Compare commits

...

319 Commits

Author SHA1 Message Date
anisha-amd
773f5de407 Docs: Ray release 25.12 and compatibility version format standardization (#5845) 2026-01-08 12:09:11 -05:00
dependabot[bot]
b297ced032 Bump urllib3 from 2.5.0 to 2.6.3 in /docs/sphinx (#5842)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.3.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.3)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.6.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-08 08:22:01 -05:00
peterjunpark
2dc22ca890 fix(primus-pytorch.rst): FP8 config instead of BF16 (#5839) 2026-01-07 13:49:31 -05:00
Joseph Macaranas
85102079ed [External CI] Add SIMDe dev package to HIP runtime pipeline (#5838) 2026-01-07 11:00:38 -05:00
dependabot[bot]
ba95e0e689 Bump pynacl from 1.6.1 to 1.6.2 in /docs/sphinx (#5836)
Bumps [pynacl](https://github.com/pyca/pynacl) from 1.6.1 to 1.6.2.
- [Changelog](https://github.com/pyca/pynacl/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/pynacl/compare/1.6.1...1.6.2)

---
updated-dependencies:
- dependency-name: pynacl
  dependency-version: 1.6.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-06 14:10:42 -05:00
Pratik Basyal
1691d369e9 ROCM-core version fixed (#5827) 2026-01-02 16:06:27 -05:00
peterjunpark
172b0f7c08 Fix inconsistency in xDiT doc
Fix inconsistency in xDiT doc
2025-12-29 10:26:25 -05:00
peterjunpark
c67fac78bd Update docs for xDiT diffusion inference 25.13 Docker release (#5820)
* archive previous version

* add xdit 25.13

* update history index

* add perf results section
2025-12-29 08:44:45 -05:00
peterjunpark
e0b8ec4dfb Update training docs for Primus/25.11 (#5819)
* update conf and toc.yml.in

* archive previous versions

archive data files

update anchors

* primus pytorch: remove training batch size args

* update primus megatron run cmds

multi-node

* update primus pytorch

update

* update

update

* update docker tag
2025-12-29 08:05:47 -05:00
Pratik Basyal
38f2d043dc OS table removed from compatibility table [develop] (#5810)
* OS table removed from compatibility table

* Feedback added

* Azure Linux 3.0 and compatibility version update

* Version fix

* Review feedback added

* Minor change
2025-12-23 16:28:19 -05:00
peterjunpark
3a43bacdda Update xdit diffusion inference history (#5808)
* Update xdit diffusion inference history

* fix
2025-12-22 11:05:32 -05:00
peterjunpark
48d8fe139b fix link to ROCm PyT docker image (#5803) 2025-12-19 15:47:55 -05:00
peterjunpark
7455fe57b8 clean up formatting in FA2 page (#5795) 2025-12-19 09:21:41 -05:00
peterjunpark
52c0a47e84 Update Flash Attention guidance in "Model acceleration libraries" (#5793)
* flash attention update

Signed-off-by: seungrok.jung <seungrok.jung@amd.com>

flash attention update

Signed-off-by: seungrok.jung <seungrok.jung@amd.com>

flash attention update

Signed-off-by: seungrok.jung <seungrok.jung@amd.com>

sentence-case heading

* Update docs/how-to/rocm-for-ai/inference-optimization/model-acceleration-libraries.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

---------

Co-authored-by: seungrok.jung <seungrok.jung@amd.com>
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
2025-12-19 08:48:52 -05:00
peterjunpark
cbab9a465d Update documentation for JAX training MaxText 25.11 release (#5789) 2025-12-18 11:23:58 -05:00
peterjunpark
459283da3c xDiT diffusion inference v25.12 documentation update (#5786)
* Add xdit-diffusion ROCm docs page.

* Update template formatting and fix sphinx warnings

* Add System Validation section.

* Add sw component versions/commits.

* Update to use latest v25.10 image instead of v25.9

* Update commands and add FLUX instructions.

* Update Flux instructions. Change image tag. Describe as diffusion inference instead of specifically video.

* git rm xdit-video-diffusion.rst

* Docs for v25.12

* Add hyperlinks to components

* Command fixes

* -Diffusers suffix

* Simplify yaml file and cleanup main rst page.

* Spelling, added 'js'

* fix merge conflict

fix

---------

Co-authored-by: Kristoffer <kristoffer.torp@amd.com>
2025-12-17 10:20:10 -05:00
peterjunpark
1b4f25733d vLLM inference benchmark 1210 (#5776)
* Archive previous ver

fix anchors

* Update vllm.rst and data yaml for 20251210
2025-12-17 09:21:57 -05:00
Ibrahim Wani
b287372be5 [origami] Test update (#5768)
* Fix the skipping of origami tests

* Update dependencies for origami refactor

* test

* Unsupress test output.

* Ctest implementation

* Test ctest

* Test ctest 2

* Add pip install test

* Fix python version

* Add python dep

* test

* test 2

* Debug for readme

* Fix pip install

* Fix pip install 2

* Clean up

* Run tests on 950

* Replace 950 with 1201

* 1101

* Add more archs

* Add more archs 2

* Comment out archs

* Move pip install script to ./azuredevops/scripts

* Fix path

* Fix path 2

* Fix path 3

* Fix path 4

* Remove pip install testing:

* Use inline script

* Add old deps
2025-12-16 15:37:41 -07:00
Pratik Basyal
78e8baf147 Taichi removed from ROCm docs [Develop] (#5779)
* Taichi removed from ROCm docs

* Warnings fixed
2025-12-16 13:12:40 -05:00
Matt Williams
3e0c8b47e3 Merge pull request #5771 from ROCm/mattwill-amd-patch-4
Reverting Optiq note
2025-12-12 17:53:41 -05:00
Matt Williams
c3f0b99cc0 Reverting Optiq note 2025-12-12 17:47:33 -05:00
dependabot[bot]
c9d1679486 Bump rocm-docs-core from 1.31.0 to 1.31.1 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.31.0 to 1.31.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.31.0...v1.31.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.31.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-12 16:15:26 -05:00
Pratik Basyal
fdbef17d7b Onnx and rocshmem version updated (#5760) 2025-12-11 17:05:25 -05:00
Matt Williams
6592a41a7f Adding ROCm-Optiq note to What is ROCm page (#5709)
* Adding ROCm-Optiq note to What is ROCm page

Adding a note for a link to the Optiq docs

* Apply suggestion from @mattwill-amd

* Apply suggestion from @mattwill-amd

* Apply suggestion from @mattwill-amd

* Update what-is-rocm.rst

* Update what-is-rocm.rst

* Apply suggestion from @mattwill-amd

* Apply suggestion from @mattwill-amd

* Apply suggestion from @mattwill-amd

* Apply suggestion from @mattwill-amd
2025-12-10 12:56:33 -08:00
Matt Williams
65a936023b Fixing link redirects (#5758)
* Update multi-gpu-fine-tuning-and-inference.rst

* Update pytorch-training-v25.6.rst

* Update pytorch-compatibility.rst
2025-12-10 11:17:59 -05:00
anisha-amd
2a64949081 Docs: update verl compatibility - fix (#5756) 2025-12-09 19:51:37 -05:00
anisha-amd
0a17434517 Docs: update verl compatibility - fix (#5754) 2025-12-09 18:36:16 -05:00
anisha-amd
2be7e5ac1e Docs: verl framework - compatibility - 25.11 release (#5752) 2025-12-09 11:41:43 -05:00
dependabot[bot]
ae80c4a31c Bump rocm-docs-core from 1.30.1 to 1.31.0 in /docs/sphinx (#5751)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.30.1 to 1.31.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.31.0/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.30.1...v1.31.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.31.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-09 08:25:16 -05:00
Adel Johar
dd89a692e1 [Ex CI] Add rocAL dependencies 2025-12-09 10:56:23 +01:00
peterjunpark
bf74351e5a Fix Primus PyTorch doc: training.batch_size -> training.local_batch_size (#5748) 2025-12-08 13:35:22 -05:00
yugang-amd
f2067767e0 xdit-diffusion v25.11 docs (#5744) 2025-12-05 17:09:48 -05:00
Pratik Basyal
effd4174fb PyTorch 2.7 support added (#5740) 2025-12-04 15:49:23 -05:00
peterjunpark
453751a86f fix docker hub links for primus:v25.10 (#5738) 2025-12-04 09:17:33 -05:00
peterjunpark
fb644412d5 Update training Docker docs for Primus 25.10 (#5737) 2025-12-04 09:08:00 -05:00
Pratik Basyal
e8fdc34b71 711 hipBLASLT performance decline known issue added (#5730)
* hipBLASLT performance decline known issue added

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* GitHub Issue added

* Ram's feedback incorporated

* GitHub Issue added

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

---------

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
2025-12-03 08:50:25 -05:00
Pratik Basyal
b4031ef23c 7.1.1 known issues post GA (#5721)
* rocblas known issues added

* Minor change

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Resolved

* Update RELEASE.md

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

---------

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
2025-11-28 16:34:47 -05:00
dependabot[bot]
d0bd4e6f03 Bump rocm-docs-core from 1.29.0 to 1.30.1 in /docs/sphinx (#5712)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.29.0 to 1.30.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.29.0...v1.30.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.30.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-28 08:18:23 -05:00
Jan Stephan
0056b9453e Remove continuous numbering of tables and figures
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2025-11-28 10:29:01 +01:00
Pratik Basyal
3d1ad79766 Merged cell removed for coloring issue (#5713) 2025-11-27 19:52:36 -05:00
Pratik Basyal
8683bed11b Known issue from 7.1.0 removed (#5702) 2025-11-26 12:27:22 -05:00
Pratik Basyal
847cd7c423 Link and PyTorch version updated (#5700) 2025-11-26 11:52:47 -05:00
Alex Xu
42cad29c04 re-compile requirements.txt 2025-11-26 11:35:00 -05:00
alexxu-amd
f7b2fe0a48 Merge pull request #5699 from ROCm/sync-develop-from-internal
Sync develop from internal for 7.1.1
2025-11-26 11:27:48 -05:00
alexxu-amd
bb199aa2b9 Merge pull request #639 from ROCm/sync-develop-from-external
Sync develop from external
2025-11-26 11:10:19 -05:00
alexxu-amd
2f7b2a7fa1 Merge branch 'develop' into sync-develop-from-external 2025-11-26 10:54:34 -05:00
Pratik Basyal
7fd75919d1 711 GPU and environment variable link updated (#640)
* ROCm environment vairable link updated

* Programming patter link updated
2025-11-26 10:41:16 -05:00
Alex Xu
4490c57c6a resolve merge conflict 2025-11-26 10:33:02 -05:00
Alex Xu
007f24fe7b Merge remote-tracking branch 'external/develop' into sync-develop-from-external 2025-11-26 10:09:04 -05:00
Pratik Basyal
afbb6e0f61 PLDM table synced (#638) 2025-11-26 10:08:12 -05:00
Pratik Basyal
1b5a3e54c2 711 compatibility note update and review feedback added (#636)
* Leo's review feedback added

* rocshmem version bumped from 3.0.0 to 3.1.0

* Footnote cleaned

* Footnote updated

* Ram's feedback

* Link updated

* Footnote updated

* Link fixed
2025-11-26 09:46:57 -05:00
alexxu-amd
2c6eb9cf2a Update versions.md (#637)
* Update versions.md

* remove empty line
2025-11-26 09:03:54 -05:00
Pratik Basyal
b93fdb811c 7.1.1 pre-GA public link reset (#627)
* 7.1.1 pre-GA public link reset

* Update CHANGELOG.md
2025-11-26 08:38:13 -05:00
srayasam-amd
096d91e190 Updating rocm version to 7.1.1 GA (#5697)
* 7.1.1 GA update

* 7.1.1 GA update

* Update rocm-7.1.1.xml

* Update default.xml
2025-11-26 16:08:03 +05:30
Pratik Basyal
02037f4384 7.1.1 fixed issues added (#634)
* Fixed issues added

* Blank line added
2025-11-24 15:56:44 -05:00
peterjunpark
c64dc46a50 [7.1.1] docs(RELEASE.md): Add notes under "Driver and firmware related changes" (#632)
* Add notes under "Driver and firmware related changes"

update

* Update RELEASE.md

---------

Co-authored-by: Pratik Basyal <prbasyal@amd.com>
2025-11-24 13:18:53 -05:00
Pratik Basyal
702d8e4c8e New link updated for MIgraphx (#5691) 2025-11-24 11:52:38 -05:00
Istvan Kiss
19344d7b61 Fix rocr-runtime environment variables content link (#631) 2025-11-21 18:59:57 +01:00
amd-hsivasun
807ec6afcf [Ex CI] Update AMDMIGraphX CMake version (#5683) 2025-11-20 18:05:24 -05:00
amd-hsivasun
4c04da05c3 [Ex CI] Update pipeline ID for amdmis to monorepo (#5685) 2025-11-20 18:05:17 -05:00
dependabot[bot]
411334716c Bump rocm-docs-core from 1.28.0 to 1.29.0 in /docs/sphinx (#5659)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.28.0 to 1.29.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.28.0...v1.29.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-20 13:54:33 -05:00
amd-hsivasun
99f0875e70 [Ex CI] amdsmi monorepo enablement (#5677)
* [Ex CI] amdsmi monorepo enablement

* Fix amdsmi yaml
2025-11-20 13:52:01 -05:00
peterjunpark
50658d0812 Update release highlights for 7.1.1 (#629) 2025-11-20 13:51:13 -05:00
Pratik Basyal
7aeecdf8e2 Document 7.1.1 Known issues (#628)
Co-authored-by: Peter Park <peter.park@amd.com>
2025-11-20 13:12:52 -05:00
Istvan Kiss
4f669eb2c6 Add JAX Plugin-PJRT support table (#619) 2025-11-20 10:55:51 -06:00
Jithun Nair
7d1f314303 Update PyTorch compatibility documentation with PyTorch2.9 for ROCm7.1.1 2025-11-19 19:15:04 -06:00
Jithun Nair
c523f51e58 Merge branch 'develop' into update-pytorch-compatibility 2025-11-19 19:11:22 -06:00
Melantha-S
b566858909 Update pytorch-compatibility.rst 2025-11-19 15:54:04 -07:00
Melantha-S
c33b9e3611 Update pytorch-compatibility.rst 2025-11-19 15:16:30 -07:00
Shao
2646b4841d Update pytorch compatibility documentation 2025-11-19 15:05:11 -07:00
Shao
ff2f40d800 Add logsumexp to spellcheck dictionary 2025-11-19 15:03:12 -07:00
Shao
71bcc5b204 Add PyTorch 2.9 release notes for ROCm 2025-11-19 14:59:27 -07:00
Pratik Basyal
fd840df30b JAX and PyTorch support and ROCProfiler upcoming changes updated 7.1.1 (#626)
* ROCProfiler upcoming changes updated

* ROCm examples moved

* JAX verison udpated

* Formatting updated"

* Update RELEASE.md

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Environment variable updated added

* Minor changelog fixes

* JAX reverted

* grid alignment

* Revert "grid alignment"

This reverts commit 47939743ab3175cad47f45fd2cd263476eaf14e1.

---------

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
2025-11-19 15:29:02 -05:00
Shao
58e26eede1 Add Cholesky and mx to spellcheck dictionary 2025-11-19 10:27:51 -07:00
Shao
407a9d4cb0 Update PyTorch compatibility documentation 2025-11-19 09:52:47 -07:00
Istvan Kiss
81b7745f8e Docs: Add Environment Variable Page (#395)
Co-authored-by: Adel Johar <adel.johar@amd.com>
2025-11-19 17:40:26 +01:00
Pratik Basyal
6af62fd30a 7.1.1 Compatibility table fixed (#624)
* broken table fixed

* Line break added

* Line break added
2025-11-19 11:22:47 -05:00
Pratik Basyal
bb692dfd84 711 Release Notes update [Batch1] (#623)
* Fixed issue updated

* Release notes updated

* Formatting correction

* RCCL performance decline issue added

* Known issue updated

* Minor update

* Known issues updated

* Review feedback added

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

---------

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
2025-11-19 08:04:37 -05:00
Adel Johar
8d51d0e803 [Ex CI] Add CXX override for MIGraphX 2025-11-19 10:45:10 +01:00
Adel Johar
66b8b96c72 [Ex CI] Add missing dependencies for rccl and mivisionx 2025-11-19 10:45:10 +01:00
Pratik Basyal
fb098b6354 Initial changes for 7.1.1 release notes (#622)
* Changelog and tables updates for 7.1.1 release notes

* Changelog synced

* Naming udpated

* Added upcoming changes for composable kernel

* Update RELEASE.md

Co-authored-by: Pratik Basyal <prbasyal@amd.com>

* Update RELEASE.md

* Highlights udpated for DGL, ROCm-DS, and HIP documentation

* Changelog synced"

* Offline, runfile and ROCm Bandwidth test updated

* CK/AITER highlight added

* Changelog synced

* AI model highlight updated

* PLDM version added

* Changelog updated

* Leo's feedback incorporated

* Compatibility and PLDM versions udpated

* New docs update added

* ROCm resolved issue added

* Review feedback added

* Link added

* PLDM updated

* PLDM table udpated

* Changes

---------

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
2025-11-17 12:09:59 -05:00
cfallows-amd
72107dd6d5 [Ex CI] Adding dependencies to rocprofiler-compute azure workflow (#5667) 2025-11-14 12:24:56 -05:00
amd-hsivasun
99c1590057 [Ex CI] Added ROCM_PATH env var to rocprofiler-compute (#5666) 2025-11-14 12:19:06 -05:00
Jeffrey Novotny
3d86323f88 Update licenses document to reflect monorepo (#620) 2025-11-14 09:16:48 -05:00
Carrie Fallows
636d4cc736 Adding dependencies to rocmDependencies in rocprof-compute yaml. Now needed for building because of rocprofiler-sdk dependency.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-11-13 20:56:45 -05:00
amd-hsivasun
d1ce815d8d [Ex CI] Add rocprofiler-sdk dep to build for rocprofiler-compute (#5664) 2025-11-13 16:08:02 -05:00
Pratik Basyal
80ced95526 Changelog updated (#5660) 2025-11-13 10:18:15 -05:00
Pratik Basyal
09c6a9fdef 710 RCCL Known Issues and CRIU note update (#5647)
* RCCL ALltoALL known issue added

* CRIU note added

* Minor change

* Review feedback and AMDSMI detailed changelog link added

* Github issue link added
2025-11-11 16:54:36 -05:00
Alex Xu
372ddd5af3 revert test changes 2025-11-11 09:33:28 -05:00
peterjunpark
eb956cfc5c Fixed wording related to VLLM_V1_USE_PREFILL_DECODE_ATTENTION (#5605)
Co-authored-by: Hongxia Yang <hongxia.yang@amd.com>
2025-11-11 09:22:11 -05:00
peterjunpark
e05cdca54f Fix references to vLLM docs (#5651) 2025-11-11 09:00:07 -05:00
anisha-amd
04c7374f41 Docs: frameworks 25.10 - compatibility - DGL and llama.cpp (#5648) 2025-11-10 15:26:54 -05:00
Alex Xu
39de859bd1 update rocm-docs-core to 1.29.0 2025-11-10 14:10:06 -05:00
amd-hsivasun
c8531ac7ea [Ex CI] Update pipeline Id for hipTensor to monorepo (#5638) 2025-11-10 13:32:10 -05:00
Pratik Basyal
420bbfa126 7.1.0 MI325X PLDM note updated (#5644)
* PLDM note updated

* Footnote update

* Note added to compatibility

* Lint error fixed
2025-11-08 09:08:21 -05:00
Pratik Basyal
4881887e2c rocBLAS precision known issue added [Develop] (#5641)
* rocBLAS precision known issue added

* IPC note removed

* Review feedback added
2025-11-07 19:45:33 -05:00
Pratik Basyal
148d6670ad rocBLAS and HipBLASLt known issue added 7.1.0 (#5634)
* rocBLAS and HipBLASLt known issue added

* Title warning fixed

* Jeff's feedback added

* Leo's feedback incorporated

* Minor feedback

* MI325X PLDM udpate

* Leo's feedback added

* PyTorch profiling issue added

* Changelog synced

* JAX section removed

* Ram's feedback added
2025-11-07 17:48:36 -05:00
amd-hsivasun
9770e9b6ef [Ex CI] hiptensor Enablement (#5636) 2025-11-07 16:08:46 -05:00
Joseph Macaranas
ee4cf66d67 [External CI] Add simde-devel in dnf mapping (#5635) 2025-11-07 00:59:35 -05:00
Alex Xu
908862242a test preview banner 2025-11-06 12:24:52 -05:00
amd-hsivasun
6ba30f191c [Ex CI] rocWMMA increase timeout for test job (#5620) 2025-11-06 11:38:07 -05:00
yugang-amd
674dc355e4 vLLM 10/24 release (#5626)
* vLLM 10/24 release

* updates per SME inputs

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/vllm.rst

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

---------

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
2025-11-05 11:13:50 -05:00
Adel Johar
c7f3a56811 [Ex CI] Add half, rccl, and dependencies for rpp, mivisionx and rocjpeg 2025-11-05 15:59:15 +01:00
Pratik Basyal
0107fa731e ROCm Bandwidth test issue added (#5612) 2025-10-31 18:19:40 -04:00
Pratik Basyal
a87ec360e1 710 known issues update[Batch1] (#5604)
* Version update

* ROCm Bandwidth failure added

* Editorial feedback added

* Minor change

* rocprofv3 issue added

* Minor change

* ROCgdb issue added

* SME feedback incorpprated

* Leo's feedback added

* ROCm Compute Profiler known issue added

* Changelog synced
2025-10-31 14:57:13 -04:00
amd-hsivasun
7215e1e8c7 [Ex CI] Update rocwmma pipeline ID to monorepo (#5602) 2025-10-31 13:56:17 -04:00
amd-hsivasun
e4a59d8c66 [Ex CI] Enable rocWMMA Monorepo (#5597)
* [Ex CI] Enable rocWMMA Monorepo

* Updated to use component name parameter
2025-10-30 13:43:05 -04:00
Pratik Basyal
8108fe7275 7.1.0 Post GA updates (#5600)
* Post GA updates

* Mono repo link added

* AMD SMI changelog link removed
2025-10-30 13:27:25 -04:00
alexxu-amd
d3ff9d7c8e Merge pull request #5599 from ROCm/sync-develop-from-internal
Sync develop from internal for 7.1.0
2025-10-30 11:37:20 -04:00
Alex Xu
939ee7de0c Merge remote-tracking branch 'internal/develop' into sync-develop-from-internal 2025-10-30 11:15:00 -04:00
Pratik Basyal
f1e6c285dd 7.1.0 PRE GA Link reset (#616)
* Link reset

* Changelog synced and feedback incorporated

* Jeff's feedback added
2025-10-30 11:01:13 -04:00
alexxu-amd
ff1d9b4d69 Update versions.md for ROCm 7.1.0 GA (#615)
* Update versions.md

* fix linting
2025-10-30 10:00:32 -04:00
srayasam-amd
ef3fa601d5 7.1.0 GA update (#5598)
* PR for GA 7.1.0

* Create rocm-7.1.0.xml

* Update default.xml

* Update rocm-7.1.0.xml
2025-10-30 19:10:03 +05:30
Pratik Basyal
576191a104 710 release highlights update pre GA (#614)
* hipBLASLt highlights updated

* Flash attention highlight added

* PLDM highlight updated

* Spell fixes
2025-10-30 09:03:32 -04:00
Pratik Basyal
2db07b5cda Changelog updated for HIP (#613) 2025-10-29 18:27:05 -04:00
alexxu-amd
fe3dc988b8 Merge pull request #612 from ROCm/sync-develop-from-external
Sync develop from external for 7.1.0 GA
2025-10-29 17:13:01 -04:00
Alex Xu
36c879b7e0 resolve merge conflict 2025-10-29 17:08:07 -04:00
alexxu-amd
91450dca10 Merge branch 'develop' into sync-develop-from-external 2025-10-29 16:49:33 -04:00
Alex Xu
2de92767e6 Merge remote-tracking branch 'external/develop' into sync-develop-from-external 2025-10-29 16:48:29 -04:00
Pratik Basyal
54d226acd9 710 highlight updates [batch 2] (#611)
* Changelog updated for ROCdbg api"

* Systems profiler update

* Minor change
2025-10-29 16:42:57 -04:00
Pratik Basyal
f46d7ec00f 7.1.0 Release notes updated (#610)
* Release notes updated

* Changelog updated"

Changelog udpated
"

* Github link updated for Mono repo
2025-10-29 14:59:33 -04:00
Pratik Basyal
09c946b6fb 710 fixed issue update (#608)
* Resolved issues added

* Changelog synced

* Changelog synced
2025-10-29 12:28:09 -04:00
Pratik Basyal
5285669d98 7.1.0 release notes, changelog, and known issues update (#606)
* RCCL and hipblaslt changelog updated

* ROCProfiler-SDK highlight addede

* Review feedback from Leo and Swati added

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com>

* ROCprofiler-SDK added

* Minor edits

---------

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com>
2025-10-29 10:22:52 -04:00
Jan Stephan
9b3138cffa [Ex CI] Add aomp, aomp-extras, composable_kernel and rocALUTION
Remove libomp-dev

Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2025-10-29 11:22:27 +01:00
Pratik Basyal
61fffe3250 7.0.2 Broken link, version and known issue update (#5591)
* Version and known issue update

* Historical compatibility updated
2025-10-28 15:16:15 -04:00
dependabot[bot]
43ccfbbe80 Bump rocm-docs-core from 1.26.0 to 1.27.0 in /docs/sphinx (#5570)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.26.0 to 1.27.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.26.0...v1.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.27.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-28 11:06:22 -04:00
peterjunpark
1515fb3779 Revert "Add xdit diffusion docs (#5576)" (#5580)
This reverts commit 4132a2609c.
2025-10-27 16:22:28 -04:00
randyh62
410a69efe4 Update RELEASE.md (#598)
Edit doorbell ring improvements
2025-10-27 13:14:45 -07:00
Joseph Macaranas
248cbf8bc1 [External CI] rccl triggers rocprofiler-sdk downstream (#5420)
- Update rccl component pipeline to include new additions made to projects already in super repos.
- Also update rccl to trigger rocproifler-sdk job upon completion.
- rocprofiler-sdk pipeline updated to include os parameter to enable future almalinux 8 job.
2025-10-27 12:14:30 -04:00
Istvan Kiss
0171dced89 Link fix and remove CentOS Stream mention from PyTorch release notes. (#593)
CentOS Stream not officially supported OS
2025-10-27 16:47:52 +01:00
Istvan Kiss
f2d6675839 Add back extra line to fix spellchecker (#604) 2025-10-27 16:39:29 +01:00
Pratik Basyal
7d0fad9aa8 Changelog duplication fixed (#601) 2025-10-27 10:38:44 -04:00
Kristoffer
4132a2609c Add xdit diffusion docs (#5576)
* Add xdit video diffusion base page.

* Update supported accelerators.

* Remove dependency on mad-tags.

* Update docker pull section.

* Update container launch instructions.

* Improve launch instruction options and layout.

* Add benchmark result outputs.

* Fix wrong HunyuanVideo path

* Finalize instructions.

* Consistent title.

* Make page and side-bar titles the same.

* Updated wordlist. Removed note container reg HF.

* Remove fp8_gemms in command and add release notes.

* Update accelerators naming.

* Add note regarding OOB performance.

* Fix admonition box.

* Overall fixes.
2025-10-27 14:56:55 +01:00
Pratik Basyal
c56d5b7495 7.1.0 release notes and compatibility footnote update (#599)
* RDC changelog and highlight addition

* Compatibility updated

* Minor change

* Consolidated changelog synced
2025-10-25 08:47:17 -05:00
Pratik Basyal
a2e2bd3277 710 Compatibility table fixed (#597)
* Compatibility table fixed

* Ryzen link updated

* rocJPEG added

* Driver updated

* Minor change

* PLDM udpate
2025-10-24 15:54:11 -04:00
randyh62
32d1cdcd90 Update RELEASE.md (#596)
Fix HIP 7.1 issues
2025-10-24 12:11:59 -07:00
Pratik Basyal
ac16524ebd 7.1.0 Compatibility updated (#595)
* Compatibility updated

* rocAL and MIgraphx changelog added

* Minor update

* Heading changes
2025-10-24 13:43:36 -04:00
Pratik Basyal
157d86b780 7.1.0 Release Notes Update (#591)
* Initial changelog added

* Changelog updated

* 7.1.0 draft changes

* Highlight changes

* Add release highlights

* formatting

* Order updated

* Highlights added

* Highlight update

* Changelog updated

* RCCL change

* RCCL changelog entry added

* Changelog updates added

* heading level fixed

* Updates added

* Leo's and Jeff's review feedback incorporated

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Release notes feedback

* Updated highlights

* Minor changes

* TOC for internal updated

---------

Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
2025-10-24 12:41:23 -04:00
peterjunpark
35ca027aa4 Fix broken links under rocm-for-ai/ (#5564) 2025-10-23 14:39:58 -04:00
peterjunpark
90c1d9068f add xref to vllm v1 optimization guide in workload.rst (#5560) 2025-10-22 13:47:46 -04:00
peterjunpark
cb8d21a0df Updates to the vLLM optimization guide for MI300X/MI355X (#5554)
* Expand vLLM optimization guide for MI300X/MI355X with comprehensive AITER coverage. attention backend selection, environment variables (HIP/RCCL/Quick Reduce), parallelism strategies, quantization (FP8/FP4), engine tuning, CUDA graph modes, and multi-node scaling.

Co-authored-by: PinSiang <pinsiang.tan@embeddedllm.com>
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>
Co-authored-by: pinsiangamd <pinsiang.tan@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
2025-10-22 12:54:25 -04:00
Kiriti Gowda
6f8cf36279 Merge pull request #5530 from kiritigowda/kg/ctest-verbose
CTest - Output verbose
2025-10-21 13:16:12 -07:00
anisha-amd
8eb5fef37c Docs: frameworks compatibility standardization (#5488) 2025-10-21 16:12:18 -04:00
Pratik Basyal
a5f0b30a47 PLDM version update for MI350 series [Develop] (#5547)
* PLDM version update for MI350 series

* Minor update
2025-10-20 14:39:17 -04:00
Adel Johar
2ec051dec5 Merge pull request #5531 from adeljo-amd/ci_examples
[Ex CI] Add libomp-dev, MIVisionX, rocDecode and dependencies
2025-10-20 09:55:02 +02:00
Pratik Basyal
fd6bbe18a7 PLDM update for MI250 and MI210 [Develop] (#5537)
* PLDM update for MI250 and MI210

* PLDM update
2025-10-17 17:13:42 -04:00
Istvan Kiss
14ada81c41 Pytorch release notes with rocm 7.1 (#588)
* Add PyTorch release notes udpate

* Remove torchtext

Torchtext development stoped and only supported with PyTorch 2.2

* Update
2025-10-17 22:03:14 +02:00
peterjunpark
a613bd6824 JAX Maxtext v25.9 doc update (#5532)
* archive previous version (25.7)

* update docker components list for 25.9

* update template

* update docker pull tag

* update

* fix intro
2025-10-17 11:31:06 -04:00
Adel Johar
b3459da524 [Ex CI] Add libomp-dev, MIVisionX, rocDecode 2025-10-17 14:02:54 +02:00
kiritigowda
eba211d7f1 CTest - Output verbose 2025-10-16 15:22:27 -07:00
peterjunpark
14bb59fca9 Update Megatron/PyTorch Primus 25.9 docs (#5528)
* add previous versions

* Fix heading levels in pages using embedded templates (#5468)

* update primus-megatron doc

update megatron-lm doc

update templates

fix tab

update primus-megatron model configs

Update primus-pytorch model configs

fix css class

add posttrain to pytorch-training template

update data sheets

update

update

update

update docker tags

* Add known issue and update Primus/Turbo versions

* add primus ver to histories

* update primus ver to 0.1.1

* fix leftovers from merge conflict
2025-10-16 12:51:30 -04:00
anisha-amd
a98236a4e3 Main Docs: references of accelerator removal and change to GPU (#5495)
* Docs: references of accelerator removal and change to GPU

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-10-16 11:22:10 -04:00
David Dixon
5cb6bfe151 Add yaml-cpp to dependencies 2025-10-16 07:26:06 -06:00
David Dixon
6e7422ded7 Update cli11.yml for Azure Pipelines (#5523) 2025-10-15 10:47:29 -06:00
Istvan Kiss
7b7ff53985 Update Radeon link (#5453) 2025-10-15 17:25:05 +02:00
David Dixon
019796dc63 [external] Create cli11.yml (#5522) 2025-10-15 09:19:56 -06:00
Pratik Basyal
f21cfe1171 GitHub issue added to 702 known issues (#5520)
* GitHub issue added to 702 known issues

* Added missing RCCL changelog
2025-10-15 09:58:23 -04:00
Jan Stephan
170cb47a4f Merge pull request #5512 from j-stephan/rocm-examples-deps
[Ex CI] Add libtiff-dev, libopencv-dev and rpp
2025-10-15 10:02:46 +02:00
Braden Stefanuk
d19a8e4a83 [superbuild] Add dependencies for hipblaslt and origami (#5487)
* ci: add deps for origami in superbuild

* ci: add rocm path to system path

* build: add pip msgpack dep
2025-10-14 16:05:24 -06:00
amd-hsivasun
3a0b8529ed [Ex CI] Added MIOpen to the test dependencies for rocm-examples (#5517) 2025-10-14 14:56:36 -04:00
Joseph Macaranas
f9d7fc2e6a [External CI] Add libsimde-dev to ROCR pipeline (#5515) 2025-10-14 14:24:45 -04:00
Nilesh M Negi
d424687191 [Ex CI] Increase RCCL build time limit to 120mins (#5516) 2025-10-14 12:59:40 -05:00
Jan Stephan
35e6e50888 [Ex CI] Add libopencv-dev
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2025-10-13 20:00:25 +02:00
Jan Stephan
91cfe98eb3 [Ex CI] Add libtiff-dev and rpp
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2025-10-13 17:42:59 +02:00
Pratik Basyal
036aaa2e78 ROCm for HPC topic updated Develop (#5504)
* ROCm for HPC topic updated

* ROCm for HPC topic udpated

* Minor editorial
2025-10-10 22:31:51 -04:00
Pratik Basyal
78258e0f85 702 compatibility Footnote updated (#5502)
* Footnote updated

* Minor update

* Minor update

* Break added

* Line break added

* Line break

* Footnote updated

* Minor correction
2025-10-10 21:23:07 -04:00
amd-hsong
c79d9f74ef Merge pull request #5490 Re-enable device_merge_inplace unit test for rocPRIM 2025-10-10 15:03:23 -06:00
amd-hsivasun
fb1b78c6f0 [Ex CI] Added Component and Module Dependencies (#5489)
* [Ex CI] Added Component and Module Dependencies

* Add registerROCmPackages flag
2025-10-10 16:01:11 -04:00
peterjunpark
3a70d75f5e Fix documented AMD SMI version (ROCm 7.0.2) (#5496) 2025-10-10 15:09:20 -04:00
alexxu-amd
61e1f088a1 Merge pull request #5492 from ROCm/sync-dev-from-internal
Sync dev from internal for 7.0.2 GA
2025-10-10 11:17:32 -04:00
Pratik Basyal
1f6e5c5e04 Update compatibility-matrix.rst 2025-10-10 11:10:48 -04:00
Pratik Basyal
e8a0769842 Update RELEASE.md 2025-10-10 11:07:51 -04:00
Alex Xu
6f9579d052 Merge remote-tracking branch 'internal/develop' into sync-dev-from-internal 2025-10-10 11:02:33 -04:00
Pratik Basyal
245d53a021 Merge pull request #579 from prbasyal-amd/post-rc3-702-update
GPU resiliency highlight updated 702
2025-10-10 11:00:59 -04:00
Alex Xu
35dbbb22bc fix linting 2025-10-10 10:29:13 -04:00
alexxu-amd
03dc8cee00 Merge pull request #584 from ROCm/sync-dev-from-external
Sync dev from external
2025-10-10 10:14:56 -04:00
Alex Xu
323e5fd27a Merge remote-tracking branch 'external/develop' into sync-dev-from-external 2025-10-10 10:13:08 -04:00
alexxu-amd
b11fd7b492 Update versions.md (#583) 2025-10-10 09:31:24 -04:00
srayasam-amd
5e2efa05a6 7.0.2 GA update (#5491)
* 7.0.2 GA update

* Create rocm-7.0.2.xml
2025-10-10 18:47:48 +05:30
Hao Song
29a90f0271 [rocPRIM] Re-enable device_merge_inplace unit test for rocPRIM 2025-10-09 21:48:11 +00:00
randyh62
c06242bb89 Update RELEASE.md (#581)
* Update RELEASE.md

Remove support for rocBlas and hipBlasLt

* Update CHANGELOG.md

Removed from the Changelog as well.
2025-10-09 13:15:08 -07:00
peterjunpark
68e8453ca5 Update vLLM doc for 10/6 release and bump rocm-docs-core to 1.26.0 (#5481)
* archive previous doc version

* update model/docker data and doc templates

* Update "Reproducing the Docker image"

* fix: truncated commit hash doesn't work for some reason

* bump rocm-docs-core to 1.26.0

* fix numbering

fix

* update docker tag

* update .wordlist.txt
2025-10-08 16:23:40 -04:00
Pratik Basyal
503b8bcc86 Framework and changelog updated (#5483)
* Framework and chaneglog updated

* Wordlist updated
2025-10-08 15:05:11 -04:00
amd-hsivasun
e3d97d339a [Ex CI] Added rocJPEG and rocprofiler-sdk 2025-10-08 14:47:44 -04:00
alexxu-amd
978c58d196 Merge pull request #577 from ROCm/sync-develop-from-external
Sync develop from external
2025-10-08 14:25:03 -04:00
alexxu-amd
a366048b64 Merge branch 'develop' into sync-develop-from-external 2025-10-08 14:12:14 -04:00
Pratik Basyal
4c3e33c291 Compatibility matrix and changelog synced for ROCm 7.0.2 (#576)
* Compatibility matrix and changelog synced

* Indentation updated

* OS updated
2025-10-08 14:11:15 -04:00
Alex Xu
89758e67d8 Merge remote-tracking branch 'external/develop' into sync-develop-from-external 2025-10-08 14:03:34 -04:00
Pratik Basyal
5d0f201b4d 7.0.2 review update (#575)
* 7.0.2 review update

* Tensorflow footnote updated

* Wordlist added
2025-10-08 12:35:14 -04:00
Pratik Basyal
e3677d89a6 PLDM bundle info updated for 7.0.2 (#574)
* PLDM bundle info updated

* Driver dependency added to GPU resiliency

* Known issue for Migrpahx added

* Footnote added

* Known issue for OpenCV updated

* Leo's feedback incorporated

* Radeon 9060 updated

* Known issues updated
2025-10-08 11:00:42 -04:00
amd-hsivasun
f20edab8fc [Ex CI] Update CMake Flags for hipTensor 2025-10-07 15:21:39 -04:00
Pratik Basyal
6f84d50011 ROCm 7.0.2 Post RC3 update (#573)
* Space minimized

* OS support updated

* Minor change
2025-10-06 14:08:01 -04:00
Pratik Basyal
57dd082f28 Post RC2 7.0.2 review feedback updated (#571)
* Known issue updated

* Space optimized

* Changelog updated

* Apply suggestions from code review

Leo's review feedback incorporated

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Highlight changes

* Highlight and OS support updated

* GPU resiliency highlight updated

* Highlights updated

* ROCm-EP deprecation added

* Apply suggestions from code review

leo's feedback incorporated

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* PLDM update

---------

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
2025-10-06 12:04:09 -04:00
peterjunpark
eeea0d2180 Fix heading levels in pages using embedded templates (#5468) 2025-10-03 13:33:14 -04:00
anisha-amd
93c6d17922 Docs: frameworks 25.09 - compatibility - FlashInfer and llama.cpp (#5462) 2025-10-02 13:51:36 -04:00
amd-hsivasun
f91c2b9b4a Update dependencies-rocm.yml 2025-10-01 15:31:35 -04:00
amd-hsivasun
5e6b66ca39 Remove tasks to locate test dir 2025-10-01 15:30:37 -04:00
amd-hsivasun
6b8b359d03 Updated test dir to s/build/tests 2025-10-01 15:30:37 -04:00
amd-hsivasun
38e659e5f0 Update testDir 2025-10-01 15:30:37 -04:00
amd-hsivasun
0894547f5a Update setupenv 2025-10-01 15:30:37 -04:00
amd-hsivasun
aca31170c4 Update setupenv 2025-10-01 15:30:37 -04:00
amd-hsivasun
d21ec9eea5 Updated testDir 2025-10-01 15:30:37 -04:00
amd-hsivasun
189c269350 Added Debug 2025-10-01 15:30:37 -04:00
amd-hsivasun
774cb7a1b3 Changed testDir 2025-10-01 15:30:37 -04:00
amd-hsivasun
024cb4db76 Added testDir 2025-10-01 15:30:37 -04:00
amd-hsivasun
945fb286f7 Find tests Task 2025-10-01 15:30:37 -04:00
amd-hsivasun
ee93101541 Change list files 2025-10-01 15:30:37 -04:00
amd-hsivasun
e31841312b Update testDir 2025-10-01 15:30:37 -04:00
amd-hsivasun
41b5298659 Added a list for all rp-systems files 2025-10-01 15:30:37 -04:00
amd-hsivasun
58790154b2 Add a script to look for setup-env.sh 2025-10-01 15:30:37 -04:00
amd-hsivasun
6f7f73ac0b Update workingDirectories 2025-10-01 15:30:37 -04:00
amd-hsivasun
b2e3bc8565 [Ex CI] Updated rp-systems CMakeBuildDir 2025-10-01 15:30:37 -04:00
amd-hsivasun
52979e2fdb [Ex CI] Updated testDir for rp-systems tests 2025-10-01 15:30:37 -04:00
peterjunpark
0ea5216ace docs: update article_info in conf.py (#5454) 2025-10-01 13:17:50 -04:00
peterjunpark
2e1b4dd5ee Add multi-node setup instructions for training perf Dockers (#5449)
---------

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
2025-09-30 14:53:38 -04:00
Pratik Basyal
5c7b993c0c 7.0.2 release changes (#568)
* Initial changes for 7.0.2

* Heading level updated

* Release notes changes

* rocsolver added

* Known issues updated

* Highlights updated

* RN changes

* Release highlights for AI applications updated

* AI developer contents added

* leo's review feedback added

* Compatibility matrix updated

* GPU driver support
2025-09-30 14:02:04 -04:00
amd-hsivasun
2d79b3c4bd [Ex CI] Added rocm-cmake dependency 2025-09-30 14:00:16 -04:00
Peter Park
fd59b5fbac fix links in docs (#5446) 2025-09-29 15:27:32 -04:00
amd-hsivasun
0a643f4686 [Ex CI] Enable aqlprofile 2025-09-26 14:42:15 -04:00
amd-hsivasun
d9e5744f7a Update testExecutable 2025-09-26 14:01:02 -04:00
amd-hsivasun
ccb849ec02 Added python3-pip to aptModules 2025-09-26 14:01:02 -04:00
amd-hsivasun
42d4867964 Removed more aptPackages 2025-09-26 14:01:02 -04:00
amd-hsivasun
375359a5dd Added ninja to aptPackages 2025-09-26 14:01:02 -04:00
amd-hsivasun
e92745f1ff Removed apt and pip modules 2025-09-26 14:01:02 -04:00
amd-hsivasun
0fa72358d3 Remove registerROCm packages flag 2025-09-26 14:01:02 -04:00
amd-hsivasun
6fec268a4e Removed package manager 2025-09-26 14:01:02 -04:00
amd-hsivasun
ff14cd1ff5 Added pyyaml 2025-09-26 14:01:02 -04:00
amd-hsivasun
8f65688653 Added registerROCmPackages 2025-09-26 14:01:02 -04:00
amd-hsivasun
33d1493adb Removed dependencies 2025-09-26 14:01:02 -04:00
amd-hsivasun
4b6c7776a2 Updated parameters 2025-09-26 14:01:02 -04:00
amd-hsivasun
af811daa1b Added GPUTarget 2025-09-26 14:01:02 -04:00
amd-hsivasun
d6c045e482 Update test parameters 2025-09-26 14:01:02 -04:00
amd-hsivasun
78b24cad39 Update test pool 2025-09-26 14:01:02 -04:00
amd-hsivasun
753a94c0bb Add test step to buildjob 2025-09-26 14:01:02 -04:00
amd-hsivasun
6ecad57c62 Revert pool changes 2025-09-26 14:01:02 -04:00
amd-hsivasun
977554809a Changed cmake prefix path 2025-09-26 14:01:02 -04:00
amd-hsivasun
7b00f4493b Removed module and prefix path 2025-09-26 14:01:02 -04:00
amd-hsivasun
95c439a272 Removed Compiler Path 2025-09-26 14:01:02 -04:00
amd-hsivasun
94e04fbdc0 Updated testpool 2025-09-26 14:01:02 -04:00
amd-hsivasun
7ab59de8af Update testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
175c817563 Change testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
25516d312e Updated testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
30c345629a Changed testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
210dc94bbb Removed testExecutable 2025-09-26 14:01:02 -04:00
amd-hsivasun
a54023ccb8 Changed testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
17e3362dc7 Add Checkout to testjob 2025-09-26 14:01:02 -04:00
amd-hsivasun
0f9c0d884d Updated testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
c890de4b16 Added Path to Gtest 2025-09-26 14:01:02 -04:00
amd-hsivasun
4ea77ab515 Added Tests 2025-09-26 14:01:02 -04:00
amd-hsivasun
c0512612f4 Updated testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
1c81ac3747 Updated testdir path 2025-09-26 14:01:02 -04:00
amd-hsivasun
4bafa42e52 Updated test parameters 2025-09-26 14:01:02 -04:00
amd-hsivasun
493801e670 Updated testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
1a5152b7b3 Removed testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
874c881012 Fixed testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
bdcaeea74c Updated testdir 2025-09-26 14:01:02 -04:00
amd-hsivasun
b02669acf7 Fixed Dependencies 2025-09-26 14:01:02 -04:00
amd-hsivasun
844f10b2b1 Updated denendecies-other variables 2025-09-26 14:01:02 -04:00
amd-hsivasun
d6c14920b4 External CI: Build pipeline for aqlprofile 2025-09-26 14:01:02 -04:00
amd-hsivasun
4affe10a7c [Ex CI] Update pipeline Id for rdc to monorepo 2025-09-26 12:38:57 -04:00
amd-hsivasun
81341ef435 Add New Line 2025-09-26 11:41:21 -04:00
amd-hsivasun
abacd328f9 [Ex CI] Added rocRand to rocmDependencies 2025-09-26 11:41:21 -04:00
amd-hsivasun
80b2fb6e26 [Ex CI] Add hipRAND to rocmDependencies 2025-09-26 11:41:21 -04:00
amd-hsivasun
b53e8decfc [Ex CI] Enable rdc monorepo 2025-09-26 11:41:21 -04:00
amd-hsivasun
5fcc2eafde [Ex CI] Update pipeline Id for rocprofiler-sdk to monorepo 2025-09-25 16:49:07 -04:00
amd-hsivasun
2eb0d77bc6 Updated testDir 2025-09-25 13:20:37 -04:00
amd-hsivasun
d84b41908f Changed Testdir 2025-09-25 13:20:37 -04:00
amd-hsivasun
986f8284d1 [Ex CI] Update testDir for rocprofiler-sdk 2025-09-25 13:20:37 -04:00
Pratik Basyal
d92d9268dc Use of Radeon and Ryzen reference updated [Develop] (#5432)
* Use of Radeon and Ryzen reference updated

* Pytorch link update
2025-09-24 19:07:41 -05:00
Ibrahim Wani
1629d3f0ea Add origami yaml based tests to azure pipelines (#5431)
* Add origami yaml tests

* Dependency fix in origami.yml

* Fix almalinux dependency; get publish test results step working

* Fix almalinux dependency issue
2025-09-24 14:49:51 -06:00
Pratik Basyal
6cf6b34b2e TOC for ROCm on Radeon and Ryzen updated (#5429) 2025-09-24 13:58:26 -05:00
Pratik Basyal
c35a0a121a ROR link and text updated (#5426) 2025-09-24 13:28:13 -05:00
amd-hsivasun
412e383654 [Ex CI] Update pipeline Id for rocprofiler-sdk 2025-09-23 15:56:49 -04:00
Pratik Basyal
39f6fc187d rocm-core version updated (#5418) 2025-09-23 15:49:33 -04:00
amd-hsivasun
05b480fb28 Update rocm-examples.yml 2025-09-23 12:10:11 -04:00
amd-hsivasun
4fa44d90db Updated dependencies-cmake-custom.yml default ver 2025-09-23 12:10:11 -04:00
amd-hsivasun
c9ef13d823 Added Custom Cmake to testjobs 2025-09-23 12:10:11 -04:00
amd-hsivasun
f02172050b Added rocWMMA dependency 2025-09-23 12:10:11 -04:00
amd-hsivasun
154dbe297a Updated File to take custom cmake version 2025-09-23 12:10:11 -04:00
amd-hsivasun
993a0a4fd4 [Ex CI] Update cmake 2025-09-23 12:10:11 -04:00
amd-hsivasun
c03662f410 [Ex CI] Update pipeline Id for origami to monorepo 2025-09-23 11:17:39 -04:00
Peter Park
442d7e4750 Add env var note to vllm.rst for MoE models and fix links in docs (#5415)
* docs(vllm.rst): add performance note for MoE models

* docs: fix links

update vllm readme link 20250521

fix links
2025-09-22 15:58:43 -04:00
Pratik Basyal
a09a8f517e PLDM version for 7.0.0 updated (#5412) 2025-09-22 11:14:07 -04:00
Pratik Basyal
0bbaab645d rocSHMEM and ROCprofiler-SDK highlight update (#5408) (#5409)
* rocSHMEM and ROCprofiler-SDK highlight update (#5408)

* Update RELEASE.md
2025-09-22 10:26:12 -04:00
Ibrahim Wani
4b80405e2e Add set -e to exit when test fails (#5398) 2025-09-19 10:43:35 -06:00
Peter Park
d92e5b6c12 Update Primus Megatron doc v25.8 (#5396)
* megatron: update previous versions list

update

wording

* megatron: update rst and yaml

update primus repo link

update mig guide

* update headings and anchors

* megatron: update doc

* update docker hub urls
2025-09-19 08:09:21 -04:00
Pratik Basyal
91fce2e134 rocpd highlight updated (#5393) 2025-09-18 19:00:36 -04:00
Peter Park
27d53cf082 Remove duplicate ML FW docker image support table (#5389) 2025-09-18 17:06:53 -04:00
Pratik Basyal
bc084246be Reference to AMD GPU Driver 30.10 release notes updated (#5380) 2025-09-18 13:34:46 -05:00
Peter Park
9827ba7ff2 docs: MaxText v25.7 patch update (#5372)
* remove jax 0.6.0 nanoo fp8 caveat note

* reorder maxtext docker images in data sheet
2025-09-17 16:25:46 -04:00
Pratik Basyal
bafda50153 Link updated (#5369) 2025-09-17 15:03:29 -05:00
Pratik Basyal
cae65c6c43 Link reset (#5368) 2025-09-17 13:49:04 -05:00
pbhandar-amd
6a66167486 Merge pull request #5367 from ROCm/amd/pbhandar/rocm_701_internal_to_external_sync
Sync internal to external develop branch for ROCm 7.0.1
2025-09-17 14:26:03 -04:00
Parag Bhandari
0f3543d6e8 Merge branch 'develop-internal' into develop 2025-09-17 14:15:05 -04:00
pbhandar-amd
678691c3d7 Merge pull request #563 from ROCm/amd/pbhandar/rocm_701_external_to_internal_sync
Sync external develop into internal develop for ROCm 7.0.1
2025-09-17 14:14:40 -04:00
pbhandar-amd
5cb3debed9 Merge branch 'develop' into amd/pbhandar/rocm_701_external_to_internal_sync 2025-09-17 14:09:59 -04:00
pbhandar-amd
dd5d710727 Update versions.md 2025-09-17 14:09:49 -04:00
pbhandar-amd
eca1ecde92 Merge branch 'develop' into amd/pbhandar/rocm_701_external_to_internal_sync 2025-09-17 13:48:36 -04:00
pbhandar-amd
ed1e414710 Update versions.md 2025-09-17 13:42:20 -04:00
Pratik Basyal
20c90fc406 Footnote updated (#564) 2025-09-17 12:24:03 -05:00
JeniferC99
6e39614b22 7.0.1 GA update (#5365)
* Update default.xml - Change 7.0.0 to 7.0.1

* add rocm-7.0.1.xml
2025-09-17 13:18:01 -04:00
Pratik Basyal
f7873ac74e Long cell in compatibility matrix updated 701 (#562)
* Long cell updated

* Long cell updated

* Historical comaptibility updated
2025-09-17 11:57:35 -05:00
Parag Bhandari
a86fba556b Merge branch 'develop' into develop-internal 2025-09-17 12:35:50 -04:00
Pratik Basyal
7603fed080 Release 7.0.1 demo release notes (#536)
* Mono repo highlight added

* Leo's feedback incorporated

* Minor wording change

* Randy's feedback incorp

* Update for upcoming change

* Minor feedback added

* Ram's feedback incorporated

* Reworded for clarity

* ROCM 7.0.1 draft

* Minor change

* Release 7.0.0 notes appended

* Heading order updated for 7.0.1

* 700 GA changes synced

* Issue updated

* Review feedback added

* Conf file updated

* Tensorflow change added

* review feedback added

* GPU depencency matrix updated

* Compatibility updated

* Minor change

* New update note

* AMD GPU Driver notes updated

* Footnotes updated
2025-09-17 10:57:15 -05:00
Braden Stefanuk
9932cd4ac2 [hipsparselt] Update compile command for new build system (#5244) 2025-09-16 15:36:20 -06:00
Peter Park
e8d104124f Fix PyTorch training benchmark doc template (#5357)
* fix template

* update wordlist
2025-09-16 17:21:57 -04:00
Peter Park
26f708da87 Add Stable Diffusion XL to PyT training benchmark doc and fix paths in SGLang Disagg Inference doc (#5282)
* add sdxl to pytorch-training

* fix sphinx warnings

fix links

* fix paths in cmds and links in sglang disagg

* fix col width

* update release highlights

* fix

quickfix
2025-09-16 16:49:33 -04:00
Pratik Basyal
5a5e4dbb6e Compatibility updated (#5355) 2025-09-16 15:49:13 -05:00
randyh62
1c3dae75e1 Revert "Update RELEASE.md (#560)" (#561)
This reverts commit f216b371a0.
2025-09-16 13:02:13 -07:00
Peter Park
bab853a0d3 Add NCF to pytorch training benchmark doc (#5352)
* add previous version (25.6)

* fix template

* Formatting and wording fixes

* add caveats

* update yaml

* add note to pytorch-training

* fix template

* make model name shorter
2025-09-16 13:29:28 -04:00
Pratik Basyal
5c7ccb3c26 Github Issue Links updated (#5350)
* 7.0.0 compatibility updated

* GIM link updated
2025-09-16 12:55:58 -04:00
randyh62
f216b371a0 Update RELEASE.md (#560)
Update llvm-project URL
2025-09-16 09:39:26 -07:00
randyh62
37faf170b1 Update RELEASE.md (#5349)
* Update RELEASE.md

update llvm-project URL

* Update .wordlist.txt

add spelling errors
2025-09-16 09:38:23 -07:00
Peter Park
8c40d14d7e fix pldm note (#5346) 2025-09-16 11:09:19 -05:00
Peter Park
d5101532f7 docs: Add SGLang disaggregated P/D inference w/ Mooncake guide (#5335)
* add main content

* Update content and format

add clarification

update

update data

* fix

fix

fix

* fix: deepseek v3

* add ki

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

---------

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
2025-09-16 10:33:58 -05:00
Peter Park
ef4e7ca1fe docs(PyTorch training v25.8): Add Primus and update PyTorch training benchmark docs (#5331)
* pyt: update previous versions list

update conf.py

* pyt: update yaml and rst

update

update toc

* update headings and anchors

* pyt: update doc

* update docker hub urls
2025-09-16 10:33:53 -05:00
Pratik Basyal
be68246824 Compatibility updated for 7.0.0 (#5332)
* Compatibility udpated

* Minor fix
2025-09-16 10:01:49 -05:00
Pratik Basyal
1626ee4d8b Post GA fixes develop (#5329)
* Develop link updated

* Release notes and compatibilty update

* Compatibilitbity updated

* RPP link updated
2025-09-16 09:30:12 -05:00
Pratik Basyal
7316031fe6 7.0.0 Release notes update Batch 9 (#559)
* Changelog synced

* Compatibilty updated

* Compatibilty update

* Compiler highlight updated

* wordlist updated
2025-09-16 07:03:32 -04:00
216 changed files with 26955 additions and 5620 deletions

View File

@@ -128,6 +128,9 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
parameters:
cmakeVersion: '3.28.6'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -152,6 +155,7 @@ jobs:
-DCMAKE_BUILD_TYPE=Release
-DGPU_TARGETS=${{ job.target }}
-DAMDGPU_TARGETS=${{ job.target }}
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
-DCMAKE_MODULE_PATH=$(Agent.BuildDirectory)/rocm/lib/cmake/hip
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm/llvm;$(Agent.BuildDirectory)/rocm
-DHALF_INCLUDE_DIR=$(Agent.BuildDirectory)/rocm/include
@@ -192,6 +196,9 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
parameters:
cmakeVersion: '3.28.6'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -217,6 +224,7 @@ jobs:
-DCMAKE_BUILD_TYPE=Release
-DGPU_TARGETS=${{ job.target }}
-DAMDGPU_TARGETS=${{ job.target }}
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
-DCMAKE_MODULE_PATH=$(Agent.BuildDirectory)/rocm/lib/cmake/hip
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm/llvm;$(Agent.BuildDirectory)/rocm
-DHALF_INCLUDE_DIR=$(Agent.BuildDirectory)/rocm/include

View File

@@ -34,6 +34,7 @@ parameters:
default:
- cmake
- libnuma-dev
- libsimde-dev
- mesa-common-dev
- ninja-build
- ocl-icd-libopencl1

View File

@@ -79,7 +79,7 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- task: Bash@3
displayName: Add lit to PATH
inputs:

View File

@@ -131,7 +131,7 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -212,7 +212,7 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:

View File

@@ -37,6 +37,7 @@ parameters:
- libdrm-dev
- libelf-dev
- libnuma-dev
- libsimde-dev
- ninja-build
- pkg-config
- name: rocmDependencies

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: amdsmi
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -31,7 +50,7 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: amdsmi_build_${{ job.os }}
- job: ${{ parameters.componentName }}_build_${{ job.os }}
pool:
${{ if eq(job.os, 'ubuntu2404') }}:
vmImage: 'ubuntu-24.04'
@@ -55,6 +74,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
os: ${{ job.os }}
@@ -65,50 +85,54 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
os: ${{ job.os }}
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
os: ${{ job.os }}
componentName: ${{ parameters.componentName }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
# - template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
# parameters:
# aptPackages: ${{ parameters.aptPackages }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: amdsmi_test_${{ job.os }}_${{ job.target }}
dependsOn: amdsmi_build_${{ job.os }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
parameters:
runRocminfo: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: amdsmi
testDir: '$(Agent.BuildDirectory)'
testExecutable: 'sudo ./rocm/share/amd_smi/tests/amdsmitst'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_${{ job.os }}_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_${{ job.os }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
parameters:
runRocminfo: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
testDir: '$(Agent.BuildDirectory)'
testExecutable: 'sudo ./rocm/share/amd_smi/tests/amdsmitst'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}

View File

@@ -0,0 +1,174 @@
parameters:
- name: componentName
type: string
default: aqlprofile
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
type: boolean
default: false
- name: aptPackages
type: object
default:
- cmake
- git
- ninja-build
- python3-pip
- name: rocmDependencies
type: object
default:
- clr
- llvm-project
- ROCR-Runtime
- name: rocmTestDependencies
type: object
default:
- clr
- llvm-project
- ROCR-Runtime
- rocprofiler-register
- name: jobMatrix
type: object
default:
buildJobs:
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
testJobs:
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: ${{ parameters.componentName }}_build_${{ job.os }}_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.os }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.MEDIUM_BUILD_POOL }}
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-vendor.yml
parameters:
dependencyList:
- gtest
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
os: ${{ job.os }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
os: ${{ job.os }}
consolidateBuildAndInstall: true
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;$(Agent.BuildDirectory)/vendor
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
-DCMAKE_MODULE_PATH=$(Agent.BuildDirectory)/aqlprofile/cmake_modules
-DAQLPROFILE_BUILD_TESTS=ON
-DGPU_TARGETS=${{ job.target }}
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
gpuTarget: ${{ job.target }}
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- ${{ if eq(job.os, 'ubuntu2204') }}:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: ${{ job.target }}
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_${{ job.os }}_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_${{ job.os }}_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- checkout: none
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
preTargetFilter: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
os: ${{ job.os }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
testDir: $(Agent.BuildDirectory)/rocm/share/hsa-amd-aqlprofile/
testExecutable: ./run_tests.sh
testParameters: ''
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}

View File

@@ -77,6 +77,7 @@ parameters:
- clr
- hipBLAS-common
- llvm-project
- rocm-cmake
- rocminfo
- rocm_smi_lib
- rocprofiler-register
@@ -144,7 +145,7 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:

View File

@@ -40,6 +40,7 @@ parameters:
- gfortran
- libgfortran5
- libopenblas-dev
- liblapack-dev
- name: pipModules
type: object
default:
@@ -53,6 +54,7 @@ parameters:
- hipSPARSE
- llvm-project
- rocBLAS
- rocm-cmake
- rocm_smi_lib
- rocminfo
- rocprofiler-register
@@ -66,6 +68,7 @@ parameters:
- llvm-project
- hipBLAS-common
- hipBLASLt
- rocm-cmake
- rocBLAS
- rocminfo
- rocprofiler-register
@@ -109,7 +112,7 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -125,10 +128,13 @@ jobs:
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
# NOTE: content between `---` is for transition support between old/new build systems
# and should be removed once transition is complete.
# -----------------------------
# Build and install gtest and lapack
# $(Pipeline.Workspace)/deps is a temporary folder for the build process
# $(Pipeline.Workspace)/s/deps is part of the hipSPARSELt repo
- script: mkdir $(Pipeline.Workspace)/deps
- script: mkdir -p $(Pipeline.Workspace)/deps
displayName: Create temp folder for external dependencies
# hipSPARSELt already has a CMake script for external deps, so we can just run that
# https://github.com/ROCm/hipSPARSELt/blob/develop/deps/CMakeLists.txt
@@ -144,22 +150,35 @@ jobs:
- script: sudo make install
displayName: Install hipSPARSELt external dependencies
workingDirectory: $(Pipeline.Workspace)/deps
# -----------------------------
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
os: ${{ job.os }}
# NOTE: the following options are old build only
# and can be removed after full transition to new build
# -DAMDGPU_TARGETS=${{ job.target }}
# -DCMAKE_Fortran_COMPILER=f95
# -DTensile_LOGIC=
# -DTensile_CPU_THREADS=
# -DTensile_LIBRARY_FORMAT=msgpack
# -DROCM_PATH=$(Agent.BuildDirectory)/rocm
# -DBUILD_CLIENTS_TESTS=ON
# -DBUILD_USE_LOCAL_TENSILE=OFF
extraBuildFlags: >-
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
-DCMAKE_C_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang
-DCMAKE_Fortran_COMPILER=f95
-DCMAKE_PREFIX_PATH="$(Agent.BuildDirectory)/rocm"
-DGPU_TARGETS=${{ job.target }}
-DAMDGPU_TARGETS=${{ job.target }}
-DCMAKE_Fortran_COMPILER=f95
-DTensile_LOGIC=
-DTensile_CPU_THREADS=
-DTensile_LIBRARY_FORMAT=msgpack
-DCMAKE_PREFIX_PATH="$(Agent.BuildDirectory)/rocm"
-DROCM_PATH=$(Agent.BuildDirectory)/rocm
-DBUILD_CLIENTS_TESTS=ON
-DBUILD_USE_LOCAL_TENSILE=OFF
-DHIPSPARSELT_ENABLE_FETCH=ON
-GNinja
${{ if ne(parameters.sparseCheckoutDir, '') }}:
cmakeSourceDir: $(Build.SourcesDirectory)/projects/hipsparselt

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: hipTensor
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -51,7 +70,7 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: hipTensor_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -66,17 +85,21 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;$(Agent.BuildDirectory)/rocm/llvm
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
-DCMAKE_C_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang
-DROCM_PATH=$(Agent.BuildDirectory)/rocm
-DCMAKE_BUILD_TYPE=Release
-DHIPTENSOR_BUILD_TESTS=ON
@@ -84,9 +107,12 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
@@ -94,44 +120,47 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: ${{ job.target }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: hipTensor_test_${{ job.target }}
timeoutInMinutes: 90
dependsOn: hipTensor_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: hipTensor
testDir: '$(Agent.BuildDirectory)/rocm/bin/hiptensor'
testParameters: '-E ".*-extended" --output-on-failure --force-new-ctest-process --output-junit test_output.xml'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_${{ job.target }}
timeoutInMinutes: 90
dependsOn: ${{ parameters.componentName }}_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
testDir: '$(Agent.BuildDirectory)/rocm/bin/hiptensor'
testParameters: '-E ".*-extended" --extra-verbose --output-on-failure --force-new-ctest-process --output-junit test_output.xml'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}

View File

@@ -71,7 +71,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:

View File

@@ -39,10 +39,16 @@ parameters:
- python3
- python3-dev
- python3-pip
- python3-venv
- libgtest-dev
- libboost-filesystem-dev
- libboost-program-options-dev
- name: pipModules
type: object
default:
- nanobind>=2.0.0
- pytest
- pytest-cov
- name: rocmDependencies
type: object
default:
@@ -69,8 +75,10 @@ parameters:
- { os: ubuntu2204, packageManager: apt }
- { os: almalinux8, packageManager: dnf }
testJobs:
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
# - { os: ubuntu2204, packageManager: apt, target: gfx1100 }
# - { os: ubuntu2204, packageManager: apt, target: gfx1151 }
# - { os: ubuntu2204, packageManager: apt, target: gfx1201 }
- name: downstreamComponentMatrix
type: object
default:
@@ -107,8 +115,17 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-vendor.yml
parameters:
dependencyList:
- gtest
- ${{ if ne(job.os, 'almalinux8') }}:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-vendor.yml
parameters:
dependencyList:
- catch2
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
@@ -125,11 +142,12 @@ jobs:
parameters:
os: ${{ job.os }}
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;$(Agent.BuildDirectory)/vendor
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
-DORIGAMI_BUILD_SHARED_LIBS=ON
-DORIGAMI_ENABLE_PYTHON=ON
-DORIGAMI_BUILD_TESTING=ON
-DORIGAMI_ENABLE_FETCH=ON
-GNinja
- ${{ if ne(job.os, 'almalinux8') }}:
- task: PublishPipelineArtifact@1
@@ -162,7 +180,6 @@ jobs:
dependsOn: origami_build_${{ job.os }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
@@ -173,30 +190,30 @@ jobs:
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-vendor.yml
parameters:
dependencyList:
- gtest
- ${{ if ne(job.os, 'almalinux8') }}:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-vendor.yml
parameters:
dependencyList:
- catch2
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
preTargetFilter: ${{ parameters.componentName }}
os: ${{ job.os }}
- task: DownloadPipelineArtifact@2
displayName: 'Download Build Directory Artifact'
inputs:
artifact: '${{ parameters.componentName }}_${{ job.os }}_build_dir'
path: '$(Agent.BuildDirectory)/s/build'
- task: DownloadPipelineArtifact@2
displayName: 'Download Python Source Artifact'
inputs:
artifact: '${{ parameters.componentName }}_${{ job.os }}_python_src'
path: '$(Agent.BuildDirectory)/s/python'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
@@ -205,17 +222,72 @@ jobs:
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- task: CMake@1
displayName: 'Origami Test CMake Configuration'
inputs:
cmakeArgs: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;$(Agent.BuildDirectory)/vendor
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
-DORIGAMI_BUILD_SHARED_LIBS=ON
-DORIGAMI_ENABLE_PYTHON=ON
-DORIGAMI_BUILD_TESTING=ON
-GNinja
$(Agent.BuildDirectory)/s
- task: Bash@3
displayName: 'Build Origami Tests and Python Bindings'
inputs:
targetType: inline
workingDirectory: build
script: |
cmake --build . --target origami-tests origami_python -- -j$(nproc)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- script: |
export PYTHONPATH=$(Agent.BuildDirectory)/s/build/python:$PYTHONPATH
echo "--- Running origami_test.py ---"
python3 $(Agent.BuildDirectory)/s/python/origami_test.py
echo "--- Running origami_grid_test.py ---"
python3 $(Agent.BuildDirectory)/s/python/origami_grid_test.py
displayName: 'Run Python Binding Tests'
condition: succeeded()
# Run tests using CTest (discovers and runs both C++ and Python tests)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
testDir: 'build'
testParameters: '--output-on-failure --force-new-ctest-process --output-junit test_output.xml'
# Test pip install workflow
# - task: Bash@3
# displayName: 'Test Pip Install'
# inputs:
# targetType: inline
# script: |
# set -e
# echo "==================================================================="
# echo "Testing pip install workflow (pip install -e .)"
# echo "==================================================================="
# # Set environment variables for pip install CMake build
# export ROCM_PATH=$(Agent.BuildDirectory)/rocm
# export CMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm:$(Agent.BuildDirectory)/vendor
# export CMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
# echo "ROCM_PATH: $ROCM_PATH"
# echo "CMAKE_PREFIX_PATH: $CMAKE_PREFIX_PATH"
# echo "CMAKE_CXX_COMPILER: $CMAKE_CXX_COMPILER"
# echo ""
# # Install from source directory
# cd "$(Agent.BuildDirectory)/s/python"
# pip install -e .
# # Verify import works
# echo ""
# echo "Verifying origami can be imported..."
# python3 -c "import origami; print('✓ Successfully imported origami')"
# # Run pytest on installed package
# echo ""
# echo "Running pytest tests..."
# python3 -m pytest tests/ -v -m "not slow" --tb=short
# echo ""
# echo "==================================================================="
# echo "Pip install test completed successfully"
# echo "==================================================================="
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}

View File

@@ -1,10 +1,35 @@
parameters:
- name: componentName
type: string
default: rccl
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
- name: systemsRepo
type: string
default: systems_repo
- name: systemsSparseCheckoutDir
type: string
default: 'projects/rocprofiler-sdk'
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -57,37 +82,52 @@ parameters:
type: object
default:
buildJobs:
- gfx942:
target: gfx942
- gfx90a:
target: gfx90a
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
testJobs:
- gfx942:
target: gfx942
- gfx90a:
target: gfx90a
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
- name: downstreamComponentMatrix
type: object
default:
- rocprofiler-sdk:
name: rocprofiler-sdk
sparseCheckoutDir: ''
skipUnifiedBuild: 'false'
buildDependsOn:
- rccl_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: rccl_build_${{ job.target }}
timeoutInMinutes: 90
- job: ${{ parameters.componentName }}_build_${{ job.os }}_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.os }}_${{ job.target }}
timeoutInMinutes: 120
variables:
- group: common
- template: /.azuredevops/variables-global.yml
- name: HIP_ROCCLR_HOME
value: $(Build.BinariesDirectory)/rocm
pool: ${{ variables.MEDIUM_BUILD_POOL }}
${{ if eq(job.os, 'almalinux8') }}:
container:
image: rocmexternalcicd.azurecr.io/manylinux228:latest
endpoint: ContainerService3
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
submoduleBehaviour: recursive
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-vendor.yml
parameters:
@@ -97,10 +137,14 @@ jobs:
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
os: ${{ job.os }}
extraBuildFlags: >-
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/bin/hipcc
-DCMAKE_C_COMPILER=$(Agent.BuildDirectory)/rocm/bin/hipcc
@@ -112,58 +156,87 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: ${{ job.target }}
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
installLatestCMake: true
- ${{ if eq(job.os, 'ubuntu2204') }}:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: ${{ job.target }}
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
installLatestCMake: true
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: rccl_test_${{ job.target }}
timeoutInMinutes: 120
dependsOn: rccl_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rccl
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './rccl-UnitTests'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_${{ job.os }}_${{ job.target }}
timeoutInMinutes: 120
dependsOn: ${{ parameters.componentName }}_build_${{ job.os }}_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
preTargetFilter: ${{ parameters.componentName }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './rccl-UnitTests'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if parameters.triggerDownstreamJobs }}:
- ${{ each component in parameters.downstreamComponentMatrix }}:
- ${{ if not(and(parameters.unifiedBuild, eq(component.skipUnifiedBuild, 'true'))) }}:
- template: /.azuredevops/components/${{ component.name }}.yml@pipelines_repo
parameters:
checkoutRepo: ${{ parameters.systemsRepo }}
sparseCheckoutDir: ${{ parameters.systemsSparseCheckoutDir }}
triggerDownstreamJobs: true
unifiedBuild: ${{ parameters.unifiedBuild }}
${{ if parameters.unifiedBuild }}:
buildDependsOn: ${{ component.unifiedBuild.buildDependsOn }}
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ component.unifiedBuild.downstreamAggregateNames }}
${{ else }}:
buildDependsOn: ${{ component.buildDependsOn }}
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ parameters.componentName }}

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: rdc
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -33,6 +52,7 @@ parameters:
- clr
- hipBLAS-common
- hipBLASLt
- hipRAND
- llvm-project
- rocBLAS
- rocm-cmake
@@ -43,6 +63,7 @@ parameters:
- rocprofiler
- rocprofiler-register
- rocprofiler-sdk
- rocRAND
- ROCR-Runtime
- name: rocmTestDependencies
type: object
@@ -74,7 +95,11 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: rdc_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -85,16 +110,22 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
parameters:
cmakeVersion: '3.25.0'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
# Build grpc
- task: Bash@3
displayName: 'git clone grpc'
@@ -104,6 +135,7 @@ jobs:
workingDirectory: $(Build.SourcesDirectory)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: ${{ parameters.componentName }}
cmakeBuildDir: $(Build.SourcesDirectory)/grpc/build
cmakeSourceDir: $(Build.SourcesDirectory)/grpc
installDir: $(Build.SourcesDirectory)/bin
@@ -117,6 +149,7 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: ${{ parameters.componentName }}
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DGRPC_ROOT="$(Build.SourcesDirectory)/bin"
@@ -126,9 +159,12 @@ jobs:
-DAMDGPU_TARGETS=${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
@@ -136,60 +172,64 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: ${{ job.target }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: rdc_test_${{ job.target }}
dependsOn: rdc_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
- name: ROCM_PATH
value: $(Agent.BuildDirectory)/rocm
- name: ROCM_DIR
value: $(Agent.BuildDirectory)/rocm
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
- task: Bash@3
displayName: Setup test environment
inputs:
targetType: inline
script: |
sudo ln -s $(Agent.BuildDirectory)/rocm/bin/rdcd /usr/sbin/rdcd
echo $(Agent.BuildDirectory)/rocm/lib/rdc/grpc/lib | sudo tee /etc/ld.so.conf.d/grpc.conf
sudo ldconfig -v
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- task: Bash@3
displayName: Test rdc
inputs:
targetType: inline
script: >-
$(Agent.BuildDirectory)/rocm/share/rdc/rdctst_tests/rdctst
--batch_mode
--start_rdcd
--unauth_comm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
extraPaths: /home/user/workspace/rocm/bin
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
- name: ROCM_PATH
value: $(Agent.BuildDirectory)/rocm
- name: ROCM_DIR
value: $(Agent.BuildDirectory)/rocm
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- checkout: none
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- task: Bash@3
displayName: Setup test environment
inputs:
targetType: inline
script: |
sudo ln -s $(Agent.BuildDirectory)/rocm/bin/rdcd /usr/sbin/rdcd
echo $(Agent.BuildDirectory)/rocm/lib/rdc/grpc/lib | sudo tee /etc/ld.so.conf.d/grpc.conf
sudo ldconfig -v
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- task: Bash@3
displayName: Test rdc
inputs:
targetType: inline
script: >-
$(Agent.BuildDirectory)/rocm/share/rdc/rdctst_tests/rdctst
--batch_mode
--start_rdcd
--unauth_comm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
extraPaths: /home/user/workspace/rocm/bin

View File

@@ -70,6 +70,7 @@ parameters:
- hipBLAS-common
- hipBLASLt
- llvm-project
- rocm-cmake
- rocminfo
- rocprofiler-register
- rocm_smi_lib
@@ -154,7 +155,7 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:

View File

@@ -210,7 +210,7 @@ jobs:
parameters:
componentName: ${{ parameters.componentName }}
testDir: '$(Agent.BuildDirectory)/rocm/bin/rocprim'
extraTestParameters: '-I ${{ job.shard }},,${{ job.shardCount }} -E device_merge_inplace'
extraTestParameters: '-I ${{ job.shard }},,${{ job.shardCount }}'
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: rocWMMA
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -66,7 +85,11 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: rocWMMA_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -81,6 +104,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
@@ -102,9 +126,12 @@ jobs:
# gfx1030 not supported in documentation
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
@@ -112,43 +139,45 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: ${{ job.target }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: rocWMMA_test_${{ job.target }}
timeoutInMinutes: 270
dependsOn: rocWMMA_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rocWMMA
testDir: '$(Agent.BuildDirectory)/rocm/bin/rocwmma'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_${{ job.target }}
timeoutInMinutes: 350
dependsOn: ${{ parameters.componentName }}_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
preTargetFilter: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
testDir: '$(Agent.BuildDirectory)/rocm/bin/rocwmma'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}

View File

@@ -81,7 +81,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rocm-cmake
testParameters: '-E "pass-version-parent" --output-on-failure --force-new-ctest-process --output-junit test_output.xml'
testParameters: '-E "pass-version-parent" --extra-verbose --output-on-failure --force-new-ctest-process --output-junit test_output.xml'
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:

View File

@@ -14,16 +14,42 @@ parameters:
type: object
default:
- cmake
- libdw-dev
- libglfw3-dev
- libmsgpack-dev
- libopencv-dev
- libtbb-dev
- libtiff-dev
- libva-amdgpu-dev
- libva2-amdgpu
- mesa-amdgpu-va-drivers
- libavcodec-dev
- libavformat-dev
- libavutil-dev
- ninja-build
- python3-pip
- protobuf-compiler
- libprotoc-dev
- libopencv-dev
- name: pipModules
type: object
default:
- future==1.0.0
- pytz==2022.1
- numpy==1.23
- google==3.0.0
- protobuf==3.12.4
- onnx==1.12.0
- nnef==1.0.7
- name: rocmDependencies
type: object
default:
- AMDMIGraphX
- aomp
- aomp-extras
- clr
- half
- composable_kernel
- hipBLAS
- hipBLAS-common
- hipBLASLt
@@ -33,21 +59,37 @@ parameters:
- hipRAND
- hipSOLVER
- hipSPARSE
- hipTensor
- llvm-project
- MIOpen
- MIVisionX
- rocm_smi_lib
- rccl
- rocAL
- rocALUTION
- rocBLAS
- rocDecode
- rocFFT
- rocJPEG
- rocPRIM
- rocprofiler-register
- rocprofiler-sdk
- ROCR-Runtime
- rocRAND
- rocSOLVER
- rocSPARSE
- rocThrust
- rocWMMA
- rpp
- name: rocmTestDependencies
type: object
default:
- AMDMIGraphX
- aomp
- aomp-extras
- clr
- half
- composable_kernel
- hipBLAS
- hipBLAS-common
- hipBLASLt
@@ -57,18 +99,30 @@ parameters:
- hipRAND
- hipSOLVER
- hipSPARSE
- hipTensor
- llvm-project
- MIOpen
- MIVisionX
- rocm_smi_lib
- rccl
- rocAL
- rocALUTION
- rocBLAS
- rocDecode
- rocFFT
- rocminfo
- rocPRIM
- rocJPEG
- rocprofiler-register
- rocprofiler-sdk
- ROCR-Runtime
- rocRAND
- rocSOLVER
- rocSPARSE
- rocThrust
- roctracer
- rocWMMA
- rpp
- name: jobMatrix
type: object
@@ -97,6 +151,11 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
registerROCmPackages: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
parameters:
cmakeVersion: '3.25.0'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -158,6 +217,10 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
registerROCmPackages: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
parameters:
cmakeVersion: '3.25.0'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -188,5 +251,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: ${{ job.target }}

View File

@@ -43,9 +43,14 @@ parameters:
- ninja-build
- python3-pip
- python3-venv
- googletest
- libgtest-dev
- libgmock-dev
- libboost-filesystem-dev
- name: pipModules
type: object
default:
- msgpack
- joblib
- "packaging>=22.0"
- pytest
@@ -102,7 +107,7 @@ jobs:
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-custom.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
@@ -147,6 +152,13 @@ jobs:
echo "##vso[task.prependpath]$USER_BASE/bin"
echo "##vso[task.setvariable variable=PytestCmakePath]$USER_BASE/share/Pytest/cmake"
displayName: Set cmake configure paths
- task: Bash@3
displayName: Add ROCm binaries to PATH
inputs:
targetType: inline
script: |
echo "##vso[task.prependpath]$(Agent.BuildDirectory)/rocm/bin"
echo "##vso[task.prependpath]$(Agent.BuildDirectory)/rocm/llvm/bin"
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
os: ${{ job.os }}

View File

@@ -65,6 +65,13 @@ parameters:
- pytest
- pytest-cov
- pytest-xdist
- name: rocmDependencies
type: object
default:
- clr
- llvm-project
- ROCR-Runtime
- rocprofiler-sdk
- name: rocmTestDependencies
type: object
default:
@@ -101,10 +108,12 @@ jobs:
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.os }}_${{ job.target }}
- ${{ build }}_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
- name: ROCM_PATH
value: $(Agent.BuildDirectory)/rocm
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
@@ -119,6 +128,14 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-

View File

@@ -79,27 +79,27 @@ parameters:
type: object
default:
buildJobs:
- gfx942:
target: gfx942
- gfx90a:
target: gfx90a
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
testJobs:
- gfx942:
target: gfx942
- gfx90a:
target: gfx90a
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: rocprofiler_sdk_build_${{ job.target }}
- job: rocprofiler_sdk_build_${{ job.os }}_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.target }}
- ${{ build }}_${{ job.os}}_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.MEDIUM_BUILD_POOL }}
${{ if eq(job.os, 'almalinux8') }}:
container:
image: rocmexternalcicd.azurecr.io/manylinux228:latest
endpoint: ContainerService3
workspace:
clean: all
steps:
@@ -107,6 +107,7 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
registerROCmPackages: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
@@ -118,6 +119,7 @@ jobs:
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
@@ -132,6 +134,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DROCPROFILER_BUILD_TESTS=ON
@@ -143,6 +146,7 @@ jobs:
parameters:
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
@@ -158,8 +162,8 @@ jobs:
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: rocprofiler_sdk_test_${{ job.target }}
dependsOn: rocprofiler_sdk_build_${{ job.target }}
- job: rocprofiler_sdk_test_${{ job.os }}_${{ job.target }}
dependsOn: rocprofiler_sdk_build_${{ job.os }}_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
@@ -177,6 +181,7 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
registerROCmPackages: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
@@ -188,6 +193,7 @@ jobs:
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
@@ -202,6 +208,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DROCPROFILER_BUILD_TESTS=ON
@@ -213,6 +220,8 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
testDir: $(Agent.BuildDirectory)/build
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}

View File

@@ -226,8 +226,11 @@ jobs:
echo "##vso[task.prependpath]$(Agent.BuildDirectory)/rocm/llvm/bin"
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
cmakeSourceDir: $(Agent.BuildDirectory)/s/projects/rocprofiler-systems
# build flags reference: https://rocm.docs.amd.com/projects/omnitrace/en/latest/install/install.html
extraBuildFlags: >-
-DCMAKE_INSTALL_PREFIX=$(Agent.BuildDirectory)/rocprofiler-systems
-DROCPROFSYS_USE_PYTHON=ON
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_BUILD_DYNINST=ON
-DROCPROFSYS_BUILD_LIBUNWIND=ON
@@ -245,11 +248,13 @@ jobs:
displayName: Set up rocprofiler-systems env
inputs:
targetType: inline
script: source share/rocprofiler-systems/setup-env.sh
workingDirectory: build
script: source $(Agent.BuildDirectory)/rocprofiler-systems/share/rocprofiler-systems/setup-env.sh
workingDirectory: $(Agent.BuildDirectory)/rocprofiler-systems/share/rocprofiler-systems
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
testDir: $(Agent.BuildDirectory)/s/build/tests/
testParameters: '--output-on-failure'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
gpuTarget: ${{ job.target }}

View File

@@ -0,0 +1,63 @@
parameters:
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
- name: cli11Version
type: string
default: ''
- name: aptPackages
type: object
default:
- cmake
- git
- ninja-build
- name: jobMatrix
type: object
default:
buildJobs:
- { os: ubuntu2204, packageManager: apt}
- { os: almalinux8, packageManager: dnf}
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: cli11_${{ job.os }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool:
vmImage: 'ubuntu-22.04'
${{ if eq(job.os, 'almalinux8') }}:
container:
image: rocmexternalcicd.azurecr.io/manylinux228:latest
endpoint: ContainerService3
workspace:
clean: all
steps:
- checkout: none
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- task: Bash@3
displayName: Clone cli11 ${{ parameters.cli11Version }}
inputs:
targetType: inline
script: git clone https://github.com/CLIUtils/CLI11.git -b ${{ parameters.cli11Version }}
workingDirectory: $(Agent.BuildDirectory)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
os: ${{ job.os }}
cmakeBuildDir: $(Agent.BuildDirectory)/CLI11/build
cmakeSourceDir: $(Agent.BuildDirectory)/CLI11
useAmdclang: false
extraBuildFlags: >-
-DCMAKE_BUILD_TYPE=Release
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
os: ${{ job.os }}

View File

@@ -0,0 +1,66 @@
parameters:
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
- name: yamlcppVersion
type: string
default: ''
- name: aptPackages
type: object
default:
- cmake
- git
- ninja-build
- name: jobMatrix
type: object
default:
buildJobs:
- { os: ubuntu2204, packageManager: apt}
- { os: almalinux8, packageManager: dnf}
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: yamlcpp_${{ job.os }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool:
vmImage: 'ubuntu-22.04'
${{ if eq(job.os, 'almalinux8') }}:
container:
image: rocmexternalcicd.azurecr.io/manylinux228:latest
endpoint: ContainerService3
workspace:
clean: all
steps:
- checkout: none
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- task: Bash@3
displayName: Clone yaml-cpp ${{ parameters.yamlcppVersion }}
inputs:
targetType: inline
script: git clone https://github.com/jbeder/yaml-cpp.git -b ${{ parameters.yamlcppVersion }}
workingDirectory: $(Agent.BuildDirectory)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
os: ${{ job.os }}
cmakeBuildDir: $(Agent.BuildDirectory)/yaml-cpp/build
cmakeSourceDir: $(Agent.BuildDirectory)/yaml-cpp
useAmdclang: false
extraBuildFlags: >-
-DCMAKE_BUILD_TYPE=Release
-DYAML_CPP_BUILD_TOOLS=OFF
-DYAML_BUILD_SHARED_LIBS=OFF
-DYAML_CPP_INSTALL=ON
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
os: ${{ job.os }}

View File

@@ -0,0 +1,23 @@
variables:
- group: common
- template: /.azuredevops/variables-global.yml
parameters:
- name: cli11Version
type: string
default: "main"
resources:
repositories:
- repository: pipelines_repo
type: github
endpoint: ROCm
name: ROCm/ROCm
trigger: none
pr: none
jobs:
- template: ${{ variables.CI_DEPENDENCIES_PATH }}/cli11.yml
parameters:
cli11Version: ${{ parameters.cli11Version }}

View File

@@ -0,0 +1,24 @@
variables:
- group: common
- template: /.azuredevops/variables-global.yml
parameters:
- name: yamlcppVersion
type: string
default: "0.8.0"
resources:
repositories:
- repository: pipelines_repo
type: github
endpoint: ROCm
name: ROCm/ROCm
trigger: none
pr: none
jobs:
- template: ${{ variables.CI_DEPENDENCIES_PATH }}/yamlcpp.yml
parameters:
yamlcppVersion: ${{ parameters.yamlcppVersion }}

View File

@@ -1,10 +1,15 @@
parameters:
- name: cmakeVersion
type: string
default: '3.31.0'
steps:
- task: Bash@3
displayName: Install CMake 3.31
displayName: Install CMake ${{ parameters.cmakeVersion }}
inputs:
targetType: inline
script: |
CMAKE_VERSION=3.31.0
CMAKE_VERSION=${{ parameters.cmakeVersion }}
CMAKE_ROOT="$(Pipeline.Workspace)/cmake"
echo "Downloading CMake $CMAKE_VERSION..."

View File

@@ -63,6 +63,7 @@ parameters:
libopenblas-dev: openblas-devel
libopenmpi-dev: openmpi-devel
libpci-dev: libpciaccess-devel
libsimde-dev: simde-devel
libssl-dev: openssl-devel
# note: libstdc++-devel is in the base packages list
libsystemd-dev: systemd-devel

View File

@@ -35,8 +35,8 @@ parameters:
developBranch: develop
hasGpuTarget: true
amdsmi:
pipelineId: 99
developBranch: amd-staging
pipelineId: 376
developBranch: develop
hasGpuTarget: false
aomp-extras:
pipelineId: 111
@@ -46,6 +46,10 @@ parameters:
pipelineId: 115
developBranch: aomp-dev
hasGpuTarget: false
aqlprofile:
pipelineId: 365
developBranch: develop
hasGpuTarget: false
clr:
pipelineId: 335
developBranch: develop
@@ -111,7 +115,7 @@ parameters:
developBranch: develop
hasGpuTarget: true
hipTensor:
pipelineId: 105
pipelineId: 374
developBranch: develop
hasGpuTarget: true
llvm-project:
@@ -126,13 +130,17 @@ parameters:
pipelineId: 80
developBranch: develop
hasGpuTarget: true
origami:
pipelineId: 364
developBranch: develop
hasGpuTarget: true
rccl:
pipelineId: 107
developBranch: develop
hasGpuTarget: true
rdc:
pipelineId: 100
developBranch: amd-staging
pipelineId: 360
developBranch: develop
hasGpuTarget: false
rocAL:
pipelineId: 151
@@ -219,8 +227,8 @@ parameters:
developBranch: develop
hasGpuTarget: true
rocprofiler-systems:
pipelineId: 255
developBranch: amd-staging
pipelineId: 345
developBranch: develop
hasGpuTarget: true
rocPyDecode:
pipelineId: 239
@@ -255,7 +263,7 @@ parameters:
developBranch: develop
hasGpuTarget: true
rocWMMA:
pipelineId: 109
pipelineId: 370
developBranch: develop
hasGpuTarget: true
rpp:

View File

@@ -13,7 +13,7 @@ parameters:
default: ctest
- name: testParameters
type: string
default: --output-on-failure --force-new-ctest-process --output-junit test_output.xml
default: --extra-verbose --output-on-failure --force-new-ctest-process --output-junit test_output.xml
- name: extraTestParameters
type: string
default: ''

1
.gitignore vendored
View File

@@ -1,6 +1,7 @@
.venv
.vscode
build
__pycache__
# documentation artifacts
_build/

View File

@@ -27,6 +27,7 @@ ASICs
ASan
ASAN
ASm
Async
ATI
atomicRMW
AddressSanitizer
@@ -34,6 +35,7 @@ AlexNet
Andrej
Arb
Autocast
autograd
BARs
BatchNorm
BLAS
@@ -43,6 +45,7 @@ Blit
Blockwise
Bluefield
Bootloader
Broadcom
CAS
CCD
CDNA
@@ -72,9 +75,11 @@ CU
CUDA
CUs
CXX
CX
Cavium
CentOS
ChatGPT
Cholesky
CoRR
Codespaces
Commitizen
@@ -84,9 +89,11 @@ Conda
ConnectX
CountOnes
CuPy
customizable
da
Dashboarding
Dataloading
dataflows
DBRX
DDR
DF
@@ -118,6 +125,8 @@ Dependabot
Deprecations
DevCap
DirectX
Disaggregated
disaggregated
Dockerfile
Dockerized
Doxygen
@@ -126,9 +135,13 @@ ELMo
ENDPGM
EPYC
ESXi
EP
EoS
etcd
equalto
fas
FBGEMM
FiLM
FIFOs
FFT
FFTs
@@ -142,15 +155,19 @@ Filesystem
FindDb
Flang
FlashAttention
FlashInfers
FlashInfer
FluxBenchmark
Fortran
Fuyu
GALB
GAT
GATNE
GCC
GCD
GCDs
GCN
GCNN
GDB
GDDR
GDR
@@ -169,15 +186,19 @@ Glibc
GLXT
Gloo
GMI
GNN
GNNs
GPG
GPR
GPT
GPU
GPU's
GPUDirect
GPUs
Graphbolt
GraphBolt
GraphSage
GRBM
GRE
GenAI
GenZ
GitHub
@@ -204,7 +225,10 @@ Haswell
Higgs
href
Hyperparameters
HybridEngine
Huggingface
Hunyuan
HunyuanVideo
IB
ICD
ICT
@@ -235,7 +259,9 @@ Intersphinx
Intra
Ioffe
JAX's
JAXLIB
Jinja
js
JSON
Jupyter
KFD
@@ -255,6 +281,7 @@ LLM
LLMs
LLVM
LM
logsumexp
LRU
LSAN
LSan
@@ -290,6 +317,7 @@ Makefiles
Matplotlib
Matrox
MaxText
MBT
Megablocks
Megatrends
Megatron
@@ -299,11 +327,15 @@ Meta's
Miniconda
MirroredStrategy
Mixtral
MLA
MosaicML
MoEs
Mooncake
Mpops
Multicore
Multithreaded
mx
MXFP
MyEnvironment
MyST
NANOO
@@ -339,6 +371,7 @@ OFED
OMM
OMP
OMPI
OOM
OMPT
OMPX
ONNX
@@ -365,6 +398,7 @@ perf
PEQT
PIL
PILImage
PJRT
POR
PRNG
PRs
@@ -384,6 +418,7 @@ Profiler's
PyPi
Pytest
PyTorch
QPS
Qcycles
Qwen
RAII
@@ -445,6 +480,7 @@ SKU
SKUs
SLES
SLURM
Slurm
SMEM
SMFMA
SMI
@@ -473,6 +509,7 @@ TCI
TCIU
TCP
TCR
TVM
THREADGROUPS
threadgroups
TensorRT
@@ -484,13 +521,12 @@ TPS
TPU
TPUs
TSME
Taichi
Taichi's
Tagram
TensileLite
TensorBoard
TensorFlow
TensorParallel
TheRock
ToC
TorchAudio
torchaudio
@@ -508,6 +544,7 @@ UAC
UC
UCC
UCX
ud
UE
UIF
UMC
@@ -615,6 +652,7 @@ coalescable
codename
collater
comgr
compat
completers
composable
concretization
@@ -656,12 +694,14 @@ denoised
denoises
denormalize
dequantization
dequantized
dequantizes
deserializers
detections
dev
devicelibs
devsel
dgl
dimensionality
disambiguates
distro
@@ -701,6 +741,7 @@ githooks
github
globals
gnupg
gpu
grayscale
gx
gzip
@@ -755,6 +796,7 @@ invariants
invocating
ipo
jax
json
kdb
kfd
kv
@@ -768,6 +810,7 @@ linalg
linearized
linter
linux
llm
llvm
lm
localscratch
@@ -776,6 +819,7 @@ lossy
macOS
matchers
maxtext
megablocks
megatron
microarchitecture
migraphx
@@ -812,11 +856,13 @@ pallas
parallelization
parallelizing
param
params
parameterization
passthrough
pe
perfcounter
performant
piecewise
perl
pragma
pre
@@ -857,6 +903,7 @@ querySelectorAll
queueing
qwen
radeon
rc
rccl
rdc
rdma
@@ -918,6 +965,7 @@ scalability
scalable
scipy
seealso
selectattr
selectedTag
sendmsg
seqs
@@ -934,6 +982,7 @@ softmax
spack
spmm
src
stanford
stochastically
strided
subcommand
@@ -953,6 +1002,7 @@ tabindex
targetContainer
td
tensorfloat
tf
th
tokenization
tokenize
@@ -961,10 +1011,12 @@ tokenizer
tokenizes
toolchain
toolchains
topk
toolset
toolsets
torchtitan
torchvision
tp
tqdm
tracebacks
txt
@@ -987,6 +1039,7 @@ USM
UTCL
UTIL
utils
UX
vL
variational
vdi
@@ -1016,6 +1069,8 @@ writebacks
wrreq
wzo
xargs
xdit
xDiT
xGMI
xPacked
xz

File diff suppressed because it is too large Load Diff

2556
RELEASE.md

File diff suppressed because it is too large Load Diff

View File

@@ -1,33 +1,17 @@
<?xml version="1.0" encoding="UTF-8"?>
<manifest>
<remote name="rocm-org" fetch="https://github.com/ROCm/" />
<default revision="refs/tags/rocm-7.0.0"
<default revision="refs/tags/rocm-7.1.1"
remote="rocm-org"
sync-c="true"
sync-j="4" />
<!--list of projects for ROCm-->
<project name="ROCK-Kernel-Driver" />
<project name="ROCR-Runtime" />
<project name="amdsmi" />
<project name="aqlprofile" />
<project name="rdc" />
<project name="rocm_bandwidth_test" />
<project name="rocm_smi_lib" />
<project name="rocm-core" />
<project name="rocm-examples" />
<project name="rocminfo" />
<project name="rocprofiler" />
<project name="rocprofiler-register" />
<project name="rocprofiler-sdk" />
<project name="rocprofiler-compute" />
<project name="rocprofiler-systems" />
<project name="roctracer" />
<!--HIP Projects-->
<project name="hip" />
<project name="hip-tests" />
<project name="HIPIFY" />
<project name="clr" />
<project name="hipother" />
<!-- The following projects are all associated with the AMDGPU LLVM compiler -->
<project name="half" />
<project name="llvm-project" />
@@ -55,9 +39,15 @@
MIOpen rocBLAS rocFFT rocPRIM rocRAND
rocSPARSE rocThrust Tensile -->
<project groups="mathlibs" name="rocm-libraries" />
<!-- The following components have been migrated to rocm-systems:
aqlprofile clr hip hip-tests hipother
rdc rocm-core rocm_smi_lib rocminfo rocprofiler-compute
rocprofiler-register rocprofiler-sdk rocprofiler-systems
rocprofiler rocr-runtime roctracer -->
<project groups="mathlibs" name="rocm-systems" />
<project groups="mathlibs" name="rocPyDecode" />
<project groups="mathlibs" name="rocSHMEM" />
<project groups="mathlibs" name="rocSOLVER" />
<project groups="mathlibs" name="rocSHMEM" />
<project groups="mathlibs" name="rocWMMA" />
<project groups="mathlibs" name="rocm-cmake" />
<project groups="mathlibs" name="rpp" />

View File

@@ -25,69 +25,69 @@ additional licenses. Please review individual repositories for more information.
<!-- spellcheck-disable -->
| Component | License |
|:---------------------|:-------------------------|
| [AMD Compute Language Runtime (CLR)](https://github.com/ROCm/clr) | [MIT](https://github.com/ROCm/clr/blob/amd-staging/LICENSE.txt) |
| [AMD Compute Language Runtime (CLR)](https://github.com/ROCm/rocm-systems/tree/develop/projects/clr) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/clr/LICENSE.md) |
| [AMD SMI](https://github.com/ROCm/amdsmi) | [MIT](https://github.com/ROCm/amdsmi/blob/amd-staging/LICENSE) |
| [aomp](https://github.com/ROCm/aomp/) | [Apache 2.0](https://github.com/ROCm/aomp/blob/aomp-dev/LICENSE) |
| [aomp-extras](https://github.com/ROCm/aomp-extras/) | [MIT](https://github.com/ROCm/aomp-extras/blob/aomp-dev/LICENSE) |
| [AQLprofile](https://github.com/rocm/aqlprofile/) | [MIT](https://github.com/ROCm/aqlprofile/blob/amd-staging/LICENSE.md) |
| [AQLprofile](https://github.com/ROCm/rocm-systems/tree/develop/projects/aqlprofile/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/aqlprofile/LICENSE.md) |
| [Code Object Manager (Comgr)](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/comgr) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/comgr/LICENSE.txt) |
| [Composable Kernel](https://github.com/ROCm/composable_kernel) | [MIT](https://github.com/ROCm/composable_kernel/blob/develop/LICENSE) |
| [half](https://github.com/ROCm/half/) | [MIT](https://github.com/ROCm/half/blob/rocm/LICENSE.txt) |
| [HIP](https://github.com/ROCm/HIP/) | [MIT](https://github.com/ROCm/HIP/blob/amd-staging/LICENSE.txt) |
| [hipamd](https://github.com/ROCm/clr/tree/amd-staging/hipamd) | [MIT](https://github.com/ROCm/clr/blob/amd-staging/hipamd/LICENSE.txt) |
| [hipBLAS](https://github.com/ROCm/hipBLAS/) | [MIT](https://github.com/ROCm/hipBLAS/blob/develop/LICENSE.md) |
| [hipBLASLt](https://github.com/ROCm/hipBLASLt/) | [MIT](https://github.com/ROCm/hipBLASLt/blob/develop/LICENSE.md) |
| [HIP](https://github.com/ROCm/rocm-systems/tree/develop/projects/hip/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/hip/LICENSE.md) |
| [hipamd](https://github.com/ROCm/rocm-systems/tree/develop/projects/clr/hipamd/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/clr/hipamd/LICENSE.md) |
| [hipBLAS](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipblas/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipblas/LICENSE.md) |
| [hipBLASLt](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipblaslt/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipblaslt/LICENSE.md) |
| [HIPCC](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/hipcc) | [MIT](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/hipcc/LICENSE.txt) |
| [hipCUB](https://github.com/ROCm/hipCUB/) | [Custom](https://github.com/ROCm/hipCUB/blob/develop/LICENSE.txt) |
| [hipFFT](https://github.com/ROCm/hipFFT/) | [MIT](https://github.com/ROCm/hipFFT/blob/develop/LICENSE.md) |
| [hipCUB](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipcub/) | [Custom](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipcub/LICENSE.txt) |
| [hipFFT](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipfft/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipfft/LICENSE.md) |
| [hipfort](https://github.com/ROCm/hipfort/) | [MIT](https://github.com/ROCm/hipfort/blob/develop/LICENSE) |
| [HIPIFY](https://github.com/ROCm/HIPIFY/) | [MIT](https://github.com/ROCm/HIPIFY/blob/amd-staging/LICENSE.txt) |
| [hipRAND](https://github.com/ROCm/hipRAND/) | [MIT](https://github.com/ROCm/hipRAND/blob/develop/LICENSE.txt) |
| [hipSOLVER](https://github.com/ROCm/hipSOLVER/) | [MIT](https://github.com/ROCm/hipSOLVER/blob/develop/LICENSE.md) |
| [hipSPARSE](https://github.com/ROCm/hipSPARSE/) | [MIT](https://github.com/ROCm/hipSPARSE/blob/develop/LICENSE.md) |
| [hipSPARSELt](https://github.com/ROCm/hipSPARSELt/) | [MIT](https://github.com/ROCm/hipSPARSELt/blob/develop/LICENSE.md) |
| [hipTensor](https://github.com/ROCm/hipTensor) | [MIT](https://github.com/ROCm/hipTensor/blob/develop/LICENSE) |
| [hipRAND](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hiprand/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hiprand/LICENSE.md) |
| [hipSOLVER](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipsolver/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipsolver/LICENSE.md) |
| [hipSPARSE](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipsparse/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipsparse/LICENSE.md) |
| [hipSPARSELt](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipsparselt/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipsparselt/LICENSE.md) |
| [hipTensor](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hiptensor/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/hiptensor/LICENSE) |
| [llvm-project](https://github.com/ROCm/llvm-project/) | [Apache](https://github.com/ROCm/llvm-project/blob/amd-staging/LICENSE.TXT) |
| [llvm-project/flang](https://github.com/ROCm/llvm-project/tree/amd-staging/flang) | [Apache 2.0](https://github.com/ROCm/llvm-project/blob/amd-staging/flang/LICENSE.TXT) |
| [MIGraphX](https://github.com/ROCm/AMDMIGraphX/) | [MIT](https://github.com/ROCm/AMDMIGraphX/blob/develop/LICENSE) |
| [MIOpen](https://github.com/ROCm/MIOpen/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/miopen/LICENSE.md) |
| [MIOpen](https://github.com/ROCm/rocm-libraries/tree/develop/projects/miopen/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/miopen/LICENSE.md) |
| [MIVisionX](https://github.com/ROCm/MIVisionX/) | [MIT](https://github.com/ROCm/MIVisionX/blob/develop/LICENSE.txt) |
| [rocAL](https://github.com/ROCm/rocAL) | [MIT](https://github.com/ROCm/rocAL/blob/develop/LICENSE.txt) |
| [rocALUTION](https://github.com/ROCm/rocALUTION/) | [MIT](https://github.com/ROCm/rocALUTION/blob/develop/LICENSE.md) |
| [rocBLAS](https://github.com/ROCm/rocBLAS/) | [MIT](https://github.com/ROCm/rocBLAS/blob/develop/LICENSE.md) |
| [rocBLAS](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocblas/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocblas/LICENSE.md) |
| [ROCdbgapi](https://github.com/ROCm/ROCdbgapi/) | [MIT](https://github.com/ROCm/ROCdbgapi/blob/amd-staging/LICENSE.txt) |
| [rocDecode](https://github.com/ROCm/rocDecode) | [MIT](https://github.com/ROCm/rocDecode/blob/develop/LICENSE) |
| [rocFFT](https://github.com/ROCm/rocFFT/) | [MIT](https://github.com/ROCm/rocFFT/blob/develop/LICENSE.md) |
| [rocFFT](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocfft/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocfft/LICENSE.md) |
| [ROCgdb](https://github.com/ROCm/ROCgdb/) | [GNU General Public License v3.0](https://github.com/ROCm/ROCgdb/blob/amd-staging/COPYING3) |
| [rocJPEG](https://github.com/ROCm/rocJPEG/) | [MIT](https://github.com/ROCm/rocJPEG/blob/develop/LICENSE) |
| [ROCK-Kernel-Driver](https://github.com/ROCm/ROCK-Kernel-Driver/) | [GPL 2.0 WITH Linux-syscall-note](https://github.com/ROCm/ROCK-Kernel-Driver/blob/master/COPYING) |
| [rocminfo](https://github.com/ROCm/rocminfo/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocminfo/blob/amd-staging/License.txt) |
| [rocminfo](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocminfo/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocminfo/License.txt) |
| [ROCm Bandwidth Test](https://github.com/ROCm/rocm_bandwidth_test/) | [MIT](https://github.com/ROCm/rocm_bandwidth_test/blob/master/LICENSE.txt) |
| [ROCm CMake](https://github.com/ROCm/rocm-cmake/) | [MIT](https://github.com/ROCm/rocm-cmake/blob/develop/LICENSE) |
| [ROCm Communication Collectives Library (RCCL)](https://github.com/ROCm/rccl/) | [Custom](https://github.com/ROCm/rccl/blob/develop/LICENSE.txt) |
| [ROCm-Core](https://github.com/ROCm/rocm-core) | [MIT](https://github.com/ROCm/rocm-core/blob/master/copyright) |
| [ROCm Compute Profiler](https://github.com/ROCm/rocprofiler-compute) | [MIT](https://github.com/ROCm/rocprofiler-compute/blob/amd-staging/LICENSE) |
| [ROCm Data Center (RDC)](https://github.com/ROCm/rdc/) | [MIT](https://github.com/ROCm/rdc/blob/amd-staging/LICENSE.md) |
| [ROCm-Core](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocm-core/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocm-core/LICENSE.md) |
| [ROCm Compute Profiler](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler-compute/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocprofiler-compute/LICENSE.md) |
| [ROCm Data Center (RDC)](https://github.com/ROCm/rocm-systems/tree/develop/projects/rdc/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/rdc/LICENSE.md) |
| [ROCm-Device-Libs](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/LICENSE.TXT) |
| [ROCm-OpenCL-Runtime](https://github.com/ROCm/clr/tree/amd-staging/opencl) | [MIT](https://github.com/ROCm/clr/blob/amd-staging/opencl/LICENSE.txt) |
| [ROCm-OpenCL-Runtime](https://github.com/ROCm/rocm-systems/tree/develop/projects/clr/opencl/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/clr/opencl/LICENSE.md) |
| [ROCm Performance Primitives (RPP)](https://github.com/ROCm/rpp) | [MIT](https://github.com/ROCm/rpp/blob/develop/LICENSE) |
| [ROCm SMI Lib](https://github.com/ROCm/rocm_smi_lib/) | [MIT](https://github.com/ROCm/rocm_smi_lib/blob/amd-staging/LICENSE.md) |
| [ROCm Systems Profiler](https://github.com/ROCm/rocprofiler-systems) | [MIT](https://github.com/ROCm/rocprofiler-systems/blob/amd-staging/LICENSE.md) |
| [ROCm SMI Lib](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocm-smi-lib/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocm-smi-lib/LICENSE.md) |
| [ROCm Systems Profiler](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler-systems/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocprofiler-systems/LICENSE.md) |
| [ROCm Validation Suite](https://github.com/ROCm/ROCmValidationSuite/) | [MIT](https://github.com/ROCm/ROCmValidationSuite/blob/master/LICENSE) |
| [rocPRIM](https://github.com/ROCm/rocPRIM/) | [MIT](https://github.com/ROCm/rocPRIM/blob/develop/LICENSE.txt) |
| [ROCProfiler](https://github.com/ROCm/rocprofiler/) | [MIT](https://github.com/ROCm/rocprofiler/blob/amd-staging/LICENSE.md) |
| [ROCprofiler-SDK](https://github.com/ROCm/rocprofiler-sdk) | [MIT](https://github.com/ROCm/rocprofiler-sdk/blob/amd-mainline/LICENSE) |
| [rocPRIM](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocprim/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocprim/LICENSE.md) |
| [ROCProfiler](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocprofiler/LICENSE.md) |
| [ROCprofiler-SDK](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler-sdk/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocprofiler-sdk/LICENSE.md) |
| [rocPyDecode](https://github.com/ROCm/rocPyDecode) | [MIT](https://github.com/ROCm/rocPyDecode/blob/develop/LICENSE.txt) |
| [rocRAND](https://github.com/ROCm/rocRAND/) | [MIT](https://github.com/ROCm/rocRAND/blob/develop/LICENSE.txt) |
| [rocRAND](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocrand/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocrand/LICENSE.md) |
| [ROCr Debug Agent](https://github.com/ROCm/rocr_debug_agent/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocr_debug_agent/blob/amd-staging/LICENSE.txt) |
| [ROCR-Runtime](https://github.com/ROCm/ROCR-Runtime/) | [The University of Illinois/NCSA](https://github.com/ROCm/ROCR-Runtime/blob/amd-staging/LICENSE.txt) |
| [ROCR-Runtime](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocr-runtime/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocr-runtime/LICENSE.txt) |
| [rocSHMEM](https://github.com/ROCm/rocSHMEM/) | [MIT](https://github.com/ROCm/rocSHMEM/blob/develop/LICENSE.md) |
| [rocSOLVER](https://github.com/ROCm/rocSOLVER/) | [BSD-2-Clause](https://github.com/ROCm/rocSOLVER/blob/develop/LICENSE.md) |
| [rocSPARSE](https://github.com/ROCm/rocSPARSE/) | [MIT](https://github.com/ROCm/rocSPARSE/blob/develop/LICENSE.md) |
| [rocThrust](https://github.com/ROCm/rocThrust/) | [Apache 2.0](https://github.com/ROCm/rocThrust/blob/develop/LICENSE) |
| [ROCTracer](https://github.com/ROCm/roctracer/) | [MIT](https://github.com/ROCm/roctracer/blob/amd-master/LICENSE) |
| [rocWMMA](https://github.com/ROCm/rocWMMA/) | [MIT](https://github.com/ROCm/rocWMMA/blob/develop/LICENSE.md) |
| [Tensile](https://github.com/ROCm/Tensile/) | [MIT](https://github.com/ROCm/Tensile/blob/develop/LICENSE.md) |
| [rocSOLVER](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocsolver/) | [BSD-2-Clause](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocsolver/LICENSE.md) |
| [rocSPARSE](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocsparse/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocsparse/LICENSE.md) |
| [rocThrust](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocthrust/) | [Apache 2.0](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocthrust/LICENSE) |
| [ROCTracer](https://github.com/ROCm/rocm-systems/tree/develop/projects/roctracer/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/roctracer/LICENSE.md) |
| [rocWMMA](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocwmma/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocwmma/LICENSE.md) |
| [Tensile](https://github.com/ROCm/rocm-libraries/tree/develop/shared/tensile/) | [MIT](https://github.com/ROCm/rocm-libraries/blob/develop/shared/tensile/LICENSE.md) |
| [TransferBench](https://github.com/ROCm/TransferBench) | [MIT](https://github.com/ROCm/TransferBench/blob/develop/LICENSE.md) |
Open sourced ROCm components are released via public GitHub

View File

@@ -1,134 +1,136 @@
ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.3,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,,
,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2"
,,,,,,,,,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2"
,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8"
,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4"
,,,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9
,"Oracle Linux 9, 8 [#ol-700-mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,,
,Debian 12,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,,
,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-630-past-60]_,Azure Linux 3.0 [#az-mi300x-630-past-60]_,,,,,,,,,,,,
,Rocky Linux 9,,,,,,,,,,,,,,,,,,
,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA4,,,,,,,,,,,,,,,,,,
,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3
,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2
,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA
,RDNA4,RDNA4,RDNA4,RDNA4,,,,,,,,,,,,,,,
,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3
,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2
,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx950,,,,,,,,,,,,,,,,,,
,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030
,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942 [#mi300_624-past-60]_,gfx942 [#mi300_622-past-60]_,gfx942 [#mi300_621-past-60]_,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_
,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a
,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908
,,,,,,,,,,,,,,,,,,,
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.7, 2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.19.1, 2.18.1","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.6.0,0.4.35,0.4.35,0.4.35,0.4.35,0.4.31,0.4.31,0.4.31,0.4.31,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
:doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.3.0.post0,N/A,N/A,N/A,N/A,N/A,
:doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,N/A,N/A,N/A,85f95ae,85f95ae,85f95ae,85f95ae,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,
:doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,N/A,N/A,2.4.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,
:doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>`,N/A,N/A,N/A,N/A,N/A,0.7.0,0.7.0,0.7.0,0.7.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,
:doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,1.8.0b1,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.22.0,1.20.0,1.20.0,1.20.0,1.20.0,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
`UCC <https://github.com/ROCm/ucc>`_,>=1.4.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.17.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1
,,,,,,,,,,,,,,,,,,,
THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
Thrust,2.6.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
CUB,2.6.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
,,,,,,,,,,,,,,,,,,,
KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`KMD versions <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>`,"30.10, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x"
,,,,,,,,,,,,,,,,,,,
ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0
:doc:`MIGraphX <amdmigraphx:index>`,2.13.0,2.12.0,2.12.0,2.12.0,2.12.0,2.11.0,2.11.0,2.11.0,2.11.0,2.10.0,2.10.0,2.10.0,2.10.0,2.9.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0
:doc:`MIOpen <miopen:index>`,3.5.0,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`MIVisionX <mivisionx:index>`,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0
:doc:`rocAL <rocal:index>`,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0,2.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`rocDecode <rocdecode:index>`,1.0.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A
:doc:`rocJPEG <rocjpeg:index>`,1.1.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`rocPyDecode <rocpydecode:index>`,0.6.0,0.3.1,0.3.1,0.3.1,0.3.1,0.2.0,0.2.0,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`RPP <rpp:index>`,2.0.0,1.9.10,1.9.10,1.9.10,1.9.10,1.9.1,1.9.1,1.9.1,1.9.1,1.8.0,1.8.0,1.8.0,1.8.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0
,,,,,,,,,,,,,,,,,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`RCCL <rccl:index>`,2.26.6,2.22.3,2.22.3,2.22.3,2.22.3,2.21.5,2.21.5,2.21.5,2.21.5,2.20.5,2.20.5,2.20.5,2.20.5,2.18.6,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3
:doc:`rocSHMEM <rocshmem:index>`,3.0.0,2.0.1,2.0.1,2.0.0,2.0.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
,,,,,,,,,,,,,,,,,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0
:doc:`hipBLAS <hipblas:index>`,3.0.0,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0
:doc:`hipBLASLt <hipblaslt:index>`,1.0.0,0.12.1,0.12.1,0.12.1,0.12.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.7.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0
:doc:`hipFFT <hipfft:index>`,1.0.20,1.0.18,1.0.18,1.0.18,1.0.18,1.0.17,1.0.17,1.0.17,1.0.17,1.0.16,1.0.15,1.0.15,1.0.14,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13
:doc:`hipfort <hipfort:index>`,0.7.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.1,0.5.1,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0
:doc:`hipRAND <hiprand:index>`,3.0.0,2.12.0,2.12.0,2.12.0,2.12.0,2.11.1,2.11.1,2.11.1,2.11.0,2.11.1,2.11.0,2.11.0,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16
:doc:`hipSOLVER <hipsolver:index>`,3.0.0,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.1,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0
:doc:`hipSPARSE <hipsparse:index>`,4.0.1,3.2.0,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.1.1,3.1.1,3.1.1,3.1.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.4,0.2.3,0.2.3,0.2.3,0.2.3,0.2.2,0.2.2,0.2.2,0.2.2,0.2.1,0.2.1,0.2.1,0.2.1,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0
:doc:`rocALUTION <rocalution:index>`,4.0.0,3.2.3,3.2.3,3.2.3,3.2.2,3.2.1,3.2.1,3.2.1,3.2.1,3.2.1,3.2.0,3.2.0,3.2.0,3.1.1,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3
:doc:`rocBLAS <rocblas:index>`,5.0.0,4.4.1,4.4.1,4.4.0,4.4.0,4.3.0,4.3.0,4.3.0,4.3.0,4.2.4,4.2.1,4.2.1,4.2.0,4.1.2,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0
:doc:`rocFFT <rocfft:index>`,1.0.34,1.0.32,1.0.32,1.0.32,1.0.32,1.0.31,1.0.31,1.0.31,1.0.31,1.0.30,1.0.29,1.0.29,1.0.28,1.0.27,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23
:doc:`rocRAND <rocrand:index>`,4.0.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.1,3.1.0,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17
:doc:`rocSOLVER <rocsolver:index>`,3.30.0,3.28.2,3.28.2,3.28.0,3.28.0,3.27.0,3.27.0,3.27.0,3.27.0,3.26.2,3.26.0,3.26.0,3.26.0,3.25.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0
:doc:`rocSPARSE <rocsparse:index>`,4.0.2,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2
:doc:`rocWMMA <rocwmma:index>`,2.0.0,1.7.0,1.7.0,1.7.0,1.7.0,1.6.0,1.6.0,1.6.0,1.6.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0
:doc:`Tensile <tensile:src/index>`,4.44.0,4.43.0,4.43.0,4.43.0,4.43.0,4.42.0,4.42.0,4.42.0,4.42.0,4.41.0,4.41.0,4.41.0,4.41.0,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0
,,,,,,,,,,,,,,,,,,,
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`hipCUB <hipcub:index>`,4.0.0,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`hipTensor <hiptensor:index>`,2.0.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0
:doc:`rocPRIM <rocprim:index>`,4.0.0,3.4.1,3.4.1,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.2,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`rocThrust <rocthrust:index>`,4.0.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.1.1,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
,,,,,,,,,,,,,,,,,,,
SUPPORT LIBS,,,,,,,,,,,,,,,,,,,
`hipother <https://github.com/ROCm/hipother>`_,7.0.51830,6.4.43483,6.4.43483,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`rocm-core <https://github.com/ROCm/rocm-core>`_,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0,6.1.5,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,20240607.5.7,20240607.5.7,20240607.4.05,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245
,,,,,,,,,,,,,,,,,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`AMD SMI <amdsmi:index>`,26.0.0,25.5.1,25.5.1,25.4.2,25.3.0,24.7.1,24.7.1,24.7.1,24.7.1,24.6.3,24.6.3,24.6.3,24.6.2,24.5.1,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2
:doc:`ROCm Data Center Tool <rdc:index>`,1.1.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.8.0,7.7.0,7.5.0,7.5.0,7.5.0,7.4.0,7.4.0,7.4.0,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.2.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000
,,,,,,,,,,,,,,,,,,,
PERFORMANCE TOOLS,,,,,,,,,,,,,,,,,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,2.6.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.2.3,3.1.1,3.1.1,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.0.1,2.0.1,2.0.1,2.0.1,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.1.0,1.0.2,1.0.2,1.0.1,1.0.0,0.1.2,0.1.1,0.1.0,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCProfiler <rocprofiler:index>`,2.0.70000,2.0.60403,2.0.60402,2.0.60401,2.0.60400,2.0.60303,2.0.60302,2.0.60301,2.0.60300,2.0.60204,2.0.60202,2.0.60201,2.0.60200,2.0.60105,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,1.0.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCTracer <roctracer:index>`,4.1.70000,4.1.60403,4.1.60402,4.1.60401,4.1.60400,4.1.60303,4.1.60302,4.1.60301,4.1.60300,4.1.60204,4.1.60202,4.1.60201,4.1.60200,4.1.60105,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000
,,,,,,,,,,,,,,,,,,,
DEVELOPMENT TOOLS,,,,,,,,,,,,,,,,,,,
:doc:`HIPIFY <hipify:index>`,20.0.0,19.0.0,19.0.0,19.0.0,19.0.0,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.13.0,0.13.0,0.13.0,0.13.0,0.12.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.3,0.77.2,0.77.2,0.77.2,0.77.2,0.77.0,0.77.0,0.77.0,0.77.0,0.76.0,0.76.0,0.76.0,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,16.3.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,14.2.0,14.2.0,14.2.0,14.2.0,14.1.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.3.0,0.3.0,0.3.0,0.3.0,N/A,N/A
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.1.0,2.0.4,2.0.4,2.0.4,2.0.4,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3
,,,,,,,,,,,,,,,,,,,
COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
`Flang <https://github.com/ROCm/flang>`_,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`llvm-project <llvm-project:index>`,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
,,,,,,,,,,,,,,,,,,,
RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,
:doc:`AMD CLR <hip:understand/amd_clr>`,7.0.51830,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
:doc:`HIP <hip:index>`,7.0.51830,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.18.0,1.15.0,1.15.0,1.15.0,1.15.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.13.0,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0
ROCm Version,7.1.1,7.1.0,7.0.2,7.0.1/7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
:ref:`Operating systems & kernels <OS-kernel-versions>` [#os-compatibility-past-60]_,Ubuntu 24.04.3,Ubuntu 24.04.3,Ubuntu 24.04.3,Ubuntu 24.04.3,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,,
,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2"
,,,,,,,,,,,,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
,"RHEL 10.1, 10.0, 9.7, 9.6, 9.4","RHEL 10.0, 9.6, 9.4","RHEL 10.0, 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2"
,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8"
,SLES 15 SP7,SLES 15 SP7,SLES 15 SP7,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4"
,,,,,,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9
,"Oracle Linux 10, 9, 8","Oracle Linux 10, 9, 8","Oracle Linux 10, 9, 8","Oracle Linux 9, 8","Oracle Linux 9, 8","Oracle Linux 9, 8","Oracle Linux 9, 8","Oracle Linux 9, 8",Oracle Linux 8.10,Oracle Linux 8.10,Oracle Linux 8.10,Oracle Linux 8.10,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,,,
,"Debian 13, 12","Debian 13, 12","Debian 13, 12",Debian 12,Debian 12,Debian 12,Debian 12,Debian 12,Debian 12,Debian 12,Debian 12,,,,,,,,,,,
,,,Azure Linux 3.0,Azure Linux 3.0,Azure Linux 3.0,Azure Linux 3.0,Azure Linux 3.0,Azure Linux 3.0,Azure Linux 3.0,Azure Linux 3.0,,,,,,,,,,,,
,Rocky Linux 9,Rocky Linux 9,Rocky Linux 9,Rocky Linux 9,,,,,,,,,,,,,,,,,,
,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA4,CDNA4,CDNA4,CDNA4,,,,,,,,,,,,,,,,,,
,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3
,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2
,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA
,RDNA4,RDNA4,RDNA4,RDNA4,RDNA4,RDNA4,RDNA4,,,,,,,,,,,,,,,
,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3
,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2
,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>` [#gpu-compatibility-past-60]_,gfx950,gfx950,gfx950,gfx950,,,,,,,,,,,,,,,,,,
,gfx1201,gfx1201,gfx1201,gfx1201,gfx1201,gfx1201,gfx1201,,,,,,,,,,,,,,,
,gfx1200,gfx1200,gfx1200,gfx1200,gfx1200,gfx1200,gfx1200,,,,,,,,,,,,,,,
,gfx1101,gfx1101,gfx1101,gfx1101,gfx1101,gfx1101,gfx1101,,,,,,,,,,,,,,,
,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030
,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942, gfx942, gfx942, gfx942, gfx942, gfx942, gfx942
,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a
,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908
,,,,,,,,,,,,,,,,,,,,,,
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.9, 2.8, 2.7","2.8, 2.7, 2.6","2.8, 2.7, 2.6","2.7, 2.6, 2.5","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.20.0, 2.19.1, 2.18.1","2.20.0, 2.19.1, 2.18.1","2.19.1, 2.18.1, 2.17.1 [#tf-mi350-past-60]_","2.19.1, 2.18.1, 2.17.1 [#tf-mi350-past-60]_","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.7.1,0.7.1,0.6.0,0.6.0,0.4.35,0.4.35,0.4.35,0.4.35,0.4.31,0.4.31,0.4.31,0.4.31,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
:doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat-past-60]_,N/A,N/A,N/A,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.3.0.post0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>` [#stanford-megatron-lm_compat-past-60]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,85f95ae,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat-past-60]_,N/A,N/A,N/A,2.4.0,2.4.0,N/A,N/A,2.4.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>` [#megablocks_compat-past-60]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.7.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`Ray <../compatibility/ml-compatibility/ray-compatibility>` [#ray_compat-past-60]_,N/A,N/A,N/A,2.51.1,N/A,N/A,2.48.0.post0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`llama.cpp <../compatibility/ml-compatibility/llama-cpp-compatibility>` [#llama-cpp_compat-past-60]_,N/A,N/A,N/A,b6652,b6356,b6356,b6356,b5997,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`FlashInfer <../compatibility/ml-compatibility/flashinfer-compatibility>` [#flashinfer_compat-past-60]_,N/A,N/A,N/A,N/A,N/A,N/A,v0.2.5,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.23.1,1.22.0,1.22.0,1.22.0,1.20.0,1.20.0,1.20.0,1.20.0,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
`UCC <https://github.com/ROCm/ucc>`_,>=1.4.0,>=1.4.0,>=1.4.0,>=1.4.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.17.0,>=1.17.0,>=1.17.0,>=1.17.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1
,,,,,,,,,,,,,,,,,,,,,,
THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
Thrust,2.8.5,2.8.5,2.6.0,2.6.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
CUB,2.8.5,2.8.5,2.6.0,2.6.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
,,,,,,,,,,,,,,,,,,,,,,
DRIVER & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`AMD GPU Driver <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>`,"30.20.1, 30.20.0 [#mi325x_KVM-past-60]_, 30.10.2, 30.10.1 [#driver_patch-past-60]_, 30.10, 6.4.x","30.20.0 [#mi325x_KVM-past-60]_, 30.10.2, 30.10.1 [#driver_patch-past-60]_, 30.10, 6.4.x","30.10.2, 30.10.1 [#driver_patch-past-60]_, 30.10, 6.4.x, 6.3.x","30.10.1 [#driver_patch-past-60]_, 30.10, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x"
,,,,,,,,,,,,,,,,,,,,,,
ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0
:doc:`MIGraphX <amdmigraphx:index>`,2.14.0,2.14.0,2.13.0,2.13.0,2.12.0,2.12.0,2.12.0,2.12.0,2.11.0,2.11.0,2.11.0,2.11.0,2.10.0,2.10.0,2.10.0,2.10.0,2.9.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0
:doc:`MIOpen <miopen:index>`,3.5.1,3.5.1,3.5.0,3.5.0,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`MIVisionX <mivisionx:index>`,3.4.0,3.4.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0
:doc:`rocAL <rocal:index>`,2.4.0,2.4.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0,2.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`rocDecode <rocdecode:index>`,1.4.0,1.4.0,1.0.0,1.0.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A
:doc:`rocJPEG <rocjpeg:index>`,1.2.0,1.2.0,1.1.0,1.1.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`rocPyDecode <rocpydecode:index>`,0.7.0,0.7.0,0.6.0,0.6.0,0.3.1,0.3.1,0.3.1,0.3.1,0.2.0,0.2.0,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`RPP <rpp:index>`,2.1.0,2.1.0,2.0.0,2.0.0,1.9.10,1.9.10,1.9.10,1.9.10,1.9.1,1.9.1,1.9.1,1.9.1,1.8.0,1.8.0,1.8.0,1.8.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0
,,,,,,,,,,,,,,,,,,,,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`RCCL <rccl:index>`,2.27.7,2.27.7,2.26.6,2.26.6,2.22.3,2.22.3,2.22.3,2.22.3,2.21.5,2.21.5,2.21.5,2.21.5,2.20.5,2.20.5,2.20.5,2.20.5,2.18.6,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3
:doc:`rocSHMEM <rocshmem:index>`,3.1.0,3.0.0,3.0.0,3.0.0,2.0.1,2.0.1,2.0.0,2.0.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
,,,,,,,,,,,,,,,,,,,,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0
:doc:`hipBLAS <hipblas:index>`,3.1.0,3.1.0,3.0.2,3.0.0,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0
:doc:`hipBLASLt <hipblaslt:index>`,1.1.0,1.1.0,1.0.0,1.0.0,0.12.1,0.12.1,0.12.1,0.12.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.7.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0
:doc:`hipFFT <hipfft:index>`,1.0.21,1.0.21,1.0.20,1.0.20,1.0.18,1.0.18,1.0.18,1.0.18,1.0.17,1.0.17,1.0.17,1.0.17,1.0.16,1.0.15,1.0.15,1.0.14,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13
:doc:`hipfort <hipfort:index>`,0.7.1,0.7.1,0.7.0,0.7.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.1,0.5.1,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0
:doc:`hipRAND <hiprand:index>`,3.1.0,3.1.0,3.0.0,3.0.0,2.12.0,2.12.0,2.12.0,2.12.0,2.11.1,2.11.1,2.11.1,2.11.0,2.11.1,2.11.0,2.11.0,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16
:doc:`hipSOLVER <hipsolver:index>`,3.1.0,3.1.0,3.0.0,3.0.0,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.1,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0
:doc:`hipSPARSE <hipsparse:index>`,4.1.0,4.1.0,4.0.1,4.0.1,3.2.0,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.1.1,3.1.1,3.1.1,3.1.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.5,0.2.5,0.2.4,0.2.4,0.2.3,0.2.3,0.2.3,0.2.3,0.2.2,0.2.2,0.2.2,0.2.2,0.2.1,0.2.1,0.2.1,0.2.1,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0
:doc:`rocALUTION <rocalution:index>`,4.0.1,4.0.1,4.0.0,4.0.0,3.2.3,3.2.3,3.2.3,3.2.2,3.2.1,3.2.1,3.2.1,3.2.1,3.2.1,3.2.0,3.2.0,3.2.0,3.1.1,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3
:doc:`rocBLAS <rocblas:index>`,5.1.1,5.1.0,5.0.2,5.0.0,4.4.1,4.4.1,4.4.0,4.4.0,4.3.0,4.3.0,4.3.0,4.3.0,4.2.4,4.2.1,4.2.1,4.2.0,4.1.2,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0
:doc:`rocFFT <rocfft:index>`,1.0.35,1.0.35,1.0.34,1.0.34,1.0.32,1.0.32,1.0.32,1.0.32,1.0.31,1.0.31,1.0.31,1.0.31,1.0.30,1.0.29,1.0.29,1.0.28,1.0.27,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23
:doc:`rocRAND <rocrand:index>`,4.1.0,4.1.0,4.0.0,4.0.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.1,3.1.0,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17
:doc:`rocSOLVER <rocsolver:index>`,3.31.0,3.31.0,3.30.1,3.30.0,3.28.2,3.28.2,3.28.0,3.28.0,3.27.0,3.27.0,3.27.0,3.27.0,3.26.2,3.26.0,3.26.0,3.26.0,3.25.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0
:doc:`rocSPARSE <rocsparse:index>`,4.1.0,4.1.0,4.0.2,4.0.2,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2
:doc:`rocWMMA <rocwmma:index>`,2.1.0,2.0.0,2.0.0,2.0.0,1.7.0,1.7.0,1.7.0,1.7.0,1.6.0,1.6.0,1.6.0,1.6.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0
:doc:`Tensile <tensile:src/index>`,4.44.0,4.44.0,4.44.0,4.44.0,4.43.0,4.43.0,4.43.0,4.43.0,4.42.0,4.42.0,4.42.0,4.42.0,4.41.0,4.41.0,4.41.0,4.41.0,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0
,,,,,,,,,,,,,,,,,,,,,,
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`hipCUB <hipcub:index>`,4.1.0,4.1.0,4.0.0,4.0.0,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`hipTensor <hiptensor:index>`,2.0.0,2.0.0,2.0.0,2.0.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0
:doc:`rocPRIM <rocprim:index>`,4.1.0,4.1.0,4.0.1,4.0.0,3.4.1,3.4.1,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.2,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`rocThrust <rocthrust:index>`,4.1.0,4.1.0,4.0.0,4.0.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.1.1,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
,,,,,,,,,,,,,,,,,,,,,,
SUPPORT LIBS,,,,,,,,,,,,,,,,,,,,,,
`hipother <https://github.com/ROCm/hipother>`_,7.1.52802,7.1.25424,7.0.51831,7.0.51830,6.4.43483,6.4.43483,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`rocm-core <https://github.com/ROCm/rocm-core>`_,7.1.1,7.1.0,7.0.2,7.0.1/7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0,6.1.5,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,20240607.5.7,20240607.5.7,20240607.4.05,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245
,,,,,,,,,,,,,,,,,,,,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`AMD SMI <amdsmi:index>`,26.2.0,26.1.0,26.0.2,26.0.0,25.5.1,25.5.1,25.4.2,25.3.0,24.7.1,24.7.1,24.7.1,24.7.1,24.6.3,24.6.3,24.6.3,24.6.2,24.5.1,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2
:doc:`ROCm Data Center Tool <rdc:index>`,1.2.0,1.2.0,1.1.0,1.1.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.8.0,7.8.0,7.8.0,7.8.0,7.7.0,7.5.0,7.5.0,7.5.0,7.4.0,7.4.0,7.4.0,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.3.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000
,,,,,,,,,,,,,,,,,,,,,,
PERFORMANCE TOOLS,,,,,,,,,,,,,,,,,,,,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,2.6.0,2.6.0,2.6.0,2.6.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.3.1,3.3.0,3.2.3,3.2.3,3.1.1,3.1.1,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.0.1,2.0.1,2.0.1,2.0.1,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.2.1,1.2.0,1.1.1,1.1.0,1.0.2,1.0.2,1.0.1,1.0.0,0.1.2,0.1.1,0.1.0,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCProfiler <rocprofiler:index>`,2.0.70101,2.0.70100,2.0.70002,2.0.70000,2.0.60403,2.0.60402,2.0.60401,2.0.60400,2.0.60303,2.0.60302,2.0.60301,2.0.60300,2.0.60204,2.0.60202,2.0.60201,2.0.60200,2.0.60105,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,1.0.0,1.0.0,1.0.0,1.0.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCTracer <roctracer:index>`,4.1.70101,4.1.70100,4.1.70002,4.1.70000,4.1.60403,4.1.60402,4.1.60401,4.1.60400,4.1.60303,4.1.60302,4.1.60301,4.1.60300,4.1.60204,4.1.60202,4.1.60201,4.1.60200,4.1.60105,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000
,,,,,,,,,,,,,,,,,,,,,,
DEVELOPMENT TOOLS,,,,,,,,,,,,,,,,,,,,,,
:doc:`HIPIFY <hipify:index>`,20.0.0,20.0.0,20.0.0,20.0.0,19.0.0,19.0.0,19.0.0,19.0.0,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.13.0,0.13.0,0.13.0,0.13.0,0.12.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.4,0.77.4,0.77.4,0.77.3,0.77.2,0.77.2,0.77.2,0.77.2,0.77.0,0.77.0,0.77.0,0.77.0,0.76.0,0.76.0,0.76.0,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,16.3.0,16.3.0,16.3.0,16.3.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,14.2.0,14.2.0,14.2.0,14.2.0,14.1.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.5.0,0.5.0,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.3.0,0.3.0,0.3.0,0.3.0,N/A,N/A
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.1.0,2.1.0,2.1.0,2.1.0,2.0.4,2.0.4,2.0.4,2.0.4,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3
,,,,,,,,,,,,,,,,,,,,,,
COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
`Flang <https://github.com/ROCm/flang>`_,20.0.025444,20.0.025425,20.0.0.25385,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`llvm-project <llvm-project:index>`,20.0.025444,20.0.025425,20.0.0.25385,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,20.0.025444,20.0.025425,20.0.0.25385,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
,,,,,,,,,,,,,,,,,,,,,,
RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,,,,,
:doc:`AMD CLR <hip:understand/amd_clr>`,7.1.52802,7.1.25424,7.0.51831,7.0.51830,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
:doc:`HIP <hip:index>`,7.1.52802,7.1.25424,7.0.51831,7.0.51830,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.18.0,1.18.0,1.18.0,1.18.0,1.15.0,1.15.0,1.15.0,1.15.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.13.0,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0
1 ROCm Version 7.1.1 7.1.0 7.0.2 7.0.0 7.0.1/7.0.0 6.4.3 6.4.2 6.4.1 6.4.0 6.3.3 6.3.2 6.3.1 6.3.0 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
2 :ref:`Operating systems & kernels <OS-kernel-versions>` :ref:`Operating systems & kernels <OS-kernel-versions>` [#os-compatibility-past-60]_ Ubuntu 24.04.3 Ubuntu 24.04.3 Ubuntu 24.04.3 Ubuntu 24.04.3 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.1, 24.04 Ubuntu 24.04.1, 24.04 Ubuntu 24.04.1, 24.04 Ubuntu 24.04
3 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3, 22.04.2 Ubuntu 22.04.4, 22.04.3, 22.04.2
4 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5
5 RHEL 10.1, 10.0, 9.7, 9.6, 9.4 RHEL 10.0, 9.6, 9.4 RHEL 10.0, 9.6, 9.4 RHEL 9.6, 9.4 RHEL 9.6, 9.4 RHEL 9.6, 9.4 RHEL 9.6, 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.3, 9.2 RHEL 9.3, 9.2
6 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8
7 SLES 15 SP7 SLES 15 SP7 SLES 15 SP7 SLES 15 SP7 SLES 15 SP7, SP6 SLES 15 SP7, SP6 SLES 15 SP6 SLES 15 SP6 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4
8 CentOS 7.9 CentOS 7.9 CentOS 7.9 CentOS 7.9 CentOS 7.9
9 Oracle Linux 10, 9, 8 Oracle Linux 10, 9, 8 Oracle Linux 10, 9, 8 Oracle Linux 9, 8 [#ol-700-mi300x-past-60]_ Oracle Linux 9, 8 Oracle Linux 9, 8 [#mi300x-past-60]_ Oracle Linux 9, 8 Oracle Linux 9, 8 [#mi300x-past-60]_ Oracle Linux 9, 8 Oracle Linux 9, 8 [#mi300x-past-60]_ Oracle Linux 9, 8 Oracle Linux 9, 8 [#mi300x-past-60]_ Oracle Linux 9, 8 Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9
10 Debian 13, 12 Debian 13, 12 Debian 13, 12 Debian 12 Debian 12 [#single-node-past-60]_ Debian 12 Debian 12 [#single-node-past-60]_ Debian 12 Debian 12 [#single-node-past-60]_ Debian 12 Debian 12 [#single-node-past-60]_ Debian 12 Debian 12 [#single-node-past-60]_ Debian 12 Debian 12 [#single-node-past-60]_ Debian 12 Debian 12 [#single-node-past-60]_ Debian 12
11 Azure Linux 3.0 Azure Linux 3.0 [#az-mi300x-past-60]_ Azure Linux 3.0 Azure Linux 3.0 [#az-mi300x-past-60]_ Azure Linux 3.0 Azure Linux 3.0 [#az-mi300x-past-60]_ Azure Linux 3.0 Azure Linux 3.0 [#az-mi300x-past-60]_ Azure Linux 3.0 Azure Linux 3.0 [#az-mi300x-past-60]_ Azure Linux 3.0 Azure Linux 3.0 [#az-mi300x-630-past-60]_ Azure Linux 3.0 Azure Linux 3.0 [#az-mi300x-630-past-60]_ Azure Linux 3.0
12 Rocky Linux 9 Rocky Linux 9 Rocky Linux 9 Rocky Linux 9
13 .. _architecture-support-compatibility-matrix-past-60: .. _architecture-support-compatibility-matrix-past-60:
14 :doc:`Architecture <rocm-install-on-linux:reference/system-requirements>` CDNA4 CDNA4 CDNA4 CDNA4
15 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3
16 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2
17 CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA
18 RDNA4 RDNA4 RDNA4 RDNA4 RDNA4 RDNA4 RDNA4
19 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3
20 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2
21 .. _gpu-support-compatibility-matrix-past-60: .. _gpu-support-compatibility-matrix-past-60:
22 :doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>` :doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>` [#gpu-compatibility-past-60]_ gfx950 gfx950 gfx950 gfx950
23 gfx1201 gfx1201 gfx1201 gfx1201 [#RDNA-OS-past-60]_ gfx1201 gfx1201 [#RDNA-OS-past-60]_ gfx1201 gfx1201 [#RDNA-OS-past-60]_ gfx1201 gfx1201 [#RDNA-OS-past-60]_ gfx1201
24 gfx1200 gfx1200 gfx1200 gfx1200 [#RDNA-OS-past-60]_ gfx1200 gfx1200 [#RDNA-OS-past-60]_ gfx1200 gfx1200 [#RDNA-OS-past-60]_ gfx1200 gfx1200 [#RDNA-OS-past-60]_ gfx1200
25 gfx1101 gfx1101 gfx1101 gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_ gfx1101 gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_ gfx1101 gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_ gfx1101 gfx1101 [#RDNA-OS-past-60]_ gfx1101
26 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100
27 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030
28 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 [#mi300_624-past-60]_ gfx942 gfx942 [#mi300_622-past-60]_ gfx942 gfx942 [#mi300_621-past-60]_ gfx942 gfx942 [#mi300_620-past-60]_ gfx942 gfx942 [#mi300_612-past-60]_ gfx942 gfx942 [#mi300_612-past-60]_ gfx942 gfx942 [#mi300_611-past-60]_ gfx942 gfx942 [#mi300_610-past-60]_ gfx942 gfx942 [#mi300_602-past-60]_ gfx942 gfx942 [#mi300_600-past-60]_ gfx942
29 gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a
30 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908
31
32 FRAMEWORK SUPPORT .. _framework-support-compatibility-matrix-past-60: .. _framework-support-compatibility-matrix-past-60:
33 :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>` 2.9, 2.8, 2.7 2.8, 2.7, 2.6 2.8, 2.7, 2.6 2.7, 2.6, 2.5, 2.4, 2.3 2.7, 2.6, 2.5 2.6, 2.5, 2.4, 2.3 2.6, 2.5, 2.4, 2.3 2.6, 2.5, 2.4, 2.3 2.6, 2.5, 2.4, 2.3 2.4, 2.3, 2.2, 1.13 2.4, 2.3, 2.2, 1.13 2.4, 2.3, 2.2, 1.13 2.4, 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13
34 :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>` 2.20.0, 2.19.1, 2.18.1 2.20.0, 2.19.1, 2.18.1 2.19.1, 2.18.1, 2.17.1 [#tf-mi350-past-60]_ 2.19.1, 2.18.1 2.19.1, 2.18.1, 2.17.1 [#tf-mi350-past-60]_ 2.18.1, 2.17.1, 2.16.2 2.18.1, 2.17.1, 2.16.2 2.18.1, 2.17.1, 2.16.2 2.18.1, 2.17.1, 2.16.2 2.17.0, 2.16.2, 2.15.1 2.17.0, 2.16.2, 2.15.1 2.17.0, 2.16.2, 2.15.1 2.17.0, 2.16.2, 2.15.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.14.0, 2.13.1, 2.12.1 2.14.0, 2.13.1, 2.12.1
35 :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>` 0.7.1 0.7.1 0.6.0 0.6.0 0.4.35 0.4.35 0.4.35 0.4.35 0.4.31 0.4.31 0.4.31 0.4.31 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26
36 :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_ :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat-past-60]_ N/A N/A N/A N/A 0.6.0 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 0.3.0.post0 N/A N/A N/A N/A N/A N/A
37 :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>` :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>` [#stanford-megatron-lm_compat-past-60]_ N/A N/A N/A N/A N/A N/A N/A N/A 85f95ae N/A 85f95ae N/A 85f95ae N/A 85f95ae N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
38 :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_ :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat-past-60]_ N/A N/A N/A N/A 2.4.0 N/A 2.4.0 N/A N/A 2.4.0 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
39 :doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>` :doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>` [#megablocks_compat-past-60]_ N/A N/A N/A N/A N/A N/A N/A N/A 0.7.0 N/A 0.7.0 N/A 0.7.0 N/A 0.7.0 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
40 :doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_ :doc:`Ray <../compatibility/ml-compatibility/ray-compatibility>` [#ray_compat-past-60]_ N/A N/A N/A N/A 2.51.1 N/A N/A N/A 2.48.0.post0 N/A N/A 1.8.0b1 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
41 `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_ :doc:`llama.cpp <../compatibility/ml-compatibility/llama-cpp-compatibility>` [#llama-cpp_compat-past-60]_ N/A N/A N/A 1.22.0 b6652 1.20.0 b6356 1.20.0 b6356 1.20.0 b6356 1.20.0 b5997 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.14.1 N/A 1.14.1 N/A
42 :doc:`FlashInfer <../compatibility/ml-compatibility/flashinfer-compatibility>` [#flashinfer_compat-past-60]_ N/A N/A N/A N/A N/A N/A v0.2.5 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
43 `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_ 1.23.1 1.22.0 1.22.0 1.22.0 1.20.0 1.20.0 1.20.0 1.20.0 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.14.1 1.14.1
44 THIRD PARTY COMMS .. _thirdpartycomms-support-compatibility-matrix-past-60:
45 `UCC <https://github.com/ROCm/ucc>`_ >=1.4.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.2.0 >=1.2.0
46 `UCX <https://github.com/ROCm/ucx>`_ THIRD PARTY COMMS .. _thirdpartycomms-support-compatibility-matrix-past-60: >=1.17.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1
47 `UCC <https://github.com/ROCm/ucc>`_ >=1.4.0 >=1.4.0 >=1.4.0 >=1.4.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.2.0 >=1.2.0
48 THIRD PARTY ALGORITHM `UCX <https://github.com/ROCm/ucx>`_ >=1.17.0 >=1.17.0 >=1.17.0 .. _thirdpartyalgorithm-support-compatibility-matrix-past-60: >=1.17.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1
49 Thrust 2.6.0 2.5.0 2.5.0 2.5.0 2.5.0 2.3.2 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
50 CUB THIRD PARTY ALGORITHM .. _thirdpartyalgorithm-support-compatibility-matrix-past-60: 2.6.0 2.5.0 2.5.0 2.5.0 2.5.0 2.3.2 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
51 Thrust 2.8.5 2.8.5 2.6.0 2.6.0 2.5.0 2.5.0 2.5.0 2.5.0 2.3.2 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
52 KMD & USER SPACE [#kfd_support-past-60]_ CUB 2.8.5 2.8.5 2.6.0 .. _kfd-userspace-support-compatibility-matrix-past-60: 2.6.0 2.5.0 2.5.0 2.5.0 2.5.0 2.3.2 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
53 :doc:`KMD versions <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>` 30.10, 6.4.x, 6.3.x, 6.2.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x
54 DRIVER & USER SPACE [#kfd_support-past-60]_ .. _kfd-userspace-support-compatibility-matrix-past-60:
55 ML & COMPUTER VISION :doc:`AMD GPU Driver <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>` 30.20.1, 30.20.0 [#mi325x_KVM-past-60]_, 30.10.2, 30.10.1 [#driver_patch-past-60]_, 30.10, 6.4.x 30.20.0 [#mi325x_KVM-past-60]_, 30.10.2, 30.10.1 [#driver_patch-past-60]_, 30.10, 6.4.x 30.10.2, 30.10.1 [#driver_patch-past-60]_, 30.10, 6.4.x, 6.3.x .. _mllibs-support-compatibility-matrix-past-60: 30.10.1 [#driver_patch-past-60]_, 30.10, 6.4.x, 6.3.x, 6.2.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x
56 :doc:`Composable Kernel <composable_kernel:index>` 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0
57 :doc:`MIGraphX <amdmigraphx:index>` ML & COMPUTER VISION .. _mllibs-support-compatibility-matrix-past-60: 2.13.0 2.12.0 2.12.0 2.12.0 2.12.0 2.11.0 2.11.0 2.11.0 2.11.0 2.10.0 2.10.0 2.10.0 2.10.0 2.9.0 2.9.0 2.9.0 2.9.0 2.8.0 2.8.0
58 :doc:`MIOpen <miopen:index>` :doc:`Composable Kernel <composable_kernel:index>` 1.1.0 1.1.0 1.1.0 3.5.0 1.1.0 3.4.0 1.1.0 3.4.0 1.1.0 3.4.0 1.1.0 3.4.0 1.1.0 3.3.0 1.1.0 3.3.0 1.1.0 3.3.0 1.1.0 3.3.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.0.0 1.1.0 3.0.0 1.1.0
59 :doc:`MIVisionX <mivisionx:index>` :doc:`MIGraphX <amdmigraphx:index>` 2.14.0 2.14.0 2.13.0 3.3.0 2.13.0 3.2.0 2.12.0 3.2.0 2.12.0 3.2.0 2.12.0 3.2.0 2.12.0 3.1.0 2.11.0 3.1.0 2.11.0 3.1.0 2.11.0 3.1.0 2.11.0 3.0.0 2.10.0 3.0.0 2.10.0 3.0.0 2.10.0 3.0.0 2.10.0 2.5.0 2.9.0 2.5.0 2.9.0 2.5.0 2.9.0 2.5.0 2.9.0 2.5.0 2.8.0 2.5.0 2.8.0
60 :doc:`rocAL <rocal:index>` :doc:`MIOpen <miopen:index>` 3.5.1 3.5.1 3.5.0 2.3.0 3.5.0 2.2.0 3.4.0 2.2.0 3.4.0 2.2.0 3.4.0 2.2.0 3.4.0 2.1.0 3.3.0 2.1.0 3.3.0 2.1.0 3.3.0 2.1.0 3.3.0 2.0.0 3.2.0 2.0.0 3.2.0 2.0.0 3.2.0 1.0.0 3.2.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.0.0 1.0.0 3.0.0
61 :doc:`rocDecode <rocdecode:index>` :doc:`MIVisionX <mivisionx:index>` 3.4.0 3.4.0 3.3.0 1.0.0 3.3.0 0.10.0 3.2.0 0.10.0 3.2.0 0.10.0 3.2.0 0.10.0 3.2.0 0.8.0 3.1.0 0.8.0 3.1.0 0.8.0 3.1.0 0.8.0 3.1.0 0.6.0 3.0.0 0.6.0 3.0.0 0.6.0 3.0.0 0.6.0 3.0.0 0.6.0 2.5.0 0.6.0 2.5.0 0.5.0 2.5.0 0.5.0 2.5.0 N/A 2.5.0 N/A 2.5.0
62 :doc:`rocJPEG <rocjpeg:index>` :doc:`rocAL <rocal:index>` 2.4.0 2.4.0 2.3.0 1.1.0 2.3.0 0.8.0 2.2.0 0.8.0 2.2.0 0.8.0 2.2.0 0.8.0 2.2.0 0.6.0 2.1.0 0.6.0 2.1.0 0.6.0 2.1.0 0.6.0 2.1.0 N/A 2.0.0 N/A 2.0.0 N/A 2.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0
63 :doc:`rocPyDecode <rocpydecode:index>` :doc:`rocDecode <rocdecode:index>` 1.4.0 1.4.0 1.0.0 0.6.0 1.0.0 0.3.1 0.10.0 0.3.1 0.10.0 0.3.1 0.10.0 0.3.1 0.10.0 0.2.0 0.8.0 0.2.0 0.8.0 0.2.0 0.8.0 0.2.0 0.8.0 0.1.0 0.6.0 0.1.0 0.6.0 0.1.0 0.6.0 0.1.0 0.6.0 N/A 0.6.0 N/A 0.6.0 N/A 0.5.0 N/A 0.5.0 N/A N/A
64 :doc:`RPP <rpp:index>` :doc:`rocJPEG <rocjpeg:index>` 1.2.0 1.2.0 1.1.0 2.0.0 1.1.0 1.9.10 0.8.0 1.9.10 0.8.0 1.9.10 0.8.0 1.9.10 0.8.0 1.9.1 0.6.0 1.9.1 0.6.0 1.9.1 0.6.0 1.9.1 0.6.0 1.8.0 N/A 1.8.0 N/A 1.8.0 N/A 1.8.0 N/A 1.5.0 N/A 1.5.0 N/A 1.5.0 N/A 1.5.0 N/A 1.4.0 N/A 1.4.0 N/A
65 :doc:`rocPyDecode <rocpydecode:index>` 0.7.0 0.7.0 0.6.0 0.6.0 0.3.1 0.3.1 0.3.1 0.3.1 0.2.0 0.2.0 0.2.0 0.2.0 0.1.0 0.1.0 0.1.0 0.1.0 N/A N/A N/A N/A N/A N/A
66 COMMUNICATION :doc:`RPP <rpp:index>` 2.1.0 2.1.0 2.0.0 .. _commlibs-support-compatibility-matrix-past-60: 2.0.0 1.9.10 1.9.10 1.9.10 1.9.10 1.9.1 1.9.1 1.9.1 1.9.1 1.8.0 1.8.0 1.8.0 1.8.0 1.5.0 1.5.0 1.5.0 1.5.0 1.4.0 1.4.0
67 :doc:`RCCL <rccl:index>` 2.26.6 2.22.3 2.22.3 2.22.3 2.22.3 2.21.5 2.21.5 2.21.5 2.21.5 2.20.5 2.20.5 2.20.5 2.20.5 2.18.6 2.18.6 2.18.6 2.18.6 2.18.3 2.18.3
68 :doc:`rocSHMEM <rocshmem:index>` COMMUNICATION .. _commlibs-support-compatibility-matrix-past-60: 3.0.0 2.0.1 2.0.1 2.0.0 2.0.0 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
69 :doc:`RCCL <rccl:index>` 2.27.7 2.27.7 2.26.6 2.26.6 2.22.3 2.22.3 2.22.3 2.22.3 2.21.5 2.21.5 2.21.5 2.21.5 2.20.5 2.20.5 2.20.5 2.20.5 2.18.6 2.18.6 2.18.6 2.18.6 2.18.3 2.18.3
70 MATH LIBS :doc:`rocSHMEM <rocshmem:index>` 3.1.0 3.0.0 3.0.0 .. _mathlibs-support-compatibility-matrix-past-60: 3.0.0 2.0.1 2.0.1 2.0.0 2.0.0 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
71 `half <https://github.com/ROCm/half>`_ 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0
72 :doc:`hipBLAS <hipblas:index>` MATH LIBS .. _mathlibs-support-compatibility-matrix-past-60: 3.0.0 2.4.0 2.4.0 2.4.0 2.4.0 2.3.0 2.3.0 2.3.0 2.3.0 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.0 2.0.0
73 :doc:`hipBLASLt <hipblaslt:index>` `half <https://github.com/ROCm/half>`_ 1.12.0 1.12.0 1.12.0 1.0.0 1.12.0 0.12.1 1.12.0 0.12.1 1.12.0 0.12.1 1.12.0 0.12.0 1.12.0 0.10.0 1.12.0 0.10.0 1.12.0 0.10.0 1.12.0 0.10.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.6.0 1.12.0 0.6.0 1.12.0
74 :doc:`hipFFT <hipfft:index>` :doc:`hipBLAS <hipblas:index>` 3.1.0 3.1.0 3.0.2 1.0.20 3.0.0 1.0.18 2.4.0 1.0.18 2.4.0 1.0.18 2.4.0 1.0.18 2.4.0 1.0.17 2.3.0 1.0.17 2.3.0 1.0.17 2.3.0 1.0.17 2.3.0 1.0.16 2.2.0 1.0.15 2.2.0 1.0.15 2.2.0 1.0.14 2.2.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.13 2.0.0 1.0.13 2.0.0
75 :doc:`hipfort <hipfort:index>` :doc:`hipBLASLt <hipblaslt:index>` 1.1.0 1.1.0 1.0.0 0.7.0 1.0.0 0.6.0 0.12.1 0.6.0 0.12.1 0.6.0 0.12.1 0.6.0 0.12.0 0.5.1 0.10.0 0.5.1 0.10.0 0.5.0 0.10.0 0.5.0 0.10.0 0.4.0 0.8.0 0.4.0 0.8.0 0.4.0 0.8.0 0.4.0 0.8.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.6.0 0.4.0 0.6.0
76 :doc:`hipRAND <hiprand:index>` :doc:`hipFFT <hipfft:index>` 1.0.21 1.0.21 1.0.20 3.0.0 1.0.20 2.12.0 1.0.18 2.12.0 1.0.18 2.12.0 1.0.18 2.12.0 1.0.18 2.11.1 1.0.17 2.11.1 1.0.17 2.11.1 1.0.17 2.11.0 1.0.17 2.11.1 1.0.16 2.11.0 1.0.15 2.11.0 1.0.15 2.11.0 1.0.14 2.10.16 1.0.14 2.10.16 1.0.14 2.10.16 1.0.14 2.10.16 1.0.14 2.10.16 1.0.13 2.10.16 1.0.13
77 :doc:`hipSOLVER <hipsolver:index>` :doc:`hipfort <hipfort:index>` 0.7.1 0.7.1 0.7.0 3.0.0 0.7.0 2.4.0 0.6.0 2.4.0 0.6.0 2.4.0 0.6.0 2.4.0 0.6.0 2.3.0 0.5.1 2.3.0 0.5.1 2.3.0 0.5.0 2.3.0 0.5.0 2.2.0 0.4.0 2.2.0 0.4.0 2.2.0 0.4.0 2.2.0 0.4.0 2.1.1 0.4.0 2.1.1 0.4.0 2.1.1 0.4.0 2.1.0 0.4.0 2.0.0 0.4.0 2.0.0 0.4.0
78 :doc:`hipSPARSE <hipsparse:index>` :doc:`hipRAND <hiprand:index>` 3.1.0 3.1.0 3.0.0 4.0.1 3.0.0 3.2.0 2.12.0 3.2.0 2.12.0 3.2.0 2.12.0 3.2.0 2.12.0 3.1.2 2.11.1 3.1.2 2.11.1 3.1.2 2.11.1 3.1.2 2.11.0 3.1.1 2.11.1 3.1.1 2.11.0 3.1.1 2.11.0 3.1.1 2.11.0 3.0.1 2.10.16 3.0.1 2.10.16 3.0.1 2.10.16 3.0.1 2.10.16 3.0.0 2.10.16 3.0.0 2.10.16
79 :doc:`hipSPARSELt <hipsparselt:index>` :doc:`hipSOLVER <hipsolver:index>` 3.1.0 3.1.0 3.0.0 0.2.4 3.0.0 0.2.3 2.4.0 0.2.3 2.4.0 0.2.3 2.4.0 0.2.3 2.4.0 0.2.2 2.3.0 0.2.2 2.3.0 0.2.2 2.3.0 0.2.2 2.3.0 0.2.1 2.2.0 0.2.1 2.2.0 0.2.1 2.2.0 0.2.1 2.2.0 0.2.0 2.1.1 0.2.0 2.1.1 0.1.0 2.1.1 0.1.0 2.1.0 0.1.0 2.0.0 0.1.0 2.0.0
80 :doc:`rocALUTION <rocalution:index>` :doc:`hipSPARSE <hipsparse:index>` 4.1.0 4.1.0 4.0.1 4.0.0 4.0.1 3.2.3 3.2.0 3.2.3 3.2.0 3.2.3 3.2.0 3.2.2 3.2.0 3.2.1 3.1.2 3.2.1 3.1.2 3.2.1 3.1.2 3.2.1 3.1.2 3.2.1 3.1.1 3.2.0 3.1.1 3.2.0 3.1.1 3.2.0 3.1.1 3.1.1 3.0.1 3.1.1 3.0.1 3.1.1 3.0.1 3.1.1 3.0.1 3.0.3 3.0.0 3.0.3 3.0.0
81 :doc:`rocBLAS <rocblas:index>` :doc:`hipSPARSELt <hipsparselt:index>` 0.2.5 0.2.5 0.2.4 5.0.0 0.2.4 4.4.1 0.2.3 4.4.1 0.2.3 4.4.0 0.2.3 4.4.0 0.2.3 4.3.0 0.2.2 4.3.0 0.2.2 4.3.0 0.2.2 4.3.0 0.2.2 4.2.4 0.2.1 4.2.1 0.2.1 4.2.1 0.2.1 4.2.0 0.2.1 4.1.2 0.2.0 4.1.2 0.2.0 4.1.0 0.1.0 4.1.0 0.1.0 4.0.0 0.1.0 4.0.0 0.1.0
82 :doc:`rocFFT <rocfft:index>` :doc:`rocALUTION <rocalution:index>` 4.0.1 4.0.1 4.0.0 1.0.34 4.0.0 1.0.32 3.2.3 1.0.32 3.2.3 1.0.32 3.2.3 1.0.32 3.2.2 1.0.31 3.2.1 1.0.31 3.2.1 1.0.31 3.2.1 1.0.31 3.2.1 1.0.30 3.2.1 1.0.29 3.2.0 1.0.29 3.2.0 1.0.28 3.2.0 1.0.27 3.1.1 1.0.27 3.1.1 1.0.27 3.1.1 1.0.26 3.1.1 1.0.25 3.0.3 1.0.23 3.0.3
83 :doc:`rocRAND <rocrand:index>` :doc:`rocBLAS <rocblas:index>` 5.1.1 5.1.0 5.0.2 4.0.0 5.0.0 3.3.0 4.4.1 3.3.0 4.4.1 3.3.0 4.4.0 3.3.0 4.4.0 3.2.0 4.3.0 3.2.0 4.3.0 3.2.0 4.3.0 3.2.0 4.3.0 3.1.1 4.2.4 3.1.0 4.2.1 3.1.0 4.2.1 3.1.0 4.2.0 3.0.1 4.1.2 3.0.1 4.1.2 3.0.1 4.1.0 3.0.1 4.1.0 3.0.0 4.0.0 2.10.17 4.0.0
84 :doc:`rocSOLVER <rocsolver:index>` :doc:`rocFFT <rocfft:index>` 1.0.35 1.0.35 1.0.34 3.30.0 1.0.34 3.28.2 1.0.32 3.28.2 1.0.32 3.28.0 1.0.32 3.28.0 1.0.32 3.27.0 1.0.31 3.27.0 1.0.31 3.27.0 1.0.31 3.27.0 1.0.31 3.26.2 1.0.30 3.26.0 1.0.29 3.26.0 1.0.29 3.26.0 1.0.28 3.25.0 1.0.27 3.25.0 1.0.27 3.25.0 1.0.27 3.25.0 1.0.26 3.24.0 1.0.25 3.24.0 1.0.23
85 :doc:`rocSPARSE <rocsparse:index>` :doc:`rocRAND <rocrand:index>` 4.1.0 4.1.0 4.0.0 4.0.2 4.0.0 3.4.0 3.3.0 3.4.0 3.3.0 3.4.0 3.3.0 3.4.0 3.3.0 3.3.0 3.2.0 3.3.0 3.2.0 3.3.0 3.2.0 3.3.0 3.2.0 3.2.1 3.1.1 3.2.0 3.1.0 3.2.0 3.1.0 3.2.0 3.1.0 3.1.2 3.0.1 3.1.2 3.0.1 3.1.2 3.0.1 3.1.2 3.0.1 3.0.2 3.0.0 3.0.2 2.10.17
86 :doc:`rocWMMA <rocwmma:index>` :doc:`rocSOLVER <rocsolver:index>` 3.31.0 3.31.0 3.30.1 2.0.0 3.30.0 1.7.0 3.28.2 1.7.0 3.28.2 1.7.0 3.28.0 1.7.0 3.28.0 1.6.0 3.27.0 1.6.0 3.27.0 1.6.0 3.27.0 1.6.0 3.27.0 1.5.0 3.26.2 1.5.0 3.26.0 1.5.0 3.26.0 1.5.0 3.26.0 1.4.0 3.25.0 1.4.0 3.25.0 1.4.0 3.25.0 1.4.0 3.25.0 1.3.0 3.24.0 1.3.0 3.24.0
87 :doc:`Tensile <tensile:src/index>` :doc:`rocSPARSE <rocsparse:index>` 4.1.0 4.1.0 4.0.2 4.44.0 4.0.2 4.43.0 3.4.0 4.43.0 3.4.0 4.43.0 3.4.0 4.43.0 3.4.0 4.42.0 3.3.0 4.42.0 3.3.0 4.42.0 3.3.0 4.42.0 3.3.0 4.41.0 3.2.1 4.41.0 3.2.0 4.41.0 3.2.0 4.41.0 3.2.0 4.40.0 3.1.2 4.40.0 3.1.2 4.40.0 3.1.2 4.40.0 3.1.2 4.39.0 3.0.2 4.39.0 3.0.2
88 :doc:`rocWMMA <rocwmma:index>` 2.1.0 2.0.0 2.0.0 2.0.0 1.7.0 1.7.0 1.7.0 1.7.0 1.6.0 1.6.0 1.6.0 1.6.0 1.5.0 1.5.0 1.5.0 1.5.0 1.4.0 1.4.0 1.4.0 1.4.0 1.3.0 1.3.0
89 PRIMITIVES :doc:`Tensile <tensile:src/index>` 4.44.0 4.44.0 4.44.0 .. _primitivelibs-support-compatibility-matrix-past-60: 4.44.0 4.43.0 4.43.0 4.43.0 4.43.0 4.42.0 4.42.0 4.42.0 4.42.0 4.41.0 4.41.0 4.41.0 4.41.0 4.40.0 4.40.0 4.40.0 4.40.0 4.39.0 4.39.0
90 :doc:`hipCUB <hipcub:index>` 4.0.0 3.4.0 3.4.0 3.4.0 3.4.0 3.3.0 3.3.0 3.3.0 3.3.0 3.2.1 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
91 :doc:`hipTensor <hiptensor:index>` PRIMITIVES .. _primitivelibs-support-compatibility-matrix-past-60: 2.0.0 1.5.0 1.5.0 1.5.0 1.5.0 1.4.0 1.4.0 1.4.0 1.4.0 1.3.0 1.3.0 1.3.0 1.3.0 1.2.0 1.2.0 1.2.0 1.2.0 1.1.0 1.1.0
92 :doc:`rocPRIM <rocprim:index>` :doc:`hipCUB <hipcub:index>` 4.1.0 4.1.0 4.0.0 4.0.0 3.4.1 3.4.0 3.4.1 3.4.0 3.4.0 3.4.0 3.3.0 3.3.0 3.3.0 3.3.0 3.2.2 3.2.1 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
93 :doc:`rocThrust <rocthrust:index>` :doc:`hipTensor <hiptensor:index>` 2.0.0 2.0.0 2.0.0 4.0.0 2.0.0 3.3.0 1.5.0 3.3.0 1.5.0 3.3.0 1.5.0 3.3.0 1.5.0 3.3.0 1.4.0 3.3.0 1.4.0 3.3.0 1.4.0 3.3.0 1.4.0 3.1.1 1.3.0 3.1.0 1.3.0 3.1.0 1.3.0 3.0.1 1.3.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.0 1.1.0 3.0.0 1.1.0
94 :doc:`rocPRIM <rocprim:index>` 4.1.0 4.1.0 4.0.1 4.0.0 3.4.1 3.4.1 3.4.0 3.4.0 3.3.0 3.3.0 3.3.0 3.3.0 3.2.2 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
95 SUPPORT LIBS :doc:`rocThrust <rocthrust:index>` 4.1.0 4.1.0 4.0.0 4.0.0 3.3.0 3.3.0 3.3.0 3.3.0 3.3.0 3.3.0 3.3.0 3.3.0 3.1.1 3.1.0 3.1.0 3.0.1 3.0.1 3.0.1 3.0.1 3.0.1 3.0.0 3.0.0
96 `hipother <https://github.com/ROCm/hipother>`_ 7.0.51830 6.4.43483 6.4.43483 6.4.43483 6.4.43482 6.3.42134 6.3.42134 6.3.42133 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
97 `rocm-core <https://github.com/ROCm/rocm-core>`_ SUPPORT LIBS 7.0.0 6.4.3 6.4.2 6.4.1 6.4.0 6.3.3 6.3.2 6.3.1 6.3.0 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
98 `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_ `hipother <https://github.com/ROCm/hipother>`_ 7.1.52802 7.1.25424 7.0.51831 N/A [#ROCT-rocr-past-60]_ 7.0.51830 N/A [#ROCT-rocr-past-60]_ 6.4.43483 N/A [#ROCT-rocr-past-60]_ 6.4.43483 N/A [#ROCT-rocr-past-60]_ 6.4.43483 N/A [#ROCT-rocr-past-60]_ 6.4.43482 N/A [#ROCT-rocr-past-60]_ 6.3.42134 N/A [#ROCT-rocr-past-60]_ 6.3.42134 N/A [#ROCT-rocr-past-60]_ 6.3.42133 N/A [#ROCT-rocr-past-60]_ 6.3.42131 20240607.5.7 6.2.41134 20240607.5.7 6.2.41134 20240607.4.05 6.2.41134 20240607.1.4246 6.2.41133 20240125.5.08 6.1.40093 20240125.5.08 6.1.40093 20240125.5.08 6.1.40092 20240125.3.30 6.1.40091 20231016.2.245 6.1.32831 20231016.2.245 6.1.32830
99 `rocm-core <https://github.com/ROCm/rocm-core>`_ 7.1.1 7.1.0 7.0.2 7.0.1/7.0.0 6.4.3 6.4.2 6.4.1 6.4.0 6.3.3 6.3.2 6.3.1 6.3.0 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
100 SYSTEM MGMT TOOLS `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ .. _tools-support-compatibility-matrix-past-60: N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ 20240607.5.7 20240607.5.7 20240607.4.05 20240607.1.4246 20240125.5.08 20240125.5.08 20240125.5.08 20240125.3.30 20231016.2.245 20231016.2.245
101 :doc:`AMD SMI <amdsmi:index>` 26.0.0 25.5.1 25.5.1 25.4.2 25.3.0 24.7.1 24.7.1 24.7.1 24.7.1 24.6.3 24.6.3 24.6.3 24.6.2 24.5.1 24.5.1 24.5.1 24.4.1 23.4.2 23.4.2
102 :doc:`ROCm Data Center Tool <rdc:index>` SYSTEM MGMT TOOLS .. _tools-support-compatibility-matrix-past-60: 1.1.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0
103 :doc:`rocminfo <rocminfo:index>` :doc:`AMD SMI <amdsmi:index>` 26.2.0 26.1.0 26.0.2 1.0.0 26.0.0 1.0.0 25.5.1 1.0.0 25.5.1 1.0.0 25.4.2 1.0.0 25.3.0 1.0.0 24.7.1 1.0.0 24.7.1 1.0.0 24.7.1 1.0.0 24.7.1 1.0.0 24.6.3 1.0.0 24.6.3 1.0.0 24.6.3 1.0.0 24.6.2 1.0.0 24.5.1 1.0.0 24.5.1 1.0.0 24.5.1 1.0.0 24.4.1 1.0.0 23.4.2 1.0.0 23.4.2
104 :doc:`ROCm SMI <rocm_smi_lib:index>` :doc:`ROCm Data Center Tool <rdc:index>` 1.2.0 1.2.0 1.1.0 7.8.0 1.1.0 7.7.0 0.3.0 7.5.0 0.3.0 7.5.0 0.3.0 7.5.0 0.3.0 7.4.0 0.3.0 7.4.0 0.3.0 7.4.0 0.3.0 7.4.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.2.0 0.3.0 7.2.0 0.3.0 7.0.0 0.3.0 7.0.0 0.3.0 6.0.2 0.3.0 6.0.0 0.3.0
105 :doc:`ROCm Validation Suite <rocmvalidationsuite:index>` :doc:`rocminfo <rocminfo:index>` 1.0.0 1.0.0 1.0.0 1.2.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.0.60204 1.0.0 1.0.60202 1.0.0 1.0.60201 1.0.0 1.0.60200 1.0.0 1.0.60105 1.0.0 1.0.60102 1.0.0 1.0.60101 1.0.0 1.0.60100 1.0.0 1.0.60002 1.0.0 1.0.60000 1.0.0
106 :doc:`ROCm SMI <rocm_smi_lib:index>` 7.8.0 7.8.0 7.8.0 7.8.0 7.7.0 7.5.0 7.5.0 7.5.0 7.4.0 7.4.0 7.4.0 7.4.0 7.3.0 7.3.0 7.3.0 7.3.0 7.2.0 7.2.0 7.0.0 7.0.0 6.0.2 6.0.0
107 PERFORMANCE TOOLS :doc:`ROCm Validation Suite <rocmvalidationsuite:index>` 1.3.0 1.2.0 1.2.0 1.2.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.0.60204 1.0.60202 1.0.60201 1.0.60200 1.0.60105 1.0.60102 1.0.60101 1.0.60100 1.0.60002 1.0.60000
108 :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>` 2.6.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0
109 :doc:`ROCm Compute Profiler <rocprofiler-compute:index>` PERFORMANCE TOOLS 3.2.3 3.1.1 3.1.1 3.1.0 3.1.0 3.0.0 3.0.0 3.0.0 3.0.0 2.0.1 2.0.1 2.0.1 2.0.1 N/A N/A N/A N/A N/A N/A
110 :doc:`ROCm Systems Profiler <rocprofiler-systems:index>` :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>` 2.6.0 2.6.0 2.6.0 1.1.0 2.6.0 1.0.2 1.4.0 1.0.2 1.4.0 1.0.1 1.4.0 1.0.0 1.4.0 0.1.2 1.4.0 0.1.1 1.4.0 0.1.0 1.4.0 0.1.0 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0
111 :doc:`ROCProfiler <rocprofiler:index>` :doc:`ROCm Compute Profiler <rocprofiler-compute:index>` 3.3.1 3.3.0 3.2.3 2.0.70000 3.2.3 2.0.60403 3.1.1 2.0.60402 3.1.1 2.0.60401 3.1.0 2.0.60400 3.1.0 2.0.60303 3.0.0 2.0.60302 3.0.0 2.0.60301 3.0.0 2.0.60300 3.0.0 2.0.60204 2.0.1 2.0.60202 2.0.1 2.0.60201 2.0.1 2.0.60200 2.0.1 2.0.60105 N/A 2.0.60102 N/A 2.0.60101 N/A 2.0.60100 N/A 2.0.60002 N/A 2.0.60000 N/A
112 :doc:`ROCprofiler-SDK <rocprofiler-sdk:index>` :doc:`ROCm Systems Profiler <rocprofiler-systems:index>` 1.2.1 1.2.0 1.1.1 1.0.0 1.1.0 0.6.0 1.0.2 0.6.0 1.0.2 0.6.0 1.0.1 0.6.0 1.0.0 0.5.0 0.1.2 0.5.0 0.1.1 0.5.0 0.1.0 0.5.0 0.1.0 0.4.0 1.11.2 0.4.0 1.11.2 0.4.0 1.11.2 0.4.0 1.11.2 N/A N/A N/A N/A N/A N/A
113 :doc:`ROCTracer <roctracer:index>` :doc:`ROCProfiler <rocprofiler:index>` 2.0.70101 2.0.70100 2.0.70002 4.1.70000 2.0.70000 4.1.60403 2.0.60403 4.1.60402 2.0.60402 4.1.60401 2.0.60401 4.1.60400 2.0.60400 4.1.60303 2.0.60303 4.1.60302 2.0.60302 4.1.60301 2.0.60301 4.1.60300 2.0.60300 4.1.60204 2.0.60204 4.1.60202 2.0.60202 4.1.60201 2.0.60201 4.1.60200 2.0.60200 4.1.60105 2.0.60105 4.1.60102 2.0.60102 4.1.60101 2.0.60101 4.1.60100 2.0.60100 4.1.60002 2.0.60002 4.1.60000 2.0.60000
114 :doc:`ROCprofiler-SDK <rocprofiler-sdk:index>` 1.0.0 1.0.0 1.0.0 1.0.0 0.6.0 0.6.0 0.6.0 0.6.0 0.5.0 0.5.0 0.5.0 0.5.0 0.4.0 0.4.0 0.4.0 0.4.0 N/A N/A N/A N/A N/A N/A
115 DEVELOPMENT TOOLS :doc:`ROCTracer <roctracer:index>` 4.1.70101 4.1.70100 4.1.70002 4.1.70000 4.1.60403 4.1.60402 4.1.60401 4.1.60400 4.1.60303 4.1.60302 4.1.60301 4.1.60300 4.1.60204 4.1.60202 4.1.60201 4.1.60200 4.1.60105 4.1.60102 4.1.60101 4.1.60100 4.1.60002 4.1.60000
116 :doc:`HIPIFY <hipify:index>` 20.0.0 19.0.0 19.0.0 19.0.0 19.0.0 18.0.0.25012 18.0.0.25012 18.0.0.24491 18.0.0.24455 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
117 :doc:`ROCm CMake <rocmcmakebuildtools:index>` DEVELOPMENT TOOLS 0.14.0 0.14.0 0.14.0 0.14.0 0.14.0 0.14.0 0.14.0 0.14.0 0.14.0 0.13.0 0.13.0 0.13.0 0.13.0 0.12.0 0.12.0 0.12.0 0.12.0 0.11.0 0.11.0
118 :doc:`ROCdbgapi <rocdbgapi:index>` :doc:`HIPIFY <hipify:index>` 20.0.0 20.0.0 20.0.0 0.77.3 20.0.0 0.77.2 19.0.0 0.77.2 19.0.0 0.77.2 19.0.0 0.77.2 19.0.0 0.77.0 18.0.0.25012 0.77.0 18.0.0.25012 0.77.0 18.0.0.24491 0.77.0 18.0.0.24455 0.76.0 18.0.0.24392 0.76.0 18.0.0.24355 0.76.0 18.0.0.24355 0.76.0 18.0.0.24232 0.71.0 17.0.0.24193 0.71.0 17.0.0.24193 0.71.0 17.0.0.24154 0.71.0 17.0.0.24103 0.71.0 17.0.0.24012 0.71.0 17.0.0.23483
119 :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>` :doc:`ROCm CMake <rocmcmakebuildtools:index>` 0.14.0 0.14.0 0.14.0 16.3.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 14.2.0 0.13.0 14.2.0 0.13.0 14.2.0 0.13.0 14.2.0 0.13.0 14.1.0 0.12.0 14.1.0 0.12.0 14.1.0 0.12.0 14.1.0 0.12.0 13.2.0 0.11.0 13.2.0 0.11.0
120 `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_ :doc:`ROCdbgapi <rocdbgapi:index>` 0.77.4 0.77.4 0.77.4 0.5.0 0.77.3 0.4.0 0.77.2 0.4.0 0.77.2 0.4.0 0.77.2 0.4.0 0.77.2 0.4.0 0.77.0 0.4.0 0.77.0 0.4.0 0.77.0 0.4.0 0.77.0 0.4.0 0.76.0 0.4.0 0.76.0 0.4.0 0.76.0 0.4.0 0.76.0 0.3.0 0.71.0 0.3.0 0.71.0 0.3.0 0.71.0 0.3.0 0.71.0 N/A 0.71.0 N/A 0.71.0
121 :doc:`ROCr Debug Agent <rocr_debug_agent:index>` :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>` 16.3.0 16.3.0 16.3.0 2.1.0 16.3.0 2.0.4 15.2.0 2.0.4 15.2.0 2.0.4 15.2.0 2.0.4 15.2.0 2.0.3 15.2.0 2.0.3 15.2.0 2.0.3 15.2.0 2.0.3 15.2.0 2.0.3 14.2.0 2.0.3 14.2.0 2.0.3 14.2.0 2.0.3 14.2.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 13.2.0 2.0.3 13.2.0
122 `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_ 0.5.0 0.5.0 0.5.0 0.5.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.3.0 0.3.0 0.3.0 0.3.0 N/A N/A
123 COMPILERS :doc:`ROCr Debug Agent <rocr_debug_agent:index>` 2.1.0 2.1.0 2.1.0 .. _compilers-support-compatibility-matrix-past-60: 2.1.0 2.0.4 2.0.4 2.0.4 2.0.4 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3
124 `clang-ocl <https://github.com/ROCm/clang-ocl>`_ N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 0.5.0 0.5.0 0.5.0 0.5.0 0.5.0 0.5.0
125 :doc:`hipCC <hipcc:index>` COMPILERS .. _compilers-support-compatibility-matrix-past-60: 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.0.0 1.0.0 1.0.0 1.0.0 1.0.0 1.0.0
126 `Flang <https://github.com/ROCm/flang>`_ `clang-ocl <https://github.com/ROCm/clang-ocl>`_ N/A N/A N/A 20.0.0.25314 N/A 19.0.0.25224 N/A 19.0.0.25224 N/A 19.0.0.25184 N/A 19.0.0.25133 N/A 18.0.0.25012 N/A 18.0.0.25012 N/A 18.0.0.24491 N/A 18.0.0.24455 N/A 18.0.0.24392 N/A 18.0.0.24355 N/A 18.0.0.24355 N/A 18.0.0.24232 N/A 17.0.0.24193 0.5.0 17.0.0.24193 0.5.0 17.0.0.24154 0.5.0 17.0.0.24103 0.5.0 17.0.0.24012 0.5.0 17.0.0.23483 0.5.0
127 :doc:`llvm-project <llvm-project:index>` :doc:`hipCC <hipcc:index>` 1.1.1 1.1.1 1.1.1 20.0.0.25314 1.1.1 19.0.0.25224 1.1.1 19.0.0.25224 1.1.1 19.0.0.25184 1.1.1 19.0.0.25133 1.1.1 18.0.0.25012 1.1.1 18.0.0.25012 1.1.1 18.0.0.24491 1.1.1 18.0.0.24491 1.1.1 18.0.0.24392 1.1.1 18.0.0.24355 1.1.1 18.0.0.24355 1.1.1 18.0.0.24232 1.1.1 17.0.0.24193 1.0.0 17.0.0.24193 1.0.0 17.0.0.24154 1.0.0 17.0.0.24103 1.0.0 17.0.0.24012 1.0.0 17.0.0.23483 1.0.0
128 `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_ `Flang <https://github.com/ROCm/flang>`_ 20.0.025444 20.0.025425 20.0.0.25385 20.0.0.25314 19.0.0.25224 19.0.0.25224 19.0.0.25184 19.0.0.25133 18.0.0.25012 18.0.0.25012 18.0.0.24491 18.0.0.24491 18.0.0.24455 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
129 :doc:`llvm-project <llvm-project:index>` 20.0.025444 20.0.025425 20.0.0.25385 20.0.0.25314 19.0.0.25224 19.0.0.25224 19.0.0.25184 19.0.0.25133 18.0.0.25012 18.0.0.25012 18.0.0.24491 18.0.0.24491 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
130 RUNTIMES `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_ 20.0.025444 20.0.025425 20.0.0.25385 .. _runtime-support-compatibility-matrix-past-60: 20.0.0.25314 19.0.0.25224 19.0.0.25224 19.0.0.25184 19.0.0.25133 18.0.0.25012 18.0.0.25012 18.0.0.24491 18.0.0.24491 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
131 :doc:`AMD CLR <hip:understand/amd_clr>` 7.0.51830 6.4.43484 6.4.43484 6.4.43483 6.4.43482 6.3.42134 6.3.42134 6.3.42133 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
132 :doc:`HIP <hip:index>` RUNTIMES .. _runtime-support-compatibility-matrix-past-60: 7.0.51830 6.4.43484 6.4.43484 6.4.43483 6.4.43482 6.3.42134 6.3.42134 6.3.42133 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
133 `OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_ :doc:`AMD CLR <hip:understand/amd_clr>` 7.1.52802 7.1.25424 7.0.51831 2.0.0 7.0.51830 2.0.0 6.4.43484 2.0.0 6.4.43484 2.0.0 6.4.43483 2.0.0 6.4.43482 2.0.0 6.3.42134 2.0.0 6.3.42134 2.0.0 6.3.42133 2.0.0 6.3.42131 2.0.0 6.2.41134 2.0.0 6.2.41134 2.0.0 6.2.41134 2.0.0 6.2.41133 2.0.0 6.1.40093 2.0.0 6.1.40093 2.0.0 6.1.40092 2.0.0 6.1.40091 2.0.0 6.1.32831 2.0.0 6.1.32830
134 :doc:`ROCr Runtime <rocr-runtime:index>` :doc:`HIP <hip:index>` 7.1.52802 7.1.25424 7.0.51831 1.18.0 7.0.51830 1.15.0 6.4.43484 1.15.0 6.4.43484 1.15.0 6.4.43483 1.15.0 6.4.43482 1.14.0 6.3.42134 1.14.0 6.3.42134 1.14.0 6.3.42133 1.14.0 6.3.42131 1.14.0 6.2.41134 1.14.0 6.2.41134 1.14.0 6.2.41134 1.13.0 6.2.41133 1.13.0 6.1.40093 1.13.0 6.1.40093 1.13.0 6.1.40092 1.13.0 6.1.40091 1.12.0 6.1.32831 1.12.0 6.1.32830
135 `OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_ 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0
136 :doc:`ROCr Runtime <rocr-runtime:index>` 1.18.0 1.18.0 1.18.0 1.18.0 1.15.0 1.15.0 1.15.0 1.15.0 1.14.0 1.14.0 1.14.0 1.14.0 1.14.0 1.14.0 1.14.0 1.13.0 1.13.0 1.13.0 1.13.0 1.13.0 1.12.0 1.12.0

View File

@@ -10,10 +10,9 @@ Use this matrix to view the ROCm compatibility and system requirements across su
You can also refer to the :ref:`past versions of ROCm compatibility matrix<past-rocm-compatibility-matrix>`.
Accelerators and GPUs listed in the following table support compute workloads (no display
information or graphics). If youre using ROCm with AMD Radeon or Radeon Pro GPUs for graphics
workloads, see the `Use ROCm on Radeon GPU documentation
<https://rocm.docs.amd.com/projects/radeon/en/latest/docs/compatibility.html>`_ to verify
GPUs listed in the following table support compute workloads (no display
information or graphics). If youre using ROCm with AMD Radeon GPUs or Ryzen APUs for graphics
workloads, see the :doc:`Use ROCm on Radeon and Ryzen <radeon:index>` to verify
compatibility and system requirements.
.. |br| raw:: html
@@ -23,20 +22,20 @@ compatibility and system requirements.
.. container:: format-big-table
.. csv-table::
:header: "ROCm Version", "7.0.0", "6.4.3", "6.3.0"
:header: "ROCm Version", "7.1.1", "7.1.0", "6.4.0"
:stub-columns: 1
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.3,Ubuntu 24.04.2,Ubuntu 24.04.2
:ref:`Operating systems & kernels <OS-kernel-versions>` [#os-compatibility]_,Ubuntu 24.04.3,Ubuntu 24.04.3,Ubuntu 24.04.2
,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5
,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.5, 9.4"
,RHEL 8.10 [#rhel-700]_,RHEL 8.10 [#rhel-700],RHEL 8.10 [#rhel-700]
,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP6, SP5"
,"Oracle Linux 9, 8 [#ol-700-mi300x]_","Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_
,Debian 12,Debian 12 [#single-node]_,
,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#az-mi300x]_,
,Rocky Linux 9 [#rl-700]_,,
,"RHEL 10.1, 10.0, 9.7, |br| 9.6, 9.4","RHEL 10.0, 9.6, 9.4","RHEL 9.5, 9.4"
,RHEL 8.10,RHEL 8.10,RHEL 8.10
,SLES 15 SP7,SLES 15 SP7,SLES 15 SP6
,"Oracle Linux 10, 9, 8","Oracle Linux 10, 9, 8","Oracle Linux 9, 8"
,"Debian 13, 12","Debian 13, 12",Debian 12
,,,Azure Linux 3.0
,Rocky Linux 9,Rocky Linux 9,
,.. _architecture-support-compatibility-matrix:,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA4,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA4,CDNA4,
,CDNA3,CDNA3,CDNA3
,CDNA2,CDNA2,CDNA2
,CDNA,CDNA,CDNA
@@ -44,10 +43,10 @@ compatibility and system requirements.
,RDNA3,RDNA3,RDNA3
,RDNA2,RDNA2,RDNA2
,.. _gpu-support-compatibility-matrix:,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx950,,
,gfx1201 [#RDNA-OS]_,gfx1201 [#RDNA-OS]_,
,gfx1200 [#RDNA-OS]_,gfx1200 [#RDNA-OS]_,
,gfx1101 [#RDNA-OS]_ [#7700XT-OS]_,gfx1101 [#RDNA-OS]_ [#7700XT-OS]_,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>` [#gpu-compatibility]_,gfx950,gfx950,
,gfx1201,gfx1201,
,gfx1200,gfx1200,
,gfx1101,gfx1101,
,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030
,gfx942,gfx942,gfx942
@@ -55,160 +54,122 @@ compatibility and system requirements.
,gfx908,gfx908,gfx908
,,,
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix:,,
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.7, 2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 2.1, 2.0, 1.13"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.19.1, 2.18.1","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.6.0,0.4.35,0.4.31
:doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A
:doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,85f95ae
:doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,N/A
:doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>`,N/A,N/A,0.7.0
:doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_,N/A,N/A,N/A
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.22.0,1.20.0,1.17.3
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.9, 2.8, 2.7","2.8, 2.7, 2.6","2.6, 2.5, 2.4, 2.3"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.20.0, 2.19.1, 2.18.1","2.20.0, 2.19.1, 2.18.1","2.18.1, 2.17.1, 2.16.2"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.7.1,0.7.1,0.4.35
:doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,2.4.0
:doc:`llama.cpp <../compatibility/ml-compatibility/llama-cpp-compatibility>` [#llama-cpp_compat]_,N/A,N/A,b5997
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.23.1,1.22.0,1.20.0
,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix:,,
`UCC <https://github.com/ROCm/ucc>`_,>=1.4.0,>=1.3.0,>=1.3.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.17.0,>=1.15.0,>=1.15.0
`UCC <https://github.com/ROCm/ucc>`_,>=1.4.0,>=1.4.0,>=1.3.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.17.0,>=1.17.0,>=1.15.0
,,,
THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix:,,
Thrust,2.6.0,2.5.0,2.3.2
CUB,2.6.0,2.5.0,2.3.2
Thrust,2.8.5,2.8.5,2.5.0
CUB,2.8.5,2.8.5,2.5.0
,,,
KMD & USER SPACE [#kfd_support]_,.. _kfd-userspace-support-compatibility-matrix:,,
:doc:`KMD versions <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>`,"30.10, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x"
DRIVER & USER SPACE [#kfd_support]_,.. _kfd-userspace-support-compatibility-matrix:,,
:doc:`AMD GPU Driver <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>`,"30.20.1, 30.20.0 [#mi325x_KVM]_, |br| 30.10.2, 30.10.1 [#driver_patch]_, |br| 30.10, 6.4.x","30.20.0 [#mi325x_KVM]_, 30.10.2, |br| 30.10.1 [#driver_patch]_, 30.10, 6.4.x","6.4.x, 6.3.x, 6.2.x, 6.1.x"
,,,
ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix:,,
:doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0
:doc:`MIGraphX <amdmigraphx:index>`,2.13.0,2.12.0,2.11.0
:doc:`MIOpen <miopen:index>`,3.5.0,3.4.0,3.3.0
:doc:`MIVisionX <mivisionx:index>`,3.3.0,3.2.0,3.1.0
:doc:`rocAL <rocal:index>`,2.3.0,2.2.0,2.1.0
:doc:`rocDecode <rocdecode:index>`,1.0.0,0.10.0,0.8.0
:doc:`rocJPEG <rocjpeg:index>`,1.1.0,0.8.0,0.6.0
:doc:`rocPyDecode <rocpydecode:index>`,0.6.0,0.3.1,0.2.0
:doc:`RPP <rpp:index>`,2.0.0,1.9.10,1.9.1
:doc:`MIGraphX <amdmigraphx:index>`,2.14.0,2.14.0,2.12.0
:doc:`MIOpen <miopen:index>`,3.5.1,3.5.1,3.4.0
:doc:`MIVisionX <mivisionx:index>`,3.4.0,3.4.0,3.2.0
:doc:`rocAL <rocal:index>`,2.4.0,2.4.0,2.2.0
:doc:`rocDecode <rocdecode:index>`,1.4.0,1.4.0,0.10.0
:doc:`rocJPEG <rocjpeg:index>`,1.2.0,1.2.0,0.8.0
:doc:`rocPyDecode <rocpydecode:index>`,0.7.0,0.7.0,0.3.1
:doc:`RPP <rpp:index>`,2.1.0,2.1.0,1.9.10
,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix:,,
:doc:`RCCL <rccl:index>`,2.26.6,2.22.3,2.21.5
:doc:`rocSHMEM <rocshmem:index>`,3.0.0,2.0.1,N/A
:doc:`RCCL <rccl:index>`,2.27.7,2.27.7,2.22.3
:doc:`rocSHMEM <rocshmem:index>`,3.1.0,3.0.0,2.0.0
,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix:,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0
:doc:`hipBLAS <hipblas:index>`,3.0.0,2.4.0,2.3.0
:doc:`hipBLASLt <hipblaslt:index>`,1.0.0,0.12.1,0.10.0
:doc:`hipFFT <hipfft:index>`,1.0.20,1.0.18,1.0.17
:doc:`hipfort <hipfort:index>`,0.7.0,0.6.0,0.5.0
:doc:`hipRAND <hiprand:index>`,3.0.0,2.12.0,2.11.0
:doc:`hipSOLVER <hipsolver:index>`,3.0.0,2.4.0,2.3.0
:doc:`hipSPARSE <hipsparse:index>`,4.0.1,3.2.0,3.1.2
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.4,0.2.3,0.2.2
:doc:`rocALUTION <rocalution:index>`,4.0.0,3.2.3,3.2.1
:doc:`rocBLAS <rocblas:index>`,5.0.0,4.4.1,4.3.0
:doc:`rocFFT <rocfft:index>`,1.0.34,1.0.32,1.0.31
:doc:`rocRAND <rocrand:index>`,4.0.0,3.3.0,3.2.0
:doc:`rocSOLVER <rocsolver:index>`,3.30.0,3.28.2,3.27.0
:doc:`rocSPARSE <rocsparse:index>`,4.0.2,3.4.0,3.3.0
:doc:`rocWMMA <rocwmma:index>`,2.0.0,1.7.0,1.6.0
:doc:`Tensile <tensile:src/index>`,4.44.0,4.43.0,4.42.0
:doc:`hipBLAS <hipblas:index>`,3.1.0,3.1.0,2.4.0
:doc:`hipBLASLt <hipblaslt:index>`,1.1.0,1.1.0,0.12.0
:doc:`hipFFT <hipfft:index>`,1.0.21,1.0.21,1.0.18
:doc:`hipfort <hipfort:index>`,0.7.1,0.7.1,0.6.0
:doc:`hipRAND <hiprand:index>`,3.1.0,3.1.0,2.12.0
:doc:`hipSOLVER <hipsolver:index>`,3.1.0,3.1.0,2.4.0
:doc:`hipSPARSE <hipsparse:index>`,4.1.0,4.1.0,3.2.0
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.5,0.2.5,0.2.3
:doc:`rocALUTION <rocalution:index>`,4.0.1,4.0.1,3.2.2
:doc:`rocBLAS <rocblas:index>`,5.1.1,5.1.0,4.4.0
:doc:`rocFFT <rocfft:index>`,1.0.35,1.0.35,1.0.32
:doc:`rocRAND <rocrand:index>`,4.1.0,4.1.0,3.3.0
:doc:`rocSOLVER <rocsolver:index>`,3.31.0,3.31.0,3.28.0
:doc:`rocSPARSE <rocsparse:index>`,4.1.0,4.1.0,3.4.0
:doc:`rocWMMA <rocwmma:index>`,2.1.0,2.0.0,1.7.0
:doc:`Tensile <tensile:src/index>`,4.44.0,4.44.0,4.43.0
,,,
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix:,,
:doc:`hipCUB <hipcub:index>`,4.0.0,3.4.0,3.3.0
:doc:`hipTensor <hiptensor:index>`,2.0.0,1.5.0,1.4.0
:doc:`rocPRIM <rocprim:index>`,4.0.0,3.4.1,3.3.0
:doc:`rocThrust <rocthrust:index>`,4.0.0,3.3.0,3.3.0
:doc:`hipCUB <hipcub:index>`,4.1.0,4.1.0,3.4.0
:doc:`hipTensor <hiptensor:index>`,2.0.0,2.0.0,1.5.0
:doc:`rocPRIM <rocprim:index>`,4.1.0,4.1.0,3.4.0
:doc:`rocThrust <rocthrust:index>`,4.1.0,4.1.0,3.3.0
,,,
SUPPORT LIBS,,,
`hipother <https://github.com/ROCm/hipother>`_,7.0.51830,6.4.43483,6.3.42131
`rocm-core <https://github.com/ROCm/rocm-core>`_,7.0.0,6.4.3,6.3.0
`hipother <https://github.com/ROCm/hipother>`_,7.1.52802,7.1.25424,6.4.43482
`rocm-core <https://github.com/ROCm/rocm-core>`_,7.1.1,7.1.0,6.4.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr]_,N/A [#ROCT-rocr]_,N/A [#ROCT-rocr]_
,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix:,,
:doc:`AMD SMI <amdsmi:index>`,26.0.0,25.5.1,24.7.1
:doc:`ROCm Data Center Tool <rdc:index>`,1.1.0,0.3.0,0.3.0
:doc:`AMD SMI <amdsmi:index>`,26.2.0,26.1.0,25.3.0
:doc:`ROCm Data Center Tool <rdc:index>`,1.2.0,1.2.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.8.0,7.7.0,7.4.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.2.0,1.1.0,1.1.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.8.0,7.8.0,7.5.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.3.0,1.2.0,1.1.0
,,,
PERFORMANCE TOOLS,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,2.6.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.2.3,3.1.1,3.0.0
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.1.0,1.0.2,0.1.0
:doc:`ROCProfiler <rocprofiler:index>`,2.0.70000,2.0.60403,2.0.60300
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,1.0.0,0.6.0,0.5.0
:doc:`ROCTracer <roctracer:index>`,4.1.70000,4.1.60403,4.1.60300
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,2.6.0,2.6.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.3.1,3.3.0,3.1.0
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.2.1,1.2.0,1.0.0
:doc:`ROCProfiler <rocprofiler:index>`,2.0.70101,2.0.70100,2.0.60400
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,1.0.0,1.0.0,0.6.0
:doc:`ROCTracer <roctracer:index>`,4.1.70101,4.1.70100,4.1.60400
,,,
DEVELOPMENT TOOLS,,,
:doc:`HIPIFY <hipify:index>`,20.0.0,19.0.0,18.0.0.24455
:doc:`HIPIFY <hipify:index>`,20.0.0,20.0.0,19.0.0
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.14.0,0.14.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.3,0.77.2,0.77.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,16.3.0,15.2.0,15.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.5.0,0.4.0,0.4.0
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.1.0,2.0.4,2.0.3
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.4,0.77.4,0.77.2
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,16.3.0,16.3.0,15.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.5.0,0.5.0,0.4.0
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.1.0,2.1.0,2.0.4
,,,
COMPILERS,.. _compilers-support-compatibility-matrix:,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1
`Flang <https://github.com/ROCm/flang>`_,20.0.0.25314,19.0.0.25224,18.0.0.24455
:doc:`llvm-project <llvm-project:index>`,20.0.0.25314,19.0.0.25224,18.0.0.24491
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,20.0.0.25314,19.0.0.25224,18.0.0.24491
`Flang <https://github.com/ROCm/flang>`_,20.0.025444,20.0.025425,19.0.0.25133
:doc:`llvm-project <llvm-project:index>`,20.0.025444,20.0.025425,19.0.0.25133
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,20.0.025444,20.0.025425,19.0.0.25133
,,,
RUNTIMES,.. _runtime-support-compatibility-matrix:,,
:doc:`AMD CLR <hip:understand/amd_clr>`,7.0.51830,6.4.43484,6.3.42131
:doc:`HIP <hip:index>`,7.0.51830,6.4.43484,6.3.42131
:doc:`AMD CLR <hip:understand/amd_clr>`,7.1.52802,7.1.25424,6.4.43482
:doc:`HIP <hip:index>`,7.1.52802,7.1.25424,6.4.43482
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.18.0,1.15.0,1.14.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.18.0,1.18.0,1.15.0
.. rubric:: Footnotes
.. [#rhel-700] RHEL 8.10 is only supported on AMD Instinct MI300X, MI300A, MI250X, MI250, MI210, and MI100 GPUs.
.. [#ol-700-mi300x] **For ROCm 7.0** - Oracle Linux 9 is supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X.
.. [#ol-mi300x] **Prior ROCm 7.0** - Oracle Linux is supported only on AMD Instinct MI300X.
.. [#sles-db-700] SLES 15 SP7 and Debian 12 are only supported on AMD Instinct MI300X, MI300A, MI250X, MI250, and MI210 GPUs.
.. [#az-mi300x] Starting ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710.
.. [#rl-700] Rocky Linux 9 is only supported on AMD Instinct MI300X and MI300A GPUs.
.. [#single-node] **Prior to ROCm 7.0.0** - Debian 12 is supported only on AMD Instinct MI300X for single-node functionality.
.. [#az-mi300x] Starting from ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710.
.. [#RDNA-OS] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4.
.. [#7700XT-OS] Radeon RX 7700 XT (gfx1101) is supported only on Ubuntu 24.04.2 and RHEL 9.6.
.. [#kfd_support] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The supported user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
.. [#os-compatibility] Some operating systems are supported on limited GPUs. For detailed information, see the latest :ref:`supported_distributions`. For version specific information, see `ROCm 7.1.1 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.1/reference/system-requirements.html#supported-operating-systems>`__, `ROCm 7.1.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.0/reference/system-requirements.html#supported-operating-systems>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-operating-systems>`__.
.. [#gpu-compatibility] Some GPUs have limited operating system support. For detailed information, see the latest :ref:`supported_GPUs`. For version specific information, see `ROCm 7.1.1 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.1/reference/system-requirements.html#supported-gpus>`__, `ROCm 7.1.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.0/reference/system-requirements.html#supported-gpus>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-gpus>`__.
.. [#dgl_compat] DGL is only supported on ROCm 7.0.0, ROCm 6.4.3 and ROCm 6.4.0.
.. [#llama-cpp_compat] llama.cpp is only supported on ROCm 7.0.0 and ROCm 6.4.x.
.. [#mi325x_KVM] For AMD Instinct MI325X KVM SR-IOV users, do not use AMD GPU Driver (amdgpu) 30.20.0.
.. [#driver_patch] AMD GPU Driver (amdgpu) 30.10.1 is a quality release that resolves an issue identified in the 30.10 release. There are no other significant changes or feature additions in ROCm 7.0.1 from ROCm 7.0.0. AMD GPU Driver (amdgpu) 30.10.1 is compatible with ROCm 7.0.1 and ROCm 7.0.0.
.. [#kfd_support] As of ROCm 6.4.0, forward and backward compatibility between the AMD GPU Driver (amdgpu) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The supported user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and AMD GPU Driver support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
.. [#ROCT-rocr] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package.
.. _OS-kernel-versions:
Operating systems, kernel and Glibc versions
*********************************************
Use this lookup table to confirm which operating system and kernel versions are supported with ROCm.
.. csv-table::
:header: "OS", "Version", "Kernel", "Glibc"
:widths: 40, 20, 30, 20
:stub-columns: 1
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 24.04.3, "6.8 [GA], 6.14 [HWE]", 2.39
,,
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 24.04.2, "6.8 [GA], 6.11 [HWE]", 2.39
,,
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 22.04.5, "5.15 [GA], 6.8 [HWE]", 2.35
,,
`Red Hat Enterprise Linux (RHEL 9) <https://access.redhat.com/articles/3078#RHEL9>`_, 9.6, 5.14.0-570, 2.34
,9.5, 5.14+, 2.34
,9.4, 5.14.0-427, 2.34
,,
`Red Hat Enterprise Linux (RHEL 8) <https://access.redhat.com/articles/3078#RHEL8>`_, 8.10, 4.18.0-553, 2.28
,,
`SUSE Linux Enterprise Server (SLES) <https://www.suse.com/support/kb/doc/?id=000019587#SLE15SP4>`_, 15 SP7, 6.40-150700.51, 2.38
,15 SP6, "6.5.0+, 6.4.0", 2.38
,15 SP5, 5.14.21, 2.31
,,
`Rocky Linux <https://wiki.rockylinux.org/rocky/version/>`_, 9, 5.14.0-570, 2.34
,,
`Oracle Linux <https://blogs.oracle.com/scoter/post/oracle-linux-and-unbreakable-enterprise-kernel-uek-releases>`_, 9, 6.12.0 (UEK), 2.34
,8, 5.15.0 (UEK), 2.28
,,
`Debian <https://www.debian.org/download>`_,12, 6.1.0, 2.36
,,
`Azure Linux <https://techcommunity.microsoft.com/blog/linuxandopensourceblog/azure-linux-3-0-now-in-preview-on-azure-kubernetes-service-v1-31/4287229>`_,3.0, 6.6.92, 2.38
,,
For detailed information on operating system supported on ROCm 7.1.1 and associated Kernel and Glibc version, see the latest :ref:`supported_distributions`. For version specific information, see `ROCm 7.1.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.0/reference/system-requirements.html#supported-operating-systems>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-operating-systems>`__.
.. note::
@@ -240,29 +201,18 @@ Expand for full historical view of:
.. rubric:: Footnotes
.. [#ol-700-mi300x-past-60] **For ROCm 7.0.0** - Oracle Linux 9 is supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X.
.. [#mi300x-past-60] **Prior to ROCm 7.0.0** - Oracle Linux is supported only on AMD Instinct MI300X.
.. [#single-node-past-60] **Prior to ROCm 7.0.0 ** - Debian 12 is supported only on AMD Instinct MI300X for single-node functionality.
.. [#az-mi300x-past-60] Starting from ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710.
.. [#az-mi300x-630-past-60] **Prior ROCm 6.4.0**- Azure Linux 3.0 is supported only on AMD Instinct MI300X.
.. [#RDNA-OS-past-60] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4.
.. [#7700XT-OS-past-60] Radeon RX 7700 XT (gfx1101) is supported only on Ubuntu 24.04.2 and RHEL 9.6.
.. [#mi300_624-past-60] **For ROCm 6.2.4** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
.. [#mi300_622-past-60] **For ROCm 6.2.2** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
.. [#mi300_621-past-60] **For ROCm 6.2.1** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
.. [#mi300_620-past-60] **For ROCm 6.2.0** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
.. [#mi300_612-past-60] **For ROCm 6.1.2** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4 and Oracle Linux.
.. [#mi300_611-past-60] **For ROCm 6.1.1** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4 and Oracle Linux.
.. [#mi300_610-past-60] **For ROCm 6.1.0** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4.
.. [#mi300_602-past-60] **For ROCm 6.0.2** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
.. [#mi300_600-past-60] **For ROCm 6.0.0** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
.. [#verl_compat] verl is only supported on ROCm 6.2.0.
.. [#stanford-megatron-lm_compat] Stanford Megatron-LM is only supported on ROCm 6.3.0.
.. [#dgl_compat] DGL is only supported on ROCm 6.4.0.
.. [#megablocks_compat] Megablocks is only supported on ROCm 6.3.0.
.. [#taichi_compat] Taichi is only supported on ROCm 6.3.2.
.. [#ray_compat] Ray is only supported on ROCm 6.4.1.
.. [#llama-cpp_compat] llama.cpp is only supported on ROCm 6.4.0.
.. [#kfd_support-past-60] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
.. [#os-compatibility-past-60] Some operating systems are supported on limited GPUs. For detailed information, see the latest :ref:`supported_distributions`. For version specific information, see `ROCm 7.1.1 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.1/reference/system-requirements.html#supported-operating-systems>`__, `ROCm 7.1.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.0/reference/system-requirements.html#supported-operating-systems>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-operating-systems>`__.
.. [#gpu-compatibility-past-60] Some GPUs have limited operating system support. For detailed information, see the latest :ref:`supported_GPUs`. For version specific information, see `ROCm 7.1.1 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.1/reference/system-requirements.html#supported-gpus>`__, `ROCm 7.1.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.0/reference/system-requirements.html#supported-gpus>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-gpus>`__.
.. [#tf-mi350-past-60] TensorFlow 2.17.1 is not supported on AMD Instinct MI350 Series GPUs. Use TensorFlow 2.19.1 or 2.18.1 with MI350 Series GPUs instead.
.. [#verl_compat-past-60] verl is only supported on ROCm 7.0.0 and 6.2.0.
.. [#stanford-megatron-lm_compat-past-60] Stanford Megatron-LM is only supported on ROCm 6.3.0.
.. [#dgl_compat-past-60] DGL is only supported on ROCm 7.0.0, ROCm 6.4.3 and ROCm 6.4.0.
.. [#megablocks_compat-past-60] Megablocks is only supported on ROCm 6.3.0.
.. [#ray_compat-past-60] Ray is only supported on ROCm 7.0.0 and 6.4.1.
.. [#llama-cpp_compat-past-60] llama.cpp is only supported on ROCm 7.0.0 and 6.4.x.
.. [#flashinfer_compat-past-60] FlashInfer is only supported on ROCm 6.4.1.
.. [#mi325x_KVM-past-60] For AMD Instinct MI325X KVM SR-IOV users, do not use AMD GPU Driver (amdgpu) 30.20.0.
.. [#driver_patch-past-60] AMD GPU Driver (amdgpu) 30.10.1 is a quality release that resolves an issue identified in the 30.10 release. There are no other significant changes or feature additions in ROCm 7.0.1 from ROCm 7.0.0. AMD GPU Driver (amdgpu) 30.10.1 is compatible with ROCm 7.0.1 and ROCm 7.0.0.
.. [#kfd_support-past-60] As of ROCm 6.4.0, forward and backward compatibility between the AMD GPU Driver (amdgpu) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The supported user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and AMD GPU Driver support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
.. [#ROCT-rocr-past-60] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package.

View File

@@ -2,7 +2,7 @@
.. meta::
:description: Deep Graph Library (DGL) compatibility
:keywords: GPU, DGL compatibility
:keywords: GPU, CPU, deep graph library, DGL, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -10,215 +10,274 @@
DGL compatibility
********************************************************************************
Deep Graph Library `(DGL) <https://www.dgl.ai/>`_ is an easy-to-use, high-performance and scalable
Deep Graph Library (`DGL <https://www.dgl.ai/>`__) is an easy-to-use, high-performance, and scalable
Python package for deep learning on graphs. DGL is framework agnostic, meaning
if a deep graph model is a component in an end-to-end application, the rest of
that if a deep graph model is a component in an end-to-end application, the rest of
the logic is implemented using PyTorch.
* ROCm support for DGL is hosted in the `https://github.com/ROCm/dgl <https://github.com/ROCm/dgl>`_ repository.
* Due to independent compatibility considerations, this location differs from the `https://github.com/dmlc/dgl <https://github.com/dmlc/dgl>`_ upstream repository.
* Use the prebuilt :ref:`Docker images <dgl-docker-compat>` with DGL, PyTorch, and ROCm preinstalled.
* See the :doc:`ROCm DGL installation guide <rocm-install-on-linux:install/3rd-party/dgl-install>`
to install and get started.
DGL provides a high-performance graph object that can reside on either CPUs or GPUs.
It bundles structural data features for better control and provides a variety of functions
for computing with graph objects, including efficient and customizable message passing
primitives for Graph Neural Networks.
Supported devices
Support overview
================================================================================
- **Officially Supported**: TF32 with AMD Instinct MI300X (through hipblaslt)
- **Partially Supported**: TF32 with AMD Instinct MI250X
- The ROCm-supported version of DGL is maintained in the official `https://github.com/ROCm/dgl
<https://github.com/ROCm/dgl>`__ repository, which differs from the
`https://github.com/dmlc/dgl <https://github.com/dmlc/dgl>`__ upstream repository.
- To get started and install DGL on ROCm, use the prebuilt :ref:`Docker images <dgl-docker-compat>`,
which include ROCm, DGL, and all required dependencies.
.. _dgl-recommendations:
Use cases and recommendations
================================================================================
DGL can be used for Graph Learning, and building popular graph models like
GAT, GCN and GraphSage. Using these we can support a variety of use-cases such as:
- Recommender systems
- Network Optimization and Analysis
- 1D (Temporal) and 2D (Image) Classification
- Drug Discovery
Multiple use cases of DGL have been tested and verified.
However, a recommended example follows a drug discovery pipeline using the ``SE3Transformer``.
Refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`_,
where you can search for DGL examples and best practices to optimize your training workflows on AMD GPUs.
Coverage includes:
- Single-GPU training/inference
- Multi-GPU training
- See the :doc:`ROCm DGL installation guide <rocm-install-on-linux:install/3rd-party/dgl-install>`
for installation and setup instructions.
- You can also consult the upstream `Installation guide <https://www.dgl.ai/pages/start.html>`__
for additional context.
.. _dgl-docker-compat:
Docker image compatibility
Compatibility matrix
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `DGL images <https://hub.docker.com/r/rocm/dgl>`_
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
inventories were tested on `ROCm 6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`_.
AMD validates and publishes `DGL images <https://hub.docker.com/r/rocm/dgl/tags>`__
with ROCm backends on Docker Hub. The following Docker image tags and associated
inventories represent the latest available DGL version from the official Docker Hub.
Click the |docker-icon| to view the image on Docker Hub.
.. list-table:: DGL Docker image components
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker
* - Docker image
- ROCm
- DGL
- PyTorch
- Ubuntu
- Python
- GPU
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu24.04_py3.12_pytorch_release_2.6.0/images/sha256-8ce2c3bcfaa137ab94a75f9e2ea711894748980f57417739138402a542dd5564"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4.0.amd0_rocm7.0.0_ubuntu24.04_py3.12_pytorch_2.8.0/images/sha256-943698ddf54c22a7bcad2e5b4ff467752e29e4ba6d0c926789ae7b242cbd92dd"><i class="fab fa-docker fa-lg"></i> rocm/dgl</a>
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`_
- `2.6.0 <https://github.com/ROCm/pytorch/tree/release/2.6>`_
- `7.0.0 <https://repo.radeon.com/rocm/apt/7.0/>`__
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
- `2.8.0 <https://github.com/pytorch/pytorch/releases/tag/v2.8.0>`__
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`_
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`__
- MI300X, MI250X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu24.04_py3.12_pytorch_release_2.4.1/images/sha256-cf1683283b8eeda867b690229c8091c5bbf1edb9f52e8fb3da437c49a612ebe4"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4.0.amd0_rocm7.0.0_ubuntu24.04_py3.12_pytorch_2.6.0/images/sha256-b2ec286a035eb7d0a6aab069561914d21a3cac462281e9c024501ba5ccedfbf7"><i class="fab fa-docker fa-lg"></i> rocm/dgl</a>
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`_
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- `7.0.0 <https://repo.radeon.com/rocm/apt/7.0/>`__
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
- `2.6.0 <https://github.com/pytorch/pytorch/releases/tag/v2.6.0>`__
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`_
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`__
- MI300X, MI250X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.4.1/images/sha256-4834f178c3614e2d09e89e32041db8984c456d45dfd20286e377ca8635686554"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4.0.amd0_rocm7.0.0_ubuntu22.04_py3.10_pytorch_2.7.1/images/sha256-d27aee16df922ccf0bcd9107bfcb6d20d34235445d456c637e33ca6f19d11a51"><i class="fab fa-docker fa-lg"></i> rocm/dgl</a>
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`_
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- `7.0.0 <https://repo.radeon.com/rocm/apt/7.0/>`__
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
- `2.7.1 <https://github.com/pytorch/pytorch/releases/tag/v2.7.1>`__
- 22.04
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`_
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`__
- MI300X, MI250X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.3.0/images/sha256-88740a2c8ab4084b42b10c3c6ba984cab33dd3a044f479c6d7618e2b2cb05e69"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4.0.amd0_rocm6.4.3_ubuntu24.04_py3.12_pytorch_2.6.0/images/sha256-f3ba6a3c9ec9f6c1cde28449dc9780e0c4c16c4140f4b23f158565fbfd422d6b"><i class="fab fa-docker fa-lg"></i> rocm/dgl</a>
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`_
- `2.3.0 <https://github.com/ROCm/pytorch/tree/release/2.3>`_
- `6.4.3 <https://repo.radeon.com/rocm/apt/6.4.3/>`__
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
- `2.6.0 <https://github.com/pytorch/pytorch/releases/tag/v2.6.0>`__
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`__
- MI300X, MI250X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu24.04_py3.12_pytorch_release_2.6.0/images/sha256-8ce2c3bcfaa137ab94a75f9e2ea711894748980f57417739138402a542dd5564"><i class="fab fa-docker fa-lg"></i> rocm/dgl</a>
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
- `2.6.0 <https://github.com/pytorch/pytorch/releases/tag/v2.6.0>`__
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`__
- MI300X, MI250X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu24.04_py3.12_pytorch_release_2.4.1/images/sha256-cf1683283b8eeda867b690229c8091c5bbf1edb9f52e8fb3da437c49a612ebe4"><i class="fab fa-docker fa-lg"></i> rocm/dgl</a>
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
- `2.4.1 <https://github.com/pytorch/pytorch/releases/tag/v2.4.1>`__
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`__
- MI300X, MI250X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.4.1/images/sha256-4834f178c3614e2d09e89e32041db8984c456d45dfd20286e377ca8635686554"><i class="fab fa-docker fa-lg"></i> rocm/dgl</a>
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
- `2.4.1 <https://github.com/pytorch/pytorch/releases/tag/v2.4.1>`__
- 22.04
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`_
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`__
- MI300X, MI250X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.3.0/images/sha256-88740a2c8ab4084b42b10c3c6ba984cab33dd3a044f479c6d7618e2b2cb05e69"><i class="fab fa-docker fa-lg"></i> rocm/dgl</a>
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
- `2.3.0 <https://github.com/pytorch/pytorch/releases/tag/v2.3.0>`__
- 22.04
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`__
- MI300X, MI250X
.. _dgl-key-rocm-libraries:
Key ROCm libraries for DGL
================================================================================
DGL on ROCm depends on specific libraries that affect its features and performance.
Using the DGL Docker container or building it with the provided docker file or a ROCm base image is recommended.
Using the DGL Docker container or building it with the provided Docker file or a ROCm base image is recommended.
If you prefer to build it yourself, ensure the following dependencies are installed:
.. list-table::
:header-rows: 1
* - ROCm library
- Version
- ROCm 7.0.0 Version
- ROCm 6.4.x Version
- Purpose
* - `Composable Kernel <https://github.com/ROCm/composable_kernel>`_
- :version-ref:`"Composable Kernel" rocm_version`
- 1.1.0
- 1.1.0
- Enables faster execution of core operations like matrix multiplication
(GEMM), convolutions and transformations.
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`_
- :version-ref:`hipBLAS rocm_version`
- 3.0.0
- 2.4.0
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
matrix and vector operations.
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`_
- :version-ref:`hipBLASLt rocm_version`
- 1.0.0
- 0.12.0
- hipBLASLt is an extension of the hipBLAS library, providing additional
features like epilogues fused into the matrix multiplication kernel or
use of integer tensor cores.
* - `hipCUB <https://github.com/ROCm/hipCUB>`_
- :version-ref:`hipCUB rocm_version`
- 4.0.0
- 3.4.0
- Provides a C++ template library for parallel algorithms for reduction,
scan, sort and select.
* - `hipFFT <https://github.com/ROCm/hipFFT>`_
- :version-ref:`hipFFT rocm_version`
- 1.0.20
- 1.0.18
- Provides GPU-accelerated Fast Fourier Transform (FFT) operations.
* - `hipRAND <https://github.com/ROCm/hipRAND>`_
- :version-ref:`hipRAND rocm_version`
- 3.0.0
- 2.12.0
- Provides fast random number generation for GPUs.
* - `hipSOLVER <https://github.com/ROCm/hipSOLVER>`_
- :version-ref:`hipSOLVER rocm_version`
- 3.0.0
- 2.4.0
- Provides GPU-accelerated solvers for linear systems, eigenvalues, and
singular value decompositions (SVD).
* - `hipSPARSE <https://github.com/ROCm/hipSPARSE>`_
- :version-ref:`hipSPARSE rocm_version`
- 4.0.1
- 3.2.0
- Accelerates operations on sparse matrices, such as sparse matrix-vector
or matrix-matrix products.
* - `hipSPARSELt <https://github.com/ROCm/hipSPARSELt>`_
- :version-ref:`hipSPARSELt rocm_version`
- 0.2.4
- 0.2.3
- Accelerates operations on sparse matrices, such as sparse matrix-vector
or matrix-matrix products.
* - `hipTensor <https://github.com/ROCm/hipTensor>`_
- :version-ref:`hipTensor rocm_version`
- 2.0.0
- 1.5.0
- Optimizes for high-performance tensor operations, such as contractions.
* - `MIOpen <https://github.com/ROCm/MIOpen>`_
- :version-ref:`MIOpen rocm_version`
- 3.5.0
- 3.4.0
- Optimizes deep learning primitives such as convolutions, pooling,
normalization, and activation functions.
* - `MIGraphX <https://github.com/ROCm/AMDMIGraphX>`_
- :version-ref:`MIGraphX rocm_version`
- 2.13.0
- 2.12.0
- Adds graph-level optimizations, ONNX models and mixed precision support
and enable Ahead-of-Time (AOT) Compilation.
* - `MIVisionX <https://github.com/ROCm/MIVisionX>`_
- :version-ref:`MIVisionX rocm_version`
- 3.3.0
- 3.2.0
- Optimizes acceleration for computer vision and AI workloads like
preprocessing, augmentation, and inferencing.
* - `rocAL <https://github.com/ROCm/rocAL>`_
- :version-ref:`rocAL rocm_version`
- 3.3.0
- 2.2.0
- Accelerates the data pipeline by offloading intensive preprocessing and
augmentation tasks. rocAL is part of MIVisionX.
* - `RCCL <https://github.com/ROCm/rccl>`_
- :version-ref:`RCCL rocm_version`
- 2.26.6
- 2.22.3
- Optimizes for multi-GPU communication for operations like AllReduce and
Broadcast.
* - `rocDecode <https://github.com/ROCm/rocDecode>`_
- :version-ref:`rocDecode rocm_version`
- 1.0.0
- 0.10.0
- Provides hardware-accelerated data decoding capabilities, particularly
for image, video, and other dataset formats.
* - `rocJPEG <https://github.com/ROCm/rocJPEG>`_
- :version-ref:`rocJPEG rocm_version`
- 1.1.0
- 0.8.0
- Provides hardware-accelerated JPEG image decoding and encoding.
* - `RPP <https://github.com/ROCm/RPP>`_
- :version-ref:`RPP rocm_version`
- 2.0.0
- 1.9.10
- Speeds up data augmentation, transformation, and other preprocessing steps.
* - `rocThrust <https://github.com/ROCm/rocThrust>`_
- :version-ref:`rocThrust rocm_version`
- 4.0.0
- 3.3.0
- Provides a C++ template library for parallel algorithms like sorting,
reduction, and scanning.
* - `rocWMMA <https://github.com/ROCm/rocWMMA>`_
- :version-ref:`rocWMMA rocm_version`
- 2.0.0
- 1.7.0
- Accelerates warp-level matrix-multiply and matrix-accumulate to speed up matrix
multiplication (GEMM) and accumulation operations with mixed precision
support.
.. _dgl-supported-features-latest:
Supported features
Supported features with ROCm 7.0.0
================================================================================
Many functions and methods available in DGL Upstream are also supported in DGL ROCm.
Many functions and methods available upstream are also supported in DGL on ROCm.
Instead of listing them all, support is grouped into the following categories to provide a general overview.
* DGL Base
* DGL Backend
* DGL Data
* DGL Dataloading
* DGL DGLGraph
* DGL Graph
* DGL Function
* DGL Ops
* DGL Sampling
@@ -230,26 +289,76 @@ Instead of listing them all, support is grouped into the following categories to
* DGL NN
* DGL Optim
* DGL Sparse
* GraphBolt
.. _dgl-unsupported-features-latest:
Unsupported features
Unsupported features with ROCm 7.0.0
================================================================================
* Graphbolt
* Partial TF32 Support (MI250x only)
* Kineto/ ROCTracer integration
* TF32 Support (only supported for PyTorch 2.7 and above)
* Kineto/ROCTracer integration
.. _dgl-unsupported-functions:
Unsupported functions
Unsupported functions with ROCm 7.0.0
================================================================================
* ``more_nnz``
* ``bfs``
* ``format``
* ``multiprocess_sparse_adam_state_dict``
* ``record_stream_ndarray``
* ``half_spmm``
* ``segment_mm``
* ``gather_mm_idx_b``
* ``pgexplainer``
* ``sample_labors_prob``
* ``sample_labors_noprob``
* ``sparse_admin``
.. _dgl-recommendations:
Use cases and recommendations
================================================================================
DGL can be used for Graph Learning, and building popular graph models like
GAT, GCN, and GraphSage. Using these models, a variety of use cases are supported:
- Recommender systems
- Network Optimization and Analysis
- 1D (Temporal) and 2D (Image) Classification
- Drug Discovery
For use cases and recommendations, refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`__,
where you can search for DGL examples and best practices to optimize your workloads on AMD GPUs.
* Although multiple use cases of DGL have been tested and verified, a few have been
outlined in the `DGL in the Real World: Running GNNs on Real Use Cases
<https://rocm.blogs.amd.com/artificial-intelligence/dgl_blog2/README.html>`__ blog
post, which walks through four real-world graph neural network (GNN) workloads
implemented with the Deep Graph Library on ROCm. It covers tasks ranging from
heterogeneous e-commerce graphs and multiplex networks (GATNE) to molecular graph
regression (GNN-FiLM) and EEG-based neurological diagnosis (EEG-GCNN). For each use
case, the authors detail: the dataset and task, how DGL is used, and their experience
porting to ROCm. It is shown that DGL codebases often run without modification, with
seamless integration of graph operations, message passing, sampling, and convolution.
* The `Graph Neural Networks (GNNs) at Scale: DGL with ROCm on AMD Hardware
<https://rocm.blogs.amd.com/artificial-intelligence/why-graph-neural/README.html>`__
blog post introduces the Deep Graph Library (DGL) and its enablement on the AMD ROCm platform,
bringing high-performance graph neural network (GNN) training to AMD GPUs. DGL bridges
the gap between dense tensor frameworks and the irregular nature of graph data through a
graph-first, message-passing abstraction. Its design ensures scalability, flexibility, and
interoperability across frameworks like PyTorch and TensorFlow. AMDs ROCm integration
enables DGL to run efficiently on HIP-based GPUs, supported by prebuilt Docker containers
and open-source repositories. This marks a major step in AMD's mission to advance open,
scalable AI ecosystems beyond traditional architectures.
You can pre-process datasets and begin training on AMD GPUs through:
* Single-GPU training/inference
* Multi-GPU training
Previous versions
===============================================================================
See :doc:`rocm-install-on-linux:install/3rd-party/previous-versions/dgl-history` to find documentation for previous releases
of the ``ROCm/dgl`` Docker image.

View File

@@ -0,0 +1,98 @@
:orphan:
.. meta::
:description: FlashInfer compatibility
:keywords: GPU, LLM, FlashInfer, deep learning, framework compatibility
.. version-set:: rocm_version latest
********************************************************************************
FlashInfer compatibility
********************************************************************************
`FlashInfer <https://docs.flashinfer.ai/index.html>`__ is a library and kernel generator
for Large Language Models (LLMs) that provides a high-performance implementation of graphics
processing units (GPUs) kernels. FlashInfer focuses on LLM serving and inference, as well
as advanced performance across diverse scenarios.
FlashInfer features highly efficient attention kernels, load-balanced scheduling, and memory-optimized
techniques, while supporting customized attention variants. Its compatible with ``torch.compile``, and
offers high-performance LLM-specific operators, with easy integration through PyTorch, and C++ APIs.
.. note::
The ROCm port of FlashInfer is under active development, and some features are not yet available.
For the latest feature compatibility matrix, refer to the ``README`` of the
`https://github.com/ROCm/flashinfer <https://github.com/ROCm/flashinfer>`__ repository.
Support overview
================================================================================
- The ROCm-supported version of FlashInfer is maintained in the official `https://github.com/ROCm/flashinfer
<https://github.com/ROCm/flashinfer>`__ repository, which differs from the
`https://github.com/flashinfer-ai/flashinfer <https://github.com/flashinfer-ai/flashinfer>`__
upstream repository.
- To get started and install FlashInfer on ROCm, use the prebuilt :ref:`Docker images <flashinfer-docker-compat>`,
which include ROCm, FlashInfer, and all required dependencies.
- See the :doc:`ROCm FlashInfer installation guide <rocm-install-on-linux:install/3rd-party/flashinfer-install>`
for installation and setup instructions.
- You can also consult the upstream `Installation guide <https://docs.flashinfer.ai/installation.html>`__
for additional context.
.. _flashinfer-docker-compat:
Compatibility matrix
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `FlashInfer images <https://hub.docker.com/r/rocm/flashinfer/tags>`__
with ROCm backends on Docker Hub. The following Docker image tag and associated
inventories represent the latest available FlashInfer version from the official Docker Hub.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- ROCm
- FlashInfer
- PyTorch
- Ubuntu
- Python
- GPU
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/flashinfer/flashinfer-0.2.5_rocm6.4_ubuntu24.04_py3.12_pytorch2.7/images/sha256-558914838821c88c557fb6d42cfbc1bdb67d79d19759f37c764a9ee801f93313"><i class="fab fa-docker fa-lg"></i> rocm/flashinfer</a>
- `6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__
- `v0.2.5 <https://github.com/flashinfer-ai/flashinfer/releases/tag/v0.2.5>`__
- `2.7.1 <https://github.com/ROCm/pytorch/releases/tag/v2.7.1>`__
- 24.04
- `3.12 <https://www.python.org/downloads/release/python-3129/>`__
- MI300X
.. _flashinfer-recommendations:
Use cases and recommendations
================================================================================
The release of FlashInfer on ROCm provides the decode functionality for LLM inferencing.
In the decode phase, tokens are generated sequentially, with the model predicting each new
token based on the previously generated tokens and the input context.
FlashInfer on ROCm brings over upstream features such as load balancing, sparse and dense
attention optimizations, and batching support, enabling efficient execution on AMD Instinct™ MI300X GPUs.
Because large LLMs often require substantial KV caches or long context windows, FlashInfer on ROCm
also implements cascade attention from upstream to reduce memory usage.
For currently supported use cases and recommendations, refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`__,
where you can search for examples and best practices to optimize your workloads on AMD GPUs.

View File

@@ -2,7 +2,7 @@
.. meta::
:description: JAX compatibility
:keywords: GPU, JAX compatibility
:keywords: GPU, JAX, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -10,42 +10,58 @@
JAX compatibility
*******************************************************************************
JAX provides a NumPy-like API, which combines automatic differentiation and the
Accelerated Linear Algebra (XLA) compiler to achieve high-performance machine
learning at scale.
`JAX <https://docs.jax.dev/en/latest/notebooks/thinking_in_jax.html>`__ is a library
for array-oriented numerical computation (similar to NumPy), with automatic differentiation
and just-in-time (JIT) compilation to enable high-performance machine learning research.
JAX uses composable transformations of Python and NumPy through just-in-time
(JIT) compilation, automatic vectorization, and parallelization. To learn about
JAX, including profiling and optimizations, see the official `JAX documentation
<https://jax.readthedocs.io/en/latest/notebooks/quickstart.html>`_.
JAX provides an API that combines automatic differentiation and the
Accelerated Linear Algebra (XLA) compiler to achieve high-performance machine
learning at scale. JAX uses composable transformations of Python and NumPy through
JIT compilation, automatic vectorization, and parallelization.
ROCm support for JAX is upstreamed, and users can build the official source code
with ROCm support:
Support overview
================================================================================
- ROCm JAX release:
- The ROCm-supported version of JAX is maintained in the official `https://github.com/ROCm/rocm-jax
<https://github.com/ROCm/rocm-jax>`__ repository, which differs from the
`https://github.com/jax-ml/jax <https://github.com/jax-ml/jax>`__ upstream repository.
- Offers AMD-validated and community :ref:`Docker images <jax-docker-compat>`
with ROCm and JAX preinstalled.
- To get started and install JAX on ROCm, use the prebuilt :ref:`Docker images <jax-docker-compat>`,
which include ROCm, JAX, and all required dependencies.
- ROCm JAX repository: `ROCm/rocm-jax <https://github.com/ROCm/rocm-jax>`_
- See the :doc:`ROCm JAX installation guide <rocm-install-on-linux:install/3rd-party/jax-install>`
for installation and setup instructions.
- See the :doc:`ROCm JAX installation guide <rocm-install-on-linux:install/3rd-party/jax-install>`
to get started.
- You can also consult the upstream `Installation guide <https://jax.readthedocs.io/en/latest/installation.html#amd-gpu-linux>`__
for additional context.
- Official JAX release:
Version support
--------------------------------------------------------------------------------
- Official JAX repository: `jax-ml/jax <https://github.com/jax-ml/jax>`_
AMD releases official `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax/tags>`_
quarterly alongside new ROCm releases. These images undergo full AMD testing.
`Community ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax-community/tags>`_
follow upstream JAX releases and use the latest available ROCm version.
- See the `AMD GPU (Linux) installation section
<https://jax.readthedocs.io/en/latest/installation.html#amd-gpu-linux>`_ in
the JAX documentation.
JAX Plugin-PJRT with JAX/JAXLIB compatibility
================================================================================
.. note::
Portable JIT Runtime (PJRT) is an open, stable interface for device runtime and
compiler. The following table details the ROCm version compatibility matrix
between JAX PluginPJRT and JAX/JAXLIB.
AMD releases official `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax>`_
quarterly alongside new ROCm releases. These images undergo full AMD testing.
`Community ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax-community>`_
follow upstream JAX releases and use the latest available ROCm version.
.. list-table::
:header-rows: 1
* - JAX Plugin-PJRT
- JAX/JAXLIB
- ROCm
* - 0.7.1
- 0.7.1
- 7.1.1, 7.1.0
* - 0.6.0
- 0.6.2, 0.6.0
- 7.0.2, 7.0.1, 7.0.0
Use cases and recommendations
================================================================================
@@ -71,7 +87,7 @@ Use cases and recommendations
* The `Distributed fine-tuning with JAX on AMD GPUs <https://rocm.blogs.amd.com/artificial-intelligence/distributed-sft-jax/README.html>`_
outlines the process of fine-tuning a Bidirectional Encoder Representations
from Transformers (BERT)-based large language model (LLM) using JAX for a text
classification task. The blog post discuss techniques for parallelizing the
classification task. The blog post discusses techniques for parallelizing the
fine-tuning across multiple AMD GPUs and assess the model's performance on a
holdout dataset. During the fine-tuning, a BERT-base-cased transformer model
and the General Language Understanding Evaluation (GLUE) benchmark dataset was
@@ -79,7 +95,7 @@ Use cases and recommendations
* The `MI300X workload optimization guide <https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/workload.html>`_
provides detailed guidance on optimizing workloads for the AMD Instinct MI300X
accelerator using ROCm. The page is aimed at helping users achieve optimal
GPU using ROCm. The page is aimed at helping users achieve optimal
performance for deep learning and other high-performance computing tasks on
the MI300X GPU.
@@ -90,75 +106,15 @@ For more use cases and recommendations, see `ROCm JAX blog posts <https://rocm.b
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
AMD validates and publishes `JAX images <https://hub.docker.com/r/rocm/jax/tags>`__
with ROCm backends on Docker Hub.
<i class="fab fa-docker"></i>
For ``jax-community`` images, see `rocm/jax-community
<https://hub.docker.com/r/rocm/jax-community/tags>`__ on Docker Hub.
AMD validates and publishes ready-made `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax>`_
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories represent the latest JAX version from the official Docker Hub and are validated for
`ROCm 6.4.2 <https://repo.radeon.com/rocm/apt/6.4.2/>`_. Click the |docker-icon|
icon to view the image on Docker Hub.
.. list-table:: JAX Docker image components
:header-rows: 1
* - Docker image
- JAX
- Linux
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax/rocm6.4.2-jax0.4.35-py3.12/images/sha256-8918fa806a172c1a10eb2f57131eb31b5d7c8fa1656b8729fe7d3d736112de83"><i class="fab fa-docker fa-lg"></i> rocm/jax</a>
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 24.04
- `3.12.10 <https://www.python.org/downloads/release/python-31210/>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax/rocm6.4.2-jax0.4.35-py3.10/images/sha256-a394be13c67b7fc602216abee51233afd4b6cb7adaa57ca97e688fba82f9ad79"><i class="fab fa-docker fa-lg"></i> rocm/jax</a>
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 22.04
- `3.10.17 <https://www.python.org/downloads/release/python-31017/>`_
AMD publishes `Community ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax-community>`_
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories are tested for `ROCm 6.3.2 <https://repo.radeon.com/rocm/apt/6.3.2/>`_.
.. list-table:: JAX community Docker image components
:header-rows: 1
* - Docker image
- JAX
- Linux
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.3.2-jax0.5.0-py3.12.8/images/sha256-25dfaa0183e274bd0a3554a309af3249c6f16a1793226cb5373f418e39d3146a"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.5.0 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.5.0>`_
- Ubuntu 22.04
- `3.12.8 <https://www.python.org/downloads/release/python-3128/>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.3.2-jax0.5.0-py3.11.11/images/sha256-ff9baeca9067d13e6c279c911e5a9e5beed0817d24fafd424367cc3d5bd381d7"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.5.0 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.5.0>`_
- Ubuntu 22.04
- `3.11.11 <https://www.python.org/downloads/release/python-31111/>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.3.2-jax0.5.0-py3.10.16/images/sha256-8bab484be1713655f74da51a191ed824bb9d03db1104fd63530a1ac3c37cf7b1"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.5.0 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.5.0>`_
- Ubuntu 22.04
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`_
To find the right image tag, see the :ref:`JAX on ROCm installation
documentation <rocm-install-on-linux:jax-docker-support>` for a list of
available ``rocm/jax`` images.
.. _key_rocm_libraries:
@@ -294,7 +250,7 @@ The ROCm supported data types in JAX are collected in the following table.
.. note::
JAX data type support is effected by the :ref:`key_rocm_libraries` and it's
JAX data type support is affected by the :ref:`key_rocm_libraries` and it's
collected on :doc:`ROCm data types and precision support <rocm:reference/precision-support>`
page.

View File

@@ -1,8 +1,8 @@
:orphan:
.. meta::
:description: llama.cpp deep learning framework compatibility
:keywords: GPU, GGML, llama.cpp compatibility
:description: llama.cpp compatibility
:keywords: GPU, GGML, llama.cpp, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -16,37 +16,230 @@ for Large Language Model (LLM) inference that runs on both central processing un
a simple, dependency-free setup.
The framework supports multiple quantization options, from 1.5-bit to 8-bit integers,
to speed up inference and reduce memory usage. Originally built as a CPU-first library,
to accelerate inference and reduce memory usage. Originally built as a CPU-first library,
llama.cpp is easy to integrate with other programming environments and is widely
adopted across diverse platforms, including consumer devices.
ROCm support for llama.cpp is upstreamed, and you can build the official source code
with ROCm support:
- ROCm support for llama.cpp is hosted in the official `https://github.com/ROCm/llama.cpp
<https://github.com/ROCm/llama.cpp>`_ repository.
- Due to independent compatibility considerations, this location differs from the
`https://github.com/ggml-org/llama.cpp <https://github.com/ggml-org/llama.cpp>`_ upstream repository.
- To install llama.cpp, use the prebuilt :ref:`Docker image <llama-cpp-docker-compat>`,
which includes ROCm, llama.cpp, and all required dependencies.
- See the :doc:`ROCm llama.cpp installation guide <rocm-install-on-linux:install/3rd-party/llama-cpp-install>`
to install and get started.
- See the `Installation guide <https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#hip>`__
in the upstream llama.cpp documentation.
.. note::
llama.cpp is supported on ROCm 6.4.0.
Supported devices
Support overview
================================================================================
**Officially Supported**: AMD Instinct™ MI300X, MI210
- The ROCm-supported version of llama.cpp is maintained in the official `https://github.com/ROCm/llama.cpp
<https://github.com/ROCm/llama.cpp>`__ repository, which differs from the
`https://github.com/ggml-org/llama.cpp <https://github.com/ggml-org/llama.cpp>`__ upstream repository.
- To get started and install llama.cpp on ROCm, use the prebuilt :ref:`Docker images <llama-cpp-docker-compat>`,
which include ROCm, llama.cpp, and all required dependencies.
- See the :doc:`ROCm llama.cpp installation guide <rocm-install-on-linux:install/3rd-party/llama-cpp-install>`
for installation and setup instructions.
- You can also consult the upstream `Installation guide <https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md>`__
for additional context.
.. _llama-cpp-docker-compat:
Compatibility matrix
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `llama.cpp images <https://hub.docker.com/r/rocm/llama.cpp/tags>`__
with ROCm backends on Docker Hub. The following Docker image tags and associated
inventories represent the latest available llama.cpp versions from the official Docker Hub.
Click |docker-icon| to view the image on Docker Hub.
.. important::
Tag endings of ``_full``, ``_server``, and ``_light`` serve different purposes for entrypoints as follows:
- Full: This image includes both the main executable file and the tools to convert ``LLaMA`` models into ``ggml`` and convert into 4-bit quantization.
- Server: This image only includes the server executable file.
- Light: This image only includes the main executable file.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Full Docker
- Server Docker
- Light Docker
- llama.cpp
- ROCm
- Ubuntu
- GPU
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6652.amd0_rocm7.0.0_ubuntu24.04_full/images/sha256-a94f0c7a598cc6504ff9e8371c016d7a2f93e69bf54a36c870f9522567201f10g"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6652.amd0_rocm7.0.0_ubuntu24.04_server/images/sha256-be175932c3c96e882dfbc7e20e0e834f58c89c2925f48b222837ee929dfc47ee"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6652.amd0_rocm7.0.0_ubuntu24.04_light/images/sha256-d8ba0c70603da502c879b1f8010b439c8e7fa9f6cbdac8bbbbbba97cb41ebc9e"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b6652 <https://github.com/ROCm/llama.cpp/tree/release/b6652>`__
- `7.0.0 <https://repo.radeon.com/rocm/apt/7.0/>`__
- 24.04
- MI325X, MI300X, MI210
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6652.amd0_rocm7.0.0_ubuntu22.04_full/images/sha256-37582168984f25dce636cc7288298e06d94472ea35f65346b3541e6422b678ee"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6652.amd0_rocm7.0.0_ubuntu22.04_server/images/sha256-7e70578e6c3530c6591cc2c26da24a9ee68a20d318e12241de93c83224f83720"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6652.amd0_rocm7.0.0_ubuntu22.04_light/images/sha256-9a5231acf88b4a229677bc2c636ea3fe78a7a80f558bd80910b919855de93ad5"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b6652 <https://github.com/ROCm/llama.cpp/tree/release/b6652>`__
- `7.0.0 <https://repo.radeon.com/rocm/apt/7.0/>`__
- 22.04
- MI325X, MI300X, MI210
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.3_ubuntu24.04_full/images/sha256-5960fc850024a8a76451f9eaadd89b7e59981ae9f393b407310c1ddf18892577"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.3_ubuntu24.04_server/images/sha256-1b79775d9f546065a6aaf9ca426e1dd4ed4de0b8f6ee83687758cc05af6538e6"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.3_ubuntu24.04_light/images/sha256-8f863c4c2857ae42bebd64e4f1a0a1e7cc3ec4503f243e32b4a4dcad070ec361"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b6356 <https://github.com/ROCm/llama.cpp/tree/release/b6356>`__
- `6.4.3 <https://repo.radeon.com/rocm/apt/6.4.3/>`__
- 24.04
- MI325X, MI300X, MI210
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.3_ubuntu22.04_full/images/sha256-888879b3ee208f9247076d7984524b8d1701ac72611689e89854a1588bec9867"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.3_ubuntu22.04_server/images/sha256-90e4ff99a66743e33fd00728cd71a768588e5f5ef355aaa196669fe65ac70672"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.3_ubuntu22.04_light/images/sha256-bd447a049939cb99054f8fbf3f2352870fe906a75e2dc3339c845c08b9c53f9b"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b6356 <https://github.com/ROCm/llama.cpp/tree/release/b6356>`__
- `6.4.3 <https://repo.radeon.com/rocm/apt/6.4.3/>`__
- 22.04
- MI325X, MI300X, MI210
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.2_ubuntu24.04_full/images/sha256-5b3a1bc4889c1fcade434b937fbf9cc1c22ff7dc0317c130339b0c9238bc88c4"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.2_ubuntu24.04_server/images/sha256-5228ff99d0f627a9032d668f4381b2e80dc1e301adc3e0821f26d8354b175271"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.2_ubuntu24.04_light/images/sha256-b12723b332a826a89b7252dddf868cbe4d1a869562fc4aa4032f59e1a683b968"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b6356 <https://github.com/ROCm/llama.cpp/tree/release/b6356>`__
- `6.4.2 <https://repo.radeon.com/rocm/apt/6.4.2/>`__
- 24.04
- MI325X, MI300X, MI210
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.2_ubuntu22.04_full/images/sha256-cd6e21a6a73f59b35dd5309b09dd77654a94d783bf13a55c14eb8dbf8e9c2615"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.2_ubuntu22.04_server/images/sha256-c2b4689ab2c47e6626e8fea22d7a63eb03d47c0fde9f5ef8c9f158d15c423e58"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.2_ubuntu22.04_light/images/sha256-1acc28f29ed87db9cbda629cb29e1989b8219884afe05f9105522be929e94da4"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b6356 <https://github.com/ROCm/llama.cpp/tree/release/b6356>`__
- `6.4.2 <https://repo.radeon.com/rocm/apt/6.4.2/>`__
- 22.04
- MI325X, MI300X, MI210
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.1_ubuntu24.04_full/images/sha256-2f8ae8a44510d96d52dea6cb398b224f7edeb7802df7ec488c6f63d206b3cdc9"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.1_ubuntu24.04_server/images/sha256-fece497ff9f4a28b12f645de52766941da8ead8471aa1ea84b61d4b4568e51f2"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.1_ubuntu24.04_light/images/sha256-3e14352fa6f8c6128b23cf9342531c20dbfb522550b626e09d83b260a1947022"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b6356 <https://github.com/ROCm/llama.cpp/tree/release/b6356>`__
- `6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__
- 24.04
- MI325X, MI300X, MI210
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.1_ubuntu22.04_full/images/sha256-80763062ef0bec15038c35fd01267f1fc99a5dd171d4b48583cc668b15efad69"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.1_ubuntu22.04_server/images/sha256-db2a6c957555ed83b819bbc54aea884a93192da0fb512dae63d32e0dc4e8ab8f"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b6356_rocm6.4.1_ubuntu22.04_light/images/sha256-c6dbb07cc655fb079d5216e4b77451cb64a9daa0585d23b6fb8b32cb22021197"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b6356 <https://github.com/ROCm/llama.cpp/tree/release/b6356>`__
- `6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__
- 22.04
- MI325X, MI300X, MI210
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b5997_rocm6.4.0_ubuntu24.04_full/images/sha256-f78f6c81ab2f8e957469415fe2370a1334fe969c381d1fe46050c85effaee9d5"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b5997_rocm6.4.0_ubuntu24.04_server/images/sha256-275ad9e18f292c26a00a2de840c37917e98737a88a3520bdc35fd3fc5c9a6a9b"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b5997_rocm6.4.0_ubuntu24.04_light/images/sha256-cc324e6faeedf0e400011f07b49d2dc41a16bae257b2b7befa0f4e2e97231320"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b5997 <https://github.com/ROCm/llama.cpp/tree/release/b5997>`__
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__
- 24.04
- MI300X, MI210
.. _llama-cpp-key-rocm-libraries:
Key ROCm libraries for llama.cpp
================================================================================
llama.cpp functionality on ROCm is determined by its underlying library
dependencies. These ROCm components affect the capabilities, performance, and
feature set available to developers. Ensure you have the required libraries for
your corresponding ROCm version.
.. list-table::
:header-rows: 1
* - ROCm library
- ROCm 7.0.0 version
- ROCm 6.4.x version
- Purpose
- Usage
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`__
- 3.0.0
- 2.4.0
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
matrix and vector operations.
- Supports operations such as matrix multiplication, matrix-vector
products, and tensor contractions. Utilized in both dense and batched
linear algebra operations.
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`__
- 1.0.0
- 0.12.0
- hipBLASLt is an extension of the hipBLAS library, providing additional
features like epilogues fused into the matrix multiplication kernel or
use of integer tensor cores.
- By setting the flag ``ROCBLAS_USE_HIPBLASLT``, you can dispatch hipblasLt
kernels where possible.
* - `rocWMMA <https://github.com/ROCm/rocWMMA>`__
- 2.0.0
- 1.7.0
- Accelerates warp-level matrix-multiply and matrix-accumulate to speed up matrix
multiplication (GEMM) and accumulation operations with mixed precision
support.
- Can be used to enhance the flash attention performance on AMD compute, by enabling
the flag during compile time.
.. _llama-cpp-uses-recommendations:
Use cases and recommendations
================================================================================
@@ -70,87 +263,13 @@ llama.cpp is also used in a range of real-world applications, including:
For more use cases and recommendations, refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`__,
where you can search for llama.cpp examples and best practices to optimize your workloads on AMD GPUs.
- The `Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration <https://rocm.blogs.amd.com/ecosystems-and-partners/llama-cpp/README.html>`__,
- The `Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration <https://rocm.blogs.amd.com/ecosystems-and-partners/llama-cpp/README.html>`__
blog post outlines how the open-source llama.cpp framework enables efficient LLM inference—including interactive inference with ``llama-cli``,
server deployment with ``llama-server``, GGUF model preparation and quantization, performance benchmarking, and optimizations tailored for
AMD Instinct GPUs within the ROCm ecosystem.
.. _llama-cpp-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `ROCm llama.cpp Docker images <https://hub.docker.com/r/rocm/llama.cpp>`__
with ROCm backends on Docker Hub. The following Docker image tags and associated
inventories were tested on `ROCm 6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__.
Click |docker-icon| to view the image on Docker Hub.
.. important::
Tag endings of ``_full``, ``_server``, and ``_light`` serve different purposes for entrypoints as follows:
- Full: This image includes both the main executable file and the tools to convert ``LLaMA`` models into ``ggml`` and convert into 4-bit quantization.
- Server: This image only includes the server executable file.
- Light: This image only includes the main executable file.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Full Docker
- Server Docker
- Light Docker
- llama.cpp
- Ubuntu
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b5997_rocm6.4.0_ubuntu24.04_full/images/sha256-f78f6c81ab2f8e957469415fe2370a1334fe969c381d1fe46050c85effaee9d5"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b5997_rocm6.4.0_ubuntu24.04_server/images/sha256-275ad9e18f292c26a00a2de840c37917e98737a88a3520bdc35fd3fc5c9a6a9b"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- .. raw:: html
<a href="https://hub.docker.com/layers/rocm/llama.cpp/llama.cpp-b5997_rocm6.4.0_ubuntu24.04_light/images/sha256-cc324e6faeedf0e400011f07b49d2dc41a16bae257b2b7befa0f4e2e97231320"><i class="fab fa-docker fa-lg"></i> rocm/llama.cpp</a>
- `b5997 <https://github.com/ROCm/llama.cpp/tree/release/b5997>`__
- 24.04
Key ROCm libraries for llama.cpp
================================================================================
llama.cpp functionality on ROCm is determined by its underlying library
dependencies. These ROCm components affect the capabilities, performance, and
feature set available to developers.
.. list-table::
:header-rows: 1
* - ROCm library
- Version
- Purpose
- Usage
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`__
- :version-ref:`hipBLAS rocm_version`
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
matrix and vector operations.
- Supports operations such as matrix multiplication, matrix-vector
products, and tensor contractions. Utilized in both dense and batched
linear algebra operations.
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`__
- :version-ref:`hipBLASLt rocm_version`
- hipBLASLt is an extension of the hipBLAS library, providing additional
features like epilogues fused into the matrix multiplication kernel or
use of integer tensor cores.
- By setting the flag ``ROCBLAS_USE_HIPBLASLT``, you can dispatch hipblasLt
kernels where possible.
* - `rocWMMA <https://github.com/ROCm/rocWMMA>`__
- :version-ref:`rocWMMA rocm_version`
- Accelerates warp-level matrix-multiply and matrix-accumulate to speed up matrix
multiplication (GEMM) and accumulation operations with mixed precision
support.
- Can be used to enhance the flash attention performance on AMD compute, by enabling
the flag during compile time.
Previous versions
===============================================================================
See :doc:`rocm-install-on-linux:install/3rd-party/previous-versions/llama-cpp-history` to find documentation for previous releases
of the ``ROCm/llama.cpp`` Docker image.

View File

@@ -2,7 +2,7 @@
.. meta::
:description: Megablocks compatibility
:keywords: GPU, megablocks, compatibility
:keywords: GPU, megablocks, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -10,64 +10,41 @@
Megablocks compatibility
********************************************************************************
Megablocks is a light-weight library for mixture-of-experts (MoE) training.
`Megablocks <https://github.com/databricks/megablocks>`__ is a lightweight library
for mixture-of-experts `(MoE) <https://huggingface.co/blog/moe>`__ training.
The core of the system is efficient "dropless-MoE" and standard MoE layers.
Megablocks is integrated with `https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`_,
Megablocks is integrated with `https://github.com/stanford-futuredata/Megatron-LM
<https://github.com/stanford-futuredata/Megatron-LM>`__,
where data and pipeline parallel training of MoEs is supported.
* ROCm support for Megablocks is hosted in the official `https://github.com/ROCm/megablocks <https://github.com/ROCm/megablocks>`_ repository.
* Due to independent compatibility considerations, this location differs from the `https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`_ upstream repository.
* Use the prebuilt :ref:`Docker image <megablocks-docker-compat>` with ROCm, PyTorch, and Megablocks preinstalled.
* See the :doc:`ROCm Megablocks installation guide <rocm-install-on-linux:install/3rd-party/megablocks-install>` to install and get started.
.. note::
Megablocks is supported on ROCm 6.3.0.
Supported devices
Support overview
================================================================================
- **Officially Supported**: AMD Instinct MI300X
- **Partially Supported** (functionality or performance limitations): AMD Instinct MI250X, MI210X
- The ROCm-supported version of Megablocks is maintained in the official `https://github.com/ROCm/megablocks
<https://github.com/ROCm/megablocks>`__ repository, which differs from the
`https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`__ upstream repository.
Supported models and features
================================================================================
- To get started and install Megablocks on ROCm, use the prebuilt :ref:`Docker image <megablocks-docker-compat>`,
which includes ROCm, Megablocks, and all required dependencies.
This section summarizes the Megablocks features supported by ROCm.
* Distributed Pre-training
* Activation Checkpointing and Recomputation
* Distributed Optimizer
* Mixture-of-Experts
* dropless-Mixture-of-Experts
.. _megablocks-recommendations:
Use cases and recommendations
================================================================================
The `ROCm Megablocks blog posts <https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`_
guide how to leverage the ROCm platform for pre-training using the Megablocks framework.
It features how to pre-process datasets and how to begin pre-training on AMD GPUs through:
* Single-GPU pre-training
* Multi-GPU pre-training
- See the :doc:`ROCm Megablocks installation guide <rocm-install-on-linux:install/3rd-party/megablocks-install>`
for installation and setup instructions.
- You can also consult the upstream `Installation guide <https://github.com/databricks/megablocks>`__
for additional context.
.. _megablocks-docker-compat:
Docker image compatibility
Compatibility matrix
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `ROCm Megablocks images <https://hub.docker.com/r/rocm/megablocks/tags>`_
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
inventories represent the latest Megatron-LM version from the official Docker Hub.
The Docker images have been validated for `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_.
AMD validates and publishes `Megablocks images <https://hub.docker.com/r/rocm/megablocks/tags>`__
with ROCm backends on Docker Hub. The following Docker image tag and associated
inventories represent the latest available Megablocks version from the official Docker Hub.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
@@ -80,6 +57,7 @@ Click |docker-icon| to view the image on Docker Hub.
- PyTorch
- Ubuntu
- Python
- GPU
* - .. raw:: html
@@ -89,5 +67,38 @@ Click |docker-icon| to view the image on Docker Hub.
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`_
- MI300X
Supported models and features with ROCm 6.3.0
================================================================================
This section summarizes the Megablocks features supported by ROCm.
* Distributed Pre-training
* Activation Checkpointing and Recomputation
* Distributed Optimizer
* Mixture-of-Experts
* dropless-Mixture-of-Experts
.. _megablocks-recommendations:
Use cases and recommendations
================================================================================
* The `Efficient MoE training on AMD ROCm: How-to use Megablocks on AMD GPUs
<https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`__
blog post guides how to leverage the ROCm platform for pre-training using the
Megablocks framework. It introduces a streamlined approach for training Mixture-of-Experts
(MoE) models using the Megablocks library on AMD hardware. Focusing on GPT-2, it
demonstrates how block-sparse computations can enhance scalability and efficiency in MoE
training. The guide provides step-by-step instructions for setting up the environment,
including cloning the repository, building the Docker image, and running the training container.
Additionally, it offers insights into utilizing the ``oscar-1GB.json`` dataset for pre-training
language models. By leveraging Megablocks and the ROCm platform, you can optimize your MoE
training workflows for large-scale transformer models.
It features how to pre-process datasets and how to begin pre-training on AMD GPUs through:
* Single-GPU pre-training
* Multi-GPU pre-training

View File

@@ -2,7 +2,7 @@
.. meta::
:description: PyTorch compatibility
:keywords: GPU, PyTorch compatibility
:keywords: GPU, PyTorch, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -15,40 +15,42 @@ deep learning. PyTorch on ROCm provides mixed-precision and large-scale training
using `MIOpen <https://github.com/ROCm/MIOpen>`__ and
`RCCL <https://github.com/ROCm/rccl>`__ libraries.
ROCm support for PyTorch is upstreamed into the official PyTorch repository. Due
to independent compatibility considerations, this results in two distinct
release cycles for PyTorch on ROCm:
PyTorch provides two high-level features:
- ROCm PyTorch release:
- Tensor computation (like NumPy) with strong GPU acceleration
- Provides the latest version of ROCm but might not necessarily support the
latest stable PyTorch version.
- Deep neural networks built on a tape-based autograd system (rapid computation
of multiple partial derivatives or gradients)
- Offers :ref:`Docker images <pytorch-docker-compat>` with ROCm and PyTorch
preinstalled.
Support overview
================================================================================
- ROCm PyTorch repository: `<https://github.com/ROCm/pytorch>`__
ROCm support for PyTorch is upstreamed into the official PyTorch repository.
ROCm development is aligned with the stable release of PyTorch, while upstream
PyTorch testing uses the stable release of ROCm to maintain consistency:
- See the :doc:`ROCm PyTorch installation guide <rocm-install-on-linux:install/3rd-party/pytorch-install>`
to get started.
- The ROCm-supported version of PyTorch is maintained in the official `https://github.com/ROCm/pytorch
<https://github.com/ROCm/pytorch>`__ repository, which differs from the
`https://github.com/pytorch/pytorch <https://github.com/pytorch/pytorch>`__ upstream repository.
- Official PyTorch release:
- To get started and install PyTorch on ROCm, use the prebuilt :ref:`Docker images <pytorch-docker-compat>`,
which include ROCm, PyTorch, and all required dependencies.
- Provides the latest stable version of PyTorch but might not necessarily
support the latest ROCm version.
- See the :doc:`ROCm PyTorch installation guide <rocm-install-on-linux:install/3rd-party/pytorch-install>`
for installation and setup instructions.
- Official PyTorch repository: `<https://github.com/pytorch/pytorch>`__
- See the `Nightly and latest stable version installation guide <https://pytorch.org/get-started/locally/>`__
or `Previous versions <https://pytorch.org/get-started/previous-versions/>`__
to get started.
- You can also consult the upstream `Installation guide <https://pytorch.org/get-started/locally/>`__ or
`Previous versions <https://pytorch.org/get-started/previous-versions/>`__ for additional context.
PyTorch includes tooling that generates HIP source code from the CUDA backend.
This approach allows PyTorch to support ROCm without requiring manual code
modifications. For more information, see :doc:`HIPIFY <hipify:index>`.
ROCm development is aligned with the stable release of PyTorch, while upstream
PyTorch testing uses the stable release of ROCm to maintain consistency.
Version support
--------------------------------------------------------------------------------
AMD releases official `ROCm PyTorch Docker images <https://hub.docker.com/r/rocm/pytorch/tags>`_
quarterly alongside new ROCm releases. These images undergo full AMD testing.
.. _pytorch-recommendations:
@@ -73,12 +75,12 @@ Use cases and recommendations
* The :doc:`Instinct MI300X workload optimization guide </how-to/rocm-for-ai/inference-optimization/workload>`
provides detailed guidance on optimizing workloads for the AMD Instinct MI300X
accelerator using ROCm. This guide helps users achieve optimal performance for
GPU using ROCm. This guide helps users achieve optimal performance for
deep learning and other high-performance computing tasks on the MI300X
accelerator.
GPU.
* The :doc:`Inception with PyTorch documentation </conceptual/ai-pytorch-inception>`
describes how PyTorch integrates with ROCm for AI workloads It outlines the
describes how PyTorch integrates with ROCm for AI workloads. It outlines the
use of PyTorch on the ROCm platform and focuses on efficiently leveraging AMD
GPU hardware for training and inference tasks in AI applications.
@@ -89,141 +91,12 @@ For more use cases and recommendations, see `ROCm PyTorch blog posts <https://ro
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
AMD validates and publishes `PyTorch images <https://hub.docker.com/r/rocm/pytorch/tags>`__
with ROCm backends on Docker Hub.
<i class="fab fa-docker"></i>
AMD validates and publishes `PyTorch images <https://hub.docker.com/r/rocm/pytorch>`__
with ROCm backends on Docker Hub. The following Docker image tags and associated
inventories were tested on `ROCm 6.4.2 <https://repo.radeon.com/rocm/apt/6.4.2/>`__.
Click |docker-icon| to view the image on Docker Hub.
.. list-table:: PyTorch Docker image components
:header-rows: 1
:class: docker-image-compatibility
* - Docker
- PyTorch
- Ubuntu
- Python
- Apex
- torchvision
- TensorBoard
- MAGMA
- UCX
- OMPI
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu24.04_py3.12_pytorch_release_2.6.0/images/sha256-6a287591500b4048a9556c1ecc92bc411fd3d552f6c8233bc399f18eb803e8d6"><i class="fab fa-docker fa-lg"></i></a>
- `2.6.0 <https://github.com/ROCm/pytorch/tree/release/2.6>`__
- 24.04
- `3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `1.6.0 <https://github.com/ROCm/apex/tree/release/1.6.0>`__
- `0.21.0 <https://github.com/pytorch/vision/tree/v0.21.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.16.0+ds-5ubuntu1 <https://github.com/openucx/ucx/tree/v1.16.0>`__
- `4.1.6-7ubuntu2 <https://github.com/open-mpi/ompi/tree/v4.1.6>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu22.04_py3.10_pytorch_release_2.6.0/images/sha256-06b967629ba6657709f04169832cd769a11e6b491e8b1394c361d42d7a0c8b43"><i class="fab fa-docker fa-lg"></i></a>
- `2.6.0 <https://github.com/ROCm/pytorch/tree/release/2.6>`__
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `1.6.0 <https://github.com/ROCm/apex/tree/release/1.6.0>`__
- `0.21.0 <https://github.com/pytorch/vision/tree/v0.21.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu24.04_py3.12_pytorch_release_2.5.1/images/sha256-62022414217ef6de33ac5b1341e57db8a48e8573fa2ace12d48aa5edd4b99ef0"><i class="fab fa-docker fa-lg"></i></a>
- `2.5.1 <https://github.com/ROCm/pytorch/tree/release/2.5>`__
- 24.04
- `3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `1.5.0 <https://github.com/ROCm/apex/tree/release/1.5.0>`__
- `0.20.1 <https://github.com/pytorch/vision/tree/v0.20.1>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.16.0+ds-5ubuntu1 <https://github.com/openucx/ucx/tree/v1.10.0>`__
- `4.1.6-7ubuntu2 <https://github.com/open-mpi/ompi/tree/v4.1.6>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu22.04_py3.11_pytorch_release_2.5.1/images/sha256-469a7f74fc149aff31797e011ee41978f6a190adc69fa423b3c6a718a77bd985"><i class="fab fa-docker fa-lg"></i></a>
- `2.5.1 <https://github.com/ROCm/pytorch/tree/release/2.5>`__
- 22.04
- `3.11 <https://www.python.org/downloads/release/python-31113/>`__
- `1.5.0 <https://github.com/ROCm/apex/tree/release/1.5.0>`__
- `0.20.1 <https://github.com/pytorch/vision/tree/v0.20.1>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu22.04_py3.10_pytorch_release_2.5.1/images/sha256-37f41a1cd94019688669a1b20d33ea74156e0c129ef6b8270076ef214a6a1a2c"><i class="fab fa-docker fa-lg"></i></a>
- `2.5.1 <https://github.com/ROCm/pytorch/tree/release/2.5>`__
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `1.5.0 <https://github.com/ROCm/apex/tree/release/1.5.0>`__
- `0.20.1 <https://github.com/pytorch/vision/tree/v0.20.1>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu24.04_py3.12_pytorch_release_2.4.1/images/sha256-60824ba83dc1b9d94164925af1f81c0235c105dd555091ec04c57e05177ead1b"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`__
- 24.04
- `3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`__
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.16.0+ds-5ubuntu1 <https://github.com/openucx/ucx/tree/v1.16.0>`__
- `4.1.6-7ubuntu2 <https://github.com/open-mpi/ompi/tree/v4.1.6>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu22.04_py3.10_pytorch_release_2.4.1/images/sha256-fe944fe083312f901be6891ab4d3ffebf2eaf2cf4f5f0f435ef0b76ec714fabd"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`__
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`__
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu24.04_py3.12_pytorch_release_2.3.0/images/sha256-1d59251c47170c5b8960d1172a4dbe52f5793d8966edd778f168eaf32d56661a"><i class="fab fa-docker fa-lg"></i></a>
- `2.3.0 <https://github.com/ROCm/pytorch/tree/release/2.3>`__
- 24.04
- `3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `1.3.0 <https://github.com/ROCm/apex/tree/release/1.3.0>`__
- `0.18.0 <https://github.com/pytorch/vision/tree/v0.18.0>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.16.0+ds-5ubuntu1 <https://github.com/openucx/ucx/tree/v1.16.0>`__
- `4.1.6-7ubuntu2 <https://github.com/open-mpi/ompi/tree/v4.1.6>`__
To find the right image tag, see the :ref:`PyTorch on ROCm installation
documentation <rocm-install-on-linux:pytorch-docker-support>` for a list of
available ``rocm/pytorch`` images.
Key ROCm libraries for PyTorch
================================================================================
@@ -466,7 +339,7 @@ with ROCm.
* - Library
- Description
* - `torchaudio <https://docs.pytorch.org/audio/stable/index.html>`_
* - `torchaudio <https://docs.pytorch.org/audio/stable/index.html>`_
- Audio and signal processing library for PyTorch. Provides utilities for
audio I/O, signal and data processing functions, datasets, model
implementations, and application components for audio and speech
@@ -476,7 +349,7 @@ with ROCm.
you need to explicitly move audio data (waveform tensor) to GPU using
``.to('cuda')``.
* - `torchtune <https://docs.pytorch.org/torchtune/stable/index.html>`_
* - `torchtune <https://meta-pytorch.org/torchtune/stable/index.html>`_
- PyTorch-native library designed for fine-tuning large language models
(LLMs). Provides supports the full fine-tuning workflow and offers
compatibility with popular production inference systems.
@@ -488,21 +361,12 @@ with ROCm.
popular datasets, model architectures, and common image transformations
for computer vision applications.
* - `torchtext <https://docs.pytorch.org/text/stable/index.html>`_
- Text processing library for PyTorch. Provides data processing utilities
and popular datasets for natural language processing, including
tokenization, vocabulary management, and text embeddings.
**Note:** ``torchtext`` does not implement ROCm-specific kernels.
ROCm acceleration is provided through the underlying PyTorch framework
and ROCm library integration. Only official release exists.
* - `torchdata <https://docs.pytorch.org/data/beta/index.html>`_
* - `torchdata <https://meta-pytorch.org/data/beta/index.html#torchdata>`_
- Beta library of common modular data loading primitives for easily
constructing flexible and performant data pipelines, with features still
in prototype stage.
* - `torchrec <https://docs.pytorch.org/torchrec/>`_
* - `torchrec <https://meta-pytorch.org/torchrec/>`_
- PyTorch domain library for common sparsity and parallelism primitives
needed for large-scale recommender systems, enabling authors to train
models with large embedding tables shared across many GPUs.
@@ -535,7 +399,40 @@ with ROCm.
**Note:** Only official release exists.
Key features and enhancements for PyTorch 2.7 with ROCm 7.0
Key features and enhancements for PyTorch 2.9 with ROCm 7.1.1
================================================================================
- Scaled Dot Product Attention (SDPA) upgraded to use AOTriton version 0.11b.
- Default hipBLASLt support enabled for gfx908 architecture on ROCm 6.3 and later.
- MIOpen now supports channels last memory format for 3D convolutions and batch normalization.
- NHWC convolution operations in MIOpen optimized by eliminating unnecessary transpose operations.
- Improved tensor.item() performance by removing redundant synchronization.
- Enhanced performance for element-wise operations and reduction kernels.
- Added support for grouped GEMM operations through fbgemm_gpu generative AI components.
- Resolved device error in Inductor when using CUDA graph trees with HIP.
- Corrected logsumexp scaling in AOTriton-based SDPA implementation.
- Added stream graph capture status validation in memory copy synchronization functions.
Key features and enhancements for PyTorch 2.8 with ROCm 7.1
================================================================================
- MIOpen deep learning optimizations: Further optimized NHWC BatchNorm feature.
- Added float8 support for the DeepSpeed extension, allowing for decreased
memory footprint and increased throughput in training and inference workloads.
- ``torch.nn.functional.scaled_dot_product_attention`` now calling optimized
flash attention kernel automatically.
Key features and enhancements for PyTorch 2.7/2.8 with ROCm 7.0
================================================================================
- Enhanced TunableOp framework: Introduces ``tensorfloat32`` support for
@@ -545,7 +442,7 @@ Key features and enhancements for PyTorch 2.7 with ROCm 7.0
- Expanded GPU architecture support: Provides optimized support for newer GPU
architectures, including gfx1200 and gfx1201 with preferred hipBLASLt backend
selection, along with improvements for gfx950 and gfx1100 series GPUs.
selection, along with improvements for gfx950 and gfx1100 Series GPUs.
- Advanced Triton Integration: AOTriton 0.10b introduces official support for
gfx950 and gfx1201, along with experimental support for gfx1101, gfx1151,
@@ -570,10 +467,6 @@ Key features and enhancements for PyTorch 2.7 with ROCm 7.0
ROCm-specific test conditions, and enhanced unit test coverage for Flash
Attention and Memory Efficient operations.
- Build system and infrastructure improvements: Provides updated CentOS Stream 9
support, improved Docker configuration, migration to public MAGMA repository,
and enhanced QA automation scripts for PyTorch unit testing.
- Composable Kernel (CK) updates: Features updated CK submodule integration with
the latest optimizations and performance improvements for core mathematical
operations.
@@ -595,11 +488,11 @@ Key features and enhancements for PyTorch 2.7 with ROCm 7.0
network training or inference. For AMD platforms, ``amdclang++`` has been
validated as the supported compiler for building these extensions.
Known issues and notes for PyTorch 2.7 with ROCm 7.0
Known issues and notes for PyTorch 2.7/2.8 with ROCm 7.0 and ROCm 7.1
================================================================================
- The ``matmul.allow_fp16_reduced_precision_reduction`` and
``matmul.allow_bf16_reduced_precision_reduction`` options under
``torch.backends.cuda`` are not supported. As a result,
``matmul.allow_bf16_reduced_precision_reduction`` options under
``torch.backends.cuda`` are not supported. As a result,
reduced-precision reductions using FP16 or BF16 accumulation types are not
available.

View File

@@ -1,8 +1,8 @@
:orphan:
.. meta::
:description: Ray deep learning framework compatibility
:keywords: GPU, Ray compatibility
:description: Ray compatibility
:keywords: GPU, Ray, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -12,43 +12,74 @@ Ray compatibility
Ray is a unified framework for scaling AI and Python applications from your laptop
to a full cluster, without changing your code. Ray consists of `a core distributed
runtime <https://docs.ray.io/en/latest/ray-core/walkthrough.html>`_ and a set of
`AI libraries <https://docs.ray.io/en/latest/ray-air/getting-started.html>`_ for
runtime <https://docs.ray.io/en/latest/ray-core/walkthrough.html>`__ and a set of
`AI libraries <https://docs.ray.io/en/latest/ray-air/getting-started.html>`__ for
simplifying machine learning computations.
Ray is a general-purpose framework that runs many types of workloads efficiently.
Any Python application can be scaled with Ray, without extra infrastructure.
ROCm support for Ray is upstreamed, and you can build the official source code
with ROCm support:
- ROCm support for Ray is hosted in the official `https://github.com/ROCm/ray
<https://github.com/ROCm/ray>`_ repository.
- Due to independent compatibility considerations, this location differs from the
`https://github.com/ray-project/ray <https://github.com/ray-project/ray>`_ upstream repository.
- To install Ray, use the prebuilt :ref:`Docker image <ray-docker-compat>`
which includes ROCm, Ray, and all required dependencies.
- See the :doc:`ROCm Ray installation guide <rocm-install-on-linux:install/3rd-party/ray-install>`
for instructions to get started.
- See the `Installation section <https://docs.ray.io/en/latest/ray-overview/installation.html>`_
in the upstream Ray documentation.
- The Docker image provided is based on the upstream Ray `Daily Release (Nightly) wheels <https://docs.ray.io/en/latest/ray-overview/installation.html#daily-releases-nightlies>`__
corresponding to commit `005c372 <https://github.com/ray-project/ray/commit/005c372262e050d5745f475e22e64305fa07f8b8>`__.
.. note::
Ray is supported on ROCm 6.4.1.
Supported devices
Support overview
================================================================================
**Officially Supported**: AMD Instinct™ MI300X, MI210
- The ROCm-supported version of Ray is maintained in the official `https://github.com/ROCm/ray
<https://github.com/ROCm/ray>`__ repository, which differs from the
`https://github.com/ray-project/ray <https://github.com/ray-project/ray>`__ upstream repository.
- To get started and install Ray on ROCm, use the prebuilt :ref:`Docker image <ray-docker-compat>`,
which includes ROCm, Ray, and all required dependencies.
- See the :doc:`ROCm Ray installation guide <rocm-install-on-linux:install/3rd-party/ray-install>`
for installation and setup instructions.
- You can also consult the upstream `Installation guide <https://docs.ray.io/en/latest/ray-overview/installation.html>`__
for additional context.
.. _ray-docker-compat:
Compatibility matrix
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `ROCm Ray Docker images <https://hub.docker.com/r/rocm/ray/tags>`__
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories represent the latest Ray version from the official Docker Hub.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- ROCm
- Ray
- Pytorch
- Ubuntu
- Python
- GPU
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/ray/ray-2.51.1_rocm7.0.0_ubuntu22.04_py3.12_pytorch2.9.0/images/sha256-a02f6766b4ba406f88fd7e85707ec86c04b569834d869a08043ec9bcbd672168"><i class="fab fa-docker fa-lg"></i> rocm/ray</a>
- `7.0.0 <https://repo.radeon.com/rocm/apt/7.0/>`__
- `2.51.1 <https://github.com/ROCm/ray/tree/release/2.51.1>`__
- 2.9.0a0+git1c57644
- 22.04
- `3.12.12 <https://www.python.org/downloads/release/python-31212/>`__
- MI300X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/ray/ray-2.48.0.post0_rocm6.4.1_ubuntu24.04_py3.12_pytorch2.6.0/images/sha256-0d166fe6bdced38338c78eedfb96eff92655fb797da3478a62dd636365133cc0"><i class="fab fa-docker fa-lg"></i> rocm/ray</a>
- `6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__
- `2.48.0.post0 <https://github.com/ROCm/ray/tree/release/2.48.0.post0>`__
- 2.6.0+git684f6f2
- 24.04
- `3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- MI300X, MI210
Use cases and recommendations
================================================================================
@@ -77,35 +108,7 @@ topic <https://docs.ray.io/en/latest/ray-core/scheduling/accelerators.html#accel
of the Ray core documentation and refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`__,
where you can search for Ray examples and best practices to optimize your workloads on AMD GPUs.
.. _ray-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `ROCm Ray Docker images <https://hub.docker.com/r/rocm/ray/tags>`__
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories represent the latest Ray version from the official Docker Hub and are validated for
`ROCm 6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`_. Click the |docker-icon|
icon to view the image on Docker Hub.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- Ray
- Pytorch
- Ubuntu
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/ray/ray-2.48.0.post0_rocm6.4.1_ubuntu24.04_py3.12_pytorch2.6.0/images/sha256-0d166fe6bdced38338c78eedfb96eff92655fb797da3478a62dd636365133cc0"><i class="fab fa-docker fa-lg"></i> rocm/ray</a>
- `2.48.0.post0 <https://github.com/ROCm/ray/tree/release/2.48.0.post0>`_
- 2.6.0+git684f6f2
- 24.04
- `3.12.10 <https://www.python.org/downloads/release/python-31210/>`_
Previous versions
===============================================================================
See :doc:`rocm-install-on-linux:install/3rd-party/previous-versions/ray-history` to find documentation for previous releases
of the ``ROCm/ray`` Docker image.

View File

@@ -2,7 +2,7 @@
.. meta::
:description: Stanford Megatron-LM compatibility
:keywords: Stanford, Megatron-LM, compatibility
:keywords: Stanford, Megatron-LM, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -10,34 +10,76 @@
Stanford Megatron-LM compatibility
********************************************************************************
Stanford Megatron-LM is a large-scale language model training framework developed by NVIDIA `https://github.com/NVIDIA/Megatron-LM <https://github.com/NVIDIA/Megatron-LM>`_. It is
designed to train massive transformer-based language models efficiently by model and data parallelism.
Stanford Megatron-LM is a large-scale language model training framework developed
by NVIDIA at `https://github.com/NVIDIA/Megatron-LM <https://github.com/NVIDIA/Megatron-LM>`_.
It is designed to train massive transformer-based language models efficiently by model
and data parallelism.
* ROCm support for Stanford Megatron-LM is hosted in the official `https://github.com/ROCm/Stanford-Megatron-LM <https://github.com/ROCm/Stanford-Megatron-LM>`_ repository.
* Due to independent compatibility considerations, this location differs from the `https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`_ upstream repository.
* Use the prebuilt :ref:`Docker image <megatron-lm-docker-compat>` with ROCm, PyTorch, and Megatron-LM preinstalled.
* See the :doc:`ROCm Stanford Megatron-LM installation guide <rocm-install-on-linux:install/3rd-party/stanford-megatron-lm-install>` to install and get started.
It provides efficient tensor, pipeline, and sequence-based model parallelism for
pre-training transformer-based language models such as GPT (Decoder Only), BERT
(Encoder Only), and T5 (Encoder-Decoder).
.. note::
Stanford Megatron-LM is supported on ROCm 6.3.0.
Supported Devices
Support overview
================================================================================
- **Officially Supported**: AMD Instinct MI300X
- **Partially Supported** (functionality or performance limitations): AMD Instinct MI250X, MI210X
- The ROCm-supported version of Stanford Megatron-LM is maintained in the official `https://github.com/ROCm/Stanford-Megatron-LM
<https://github.com/ROCm/Stanford-Megatron-LM>`__ repository, which differs from the
`https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`__ upstream repository.
- To get started and install Stanford Megatron-LM on ROCm, use the prebuilt :ref:`Docker image <megatron-lm-docker-compat>`,
which includes ROCm, Stanford Megatron-LM, and all required dependencies.
Supported models and features
- See the :doc:`ROCm Stanford Megatron-LM installation guide <rocm-install-on-linux:install/3rd-party/stanford-megatron-lm-install>`
for installation and setup instructions.
- You can also consult the upstream `Installation guide <https://github.com/NVIDIA/Megatron-LM>`__
for additional context.
.. _megatron-lm-docker-compat:
Compatibility matrix
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `Stanford Megatron-LM images <https://hub.docker.com/r/rocm/stanford-megatron-lm/tags>`_
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
inventories represent the latest Stanford Megatron-LM version from the official Docker Hub.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- ROCm
- Stanford Megatron-LM
- PyTorch
- Ubuntu
- Python
- GPU
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/stanford-megatron-lm/stanford-megatron-lm85f95ae_rocm6.3.0_ubuntu24.04_py3.12_pytorch2.4.0/images/sha256-070556f078be10888a1421a2cb4f48c29f28b02bfeddae02588d1f7fc02a96a6"><i class="fab fa-docker fa-lg"></i> rocm/stanford-megatron-lm</a>
- `6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_
- `85f95ae <https://github.com/stanford-futuredata/Megatron-LM/commit/85f95aef3b648075fe6f291c86714fdcbd9cd1f5>`_
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`_
- MI300X
Supported models and features with ROCm 6.3.0
================================================================================
This section details models & features that are supported by the ROCm version on Stanford Megatron-LM.
Models:
* Bert
* BERT
* GPT
* T5
* ICT
@@ -54,47 +96,21 @@ Features:
Use cases and recommendations
================================================================================
See the `Efficient MoE training on AMD ROCm: How-to use Megablocks on AMD GPUs blog <https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`_ post
to leverage the ROCm platform for pre-training by using the Stanford Megatron-LM framework of pre-processing datasets on AMD GPUs.
Coverage includes:
The following blog post mentions Megablocks, but you can run Stanford Megatron-LM with the same steps to pre-process datasets on AMD GPUs:
* Single-GPU pre-training
* Multi-GPU pre-training
* The `Efficient MoE training on AMD ROCm: How-to use Megablocks on AMD GPUs
<https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`__
blog post guides how to leverage the ROCm platform for pre-training using the
Megablocks framework. It introduces a streamlined approach for training Mixture-of-Experts
(MoE) models using the Megablocks library on AMD hardware. Focusing on GPT-2, it
demonstrates how block-sparse computations can enhance scalability and efficiency in MoE
training. The guide provides step-by-step instructions for setting up the environment,
including cloning the repository, building the Docker image, and running the training container.
Additionally, it offers insights into utilizing the ``oscar-1GB.json`` dataset for pre-training
language models. By leveraging Megablocks and the ROCm platform, you can optimize your MoE
training workflows for large-scale transformer models.
It features how to pre-process datasets and how to begin pre-training on AMD GPUs through:
.. _megatron-lm-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `Stanford Megatron-LM images <https://hub.docker.com/r/rocm/megatron-lm>`_
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
inventories represent the latest Megatron-LM version from the official Docker Hub.
The Docker images have been validated for `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- Stanford Megatron-LM
- PyTorch
- Ubuntu
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/stanford-megatron-lm/stanford-megatron-lm85f95ae_rocm6.3.0_ubuntu24.04_py3.12_pytorch2.4.0/images/sha256-070556f078be10888a1421a2cb4f48c29f28b02bfeddae02588d1f7fc02a96a6"><i class="fab fa-docker fa-lg"></i></a>
- `85f95ae <https://github.com/stanford-futuredata/Megatron-LM/commit/85f95aef3b648075fe6f291c86714fdcbd9cd1f5>`_
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`_
* Single-GPU pre-training
* Multi-GPU pre-training

View File

@@ -1,76 +0,0 @@
:orphan:
.. meta::
:description: Taichi compatibility
:keywords: GPU, Taichi compatibility
.. version-set:: rocm_version latest
*******************************************************************************
Taichi compatibility
*******************************************************************************
`Taichi <https://www.taichi-lang.org/>`_ is an open-source, imperative, and parallel
programming language designed for high-performance numerical computation.
Embedded in Python, it leverages just-in-time (JIT) compilation frameworks such as LLVM to accelerate
compute-intensive Python code by compiling it to native GPU or CPU instructions.
Taichi is widely used across various domains, including real-time physical simulation,
numerical computing, augmented reality, artificial intelligence, computer vision, robotics,
visual effects in film and gaming, and general-purpose computing.
* ROCm support for Taichi is hosted in the official `https://github.com/ROCm/taichi <https://github.com/ROCm/taichi>`_ repository.
* Due to independent compatibility considerations, this location differs from the `https://github.com/taichi-dev <https://github.com/taichi-dev>`_ upstream repository.
* Use the prebuilt :ref:`Docker image <taichi-docker-compat>` with ROCm, PyTorch, and Taichi preinstalled.
* See the :doc:`ROCm Taichi installation guide <rocm-install-on-linux:install/3rd-party/taichi-install>` to install and get started.
.. note::
Taichi is supported on ROCm 6.3.2.
Supported devices and features
===============================================================================
There is support through the ROCm software stack for all Taichi GPU features on AMD Instinct MI250X and MI210X series GPUs with the exception of Taichis GPU rendering system, CGUI.
AMD Instinct MI300X series GPUs will be supported by November.
.. _taichi-recommendations:
Use cases and recommendations
================================================================================
To fully leverage Taichi's performance capabilities in compute-intensive tasks, it is best to adhere to specific coding patterns and utilize Taichi decorators.
A collection of example use cases is available in the `https://github.com/ROCm/taichi_examples <https://github.com/ROCm/taichi_examples>`_ repository,
providing practical insights and foundational knowledge for working with the Taichi programming language.
You can also refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`_ to search for Taichi examples and best practices to optimize your workflows on AMD GPUs.
.. _taichi-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `ROCm Taichi Docker images <https://hub.docker.com/r/rocm/taichi/tags>`_
with ROCm backends on Docker Hub. The following Docker image tags and associated inventories
represent the latest Taichi version from the official Docker Hub.
The Docker images have been validated for `ROCm 6.3.2 <https://rocm.docs.amd.com/en/docs-6.3.2/about/release-notes.html>`_.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- ROCm
- Taichi
- Ubuntu
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/taichi/taichi-1.8.0b1_rocm6.3.2_ubuntu22.04_py3.10.12/images/sha256-e016964a751e6a92199032d23e70fa3a564fff8555afe85cd718f8aa63f11fc6"><i class="fab fa-docker fa-lg"></i> rocm/taichi</a>
- `6.3.2 <https://repo.radeon.com/rocm/apt/6.3.2/>`_
- `1.8.0b1 <https://github.com/taichi-dev/taichi>`_
- 22.04
- `3.10.12 <https://www.python.org/downloads/release/python-31012/>`_

View File

@@ -2,7 +2,7 @@
.. meta::
:description: TensorFlow compatibility
:keywords: GPU, TensorFlow compatibility
:keywords: GPU, TensorFlow, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -12,115 +12,46 @@ TensorFlow compatibility
`TensorFlow <https://www.tensorflow.org/>`__ is an open-source library for
solving machine learning, deep learning, and AI problems. It can solve many
problems across different sectors and industries but primarily focuses on
neural network training and inference. It is one of the most popular and
in-demand frameworks and is very active in open-source contribution and
development.
problems across different sectors and industries, but primarily focuses on
neural network training and inference. It is one of the most popular deep
learning frameworks and is very active in open-source development.
Support overview
================================================================================
- The ROCm-supported version of TensorFlow is maintained in the official `https://github.com/ROCm/tensorflow-upstream
<https://github.com/ROCm/tensorflow-upstream>`__ repository, which differs from the
`https://github.com/tensorflow/tensorflow <https://github.com/tensorflow/tensorflow>`__ upstream repository.
- To get started and install TensorFlow on ROCm, use the prebuilt :ref:`Docker images <tensorflow-docker-compat>`,
which include ROCm, TensorFlow, and all required dependencies.
- See the :doc:`ROCm TensorFlow installation guide <rocm-install-on-linux:install/3rd-party/tensorflow-install>`
for installation and setup instructions.
- You can also consult the `TensorFlow API versions <https://www.tensorflow.org/versions>`__ list
for additional context.
Version support
--------------------------------------------------------------------------------
The `official TensorFlow repository <http://github.com/tensorflow/tensorflow>`__
includes full ROCm support. AMD maintains a TensorFlow `ROCm repository
<http://github.com/rocm/tensorflow-upstream>`__ in order to quickly add bug
fixes, updates, and support for the latest ROCM versions.
- ROCm TensorFlow release:
- Offers :ref:`Docker images <tensorflow-docker-compat>` with
ROCm and TensorFlow pre-installed.
- ROCm TensorFlow repository: `<https://github.com/ROCm/tensorflow-upstream>`__
- See the :doc:`ROCm TensorFlow installation guide <rocm-install-on-linux:install/3rd-party/tensorflow-install>`
to get started.
- Official TensorFlow release:
- Official TensorFlow repository: `<https://github.com/tensorflow/tensorflow>`__
- See the `TensorFlow API versions <https://www.tensorflow.org/versions>`__ list.
.. note::
The official TensorFlow documentation does not cover ROCm support. Use the
ROCm documentation for installation instructions for Tensorflow on ROCm.
See :doc:`rocm-install-on-linux:install/3rd-party/tensorflow-install`.
fixes, updates, and support for the latest ROCm versions.
.. _tensorflow-docker-compat:
Docker image compatibility
===============================================================================
================================================================================
.. |docker-icon| raw:: html
AMD provides preconfigured Docker images with TensorFlow and the ROCm backend.
These images are published on `Docker Hub <https://hub.docker.com/r/rocm/tensorflow>`__ and are the
recommended way to get started with deep learning with TensorFlow on ROCm.
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `TensorFlow images
<https://hub.docker.com/r/rocm/tensorflow>`__ with ROCm backends on
Docker Hub. The following Docker image tags and associated inventories are
validated for `ROCm 6.4.2 <https://repo.radeon.com/rocm/apt/6.4.2/>`__. Click
the |docker-icon| icon to view the image on Docker Hub.
.. list-table:: TensorFlow Docker image components
:header-rows: 1
* - Docker image
- TensorFlow
- Ubuntu
- Python
- TensorBoard
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.12-tf2.18-dev/images/sha256-96754ce2d30f729e19b497279915b5212ba33d5e408e7e5dd3f2304d87e3441e"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.18.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- 24.04
- `Python 3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `TensorBoard 2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.10-tf2.18-dev/images/sha256-fa741508d383858e86985a9efac85174529127408102558ae2e3a4ac894eea1e"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.18.1-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- 22.04
- `Python 3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.12-tf2.17-dev/images/sha256-3a0aef09f2a8833c2b64b85874dd9449ffc2ad257351857338ff5b706c03a418"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.17.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.17.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- 24.04
- `Python 3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `TensorBoard 2.17.1 <https://github.com/tensorflow/tensorboard/tree/2.17.1>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.10-tf2.17-dev/images/sha256-bc7341a41ebe7ab261aa100732874507c452421ef733e408ac4f05ed453b0bc5"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.17.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.17.1-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- 22.04
- `Python 3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.17.1 <https://github.com/tensorflow/tensorboard/tree/2.17.1>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.12-tf2.16-dev/images/sha256-4841a8df7c340dab79bf9362dad687797649a00d594e0832eb83ea6880a40d3b"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.16.2-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- 24.04
- `Python 3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `TensorBoard 2.16.2 <https://github.com/tensorflow/tensorboard/tree/2.16.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.10-tf2.16-dev/images/sha256-883fa95aba960c58a3e46fceaa18f03ede2c7df89b8e9fd603ab2d47e0852897"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.16.2-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- 22.04
- `Python 3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.16.2 <https://github.com/tensorflow/tensorboard/tree/2.16.2>`__
To find the right image tag, see the :ref:`TensorFlow on ROCm installation
documentation <rocm-install-on-linux:tensorflow-docker-support>` for a list of
available ``rocm/tensorflow`` images.
Critical ROCm libraries for TensorFlow
@@ -205,7 +136,7 @@ The following section maps supported data types and GPU-accelerated TensorFlow
features to their minimum supported ROCm and TensorFlow versions.
Data types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
---------------
The data type of a tensor is specified using the ``dtype`` attribute or
argument, and TensorFlow supports a wide range of data types for different use
@@ -323,7 +254,7 @@ are as follows:
- 1.7
Features
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
---------------
This table provides an overview of key features in TensorFlow and their
availability in ROCm.
@@ -415,7 +346,7 @@ availability in ROCm.
- 1.9.2
Distributed library features
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-----------------------------------
Enables developers to scale computations across multiple devices on a single machine or
across multiple machines.

View File

@@ -2,7 +2,7 @@
.. meta::
:description: verl compatibility
:keywords: GPU, verl compatibility
:keywords: GPU, verl, deep learning, framework compatibility
.. version-set:: rocm_version latest
@@ -10,77 +10,109 @@
verl compatibility
*******************************************************************************
Volcano Engine Reinforcement Learning for LLMs (verl) is a reinforcement learning framework designed for large language models (LLMs).
verl offers a scalable, open-source fine-tuning solution optimized for AMD Instinct GPUs with full ROCm support.
Volcano Engine Reinforcement Learning for LLMs (`verl <https://verl.readthedocs.io/en/latest/>`__)
is a reinforcement learning framework designed for large language models (LLMs).
verl offers a scalable, open-source fine-tuning solution by using a hybrid programming model
that makes it easy to define and run complex post-training dataflows efficiently.
* See the `verl documentation <https://verl.readthedocs.io/en/latest/>`_ for more information about verl.
* The official verl GitHub repository is `https://github.com/volcengine/verl <https://github.com/volcengine/verl>`_.
* Use the AMD-validated :ref:`Docker images <verl-docker-compat>` with ROCm and verl preinstalled.
* See the :doc:`ROCm verl installation guide <rocm-install-on-linux:install/3rd-party/verl-install>` to install and get started.
Its modular APIs separate computation from data, allowing smooth integration with other frameworks.
It also supports flexible model placement across GPUs for efficient scaling on different cluster sizes.
verl achieves high training and generation throughput by building on existing LLM frameworks.
Its 3D-HybridEngine reduces memory use and communication overhead when switching between training
and inference, improving overall performance.
.. note::
verl is supported on ROCm 6.2.0.
.. _verl-recommendations:
Use cases and recommendations
Support overview
================================================================================
The benefits of verl in large-scale reinforcement learning from human feedback (RLHF) are discussed in the `Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration <https://rocm.blogs.amd.com/artificial-intelligence/verl-large-scale/README.html>`_ blog.
- The ROCm-supported version of verl is maintained in the official `https://github.com/ROCm/verl
<https://github.com/ROCm/verl>`__ repository, which differs from the
`https://github.com/volcengine/verl <https://github.com/volcengine/verl>`__ upstream repository.
.. _verl-supported_features:
- To get started and install verl on ROCm, use the prebuilt :ref:`Docker image <verl-docker-compat>`,
which includes ROCm, verl, and all required dependencies.
Supported features
===============================================================================
- See the :doc:`ROCm verl installation guide <rocm-install-on-linux:install/3rd-party/verl-install>`
for installation and setup instructions.
The following table shows verl on ROCm support for GPU-accelerated modules.
.. list-table::
:header-rows: 1
* - Module
- Description
- verl version
- ROCm version
* - ``FSDP``
- Training engine
- 0.3.0.post0
- 6.2.0
* - ``vllm``
- Inference engine
- 0.3.0.post0
- 6.2.0
- You can also consult the upstream `verl documentation <https://verl.readthedocs.io/en/latest/>`__
for additional context.
.. _verl-docker-compat:
Docker image compatibility
Compatibility matrix
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `ROCm verl Docker images <https://hub.docker.com/r/rocm/verl/tags>`_
with ROCm backends on Docker Hub. The following Docker image tags and associated inventories represent the available verl versions from the official Docker Hub.
AMD validates and publishes `verl Docker images <https://hub.docker.com/r/rocm/verl/tags>`_
with ROCm backends on Docker Hub. The following Docker image tag and associated inventories
represent the latest verl version from the official Docker Hub.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
:header-rows: 1
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- ROCm
- verl
- Ubuntu
- Pytorch
- Python
- vllm
* - Docker image
- ROCm
- verl
- Ubuntu
- PyTorch
- Python
- vllm
- GPU
* - .. raw:: html
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/verl/verl-0.3.0.post0_rocm6.2_vllm0.6.3/images/sha256-cbe423803fd7850448b22444176bee06f4dcf22cd3c94c27732752d3a39b04b2"><i class="fab fa-docker fa-lg"></i> rocm/verl</a>
- `6.2.0 <https://repo.radeon.com/rocm/apt/6.2/>`_
- `0.3.0post0 <https://github.com/volcengine/verl/releases/tag/v0.3.0.post0>`_
- 20.04
- `2.5.0 <https://github.com/ROCm/pytorch/tree/release/2.5>`_
- `3.9.19 <https://www.python.org/downloads/release/python-3919/>`_
- `0.6.3 <https://github.com/vllm-project/vllm/releases/tag/v0.6.3>`_
<a href="https://hub.docker.com/layers/rocm/verl/verl-0.6.0.amd0_rocm7.0_vllm0.11.0.dev/images/sha256-f70a3ebc94c1f66de42a2fcc3f8a6a8d6d0881eb0e65b6958d7d6d24b3eecb0d"><i class="fab fa-docker fa-lg"></i> rocm/verl</a>
- `7.0.0 <https://repo.radeon.com/rocm/apt/7.0/>`__
- `0.6.0 <https://github.com/volcengine/verl/releases/tag/v0.6.0>`__
- 22.04
- `2.9.0 <https://github.com/ROCm/pytorch/tree/release/2.9-rocm7.x-gfx115x>`__
- `3.12.11 <https://www.python.org/downloads/release/python-31211/>`__
- `0.11.0 <https://github.com/vllm-project/vllm/releases/tag/v0.11.0>`__
- MI300X
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/verl/verl-0.3.0.post0_rocm6.2_vllm0.6.3/images/sha256-cbe423803fd7850448b22444176bee06f4dcf22cd3c94c27732752d3a39b04b2"><i class="fab fa-docker fa-lg"></i> rocm/verl</a>
- `6.2.0 <https://repo.radeon.com/rocm/apt/6.2/>`__
- `0.3.0.post0 <https://github.com/volcengine/verl/releases/tag/v0.3.0.post0>`__
- 20.04
- `2.5.0 <https://github.com/ROCm/pytorch/tree/release/2.5>`__
- `3.9.19 <https://www.python.org/downloads/release/python-3919/>`__
- `0.6.3 <https://github.com/vllm-project/vllm/releases/tag/v0.6.3>`__
- MI300X
.. _verl-supported_features:
Supported modules with verl on ROCm
===============================================================================
The following GPU-accelerated modules are supported with verl on ROCm:
- ``FSDP``: Training engine
- ``vllm``: Inference engine
.. _verl-recommendations:
Use cases and recommendations
================================================================================
* The benefits of verl in large-scale reinforcement learning from human feedback
(RLHF) are discussed in the `Reinforcement Learning from Human Feedback on AMD
GPUs with verl and ROCm Integration <https://rocm.blogs.amd.com/artificial-intelligence/verl-large-scale/README.html>`__
blog. The blog post outlines how the Volcano Engine Reinforcement Learning
(verl) framework integrates with the AMD ROCm platform to optimize training on
AMD Instinct™ GPUs. The guide details the process of building a Docker image,
setting up single-node and multi-node training environments, and highlights
performance benchmarks demonstrating improved throughput and convergence accuracy.
This resource serves as a comprehensive starting point for deploying verl on AMD GPUs,
facilitating efficient RLHF training workflows.
Previous versions
===============================================================================
See :doc:`rocm-install-on-linux:install/3rd-party/previous-versions/verl-history` to find documentation for previous releases
of the ``ROCm/verl`` Docker image.

View File

@@ -13,22 +13,22 @@
:gutter: 1
:::{grid-item-card}
**AMD Instinct MI300 series**
**AMD Instinct MI300 Series**
Review hardware aspects of the AMD Instinct™ MI300 series of GPU accelerators and the CDNA™ 3
Review hardware aspects of the AMD Instinct™ MI300 Series GPUs and the CDNA™ 3
architecture.
* [AMD Instinct™ MI300 microarchitecture](./gpu-arch/mi300.md)
* [AMD Instinct MI300/CDNA3 ISA](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf)
* [White paper](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf)
* [MI300 performance counters](./gpu-arch/mi300-mi200-performance-counters.rst)
* [MI350 series performance counters](./gpu-arch/mi350-performance-counters.rst)
* [MI350 Series performance counters](./gpu-arch/mi350-performance-counters.rst)
:::
:::{grid-item-card}
**AMD Instinct MI200 series**
**AMD Instinct MI200 Series**
Review hardware aspects of the AMD Instinct™ MI200 series of GPU accelerators and the CDNA™ 2
Review hardware aspects of the AMD Instinct™ MI200 Series GPUs and the CDNA™ 2
architecture.
* [AMD Instinct™ MI250 microarchitecture](./gpu-arch/mi250.md)
@@ -41,7 +41,7 @@ architecture.
:::{grid-item-card}
**AMD Instinct MI100**
Review hardware aspects of the AMD Instinct™ MI100 series of GPU accelerators and the CDNA™ 1
Review hardware aspects of the AMD Instinct™ MI100 Series GPUs and the CDNA™ 1
architecture.
* [AMD Instinct™ MI100 microarchitecture](./gpu-arch/mi100.md)

View File

@@ -1,14 +1,14 @@
---
myst:
html_meta:
"description lang=en": "Learn about the AMD Instinct MI100 series architecture."
"description lang=en": "Learn about the AMD Instinct MI100 Series architecture."
"keywords": "Instinct, MI100, microarchitecture, AMD, ROCm"
---
# AMD Instinct™ MI100 microarchitecture
The following image shows the node-level architecture of a system that
comprises two AMD EPYC™ processors and (up to) eight AMD Instinct™ accelerators.
comprises two AMD EPYC™ processors and (up to) eight AMD Instinct™ GPUs.
The two EPYC processors are connected to each other with the AMD Infinity™
fabric which provides a high-bandwidth (up to 18 GT/sec) and coherent links such
that each processor can access the available node memory as a single
@@ -18,29 +18,29 @@ available to connect the processors plus one PCIe Gen 4 x16 link per processor
can attach additional I/O devices such as the host adapters for the network
fabric.
![Structure of a single GCD in the AMD Instinct MI100 accelerator](../../data/conceptual/gpu-arch/image004.png "Node-level system architecture with two AMD EPYC™ processors and eight AMD Instinct™ accelerators.")
![Structure of a single GCD in the AMD Instinct MI100 GPU](../../data/conceptual/gpu-arch/image004.png "Node-level system architecture with two AMD EPYC™ processors and eight AMD Instinct™ GPUs.")
In a typical node configuration, each processor can host up to four AMD
Instinct™ accelerators that are attached using PCIe Gen 4 links at 16 GT/sec,
Instinct™ GPUs that are attached using PCIe Gen 4 links at 16 GT/sec,
which corresponds to a peak bidirectional link bandwidth of 32 GB/sec. Each hive
of four accelerators can participate in a fully connected, coherent AMD
Instinct™ fabric that connects the four accelerators using 23 GT/sec AMD
of four GPUs can participate in a fully connected, coherent AMD
Instinct™ fabric that connects the four GPUs using 23 GT/sec AMD
Infinity fabric links that run at a higher frequency than the inter-processor
links. This inter-GPU link can be established in certified server systems if the
GPUs are mounted in neighboring PCIe slots by installing the AMD Infinity
Fabric™ bridge for the AMD Instinct™ accelerators.
Fabric™ bridge for the AMD Instinct™ GPUs.
## Microarchitecture
The microarchitecture of the AMD Instinct accelerators is based on the AMD CDNA
The microarchitecture of the AMD Instinct GPUs is based on the AMD CDNA
architecture, which targets compute applications such as high-performance
computing (HPC) and AI & machine learning (ML) that run on everything from
individual servers to the world's largest exascale supercomputers. The overall
system architecture is designed for extreme scalability and compute performance.
![Structure of the AMD Instinct accelerator (MI100 generation)](../../data/conceptual/gpu-arch/image005.png "Structure of the AMD Instinct accelerator (MI100 generation)")
![Structure of the AMD Instinct GPU (MI100 generation)](../../data/conceptual/gpu-arch/image005.png "Structure of the AMD Instinct GPU (MI100 generation)")
The above image shows the AMD Instinct accelerator with its PCIe Gen 4 x16
The above image shows the AMD Instinct GPU with its PCIe Gen 4 x16
link (16 GT/sec, at the bottom) that connects the GPU to (one of) the host
processor(s). It also shows the three AMD Infinity Fabric ports that provide
high-speed links (23 GT/sec, also at the bottom) to the other GPUs of the local
@@ -48,7 +48,7 @@ hive.
On the left and right of the floor plan, the High Bandwidth Memory (HBM)
attaches via the GPU memory controller. The MI100 generation of the AMD
Instinct accelerator offers four stacks of HBM generation 2 (HBM2) for a total
Instinct GPU offers four stacks of HBM generation 2 (HBM2) for a total
of 32GB with a 4,096bit-wide memory interface. The peak memory bandwidth of the
attached HBM2 is 1.228 TB/sec at a memory clock frequency of 1.2 GHz.
@@ -64,7 +64,7 @@ Therefore, the theoretical maximum FP64 peak performance is 11.5 TFLOPS
![Block diagram of an MI100 compute unit with detailed SIMD view of the AMD CDNA architecture](../../data/conceptual/gpu-arch/image006.png "An MI100 compute unit with detailed SIMD view of the AMD CDNA architecture")
The preceding image shows the block diagram of a single CU of an AMD Instinct™
MI100 accelerator and summarizes how instructions flow through the execution
MI100 GPU and summarizes how instructions flow through the execution
engines. The CU fetches the instructions via a 32KB instruction cache and moves
them forward to execution via a dispatcher. The CU can handle up to ten
wavefronts at a time and feed their instructions into the execution unit. The

View File

@@ -1,13 +1,13 @@
---
myst:
html_meta:
"description lang=en": "Learn about the AMD Instinct MI250 series architecture."
"description lang=en": "Learn about the AMD Instinct MI250 Series architecture."
"keywords": "Instinct, MI250, microarchitecture, AMD, ROCm"
---
# AMD Instinct™ MI250 microarchitecture
The microarchitecture of the AMD Instinct MI250 accelerators is based on the
The microarchitecture of the AMD Instinct MI250 GPU is based on the
AMD CDNA 2 architecture that targets compute applications such as HPC,
artificial intelligence (AI), and machine learning (ML) and that run on
everything from individual servers to the worlds largest exascale
@@ -40,7 +40,7 @@ execution units (also called matrix cores), which are geared toward executing
matrix operations like matrix-matrix multiplications. For FP64, the peak
performance of these units amounts to 90.5 TFLOPS.
![Structure of a single GCD in the AMD Instinct MI250 accelerator.](../../data/conceptual/gpu-arch/image001.png "Structure of a single GCD in the AMD Instinct MI250 accelerator.")
![Structure of a single GCD in the AMD Instinct MI250 GPU.](../../data/conceptual/gpu-arch/image001.png "Structure of a single GCD in the AMD Instinct MI250 GPU.")
```{list-table} Peak-performance capabilities of the MI250 OAM for different data types.
:header-rows: 1
@@ -84,16 +84,9 @@ performance of these units amounts to 90.5 TFLOPS.
- 362.1
```
The above table summarizes the aggregated peak performance of the AMD
Instinct MI250 OCP Open Accelerator Modules (OAM, OCP is short for Open Compute
Platform) and its two GCDs for different data types and execution units. The
middle column lists the peak performance (number of data elements processed in a
single instruction) of a single compute unit if a SIMD (or matrix) instruction
is being retired in each clock cycle. The third column lists the theoretical
peak performance of the OAM module. The theoretical aggregated peak memory
bandwidth of the GPU is 3.2 TB/sec (1.6 TB/sec per GCD).
The above table summarizes the aggregated peak performance of the AMD Instinct MI250 Open Compute Platform (OCP) Open Accelerator Modules (OAMs) and its two GCDs for different data types and execution units. The middle column lists the peak performance (number of data elements processed in a single instruction) of a single compute unit if a SIMD (or matrix) instruction is being retired in each clock cycle. The third column lists the theoretical peak performance of the OAM module. The theoretical aggregated peak memory bandwidth of the GPU is 3.2 TB/sec (1.6 TB/sec per GCD).
![Dual-GCD architecture of the AMD Instinct MI250 accelerators](../../data/conceptual/gpu-arch/image002.png "Dual-GCD architecture of the AMD Instinct MI250 accelerators")
![Dual-GCD architecture of the AMD Instinct MI250 GPUs](../../data/conceptual/gpu-arch/image002.png "Dual-GCD architecture of the AMD Instinct MI250 GPUs")
The following image shows the block diagram of an OAM package that consists
of two GCDs, each of which constitutes one GPU device in the system. The two
@@ -105,18 +98,18 @@ between the two GCDs of an OAM, or a bidirectional peak transfer bandwidth of
## Node-level architecture
The following image shows the node-level architecture of a system that is
based on the AMD Instinct MI250 accelerator. The MI250 OAMs attach to the host
based on the AMD Instinct MI250 GPU. The MI250 OAMs attach to the host
system via PCIe Gen 4 x16 links (yellow lines). Each GCD maintains its own PCIe
x16 link to the host part of the system. Depending on the server platform, the
GCD can attach to the AMD EPYC processor directly or via an optional PCIe switch
. Note that some platforms may offer an x8 interface to the GCDs, which reduces
the available host-to-GPU bandwidth.
![Block diagram of AMD Instinct MI250 Accelerators with 3rd Generation AMD EPYC processor](../../data/conceptual/gpu-arch/image003.png "Block diagram of AMD Instinct MI250 Accelerators with 3rd Generation AMD EPYC processor")
![Block diagram of AMD Instinct MI250 GPUs with 3rd Generation AMD EPYC processor](../../data/conceptual/gpu-arch/image003.png "Block diagram of AMD Instinct MI250 GPUs with 3rd Generation AMD EPYC processor")
The preceding image shows the node-level architecture of a system with AMD
EPYC processors in a dual-socket configuration and four AMD Instinct MI250
accelerators. The MI250 OAMs attach to the host processors system via PCIe Gen 4
GPUs. The MI250 OAMs attach to the host processors system via PCIe Gen 4
x16 links (yellow lines). Depending on the system design, a PCIe switch may
exist to make more PCIe lanes available for additional components like network
interfaces and/or storage devices. Each GCD maintains its own PCIe x16 link to

View File

@@ -1,16 +1,16 @@
.. meta::
:description: MI300 and MI200 series performance counters and metrics
:description: MI300 and MI200 Series performance counters and metrics
:keywords: MI300, MI200, performance counters, command processor counters
***************************************************************************************************
MI300 and MI200 series performance counters and metrics
MI300 and MI200 Series performance counters and metrics
***************************************************************************************************
This document lists and describes the hardware performance counters and derived metrics available
for the AMD Instinct™ MI300 and MI200 GPU. You can also access this information using the
:doc:`ROCprofiler-SDK <rocprofiler-sdk:how-to/using-rocprofv3>`.
MI300 and MI200 series performance counters
MI300 and MI200 Series performance counters
===============================================================
Series performance counters include the following categories:
@@ -27,7 +27,7 @@ The following sections provide additional details for each category.
.. note::
Preliminary validation of all MI300 and MI200 series performance counters is in progress. Those with
Preliminary validation of all MI300 and MI200 Series performance counters is in progress. Those with
an asterisk (*) require further evaluation.
.. _command-processor-counters:
@@ -171,7 +171,7 @@ Instruction mix
"``SQ_INSTS_SMEM``", "Instr", "Number of scalar memory instructions issued"
"``SQ_INSTS_SMEM_NORM``", "Instr", "Number of scalar memory instructions normalized to match ``smem_level`` issued"
"``SQ_INSTS_FLAT``", "Instr", "Number of flat instructions issued"
"``SQ_INSTS_FLAT_LDS_ONLY``", "Instr", "**MI200 series only** Number of FLAT instructions that read/write only from/to LDS issued. Works only if ``EARLY_TA_DONE`` is enabled."
"``SQ_INSTS_FLAT_LDS_ONLY``", "Instr", "**MI200 Series only** Number of FLAT instructions that read/write only from/to LDS issued. Works only if ``EARLY_TA_DONE`` is enabled."
"``SQ_INSTS_LDS``", "Instr", "Number of LDS instructions issued **(MI200: includes flat; MI300: does not include flat)**"
"``SQ_INSTS_GDS``", "Instr", "Number of global data share instructions issued"
"``SQ_INSTS_EXP_GDS``", "Instr", "Number of EXP and global data share instructions excluding skipped export instructions issued"
@@ -396,9 +396,9 @@ Texture cache per pipe counters
"``TCP_UTCL1_TRANSLATION_MISS[n]``", "Req", "Number of unified translation cache (L1) translation misses", "0-15"
"``TCP_UTCL1_PERMISSION_MISS[n]``", "Req", "Number of unified translation cache (L1) permission misses", "0-15"
"``TCP_TOTAL_CACHE_ACCESSES[n]``", "Req", "Number of vector L1d cache accesses including hits and misses", "0-15"
"``TCP_TCP_LATENCY[n]``", "Cycles", "**MI200 series only** Accumulated wave access latency to vL1D over all wavefronts", "0-15"
"``TCP_TCC_READ_REQ_LATENCY[n]``", "Cycles", "**MI200 series only** Total vL1D to L2 request latency over all wavefronts for reads and atomics with return", "0-15"
"``TCP_TCC_WRITE_REQ_LATENCY[n]``", "Cycles", "**MI200 series only** Total vL1D to L2 request latency over all wavefronts for writes and atomics without return", "0-15"
"``TCP_TCP_LATENCY[n]``", "Cycles", "**MI200 Series only** Accumulated wave access latency to vL1D over all wavefronts", "0-15"
"``TCP_TCC_READ_REQ_LATENCY[n]``", "Cycles", "**MI200 Series only** Total vL1D to L2 request latency over all wavefronts for reads and atomics with return", "0-15"
"``TCP_TCC_WRITE_REQ_LATENCY[n]``", "Cycles", "**MI200 Series only** Total vL1D to L2 request latency over all wavefronts for writes and atomics without return", "0-15"
"``TCP_TCC_READ_REQ[n]``", "Req", "Number of read requests to L2 cache", "0-15"
"``TCP_TCC_WRITE_REQ[n]``", "Req", "Number of write requests to L2 cache", "0-15"
"``TCP_TCC_ATOMIC_WITH_RET_REQ[n]``", "Req", "Number of atomic requests to L2 cache with return", "0-15"
@@ -560,7 +560,7 @@ Note the following:
``TCC_TAG_STALL[n]``, probes can stall the pipeline at a variety of places. There is no single point that
can accurately measure the total stalls
MI300 and MI200 series derived metrics list
MI300 and MI200 Series derived metrics list
==============================================================
.. csv-table::

View File

@@ -1,21 +1,21 @@
---
myst:
html_meta:
"description lang=en": "Learn about the AMD Instinct MI300 series architecture."
"description lang=en": "Learn about the AMD Instinct MI300 Series architecture."
"keywords": "Instinct, MI300X, MI300A, microarchitecture, AMD, ROCm"
---
# AMD Instinct™ MI300 series microarchitecture
# AMD Instinct™ MI300 Series microarchitecture
The AMD Instinct MI300 series accelerators are based on the AMD CDNA 3
The AMD Instinct MI300 Series GPUs are based on the AMD CDNA 3
architecture which was designed to deliver leadership performance for HPC, artificial intelligence (AI), and machine
learning (ML) workloads. The AMD Instinct MI300 series accelerators are well-suited for extreme scalability and compute performance, running
learning (ML) workloads. The AMD Instinct MI300 Series GPUs are well-suited for extreme scalability and compute performance, running
on everything from individual servers to the worlds largest exascale supercomputers.
With the MI300 series, AMD is introducing the Accelerator Complex Die (XCD), which contains the
With the MI300 Series, AMD is introducing the Accelerator Complex Die (XCD), which contains the
GPU computational elements of the processor along with the lower levels of the cache hierarchy.
The following image depicts the structure of a single XCD in the AMD Instinct MI300 accelerator series.
The following image depicts the structure of a single XCD in the AMD Instinct MI300 GPU Series.
```{figure} ../../data/shared/xcd-sys-arch.png
---
@@ -39,7 +39,7 @@ infrastructure) using the AMD Infinity Fabric™ technology as interconnect.
The Matrix Cores inside the CDNA 3 CUs have significant improvements, emphasizing AI and machine
learning, enhancing throughput of existing data types while adding support for new data types.
CDNA 2 Matrix Cores support FP16 and BF16, while offering INT8 for inference. Compared to MI250X
accelerators, CDNA 3 Matrix Cores triple the performance for FP16 and BF16, while providing a
GPUs, CDNA 3 Matrix Cores triple the performance for FP16 and BF16, while providing a
performance gain of 6.8 times for INT8. FP8 has a performance gain of 16 times compared to FP32,
while TF32 has a gain of 4 times compared to FP32.
@@ -105,7 +105,7 @@ name: mi300-arch
alt:
align: center
---
MI300 series system architecture showing MI300A (left) with 6 XCDs and 3 CCDs, while the MI300X (right) has 8 XCDs.
MI300 Series system architecture showing MI300A (left) with 6 XCDs and 3 CCDs, while the MI300X (right) has 8 XCDs.
```
## Node-level architecture
@@ -116,11 +116,11 @@ name: mi300-node
align: center
---
MI300 series node-level architecture showing 8 fully interconnected MI300X OAM modules connected to (optional) PCIEe switches via retimers and HGX connectors.
MI300 Series node-level architecture showing 8 fully interconnected MI300X OAM modules connected to (optional) PCIEe switches via retimers and HGX connectors.
```
The image above shows the node-level architecture of a system with AMD EPYC processors in a
dual-socket configuration and eight AMD Instinct MI300X accelerators. The MI300X OAMs attach to the
dual-socket configuration and eight AMD Instinct MI300X GPUs. The MI300X OAMs attach to the
host system via PCIe Gen 5 x16 links (yellow lines). The GPUs are using seven high-bandwidth,
low-latency AMD Infinity Fabric™ links (red lines) to form a fully connected 8-GPU system.

View File

@@ -1,12 +1,12 @@
.. meta::
:description: MI355 series performance counters and metrics
:description: MI355 Series performance counters and metrics
:keywords: MI355, MI355X, MI3XX
***********************************
MI350 series performance counters
MI350 Series performance counters
***********************************
This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI350 and MI355 accelerators. These counters are available for profiling using `ROCprofiler-SDK <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/index.html>`_ and `ROCm Compute Profiler <https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/>`_.
This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI350 and MI355 GPUs. These counters are available for profiling using `ROCprofiler-SDK <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/index.html>`_ and `ROCm Compute Profiler <https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/>`_.
The following sections list the performance counters based on the IP blocks.

View File

@@ -34,7 +34,7 @@ Runtime
```{code-block} shell
:caption: Example to expose the 1. device and a device based on UUID.
export ROCR_VISIBLE_DEVICES="0,GPU-DEADBEEFDEADBEEF"
export ROCR_VISIBLE_DEVICES="0,GPU-4b2c1a9f-8d3e-6f7a-b5c9-2e4d8a1f6c3b"
```
### `GPU_DEVICE_ORDINAL`

View File

@@ -8,6 +8,7 @@ import os
import shutil
import sys
from pathlib import Path
from subprocess import run
gh_release_path = os.path.join("..", "RELEASE.md")
gh_changelog_path = os.path.join("..", "CHANGELOG.md")
@@ -80,24 +81,27 @@ latex_elements = {
}
html_baseurl = os.environ.get("READTHEDOCS_CANONICAL_URL", "rocm.docs.amd.com")
html_context = {}
html_context = {"docs_header_version": "7.1.1"}
if os.environ.get("READTHEDOCS", "") == "True":
html_context["READTHEDOCS"] = True
# Check if the branch is a docs/ branch
official_branch = run(["git", "rev-parse", "--abbrev-ref", "HEAD"], capture_output=True, text=True).stdout.find("docs/")
# configurations for PDF output by Read the Docs
project = "ROCm Documentation"
project_path = os.path.abspath(".").replace("\\", "/")
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved."
version = "7.0.0"
release = "7.0.0"
version = "7.1.1"
release = "7.1.1"
setting_all_article_info = True
all_article_info_os = ["linux", "windows"]
all_article_info_author = ""
# pages with specific settings
article_pages = [
{"file": "about/release-notes", "os": ["linux"], "date": "2025-09-16"},
{"file": "about/release-notes", "os": ["linux"], "date": "2025-11-26"},
{"file": "release/changelog", "os": ["linux"],},
{"file": "compatibility/compatibility-matrix", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/pytorch-compatibility", "os": ["linux"]},
@@ -107,14 +111,17 @@ article_pages = [
{"file": "compatibility/ml-compatibility/stanford-megatron-lm-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/dgl-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/megablocks-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/taichi-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/ray-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/llama-cpp-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/flashinfer-compatibility", "os": ["linux"]},
{"file": "how-to/deep-learning-rocm", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/install", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/system-health-check", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/system-setup/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/system-setup/multi-node-setup", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/system-setup/prerequisite-system-validation", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/system-setup/system-health-check", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/train-a-model", "os": ["linux"]},
@@ -127,19 +134,36 @@ article_pages = [
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v25.4", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v25.5", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v25.6", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v25.7", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v25.8", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v25.9", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v25.10", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-primus-migration-guide", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/primus-megatron", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/primus-megatron-v25.7", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/primus-megatron-v25.8", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/primus-megatron-v25.9", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/primus-megatron-v25.10", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/pytorch-training", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-history", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-v25.3", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-v25.4", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-v25.5", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-v25.6", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-v25.7", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-v25.8", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-v25.9", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/pytorch-training-v25.10", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/primus-pytorch", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/primus-pytorch-v25.8", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/primus-pytorch-v25.9", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/primus-pytorch-v25.10", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/jax-maxtext", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/jax-maxtext-history", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/jax-maxtext-v25.4", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/previous-versions/jax-maxtext-v25.5", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/benchmark-docker/mpt-llm-foundry", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/xdit-diffusion-inference", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/fine-tuning/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/fine-tuning/overview", "os": ["linux"]},
@@ -164,8 +188,16 @@ article_pages = [
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.9.1-20250702", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.9.1-20250715", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.10.0-20250812", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.10.1-20250909", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.10.2-20251006", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.11.1-20251103", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/sglang-history", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/pytorch-inference", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/xdit-diffusion-inference", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/xdit-25.10", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/xdit-25.11", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/xdit-25.12", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/xdit-25.13", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/deploy-your-model", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference-optimization/index", "os": ["linux"]},
@@ -193,7 +225,7 @@ external_toc_path = "./sphinx/_toc.yml"
# Add the _extensions directory to Python's search path
sys.path.append(str(Path(__file__).parent / 'extension'))
extensions = ["rocm_docs", "sphinx_reredirects", "sphinx_sitemap", "sphinxcontrib.datatemplates", "version-ref", "csv-to-list-table"]
extensions = ["rocm_docs", "sphinx_reredirects", "sphinx_sitemap", "sphinxcontrib.datatemplates", "remote-content", "version-ref", "csv-to-list-table"]
compatibility_matrix_file = str(Path(__file__).parent / 'compatibility/compatibility-matrix-historical-6.0.csv')
@@ -203,10 +235,14 @@ external_projects_current_project = "rocm"
# external_projects_remote_repository = ""
html_baseurl = os.environ.get("READTHEDOCS_CANONICAL_URL", "https://rocm-stg.amd.com/")
html_context = {}
html_context = {"docs_header_version": "7.1.0"}
if os.environ.get("READTHEDOCS", "") == "True":
html_context["READTHEDOCS"] = True
html_context["official_branch"] = official_branch
html_context["version"] = version
html_context["release"] = release
html_theme = "rocm_docs_theme"
html_theme_options = {"flavor": "rocm-docs-home"}
@@ -225,10 +261,13 @@ suppress_warnings = ["autosectionlabel.*"]
html_context = {
"project_path" : {project_path},
"gpu_type" : [('AMD Instinct accelerators', 'intrinsic'), ('AMD gfx families', 'gfx'), ('NVIDIA families', 'nvidia') ],
"gpu_type" : [('AMD Instinct GPUs', 'intrinsic'), ('AMD gfx families', 'gfx'), ('NVIDIA families', 'nvidia') ],
"atomics_type" : [('HW atomics', 'hw-atomics'), ('CAS emulation', 'cas-atomics')],
"pcie_type" : [('No PCIe atomics', 'nopcie'), ('PCIe atomics', 'pcie')],
"memory_type" : [('Device DRAM', 'device-dram'), ('Migratable Host DRAM', 'migratable-host-dram'), ('Pinned Host DRAM', 'pinned-host-dram')],
"granularity_type" : [('Coarse-grained', 'coarse-grained'), ('Fine-grained', 'fine-grained')],
"scope_type" : [('Device', 'device'), ('System', 'system')]
}
# Disable figure and table numbering
numfig = False

View File

@@ -0,0 +1,188 @@
dockers:
- pull_tag: rocm/vllm:rocm6.4.1_vllm_0.10.1_20250909
docker_hub_url: https://hub.docker.com/layers/rocm/vllm/rocm6.4.1_vllm_0.10.1_20250909/images/sha256-1113268572e26d59b205792047bea0e61e018e79aeadceba118b7bf23cb3715c
components:
ROCm: 6.4.1
vLLM: 0.10.1 (0.10.1rc2.dev409+g0b6bf6691.rocm641)
PyTorch: 2.7.0+gitf717b2a
hipBLASLt: 0.15
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.1 8B
mad_tag: pyt_vllm_llama-3.1-8b
model_repo: meta-llama/Llama-3.1-8B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 70B
mad_tag: pyt_vllm_llama-3.1-70b
model_repo: meta-llama/Llama-3.1-70B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B
mad_tag: pyt_vllm_llama-3.1-405b
model_repo: meta-llama/Llama-3.1-405B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 2 70B
mad_tag: pyt_vllm_llama-2-70b
model_repo: meta-llama/Llama-2-70b-chat-hf
url: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 4096
max_num_batched_tokens: 4096
max_model_len: 4096
- model: Llama 3.1 8B FP8
mad_tag: pyt_vllm_llama-3.1-8b_fp8
model_repo: amd/Llama-3.1-8B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-8B-Instruct-FP8-KV
precision: float8
config:
tp: 1
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 70B FP8
mad_tag: pyt_vllm_llama-3.1-70b_fp8
model_repo: amd/Llama-3.1-70B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-70B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B FP8
mad_tag: pyt_vllm_llama-3.1-405b_fp8
model_repo: amd/Llama-3.1-405B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- group: Mistral AI
tag: mistral
models:
- model: Mixtral MoE 8x7B
mad_tag: pyt_vllm_mixtral-8x7b
model_repo: mistralai/Mixtral-8x7B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 32768
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x22B
mad_tag: pyt_vllm_mixtral-8x22b
model_repo: mistralai/Mixtral-8x22B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 65536
max_num_batched_tokens: 65536
max_model_len: 8192
- model: Mixtral MoE 8x7B FP8
mad_tag: pyt_vllm_mixtral-8x7b_fp8
model_repo: amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 32768
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x22B FP8
mad_tag: pyt_vllm_mixtral-8x22b_fp8
model_repo: amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 65536
max_num_batched_tokens: 65536
max_model_len: 8192
- group: Qwen
tag: qwen
models:
- model: QwQ-32B
mad_tag: pyt_vllm_qwq-32b
model_repo: Qwen/QwQ-32B
url: https://huggingface.co/Qwen/QwQ-32B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Qwen3 30B A3B
mad_tag: pyt_vllm_qwen3-30b-a3b
model_repo: Qwen/Qwen3-30B-A3B
url: https://huggingface.co/Qwen/Qwen3-30B-A3B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 32768
max_num_batched_tokens: 32768
max_model_len: 8192
- group: Microsoft Phi
tag: phi
models:
- model: Phi-4
mad_tag: pyt_vllm_phi-4
model_repo: microsoft/phi-4
url: https://huggingface.co/microsoft/phi-4
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 16384
max_num_batched_tokens: 16384
max_model_len: 8192

View File

@@ -0,0 +1,316 @@
dockers:
- pull_tag: rocm/vllm:rocm7.0.0_vllm_0.10.2_20251006
docker_hub_url: https://hub.docker.com/layers/rocm/vllm/rocm7.0.0_vllm_0.10.2_20251006/images/sha256-94fd001964e1cf55c3224a445b1fb5be31a7dac302315255db8422d813edd7f5
components:
ROCm: 7.0.0
vLLM: 0.10.2 (0.11.0rc2.dev160+g790d22168.rocm700)
PyTorch: 2.9.0a0+git1c57644
hipBLASLt: 1.0.0
dockerfile:
commit: 790d22168820507f3105fef29596549378cfe399
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 2 70B
mad_tag: pyt_vllm_llama-2-70b
model_repo: meta-llama/Llama-2-70b-chat-hf
url: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 4096
max_model_len: 4096
- model: Llama 3.1 8B
mad_tag: pyt_vllm_llama-3.1-8b
model_repo: meta-llama/Llama-3.1-8B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 8B FP8
mad_tag: pyt_vllm_llama-3.1-8b_fp8
model_repo: amd/Llama-3.1-8B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-8B-Instruct-FP8-KV
precision: float8
config:
tp: 1
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B
mad_tag: pyt_vllm_llama-3.1-405b
model_repo: meta-llama/Llama-3.1-405B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B FP8
mad_tag: pyt_vllm_llama-3.1-405b_fp8
model_repo: amd/Llama-3.1-405B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B MXFP4
mad_tag: pyt_vllm_llama-3.1-405b_fp4
model_repo: amd/Llama-3.1-405B-Instruct-MXFP4-Preview
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-MXFP4-Preview
precision: float4
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B
mad_tag: pyt_vllm_llama-3.3-70b
model_repo: meta-llama/Llama-3.3-70B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B FP8
mad_tag: pyt_vllm_llama-3.3-70b_fp8
model_repo: amd/Llama-3.3-70B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.3-70B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B MXFP4
mad_tag: pyt_vllm_llama-3.3-70b_fp4
model_repo: amd/Llama-3.3-70B-Instruct-MXFP4-Preview
url: https://huggingface.co/amd/Llama-3.3-70B-Instruct-MXFP4-Preview
precision: float4
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 4 Scout 17Bx16E
mad_tag: pyt_vllm_llama-4-scout-17b-16e
model_repo: meta-llama/Llama-4-Scout-17B-16E-Instruct
url: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Llama 4 Maverick 17Bx128E
mad_tag: pyt_vllm_llama-4-maverick-17b-128e
model_repo: meta-llama/Llama-4-Maverick-17B-128E-Instruct
url: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Llama 4 Maverick 17Bx128E FP8
mad_tag: pyt_vllm_llama-4-maverick-17b-128e_fp8
model_repo: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
url: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek R1 0528 FP8
mad_tag: pyt_vllm_deepseek-r1
model_repo: deepseek-ai/DeepSeek-R1-0528
url: https://huggingface.co/deepseek-ai/DeepSeek-R1-0528
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_seqs: 1024
max_num_batched_tokens: 131072
max_model_len: 8192
- group: OpenAI GPT OSS
tag: gpt-oss
models:
- model: GPT OSS 20B
mad_tag: pyt_vllm_gpt-oss-20b
model_repo: openai/gpt-oss-20b
url: https://huggingface.co/openai/gpt-oss-20b
precision: bfloat16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 8192
max_model_len: 8192
- model: GPT OSS 120B
mad_tag: pyt_vllm_gpt-oss-120b
model_repo: openai/gpt-oss-120b
url: https://huggingface.co/openai/gpt-oss-120b
precision: bfloat16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 8192
max_model_len: 8192
- group: Mistral AI
tag: mistral
models:
- model: Mixtral MoE 8x7B
mad_tag: pyt_vllm_mixtral-8x7b
model_repo: mistralai/Mixtral-8x7B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x7B FP8
mad_tag: pyt_vllm_mixtral-8x7b_fp8
model_repo: amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x22B
mad_tag: pyt_vllm_mixtral-8x22b
model_repo: mistralai/Mixtral-8x22B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 65536
max_model_len: 8192
- model: Mixtral MoE 8x22B FP8
mad_tag: pyt_vllm_mixtral-8x22b_fp8
model_repo: amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 65536
max_model_len: 8192
- group: Qwen
tag: qwen
models:
- model: Qwen3 8B
mad_tag: pyt_vllm_qwen3-8b
model_repo: Qwen/Qwen3-8B
url: https://huggingface.co/Qwen/Qwen3-8B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 32B
mad_tag: pyt_vllm_qwen3-32b
model_repo: Qwen/Qwen3-32b
url: https://huggingface.co/Qwen/Qwen3-32B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 30B A3B
mad_tag: pyt_vllm_qwen3-30b-a3b
model_repo: Qwen/Qwen3-30B-A3B
url: https://huggingface.co/Qwen/Qwen3-30B-A3B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 30B A3B FP8
mad_tag: pyt_vllm_qwen3-30b-a3b_fp8
model_repo: Qwen/Qwen3-30B-A3B-FP8
url: https://huggingface.co/Qwen/Qwen3-30B-A3B-FP8
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 235B A22B
mad_tag: pyt_vllm_qwen3-235b-a22b
model_repo: Qwen/Qwen3-235B-A22B
url: https://huggingface.co/Qwen/Qwen3-235B-A22B
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 235B A22B FP8
mad_tag: pyt_vllm_qwen3-235b-a22b_fp8
model_repo: Qwen/Qwen3-235B-A22B-FP8
url: https://huggingface.co/Qwen/Qwen3-235B-A22B-FP8
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 40960
max_model_len: 8192
- group: Microsoft Phi
tag: phi
models:
- model: Phi-4
mad_tag: pyt_vllm_phi-4
model_repo: microsoft/phi-4
url: https://huggingface.co/microsoft/phi-4
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 16384
max_model_len: 8192

View File

@@ -0,0 +1,316 @@
dockers:
- pull_tag: rocm/vllm:rocm7.0.0_vllm_0.11.1_20251103
docker_hub_url: https://hub.docker.com/layers/rocm/vllm/rocm7.0.0_vllm_0.11.1_20251103/images/sha256-8d60429043d4d00958da46039a1de0d9b82df814d45da482497eef26a6076506
components:
ROCm: 7.0.0
vLLM: 0.11.1 (0.11.1rc2.dev141+g38f225c2a.rocm700)
PyTorch: 2.9.0a0+git1c57644
hipBLASLt: 1.0.0
dockerfile:
commit: 38f225c2abeadc04c2cc398814c2f53ea02c3c72
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 2 70B
mad_tag: pyt_vllm_llama-2-70b
model_repo: meta-llama/Llama-2-70b-chat-hf
url: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 4096
max_model_len: 4096
- model: Llama 3.1 8B
mad_tag: pyt_vllm_llama-3.1-8b
model_repo: meta-llama/Llama-3.1-8B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 8B FP8
mad_tag: pyt_vllm_llama-3.1-8b_fp8
model_repo: amd/Llama-3.1-8B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-8B-Instruct-FP8-KV
precision: float8
config:
tp: 1
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B
mad_tag: pyt_vllm_llama-3.1-405b
model_repo: meta-llama/Llama-3.1-405B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B FP8
mad_tag: pyt_vllm_llama-3.1-405b_fp8
model_repo: amd/Llama-3.1-405B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B MXFP4
mad_tag: pyt_vllm_llama-3.1-405b_fp4
model_repo: amd/Llama-3.1-405B-Instruct-MXFP4-Preview
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-MXFP4-Preview
precision: float4
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B
mad_tag: pyt_vllm_llama-3.3-70b
model_repo: meta-llama/Llama-3.3-70B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B FP8
mad_tag: pyt_vllm_llama-3.3-70b_fp8
model_repo: amd/Llama-3.3-70B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.3-70B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B MXFP4
mad_tag: pyt_vllm_llama-3.3-70b_fp4
model_repo: amd/Llama-3.3-70B-Instruct-MXFP4-Preview
url: https://huggingface.co/amd/Llama-3.3-70B-Instruct-MXFP4-Preview
precision: float4
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 4 Scout 17Bx16E
mad_tag: pyt_vllm_llama-4-scout-17b-16e
model_repo: meta-llama/Llama-4-Scout-17B-16E-Instruct
url: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Llama 4 Maverick 17Bx128E
mad_tag: pyt_vllm_llama-4-maverick-17b-128e
model_repo: meta-llama/Llama-4-Maverick-17B-128E-Instruct
url: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Llama 4 Maverick 17Bx128E FP8
mad_tag: pyt_vllm_llama-4-maverick-17b-128e_fp8
model_repo: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
url: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek R1 0528 FP8
mad_tag: pyt_vllm_deepseek-r1
model_repo: deepseek-ai/DeepSeek-R1-0528
url: https://huggingface.co/deepseek-ai/DeepSeek-R1-0528
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_seqs: 1024
max_num_batched_tokens: 131072
max_model_len: 8192
- group: OpenAI GPT OSS
tag: gpt-oss
models:
- model: GPT OSS 20B
mad_tag: pyt_vllm_gpt-oss-20b
model_repo: openai/gpt-oss-20b
url: https://huggingface.co/openai/gpt-oss-20b
precision: bfloat16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 8192
max_model_len: 8192
- model: GPT OSS 120B
mad_tag: pyt_vllm_gpt-oss-120b
model_repo: openai/gpt-oss-120b
url: https://huggingface.co/openai/gpt-oss-120b
precision: bfloat16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 8192
max_model_len: 8192
- group: Mistral AI
tag: mistral
models:
- model: Mixtral MoE 8x7B
mad_tag: pyt_vllm_mixtral-8x7b
model_repo: mistralai/Mixtral-8x7B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x7B FP8
mad_tag: pyt_vllm_mixtral-8x7b_fp8
model_repo: amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x22B
mad_tag: pyt_vllm_mixtral-8x22b
model_repo: mistralai/Mixtral-8x22B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 65536
max_model_len: 8192
- model: Mixtral MoE 8x22B FP8
mad_tag: pyt_vllm_mixtral-8x22b_fp8
model_repo: amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 65536
max_model_len: 8192
- group: Qwen
tag: qwen
models:
- model: Qwen3 8B
mad_tag: pyt_vllm_qwen3-8b
model_repo: Qwen/Qwen3-8B
url: https://huggingface.co/Qwen/Qwen3-8B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 32B
mad_tag: pyt_vllm_qwen3-32b
model_repo: Qwen/Qwen3-32b
url: https://huggingface.co/Qwen/Qwen3-32B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 30B A3B
mad_tag: pyt_vllm_qwen3-30b-a3b
model_repo: Qwen/Qwen3-30B-A3B
url: https://huggingface.co/Qwen/Qwen3-30B-A3B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 30B A3B FP8
mad_tag: pyt_vllm_qwen3-30b-a3b_fp8
model_repo: Qwen/Qwen3-30B-A3B-FP8
url: https://huggingface.co/Qwen/Qwen3-30B-A3B-FP8
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 235B A22B
mad_tag: pyt_vllm_qwen3-235b-a22b
model_repo: Qwen/Qwen3-235B-A22B
url: https://huggingface.co/Qwen/Qwen3-235B-A22B
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 235B A22B FP8
mad_tag: pyt_vllm_qwen3-235b-a22b_fp8
model_repo: Qwen/Qwen3-235B-A22B-FP8
url: https://huggingface.co/Qwen/Qwen3-235B-A22B-FP8
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 40960
max_model_len: 8192
- group: Microsoft Phi
tag: phi
models:
- model: Phi-4
mad_tag: pyt_vllm_phi-4
model_repo: microsoft/phi-4
url: https://huggingface.co/microsoft/phi-4
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 16384
max_model_len: 8192

View File

@@ -0,0 +1,55 @@
xdit_diffusion_inference:
docker:
pull_tag: rocm/pytorch-xdit:v25.10
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-xdit/v25.10/images/sha256-d79715ff18a9470e3f907cec8a9654d6b783c63370b091446acffc0de4d7070e
ROCm: 7.9.0
components:
TheRock: 7afbe45
rccl: 9b04b2a
composable_kernel: b7a806f
rocm-libraries: f104555
rocm-systems: 25922d0
torch: 2.10.0a0+gite9c9017
torchvision: 0.22.0a0+966da7e
triton: 3.5.0+git52e49c12
accelerate: 1.11.0.dev0
aiter: 0.1.5.post4.dev20+ga25e55e79
diffusers: 0.36.0.dev0
xfuser: 0.4.4
yunchang: 0.6.3.post1
model_groups:
- group: Hunyuan Video
tag: hunyuan
models:
- model: Hunyuan Video
model_name: hunyuanvideo
model_repo: tencent/HunyuanVideo
revision: refs/pr/18
url: https://huggingface.co/tencent/HunyuanVideo
github: https://github.com/Tencent-Hunyuan/HunyuanVideo
mad_tag: pyt_xdit_hunyuanvideo
- group: Wan-AI
tag: wan
models:
- model: Wan2.1
model_name: wan2_1-i2v-14b-720p
model_repo: Wan-AI/Wan2.1-I2V-14B-720P
url: https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P
github: https://github.com/Wan-Video/Wan2.1
mad_tag: pyt_xdit_wan_2_1
- model: Wan2.2
model_name: wan2_2-i2v-a14b
model_repo: Wan-AI/Wan2.2-I2V-A14B
url: https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B
github: https://github.com/Wan-Video/Wan2.2
mad_tag: pyt_xdit_wan_2_2
- group: FLUX
tag: flux
models:
- model: FLUX.1
model_name: FLUX.1-dev
model_repo: black-forest-labs/FLUX.1-dev
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
github: https://github.com/black-forest-labs/flux
mad_tag: pyt_xdit_flux

View File

@@ -0,0 +1,109 @@
xdit_diffusion_inference:
docker:
- version: v25-11
pull_tag: rocm/pytorch-xdit:v25.11
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-xdit/v25.11/images/sha256-c9fa659439bb024f854b4d5eea598347251b02c341c55f66c98110832bde4216
ROCm: 7.10.0
supported_models:
- group: Hunyuan Video
models:
- Hunyuan Video
- group: Wan-AI
models:
- Wan2.1
- Wan2.2
- group: FLUX
models:
- FLUX.1
whats_new:
- "Minor bug fixes and clarifications to READMEs."
- "Bumps TheRock, AITER, Diffusers, xDiT versions."
- "Changes Aiter rounding mode for faster gfx942 FWD Attention."
components:
TheRock: 3e3f834
rccl: d23d18f
composable_kernel: 2570462
rocm-libraries: 0588f07
rocm-systems: 473025a
torch: 73adac
torchvision: f5c6c2e
triton: 7416ffc
accelerate: 34c1779
aiter: de14bec
diffusers: 40528e9
xfuser: 83978b5
yunchang: 2c9b712
- version: v25-10
pull_tag: rocm/pytorch-xdit:v25.10
docker_hub_url: https://hub.docker.com/r/rocm/pytorch-xdit
ROCm: 7.9.0
supported_models:
- group: Hunyuan Video
models:
- Hunyuan Video
- group: Wan-AI
models:
- Wan2.1
- Wan2.2
- group: FLUX
models:
- FLUX.1
whats_new:
- "First official xDiT Docker Release for Diffusion Inference."
- "Supports gfx942 and gfx950 series (AMD Instinct™ MI300X, MI325X, MI350X, and MI355X)."
- "Support Wan 2.1, Wan 2.2, HunyuanVideo and Flux workloads."
components:
TheRock: 7afbe45
rccl: 9b04b2a
composable_kernel: b7a806f
rocm-libraries: f104555
rocm-systems: 25922d0
torch: 2.10.0a0+gite9c9017
torchvision: 0.22.0a0+966da7e
triton: 3.5.0+git52e49c12
accelerate: 1.11.0.dev0
aiter: 0.1.5.post4.dev20+ga25e55e79
diffusers: 0.36.0.dev0
xfuser: 0.4.4
yunchang: 0.6.3.post1
model_groups:
- group: Hunyuan Video
tag: hunyuan
models:
- model: Hunyuan Video
page_tag: hunyuan_tag
model_name: hunyuanvideo
model_repo: tencent/HunyuanVideo
revision: refs/pr/18
url: https://huggingface.co/tencent/HunyuanVideo
github: https://github.com/Tencent-Hunyuan/HunyuanVideo
mad_tag: pyt_xdit_hunyuanvideo
- group: Wan-AI
tag: wan
models:
- model: Wan2.1
page_tag: wan_21_tag
model_name: wan2_1-i2v-14b-720p
model_repo: Wan-AI/Wan2.1-I2V-14B-720P
url: https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P
github: https://github.com/Wan-Video/Wan2.1
mad_tag: pyt_xdit_wan_2_1
- model: Wan2.2
page_tag: wan_22_tag
model_name: wan2_2-i2v-a14b
model_repo: Wan-AI/Wan2.2-I2V-A14B
url: https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B
github: https://github.com/Wan-Video/Wan2.2
mad_tag: pyt_xdit_wan_2_2
- group: FLUX
tag: flux
models:
- model: FLUX.1
page_tag: flux_1_tag
model_name: FLUX.1-dev
model_repo: black-forest-labs/FLUX.1-dev
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
github: https://github.com/black-forest-labs/flux
mad_tag: pyt_xdit_flux

View File

@@ -0,0 +1,91 @@
docker:
pull_tag: rocm/pytorch-xdit:v25.12
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-xdit/v25.12/images/sha256-e06895132316bf3c393366b70a91eaab6755902dad0100e6e2b38310547d9256
ROCm: 7.10.0
whats_new:
- "Adds T2V and TI2V support for Wan models."
- "Adds support for SD-3.5 T2I model."
components:
TheRock:
version: 3e3f834
url: https://github.com/ROCm/TheRock
rccl:
version: d23d18f
url: https://github.com/ROCm/rccl
composable_kernel:
version: 2570462
url: https://github.com/ROCm/composable_kernel
rocm-libraries:
version: 0588f07
url: https://github.com/ROCm/rocm-libraries
rocm-systems:
version: 473025a
url: https://github.com/ROCm/rocm-systems
torch:
version: 73adac
url: https://github.com/pytorch/pytorch
torchvision:
version: f5c6c2e
url: https://github.com/pytorch/vision
triton:
version: 7416ffc
url: https://github.com/triton-lang/triton
accelerate:
version: 34c1779
url: https://github.com/huggingface/accelerate
aiter:
version: de14bec
url: https://github.com/ROCm/aiter
diffusers:
version: 40528e9
url: https://github.com/huggingface/diffusers
xfuser:
version: ccba9d5
url: https://github.com/xdit-project/xDiT
yunchang:
version: 2c9b712
url: https://github.com/feifeibear/long-context-attention
supported_models:
- group: Hunyuan Video
js_tag: hunyuan
models:
- model: Hunyuan Video
model_repo: tencent/HunyuanVideo
revision: refs/pr/18
url: https://huggingface.co/tencent/HunyuanVideo
github: https://github.com/Tencent-Hunyuan/HunyuanVideo
mad_tag: pyt_xdit_hunyuanvideo
js_tag: hunyuan_tag
- group: Wan-AI
js_tag: wan
models:
- model: Wan2.1
model_repo: Wan-AI/Wan2.1-I2V-14B-720P-Diffusers
url: https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P-Diffusers
github: https://github.com/Wan-Video/Wan2.1
mad_tag: pyt_xdit_wan_2_1
js_tag: wan_21_tag
- model: Wan2.2
model_repo: Wan-AI/Wan2.2-I2V-A14B-Diffusers
url: https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B-Diffusers
github: https://github.com/Wan-Video/Wan2.2
mad_tag: pyt_xdit_wan_2_2
js_tag: wan_22_tag
- group: FLUX
js_tag: flux
models:
- model: FLUX.1
model_repo: black-forest-labs/FLUX.1-dev
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
github: https://github.com/black-forest-labs/flux
mad_tag: pyt_xdit_flux
js_tag: flux_1_tag
- group: Stable Diffusion
js_tag: stablediffusion
models:
- model: stable-diffusion-3.5-large
model_repo: stabilityai/stable-diffusion-3.5-large
url: https://huggingface.co/stabilityai/stable-diffusion-3.5-large
github: https://github.com/Stability-AI/sd3.5
mad_tag: pyt_xdit_sd_3_5
js_tag: stable_diffusion_3_5_large_tag

View File

@@ -0,0 +1,32 @@
dockers:
- pull_tag: lmsysorg/sglang:v0.5.2rc1-rocm700-mi30x
docker_hub_url: https://hub.docker.com/layers/lmsysorg/sglang/v0.5.2rc1-rocm700-mi30x/images/sha256-10c4ee502ddba44dd8c13325e6e03868bfe7f43d23d0a44780a8ee8b393f4729
components:
ROCm: 7.0.0
SGLang: v0.5.2rc1
pytorch-triton-rocm: 3.4.0+rocm7.0.0.gitf9e5bf54
model_groups:
- group: Dense models
tag: dense-models
models:
- model: Llama 3.1 8B Instruct
model_repo: Llama-3.1-8B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
- model: Llama 3.1 405B FP8 KV
model_repo: Llama-3.1-405B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-FP8-KV
- model: Llama 3.3 70B FP8 KV
model_repo: amd-Llama-3.3-70B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.3-70B-Instruct-FP8-KV
- model: Qwen3 32B
model_repo: Qwen3-32B
url: https://huggingface.co/Qwen/Qwen3-32B
- group: Small experts models
tag: small-experts-models
models:
- model: DeepSeek V3
model_repo: DeepSeek-V3
url: https://huggingface.co/deepseek-ai/DeepSeek-V3
- model: Mixtral 8x7B v0.1
model_repo: Mixtral-8x7B-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x7B-v0.1

View File

@@ -1,188 +1,316 @@
dockers:
- pull_tag: rocm/vllm:rocm6.4.1_vllm_0.10.1_20250909
docker_hub_url: https://hub.docker.com/layers/rocm/vllm/rocm6.4.1_vllm_0.10.1_20250909/images/sha256-1113268572e26d59b205792047bea0e61e018e79aeadceba118b7bf23cb3715c
- pull_tag: rocm/vllm:rocm7.0.0_vllm_0.11.2_20251210
docker_hub_url: https://hub.docker.com/layers/rocm/vllm/rocm7.0.0_vllm_0.11.2_20251210/images/sha256-e7f02dd2ce3824959658bc0391296f6158638e3ebce164f6c019c4eca8150ec7
components:
ROCm: 6.4.1
vLLM: 0.10.1 (0.10.1rc2.dev409+g0b6bf6691.rocm641)
PyTorch: 2.7.0+gitf717b2a
hipBLASLt: 0.15
ROCm: 7.0.0
vLLM: 0.11.2 (0.11.2.dev673+g839868462.rocm700)
PyTorch: 2.9.0a0+git1c57644
hipBLASLt: 1.0.0
dockerfile:
commit: 8398684622109c806a35d660647060b0b9910663
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.1 8B
mad_tag: pyt_vllm_llama-3.1-8b
model_repo: meta-llama/Llama-3.1-8B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 70B
mad_tag: pyt_vllm_llama-3.1-70b
model_repo: meta-llama/Llama-3.1-70B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B
mad_tag: pyt_vllm_llama-3.1-405b
model_repo: meta-llama/Llama-3.1-405B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 2 70B
mad_tag: pyt_vllm_llama-2-70b
model_repo: meta-llama/Llama-2-70b-chat-hf
url: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 4096
max_num_batched_tokens: 4096
max_model_len: 4096
- model: Llama 3.1 8B FP8
mad_tag: pyt_vllm_llama-3.1-8b_fp8
model_repo: amd/Llama-3.1-8B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-8B-Instruct-FP8-KV
precision: float8
config:
tp: 1
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 70B FP8
mad_tag: pyt_vllm_llama-3.1-70b_fp8
model_repo: amd/Llama-3.1-70B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-70B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B FP8
mad_tag: pyt_vllm_llama-3.1-405b_fp8
model_repo: amd/Llama-3.1-405B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 2 70B
mad_tag: pyt_vllm_llama-2-70b
model_repo: meta-llama/Llama-2-70b-chat-hf
url: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 4096
max_model_len: 4096
- model: Llama 3.1 8B
mad_tag: pyt_vllm_llama-3.1-8b
model_repo: meta-llama/Llama-3.1-8B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 8B FP8
mad_tag: pyt_vllm_llama-3.1-8b_fp8
model_repo: amd/Llama-3.1-8B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-8B-Instruct-FP8-KV
precision: float8
config:
tp: 1
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B
mad_tag: pyt_vllm_llama-3.1-405b
model_repo: meta-llama/Llama-3.1-405B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B FP8
mad_tag: pyt_vllm_llama-3.1-405b_fp8
model_repo: amd/Llama-3.1-405B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.1 405B MXFP4
mad_tag: pyt_vllm_llama-3.1-405b_fp4
model_repo: amd/Llama-3.1-405B-Instruct-MXFP4-Preview
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-MXFP4-Preview
precision: float4
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B
mad_tag: pyt_vllm_llama-3.3-70b
model_repo: meta-llama/Llama-3.3-70B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B FP8
mad_tag: pyt_vllm_llama-3.3-70b_fp8
model_repo: amd/Llama-3.3-70B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.3-70B-Instruct-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 3.3 70B MXFP4
mad_tag: pyt_vllm_llama-3.3-70b_fp4
model_repo: amd/Llama-3.3-70B-Instruct-MXFP4-Preview
url: https://huggingface.co/amd/Llama-3.3-70B-Instruct-MXFP4-Preview
precision: float4
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Llama 4 Scout 17Bx16E
mad_tag: pyt_vllm_llama-4-scout-17b-16e
model_repo: meta-llama/Llama-4-Scout-17B-16E-Instruct
url: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Llama 4 Maverick 17Bx128E
mad_tag: pyt_vllm_llama-4-maverick-17b-128e
model_repo: meta-llama/Llama-4-Maverick-17B-128E-Instruct
url: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Llama 4 Maverick 17Bx128E FP8
mad_tag: pyt_vllm_llama-4-maverick-17b-128e_fp8
model_repo: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
url: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 131072
max_model_len: 8192
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek R1 0528 FP8
mad_tag: pyt_vllm_deepseek-r1
model_repo: deepseek-ai/DeepSeek-R1-0528
url: https://huggingface.co/deepseek-ai/DeepSeek-R1-0528
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_seqs: 1024
max_num_batched_tokens: 131072
max_model_len: 8192
- group: OpenAI GPT OSS
tag: gpt-oss
models:
- model: GPT OSS 20B
mad_tag: pyt_vllm_gpt-oss-20b
model_repo: openai/gpt-oss-20b
url: https://huggingface.co/openai/gpt-oss-20b
precision: bfloat16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 8192
max_model_len: 8192
- model: GPT OSS 120B
mad_tag: pyt_vllm_gpt-oss-120b
model_repo: openai/gpt-oss-120b
url: https://huggingface.co/openai/gpt-oss-120b
precision: bfloat16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 8192
max_model_len: 8192
- group: Mistral AI
tag: mistral
models:
- model: Mixtral MoE 8x7B
mad_tag: pyt_vllm_mixtral-8x7b
model_repo: mistralai/Mixtral-8x7B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 32768
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x22B
mad_tag: pyt_vllm_mixtral-8x22b
model_repo: mistralai/Mixtral-8x22B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 65536
max_num_batched_tokens: 65536
max_model_len: 8192
- model: Mixtral MoE 8x7B FP8
mad_tag: pyt_vllm_mixtral-8x7b_fp8
model_repo: amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 32768
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x22B FP8
mad_tag: pyt_vllm_mixtral-8x22b_fp8
model_repo: amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_seq_len_to_capture: 65536
max_num_batched_tokens: 65536
max_model_len: 8192
- model: Mixtral MoE 8x7B
mad_tag: pyt_vllm_mixtral-8x7b
model_repo: mistralai/Mixtral-8x7B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x7B FP8
mad_tag: pyt_vllm_mixtral-8x7b_fp8
model_repo: amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Mixtral MoE 8x22B
mad_tag: pyt_vllm_mixtral-8x22b
model_repo: mistralai/Mixtral-8x22B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 65536
max_model_len: 8192
- model: Mixtral MoE 8x22B FP8
mad_tag: pyt_vllm_mixtral-8x22b_fp8
model_repo: amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 65536
max_model_len: 8192
- group: Qwen
tag: qwen
models:
- model: QwQ-32B
mad_tag: pyt_vllm_qwq-32b
model_repo: Qwen/QwQ-32B
url: https://huggingface.co/Qwen/QwQ-32B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 131072
max_num_batched_tokens: 131072
max_model_len: 8192
- model: Qwen3 30B A3B
mad_tag: pyt_vllm_qwen3-30b-a3b
model_repo: Qwen/Qwen3-30B-A3B
url: https://huggingface.co/Qwen/Qwen3-30B-A3B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 32768
max_num_batched_tokens: 32768
max_model_len: 8192
- model: Qwen3 8B
mad_tag: pyt_vllm_qwen3-8b
model_repo: Qwen/Qwen3-8B
url: https://huggingface.co/Qwen/Qwen3-8B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 32B
mad_tag: pyt_vllm_qwen3-32b
model_repo: Qwen/Qwen3-32b
url: https://huggingface.co/Qwen/Qwen3-32B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 30B A3B
mad_tag: pyt_vllm_qwen3-30b-a3b
model_repo: Qwen/Qwen3-30B-A3B
url: https://huggingface.co/Qwen/Qwen3-30B-A3B
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 30B A3B FP8
mad_tag: pyt_vllm_qwen3-30b-a3b_fp8
model_repo: Qwen/Qwen3-30B-A3B-FP8
url: https://huggingface.co/Qwen/Qwen3-30B-A3B-FP8
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 235B A22B
mad_tag: pyt_vllm_qwen3-235b-a22b
model_repo: Qwen/Qwen3-235B-A22B
url: https://huggingface.co/Qwen/Qwen3-235B-A22B
precision: float16
config:
tp: 8
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 40960
max_model_len: 8192
- model: Qwen3 235B A22B FP8
mad_tag: pyt_vllm_qwen3-235b-a22b_fp8
model_repo: Qwen/Qwen3-235B-A22B-FP8
url: https://huggingface.co/Qwen/Qwen3-235B-A22B-FP8
precision: float8
config:
tp: 8
dtype: auto
kv_cache_dtype: fp8
max_num_batched_tokens: 40960
max_model_len: 8192
- group: Microsoft Phi
tag: phi
models:
- model: Phi-4
mad_tag: pyt_vllm_phi-4
model_repo: microsoft/phi-4
url: https://huggingface.co/microsoft/phi-4
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_seq_len_to_capture: 16384
max_num_batched_tokens: 16384
max_model_len: 8192
- model: Phi-4
mad_tag: pyt_vllm_phi-4
model_repo: microsoft/phi-4
url: https://huggingface.co/microsoft/phi-4
precision: float16
config:
tp: 1
dtype: auto
kv_cache_dtype: auto
max_num_batched_tokens: 16384
max_model_len: 8192

View File

@@ -0,0 +1,105 @@
docker:
pull_tag: rocm/pytorch-xdit:v25.13
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-xdit/v25.13/images/sha256-81954713070d67bde08595e03f62110c8a3dd66a9ae17a77d611e01f83f0f4ef
ROCm: 7.11.0
whats_new:
- "Flux.1 Kontext support"
- "Flux.2 Dev support"
- "Flux FP8 GEMM support"
- "Hybrid FP8 attention support for Wan models"
components:
TheRock:
version: 1728a81
url: https://github.com/ROCm/TheRock
rccl:
version: d23d18f
url: https://github.com/ROCm/rccl
composable_kernel:
version: ab0101c
url: https://github.com/ROCm/composable_kernel
rocm-libraries:
version: a2f7c35
url: https://github.com/ROCm/rocm-libraries
rocm-systems:
version: 659737c
url: https://github.com/ROCm/rocm-systems
torch:
version: 91be249
url: https://github.com/ROCm/pytorch
torchvision:
version: b919bd0
url: https://github.com/pytorch/vision
triton:
version: a272dfa
url: https://github.com/ROCm/triton
accelerate:
version: b521400f
url: https://github.com/huggingface/accelerate
aiter:
version: de14bec0
url: https://github.com/ROCm/aiter
diffusers:
version: a1f36ee3e
url: https://github.com/huggingface/diffusers
xfuser:
version: adf2681
url: https://github.com/xdit-project/xDiT
yunchang:
version: 2c9b712
url: https://github.com/feifeibear/long-context-attention
supported_models:
- group: Hunyuan Video
js_tag: hunyuan
models:
- model: Hunyuan Video
model_repo: tencent/HunyuanVideo
revision: refs/pr/18
url: https://huggingface.co/tencent/HunyuanVideo
github: https://github.com/Tencent-Hunyuan/HunyuanVideo
mad_tag: pyt_xdit_hunyuanvideo
js_tag: hunyuan_tag
- group: Wan-AI
js_tag: wan
models:
- model: Wan2.1
model_repo: Wan-AI/Wan2.1-I2V-14B-720P-Diffusers
url: https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P-Diffusers
github: https://github.com/Wan-Video/Wan2.1
mad_tag: pyt_xdit_wan_2_1
js_tag: wan_21_tag
- model: Wan2.2
model_repo: Wan-AI/Wan2.2-I2V-A14B-Diffusers
url: https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B-Diffusers
github: https://github.com/Wan-Video/Wan2.2
mad_tag: pyt_xdit_wan_2_2
js_tag: wan_22_tag
- group: FLUX
js_tag: flux
models:
- model: FLUX.1
model_repo: black-forest-labs/FLUX.1-dev
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
github: https://github.com/black-forest-labs/flux
mad_tag: pyt_xdit_flux
js_tag: flux_1_tag
- model: FLUX.1 Kontext
model_repo: black-forest-labs/FLUX.1-Kontext-dev
url: https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev
github: https://github.com/black-forest-labs/flux
mad_tag: pyt_xdit_flux_kontext
js_tag: flux_1_kontext_tag
- model: FLUX.2
model_repo: black-forest-labs/FLUX.2-dev
url: https://huggingface.co/black-forest-labs/FLUX.2-dev
github: https://github.com/black-forest-labs/flux2
mad_tag: pyt_xdit_flux_2
js_tag: flux_2_tag
- group: StableDiffusion
js_tag: stablediffusion
models:
- model: stable-diffusion-3.5-large
model_repo: stabilityai/stable-diffusion-3.5-large
url: https://huggingface.co/stabilityai/stable-diffusion-3.5-large
github: https://github.com/Stability-AI/sd3.5
mad_tag: pyt_xdit_sd_3_5
js_tag: stable_diffusion_3_5_large_tag

View File

@@ -1,47 +1,16 @@
dockers:
- pull_tag: rocm/jax-training:maxtext-v25.7
docker_hub_url: https://hub.docker.com/layers/rocm/jax-training/maxtext-v25.7/images/sha256-45f4c727d4019a63fc47313d3a5f5a5105569539294ddfd2d742218212ae9025
- pull_tag: rocm/jax-training:maxtext-v25.11
docker_hub_url: https://hub.docker.com/layers/rocm/jax-training/maxtext-v25.11/images/sha256-18e4d8f0b8ce7a7422c58046940dd5f32249960449fca09a562b65fb8eb1562a
components:
ROCm: 6.4.1
JAX: 0.5.0
Python: 3.10.12
Transformer Engine: 2.1.0+90d703dd
hipBLASLt: 1.x.x
- pull_tag: rocm/jax-training:maxtext-v25.7-jax060
docker_hub_url: https://hub.docker.com/layers/rocm/jax-training/maxtext-v25.7/images/sha256-45f4c727d4019a63fc47313d3a5f5a5105569539294ddfd2d742218212ae9025
components:
ROCm: 6.4.1
JAX: 0.6.0
Python: 3.10.12
Transformer Engine: 2.1.0+90d703dd
hipBLASLt: 1.1.0-499ece1c21
ROCm: 7.1.0
JAX: 0.7.1
Python: 3.12
Transformer Engine: 2.4.0.dev0+281042de
hipBLASLt: 1.2.x
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: jax_maxtext_train_llama-3.3-70b
model_repo: Llama-3.3-70B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3.1 8B
mad_tag: jax_maxtext_train_llama-3.1-8b
model_repo: Llama-3.1-8B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3.1 70B
mad_tag: jax_maxtext_train_llama-3.1-70b
model_repo: Llama-3.1-70B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3 8B
mad_tag: jax_maxtext_train_llama-3-8b
multinode_training_script: llama3_8b_multinode.sh
doc_options: ["multi-node"]
- model: Llama 3 70B
mad_tag: jax_maxtext_train_llama-3-70b
multinode_training_script: llama3_70b_multinode.sh
doc_options: ["multi-node"]
- model: Llama 2 7B
mad_tag: jax_maxtext_train_llama-2-7b
model_repo: Llama-2-7B
@@ -54,6 +23,29 @@ model_groups:
precision: bf16
multinode_training_script: llama2_70b_multinode.sh
doc_options: ["single-node", "multi-node"]
- model: Llama 3 8B (multi-node)
mad_tag: jax_maxtext_train_llama-3-8b
multinode_training_script: llama3_8b_multinode.sh
doc_options: ["multi-node"]
- model: Llama 3 70B (multi-node)
mad_tag: jax_maxtext_train_llama-3-70b
multinode_training_script: llama3_70b_multinode.sh
doc_options: ["multi-node"]
- model: Llama 3.1 8B
mad_tag: jax_maxtext_train_llama-3.1-8b
model_repo: Llama-3.1-8B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3.1 70B
mad_tag: jax_maxtext_train_llama-3.1-70b
model_repo: Llama-3.1-70B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3.3 70B
mad_tag: jax_maxtext_train_llama-3.3-70b
model_repo: Llama-3.3-70B
precision: bf16
doc_options: ["single-node"]
- group: DeepSeek
tag: deepseek
models:

View File

@@ -1,15 +1,17 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.7_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.7_py310/images/sha256-6189df849feeeee3ae31bb1e97aef5006d69d2b90c134e97708c19632e20ab5a
components:
ROCm: 6.4.2
Primus: v0.1.0-rc1
PyTorch: 2.8.0a0+gitd06a406
Python: "3.10"
Transformer Engine: 2.1.0.dev0+ba586519
hipBLASLt: 37ba1d36
Triton: 3.3.0
RCCL: 2.22.3
docker:
pull_tag: rocm/primus:v25.10
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.10/images/sha256-140c37cd2eeeb183759b9622543fc03cc210dc97cbfa18eeefdcbda84420c197
components:
ROCm: 7.1.0
Primus: 0.3.0
Primus Turbo: 0.1.1
PyTorch: 2.10.0.dev20251112+rocm7.1
Python: "3.10"
Transformer Engine: 2.4.0.dev0+32e2d1d4
Flash Attention: 2.8.3
hipBLASLt: 1.2.0-09ab7153e2
Triton: 3.4.0
RCCL: 2.27.7
model_groups:
- group: Meta Llama
tag: llama
@@ -20,8 +22,6 @@ model_groups:
mad_tag: pyt_megatron_lm_train_llama-3.1-8b
- model: Llama 3.1 70B
mad_tag: pyt_megatron_lm_train_llama-3.1-70b
- model: Llama 3.1 70B (proxy)
mad_tag: pyt_megatron_lm_train_llama-3.1-70b-proxy
- model: Llama 2 7B
mad_tag: pyt_megatron_lm_train_llama-2-7b
- model: Llama 2 70B

View File

@@ -0,0 +1,72 @@
dockers:
- pull_tag: rocm/jax-training:maxtext-v25.7-jax060
docker_hub_url: https://hub.docker.com/layers/rocm/jax-training/maxtext-v25.7/images/sha256-45f4c727d4019a63fc47313d3a5f5a5105569539294ddfd2d742218212ae9025
components:
ROCm: 6.4.1
JAX: 0.6.0
Python: 3.10.12
Transformer Engine: 2.1.0+90d703dd
hipBLASLt: 1.1.0-499ece1c21
- pull_tag: rocm/jax-training:maxtext-v25.7
docker_hub_url: https://hub.docker.com/layers/rocm/jax-training/maxtext-v25.7/images/sha256-45f4c727d4019a63fc47313d3a5f5a5105569539294ddfd2d742218212ae9025
components:
ROCm: 6.4.1
JAX: 0.5.0
Python: 3.10.12
Transformer Engine: 2.1.0+90d703dd
hipBLASLt: 1.x.x
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: jax_maxtext_train_llama-3.3-70b
model_repo: Llama-3.3-70B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3.1 8B
mad_tag: jax_maxtext_train_llama-3.1-8b
model_repo: Llama-3.1-8B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3.1 70B
mad_tag: jax_maxtext_train_llama-3.1-70b
model_repo: Llama-3.1-70B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3 8B
mad_tag: jax_maxtext_train_llama-3-8b
multinode_training_script: llama3_8b_multinode.sh
doc_options: ["multi-node"]
- model: Llama 3 70B
mad_tag: jax_maxtext_train_llama-3-70b
multinode_training_script: llama3_70b_multinode.sh
doc_options: ["multi-node"]
- model: Llama 2 7B
mad_tag: jax_maxtext_train_llama-2-7b
model_repo: Llama-2-7B
precision: bf16
multinode_training_script: llama2_7b_multinode.sh
doc_options: ["single-node", "multi-node"]
- model: Llama 2 70B
mad_tag: jax_maxtext_train_llama-2-70b
model_repo: Llama-2-70B
precision: bf16
multinode_training_script: llama2_70b_multinode.sh
doc_options: ["single-node", "multi-node"]
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V2-Lite (16B)
mad_tag: jax_maxtext_train_deepseek-v2-lite-16b
model_repo: DeepSeek-V2-lite
precision: bf16
doc_options: ["single-node"]
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: jax_maxtext_train_mixtral-8x7b
model_repo: Mixtral-8x7B
precision: bf16
doc_options: ["single-node"]

View File

@@ -0,0 +1,64 @@
dockers:
- pull_tag: rocm/jax-training:maxtext-v25.9.1
docker_hub_url: https://hub.docker.com/layers/rocm/jax-training/maxtext-v25.9.1/images/sha256-60946cfbd470f6ee361fc9da740233a4fb2e892727f01719145b1f7627a1cff6
components:
ROCm: 7.0.0
JAX: 0.6.2
Python: 3.10.18
Transformer Engine: 2.2.0.dev0+c91bac54
hipBLASLt: 1.x.x
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 2 7B
mad_tag: jax_maxtext_train_llama-2-7b
model_repo: Llama-2-7B
precision: bf16
multinode_training_script: llama2_7b_multinode.sh
doc_options: ["single-node", "multi-node"]
- model: Llama 2 70B
mad_tag: jax_maxtext_train_llama-2-70b
model_repo: Llama-2-70B
precision: bf16
multinode_training_script: llama2_70b_multinode.sh
doc_options: ["single-node", "multi-node"]
- model: Llama 3 8B (multi-node)
mad_tag: jax_maxtext_train_llama-3-8b
multinode_training_script: llama3_8b_multinode.sh
doc_options: ["multi-node"]
- model: Llama 3 70B (multi-node)
mad_tag: jax_maxtext_train_llama-3-70b
multinode_training_script: llama3_70b_multinode.sh
doc_options: ["multi-node"]
- model: Llama 3.1 8B
mad_tag: jax_maxtext_train_llama-3.1-8b
model_repo: Llama-3.1-8B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3.1 70B
mad_tag: jax_maxtext_train_llama-3.1-70b
model_repo: Llama-3.1-70B
precision: bf16
doc_options: ["single-node"]
- model: Llama 3.3 70B
mad_tag: jax_maxtext_train_llama-3.3-70b
model_repo: Llama-3.3-70B
precision: bf16
doc_options: ["single-node"]
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V2-Lite (16B)
mad_tag: jax_maxtext_train_deepseek-v2-lite-16b
model_repo: DeepSeek-V2-lite
precision: bf16
doc_options: ["single-node"]
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: jax_maxtext_train_mixtral-8x7b
model_repo: Mixtral-8x7B
precision: bf16
doc_options: ["single-node"]

View File

@@ -0,0 +1,49 @@
docker:
pull_tag: rocm/primus:v25.10
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.10/images/sha256-140c37cd2eeeb183759b9622543fc03cc210dc97cbfa18eeefdcbda84420c197
components:
ROCm: 7.1.0
Primus: 0.3.0
Primus Turbo: 0.1.1
PyTorch: 2.10.0.dev20251112+rocm7.1
Python: "3.10"
Transformer Engine: 2.4.0.dev0+32e2d1d4
Flash Attention: 2.8.3
hipBLASLt: 1.2.0-09ab7153e2
Triton: 3.4.0
RCCL: 2.27.7
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: pyt_megatron_lm_train_llama-3.3-70b
- model: Llama 3.1 8B
mad_tag: pyt_megatron_lm_train_llama-3.1-8b
- model: Llama 3.1 70B
mad_tag: pyt_megatron_lm_train_llama-3.1-70b
- model: Llama 2 7B
mad_tag: pyt_megatron_lm_train_llama-2-7b
- model: Llama 2 70B
mad_tag: pyt_megatron_lm_train_llama-2-70b
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: pyt_megatron_lm_train_deepseek-v3-proxy
- model: DeepSeek-V2-Lite
mad_tag: pyt_megatron_lm_train_deepseek-v2-lite-16b
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: pyt_megatron_lm_train_mixtral-8x7b
- model: Mixtral 8x22B (proxy)
mad_tag: pyt_megatron_lm_train_mixtral-8x22b-proxy
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: pyt_megatron_lm_train_qwen2.5-7b
- model: Qwen 2.5 72B
mad_tag: pyt_megatron_lm_train_qwen2.5-72b

View File

@@ -0,0 +1,49 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.7_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.7_py310/images/sha256-6189df849feeeee3ae31bb1e97aef5006d69d2b90c134e97708c19632e20ab5a
components:
ROCm: 6.4.2
Primus: v0.1.0-rc1
PyTorch: 2.8.0a0+gitd06a406
Python: "3.10"
Transformer Engine: 2.1.0.dev0+ba586519
hipBLASLt: 37ba1d36
Triton: 3.3.0
RCCL: 2.22.3
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: pyt_megatron_lm_train_llama-3.3-70b
- model: Llama 3.1 8B
mad_tag: pyt_megatron_lm_train_llama-3.1-8b
- model: Llama 3.1 70B
mad_tag: pyt_megatron_lm_train_llama-3.1-70b
- model: Llama 3.1 70B (proxy)
mad_tag: pyt_megatron_lm_train_llama-3.1-70b-proxy
- model: Llama 2 7B
mad_tag: pyt_megatron_lm_train_llama-2-7b
- model: Llama 2 70B
mad_tag: pyt_megatron_lm_train_llama-2-70b
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: pyt_megatron_lm_train_deepseek-v3-proxy
- model: DeepSeek-V2-Lite
mad_tag: pyt_megatron_lm_train_deepseek-v2-lite-16b
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: pyt_megatron_lm_train_mixtral-8x7b
- model: Mixtral 8x22B (proxy)
mad_tag: pyt_megatron_lm_train_mixtral-8x22b-proxy
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: pyt_megatron_lm_train_qwen2.5-7b
- model: Qwen 2.5 72B
mad_tag: pyt_megatron_lm_train_qwen2.5-72b

View File

@@ -0,0 +1,48 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.8_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.8_py310/images/sha256-50fc824361054e445e86d5d88d5f58817f61f8ec83ad4a7e43ea38bbc4a142c0
components:
ROCm: 6.4.3
PyTorch: 2.8.0a0+gitd06a406
Python: "3.10"
Transformer Engine: 2.2.0.dev0+54dd2bdc
hipBLASLt: d1b517fc7a
Triton: 3.3.0
RCCL: 2.22.3
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: pyt_megatron_lm_train_llama-3.3-70b
- model: Llama 3.1 8B
mad_tag: pyt_megatron_lm_train_llama-3.1-8b
- model: Llama 3.1 70B
mad_tag: pyt_megatron_lm_train_llama-3.1-70b
- model: Llama 3.1 70B (proxy)
mad_tag: pyt_megatron_lm_train_llama-3.1-70b-proxy
- model: Llama 2 7B
mad_tag: pyt_megatron_lm_train_llama-2-7b
- model: Llama 2 70B
mad_tag: pyt_megatron_lm_train_llama-2-70b
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: pyt_megatron_lm_train_deepseek-v3-proxy
- model: DeepSeek-V2-Lite
mad_tag: pyt_megatron_lm_train_deepseek-v2-lite-16b
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: pyt_megatron_lm_train_mixtral-8x7b
- model: Mixtral 8x22B (proxy)
mad_tag: pyt_megatron_lm_train_mixtral-8x22b-proxy
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: pyt_megatron_lm_train_qwen2.5-7b
- model: Qwen 2.5 72B
mad_tag: pyt_megatron_lm_train_qwen2.5-72b

View File

@@ -0,0 +1,53 @@
dockers:
MI355X and MI350X:
pull_tag: rocm/megatron-lm:v25.9_gfx950
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.9_gfx950/images/sha256-1a198be32f49efd66d0ff82066b44bd99b3e6b04c8e0e9b36b2c481e13bff7b6
components: &docker_components
ROCm: 7.0.0
Primus: aab4234
PyTorch: 2.9.0.dev20250821+rocm7.0.0.lw.git125803b7
Python: "3.10"
Transformer Engine: 2.2.0.dev0+54dd2bdc
Flash Attention: 2.8.3
hipBLASLt: 911283acd1
Triton: 3.4.0+rocm7.0.0.git56765e8c
RCCL: 2.26.6
MI325X and MI300X:
pull_tag: rocm/megatron-lm:v25.9_gfx942
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.9_gfx942/images/sha256-df6ab8f45b4b9ceb100fb24e19b2019a364e351ee3b324dbe54466a1d67f8357
components: *docker_components
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: pyt_megatron_lm_train_llama-3.3-70b
- model: Llama 3.1 8B
mad_tag: pyt_megatron_lm_train_llama-3.1-8b
- model: Llama 3.1 70B
mad_tag: pyt_megatron_lm_train_llama-3.1-70b
- model: Llama 2 7B
mad_tag: pyt_megatron_lm_train_llama-2-7b
- model: Llama 2 70B
mad_tag: pyt_megatron_lm_train_llama-2-70b
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: pyt_megatron_lm_train_deepseek-v3-proxy
- model: DeepSeek-V2-Lite
mad_tag: pyt_megatron_lm_train_deepseek-v2-lite-16b
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: pyt_megatron_lm_train_mixtral-8x7b
- model: Mixtral 8x22B (proxy)
mad_tag: pyt_megatron_lm_train_mixtral-8x22b-proxy
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: pyt_megatron_lm_train_qwen2.5-7b
- model: Qwen 2.5 72B
mad_tag: pyt_megatron_lm_train_qwen2.5-72b

View File

@@ -0,0 +1,58 @@
docker:
pull_tag: rocm/primus:v25.10
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.10/images/sha256-140c37cd2eeeb183759b9622543fc03cc210dc97cbfa18eeefdcbda84420c197
components:
ROCm: 7.1.0
PyTorch: 2.10.0.dev20251112+rocm7.1
Python: "3.10"
Transformer Engine: 2.4.0.dev0+32e2d1d4
Flash Attention: 2.8.3
hipBLASLt: 1.2.0-09ab7153e2
Triton: 3.4.0
RCCL: 2.27.7
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.3-70b
config_name: llama3.3_70B-pretrain.yaml
- model: Llama 3.1 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-70b
config_name: llama3.1_70B-pretrain.yaml
- model: Llama 3.1 8B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-8b
config_name: llama3.1_8B-pretrain.yaml
- model: Llama 2 7B
mad_tag: primus_pyt_megatron_lm_train_llama-2-7b
config_name: llama2_7B-pretrain.yaml
- model: Llama 2 70B
mad_tag: primus_pyt_megatron_lm_train_llama-2-70b
config_name: llama2_70B-pretrain.yaml
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: primus_pyt_megatron_lm_train_deepseek-v3-proxy
config_name: deepseek_v3-pretrain.yaml
- model: DeepSeek-V2-Lite
mad_tag: primus_pyt_megatron_lm_train_deepseek-v2-lite-16b
config_name: deepseek_v2_lite-pretrain.yaml
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x7b
config_name: mixtral_8x7B_v0.1-pretrain.yaml
- model: Mixtral 8x22B (proxy)
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x22b-proxy
config_name: mixtral_8x22B_v0.1-pretrain.yaml
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-7b
config_name: primus_qwen2.5_7B-pretrain.yaml
- model: Qwen 2.5 72B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-72b
config_name: qwen2.5_72B-pretrain.yaml

View File

@@ -0,0 +1,58 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.7_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.7_py310/images/sha256-6189df849feeeee3ae31bb1e97aef5006d69d2b90c134e97708c19632e20ab5a
components:
ROCm: 6.4.2
Primus: v0.1.0-rc1
PyTorch: 2.8.0a0+gitd06a406
Python: "3.10"
Transformer Engine: 2.1.0.dev0+ba586519
hipBLASLt: 37ba1d36
Triton: 3.3.0
RCCL: 2.22.3
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.3-70b
config_name: llama3.3_70B-pretrain.yaml
- model: Llama 3.1 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-70b
config_name: llama3.1_70B-pretrain.yaml
- model: Llama 3.1 8B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-8b
config_name: llama3.1_8B-pretrain.yaml
- model: Llama 2 7B
mad_tag: primus_pyt_megatron_lm_train_llama-2-7b
config_name: llama2_7B-pretrain.yaml
- model: Llama 2 70B
mad_tag: primus_pyt_megatron_lm_train_llama-2-70b
config_name: llama2_70B-pretrain.yaml
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: primus_pyt_megatron_lm_train_deepseek-v3-proxy
config_name: deepseek_v3-pretrain.yaml
- model: DeepSeek-V2-Lite
mad_tag: primus_pyt_megatron_lm_train_deepseek-v2-lite-16b
config_name: deepseek_v2_lite-pretrain.yaml
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x7b
config_name: mixtral_8x7B_v0.1-pretrain.yaml
- model: Mixtral 8x22B (proxy)
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x22b-proxy
config_name: mixtral_8x22B_v0.1-pretrain.yaml
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-7b
config_name: primus_qwen2.5_7B-pretrain.yaml
- model: Qwen 2.5 72B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-72b
config_name: qwen2.5_72B-pretrain.yaml

View File

@@ -0,0 +1,58 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.8_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.8_py310/images/sha256-50fc824361054e445e86d5d88d5f58817f61f8ec83ad4a7e43ea38bbc4a142c0
components:
ROCm: 6.4.3
Primus: 927a717
PyTorch: 2.8.0a0+gitd06a406
Python: "3.10"
Transformer Engine: 2.2.0.dev0+54dd2bdc
hipBLASLt: d1b517fc7a
Triton: 3.3.0
RCCL: 2.22.3
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.3-70b
config_name: llama3.3_70B-pretrain.yaml
- model: Llama 3.1 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-70b
config_name: llama3.1_70B-pretrain.yaml
- model: Llama 3.1 8B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-8b
config_name: llama3.1_8B-pretrain.yaml
- model: Llama 2 7B
mad_tag: primus_pyt_megatron_lm_train_llama-2-7b
config_name: llama2_7B-pretrain.yaml
- model: Llama 2 70B
mad_tag: primus_pyt_megatron_lm_train_llama-2-70b
config_name: llama2_70B-pretrain.yaml
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: primus_pyt_megatron_lm_train_deepseek-v3-proxy
config_name: deepseek_v3-pretrain.yaml
- model: DeepSeek-V2-Lite
mad_tag: primus_pyt_megatron_lm_train_deepseek-v2-lite-16b
config_name: deepseek_v2_lite-pretrain.yaml
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x7b
config_name: mixtral_8x7B_v0.1-pretrain.yaml
- model: Mixtral 8x22B (proxy)
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x22b-proxy
config_name: mixtral_8x22B_v0.1-pretrain.yaml
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-7b
config_name: primus_qwen2.5_7B-pretrain.yaml
- model: Qwen 2.5 72B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-72b
config_name: qwen2.5_72B-pretrain.yaml

View File

@@ -0,0 +1,65 @@
dockers:
MI355X and MI350X:
pull_tag: rocm/primus:v25.9_gfx950
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.9_gfx950/images/sha256-1a198be32f49efd66d0ff82066b44bd99b3e6b04c8e0e9b36b2c481e13bff7b6
components: &docker_components
ROCm: 7.0.0
Primus: 0.3.0
Primus Turbo: 0.1.1
PyTorch: 2.9.0.dev20250821+rocm7.0.0.lw.git125803b7
Python: "3.10"
Transformer Engine: 2.2.0.dev0+54dd2bdc
Flash Attention: 2.8.3
hipBLASLt: 911283acd1
Triton: 3.4.0+rocm7.0.0.git56765e8c
RCCL: 2.26.6
MI325X and MI300X:
pull_tag: rocm/primus:v25.9_gfx942
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.9_gfx942/images/sha256-df6ab8f45b4b9ceb100fb24e19b2019a364e351ee3b324dbe54466a1d67f8357
components: *docker_components
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.3-70b
config_name: llama3.3_70B-pretrain.yaml
- model: Llama 3.1 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-70b
config_name: llama3.1_70B-pretrain.yaml
- model: Llama 3.1 8B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-8b
config_name: llama3.1_8B-pretrain.yaml
- model: Llama 2 7B
mad_tag: primus_pyt_megatron_lm_train_llama-2-7b
config_name: llama2_7B-pretrain.yaml
- model: Llama 2 70B
mad_tag: primus_pyt_megatron_lm_train_llama-2-70b
config_name: llama2_70B-pretrain.yaml
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: primus_pyt_megatron_lm_train_deepseek-v3-proxy
config_name: deepseek_v3-pretrain.yaml
- model: DeepSeek-V2-Lite
mad_tag: primus_pyt_megatron_lm_train_deepseek-v2-lite-16b
config_name: deepseek_v2_lite-pretrain.yaml
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x7b
config_name: mixtral_8x7B_v0.1-pretrain.yaml
- model: Mixtral 8x22B (proxy)
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x22b-proxy
config_name: mixtral_8x22B_v0.1-pretrain.yaml
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-7b
config_name: primus_qwen2.5_7B-pretrain.yaml
- model: Qwen 2.5 72B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-72b
config_name: qwen2.5_72B-pretrain.yaml

View File

@@ -0,0 +1,32 @@
docker:
pull_tag: rocm/primus:v25.10
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.10/images/sha256-140c37cd2eeeb183759b9622543fc03cc210dc97cbfa18eeefdcbda84420c197
components:
ROCm: 7.1.0
PyTorch: 2.10.0.dev20251112+rocm7.1
Python: "3.10"
Transformer Engine: 2.4.0.dev0+32e2d1d4
Flash Attention: 2.8.3
hipBLASLt: 1.2.0-09ab7153e2
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.1 8B
mad_tag: primus_pyt_train_llama-3.1-8b
model_repo: Llama-3.1-8B
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: BF16
- model: Llama 3.1 70B
mad_tag: primus_pyt_train_llama-3.1-70b
model_repo: Llama-3.1-70B
url: https://huggingface.co/meta-llama/Llama-3.1-70B
precision: BF16
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek V2 16B
mad_tag: primus_pyt_train_deepseek-v2
model_repo: DeepSeek-V2
url: https://huggingface.co/deepseek-ai/DeepSeek-V2
precision: BF16

View File

@@ -0,0 +1,24 @@
dockers:
- pull_tag: rocm/pytorch-training:v25.8
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-training/v25.8/images/sha256-5082ae01d73fec6972b0d84e5dad78c0926820dcf3c19f301d6c8eb892e573c5
components:
ROCm: 6.4.3
PyTorch: 2.8.0a0+gitd06a406
Python: 3.10.18
Transformer Engine: 2.2.0.dev0+a1e66aae
Flash Attention: 3.0.0.post1
hipBLASLt: 1.1.0-d1b517fc7a
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.1 8B
mad_tag: primus_pyt_train_llama-3.1-8b
model_repo: Llama-3.1-8B
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: BF16
- model: Llama 3.1 70B
mad_tag: primus_pyt_train_llama-3.1-70b
model_repo: Llama-3.1-70B
url: https://huggingface.co/meta-llama/Llama-3.1-70B
precision: BF16

View File

@@ -0,0 +1,39 @@
dockers:
MI355X and MI350X:
pull_tag: rocm/primus:v25.9_gfx950
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.9_gfx950/images/sha256-1a198be32f49efd66d0ff82066b44bd99b3e6b04c8e0e9b36b2c481e13bff7b6
components: &docker_components
ROCm: 7.0.0
Primus: 0.3.0
Primus Turbo: 0.1.1
PyTorch: 2.9.0.dev20250821+rocm7.0.0.lw.git125803b7
Python: "3.10"
Transformer Engine: 2.2.0.dev0+54dd2bdc
Flash Attention: 2.8.3
hipBLASLt: 911283acd1
Triton: 3.4.0+rocm7.0.0.git56765e8c
RCCL: 2.26.6
MI325X and MI300X:
pull_tag: rocm/primus:v25.9_gfx942
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.9_gfx942/images/sha256-df6ab8f45b4b9ceb100fb24e19b2019a364e351ee3b324dbe54466a1d67f8357
components: *docker_components
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.1 8B
mad_tag: primus_pyt_train_llama-3.1-8b
model_repo: meta-llama/Llama-3.1-8B
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: BF16
config_file:
bf16: "./llama3_8b_fsdp_bf16.toml"
fp8: "./llama3_8b_fsdp_fp8.toml"
- model: Llama 3.1 70B
mad_tag: primus_pyt_train_llama-3.1-70b
model_repo: meta-llama/Llama-3.1-70B
url: https://huggingface.co/meta-llama/Llama-3.1-70B
precision: BF16
config_file:
bf16: "./llama3_70b_fsdp_bf16.toml"
fp8: "./llama3_70b_fsdp_fp8.toml"

View File

@@ -0,0 +1,197 @@
docker:
pull_tag: rocm/primus:v25.10
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.10/images/sha256-140c37cd2eeeb183759b9622543fc03cc210dc97cbfa18eeefdcbda84420c197
components:
ROCm: 7.1.0
Primus: 0.3.0
Primus Turbo: 0.1.1
PyTorch: 2.10.0.dev20251112+rocm7.1
Python: "3.10"
Transformer Engine: 2.4.0.dev0+32e2d1d4
Flash Attention: 2.8.3
hipBLASLt: 1.2.0-09ab7153e2
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 4 Scout 17B-16E
mad_tag: pyt_train_llama-4-scout-17b-16e
model_repo: Llama-4-17B_16E
url: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.3 70B
mad_tag: pyt_train_llama-3.3-70b
model_repo: Llama-3.3-70B
url: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
precision: BF16
training_modes: [finetune_fw, finetune_lora, finetune_qlora]
- model: Llama 3.2 1B
mad_tag: pyt_train_llama-3.2-1b
model_repo: Llama-3.2-1B
url: https://huggingface.co/meta-llama/Llama-3.2-1B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.2 3B
mad_tag: pyt_train_llama-3.2-3b
model_repo: Llama-3.2-3B
url: https://huggingface.co/meta-llama/Llama-3.2-3B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.2 Vision 11B
mad_tag: pyt_train_llama-3.2-vision-11b
model_repo: Llama-3.2-Vision-11B
url: https://huggingface.co/meta-llama/Llama-3.2-11B-Vision
precision: BF16
training_modes: [finetune_fw]
- model: Llama 3.2 Vision 90B
mad_tag: pyt_train_llama-3.2-vision-90b
model_repo: Llama-3.2-Vision-90B
url: https://huggingface.co/meta-llama/Llama-3.2-90B-Vision
precision: BF16
training_modes: [finetune_fw]
- model: Llama 3.1 8B
mad_tag: pyt_train_llama-3.1-8b
model_repo: Llama-3.1-8B
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: BF16
training_modes: [pretrain, finetune_fw, finetune_lora, HF_pretrain]
- model: Llama 3.1 70B
mad_tag: pyt_train_llama-3.1-70b
model_repo: Llama-3.1-70B
url: https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
precision: BF16
training_modes: [pretrain, finetune_fw, finetune_lora]
- model: Llama 3.1 405B
mad_tag: pyt_train_llama-3.1-405b
model_repo: Llama-3.1-405B
url: https://huggingface.co/meta-llama/Llama-3.1-405B
precision: BF16
training_modes: [finetune_qlora]
- model: Llama 3 8B
mad_tag: pyt_train_llama-3-8b
model_repo: Llama-3-8B
url: https://huggingface.co/meta-llama/Meta-Llama-3-8B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3 70B
mad_tag: pyt_train_llama-3-70b
model_repo: Llama-3-70B
url: https://huggingface.co/meta-llama/Meta-Llama-3-70B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 2 7B
mad_tag: pyt_train_llama-2-7b
model_repo: Llama-2-7B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_fw, finetune_lora, finetune_qlora]
- model: Llama 2 13B
mad_tag: pyt_train_llama-2-13b
model_repo: Llama-2-13B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 2 70B
mad_tag: pyt_train_llama-2-70b
model_repo: Llama-2-70B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_lora, finetune_qlora]
- group: OpenAI
tag: openai
models:
- model: GPT OSS 20B
mad_tag: pyt_train_gpt_oss_20b
model_repo: GPT-OSS-20B
url: https://huggingface.co/openai/gpt-oss-20b
precision: BF16
training_modes: [HF_finetune_lora]
- model: GPT OSS 120B
mad_tag: pyt_train_gpt_oss_120b
model_repo: GPT-OSS-120B
url: https://huggingface.co/openai/gpt-oss-120b
precision: BF16
training_modes: [HF_finetune_lora]
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek V2 16B
mad_tag: primus_pyt_train_deepseek-v2
model_repo: DeepSeek-V2
url: https://huggingface.co/deepseek-ai/DeepSeek-V2
precision: BF16
training_modes: [pretrain]
- group: Qwen
tag: qwen
models:
- model: Qwen 3 8B
mad_tag: pyt_train_qwen3-8b
model_repo: Qwen3-8B
url: https://huggingface.co/Qwen/Qwen3-8B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Qwen 3 32B
mad_tag: pyt_train_qwen3-32b
model_repo: Qwen3-32
url: https://huggingface.co/Qwen/Qwen3-32B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2.5 32B
mad_tag: pyt_train_qwen2.5-32b
model_repo: Qwen2.5-32B
url: https://huggingface.co/Qwen/Qwen2.5-32B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2.5 72B
mad_tag: pyt_train_qwen2.5-72b
model_repo: Qwen2.5-72B
url: https://huggingface.co/Qwen/Qwen2.5-72B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2 1.5B
mad_tag: pyt_train_qwen2-1.5b
model_repo: Qwen2-1.5B
url: https://huggingface.co/Qwen/Qwen2-1.5B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Qwen 2 7B
mad_tag: pyt_train_qwen2-7b
model_repo: Qwen2-7B
url: https://huggingface.co/Qwen/Qwen2-7B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- group: Stable Diffusion
tag: sd
models:
- model: Stable Diffusion XL
mad_tag: pyt_huggingface_stable_diffusion_xl_2k_lora_finetuning
model_repo: SDXL
url: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
precision: BF16
training_modes: [posttrain]
- group: Flux
tag: flux
models:
- model: FLUX.1-dev
mad_tag: pyt_train_flux
model_repo: Flux
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
precision: BF16
training_modes: [posttrain]
- group: NCF
tag: ncf
models:
- model: NCF
mad_tag: pyt_ncf_training
model_repo:
url: https://github.com/ROCm/FluxBenchmark
precision: FP32
- group: DLRM
tag: dlrm
models:
- model: DLRM v2
mad_tag: pyt_train_dlrm
model_repo: DLRM
url: https://github.com/AMD-AGI/DLRMBenchmark
training_modes: [pretrain]

View File

@@ -0,0 +1,162 @@
dockers:
- pull_tag: rocm/pytorch-training:v25.7
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-training/v25.7/images/sha256-cc6fd840ab89cb81d926fc29eca6d075aee9875a55a522675a4b9231c9a0a712
components:
ROCm: 6.4.2
PyTorch: 2.8.0a0+gitd06a406
Python: 3.10.18
Transformer Engine: 2.2.0.dev0+94e53dd8
Flash Attention: 3.0.0.post1
hipBLASLt: 1.1.0-4b9a52edfc
Triton: 3.3.0
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 4 Scout 17B-16E
mad_tag: pyt_train_llama-4-scout-17b-16e
model_repo: Llama-4-17B_16E
url: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.3 70B
mad_tag: pyt_train_llama-3.3-70b
model_repo: Llama-3.3-70B
url: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
precision: BF16
training_modes: [finetune_fw, finetune_lora, finetune_qlora]
- model: Llama 3.2 1B
mad_tag: pyt_train_llama-3.2-1b
model_repo: Llama-3.2-1B
url: https://huggingface.co/meta-llama/Llama-3.2-1B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.2 3B
mad_tag: pyt_train_llama-3.2-3b
model_repo: Llama-3.2-3B
url: https://huggingface.co/meta-llama/Llama-3.2-3B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.2 Vision 11B
mad_tag: pyt_train_llama-3.2-vision-11b
model_repo: Llama-3.2-Vision-11B
url: https://huggingface.co/meta-llama/Llama-3.2-11B-Vision
precision: BF16
training_modes: [finetune_fw]
- model: Llama 3.2 Vision 90B
mad_tag: pyt_train_llama-3.2-vision-90b
model_repo: Llama-3.2-Vision-90B
url: https://huggingface.co/meta-llama/Llama-3.2-90B-Vision
precision: BF16
training_modes: [finetune_fw]
- model: Llama 3.1 8B
mad_tag: pyt_train_llama-3.1-8b
model_repo: Llama-3.1-8B
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: BF16
training_modes: [pretrain, finetune_fw, finetune_lora, HF_pretrain]
- model: Llama 3.1 70B
mad_tag: pyt_train_llama-3.1-70b
model_repo: Llama-3.1-70B
url: https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
precision: BF16
training_modes: [pretrain, finetune_fw, finetune_lora]
- model: Llama 3.1 405B
mad_tag: pyt_train_llama-3.1-405b
model_repo: Llama-3.1-405B
url: https://huggingface.co/meta-llama/Llama-3.1-405B
precision: BF16
training_modes: [finetune_qlora]
- model: Llama 3 8B
mad_tag: pyt_train_llama-3-8b
model_repo: Llama-3-8B
url: https://huggingface.co/meta-llama/Meta-Llama-3-8B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3 70B
mad_tag: pyt_train_llama-3-70b
model_repo: Llama-3-70B
url: https://huggingface.co/meta-llama/Meta-Llama-3-70B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 2 7B
mad_tag: pyt_train_llama-2-7b
model_repo: Llama-2-7B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_fw, finetune_lora, finetune_qlora]
- model: Llama 2 13B
mad_tag: pyt_train_llama-2-13b
model_repo: Llama-2-13B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 2 70B
mad_tag: pyt_train_llama-2-70b
model_repo: Llama-2-70B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_lora, finetune_qlora]
- group: OpenAI
tag: openai
models:
- model: GPT OSS 20B
mad_tag: pyt_train_gpt_oss_20b
model_repo: GPT-OSS-20B
url: https://huggingface.co/openai/gpt-oss-20b
precision: BF16
training_modes: [HF_finetune_lora]
- model: GPT OSS 120B
mad_tag: pyt_train_gpt_oss_120b
model_repo: GPT-OSS-120B
url: https://huggingface.co/openai/gpt-oss-120b
precision: BF16
training_modes: [HF_finetune_lora]
- group: Qwen
tag: qwen
models:
- model: Qwen 3 8B
mad_tag: pyt_train_qwen3-8b
model_repo: Qwen3-8B
url: https://huggingface.co/Qwen/Qwen3-8B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Qwen 3 32B
mad_tag: pyt_train_qwen3-32b
model_repo: Qwen3-32
url: https://huggingface.co/Qwen/Qwen3-32B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2.5 32B
mad_tag: pyt_train_qwen2.5-32b
model_repo: Qwen2.5-32B
url: https://huggingface.co/Qwen/Qwen2.5-32B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2.5 72B
mad_tag: pyt_train_qwen2.5-72b
model_repo: Qwen2.5-72B
url: https://huggingface.co/Qwen/Qwen2.5-72B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2 1.5B
mad_tag: pyt_train_qwen2-1.5b
model_repo: Qwen2-1.5B
url: https://huggingface.co/Qwen/Qwen2-1.5B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Qwen 2 7B
mad_tag: pyt_train_qwen2-7b
model_repo: Qwen2-7B
url: https://huggingface.co/Qwen/Qwen2-7B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- group: Flux
tag: flux
models:
- model: FLUX.1-dev
mad_tag: pyt_train_flux
model_repo: Flux
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
precision: BF16
training_modes: [pretrain]

View File

@@ -0,0 +1,178 @@
dockers:
- pull_tag: rocm/pytorch-training:v25.8
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-training/v25.8/images/sha256-5082ae01d73fec6972b0d84e5dad78c0926820dcf3c19f301d6c8eb892e573c5
components:
ROCm: 6.4.3
PyTorch: 2.8.0a0+gitd06a406
Python: 3.10.18
Transformer Engine: 2.2.0.dev0+a1e66aae
Flash Attention: 3.0.0.post1
hipBLASLt: 1.1.0-d1b517fc7a
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 4 Scout 17B-16E
mad_tag: pyt_train_llama-4-scout-17b-16e
model_repo: Llama-4-17B_16E
url: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.3 70B
mad_tag: pyt_train_llama-3.3-70b
model_repo: Llama-3.3-70B
url: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
precision: BF16
training_modes: [finetune_fw, finetune_lora, finetune_qlora]
- model: Llama 3.2 1B
mad_tag: pyt_train_llama-3.2-1b
model_repo: Llama-3.2-1B
url: https://huggingface.co/meta-llama/Llama-3.2-1B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.2 3B
mad_tag: pyt_train_llama-3.2-3b
model_repo: Llama-3.2-3B
url: https://huggingface.co/meta-llama/Llama-3.2-3B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.2 Vision 11B
mad_tag: pyt_train_llama-3.2-vision-11b
model_repo: Llama-3.2-Vision-11B
url: https://huggingface.co/meta-llama/Llama-3.2-11B-Vision
precision: BF16
training_modes: [finetune_fw]
- model: Llama 3.2 Vision 90B
mad_tag: pyt_train_llama-3.2-vision-90b
model_repo: Llama-3.2-Vision-90B
url: https://huggingface.co/meta-llama/Llama-3.2-90B-Vision
precision: BF16
training_modes: [finetune_fw]
- model: Llama 3.1 8B
mad_tag: pyt_train_llama-3.1-8b
model_repo: Llama-3.1-8B
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: BF16
training_modes: [pretrain, finetune_fw, finetune_lora, HF_pretrain]
- model: Llama 3.1 70B
mad_tag: pyt_train_llama-3.1-70b
model_repo: Llama-3.1-70B
url: https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
precision: BF16
training_modes: [pretrain, finetune_fw, finetune_lora]
- model: Llama 3.1 405B
mad_tag: pyt_train_llama-3.1-405b
model_repo: Llama-3.1-405B
url: https://huggingface.co/meta-llama/Llama-3.1-405B
precision: BF16
training_modes: [finetune_qlora]
- model: Llama 3 8B
mad_tag: pyt_train_llama-3-8b
model_repo: Llama-3-8B
url: https://huggingface.co/meta-llama/Meta-Llama-3-8B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3 70B
mad_tag: pyt_train_llama-3-70b
model_repo: Llama-3-70B
url: https://huggingface.co/meta-llama/Meta-Llama-3-70B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 2 7B
mad_tag: pyt_train_llama-2-7b
model_repo: Llama-2-7B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_fw, finetune_lora, finetune_qlora]
- model: Llama 2 13B
mad_tag: pyt_train_llama-2-13b
model_repo: Llama-2-13B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 2 70B
mad_tag: pyt_train_llama-2-70b
model_repo: Llama-2-70B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_lora, finetune_qlora]
- group: OpenAI
tag: openai
models:
- model: GPT OSS 20B
mad_tag: pyt_train_gpt_oss_20b
model_repo: GPT-OSS-20B
url: https://huggingface.co/openai/gpt-oss-20b
precision: BF16
training_modes: [HF_finetune_lora]
- model: GPT OSS 120B
mad_tag: pyt_train_gpt_oss_120b
model_repo: GPT-OSS-120B
url: https://huggingface.co/openai/gpt-oss-120b
precision: BF16
training_modes: [HF_finetune_lora]
- group: Qwen
tag: qwen
models:
- model: Qwen 3 8B
mad_tag: pyt_train_qwen3-8b
model_repo: Qwen3-8B
url: https://huggingface.co/Qwen/Qwen3-8B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Qwen 3 32B
mad_tag: pyt_train_qwen3-32b
model_repo: Qwen3-32
url: https://huggingface.co/Qwen/Qwen3-32B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2.5 32B
mad_tag: pyt_train_qwen2.5-32b
model_repo: Qwen2.5-32B
url: https://huggingface.co/Qwen/Qwen2.5-32B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2.5 72B
mad_tag: pyt_train_qwen2.5-72b
model_repo: Qwen2.5-72B
url: https://huggingface.co/Qwen/Qwen2.5-72B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2 1.5B
mad_tag: pyt_train_qwen2-1.5b
model_repo: Qwen2-1.5B
url: https://huggingface.co/Qwen/Qwen2-1.5B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Qwen 2 7B
mad_tag: pyt_train_qwen2-7b
model_repo: Qwen2-7B
url: https://huggingface.co/Qwen/Qwen2-7B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- group: Stable Diffusion
tag: sd
models:
- model: Stable Diffusion XL
mad_tag: pyt_huggingface_stable_diffusion_xl_2k_lora_finetuning
model_repo: SDXL
url: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
precision: BF16
training_modes: [finetune_lora]
- group: Flux
tag: flux
models:
- model: FLUX.1-dev
mad_tag: pyt_train_flux
model_repo: Flux
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
precision: BF16
training_modes: [pretrain]
- group: NCF
tag: ncf
models:
- model: NCF
mad_tag: pyt_ncf_training
model_repo:
url: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Recommendation/NCF
precision: FP32

View File

@@ -0,0 +1,186 @@
dockers:
MI355X and MI350X:
pull_tag: rocm/pytorch-training:v25.9_gfx950
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-training/v25.9_gfx950/images/sha256-1a198be32f49efd66d0ff82066b44bd99b3e6b04c8e0e9b36b2c481e13bff7b6
components: &docker_components
ROCm: 7.0.0
Primus: aab4234
PyTorch: 2.9.0.dev20250821+rocm7.0.0.lw.git125803b7
Python: "3.10"
Transformer Engine: 2.2.0.dev0+54dd2bdc
Flash Attention: 2.8.3
hipBLASLt: 911283acd1
Triton: 3.4.0+rocm7.0.0.git56765e8c
RCCL: 2.26.6
MI325X and MI300X:
pull_tag: rocm/pytorch-training:v25.9_gfx942
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-training/v25.9_gfx942/images/sha256-df6ab8f45b4b9ceb100fb24e19b2019a364e351ee3b324dbe54466a1d67f8357
components: *docker_components
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 4 Scout 17B-16E
mad_tag: pyt_train_llama-4-scout-17b-16e
model_repo: Llama-4-17B_16E
url: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.3 70B
mad_tag: pyt_train_llama-3.3-70b
model_repo: Llama-3.3-70B
url: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
precision: BF16
training_modes: [finetune_fw, finetune_lora, finetune_qlora]
- model: Llama 3.2 1B
mad_tag: pyt_train_llama-3.2-1b
model_repo: Llama-3.2-1B
url: https://huggingface.co/meta-llama/Llama-3.2-1B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.2 3B
mad_tag: pyt_train_llama-3.2-3b
model_repo: Llama-3.2-3B
url: https://huggingface.co/meta-llama/Llama-3.2-3B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3.2 Vision 11B
mad_tag: pyt_train_llama-3.2-vision-11b
model_repo: Llama-3.2-Vision-11B
url: https://huggingface.co/meta-llama/Llama-3.2-11B-Vision
precision: BF16
training_modes: [finetune_fw]
- model: Llama 3.2 Vision 90B
mad_tag: pyt_train_llama-3.2-vision-90b
model_repo: Llama-3.2-Vision-90B
url: https://huggingface.co/meta-llama/Llama-3.2-90B-Vision
precision: BF16
training_modes: [finetune_fw]
- model: Llama 3.1 8B
mad_tag: pyt_train_llama-3.1-8b
model_repo: Llama-3.1-8B
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: BF16
training_modes: [pretrain, finetune_fw, finetune_lora, HF_pretrain]
- model: Llama 3.1 70B
mad_tag: pyt_train_llama-3.1-70b
model_repo: Llama-3.1-70B
url: https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
precision: BF16
training_modes: [pretrain, finetune_fw, finetune_lora]
- model: Llama 3.1 405B
mad_tag: pyt_train_llama-3.1-405b
model_repo: Llama-3.1-405B
url: https://huggingface.co/meta-llama/Llama-3.1-405B
precision: BF16
training_modes: [finetune_qlora]
- model: Llama 3 8B
mad_tag: pyt_train_llama-3-8b
model_repo: Llama-3-8B
url: https://huggingface.co/meta-llama/Meta-Llama-3-8B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 3 70B
mad_tag: pyt_train_llama-3-70b
model_repo: Llama-3-70B
url: https://huggingface.co/meta-llama/Meta-Llama-3-70B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 2 7B
mad_tag: pyt_train_llama-2-7b
model_repo: Llama-2-7B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_fw, finetune_lora, finetune_qlora]
- model: Llama 2 13B
mad_tag: pyt_train_llama-2-13b
model_repo: Llama-2-13B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Llama 2 70B
mad_tag: pyt_train_llama-2-70b
model_repo: Llama-2-70B
url: https://github.com/meta-llama/llama-models/tree/main/models/llama2
precision: BF16
training_modes: [finetune_lora, finetune_qlora]
- group: OpenAI
tag: openai
models:
- model: GPT OSS 20B
mad_tag: pyt_train_gpt_oss_20b
model_repo: GPT-OSS-20B
url: https://huggingface.co/openai/gpt-oss-20b
precision: BF16
training_modes: [HF_finetune_lora]
- model: GPT OSS 120B
mad_tag: pyt_train_gpt_oss_120b
model_repo: GPT-OSS-120B
url: https://huggingface.co/openai/gpt-oss-120b
precision: BF16
training_modes: [HF_finetune_lora]
- group: Qwen
tag: qwen
models:
- model: Qwen 3 8B
mad_tag: pyt_train_qwen3-8b
model_repo: Qwen3-8B
url: https://huggingface.co/Qwen/Qwen3-8B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Qwen 3 32B
mad_tag: pyt_train_qwen3-32b
model_repo: Qwen3-32
url: https://huggingface.co/Qwen/Qwen3-32B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2.5 32B
mad_tag: pyt_train_qwen2.5-32b
model_repo: Qwen2.5-32B
url: https://huggingface.co/Qwen/Qwen2.5-32B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2.5 72B
mad_tag: pyt_train_qwen2.5-72b
model_repo: Qwen2.5-72B
url: https://huggingface.co/Qwen/Qwen2.5-72B
precision: BF16
training_modes: [finetune_lora]
- model: Qwen 2 1.5B
mad_tag: pyt_train_qwen2-1.5b
model_repo: Qwen2-1.5B
url: https://huggingface.co/Qwen/Qwen2-1.5B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- model: Qwen 2 7B
mad_tag: pyt_train_qwen2-7b
model_repo: Qwen2-7B
url: https://huggingface.co/Qwen/Qwen2-7B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- group: Stable Diffusion
tag: sd
models:
- model: Stable Diffusion XL
mad_tag: pyt_huggingface_stable_diffusion_xl_2k_lora_finetuning
model_repo: SDXL
url: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
precision: BF16
training_modes: [posttrain-p]
- group: Flux
tag: flux
models:
- model: FLUX.1-dev
mad_tag: pyt_train_flux
model_repo: Flux
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
precision: BF16
training_modes: [posttrain-p]
- group: NCF
tag: ncf
models:
- model: NCF
mad_tag: pyt_ncf_training
model_repo:
url: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Recommendation/NCF
precision: FP32

View File

@@ -1,15 +1,15 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.7_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.7_py310/images/sha256-6189df849feeeee3ae31bb1e97aef5006d69d2b90c134e97708c19632e20ab5a
components:
ROCm: 6.4.2
Primus: v0.1.0-rc1
PyTorch: 2.8.0a0+gitd06a406
Python: "3.10"
Transformer Engine: 2.1.0.dev0+ba586519
hipBLASLt: 37ba1d36
Triton: 3.3.0
RCCL: 2.22.3
docker:
pull_tag: rocm/primus:v25.11
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.10/images/sha256-140c37cd2eeeb183759b9622543fc03cc210dc97cbfa18eeefdcbda84420c197
components:
ROCm: 7.1.0
PyTorch: 2.10.0.dev20251112+rocm7.1
Python: "3.10"
Transformer Engine: 2.4.0.dev0+32e2d1d4
Flash Attention: 2.8.3
hipBLASLt: 1.2.0-09ab7153e2
Triton: 3.4.0
RCCL: 2.27.7
model_groups:
- group: Meta Llama
tag: llama

View File

@@ -0,0 +1,32 @@
docker:
pull_tag: rocm/primus:v25.11
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.10/images/sha256-140c37cd2eeeb183759b9622543fc03cc210dc97cbfa18eeefdcbda84420c197
components:
ROCm: 7.1.0
PyTorch: 2.10.0.dev20251112+rocm7.1
Python: "3.10"
Transformer Engine: 2.4.0.dev0+32e2d1d4
Flash Attention: 2.8.3
hipBLASLt: 1.2.0-09ab7153e2
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.1 8B
mad_tag: primus_pyt_train_llama-3.1-8b
model_repo: Llama-3.1-8B
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: BF16
- model: Llama 3.1 70B
mad_tag: primus_pyt_train_llama-3.1-70b
model_repo: Llama-3.1-70B
url: https://huggingface.co/meta-llama/Llama-3.1-70B
precision: BF16
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek V3 16B
mad_tag: primus_pyt_train_deepseek-v3-16b
model_repo: DeepSeek-V3
url: https://huggingface.co/deepseek-ai/DeepSeek-V3
precision: BF16

View File

@@ -1,14 +1,15 @@
dockers:
- pull_tag: rocm/pytorch-training:v25.7
docker_hub_url: https://hub.docker.com/layers/rocm/pytorch-training/v25.7/images/sha256-cc6fd840ab89cb81d926fc29eca6d075aee9875a55a522675a4b9231c9a0a712
components:
ROCm: 6.4.2
PyTorch: 2.8.0a0+gitd06a406
Python: 3.10.18
Transformer Engine: 2.2.0.dev0+94e53dd8
Flash Attention: 3.0.0.post1
hipBLASLt: 1.1.0-4b9a52edfc
Triton: 3.3.0
docker:
pull_tag: rocm/primus:v25.10
docker_hub_url: https://hub.docker.com/layers/rocm/primus/v25.10/images/sha256-140c37cd2eeeb183759b9622543fc03cc210dc97cbfa18eeefdcbda84420c197
components:
ROCm: 7.1.0
Primus: 0.3.0
Primus Turbo: 0.1.1
PyTorch: 2.10.0.dev20251112+rocm7.1
Python: "3.10"
Transformer Engine: 2.4.0.dev0+32e2d1d4
Flash Attention: 2.8.3
hipBLASLt: 1.2.0-09ab7153e2
model_groups:
- group: Meta Llama
tag: llama
@@ -112,6 +113,15 @@ model_groups:
url: https://huggingface.co/openai/gpt-oss-120b
precision: BF16
training_modes: [HF_finetune_lora]
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek V2 16B
mad_tag: primus_pyt_train_deepseek-v2
model_repo: DeepSeek-V2
url: https://huggingface.co/deepseek-ai/DeepSeek-V2
precision: BF16
training_modes: [pretrain]
- group: Qwen
tag: qwen
models:
@@ -151,6 +161,15 @@ model_groups:
url: https://huggingface.co/Qwen/Qwen2-7B
precision: BF16
training_modes: [finetune_fw, finetune_lora]
- group: Stable Diffusion
tag: sd
models:
- model: Stable Diffusion XL
mad_tag: pyt_huggingface_stable_diffusion_xl_2k_lora_finetuning
model_repo: SDXL
url: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
precision: BF16
training_modes: [posttrain]
- group: Flux
tag: flux
models:
@@ -159,4 +178,20 @@ model_groups:
model_repo: Flux
url: https://huggingface.co/black-forest-labs/FLUX.1-dev
precision: BF16
training_modes: [posttrain]
- group: NCF
tag: ncf
models:
- model: NCF
mad_tag: pyt_ncf_training
model_repo:
url: https://github.com/ROCm/FluxBenchmark
precision: FP32
- group: DLRM
tag: dlrm
models:
- model: DLRM v2
mad_tag: pyt_train_dlrm
model_repo: DLRM
url: https://github.com/AMD-AGI/DLRMBenchmark
training_modes: [pretrain]

View File

@@ -1,4 +1,4 @@
Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X series,MI300A,MI350X series
Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X Series,MI300A,MI350X Series
32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS
32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS
32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS
1 Atomic MI100 MI200 PCIe MI200 A+A MI300X series MI300X Series MI300A MI350X series MI350X Series
2 32 bit atomicAdd ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS
3 32 bit atomicSub ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS
4 32 bit atomicMin ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS

View File

@@ -1,4 +1,4 @@
Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X series,MI300A,MI350X series
Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X Series,MI300A,MI350X Series
32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS
32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS
32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS
1 Atomic MI100 MI200 PCIe MI200 A+A MI300X series MI300X Series MI300A MI350X series MI350X Series
2 32 bit atomicAdd ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS
3 32 bit atomicSub ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS
4 32 bit atomicMin ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS ✅ CAS

View File

@@ -1,4 +1,4 @@
Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X series,MI300A,MI350X series
Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X Series,MI300A,MI350X Series
32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native
32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native
32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native
1 Atomic MI100 MI200 PCIe MI200 A+A MI300X series MI300X Series MI300A MI350X series MI350X Series
2 32 bit atomicAdd ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native
3 32 bit atomicSub ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native
4 32 bit atomicMin ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native

View File

@@ -1,4 +1,4 @@
Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X series,MI300A,MI350X series
Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X Series,MI300A,MI350X Series
32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native
32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native
32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native
1 Atomic MI100 MI200 PCIe MI200 A+A MI300X series MI300X Series MI300A MI350X series MI350X Series
2 32 bit atomicAdd ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native
3 32 bit atomicSub ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native
4 32 bit atomicMin ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native ✅ Native

View File

@@ -32,7 +32,7 @@ library_groups:
- name: "MIGraphX"
tag: "migraphx"
doc_link: "amdmigraphx:reference/cpp"
doc_link: "amdmigraphx:reference/MIGraphX-cpp"
data_types:
- type: "int8"
support: "⚠️"
@@ -290,7 +290,7 @@ library_groups:
- name: "Tensile"
tag: "tensile"
doc_link: "tensile:reference/precision-support"
doc_link: "tensile:src/reference/precision-support"
data_types:
- type: "int8"
support: "✅"

View File

@@ -0,0 +1,141 @@
from docutils import nodes
from docutils.parsers.rst import Directive
from docutils.statemachine import ViewList
from sphinx.util import logging
from sphinx.util.nodes import nested_parse_with_titles
import requests
import re
logger = logging.getLogger(__name__)
class BranchAwareRemoteContent(Directive):
"""
Directive that downloads and includes content from other repositories,
matching the branch/tag of the current documentation build.
Usage:
.. remote-content::
:repo: owner/repository
:path: path/to/file.rst
:default_branch: docs/develop # Branch to use when not on a release
:tag_prefix: Docs/ # Optional
"""
required_arguments = 0
optional_arguments = 0
final_argument_whitespace = True
has_content = False
option_spec = {
'repo': str,
'path': str,
'default_branch': str, # Branch to use when not on a release tag
'start_line': int, # Include the file from a specific line
'tag_prefix': str, # Prefix for release tags (e.g., 'Docs/')
}
def get_current_version(self):
"""Get current version/branch being built"""
env = self.state.document.settings.env
html_context = env.config.html_context
# Check if building from a tag
if "official_branch" in html_context:
if html_context["official_branch"] == 0:
if "version" in html_context:
# Remove any 'v' prefix
version = html_context["version"]
if re.match(r'^\d+\.\d+\.\d+$', version):
return version
# Not a version tag, so we'll use the default branch
return None
def get_target_ref(self):
"""Get target reference for the remote repository"""
current_version = self.get_current_version()
# If it's a version number, use tag prefix and version
if current_version:
tag_prefix = self.options.get('tag_prefix', '')
return f'{tag_prefix}{current_version}'
# For any other case, use the specified default branch
if 'default_branch' not in self.options:
logger.warning('No default_branch specified and not building from a version tag')
return None
return self.options['default_branch']
def construct_raw_url(self, repo, path, ref):
"""Construct the raw.githubusercontent.com URL"""
return f'https://raw.githubusercontent.com/{repo}/{ref}/{path}'
def fetch_and_parse_content(self, url, source_path):
"""Fetch content and parse it as RST"""
response = requests.get(url)
response.raise_for_status()
content = response.text
start_line = self.options.get('start_line', 0)
# Create ViewList for parsing
line_count = 0
content_list = ViewList()
for line_no, line in enumerate(content.splitlines()):
if line_count >= start_line:
content_list.append(line, source_path, line_no)
line_count+=1
# Create a section node and parse content
node = nodes.section()
nested_parse_with_titles(self.state, content_list, node)
return node.children
def run(self):
if 'repo' not in self.options or 'path' not in self.options:
logger.warning('Both repo and path options are required')
return []
target_ref = self.get_target_ref()
if not target_ref:
return []
raw_url = self.construct_raw_url(
self.options['repo'],
self.options['path'],
target_ref
)
try:
logger.info(f'Attempting to fetch content from {raw_url}')
return self.fetch_and_parse_content(raw_url, self.options['path'])
except requests.exceptions.RequestException as e:
logger.warning(f'Failed to fetch content from {raw_url}: {str(e)}')
# If we failed on a tag, try falling back to default_branch
if re.match(r'^\d+\.\d+\.\d+$', target_ref) or target_ref.startswith('Docs/'):
if 'default_branch' in self.options:
try:
fallback_ref = self.options['default_branch']
logger.info(f'Attempting fallback to {fallback_ref}...')
fallback_url = self.construct_raw_url(
self.options['repo'],
self.options['path'],
fallback_ref
)
return self.fetch_and_parse_content(fallback_url, self.options['path'])
except requests.exceptions.RequestException as e2:
logger.warning(f'Fallback also failed: {str(e2)}')
return []
def setup(app):
app.add_directive('remote-content', BranchAwareRemoteContent)
return {
'parallel_read_safe': True,
'parallel_write_safe': True,
}

View File

@@ -10,7 +10,7 @@ Deep learning frameworks provide environments for machine learning, training, fi
ROCm offers a complete ecosystem for developing and running deep learning applications efficiently. It also provides ROCm-compatible versions of popular frameworks and libraries, such as PyTorch, TensorFlow, JAX, and others.
The AMD ROCm organization actively contributes to open-source development and collaborates closely with framework organizations. This collaboration ensures that framework-specific optimizations effectively leverage AMD GPUs and accelerators.
The AMD ROCm organization actively contributes to open-source development and collaborates closely with framework organizations. This collaboration ensures that framework-specific optimizations effectively leverage AMD GPUs.
The table below summarizes information about ROCm-enabled deep learning frameworks. It includes details on ROCm compatibility and third-party tool support, installation steps and options, and links to GitHub resources. For a complete list of supported framework versions on ROCm, see the :doc:`Compatibility matrix <../compatibility/compatibility-matrix>` topic.
@@ -84,6 +84,8 @@ The table below summarizes information about ROCm-enabled deep learning framewor
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/dgl-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/dgl-install.html#use-a-prebuilt-docker-image-with-dgl-pre-installed>`__
- `Wheels package <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/dgl-install.html#use-a-wheels-package>`__
- .. raw:: html
<a href="https://github.com/ROCm/dgl"><i class="fab fa-github fa-lg"></i></a>
@@ -98,18 +100,6 @@ The table below summarizes information about ROCm-enabled deep learning framewor
<a href="https://github.com/ROCm/megablocks"><i class="fab fa-github fa-lg"></i></a>
* - `Taichi <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/taichi-compatibility.html>`__
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/taichi-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/taichi-install.html#use-a-prebuilt-docker-image-with-taichi-pre-installed>`__
- `Wheels package <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/taichi-install.html#use-a-wheels-package>`__
- .. raw:: html
<a href="https://github.com/ROCm/taichi"><i class="fab fa-github fa-lg"></i></a>
* - `Ray <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/ray-compatibility.html>`__
- .. raw:: html
@@ -128,10 +118,22 @@ The table below summarizes information about ROCm-enabled deep learning framewor
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/llama-cpp-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/llama-cpp-install.html#use-a-prebuilt-docker-image-with-llama-cpp-pre-installed>`__
- `ROCm Base Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/llama-cpp-install.html#build-your-own-docker-image>`__
- .. raw:: html
<a href="https://github.com/ROCm/llama.cpp"><i class="fab fa-github fa-lg"></i></a>
* - `FlashInfer <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/flashinfer-compatibility.html>`__
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/flashinfer-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/flashinfer-install.html#use-a-prebuilt-docker-image-with-flashinfer-pre-installed>`__
- `ROCm Base Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/flashinfer-install.html#build-your-own-docker-image>`__
- .. raw:: html
<a href="https://github.com/ROCm/flashinfer"><i class="fab fa-github fa-lg"></i></a>
Learn how to use your ROCm deep learning environment for training, fine-tuning, inference, and performance optimization
through the following guides.

View File

@@ -1,5 +1,5 @@
.. meta::
:description: How to configure MI300X accelerators to fully leverage their capabilities and achieve optimal performance.
:description: How to configure MI300X GPUs to fully leverage their capabilities and achieve optimal performance.
:keywords: ROCm, AI, machine learning, MI300X, LLM, usage, tutorial, optimization, tuning
**************************************
@@ -7,11 +7,11 @@ AMD Instinct MI300X performance guides
**************************************
The following performance guides provide essential guidance on the necessary
steps to properly `configure your system for AMD Instinct™ MI300X accelerators
steps to properly `configure your system for AMD Instinct™ MI300X GPUs
<https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`_.
They include detailed instructions on system settings and application
:doc:`workload tuning </how-to/rocm-for-ai/inference-optimization/workload>` to
help you leverage the maximum capabilities of these accelerators and achieve
help you leverage the maximum capabilities of these GPUs and achieve
superior performance.
* `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`__
@@ -19,9 +19,9 @@ superior performance.
your AMD Instinct MI300X system for performance.
* :doc:`/how-to/rocm-for-ai/inference-optimization/workload` covers steps to
optimize the performance of AMD Instinct MI300X series accelerators for HPC
optimize the performance of AMD Instinct MI300X Series GPUs for HPC
and deep learning operations.
* :doc:`/how-to/rocm-for-ai/inference/benchmark-docker/vllm` introduces a preconfigured
environment for LLM inference, designed to help you test performance with
popular models on AMD Instinct MI300X series accelerators.
popular models on AMD Instinct MI300X Series GPUs.

View File

@@ -25,7 +25,7 @@ execute on AMD GPUs while maintaining compatibility with CUDA-based systems.
OpenCL (Open Computing Language) is an open standard for cross-platform,
parallel programming of diverse processors. ROCm supports OpenCL for developers
who want to use standard frameworks across different hardware platforms,
including CPUs, GPUs, and other accelerators. For more information, see
including CPUs, GPUs, and APUs. For more information, see
`OpenCL <https://www.khronos.org/opencl/>`_.
Python bindings can be found at https://github.com/ROCm/hip-python.

View File

@@ -11,10 +11,10 @@ Fine-tuning using ROCm involves leveraging AMD's GPU-accelerated :doc:`libraries
ecosystem for deep learning development, including open-source libraries for optimized deep learning operations and
ROCm-aware versions of :doc:`deep learning frameworks <../../deep-learning-rocm>` such as PyTorch, TensorFlow, and JAX.
Single-accelerator systems, such as a machine equipped with a single accelerator or GPU, are commonly used for
Single-accelerator systems, such as a machine equipped with a single GPU, are commonly used for
smaller-scale deep learning tasks, including fine-tuning pre-trained models and running inference on moderately
sized datasets. See :doc:`single-gpu-fine-tuning-and-inference`.
Multi-accelerator systems, on the other hand, consist of multiple accelerators working in parallel. These systems are
Multi-accelerator systems, on the other hand, consist of multiple GPUs working in parallel. These systems are
typically used in LLMs and other large-scale deep learning tasks where performance, scalability, and the handling of
massive datasets are crucial. See :doc:`multi-gpu-fine-tuning-and-inference`.

View File

@@ -3,11 +3,11 @@
:keywords: ROCm, LLM, fine-tuning, usage, tutorial, multi-GPU, distributed, inference, accelerators, PyTorch, HuggingFace, torchtune
*****************************************************
Fine-tuning and inference using multiple accelerators
Fine-tuning and inference using multiple GPUs
*****************************************************
This section explains how to fine-tune a model on a multi-accelerator system. See
:doc:`Single-accelerator fine-tuning <single-gpu-fine-tuning-and-inference>` for a single accelerator or GPU setup.
:doc:`Single-accelerator fine-tuning <single-gpu-fine-tuning-and-inference>` for a single GPU setup.
.. _fine-tuning-llms-multi-gpu-env:
@@ -20,7 +20,7 @@ This section was tested using the following hardware and software environment.
:stub-columns: 1
* - Hardware
- 4 AMD Instinct MI300X accelerators
- 4 AMD Instinct MI300X GPUs
* - Software
- ROCm 6.1, Ubuntu 22.04, PyTorch 2.1.2, Python 3.10
@@ -40,13 +40,13 @@ Setting up the base implementation environment
:doc:`PyTorch installation guide <rocm-install-on-linux:install/3rd-party/pytorch-install>`. For consistent
installation, its recommended to use official ROCm prebuilt Docker images with the framework pre-installed.
#. In the Docker container, check the availability of ROCM-capable accelerators using the following command.
#. In the Docker container, check the availability of ROCm-capable GPUs using the following command.
.. code-block:: shell
rocm-smi --showproductname
#. Check that your accelerators are available to PyTorch.
#. Check that your GPUs are available to PyTorch.
.. code-block:: python
@@ -66,7 +66,7 @@ Setting up the base implementation environment
.. tip::
During training and inference, you can check the memory usage by running the ``rocm-smi`` command in your terminal.
This tool helps you see shows which accelerators or GPUs are involved.
This tool helps you see shows which GPUs are involved.
.. _fine-tuning-llms-multi-gpu-hugging-face-accelerate:
@@ -74,9 +74,9 @@ Setting up the base implementation environment
Hugging Face Accelerate for fine-tuning and inference
===========================================================
`Hugging Face Accelerate <https://huggingface.co/docs/accelerate/en/index>`_ is a library that simplifies turning raw
PyTorch code for a single accelerator into code for multiple accelerators for LLM fine-tuning and inference. It is
integrated with `Transformers <https://huggingface.co/docs/transformers/en/index>`_ allowing you to scale your PyTorch
`Hugging Face Accelerate <https://huggingface.co/docs/accelerate/en/index>`__ is a library that simplifies turning raw
PyTorch code for a single GPU into code for multiple GPUs for LLM fine-tuning and inference. It is
integrated with `Transformers <https://huggingface.co/docs/transformers/en/index>`__, so you can scale your PyTorch
code while maintaining performance and flexibility.
As a brief example of model fine-tuning and inference using multiple GPUs, let's use Transformers and load in the Llama
@@ -107,7 +107,7 @@ Now, it's important to adjust how you load the model. Add the ``device_map`` par
(``"auto"``, ``"balanced"``, ``"balanced_low_0"``, ``"sequential"``).
It's recommended to set the ``device_map`` parameter to ``“auto”`` to allow Accelerate to automatically and
efficiently allocate the model given the available resources (4 accelerators in this case).
efficiently allocate the model given the available resources (four GPUs in this case).
When you have more GPU memory available than the model size, here is the difference between each ``device_map``
option:
@@ -130,8 +130,8 @@ After loading the model in this way, the model is fully ready to use the resourc
torchtune for fine-tuning and inference
=============================================
`torchtune <https://pytorch.org/torchtune/main/>`_ is a PyTorch-native library for easy single and multi-accelerator or
GPU model fine-tuning and inference with LLMs.
`torchtune <https://meta-pytorch.org/torchtune/main/>`_ is a PyTorch-native library for easy single and multi-GPU
model fine-tuning and inference with LLMs.
#. Install torchtune using pip.

Some files were not shown because too many files have changed in this diff Show More