Compare commits

...

88 Commits

Author SHA1 Message Date
Sam Wu
fdfd9187d7 Update documentation requirements 2024-09-16 10:12:18 -08:00
Sam Wu
1659e6e8a0 Update documentation requirements 2024-06-06 16:58:20 -06:00
Sam Wu
9a496d97d5 Fix RTD config 2024-05-02 08:53:40 -06:00
Sam Wu
7dd2b6f12c Update documentation requirements 2024-05-01 16:58:39 -06:00
Sam Wu
7e53ad4f9c Update documentation requirements 2024-05-01 16:50:38 -06:00
Sam Wu
a1f0050f6b add version to html title 2023-08-04 17:15:18 -06:00
Sam Wu
bc8686a20c rocm-docs-core v0.18.3 2023-06-30 09:34:05 -06:00
dependabot[bot]
12ad0c6c8b Bump rocm-docs-core from 0.18.0 to 0.18.1 in /docs/sphinx (#2280)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.18.0 to 0.18.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.18.0...v0.18.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-27 19:34:34 -06:00
Sam Wu
eaf98b0001 rocm_smi_lib 2023-06-21 17:00:38 -06:00
Sam Wu
6dcbf4b594 edit instructions, logs, and notes for 5.1.3 2023-06-21 15:25:13 -06:00
Sam Wu
ca426a3821 edit instructions, logs, and notes for 5.1.1 2023-06-21 15:18:20 -06:00
Máté Ferenc Nagy-Egri
0e54cd2ec6 Downgrade license notice to 5.2.0 2023-06-15 13:04:48 +02:00
Máté Ferenc Nagy-Egri
e8ad3843bf Downgrade changelog to 5.2.0 2023-06-15 13:04:48 +02:00
Máté Ferenc Nagy-Egri
daab3058b0 Downgrade install instructions to 5.2.0 2023-06-15 13:04:48 +02:00
Máté Ferenc Nagy-Egri
16be156acc Downgrade release notes to 5.2.0 2023-06-15 13:04:48 +02:00
Máté Ferenc Nagy-Egri
279fa18f5a Downgrade support matrices to 5.2.3 2023-06-15 13:04:48 +02:00
Máté Ferenc Nagy-Egri
746cc7fe57 Downgrade license notice to 5.2.3 2023-06-15 13:04:47 +02:00
Máté Ferenc Nagy-Egri
9e5263ebca Downgrade changelog to 5.2.3 2023-06-15 13:04:47 +02:00
Máté Ferenc Nagy-Egri
95281a4570 Downgrade install instructions to 5.2.3 2023-06-15 13:04:29 +02:00
Máté Ferenc Nagy-Egri
f5e2c6640d Downgrade release notes to 5.2.3 2023-06-14 15:08:19 +02:00
Máté Ferenc Nagy-Egri
d9173e132e Downgrade license notice to 5.3.0 2023-06-14 15:08:19 +02:00
Máté Ferenc Nagy-Egri
cf8a084f47 Downgrade changelog to 5.3.0 2023-06-14 15:08:19 +02:00
Máté Ferenc Nagy-Egri
e69f4bd470 Downgrade install instructions to 5.3.0 2023-06-14 15:08:19 +02:00
Máté Ferenc Nagy-Egri
2044d16f32 Downgrade release notes to 5.3.0 2023-06-14 15:08:19 +02:00
Máté Ferenc Nagy-Egri
6029cf7fff Downgrade license notice to 5.3.2 2023-06-14 15:08:19 +02:00
Máté Ferenc Nagy-Egri
4cbce8a0eb Downgrade changelog to 5.3.2 2023-06-14 15:08:19 +02:00
Máté Ferenc Nagy-Egri
00a448f25d Downgrade install instructions to 5.3.2 2023-06-14 15:08:18 +02:00
Máté Ferenc Nagy-Egri
5f11e96b89 Downgrade release notes to 5.3.2 2023-06-14 15:08:18 +02:00
Máté Ferenc Nagy-Egri
0b60d6c54d Downgrade support matrices to 5.3.3 2023-06-14 15:06:35 +02:00
Máté Ferenc Nagy-Egri
aeb750208f Downgrade license notice to 5.3.3 2023-06-14 14:04:06 +02:00
Máté Ferenc Nagy-Egri
fba1e09deb Downgrade changelog to 5.3.3 2023-06-14 14:04:06 +02:00
Máté Ferenc Nagy-Egri
d6990e32f3 Downgrade install instructions to 5.3.3 2023-06-14 14:04:06 +02:00
Máté Ferenc Nagy-Egri
01d12821fe Downgrade release notes to 5.3.3 2023-06-14 10:59:36 +02:00
Máté Ferenc Nagy-Egri
070a7db8a2 Downgrade license notice to 5.4.0 2023-06-13 14:46:19 +02:00
Máté Ferenc Nagy-Egri
911f18c6c6 Downgrade changelog to 5.4.0 2023-06-13 14:46:10 +02:00
Máté Ferenc Nagy-Egri
9a5d323b01 Downgrade install instructions to 5.4.0 2023-06-13 14:46:02 +02:00
Máté Ferenc Nagy-Egri
ab0e1fd625 Downgrade release notes to 5.4.0 2023-06-13 14:45:50 +02:00
Máté Ferenc Nagy-Egri
a635018505 Downgrade license notice to 5.4.1 2023-06-13 14:24:12 +02:00
Máté Ferenc Nagy-Egri
d2070c1b4a Downgrade changelog to 5.4.1 2023-06-13 14:23:59 +02:00
Máté Ferenc Nagy-Egri
12c9158880 Downgrade install instructions to 5.4.1 2023-06-13 14:23:51 +02:00
Máté Ferenc Nagy-Egri
9a55d71cec Downgrade release notes to 5.4.1 2023-06-13 14:23:38 +02:00
Máté Ferenc Nagy-Egri
9a9df83a77 Downgrade license notice to 5.4.2 2023-06-13 14:21:39 +02:00
Máté Ferenc Nagy-Egri
395e607525 Downgrade changelog to 5.4.2 2023-06-13 14:21:39 +02:00
Máté Ferenc Nagy-Egri
561c304e10 Downgrade install instructions to 5.4.2 2023-06-13 14:21:39 +02:00
Máté Ferenc Nagy-Egri
778db160eb Downgrade release notes to 5.4.2 2023-06-13 14:16:46 +02:00
Máté Ferenc Nagy-Egri
60a3065399 Downgrade support matrices to 5.4.3 2023-06-12 14:38:40 +02:00
Máté Ferenc Nagy-Egri
3c360627bd Downgrade changelog to 5.4.3 2023-06-12 14:37:41 +02:00
Máté Ferenc Nagy-Egri
e1151d4dbb Downgrade install instructions to 5.4.3 2023-06-12 14:37:25 +02:00
Máté Ferenc Nagy-Egri
616a09c442 Downgrade release notes to 5.4.3 2023-06-12 14:37:06 +02:00
Máté Ferenc Nagy-Egri
082fbb9d44 Downgrade changelog to 5.5.0 2023-06-12 14:27:40 +02:00
Máté Ferenc Nagy-Egri
1971584024 Downgrade install instructions to 5.5.0 2023-06-12 14:27:40 +02:00
Máté Ferenc Nagy-Egri
71e52a1c84 Downgrade release notes to 5.5.0 2023-06-12 14:17:26 +02:00
Gergely Meszaros
a471e8debe Add instructions for adding extra repositories in RHEL and SLES
The hip-devel package depends on perl modules not distributed by default
on RHEL and SLES distriubutions, these can be installed from EPEL and
the `devel:languages:perl` repository respectively.

Ideally in the future these dependencies would be replaced with packages
available from default repositories, but in the meanwhile this should
be at least documented.
2023-06-08 09:37:00 -06:00
dependabot[bot]
8c86526f98 Bump rocm-docs-core from 0.13.3 to 0.13.4 in /docs/sphinx (#2226)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.13.3 to 0.13.4.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.13.3...v0.13.4)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-08 09:18:23 -06:00
Mészáros Gergely
a42fae5140 Install fixes (#2228)
* Remove install instructions for unsuported RHEL 8.8 and 9.2

Current ROCm release does not support these versions of RHEL

* Centralize disclaimers and perquisites for installation

- Move the single-version to multi-version diclaimer to the install
  overview page where single vs multi installs are discussed.
- Move the installation of kernel-headers and development packages
  to the install preparation page. Unify it mainly from the quick start
  content.

* s/Name/name/ in repository config files for RHEL

The repository name can be set as `name=><name>` instead of `Name`,
otherwise yum complains about the repo not having a name, e.g:
```output
Repository 'ROCm-5.3.3' is missing name in configuration, using id.
```

This is fixed with this commit.

* Clean up render/video group section on prerequisites

* Installation and Upgrade restructuring & fixes

- Fix the rocm package urls for RHEL in the install & upgrade guides
  - RHEL8 and 9 have different URLs, add a tab-set similar to ubuntu
    for them.
- Fix the package URL in the upgrade guide for SLES (previously pointed
  to the amdgpu url)
- Change the apt-signing key download and conversion to the method used
  in the quick start guide, which is the recommended by ubuntu maintainers
- Change the install steps from list items to rubrics with numbered entries
  which is more readable and matches the style in the quick start guide
- Do not pass `--append` to `tee` in the upgrade guide, because it is
  meant to overwrite.
- Split the one long tab-set to multiple tab-sets in the upgrade guide
  to improve readability
2023-06-08 09:17:51 -06:00
Saad Rahim
bcb3dd3b4a PCIe Atomics (#2223)
Co-authored-by: Nagy-Egri Máté Ferenc <beiktatas+github@outlook.hu>
2023-06-06 21:52:18 -06:00
Mészáros Gergely
8784fe3fba Install updates (#2221)
* Install updates

- revert distro command installation -> package manager installation
- move description of installer script to common section
- updates to the installer script installation page
- other misc fixes

* Fix spelling
2023-06-06 07:06:06 -06:00
Saad Rahim
6e79d204b8 Further installation fixes (#2219)
Co-authored-by: Sam Wu <sjwu@ualberta.ca>
2023-06-04 11:33:27 -06:00
Sam Wu
7076bc18ca Standardize install instructions (#2220)
* standardize install instructions

* use rocm-5.5.1 in install instructions
2023-06-04 10:49:11 -06:00
Saad Rahim
519df7a51f Refactoring installation documentation (#2202)
Co-authored-by: Sam Wu <sam.wu2@amd.com>
2023-06-02 14:35:24 -06:00
dependabot[bot]
90c697b6d3 Bump rocm-docs-core from 0.13.2 to 0.13.3 in /docs/sphinx (#2214)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.13.2 to 0.13.3.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.13.2...v0.13.3)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-01 11:52:50 -06:00
Nara
125cc37981 Update changelog for 5.5.1 (#2199)
* docs(changelog): update changelog for 5.5.1

Signed-off-by: Nara Prasetya <nara@streamhpc.com>

* docs(changelog): Improve continuity in release notes

* docs(changelog): Add changelog to TOC

---------

Signed-off-by: Nara Prasetya <nara@streamhpc.com>
2023-06-01 09:40:51 -06:00
Nagy-Egri Máté Ferenc
5752b5986c Remove links to docs.amd.com (#2200)
* Remove links to docs.amd.com

* Fix linking to list item (not possible)
2023-06-01 08:16:38 -06:00
dependabot[bot]
2829c088c2 Bump rocm-docs-core from 0.13.1 to 0.13.2 in /docs/sphinx (#2201)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.13.1 to 0.13.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.13.1...v0.13.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-31 11:49:43 -06:00
dependabot[bot]
3b9fb62600 Bump rocm-docs-core from 0.13.0 to 0.13.1 in /docs/sphinx (#2190)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.13.0 to 0.13.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.13.0...v0.13.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sam Wu <sam.wu2@amd.com>
2023-05-30 10:20:18 -06:00
Mészáros Gergely
b7222caed2 Replace incorrect em-dashes with dashes in code-blocks (#2192)
Replace em-dash('–') with dash('-') in code blocks where the latter was
meant.
2023-05-30 07:26:23 -06:00
Nagy-Egri Máté Ferenc
c285dd729f Team-feedback (#2193)
* Fix hipRAND copy-paste error

* Remove superflous table reference
2023-05-30 07:06:06 -06:00
Mészáros Gergely
0c93636d23 Replace links to subprojects docs with intersphinx links (#2181) 2023-05-29 12:33:46 -06:00
Sam Wu
3fa5f1fddc Update doc requirements and suppress duplicate main doc link (#2189)
* update to rocm-docs-core v0.13.0

also suppress main doc link

* rename home link to ROCm Documentation Home
2023-05-29 12:32:50 -06:00
Saad Rahim
17b029b885 Changing title (#2183) 2023-05-25 22:32:59 -06:00
Saad Rahim
460f46c3be Adding repo priority for Ubuntu 22.04 (#2178)
* Adding repo priority for Ubuntu 22.04

* removed unnecessary apt-update
2023-05-25 14:46:43 -06:00
Mészáros Gergely
6feca81dd0 docs: fix bios settings tables in mi100/mi200 tuning guides (#2179)
Add empty cells to list tables to make them uniform (all rows have the
same number of cells), before this the tables errored out with:

> ERROR: Error parsing content block for the "list-table" directive:
> uniform two-level bullet list expected, but row 13 does not contain
> the same number of items as row 1 (3 vs 4)

and the table did not show up.
2023-05-25 09:54:40 -06:00
Mészáros Gergely
ec8496041a ci: change markdown linting to use the NodeJs markdownlint (#2180)
* ci: change markdown linting to use the NodeJs markdownlint

The original ruby based markdownlint has a few shortcomings not known
when it was introduced:
- no support for myst extensions
- no support for disabling specific rules for specific files or regions

These two combined make it very hard to use when used for this project
when it has false positives around myst extensions.

Luckily there's a NodeJS based version of markdownlint [1] supporting the
same ruleset that is more configurable:
- seems to support myst extensions better
- has an html comment based syntax to disable specific rules

The library seem to be better maintained too and with better tooling:
e.g. there's a vscode extension using the engine for local use:
markdownlint (DavidAnson.vscode-markdownlint).

[1]: https://github.com/DavidAnson/markdownlint

* docs: hotfix empty links

There are missing links in the docs, these should get fixed, but for now
they are just monkey patched to make CI happy.

* docs: fix links

---------

Co-authored-by: Nara Prasetya <nara@streamhpc.com>
2023-05-25 09:51:19 -06:00
Edgar Gabriel
c7350c08ab update the gpu-aware-mpi page (#2176)
* update the gpu-aware-mpi page

Three changes:
 - add the ucx compatibility table
 - add the --with-rocm=/opt/rocm option to the compilation of Open MPI
 - add a section about how to compile and use UCC for collective
operations.

* Changing link to relative

* Update gpu_aware_mpi.md

---------

Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
2023-05-24 16:42:45 -06:00
Sam Wu
c1809766e6 Link fixes (#2177)
* fix rocmcc link

* remove unused link

* remove unused linkcheck configs

* update amd smi section

add link to ami smi github

---------

Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
2023-05-24 16:14:23 -06:00
Saad Rahim
61df1ec8c6 Updating link to new dev hub (#2174) 2023-05-24 16:11:14 -06:00
Li Li
983987aab5 Update deep learning guide (#2124)
* add deep learning guide

* seperate out oprimization, reference, and troubleshooting as standalone sections.

* resolve lint errors

* delete introduction to DL

* correct syntax highlights and filename

* remove out-of-date QAs

* Renaming and cleanup

* Spelling

* Fixup TOC

---------

Co-authored-by: Nara Prasetya <nara@streamhpc.com>
Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
2023-05-24 16:04:30 -06:00
zhang2amd
914b62e219 Update default.xml for 5.5.1 release 2023-05-24 13:17:55 -07:00
Saad Rahim
faac45772c Broken Links (#2172) 2023-05-24 11:11:40 -06:00
dependabot[bot]
d206494272 Bump rocm-docs-core from 0.11.1 to 0.12.0 in /docs/sphinx (#2171)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.11.1 to 0.12.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.11.1...v0.12.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-24 10:19:34 -06:00
Saad Rahim
26c73a3986 Fixing GPU support tables (#2170)
* Fixing GPU support tables

* Linting
2023-05-24 10:06:12 -06:00
Nagy-Egri Máté Ferenc
dc74008ac6 Fix-landing-pages (#2167) 2023-05-24 07:27:50 -06:00
dependabot[bot]
108287dcd7 Bump rocm-docs-core from 0.11.0 to 0.11.1 in /docs/sphinx (#2164)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.11.0 to 0.11.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.11.0...v0.11.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-24 07:08:10 -06:00
Nagy-Egri Máté Ferenc
38440915ef Finish-compat-section (#2166)
* User/Kernel-Space compat

* Update ML compat at 5.5.0

* Fix spelling of user and kernel space
2023-05-24 07:02:43 -06:00
srawat
d9c434881a Update openmp.md (#2163)
Updated the link for supported GPUs from absolute to relative "(../../release/gpu_os_support.md#gpu-support-table)"
2023-05-23 07:05:18 -06:00
Nagy-Egri Máté Ferenc
4c795d45f6 Typo and link style fixes (#2158)
* CMake package config filename format

* No links as text
2023-05-22 17:27:59 -06:00
Saad Rahim
ef0a88ea0e Navigation improvement (#2151)
* Reorganized Ref Grid card and ROCm intro

* MIGraphX link

* openmp header cleanup

* Fixing durationN

* Syncing grid cards to left nav
2023-05-19 15:07:46 -06:00
Nagy-Egri Máté Ferenc
34578f0193 Compatibility pages review (#2134) 2023-05-19 07:38:14 -06:00
82 changed files with 2666 additions and 6249 deletions

2
.github/CODEOWNERS vendored
View File

@@ -1 +1 @@
* @saadrahim @Rmalavally @amd-aakash @zhang2amd @jlgreathouse @samjwu
* @saadrahim @Rmalavally @amd-aakash @zhang2amd @jlgreathouse @samjwu @MathiasMagnus

View File

@@ -32,10 +32,10 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Use markdownlint
uses: actionshub/markdownlint@v3.1.3
- name: Use markdownlint-cli2
uses: DavidAnson/markdownlint-cli2-action@v10.0.1
with:
filesToIgnoreRegex: CHANGELOG.md|(docs\/)?(RELEASE|release).md|tools\/autotag\/templates\/.
globs: '**/*.md'
spelling:
name: "Spelling"

1
.gitignore vendored
View File

@@ -15,3 +15,4 @@ _readthedocs/
# avoid duplicating contributing.md due to conf.py
docs/contributing.md
docs/release.md
docs/CHANGELOG.md

14
.markdownlint-cli2.yaml Normal file
View File

@@ -0,0 +1,14 @@
config:
default: true
MD013: false
MD026:
punctuation: '.,;:!'
MD029:
style: ordered
MD033: false
MD034: false
MD041: false
ignores:
- CHANGELOG.md
- "{,docs/}{RELEASE,release}.md"
- tools/autotag/templates/**/*.md

1
.mdlrc
View File

@@ -1 +0,0 @@
style "mdlrc-style.rb"

View File

@@ -3,12 +3,19 @@
version: 2
build:
os: ubuntu-22.04
tools:
python: "3.10"
apt_packages:
- "doxygen"
- "graphviz" # For dot graphs in doxygen
python:
install:
- requirements: docs/sphinx/requirements.txt
sphinx:
configuration: docs/conf.py
formats: [htmlzip, pdf, epub]
python:
version: "3.8"
install:
- requirements: docs/sphinx/requirements.txt
formats: []

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,4 @@
# AMD ROCm™ Platform - Powering Your GPU Computational Needs
# AMD ROCm™ Platform
ROCm™ is an open-source stack for GPU computation. ROCm is primarily Open-Source
Software (OSS) that allows developers the freedom to customize and tailor their
@@ -32,7 +32,13 @@ The default.xml file uses the repo Manifest format.
The develop branch of this repository contains content for the next
ROCm release.
## How to build documentation via Sphinx
## ROCm Documentation
ROCm Documentation is available online at
[rocm.docs.amd.com](https://rocm.docs.amd.com). Source code for the documenation
is located in the docs folder of most repositories that are part of ROCm.
### How to build documentation via Sphinx
```bash
cd docs

View File

@@ -15,679 +15,25 @@ The release notes for the ROCm platform.
-------------------
## ROCm 5.5.0
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable no-duplicate-header -->
### What's New in This Release
## ROCm 5.1.3
#### HIP Enhancements
The ROCm v5.5 release consists of the following HIP enhancements:
##### Enhanced Stack Size Limit
In this release, the stack size limit is increased from 16k to 131056 bytes (or 128K - 16).
Applications requiring to update the stack size can use hipDeviceSetLimit API.
##### `hipcc` Changes
The following hipcc changes are implemented in this release:
- `hipcc` will not implicitly link to `libpthread` and `librt`, as they are no longer a link time dependence for HIP programs.  Applications that depend on these libraries must explicitly link to them.
- `-use-staticlib` and `-use-sharedlib` options are deprecated.
##### Future Changes
- Separation of `hipcc` binaries (Perl scripts) from HIP to `hipcc` project. Users will access separate `hipcc` package for installing `hipcc` binaries in future ROCm releases.
- In a future ROCm release, the following samples will be removed from the `hip-tests` project.
- `hipBusbandWidth` at <https://github.com/ROCm-Developer-Tools/hip-tests/tree/develop/samples/1_Utils/shipBusBandwidth>
- `hipCommander` at <https://github.com/ROCm-Developer-Tools/hip-tests/tree/develop/samples/1_Utils/hipCommander>
Note that the samples will continue to be available in previous release branches.
##### New HIP APIs in This Release
> **Note**
>
> This is a pre-official version (beta) release of the new APIs and may contain unresolved issues.
###### Memory Management HIP APIs
The new memory management HIP API is as follows:
- Sets information on the specified pointer [BETA].
```h
hipError_t hipPointerSetAttribute(const void* value, hipPointer_attribute attribute, hipDeviceptr_t ptr);
```
###### Module Management HIP APIs
The new module management HIP APIs are as follows:
- Launches kernel $f$ with launch parameters and shared memory on stream with arguments passed to `kernelParams`, where thread blocks can cooperate and synchronize as they execute.
```h
hipError_t hipModuleLaunchCooperativeKernel(hipFunction_t f, unsigned int gridDimX, unsigned int gridDimY, unsigned int gridDimZ, unsigned int blockDimX, unsigned int blockDimY, unsigned int blockDimZ, unsigned int sharedMemBytes, hipStream_t stream, void** kernelParams);
```
- Launches kernels on multiple devices where thread blocks can cooperate and synchronize as they execute.
```h
hipError_t hipModuleLaunchCooperativeKernelMultiDevice(hipFunctionLaunchParams* launchParamsList, unsigned int numDevices, unsigned int flags);
```
###### HIP Graph Management APIs
The new HIP Graph Management APIs are as follows:
- Creates a memory allocation node and adds it to a graph [BETA]
```h
hipError_t hipGraphAddMemAllocNode(hipGraphNode_t* pGraphNode, hipGraph_t graph, const hipGraphNode_t* pDependencies, size_t numDependencies, hipMemAllocNodeParams* pNodeParams);
```
- Return parameters for memory allocation node [BETA]
```h
hipError_t hipGraphMemAllocNodeGetParams(hipGraphNode_t node, hipMemAllocNodeParams* pNodeParams);
```
- Creates a memory free node and adds it to a graph [BETA]
```h
hipError_t hipGraphAddMemFreeNode(hipGraphNode_t* pGraphNode, hipGraph_t graph, const hipGraphNode_t* pDependencies, size_t numDependencies, void* dev_ptr);
```
- Returns parameters for memory free node [BETA].
```h
hipError_t hipGraphMemFreeNodeGetParams(hipGraphNode_t node, void* dev_ptr);
```
- Write a DOT file describing graph structure [BETA].
```h
hipError_t hipGraphDebugDotPrint(hipGraph_t graph, const char* path, unsigned int flags);
```
- Copies attributes from source node to destination node [BETA].
```h
hipError_t hipGraphKernelNodeCopyAttributes(hipGraphNode_t hSrc, hipGraphNode_t hDst);
```
- Enables or disables the specified node in the given graphExec [BETA]
```h
hipError_t hipGraphNodeSetEnabled(hipGraphExec_t hGraphExec, hipGraphNode_t hNode, unsigned int isEnabled);
```
- Query whether a node in the given graphExec is enabled [BETA]
```h
hipError_t hipGraphNodeGetEnabled(hipGraphExec_t hGraphExec, hipGraphNode_t hNode, unsigned int* isEnabled);
```
##### OpenMP Enhancements
This release consists of the following OpenMP enhancements:
- Additional support for OMPT functions `get_device_time` and `get_record_type`.
- Add support for min/max fast fp atomics on AMD GPUs.
- Fix the use of the abs function in C device regions.
### Deprecations and Warnings
#### HIP Deprecation
The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts.
> **Note**
>
> There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option.
##### Linux Filesystem Hierarchy Standard for ROCm
ROCm packages have adopted the Linux foundation filesystem hierarchy standard in this release to ensure ROCm components follow open source conventions for Linux-based distributions. While moving to a new filesystem hierarchy, ROCm ensures backward compatibility with its 5.1 version or older filesystem hierarchy. See below for a detailed explanation of the new filesystem hierarchy and backward compatibility.
##### New Filesystem Hierarchy
The following is the new filesystem hierarchy:4
```text
/opt/rocm-<ver>
| --bin
| --All externally exposed Binaries
| --libexec
| --<component>
| -- Component specific private non-ISA executables (architecture independent)
| --include
| -- <component>
| --<header files>
| --lib
| --lib<soname>.so -> lib<soname>.so.major -> lib<soname>.so.major.minor.patch
(public libraries linked with application)
| --<component> (component specific private library, executable data)
| --<cmake>
| --components
| --<component>.config.cmake
| --share
| --html/<component>/*.html
| --info/<component>/*.[pdf, md, txt]
| --man
| --doc
| --<component>
| --<licenses>
| --<component>
| --<misc files> (arch independent non-executable)
| --samples
```
> **Note**
>
> ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release.
For more information, refer to <https://refspecs.linuxfoundation.org/fhs.shtml>.
##### Backward Compatibility with Older Filesystems
ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility.
> **Note**
>
> ROCm will continue supporting backward compatibility until the next major release.
##### Wrapper header files
Wrapper header files are placed in the old location (`/opt/rocm-xxx/<component>/include`) with a warning message to include files from the new location (`/opt/rocm-xxx/include`) as shown in the example below:
```h
// Code snippet from hip_runtime.h
#pragma message “This file is deprecated. Use file from include path /opt/rocm-ver/include/ and prefix with hip”.
#include "hip/hip_runtime.h"
```
The wrapper header files backward compatibility deprecation is as follows:
- `#pragma` message announcing deprecation -- ROCm v5.2 release
- `#pragma` message changed to `#warning` -- Future release
- `#warning` changed to `#error` -- Future release
- Backward compatibility wrappers removed -- Future release
##### Library files
Library files are available in the `/opt/rocm-xxx/lib` folder. For backward compatibility, the old library location (`/opt/rocm-xxx/<component>/lib`) has a soft link to the library at the new location.
Example:
```log
$ ls -l /opt/rocm/hip/lib/
total 4
drwxr-xr-x 4 root root 4096 May 12 10:45 cmake
lrwxrwxrwx 1 root root 24 May 10 23:32 libamdhip64.so -> ../../lib/libamdhip64.so
```
##### CMake Config files
All CMake configuration files are available in the `/opt/rocm-xxx/lib/cmake/<component>` folder.
For backward compatibility, the old CMake locations (`/opt/rocm-xxx/<component>/lib/cmake`) consist of a soft link to the new CMake config.
Example:
```log
$ ls -l /opt/rocm/hip/lib/cmake/hip/
total 0
lrwxrwxrwx 1 root root 42 May 10 23:32 hip-config.cmake -> ../../../../lib/cmake/hip/hip-config.cmake
```
#### ROCm Support For Code Object V3 Deprecated
Support for Code Object v3 is deprecated and will be removed in a future release.
#### Comgr V3.0 Changes
The following APIs and macros have been marked as deprecated. These are expected to be removed in a future ROCm release and coincides with the release of Comgr v3.0.
##### API Changes
- `amd_comgr_action_info_set_options()`
- `amd_comgr_action_info_get_options()`
##### Actions and Data Types
- `AMD_COMGR_ACTION_ADD_DEVICE_LIBRARIES`
- `AMD_COMGR_ACTION_COMPILE_SOURCE_TO_FATBIN`
For replacements, see the `AMD_COMGR_ACTION_INFO_GET`/`SET_OPTION_LIST APIs`, and the `AMD_COMGR_ACTION_COMPILE_SOURCE_(WITH_DEVICE_LIBS)_TO_BC` macros.
#### Deprecated Environment Variables
The following environment variables are removed in this ROCm release:
- `GPU_MAX_COMMAND_QUEUES`
- `GPU_MAX_WORKGROUP_SIZE_2D_X`
- `GPU_MAX_WORKGROUP_SIZE_2D_Y`
- `GPU_MAX_WORKGROUP_SIZE_3D_X`
- `GPU_MAX_WORKGROUP_SIZE_3D_Y`
- `GPU_MAX_WORKGROUP_SIZE_3D_Z`
- `GPU_BLIT_ENGINE_TYPE`
- `GPU_USE_SYNC_OBJECTS`
- `AMD_OCL_SC_LIB`
- `AMD_OCL_ENABLE_MESSAGE_BOX`
- `GPU_FORCE_64BIT_PTR`
- `GPU_FORCE_OCL20_32BIT`
- `GPU_RAW_TIMESTAMP`
- `GPU_SELECT_COMPUTE_RINGS_ID`
- `GPU_USE_SINGLE_SCRATCH`
- `GPU_ENABLE_LARGE_ALLOCATION`
- `HSA_LOCAL_MEMORY_ENABLE`
- `HSA_ENABLE_COARSE_GRAIN_SVM`
- `GPU_IFH_MODE`
- `OCL_SYSMEM_REQUIREMENT`
- `OCL_CODE_CACHE_ENABLE`
- `OCL_CODE_CACHE_RESET`
### Known Issues In This Release
The following are the known issues in this release.
#### `DISTRIBUTED`/`TEST_DISTRIBUTED_SPAWN` Fails
When user applications call `ncclCommAbort` to destruct communicators and then create new
communicators repeatedly, subsequent communicators may fail to initialize.
This issue is under investigation and will be resolved in a future release.
#### Failures In HIP Directed Tests
Multiple HIP directed tests fail.
### Library Changes in ROCM 5.5.0
### Library Changes in ROCM 5.1.3
| Library | Version |
|---------|---------|
| hipBLAS | 0.53.0 ⇒ [0.54.0](https://github.com/ROCmSoftwarePlatform/hipBLAS/releases/tag/rocm-5.5.0) |
| hipCUB | 2.13.0 ⇒ [2.13.1](https://github.com/ROCmSoftwarePlatform/hipCUB/releases/tag/rocm-5.5.0) |
| hipFFT | 1.0.10 ⇒ [1.0.11](https://github.com/ROCmSoftwarePlatform/hipFFT/releases/tag/rocm-5.5.0) |
| hipSOLVER | 1.6.0 ⇒ [1.7.0](https://github.com/ROCmSoftwarePlatform/hipSOLVER/releases/tag/rocm-5.5.0) |
| hipSPARSE | 2.3.3 ⇒ [2.3.5](https://github.com/ROCmSoftwarePlatform/hipSPARSE/releases/tag/rocm-5.5.0) |
| rccl | 2.13.4 ⇒ [2.15.5](https://github.com/ROCmSoftwarePlatform/rccl/releases/tag/rocm-5.5.0) |
| rocALUTION | 2.1.3 ⇒ [2.1.8](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.5.0) |
| rocBLAS | 2.46.0 ⇒ [2.47.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.5.0) |
| rocFFT | 1.0.21 ⇒ [1.0.22](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.5.0) |
| rocPRIM | 2.12.0 ⇒ [2.13.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.5.0) |
| rocRAND | 2.10.16 ⇒ [2.10.17](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.5.0) |
| rocSOLVER | 3.20.0 ⇒ [3.21.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.5.0) |
| rocSPARSE | 2.4.0 ⇒ [2.5.1](https://github.com/ROCmSoftwarePlatform/rocSPARSE/releases/tag/rocm-5.5.0) |
| rocThrust | [2.17.0](https://github.com/ROCmSoftwarePlatform/rocThrust/releases/tag/rocm-5.5.0) |
| rocWMMA | 0.9 ⇒ [1.0](https://github.com/ROCmSoftwarePlatform/rocWMMA/releases/tag/rocm-5.5.0) |
| Tensile | 4.35.0 ⇒ [4.36.0](https://github.com/ROCmSoftwarePlatform/Tensile/releases/tag/rocm-5.5.0) |
#### hipBLAS 0.54.0
hipBLAS 0.54.0 for ROCm 5.5.0
##### Added
- added option to opt-in to use __half for hipblasHalf type in the API for c++ users who define HIPBLAS_USE_HIP_HALF
- added scripts to plot performance for multiple functions
- data driven hipblas-bench and hipblas-test execution via external yaml format data files
- client smoke test added for quick validation using command hipblas-test --yaml hipblas_smoke.yaml
##### Fixed
- fixed datatype conversion functions to support more rocBLAS/cuBLAS datatypes
- fixed geqrf to return successfully when nullptrs are passed in with n == 0 || m == 0
- fixed getrs to return successfully when given nullptrs with corresponding size = 0
- fixed getrs to give info = -1 when transpose is not an expected type
- fixed gels to return successfully when given nullptrs with corresponding size = 0
- fixed gels to give info = -1 when transpose is not in (&#39;N&#39;, &#39;T&#39;) for real cases or not in (&#39;N&#39;, &#39;C&#39;) for complex cases
##### Changed
- changed reference code for Windows to OpenBLAS
- hipblas client executables all now begin with hipblas- prefix
#### hipCUB 2.13.1
hipCUB 2.13.1 for ROCm 5.5.0
##### Added
- Benchmarks for `BlockShuffle`, `BlockLoad`, and `BlockStore`.
##### Changed
- CUB backend references CUB and Thrust version 1.17.2.
- Improved benchmark coverage of `BlockScan` by adding `ExclusiveScan`, benchmark coverage of `BlockRadixSort` by adding `SortBlockedToStriped`, and benchmark coverage of `WarpScan` by adding `Broadcast`.
##### Fixed
- Windows HIP SDK support
##### Known Issues
- `BlockRadixRankMatch` is currently broken under the rocPRIM backend.
- `BlockRadixRankMatch` with a warp size that does not exactly divide the block size is broken under the CUB backend.
#### hipFFT 1.0.11
hipFFT 1.0.11 for ROCm 5.5.0
##### Fixed
- Fixed old version rocm include/lib folders not removed on upgrade.
#### hipSOLVER 1.7.0
hipSOLVER 1.7.0 for ROCm 5.5.0
##### Added
- Added functions
- gesvdj
- hipsolverSgesvdj_bufferSize, hipsolverDgesvdj_bufferSize, hipsolverCgesvdj_bufferSize, hipsolverZgesvdj_bufferSize
- hipsolverSgesvdj, hipsolverDgesvdj, hipsolverCgesvdj, hipsolverZgesvdj
- gesvdjBatched
- hipsolverSgesvdjBatched_bufferSize, hipsolverDgesvdjBatched_bufferSize, hipsolverCgesvdjBatched_bufferSize, hipsolverZgesvdjBatched_bufferSize
- hipsolverSgesvdjBatched, hipsolverDgesvdjBatched, hipsolverCgesvdjBatched, hipsolverZgesvdjBatched
#### hipSPARSE 2.3.5
hipSPARSE 2.3.5 for ROCm 5.5.0
##### Improved
- Fixed an issue, where the rocm folder was not removed on upgrade of meta packages
- Fixed a compilation issue with cusparse backend
- Added more detailed messages on unit test failures due to missing input data
- Improved documentation
- Fixed a bug with deprecation messages when using gcc9 (Thanks @Maetveis)
#### rccl 2.15.5
RCCL 2.15.5 for ROCm 5.5.0
##### Changed
- Compatibility with NCCL 2.15.5
- Unit test executable renamed to rccl-UnitTests
##### Added
- HW-topology aware binary tree implementation
- Experimental support for MSCCL
- New unit tests for hipGraph support
- NPKit integration
##### Fixed
- rocm-smi ID conversion
- Support for HIP_VISIBLE_DEVICES for unit tests
- Support for p2p transfers to non (HIP) visible devices
##### Removed
- Removed TransferBench from tools. Exists in standalone repo: https://github.com/ROCmSoftwarePlatform/TransferBench
#### rocALUTION 2.1.8
rocALUTION 2.1.8 for ROCm 5.5.0
##### Added
- Added build support for Navi32
##### Improved
- Fixed a typo in MPI backend
- Fixed a bug with the backend when HIP support is disabled
- Fixed a bug in SAAMG hierarchy building on HIP backend
- Improved SAAMG hierarchy build performance on HIP backend
##### Changed
- LocalVector::GetIndexValues(ValueType\*) is deprecated, use LocalVector::GetIndexValues(const LocalVector&amp;, LocalVector\*) instead
- LocalVector::SetIndexValues(const ValueType\*) is deprecated, use LocalVector::SetIndexValues(const LocalVector&amp;, const LocalVector&amp;) instead
- LocalMatrix::RSDirectInterpolation(const LocalVector&amp;, const LocalVector&amp;, LocalMatrix\*, LocalMatrix\*) is deprecated, use LocalMatrix::RSDirectInterpolation(const LocalVector&amp;, const LocalVector&amp;, LocalMatrix\*) instead
- LocalMatrix::RSExtPIInterpolation(const LocalVector&amp;, const LocalVector&amp;, bool, float, LocalMatrix\*, LocalMatrix\*) is deprecated, use LocalMatrix::RSExtPIInterpolation(const LocalVector&amp;, const LocalVector&amp;, bool, LocalMatrix\*) instead
- LocalMatrix::RugeStueben() is deprecated
- LocalMatrix::AMGSmoothedAggregation(ValueType, const LocalVector&amp;, const LocalVector&amp;, LocalMatrix\*, LocalMatrix\*, int) is deprecated, use LocalMatrix::AMGAggregation(ValueType, const LocalVector&amp;, const LocalVector&amp;, LocalMatrix\*, int) instead
- LocalMatrix::AMGAggregation(const LocalVector&amp;, LocalMatrix\*, LocalMatrix\*) is deprecated, use LocalMatrix::AMGAggregation(const LocalVector&amp;, LocalMatrix\*) instead
#### rocBLAS 2.47.0
rocBLAS 2.47.0 for ROCm 5.5.0
##### Added
- added functionality rocblas_geam_ex for matrix-matrix minimum operations
- added HIP Graph support as beta feature for rocBLAS Level 1, Level 2, and Level 3(pointer mode host) functions
- added beta features API. Exposed using compiler define ROCBLAS_BETA_FEATURES_API
- added support for vector initialization in the rocBLAS test framework with negative increments
- added windows build documentation for forthcoming support using ROCm HIP SDK
- added scripts to plot performance for multiple functions
##### Optimizations
- improved performance of Level 2 rocBLAS GEMV for float and double precision. Performance enhanced by 150-200% for certain problem sizes when (m==n) measured on a gfx90a GPU.
- improved performance of Level 2 rocBLAS GER for float, double and complex float precisions. Performance enhanced by 5-7% for certain problem sizes measured on a gfx90a GPU.
- improved performance of Level 2 rocBLAS SYMV for float and double precisions. Performance enhanced by 120-150% for certain problem sizes measured on both gfx908 and gfx90a GPUs.
##### Fixed
- fixed setting of executable mode on client script rocblas_gentest.py to avoid potential permission errors with clients rocblas-test and rocblas-bench
- fixed deprecated API compatibility with Visual Studio compiler
- fixed test framework memory exception handling for Level 2 functions when the host memory allocation exceeds the available memory
##### Changed
- install.sh internally runs rmake.py (also used on windows) and rmake.py may be used directly by developers on linux (use --help)
- rocblas client executables all now begin with rocblas- prefix
##### Removed
- install.sh removed options -o --cov as now Tensile will use the default COV format, set by cmake define Tensile_CODE_OBJECT_VERSION=default
#### rocFFT 1.0.22
rocFFT 1.0.22 for ROCm 5.5.0
##### Optimizations
- Improved performance of 1D lengths &lt; 2048 that use Bluestein&#39;s algorithm.
- Reduced time for generating code during plan creation.
- Optimized 3D R2C/C2R lengths 32, 84, 128.
- Optimized batched small 1D R2C/C2R cases.
##### Added
- Added gfx1101 to default AMDGPU_TARGETS.
##### Changed
- Moved client programs to C++17.
- Moved planar kernels and infrequently used Stockham kernels to be runtime-compiled.
- Moved transpose, real-complex, Bluestein, and Stockham kernels to library kernel cache.
##### Fixed
- Removed zero-length twiddle table allocations, which fixes errors from hipMallocManaged.
- Fixed incorrect freeing of HIP stream handles during twiddle computation when multiple devices are present.
#### rocPRIM 2.13.0
rocPRIM 2.13.0 for ROCm 5.5.0
##### Added
- New block level `radix_rank` primitive.
- New block level `radix_rank_match` primitive.
##### Changed
- Improved the performance of `block_radix_sort` and `device_radix_sort`.
##### Known Issues
- Disabled GPU error messages relating to incorrect warp operation usage with Navi GPUs on Windows, due to GPU printf performance issues on Windows.
##### Fixed
- Fixed benchmark build on Windows
#### rocRAND 2.10.17
rocRAND 2.10.17 for ROCm 5.5.0
##### Added
- MT19937 pseudo random number generator based on M. Matsumoto and T. Nishimura, 1998, Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator.
- New benchmark for the device API using Google Benchmark, `benchmark_rocrand_device_api`, replacing `benchmark_rocrand_kernel`. `benchmark_rocrand_kernel` is deprecated and will be removed in a future version. Likewise, `benchmark_curand_host_api` is added to replace `benchmark_curand_generate` and `benchmark_curand_device_api` is added to replace `benchmark_curand_kernel`.
- experimental HIP-CPU feature
- ThreeFry pseudorandom number generator based on Salmon et al., 2011, &#34;Parallel random numbers: as easy as 1, 2, 3&#34;.
##### Changed
- Python 2.7 is no longer officially supported.
##### Fixed
- Windows HIP SDK support
#### rocSOLVER 3.21.0
rocSOLVER 3.21.0 for ROCm 5.5.0
##### Added
- SVD for general matrices using Jacobi algorithm:
- GESVDJ (with batched and strided\_batched versions)
- LU factorization without pivoting for block tridiagonal matrices:
- GEBLTTRF_NPVT (with batched and strided\_batched versions)
- Linear system solver without pivoting for block tridiagonal matrices:
- GEBLTTRS_NPVT (with batched and strided\_batched, versions)
- Product of triangular matrices
- LAUUM
- Added experimental hipGraph support for rocSOLVER functions
##### Optimized
- Improved the performance of SYEVJ/HEEVJ.
##### Changed
- STEDC, SYEVD/HEEVD and SYGVD/HEGVD now use fully implemented Divide and Conquer approach.
##### Fixed
- SYEVJ/HEEVJ should now be invariant under matrix scaling.
- SYEVJ/HEEVJ should now properly output the eigenvalues when no sweeps are executed.
- Fixed GETF2\_NPVT and GETRF\_NPVT input data initialization in tests and benchmarks.
- Fixed rocblas missing from the dependency list of the rocsolver deb and rpm packages.
#### rocSPARSE 2.5.1
rocSPARSE 2.5.1 for ROCm 5.5.0
##### Added
- Added bsrgemm and spgemm for BSR format
- Added bsrgeam
- Added build support for Navi32
- Added experimental hipGraph support for some rocSPARSE routines
- Added csritsv, spitsv csr iterative triangular solve
- Added mixed precisions for SpMV
- Added batched SpMM for transpose A in COO format with atomic atomic algorithm
##### Improved
- Optimization to csr2bsr
- Optimization to csr2csr_compress
- Optimization to csr2coo
- Optimization to gebsr2csr
- Optimization to csr2gebsr
- Fixes to documentation
- Fixes a bug in COO SpMV gridsize
- Fixes a bug in SpMM gridsize when using very large matrices
##### Known Issues
- In csritlu0, the algorithm rocsparse_itilu0_alg_sync_split_fusion has some accuracy issues to investigate with XNACK enabled. The fallback is rocsparse_itilu0_alg_sync_split.
#### rocWMMA 1.0
rocWMMA 1.0 for ROCm 5.5.0
##### Added
- Added support for wave32 on gfx11+
- Added infrastructure changes to support hipRTC
- Added performance tracking system
##### Changed
- Modified the assignment of hardware information
- Modified the data access for unsigned datatypes
- Added library config to support multiple architectures
#### Tensile 4.36.0
Tensile 4.36.0 for ROCm 5.5.0
##### Added
- Add functions for user-driven tuning
- Add GFX11 support: HostLibraryTests yamls, rearragne FP32(C)/FP64(C) instruction order, archCaps for instruction renaming condition, adjust vgpr bank for A/B/C for optimize, separate vscnt and vmcnt, dual mac
- Add binary search for Grid-Based algorithm
- Add reject condition for (StoreCInUnroll + BufferStore=0) and (DirectToVgpr + ScheduleIterAlg&lt;3 + PrefetchGlobalRead==2)
- Add support for (DirectToLds + hgemm + NN/NT/TT) and (DirectToLds + hgemm + GlobalLoadVectorWidth &lt; 4)
- Add support for (DirectToLds + hgemm(TLU=True only) or sgemm + NumLoadsCoalesced &gt; 1)
- Add GSU SingleBuffer algorithm for HSS/BSS
- Add gfx900:xnack-, gfx1032, gfx1034, gfx1035
- Enable gfx1031 support
##### Optimizations
- Use AssertSizeLessThan for BufferStoreOffsetLimitCheck if it is smaller than MT1
- Improve InitAccVgprOpt
##### Changed
- Use global_atomic for GSU instead of flat and global_store for debug code
- Replace flat_load/store with global_load/store
- Use global_load/store for BufferLoad/Store=0 and enable scheduling
- LocalSplitU support for HGEMM+HPA when MFMA disabled
- Update Code Object Version
- Type cast local memory to COMPUTE_DATA_TYPE in LDS to avoid precision loss
- Update asm cap cache arguments
- Unify SplitGlobalRead into ThreadSeparateGlobalRead and remove SplitGlobalRead
- Change checks, error messages, assembly syntax, and coverage for DirectToLds
- Remove unused cmake file
- Clean up the LLVM dependency code
- Update ThreadSeparateGlobalRead test cases for PrefetchGlobalRead=2
- Update sgemm/hgemm test cases for DirectToLds and ThreadSepareteGlobalRead
##### Fixed
- Add build-id to header of compiled source kernels
- Fix solution index collisions
- Fix h beta vectorwidth4 correctness issue for WMMA
- Fix an error with BufferStore=0
- Fix mismatch issue with (StoreCInUnroll + PrefetchGlobalRead=2)
- Fix MoveMIoutToArch bug
- Fix flat load correctness issue on I8 and flat store correctness issue
- Fix mismatch issue with BufferLoad=0 + TailLoop for large array sizes
- Fix code generation error with BufferStore=0 and StoreCInUnrollPostLoop
- Fix issues with DirectToVgpr + ScheduleIterAlg&lt;3
- Fix mismatch issue with DGEMM TT + LocalReadVectorWidth=2
- Fix mismatch issue with PrefetchGlobalRead=2
- Fix mismatch issue with DirectToVgpr + PrefetchGlobalRead=2 + small tile size
- Fix an error with PersistentKernel=0 + PrefetchAcrossPersistent=1 + PrefetchAcrossPersistentMode=1
- Fix mismatch issue with DirectToVgpr + DirectToLds + only 1 iteration in unroll loop case
- Remove duplicate GSU kernels: for GSU = 1, GSUAlgorithm SingleBuffer and MultipleBuffer kernels are identical
- Fix for failing CI tests due to CpuThreads=0
- Fix mismatch issue with DirectToLds + PrefetchGlobalRead=2
- Remove the reject condition for ThreadSeparateGlobalRead and DirectToLds (HGEMM, SGEMM only)
- Modify reject condition for minimum lanes of ThreadSeparateGlobalRead (SGEMM or larger data type only)
| hipBLAS | [0.50.0](https://github.com/ROCmSoftwarePlatform/hipBLAS/releases/tag/rocm-5.1.3) |
| hipCUB | [2.11.0](https://github.com/ROCmSoftwarePlatform/hipCUB/releases/tag/rocm-5.1.3) |
| hipFFT | [1.0.7](https://github.com/ROCmSoftwarePlatform/hipFFT/releases/tag/rocm-5.1.3) |
| hipSOLVER | [1.3.0](https://github.com/ROCmSoftwarePlatform/hipSOLVER/releases/tag/rocm-5.1.3) |
| hipSPARSE | [2.1.0](https://github.com/ROCmSoftwarePlatform/hipSPARSE/releases/tag/rocm-5.1.3) |
| rccl | [2.11.4](https://github.com/ROCmSoftwarePlatform/rccl/releases/tag/rocm-5.1.3) |
| rocALUTION | [2.0.2](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.1.3) |
| rocBLAS | [2.43.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.1.3) |
| rocFFT | [1.0.16](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.1.3) |
| rocPRIM | [2.10.13](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.1.3) |
| rocRAND | [2.10.13](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.1.3) |
| rocSOLVER | [3.17.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.1.3) |
| rocSPARSE | [2.1.0](https://github.com/ROCmSoftwarePlatform/rocSPARSE/releases/tag/rocm-5.1.3) |
| rocThrust | [2.14.0](https://github.com/ROCmSoftwarePlatform/rocThrust/releases/tag/rocm-5.1.3) |
| Tensile | [4.32.0](https://github.com/ROCmSoftwarePlatform/Tensile/releases/tag/rocm-5.1.3) |

View File

@@ -12,7 +12,7 @@ fetch="https://github.com/GPUOpen-ProfessionalCompute-Libraries/" />
fetch="https://github.com/GPUOpen-Tools/" />
<remote name="KhronosGroup"
fetch="https://github.com/KhronosGroup/" />
<default revision="refs/tags/rocm-5.5.0"
<default revision="refs/tags/rocm-5.5.1"
remote="roc-github"
sync-c="true"
sync-j="4" />

View File

@@ -5,40 +5,21 @@
# https://www.sphinx-doc.org/en/master/usage/configuration.html
import shutil
shutil.copy2('../CONTRIBUTING.md','./contributing.md')
shutil.copy2('../RELEASE.md','./release.md')
from rocm_docs import ROCmDocs
# working anchors that linkcheck cannot find
linkcheck_anchors_ignore = [
'd90e61',
'd1667e113',
'd2999e60',
'building-from-source',
'use-the-rocm-build-tool-rbuild',
'use-cmake-to-build-migraphx',
'example'
]
linkcheck_ignore = [
# site to be built
"https://rocmdocs.amd.com/projects/ROCmCC/en/latest/",
"https://rocmdocs.amd.com/projects/amdsmi/en/latest/",
"https://rocmdocs.amd.com/projects/rdc/en/latest/",
"https://rocmdocs.amd.com/projects/rocmsmi/en/latest/",
"https://rocmdocs.amd.com/projects/roctracer/en/latest/",
"https://rocmdocs.amd.com/projects/MIGraphX/en/latest/",
"https://rocmdocs.amd.com/projects/rocprofiler/en/latest/",
# correct links that linkcheck times out on
"https://github.com/ROCm-Developer-Tools/HIP-VS/blob/master/README.md",
r"https://www.amd.com/system/files/.*.pdf",
"https://www.amd.com/en/developer/aocc.html",
"https://www.amd.com/en/support/linux-drivers",
"https://www.amd.com/en/technologies/infinity-hub",
r"https://bitbucket.org/icl/magma/*",
"http://cs231n.stanford.edu/"
]
shutil.copy2('../CONTRIBUTING.md','./contributing.md')
shutil.copy2('../RELEASE.md','./release.md')
# Keep capitalization due to similar linking on GitHub's markdown preview.
shutil.copy2('../CHANGELOG.md','./CHANGELOG.md')
# configurations for PDF output by Read the Docs
project = "ROCm Documentation"
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved."
version = "5.1.3"
release = "5.1.3"
setting_all_article_info = True
all_article_info_os = ["linux"]
@@ -73,7 +54,7 @@ article_pages = [
{"file":"how_to/system_debugging", "os":["linux"]},
{"file":"how_to/tensorflow_install/tensorflow_install", "os":["linux"]},
{"file":"examples/ai_ml_inferencing", "os":["linux"]},
{"file":"examples/machine_learning", "os":["linux"]},
{"file":"examples/inception_casestudy/inception_casestudy", "os":["linux"]},
{"file":"understand/file_reorg", "os":["linux"]},
@@ -83,8 +64,13 @@ article_pages = [
external_toc_path = "./sphinx/_toc.yml"
docs_core = ROCmDocs("ROCm Documentation")
docs_core = ROCmDocs("ROCm 5.1.3 Documentation Home")
docs_core.setup()
external_projects_current_project = "rocm"
for sphinx_var in ROCmDocs.SPHINX_VARS:
globals()[sphinx_var] = getattr(docs_core, sphinx_var)
html_theme_options = {
"link_main_doc": False
}

View File

@@ -1,44 +0,0 @@
# Deploy
Please follow the guides below to begin your ROCm journey. ROCm can be consumed
via many mechanisms.
:::::{grid} 1 1 3 3
:gutter: 1
::::{grid-item-card}
:padding: 2
Quick Start
^^^
- [Linux](quick_start)
- [Windows](hip_sdk_install_win/hip_sdk_install_win)
::::
::::{grid-item-card}
:padding: 2
Docker
^^^
- [Guide](deploy/docker)
- [Dockerhub](https://hub.docker.com/u/rocm/)
::::
::::{grid-item-card}
:padding: 2
[Advanced](deploy/advanced)
^^^
- [Uninstall](deploy/advanced/uninstall)
- [Multi-ROCm Installations](deploy/advanced/multi)
- [spack](deploy/advanced/spack)
- [Build from Source](deploy/advanced/build_source)
::::
:::::
## Related Information
[Release Information](release)

View File

@@ -4,9 +4,9 @@
Docker containers share the kernel with the host operating system, therefore the
ROCm kernel-mode driver must be installed on the host. Please refer to
[](/deploy/linux/install) for details. The other user-space parts
(like the HIP-runtime or math libraries) of the ROCm stack will be loaded from
the container image and don't need to be installed to the host.
{ref}`using-the-package-manager` on installing `amdgpu-dkms`. The other
user-space parts (like the HIP-runtime or math libraries) of the ROCm stack will
be loaded from the container image and don't need to be installed to the host.
(docker-access-gpus-in-container)=

View File

@@ -1,17 +1,13 @@
# Deploy ROCm on Linux
Please start with the [Quick Start Linux](quick_start) or follow the detailed instructions below.
Start with {doc}`/deploy/linux/quick_start` or follow the detailed
instructions below.
::::{grid} 2 3 3 3
## Prepare to Install
::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} Overview
:link: install
:link-type: doc
Overview and comparison of the different ways to install ROCm.
:::
:::{grid-item-card} Prerequisites
:link: prerequisites
:link-type: doc
@@ -19,37 +15,39 @@ Overview and comparison of the different ways to install ROCm.
The prerequisites page lists the required steps *before* installation.
:::
:::{grid-item-card} Installation
:link: install
:::{grid-item-card} Install Choices
:link: install_overview
:link-type: doc
Detailed steps to install with the package manager or with the installation
script, including multi-version installation. Recommended for most users.
Package manager vs AMDGPU Installer
Standard Packages vs Multi-Version Packages
:::
:::{grid-item-card} Upgrading
:link: upgrade
::::
## Choose your install method
::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} Package Manager
:link: os-native/index
:link-type: doc
Instructions for upgrading an existing ROCm installation.
Directly use your distribution's package manager to install ROCm.
:::
:::{grid-item-card} Uninstallation
:link: uninstall
:::{grid-item-card} AMDGPU Installer
:link: installer/index
:link-type: doc
Steps for removing ROCm packages libraries and tools.
:::
:::{grid-item-card} Package Manager Integration
:link: package_manager_integration
:link-type: doc
Information about (meta-)packages in the ROCm ecosystem.
Use an installer tool that orchestrates changes via the package
manager.
:::
::::
## See Also
- [GPU and OS Support Linux](../../gpu_os_support.md)
- {doc}`/release/gpu_os_support`

View File

@@ -1,956 +0,0 @@
# Installation (Linux)
Installing can be done in one of two ways, depending on your preference:
- Using an installer script
- Through your system's package manager
```{attention}
For information on installing ROCm on devices with NVIDIA GPUs, refer to the HIP
Installation Guide.
```
(install-script-method)=
## Installer Script Method
The installer script method automates the installation process for the AMDGPU
and ROCm stack. The installer script handles the complete installation process
for ROCm, including setting up the repository, cleaning the system, updating,
and installing the desired drivers and meta-packages. With this approach, the
system has more control over the ROCm installation process. Thus, those who are
less familiar with the Linux standard commands can choose this method for ROCm
installation.
For AMDGPU and ROCm installation using the installer script method on Linux
distribution, follow these steps:
1. **Meet prerequisites** Ensure the Prerequisites are met before downloading
and installing the installer using the installer script method.
2. **Download and install the installer script** Ensure you download and
install the installer script from the recommended URL.
```{tip}
The installer package is updated periodically to resolve known issues and add
new features. The links for each Linux distribution always point to the latest
available build.
```
3. **Use the installer script on Linux distributions** Ensure you execute the
script for installing use cases.
### Download and Install the Installer Script
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
<!-- markdownlint-disable-next-line MD013 -->
::::{rubric} To download the amdgpu-install script on the system, use the following commands.
::::
::::{tab-set}
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/5.4.3/ubuntu/focal/amdgpu-install_5.4.50403-1_all.deb
sudo apt install ./amdgpu-install_5.4.50403-1_all.deb
```
:::
:::{tab-item} Ubuntu 22.04
:sync: ubuntu-22.04
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/5.4.3/ubuntu/jammy/amdgpu-install_5.4.50403-1_all.deb
sudo apt install ./amdgpu-install_5.4.50403-1_all.deb
```
:::
::::
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
<!-- markdownlint-disable-next-line MD013 -->
::::{rubric} To download the amdgpu-install script on the system, use the following commands.
::::
::::{tab-set}
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/5.4.3/rhel/8.6/amdgpu-install-5.4.50403-1.el8.noarch.rpm
```
:::
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/5.4.3/rhel/8.7/amdgpu-install-5.4.50403-1.el8.noarch.rpm
```
:::
:::{tab-item} RHEL 9.1
:sync: RHEL-9.1
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/5.4.3/rhel/9.1/amdgpu-install-5.4.50403-1.el9.noarch.rpm
```
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
<!-- markdownlint-disable-next-line MD013 -->
::::{rubric} To download the amdgpu-install script on the system, use the following commands.
::::
::::{tab-set}
:::{tab-item} Service Pack 4
:sync: SLES15-SP4
```shell
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/5.4.3/sle/15.4/amdgpu-install-5.4.50403-1.noarch.rpm
```
:::
::::
:::::
::::::
### Using the Installer Script for Single-version ROCm Installation
To install use cases specific to your requirements, use the installer
`amdgpu-install` as follows:
- To install a single use case:
```shell
sudo amdgpu-install --usecase=rocm
```
- To install kernel-mode driver:
```shell
sudo amdgpu-install --usecase=dkms
```
- To install multiple use cases:
```shell
sudo amdgpu-install --usecase=hiplibsdk,rocm
```
- To display a list of available use cases:
```shell
sudo amdgpu-install --list-usecase
```
Following is a sample of output listed by the command above:
```{note}
The list in this section represents only a sample of available use cases for ROCm:
```
```none
If --usecase option is not present, the default selection is "graphics,opencl,hip"
Available use cases:
rocm(for users and developers requiring full ROCm stack)
- OpenCL (ROCr/KFD based) runtime
- HIP runtimes
- Machine learning framework
- All ROCm libraries and applications
- ROCm Compiler and device libraries
- ROCr runtime and thunk
lrt(for users of applications requiring ROCm runtime)
- ROCm Compiler and device libraries
- ROCr runtime and thunk
opencl(for users of applications requiring OpenCL on Vega or
later products)
- ROCr based OpenCL
- ROCm Language runtime
openclsdk (for application developers requiring ROCr based OpenCL)
- ROCr based OpenCL
- ROCm Language runtime
- development and SDK files for ROCr based OpenCL
hip(for users of HIP runtime on AMD products)
- HIP runtimes
hiplibsdk (for application developers requiring HIP on AMD products)
- HIP runtimes
- ROCm math libraries
- HIP development libraries
```
```{tip}
Adding `-y` as a parameter to `amdgpu-install` skips user prompts (for
automation). Example: `amdgpu-install -y --usecase=rocm`
```
### Using Installer Script in Docker
When the installation is initiated in Docker, the installer tries to install the
use case along with the kernel-mode driver. However, you cannot install the
kernel-mode driver in a Docker container. To skip the installation of the
kernel-mode driver, proceed with the `--no-dkms` option, as shown below:
```shell
sudo amdgpu-install --usecase=rocm --no-dkms
```
### Using the Installer Script for Multi-version ROCm Installation
The multi-version ROCm installation requires you to download and install the
latest ROCm release installer from the list of ROCm releases you want to install
simultaneously on your system.
**Example:** If you want to install ROCm releases 4.5.0, 4.5.1, and 5.4.3
simultaneously, you are required to download the installer from the latest ROCm
release v5.4.3.
To download and install the installer, refer to the [Download and Install the
Installer Script](#download-and-install-the-installer-script) section.
```{attention}
If the existing ROCm release contains non-versioned ROCm packages, uninstall
those packages before proceeding with the multi-version installation to avoid
conflicts.
```
#### Add Required ROCm Repositories
Add the required repositories using the following steps:
```{important}
Add the AMDGPU and ROCm repositories manually for all ROCm releases you want to
install except the latest one. The amdgpu-install script automatically adds the
required repositories for the latest release.
```
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
::::{tab-set}
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
for ver in 5.0.2 5.1.4 5.2.5 5.3.3; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" | sudo tee /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
:::{tab-item} Ubuntu 22.04
:sync: ubuntu-22.04
```shell
for ver in 5.0.2 5.1.4 5.2.5 5.3.3; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver jammy main" | sudo tee /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
::::
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
```shell
for ver in 5.0.2 5.1.4 5.2.5 5.3.3; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
Name=ROCm$ver
baseurl=https://repo.radeon.com/rocm/$ver/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo yum clean all
```
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
```shell
for ver in 5.0.2 5.1.4 5.2.5 5.3.3; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
name=rocm
baseurl=https://repo.radeon.com/amdgpu/$ver/sle/15.4/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo zypper ref
```
:::::
::::::
#### Use the Installer to Install Multi-version ROCm Meta-packages
Use the installer script as given below:
```none
sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-1>
sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-2>
sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-3>
```
```{tip}
If the kernel-mode driver is already present on the system and you do not want
to upgrade it, use the `--no-dkms` option to skip the installation of the
kernel-mode driver, as shown in the following samples:
```
```none
sudo amdgpu-install --usecase=rocm --rocmrelease=4.5.0 --no-dkms
sudo amdgpu-install --usecase=rocm --rocmrelease=5.4.3 --no-dkms
```
Following are examples of ROCm multi-version installation. The kernel-mode
driver, associated with the ROCm release v5.4.3, will be installed as its latest
release in the list.
```none
sudo amdgpu-install --usecase=rocm --rocmrelease=4.5.0
sudo amdgpu-install --usecase=rocm --rocmrelease=4.5.2
sudo amdgpu-install --usecase=rocm --rocmrelease=5.4.3
```
## Package Manager Method
The package manager method involves a manual setup of the repository, which
includes setting up the repository, updating, and installing/uninstalling
meta-packages. This involves using standard commands such as yum, apt, and
others respective to the Linux distribution.
The functions of a package manager installation system are:
- Grouping packages based on function
- Extracting package archives
- Ensuring a package is installed with all necessary packages and dependencies
are managed
- From a remote repository, looking up, downloading, installing, or updating
existing packages
- Ensuring the authenticity and integrity of the package
### Installing ROCm on Linux Distributions
For a fresh ROCm installation using the package manager method on a Linux
distribution, follow the steps below:
1. **Meet prerequisites** Ensure the Prerequisites are met before the ROCm
installation.
2. **Install kernel headers and development packages** Ensure kernel headers
and development packages are installed on the system.
3. **Select the base URLs for AMDGPU and ROCm stack repository** Ensure the
base URLs for AMDGPU and ROCm stack repositories are selected.
4. **Add the AMDGPU stack repository** Ensure the AMDGPU stack repository is
added.
5. **Install the kernel-mode driver and reboot the system** Ensure the
kernel-mode driver is installed and the system is rebooted.
6. **Add ROCm stack repository** Ensure the ROCm stack repository is added.
7. **Install single-version or multi-version ROCm meta-packages** Install the
desired meta-packages.
8. **Verify installation for the applicable distributions** Verify if the
installation is successful.
```{important}
You cannot install a kernel-mode driver in a Docker container. Refer to the
sections below for specific commands to install the AMDGPU and ROCm stack on
various Linux distributions.
```
#### Understanding the Release-specific AMDGPU and ROCm Stack Repositories on Linux Distributions
The release-specific repositories consist of packages from a specific release of
the AMDGPU stack and ROCm stack. The repositories are not updated for the latest
packages with subsequent releases. When a new ROCm release is available, the new
repository, specific to that release, is added. You can select a specific
release to install, update the previously installed single version to the later
available release, or add the latest version of ROCm along with the currently
installed version by using the multi-version ROCm packages.
```{note}
Users installing multiple versions of the ROCm stack must use the
release-specific base URL.
```
#### Using the Package Manager
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
::::{rubric} Installation of Kernel Headers and Development Packages
::::
The following instructions to install kernel headers and development packages
apply to all versions and kernels of Ubuntu. The ROCm installation requires you
to install the Linux-headers and Linux-modules-extra package with the correct
version corresponding to the kernel's version.
**Example:** If the system is running the Linux kernel version
`5.15.0-41-generic`, you must install the identical versions of Linux-headers
and development packages. Refer to {ref}`check-kernel-info` on to how to check
the system's kernel version.
To check the `kernel-headers` and `linux-modules-extra` package versions,
follow these steps:
1. For the Ubuntu/Debian environment, execute the following command to verify
the kernel headers and development packages are installed with the
respective versions:
```shell
sudo dpkg -l | grep linux-headers
```
The command indicates if there are Linux headers installed as shown below:
```none
ii linux-headers-5.15.0-41-generic 5.15.0-41.44~20.04.1 amd64 Linux kernel headers for version 5.15.0 on 64 bit x86 SMP
```
2. Execute the following command to check whether the development packages are
installed:
```shell
sudo dpkg -l | grep linux-modules-extra
```
The command mentioned above lists the installed `linux-modules-extra`
packages like the output below:
```none
ii linux-modules-extra-5.15.0-41-generic 5.15.0-41.44~20.04.1 amd64 Linux kernel extra modules for version 5.15.0 on 64 bit x86 SMP
```
3. If the supported version installation of Linux headers and development
packages are not installed on the system, execute the following command
to install the packages:
```shell
sudo apt install linux-headers-`uname -r` linux-modules-extra-`uname -r`
```
::::{rubric} Adding the AMDGPU and ROCm Stack Repositories
::::
1. Add GPG Key for AMDGPU and ROCm Stack
Add the GPG key for AMDGPU and ROCm repositories. For Debian-based systems
like Ubuntu, configure the Debian ROCm repository as follows:
```shell
curl -fsSL https://repo.radeon.com/rocm/rocm.gpg.key | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/rocm-keyring.gpg
```
```{note}
The GPG key may change; ensure it is updated when installing a new release. If
the key signature verification fails while updating, re-add the key from the
ROCm to the apt repository as mentioned above. The current `rocm.gpg.key` is not
available in a standard key ring distribution but has the following SHA1 sum
hash: `73f5d8100de6048aa38a8b84cd9a87f05177d208 rocm.gpg.key`
```
2. Add the AMDGPU Stack Repository and Install the Kernel-mode Driver
```{attention}
If you have a version of the kernel-mode driver installed, you may skip this
section.
```
To add the AMDGPU stack repository, follow these steps:
::::{tab-set}
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/amdgpu/5.4.3/ubuntu focal main' | sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
:::{tab-item} Ubuntu 22.04
:sync: ubuntu-22.04
```shell
echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/amdgpu/5.4.3/ubuntu jammy main' | sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
::::
Install the kernel mode driver and reboot the system using the following
commands:
```shell
sudo apt install amdgpu-dkms
sudo reboot
```
3. Add the ROCm Stack Repository and Install Meta-packages
To add the ROCm repository, use the following steps:
::::{tab-set}
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
for ver in 5.0.2 5.1.4 5.2.5 5.3.3 5.4.3; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" | sudo tee /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
:::{tab-item} Ubuntu 22.04
:sync: ubuntu-22.04
```shell
for ver in 5.0.2 5.1.4 5.2.5 5.3.3 5.4.3; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver jammy main" | sudo tee /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
::::
Install packages of your choice in a single-version ROCm install or
in a multi-version ROCm install fashion. For more information on what
single/multi-version installations are, refer to {ref}`installation-types`.
For a comprehensive list of meta-packages, refer to
{ref}`meta-package-desc`.
- Sample Single-version installation
```shell
sudo apt install rocm-hip-sdk
```
- Sample Multi-version installation
```{important}
If the existing ROCm release contains non-versioned ROCm packages, you must
uninstall those packages before proceeding to the multi-version installation
to avoid conflicts.
```
```shell
sudo apt install rocm-hip-sdk5.4.3 rocm-hip-sdk5.2.5
```
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
::::{rubric} Installation of Kernel Headers and Development Packages
::::
The ROCm installation requires that you install the kernel headers and
`linux-modules-extra` package with the correct version corresponding to the
kernel's version.
**Example:** If the system is running Linux kernel version
`3.10.0-1160.el7.x86_64`, you must install the identical versions of kernel
headers and development packages. Refer to {ref}`check-kernel-info` on to how to
check the system's kernel version.
To check the kernel headers and `linux-modules-extra` package versions,
follow these steps:
1. To verify you have the supported version of the installed kernel headers,
type the following on the command line:
```shell
sudo yum list installed kernel-headers
```
The command mentioned above displays the list of kernel headers versions
currently present on your system. Verify if the listed kernel headers have
the same versions as the kernel.
2. The following command lists the development packages on your system. Verify
if the listed development package's version number matches the kernel
version number:
```shell
sudo yum list installed kernel-devel
```
3. If the supported version installation of kernel headers and development
packages does not exist on the system, execute the command below to install:
```shell
sudo yum install kernel-headers-`uname -r` kernel-devel-`uname -r`
```
::::{rubric} Adding the AMDGPU and ROCm Stack Repositories
::::
1. Add the AMDGPU Stack Repository and Install the Kernel-mode Driver
```{attention}
If you have a version of the kernel-mode driver installed, you may skip this
section.
```
::::{tab-set}
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
```shell
sudo tee --append /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
Name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.4.3/rhel/8.6/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
```shell
sudo tee --append /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
Name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.4.3/rhel/8.7/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 9.1
:sync: RHEL-9.1
```shell
sudo tee --append /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
Name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.4.3/rhel/9.2/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
::::
Install the kernel mode driver and reboot the system using the following
commands:
```shell
sudo yum install amdgpu-dkms
sudo reboot
```
2. Add the ROCm Stack Repository and Install Meta-packages
To add the ROCm repository, use the following steps:
```shell
for ver in 5.0.2 5.1.4 5.2.5 5.3.3 5.4.3; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
Name=ROCm$ver
baseurl=https://repo.radeon.com/rocm/$ver/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo yum clean all
```
Install packages of your choice in a single-version ROCm install or
in a multi-version ROCm install fashion. For more information on what
single/multi-version installations are, refer to {ref}`installation-types`.
For a comprehensive list of meta-packages, refer to
{ref}`meta-package-desc`.
- Sample Single-version installation
```shell
sudo yum install rocm-hip-sdk
```
- Sample Multi-version installation
```{important}
If the existing ROCm release contains non-versioned ROCm packages, you must
uninstall those packages before proceeding to the multi-version installation
to avoid conflicts.
```
```shell
sudo yum install rocm-hip-sdk5.4.3 rocm-hip-sdk5.2.5
```
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
::::{rubric} Installation of Kernel Headers and Development Packages
::::
ROCm installation requires you to install `linux-headers` and
`linux-modules-extra` package with the correct version corresponding to the
kernel's version.
**Example:** If the system is running the Linux kernel version
`5.3.18-57_11.0.18`, you must install the same versions of Linux headers and
development packages. Refer to {ref}`check-kernel-info` on to how to check
the system's kernel version.
To check the `kernel-headers` and `linux-modules-extra` package versions, follow
these steps:
1. Ensure that the correct version of the latest `kernel-default-devel` and
`kernel-default` packages are installed. The following command lists the
installed `kernel-default-devel` and `kernel-default` package:
```shell
sudo zypper info kernel-default-devel or kernel-default
```
```{note}
This next step is only required if you find from the above command that the
`kernel-default-devel` and `kernel-default` versions of the package,
corresponding to the kernel release version, do not exist on your system.
```
2. If the required version of packages does not exist on the system, install
with the command below:
```shell
sudo zypper install kernel-default-devel or kernel-default
```
::::{rubric} Adding the AMDGPU and ROCm Stack Repositories
::::
1. Add the AMDGPU Stack Repository and Install the Kernel-mode Driver
```{attention}
If you have a version of the kernel-mode driver installed, you may skip this
section.
```
```shell
sudo tee --append /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.4.3/sle/15.4/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
Install the kernel mode driver and reboot the system using the following
commands:
```shell
sudo zypper --gpg-auto-import-keys install amdgpu-dkms
sudo reboot
```
2. Add the ROCm Stack Repository and Install Meta-packages
To add the ROCm repository, use the following steps:
```shell
for ver in 5.0.2 5.1.4 5.2.5 5.3.3 5.4.3; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
name=rocm
baseurl=https://repo.radeon.com/amdgpu/$ver/sle/15.4/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo zypper ref
```
Install packages of your choice in a single-version ROCm install or
in a multi-version ROCm install fashion. For more information on what
single/multi-version installations are, refer to {ref}`installation-types`.
For a comprehensive list of meta-packages, refer to
{ref}`meta-package-desc`.
- Sample Single-version installation
```shell
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk
```
- Sample Multi-version installation
```{important}
If the existing ROCm release contains non-versioned ROCm packages, you must
uninstall those packages before proceeding to the multi-version installation
to avoid conflicts.
```
```shell
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk5.4.3 rocm-hip-sdk5.2.5
```
:::::
::::::
(post-install-actions-linux)=
## Post-install Actions and Verification Process
The post-install actions listed here are optional and depend on your use case,
but are generally useful. Verification of the install is advised.
### Post-install Actions
1. Instruct the system linker where to find the shared objects (`.so` files) for
ROCm applications.
```shell
sudo tee --append /etc/ld.so.conf.d/rocm.conf <<EOF
/opt/rocm/lib
/opt/rocm/lib64
EOF
sudo ldconfig
```
```{note}
Multi-version installations require extra care. Having multiple versions on
the system linker library search path is unadvised. One must take care both
at compile-time and at run-time to assure that the proper libraries are
picked up. You can override `ld.so.conf` entries on a case-by-case basis
using the `LD_LIBRARY_PATH` environmental variable.
```
2. Add binary paths to the `PATH` environment variable.
```shell
export PATH=$PATH:/opt/rocm-5.4.3/bin:/opt/rocm-5.4.3/opencl/bin
```
```{attention}
When using CMake to build applications, having the ROCm install location on
the PATH subtly affects how ROCm libraries are searched for. See [Config Mode
Search Procedure](https://cmake.org/cmake/help/latest/command/find_package.html#config-mode-search-procedure)
and [CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH](https://cmake.org/cmake/help/latest/variable/CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH.html)
for details.
(Entries in the `PATH` minus `bin` and `sbin` are added to library search
paths, therefore this convenience will affect builds and result in ROCm
libraries almost always being found. This may be an issue when you're
developing these libraries or want to use self-built versions of them.)
```
(verifying-kernel-mode-driver-installation)=
### Verifying Kernel-mode Driver Installation
Check the installation of the kernel-mode driver by typing the command given
below:
```shell
dkms status
```
### Verifying ROCm Installation
After completing the ROCm installation, execute the following commands on the
system to verify if the installation is successful. If you see your GPUs listed
by both commands, the installation is considered successful:
```shell
/opt/rocm/bin/rocminfo
# OR
/opt/rocm/opencl/bin/clinfo
```
### Verifying Package Installation
To ensure the packages are installed successfully, use the following commands:
::::{tab-set}
:::{tab-item} Ubuntu
:sync: ubuntu
```shell
sudo apt list --installed
```
:::
:::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
```shell
sudo yum list installed
```
:::
:::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
```shell
sudo zypper search --installed-only
```
:::
::::

View File

@@ -1,111 +1,38 @@
# Installation Overview (Linux)
# ROCm Installation Options (Linux)
This document is intended for users familiar with Linux and discusses the
installation of ROCm on various distributions.
Users installing ROCm must choose between various installation options. A new
user should follow the [Quick Start guide](./quick_start).
The guide provides instructions for the following:
## Package Manager versus AMDGPU Installer?
- Kernel-mode driver installation
- ROCm single-version and multi-version installation
- ROCm and kernel-mode driver version upgrade
- ROCm single-version and multi-version uninstallation
- Kernel-mode driver uninstallation
ROCm supports two methods for installation:
```{note}
The rest of this document refers to _Radeon™ Software for Linux_ as the `amdgpu`
stack and `amdgpu-dkms` driver as the kernel-mode driver.
```
- Directly using the Linux distribution's package manager
- The `amdgpu-install` script
## Installation Methods
There is no difference in the final installation state when choosing either
option.
It is customary for Linux installers to integrate into the system's package
manager. There are two notable groups of package sources:
Using the distribution's package manager lets the user install,
upgrade and uninstall using familiar commands and workflows. Third party
ecosystem support is the same as your OS package manager.
- AMD-hosted repositories maintained by AMD available to register on supported
Linux distribution versions. For a complete list of AMD-supported platforms,
refer to the article: [GPU and OS Support](/release/gpu_os_support).
- Distribution-hosted repositories maintained by the developer of said Linux
distribution. These require little to no setup from the user, but aren't tested
by AMD. For support on these installations, contact the relevant maintainers.
The `amdgpu-install` script is a wrapper around the package manager. The same
packages are installed by this script as the package manager system.
AMD also provides installer scripts for those that wish to drive installations
in a more manual fashion.
## Package Licensing
```{attention}
AQL Profiler and AOCC CPU optimization are both provided in binary form, each
subject to the license agreement enclosed in the directory for the binary and is
available here: `/opt/rocm/share/doc/rocm-llvm-alt/EULA`. By using, installing,
copying or distributing AQL Profiler and/or AOCC CPU Optimizations, you agree to
the terms and conditions of this license agreement. If you do not agree to the
terms of this agreement, do not install, copy or use the AQL Profiler and/or the
AOCC CPU Optimizations.
```
For the rest of the ROCm packages, you can find the licensing information at the
following location: `/opt/rocm/share/doc/<component-name>/`
For example, you can fetch the licensing information of the `_amd_comgr_`
component (Code Object Manager) from the `amd_comgr` folder. A file named
`LICENSE.txt` contains the license details at:
`/opt/rocm-5.4.3/share/doc/amd_comgr/LICENSE.txt`
### Package Manager Integration
Integrating with the distribution's package manager let's the user install,
upgrade and uninstall using familiar commands and workflows. The actual commands
vary from distribution to distribution. For more information, refer to
[Package Manager Integration](package_manager_integration).
### Installer Script
The `amdgpu-install` script streamlines the installation process by:
- Abstracting the distribution-specific package installation logic
- Performing the repository setup
- Allowing you to specify the use case and automating the installation of all
the required packages
- Installing multiple ROCm releases simultaneously on a system
- Automating updating local repository information through enhanced
functionality of the `amdgpu-install` script
- Performing post-install checks to verify whether the installation was
completed successfully
- Upgrading the installed ROCm release
- Uninstalling the installed single-version or multi-version ROCm releases
```{tip}
The installer script is provided for convenience. It doesn't do anything the
user otherwise couldn't. It automates some tasks surrounding installation, such
as registering/unregistering and driving the system's package manager, but the
bulk of the work will still be done by the system's package manager. As is the
case with most convenience wrappers, some degree of customization is lost for
the sake of simplicity.
```
#### Use cases
The installer script introduces the notion of "use cases", which denote usage
patterns or reasons why someone installs ROCm. This is to allow users to install
only a subset of the ROCm ecosystem, parts concerning them, resulting in
smaller installation footprint and faster installs/upgrades.
Some of the ROCm-specific use cases the installer supports are:
- OpenCL (ROCr/KFD based) runtime
- HIP runtimes
- ROCm libraries and applications
- ROCm Compiler and device libraries
- Kernel-mode driver
For more information, refer to the How to Install ROCm section in this guide.
The installer automates the installation process for the AMDGPU
and ROCm stack. It handles the complete installation process
for ROCm, including setting up the repository, cleaning the system, updating,
and installing the desired drivers and meta-packages. Users who are
less familiar with the package manager can choose this method for ROCm
installation.
(installation-types)=
## Installation types
## Single Version ROCm install versus Multi-Version
This section discusses the single-version and multi-version installation of the
ROCm software stack.
ROCm packages are versioned with both semantic versioning that is package
specific and a ROCm release version.
### Single-version Installation
@@ -123,8 +50,14 @@ The multi-version installation refers to the following:
ability to support multiple versions of packages simultaneously.
- Use of versioned ROCm meta-packages.
```{attention}
ROCm packages that were previously installed from a single-version installation
must be removed before proceeding with the multi-version installation to avoid
conflicts.
```
```{note}
Multiversion install is not available for the AMDGPU stack.
Multiversion install is not available for the kernel driver module, also referred to as AMDGPU.
```
The following image demonstrates the difference between single-version and

View File

@@ -0,0 +1,31 @@
# AMDGPU Install Script
::::{grid} 2 3 3 3
:gutter: 1
:::{grid-item-card} Install
:link: install
:link-type: doc
How to install ROCm?
:::
:::{grid-item-card} Upgrade
:link: upgrade
:link-type: doc
Instructions for upgrading an existing ROCm installation.
:::
:::{grid-item-card} Uninstall
:link: uninstall
:link-type: doc
Steps for removing ROCm packages libraries and tools.
:::
::::
## See Also
- {doc}`/release/gpu_os_support`

View File

@@ -0,0 +1,299 @@
# Installation with install script
Prior to beginning, please ensure you have the [prerequisites](../prerequisites)
installed.
## Download the Installer Script
To download and install the `amdgpu-install` script on the system, use the
following commands based on your distribution.
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
::::{tab-set}
:::{tab-item} Ubuntu 18.04
:sync: ubuntu-18.04
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/22.10.3/ubuntu/bionic/amdgpu-install_22.10.3.50103-1_all.deb
sudo apt install ./amdgpu-install_22.10.3.50103-1_all.deb
```
:::
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/22.10.3/ubuntu/focal/amdgpu-install_22.10.3.50103-1_all.deb
sudo apt install ./amdgpu-install_22.10.3.50103-1_all.deb
```
:::
::::
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
::::{tab-set}
:::{tab-item} RHEL 7.9
:sync: RHEL-7.9
:sync: RHEL-7
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/22.10.3/rhel/7.9/amdgpu-install-22.10.3.50103-1.el7.noarch.rpm
```
:::
:::{tab-item} RHEL 8.4
:sync: RHEL-8.4
:sync: RHEL-8
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/22.10.3/rhel/8.4/amdgpu-install-22.10.3.50103-1.el8.noarch.rpm
```
:::
:::{tab-item} RHEL 8.5
:sync: RHEL-8.5
:sync: RHEL-8
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/22.10.3/rhel/8.5/amdgpu-install-22.10.3.50103-1.el8.noarch.rpm
```
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
::::{tab-set}
:::{tab-item} Service Pack 4
:sync: SLES15-SP4
```shell
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/22.10.3/sle/15/amdgpu-install-22.10.3.50103-1.noarch.rpm
```
:::
::::
:::::
::::::
## Use cases
Instead of installing individual applications or libraries the installer script
groups packages into specific use cases, matching typical workflows and runtimes.
To display a list of available use cases execute the command:
```shell
sudo amdgpu-install --list-usecase
```
The available use-cases will be printed in a format similar to the example
output below.
```none
If --usecase option is not present, the default selection is "graphics,opencl,hip"
Available use cases:
rocm(for users and developers requiring full ROCm stack)
- OpenCL (ROCr/KFD based) runtime
- HIP runtimes
- Machine learning framework
- All ROCm libraries and applications
- ROCm Compiler and device libraries
- ROCr runtime and thunk
lrt(for users of applications requiring ROCm runtime)
- ROCm Compiler and device libraries
- ROCr runtime and thunk
opencl(for users of applications requiring OpenCL on Vega or
later products)
- ROCr based OpenCL
- ROCm Language runtime
openclsdk (for application developers requiring ROCr based OpenCL)
- ROCr based OpenCL
- ROCm Language runtime
- development and SDK files for ROCr based OpenCL
hip(for users of HIP runtime on AMD products)
- HIP runtimes
hiplibsdk (for application developers requiring HIP on AMD products)
- HIP runtimes
- ROCm math libraries
- HIP development libraries
```
To install use cases specific to your requirements, use the installer
`amdgpu-install` as follows:
- To install a single use case add it with the `--usecase` option:
```shell
sudo amdgpu-install --usecase=rocm
```
- For multiple use cases separate them with commas:
```shell
sudo amdgpu-install --usecase=hiplibsdk,rocm
```
## Single-version ROCm Installation
By default (without the `--rocmrelease` option)
the installer script will install packages in the single-version layout.
## Multi-version ROCm Installation
For the multi-version ROCm installation you must use the installer script from
the latest release of ROCm that you wish to install.
**Example:** If you want to install ROCm releases 5.0.2 and 5.1.3
simultaneously, you are required to download the installer from the latest ROCm
release v5.1.3.
### Add Required Repositories
You must add the ROCm repositories manually for all ROCm releases
you want to install except the latest one. The `amdgpu-install` script
automatically adds the required repositories for the latest release.
Run the following commands based on your distribution to add the repositories:
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
::::{tab-set}
:::{tab-item} Ubuntu 18.04
:sync: ubuntu-18.04
```shell
for ver in 5.0.2 5.1.3; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver bionic main" | sudo tee /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
for ver in 5.0.2 5.1.3; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" | sudo tee /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
::::
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
::::{tab-set}
:::{tab-item} RHEL 7
:sync: RHEL-7
```shell
for ver in 5.0.2 5.1.3; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
baseurl=https://repo.radeon.com/rocm/yum/$ver/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo yum clean all
```
:::
:::{tab-item} RHEL 8
:sync: RHEL-8
```shell
for ver in 5.0.2 5.1.3; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
baseurl=https://repo.radeon.com/rocm/yum/$ver/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo yum clean all
```
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
```shell
for ver in 5.0.2 5.1.3; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
name=rocm
baseurl=https://repo.radeon.com/rocm/zyp/$ver/main
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo zypper ref
```
:::::
::::::
### Install packages
Use the installer script as given below:
```none
sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-1>
sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-2>
sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-3>
```
Following are examples of ROCm multi-version installation. The kernel-mode
driver, associated with the ROCm release v5.1.3, will be installed as its latest
release in the list.
```none
sudo amdgpu-install --usecase=rocm --rocmrelease=5.0.2
sudo amdgpu-install --usecase=rocm --rocmrelease=5.1.3
```
## Additional options
### Unattended installation
Adding `-y` as a parameter to `amdgpu-install` skips user prompts (for
automation). Example: `amdgpu-install -y --usecase=rocm`
### Skipping kernel mode driver installation
The installer script tries to install the kernel mode driver along with the
requested use cases. This might be unnecessary as in the case of docker
containers or you may wish to keep a specific version when using multi-version
installation, and not have the last installed version overwrite the kernel mode
driver.
To skip the installation of the kernel-mode driver add the `--no-dkms` option
when calling the installer script.

View File

@@ -0,0 +1,25 @@
# Installer Script Uninstallation (Linux)
To uninstall all ROCm packages and the kernel-mode driver the following commands
can be used.
::::{rubric} Uninstalling Single-Version Install
::::
```console shell
sudo amdgpu-install --uninstall
```
::::{rubric} Uninstalling a Specific ROCm Release
::::
```console shell
sudo amdgpu-install --uninstall --rocmrelease=<release-number>
```
::::{rubric} Uninstalling all ROCm Releases
::::
```console shell
sudo amdgpu-install --uninstall --rocmrelease=all
```

View File

@@ -0,0 +1,5 @@
# Upgrading with the Installer Script (Linux)
The upgrade procedure with the installer script is exactly the same as
installing for 1st time use. Refer to the {doc}`install`
section on the exact procedure to follow.

View File

@@ -0,0 +1,38 @@
# Installation via Package manager
::::{grid} 2 3 3 3
:gutter: 1
:::{grid-item-card} Install
:link: install
:link-type: doc
How to install ROCm?
:::
:::{grid-item-card} Upgrade
:link: upgrade
:link-type: doc
Instructions for upgrading an existing ROCm installation.
:::
:::{grid-item-card} Uninstall
:link: uninstall
:link-type: doc
Steps for removing ROCm packages libraries and tools.
:::
:::{grid-item-card} Package Manager Integration
:link: package_manager_integration
:link-type: doc
Information about packages.
:::
::::
## See Also
- {doc}`/release/gpu_os_support`

View File

@@ -0,0 +1,465 @@
# Installation (Linux)
## Understanding the Release-specific AMDGPU and ROCm Repositories on Linux Distributions
The release-specific repositories consist of packages from a specific release of
versions of AMDGPU and ROCm. The repositories are not updated for the latest
packages with subsequent releases. When a new ROCm release is available, the new
repository, specific to that release, is added. You can select a specific
release to install, update the previously installed single version to the later
available release, or add the latest version of ROCm along with the currently
installed version by using the multi-version ROCm packages.
## Step by Step Instructions
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
::::{rubric} 1. Download and convert the package signing key
::::
```shell
# Make the directory if it doesn't exist yet.
# This location is recommended by the distribution maintainers.
sudo mkdir --parents --mode=0755 /etc/apt/keyrings
# Download the key, convert the signing-key to a full
# keyring required by apt and store in the keyring directory
wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | \
gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null
```
```{note}
The GPG key may change; ensure it is updated when installing a new release. If
the key signature verification fails while updating, re-add the key from the
ROCm to the apt repository as mentioned above. The current `rocm.gpg.key` is not
available in a standard key ring distribution but has the following SHA1 sum
hash: `73f5d8100de6048aa38a8b84cd9a87f05177d208 rocm.gpg.key`
```
::::{rubric} 2. Add the AMDGPU Repository and Install the Kernel-mode Driver
::::
```{tip}
If you have a version of the kernel-mode driver installed, you may skip this
section.
```
To add the AMDGPU repository, follow these steps:
::::{tab-set}
:::{tab-item} Ubuntu 18.04
:sync: ubuntu-18.04
```shell
# amdgpu repository for bionic
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.10.3/ubuntu bionic main' \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
# amdgpu repository for focal
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.10.3/ubuntu focal main' \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
::::
Install the kernel mode driver and reboot the system using the following
commands:
```shell
sudo apt install amdgpu-dkms
sudo reboot
```
::::{rubric} 3. Add the ROCm Repository
::::
To add the ROCm repository, use the following steps:
::::{tab-set}
:::{tab-item} Ubuntu 18.04
:sync: ubuntu-18.04
```shell
# ROCm repositories for bionic
for ver in 5.0.2 5.1.3; do
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ver bionic main" \
| sudo tee --append /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
# ROCm repositories for focal
for ver in 5.0.2 5.1.3; do
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" \
| sudo tee --append /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
::::
::::{rubric} 4. Install packages
::::
Install packages of your choice in a single-version ROCm install or
in a multi-version ROCm install fashion. For more information on what
single/multi-version installations are, refer to {ref}`installation-types`.
For a comprehensive list of meta-packages, refer to
{ref}`meta-package-desc`.
- Sample Single-version installation
```shell
sudo apt install rocm-hip-sdk
```
- Sample Multi-version installation
```shell
sudo apt install rocm-hip-sdk5.0.2 rocm-hip-sdk5.1.3
```
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
::::{rubric} 1. Add the AMDGPU Stack Repository and Install the Kernel-mode Driver
::::
```{tip}
If you have a version of the kernel-mode driver installed, you may skip this
section.
```
::::{tab-set}
:::{tab-item} RHEL 7.9
:sync: RHEL-7.9
:sync: RHEL-7
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.10.3/rhel/7.9/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.4
:sync: RHEL-8.4
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.10.3/rhel/8.4/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.5
:sync: RHEL-8.5
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.10.3/rhel/8.5/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
::::
Install the kernel mode driver and reboot the system using the following
commands:
```shell
sudo yum install amdgpu-dkms
sudo reboot
```
::::{rubric} 2. Add the ROCm Stack Repository
::::
To add the ROCm repository, use the following steps, based on your distribution:
::::{tab-set}
:::{tab-item} RHEL 8
:sync: RHEL-8
```shell
for ver in 5.0.2 5.1.3; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
baseurl=https://repo.radeon.com/rocm/yum/$ver/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo yum clean all
```
:::
:::{tab-item} RHEL 8
:sync: RHEL-8
```shell
for ver in 5.0.2 5.1.3; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
baseurl=https://repo.radeon.com/rocm/yum/$ver/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo yum clean all
```
:::
::::
::::{rubric} 3. Install packages
::::
Install packages of your choice in a single-version ROCm install or
in a multi-version ROCm install fashion. For more information on what
single/multi-version installations are, refer to {ref}`installation-types`.
For a comprehensive list of meta-packages, refer to
{ref}`meta-package-desc`.
- Sample Single-version installation
```shell
sudo yum install rocm-hip-sdk
```
- Sample Multi-version installation
```shell
sudo yum install rocm-hip-sdk5.0.2 rocm-hip-sdk5.1.3
```
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
::::{rubric} 1. Add the AMDGPU Repository and Install the Kernel-mode Driver
::::
```{tip}
If you have a version of the kernel-mode driver installed, you may skip this
section.
```
```shell
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.10.3/sle/15/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
Install the kernel mode driver and reboot the system using the following
commands:
```shell
sudo zypper --gpg-auto-import-keys install amdgpu-dkms
sudo reboot
```
::::{rubric} 2. Add the ROCm Stack Repository
::::
To add the ROCm repository, use the following steps:
```shell
for ver in 5.0.2 5.1.3; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
name=rocm
baseurl=https://repo.radeon.com/rocm/zyp/$ver/main
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo zypper ref
```
::::{rubric} 3. Install packages
::::
Install packages of your choice in a single-version ROCm install or
in a multi-version ROCm install fashion. For more information on what
single/multi-version installations are, refer to {ref}`installation-types`.
For a comprehensive list of meta-packages, refer to
{ref}`meta-package-desc`.
- Sample Single-version installation
```shell
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk
```
- Sample Multi-version installation
```shell
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk5.0.2 rocm-hip-sdk5.1.3
```
:::::
::::::
(post-install-actions-linux)=
## Post-install Actions and Verification Process
The post-install actions listed here are optional and depend on your use case,
but are generally useful. Verification of the install is advised.
### Post-install Actions
1. Instruct the system linker where to find the shared objects (`.so` files) for
ROCm applications.
```shell
sudo tee --append /etc/ld.so.conf.d/rocm.conf <<EOF
/opt/rocm/lib
/opt/rocm/lib64
EOF
sudo ldconfig
```
```{note}
Multi-version installations require extra care. Having multiple versions on
the system linker library search path is unadvised. One must take care both
at compile-time and at run-time to assure that the proper libraries are
picked up. You can override `ld.so.conf` entries on a case-by-case basis
using the `LD_LIBRARY_PATH` environmental variable.
```
2. Add binary paths to the `PATH` environment variable.
```shell
export PATH=$PATH:/opt/rocm-5.1.3/bin:/opt/rocm-5.1.3/opencl/bin
```
```{attention}
When using CMake to build applications, having the ROCm install location on
the PATH subtly affects how ROCm libraries are searched for. See [Config Mode
Search Procedure](https://cmake.org/cmake/help/latest/command/find_package.html#config-mode-search-procedure)
and [CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH](https://cmake.org/cmake/help/latest/variable/CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH.html)
for details.
(Entries in the `PATH` minus `bin` and `sbin` are added to library search
paths, therefore this convenience will affect builds and result in ROCm
libraries almost always being found. This may be an issue when you're
developing these libraries or want to use self-built versions of them.)
```
(verifying-kernel-mode-driver-installation)=
### Verifying Kernel-mode Driver Installation
Check the installation of the kernel-mode driver by typing the command given
below:
```shell
dkms status
```
### Verifying ROCm Installation
After completing the ROCm installation, execute the following commands on the
system to verify if the installation is successful. If you see your GPUs listed
by both commands, the installation is considered successful:
```shell
/opt/rocm/bin/rocminfo
# OR
/opt/rocm/opencl/bin/clinfo
```
### Verifying Package Installation
To ensure the packages are installed successfully, use the following commands:
::::{tab-set}
:::{tab-item} Ubuntu
:sync: ubuntu
```shell
sudo apt list --installed
```
:::
:::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
```shell
sudo yum list installed
```
:::
:::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell
sudo zypper search --installed-only
```
:::
::::

View File

@@ -1,23 +1,9 @@
# Uninstallation (Linux)
# Uninstallation with package manager (Linux)
Uninstallation of ROCm entails removing ROCm packages, tools, and libraries from
the system.
You can uninstall using the following methods:
- Package manager uninstallation
- Uninstallation using the uninstall script
```{attention}
Use the same uninstall method that you used to install ROCm. Mixing procedures
is untested and may result in inconsistent system state.
```
## Package Manager Method
The package manager uninstallation offers a method for a clean uninstallation
process for ROCm. This section describes how to uninstall the ROCm instance from
various Linux distributions.
This section describes how to uninstall ROCm with the Linux distribution's
package manager. This method should be used if ROCm was installed via the package
manager. If the installer script was used for installation, then it should be
used for uninstallation too, refer to {doc}`/deploy/linux/installer/uninstall`.
::::::{tab-set}
:::::{tab-item} Ubuntu
@@ -182,31 +168,3 @@ sudo zypper remove --clean-deps amdgpu-dkms
:::::
::::::
## Installer Script Method
::::{rubric} Uninstalling Single-Version Install
::::
```console shell
sudo amdgpu-install --uninstall
```
```{note}
This command uninstalls all ROCm packages associated with the installed ROCm
release along with the kernel-mode driver.
```
::::{rubric} Uninstalling a Specific ROCm Release
::::
```console shell
sudo amdgpu-install --uninstall --rocmrelease=<release-number>
```
::::{rubric} Uninstalling all ROCm Releases
::::
```console shell
sudo amdgpu-install --uninstall --rocmrelease=all
```

View File

@@ -0,0 +1,292 @@
# Upgrade ROCm with the package manager
This section explains how to upgrade the existing AMDGPU driver and ROCm
packages to the latest version using your OS's distributed package manager.
```{note}
Package upgrade is applicable to single-version packages only. If the preference
is to install an updated version of the ROCm along with the currently
installed version, refer to the [](install) page.
```
## Upgrade Steps
### Update the AMDGPU repository
Execute the commands below based on your distribution to point the `amdgpu`
repository to the new release.
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
::::{tab-set}
:::{tab-item} Ubuntu 18.04
:sync: ubuntu-18.04
```shell
# amdgpu repository for bionic
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.10.3/ubuntu bionic main' \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
# amdgpu repository for focal
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.10.3/ubuntu focal main' \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
::::
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
::::{tab-set}
:::{tab-item} RHEL 7.9
:sync: RHEL-7.9
:sync: RHEL-7
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.10.3/rhel/7.9/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.4
:sync: RHEL-8.4
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.10.3/rhel/8.4/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.5
:sync: RHEL-8.5
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.10.3/rhel/8.5/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
```shell
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.10.3/sle/15/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
:::::
::::::
### Upgrade the kernel-mode driver & reboot
Upgrade the kernel mode driver and reboot the system using the following
commands based on your distribution:
::::{tab-set}
:::{tab-item} Ubuntu
:sync: ubuntu
```shell
sudo apt install amdgpu-dkms
sudo reboot
```
:::
:::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
```shell
sudo yum install amdgpu-dkms
sudo reboot
```
:::
:::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
```shell
sudo zypper --gpg-auto-import-keys install amdgpu-dkms
sudo reboot
```
:::
::::
### Update the ROCm repository
Execute the commands below based on your distribution to point the `rocm`
repository to the new release.
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
::::{tab-set}
:::{tab-item} Ubuntu 18.04
:sync: ubuntu-18.04
```shell
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.1.3 bionic main" \
| sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.1.3 focal main" \
| sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
::::
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
::::{tab-set}
:::{tab-item} RHEL 7
:sync: RHEL-7
```shell
sudo tee /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-5.1.3]
name=ROCm5.1.3
baseurl=https://repo.radeon.com/rocm/yum/5.1.3/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-5.1.3]
name=ROCm5.1.3
baseurl=https://repo.radeon.com/rocm/yum/5.1.3/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
```shell
sudo tee /etc/zypp/repos.d/rocm.repo <<EOF
[ROCm-5.1.3]
name=ROCm5.1.3
name=rocm
baseurl=https://repo.radeon.com/rocm/zyp/5.1.3/main
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
:::::
::::::
### Upgrade the ROCm packages
Your packages can be upgraded now through their meta-packages, see the following
example based on your distribution:
::::{tab-set}
:::{tab-item} Ubuntu
:sync: ubuntu
```shell
sudo apt install --only-upgrade rocm-hip-sdk
```
:::
:::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
```shell
sudo yum update rocm-hip-sdk
```
:::
:::{tab-item} Suse Linux Enterprise Server 15
:sync: SLES15
```shell
sudo zypper --gpg-auto-import-keys update rocm-hip-sdk
```
:::
::::
## Verification Process
To verify if the upgrade is successful, refer to the
{ref}`post-install-actions-linux` given in the
[Installation](install) section.

View File

@@ -49,59 +49,128 @@ Verify the kernel version using the following steps:
uname -srmv
```
2. Confirm that the obtained kernel version information matches with System
Requirements.
**Example:** The output of the command above lists the kernel version in the
following format:
```shell
```output
Linux 5.15.0-46-generic #44~20.04.5-Ubuntu SMP Fri Jun 24 13:27:29 UTC 2022 x86_64
```
## Confirm the System has a ROCm-Capable GPU
2. Confirm that the obtained kernel version information matches with system
requirements as listed in {ref}`supported_distributions`.
The ROCm platform is designed to support the following GPUs:
## Additional package repositories
```{table} GPU Support for ROCm Programming Models
:name: gpu-support
| **Classification** | **GPU Name** | **GFX ID** | **Product Id** |
|:------------------:|:-------------------------:|:----------:|:--------------:|
| **GFX9 GPUs** | AMD Radeon Instinct™ MI50 | gfx906 | Vega 20 |
| **GFX9 GPUs** | AMD Radeon Instinct™ MI60 | gfx906 | Vega 20 |
| **GFX9 GPUs** | AMD Radeon™ VII | gfx906 | Vega 20 |
| **GFX9 GPUs** | AMD Radeon™ Pro VII | gfx906 | Vega 20 |
| **RDNA GPUs** | AMD Radeon™ Pro W6800 | gfx1030 | Navi 21 GL-XL |
| **RDNA GPUs** | AMD Radeon™ Pro V620 | gfx1030 | Navi 21 GL-XE |
| **CDNA GPUs** | AMD Instinct™ MI100 | gfx908 | Arcturus |
| **CDNA GPUs** | AMD Instinct™ MI200 | gfx90a | Aldebaran |
On some distributions the ROCm packages depend on packages outside the default
package repositories. These extra repositories need to be enabled before
installation. Follow the instructions below based on your distributions.
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
All packages are available in the default Ubuntu repositories, therefore
no additional repositories need to be added.
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
::::{rubric} 1. Add the EPEL repository
::::
::::{tab-set}
:::{tab-item} RHEL 8
:sync: RHEL-8
```shell
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo rpm -ivh epel-release-latest-8.noarch.rpm
```
### Verify Your System Has a ROCm-Capable GPU
:::
:::{tab-item} RHEL 9
To verify that your system has a ROCm-capable GPU, use these steps:
```shell
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
sudo rpm -ivh epel-release-latest-9.noarch.rpm
```
1. Enter the following command:
:::
::::
```shell
lspci | grep -i display
```
::::{rubric} 2. Enable the CodeReady Linux Builder repository
::::
The command displays the details of detected GPUs on the system in the
following format in the case of AMD Instinct™ MI200:
Run the following command and follow the instructions.
```text
c1:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran
c5:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran
```
```shell
sudo crb enable
```
2. Verify from the output that the listed product names match with the Product
Id given in the table above.
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
### Setting Permissions for Groups
Add the perl languages repository.
```shell
zypper addrepo https://download.opensuse.org/repositories/devel:languages:perl/SLE_15_SP4/devel:languages:perl.repo
```
:::::
::::::
## Kernel headers and development packages
The driver package uses
[{abbr}`DKMS (Dynamic Kernel Module Support)`][DKMS-wiki] to build
the `amdgpu-dkms` module (driver) for the installed kernels. This requires the
Linux kernel headers and modules to be installed for each. Usually these are
automatically installed with the kernel, but if you have multiple kernel
versions or you have downloaded the kernel images and not the kernel
meta-packages then they must be manually installed.
[DKMS-wiki]: https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support
To install for the currently active kernel run the command corresponding
to your distribution.
::::{tab-set}
:::{tab-item} Ubuntu
:sync: ubuntu
```shell
sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
```
:::
:::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
```shell
sudo yum install kernel-headers kernel-devel
```
:::
:::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell
sudo zypper install kernel-default-devel
```
:::
::::
## Setting Permissions for Groups
This section provides steps to add any current user to a video group to access
GPU resources.
Use of the video group is recommended for all ROCm-supported operating
systems.
1. To check the groups in your system, issue the following command:
@@ -109,21 +178,17 @@ GPU resources.
groups
```
2. Add yourself to the `render` or `video` group using the following instruction:
2. Add yourself to the `render` and `video` group using the command:
```shell
sudo usermod -a -G render $LOGNAME
# OR
sudo usermod -a -G video $LOGNAME
sudo usermod -a -G render,video $LOGNAME
```
3. Use of the video group is recommended for all ROCm-supported operating
systems.
To add all future users to the `video` and `render` groups by default, run
the following commands:
To add all future users to the `video` and `render` groups by default, run the following commands:
```shell
echo 'ADD_EXTRA_GROUPS=1' | sudo tee -a /etc/adduser.conf
echo 'EXTRA_GROUPS=video' | sudo tee -a /etc/adduser.conf
echo 'EXTRA_GROUPS=render' | sudo tee -a /etc/adduser.conf
```
```shell
echo 'ADD_EXTRA_GROUPS=1' | sudo tee -a /etc/adduser.conf
echo 'EXTRA_GROUPS=video' | sudo tee -a /etc/adduser.conf
echo 'EXTRA_GROUPS=render' | sudo tee -a /etc/adduser.conf
```

View File

@@ -1,48 +1,5 @@
# Quick Start (Linux)
## Install Prerequisites
The driver package uses
[{abbr}`DKMS (Dynamic Kernel Module Support)`][DKMS-wiki] to build
the `amdgpu-dkms` module (driver) for the installed kernels. This requires the Linux
kernel headers and modules to be installed for each. Usually these are
automatically installed with the kernel, but if you have multiple kernel
versions or you have downloaded the kernel images and not the kernel
meta-packages then they must be manually installed.
[DKMS-wiki]: https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support
To install for the currently active kernel run the command corresponding
to your distribution.
::::{tab-set}
:::{tab-item} Ubuntu
:sync: ubuntu
```shell
sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
```
:::
:::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
```shell
sudo yum install kernel-headers kernel-devel
```
:::
:::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell
sudo zypper install kernel-default-devel
```
:::
::::
## Add Repositories
::::::{tab-set}
@@ -92,6 +49,7 @@ EOF
# ROCm repository for jammy
sudo tee /etc/apt/sources.list.d/rocm.list <<'EOF'
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/debian jammy main
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
EOF
```

View File

@@ -1,282 +0,0 @@
# Upgrade (Linux)
This section explains how to upgrade the existing kernel-mode driver and ROCm
packages to the latest version. The assumption is that you already have a
version of the kernel-mode driver and the ROCm software stack is installed on
the system.
```{note}
Package upgrade is applicable to single-version packages only. If the preference
is to install an updated version of the ROCm stack along with the currently
installed version, refer to the [](install) page.
```
You may use the following upgrade methods to upgrade ROCm:
- Package manager method
- Installer script method
## Package Manager Method
To upgrade the system with the desired ROCm release using the package manager
method, follow the steps below:
1. **Update the AMDGPU stack repository** Ensure you have updated the AMDGPU
repository.
2. **Upgrade the kernel-mode driver and reboot the system** Ensure you have
upgraded the kernel-mode driver and rebooted the system.
3. **Update the ROCm repository** Ensure you have updated the ROCm repository
with the desired ROCm release.
4. **Upgrade the ROCm meta-packages** Upgrade the ROCm meta-packages.
5. **Verify the upgrade for the applicable distributions** Verify if the
upgrade is successful.
To upgrade ROCm on different Linux distributions, refer to the sections below
for specific commands.
::::::{tab-set}
:::::{tab-item} Ubuntu
:sync: ubuntu
::::{rubric} Update the AMDGPU Stack Repository
::::
::::{tab-set}
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/amdgpu/5.4.3/ubuntu focal main' | sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
:::{tab-item} Ubuntu 22.04
:sync: ubuntu-22.04
```shell
echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/amdgpu/5.4.3/ubuntu jammy main' | sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
::::
Upgrade the kernel mode driver and reboot the system using the following
commands:
```shell
sudo apt install amdgpu-dkms
sudo reboot
```
::::{rubric} Update the ROCm Stack Repository
::::
::::{tab-set}
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/5.4.3 focal main" | sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
:::{tab-item} Ubuntu 22.04
:sync: ubuntu-22.04
```shell
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/5.4.3 jammy main" | sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
:::
::::
::::{rubric} Upgrade the ROCm Meta-packages
::::
Your packages can be upgraded now through their meta-packages, for example:
```shell
sudo apt install -only-upgrade rocm-hip-sdk
```
:::::
:::::{tab-item} Red Hat Enterprise Linux
:sync: RHEL
::::{rubric} Update the AMDGPU Stack Repository
::::
::::{tab-set}
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
```shell
sudo tee --append /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
Name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.4.3/rhel/8.6/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
```shell
sudo tee --append /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
Name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.4.3/rhel/8.7/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 9.1
:sync: RHEL-9.1
```shell
sudo tee --append /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
Name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.4.3/rhel/9.2/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
::::
::::{rubric} Upgrade the Kernel-mode Driver and Reboot the System
::::
Upgrade the kernel mode driver and reboot the system using the following
commands:
```shell
sudo yum install amdgpu-dkms
sudo reboot
```
::::{rubric} Update the ROCm Repository
::::
```shell
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-5.4.3]
Name=ROCm5.4.3
baseurl=https://repo.radeon.com/rocm/5.4.3/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo yum clean all
```
::::{rubric} Upgrade the ROCm Meta-packages
::::
Your packages can be upgraded now through their meta-packages, for example:
```shell
sudo apt install -only-upgrade rocm-hip-sdk
```
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
::::{rubric} Update the AMDGPU Stack Repository
::::
```shell
sudo tee --append /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.4.3/sle/15.4/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
::::{rubric} Upgrade the Kernel-mode Driver and Reboot the System
::::
Upgrade the kernel mode driver and reboot the system using the following
commands:
```shell
sudo zypper --gpg-auto-import-keys install amdgpu-dkms
sudo reboot
```
::::{rubric} Update the ROCm Stack Repository
::::
```shell
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
name=rocm
baseurl=https://repo.radeon.com/amdgpu/5.4.3/sle/15.4/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
::::{rubric} Upgrade the ROCm Meta-packages
::::
Your packages can be upgraded now through their meta-packages, for example:
```shell
sudo zypper --gpg-auto-import-keys update -y rocm-hip-sdk
```
:::::
::::::
## Installer Script Method
The installer script method automates the upgrade process for the AMDGPU and
ROCm stack. The `amdgpu-install` script handles the complete upgrade process for
ROCm, including updating the required repositories and upgrading the desired
meta-packages.
The upgrade procedure is exactly the same as installing for 1st time use. Refer
to the {ref}`install-script-method` section on the exact procedure to follow.
## Verification Process
To verify if the upgrade is successful, refer to the
{ref}`post-install-actions-linux` given in the
[Installation](install) section.

View File

@@ -1,5 +0,0 @@
# AI/ML/Inferencing
To demonstrate some of the potential usages of ROCm for AI/ML/DL/Inferencing we
provide a detailed example of a
[ROCm implementation of Inception v3 using the PyTorch framework](./inception_casestudy/inception_casestudy.md).

25
docs/examples/all.md Normal file
View File

@@ -0,0 +1,25 @@
# All Tutorial Material
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} ROCm Examples
:link: https://github.com/amd/rocm-examples
:link-type: url
Samples codes demonstrating and explaining the use of the HIP API as well as
ROCm-accelerated domain libraries.
:::
:::{grid-item-card} AI/ML/Inferencing
:link: machine_learning/all
:link-type: doc
Detailed walkthroughs of specific use-cases driven by frameworks using ROCm
acceleration.
- [Implementing Inception V3 on ROCm with PyTorch](machine_learning/pytorch_inception.md)
- [Optimizing Inference with MIGraphX](machine_learning/migraphx_optimization.md)
:::
:::::

View File

@@ -0,0 +1,20 @@
# Machine Learning, Deep Learning, and Artificial Intelligence
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} Inception V3 with PyTorch
:link: pytorch_inception
:link-type: doc
A collection of detailed and guided examples for working with Inception V3 with PyTorch on ROCm.
:::
:::{grid-item-card} Optimizing Inference with MIGraphX
:link: migraphx_optimization
:link-type: doc
Walkthroughs of optimizing inference using MIGraphX.
:::
:::::

View File

@@ -0,0 +1,338 @@
# Inference Optimization with MIGraphX
The following sections cover inferencing and introduces MIGraphX.
## Inference
The inference is where capabilities learned during Deep Learning training are put to work. It refers to using a fully trained neural network to make conclusions (predictions) on unseen data that the model has never interacted with before. Deep Learning inferencing is achieved by feeding new data, such as new images, to the network, giving the Deep Neural Network a chance to classify the image.
Taking our previous example of MNIST, the DNN can be fed new images of handwritten digit images, allowing the neural network to classify digits. A fully trained DNN should make accurate predictions about what an image represents, and inference cannot happen without training.
## MIGraphX Introduction
MIGraphX is a graph compiler focused on accelerating the Machine Learning inference that can target AMD GPUs and CPUs. MIGraphX accelerates the Machine Learning models by leveraging several graph-level transformations and optimizations. These optimizations include:
- Operator fusion
- Arithmetic simplifications
- Dead-code elimination
- Common subexpression elimination (CSE)
- Constant propagation
After doing all these transformations, MIGraphX emits code for the AMD GPU by calling to MIOpen or rocBLAS or creating HIP kernels for a particular operator. MIGraphX can also target CPUs using DNNL or ZenDNN libraries.
MIGraphX provides easy-to-use APIs in C++ and Python to import machine models in ONNX or TensorFlow. Users can compile, save, load, and run these models using MIGraphX's C++ and Python APIs. Internally, MIGraphX parses ONNX or TensorFlow models into internal graph representation where each operator in the model gets mapped to an operator within MIGraphX. Each of these operators defines various attributes such as:
- Number of arguments
- Type of arguments
- Shape of arguments
After optimization passes, all these operators get mapped to different kernels on GPUs or CPUs.
After importing a model into MIGraphX, the model is represented as `migraphx::program`. `migraphx::program` is made up of `migraphx::module`. The program can consist of several modules, but it always has one main_module. Modules are made up of `migraphx::instruction_ref`. Instructions contain the `migraphx::op` and arguments to the operator.
## Installing MIGraphX
There are three options to get started with MIGraphX installation. MIGraphX depends on ROCm libraries; assume that the machine has ROCm installed.
### Option 1: Installing Binaries
To install MIGraphX on Debian-based systems like Ubuntu, use the following command:
```bash
sudo apt update && sudo apt install -y migraphx
```
The header files and libraries are installed under `/opt/rocm-\<version\>`, where \<version\> is the ROCm version.
### Option 2: Building from Source
There are two ways to build the MIGraphX sources.
- [Use the ROCm build tool](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX#use-the-rocm-build-tool-rbuild) - This approach uses [rbuild](https://github.com/RadeonOpenCompute/rbuild) to install the prerequisites and build the libraries with just one command.
or
- [Use CMake](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX#use-cmake-to-build-migraphx) - This approach uses a script to install the prerequisites, then uses CMake to build the source.
For detailed steps on building from source and installing dependencies, refer to the following `README` file:
[https://github.com/ROCmSoftwarePlatform/AMDMIGraphX#building-from-source](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX#building-from-source)
### Option 3: Use Docker
To use Docker, follow these steps:
1. The easiest way to set up the development environment is to use Docker. To build Docker from scratch, first clone the MIGraphX repository by running:
```bash
git clone --recursive https://github.com/ROCmSoftwarePlatform/AMDMIGraphX
```
2. The repository contains a Dockerfile from which you can build a Docker image as:
```bash
docker build -t migraphx .
```
3. Then to enter the development environment, use Docker run:
```bash
docker run --device='/dev/kfd' --device='/dev/dri' -v=`pwd`:/code/AMDMIGraphX -w /code/AMDMIGraphX --group-add video -it migraphx
```
The Docker image contains all the prerequisites required for the installation, so users can go to the folder `/code/AMDMIGraphX` and follow the steps mentioned in [Option 2: Building from Source](#option-2-building-from-source).
## MIGraphX Example
MIGraphX provides both C++ and Python APIs. The following sections show examples of both using the Inception v3 model. To walk through the examples, fetch the Inception v3 ONNX model by running the following:
```py
import torch
import torchvision.models as models
inception = models.inception_v3(pretrained=True)
torch.onnx.export(inception,torch.randn(1,3,299,299), "inceptioni1.onnx")
```
This will create `inceptioni1.onnx`, which can be imported in MIGraphX using C++ or Python API.
### MIGraphX Python API
Follow these steps:
1. To import the MIGraphX module in Python script, set `PYTHONPATH` to the MIGraphX libraries installation. If binaries are installed using steps mentioned in [Option 1: Installing Binaries](#option-1-installing-binaries), perform the following action:
```bash
export PYTHONPATH=$PYTHONPATH:/opt/rocm/
```
2. The following script shows the usage of Python API to import the ONNX model, compile it, and run inference on it. Set `LD_LIBRARY_PATH` to `/opt/rocm/` if required.
```py
# import migraphx and numpy
import migraphx
import numpy as np
# import and parse inception model
model = migraphx.parse_onnx("inceptioni1.onnx")
# compile model for the GPU target
model.compile(migraphx.get_target("gpu"))
# optionally print compiled model
model.print()
# create random input image
input_image = np.random.rand(1, 3, 299, 299).astype('float32')
# feed image to model, 'x.1` is the input param name
results = model.run({'x.1': input_image})
# get the results back
result_np = np.array(results[0])
# print the inferred class of the input image
print(np.argmax(result_np))
```
Find additional examples of Python API in the `/examples` directory of the MIGraphX repository.
## MIGraphX C++ API
Follow these steps:
1. The following is a minimalist example that shows the usage of MIGraphX C++ API to load ONNX file, compile it for the GPU, and run inference on it. To use MIGraphX C++ API, you only need to load the `migraphx.hpp` file. This example runs inference on the Inception v3 model.
```c++
#include <vector>
#include <string>
#include <algorithm>
#include <ctime>
#include <random>
#include <migraphx/migraphx.hpp>
int main(int argc, char** argv)
{
migraphx::program prog;
migraphx::onnx_options onnx_opts;
// import and parse onnx file into migraphx::program
prog = parse_onnx("inceptioni1.onnx", onnx_opts);
// print imported model
prog.print();
migraphx::target targ = migraphx::target("gpu");
migraphx::compile_options comp_opts;
comp_opts.set_offload_copy();
// compile for the GPU
prog.compile(targ, comp_opts);
// print the compiled program
prog.print();
// randomly generate input image
// of shape (1, 3, 299, 299)
std::srand(unsigned(std::time(nullptr)));
std::vector<float> input_image(1*299*299*3);
std::generate(input_image.begin(), input_image.end(), std::rand);
// users need to provide data for the input
// parameters in order to run inference
// you can query into migraph program for the parameters
migraphx::program_parameters prog_params;
auto param_shapes = prog.get_parameter_shapes();
auto input = param_shapes.names().front();
// create argument for the parameter
prog_params.add(input, migraphx::argument(param_shapes[input], input_image.data()));
// run inference
auto outputs = prog.eval(prog_params);
// read back the output
float* results = reinterpret_cast<float*>(outputs[0].data());
float* max = std::max_element(results, results + 1000);
int answer = max - results;
std::cout << "answer: " << answer << std::endl;
}
```
2. To compile this program, you can use CMake and you only need to link the `migraphx::c` library to use MIGraphX's C++ API. The following is the `CMakeLists.txt` file that can build the earlier example:
```cmake
cmake_minimum_required(VERSION 3.5)
project (CAI)
set (CMAKE_CXX_STANDARD 14)
set (EXAMPLE inception_inference)
list (APPEND CMAKE_PREFIX_PATH /opt/rocm/hip /opt/rocm)
find_package (migraphx)
message("source file: " ${EXAMPLE}.cpp " ---> bin: " ${EXAMPLE})
add_executable(${EXAMPLE} ${EXAMPLE}.cpp)
target_link_libraries(${EXAMPLE} migraphx::c)
```
3. To build the executable file, run the following from the directory containing the `inception_inference.cpp` file:
```bash
mkdir build
cd build
cmake ..
make -j$(nproc)
./inception_inference
```
:::{note}
Set `LD_LIBRARY_PATH` to `/opt/rocm/lib` if required during the build. Additional examples can be found in the MIGraphX repository under the `/examples/` directory.
:::
## Tuning MIGraphX
MIGraphX uses MIOpen kernels to target AMD GPU. For the model compiled with MIGraphX, tune MIOpen to pick the best possible kernel implementation. The MIOpen tuning results in a significant performance boost. Tuning can be done by setting the environment variable `MIOPEN_FIND_ENFORCE=3`.
:::{note}
The tuning process can take a long time to finish.
:::
**Example:** The average inference time of the inception model example shown previously over 100 iterations using untuned kernels is 0.01383ms. After tuning, it reduces to 0.00459ms, which is a 3x improvement. This result is from ROCm v4.5 on a MI100 GPU.
:::{note}
The results may vary depending on the system configurations.
:::
For reference, the following code snippet shows inference runs for only the first 10 iterations for both tuned and untuned kernels:
```console
### UNTUNED ###
iterator : 0
Inference complete
Inference time: 0.063ms
iterator : 1
Inference complete
Inference time: 0.008ms
iterator : 2
Inference complete
Inference time: 0.007ms
iterator : 3
Inference complete
Inference time: 0.007ms
iterator : 4
Inference complete
Inference time: 0.007ms
iterator : 5
Inference complete
Inference time: 0.008ms
iterator : 6
Inference complete
Inference time: 0.007ms
iterator : 7
Inference complete
Inference time: 0.028ms
iterator : 8
Inference complete
Inference time: 0.029ms
iterator : 9
Inference complete
Inference time: 0.029ms
### TUNED ###
iterator : 0
Inference complete
Inference time: 0.063ms
iterator : 1
Inference complete
Inference time: 0.004ms
iterator : 2
Inference complete
Inference time: 0.004ms
iterator : 3
Inference complete
Inference time: 0.004ms
iterator : 4
Inference complete
Inference time: 0.004ms
iterator : 5
Inference complete
Inference time: 0.004ms
iterator : 6
Inference complete
Inference time: 0.004ms
iterator : 7
Inference complete
Inference time: 0.004ms
iterator : 8
Inference complete
Inference time: 0.004ms
iterator : 9
Inference complete
Inference time: 0.004ms
```
### YModel
The best inference performance through MIGraphX is conditioned upon having tuned kernel configurations stored in a `/home` local User Database (DB). If a user were to move their model to a different server or allow a different user to use it, they would have to run through the MIOpen tuning process again to populate the next User DB with the best kernel configurations and corresponding solvers.
Tuning is time consuming, and if the users have not performed tuning, they would see discrepancies between expected or claimed inference performance and actual inference performance. This has led to repetitive and time-consuming tuning tasks for each user.
MIGraphX introduces a feature, known as YModel, that stores the kernel config parameters found during tuning into a `.mxr` file. This ensures the same level of expected performance, even when a model is copied to a different user/system.
The YModel feature is available starting from ROCm 5.4.1 and UIF 1.1.
#### YModel Example
Through the `migraphx-driver` functionality, you can generate `.mxr` files with tuning information stored inside it by passing additional `--binary --output model.mxr` to `migraphx-driver` along with the rest of the necessary flags.
For example, to generate `.mxr` file from the ONNX model, use the following:
```bash
./path/to/migraphx-driver compile --onnx resnet50.onnx --enable-offload-copy --binary --output resnet50.mxr
```
To run generated `.mxr` files through `migraphx-driver`, use the following:
```bash
./path/to/migraphx-driver run --migraphx resnet50.mxr --enable-offload-copy
```
Alternatively, you can use MIGraphX's C++ or Python API to generate `.mxr` file. Refer to {numref}`image018` for an example.
```{figure} ../../data/understand/deep_learning/image.018.png
:name: image018
---
align: center
---
Generating a `.mxr` File
```

View File

@@ -1,4 +1,4 @@
# Training and Inference Walk-through: Inception V3 with PyTorch
# Inception V3 with PyTorch
## Deep Learning Training
@@ -15,11 +15,11 @@ Training occurs in multiple phases for every batch of training data. {numref}`Ty
:::{table} Types of Training Phases
:name: TypesOfTrainingPhases
:widths: auto
| Types of Phases | |
| ----------- | ----------- |
| Forward Pass | The input features are fed into the model, whose parameters may be randomly initialized initially. Activations (outputs) of each layer are retained during this pass to help in the loss gradient computation during the backward pass. |
| Loss Computation | The output is compared against the target outputs, and the loss is computed. |
| Backward Pass | The loss is propagated backward, and the model's error gradients are computed and stored for each trainable parameter. |
| Types of Phases | |
| ----------------- | --- |
| Forward Pass | The input features are fed into the model, whose parameters may be randomly initialized initially. Activations (outputs) of each layer are retained during this pass to help in the loss gradient computation during the backward pass. |
| Loss Computation | The output is compared against the target outputs, and the loss is computed. |
| Backward Pass | The loss is propagated backward, and the model's error gradients are computed and stored for each trainable parameter. |
| Optimization Pass | The optimization algorithm updates the model parameters using the stored error gradients. |
:::
@@ -44,19 +44,19 @@ The following sections contain case studies for the Inception v3 model.
### Inception v3 with PyTorch
Convolution Neural Networks are forms of artificial neural networks commonly used for image processing. One of the core layers of such a network is the convolutional layer, which convolves the input with a weight tensor and passes the result to the next layer. Inception v3 [1] is an architectural development over the ImageNet competition-winning entry, AlexNet, using more profound and broader networks while attempting to meet computational and memory budgets.
Convolution Neural Networks are forms of artificial neural networks commonly used for image processing. One of the core layers of such a network is the convolutional layer, which convolves the input with a weight tensor and passes the result to the next layer. Inception v3[^inception_arch] is an architectural development over the ImageNet competition-winning entry, AlexNet, using more profound and broader networks while attempting to meet computational and memory budgets.
The implementation uses PyTorch as a framework. This case study utilizes `torchvision` [2], a repository of popular datasets and model architectures, for obtaining the model. `torchvision` also provides pre-trained weights as a starting point to develop new models or fine-tune the model for a new task.
The implementation uses PyTorch as a framework. This case study utilizes `torchvision`[^torch_vision], a repository of popular datasets and model architectures, for obtaining the model. `torchvision` also provides pre-trained weights as a starting point to develop new models or fine-tune the model for a new task.
#### Evaluating a Pre-Trained Model
The Inception v3 model introduces a simple image classification task with the pre-trained model. This does not involve training but utilizes an already pre-trained model from `torchvision`.
This example is adapted from the PyTorch research hub page on Inception v3 [3].
This example is adapted from the PyTorch research hub page on Inception v3[^torch_vision_inception].
Follow these steps:
1. Run the PyTorch ROCm-based Docker image or refer to the section [Installing PyTorch](https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.4-/page/Frameworks_Installation.html#d1667e113) for setting up a PyTorch environment on ROCm.
1. Run the PyTorch ROCm-based Docker image or refer to the section [Installing PyTorch](/how_to/pytorch_install/pytorch_install.md) for setting up a PyTorch environment on ROCm.
```dockerfile
docker run -it -v $HOME:/data --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest
@@ -146,16 +146,16 @@ The previous section focused on downloading and using the Inception v3 model for
Follow these steps:
1. Run the PyTorch ROCm Docker image or refer to the section [Installing PyTorch](https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.4-/page/Frameworks_Installation.html#d1667e113) for setting up a PyTorch environment on ROCm.
1. Run the PyTorch ROCm Docker image or refer to the section [Installing PyTorch](how_to/pytorch_install/pytorch_install.md) for setting up a PyTorch environment on ROCm.
```dockerfile
docker pull rocm/pytorch:latest
docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest
```
2. Download an ImageNet database. For this example, the `tiny-imagenet-200` [4], a smaller ImageNet variant with 200 image classes and a training dataset with 100,000 images, was downsized to 64x64 color images.
2. Download an ImageNet database. For this example, the `tiny-imagenet-200`[^Stanford_deep_learning], a smaller ImageNet variant with 200 image classes and a training dataset with 100,000 images, was downsized to 64x64 color images.
```py
```bash
wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
```
@@ -357,7 +357,7 @@ Follow these steps:
model.to(device)
```
13. Set the loss criteria. For this example, Cross Entropy Loss [5] is used.
13. Set the loss criteria. For this example, Cross Entropy Loss[^cross_entropy] is used.
```py
criterion = torch.nn.CrossEntropyLoss()
@@ -583,7 +583,7 @@ Follow these steps:
import torch.optim as optim
```
10. Set the loss criteria. For this example, Cross Entropy Loss [5] is used.
10. Set the loss criteria. For this example, Cross Entropy Loss[^cross_entropy] is used.
```py
criterion = nn.CrossEntropyLoss()
@@ -1164,7 +1164,7 @@ To prepare the data for training, follow these steps:
---
```
8. A model needs a loss function and an optimizer for training. Since this is a binary classification problem and the model outputs a probability (a single-unit layer with a sigmoid activation), use [losses.BinaryCrossentropy](https://www.tensorflow.org/api_docs/python/tf/keras/losses/BinaryCrossentropy) loss function.
8. A model needs a loss function and an optimizer for training. Since this is a binary classification problem and the model outputs a probability (a single-unit layer with a sigmoid activation), use [`losses.BinaryCrossentropy`](https://www.tensorflow.org/api_docs/python/tf/keras/losses/BinaryCrossentropy) loss function.
```py
model.compile(loss=losses.BinaryCrossentropy(from_logits=True),
@@ -1272,422 +1272,14 @@ To prepare the data for training, follow these steps:
export_model.predict(examples)
```
## Optimization
The following sections cover inferencing and introduces MIGraphX.
### Inferencing
The inference is where capabilities learned during Deep Learning training are put to work. It refers to using a fully trained neural network to make conclusions (predictions) on unseen data that the model has never interacted with before. Deep Learning inferencing is achieved by feeding new data, such as new images, to the network, giving the Deep Neural Network a chance to classify the image.
Taking our previous example of MNIST, the DNN can be fed new images of handwritten digit images, allowing the neural network to classify digits. A fully trained DNN should make accurate predictions about what an image represents, and inference cannot happen without training.
### MIGraphX Introduction
MIGraphX is a graph compiler focused on accelerating the Machine Learning inference that can target AMD GPUs and CPUs. MIGraphX accelerates the Machine Learning models by leveraging several graph-level transformations and optimizations. These optimizations include:
- Operator fusion
- Arithmetic simplifications
- Dead-code elimination
- Common subexpression elimination (CSE)
- Constant propagation
After doing all these transformations, MIGraphX emits code for the AMD GPU by calling to MIOpen or rocBLAS or creating HIP kernels for a particular operator. MIGraphX can also target CPUs using DNNL or ZenDNN libraries.
MIGraphX provides easy-to-use APIs in C++ and Python to import machine models in ONNX or TensorFlow. Users can compile, save, load, and run these models using MIGraphX's C++ and Python APIs. Internally, MIGraphX parses ONNX or TensorFlow models into internal graph representation where each operator in the model gets mapped to an operator within MIGraphX. Each of these operators defines various attributes such as:
- Number of arguments
- Type of arguments
- Shape of arguments
After optimization passes, all these operators get mapped to different kernels on GPUs or CPUs.
After importing a model into MIGraphX, the model is represented as `migraphx::program`. `migraphx::program` is made up of `migraphx::module`. The program can consist of several modules, but it always has one main_module. Modules are made up of `migraphx::instruction_ref`. Instructions contain the `migraphx::op` and arguments to the operator.
### MIGraphX Installation
There are three options to get started with MIGraphX installation. MIGraphX depends on ROCm libraries; assume that the machine has ROCm installed.
#### Option 1: Installing Binaries
To install MIGraphX on Debian-based systems like Ubuntu, use the following command:
```bash
sudo apt update && sudo apt install -y migraphx
```
The header files and libraries are installed under `/opt/rocm-\<version\>`, where \<version\> is the ROCm version.
#### Option 2: Building from Source
There are two ways to build the MIGraphX sources.
- [Use the ROCm build tool](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX#use-the-rocm-build-tool-rbuild) - This approach uses [rbuild](https://github.com/RadeonOpenCompute/rbuild) to install the prerequisites and build the libraries with just one command.
or
- [Use CMake](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX#use-cmake-to-build-migraphx) - This approach uses a script to install the prerequisites, then uses CMake to build the source.
For detailed steps on building from source and installing dependencies, refer to the following `README` file:
[https://github.com/ROCmSoftwarePlatform/AMDMIGraphX#building-from-source](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX#building-from-source)
#### Option 3: Use Docker
To use Docker, follow these steps:
1. The easiest way to set up the development environment is to use Docker. To build Docker from scratch, first clone the MIGraphX repository by running:
```bash
git clone --recursive https://github.com/ROCmSoftwarePlatform/AMDMIGraphX
```
2. The repository contains a Dockerfile from which you can build a Docker image as:
```bash
docker build -t migraphx .
```
3. Then to enter the development environment, use Docker run:
```bash
docker run --device='/dev/kfd' --device='/dev/dri' -v=`pwd`:/code/AMDMIGraphX -w /code/AMDMIGraphX --group-add video -it migraphx
```
The Docker image contains all the prerequisites required for the installation, so users can go to the folder /code/AMDMIGraphX and follow the steps mentioned in [Option 2: Building from Source](#option-2-building-from-source).
### MIGraphX Example
MIGraphX provides both C++ and Python APIs. The following sections show examples of both using the Inception v3 model. To walk through the examples, fetch the Inception v3 ONNX model by running the following:
```py
import torch
import torchvision.models as models
inception = models.inception_v3(pretrained=True)
torch.onnx.export(inception,torch.randn(1,3,299,299), "inceptioni1.onnx")
```
This will create `inceptioni1.onnx`, which can be imported in MIGraphX using C++ or Python API.
### MIGraphX Python API
Follow these steps:
1. To import the MIGraphX module in Python script, set `PYTHONPATH` to the MIGraphX libraries installation. If binaries are installed using steps mentioned in [Option 1: Installing Binaries](#option-1-installing-binaries), perform the following action:
```py
export PYTHONPATH=$PYTHONPATH:/opt/rocm/
```
2. The following script shows the usage of Python API to import the ONNX model, compile it, and run inference on it. Set `LD_LIBRARY_PATH` to `/opt/rocm/` if required.
```py
# import migraphx and numpy
import migraphx
import numpy as np
# import and parse inception model
model = migraphx.parse_onnx("inceptioni1.onnx")
# compile model for the GPU target
model.compile(migraphx.get_target("gpu"))
# optionally print compiled model
model.print()
# create random input image
input_image = np.random.rand(1, 3, 299, 299).astype('float32')
# feed image to model, 'x.1` is the input param name
results = model.run({'x.1': input_image})
# get the results back
result_np = np.array(results[0])
# print the inferred class of the input image
print(np.argmax(result_np))
```
Find additional examples of Python API in the /examples directory of the MIGraphX repository.
### MIGraphX C++ API
Follow these steps:
1. The following is a minimalist example that shows the usage of MIGraphX C++ API to load ONNX file, compile it for the GPU, and run inference on it. To use MIGraphX C++ API, you only need to load the `migraphx.hpp` file. This example runs inference on the Inception v3 model.
```c++
#include <vector>
#include <string>
#include <algorithm>
#include <ctime>
#include <random>
#include <migraphx/migraphx.hpp>
int main(int argc, char** argv)
{
migraphx::program prog;
migraphx::onnx_options onnx_opts;
// import and parse onnx file into migraphx::program
prog = parse_onnx("inceptioni1.onnx", onnx_opts);
// print imported model
prog.print();
migraphx::target targ = migraphx::target("gpu");
migraphx::compile_options comp_opts;
comp_opts.set_offload_copy();
// compile for the GPU
prog.compile(targ, comp_opts);
// print the compiled program
prog.print();
// randomly generate input image
// of shape (1, 3, 299, 299)
std::srand(unsigned(std::time(nullptr)));
std::vector<float> input_image(1*299*299*3);
std::generate(input_image.begin(), input_image.end(), std::rand);
// users need to provide data for the input
// parameters in order to run inference
// you can query into migraph program for the parameters
migraphx::program_parameters prog_params;
auto param_shapes = prog.get_parameter_shapes();
auto input = param_shapes.names().front();
// create argument for the parameter
prog_params.add(input, migraphx::argument(param_shapes[input], input_image.data()));
// run inference
auto outputs = prog.eval(prog_params);
// read back the output
float* results = reinterpret_cast<float*>(outputs[0].data());
float* max = std::max_element(results, results + 1000);
int answer = max - results;
std::cout << "answer: " << answer << std::endl;
}
```
2. To compile this program, you can use CMake and you only need to link the `migraphx::c` library to use MIGraphX's C++ API. The following is the `CMakeLists.txt` file that can build the earlier example:
```py
cmake_minimum_required(VERSION 3.5)
project (CAI)
set (CMAKE_CXX_STANDARD 14)
set (EXAMPLE inception_inference)
list (APPEND CMAKE_PREFIX_PATH /opt/rocm/hip /opt/rocm)
find_package (migraphx)
message("source file: " ${EXAMPLE}.cpp " ---> bin: " ${EXAMPLE})
add_executable(${EXAMPLE} ${EXAMPLE}.cpp)
target_link_libraries(${EXAMPLE} migraphx::c)
```
3. To build the executable file, run the following from the directory containing the `inception_inference.cpp` file:
```py
mkdir build
cd build
cmake ..
make -j$(nproc)
./inception_inference
```
:::{note}
Set `LD_LIBRARY_PATH` to `/opt/rocm/lib` if required during the build. Additional examples can be found in the MIGraphX repository under the `/examples/` directory.
:::
### Tuning MIGraphX
MIGraphX uses MIOpen kernels to target AMD GPU. For the model compiled with MIGraphX, tune MIOpen to pick the best possible kernel implementation. The MIOpen tuning results in a significant performance boost. Tuning can be done by setting the environment variable MIOPEN_FIND_ENFORCE=3.
:::{note}
The tuning process can take a long time to finish.
:::
**Example:** The average inference time of the inception model example shown previously over 100 iterations using untuned kernels is 0.01383ms. After tuning, it reduces to 0.00459ms, which is a 3x improvement. This result is from ROCm v4.5 on a MI100 GPU.
:::{note}
The results may vary depending on the system configurations.
:::
For reference, the following code snippet shows inference runs for only the first 10 iterations for both tuned and untuned kernels:
```py
### UNTUNED ###
iterator : 0
Inference complete
Inference time: 0.063ms
iterator : 1
Inference complete
Inference time: 0.008ms
iterator : 2
Inference complete
Inference time: 0.007ms
iterator : 3
Inference complete
Inference time: 0.007ms
iterator : 4
Inference complete
Inference time: 0.007ms
iterator : 5
Inference complete
Inference time: 0.008ms
iterator : 6
Inference complete
Inference time: 0.007ms
iterator : 7
Inference complete
Inference time: 0.028ms
iterator : 8
Inference complete
Inference time: 0.029ms
iterator : 9
Inference complete
Inference time: 0.029ms
### TUNED ###
iterator : 0
Inference complete
Inference time: 0.063ms
iterator : 1
Inference complete
Inference time: 0.004ms
iterator : 2
Inference complete
Inference time: 0.004ms
iterator : 3
Inference complete
Inference time: 0.004ms
iterator : 4
Inference complete
Inference time: 0.004ms
iterator : 5
Inference complete
Inference time: 0.004ms
iterator : 6
Inference complete
Inference time: 0.004ms
iterator : 7
Inference complete
Inference time: 0.004ms
iterator : 8
Inference complete
Inference time: 0.004ms
iterator : 9
Inference complete
Inference time: 0.004ms
```
#### YModel
The best inference performance through MIGraphX is conditioned upon having tuned kernel configurations stored in a /home local User Database (DB). If a user were to move their model to a different server or allow a different user to use it, they would have to run through the MIOpen tuning process again to populate the next User DB with the best kernel configurations and corresponding solvers.
Tuning is time consuming, and if the users have not performed tuning, they would see discrepancies between expected or claimed inference performance and actual inference performance. This has led to repetitive and time-consuming tuning tasks for each user.
MIGraphX introduces a feature, known as YModel, that stores the kernel config parameters found during tuning into a `.mxr` file. This ensures the same level of expected performance, even when a model is copied to a different user/system.
The YModel feature is available starting from ROCm 5.4.1 and UIF 1.1.
##### YModel Example
Through the `migraphx-driver` functionality, you can generate `.mxr` files with tuning information stored inside it by passing additional `--binary --output model.mxr` to `migraphx-driver` along with the rest of the necessary flags.
For example, to generate `.mxr` file from the ONNX model, use the following:
```bash
./path/to/migraphx-driver compile --onnx resnet50.onnx --enable-offload-copy --binary --output resnet50.mxr
```
To run generated `.mxr` files through `migraphx-driver`, use the following:
```bash
./path/to/migraphx-driver run --migraphx resnet50.mxr --enable-offload-copy
```
Alternatively, you can use MIGraphX's C++ or Python API to generate `.mxr` file. Refer to {numref}`image018` for an example.
```{figure} ../../data/understand/deep_learning/image.018.png
:name: image018
---
align: center
---
Generating a `.mxr` File
```
## Troubleshooting
**Q: What do I do if I get this error when trying to run PyTorch:**
```bash
hipErrorNoBinaryForGPU: Unable to find code object for all current devices!
```
Ans: The error denotes that the installation of PyTorch and/or other dependencies or libraries do not support the current GPU.
**Workaround:**
To implement a workaround, follow these steps:
1. Confirm that the hardware supports the ROCm stack. Refer to the Hardware and Software Support document at [https://docs.amd.com](https://docs.amd.com).
2. Determine the gfx target.
```py
rocminfo | grep gfx
```
3. Check if PyTorch is compiled with the correct gfx target.
```py
TORCHDIR=$( dirname $( python3 -c 'import torch; print(torch.__file__)' ) )
roc-obj-ls -v $TORCHDIR/lib/libtorch_hip.so # check for gfx target
```
:::{note}
Recompile PyTorch with the right gfx target if compiling from the source if the hardware is not supported. For wheels or Docker installation, contact ROCm support [6].
:::
**Q: Why am I unable to access Docker or GPU in user accounts?**
Ans: Ensure that the user is added to docker, video, and render Linux groups as described in the ROCm Installation Guide at [https://docs.amd.com](https://docs.amd.com).
**Q: Which consumer GPUs does ROCm support?**
Ans: ROCm supports gfx1030, which is the Navi 21 series.
**Q: Can I install PyTorch directly on bare metal?**
Ans: Bare-metal installation of PyTorch is supported through wheels. Refer to Option 2: Install PyTorch Using Wheels Package in the section [Installing PyTorch](/ROCm/docs/how_to/pytorch_install/pytorch_install) of this guide for more information.
**Q: How do I profile PyTorch workloads?**
Ans: Use the PyTorch Profiler \[6\] to profile GPU kernels on ROCm.
**Q: Can I run ROCm on Windows?**
Ans: ROCm is not supported on Windows.
## References
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens and Z. Wojna, "Rethinking the Inception Architecture for Computer Vision," CoRR, p. abs/1512.00567, 2015
[^inception_arch]: C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens and Z. Wojna, "Rethinking the Inception Architecture for Computer Vision," CoRR, p. abs/1512.00567, 2015
PyTorch, \[Online\]. Available: [https://pytorch.org/vision/stable/index.html](https://pytorch.org/vision/stable/index.html)
[^torch_vision]: PyTorch, \[Online\]. Available: [https://pytorch.org/vision/stable/index.html](https://pytorch.org/vision/stable/index.html)
PyTorch, \[Online\]. Available: [https://pytorch.org/hub/pytorch_vision_inception_v3/](https://pytorch.org/hub/pytorch_vision_inception_v3/)
[^torch_vision_inception]: PyTorch, \[Online\]. Available: [https://pytorch.org/hub/pytorch_vision_inception_v3/](https://pytorch.org/hub/pytorch_vision_inception_v3/)
Stanford, \[Online\]. Available: [http://cs231n.stanford.edu/](http://cs231n.stanford.edu/)
[^Stanford_deep_learning]: Stanford, \[Online\]. Available: [http://cs231n.stanford.edu/](http://cs231n.stanford.edu/)
Wikipedia, \[Online\]. Available: [https://en.wikipedia.org/wiki/Cross_entropy](https://en.wikipedia.org/wiki/Cross_entropy)
AMD, "ROCm issues," \[Online\]. Available: [https://github.com/RadeonOpenCompute/ROCm/issues](https://github.com/RadeonOpenCompute/ROCm/issues)
PyTorch, \[Online image\]. [https://pytorch.org/assets/brand-guidelines/PyTorch-Brand-Guidelines.pdf](https://pytorch.org/assets/brand-guidelines/PyTorch-Brand-Guidelines.pdf)
TensorFlow, \[Online image\]. [https://www.tensorflow.org/extras/tensorflow_brand_guidelines.pdf](https://www.tensorflow.org/extras/tensorflow_brand_guidelines.pdf)
MAGMA, \[Online image\]. [https://bitbucket.org/icl/magma/src/master/docs/](https://bitbucket.org/icl/magma/src/master/docs/)
Advanced Micro Devices, Inc., \[Online\]. Available: [https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/](https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/)
Advanced Micro Devices, Inc., \[Online\]. Available: [https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki)
Docker, \[Online\]. [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/)
Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision)
[^cross_entropy]: Wikipedia, \[Online\]. Available: [https://en.wikipedia.org/wiki/Cross_entropy](https://en.wikipedia.org/wiki/Cross_entropy)

View File

@@ -0,0 +1,56 @@
# Troubleshooting
**Q: What do I do if I get this error when trying to run PyTorch:**
```bash
hipErrorNoBinaryForGPU: Unable to find code object for all current devices!
```
Ans: The error denotes that the installation of PyTorch and/or other
dependencies or libraries do not support the current GPU.
**Workaround:**
To implement a workaround, follow these steps:
1. Confirm that the hardware supports the ROCm stack. Refer to
{ref}`supported_gpus`.
2. Determine the gfx target.
```bash
rocminfo | grep gfx
```
3. Check if PyTorch is compiled with the correct gfx target.
```bash
TORCHDIR=$( dirname $( python3 -c 'import torch; print(torch.__file__)' ) )
roc-obj-ls -v $TORCHDIR/lib/libtorch_hip.so # check for gfx target
```
:::{note}
Recompile PyTorch with the right gfx target if compiling from the source if
the hardware is not supported. For wheels or Docker installation, contact
ROCm support [^ROCm_issues].
:::
**Q: Why am I unable to access Docker or GPU in user accounts?**
Ans: Ensure that the user is added to docker, video, and render Linux groups as
described in the ROCm Installation Guide at {ref}`setting_group_permissions`.
**Q: Can I install PyTorch directly on bare metal?**
Ans: Bare-metal installation of PyTorch is supported through wheels. Refer to
Option 2: Install PyTorch Using Wheels Package in the section
{ref}`install_pytorch_using_wheels` of this guide for more information.
**Q: How do I profile PyTorch workloads?**
Ans: Use the PyTorch Profiler to profile GPU kernels on ROCm.
------
[^ROCm_issues]: AMD, "ROCm issues," \[Online\]. Available: [https://github.com/RadeonOpenCompute/ROCm/issues](https://github.com/RadeonOpenCompute/ROCm/issues)

34
docs/how_to/all.md Normal file
View File

@@ -0,0 +1,34 @@
# All How-To Material
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} Tuning Guides
:link: tuning_guides/index
:link-type: doc
Use case-specific system setup and tuning guides.
:::
:::{grid-item-card} Deep Learning Guide
:link: deep_learning_rocm
:link-type: doc
Installation of various Deep Learning frameworks and applications.
:::
:::{grid-item-card} GPU-Enabled MPI
:link: gpu_aware_mpi
:link-type: doc
This chapter exemplifies how to set up Open MPI with the ROCm platform.
:::
:::{grid-item-card} System Debugging Guide
:link: system_debugging
:link-type: doc
Useful commands to debug misbehaving ROCm installations.
:::
:::::

View File

@@ -1,7 +1,10 @@
# Deep Learning Guide
The following sections cover the different framework installations for ROCm and
Deep Learning applications. {numref}`Rocm-Compat-Frameworks-Flowchart` provides the sequential flow for the use of each framework. Refer to the ROCm Compatible Frameworks Release Notes for each framework's most current release notes at [Framework Release Notes](https://docs.amd.com/bundle/ROCm-Compatible-Frameworks-Release-Notes/page/Framework_Release_Notes.html).
Deep Learning applications. {numref}`Rocm-Compat-Frameworks-Flowchart` provides
the sequential flow for the use of each framework. Refer to the ROCm Compatible
Frameworks Release Notes for each framework's most current release notes at
{ref}`ml_framework_compat_matrix`.
```{figure} ../data/how_to/magma_install/image.005.png
:name: Rocm-Compat-Frameworks-Flowchart
@@ -14,5 +17,5 @@ ROCm Compatible Frameworks Flowchart
## Frameworks Installation
- [How to Install PyTorch?](pytorch_install/pytorch_install)
- [How to Install Tensorflow?](tensorflow_install/tensorflow_install)
- [How to Install Magma?](magma_install/magma_install)
- [How to Install Magma?](tensorflow_install/tensorflow_install)

View File

@@ -61,7 +61,7 @@ The next step is to set up UCX by compiling its source code and install it:
```shell
export UCX_DIR=$INSTALL_DIR/ucx
cd $BUILD_DIR
git clone https://github.com/openucx/ucx.git -b v1.13.0
git clone https://github.com/openucx/ucx.git -b v1.14.1
cd ucx
./autogen.sh
mkdir build
@@ -75,6 +75,10 @@ make -j $(nproc)
make -j $(nproc) install
```
The following
[table](../release/3rd_party_support_matrix.md#communication-libraries)
documents the compatibility of UCX versions with ROCm versions.
## Install Open MPI
These are the steps to build Open MPI:
@@ -89,6 +93,7 @@ cd ompi
mkdir build
cd build
../configure --prefix=$OMPI_DIR --with-ucx=$UCX_DIR \
--with-rocm=/opt/rocm \
--enable-mca-no-build=btl-uct --enable-mpi1-compatibility \
CC=clang CXX=clang++ FC=flang
make -j $(nproc)
@@ -97,7 +102,7 @@ make -j $(nproc) install
## ROCm-enabled OSU
he OSU Micro Benchmarks v5.9 (OMB) can be used to evaluate the performance of
The OSU Micro Benchmarks v5.9 (OMB) can be used to evaluate the performance of
various primitives with an AMD GPU device and ROCm support. This functionality
is exposed when configured with `--enable-rocm` option. We can use the following
steps to compile OMB:
@@ -118,13 +123,21 @@ make -j $(nproc)
## Intra-node Run
Before running an Open MPI job, it is essential to set some environment variables to
ensure that the correct version of Open MPI and UCX is being used.
```shell
export LD_LIBRARY_PATH=$OMPI_DIR/lib:$UCX_DIR/lib:/opt/rocm/lib
export PATH=$OMPI_DIR/bin:$PATH
```
The following command runs the OSU bandwidth benchmark between the first two GPU
devices (i.e., GPU 0 and GPU 1, same OAM) by default inside the same node. It
measures the unidirectional bandwidth from the first device to the other.
```shell
$OMPI_DIR/bin/mpirun -np 2 --mca btl '^openib' \
-x UCX_TLS=sm,self,rocm_copy,rocm_ipc \
$OMPI_DIR/bin/mpirun -np 2 \
-x UCX_TLS=sm,self,rocm \
--mca pml ucx mpi/pt2pt/osu_bw -d rocm D D
```
@@ -146,3 +159,37 @@ connection:
:alt: OSU execution showing transfer bandwidth increasing alongside payload inc.
Inter-GPU bandwidth with various payload sizes.
:::
## Collective Operations
Collective Operations on GPU buffers are best handled through the
Unified Collective Communication Library (UCC) component in Open MPI.
For this, the UCC library has to be configured and compiled with ROCm
support. An example for configuring UCC and Open MPI with ROCm support
is shown below:
```shell
export UCC_DIR=$INSTALL_DIR/ucc
git clone https://github.com/openucx/ucc.git
cd ucc
./configure --with-rocm=/opt/rocm \
--with-ucx=$UCX_DIR \
--prefix=$UCC_DIR
make -j && make install
# Configure and compile Open MPI with UCX, UCC, and ROCm support
cd ompi
./configure --with-rocm=/opt/rocm \
--with-ucx=$UCX_DIR \
--with-ucc=$UCC_DIR
--prefix=$OMPI_DIR
```
To use the UCC component with an MPI application requires setting some
additional parameters:
```shell
mpirun --mca pml ucx --mca osc ucx \
--mca coll_ucc_enable 1 \
--mca coll_ucc_priority 100 -np 64 ./my_mpi_app
```

View File

@@ -14,10 +14,12 @@ automatic differentiation. Other advanced features include:
### Installing PyTorch
To install ROCm on bare metal, refer to the section
[ROCm Installation](https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.4-/page/Prerequisites.html#d2999e60).
The recommended option to get a PyTorch environment is through Docker. However,
installing the PyTorch wheels package on bare metal is also supported.
To install ROCm on bare metal, refer to the sections
[GPU and OS Support (Linux)](../../release/gpu_os_support.md) and
[Compatibility](../../release/compatibility.md) for hardware, software and
3rd-party framework compatibility between ROCm and PyTorch. The recommended
option to get a PyTorch environment is through Docker. However, installing the
PyTorch wheels package on bare metal is also supported.
#### Option 1 (Recommended): Use Docker Image with PyTorch Pre-Installed
@@ -51,6 +53,8 @@ Follow these steps:
onto the container.
:::
(install_pytorch_using_wheels)=
#### Option 2: Install PyTorch Using Wheels Package
PyTorch supports the ROCm platform by providing tested wheels packages. To
@@ -77,9 +81,9 @@ To install PyTorch using the wheels package, follow these installation steps:
b. Download a base OS Docker image and install ROCm following the
installation directions in the section
[Installation](https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.4-/page/Prerequisites.html#d2999e60).
ROCm 5.2 is installed in this example, as supported by the installation
matrix from <http://pytorch.org/>.
[Installation](../../deploy/linux/install.md). ROCm 5.2 is installed in
this example, as supported by the installation matrix from
<http://pytorch.org/>.
or
@@ -152,7 +156,7 @@ Follow these steps:
cd ~
git clone https://github.com/pytorch/pytorch.git
cd pytorch
git submodule update --init recursive
git submodule update --init --recursive
```
4. Build PyTorch for ROCm.
@@ -194,7 +198,7 @@ Follow these steps:
```bash
python3 tools/amd_build/build_amd.py
USE_ROCM=1 MAX_JOBS=4 python3 setup.py install user
USE_ROCM=1 MAX_JOBS=4 python3 setup.py install --user
```
#### Option 4: Install Using PyTorch Upstream Docker File
@@ -217,7 +221,7 @@ Follow these steps:
cd ~
git clone https://github.com/pytorch/pytorch.git
cd pytorch
git submodule update --init recursive
git submodule update --init --recursive
```
2. Build the PyTorch Docker image.

View File

@@ -52,7 +52,7 @@ Debug messages when developing/debugging base ROCm driver. You could enable the
## Turn Off Page Retry on GFX9/Vega Devices
`sudo s`
`sudo -s`
`echo 1 > /sys/module/amdkfd/parameters/noretry`
@@ -65,4 +65,4 @@ Debug messages when developing/debugging base ROCm driver. You could enable the
## PCIe-Debug
Refer to ROCm PCIe Debug, <a href="https://rocmdocs.amd.com/en/latest/Other_Solutions/PCIe-Debug.html#pcie-debug" target="_blank">https://rocmdocs.amd.com/en/latest/Other_Solutions/PCIe-Debug.html#pcie-debug</a>.
For information on how to debug and profile HIP applications, see <a href="https://rocmdocs.amd.com/projects/HIP/en/latest/how_to_guides/debugging.html" target="_blank">https://rocmdocs.amd.com/projects/HIP/en/latest/how_to_guides/debugging.html</a>
For information on how to debug and profile HIP applications, see {doc}`hip:how_to_guides/debugging`

View File

@@ -16,8 +16,8 @@ The following sections contain options for installing TensorFlow.
#### Option 1: Install TensorFlow Using Docker Image
To install ROCm on bare metal, follow the section
[ROCm Installation](https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.4-/page/Prerequisites.html#d2999e60).
The recommended option to get a TensorFlow environment is through Docker.
[Installation (Linux)](../../deploy/linux/install.md). The recommended option to
get a TensorFlow environment is through Docker.
Using Docker provides portability and access to a prebuilt Docker container that
has been rigorously tested within AMD. This might also save compilation time and
@@ -45,7 +45,7 @@ To install TensorFlow using the wheels package, follow these steps:
1. Check the Python version.
```bash
python3 version
python3 --version
```
| If: | Then: |
@@ -105,7 +105,7 @@ To install TensorFlow using the wheels package, follow these steps:
5. Install TensorFlow for the Python version as indicated in Step 2.
```bash
/usr/bin/python[version] -m pip install --user tensorflow-rocm==[wheel-version] upgrade
/usr/bin/python[version] -m pip install --user tensorflow-rocm==[wheel-version] --upgrade
```
For a valid wheel version for a ROCm release, refer to the instruction below:

View File

@@ -1,5 +1,7 @@
# Tuning Guides
Use case-specific system setup and tuning guides.
## High Performance Computing
High Performance Computing (HPC) workloads have unique requirements. The default

View File

@@ -83,78 +83,97 @@ available as listed in {numref}`mi100-bios`.
- AMD CBS / NBIO Common Options
- IOMMU
- Disable
-
*
- AMD CBS / NBIO Common Options
- PCIe Ten Bit Tag Support
- Enable
-
*
- AMD CBS / NBIO Common Options
- Preferred IO
- Manual
-
*
- AMD CBS / NBIO Common Options
- Preferred IO Bus
- "Use lspci to find pci device id"
-
*
- AMD CBS / NBIO Common Options
- Enhanced Preferred IO Mode
- Enable
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Determinism Control
- Manual
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Determinism Slider
- Power
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- cTDP Control
- Manual
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- cTDP
- 240
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Package Power Limit Control
- Manual
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Package Power Limit
- 240
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- xGMI Link Width Control
- Manual
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- xGMI Force Link Width
- 2
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- xGMI Force Link Width Control
- Force
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- APBDIS
- 1
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- DF C-states
- Auto
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Fixed SOC P-state
- P0
-
*
- AMD CBS / UMC Common Options / DDR4 Common Options
- Enforce POR
- Accept
-
*
- AMD CBS / UMC Common Options / DDR4 Common Options / Enforce POR
- Overclock
- Enabled
-
*
- AMD CBS / UMC Common Options / DDR4 Common Options / Enforce POR
- Memory Clock Speed

View File

@@ -27,8 +27,6 @@ Analogous settings for other non-AMI System BIOS providers could be set
similarly. For systems with Intel processors, some settings may not apply or be
available as listed in {numref}`mi200-bios`.
Table 2: Recommended settings for the system BIOS in a GIGABYTE platform.
```{list-table} Recommended settings for the system BIOS in a GIGABYTE platform.
:header-rows: 1
:name: mi200-bios
@@ -82,30 +80,37 @@ Table 2: Recommended settings for the system BIOS in a GIGABYTE platform.
- AMD CBS / NBIO Common Options
- IOMMU
- Disable
-
*
- AMD CBS / NBIO Common Options
- PCIe Ten Bit Tag Support
- Auto
-
*
- AMD CBS / NBIO Common Options
- Preferred IO
- Bus
-
*
- AMD CBS / NBIO Common Options
- Preferred IO Bus
- "Use lspci to find pci device id"
-
*
- AMD CBS / NBIO Common Options
- Enhanced Preferred IO Mode
- Enable
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Determinism Control
- Manual
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Determinism Slider
- Power
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- cTDP Control
@@ -115,6 +120,7 @@ Table 2: Recommended settings for the system BIOS in a GIGABYTE platform.
- AMD CBS / NBIO Common Options / SMU Common Options
- cTDP
- 280
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Package Power Limit Control
@@ -124,6 +130,7 @@ Table 2: Recommended settings for the system BIOS in a GIGABYTE platform.
- AMD CBS / NBIO Common Options / SMU Common Options
- Package Power Limit
- 280
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- xGMI Link Width Control
@@ -133,30 +140,37 @@ Table 2: Recommended settings for the system BIOS in a GIGABYTE platform.
- AMD CBS / NBIO Common Options / SMU Common Options
- xGMI Force Link Width
- 2
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- xGMI Force Link Width Control
- Force
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- APBDIS
- 1
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- DF C-states
- Enabled
-
*
- AMD CBS / NBIO Common Options / SMU Common Options
- Fixed SOC P-state
- P0
-
*
- AMD CBS / UMC Common Options / DDR4 Common Options
- Enforce POR
- Accept
-
*
- AMD CBS / UMC Common Options / DDR4 Common Options / Enforce POR
- Overclock
- Enabled
-
*
- AMD CBS / UMC Common Options / DDR4 Common Options / Enforce POR
- Memory Clock Speed

View File

@@ -1,4 +0,0 @@
# Inference Optimization Using MIGraphX
Pull content from
<https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.4.1/page/Optimization.html>

View File

@@ -1,4 +1,4 @@
# AMD ROCm™ Platform - Powering Your GPU Computational Needs
# AMD ROCm™ Documentation
:::::{grid} 1 1 3 3
:gutter: 1
@@ -14,7 +14,7 @@ agile, flexible, rapid and secure manner. [more...](rocm)
::::
::::{grid-item}
:::{dropdown} [Deploy ROCm](deploy)
:::{dropdown} Deploy ROCm
- {doc}`/deploy/linux/index`
- {doc}`/deploy/docker`
@@ -44,14 +44,14 @@ agile, flexible, rapid and secure manner. [more...](rocm)
[APIs and Reference](reference/all)
^^^
- [Compilers and Development Tools](reference/compilers)
- [HIP](reference/hip)
- [OpenMP](reference/openmp/openmp)
- [Math Libraries](reference/gpu_libraries/math)
- [C++ Primitives Libraries](reference/gpu_libraries/c++_primitives)
- [Communication Libraries](reference/gpu_libraries/communication)
- [AI Libraries](reference/ai_tools)
- [Computer Vision](reference/computer_vision)
- [OpenMP](reference/openmp/openmp)
- [Compilers and Tools](reference/compilers)
- [Management Tools](reference/management_tools)
- [Validation Tools](reference/validation_tools)
@@ -59,23 +59,24 @@ agile, flexible, rapid and secure manner. [more...](rocm)
:::{grid-item-card}
:padding: 2
Understand ROCm
[Understand ROCm](understand/all)
^^^
- [Compiler Disambiguation](understand/compiler_disambiguation)
- [Using CMake](understand/cmake_packages)
- [ROCm File Reorganization White Paper](understand/file_reorg)
- [GPU Architecture](understand/gpu_arch)
- [Linux Folder Structure Reorganization](understand/file_reorg)
- [GPU Isolation Techniques](understand/gpu_isolation)
- [GPU Architecture](understand/gpu_arch)
:::
:::{grid-item-card}
:padding: 2
How to Guides
[How to Guides](how_to/all)
^^^
- [System Tuning for Various Architectures](how_to/tuning_guides/index)
- [GPU Aware MPI](how_to/gpu_aware_mpi)
- [Setting up for Deep Learning with ROCm](how_to/deep_learning_rocm)
- [Magma Installation](how_to/magma_install/magma_install)
- [PyTorch Installation](how_to/pytorch_install/pytorch_install)
@@ -86,12 +87,13 @@ How to Guides
:::{grid-item-card}
:padding: 2
Examples
[Tutorials & Examples](examples/all)
^^^
- [ROCm Examples](https://github.com/amd/rocm-examples)
- [AI/ML/Inferencing](examples/ai_ml_inferencing)
- [Inception V3 with PyTorch](examples/inception_casestudy/inception_casestudy)
- [Examples](https://github.com/amd/rocm-examples)
- [ML, DL, and AI](examples/machine_learning/all)
- [](examples/machine_learning/pytorch_inception)
- [](examples/machine_learning/migraphx_optimization)
:::
::::

View File

@@ -3,24 +3,24 @@
::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [MIOpen](https://rocmdocs.amd.com/projects/MIOpen/en/latest/)
:::{grid-item-card} {doc}`MIOpen <miopen:index>`
AMD's library for high performance machine learning primitives.
- [Documentation](https://rocmdocs.amd.com/projects/MIOpen/en/latest/)
- {doc}`Documentation <miopen:index>`
:::
:::{grid-item-card} [Composable Kernel](https://rocmdocs.amd.com/projects/composable_kernel/en/latest/)
:::{grid-item-card} {doc}`Composable Kernel <composable_kernel:index>`
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
- [Documentation](https://rocmdocs.amd.com/projects/composable_kernel/en/latest/)
- {doc}`Documentation <composable_kernel:index>`
:::
:::{grid-item-card} [MIGraphX](https://rocmdocs.amd.com/projects/MIGraphX/en/latest/)
:::{grid-item-card} {doc}`MIGraphX <amdmigraphx:index>`
AMD MIGraphX is AMD's graph inference engine that accelerates machine learning model inference.
- [Documentation](https://rocmdocs.amd.com/projects/MIGraphX/en/latest/)
- {doc}`Documentation <amdmigraphx:index>`
:::

View File

@@ -8,7 +8,7 @@
:::{grid-item-card} [HIP](./hip)
HIP is both AMD's GPU programming language extension and the GPU runtime.
- [HIP Runtime API Manual](https://rocmdocs.amd.com/projects/hipBLAS/en/latest/)
- {doc}`hip:.doxygen/docBin/html/index`
- [Examples](https://github.com/amd/rocm-examples/tree/develop/HIP-Basic)
:::
@@ -25,33 +25,33 @@ HIP Math Libraries support the following domains:
:::{grid-item-card} [C++ Primitive Libraries](./gpu_libraries/c++_primitives)
ROCm template libraries for C++ primitives and algorithms are as follows:
- [rocPRIM](https://rocprim.readthedocs.io/en/latest/)
- [rocThrust](https://rocthrust.readthedocs.io/en/latest/)
- [hipCUB](https://hipcub.readthedocs.io/en/latest/)
- {doc}`rocPRIM <rocprim:index>`
- {doc}`rocThrust <rocthrust:index>`
- {doc}`hipCUB <hipcub:index>`
:::
:::{grid-item-card} [Communication Libraries](gpu_libraries/communication)
Inter and intra-node communication is supported by the following projects:
- [RCCL](https://rocmdocs.amd.com/projects/rccl/en/latest/)
- {doc}`RCCL <rccl:index>`
:::
:::{grid-item-card} [AI Libraries](./ai_tools)
Libraries related to AI.
- [MIOpen](https://rocmdocs.amd.com/projects/MIOpen/en/latest/)
- [Composable Kernel](https://rocmdocs.amd.com/projects/composable_kernel/en/latest/)
- [MIGraphX](https://rocmdocs.amd.com/projects/MIGraphX/en/latest/)
- {doc}`MIOpen <miopen:index>`
- {doc}`Composable Kernel <composable_kernel:index>`
- {doc}`MIGraphX <amdmigraphx:index>`
:::
:::{grid-item-card} [Computer Vision](./computer_vision)
Computer vision related projects.
- [MIVisionX](https://rocmdocs.amd.com/projects/MIVisionX/en/latest)
- [rocAL](https://rocmdocs.amd.com/projects/rocAL/en/latest)
- {doc}`MIVisionX <mivisionx:README>`
- {doc}`rocAL <rocal:README>`
:::
@@ -63,25 +63,25 @@ Computer vision related projects.
:::{grid-item-card} [Compilers and Tools](compilers)
- [ROCmCC](https://rocmdocs.amd.com/projects/ROCmCC/en/latest/)
- [ROCgdb](https://rocmdocs.amd.com/projects/ROCgdb/en/latest/)
- [ROCProfiler](https://rocmdocs.amd.com/projects/rocprofiler/en/latest/)
- [ROCTracer](https://rocmdocs.amd.com/projects/roctracer/en/latest/)
- [ROCmCC](/reference/rocmcc/rocmcc)
- {doc}`ROCgdb <rocgdb:index>`
- {doc}`ROCProfiler <rocprofiler:rocprof>`
- {doc}`ROCTracer <roctracer:index>`
:::
:::{grid-item-card} [Management Tools](management_tools)
- [AMD SMI](https://rocmdocs.amd.com/projects/amdsmi/en/latest/)
- [ROCm SMI](https://rocmdocs.amd.com/projects/rocmsmi/en/latest/)
- [ROCm Datacenter Tool](https://rocmdocs.amd.com/projects/rdc/en/latest/)
- AMD SMI
- [ROCm SMI](https://rocmdocs.amd.com/projects/rocm_smi_lib/en/latest/)
- {doc}`ROCm Datacenter Tool <rdc:index>`
:::
:::{grid-item-card} [Validation Tools](validation_tools)
- [ROCm Validation Suite](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/latest/)
- [TransferBench](https://rocmdocs.amd.com/projects/TransferBench/en/latest/)
- {doc}`ROCm Validation Suite <rocmvalidationsuite:index>`
- {doc}`TransferBench <transferbench:index>`
:::

View File

@@ -3,32 +3,47 @@
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [ROCmCC](https://rocmdocs.amd.com/projects/ROCmCC/en/latest/)
ROCmCC is a Clang/LLVM-based compiler. It is optimized for high-performance computing on AMD GPUs and CPUs and supports various heterogeneous programming models such as HIP, OpenMP, and OpenCL.
- [Documentation](https://rocmdocs.amd.com/projects/ROCmCC/en/latest/)
:::{grid-item-card} ROCmCC
:link: /reference/rocmcc/rocmcc
:link-type: doc
ROCmCC is a Clang/LLVM-based compiler. It is optimized for high-performance
computing on AMD GPUs and CPUs and supports various heterogeneous programming
models such as HIP, OpenMP, and OpenCL.
:::
:::{grid-item-card} [ROCgdb](https://rocmdocs.amd.com/projects/ROCgdb/en/latest/)
:::{grid-item-card} ROCgdb
:link: rocgdb:index
:link-type: doc
This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.
- [Documentation](https://rocmdocs.amd.com/projects/ROCgdb/en/latest/)
:::
:::{grid-item-card} [ROCProfiler](https://rocmdocs.amd.com/projects/rocprofiler/en/latest/)
:::{grid-item-card} ROCProfiler
:link: rocprofiler:rocprof
:link-type: doc
ROC profiler library. Profiling with performance counters and derived metrics. Library supports GFX8/GFX9. Hardware specific low-level performance analysis interface for profiling of GPU compute applications. The profiling includes hardware performance counters with complex performance metrics.
- [Documentation](https://rocmdocs.amd.com/projects/rocprofiler/en/latest/)
:::
:::{grid-item-card} ROCTracer
:link: roctracer:index
:link-type: doc
Callback/Activity Library for Performance tracing AMD GPU's
:::
:::{grid-item-card} [ROCTracer](https://rocmdocs.amd.com/projects/roctracer/en/latest/)
Callback/Activity Library for Performance tracing AMD GPU's
- [Documentation](https://rocmdocs.amd.com/projects/roctracer/en/latest/)
:::{grid-item-card} ROCdbgapi
:link: rocdbgapi:index
:link-type: doc
The AMD Debugger API is a library that provides all the support necessary for a
debugger and other tools to perform low level control of the execution and
inspection of execution state of AMD's commercially available GPU architectures.
:::
:::::
## See Also
- [Compiler Disambiguation](../understand/compiler_disambiguation.md)

View File

@@ -3,17 +3,17 @@
::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [MIVisionX](https://rocmdocs.amd.com/projects/MIVisionX/en/latest/)
:::{grid-item-card} {doc}`MIVisionX <mivisionx:README>`
MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.
- [Documentation](https://rocmdocs.amd.com/projects/MIVisionX/en/latest/)
- {doc}`Documentation <mivisionx:README>`
:::
:::{grid-item-card} [rocAL](https://rocmdocs.amd.com/projects/rocAL/en/latest/)
:::{grid-item-card} {doc}`rocAL <rocal:README>`
The AMD ROCm Augmentation Library (rocAL) is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a processing graph programmable by the user. rocAL currently provides C API.
- [Documentation](https://rocmdocs.amd.com/projects/rocAL/en/latest/)
- {doc}`Documentation <rocal:README>`
:::

View File

@@ -1,37 +0,0 @@
# Framework Compatibility
The ROCm release supports the most recent and two prior releases of PyTorch and TensorFlow.
Legends:
Blue: Shows compatibility tested versions
Gray: Not tested
![With PyTorch](../../data/framework_compatibility/with_pytorch.png)
![With TensorFlow](../../data/framework_compatibility/with_tensorflow.png)
## Supported Frameworks
This section contains the latest release notes for each framework compatible with ROCm™ and Deep Learning (DL) applications.
The ROCm 5.4 platform supports the following frameworks:
- PyTorch v1.12.1
- MAGMA v2.5.4
- TensorFlow v2.10.0
### PyTorch
For the latest release of PyTorch, refer to <a href="https://github.com/pytorch/pytorch/releases/" target="_blank">https://github.com/pytorch/pytorch/releases/</a>
### MAGMA
For the latest release of MAGMA, refer to <a href="https://icl.utk.edu/magma/index.html" target="_blank">https://icl.utk.edu/magma/index.html</a>
### TensorFlow
For the latest release of TensorFlow, refer to <a href="https://github.com/tensorflow/tensorflow/releases/" target="_blank">https://github.com/tensorflow/tensorflow/releases</a>

View File

@@ -5,33 +5,33 @@ ROCm template libraries for algorithms are as follows:
:::::{grid} 1 1 3 3
:gutter: 1
:::{grid-item-card} [rocPRIM](https://rocmdocs.amd.com/projects/rocPRIM/en/latest/)
:::{grid-item-card} {doc}`rocPRIM <rocprim:index>`
rocPRIM is an AMD GPU optimized template library of algorithm primitives, like
transforms, reductions, scans, etc. It also serves as a common back-end for
similar libraries found inside ROCm.
- [Documentation](https://rocmdocs.amd.com/projects/rocPRIM/en/latest/)
- {doc}`Documentation <rocprim:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocPRIM/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/rocPRIM)
:::
:::{grid-item-card} [rocThrust](https://rocmdocs.amd.com/projects/rocThrust/en/latest/)
:::{grid-item-card} {doc}`rocThrust <rocthrust:index>`
rocThrust is a template library of algorithm primitives with a Thrust-compatible
interface. Their CPU back-ends are identical, while the GPU back-end calls into
rocPRIM.
- [Documentation](https://rocmdocs.amd.com/projects/rocThrust/en/latest/)
- {doc}`Documentation <rocthrust:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocThrust/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/rocThrust)
:::
:::{grid-item-card} [hipCUB](https://rocmdocs.amd.com/projects/hipCUB/en/latest/)
:::{grid-item-card} {doc}`hipCUB <hipcub:index>`
hipCUB is a template library of algorithm primitives with a CUB-compatible
interface. It's back-end is rocPRIM.
- [Documentation](https://rocmdocs.amd.com/projects/hipCUB/en/latest/)
- {doc}`Documentation <hipcub:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipCUB/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/hipCUB)

View File

@@ -3,13 +3,13 @@
:::::{grid} 1 1 1 1
:gutter: 1
:::{grid-item-card} [RCCL](https://rocmdocs.amd.com/projects/rccl/en/latest/)
:::{grid-item-card} {doc}`RCCL <rccl:index>`
RCCL (pronounced "Rickle") is a stand-alone library of standard collective communication routines for GPUs,
implementing all-reduce, all-gather, reduce, broadcast, reduce-scatter, gather, scatter, and all-to-all.
The collective operations are implemented using ring and tree algorithms and have been optimized for
throughput and latency.
- [Documentation](https://rocmdocs.amd.com/projects/rccl/en/latest/)
- {doc}`Documentation <rccl:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocFFT/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/ROCmSoftwarePlatform/rccl/tree/develop/tools)

View File

@@ -5,20 +5,20 @@ ROCm libraries for FFT are as follows:
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [rocFFT](https://rocmdocs.amd.com/projects/rocFFT/en/latest/)
:::{grid-item-card} {doc}`rocFFT <rocfft:index>`
rocFFT is an AMD GPU optimized library for FFT.
- [Documentation](https://rocmdocs.amd.com/projects/rocFFT/en/latest/)
- {doc}`Documentation <rocfft:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocFFT/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} [hipFFT](https://rocmdocs.amd.com/projects/hipFFT/en/latest/)
:::{grid-item-card} {doc}`hipFFT <hipfft:index>`
hipFFT is a compatibility layer for GPU accelerated FFT optimized for AMD GPUs
using rocFFT. hipFFT allows for a common interface for other non AMD GPU
FFT libraries.
- [Documentation](https://rocmdocs.amd.com/projects/hipFFT/en/latest/)
- {doc}`Documentation <hipfft:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipFFT/blob/develop/CHANGELOG.md)
:::

View File

@@ -5,85 +5,85 @@ ROCm libraries for linear algebra are as follows:
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [rocBLAS](https://rocmdocs.amd.com/projects/rocBLAS/en/develop/)
:::{grid-item-card} {doc}`rocBLAS <rocblas:index>`
`rocBLAS` is an AMD GPU optimized library for BLAS (Basic Linear Algebra Subprograms).
- [Documentation](https://rocmdocs.amd.com/projects/rocBLAS/en/develop/)
- {doc}`Documentation <rocblas:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/rocBLAS)
:::
:::{grid-item-card} [hipBLAS](https://rocmdocs.amd.com/projects/hipBLAS/en/develop/)
:::{grid-item-card} {doc}`hipBLAS <hipblas:index>`
`hipBLAS` is a compatibility layer for GPU accelerated BLAS optimized for AMD GPUs
via `rocBLAS` and `rocSOLVER`. `hipBLAS` allows for a common interface for other GPU
BLAS libraries.
- [Documentation](https://rocmdocs.amd.com/projects/hipBLAS/en/develop/)
- {doc}`Documentation <hipblas:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipBLAS/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} [hipBLASLt](https://rocmdocs.amd.com/projects/hipBLASLt/en/develop/)
:::{grid-item-card} {doc}`hipBLASLt <hipblaslt:index>`
`hipBLASLt` is a library that provides general matrix-matrix operations with a
flexible API and extends functionalities beyond traditional BLAS library.
`hipBLASLt` is exposed APIs in HIP programming language with an underlying
optimized generator as a back-end kernel provider.
- [Documentation](https://rocmdocs.amd.com/projects/hipBLASLt/en/develop/)
- {doc}`Documentation <hipblaslt:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipBLASLt/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} [rocALUTION](https://rocmdocs.amd.com/projects/rocALUTION/en/develop/)
:::{grid-item-card} {doc}`rocALUTION <rocalution:index>`
`rocALUTION` is a sparse linear algebra library with focus on exploring
fine-grained parallelism on top of AMD's ROCm runtime and toolchains, targeting
modern CPU and GPU platforms.
- [Documentation](https://rocmdocs.amd.com/projects/rocALUTION/en/develop/)
- {doc}`Documentation <rocalution:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocALUTION/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} [rocWMMA](https://rocmdocs.amd.com/projects/rocWMMA/en/develop/)
:::{grid-item-card} {doc}`rocWMMA <rocwmma:index>`
`rocWMMA` provides an API to break down mixed precision matrix multiply-accumulate
(MMA) problems into fragments and distributes these over GPU wavefronts.
- [Documentation](https://rocmdocs.amd.com/projects/rocWMMA/en/develop/)
- {doc}`Documentation <rocwmma:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocWMMA/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} [rocSOLVER](https://rocmdocs.amd.com/projects/rocSOLVER/en/develop/)
:::{grid-item-card} {doc}`rocSOLVER <rocsolver:index>`
`rocSOLVER` provides a subset of LAPACK (Linear Algebra Package) functionality on the ROCm platform.
- [Documentation](https://rocmdocs.amd.com/projects/rocSOLVER/en/develop/)
- {doc}`Documentation <rocsolver:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocSOLVER/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} [hipSOLVER](https://rocmdocs.amd.com/projects/hipSOLVER/en/develop/)
:::{grid-item-card} {doc}`hipSOLVER <hipsolver:index>`
`hipSOLVER` is a LAPACK marshalling library supporting both `rocSOLVER` and `cuSOLVER`
as backends whilst exporting a unified interface.
- [Documentation](https://rocmdocs.amd.com/projects/hipSOLVER/en/develop/)
- {doc}`Documentation <hipsolver:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipSOLVER/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} [rocSPARSE](https://rocmdocs.amd.com/projects/rocSOLVER/en/develop/)
:::{grid-item-card} {doc}`rocSPARSE <rocsparse:index>`
`rocSPARSE` is a library to provide BLAS for sparse computations.
- [Documentation](https://rocmdocs.amd.com/projects/rocSOLVER/en/develop/)
- {doc}`Documentation <rocsparse:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocSOLVER/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} [hipSPARSE](https://rocmdocs.amd.com/projects/hipSOLVER/en/develop/)
:::{grid-item-card} {doc}`hipSPARSE <hipsparse:index>`
`hipSPARSE` is a marshalling library to provide sparse BLAS functionality,
supporting both `rocSPARSE` and `cuSPARSE` as backends.
- [Documentation](https://rocmdocs.amd.com/projects/hipSOLVER/en/develop/)
- {doc}`Documentation <hipsparse:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipSOLVER/blob/develop/CHANGELOG.md)
:::

View File

@@ -12,30 +12,35 @@ vendor libraries as their back-ends. Due to their static dispatch nature, suppor
at compile-time of the hipLIB in question. For dynamic dispatch between vendor implementations, refer to the
[Orochi](https://github.com/GPUOpen-LibrariesAndSDKs/Orochi) library.
::::{grid} 1 2 3 3
:gutter: 1
:::{grid-item-card} [Linear Algebra Libraries](linear_algebra)
- [rocBLAS](https://rocmdocs.amd.com/projects/rocBLAS/en/develop/)
- [hipBLAS](https://rocmdocs.amd.com/projects/hipBLAS/en/develop/)
- [hipBLASLt](https://rocmdocs.amd.com/projects/hipBLASLt/en/develop/)
- [rocALUTION](https://rocmdocs.amd.com/projects/rocALUTION/en/develop/)
- [rocWMMA](https://rocmdocs.amd.com/projects/rocWMMA/en/develop/)
- [rocSOLVER](https://rocmdocs.amd.com/projects/rocSOLVER/en/develop/)
- [hipSOLVER](https://rocmdocs.amd.com/projects/hipSOLVER/en/develop/)
- [rocSPARSE](https://rocmdocs.amd.com/projects/rocSPARSE/en/develop/)
- [hipSPARSE](https://rocmdocs.amd.com/projects/hipSPARSE/en/develop/)
- {doc}`rocBLAS <rocblas:index>`
- {doc}`hipBLAS <hipblas:index>`
- {doc}`hipBLASLt <hipblaslt:index>`
- {doc}`rocALUTION <rocalution:index>`
- {doc}`rocWMMA <rocwmma:index>`
- {doc}`rocSOLVER <rocsolver:index>`
- {doc}`hipSOLVER <hipsolver:index>`
- {doc}`rocSPARSE <rocsparse:index>`
- {doc}`hipSPARSE <hipsparse:index>`
:::
:::{grid-item-card} [Fast Fourier Transforms](fft)
- [rocFFT](https://rocmdocs.amd.com/projects/rocFFT/en/develop/)
- [hipFFT](https://rocmdocs.amd.com/projects/hipFFT/en/develop/)
- {doc}`rocFFT <rocfft:index>`
- {doc}`hipFFT <hipfft:index>`
:::
:::{grid-item-card} [Random Numbers](rand)
- [rocRAND](https://rocmdocs.amd.com/projects/rocRAND/en/develop/)
- [hipRAND](https://rocmdocs.amd.com/projects/hipRAND/en/develop/)
- {doc}`rocRAND <rocrand:index>`
- {doc}`hipRAND <hiprand:index>`
:::
::::

View File

@@ -3,21 +3,21 @@
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [rocRAND](https://rocmdocs.amd.com/projects/rocRAND/en/latest/)
:::{grid-item-card} {doc}`rocRAND <rocrand:index>`
rocRAND is an AMD GPU optimized library for pseudo-random number generators (PRNG).
- [Documentation](https://rocmdocs.amd.com/projects/rocRAND/en/latest/)
- {doc}`Documentation <rocrand:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocRAND/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/rocRAND)
:::
:::{grid-item-card} [hipRAND](https://rocmdocs.amd.com/projects/hipRAND/en/latest/)
hipRAND is a compatibility layer for GPU accelerated FFT optimized for AMD GPUs
using rocFFT. hipFFT allows for a common interface for other non AMD GPU
FFT libraries.
:::{grid-item-card} {doc}`hipRAND <hiprand:index>`
hipRAND is a compatibility layer for GPU accelerated pseudo-random number
generation (PRNG) optimized for AMD GPUs using rocRAND. hipRAND allows for a
common interface for other non AMD GPU PRNG libraries.
- [Documentation](https://rocmdocs.amd.com/projects/hipRAND/en/latest/)
- {doc}`Documentation <hiprand:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipRAND/blob/develop/CHANGELOG.md)
:::

View File

@@ -1,16 +1,18 @@
# HIP
HIP is both AMD's GPU programming language extension and the GPU runtime. This page introduces the HIP runtime and other HIP libraries and tools.
HIP is both AMD's GPU programming language extension and the GPU runtime. This
page introduces the HIP runtime and other HIP libraries and tools.
## HIP Runtime
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [HIP Runtime](https://rocmdocs.amd.com/projects/HIP/en/develop/)
The HIP Runtime is used to enable GPU acceleration for all HIP language based products.
:::{grid-item-card} {doc}`HIP Runtime <hip:index>`
The HIP Runtime is used to enable GPU acceleration for all HIP language based
products.
- [HIP Runtime API Manual](https://rocmdocs.amd.com/projects/HIP/en/develop/)
- {doc}`hip:.doxygen/docBin/html/index`
- [Examples](https://github.com/amd/rocm-examples/tree/develop/HIP-Basic)
:::
@@ -19,14 +21,14 @@ The HIP Runtime is used to enable GPU acceleration for all HIP language based pr
## Porting tools
:::::{grid} 1 1 1 1
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [HIPify](https://rocm.docs.amd.com/projects/HIPIFY/en/latest/)
HIPify assists with porting applications from based on CUDA to the HIP Runtime. Supported
CUDA APIs are documented here as well.
:::{grid-item-card} {doc}`HIPIFY <hipify:index>`
HIPIFY assists with porting applications from based on CUDA to the HIP Runtime.
Supported CUDA APIs are documented here as well.
- [Reference Manual](https://rocm.docs.amd.com/projects/HIPIFY/en/latest/)
- {doc}`Reference Manual <hipify:index>`
:::

View File

@@ -3,31 +3,28 @@
:::::{grid} 1 1 3 3
:gutter: 1
:::{grid-item-card} [AMD SMI](https://rocmdocs.amd.com/projects/amdsmi/en/latest/)
GO AMD SMI provides GO binding for [E-SMI In-Band C library](https://github.com/amd/esmi_ib_library),
[ROCm SMI Library](https://github.com/RadeonOpenCompute/rocm_smi_lib), and any
GO language application that needs to link with these libraries and call the APIs
from the GO application. The GO binding are imported in the
[AMD SMI Exporter](https://github.com/amd/amd_smi_exporter) to export information
provided by the AMD E-SMI inband library and the ROCm SMI GPU library to the Prometheus server.
:::{grid-item-card} AMD SMI
The AMD System Management Interface Library, or AMD SMI library, is a C library for Linux that provides a user space interface for applications to monitor and control AMD devices.
- [Documentation](https://rocmdocs.amd.com/projects/amdsmi/en/latest/)
- [GitHub](https://github.com/RadeonOpenCompute/amdsmi)
- [Examples](https://github.com/amd/go_amd_smi#example)
:::
:::{grid-item-card} [ROCm SMI](https://rocmdocs.amd.com/projects/rocmsmi/en/latest/)
:::{grid-item-card} [ROCm SMI](https://rocmdocs.amd.com/projects/rocm_smi_lib/en/latest/)
This tool acts as a command line interface for manipulating and monitoring the AMD GPU kernel, and is intended to replace and deprecate the existing `rocm_smi.py` CLI tool. It uses `ctypes` to call the `rocm_smi_lib` API.
- [Documentation](https://rocmdocs.amd.com/projects/rocmsmi/en/latest/)
- [Documentation](https://rocmdocs.amd.com/projects/rocm_smi_lib/en/latest/)
- [GitHub](https://github.com/RadeonOpenCompute/rocm_smi_lib)
- [Examples](https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/master/python_smi_tools)
:::
:::{grid-item-card} [ROCm Datacenter Tool](https://rocmdocs.amd.com/projects/rdc/en/latest/)
:::{grid-item-card} {doc}`ROCm Datacenter Tool <rdc:index>`
The ROCm™ Data Center Tool simplifies the administration and addresses key infrastructure challenges in AMD GPUs in cluster and data center environments.
- [Documentation](https://rocmdocs.amd.com/projects/rdc/en/latest/)
- {doc}`Documentation <rdc:index>`
- [GitHub](https://github.com/RadeonOpenCompute/rdc)
- [Examples](https://github.com/RadeonOpenCompute/rdc/tree/master/example)
:::

View File

@@ -1,6 +1,6 @@
# OpenMP Support in ROCm
## Introduction to OpenMP Support Guide
## Introduction
The ROCm™ installation includes an LLVM-based implementation that fully supports
the OpenMP 4.5 standard and a subset of OpenMP 5.0, 5.1, and 5.2 standards.
@@ -9,8 +9,7 @@ Along with host APIs, the OpenMP compilers support offloading code and data onto
GPU devices. This document briefly describes the installation location of the
OpenMP toolchain, example usage of device offloading, and usage of `rocprof`
with OpenMP applications. The GPUs supported are the same as those supported by
this ROCm release. See the list of supported GPUs in the installation guide at
[https://docs.amd.com/](https://docs.amd.com/).
this ROCm release. See the list of supported GPUs in {doc}`/release/gpu_os_support`.
### Installation
@@ -97,7 +96,7 @@ code compiled with AOMP:
```
The stats option produces timestamps for the kernels. Look into the output
CSV file for the field, `Durations`, which is useful in getting an
CSV file for the field, `DurationNs`, which is useful in getting an
understanding of the critical kernels in the code.
Apart from `--stats`, the option `--timestamp` on produces a timestamp for
@@ -110,7 +109,7 @@ code compiled with AOMP:
an XML file as an input.
For more details on `rocprof`, refer to the ROCm Profiling Tools document on
<https://docs.amd.com>.
{doc}`rocprofiler:rocprof`.
### Using Tracing Options
@@ -118,7 +117,7 @@ For more details on `rocprof`, refer to the ROCm Profiling Tools document on
program with:
```bash
-Wl,rpath,/opt/rocm-{version}/lib -lamdhip64
-Wl,-rpath,/opt/rocm-{version}/lib -lamdhip64
```
The following tracing options are widely used to generate useful information:
@@ -137,7 +136,7 @@ Navigate to Chrome or Perfetto and load the JSON file to see the timeline of the
HSA calls.
For more details on tracing, refer to the ROCm Profiling Tools document on
<https://docs.amd.com>.
{doc}`rocprofiler:rocprof`.
### Environment Variables
@@ -157,6 +156,8 @@ For more details on tracing, refer to the ROCm Profiling Tools document on
The OpenMP programming model is greatly enhanced with the following new features
implemented in the past releases.
(openmp_usm)=
### Unified Shared Memory
Unified Shared Memory (USM) provides a pointer-based approach to memory

View File

@@ -666,9 +666,8 @@ The following OpenMP pragma is available on MI200, and it must be executed with
omp requires unified_shared_memory
```
For more details on
[USM](https://docs.amd.com/bundle/OpenMP-Support-Guide-v5.4/page/OpenMP_Features.html#d90e61),
refer to the OpenMP Support Guide at [https://docs.amd.com](https://docs.amd.com).
For more details on USM refer to the {ref}`openmp_usm` section of the OpenMP
Guide.
### Support Status of Other Clang Options

View File

@@ -3,19 +3,19 @@
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} [RVS](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/latest/)
:::{grid-item-card} {doc}`RVS <rocmvalidationsuite:index>`
The ROCm Validation Suite is a system administrators and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform.
- [Documentation](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/latest/)
- {doc}`Documentation <rocmvalidationsuite:index>`
:::
:::{grid-item-card} [TransferBench](https://rocmdocs.amd.com/projects/TransferBench/en/latest/)
:::{grid-item-card} {doc}`TransferBench <transferbench:index>`
TransferBench is a simple utility capable of benchmarking simultaneous transfers between user-specified devices (CPUs/GPUs).
- [Documentation](https://rocmdocs.amd.com/projects/TransferBench/en/latest/)
- {doc}`Documentation <transferbench:index>`
- [Changelog](https://github.com/ROCmSoftwarePlatform/TransferBench/blob/develop/CHANGELOG.md)
- [Examples](https://rocmdocs.amd.com/projects/TransferBench/en/develop/examples/index.html#examples)
- {doc}`transferbench:examples/index`
:::

View File

@@ -0,0 +1,48 @@
# 3rd Party Support Matrix
ROCm™ supports various 3rd party libraries and frameworks. Supported versions
are tested and known to work. Non-supported versions of 3rd parties may also
work, but aren't tested.
(ml_framework_compat_matrix)=
## Deep Learning
ROCm releases support the most recent and two prior releases of PyTorch and
TensorFlow
| ROCm | [PyTorch](https://github.com/pytorch/pytorch/releases/) | [TensorFlow](https://github.com/tensorflow/tensorflow/releases/) | [MAGMA](https://icl.utk.edu/magma/index.html) |
|:------|:--------------------------:|:--------------------:|:-----:|
| 5.0.2 | 1.8, 1.9, 1.10 | 2.6, 2.7, 2.8 | |
| 5.1.3 | 1.9, 1.10, 1.11 | 2.7, 2.8, 2.9 | |
| 5.2.x | 1.10, 1.11, 1.12 | 2.8, 2.9, 2.9 | |
| 5.3.x | 1.10.1, 1.11, 1.12.1, 1.13 | 2.8, 2.9, 2.10 | |
| 5.4.x | 1.10.1, 1.11, 1.12.1, 1.13 | 2.8, 2.9, 2.10, 2.11 | 2.5.4 |
## Communication libraries
ROCm supports [OpenUCX](https://openucx.org/) an "an open-source,
production-grade communication framework for data-centric and high-performance
applications".
UCX version | ROCm 5.4 and older | ROCm 5.5 and newer |
|:----------|:------------------:|:------------------:|
| -1.14.0 | COMPATIBLE | INCOMPATIBLE |
| 1.14.1+ | COMPATIBLE | COMPATIBLE |
## Algorithm libraries
ROCm releases provide algorithm libraries with interfaces compatible with
contemporary CUDA / NVIDIA HPC SDK alternatives.
- Thrust → rocThrust
- CUB → hipCUB
| ROCm | Thrust / CUB | HPC SDK |
|:------|:------------:|:-------:|
| 5.0.2 | 1.14 | 21.9 |
| 5.1.3 | 1.15 | 22.1 |
| 5.2.x | 1.15 | 22.2, 22.3 |
For the latest documentation of these libraries, refer to the
[associated documentation](../reference/gpu_libraries/c%2B%2B_primitives.md).

View File

@@ -1,5 +1,29 @@
# Compatibility
[Frameworks Support Matrix](docker_support_matrix.md)
:::::{grid} 1 1 2 2
:gutter: 1
[Framework Compatibility](../reference/framework_compatibility/framework_compatibility)
:::{grid-item-card} User space & Kernel Fusion Driver
Forward and backward compatibility of ROCm user space components and the
kernel space Kernel Fusion Driver (KFD).
- [User/Kernel-Space Support Matrix](./user_kernel_space_compat_matrix.md)
:::
:::{grid-item-card} Docker Image Support
ROCm releases several Docker container images.
- [Docker Image Support Matrix](./docker_image_support_matrix.md)
:::
:::{grid-item-card} 3rd Party Support
Several 3rd party libraries ship with ROCm enablement as well as several ROCm
components provide interfaces compatible with 3rd party solutions.
- [3rd Party Support Matrix](./3rd_party_support_matrix.md)
:::
:::::

View File

@@ -1,4 +1,4 @@
# Frameworks Support Matrix
# Docker Image Support Matrix
The software support matrices for ROCm container releases is listed.
@@ -61,7 +61,7 @@ The software support matrices for ROCm container releases is listed.
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [Python 3.9](https://www.python.org/downloads/release/python-390/)
* [tensorflow-rocm 2.13.0]()
* `tensorflow-rocm` 2.13.0
* [OFED 5.3](https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz)
* [OMPI 4.0.7](https://github.com/open-mpi/ompi/tree/v4.0.7)
* [Horovod 0.27.0](https://github.com/horovod/horovod/tree/v0.27.0)
@@ -71,7 +71,7 @@ The software support matrices for ROCm container releases is listed.
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [Python 3.9](https://www.python.org/downloads/release/python-390/)
* [tensorflow-rocm 2.11.0](https://pypi.org/project/tensorflow-rocm/2.11.0.540/)
* [`tensorflow-rocm` 2.11.0](https://pypi.org/project/tensorflow-rocm/2.11.0.540/)
* [OFED 5.3](https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz)
* [OMPI 4.0.7](https://github.com/open-mpi/ompi/tree/v4.0.7)
* [Horovod 0.27.0](https://github.com/horovod/horovod/tree/v0.27.0)
@@ -81,7 +81,7 @@ The software support matrices for ROCm container releases is listed.
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [Python 3.9](https://www.python.org/downloads/release/python-390/)
* [tensorflow-rocm 2.10.1](https://pypi.org/project/tensorflow-rocm/2.10.1.540/)
* [`tensorflow-rocm` 2.10.1](https://pypi.org/project/tensorflow-rocm/2.10.1.540/)
* [OFED 5.3](https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz)
* [OMPI 4.0.7](https://github.com/open-mpi/ompi/tree/v4.0.7)
* [Horovod 0.27.0](https://github.com/horovod/horovod/tree/v0.27.0)

View File

@@ -24,20 +24,25 @@ ROCm supports virtualization for select GPUs only as shown below.
| VMWare | ESXi 8 | MI210 | Ubuntu 20.04 (`5.15.0-56-generic`), SLES 15 SP4 (`5.14.21-150400.24.18-default`) |
| VMWare | ESXi 7 | MI210 | Ubuntu 20.04 (`5.15.0-56-generic`), SLES 15 SP4 (`5.14.21-150400.24.18-default`) |
(supported_gpus)=
## GPU Support Table
::::{tab-set}
:::{tab-item} Instinct™
:::{tab-item} AMD Instinct™
:sync: instinct
Use Driver Shipped with ROCm
| GPU | Architecture | Product | [LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Linux | Windows |
|:-----------------:|:---------------:|:-------:|:--------------------------------------------------------------------:|:------------------------------------:|:-----------:|
| AMD Instinct™ MI250X | CDNA2 | Full | gfx90a | Supported | Unsupported |
| AMD Instinct™ MI250 | CDNA2 | Full | gfx90a | Supported | Unsupported |
| AMD Instinct™ MI210 | CDNA2 | Full | gfx90a | Supported | Unsupported |
| AMD Instinct™ MI100 | CDNA | Full | gfx908 | Supported | Unsupported |
| AMD Instinct™ MI50 | Vega | Full | gfx906 | Supported | Unsupported |
| Product Name | Architecture | [LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) |Support |
|:------------:|:------------:|:--------------------------------------------------------------------:|:-------:|
| AMD Instinct™ MI250X | CDNA2 | gfx90a | ✅ |
| AMD Instinct™ MI250 | CDNA2 | gfx90a | ✅ |
| AMD Instinct™ MI210 | CDNA2 | gfx90a | ✅ |
| AMD Instinct™ MI100 | CDNA | gfx908 | ✅ |
| AMD Instinct™ MI50 | GCN5.1 | gfx906 | ✅ |
| AMD Instinct™ MI25 | GCN5.0 | gfx900 | ❌ |
:::
@@ -46,69 +51,11 @@ Use Driver Shipped with ROCm
[Use Radeon Pro Driver](https://www.amd.com/en/support/linux-drivers)
This table is incomplete.
| GPU | Architecture | SW Level | [LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Linux | Windows |
|:-----------------:|:---------------:|:--------:|:--------------------------------------------------------------------:|:------------------------------------:|:-----------:|
| AMD Radeon™ Pro W6800 | RDNA2 | Full | gfx1030 | Supported | Supported |
| AMD Radeon™ Pro V620 | RDNA2 | Full | gfx1030 | Supported | Unsupported |
:::
:::{tab-item} Radeon™
:sync: radeon
[Use Radeon Pro Driver](https://www.amd.com/en/support/linux-drivers)
This table is incomplete.
| GPU | Architecture | SW Level | [LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Linux | Windows |
|:------------------:|:--------------:|:----------:|:--------------------------------------------------------------------:|:------------------------------------:|:-----------:|
| AMD Radeon™ RX 6900 XT | RDNA2 |HIP SDK | gfx1030 | Supported | Supported |
| AMD Radeon™ RX 6600 | RDNA2 |HIP Runtime | gfx1031 | Supported | Supported |
| AMD Radeon™ VII | Vega |Full | gfx906 | Supported | Unsupported |
| AMD Radeon™ R9 Fury | Fiji |NA | gfx803 | Community | Unsupported |
:::
::::
### Software Enablement Level
::::{tab-set}
:::{tab-item} AMD Instinct™
:sync: instinct
Instinct™ accelerators support the full stack available in ROCm. Instinct™
accelerators are Linux only.
:::
:::{tab-item} AMD Radeon Pro™
:sync: radeonpro
ROCm software support varies by GPU type and Operating System. ROCm ecosystem
products are three software stack enablement levels that correspond as
described below:
- Full includes all software that is part of the ROCm ecosystem. Please see
[article](link) for details of ROCm.
- HIP SDK includes the HIP Runtime and a selection of GPU libraries for compute.
Please see [article](link) for details of HIP SDK.
- HIP Runtime enables the use of the HIP Runtime only.
:::
:::{tab-item} AMD Radeon™
:sync: radeon
ROCm software support varies by GPU type and Operating System. ROCm ecosystem
products are three software stack enablement levels that correspond as described
below:
- Full includes all software that is part of the ROCm ecosystem. Please see
[article](link) for details of ROCm.
- HIP SDK includes the HIP Runtime and a selection of GPU libraries for compute.
Please see [article](link) for details of HIP SDK.
- HIP enables the use of the HIP Runtime only.
| Name | Architecture |[LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Support|
|:----:|:------------:|:--------------------------------------------------------------------:|:-------:|
| AMD Radeon™ Pro W6800 | RDNA2 | gfx1030 | ✅ |
| AMD Radeon™ Pro V620 | RDNA2 | gfx1030 | ✅ |
| AMD Radeon™ Pro VII | GCN5.1 | gfx906 | ✅ |
:::
@@ -116,47 +63,11 @@ below:
### Support Status
::::{tab-set}
:::{tab-item} Instinct™
:sync: instinct
- Supported - AMD enables these GPUs in our software distributions for the
corresponding ROCm product.
- Unsupported - This configuration is not enabled in our software distributions.
- Deprecated - Support will be removed in a future release.
:::
:::{tab-item} Radeon Pro™
:sync: radeonpro
GPU support levels for Radeon Pro™
- Supported - AMD enables these GPUs in our software distributions for the
corresponding ROCm product.
- Unsupported - This configuration is not enabled in our software distributions.
- Deprecated - Support will be removed in a future release.
- Community - AMD does not enable these GPUs in our software distributions but
end users are free to enable these GPUs themselves.
:::
:::{tab-item} Radeon™
:sync: radeon
Support levels for Radeon™ GPUs:
- Supported - AMD enables these GPUs in our software distributions for the
corresponding ROCm product.
- Unsupported - This configuration is not enabled in our software distributions.
- Deprecated - Support will be removed in a future release.
- Community - AMD does not enable these GPUs in our software distributions but
end users are free to enable these GPUs themselves.
:::
::::
- ✅: **Supported** - AMD enables these GPUs in our software distributions for
the corresponding ROCm product.
- ⚠️: **Deprecated** - Support will be removed in a future release.
- ❌: **Unsupported** - This configuration is not enabled in our software
distributions.
## CPU Support

View File

@@ -27,7 +27,7 @@ The table is ordered to follow ROCm's manifest file.
| [HIPIFY](https://github.com/ROCm-Developer-Tools/HIPIFY/) | [MIT](https://github.com/ROCm-Developer-Tools/HIPIFY/blob/amd-staging/LICENSE.txt) |
| [HIPCC](https://github.com/ROCm-Developer-Tools/HIPCC/blob/develop/LICENSE.txt) | [MIT](https://github.com/ROCm-Developer-Tools/HIPCC/blob/develop/LICENSE.txt) |
| [llvm-project](https://github.com/ROCm-Developer-Tools/llvm-project/) | [Apache](https://github.com/ROCm-Developer-Tools/llvm-project/blob/main/LICENSE.TXT) |
| rocm-llvm-alt | [AMD Proprietary License]()
| rocm-llvm-alt | [AMD Proprietary License](https://www.amd.com/en/support/amd-software-eula)
| [ROCm-Device-Libs](https://github.com/RadeonOpenCompute/ROCm-Device-Libs/) | [The University of Illinois/NCSA](https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/amd-stg-open/LICENSE.TXT) |
| [atmi](https://github.com/RadeonOpenCompute/atmi/) | [MIT](https://github.com/RadeonOpenCompute/atmi/blob/master/LICENSE.txt) |
| [ROCm-CompilerSupport](https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/) | [The University of Illinois/NCSA](https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/blob/amd-stg-open/LICENSE.txt) |
@@ -102,3 +102,23 @@ AMD, the AMD Arrow logo, ROCm, and combinations thereof are trademarks of
Advanced Micro Devices, Inc. Other product names used in this publication are
for identification purposes only and may be trademarks of their respective
companies.
## Package Licensing
```{attention}
AQL Profiler and AOCC CPU optimization are both provided in binary form, each
subject to the license agreement enclosed in the directory for the binary and is
available here: `/opt/rocm/share/doc/rocm-llvm-alt/EULA`. By using, installing,
copying or distributing AQL Profiler and/or AOCC CPU Optimizations, you agree to
the terms and conditions of this license agreement. If you do not agree to the
terms of this agreement, do not install, copy or use the AQL Profiler and/or the
AOCC CPU Optimizations.
```
For the rest of the ROCm packages, you can find the licensing information at the
following location: `/opt/rocm/share/doc/<component-name>/`
For example, you can fetch the licensing information of the `_amd_comgr_`
component (Code Object Manager) from the `amd_comgr` folder. A file named
`LICENSE.txt` contains the license details at:
`/opt/rocm-5.1.3/share/doc/amd_comgr/LICENSE.txt`

View File

@@ -0,0 +1,12 @@
# User/Kernel-Space Support Matrix
ROCm™ provides forward and backward compatibility between the Kernel Fusion
Driver (KFD) and its user space software for +/- 2 releases. This table shows
the compatibility combinations that are currently supported.
| KFD | Tested user space versions |
|:------|:--------------------------:|
| 5.0.2 | 5.1.0, 5.2.0 |
| 5.1.0 | 5.0.2 |
| 5.1.3 | 5.2.0, 5.3.0 |
| 5.2.0 | 5.0.2, 5.1.3 |

View File

@@ -1,6 +1,6 @@
# AMD ROCm™ Platform - Overview
# What is ROCm?
ROCm is an open-source stack for GPU computation. ROCm is primarily Open-Source
ROCm is an open-source stack for GPU computation. ROCm is primarily Open-Source
Software (OSS) that allows developers the freedom to customize and tailor their
GPU software for their own needs while collaborating with a community of other
developers, and helping each other find solutions in an agile, flexible, rapid

View File

@@ -19,27 +19,42 @@ subtrees:
title: Installation Overview
- file: deploy/linux/prerequisites
title: Prerequisites
- file: deploy/linux/install
title: Installation
- file: deploy/linux/upgrade
title: Upgrade
- file: deploy/linux/uninstall
title: Uninstallation
- file: deploy/linux/package_manager_integration
- file: deploy/linux/os-native/index
subtrees:
- entries:
- file: deploy/linux/os-native/install
title: Installation
- file: deploy/linux/os-native/upgrade
title: Upgrade
- file: deploy/linux/os-native/uninstall
title: Uninstallation
- file: deploy/linux/os-native/package_manager_integration
- file: deploy/linux/installer/index
subtrees:
- entries:
- file: deploy/linux/installer/install
title: Installation
- file: deploy/linux/installer/upgrade
title: Upgrade
- file: deploy/linux/installer/uninstall
title: Uninstallation
- file: deploy/docker
title: Docker
- caption: Release Info
entries:
- file: release
- file: CHANGELOG
title: Changelog
- file: release/gpu_os_support
- url: https://github.com/RadeonOpenCompute/ROCm/labels/Verified%20Issue
title: Known Issues
- file: release/compatibility
subtrees:
- entries:
- file: release/docker_support_matrix
- file: reference/framework_compatibility/framework_compatibility
- file: release/user_kernel_space_compat_matrix
- file: release/docker_image_support_matrix
- file: release/3rd_party_support_matrix
- file: release/licensing
@@ -47,26 +62,26 @@ subtrees:
entries:
- file: reference/all
- file: reference/compilers
title: Compilers and Development Tools
title: Compilers and Tools
subtrees:
- entries:
- file: reference/rocmcc/rocmcc
title: ROCmCC
- url: https://rocmdocs.amd.com/projects/ROCmCC/en/{branch}/
- url: ${project:rocgdb}
title: ROCgdb
- url: https://rocmdocs.amd.com/projects/ROCgdb/en/hybrid/
- url: ${project:rocprofiler}
title: rocprofiler
- url: https://rocmdocs.amd.com/projects/rocprofiler/en/{branch}/
- url: ${project:roctracer}
title: roctracer
- url: https://rocmdocs.amd.com/projects/roctracer/en/{branch}/
title: ROCdbgapi
- url: ${project:rocdbgapi}
title: ROCdbgapi
- file: reference/hip
subtrees:
- entries:
- title: HIP Runtime API
url: https://rocmdocs.amd.com/projects/HIP/en/{branch}/
url: ${project:hip}
- title: HIPify - Port Your Code
url: https://advanced-micro-devices-demo--737.com.readthedocs.build/projects/HIPIFY/en/737/
url: ${project:hipify}
- file: reference/openmp/openmp
title: OpenMP
- file: reference/gpu_libraries/math
@@ -77,72 +92,72 @@ subtrees:
subtrees:
- entries:
- title: rocBLAS
url: https://rocmdocs.amd.com/projects/rocBLAS/en/{branch}/
url: ${project:rocblas}
- title: hipBLAS
url: https://rocmdocs.amd.com/projects/hipBLAS/en/{branch}/
url: ${project:hipblas}
- title: hipBLASLt
url: https://rocm.docs.amd.com/projects/hipBLASLt/en/{branch}/
url: ${project:hipblaslt}
- title: rocALUTION
url: https://rocm.docs.amd.com/projects/rocALUTION/en/{branch}/
url: ${project:rocalution}
- title: rocWMMA
url: https://rocm.docs.amd.com/projects/rocWMMA/en/{branch}/
url: ${project:rocwmma}
- title: rocSOLVER
url: https://rocm.docs.amd.com/projects/rocSOLVER/en/{branch}/
url: ${project:rocsolver}
- title: hipSOLVER
url: https://rocm.docs.amd.com/projects/hipSOLVER/en/{branch}/
url: ${project:hipsolver}
- title: rocSPARSE
url: https://rocm.docs.amd.com/projects/rocSPARSE/en/{branch}/
url: ${project:rocsparse}
- title: hipSPARSE
url: https://rocm.docs.amd.com/projects/hipSPARSE/en/{branch}/
url: ${project:hipsparse}
- file: reference/gpu_libraries/fft
subtrees:
- entries:
- title: rocFFT
url: https://rocm.docs.amd.com/projects/rocFFT/en/{branch}/
url: ${project:rocfft}
- title: hipFFT
url: https://rocm.docs.amd.com/projects/hipFFT/en/{branch}/
url: ${project:hipfft}
- file: reference/gpu_libraries/rand
subtrees:
- entries:
- title: rocRAND
url: https://rocm.docs.amd.com/projects/rocRAND/en/{branch}/
url: ${project:rocrand}
- title: hipRAND
url: https://rocm.docs.amd.com/projects/hipRAND/en/{branch}/
url: ${project:hiprand}
- file: reference/gpu_libraries/c++_primitives
title: C++ Primitive Libraries
subtrees:
- entries:
- title: rocPRIM
url: https://rocm.docs.amd.com/projects/rocPRIM/en/{branch}/
url: ${project:rocprim}
- entries:
- title: hipCUB
url: https://rocm.docs.amd.com/projects/hipCUB/en/{branch}/
url: ${project:hipcub}
- entries:
- title: rocThrust
url: https://rocm.docs.amd.com/projects/rocThrust/en/{branch}/
url: ${project:rocthrust}
- file: reference/gpu_libraries/communication
title: Communication Libraries
subtrees:
- entries:
- title: RCCL
url: https://rocm.docs.amd.com/projects/rccl/en/{branch}/
url: ${project:rccl}
- file: reference/ai_tools
title: AI Libraries
subtrees:
- entries:
- title: MIOpen - Machine Intelligence
url: https://rocm.docs.amd.com/projects/MIOpen/en/{branch}/
url: ${project:miopen}
- title: Composable Kernel
url: https://rocm.docs.amd.com/projects/composable_kernel/en/{branch}/
url: ${project:composable_kernel}
- title: MIGraphX - Graph Optimization
url: https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
url: ${project:amdmigraphx}
- file: reference/computer_vision
subtrees:
- entries:
- url: https://rocm.docs.amd.com/projects/MIVisionX/en/{branch}/
- url: ${project:mivisionx}
title: MIVisionX
- entries:
- url: https://rocm.docs.amd.com/projects/rocAL/en/{branch}/
- url: ${project:rocal}
title: rocAL
- file: reference/management_tools
title: Management Tools
@@ -150,20 +165,21 @@ subtrees:
- entries:
- url: https://rocm.docs.amd.com/projects/amdsmi/en/{branch}/
title: AMD SMI
- url: https://rocm.docs.amd.com/projects/rocmsmi/en/{branch}/
- url: https://rocm.docs.amd.com/projects/rocm_smi_lib/en/{branch}/
title: ROCm SMI
- url: https://rocm.docs.amd.com/projects/rdc/en/{branch}/
- url: ${project:rdc}
title: ROCm Datacenter Tool
- file: reference/validation_tools
title: Validation Tools
subtrees:
- entries:
- url: https://rocm.docs.amd.com/projects/rvs/en/{branch}/
- url: ${project:rocmvalidationsuite}
title: RVS
- url: https://rocm.docs.amd.com/projects/TransferBench/en/{branch}/
- url: ${project:transferbench}
title: TransferBench
- caption: Understand ROCm
entries:
- file: understand/all.md
- title: Compiler Disambiguation
file: understand/compiler_disambiguation
- file: understand/cmake_packages
@@ -176,8 +192,10 @@ subtrees:
title: MI250
- file: understand/gpu_arch/mi100
title: MI100
- file: understand/More-about-how-ROCm-uses-PCIe-Atomics
- caption: How to Guides
entries:
- file: how_to/all
- title: Tuning Guides
file: how_to/tuning_guides/index.md
subtrees:
@@ -197,15 +215,17 @@ subtrees:
- file: how_to/gpu_aware_mpi
- file: how_to/system_debugging
- caption: Examples
- caption: Tutorials & Examples
file: examples
entries:
- title: ROCm Examples
url: https://github.com/amd/rocm-examples
- file: examples/ai_ml_inferencing
title: AI/ML/Inferencing
- title: Machine Learning
file: examples/machine_learning/all
subtrees:
- entries:
- file: examples/inception_casestudy/inception_casestudy
- entries:
- file: examples/machine_learning/pytorch_inception
- file: examples/machine_learning/migraphx_optimization
- caption: About
entries:

View File

@@ -1 +1,2 @@
rocm-docs-core==0.11.0
rocm-docs-core==1.8.0
sphinx-reredirects

View File

@@ -1,110 +1,106 @@
#
# This file is autogenerated by pip-compile with Python 3.8
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile sphinx/requirements.in
# pip-compile requirements.in
#
accessible-pygments==0.0.3
accessible-pygments==0.0.5
# via pydata-sphinx-theme
alabaster==0.7.13
alabaster==1.0.0
# via sphinx
babel==2.11.0
babel==2.16.0
# via
# pydata-sphinx-theme
# sphinx
beautifulsoup4==4.11.2
beautifulsoup4==4.12.3
# via pydata-sphinx-theme
breathe==4.34.0
breathe==4.35.0
# via rocm-docs-core
certifi==2022.12.7
certifi==2024.8.30
# via requests
cffi==1.15.1
cffi==1.17.1
# via
# cryptography
# pynacl
charset-normalizer==2.1.1
charset-normalizer==3.3.2
# via requests
click==8.1.3
click==8.1.7
# via sphinx-external-toc
cryptography==40.0.2
cryptography==43.0.1
# via pyjwt
deprecated==1.2.13
deprecated==1.2.14
# via pygithub
docutils==0.19
docutils==0.21.2
# via
# breathe
# myst-parser
# pydata-sphinx-theme
# sphinx
fastjsonschema==2.16.3
fastjsonschema==2.20.0
# via rocm-docs-core
gitdb==4.0.10
gitdb==4.0.11
# via gitpython
gitpython==3.1.30
gitpython==3.1.43
# via rocm-docs-core
idna==3.4
idna==3.10
# via requests
imagesize==1.4.1
# via sphinx
jinja2==3.1.2
jinja2==3.1.4
# via
# myst-parser
# sphinx
linkify-it-py==1.0.3
# via myst-parser
markdown-it-py==2.2.0
markdown-it-py==3.0.0
# via
# mdit-py-plugins
# myst-parser
markupsafe==2.1.2
markupsafe==2.1.5
# via jinja2
mdit-py-plugins==0.3.4
mdit-py-plugins==0.4.2
# via myst-parser
mdurl==0.1.2
# via markdown-it-py
myst-parser[linkify]==1.0.0
myst-parser==4.0.0
# via rocm-docs-core
packaging==23.0
packaging==24.1
# via
# pydata-sphinx-theme
# sphinx
pycparser==2.21
pycparser==2.22
# via cffi
pydata-sphinx-theme==0.13.3
pydata-sphinx-theme==0.15.4
# via
# rocm-docs-core
# sphinx-book-theme
pygithub==1.58.1
pygithub==2.4.0
# via rocm-docs-core
pygments==2.14.0
pygments==2.18.0
# via
# accessible-pygments
# pydata-sphinx-theme
# sphinx
pyjwt[crypto]==2.6.0
pyjwt[crypto]==2.9.0
# via pygithub
pynacl==1.5.0
# via pygithub
pytz==2022.7.1
# via babel
pyyaml==6.0
pyyaml==6.0.2
# via
# myst-parser
# rocm-docs-core
# sphinx-external-toc
requests==2.28.1
requests==2.32.3
# via
# pygithub
# sphinx
rocm-docs-core==0.11.0
rocm-docs-core==1.8.0
# via -r requirements.in
smmap==5.0.0
smmap==5.0.1
# via gitdb
snowballstemmer==2.2.0
# via sphinx
soupsieve==2.4
soupsieve==2.6
# via beautifulsoup4
sphinx==5.3.0
sphinx==8.0.2
# via
# breathe
# myst-parser
@@ -115,33 +111,40 @@ sphinx==5.3.0
# sphinx-design
# sphinx-external-toc
# sphinx-notfound-page
sphinx-book-theme==1.0.1
# sphinx-reredirects
sphinx-book-theme==1.1.3
# via rocm-docs-core
sphinx-copybutton==0.5.1
sphinx-copybutton==0.5.2
# via rocm-docs-core
sphinx-design==0.4.1
sphinx-design==0.6.1
# via rocm-docs-core
sphinx-external-toc==0.3.1
sphinx-external-toc==1.0.1
# via rocm-docs-core
sphinx-notfound-page==0.8.3
sphinx-notfound-page==1.0.4
# via rocm-docs-core
sphinxcontrib-applehelp==1.0.4
sphinx-reredirects==0.1.5
# via -r requirements.in
sphinxcontrib-applehelp==2.0.0
# via sphinx
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-devhelp==2.0.0
# via sphinx
sphinxcontrib-htmlhelp==2.0.1
sphinxcontrib-htmlhelp==2.1.0
# via sphinx
sphinxcontrib-jsmath==1.0.1
# via sphinx
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-qthelp==2.0.0
# via sphinx
sphinxcontrib-serializinghtml==1.1.5
sphinxcontrib-serializinghtml==2.0.0
# via sphinx
typing-extensions==4.5.0
# via pydata-sphinx-theme
uc-micro-py==1.0.1
# via linkify-it-py
urllib3==1.26.13
# via requests
wrapt==1.14.1
tomli==2.0.1
# via sphinx
typing-extensions==4.12.2
# via
# pydata-sphinx-theme
# pygithub
urllib3==2.2.3
# via
# pygithub
# requests
wrapt==1.16.0
# via deprecated

View File

@@ -0,0 +1,149 @@
===========================
How ROCm uses PCIe Atomics
===========================
ROCm PCIe Feature and Overview BAR Memory
==========================================
ROCm is an extension of HSA platform architecture, so it shares the queueing model, memory model, signaling and synchronization protocols. Platform atomics are integral to perform queuing and signaling memory operations where there may be multiple-writers across CPU and GPU agents.
The full list of HSA system architecture platform requirements are here: `HSA Sys Arch Features <http://hsafoundation.com/wp-content/uploads/2021/02/HSA-SysArch-1.2.pdf>`_.
The ROCm Platform uses the new PCI Express 3.0 (PCIe 3.0) features for Atomic Read-Modify-Write Transactions which extends inter-processor synchronization mechanisms to IO to support the defined set of HSA capabilities needed for queuing and signaling memory operations.
The new PCIe AtomicOps operate as completers for ``CAS`` (Compare and Swap), ``FetchADD``, ``SWAP`` atomics. The AtomicsOps are initiated by the
I/O device which support 32-bit, 64-bit and 128-bit operand which target address have to be naturally aligned to operation sizes.
For ROCm the Platform atomics are used in ROCm in the following ways:
* Update HSA queues read_dispatch_id: 64 bit atomic add used by the command processor on the GPU agent to update the packet ID it processed.
* Update HSA queues write_dispatch_id: 64 bit atomic add used by the CPU and GPU agent to support multi-writer queue insertions.
* Update HSA Signals 64bit atomic ops are used for CPU & GPU synchronization.
The PCIe 3.0 AtomicOp feature allows atomic transactions to be requested by, routed through and completed by PCIe components. Routing and completion does not require software support. Component support for each is detectable via the DEVCAP2 register. Upstream bridges need to have AtomicOp routing enabled or the Atomic Operations will fail even though PCIe endpoint and PCIe I/O Devices has the capability to Atomics Operations.
To do AtomicOp routing capability between two or more Root Ports, each associated Root Port must indicate that capability via the AtomicOp Routing Supported bit in the Device Capabilities 2 register.
If your system has a PCIe Express Switch it needs to support AtomicsOp routing. Again AtomicOp requests are permitted only if a components ``DEVCTL2.ATOMICOP_REQUESTER_ENABLE`` field is set. These requests can only be serviced if the upstream components support AtomicOp completion and/or routing to a component which does. AtomicOp Routing Support=1 Routing is supported, AtomicOp Routing Support=0 routing is not supported.
Atomic Operation is a Non-Posted transaction supporting 32-bit and 64-bit address formats, there must be a response for Completion containing the result of the operation. Errors associated with the operation (uncorrectable error accessing the target location or carrying out the Atomic operation) are signaled to the requester by setting the Completion Status field in the completion descriptor, they are set to to Completer Abort (CA) or Unsupported Request (UR).
To understand more about how PCIe Atomic operations work `PCIe Atomics <https://pcisig.com/sites/default/files/specification_documents/ECN_Atomic_Ops_080417.pdf>`_
`Linux Kernel Patch to pci_enable_atomic_request <https://patchwork.kernel.org/patch/7261731/>`_
There are also a number of papers which talk about these new capabilities:
* `Atomic Read Modify Write Primitives by Intel <https://www.intel.es/content/dam/doc/white-paper/atomic-read-modify-write-primitives-i-o-devices-paper.pdf>`_
* `PCI express 3 Accelerator Whitepaper by Intel <https://www.intel.sg/content/dam/doc/white-paper/pci-express3-accelerator-white-paper.pdf>`_
* `Intel PCIe Generation 3 Hotchips Paper <https://www.hotchips.org/wp-content/uploads/hc_archives/hc21/1_sun/HC21.23.1.SystemInterconnectTutorial-Epub/HC21.23.131.Ajanovic-Intel-PCIeGen3.pdf>`_
* `PCIe Generation 4 Base Specification includes Atomics Operation <http://composter.com.ua/documents/PCI_Express_Base_Specification_Revision_4.0.Ver.0.3.pdf>`_
Other I/O devices with PCIe Atomics support
* `Mellanox ConnectX-5 InfiniBand Card <http://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX-5_VPI_Card.pdf>`_
* `Cray Aries Interconnect <http://www.hoti.org/hoti20/slides/Bob_Alverson.pdf>`_
* `Xilinx PCIe Ultrascale Whitepaper <https://www.xilinx.com/support/documentation/white_papers/wp464-PCIe-ultrascale.pdf>`_
* `Xilinx 7 Series Devices <https://www.xilinx.com/support/documentation/ip_documentation/pcie_7x/v3_1/pg054-7series-pcie.pdf>`_
Future bus technology with richer I/O Atomics Operation Support
* `GenZ <http://genzconsortium.org/faq/gen-z-technology/#33/>`_
New PCIe Endpoints with support beyond AMD Ryzen and EPYC CPU; Intel Haswell or newer CPUs with PCIe Generation 3.0 support.
* `Mellanox Bluefield SOC <http://www.mellanox.com/related-docs/npu-multicore-processors/PB_Bluefield_SoC.pdf>`_
* `Cavium Thunder X2 <http://www.cavium.com/ThunderX2_ARM_Processors.html>`_
In ROCm, we also take advantage of PCIe ID based ordering technology for P2P when the GPU originates two writes to two different targets:
| 1. write to another GPU memory,
| 2. then write to system memory to indicate transfer complete.
They are routed off to different ends of the computer but we want to make sure the write to system memory to indicate transfer complete occurs AFTER P2P write to GPU has complete.
`Good Paper on Understanding PCIe Generation 3 Throughput <https://www.altera.com/en_US/pdfs/literature/an/an690.pdf>`_
BAR Memory Overview
*******************
On a Xeon E5 based system in the BIOS we can turn on above 4GB PCIe addressing, if so he need to set MMIO Base address ( MMIOH Base) and Range ( MMIO High Size) in the BIOS.
In SuperMicro system in the system bios you need to see the following
* Advanced->PCIe/PCI/PnP configuration-> Above 4G Decoding = Enabled
* Advanced->PCIe/PCI/PnP Configuration->MMIOH Base = 512G
* Advanced->PCIe/PCI/PnP Configuration->MMIO High Size = 256G
When we support Large Bar Capability there is a Large Bar Vbios which also disable the IO bar.
For GFX9 and Vega10 which have Physical Address up 44 bit and 48 bit Virtual address.
* BAR0-1 registers: 64bit, prefetchable, GPU memory. 8GB or 16GB depending on Vega10 SKU. Must be placed < 2^44 to support P2P access from other Vega10.
* BAR2-3 registers: 64bit, prefetchable, Doorbell. Must be placed < 2^44 to support P2P access from other Vega10.
* BAR4 register: Optional, not a boot device.
* BAR5 register: 32bit, non-prefetchable, MMIO. Must be placed < 4GB.
Here is how our BAR works on GFX 8 GPUs with 40 bit Physical Address Limit ::
11:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series] (rev c1)
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b35
Flags: bus master, fast devsel, latency 0, IRQ 119
Memory at bf40000000 (64-bit, prefetchable) [size=256M]
Memory at bf50000000 (64-bit, prefetchable) [size=2M]
I/O ports at 3000 [size=256]
Memory at c7400000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at c7440000 [disabled] [size=128K]
Legend:
1 : GPU Frame Buffer BAR In this example it happens to be 256M, but typically this will be size of the GPU memory (typically 4GB+). This BAR has to be placed < 2^40 to allow peer-to-peer access from other GFX8 AMD GPUs. For GFX9 (Vega GPU) the BAR has to be placed < 2^44 to allow peer-to-peer access from other GFX9 AMD GPUs.
2 : Doorbell BAR The size of the BAR is typically will be < 10MB (currently fixed at 2MB) for this generation GPUs. This BAR has to be placed < 2^40 to allow peer-to-peer access from other current generation AMD GPUs.
3 : IO BAR - This is for legacy VGA and boot device support, but since this the GPUs in this project are not VGA devices (headless), this is not a concern even if the SBIOS does not setup.
4 : MMIO BAR This is required for the AMD Driver SW to access the configuration registers. Since the reminder of the BAR available is only 1 DWORD (32bit), this is placed < 4GB. This is fixed at 256KB.
5 : Expansion ROM This is required for the AMD Driver SW to access the GPUs video-bios. This is currently fixed at 128KB.
Excepts form Overview of Changes to PCI Express 3.0
===================================================
By Mike Jackson, Senior Staff Architect, MindShare, Inc.
********************************************************
Atomic Operations Goal:
*************************
Support SMP-type operations across a PCIe network to allow for things like offloading tasks between CPU cores and accelerators like a GPU. The spec says this enables advanced synchronization mechanisms that are particularly useful with multiple producers or consumers that need to be synchronized in a non-blocking fashion. Three new atomic non-posted requests were added, plus the corresponding completion (the address must be naturally aligned with the operand size or the TLP is malformed):
* Fetch and Add uses one operand as the “add” value. Reads the target location, adds the operand, and then writes the result back to the original location.
* Unconditional Swap uses one operand as the “swap” value. Reads the target location and then writes the swap value to it.
* Compare and Swap uses 2 operands: first data is compare value, second is swap value. Reads the target location, checks it against the compare value and, if equal, writes the swap value to the target location.
* AtomicOpCompletion new completion to give the result so far atomic request and indicate that the atomicity of the transaction has been maintained.
Since AtomicOps are not locked they don't have the performance downsides of the PCI locked protocol. Compared to locked cycles, they provide “lower latency, higher scalability, advanced synchronization algorithms, and dramatically lower impact on other PCIe traffic.” The lock mechanism can still be used across a bridge to PCI or PCI-X to achieve the desired operation.
AtomicOps can go from device to device, device to host, or host to device. Each completer indicates whether it supports this capability and guarantees atomic access if it does. The ability to route AtomicOps is also indicated in the registers for a given port.
ID-based Ordering Goal:
*************************
Improve performance by avoiding stalls caused by ordering rules. For example, posted writes are never normally allowed to pass each other in a queue, but if they are requested by different functions, we can have some confidence that the requests are not dependent on each other. The previously reserved Attribute bit [2] is now combined with the RO bit to indicate ID ordering with or without relaxed ordering.
This only has meaning for memory requests, and is reserved for Configuration or IO requests. Completers are not required to copy this bit into a completion, and only use the bit if their enable bit is set for this operation.
To read more on PCIe Gen 3 new options https://www.mindshare.com/files/resources/PCIe%203-0.pdf

47
docs/understand/all.md Normal file
View File

@@ -0,0 +1,47 @@
# All Explanation Material
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} Compiler Nomencalture
:link: compiler_disambiguation
:link-type: doc
ROCm ships multiple compilers of varying origins and purposes. This article
disambiguates compiler naming used throughout the documentation.
:::
:::{grid-item-card} Using CMake
:link: cmake_packages
:link-type: doc
ROCm components ship with 1st party CMake support. This article details how that
support works and how to use it.
:::
:::{grid-item-card} Linux Folder Structure Reorganization
:link: file_reorg
:link-type: doc
ROCm™ packages have adopted the Linux foundation file system hierarchy standard
to ensure ROCm components follow open source conventions for Linux-based
distributions.
:::
:::{grid-item-card} GPU Isolation Techniques
:link: gpu_isolation
:link-type: doc
Restricting the access of applications to a subset of GPUs, aka isolating GPUs
allows users to hide GPU resources from programs.
:::
:::{grid-item-card} GPU Architectures
:link: gpu_arch
:link-type: doc
AMD documentation around architectural details from both the CDNA and RDNA
product lines.
:::
:::::

View File

@@ -179,7 +179,7 @@ This project can then be configured with for eg.
- Linux: ``cmake -D CMAKE_CXX_COMPILER:PATH=/opt/rocm/bin/amdclang++``
Which use the device compiler provided from the binary packages of
`ROCm HIP SDK <https://www.amd.com/en/graphics/servers-solutions-rocm>`_ and
`ROCm HIP SDK <https://www.amd.com/en/developer/rocm-hub.html>`_ and
`repo.radeon.com <https://repo.radeon.com>`_ respectively.
When using the CXX language support to compile HIP device code, selecting the

View File

@@ -1,6 +1,7 @@
# ROCm Compilers Disambiguation
The following table summarizes the widely used terms in this document.
ROCm ships multiple compilers of varying origins and purposes. This article
disambiguates compiler naming used throughout the documentation.
## Compiler Terms

View File

@@ -17,7 +17,7 @@ distributions. Following is the ROCm proposed file structure.
| -- architecture dependent libraries and binaries used internally by components
| -- cmake
| -- <component>
| --<component>.config.cmake
| --<component>-config.cmake
| -- libexec
| -- <component>
| -- non ISA/architecture independent executables used internally by components
@@ -162,7 +162,6 @@ correct header file and use correct search paths.
## References
ROCm deprecation warning :
<https://docs.amd.com/bundle/ROCm-Release-Notes-v5.4.3/page/Deprecations_and_Warnings.html>
{ref}`ROCm deprecation warning <5_4_0_filesystem_reorg_deprecation_notice>`
Linux File System Standard : <https://refspecs.linuxfoundation.org/fhs.shtml>
[Linux File System Standard](https://refspecs.linuxfoundation.org/fhs.shtml)

View File

@@ -1,79 +0,0 @@
# ISV Deployment Guide (Windows)
## Abstract
ISVs deploying applications using the HIP SDK depend on the AMD GPU Drivers, HIP
Runtime Library and HIP SDK Libraries. A compatibility matrix table provides
details on AMDs support model. AMD GPU Drivers are distributed with a HIP
Runtime included. Each HIP Runtime is associated with a HIP compiler version.
Applications built with a particular HIP compiler should document its associated
HIP Runtime version and AMD GPU Driver as minimum version requirements for its
end users. Applications do not distribute the HIP Runtime. Instead, end users
will use the HIP Runtime provided by an AMD GPU Driver. AMD provides backward
compatibility for applications dynamically linked to the HIP Runtime based on
our Driver and HIP support policy. ISV applications using the HIP SDK Libraries,
for example hipBLAS, should distribute the HIP SDK Library as part of its
installer package. It is recommended not to require end users to install the
HIP SDK. AMD provides backward compatibility for AMD Driver and HIP Runtime for
the HIP SDK Libraries based on our support policy. AMD support policy for Visual
Studio and other third-party compilers are documented here.
## Introduction
This guide is intended for Independent Software Vendors (ISVs) and other
software developers intending to build applications with the HIP SDK for
Windows. The HIP SDK is intended for developer distribution in contrast to the
AMD GPU driver which is intended for all end users. The guide discusses how to
use and distribute components from the HIP SDK. The HIP SDK is the collection of
the AMD GPU Driver, HIP Runtime and the HIP Libraries. These three parts are
distributed in the HIP SDK installer. The compatibility and versioning relation
between these three parts is documented here. AMDs support policies for the
developer tools allows the ISVs the stability to plan the usage of a tool chain.
## Recommended Library Distribution Model
The HIP SDK is distributed via a Windows installer. This distribution system is
only intended for software developers and testers. AMD recommends that end users
of the program built against HIP SDK components do not have a requirement to
install the HIP SDK. There are two types of ISV applications that use the HIP
SDK as follows.
The first group of ISV applications have a dependency on the HIP Runtime and
select HIP Header Only Libraries (rocPRIM, hipCUB and rocThrust). This group of
ISV applications need to require their end users install an AMD GPU Driver. Each
AMD GPU driver has a HIP runtime library bundled with it. The ISV application
should ensure that the HIP runtime library has a minimum version associated with
it. As the HIP runtime library does not have semantic versioning, the ISV
application cannot check for compatibility. However, AMD is committed to not
breaking API/ABI compatibility unless the major version number of the HIP
runtime is incremented. ISV applications may run without user warning if the HIP
major version available in the driver is the same as the HIP major version
associated with the compiler it was built with. The ISV at its discretion may
throw a warning if the HIP major version is higher than the associate HIP major
version of the compiler it was built with.
The second group of ISV application has a dependency on the HIP Runtime and one
or more Dynamically Linked HIP Libraries including the HIP RT library. ISV
applications with this dependency need to ensure the end user installs an AMD
GPU Driver and is recommended to distribute the dynamically linked HIP library
in the installer package of its application. This allows end users to avoid
installing the HIP SDK. One benefit of this model is smaller disk space required
as only required binaries are distributed by the ISV application. It also avoids
the end user to have to agree to licensing agreements for the entire HIP SDK.
The version checks recommended for the ISV application including dynamically
linked HIP Libraries follow the same requirements as the ISV applications that
only have the HIP Runtime and header only library. In addition, each dynamically
linked HIP library also has a minimum HIP runtime requirement. Checks for the
minimum HIP version for each dynamically linked HIP library may be added at the
ISVs discretion. Usually, the minimum HIP version check for the HIP runtime is
sufficient if dynamically linked HIP libraries come from the same SDK package as
the HIP compiler.
Please note AMD does not support static linking to any components distributed in
the HIP SDK.
## Conclusion
This guide provides a limited set of guidance for ISVs application deployment.
Please refer to the HIP API guides for the SDK and HIP Optimization guides for
more information.

View File

@@ -1,23 +0,0 @@
all
# Extend line length
rule 'MD013', :line_length => 99999
rule 'MD026', :punctuation => '.,;:!'
# Use "1. 2. 3."-style numbered lists instead of "1. 1. 1."
rule 'MD029', :style => :ordered
# Allow in-line HTML
exclude_rule 'MD033'
exclude_rule 'MD034'
exclude_rule 'MD041'
# False positives, see: https://github.com/markdownlint/markdownlint/issues/374
exclude_rule 'MD005'
# False positives, see: https://github.com/markdownlint/markdownlint/issues/313
exclude_rule 'MD007'

View File

@@ -60,7 +60,8 @@ ROCDebugger Machine Interface (MI) extends support to lanes. The following enhan
- MI varobjs are now lane-aware.
For more information, refer to the ROC Debugger User Guide at <https://docs.amd.com>.
For more information, refer to the ROC Debugger User Guide at
{doc}`ROCgdb <rocgdb:index>`.
##### Enhanced - clone-inferior Command
@@ -82,7 +83,7 @@ This release includes support for AMD Radeon™ Pro W6800, in addition to other
- Various other bug fixes and performance improvements
For more information, see <https://docs.amd.com/bundle/MIOpen_gh-pages/page/releasenotes.html>
For more information, see {doc}`Documentation <miopen:index>`.
#### Checkpoint Restore Support With CRIU

View File

@@ -271,7 +271,8 @@ The new APIs for virtual memory management are as follows:
hipError_t hipMemUnmap(void* ptr, size_t size);
```
For more information, refer to the HIP API documentation at <https://docs.amd.com/bundle/HIP_API_Guide/page/modules.html>
For more information, refer to the HIP API documentation at
{doc}`hip:.doxygen/docBin/html/modules`.
##### Planned HIP Changes in Future Releases
@@ -287,7 +288,8 @@ This release introduces a new ROCm C++ library for accelerating mixed precision
rocWMMA is released as a header library and includes test and sample projects to validate and illustrate example usages of the C++ API. GEMM matrix multiplication is used as primary validation given the heavy precedent for the library. However, the usage portfolio is growing significantly and demonstrates different ways rocWMMA may be consumed.
For more information, refer to <https://docs.amd.com/category/libraries>.
For more information, refer to
[Communication Libraries](../../../../docs/reference/gpu_libraries/communication.md).
#### OpenMP Enhancements in This Release

View File

@@ -95,6 +95,8 @@ The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, co
>
> There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option.
(5_4_0_filesystem_reorg_deprecation_notice)=
##### Linux Filesystem Hierarchy Standard for ROCm
ROCm packages have adopted the Linux foundation filesystem hierarchy standard in this release to ensure ROCm components follow open source conventions for Linux-based distributions. While moving to a new filesystem hierarchy, ROCm ensures backward compatibility with its 5.1 version or older filesystem hierarchy. See below for a detailed explanation of the new filesystem hierarchy and backward compatibility.
@@ -205,9 +207,8 @@ The test was incorrectly using the `hipDeviceAttributePageableMemoryAccess` devi
`hipHostMalloc()` allocates memory with fine-grained access by default when the environment variable `HIP_HOST_COHERENT=1` is used.
For more information, refer to the HIP Programming Guide at
For more information, refer to {doc}`hip:.doxygen/docBin/html/index`.
<https://docs.amd.com/bundle/HIP-Programming-Guide-v5.4/page/Introduction_to_HIP_Programming_Guide.html>
#### SoftHang with `hipStreamWithCUMask` test on AMD Instinct™

View File

@@ -0,0 +1,11 @@
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable no-duplicate-header -->
### What's New in This Release
#### HIP API Change
The following HIP API is updated in the ROCm v5.5.1 release,
##### `hipDeviceSetCacheConfig`
- The return value for `hipDeviceSetCacheConfig` is updated from `hipErrorNotSupported` to `hipSuccess`