Commit Graph

786 Commits

Author SHA1 Message Date
dependabot[bot]
defb276d93 Build(deps): Bump rocm-docs-core from 1.18.1 to 1.18.2 in /docs/sphinx (#4556)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.18.1 to 1.18.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.18.1...v1.18.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.18.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-03 17:02:06 -06:00
Peter Park
fdf24a9c40 fix link to CLR license (#4560) 2025-04-03 13:09:59 -04:00
Dominic Widdows
715cce53de Update workload.rst with small export fix (#4425)
Tiny fix that removes the "export" directive. 
` export HIP_FORCE_DEV_KERNARG=1  hipblaslt-bench ...`
leads to 
bash: export: `hipblaslt-bench': not a valid identifier

whereas just starting with HIP_FORCE_DEV_KERNARG=1 passes this env var to the hipblaslt-bench process, which I think is the intention here.
2025-04-03 13:01:26 -04:00
Peter Park
ea66bf386a Fix more links in documentation (#4551)
* fix vllm engine args link

* remove RDNA subtree in under system optimization in toc

* fix RDNA 2 architecture PDF link

* fix CLR LICENSE.txt link

* fix rocPyDecode license link
2025-04-01 15:56:34 -04:00
Peter Park
ac2c5e72d4 Fix links in documentation 2025-04-01 15:39:20 -04:00
Peter Park
424e6148bd Add MaxText training Docker doc
Add MaxText training Docker doc
2025-03-28 11:25:06 -04:00
Pratik Basyal
a0faccba37 AMD GPU Docs System optimization migration changes in ROCm Docs Develop (#4538)
* AMD GPU Docs System optimization migration changes in ROCm Docs (#296)

* System optimization migration changes in ROCm

* Linting issue fixed

* Linking corrected

* Minor change

* Link updated to Instinct.docs.amd.com

* ROCm docs grid updated by removing IOMMU.rst, pcie-atomics, and oversubscription pages

* Files removed and reference fixed

* Reference text updated

* GPU atomics from 6.4.0 removed
2025-03-27 16:38:10 -04:00
dependabot[bot]
1385196fab Build(deps): Bump sphinx-reredirects from 0.1.5 to 0.1.6 in /docs/sphinx (#4527)
Bumps [sphinx-reredirects](https://github.com/documatt/sphinx-reredirects) from 0.1.5 to 0.1.6.
- [Commits](https://github.com/documatt/sphinx-reredirects/compare/v0.1.5...v0.1.6)

---
updated-dependencies:
- dependency-name: sphinx-reredirects
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 13:21:53 -06:00
Peter Park
58d42ec50b Improve "tuning guides" landing page (#4504)
* Improve "tuning guides" landing page

* Update docs/how-to/gpu-performance/mi300x.rst

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Update docs/how-to/gpu-performance/mi300x.rst

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* change tuning to optimization

---------

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-03-25 13:54:27 -04:00
dependabot[bot]
e396b4898f Build(deps): Bump jinja2 from 3.1.5 to 3.1.6 in /docs/sphinx (#4465)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 17:18:02 -06:00
Peter Park
8f359da39e Update Megatron-LM doc for 25.4 (#4520)
* update megatron-lm doc

* update 'previous versions'

* add missing space

* update docker pull tag

* Update options and docker pull tag

* Add performance measurements link to megatron-lm doc

* fix previous versions table

* words

* Simplify system validation section

* minor fixes

* fix perv versions tbl
2025-03-21 16:49:55 -04:00
Alex Xu
388f18cf36 add 6.1.5 to version list 2025-03-14 10:51:57 -04:00
Peter Park
2fca094531 PyTorch training Docker update 25.4 (#4482)
* remove orphan tag

* add hugging face PEFT

* update "previous versions"

* data == ultrachat 200k

* fix "llama 2"

* add ultrachat to wordlist

* fix previous versions table

* add performance measurements

* add mi325x

* fix prev version

* change 'validation' to 'testing

* fix dir name

* fix backtick
2025-03-13 13:40:00 -04:00
Peter Park
9b2ce2b634 Update vLLM performance Docker docs (#4491)
* add links to performance results

words

* change "performance validation" to "performance testing"

* update vLLM docker 3/11

* add previous versions

add previous versions

* fix llama 3.1 8b model repo name

* words
2025-03-13 10:04:21 -04:00
dependabot[bot]
d171830a85 Build(deps): Bump rocm-docs-core from 1.17.1 to 1.18.1 in /docs/sphinx (#4488)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.17.1 to 1.18.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.17.1...v1.18.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-12 16:56:27 -06:00
Peter Park
29ba151b48 Fix "VGPR" typo in workload tuning guide (#4484)
* Fix "VGPR" typo in workload tuning guide

* fix wording
2025-03-12 10:28:35 -04:00
Istvan Kiss
41a5ae5618 Replace "-" on precision support page 2025-03-10 13:41:02 +01:00
dependabot[bot]
6db5bee4dd Build(deps): Bump rocm-docs-core from 1.17.0 to 1.17.1 in /docs/sphinx (#4442)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.17.0 to 1.17.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.17.0...v1.17.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-07 17:06:17 -07:00
Pratik Basyal
9aad9ce7ef Content for modprobe added to MI300X system optimization (#4434)
Added content for modprobe
2025-03-07 14:52:20 -05:00
Adel Johar
cd85ccd539 Docs: use custom directive to reference library versions 2025-03-05 10:24:22 +01:00
Peter Park
fa0e212906 Fix applies to linux tag for training benchmark docker pages (#4446) 2025-03-04 12:06:55 -05:00
Peter Park
1fb42c2591 Update LLM inference performance validation on AMD Instinct MI300X guide to filter by desired model (#4424)
* WIP

(cherry picked from commit a06a5b5b959a9425e7384fb58b88c3716f380e48)

rm unneeded files

(cherry picked from commit f1d0c00056a83299bdea74a43cd17454999cf2d8)

* add sphinxcontrib.datatemplates

(cherry picked from commit d056b93a325d87b81f54f70c6eb4ae78f4fb0bc1)

* add template

(cherry picked from commit 0691d59f0a1efbda7908762b7a906e30a65c0ee1)

fix template

(cherry picked from commit 01e4bea5522aa5deeaade58c105ff850f449df8b)

WIPO

(cherry picked from commit 4d8daf7445e7be92cd9ee1d39dff564bd8de41f4)

WIP

(cherry picked from commit 9eefd1f5833bc4dc8de9d777ff65a5fe5f826dbd)

update models yaml schema

(cherry picked from commit a5f0fc1e6cc51104dc2d42029bfcf3eea276d270)

add model groups functionality

(cherry picked from commit 13f49f96dd3e5a160d37c52e48a4fbcccdcf4f9e)

add selector headings and fix template

(cherry picked from commit 35f7f2314bcf74b4fd0a8ca10aaabf0de7063bb0)

update template

(cherry picked from commit 9e2dcfe0c7f6e7c2c685866ea83375fbacbc5032)

fix

(cherry picked from commit be51e32791550ddc21785effccb889228394b242)

use classes instead of data tags

(cherry picked from commit cd52d68c504f7e7435d156ae70cf4bde1dfe703e)

update template

(cherry picked from commit 9ed89fee6874b39ee3535fbde54a0a59f346ea2b)

clean up extra wip files

(cherry picked from commit a9f965a104baa966c184054638e935b011526278)

update wordlist

(cherry picked from commit f783656814e896aedd21acd1c8c87b4700c14469)

remove unused template

(cherry picked from commit cac894bd9c2b1262c9c006e5fddbcb742dc6d882)

improve script

(cherry picked from commit ca20ffd4922916616e0924d625652a815f27c35f)

fix template

(cherry picked from commit 752c61fda856fd5b244734636c036c8877e823b9)

fix standalone benchmark output path in template

(cherry picked from commit d8c04203b5ec0f6c2e2307f7890304a3dc5687be)

fix toc

(cherry picked from commit 8df42faf53488ef29f5a263d25032f3d35cd58ed)

update script to prevent flash of unstyled content

import a11y

(cherry picked from commit 46c852717f223a1d8744fab035807cebab4c5404)

add tabindex to wordlist

(cherry picked from commit 11492593f9692f5453045e7ec52c8f8ae9624ae9)

text

update script

* remove unused config option

* reorganize assets

* fix linting warning

* move js from data/ to extension/
2025-02-28 12:39:02 -05:00
Istvan Kiss
cd57bc8186 Fix white paper links 2025-02-27 15:29:06 +01:00
Adel Johar
4be8096109 Merge pull request #4393 from ROCm/docs_fix_arch
Docs: Fix gpu-arch-spec.rst
2025-02-26 14:19:38 +01:00
Peter Park
934767322b Update PT and TF docker inventories in compatibility docs (#4415)
* update PyTorch docker inventories in compatibility doc

* update TF docker inventories in compatibility doc

* update text to rocm 6.3.3
2025-02-25 12:32:34 -05:00
Peter Park
1ea1c5c6e0 fix tab sync and nested tab Megatron-LM doc (#4409) 2025-02-21 17:19:48 -05:00
Peter Park
389fa7071b Update docs on Megatron-LM and PyTorch training Dockers (#4407)
* Update Megatron-LM and PyTorch Training Docker docs

Also restructure TOC

* Apply suggestions from code review

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

update "start training" text

Apply suggestions from code review

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

update conf.py

fix spacing

fix branding issue

add disable numa

reorg

remove extra text
2025-02-21 13:07:18 -05:00
dependabot[bot]
27cb8ea927 Build(deps): Bump rocm-docs-core from 1.15.0 to 1.17.0 in /docs/sphinx (#4402)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.15.0 to 1.17.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.15.0...v1.17.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-20 11:22:49 -07:00
Pratik Basyal
1b36ab4850 Final GA day prep for 633 (#313)
* ROCProfiler deprecation notice udpated

* Final GA day changes added

* github issue no. added

* ROCTx added

* rocprofv added to wordlist

* Minor fix
2025-02-19 15:19:44 -05:00
Parag Bhandari
ba90b9e61b Removed merge conflict markers 2025-02-19 13:56:00 -05:00
Parag Bhandari
662a40a33f Merge branch 'develop' into internal-develop 2025-02-19 13:35:46 -05:00
pbhandar-amd
fd4ccb9372 Update versions.md 2025-02-19 12:56:36 -05:00
Adel Johar
0c6f660d59 Docs: Fix gpu-arch-spec.rst 2025-02-19 17:05:01 +01:00
Peter Park
618b44ed23 add vllm docker to release highlights (#306) 2025-02-13 12:01:08 -05:00
Adel Johar
c52aa329c8 Merge pull request #4350 from ROCm/docs_device_version
Docs: Add Device Major/Minor Versions to gpu-arch-spec.rst
2025-02-13 14:41:01 +01:00
Adel Johar
1499f74c22 Docs: Add Device Major/Minor Versions to gpu-arch-spec.rst 2025-02-13 14:24:00 +01:00
Pratik Basyal
35f4362e68 Release notes updates for ROCm 6.3.3 release (#298)
* Initial changes for 6.3.3 release updated in RN

* conf file updated

* 6.3.3 compatibility matrix updated

* 6.3.3 version update

* HIP documentation updated added

* Deprecation notice added

* ROCm Offline Installer updates added to Release Highlight

* CSV loading error fixed

* ROCm System Profiler 0.1.2 updated added

* Reference to Offline Installer updated

* Resolved issues removed

* Azure Linux support for 6.3.2 added

* Minor update in ROCm Offline Installer highlight

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

---------

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
2025-02-12 09:24:58 -05:00
dependabot[bot]
24603ac37a Build(deps): Bump cryptography from 43.0.3 to 44.0.1 in /docs/sphinx (#4365)
Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.3 to 44.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/43.0.3...44.0.1)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-11 17:01:16 -07:00
Pratik Basyal
c469e34b27 Debian 12 support for single-node added (#300) (#4357) 2025-02-10 09:33:27 -05:00
Pratik Basyal
63b8d9da7b Debian 12 support for single-node added (#300) 2025-02-07 17:47:00 -05:00
Peter Park
2751a17cf0 Update vLLM benchmarking guide (#4347)
* update vllm-benchmark

fix hlist overflow

update standalone benchmarking options

update list of models

fix typo and model name

unnecessary duplicate info

update formatting

update vllm benchmark guide

- remove Llama 2 FP8
- add Jais 13B
- update commands

update docker pull tag

update MAD available models

remove extra mad models not relevant to vllm

update PyTorch version

add changelog

add model names to .wordlist.txt

* Update docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Update docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Update docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* fix typo

* update link

* fix link text

* change changelog to previous versions

* fix typo

* remove "for"

---------

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-02-05 17:18:35 -05:00
Peter Park
9b0ae86b1b Fix ROCm Bandwidth Test license type
Fix ROCm Bandwidth Test license type
2025-02-05 16:40:31 -05:00
Istvan Kiss
faa67965dd Precision support page update 2025-02-04 16:17:31 +01:00
Pratik Basyal
f885b5df6e Updated ROCm install on Linux installation method link (#4313) 2025-01-31 16:48:33 -05:00
dependabot[bot]
ee70cb0bb5 Build(deps): Bump rocm-docs-core from 1.13.0 to 1.15.0 in /docs/sphinx (#4315)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.13.0 to 1.15.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.13.0...v1.15.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-29 17:14:55 -07:00
Jeffrey Novotny
d401b5f152 Add ToC and index links to the AI Developer Tutorials (#4312)
* Add ToC and index links to the AI Developer Tutorials

* Change link positioning

* Change wording
2025-01-29 14:43:32 -05:00
Pratik Basyal
a414216ff4 Duplication from GA merge resolved (#4308)
* Duplication from GA merge resolved

* Date updated
2025-01-28 16:39:49 -05:00
alexxu-amd
d878f49107 Update versions.md for 6.3.2 2025-01-28 14:22:45 -05:00
Pratik Basyal
3738297667 2nd POC for How to Use ROCm for AI (#282)
* Initial draft for How-to POC

* Zone.identifier file removed

* Broken links in index.md fixed

* Zone.identifier file removed

* Review feedback incorporated

* Title updated

* New format for ROCm for AI TOC created

* Folder structure changed

* ROCm for AI index updated

* Link to Llama recipe updated

* Review feedback added

* Feedback from Cindy added

* Intro text from Cindy added

* New flow suggested by Hongxia incorporated

* Overview content from Cindy added, TOC updated, Meta data updated

* Reference to HPC removed

* Listing alignment updated

* Overview page updated

* Folder structure and link change resulted from TOC change updated

* Content sequence updated

* Meta data updated

* Review feedback incorporated

* Index file renamed

* Conf file updated for OS compatibility info

* update metadata (#4)

update metadata

fix spelling

* Wordlist updated

---------

Co-authored-by: Peter Park <peter.park@amd.com>
2025-01-24 17:42:20 -05:00
Alex Xu
98cc1ec504 Merge remote-tracking branch 'external/develop' into sync-develop-from-external 2025-01-24 14:08:56 -05:00