Commit Graph

2441 Commits

Author SHA1 Message Date
Istvan Kiss
6c7f167650 Fix broken torchserve link 2025-04-07 16:07:31 +02:00
dependabot[bot]
defb276d93 Build(deps): Bump rocm-docs-core from 1.18.1 to 1.18.2 in /docs/sphinx (#4556)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.18.1 to 1.18.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.18.1...v1.18.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.18.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-03 17:02:06 -06:00
Peter Park
fdf24a9c40 fix link to CLR license (#4560) 2025-04-03 13:09:59 -04:00
Dominic Widdows
715cce53de Update workload.rst with small export fix (#4425)
Tiny fix that removes the "export" directive. 
` export HIP_FORCE_DEV_KERNARG=1  hipblaslt-bench ...`
leads to 
bash: export: `hipblaslt-bench': not a valid identifier

whereas just starting with HIP_FORCE_DEV_KERNARG=1 passes this env var to the hipblaslt-bench process, which I think is the intention here.
2025-04-03 13:01:26 -04:00
Daniel Su
2536c40751 Ex CI: fix CK test pool names (#4558) 2025-04-03 11:24:24 -04:00
Daniel Su
07068b6fd8 Ex CI: add pkg-config to ROCgdb, remove tarball link from Tensile (#4555)
* Ex CI: add pkg-config to ROCgdb

* Tensile, remove link to non-existent tar.gz artifact
2025-04-02 17:11:29 -04:00
Daniel Su
09a3cd9a46 Ex CI: convert job strategy matrices into compiletime parameters (#4553) 2025-04-02 11:43:52 -04:00
Peter Park
ea66bf386a Fix more links in documentation (#4551)
* fix vllm engine args link

* remove RDNA subtree in under system optimization in toc

* fix RDNA 2 architecture PDF link

* fix CLR LICENSE.txt link

* fix rocPyDecode license link
2025-04-01 15:56:34 -04:00
Peter Park
ac2c5e72d4 Fix links in documentation 2025-04-01 15:39:20 -04:00
Daniel Su
37de280ca6 Ex CI: rocprof-compute, add dependency on rocprof-sdk (#4547) 2025-03-31 17:29:55 -04:00
Daniel Su
a6232d89f2 Ex CI: add Ninja build gen for 12 components (#4544) 2025-03-28 13:40:57 -04:00
Peter Park
424e6148bd Add MaxText training Docker doc
Add MaxText training Docker doc
2025-03-28 11:25:06 -04:00
Daniel Su
31b1a1f124 Ex CI: fix snap latest cmake version to 3.31 (#4542) 2025-03-28 10:08:03 -04:00
Pratik Basyal
a0faccba37 AMD GPU Docs System optimization migration changes in ROCm Docs Develop (#4538)
* AMD GPU Docs System optimization migration changes in ROCm Docs (#296)

* System optimization migration changes in ROCm

* Linting issue fixed

* Linking corrected

* Minor change

* Link updated to Instinct.docs.amd.com

* ROCm docs grid updated by removing IOMMU.rst, pcie-atomics, and oversubscription pages

* Files removed and reference fixed

* Reference text updated

* GPU atomics from 6.4.0 removed
2025-03-27 16:38:10 -04:00
Daniel Su
4bee895a1b Ex CI: fixes for RVS, Tensile, hipBLASLt, rocMLIR, CK (#4535) 2025-03-27 11:28:38 -04:00
dependabot[bot]
1385196fab Build(deps): Bump sphinx-reredirects from 0.1.5 to 0.1.6 in /docs/sphinx (#4527)
Bumps [sphinx-reredirects](https://github.com/documatt/sphinx-reredirects) from 0.1.5 to 0.1.6.
- [Commits](https://github.com/documatt/sphinx-reredirects/compare/v0.1.5...v0.1.6)

---
updated-dependencies:
- dependency-name: sphinx-reredirects
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 13:21:53 -06:00
Daniel Su
ea11ae86ec Ex CI: fixes for rocWMMA, rocprof-sdk, roctracer, AOMP (#4529) 2025-03-25 14:28:02 -04:00
Peter Park
58d42ec50b Improve "tuning guides" landing page (#4504)
* Improve "tuning guides" landing page

* Update docs/how-to/gpu-performance/mi300x.rst

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Update docs/how-to/gpu-performance/mi300x.rst

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* change tuning to optimization

---------

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-03-25 13:54:27 -04:00
Daniel Su
8dc218c5d0 Ex CI: dynamically set rocrtst include directory (#4525) 2025-03-24 16:34:56 -04:00
dependabot[bot]
e396b4898f Build(deps): Bump jinja2 from 3.1.5 to 3.1.6 in /docs/sphinx (#4465)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 17:18:02 -06:00
Joseph Macaranas
12ac5b8025 [External CI] setuptools version fix for pytorch (#4522) 2025-03-21 17:22:29 -04:00
Peter Park
8f359da39e Update Megatron-LM doc for 25.4 (#4520)
* update megatron-lm doc

* update 'previous versions'

* add missing space

* update docker pull tag

* Update options and docker pull tag

* Add performance measurements link to megatron-lm doc

* fix previous versions table

* words

* Simplify system validation section

* minor fixes

* fix perv versions tbl
2025-03-21 16:49:55 -04:00
Daniel Su
80e89cc885 Ex CI: fix Dockerfile PATH creation (#4518) 2025-03-21 13:36:13 -04:00
Daniel Su
93d6018a83 Ex CI: fix manifest creation for AOMP and HIP/clr (#4517) 2025-03-21 11:19:08 -04:00
Daniel Su
99a35bb1fc Ex CI: remove /opt/rocm symlinks from nine components (#4508) 2025-03-18 11:41:23 -04:00
Daniel Su
60719f0292 Ex CI: add gfx90a to nightly job (#4507) 2025-03-18 10:27:49 -04:00
Alex Xu
388f18cf36 add 6.1.5 to version list 2025-03-14 10:51:57 -04:00
Peter Park
2fca094531 PyTorch training Docker update 25.4 (#4482)
* remove orphan tag

* add hugging face PEFT

* update "previous versions"

* data == ultrachat 200k

* fix "llama 2"

* add ultrachat to wordlist

* fix previous versions table

* add performance measurements

* add mi325x

* fix prev version

* change 'validation' to 'testing

* fix dir name

* fix backtick
2025-03-13 13:40:00 -04:00
Daniel Su
41e7ae8da8 Ex CI: fixes for RDC, rocprof-sdk, hipBLASLt, CK (#4492) 2025-03-13 13:13:44 -04:00
Peter Park
9b2ce2b634 Update vLLM performance Docker docs (#4491)
* add links to performance results

words

* change "performance validation" to "performance testing"

* update vLLM docker 3/11

* add previous versions

add previous versions

* fix llama 3.1 8b model repo name

* words
2025-03-13 10:04:21 -04:00
dependabot[bot]
d171830a85 Build(deps): Bump rocm-docs-core from 1.17.1 to 1.18.1 in /docs/sphinx (#4488)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.17.1 to 1.18.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.17.1...v1.18.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-12 16:56:27 -06:00
Peter Park
29ba151b48 Fix "VGPR" typo in workload tuning guide (#4484)
* Fix "VGPR" typo in workload tuning guide

* fix wording
2025-03-12 10:28:35 -04:00
Joseph Macaranas
17df9993bc External CI: Update mainline branch name for llvm-project dependency (#4481) 2025-03-11 10:54:51 -04:00
Istvan Kiss
41a5ae5618 Replace "-" on precision support page 2025-03-10 13:41:02 +01:00
dependabot[bot]
6db5bee4dd Build(deps): Bump rocm-docs-core from 1.17.0 to 1.17.1 in /docs/sphinx (#4442)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.17.0 to 1.17.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.17.0...v1.17.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-07 17:06:17 -07:00
Joseph Macaranas
cb27cda5c7 External CI: Add default ubuntu repos to sources.list (#4464)
- Also add fix-missing parameter to apt install
2025-03-07 17:36:25 -05:00
Daniel Su
1d9ecdef44 Ex CI: temporarily change from low pool to base pool (#4463) 2025-03-07 17:15:32 -05:00
Daniel Su
29509640e7 Ex CI: fixes for rocMLIR, rocPyDecode, RVS, rocprof-compute (#4462) 2025-03-07 16:46:05 -05:00
Pratik Basyal
9aad9ce7ef Content for modprobe added to MI300X system optimization (#4434)
Added content for modprobe
2025-03-07 14:52:20 -05:00
Daniel Su
2bcd398de6 Ex CI: make component CTests nonverbose (#4458) 2025-03-06 17:38:24 -05:00
Daniel Su
a0b91d17ff Ex CI: make disk space print optional, small ROCr and manifest tweaks (#4457) 2025-03-06 16:13:19 -05:00
Daniel Su
c83677f41c Ex CI: enable gfx90a tests (#4450) 2025-03-06 13:50:12 -05:00
Daniel Su
e38b3aea50 Ex CI: update to 6.3.4, fixes for rocm-smi and rocWMMA (#4455)
* Ex CI: update to 6.3.4

* fix rocm-smi not installing apt packages

* extend rocWMMA test timeout to 2 hours
2025-03-06 13:34:56 -05:00
Adel Johar
cad7b92954 Merge pull request #4385 from ROCm/docs_versions
Docs: use custom directive to reference library versions
2025-03-05 15:01:25 +01:00
Adel Johar
cd85ccd539 Docs: use custom directive to reference library versions 2025-03-05 10:24:22 +01:00
alexxu-amd
de4ac7a5a3 Merge pull request #4438 from ROCm/alexxu12/md-file-fix
Fix important block from CONTRIBUTING.md
2025-03-04 13:08:51 -05:00
Peter Park
fa0e212906 Fix applies to linux tag for training benchmark docker pages (#4446) 2025-03-04 12:06:55 -05:00
Daniel Su
84001e176e Ex CI: increase hipBLASLt test timeout to 2 hours (#4445) 2025-03-04 10:53:05 -05:00
Joseph Macaranas
9cd2706fdb External CI: Set hipSPARSELt Fortran compiler to f95 (#4441)
- Explicitly set Fortran compiler to account for recent llvm-project changes that were meant to help with aomp issues.
2025-03-03 16:43:37 -05:00
Alex Xu
13be0b6a51 fix important block 2025-03-03 14:35:33 -05:00