Dominic Widdows
715cce53de
Update workload.rst with small export fix ( #4425 )
...
Tiny fix that removes the "export" directive.
` export HIP_FORCE_DEV_KERNARG=1 hipblaslt-bench ...`
leads to
bash: export: `hipblaslt-bench': not a valid identifier
whereas just starting with HIP_FORCE_DEV_KERNARG=1 passes this env var to the hipblaslt-bench process, which I think is the intention here.
2025-04-03 13:01:26 -04:00
Daniel Su
2536c40751
Ex CI: fix CK test pool names ( #4558 )
2025-04-03 11:24:24 -04:00
Daniel Su
07068b6fd8
Ex CI: add pkg-config to ROCgdb, remove tarball link from Tensile ( #4555 )
...
* Ex CI: add pkg-config to ROCgdb
* Tensile, remove link to non-existent tar.gz artifact
2025-04-02 17:11:29 -04:00
Daniel Su
09a3cd9a46
Ex CI: convert job strategy matrices into compiletime parameters ( #4553 )
2025-04-02 11:43:52 -04:00
Peter Park
ea66bf386a
Fix more links in documentation ( #4551 )
...
* fix vllm engine args link
* remove RDNA subtree in under system optimization in toc
* fix RDNA 2 architecture PDF link
* fix CLR LICENSE.txt link
* fix rocPyDecode license link
2025-04-01 15:56:34 -04:00
Peter Park
ac2c5e72d4
Fix links in documentation
2025-04-01 15:39:20 -04:00
Daniel Su
37de280ca6
Ex CI: rocprof-compute, add dependency on rocprof-sdk ( #4547 )
2025-03-31 17:29:55 -04:00
Daniel Su
a6232d89f2
Ex CI: add Ninja build gen for 12 components ( #4544 )
2025-03-28 13:40:57 -04:00
Peter Park
424e6148bd
Add MaxText training Docker doc
...
Add MaxText training Docker doc
2025-03-28 11:25:06 -04:00
Daniel Su
31b1a1f124
Ex CI: fix snap latest cmake version to 3.31 ( #4542 )
2025-03-28 10:08:03 -04:00
Pratik Basyal
a0faccba37
AMD GPU Docs System optimization migration changes in ROCm Docs Develop ( #4538 )
...
* AMD GPU Docs System optimization migration changes in ROCm Docs (#296 )
* System optimization migration changes in ROCm
* Linting issue fixed
* Linking corrected
* Minor change
* Link updated to Instinct.docs.amd.com
* ROCm docs grid updated by removing IOMMU.rst, pcie-atomics, and oversubscription pages
* Files removed and reference fixed
* Reference text updated
* GPU atomics from 6.4.0 removed
2025-03-27 16:38:10 -04:00
Daniel Su
4bee895a1b
Ex CI: fixes for RVS, Tensile, hipBLASLt, rocMLIR, CK ( #4535 )
2025-03-27 11:28:38 -04:00
dependabot[bot]
1385196fab
Build(deps): Bump sphinx-reredirects from 0.1.5 to 0.1.6 in /docs/sphinx ( #4527 )
...
Bumps [sphinx-reredirects](https://github.com/documatt/sphinx-reredirects ) from 0.1.5 to 0.1.6.
- [Commits](https://github.com/documatt/sphinx-reredirects/compare/v0.1.5...v0.1.6 )
---
updated-dependencies:
- dependency-name: sphinx-reredirects
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 13:21:53 -06:00
Daniel Su
ea11ae86ec
Ex CI: fixes for rocWMMA, rocprof-sdk, roctracer, AOMP ( #4529 )
2025-03-25 14:28:02 -04:00
Peter Park
58d42ec50b
Improve "tuning guides" landing page ( #4504 )
...
* Improve "tuning guides" landing page
* Update docs/how-to/gpu-performance/mi300x.rst
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
* Update docs/how-to/gpu-performance/mi300x.rst
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
* change tuning to optimization
---------
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
2025-03-25 13:54:27 -04:00
Daniel Su
8dc218c5d0
Ex CI: dynamically set rocrtst include directory ( #4525 )
2025-03-24 16:34:56 -04:00
dependabot[bot]
e396b4898f
Build(deps): Bump jinja2 from 3.1.5 to 3.1.6 in /docs/sphinx ( #4465 )
...
Bumps [jinja2](https://github.com/pallets/jinja ) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases )
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst )
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6 )
---
updated-dependencies:
- dependency-name: jinja2
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 17:18:02 -06:00
Joseph Macaranas
12ac5b8025
[External CI] setuptools version fix for pytorch ( #4522 )
2025-03-21 17:22:29 -04:00
Peter Park
8f359da39e
Update Megatron-LM doc for 25.4 ( #4520 )
...
* update megatron-lm doc
* update 'previous versions'
* add missing space
* update docker pull tag
* Update options and docker pull tag
* Add performance measurements link to megatron-lm doc
* fix previous versions table
* words
* Simplify system validation section
* minor fixes
* fix perv versions tbl
2025-03-21 16:49:55 -04:00
Daniel Su
80e89cc885
Ex CI: fix Dockerfile PATH creation ( #4518 )
2025-03-21 13:36:13 -04:00
Daniel Su
93d6018a83
Ex CI: fix manifest creation for AOMP and HIP/clr ( #4517 )
2025-03-21 11:19:08 -04:00
Daniel Su
99a35bb1fc
Ex CI: remove /opt/rocm symlinks from nine components ( #4508 )
2025-03-18 11:41:23 -04:00
Daniel Su
60719f0292
Ex CI: add gfx90a to nightly job ( #4507 )
2025-03-18 10:27:49 -04:00
Alex Xu
388f18cf36
add 6.1.5 to version list
2025-03-14 10:51:57 -04:00
Peter Park
2fca094531
PyTorch training Docker update 25.4 ( #4482 )
...
* remove orphan tag
* add hugging face PEFT
* update "previous versions"
* data == ultrachat 200k
* fix "llama 2"
* add ultrachat to wordlist
* fix previous versions table
* add performance measurements
* add mi325x
* fix prev version
* change 'validation' to 'testing
* fix dir name
* fix backtick
2025-03-13 13:40:00 -04:00
Daniel Su
41e7ae8da8
Ex CI: fixes for RDC, rocprof-sdk, hipBLASLt, CK ( #4492 )
2025-03-13 13:13:44 -04:00
Peter Park
9b2ce2b634
Update vLLM performance Docker docs ( #4491 )
...
* add links to performance results
words
* change "performance validation" to "performance testing"
* update vLLM docker 3/11
* add previous versions
add previous versions
* fix llama 3.1 8b model repo name
* words
2025-03-13 10:04:21 -04:00
dependabot[bot]
d171830a85
Build(deps): Bump rocm-docs-core from 1.17.1 to 1.18.1 in /docs/sphinx ( #4488 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.17.1 to 1.18.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.17.1...v1.18.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-12 16:56:27 -06:00
Peter Park
29ba151b48
Fix "VGPR" typo in workload tuning guide ( #4484 )
...
* Fix "VGPR" typo in workload tuning guide
* fix wording
2025-03-12 10:28:35 -04:00
Joseph Macaranas
17df9993bc
External CI: Update mainline branch name for llvm-project dependency ( #4481 )
2025-03-11 10:54:51 -04:00
Istvan Kiss
41a5ae5618
Replace "-" on precision support page
2025-03-10 13:41:02 +01:00
dependabot[bot]
6db5bee4dd
Build(deps): Bump rocm-docs-core from 1.17.0 to 1.17.1 in /docs/sphinx ( #4442 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.17.0 to 1.17.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.17.0...v1.17.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-07 17:06:17 -07:00
Joseph Macaranas
cb27cda5c7
External CI: Add default ubuntu repos to sources.list ( #4464 )
...
- Also add fix-missing parameter to apt install
2025-03-07 17:36:25 -05:00
Daniel Su
1d9ecdef44
Ex CI: temporarily change from low pool to base pool ( #4463 )
2025-03-07 17:15:32 -05:00
Daniel Su
29509640e7
Ex CI: fixes for rocMLIR, rocPyDecode, RVS, rocprof-compute ( #4462 )
2025-03-07 16:46:05 -05:00
Pratik Basyal
9aad9ce7ef
Content for modprobe added to MI300X system optimization ( #4434 )
...
Added content for modprobe
2025-03-07 14:52:20 -05:00
Daniel Su
2bcd398de6
Ex CI: make component CTests nonverbose ( #4458 )
2025-03-06 17:38:24 -05:00
Daniel Su
a0b91d17ff
Ex CI: make disk space print optional, small ROCr and manifest tweaks ( #4457 )
2025-03-06 16:13:19 -05:00
Daniel Su
c83677f41c
Ex CI: enable gfx90a tests ( #4450 )
2025-03-06 13:50:12 -05:00
Daniel Su
e38b3aea50
Ex CI: update to 6.3.4, fixes for rocm-smi and rocWMMA ( #4455 )
...
* Ex CI: update to 6.3.4
* fix rocm-smi not installing apt packages
* extend rocWMMA test timeout to 2 hours
2025-03-06 13:34:56 -05:00
Adel Johar
cad7b92954
Merge pull request #4385 from ROCm/docs_versions
...
Docs: use custom directive to reference library versions
2025-03-05 15:01:25 +01:00
Adel Johar
cd85ccd539
Docs: use custom directive to reference library versions
2025-03-05 10:24:22 +01:00
alexxu-amd
de4ac7a5a3
Merge pull request #4438 from ROCm/alexxu12/md-file-fix
...
Fix important block from CONTRIBUTING.md
2025-03-04 13:08:51 -05:00
Peter Park
fa0e212906
Fix applies to linux tag for training benchmark docker pages ( #4446 )
2025-03-04 12:06:55 -05:00
Daniel Su
84001e176e
Ex CI: increase hipBLASLt test timeout to 2 hours ( #4445 )
2025-03-04 10:53:05 -05:00
Joseph Macaranas
9cd2706fdb
External CI: Set hipSPARSELt Fortran compiler to f95 ( #4441 )
...
- Explicitly set Fortran compiler to account for recent llvm-project changes that were meant to help with aomp issues.
2025-03-03 16:43:37 -05:00
Alex Xu
13be0b6a51
fix important block
2025-03-03 14:35:33 -05:00
Alex Xu
efefa0f43e
fix important block
2025-03-03 14:12:05 -05:00
Daniel Su
4d15adf284
Ex CI: fix rocm-cmake tests, update component branch names ( #4433 )
2025-02-28 13:57:06 -05:00
Peter Park
1fb42c2591
Update LLM inference performance validation on AMD Instinct MI300X guide to filter by desired model ( #4424 )
...
* WIP
(cherry picked from commit a06a5b5b959a9425e7384fb58b88c3716f380e48)
rm unneeded files
(cherry picked from commit f1d0c00056a83299bdea74a43cd17454999cf2d8)
* add sphinxcontrib.datatemplates
(cherry picked from commit d056b93a325d87b81f54f70c6eb4ae78f4fb0bc1)
* add template
(cherry picked from commit 0691d59f0a1efbda7908762b7a906e30a65c0ee1)
fix template
(cherry picked from commit 01e4bea5522aa5deeaade58c105ff850f449df8b)
WIPO
(cherry picked from commit 4d8daf7445e7be92cd9ee1d39dff564bd8de41f4)
WIP
(cherry picked from commit 9eefd1f5833bc4dc8de9d777ff65a5fe5f826dbd)
update models yaml schema
(cherry picked from commit a5f0fc1e6cc51104dc2d42029bfcf3eea276d270)
add model groups functionality
(cherry picked from commit 13f49f96dd3e5a160d37c52e48a4fbcccdcf4f9e)
add selector headings and fix template
(cherry picked from commit 35f7f2314bcf74b4fd0a8ca10aaabf0de7063bb0)
update template
(cherry picked from commit 9e2dcfe0c7f6e7c2c685866ea83375fbacbc5032)
fix
(cherry picked from commit be51e32791550ddc21785effccb889228394b242)
use classes instead of data tags
(cherry picked from commit cd52d68c504f7e7435d156ae70cf4bde1dfe703e)
update template
(cherry picked from commit 9ed89fee6874b39ee3535fbde54a0a59f346ea2b)
clean up extra wip files
(cherry picked from commit a9f965a104baa966c184054638e935b011526278)
update wordlist
(cherry picked from commit f783656814e896aedd21acd1c8c87b4700c14469)
remove unused template
(cherry picked from commit cac894bd9c2b1262c9c006e5fddbcb742dc6d882)
improve script
(cherry picked from commit ca20ffd4922916616e0924d625652a815f27c35f)
fix template
(cherry picked from commit 752c61fda856fd5b244734636c036c8877e823b9)
fix standalone benchmark output path in template
(cherry picked from commit d8c04203b5ec0f6c2e2307f7890304a3dc5687be)
fix toc
(cherry picked from commit 8df42faf53488ef29f5a263d25032f3d35cd58ed)
update script to prevent flash of unstyled content
import a11y
(cherry picked from commit 46c852717f223a1d8744fab035807cebab4c5404)
add tabindex to wordlist
(cherry picked from commit 11492593f9692f5453045e7ec52c8f8ae9624ae9)
text
update script
* remove unused config option
* reorganize assets
* fix linting warning
* move js from data/ to extension/
2025-02-28 12:39:02 -05:00