peterjunpark
e0b8ec4dfb
Update training docs for Primus/25.11 ( #5819 )
...
* update conf and toc.yml.in
* archive previous versions
archive data files
update anchors
* primus pytorch: remove training batch size args
* update primus megatron run cmds
multi-node
* update primus pytorch
update
* update
update
* update docker tag
2025-12-29 08:05:47 -05:00
Pratik Basyal
78e8baf147
Taichi removed from ROCm docs [Develop] ( #5779 )
...
* Taichi removed from ROCm docs
* Warnings fixed
2025-12-16 13:12:40 -05:00
yugang-amd
f2067767e0
xdit-diffusion v25.11 docs ( #5744 )
2025-12-05 17:09:48 -05:00
Pratik Basyal
b93fdb811c
7.1.1 pre-GA public link reset ( #627 )
...
* 7.1.1 pre-GA public link reset
* Update CHANGELOG.md
2025-11-26 08:38:13 -05:00
Istvan Kiss
81b7745f8e
Docs: Add Environment Variable Page ( #395 )
...
Co-authored-by: Adel Johar <adel.johar@amd.com >
2025-11-19 17:40:26 +01:00
Pratik Basyal
fb098b6354
Initial changes for 7.1.1 release notes ( #622 )
...
* Changelog and tables updates for 7.1.1 release notes
* Changelog synced
* Naming udpated
* Added upcoming changes for composable kernel
* Update RELEASE.md
Co-authored-by: Pratik Basyal <prbasyal@amd.com >
* Update RELEASE.md
* Highlights udpated for DGL, ROCm-DS, and HIP documentation
* Changelog synced"
* Offline, runfile and ROCm Bandwidth test updated
* CK/AITER highlight added
* Changelog synced
* AI model highlight updated
* PLDM version added
* Changelog updated
* Leo's feedback incorporated
* Compatibility and PLDM versions udpated
* New docs update added
* ROCm resolved issue added
* Review feedback added
* Link added
* PLDM updated
* PLDM table udpated
* Changes
---------
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com >
2025-11-17 12:09:59 -05:00
peterjunpark
1515fb3779
Revert "Add xdit diffusion docs ( #5576 )" ( #5580 )
...
This reverts commit 4132a2609c .
2025-10-27 16:22:28 -04:00
Kristoffer
4132a2609c
Add xdit diffusion docs ( #5576 )
...
* Add xdit video diffusion base page.
* Update supported accelerators.
* Remove dependency on mad-tags.
* Update docker pull section.
* Update container launch instructions.
* Improve launch instruction options and layout.
* Add benchmark result outputs.
* Fix wrong HunyuanVideo path
* Finalize instructions.
* Consistent title.
* Make page and side-bar titles the same.
* Updated wordlist. Removed note container reg HF.
* Remove fp8_gemms in command and add release notes.
* Update accelerators naming.
* Add note regarding OOB performance.
* Fix admonition box.
* Overall fixes.
2025-10-27 14:56:55 +01:00
peterjunpark
cb8d21a0df
Updates to the vLLM optimization guide for MI300X/MI355X ( #5554 )
...
* Expand vLLM optimization guide for MI300X/MI355X with comprehensive AITER coverage. attention backend selection, environment variables (HIP/RCCL/Quick Reduce), parallelism strategies, quantization (FP8/FP4), engine tuning, CUDA graph modes, and multi-node scaling.
Co-authored-by: PinSiang <pinsiang.tan@embeddedllm.com >
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com >
Co-authored-by: pinsiangamd <pinsiang.tan@amd.com >
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com >
2025-10-22 12:54:25 -04:00
anisha-amd
a98236a4e3
Main Docs: references of accelerator removal and change to GPU ( #5495 )
...
* Docs: references of accelerator removal and change to GPU
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
2025-10-16 11:22:10 -04:00
anisha-amd
93c6d17922
Docs: frameworks 25.09 - compatibility - FlashInfer and llama.cpp ( #5462 )
2025-10-02 13:51:36 -04:00
peterjunpark
2e1b4dd5ee
Add multi-node setup instructions for training perf Dockers ( #5449 )
...
---------
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com >
2025-09-30 14:53:38 -04:00
Pratik Basyal
6cf6b34b2e
TOC for ROCm on Radeon and Ryzen updated ( #5429 )
2025-09-24 13:58:26 -05:00
Pratik Basyal
c35a0a121a
ROR link and text updated ( #5426 )
2025-09-24 13:28:13 -05:00
Peter Park
d5101532f7
docs: Add SGLang disaggregated P/D inference w/ Mooncake guide ( #5335 )
...
* add main content
* Update content and format
add clarification
update
update data
* fix
fix
fix
* fix: deepseek v3
* add ki
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
---------
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
2025-09-16 10:33:58 -05:00
Peter Park
ef4e7ca1fe
docs(PyTorch training v25.8): Add Primus and update PyTorch training benchmark docs ( #5331 )
...
* pyt: update previous versions list
update conf.py
* pyt: update yaml and rst
update
update toc
* update headings and anchors
* pyt: update doc
* update docker hub urls
2025-09-16 10:33:53 -05:00
Pratik Basyal
412f6f2b0e
700 reset link [Develop] ( #5325 )
...
* TOC link update and manifest removed
* Link reset
* Changelog synced
2025-09-16 08:07:40 -04:00
Parag Bhandari
60e3a8107c
Merge branch 'develop' into develop-internal
2025-09-16 05:12:42 -04:00
anisha-amd
db43d18c37
Docs: frameworks compatibility- ray and llama.cpp ( #5273 )
2025-09-09 11:02:30 -04:00
Swati Rawat
4f4f4556a5
Merge branch 'develop' into swraw/docs
2025-08-28 20:48:33 +05:30
Istvan Kiss
d476d09aff
Update precision support page with missing libraries and RDNA2 and CDNA4 support
2025-08-28 17:09:34 +02:00
srawat
95d1752874
Update _toc.yml.in
2025-08-28 20:35:01 +05:30
srawat
eabf72c2db
Update _toc.yml.in
2025-08-28 20:28:34 +05:30
Pratik Basyal
ea8ff1b17d
UCC and UCX version and release notes update for 7.0.0 ( #521 )
...
* Indentation and formatting updated
* UCC and UCX version udpated
* ROCm bandwidth test update
* MI350 series info added
* Changelog update
* ROCm systems Profiler highlight updated
* Redundant removed, pulled out from HIP changelog
* Known issues to Compute profiler added
* ONNX compatibility updtaed
* ROCm COmpute Profiler highlight added
* RN update
* ROCm 700 stack image updated
* ROCM Compute and System highlight updated
* Deep learning frameworks added
* removed BF16 support for MIGraphX -- already in 6.4 release notes; removed FP4 MIGraphX support
* ROCm Compute profiler highlight updated
* Formatting update
* AI framework update
* ROCm Systems Profiler udpate
* removed mention of CentOS of CentOS
* ROCm Compute Profiler update
* Feedback changes
* leo's feedback incorporated
* ampersand
* Changelog synced
* Changelog synced
* RHEL 10 removed
* Rocky Linux updated
---------
Co-authored-by: spolifroni-amd <sandra.polifroni@amd.com >
2025-08-26 16:34:27 -04:00
Matt Williams
1d42f7cc62
Deep learning frameworks edits for scale ( #5189 )
...
* Deep learning frameworks edits for scale
Based on https://ontrack-internal.amd.com/browse/ROCDOC-1809
* update table
table
* leo comments
* formatting
* format
* update table based on feedback
* header
* Update machine learning page
* headers
* Apply suggestions from code review
Co-authored-by: anisha-amd <anisha.sankar@amd.com >
* Update .wordlist.txt
* formatting
* Update docs/how-to/deep-learning-rocm.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
---------
Co-authored-by: Matt Williams <Matt.Williams+amdeng@amd.com >
Co-authored-by: anisha-amd <anisha.sankar@amd.com >
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
2025-08-22 11:46:07 -04:00
srawat
c587d75701
listing in TOC
2025-08-22 19:57:27 +05:30
Peter Park
98029db4ee
docs: Add Primus (Megatron) training Docker documentation ( #5218 )
2025-08-21 23:50:55 -04:00
Pratik Basyal
08d0840b69
Post RC3 7.0.0 RN update ( #501 )
...
* Indentation and formatting updated
* AMD SMI changelog update
* Changelog update
* Compute and Systems profiler changelog added
* Highlight added
* AMD SMI link added
* Changelog updated
* Refernece link updated
* ROCal changelog added
* rocJpeg added
* Minor change
* version update
* rocpydecode added
* Changelog.md updated
* Heading level error fixed
* Feedback from Jeff incorporated
* Title formatting updated
* Changelog updated
* Changelog updated
* Changelog updates
* HIPCC perl script removed
* TOC for internal purpose updated
* ROCgdb api and ROCdbg added
* Changelog udpate
* Sandra's feedback added
2025-08-18 14:03:43 -04:00
yugang-amd
cc5bc5a882
Add SGLang inference benchmark doc w/ initial support for DeepSeek-R1-Distill-Qwen-32B ( #4870 )
2025-07-25 12:42:40 -04:00
Peter Park
15ee605d18
Fix branches for install docs in _toc.yml.in ( #5083 )
2025-07-22 11:03:40 -04:00
Peter Park
9ed65a81c4
Add Megatron-LM benchmark doc 5/2 ( #4778 )
...
* reorg files
* add tabs
* update template
* update template
* update wordlist and toc
* add previous version to doc
* add selector paragraph
* update wordlist.txt
2025-05-22 14:28:18 -04:00
Peter Park
0e8b745266
Fix toc ( #4762 )
2025-05-21 12:26:30 -04:00
Alex Xu
58a62bc00e
Merge remote-tracking branch 'external/develop' into sync-develop-from-external
2025-05-21 11:16:31 -04:00
Peter Park
0a77e7b3a5
docs: Add system health check doc under ROCm for AI ( #4736 )
...
* add initial draft
* add to toc and install page
* update wording
* improve documentation structure
* resturcture and expand content
* add to training section
* add to conf.py article_pages
* Update docs/how-to/rocm-for-ai/includes/system-health-benchmarks.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/includes/system-health-benchmarks.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* update wordlist.txt
* Update docs/how-to/rocm-for-ai/includes/system-health-benchmarks.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* inference --> AI workloads
* udpate toc
* update article_pages in conf.py
* Update system validation notes in training docs
* fix links in prerequisite-system-validation
* wording
* add note
* consistency
* remove extra files
* fix links
* add links to training index page
---------
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
2025-05-13 15:54:48 -04:00
Pratik Basyal
169f3bbe5e
641 Release notes update post RC2 batch1 ( #387 )
...
* Release highlight updated
* TOC updated for internal
* RC3 manifest added
* clarify docker image highlight
* update doc highlights
* RC3 changes added
* RC3 manifest added
* ROCm SMI version update
---------
Co-authored-by: Peter Park <peter.park@amd.com >
2025-05-06 15:07:54 -04:00
Peter Park
d44ea40a0d
Add MPT-30B + LLM Foundry doc ( #4704 )
...
* add mpt-30b doc
* add tunableop note
* update MPT doc
* add section
* update wordlist
* fix flash attention version
* update "applies to"
* address review feedback
* Update docs/how-to/rocm-for-ai/training/benchmark-docker/mpt-llm-foundry.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/training/benchmark-docker/mpt-llm-foundry.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/training/benchmark-docker/mpt-llm-foundry.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* update docker details to pytorch-training-v25.5
* update
---------
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
2025-05-02 12:13:20 -04:00
Peter Park
c3faa9670b
Add PyTorch inference benchmark Docker guide (+ CLIP and Chai-1) ( #4654 )
...
* update vLLM links in deploy-your-model.rst
* add pytorch inference benchmark doc
* update toc and vLLM title
* remove previous versions
* update
* wording
* fix link and "applies to"
* add pytorch to wordlist
* add tunableop note to clip
* make tunableop note appear to all models
* Update docs/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Update docs/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* fix incorrect links
* wording
* fix wrong docker pull tag
---------
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
2025-04-23 17:35:52 -04:00
Parag Bhandari
e756d99f65
Merge branch 'develop-internal' into develop
2025-04-11 15:15:19 -04:00
Pratik Basyal
686fcece1d
PRE GA Day 640 update for resetting link and HPC application list ( #367 )
...
* Links reset to point to latest from stg, internal, RTD, and develop
* ROCm for HPC updated
* GA prep changes
2025-04-11 14:12:57 -05:00
Parag Bhandari
db3c46fccf
Merge branch 'develop-internal' into develop
2025-04-11 14:32:09 -04:00
Peter Park
ea66bf386a
Fix more links in documentation ( #4551 )
...
* fix vllm engine args link
* remove RDNA subtree in under system optimization in toc
* fix RDNA 2 architecture PDF link
* fix CLR LICENSE.txt link
* fix rocPyDecode license link
2025-04-01 15:56:34 -04:00
amitkumar-amd
b178a7ca78
Update the TOC ( #355 )
...
* remove 1200
* update link on TOC
* Update docs/sphinx/_toc.yml.in
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
---------
Co-authored-by: Pratik Basyal <prbasyal@amd.com >
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
2025-03-28 15:59:27 -05:00
Peter Park
424e6148bd
Add MaxText training Docker doc
...
Add MaxText training Docker doc
2025-03-28 11:25:06 -04:00
Pratik Basyal
a0faccba37
AMD GPU Docs System optimization migration changes in ROCm Docs Develop ( #4538 )
...
* AMD GPU Docs System optimization migration changes in ROCm Docs (#296 )
* System optimization migration changes in ROCm
* Linting issue fixed
* Linking corrected
* Minor change
* Link updated to Instinct.docs.amd.com
* ROCm docs grid updated by removing IOMMU.rst, pcie-atomics, and oversubscription pages
* Files removed and reference fixed
* Reference text updated
* GPU atomics from 6.4.0 removed
2025-03-27 16:38:10 -04:00
Pratik Basyal
544149631a
AMD GPU Docs System optimization migration changes in ROCm Docs ( #296 )
...
* System optimization migration changes in ROCm
* Linting issue fixed
* Linking corrected
* Minor change
* Link updated to Instinct.docs.amd.com
* ROCm docs grid updated by removing IOMMU.rst, pcie-atomics, and oversubscription pages
* Files removed and reference fixed
* Reference text updated
2025-03-26 10:01:33 -04:00
Peter Park
58d42ec50b
Improve "tuning guides" landing page ( #4504 )
...
* Improve "tuning guides" landing page
* Update docs/how-to/gpu-performance/mi300x.rst
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
* Update docs/how-to/gpu-performance/mi300x.rst
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
* change tuning to optimization
---------
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
2025-03-25 13:54:27 -04:00
Pratik Basyal
e980ea5e57
Pre ga 640 update ( #333 )
...
* ROCProfiler deprecation notice udpated
* Link error
* Compatibility updated
* New changelog and OS support updated
* Upcoming changes removed from rocWWMA, added to hipTensor
* Glibc added to wordlist
* Instict docs content added
* RHEL 9.5 to OS
* Compatibility OS update
* Leo's feedback incorporated and TOC updated for linux requirement
2025-03-21 16:09:53 -04:00
Istvan Kiss
635838e7ef
Add atomics operation support page
2025-03-20 17:11:02 +01:00
Peter Park
9b2ce2b634
Update vLLM performance Docker docs ( #4491 )
...
* add links to performance results
words
* change "performance validation" to "performance testing"
* update vLLM docker 3/11
* add previous versions
add previous versions
* fix llama 3.1 8b model repo name
* words
2025-03-13 10:04:21 -04:00
Istvan Kiss
cd57bc8186
Fix white paper links
2025-02-27 15:29:06 +01:00