Alex Xu
8c28f9ca9f
Merge remote-tracking branch 'external/develop' into sync-devlop-from-external
2026-01-21 14:34:02 -05:00
randyh62
45bd726f55
Use intersphinx links for deep learning ( #5859 )
...
* Use intersphinx links for deep learning
* Update deep-learning-rocm.rst
remove Taichi
* Update deep-learning-rocm.rst
Change Install link to "link"
* Apply suggestion from @randyh62
OK
2026-01-16 13:17:47 -08:00
peterjunpark
a745e45dcb
Doc update for vLLM refactor #5855
2026-01-15 11:21:38 -05:00
peterjunpark
2dc22ca890
fix(primus-pytorch.rst): FP8 config instead of BF16 ( #5839 )
2026-01-07 13:49:31 -05:00
Pratik Basyal
8d076740b8
720 RC2 update ( #660 )
...
* New GPUs listed
* GPU highlights updated
* OS table removed
* JAX 0.8.0 support added
* Apply suggestions from code review
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Azure Linux 3.0 removed
* Review feedback added
* Release and changelog synced
* Minor corrections and date change
---------
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
2026-01-07 11:20:08 -05:00
Swati Rawat
61d2424ab7
Update docs/how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v24.12-dev.rst
...
Co-authored-by: peterjunpark <git@peterjunpark.com >
2026-01-05 18:18:35 +05:30
Swati Rawat
2e3500a111
Update docs/how-to/rocm-for-ai/system-setup/prerequisite-system-validation.rst
...
Co-authored-by: peterjunpark <git@peterjunpark.com >
2026-01-05 18:18:25 +05:30
Swati Rawat
fa4bf5e9ba
Update docs/how-to/rocm-for-ai/system-setup/prerequisite-system-validation.rst
...
Co-authored-by: peterjunpark <git@peterjunpark.com >
2026-01-05 18:18:17 +05:30
Swati Rawat
2e506f1ae7
Update docs/how-to/rocm-for-ai/system-setup/prerequisite-system-validation.rst
...
Co-authored-by: peterjunpark <git@peterjunpark.com >
2026-01-05 18:18:00 +05:30
Swati Rawat
56b684fcae
Update docs/how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v24.12-dev.rst
...
Co-authored-by: peterjunpark <git@peterjunpark.com >
2026-01-05 18:17:40 +05:30
Swati Rawat
b3e78704f5
Update docs/how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v24.12-dev.rst
...
Co-authored-by: peterjunpark <git@peterjunpark.com >
2026-01-05 18:17:11 +05:30
peterjunpark
172b0f7c08
Fix inconsistency in xDiT doc
...
Fix inconsistency in xDiT doc
2025-12-29 10:26:25 -05:00
peterjunpark
c67fac78bd
Update docs for xDiT diffusion inference 25.13 Docker release ( #5820 )
...
* archive previous version
* add xdit 25.13
* update history index
* add perf results section
2025-12-29 08:44:45 -05:00
peterjunpark
e0b8ec4dfb
Update training docs for Primus/25.11 ( #5819 )
...
* update conf and toc.yml.in
* archive previous versions
archive data files
update anchors
* primus pytorch: remove training batch size args
* update primus megatron run cmds
multi-node
* update primus pytorch
update
* update
update
* update docker tag
2025-12-29 08:05:47 -05:00
srawat
756fad8435
Update single-gpu-fine-tuning-and-inference.rst
2025-12-23 16:05:01 +05:30
peterjunpark
3a43bacdda
Update xdit diffusion inference history ( #5808 )
...
* Update xdit diffusion inference history
* fix
2025-12-22 11:05:32 -05:00
srawat
f84d9574a8
Update multi-gpu-fine-tuning-and-inference.rst
2025-12-22 17:30:39 +05:30
peterjunpark
48d8fe139b
fix link to ROCm PyT docker image ( #5803 )
2025-12-19 15:47:55 -05:00
peterjunpark
7455fe57b8
clean up formatting in FA2 page ( #5795 )
2025-12-19 09:21:41 -05:00
peterjunpark
52c0a47e84
Update Flash Attention guidance in "Model acceleration libraries" ( #5793 )
...
* flash attention update
Signed-off-by: seungrok.jung <seungrok.jung@amd.com >
flash attention update
Signed-off-by: seungrok.jung <seungrok.jung@amd.com >
flash attention update
Signed-off-by: seungrok.jung <seungrok.jung@amd.com >
sentence-case heading
* Update docs/how-to/rocm-for-ai/inference-optimization/model-acceleration-libraries.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
---------
Co-authored-by: seungrok.jung <seungrok.jung@amd.com >
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
2025-12-19 08:48:52 -05:00
peterjunpark
cbab9a465d
Update documentation for JAX training MaxText 25.11 release ( #5789 )
2025-12-18 11:23:58 -05:00
peterjunpark
459283da3c
xDiT diffusion inference v25.12 documentation update ( #5786 )
...
* Add xdit-diffusion ROCm docs page.
* Update template formatting and fix sphinx warnings
* Add System Validation section.
* Add sw component versions/commits.
* Update to use latest v25.10 image instead of v25.9
* Update commands and add FLUX instructions.
* Update Flux instructions. Change image tag. Describe as diffusion inference instead of specifically video.
* git rm xdit-video-diffusion.rst
* Docs for v25.12
* Add hyperlinks to components
* Command fixes
* -Diffusers suffix
* Simplify yaml file and cleanup main rst page.
* Spelling, added 'js'
* fix merge conflict
fix
---------
Co-authored-by: Kristoffer <kristoffer.torp@amd.com >
2025-12-17 10:20:10 -05:00
srawat
00683dc244
Update prerequisite-system-validation.rst
2025-12-17 19:59:10 +05:30
peterjunpark
1b4f25733d
vLLM inference benchmark 1210 ( #5776 )
...
* Archive previous ver
fix anchors
* Update vllm.rst and data yaml for 20251210
2025-12-17 09:21:57 -05:00
srawat
535b051b8d
replace rocm-smi reference with amd-smi
2025-12-17 19:42:50 +05:30
Pratik Basyal
78e8baf147
Taichi removed from ROCm docs [Develop] ( #5779 )
...
* Taichi removed from ROCm docs
* Warnings fixed
2025-12-16 13:12:40 -05:00
Matt Williams
65a936023b
Fixing link redirects ( #5758 )
...
* Update multi-gpu-fine-tuning-and-inference.rst
* Update pytorch-training-v25.6.rst
* Update pytorch-compatibility.rst
2025-12-10 11:17:59 -05:00
peterjunpark
bf74351e5a
Fix Primus PyTorch doc: training.batch_size -> training.local_batch_size ( #5748 )
2025-12-08 13:35:22 -05:00
yugang-amd
f2067767e0
xdit-diffusion v25.11 docs ( #5744 )
2025-12-05 17:09:48 -05:00
peterjunpark
453751a86f
fix docker hub links for primus:v25.10 ( #5738 )
2025-12-04 09:17:33 -05:00
peterjunpark
fb644412d5
Update training Docker docs for Primus 25.10 ( #5737 )
2025-12-04 09:08:00 -05:00
Alex Xu
007f24fe7b
Merge remote-tracking branch 'external/develop' into sync-develop-from-external
2025-11-26 10:09:04 -05:00
Pratik Basyal
fb098b6354
Initial changes for 7.1.1 release notes ( #622 )
...
* Changelog and tables updates for 7.1.1 release notes
* Changelog synced
* Naming udpated
* Added upcoming changes for composable kernel
* Update RELEASE.md
Co-authored-by: Pratik Basyal <prbasyal@amd.com >
* Update RELEASE.md
* Highlights udpated for DGL, ROCm-DS, and HIP documentation
* Changelog synced"
* Offline, runfile and ROCm Bandwidth test updated
* CK/AITER highlight added
* Changelog synced
* AI model highlight updated
* PLDM version added
* Changelog updated
* Leo's feedback incorporated
* Compatibility and PLDM versions udpated
* New docs update added
* ROCm resolved issue added
* Review feedback added
* Link added
* PLDM updated
* PLDM table udpated
* Changes
---------
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com >
2025-11-17 12:09:59 -05:00
peterjunpark
eb956cfc5c
Fixed wording related to VLLM_V1_USE_PREFILL_DECODE_ATTENTION ( #5605 )
...
Co-authored-by: Hongxia Yang <hongxia.yang@amd.com >
2025-11-11 09:22:11 -05:00
peterjunpark
e05cdca54f
Fix references to vLLM docs ( #5651 )
2025-11-11 09:00:07 -05:00
anisha-amd
04c7374f41
Docs: frameworks 25.10 - compatibility - DGL and llama.cpp ( #5648 )
2025-11-10 15:26:54 -05:00
yugang-amd
674dc355e4
vLLM 10/24 release ( #5626 )
...
* vLLM 10/24 release
* updates per SME inputs
* Update docs/how-to/rocm-for-ai/inference/benchmark-docker/vllm.rst
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com >
---------
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com >
2025-11-05 11:13:50 -05:00
peterjunpark
1515fb3779
Revert "Add xdit diffusion docs ( #5576 )" ( #5580 )
...
This reverts commit 4132a2609c .
2025-10-27 16:22:28 -04:00
Kristoffer
4132a2609c
Add xdit diffusion docs ( #5576 )
...
* Add xdit video diffusion base page.
* Update supported accelerators.
* Remove dependency on mad-tags.
* Update docker pull section.
* Update container launch instructions.
* Improve launch instruction options and layout.
* Add benchmark result outputs.
* Fix wrong HunyuanVideo path
* Finalize instructions.
* Consistent title.
* Make page and side-bar titles the same.
* Updated wordlist. Removed note container reg HF.
* Remove fp8_gemms in command and add release notes.
* Update accelerators naming.
* Add note regarding OOB performance.
* Fix admonition box.
* Overall fixes.
2025-10-27 14:56:55 +01:00
peterjunpark
35ca027aa4
Fix broken links under rocm-for-ai/ ( #5564 )
2025-10-23 14:39:58 -04:00
peterjunpark
90c1d9068f
add xref to vllm v1 optimization guide in workload.rst ( #5560 )
2025-10-22 13:47:46 -04:00
peterjunpark
cb8d21a0df
Updates to the vLLM optimization guide for MI300X/MI355X ( #5554 )
...
* Expand vLLM optimization guide for MI300X/MI355X with comprehensive AITER coverage. attention backend selection, environment variables (HIP/RCCL/Quick Reduce), parallelism strategies, quantization (FP8/FP4), engine tuning, CUDA graph modes, and multi-node scaling.
Co-authored-by: PinSiang <pinsiang.tan@embeddedllm.com >
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com >
Co-authored-by: pinsiangamd <pinsiang.tan@amd.com >
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com >
2025-10-22 12:54:25 -04:00
peterjunpark
a613bd6824
JAX Maxtext v25.9 doc update ( #5532 )
...
* archive previous version (25.7)
* update docker components list for 25.9
* update template
* update docker pull tag
* update
* fix intro
2025-10-17 11:31:06 -04:00
peterjunpark
14bb59fca9
Update Megatron/PyTorch Primus 25.9 docs ( #5528 )
...
* add previous versions
* Fix heading levels in pages using embedded templates (#5468 )
* update primus-megatron doc
update megatron-lm doc
update templates
fix tab
update primus-megatron model configs
Update primus-pytorch model configs
fix css class
add posttrain to pytorch-training template
update data sheets
update
update
update
update docker tags
* Add known issue and update Primus/Turbo versions
* add primus ver to histories
* update primus ver to 0.1.1
* fix leftovers from merge conflict
2025-10-16 12:51:30 -04:00
anisha-amd
a98236a4e3
Main Docs: references of accelerator removal and change to GPU ( #5495 )
...
* Docs: references of accelerator removal and change to GPU
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com >
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
2025-10-16 11:22:10 -04:00
Pratik Basyal
036aaa2e78
ROCm for HPC topic updated Develop ( #5504 )
...
* ROCm for HPC topic updated
* ROCm for HPC topic udpated
* Minor editorial
2025-10-10 22:31:51 -04:00
peterjunpark
68e8453ca5
Update vLLM doc for 10/6 release and bump rocm-docs-core to 1.26.0 ( #5481 )
...
* archive previous doc version
* update model/docker data and doc templates
* Update "Reproducing the Docker image"
* fix: truncated commit hash doesn't work for some reason
* bump rocm-docs-core to 1.26.0
* fix numbering
fix
* update docker tag
* update .wordlist.txt
2025-10-08 16:23:40 -04:00
peterjunpark
eeea0d2180
Fix heading levels in pages using embedded templates ( #5468 )
2025-10-03 13:33:14 -04:00
anisha-amd
93c6d17922
Docs: frameworks 25.09 - compatibility - FlashInfer and llama.cpp ( #5462 )
2025-10-02 13:51:36 -04:00
peterjunpark
2e1b4dd5ee
Add multi-node setup instructions for training perf Dockers ( #5449 )
...
---------
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com >
2025-09-30 14:53:38 -04:00