github/ROCm - ROCm - AtHeartEngineering

mirror of https://github.com/ROCm/ROCm.git synced 2026-01-08 22:28:06 -05:00

Author	SHA1	Message	Date
peterjunpark	e0b8ec4dfb	Update training docs for Primus/25.11 (#5819 ) * update conf and toc.yml.in * archive previous versions archive data files update anchors * primus pytorch: remove training batch size args * update primus megatron run cmds multi-node * update primus pytorch update * update update * update docker tag	2025-12-29 08:05:47 -05:00
Pratik Basyal	78e8baf147	Taichi removed from ROCm docs [Develop] (#5779 ) * Taichi removed from ROCm docs * Warnings fixed	2025-12-16 13:12:40 -05:00
yugang-amd	f2067767e0	xdit-diffusion v25.11 docs (#5744 )	2025-12-05 17:09:48 -05:00
Pratik Basyal	b93fdb811c	7.1.1 pre-GA public link reset (#627 ) * 7.1.1 pre-GA public link reset * Update CHANGELOG.md	2025-11-26 08:38:13 -05:00
Istvan Kiss	81b7745f8e	Docs: Add Environment Variable Page (#395 ) Co-authored-by: Adel Johar <adel.johar@amd.com>	2025-11-19 17:40:26 +01:00
Pratik Basyal	fb098b6354	Initial changes for 7.1.1 release notes (#622 ) * Changelog and tables updates for 7.1.1 release notes * Changelog synced * Naming udpated * Added upcoming changes for composable kernel * Update RELEASE.md Co-authored-by: Pratik Basyal <prbasyal@amd.com> * Update RELEASE.md * Highlights udpated for DGL, ROCm-DS, and HIP documentation * Changelog synced" * Offline, runfile and ROCm Bandwidth test updated * CK/AITER highlight added * Changelog synced * AI model highlight updated * PLDM version added * Changelog updated * Leo's feedback incorporated * Compatibility and PLDM versions udpated * New docs update added * ROCm resolved issue added * Review feedback added * Link added * PLDM updated * PLDM table udpated * Changes --------- Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>	2025-11-17 12:09:59 -05:00
peterjunpark	1515fb3779	Revert "Add xdit diffusion docs (#5576 )" (#5580 ) This reverts commit `4132a2609c`.	2025-10-27 16:22:28 -04:00
Kristoffer	4132a2609c	Add xdit diffusion docs (#5576 ) * Add xdit video diffusion base page. * Update supported accelerators. * Remove dependency on mad-tags. * Update docker pull section. * Update container launch instructions. * Improve launch instruction options and layout. * Add benchmark result outputs. * Fix wrong HunyuanVideo path * Finalize instructions. * Consistent title. * Make page and side-bar titles the same. * Updated wordlist. Removed note container reg HF. * Remove fp8_gemms in command and add release notes. * Update accelerators naming. * Add note regarding OOB performance. * Fix admonition box. * Overall fixes.	2025-10-27 14:56:55 +01:00
peterjunpark	cb8d21a0df	Updates to the vLLM optimization guide for MI300X/MI355X (#5554 ) * Expand vLLM optimization guide for MI300X/MI355X with comprehensive AITER coverage. attention backend selection, environment variables (HIP/RCCL/Quick Reduce), parallelism strategies, quantization (FP8/FP4), engine tuning, CUDA graph modes, and multi-node scaling. Co-authored-by: PinSiang <pinsiang.tan@embeddedllm.com> Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com> Co-authored-by: pinsiangamd <pinsiang.tan@amd.com> Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>	2025-10-22 12:54:25 -04:00
anisha-amd	a98236a4e3	Main Docs: references of accelerator removal and change to GPU (#5495 ) * Docs: references of accelerator removal and change to GPU Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>	2025-10-16 11:22:10 -04:00
anisha-amd	93c6d17922	Docs: frameworks 25.09 - compatibility - FlashInfer and llama.cpp (#5462 )	2025-10-02 13:51:36 -04:00
peterjunpark	2e1b4dd5ee	Add multi-node setup instructions for training perf Dockers (#5449 ) --------- Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>	2025-09-30 14:53:38 -04:00
Pratik Basyal	6cf6b34b2e	TOC for ROCm on Radeon and Ryzen updated (#5429 )	2025-09-24 13:58:26 -05:00
Pratik Basyal	c35a0a121a	ROR link and text updated (#5426 )	2025-09-24 13:28:13 -05:00
Peter Park	d5101532f7	docs: Add SGLang disaggregated P/D inference w/ Mooncake guide (#5335 ) * add main content * Update content and format add clarification update update data * fix fix fix * fix: deepseek v3 * add ki * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/benchmark-docker/sglang-distributed.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> --------- Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>	2025-09-16 10:33:58 -05:00
Peter Park	ef4e7ca1fe	docs(PyTorch training v25.8): Add Primus and update PyTorch training benchmark docs (#5331 ) * pyt: update previous versions list update conf.py * pyt: update yaml and rst update update toc * update headings and anchors * pyt: update doc * update docker hub urls	2025-09-16 10:33:53 -05:00
Pratik Basyal	412f6f2b0e	700 reset link [Develop] (#5325 ) * TOC link update and manifest removed * Link reset * Changelog synced	2025-09-16 08:07:40 -04:00
Parag Bhandari	60e3a8107c	Merge branch 'develop' into develop-internal	2025-09-16 05:12:42 -04:00
anisha-amd	db43d18c37	Docs: frameworks compatibility- ray and llama.cpp (#5273 )	2025-09-09 11:02:30 -04:00
Swati Rawat	4f4f4556a5	Merge branch 'develop' into swraw/docs	2025-08-28 20:48:33 +05:30
Istvan Kiss	d476d09aff	Update precision support page with missing libraries and RDNA2 and CDNA4 support	2025-08-28 17:09:34 +02:00
srawat	95d1752874	Update _toc.yml.in	2025-08-28 20:35:01 +05:30
srawat	eabf72c2db	Update _toc.yml.in	2025-08-28 20:28:34 +05:30
Pratik Basyal	ea8ff1b17d	UCC and UCX version and release notes update for 7.0.0 (#521 ) * Indentation and formatting updated * UCC and UCX version udpated * ROCm bandwidth test update * MI350 series info added * Changelog update * ROCm systems Profiler highlight updated * Redundant removed, pulled out from HIP changelog * Known issues to Compute profiler added * ONNX compatibility updtaed * ROCm COmpute Profiler highlight added * RN update * ROCm 700 stack image updated * ROCM Compute and System highlight updated * Deep learning frameworks added * removed BF16 support for MIGraphX -- already in 6.4 release notes; removed FP4 MIGraphX support * ROCm Compute profiler highlight updated * Formatting update * AI framework update * ROCm Systems Profiler udpate * removed mention of CentOS of CentOS * ROCm Compute Profiler update * Feedback changes * leo's feedback incorporated * ampersand * Changelog synced * Changelog synced * RHEL 10 removed * Rocky Linux updated --------- Co-authored-by: spolifroni-amd <sandra.polifroni@amd.com>	2025-08-26 16:34:27 -04:00
Matt Williams	1d42f7cc62	Deep learning frameworks edits for scale (#5189 ) * Deep learning frameworks edits for scale Based on https://ontrack-internal.amd.com/browse/ROCDOC-1809 * update table table * leo comments * formatting * format * update table based on feedback * header * Update machine learning page * headers * Apply suggestions from code review Co-authored-by: anisha-amd <anisha.sankar@amd.com> * Update .wordlist.txt * formatting * Update docs/how-to/deep-learning-rocm.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> --------- Co-authored-by: Matt Williams <Matt.Williams+amdeng@amd.com> Co-authored-by: anisha-amd <anisha.sankar@amd.com> Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>	2025-08-22 11:46:07 -04:00
srawat	c587d75701	listing in TOC	2025-08-22 19:57:27 +05:30
Peter Park	98029db4ee	docs: Add Primus (Megatron) training Docker documentation (#5218 )	2025-08-21 23:50:55 -04:00
Pratik Basyal	08d0840b69	Post RC3 7.0.0 RN update (#501 ) * Indentation and formatting updated * AMD SMI changelog update * Changelog update * Compute and Systems profiler changelog added * Highlight added * AMD SMI link added * Changelog updated * Refernece link updated * ROCal changelog added * rocJpeg added * Minor change * version update * rocpydecode added * Changelog.md updated * Heading level error fixed * Feedback from Jeff incorporated * Title formatting updated * Changelog updated * Changelog updated * Changelog updates * HIPCC perl script removed * TOC for internal purpose updated * ROCgdb api and ROCdbg added * Changelog udpate * Sandra's feedback added	2025-08-18 14:03:43 -04:00
yugang-amd	cc5bc5a882	Add SGLang inference benchmark doc w/ initial support for DeepSeek-R1-Distill-Qwen-32B (#4870 )	2025-07-25 12:42:40 -04:00
Peter Park	15ee605d18	Fix branches for install docs in _toc.yml.in (#5083 )	2025-07-22 11:03:40 -04:00
Peter Park	9ed65a81c4	Add Megatron-LM benchmark doc 5/2 (#4778 ) * reorg files * add tabs * update template * update template * update wordlist and toc * add previous version to doc * add selector paragraph * update wordlist.txt	2025-05-22 14:28:18 -04:00
Peter Park	0e8b745266	Fix toc (#4762 )	2025-05-21 12:26:30 -04:00
Alex Xu	58a62bc00e	Merge remote-tracking branch 'external/develop' into sync-develop-from-external	2025-05-21 11:16:31 -04:00
Peter Park	0a77e7b3a5	docs: Add system health check doc under ROCm for AI (#4736 ) * add initial draft * add to toc and install page * update wording * improve documentation structure * resturcture and expand content * add to training section * add to conf.py article_pages * Update docs/how-to/rocm-for-ai/includes/system-health-benchmarks.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/includes/system-health-benchmarks.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * update wordlist.txt * Update docs/how-to/rocm-for-ai/includes/system-health-benchmarks.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * inference --> AI workloads * udpate toc * update article_pages in conf.py * Update system validation notes in training docs * fix links in prerequisite-system-validation * wording * add note * consistency * remove extra files * fix links * add links to training index page --------- Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>	2025-05-13 15:54:48 -04:00
Pratik Basyal	169f3bbe5e	641 Release notes update post RC2 batch1 (#387 ) * Release highlight updated * TOC updated for internal * RC3 manifest added * clarify docker image highlight * update doc highlights * RC3 changes added * RC3 manifest added * ROCm SMI version update --------- Co-authored-by: Peter Park <peter.park@amd.com>	2025-05-06 15:07:54 -04:00
Peter Park	d44ea40a0d	Add MPT-30B + LLM Foundry doc (#4704 ) * add mpt-30b doc * add tunableop note * update MPT doc * add section * update wordlist * fix flash attention version * update "applies to" * address review feedback * Update docs/how-to/rocm-for-ai/training/benchmark-docker/mpt-llm-foundry.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/training/benchmark-docker/mpt-llm-foundry.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/training/benchmark-docker/mpt-llm-foundry.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * update docker details to pytorch-training-v25.5 * update --------- Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>	2025-05-02 12:13:20 -04:00
Peter Park	c3faa9670b	Add PyTorch inference benchmark Docker guide (+ CLIP and Chai-1) (#4654 ) * update vLLM links in deploy-your-model.rst * add pytorch inference benchmark doc * update toc and vLLM title * remove previous versions * update * wording * fix link and "applies to" * add pytorch to wordlist * add tunableop note to clip * make tunableop note appear to all models * Update docs/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * fix incorrect links * wording * fix wrong docker pull tag --------- Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>	2025-04-23 17:35:52 -04:00
Parag Bhandari	e756d99f65	Merge branch 'develop-internal' into develop	2025-04-11 15:15:19 -04:00
Pratik Basyal	686fcece1d	PRE GA Day 640 update for resetting link and HPC application list (#367 ) * Links reset to point to latest from stg, internal, RTD, and develop * ROCm for HPC updated * GA prep changes	2025-04-11 14:12:57 -05:00
Parag Bhandari	db3c46fccf	Merge branch 'develop-internal' into develop	2025-04-11 14:32:09 -04:00
Peter Park	ea66bf386a	Fix more links in documentation (#4551 ) * fix vllm engine args link * remove RDNA subtree in under system optimization in toc * fix RDNA 2 architecture PDF link * fix CLR LICENSE.txt link * fix rocPyDecode license link	2025-04-01 15:56:34 -04:00
amitkumar-amd	b178a7ca78	Update the TOC (#355 ) * remove 1200 * update link on TOC * Update docs/sphinx/_toc.yml.in Co-authored-by: Pratik Basyal <pratik.basyal@amd.com> --------- Co-authored-by: Pratik Basyal <prbasyal@amd.com> Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>	2025-03-28 15:59:27 -05:00
Peter Park	424e6148bd	Add MaxText training Docker doc Add MaxText training Docker doc	2025-03-28 11:25:06 -04:00
Pratik Basyal	a0faccba37	AMD GPU Docs System optimization migration changes in ROCm Docs Develop (#4538 ) * AMD GPU Docs System optimization migration changes in ROCm Docs (#296) * System optimization migration changes in ROCm * Linting issue fixed * Linking corrected * Minor change * Link updated to Instinct.docs.amd.com * ROCm docs grid updated by removing IOMMU.rst, pcie-atomics, and oversubscription pages * Files removed and reference fixed * Reference text updated * GPU atomics from 6.4.0 removed	2025-03-27 16:38:10 -04:00
Pratik Basyal	544149631a	AMD GPU Docs System optimization migration changes in ROCm Docs (#296 ) * System optimization migration changes in ROCm * Linting issue fixed * Linking corrected * Minor change * Link updated to Instinct.docs.amd.com * ROCm docs grid updated by removing IOMMU.rst, pcie-atomics, and oversubscription pages * Files removed and reference fixed * Reference text updated	2025-03-26 10:01:33 -04:00
Peter Park	58d42ec50b	Improve "tuning guides" landing page (#4504 ) * Improve "tuning guides" landing page * Update docs/how-to/gpu-performance/mi300x.rst Co-authored-by: Pratik Basyal <pratik.basyal@amd.com> * Update docs/how-to/gpu-performance/mi300x.rst Co-authored-by: Pratik Basyal <pratik.basyal@amd.com> * change tuning to optimization --------- Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>	2025-03-25 13:54:27 -04:00
Pratik Basyal	e980ea5e57	Pre ga 640 update (#333 ) * ROCProfiler deprecation notice udpated * Link error * Compatibility updated * New changelog and OS support updated * Upcoming changes removed from rocWWMA, added to hipTensor * Glibc added to wordlist * Instict docs content added * RHEL 9.5 to OS * Compatibility OS update * Leo's feedback incorporated and TOC updated for linux requirement	2025-03-21 16:09:53 -04:00
Istvan Kiss	635838e7ef	Add atomics operation support page	2025-03-20 17:11:02 +01:00
Peter Park	9b2ce2b634	Update vLLM performance Docker docs (#4491 ) * add links to performance results words * change "performance validation" to "performance testing" * update vLLM docker 3/11 * add previous versions add previous versions * fix llama 3.1 8b model repo name * words	2025-03-13 10:04:21 -04:00
Istvan Kiss	cd57bc8186	Fix white paper links	2025-02-27 15:29:06 +01:00

1 2 3 4

183 Commits