* remove 'Using MPI' and 'gpu-cluster-networking' sections due to migration to dcgpu
* remove gpu-cluster-networking from index page
---------
Co-authored-by: Alex Xu <alex.xu@amd.com>
minor fixes to formatting
fix spelling errors
more spelling
fixes
quantization update
fix format
simplify wording in tunableops and format fix
Apply suggestions from code review
review feedback by Peter
Co-authored-by: Peter Park <peter.park@amd.com>
Apply suggestions from code review
addressing feedback
Co-authored-by: Peter Park <peter.park@amd.com>
Apply suggestions from code review
feedback again
Co-authored-by: Peter Park <peter.park@amd.com>
add hipblaslt yaml file figure
feedback and minor formatting
formatting
update wordlist.txt
remove outdated sentence regarding fsdp and rccl
(cherry picked from commit 87fa9fd83a2e623f6cab4e69d65f49e3db0a45f6)
update wordlist
Co-authored-by: hongxyan <hongxyan@amd.com>
* Fix Radeon link and point at R6.1.3 as absolute link (#3757)
* Update ROCm manifest to 6.2.1
* Update ROCm branch name
* Add 6.2.1 to version list (#3770)
* Add links to GH issues in 6.2.1 release notes (#3769)
* add MAD page
* link to GitHub issues in release notes known issues
* update templates for 6.2.1
* Revert "add MAD page"
This reverts commit 9cce72bba3.
* update wordlist for spellcheck linter
* add rccl note
* update rocal version change heading to be more obvious
* make rocal note more specific
* fix missing space
* fix capitalization
* Update RCCL known issue wording (#3775)
* add MAD page
* fix wording in RCCL known issue
* Revert "add MAD page"
This reverts commit c81d0f3b0a.
* update llvm version for 6.2.1 (#3779)
* Fix broken links in 6.2.1 release notes (#3782)
* External CI: Replace libomp dependencies with aomp (#3781)
Add roctracer dependency for hipBLAS and rocWMMA testing
* External CI: Add rocprofiler v1 and v2 smoke tests (#3784)
* External CI: ROCgdb smoke tests (#3785)
- Since this is an autotools project and not cmake, build and test on gfx942 system instead of separating into two jobs. Pipeline time is short anyway.
- Follow build instructions to update build flags and to incorporate the ROCdbgapi.
- Results are not parsed and graphed, but the log contents are printed at the end. This was helpful for debugging and will be kept in the pipeline, as the make check-gdb command's output was not helpful on its own.
* External CI: rocPyDecode Smoke Test (#3786)
* External CI: omniperf pipeline (#3788)
- Referred to public documentation, source, and iterative attempts to create and improve build and test pipeline.
- ctest failures are due to the test node not having expected marketing name string and override not working.
- The fix should be on the omniperf repo side of things, so this pull request should be fine as is.
* External CI: create omniperf pipeline IDs, update nightly build (#3790)
* Fixed greater than to be less than in rocFFT changes
* fix footnote for 6.1.0 (#3791)
* fix footnote for 6.1.0
* fix empty columns in historical KFD title
* External CI: Publish wheel as artifact for rocPyDecode (#3796)
* External CI: fix hip-tests symlink creation (#3799)
* Docs: Add Ubuntu 24.04.1 (#3801)
* add ubuntu 24.04.1
* add 24.04.1 to bottom os section
* fix heading and template
* Update compatibility-matrix.rst for OpenMP version
* Update compatibility-matrix-historical-6.0.csv for OpenMP version
* rm ubuntu 24.04.1 from 6.2.0
* Update docs/compatibility/compatibility-matrix.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* rm duplicate ubuntu in historical
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* External CI: fixes for rocMLIR and nightly build (#3800)
* External CI: fix symlinks for rocMLIR and nightly build
* add pipeline IDs for hip-tests
* fix hip-test ID typo
* remove llvm-alt license (#3727)
* remove llvm-alt license
* fix linting error
* External CI: enable ROCR-Runtime tests (#3809)
* External CI: default branches for hip-tests, omniperf (#3811)
* External CI: torch and torchvision smoke tests (#3810)
* External CI: torch and torchvision smoke tests
- Fixed issues with package name and version for the vision wheel that prevented it from installing. A patch is used until my pull request in vision repo is merged.
- Referred to rocAutomation scripts to pick which test scripts to run out of the many in the torch and vision repo, and iteratively tested suggested scripts to see which ones completed in a timely manner.
- Leveraging pytest-azurepipelines module to automatically parse and graph results from these tests.
* External CI: omnitrace build pipeline (#3812)
* External CI: omnitrace build pipeline starter
- Adding initial set of dependencies and build flags.
* External CI: omnitrace build pipeline
- Add bison, rccl, texinfo dependencies based on build failures.
- Add AMDGPU_TARGETS flag
- Add ROCm binaries to PATH for clang-format and other tools used.
* Fix indentation
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: AMDMIGraphX Build Fix (#3814)
- Swap to default gcc on OS to resolve build errors from recent commits.
- Added libdnnl-dev dependency from iterative attempts with compiler change.
- Referred to the passing GitHub checks to observe the compilers that was used.
- Build CK jit lib and include in AMDMIGraphX build.
* External CI: test fixes w/ roctracer, list omniperf as partially succeeding (#3815)
* External CI: rpp tests (#3816)
* External CI: Build pipeline for rocprofiler-sdk (#3819)
* External CI: Pipeline for rocprofiler-sdk
* Add rocprofiler dependency
* External CI: rocprofiler-sdk build pipeline
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: Fix/add missing pipeline IDs (#3818)
* External CI: omnitrace tests (#3822)
* Update tags to 6.2.2 (#3827)
* External CI: add roctracer to roc/hipSOLVER test deps (#3825)
* External CI: add rocprofiler-sdk pipeline IDs (#3824)
* External CI: AMDMIGraphX Smoke Tests (#3830)
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: MIOpen tests (#3837)
* Point to release history instead of deprecated changelog (#3836)
* External CI: filter out hipTensor extended tests (#3838)
* added revised note re. radeon gpus (#3839)
* Restructured the contributions section. (#3715)
* testing if this file is editable
* changed 'kebob-case' to 'dash-case'
* Restructured the page to be more straightforward and provide additional repo information
* forgot to save
* Moved the topic sentence
* Wrong accent on the a in diataxis
* Removed the feedback info from contributing and moved it to Feedback
* fixed spelling errors
* fixed some wording and removed second person text
* consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc
* updated the titles in the toc
* made changes based on feedback
* it's better when you save
* removed structure and build; fixed something for the linter
* added rst to wordlist
* added customizations to wordlist
* Add links to gpu cluster network guides (#3763)
* Add links to gpu cluster network guides
* Add newline character to eof
* Make link absolute
* add dynamic branch in toc
* remove unnecessary page
clean up
* clean up index/toc
* make multi-node topics adjacent
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* updated the radeon note (#3850)
* External CI: Fix rocPyDecode wheel creation (#3852)
- Set values for expected environment variables.
- Accompanying changes required in rocPyDecode repo. Pull request will be made.
* External CI: pytorch vision patch removal (#3855)
My pull request applying this patch was merged upstream, so this is no longer needed and will break the pipeline since it can no longer be applied.
* Build(deps): Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#3807)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* updated the radeon note, as it were (#3857)
* updated the radeon note, as it were
* updated the note again
* Set devops team as codeowners for rocm-build (#3860)
* Set ext CI as codeowners for rocm-build
* Update CODEOWNERS to rocm-devops
* External CI: Add option to pull mainline branch for dependencies (#3689)
* External CI: Add option to pull mainline branch for dependencies
* Missing parameter for mainline branch dependencies.
* External CI: mainline branch definitions
* Removed MIGraphX optimization page (#3848)
* External CI: add a global variable to control gfx942 tests (#3864)
* External CI: update component default/mainline branches (#3871)
* External CI: Stop building gfx90a (#3872)
Save on VM resources until infrastructure has test targets.
* External CI: add libstdc++-12 to rocMLIR (#3874)
* Add building doc section (#3873)
* External CI: programmatically get latest aqlprofile (#3876)
* External CI: use ctest for rocm-examples (#3877)
* External CI: Tensile pipeline (#3884)
* add oversubscription conceptual doc (#3885)
add mitigiation steps
add to toc
move page for build
move doc
fix spelling
update doc
update oversubscription
update order
fix spelling
add oversubscription to wordlist
move oversubscription topic to bottom of toc and index
* add oversubscription conceptual doc (#3885)
(cherry picked from commit d0ecf51b0c)
* External CI: Add pipeline to build upstream boost (#3896)
* Update bitsandbytes branch in docs (#3898)
* Documentation: Add reference to precision-support floating-point types (#3899)
* External CI: use Boost template for MIOpen (#3903)
* External CI: create rocprofiler-systems pipeline (#3906)
* External CI: omnitrace/rocprof-sys pipeline IDs (#3908)
* External CI: MIOpen parse test results (#3913)
* External CI: Use pip to install latest cmake on test system (#3915)
* added a link to the compatibility matrix (#3904)
* added a link to the compatibility matrix
* removed quotes
* docs: Remove invalid amd_iommu=on parameter
Per kernel-parameters.txt, there is no "on" option for amd_iommu. While
intel_iommu has it, amd_iommu is automatically on unless specified
otherwise. For more info, see these 2 links:
https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt75aa74d52f/drivers/iommu/amd/init.c (L3481)
Signed-off-by: Kent Russell <kent.russell@amd.com>
* External CI: hipBLASLt build now requires python packaging module (#3926)
https://github.com/ROCm/hipBLASLt/pull/1250/files#diff-fee2e6f068b33fca3a1dc49392de8848dbf05c3f4632b680abb1052523e5a30fR35
* External CI: Moved location of upstream pytorch build scripts (#3930)
https://github.com/pytorch/pytorch/pull/138103
* External CI: disable rocMLIR tests (#3931)
* External CI: disable rocMLIR tests
* roctracer AMDGPU_TARGETS flag
* External CI: create a GPU diagnostics template (#3932)
* External CI: Add CK into pytorch build environment (#3934)
* External CI: add support to disable individual component tests (#3938)
* External CI: AMDMIGraphX greater-equal pip dependencies (#3939)
* Build(deps): Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx (#3933)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.2 to 1.8.3.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.2...v1.8.3)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* External CI: rocDecode add libva-amdgpu-dev dependency (#3940)
* External CI: enumerate GPUs in gpu-diagnostics (#3942)
* External CI: move gpu-diag directly before tests (#3943)
* External CI: fix HIP_PIPELINE_ID (#3944)
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Daniel Su <danielsu@amd.com>
Co-authored-by: Sandra Polifroni <sandra.polifroni@amd.com>
Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com>
Co-authored-by: Michael Benavidez <michael.benavidez@amd.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MKKnorr <MKKnorr@web.de>
Co-authored-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Joseph Greathouse <jlgreathouse@users.noreply.github.com>
* adding preliminary compatibility matrix data for 6.2.1
* bump up some version numbers from 6.2.0 to 6.2.1
* adding kernel versions to compatibility matrix. I hate it
* add kernel version lookup table, in dropdown list
* add KFD and User space support. Also adjust some meta data keywords
* update 6.2.1 RC2 versions
* make spelling linter happy
* remove kernel versions from table, just reference LUT below
* Leave kenerel Lookup table expanded
* update kernel version table
* remove kernels from historical matrix, update footnotes
* move historical matrix into compatibility folder
* update historical matrix paths
* version bumps for RC3
* RC4 has no other version bumps. Reorder RPP alphabetically
* change How-To card hue to purple
* add rocAL, hipCC, CLR. Rearrange order of some items to align with stack diagram. Update UCC versions
* update llvm-project to point to docs page instead of GitHub
* initial commit for placeholder 6.2 data
* fix TensorFlow versions, and LLVM/OpenMP version strings
* add third column with 6.1.0 as last column. Update some versions from Peter's review comments
* reduce RPP name
* remove trailing comma
* reduce length of 3rd party communications libs title
* change footnote for 6.2 to remove mention of MI300A
* remove TransferBench
* change from 6.1.0 to 6.0.0 data in last column
* fixing a few version numbers
* add rocprofiler-sdk version
* fix omnitrace version
* adding full matrix, 2 different views
* add copying csv in conf.py
* 6.2 content edits, and change subheadings to remove :, renamed a few as Leo suggested
* add Framework anchor within compat matrix, and fix linting error
* categorized tools
* update Cub/Thrust versions, abbreviate Management
* remove the dedicated histtorical page
* WIP commit, added anchors and in compat matrix, along with anchor test code
* check 6.1.1 and 6.0.2 versions, add anchors thru table
* audit 6.2 RC4 versions against table, remove clang-ocl, and update hip-other version
* avoid linting
* MI300A system optimization guide internal draft
* Small changes to System BIOS paragraph
* Some minor edits
* Changes after external review feedback
* Add CPU Affinity debug setting
* Edit CPU Affinity debug setting
* Changes from external discussion
* Add glossary and other small fixes
* Additional changes from the review
* Update the IOMMU guidance
* Change description of CPU affinity setting
* Slight rewording
* Change Debian to Red Hat-based
* A few changes from the second internal review
* Add MI300X tuning guides
Add mi300x doc (pandoc conversion)
fix headings
add metadata
move images to shared/
move images to shared/
convert tuning-guides.md to rst using pandoc
add mi300x to tuning-guides.rst landing page
update h1s, toc, and landing page
fix spelling
fix fmt
format code blocks
add tensilelite imgs
fix formatting
fix formatting some more
fix formatting
more formatting
spelling
remove --enforce-eager note
satisfy spellcheck linter
more spelling
add fixes from hongxia
fix env var in D5
add fixes to PyTorch inductor section
fix
fix
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update 'torch_compile_debug' suggestion based on Hongxia's feedback
fix PyTorch inductor env vars
minor formatting fixes
Apply suggestions from code review
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update vllm path
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
disable numfig in Sphinx configuration
fix formatting and capitalization
add words to wordlist
update index
update wordlist
update optimizing-triton-kernel
convert cards to table
fix link in index.md
add @lpaoletti's feedback
Add system tuning guide
add images
add system section
add os settings and sys management
remove pcie=noats recommendation
reorg
add blurb to developer section
impr formatting
remove windows os from tuning guides pages in conf.py
add suggestions from review
fix typo and link
remove os windows from relevant pages in conf
mi300x
add suggestions from review
fix toc
fix index links
reorg
update vLLM vars
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
update vLLM vars
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
reorganize
add warnings
add text to system tuning
add filler text on index pages
reorg tuning pages
fix links
fix vars
* rm old pages
fix toc
* add suggestions from review
small change
add more suggestions
rewrite intro
* add 'workload tuning philosophy'
* refactor
* fix broken links
* black format conf.py
* simplify cmd and update doc structure
* add higher-level heading for consistency (mi300x.rst)
* add fixes from review
fix url
add fixes
fix formatting
fix fmt
fix hipBLASLt section
change words
fix tensilelite section
fix
fix
fix fmt
* style guide
* fix some formatting
* satisfy spellcheck linter
* update wordlist
* fix bad conflict resolution
* Add Fine Tuning LLMs how to guide
* Reorg and refactor Fine-tuning LLMs with ROCm
Update index and headings
Fix formatting and update toc
Split out content from index to overview.rst
Add metadata
Clean up overview
Add inference sections, fix rst errors, clean up single-gpu-fine-tuning
Combine fine-tuning and inference guides
Fix some links and formatting
Update toc and add formatting fixes
Add ck kernel fusion content
Update toc
Clean up model quantization and acceleration
Add CK images
Clean up profiling
Update triton kernel performance optimization
Update llm inference frameworks guide
Disable automatic number of figures and tables in Sphinx conf
Change tabs to spaces
Change heading to end with -ing
Add link fixes and heading updates
Add rocprof/Omniperf/Omnitrace section
Update profiling and debugging guide
Add formatting fixes
Satisfy spellcheck
Fix words
Delete unused file
Finish overview
Clean up first 4 sections
Multi-gpu fine-tuning guide: slight fixes
Update toc
Remove tabs
Formatting fixes
* Minor wording updates
* Add some clean-up
* Update profiling and debugging gudie
* Fix Omnitrace link
* Update ck kernel fusion with latest
* Update CK formatting
* Fix perfetto link syntax
* Fix typos and add blurbs
* Add fixes to Triton optimization doc
* Tabify saving adapters / models section
* Fix linting errors - spellcheck
Fix spelling and grammar
Satisfy linter
Update wording in profiling guide
Add fixes to satisfy linter
More fixes for linting in Triton guide
More linting fixes
Spellcheck in CK guide
* Improve triton guide
Fix linting errors and optics
* Add occupancy / vgpr table
Change some wording
* Re-add tunableop
* Add missing indent in _toc.yml
* Remove ckProfiler references
* Add links to resources
* Add refs in CK optimization guide
* Rename files and fix internal links
* Organize tuning guides
Reorg triton
* Add compute unit diagram
* Remove AutoAWQ
* Add higher res image for Perfetto trace example
* Update link text
* Update fig nums
* Update some formatting
* Update "Inductor"
* Change "Inductor" to TorchInductor
* Add link to official TorchInductor docs
* Add Using ROCm for AI:wq
Add PyTorch Docker installation images
Split doc into subtopics
Add metadata
Clean up index
Clean up hugging face guide
Clean up installation guide
Fix rST formatting
Clean up install and train-a-model
Clean up MAD
Delete unused file
Add ref anchors and clean up MAD doc
Add formatting fixes
Update toc and section index
Format some code blocks
Remove install guide and update toc
Chop installation guide
Clean up deployment and hugging face sections
Change headings to end in -ing
Fix spelling in Training a model
Delete MAD and split out install content
Fix formatting
Change words to satisfy spellcheck linter
* Add review suggestions and add helpful links
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Add helpful links and add review suggestions
Remove fine-tuning link and links to D5 and MAGMA
Update docs/how-to/rocm-for-ai/deploy-your-model.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Update DeepSpeed link
Add subheading to ML framework installation and closing blurb to hugging face models guide
* Reorder topics
* Rename 'Tuning guides' to 'Hardware optimization'
* Move deep learning to Install section
* Change 'Hardware' to 'System' to align with index.md
* Satisfy spellcheck linter
* adding new framework install graphic with JAX
* Fix link to ROCm libraries list
* crop framework_install graphic
* Reset .wordlist.txt update
* Prettify deep learning framework installation page
* Change spacing in list of frameworks
---------
Co-authored-by: Young Hui <young.hui@amd.com>
* add rocm software stack diagram to What is ROCm landing page
* restructure ROCm project list table
* clean up unnecessary hyphenation
* update What is ROCm stack diagram filename
* reorder rocm project list to reflect diagram
* update "What is ROCm?" image metadata
* change 'project list' to 'components'
* change 'project' to 'component'