* Known issue updated
* Reworded for clarity
* Minor update
* Minor change
* Known issue updated
* Reference link added
* Apply suggestions from code review
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* PLDM updated
* SME feedback added
* Minor change
* ROCm Optiq added
---------
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* Use intersphinx links for deep learning
* Update deep-learning-rocm.rst
remove Taichi
* Update deep-learning-rocm.rst
Change Install link to "link"
* Apply suggestion from @randyh62
OK
* New GPUs listed
* GPU highlights updated
* OS table removed
* JAX 0.8.0 support added
* Apply suggestions from code review
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* Azure Linux 3.0 removed
* Review feedback added
* Release and changelog synced
* Minor corrections and date change
---------
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* OS table removed from compatibility table
* Feedback added
* Azure Linux 3.0 and compatibility version update
* Version fix
* Review feedback added
* Minor change
* Adding ROCm-Optiq note to What is ROCm page
Adding a note for a link to the Optiq docs
* Apply suggestion from @mattwill-amd
* Apply suggestion from @mattwill-amd
* Apply suggestion from @mattwill-amd
* Update what-is-rocm.rst
* Update what-is-rocm.rst
* Apply suggestion from @mattwill-amd
* Apply suggestion from @mattwill-amd
* Apply suggestion from @mattwill-amd
* Apply suggestion from @mattwill-amd
- Update rccl component pipeline to include new additions made to projects already in super repos.
- Also update rccl to trigger rocproifler-sdk job upon completion.
- rocprofiler-sdk pipeline updated to include os parameter to enable future almalinux 8 job.
* add previous versions
* Fix heading levels in pages using embedded templates (#5468)
* update primus-megatron doc
update megatron-lm doc
update templates
fix tab
update primus-megatron model configs
Update primus-pytorch model configs
fix css class
add posttrain to pytorch-training template
update data sheets
update
update
update
update docker tags
* Add known issue and update Primus/Turbo versions
* add primus ver to histories
* update primus ver to 0.1.1
* fix leftovers from merge conflict
* archive previous doc version
* update model/docker data and doc templates
* Update "Reproducing the Docker image"
* fix: truncated commit hash doesn't work for some reason
* bump rocm-docs-core to 1.26.0
* fix numbering
fix
* update docker tag
* update .wordlist.txt
* Update CHANGELOG.md
Removed duplicate num_threads entry, and added a new Resolved issue from Julia.
* Update RELEASE.md
Removed duplicate num_threads entry and added a resolved issue from Julia.
* Add origami yaml pipeline.
* Unindent lines.
* Add cmake dependency step to origami yml.
* Add pybind dep
* Fix pipeline failures.
* Quick fix
* Fix pybind11 dep for almalinux
* Fix pybind11 dep for almalinux again
* Test
* [Ex CI] don't create symlink if more than one sparse checkout dir
* hipBLASLt multi sparse
* Replace pybind with nanobind.
* Quick fix
* Testing nanobind install in pipelines
* Run origami binding tests
* Change build path for tests
* Change build path for tests again
* Add missing dep for CI
* Add archs to buildJobs
* Fix CI error.
* Test
* Test job target
* Adding job target to hipblaslt dependant builds
* Check devices on machine
* Add gpu to pipeline
* Add more gpu targets
* test
* Add test job to origami
* Update test jobs
* Finding test dir
* Fix sparse checkout
* Find build dir
* Try to find build dir
* Clean up
* Test
* Change test dir
* Build origami in test job
* Try removing job.target from params
* Package bindings in build artifacts
* Download build as artifact.
* Comment out block
* Fix checkout in test job
* Test1
* Echo to list dir
* Sparse checkout origami/python
* Download python bindings as artifact
* Try ctest instead of running test files directly
* Only download artifacts for ubuntu
* Add missing cd
* Run individual tests not ctest.
* Fix hipblaslt build failures
* Resolve more ci failures in hipblaslt
* Add old changes back in
* Fix hipblaslt ci errors
* Clean up
* Add nanobind to array
* Add nanobind to array correctly
* Remove nanobind install script
* Quick fix
* Add pip module installs to test job
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
- Trigger downstream build of rocpydecode within rocdecode pipelines.
- Copying similar variables as other pipelines even though these projects are not in the super-repos.
* Indentation and formatting updated
* Known issues added
* Known issues udpated
* Minor change
* Known issues updated
* KMD UMD udpate
* Updated known issues
* Additional text removed from known issues
* Oracle linux 10 removed
* Indentation and formatting updated
* Resolved issue for kokkos option added
* Known issue for ROCr added
* 2nd known issue added
* Known issues updated
* adding 2 known issues
* Apply suggestions from code review
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
* Update RELEASE.md
* Known issues added
* Approved known issue added
* Component removed based on Leo's feedback
* Issue link added
---------
Co-authored-by: Matt Williams <Matt.Williams+amdeng@amd.com>
Co-authored-by: Matt Williams <matt.williams@amd.com>
- Subset of the hipblaslt component yaml, deleting extra gpu targets and the testing component.
- Sparse checkout details removed.
- Basic build flags from top-level invocation added.
* init
* fix source dir
* miopen specify test build dir
* fix test build dir
* revert change
* fix test build again
* move to ultra temporarily
* miopen-get-ck, working dir
* exclude flaky test
* move back to high
* Add MIVisionX and AMDMIGraphX downstream jobs to MIOpen
* comment sparsecheckoutdir
* quote component names
* fix artifact name
* miopen ck script exit on fail
* add downstream checkout repos
* mivisionx, add aomp
* Update ROCR-Runtime.yml
Migrate from rocmsmi to amdsmi
* Update ROCR-Runtime.yml
Removed libhwloc.so.5 install
* Update ROCR-Runtime.yml
Link to hwloc.so.5
* Update ROCR-Runtime.yml
Added link in the rocrtst step
* Update ROCR-Runtime.yml
* Sphinx warning for DGL fixed
* Update dgl-compatibility.rst
removed benchmark line and updated link
---------
Co-authored-by: Pratik Basyal <prbasyal@amd.com>
* HIP 7.0 upcoming changes blog link updated
* Documentation highlight for deep learning framework added
* Note loading fixed
* Note removed
* Link fixed
* verl compatibility
* add Supported features
Signed-off-by: Vicky Tsang <vtsang@amd.com>
* updated and edited verl compat doc
* added links to verl
* add future release for sglang and megatron inference eng.
Signed-off-by: Vicky Tsang <vtsang@amd.com>
* fix lint
Signed-off-by: Vicky Tsang <vtsang@amd.com>
* fixed a typo and a table
* Spolifroni amd/add to compat matrix (#430)
* added verl to compatibility matrix
* small change
* fixed an error in csv
* edited the verl compat based on leo's recommendations
* updated compat matrix (#435)
* Added a hardcoded link to the verl install
This is a link to an RTD build and MUST be removed before publishing.
* Update verl-compatibility.rst
* Added a hardcoded link to the verl install
This link is to an RTD build and it WILL break at publishing. It MUST be changed before publishing.
* Added version support note (#448)
* small fixes
* Update verl-compatibility.rst
* Update verl-compatibility.rst
---------
Signed-off-by: Vicky Tsang <vtsang@amd.com>
Co-authored-by: spolifroni-amd <sandra.polifroni@amd.com>
Co-authored-by: anisha-amd <anisha.sankar@amd.com>
* add wan2.1 to pyt inference models
* update group name
* fix container tag
* fix group name
* change documented data type to bfloat16
* fix col width
Added AlmaLinux 8 Pipeline Support
- aomp
- HIPIFY
- rocDecode
- ROCgdb
- rocJPEG
- rocprofiler
- aqlprofile dependency template
- build autotools template
- download latest cmake template
Pipeline Changes
- More gfx build targets.
- Copying llvm-lit to the llvm-project published artifacts.
- HIPIFY now uses our built version of llvm-project for its pipeline.
- Disable testing in HIPIFY pipeline due to low value provided. Revisit in the future.
- aomp's ROCm dependency list reduced.
- aomp's openmp build had issues with ninja on AlmaLinux 8.
- Add hipSPARSELt dependency.
- Add hipBLASLt test dependency for rocroller shared library.
- Update pip dependency versions.
- Install another typing_extensions at a specific folder for one of the builds we do not control to work.
- Wheel renaming no longer works, so we need to find another mechanism if we start doing builds for different branches and gfx architectures.
- Fixed rocprim pipeline to not rebuild during install step.
- Updates to hipblas-common, hipcub, hiprand, and rocthrust pipelines to build on AlmaLinux8 and more gfx architectures.
- Include rocm-cmake dependency when CMake setup mentions it.
* [External CI] Ubuntu 24.04 job for llvm-project
* temporarily switch to using 'high' build pool while 'ultra' is down
* switch almalinux8 to build on manylinux container
* add pool for alma8 container
* switch alma8 packag manager to apt
* Update llvm-project.yml
* switch back to dnf after resolved container init
---------
Co-authored-by: Joseph Macaranas <Joseph.Macaranas@amd.com>
- Increase compilation coverage for rocrand to more gfx architectures.
- Follow similar path as recent rocprim pipeline changes.
- Add and fix conditionals in cmake template to consolidate the cmake build and install steps to deal with the re-build being done. This is not required in the ubuntu 22.04 job.
- The build time is a little bit too long on the free agents and we will end up capped on free runners soon, so changing the build pool.
GCC Toolset 14 Environment
- source /opt/rh/gcc-toolset-14/enable only lasts for the shell session, so run at the beginning of relevant build and test tasks when the OS is AlmaLinux 8.
- CMake tasks set env to behave as if source /opt/rh/gcc-toolset-14/enable command was run.
- Observed that the built ROCm libraries can either be installed on lib or lib64 directories in this OS profile, so ldconfig step is adjusted to look at additional directories. This won't impact usage in ubuntu22 if the lib64 directories don't exist in the custom ROCm build.
- For the llvm linking step we cannot assume the ROCm lib directory exists, as only ROCm lib64 might be present on the build environment.
- libatomic package was added to the gcc toolset setup.
yaml-based Changes
- base set of dnf packages now defined in an array for dependencies that already come pre-installed on the ubuntu22 VMs.
- Changed format of the job matrix for readability.
New Features
- AlmaLinux 8 pipelines for roctracer and ROCdbgapi.
- roctracer pipeline expanded to support compilation for gfx1030 and gfx1100.
- AlmaLinux 8 llvm-project pipeline now builds flang and flang-rt, so re-enabled for ubuntu 22.04 pipeline as well.
TODO
- Revisit why ninja-build is not used for comgr, device-libs, and hipcc.
- Removed building flang in this pipeline. Will build flang in the aomp pipeline to unblock progress on runtimes and first set of math libraries. Flang debug can also be moved to a cheaper VM.
- ninja-build from dnf is too old for llvm-project. Using a release from GitHub instead.
- Added more dnf package mappings.
- scl enable command is not needed.
- Modified job matrices and templates to support a second OS.
- Included creation of Virtual Machine Scale Sets running AlmaLinux OS 8.10 with GCC toolset 14 to match manylinux 2_28.
- Dependency download algorithm modified so that only a single array of package manager (apt) packages need to be provided as input and then the other package managers have a mapping of equivalent packages.
- Cleaned up python3-pip in the arrays as those should already be on the VMs.
- This will be an iterative process of getting components to build on this OS profile, and starting with the components that don't have interdependencies.
- Highest priority is to get the rocm-libraries working.
* Remove sparseCheckout param
* Add support for downloading same-pipeline-builds for monorepo chain builds
* Make local-artifact step names more informative
* Use componentName param for artifact filenames
* Enable chain downstream triggers for PRIMs & RANDs
* Set preTargetFilter for tests' local-artifact-download call
* Set checkout: none for test jobs
* Exclude failing rocThrust scan.hip test
* Matrixize downstream jobs
* fix vllm link in release.md
* add RDNA4 note in compat matrix
* update hipcc github url to specific path in llvm-project repo
* remove non-existant HIP upcoming changes reference
* remove non-existant resolved issues internal link
* fix hip upcoming changes url
* duplicate amd smi known issue
* Remove JAIS 13B and 30B
* update Docker details - vLLM 0.8.3
* add previous version
* Update docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst
* fix link to previous version
* Known issue for installation failure added
* Github issue No. added
* Typo fixed
* Feedback from Anush updated
* Minor change
* Feedback from Fai added
* Public Issue No. updated
* Minor change
* add files
* Allow command line args for download script
* Move script into separate folder
* Add newline to end of script
---------
Co-authored-by: David Dixon <david.dixon@amd.com>
- Add knobs to toggle aggregate build options.
- Aggregate build pipeline will pull ROCm dependencies from earlier in the same pipeline.
- Changing build pool of some components for more compute power.
- Deleting deprecated component.
- Add Ninja to dependency compilation in MIOpen.
- Add retries to wget for MIOpen CK build case.
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
- Pipeline now uses separate CMake calls to build extras, openmp, and offload.
- Legacy and other components no longer included. Revisit building them without including them in the build artifacts.
Tiny fix that removes the "export" directive.
` export HIP_FORCE_DEV_KERNARG=1 hipblaslt-bench ...`
leads to
bash: export: `hipblaslt-bench': not a valid identifier
whereas just starting with HIP_FORCE_DEV_KERNARG=1 passes this env var to the hipblaslt-bench process, which I think is the intention here.
* Update RELEASE.md
added two new Resolved Issues and made two other changes
* Update RELEASE.md
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
---------
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
* ROCProfiler deprecation notice udpated
* RHEL 9.6 support removed and 9.5 EOS rejected
* Feedback to KV cache highlight added
* Wrong entry of ROCprofiler-SDK removed
* --kokkos-trace issue drafted
* Known issues for compute parition and JAX limited support added
* Known issues for ROCm Systems profiler and MIOpen added
* Feedback from Leo added
* AMD Radeon PRO W7800 48GB support added to RN
* rocSPARSE fixed issue added
* AMD RDNA 2 removed from TOC
* Revert "AMD RDNA 2 removed from TOC"
This reverts commit a8511fb7826891f27d42f1d749fd5356dbaacfbe.
* Unvalidated known issues removed
* Leo's feedback incorporated
* Changelog.md sync with release.md
* fix vllm engine args link
* remove RDNA subtree in under system optimization in toc
* fix RDNA 2 architecture PDF link
* fix CLR LICENSE.txt link
* fix rocPyDecode license link
* ROCProfiler deprecation notice udpated
* RHEL 9.6 support removed and 9.5 EOS rejected
* KV cache highlight updated
* Feedback from Peter Incorporated
Co-authored-by: Peter Park <peter.park@amd.com>
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* ROCProfiler deprecation notice udpated
* RHEL 9.6 support removed and 9.5 EOS rejected
* OS support updated
* Documentation highlight updated
* Update on hardware atomics update
* rocPyDecode version updated
* Quick update in Changes to changes
* Command translation fixed
* gfx950 removed from CK changelog
* glibc version updated
* gfx950 removed
* Changelog list updated
* System optimization migration changes in ROCm
* Linting issue fixed
* Linking corrected
* Minor change
* Link updated to Instinct.docs.amd.com
* ROCm docs grid updated by removing IOMMU.rst, pcie-atomics, and oversubscription pages
* Files removed and reference fixed
* Reference text updated
* ROCProfiler deprecation notice udpated
* RHEL 9.6 support removed and 9.5 EOS rejected
* Updated KMD/UMD content
* Minor correction
* Quick feedback from Ram incorporated
* KMD/UMD seperation highlight updated
* Feedback from leo, Ram, and David updated
* Minor change
* Minor change
* Suggestion from Leo added
* Feedback from Ram incorporated
* Minor fix
* Minor change
* Quick change from Ram
* ROCProfiler deprecation notice udpated
* Link error
* Compatibility updated
* New changelog and OS support updated
* Upcoming changes removed from rocWWMA, added to hipTensor
* Glibc added to wordlist
* Instict docs content added
* RHEL 9.5 to OS
* Compatibility OS update
* Leo's feedback incorporated and TOC updated for linux requirement
* ROCProfiler deprecation notice udpated
* Updated forward backward compatibility content
* Minor fixes on KMD uder space support note
* SLES 15.7 removed
* SLES version formatting update
* Known issue for generic target added
* Known issue update
* Oracle version major release only
* Only major version for oracle linux
* AMDGPU driver known issue updated
* Leo's feedback incorporated
* Leo's feedback incorporated
* Historical change added
* QUick fix
* Fixed issues added
* Jeff's feedback on rocWWMA and hiptensor changelog added
* 6.4.0 changelog added
* DLPack and VP9 added
* update RELEASE based on internal discussion
* remove link to cl
---------
Co-authored-by: Peter Park <peter.park@amd.com>
- Add flang to built projects.
- Upgrade build VM to account for additional project.
- Temporarily ignore a test case for debug info, which is not a high priority in External CI.
* Corrected typo
Corrected typo in line 119 prerequisities -> prerequisites
* Corrected typo in README.md
Corrected typo in line 119 prerequisities -> prerequisites
* Update Megatron-LM and PyTorch Training Docker docs
Also restructure TOC
* Apply suggestions from code review
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
update "start training" text
Apply suggestions from code review
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
update conf.py
fix spacing
fix branding issue
add disable numa
reorg
remove extra text
- Removing the creation of expected folders and symbolic links as workaround to get the test components compiling.
- Set the only OpenCL build flag affecting the build.
- Also, fixes to rocprofiler-sdk when incorporating recent features.
- URL encoding algorithm converts trailing '=' in the base64 string to an integer representing the number of those trailing '=' characters.
* Initial draft for How-to POC
* Zone.identifier file removed
* Broken links in index.md fixed
* Zone.identifier file removed
* Review feedback incorporated
* Title updated
* New format for ROCm for AI TOC created
* Folder structure changed
* ROCm for AI index updated
* Link to Llama recipe updated
* Review feedback added
* Feedback from Cindy added
* Intro text from Cindy added
* New flow suggested by Hongxia incorporated
* Overview content from Cindy added, TOC updated, Meta data updated
* Reference to HPC removed
* Listing alignment updated
* Overview page updated
* Folder structure and link change resulted from TOC change updated
* Content sequence updated
* Meta data updated
* Review feedback incorporated
* Index file renamed
* Conf file updated for OS compatibility info
* update metadata (#4)
update metadata
fix spelling
* Wordlist updated
---------
Co-authored-by: Peter Park <peter.park@amd.com>
- pip update click module for test failures.
- Test results are at 99.8% with these fixes.
- Missing cmake dependency from last PR for ROCR-Runtime
- Missing pkg-config dependency for amdsmi
- Modify PATH to find pip's cmake for rocprofiler-sdk
- Dynamically write a Dockerfile based on the environment for the failing job.
- Account for additional dependencies that need to be installed and setup.
- Build and push a custom container based on that dynamic Dockerfile to capture that failing environment.
- Documenting additional setup to install Docker on VMSS during provisioning.
* Change AMDMIGraphX to use local-artifact-download for half 5.6
* Refactor dependencies-rocm & artifact-download, consolidate component variable lists
* Add mainline option to nightly
* Change all components to new dependencies-rocm usage
* rm aqlprofile checkoutRef
* simplify dependencies-rocm, add gpuTarget back to rocMLIR
* rm tag-builds from aqlprofile
* Make review changes
These were encountered while debugging
https://github.com/ROCm/ROCm/issues/4190
- There is no manifest (-m) for ROCm 6.3.1 in the tools/rocm-build folder
-- Changed the rocm version to 6.3.0 to avoid immediate build failure
- The manifest is not specified in the first instance of "Downloading the ROCm source code", but it is in "Build ROCm from source".
-- Without the correct manifest, subsequent build instructions will fail as the ROCm/ROCm directory doesn't get pulled. It's unclear why these two otherwise identical commands are duplicated and have this discrepancy
* remove 'Using MPI' and 'gpu-cluster-networking' sections due to migration to dcgpu
* remove gpu-cluster-networking from index page
---------
Co-authored-by: Alex Xu <alex.xu@amd.com>
- Recent vision compilation has been failing, and debugging hasn't been fruitful in finding cause.
- Should unblock nightly job to at least build and test pytorch while debug effort continues after the holidays.
- pytorch build and test is unblocked by temporarily patching the composable_kernel submodule on upstream pytorch to latest develop, until that submodule is updated to have explicit cast for hneg.
* Updated for 6.3.1
* Compatible version updated from RC1 build
Co-authored-by: Peter Park <peter.park@amd.com>
* Comptibility table and rst updated
* Compatible version updated from RC1 build
Co-authored-by: Peter Park <peter.park@amd.com>
* Peter's review feedback incorporated
Co-authored-by: Peter Park <peter.park@amd.com>
---------
Co-authored-by: prbasyal <prbasyal@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
* 6.3.1 Release notes (#224)
* New Release highlight on offline installer added. OS change, Known Issues, Resolved issues, and upcoming changes copied from 6.3.0 and updated version
---------
Co-authored-by: prbasyal <prbasyal@amd.com>
* Updates to release notes (#229)
* Updates to release notes
* & -> and
* Updated the component changes, table, release highlights, and fixed i… (#232)
* Updated the component changes, table, release highlights, and fixed issues
* Version number and heading title fixed
* Update RELEASE.md
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
* Update RELEASE.md
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
* Version transition updated
---------
Co-authored-by: prbasyal <prbasyal@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
* Rn 631 custombranch (#234)
* Updated the component changes, table, release highlights, and fixed issues
* Version number and heading title fixed
* Update RELEASE.md
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
* Update RELEASE.md
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
* Version transition updated
* OS and Hardware compatibility updated
---------
Co-authored-by: prbasyal <prbasyal@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
* [6.3.1 release notes] Add rocprof-sys changes to RN (#235)
* remove extra sections
* add rocprof-sys changelog
* add omni fixed issues and ami smi cl
* Update RELEASE.md
Version transition added in the table
* add documentation update note
* Broken link fixed
---------
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
* added edited version of the migraphx changelog and removed CK entry (#238)
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
* Updating ROCm-internal with 6.3.1 release notes changes (#241)
* Updated date and version
* Typos and wording fixed
* Minor fix
* Documentation update added
* MIGraphX change dropped
* Debian support removed
* New release highlight added
* HIPRand version changed
* Cross-reference to Per queue added
* Leo's review feedback incorporated
* HIP optimized section updated
* Istinct and Peter's feedback added
---------
Co-authored-by: prbasyal <prbasyal@amd.com>
* Fix changelog and new documentation note (#246)
* fix amd smi and add training a model using megatron note
* update workload tuning doc note
* fmt
---------
Co-authored-by: prbasyal <prbasyal@amd.com>
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
- Exclude lone, consistently failing MIOpen test.
- test_rnn_seq_api is the only ctest failure, so let's filter it out for now to easily identify new failures.
minor fixes to formatting
fix spelling errors
more spelling
fixes
quantization update
fix format
simplify wording in tunableops and format fix
Apply suggestions from code review
review feedback by Peter
Co-authored-by: Peter Park <peter.park@amd.com>
Apply suggestions from code review
addressing feedback
Co-authored-by: Peter Park <peter.park@amd.com>
Apply suggestions from code review
feedback again
Co-authored-by: Peter Park <peter.park@amd.com>
add hipblaslt yaml file figure
feedback and minor formatting
formatting
update wordlist.txt
remove outdated sentence regarding fsdp and rccl
(cherry picked from commit 87fa9fd83a2e623f6cab4e69d65f49e3db0a45f6)
update wordlist
Co-authored-by: hongxyan <hongxyan@amd.com>
- aomp: Account for path changes due to LLVM_INSTALL_LOC from aomp PR #1012
- aomp: Add llvm-legacy build script step for aomp PR #1062
- rocWMMA: Fix rpath issue when using ninja.
* removed the building doc; edited toolchain to remove myst; made the fact that rst is the preferred format evident
* edited the readme so that it points to the contributing to the rocm docs page
* Update docs/contribute/contributing.md
Co-authored-by: Peter Park <peter.park@amd.com>
* Update docs/contribute/contributing.md
Co-authored-by: Peter Park <peter.park@amd.com>
* added two images showing where the checks and doc build is
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* Update version list with 6.2.0 (#3505) (#3506)
* Fix link to meta-llama finetuning recipes
* Spellcheck fixes in release notes templates (#3526) (#3548)
* fix spelling in 5.4.x templates
* add to wordlist
* update templates
update wordlist
* remove extra_components
rm extra_components
* fix spelling
Co-authored-by: Peter Park <peter.park@amd.com>
* Fix link to rocr debug agent (#3533)
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Fix intersphinx links (#3546)
* update fw install links
* fix more intersphinx links
* fix more links
* add rocPyDecode repo to ROCm6.2 manifest file (#3541) (#3553)
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
* Fix typo for TFLOPs metric in MI250 architecture page
* Add rocm-examples to default.xml (#3583)
* Add rocm 6.2.0 manifest file for rocm-build scripts (#3538)
* Add rocm 6.2.0 manifest file for rocm-build scripts
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Add "rocm-examples"
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Add a section on increasing memory allocation to the MI300A system op… (#3587)
* Add a section on increasing memory allocation to the MI300A system optimization guide
* Addition to wordlist
* Change GB to GiB for consistency
* Standardize GiB/KiB spacing
* Minor wording changes
* Update build scripts for ROCm6.2 release
* fix README.md for Ubuntu24 docker
* Correct ttm to amdttm (#3648)
* Expand the section on changing thread affinity (#3653)
* Expand the section on changing thread affinity
* Clarify the methods for configuring allocatable memory settings
* Small correction
* Update model-quantization.rst to import `BitsAndBytesConfig` from transformers library (#3638)
* remove unneeded file (#3663)
* Fix intersphinx links (#3668)
* fix links in install.rst
* fix links in sys opt guides
* Add introduction and links to the new guide to the vLLM optimized Doc… (#3637)
* Add introduction and links to the new guide to the vLLM optimized Docker image on AMD Infinity Hub
* Update target link for the Docker vLLM guide
* Change target URL
* Change link target URL again
* Fixed broken link to RISC-V documentation
* Add FBGEMM/FBGEMM_GPU to the Model acceleration libraries page (#3659)
* Add FBGEMM/FBGEMM_GPU to the Model acceleration libraries page
* Add words to wordlist and fix a typo
* Add new sections for Docker and testing
* Incorporate comments from the external review
* Some minor edits and clarifications
* Incorporate further review coments and fix test section
* Add comment to test section
* Change git clone command for FBGEMM repo
* Change Docker command
* Changes from internal review
* Fix linting issue
* Fixed broken links for tensile, rocprofiler, roctracer, hipify, rocm-cmake
* add missing make command to bitsandbytes install commands (#3722)
* Update link to rocRAND data type support (#3736)
* Fix Radeon link and point at R6.1.3 as absolute link (#3757)
* Fix Radeon link and point at R6.1.3 as absolute link (#3757)
* Include rocal version change in the highlights (#177)
* Include rocal version change in the highlights
* Reworded rocal known issues and added link to rocal in highlights
* Update ROCm manifest to 6.2.1
* Update ROCm branch name
* Add 6.2.1 to version list (#3770)
* Add links to GH issues in 6.2.1 release notes (#3769)
* add MAD page
* link to GitHub issues in release notes known issues
* update templates for 6.2.1
* Revert "add MAD page"
This reverts commit 9cce72bba3.
* update wordlist for spellcheck linter
* add rccl note
* update rocal version change heading to be more obvious
* make rocal note more specific
* fix missing space
* fix capitalization
* Update RCCL known issue wording (#3775)
* add MAD page
* fix wording in RCCL known issue
* Revert "add MAD page"
This reverts commit c81d0f3b0a.
* update llvm version for 6.2.1 (#3779)
* Fix broken links in 6.2.1 release notes (#3782)
* External CI: Replace libomp dependencies with aomp (#3781)
Add roctracer dependency for hipBLAS and rocWMMA testing
* External CI: Add rocprofiler v1 and v2 smoke tests (#3784)
* External CI: ROCgdb smoke tests (#3785)
- Since this is an autotools project and not cmake, build and test on gfx942 system instead of separating into two jobs. Pipeline time is short anyway.
- Follow build instructions to update build flags and to incorporate the ROCdbgapi.
- Results are not parsed and graphed, but the log contents are printed at the end. This was helpful for debugging and will be kept in the pipeline, as the make check-gdb command's output was not helpful on its own.
* External CI: rocPyDecode Smoke Test (#3786)
* External CI: omniperf pipeline (#3788)
- Referred to public documentation, source, and iterative attempts to create and improve build and test pipeline.
- ctest failures are due to the test node not having expected marketing name string and override not working.
- The fix should be on the omniperf repo side of things, so this pull request should be fine as is.
* External CI: create omniperf pipeline IDs, update nightly build (#3790)
* Fixed greater than to be less than in rocFFT changes
* fix footnote for 6.1.0 (#3791)
* fix footnote for 6.1.0
* fix empty columns in historical KFD title
* External CI: Publish wheel as artifact for rocPyDecode (#3796)
* fix build rocal for ROCm6.2.1
* Add ROCm6.2.1 manifest file
* External CI: fix hip-tests symlink creation (#3799)
* Docs: Add Ubuntu 24.04.1 (#3801)
* add ubuntu 24.04.1
* add 24.04.1 to bottom os section
* fix heading and template
* Update compatibility-matrix.rst for OpenMP version
* Update compatibility-matrix-historical-6.0.csv for OpenMP version
* rm ubuntu 24.04.1 from 6.2.0
* Update docs/compatibility/compatibility-matrix.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* rm duplicate ubuntu in historical
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* Docs: Add Ubuntu 24.04.1 (#3801)
* add ubuntu 24.04.1
* add 24.04.1 to bottom os section
* fix heading and template
* Update compatibility-matrix.rst for OpenMP version
* Update compatibility-matrix-historical-6.0.csv for OpenMP version
* rm ubuntu 24.04.1 from 6.2.0
* Update docs/compatibility/compatibility-matrix.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* rm duplicate ubuntu in historical
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* External CI: fixes for rocMLIR and nightly build (#3800)
* External CI: fix symlinks for rocMLIR and nightly build
* add pipeline IDs for hip-tests
* fix hip-test ID typo
* remove llvm-alt license (#3727)
* remove llvm-alt license
* fix linting error
* External CI: enable ROCR-Runtime tests (#3809)
* External CI: default branches for hip-tests, omniperf (#3811)
* External CI: torch and torchvision smoke tests (#3810)
* External CI: torch and torchvision smoke tests
- Fixed issues with package name and version for the vision wheel that prevented it from installing. A patch is used until my pull request in vision repo is merged.
- Referred to rocAutomation scripts to pick which test scripts to run out of the many in the torch and vision repo, and iteratively tested suggested scripts to see which ones completed in a timely manner.
- Leveraging pytest-azurepipelines module to automatically parse and graph results from these tests.
* External CI: omnitrace build pipeline (#3812)
* External CI: omnitrace build pipeline starter
- Adding initial set of dependencies and build flags.
* External CI: omnitrace build pipeline
- Add bison, rccl, texinfo dependencies based on build failures.
- Add AMDGPU_TARGETS flag
- Add ROCm binaries to PATH for clang-format and other tools used.
* Fix indentation
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: AMDMIGraphX Build Fix (#3814)
- Swap to default gcc on OS to resolve build errors from recent commits.
- Added libdnnl-dev dependency from iterative attempts with compiler change.
- Referred to the passing GitHub checks to observe the compilers that was used.
- Build CK jit lib and include in AMDMIGraphX build.
* External CI: test fixes w/ roctracer, list omniperf as partially succeeding (#3815)
* External CI: rpp tests (#3816)
* External CI: Build pipeline for rocprofiler-sdk (#3819)
* External CI: Pipeline for rocprofiler-sdk
* Add rocprofiler dependency
* External CI: rocprofiler-sdk build pipeline
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: Fix/add missing pipeline IDs (#3818)
* Update default.xml - Change 6.2.1 to 6.2.2
* Add ROCm6.2.1 manifest file
* External CI: omnitrace tests (#3822)
* Update tags to 6.2.2 (#3827)
* Update tags to 6.2.2 (#3827)
* External CI: add roctracer to roc/hipSOLVER test deps (#3825)
* External CI: add rocprofiler-sdk pipeline IDs (#3824)
* External CI: AMDMIGraphX Smoke Tests (#3830)
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: MIOpen tests (#3837)
* Point to release history instead of deprecated changelog (#3836)
* External CI: filter out hipTensor extended tests (#3838)
* added revised note re. radeon gpus (#3839)
* Restructured the contributions section. (#3715)
* testing if this file is editable
* changed 'kebob-case' to 'dash-case'
* Restructured the page to be more straightforward and provide additional repo information
* forgot to save
* Moved the topic sentence
* Wrong accent on the a in diataxis
* Removed the feedback info from contributing and moved it to Feedback
* fixed spelling errors
* fixed some wording and removed second person text
* consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc
* updated the titles in the toc
* made changes based on feedback
* it's better when you save
* removed structure and build; fixed something for the linter
* added rst to wordlist
* added customizations to wordlist
* Add links to gpu cluster network guides (#3763)
* Add links to gpu cluster network guides
* Add newline character to eof
* Make link absolute
* add dynamic branch in toc
* remove unnecessary page
clean up
* clean up index/toc
* make multi-node topics adjacent
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* Point to release history instead of deprecated changelog (#3836)
* Restructured the contributions section. (#3715)
* testing if this file is editable
* changed 'kebob-case' to 'dash-case'
* Restructured the page to be more straightforward and provide additional repo information
* forgot to save
* Moved the topic sentence
* Wrong accent on the a in diataxis
* Removed the feedback info from contributing and moved it to Feedback
* fixed spelling errors
* fixed some wording and removed second person text
* consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc
* updated the titles in the toc
* made changes based on feedback
* it's better when you save
* removed structure and build; fixed something for the linter
* added rst to wordlist
* added customizations to wordlist
* Add links to gpu cluster network guides (#3763)
* Add links to gpu cluster network guides
* Add newline character to eof
* Make link absolute
* add dynamic branch in toc
* remove unnecessary page
clean up
* clean up index/toc
* make multi-node topics adjacent
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* updated the radeon note (#3850)
* External CI: Fix rocPyDecode wheel creation (#3852)
- Set values for expected environment variables.
- Accompanying changes required in rocPyDecode repo. Pull request will be made.
* External CI: pytorch vision patch removal (#3855)
My pull request applying this patch was merged upstream, so this is no longer needed and will break the pipeline since it can no longer be applied.
* Build(deps): Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#3807)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* updated the radeon note, as it were (#3857)
* updated the radeon note, as it were
* updated the note again
* Set devops team as codeowners for rocm-build (#3860)
* Set ext CI as codeowners for rocm-build
* Update CODEOWNERS to rocm-devops
* External CI: Add option to pull mainline branch for dependencies (#3689)
* External CI: Add option to pull mainline branch for dependencies
* Missing parameter for mainline branch dependencies.
* External CI: mainline branch definitions
* Removed MIGraphX optimization page (#3848)
* External CI: add a global variable to control gfx942 tests (#3864)
* External CI: update component default/mainline branches (#3871)
* External CI: Stop building gfx90a (#3872)
Save on VM resources until infrastructure has test targets.
* External CI: add libstdc++-12 to rocMLIR (#3874)
* Add building doc section (#3873)
* External CI: programmatically get latest aqlprofile (#3876)
* External CI: use ctest for rocm-examples (#3877)
* External CI: Tensile pipeline (#3884)
* add oversubscription conceptual doc (#3885)
add mitigiation steps
add to toc
move page for build
move doc
fix spelling
update doc
update oversubscription
update order
fix spelling
add oversubscription to wordlist
move oversubscription topic to bottom of toc and index
* add oversubscription conceptual doc (#3885)
add mitigiation steps
add to toc
move page for build
move doc
fix spelling
update doc
update oversubscription
update order
fix spelling
add oversubscription to wordlist
move oversubscription topic to bottom of toc and index
(cherry picked from commit d0ecf51b0c)
* add oversubscription conceptual doc (#3885)
(cherry picked from commit d0ecf51b0c)
* Add building doc section (#3873)
(cherry picked from commit abc0e6a087)
* External CI: Add pipeline to build upstream boost (#3896)
* Update bitsandbytes branch in docs (#3898)
* Update bitsandbytes branch in docs (#3898)
(cherry picked from commit b541be7bcb)
* Documentation: Add reference to precision-support floating-point types (#3899)
* External CI: use Boost template for MIOpen (#3903)
* External CI: create rocprofiler-systems pipeline (#3906)
* External CI: omnitrace/rocprof-sys pipeline IDs (#3908)
* External CI: MIOpen parse test results (#3913)
* External CI: Use pip to install latest cmake on test system (#3915)
* added a link to the compatibility matrix (#3904)
* added a link to the compatibility matrix
* removed quotes
* docs: Remove invalid amd_iommu=on parameter
Per kernel-parameters.txt, there is no "on" option for amd_iommu. While
intel_iommu has it, amd_iommu is automatically on unless specified
otherwise. For more info, see these 2 links:
https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt75aa74d52f/drivers/iommu/amd/init.c (L3481)
Signed-off-by: Kent Russell <kent.russell@amd.com>
* docs: Remove invalid amd_iommu=on parameter
Per kernel-parameters.txt, there is no "on" option for amd_iommu. While
intel_iommu has it, amd_iommu is automatically on unless specified
otherwise. For more info, see these 2 links:
https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt75aa74d52f/drivers/iommu/amd/init.c (L3481)
Signed-off-by: Kent Russell <kent.russell@amd.com>
(cherry picked from commit 74333b667d)
* External CI: hipBLASLt build now requires python packaging module (#3926)
https://github.com/ROCm/hipBLASLt/pull/1250/files#diff-fee2e6f068b33fca3a1dc49392de8848dbf05c3f4632b680abb1052523e5a30fR35
* External CI: Moved location of upstream pytorch build scripts (#3930)
https://github.com/pytorch/pytorch/pull/138103
* External CI: disable rocMLIR tests (#3931)
* External CI: disable rocMLIR tests
* roctracer AMDGPU_TARGETS flag
* External CI: create a GPU diagnostics template (#3932)
* External CI: Add CK into pytorch build environment (#3934)
* Update rocm-6.2.2.xml (#3927)
vim typo removed
* External CI: add support to disable individual component tests (#3938)
* External CI: AMDMIGraphX greater-equal pip dependencies (#3939)
* Build(deps): Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx (#3933)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.2 to 1.8.3.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.2...v1.8.3)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* External CI: rocDecode add libva-amdgpu-dev dependency (#3940)
* External CI: enumerate GPUs in gpu-diagnostics (#3942)
* External CI: move gpu-diag directly before tests (#3943)
* External CI: fix HIP_PIPELINE_ID (#3944)
* External CI: pytorch pipeline updates (#3948)
To support recent upstream changes and issues observed.
* External CI: rocpydecode dependency installation change (#3954)
- Install pybind11 through pip instead of apt
- Add pip-installed pybind11 path to CMAKE_PREFIX_PATH
- Tested against source of PR 122
* External CI: do not assume python is python3 for rocpydecode (#3955)
* Improve consistency of the gpu-arch-specs table. (#3936)
* Improve consistency of the gpu-arch-specs table.
* Add XCD to the glossary.
* External CI: Always force rocPyDecode cleanup step
* External CI: Add aqlprofile to Tensile test dependencies (#3961)
* add vllm performance validation doc (#3964)
* External CI: various fixes (#3963)
* add suggestions to vllm perf validation doc (#3968)
* External CI: move allowPartiallySucceededBuilds to library variable (#3970)
* External CI: suppress GPU diag warnings (#3972)
* External CI: rocprofiler-compute pipeline files (#3973)
* External CI: disable reload AMDGPU (#3974)
* Update links to vllm perf validation doc (#3971)
* update links to vllm perf validation doc
* add PagedAttention to wordlist
* External CI: Change test setup for rocPyDecode (#3978)
- Use multiple potential locations for pybind11 to be found by cmake.
* External CI: add roctracer to rocBLAS deps (#3982)
* External CI: decode test changes (#3983)
- Only target container with access to first device
- Ensure pybind11-dev is uninstalled before the package manager install steps
* Changed the introductory text linked to Radeon (#3988)
Co-authored-by: prbasyal <prbasyal@amd.com>
* External CI: finish rocprofiler-compute enablement (#3995)
* External CI: add aomp as rocprofiler-systems dependency (#3996)
* External CI: remove omniperf from nightly (#4000)
* Sync from internal develop 6.2.4 (#4002)
* add radeon pro v710 to gpu arch specs (#192)
* Add V710 specs
gpg: using RSA key
22223038B47B3ED4B3355AB11B54779B4780494E
gpg: Good signature from "Peter Park (MKMPETEPARK01)
<peter.park@amd.com>" [ultimate]
add some specs
add cols
clean up extra line
* fix graphics l1 cache description
* update SGPR for RDNA2 and RDNA3 archs
* update VGPR
* Apply suggestions from code review
* change l2 cache to 4
* Update docs/reference/gpu-arch-specs.rst
* ROCm 6.2.4 compatibility matrix (#186)
* prep compat column (historical) and mi300x column
* update historical compat matrix for 6.2.4
* update compat matrix for 6.2.4
* fix compat
* fix thunk version
* fix hipify ver
* ROCm 6.2.4 release notes (#184)
* prep 6.2.4 release notes
* add mathlibs
* add detail component changes
* rm non-updated linnks
* fix sentence
* fix rocthrust v
* rm offline installer
* condense
* add leo/ram fdback
words
* update documentation section
* add rocm on radeon note
* update os support note wording
* update release
* update version and GA date to 10-17
* update 6.2.4 rn
* update wording
* add link to v710
* update wording
* update templ
* simplify note
* words
os note
words
* change URLs to latest
* update link to supported GPUs
* Update versions.md 6.2.4 date to Oct 18
* Update conf.py release note date to Oct 18
---------
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Sync change from ROCm to ROCm-internal (#194)
* Fix Radeon link and point at R6.1.3 as absolute link (#3757)
* Update ROCm manifest to 6.2.1
* Update ROCm branch name
* Add 6.2.1 to version list (#3770)
* Add links to GH issues in 6.2.1 release notes (#3769)
* add MAD page
* link to GitHub issues in release notes known issues
* update templates for 6.2.1
* Revert "add MAD page"
This reverts commit 9cce72bba3.
* update wordlist for spellcheck linter
* add rccl note
* update rocal version change heading to be more obvious
* make rocal note more specific
* fix missing space
* fix capitalization
* Update RCCL known issue wording (#3775)
* add MAD page
* fix wording in RCCL known issue
* Revert "add MAD page"
This reverts commit c81d0f3b0a.
* update llvm version for 6.2.1 (#3779)
* Fix broken links in 6.2.1 release notes (#3782)
* External CI: Replace libomp dependencies with aomp (#3781)
Add roctracer dependency for hipBLAS and rocWMMA testing
* External CI: Add rocprofiler v1 and v2 smoke tests (#3784)
* External CI: ROCgdb smoke tests (#3785)
- Since this is an autotools project and not cmake, build and test on gfx942 system instead of separating into two jobs. Pipeline time is short anyway.
- Follow build instructions to update build flags and to incorporate the ROCdbgapi.
- Results are not parsed and graphed, but the log contents are printed at the end. This was helpful for debugging and will be kept in the pipeline, as the make check-gdb command's output was not helpful on its own.
* External CI: rocPyDecode Smoke Test (#3786)
* External CI: omniperf pipeline (#3788)
- Referred to public documentation, source, and iterative attempts to create and improve build and test pipeline.
- ctest failures are due to the test node not having expected marketing name string and override not working.
- The fix should be on the omniperf repo side of things, so this pull request should be fine as is.
* External CI: create omniperf pipeline IDs, update nightly build (#3790)
* Fixed greater than to be less than in rocFFT changes
* fix footnote for 6.1.0 (#3791)
* fix footnote for 6.1.0
* fix empty columns in historical KFD title
* External CI: Publish wheel as artifact for rocPyDecode (#3796)
* External CI: fix hip-tests symlink creation (#3799)
* Docs: Add Ubuntu 24.04.1 (#3801)
* add ubuntu 24.04.1
* add 24.04.1 to bottom os section
* fix heading and template
* Update compatibility-matrix.rst for OpenMP version
* Update compatibility-matrix-historical-6.0.csv for OpenMP version
* rm ubuntu 24.04.1 from 6.2.0
* Update docs/compatibility/compatibility-matrix.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* rm duplicate ubuntu in historical
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* External CI: fixes for rocMLIR and nightly build (#3800)
* External CI: fix symlinks for rocMLIR and nightly build
* add pipeline IDs for hip-tests
* fix hip-test ID typo
* remove llvm-alt license (#3727)
* remove llvm-alt license
* fix linting error
* External CI: enable ROCR-Runtime tests (#3809)
* External CI: default branches for hip-tests, omniperf (#3811)
* External CI: torch and torchvision smoke tests (#3810)
* External CI: torch and torchvision smoke tests
- Fixed issues with package name and version for the vision wheel that prevented it from installing. A patch is used until my pull request in vision repo is merged.
- Referred to rocAutomation scripts to pick which test scripts to run out of the many in the torch and vision repo, and iteratively tested suggested scripts to see which ones completed in a timely manner.
- Leveraging pytest-azurepipelines module to automatically parse and graph results from these tests.
* External CI: omnitrace build pipeline (#3812)
* External CI: omnitrace build pipeline starter
- Adding initial set of dependencies and build flags.
* External CI: omnitrace build pipeline
- Add bison, rccl, texinfo dependencies based on build failures.
- Add AMDGPU_TARGETS flag
- Add ROCm binaries to PATH for clang-format and other tools used.
* Fix indentation
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: AMDMIGraphX Build Fix (#3814)
- Swap to default gcc on OS to resolve build errors from recent commits.
- Added libdnnl-dev dependency from iterative attempts with compiler change.
- Referred to the passing GitHub checks to observe the compilers that was used.
- Build CK jit lib and include in AMDMIGraphX build.
* External CI: test fixes w/ roctracer, list omniperf as partially succeeding (#3815)
* External CI: rpp tests (#3816)
* External CI: Build pipeline for rocprofiler-sdk (#3819)
* External CI: Pipeline for rocprofiler-sdk
* Add rocprofiler dependency
* External CI: rocprofiler-sdk build pipeline
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: Fix/add missing pipeline IDs (#3818)
* External CI: omnitrace tests (#3822)
* Update tags to 6.2.2 (#3827)
* External CI: add roctracer to roc/hipSOLVER test deps (#3825)
* External CI: add rocprofiler-sdk pipeline IDs (#3824)
* External CI: AMDMIGraphX Smoke Tests (#3830)
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: MIOpen tests (#3837)
* Point to release history instead of deprecated changelog (#3836)
* External CI: filter out hipTensor extended tests (#3838)
* added revised note re. radeon gpus (#3839)
* Restructured the contributions section. (#3715)
* testing if this file is editable
* changed 'kebob-case' to 'dash-case'
* Restructured the page to be more straightforward and provide additional repo information
* forgot to save
* Moved the topic sentence
* Wrong accent on the a in diataxis
* Removed the feedback info from contributing and moved it to Feedback
* fixed spelling errors
* fixed some wording and removed second person text
* consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc
* updated the titles in the toc
* made changes based on feedback
* it's better when you save
* removed structure and build; fixed something for the linter
* added rst to wordlist
* added customizations to wordlist
* Add links to gpu cluster network guides (#3763)
* Add links to gpu cluster network guides
* Add newline character to eof
* Make link absolute
* add dynamic branch in toc
* remove unnecessary page
clean up
* clean up index/toc
* make multi-node topics adjacent
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* updated the radeon note (#3850)
* External CI: Fix rocPyDecode wheel creation (#3852)
- Set values for expected environment variables.
- Accompanying changes required in rocPyDecode repo. Pull request will be made.
* External CI: pytorch vision patch removal (#3855)
My pull request applying this patch was merged upstream, so this is no longer needed and will break the pipeline since it can no longer be applied.
* Build(deps): Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#3807)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* updated the radeon note, as it were (#3857)
* updated the radeon note, as it were
* updated the note again
* Set devops team as codeowners for rocm-build (#3860)
* Set ext CI as codeowners for rocm-build
* Update CODEOWNERS to rocm-devops
* External CI: Add option to pull mainline branch for dependencies (#3689)
* External CI: Add option to pull mainline branch for dependencies
* Missing parameter for mainline branch dependencies.
* External CI: mainline branch definitions
* Removed MIGraphX optimization page (#3848)
* External CI: add a global variable to control gfx942 tests (#3864)
* External CI: update component default/mainline branches (#3871)
* External CI: Stop building gfx90a (#3872)
Save on VM resources until infrastructure has test targets.
* External CI: add libstdc++-12 to rocMLIR (#3874)
* Add building doc section (#3873)
* External CI: programmatically get latest aqlprofile (#3876)
* External CI: use ctest for rocm-examples (#3877)
* External CI: Tensile pipeline (#3884)
* add oversubscription conceptual doc (#3885)
add mitigiation steps
add to toc
move page for build
move doc
fix spelling
update doc
update oversubscription
update order
fix spelling
add oversubscription to wordlist
move oversubscription topic to bottom of toc and index
* add oversubscription conceptual doc (#3885)
(cherry picked from commit d0ecf51b0c)
* External CI: Add pipeline to build upstream boost (#3896)
* Update bitsandbytes branch in docs (#3898)
* Documentation: Add reference to precision-support floating-point types (#3899)
* External CI: use Boost template for MIOpen (#3903)
* External CI: create rocprofiler-systems pipeline (#3906)
* External CI: omnitrace/rocprof-sys pipeline IDs (#3908)
* External CI: MIOpen parse test results (#3913)
* External CI: Use pip to install latest cmake on test system (#3915)
* added a link to the compatibility matrix (#3904)
* added a link to the compatibility matrix
* removed quotes
* docs: Remove invalid amd_iommu=on parameter
Per kernel-parameters.txt, there is no "on" option for amd_iommu. While
intel_iommu has it, amd_iommu is automatically on unless specified
otherwise. For more info, see these 2 links:
https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt75aa74d52f/drivers/iommu/amd/init.c (L3481)
Signed-off-by: Kent Russell <kent.russell@amd.com>
* External CI: hipBLASLt build now requires python packaging module (#3926)
https://github.com/ROCm/hipBLASLt/pull/1250/files#diff-fee2e6f068b33fca3a1dc49392de8848dbf05c3f4632b680abb1052523e5a30fR35
* External CI: Moved location of upstream pytorch build scripts (#3930)
https://github.com/pytorch/pytorch/pull/138103
* External CI: disable rocMLIR tests (#3931)
* External CI: disable rocMLIR tests
* roctracer AMDGPU_TARGETS flag
* External CI: create a GPU diagnostics template (#3932)
* External CI: Add CK into pytorch build environment (#3934)
* External CI: add support to disable individual component tests (#3938)
* External CI: AMDMIGraphX greater-equal pip dependencies (#3939)
* Build(deps): Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx (#3933)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.2 to 1.8.3.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.2...v1.8.3)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* External CI: rocDecode add libva-amdgpu-dev dependency (#3940)
* External CI: enumerate GPUs in gpu-diagnostics (#3942)
* External CI: move gpu-diag directly before tests (#3943)
* External CI: fix HIP_PIPELINE_ID (#3944)
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Daniel Su <danielsu@amd.com>
Co-authored-by: Sandra Polifroni <sandra.polifroni@amd.com>
Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com>
Co-authored-by: Michael Benavidez <michael.benavidez@amd.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MKKnorr <MKKnorr@web.de>
Co-authored-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Joseph Greathouse <jlgreathouse@users.noreply.github.com>
* 6.2.4 release notes: add known/fixed issues (#193)
* add "for compute workloads" wording for clarity
* add AMDSMI resolved issue
* add dlm known issue
intro text
wording
* update wording
rm bullet point
update wording
* fix spellcheck due to spacing
* rm s
* rm gfx1151
* remove dlm known issue
* update list of updated docs; note for Radeon users
fmt
* update GA date for 6.2.4
* fix rdc version
* fix RDC version strings (#196)
* revert outdataed change for .azuredevops
* Fix 6.2.4 date in versions.md
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Daniel Su <danielsu@amd.com>
Co-authored-by: Sandra Polifroni <sandra.polifroni@amd.com>
Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com>
Co-authored-by: Michael Benavidez <michael.benavidez@amd.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MKKnorr <MKKnorr@web.de>
Co-authored-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Joseph Greathouse <jlgreathouse@users.noreply.github.com>
* fix links in release notes 6.2.4 (#4008)
* Remove extra line
* Update xml files for 6.2.4 (#4012)
* Update xml files for 6.2.4
* Update README with 6.2.4
* Increase visibility of programming guide
* Docs: Update what is rocm description
* Apply suggestions from code review
Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com>
* Update docs/how-to/hip_programming_guide.rst
Co-authored-by: MKKnorr <MKKnorr@web.de>
* WIP
* Update docs/index.md
* Update docs/how-to/hip_programming_guide.rst
Co-authored-by: MKKnorr <MKKnorr@web.de>
* Update docs/how-to/programming_guide.rst
* Update docs/what-is-rocm.rst
* Apply suggestions from code review
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* Update docs/how-to/programming_guide.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* Remove tip
* External CI: allow test failures to present as failures on Github (#3993)
* External CI: disable rdmatest and rocrtstFunc.Memory_Max_Mem (#4016)
* Added 6.2.4 manifest.xml
* External CI: fix comgr build (#4025)
* External CI: increase Tensile test timeout to 90 mins (#4027)
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: David Galiffi <dgaliffi@amd.com>
Co-authored-by: Chris Kime <Christopher.Kime@amd.com>
Co-authored-by: ozziemoreno <109979778+ozziemoreno@users.noreply.github.com>
Co-authored-by: Sandra Polifroni <sandra.polifroni@amd.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Daniel Su <danielsu@amd.com>
Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com>
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
Co-authored-by: Michael Benavidez <michael.benavidez@amd.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MKKnorr <MKKnorr@web.de>
Co-authored-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Joseph Greathouse <jlgreathouse@users.noreply.github.com>
Co-authored-by: Johannes Maria Frank <jmfrank63@gmail.com>
Co-authored-by: Brian Cornille <bcornill@amd.com>
Co-authored-by: Joseph Macaranas <Joseph.Macaranas@amd.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
Co-authored-by: prbasyal <prbasyal@amd.com>
Co-authored-by: Istvan Kiss <neon60@gmail.com>
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Co-authored-by: Ameya Keshava Mallya <ameyakeshava.mallya@amd.com>
* add radeon pro v710 to gpu arch specs (#192)
* Add V710 specs
gpg: using RSA key
22223038B47B3ED4B3355AB11B54779B4780494E
gpg: Good signature from "Peter Park (MKMPETEPARK01)
<peter.park@amd.com>" [ultimate]
add some specs
add cols
clean up extra line
* fix graphics l1 cache description
* update SGPR for RDNA2 and RDNA3 archs
* update VGPR
* Apply suggestions from code review
* change l2 cache to 4
* Update docs/reference/gpu-arch-specs.rst
* ROCm 6.2.4 compatibility matrix (#186)
* prep compat column (historical) and mi300x column
* update historical compat matrix for 6.2.4
* update compat matrix for 6.2.4
* fix compat
* fix thunk version
* fix hipify ver
* ROCm 6.2.4 release notes (#184)
* prep 6.2.4 release notes
* add mathlibs
* add detail component changes
* rm non-updated linnks
* fix sentence
* fix rocthrust v
* rm offline installer
* condense
* add leo/ram fdback
words
* update documentation section
* add rocm on radeon note
* update os support note wording
* update release
* update version and GA date to 10-17
* update 6.2.4 rn
* update wording
* add link to v710
* update wording
* update templ
* simplify note
* words
os note
words
* change URLs to latest
* update link to supported GPUs
* Update versions.md 6.2.4 date to Oct 18
* Update conf.py release note date to Oct 18
---------
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Sync change from ROCm to ROCm-internal (#194)
* Fix Radeon link and point at R6.1.3 as absolute link (#3757)
* Update ROCm manifest to 6.2.1
* Update ROCm branch name
* Add 6.2.1 to version list (#3770)
* Add links to GH issues in 6.2.1 release notes (#3769)
* add MAD page
* link to GitHub issues in release notes known issues
* update templates for 6.2.1
* Revert "add MAD page"
This reverts commit 9cce72bba3.
* update wordlist for spellcheck linter
* add rccl note
* update rocal version change heading to be more obvious
* make rocal note more specific
* fix missing space
* fix capitalization
* Update RCCL known issue wording (#3775)
* add MAD page
* fix wording in RCCL known issue
* Revert "add MAD page"
This reverts commit c81d0f3b0a.
* update llvm version for 6.2.1 (#3779)
* Fix broken links in 6.2.1 release notes (#3782)
* External CI: Replace libomp dependencies with aomp (#3781)
Add roctracer dependency for hipBLAS and rocWMMA testing
* External CI: Add rocprofiler v1 and v2 smoke tests (#3784)
* External CI: ROCgdb smoke tests (#3785)
- Since this is an autotools project and not cmake, build and test on gfx942 system instead of separating into two jobs. Pipeline time is short anyway.
- Follow build instructions to update build flags and to incorporate the ROCdbgapi.
- Results are not parsed and graphed, but the log contents are printed at the end. This was helpful for debugging and will be kept in the pipeline, as the make check-gdb command's output was not helpful on its own.
* External CI: rocPyDecode Smoke Test (#3786)
* External CI: omniperf pipeline (#3788)
- Referred to public documentation, source, and iterative attempts to create and improve build and test pipeline.
- ctest failures are due to the test node not having expected marketing name string and override not working.
- The fix should be on the omniperf repo side of things, so this pull request should be fine as is.
* External CI: create omniperf pipeline IDs, update nightly build (#3790)
* Fixed greater than to be less than in rocFFT changes
* fix footnote for 6.1.0 (#3791)
* fix footnote for 6.1.0
* fix empty columns in historical KFD title
* External CI: Publish wheel as artifact for rocPyDecode (#3796)
* External CI: fix hip-tests symlink creation (#3799)
* Docs: Add Ubuntu 24.04.1 (#3801)
* add ubuntu 24.04.1
* add 24.04.1 to bottom os section
* fix heading and template
* Update compatibility-matrix.rst for OpenMP version
* Update compatibility-matrix-historical-6.0.csv for OpenMP version
* rm ubuntu 24.04.1 from 6.2.0
* Update docs/compatibility/compatibility-matrix.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* rm duplicate ubuntu in historical
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* External CI: fixes for rocMLIR and nightly build (#3800)
* External CI: fix symlinks for rocMLIR and nightly build
* add pipeline IDs for hip-tests
* fix hip-test ID typo
* remove llvm-alt license (#3727)
* remove llvm-alt license
* fix linting error
* External CI: enable ROCR-Runtime tests (#3809)
* External CI: default branches for hip-tests, omniperf (#3811)
* External CI: torch and torchvision smoke tests (#3810)
* External CI: torch and torchvision smoke tests
- Fixed issues with package name and version for the vision wheel that prevented it from installing. A patch is used until my pull request in vision repo is merged.
- Referred to rocAutomation scripts to pick which test scripts to run out of the many in the torch and vision repo, and iteratively tested suggested scripts to see which ones completed in a timely manner.
- Leveraging pytest-azurepipelines module to automatically parse and graph results from these tests.
* External CI: omnitrace build pipeline (#3812)
* External CI: omnitrace build pipeline starter
- Adding initial set of dependencies and build flags.
* External CI: omnitrace build pipeline
- Add bison, rccl, texinfo dependencies based on build failures.
- Add AMDGPU_TARGETS flag
- Add ROCm binaries to PATH for clang-format and other tools used.
* Fix indentation
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: AMDMIGraphX Build Fix (#3814)
- Swap to default gcc on OS to resolve build errors from recent commits.
- Added libdnnl-dev dependency from iterative attempts with compiler change.
- Referred to the passing GitHub checks to observe the compilers that was used.
- Build CK jit lib and include in AMDMIGraphX build.
* External CI: test fixes w/ roctracer, list omniperf as partially succeeding (#3815)
* External CI: rpp tests (#3816)
* External CI: Build pipeline for rocprofiler-sdk (#3819)
* External CI: Pipeline for rocprofiler-sdk
* Add rocprofiler dependency
* External CI: rocprofiler-sdk build pipeline
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: Fix/add missing pipeline IDs (#3818)
* External CI: omnitrace tests (#3822)
* Update tags to 6.2.2 (#3827)
* External CI: add roctracer to roc/hipSOLVER test deps (#3825)
* External CI: add rocprofiler-sdk pipeline IDs (#3824)
* External CI: AMDMIGraphX Smoke Tests (#3830)
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: MIOpen tests (#3837)
* Point to release history instead of deprecated changelog (#3836)
* External CI: filter out hipTensor extended tests (#3838)
* added revised note re. radeon gpus (#3839)
* Restructured the contributions section. (#3715)
* testing if this file is editable
* changed 'kebob-case' to 'dash-case'
* Restructured the page to be more straightforward and provide additional repo information
* forgot to save
* Moved the topic sentence
* Wrong accent on the a in diataxis
* Removed the feedback info from contributing and moved it to Feedback
* fixed spelling errors
* fixed some wording and removed second person text
* consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc
* updated the titles in the toc
* made changes based on feedback
* it's better when you save
* removed structure and build; fixed something for the linter
* added rst to wordlist
* added customizations to wordlist
* Add links to gpu cluster network guides (#3763)
* Add links to gpu cluster network guides
* Add newline character to eof
* Make link absolute
* add dynamic branch in toc
* remove unnecessary page
clean up
* clean up index/toc
* make multi-node topics adjacent
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* updated the radeon note (#3850)
* External CI: Fix rocPyDecode wheel creation (#3852)
- Set values for expected environment variables.
- Accompanying changes required in rocPyDecode repo. Pull request will be made.
* External CI: pytorch vision patch removal (#3855)
My pull request applying this patch was merged upstream, so this is no longer needed and will break the pipeline since it can no longer be applied.
* Build(deps): Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#3807)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* updated the radeon note, as it were (#3857)
* updated the radeon note, as it were
* updated the note again
* Set devops team as codeowners for rocm-build (#3860)
* Set ext CI as codeowners for rocm-build
* Update CODEOWNERS to rocm-devops
* External CI: Add option to pull mainline branch for dependencies (#3689)
* External CI: Add option to pull mainline branch for dependencies
* Missing parameter for mainline branch dependencies.
* External CI: mainline branch definitions
* Removed MIGraphX optimization page (#3848)
* External CI: add a global variable to control gfx942 tests (#3864)
* External CI: update component default/mainline branches (#3871)
* External CI: Stop building gfx90a (#3872)
Save on VM resources until infrastructure has test targets.
* External CI: add libstdc++-12 to rocMLIR (#3874)
* Add building doc section (#3873)
* External CI: programmatically get latest aqlprofile (#3876)
* External CI: use ctest for rocm-examples (#3877)
* External CI: Tensile pipeline (#3884)
* add oversubscription conceptual doc (#3885)
add mitigiation steps
add to toc
move page for build
move doc
fix spelling
update doc
update oversubscription
update order
fix spelling
add oversubscription to wordlist
move oversubscription topic to bottom of toc and index
* add oversubscription conceptual doc (#3885)
(cherry picked from commit d0ecf51b0c)
* External CI: Add pipeline to build upstream boost (#3896)
* Update bitsandbytes branch in docs (#3898)
* Documentation: Add reference to precision-support floating-point types (#3899)
* External CI: use Boost template for MIOpen (#3903)
* External CI: create rocprofiler-systems pipeline (#3906)
* External CI: omnitrace/rocprof-sys pipeline IDs (#3908)
* External CI: MIOpen parse test results (#3913)
* External CI: Use pip to install latest cmake on test system (#3915)
* added a link to the compatibility matrix (#3904)
* added a link to the compatibility matrix
* removed quotes
* docs: Remove invalid amd_iommu=on parameter
Per kernel-parameters.txt, there is no "on" option for amd_iommu. While
intel_iommu has it, amd_iommu is automatically on unless specified
otherwise. For more info, see these 2 links:
https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt75aa74d52f/drivers/iommu/amd/init.c (L3481)
Signed-off-by: Kent Russell <kent.russell@amd.com>
* External CI: hipBLASLt build now requires python packaging module (#3926)
https://github.com/ROCm/hipBLASLt/pull/1250/files#diff-fee2e6f068b33fca3a1dc49392de8848dbf05c3f4632b680abb1052523e5a30fR35
* External CI: Moved location of upstream pytorch build scripts (#3930)
https://github.com/pytorch/pytorch/pull/138103
* External CI: disable rocMLIR tests (#3931)
* External CI: disable rocMLIR tests
* roctracer AMDGPU_TARGETS flag
* External CI: create a GPU diagnostics template (#3932)
* External CI: Add CK into pytorch build environment (#3934)
* External CI: add support to disable individual component tests (#3938)
* External CI: AMDMIGraphX greater-equal pip dependencies (#3939)
* Build(deps): Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx (#3933)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.2 to 1.8.3.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.2...v1.8.3)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* External CI: rocDecode add libva-amdgpu-dev dependency (#3940)
* External CI: enumerate GPUs in gpu-diagnostics (#3942)
* External CI: move gpu-diag directly before tests (#3943)
* External CI: fix HIP_PIPELINE_ID (#3944)
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Daniel Su <danielsu@amd.com>
Co-authored-by: Sandra Polifroni <sandra.polifroni@amd.com>
Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com>
Co-authored-by: Michael Benavidez <michael.benavidez@amd.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MKKnorr <MKKnorr@web.de>
Co-authored-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Joseph Greathouse <jlgreathouse@users.noreply.github.com>
* 6.2.4 release notes: add known/fixed issues (#193)
* add "for compute workloads" wording for clarity
* add AMDSMI resolved issue
* add dlm known issue
intro text
wording
* update wording
rm bullet point
update wording
* fix spellcheck due to spacing
* rm s
* rm gfx1151
* remove dlm known issue
* update list of updated docs; note for Radeon users
fmt
* update GA date for 6.2.4
* fix rdc version
* fix RDC version strings (#196)
* revert outdataed change for .azuredevops
* Fix 6.2.4 date in versions.md
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Daniel Su <danielsu@amd.com>
Co-authored-by: Sandra Polifroni <sandra.polifroni@amd.com>
Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com>
Co-authored-by: Michael Benavidez <michael.benavidez@amd.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MKKnorr <MKKnorr@web.de>
Co-authored-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Joseph Greathouse <jlgreathouse@users.noreply.github.com>
* add "for compute workloads" wording for clarity
* add AMDSMI resolved issue
* add dlm known issue
intro text
wording
* update wording
rm bullet point
update wording
* fix spellcheck due to spacing
* rm s
* rm gfx1151
* remove dlm known issue
* update list of updated docs; note for Radeon users
fmt
* update GA date for 6.2.4
* fix rdc version
* Fix Radeon link and point at R6.1.3 as absolute link (#3757)
* Update ROCm manifest to 6.2.1
* Update ROCm branch name
* Add 6.2.1 to version list (#3770)
* Add links to GH issues in 6.2.1 release notes (#3769)
* add MAD page
* link to GitHub issues in release notes known issues
* update templates for 6.2.1
* Revert "add MAD page"
This reverts commit 9cce72bba3.
* update wordlist for spellcheck linter
* add rccl note
* update rocal version change heading to be more obvious
* make rocal note more specific
* fix missing space
* fix capitalization
* Update RCCL known issue wording (#3775)
* add MAD page
* fix wording in RCCL known issue
* Revert "add MAD page"
This reverts commit c81d0f3b0a.
* update llvm version for 6.2.1 (#3779)
* Fix broken links in 6.2.1 release notes (#3782)
* External CI: Replace libomp dependencies with aomp (#3781)
Add roctracer dependency for hipBLAS and rocWMMA testing
* External CI: Add rocprofiler v1 and v2 smoke tests (#3784)
* External CI: ROCgdb smoke tests (#3785)
- Since this is an autotools project and not cmake, build and test on gfx942 system instead of separating into two jobs. Pipeline time is short anyway.
- Follow build instructions to update build flags and to incorporate the ROCdbgapi.
- Results are not parsed and graphed, but the log contents are printed at the end. This was helpful for debugging and will be kept in the pipeline, as the make check-gdb command's output was not helpful on its own.
* External CI: rocPyDecode Smoke Test (#3786)
* External CI: omniperf pipeline (#3788)
- Referred to public documentation, source, and iterative attempts to create and improve build and test pipeline.
- ctest failures are due to the test node not having expected marketing name string and override not working.
- The fix should be on the omniperf repo side of things, so this pull request should be fine as is.
* External CI: create omniperf pipeline IDs, update nightly build (#3790)
* Fixed greater than to be less than in rocFFT changes
* fix footnote for 6.1.0 (#3791)
* fix footnote for 6.1.0
* fix empty columns in historical KFD title
* External CI: Publish wheel as artifact for rocPyDecode (#3796)
* External CI: fix hip-tests symlink creation (#3799)
* Docs: Add Ubuntu 24.04.1 (#3801)
* add ubuntu 24.04.1
* add 24.04.1 to bottom os section
* fix heading and template
* Update compatibility-matrix.rst for OpenMP version
* Update compatibility-matrix-historical-6.0.csv for OpenMP version
* rm ubuntu 24.04.1 from 6.2.0
* Update docs/compatibility/compatibility-matrix.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* rm duplicate ubuntu in historical
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* External CI: fixes for rocMLIR and nightly build (#3800)
* External CI: fix symlinks for rocMLIR and nightly build
* add pipeline IDs for hip-tests
* fix hip-test ID typo
* remove llvm-alt license (#3727)
* remove llvm-alt license
* fix linting error
* External CI: enable ROCR-Runtime tests (#3809)
* External CI: default branches for hip-tests, omniperf (#3811)
* External CI: torch and torchvision smoke tests (#3810)
* External CI: torch and torchvision smoke tests
- Fixed issues with package name and version for the vision wheel that prevented it from installing. A patch is used until my pull request in vision repo is merged.
- Referred to rocAutomation scripts to pick which test scripts to run out of the many in the torch and vision repo, and iteratively tested suggested scripts to see which ones completed in a timely manner.
- Leveraging pytest-azurepipelines module to automatically parse and graph results from these tests.
* External CI: omnitrace build pipeline (#3812)
* External CI: omnitrace build pipeline starter
- Adding initial set of dependencies and build flags.
* External CI: omnitrace build pipeline
- Add bison, rccl, texinfo dependencies based on build failures.
- Add AMDGPU_TARGETS flag
- Add ROCm binaries to PATH for clang-format and other tools used.
* Fix indentation
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: AMDMIGraphX Build Fix (#3814)
- Swap to default gcc on OS to resolve build errors from recent commits.
- Added libdnnl-dev dependency from iterative attempts with compiler change.
- Referred to the passing GitHub checks to observe the compilers that was used.
- Build CK jit lib and include in AMDMIGraphX build.
* External CI: test fixes w/ roctracer, list omniperf as partially succeeding (#3815)
* External CI: rpp tests (#3816)
* External CI: Build pipeline for rocprofiler-sdk (#3819)
* External CI: Pipeline for rocprofiler-sdk
* Add rocprofiler dependency
* External CI: rocprofiler-sdk build pipeline
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: Fix/add missing pipeline IDs (#3818)
* External CI: omnitrace tests (#3822)
* Update tags to 6.2.2 (#3827)
* External CI: add roctracer to roc/hipSOLVER test deps (#3825)
* External CI: add rocprofiler-sdk pipeline IDs (#3824)
* External CI: AMDMIGraphX Smoke Tests (#3830)
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: MIOpen tests (#3837)
* Point to release history instead of deprecated changelog (#3836)
* External CI: filter out hipTensor extended tests (#3838)
* added revised note re. radeon gpus (#3839)
* Restructured the contributions section. (#3715)
* testing if this file is editable
* changed 'kebob-case' to 'dash-case'
* Restructured the page to be more straightforward and provide additional repo information
* forgot to save
* Moved the topic sentence
* Wrong accent on the a in diataxis
* Removed the feedback info from contributing and moved it to Feedback
* fixed spelling errors
* fixed some wording and removed second person text
* consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc
* updated the titles in the toc
* made changes based on feedback
* it's better when you save
* removed structure and build; fixed something for the linter
* added rst to wordlist
* added customizations to wordlist
* Add links to gpu cluster network guides (#3763)
* Add links to gpu cluster network guides
* Add newline character to eof
* Make link absolute
* add dynamic branch in toc
* remove unnecessary page
clean up
* clean up index/toc
* make multi-node topics adjacent
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* updated the radeon note (#3850)
* External CI: Fix rocPyDecode wheel creation (#3852)
- Set values for expected environment variables.
- Accompanying changes required in rocPyDecode repo. Pull request will be made.
* External CI: pytorch vision patch removal (#3855)
My pull request applying this patch was merged upstream, so this is no longer needed and will break the pipeline since it can no longer be applied.
* Build(deps): Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#3807)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* updated the radeon note, as it were (#3857)
* updated the radeon note, as it were
* updated the note again
* Set devops team as codeowners for rocm-build (#3860)
* Set ext CI as codeowners for rocm-build
* Update CODEOWNERS to rocm-devops
* External CI: Add option to pull mainline branch for dependencies (#3689)
* External CI: Add option to pull mainline branch for dependencies
* Missing parameter for mainline branch dependencies.
* External CI: mainline branch definitions
* Removed MIGraphX optimization page (#3848)
* External CI: add a global variable to control gfx942 tests (#3864)
* External CI: update component default/mainline branches (#3871)
* External CI: Stop building gfx90a (#3872)
Save on VM resources until infrastructure has test targets.
* External CI: add libstdc++-12 to rocMLIR (#3874)
* Add building doc section (#3873)
* External CI: programmatically get latest aqlprofile (#3876)
* External CI: use ctest for rocm-examples (#3877)
* External CI: Tensile pipeline (#3884)
* add oversubscription conceptual doc (#3885)
add mitigiation steps
add to toc
move page for build
move doc
fix spelling
update doc
update oversubscription
update order
fix spelling
add oversubscription to wordlist
move oversubscription topic to bottom of toc and index
* add oversubscription conceptual doc (#3885)
(cherry picked from commit d0ecf51b0c)
* External CI: Add pipeline to build upstream boost (#3896)
* Update bitsandbytes branch in docs (#3898)
* Documentation: Add reference to precision-support floating-point types (#3899)
* External CI: use Boost template for MIOpen (#3903)
* External CI: create rocprofiler-systems pipeline (#3906)
* External CI: omnitrace/rocprof-sys pipeline IDs (#3908)
* External CI: MIOpen parse test results (#3913)
* External CI: Use pip to install latest cmake on test system (#3915)
* added a link to the compatibility matrix (#3904)
* added a link to the compatibility matrix
* removed quotes
* docs: Remove invalid amd_iommu=on parameter
Per kernel-parameters.txt, there is no "on" option for amd_iommu. While
intel_iommu has it, amd_iommu is automatically on unless specified
otherwise. For more info, see these 2 links:
https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt75aa74d52f/drivers/iommu/amd/init.c (L3481)
Signed-off-by: Kent Russell <kent.russell@amd.com>
* External CI: hipBLASLt build now requires python packaging module (#3926)
https://github.com/ROCm/hipBLASLt/pull/1250/files#diff-fee2e6f068b33fca3a1dc49392de8848dbf05c3f4632b680abb1052523e5a30fR35
* External CI: Moved location of upstream pytorch build scripts (#3930)
https://github.com/pytorch/pytorch/pull/138103
* External CI: disable rocMLIR tests (#3931)
* External CI: disable rocMLIR tests
* roctracer AMDGPU_TARGETS flag
* External CI: create a GPU diagnostics template (#3932)
* External CI: Add CK into pytorch build environment (#3934)
* External CI: add support to disable individual component tests (#3938)
* External CI: AMDMIGraphX greater-equal pip dependencies (#3939)
* Build(deps): Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx (#3933)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.2 to 1.8.3.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.2...v1.8.3)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* External CI: rocDecode add libva-amdgpu-dev dependency (#3940)
* External CI: enumerate GPUs in gpu-diagnostics (#3942)
* External CI: move gpu-diag directly before tests (#3943)
* External CI: fix HIP_PIPELINE_ID (#3944)
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Daniel Su <danielsu@amd.com>
Co-authored-by: Sandra Polifroni <sandra.polifroni@amd.com>
Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com>
Co-authored-by: Michael Benavidez <michael.benavidez@amd.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MKKnorr <MKKnorr@web.de>
Co-authored-by: Kent Russell <kent.russell@amd.com>
Co-authored-by: Joseph Greathouse <jlgreathouse@users.noreply.github.com>
* prep 6.2.4 release notes
* add mathlibs
* add detail component changes
* rm non-updated linnks
* fix sentence
* fix rocthrust v
* rm offline installer
* condense
* add leo/ram fdback
words
* update documentation section
* add rocm on radeon note
* update os support note wording
* update release
* update version and GA date to 10-17
* update 6.2.4 rn
* update wording
* add link to v710
* update wording
* update templ
* simplify note
* words
os note
words
* change URLs to latest
* update link to supported GPUs
* Update versions.md 6.2.4 date to Oct 18
* Update conf.py release note date to Oct 18
---------
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Add V710 specs
gpg: using RSA key
22223038B47B3ED4B3355AB11B54779B4780494E
gpg: Good signature from "Peter Park (MKMPETEPARK01)
<peter.park@amd.com>" [ultimate]
add some specs
add cols
clean up extra line
* fix graphics l1 cache description
* update SGPR for RDNA2 and RDNA3 archs
* update VGPR
* Apply suggestions from code review
* change l2 cache to 4
* Update docs/reference/gpu-arch-specs.rst
add mitigiation steps
add to toc
move page for build
move doc
fix spelling
update doc
update oversubscription
update order
fix spelling
add oversubscription to wordlist
move oversubscription topic to bottom of toc and index
(cherry picked from commit d0ecf51b0c)
add mitigiation steps
add to toc
move page for build
move doc
fix spelling
update doc
update oversubscription
update order
fix spelling
add oversubscription to wordlist
move oversubscription topic to bottom of toc and index
* Add links to gpu cluster network guides
* Add newline character to eof
* Make link absolute
* add dynamic branch in toc
* remove unnecessary page
clean up
* clean up index/toc
* make multi-node topics adjacent
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* testing if this file is editable
* changed 'kebob-case' to 'dash-case'
* Restructured the page to be more straightforward and provide additional repo information
* forgot to save
* Moved the topic sentence
* Wrong accent on the a in diataxis
* Removed the feedback info from contributing and moved it to Feedback
* fixed spelling errors
* fixed some wording and removed second person text
* consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc
* updated the titles in the toc
* made changes based on feedback
* it's better when you save
* removed structure and build; fixed something for the linter
* added rst to wordlist
* added customizations to wordlist
* Add links to gpu cluster network guides
* Add newline character to eof
* Make link absolute
* add dynamic branch in toc
* remove unnecessary page
clean up
* clean up index/toc
* make multi-node topics adjacent
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* testing if this file is editable
* changed 'kebob-case' to 'dash-case'
* Restructured the page to be more straightforward and provide additional repo information
* forgot to save
* Moved the topic sentence
* Wrong accent on the a in diataxis
* Removed the feedback info from contributing and moved it to Feedback
* fixed spelling errors
* fixed some wording and removed second person text
* consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc
* updated the titles in the toc
* made changes based on feedback
* it's better when you save
* removed structure and build; fixed something for the linter
* added rst to wordlist
* added customizations to wordlist
- Swap to default gcc on OS to resolve build errors from recent commits.
- Added libdnnl-dev dependency from iterative attempts with compiler change.
- Referred to the passing GitHub checks to observe the compilers that was used.
- Build CK jit lib and include in AMDMIGraphX build.
* External CI: omnitrace build pipeline starter
- Adding initial set of dependencies and build flags.
* External CI: omnitrace build pipeline
- Add bison, rccl, texinfo dependencies based on build failures.
- Add AMDGPU_TARGETS flag
- Add ROCm binaries to PATH for clang-format and other tools used.
* Fix indentation
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* External CI: torch and torchvision smoke tests
- Fixed issues with package name and version for the vision wheel that prevented it from installing. A patch is used until my pull request in vision repo is merged.
- Referred to rocAutomation scripts to pick which test scripts to run out of the many in the torch and vision repo, and iteratively tested suggested scripts to see which ones completed in a timely manner.
- Leveraging pytest-azurepipelines module to automatically parse and graph results from these tests.
* update current matrix for 6.2.2
* update history compat
* fix typo
* fixed missed 60201s
* fix missed rocm-6.2.1
* Add additional column to compatibility-matrix-historical-6.0, so it includes it correctly
Also, fixing a few 6.2.2 footnote references
* add oracle linux 8.9 under 6.2.2 in historical
* rm widths in historical table
* lowercase a letter
* Fix version numbers for 6.2.2
* Minor updates to historical matrix
* add ubuntu 24.04.1
* Docs: Add Ubuntu 24.04.1 (#3801)
* add ubuntu 24.04.1
* add 24.04.1 to bottom os section
* fix heading and template
* Update compatibility-matrix.rst for OpenMP version
* Update compatibility-matrix-historical-6.0.csv for OpenMP version
* rm ubuntu 24.04.1 from 6.2.0
* Update docs/compatibility/compatibility-matrix.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* rm duplicate ubuntu in historical
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* add overwritten ubuntu 24.04.1
* fix wrong versions and extra comma
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
- Referred to public documentation, source, and iterative attempts to create and improve build and test pipeline.
- ctest failures are due to the test node not having expected marketing name string and override not working.
- The fix should be on the omniperf repo side of things, so this pull request should be fine as is.
- Since this is an autotools project and not cmake, build and test on gfx942 system instead of separating into two jobs. Pipeline time is short anyway.
- Follow build instructions to update build flags and to incorporate the ROCdbgapi.
- Results are not parsed and graphed, but the log contents are printed at the end. This was helpful for debugging and will be kept in the pipeline, as the make check-gdb command's output was not helpful on its own.
* add MAD page
* link to GitHub issues in release notes known issues
* update templates for 6.2.1
* Revert "add MAD page"
This reverts commit 9cce72bba3.
* update wordlist for spellcheck linter
* add rccl note
* update rocal version change heading to be more obvious
* make rocal note more specific
* fix missing space
* fix capitalization
* first pass of the release notes for 6.2.1 (#131)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* Spolifroni amd/release notes 621 (#135)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* added documentation highlights (#136)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* Added information for rocdbgapi (#138)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Updates to documentation section; changed "key" to "notable" (#139)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* Updated the release date and made changes to component details (#140)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* Updated the known issues intro (#141)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* test (#142)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* Spolifroni amd/release notes 621 (#143)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* Reworded some things (#146)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* Added info for rocal 2.0.0 (#147)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Some small changes to the release notes (#148)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* Updated with more components for RC3 (#149)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* Small changes to wording, punctuation; fixed a list (#150)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* Updated versions and removed previous release notes. (#151)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* Update to hightlights, SMI, small fixes (#152)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* Updated the known issues wording for rocAL (#153)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* small fixes (#155)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray "notable" (#156)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* Added offline installer highlight (#157)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added link to offline installer; aligned rn with other FBGEEM doc (#158)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component (#159)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed broken links (#160)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* updated the links again and removed rocAL optimization and known issues (#161)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date (#163)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes (#165)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* Moved known issue to omnitrace (#166)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked omnitrace wording (#167)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked the omnitrace workaround language to be more precise
* fixed rocdbgapi (#168)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked the omnitrace workaround language to be more precise
* fixed ROCdbgapi
* Changed wording in offline installer changes (#169)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked the omnitrace workaround language to be more precise
* fixed ROCdbgapi
* Updated wording for Offline Installer changes
* Updated to show no new Known Issues. (#170)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked the omnitrace workaround language to be more precise
* fixed ROCdbgapi
* Updated wording for Offline Installer changes
* changed Known Issues to say that there are no known issues
* updated the upcoming changes (#171)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked the omnitrace workaround language to be more precise
* fixed ROCdbgapi
* Updated wording for Offline Installer changes
* changed Known Issues to say that there are no known issues
* added rccl plugin removal
* added lack of mi300x support to hardware (#172)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked the omnitrace workaround language to be more precise
* fixed ROCdbgapi
* Updated wording for Offline Installer changes
* changed Known Issues to say that there are no known issues
* added rccl plugin removal
* added lack of MI300X supporort
* removed a contraction (#173)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked the omnitrace workaround language to be more precise
* fixed ROCdbgapi
* Updated wording for Offline Installer changes
* changed Known Issues to say that there are no known issues
* added rccl plugin removal
* added lack of MI300X supporort
* i don't like contractions. te irony
* Changed the link in known issues (#174)
* first pass of the release otnotes for 6.2.1
* something went wrong building the relnotes the first time; this should be OKer
* Partially complete release notees for 6.2.1
* added a line about there being no OS changes in 6.2.1 relative 6.2.0
* Updated version and date
* made wording changes and added documentation highlights
* added information about rocdbgapi
* Changed 'key' to 'notable'; clarified that changes are from 6.2.0 to 6.2.1; clarified the open-source nature of the documentation; brought a note back.
* updated the release date in conf.py; removed added api calls for HIP; added fixed issues to rcodbgapi
* changed the opening intro to Known Issues
* fixed the major copy-pasta error with upcoming changes
* removed a word just to see what happens
* putting the "are" back
* removed the HIP changes; they were in 6.2.0
* corrected some formatting errors
* changed some wording
* changed a word
* reworded the known issues
* added info for rocAL 2.0.0
* Updated the wording on the rocAL changes
* made some small changes.
* minor wording change
* added more component changes
* fixed a bad table; made some minor changes to punctuation and spelling.
* The hipify version needs to be updated to reflect that its version reflects the rocm version. So it went from 6.2.0 to 6.2.1
* undid the hipify version change, but updated the version of amd smi
* removed the previous release notes.
* updated release date to Sept 12
* modified the ROCm SMI entry; workaround reworded and put into known issues; one line added to resolved issues
* Added the FBGEEM support highlight
* updated wording on rocAL known issues
* made some small edits
* removed a stray 'notable'
* added offline installer highlight
* added a link to the offline installer doc; removed the second uppercase E in FBGEEM long-form to align with the other documentation
* fixed a link that had to go to latest rather than to 6.2.1
* trying to trigger a pr
* undoing the last change
* changed a link; fixed wording; added a 'removals' section for one component
* fixed up the list for rocAL to make it more compact
* fixed broken links to component documentation
* Removed optimizations and known issues from rocal
* updated doc links of 404ing components to their readthedocs documentation. Tensile won't be released until later so the link goes to github. Will need to double-check links after release to make sure they still work.
* updated release date
* small changes
* moved known issue to omnitrace
* tweeked the omnitrace workaround language to be more precise
* fixed ROCdbgapi
* Updated wording for Offline Installer changes
* changed Known Issues to say that there are no known issues
* added rccl plugin removal
* added lack of MI300X supporort
* i don't like contractions. te irony
* fixed the label in known issues github link and also changed it from being a link to known issues to issues, since there are no verified known issues at this point
* removed link to github and reference to the list of known issues
* remove "6.2.1 does not support MI300X" and add MI300X GPU recovery failure KI
* update words
* removed info re. rocdbgapi known issues (#176)
* Added point about version change to rocal
* Put link to prerequisites in rocal
---------
Co-authored-by: Peter Park <peter.park@amd.com>
* adding preliminary compatibility matrix data for 6.2.1
* bump up some version numbers from 6.2.0 to 6.2.1
* adding kernel versions to compatibility matrix. I hate it
* add kernel version lookup table, in dropdown list
* add KFD and User space support. Also adjust some meta data keywords
* update 6.2.1 RC2 versions
* make spelling linter happy
* remove kernel versions from table, just reference LUT below
* Leave kenerel Lookup table expanded
* update kernel version table
* remove kernels from historical matrix, update footnotes
* move historical matrix into compatibility folder
* update historical matrix paths
* version bumps for RC3
* RC4 has no other version bumps. Reorder RPP alphabetically
* change How-To card hue to purple
- Add roctracer dependency to hipBLASLt build to address recent failures.
- Change build pool to ultra due to increased build times.
- Enable ccache to help with build times.
- Referred to public documentation, build instructions, source code in tests directory, and iterative runs to modify build flags.
- rdci test failures are known due to singleton nature of rocprofiler, but gtest attempting to spawn multiple instances. There is an internal ticket to track the issue.
Referred to public documentation, build instructions, and iterative debug runs to update build flags, publish new artifacts, and run tests. Test results are not parsed and graphed in Azure.
40% pass rate for this initial pass. Would like to push this through to at least change the build process and then defer fixing the remaining test failures.
- Test results are not parsed to be graphed in Azure reports.
- Added ccache to potentially improve build times, keyed against the date and hash based on amdclang++ binary.
* Add FBGEMM/FBGEMM_GPU to the Model acceleration libraries page
* Add words to wordlist and fix a typo
* Add new sections for Docker and testing
* Incorporate comments from the external review
* Some minor edits and clarifications
* Incorporate further review coments and fix test section
* Add comment to test section
* Change git clone command for FBGEMM repo
* Change Docker command
* Changes from internal review
* Fix linting issue
* Add FBGEMM/FBGEMM_GPU to the Model acceleration libraries page
* Add words to wordlist and fix a typo
* Add new sections for Docker and testing
* Incorporate comments from the external review
* Some minor edits and clarifications
* Incorporate further review coments and fix test section
* Add comment to test section
* Change git clone command for FBGEMM repo
* Change Docker command
* Changes from internal review
* Fix linting issue
Replace cmake calls with bash script calls to compile the components comprising openmp-extras.
Added inline comments to describe the bash scripts from aomp repo being executed.
- Added steps for creating wheel file for torchvision.
- Tried to add torchaudio as well, but it was not reading in AMDGPU_TARGETS value in the nested cmake calls from the python setup.py execution.
- Upstream pytorch builder scripts were updated, so it broke the patching step in the job. Removed the need to patch by using a flag to skip the tests.
- Will work on adding smoke tests of pytorch and torchvision later, just getting this out to fix the nightly build.
* Add introduction and links to the new guide to the vLLM optimized Docker image on AMD Infinity Hub
* Update target link for the Docker vLLM guide
* Change target URL
* Change link target URL again
* Add introduction and links to the new guide to the vLLM optimized Docker image on AMD Infinity Hub
* Update target link for the Docker vLLM guide
* Change target URL
* Change link target URL again
* Added all variables found in the library page on Azure
* removed extra space
* copied the example of referencing variables from variables-global.yml and add HALF560_PIPELINE_ID to the file
* introduced variables-global.yml to this file and pointed the path to variables.CCACHE_DIR
* introduced variables-global.yml and changed all variables in stagingPipelineIdentifiers and taggedPipelineIdentifiers to match the identifier names in variables-global.yml
* adjusted how the variables are introduced into the file
* tried adding ./ to variables-global.yml path
* copied the formatting from develop branch but changed identifiers to match them in variables-global.yml
* changed build pool to high to test if variable works
* recopied variables from library page to account for any changes
* changed build pool back to medium
* removed extra whitespace
* remove whitespace
* added all the variables from the page on azure
* fix merge
fix merge
---------
Co-authored-by: Daniel Su <danielsu@amd.com>
* move precision_support to reference
* add rocPyDecode to AI
* Use CSS style sheets for Card titles
* remove temp folder and files
* add card hues
* shuffle hues
* update requirements
* add hues test
* add hues test2
* select hues
* remove hues test
* use hues and add gutters
* sync TOC and index titles
* once more through the TOC
- Updating pipelines to account for combined repo changes of ROCR-Runtime and ROCT-Thunk-Interface.
- Removed dependencies referring ROCT-Thunk-Interface since it is now in the ROCR-Runtime repo.
- Changed ROCR-Runtime build command to account for directory changes.
* add rocAL, hipCC, CLR. Rearrange order of some items to align with stack diagram. Update UCC versions
* update llvm-project to point to docs page instead of GitHub
* Add a section on increasing memory allocation to the MI300A system optimization guide
* Addition to wordlist
* Change GB to GiB for consistency
* Standardize GiB/KiB spacing
* Minor wording changes
* Add a section on increasing memory allocation to the MI300A system optimization guide
* Addition to wordlist
* Change GB to GiB for consistency
* Standardize GiB/KiB spacing
* Minor wording changes
* Rewrote the section to be minimalist and not specify the number of ways to provide feedback. Also removed the PR info since that's covered in Contributing.
* Update feedback.md
Got feedback from Leo about how to improve on this and make it conform to the style guide. Updated with changes based on that feedback.
Extension of PR #3544 and additional logic for ROCm dependency downloads to account for gpu target for components that can specify GPU target when building or have direct dependencies of these components. Plus, refactoring if statements to reduce lines of code.
Adding support for parallel build jobs where the only difference is the singular GPU target. This allows nightly packaging jobs to pick and choose based on GPU target to reduce download size.
To accommodate this new feature producing multiple artifacts for a component, added support for a file filter when downloading a ROCm component using the format "componentName:fileFilter".
* initial commit for placeholder 6.2 data
* fix TensorFlow versions, and LLVM/OpenMP version strings
* add third column with 6.1.0 as last column. Update some versions from Peter's review comments
* reduce RPP name
* remove trailing comma
* reduce length of 3rd party communications libs title
* change footnote for 6.2 to remove mention of MI300A
* remove TransferBench
* change from 6.1.0 to 6.0.0 data in last column
* fixing a few version numbers
* add rocprofiler-sdk version
* fix omnitrace version
* adding full matrix, 2 different views
* add copying csv in conf.py
* 6.2 content edits, and change subheadings to remove :, renamed a few as Leo suggested
* add Framework anchor within compat matrix, and fix linting error
* categorized tools
* update Cub/Thrust versions, abbreviate Management
* remove the dedicated histtorical page
* WIP commit, added anchors and in compat matrix, along with anchor test code
* check 6.1.1 and 6.0.2 versions, add anchors thru table
* audit 6.2 RC4 versions against table, remove clang-ocl, and update hip-other version
* avoid linting
* MI300A system optimization guide internal draft
* Small changes to System BIOS paragraph
* Some minor edits
* Changes after external review feedback
* Add CPU Affinity debug setting
* Edit CPU Affinity debug setting
* Changes from external discussion
* Add glossary and other small fixes
* Additional changes from the review
* Update the IOMMU guidance
* Change description of CPU affinity setting
* Slight rewording
* Change Debian to Red Hat-based
* A few changes from the second internal review
* Add MI300X tuning guides
Add mi300x doc (pandoc conversion)
fix headings
add metadata
move images to shared/
move images to shared/
convert tuning-guides.md to rst using pandoc
add mi300x to tuning-guides.rst landing page
update h1s, toc, and landing page
fix spelling
fix fmt
format code blocks
add tensilelite imgs
fix formatting
fix formatting some more
fix formatting
more formatting
spelling
remove --enforce-eager note
satisfy spellcheck linter
more spelling
add fixes from hongxia
fix env var in D5
add fixes to PyTorch inductor section
fix
fix
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update docs/how-to/tuning-guides/mi300x.rst
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update 'torch_compile_debug' suggestion based on Hongxia's feedback
fix PyTorch inductor env vars
minor formatting fixes
Apply suggestions from code review
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
Update vllm path
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
disable numfig in Sphinx configuration
fix formatting and capitalization
add words to wordlist
update index
update wordlist
update optimizing-triton-kernel
convert cards to table
fix link in index.md
add @lpaoletti's feedback
Add system tuning guide
add images
add system section
add os settings and sys management
remove pcie=noats recommendation
reorg
add blurb to developer section
impr formatting
remove windows os from tuning guides pages in conf.py
add suggestions from review
fix typo and link
remove os windows from relevant pages in conf
mi300x
add suggestions from review
fix toc
fix index links
reorg
update vLLM vars
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
update vLLM vars
Co-authored-by: Hongxia Yang
<62075498+hongxiayang@users.noreply.github.com>
reorganize
add warnings
add text to system tuning
add filler text on index pages
reorg tuning pages
fix links
fix vars
* rm old pages
fix toc
* add suggestions from review
small change
add more suggestions
rewrite intro
* add 'workload tuning philosophy'
* refactor
* fix broken links
* black format conf.py
* simplify cmd and update doc structure
* add higher-level heading for consistency (mi300x.rst)
* add fixes from review
fix url
add fixes
fix formatting
fix fmt
fix hipBLASLt section
change words
fix tensilelite section
fix
fix
fix fmt
* style guide
* fix some formatting
* satisfy spellcheck linter
* update wordlist
* fix bad conflict resolution
* Switch all pipeline gpu targets to gfx942
* Change more pipelines target to gfx942
* set variables for manual testing
* Switch all pipeline gpu targets to gfx942
* Change more pipelines target to gfx942
* set variables for manual testing
* add test pipeline id
* revert test changes
* correct gpu target name
* remove unused flags; change hipSPARSELt target to be gfx942
* added professional graphic
to replace hand modified
* Update deep-learning-rocm.rst
update image reference
* Delete docs/data/how-to/framework_install_2024_05_23-update.png
replace with renamed file with correct date
* Add files via upload
updated dat in file name
* Update deep-learning-rocm.rst
corrected image name to reflect new date
* Update deep-learning-rocm.rst
corrected file name
* Add files via upload
correct name
* Delete docs/data/how-to/framework_install_2024_07-04.png
name format incorrect
* Update deep-learning-rocm.rst
correct image name
* add CXX flag
* add CXX flag
* Update ROCmValidationSuite.yml
* Change googletest to libgtest-dev
* Update ROCmValidationSuite.yml
* Update ROCmValidationSuite.yml
* add ROCM_PATH as env var
* add HIP_INC_DIR
* remove manual test variables
* set variables for manual test
* remove CMAKE_CXX_COMPILER flag
* Set link to redirect llvm folder
* correct indentation
* remove manual test variables
* rename task
* update CLR docs reference
* Apply suggestions from code review
Co-authored-by: Peter Park <peter.park@amd.com>
---------
Co-authored-by: amitkumar-amd <Amit.Kumar6@amd.com>
Co-authored-by: Peter Park <peter.park@amd.com>
* Use components.xml instead of default.xml
* Rm unused var
* Use category instead of group
* Add group and category
* Change changelog template
* Conditional display
* Remove sort
* Add mappings
* Jinja does not track state
* Handle dupe logic in python
* Construct doc page and repo url
* Add repo url
* Add doc page
* Avoid using bare URL
* Add None key
* Test release notes
[Why]
To maintain the "pitchfork layout" convention used by the repository.
[How]
- Update README.md
- Update INFRA_REPO in ROCm.mk
- Updated to new path: ROCm/tools/rocm-build
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <dgaliffi@amd.com>
With MIOpen now building with latest source on External CI, this unblocked AMDMIGraphX from building with latest source.
Determined rocMLIR also needed to be built with latest source as a dependency.
[Why]
To maintain the "pitchfork layout" convention used by the repository.
[How]
- Update README.md
- Update INFRA_REPO in ROCm.mk
- Updated to new path: ROCm/tools/rocm-build
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Fix first link in compatibility matrix table
* Revert "Fix first link in compatibility matrix table"
This reverts commit 069c5c116a.
* Remove sticky header and unused css
* Remove container from hardware specs matrix
---------
Co-authored-by: Peter Jun Park <peter.park@amd.com>
* Fix first link in compatibility matrix table
* Revert "Fix first link in compatibility matrix table"
This reverts commit 069c5c116a.
* Remove sticky header and unused css
* Remove container from hardware specs matrix
---------
Co-authored-by: Peter Jun Park <peter.park@amd.com>
* Regenerate changelog
* Add component changelogs and known issue
Fix RELEASE.md headings
Update pub datestamp for 6.1.2
Add AMDSMI and ROCm SMI to 6.1.2 template
Add rccl and rocBLAS
Update intro blurb and headings
Add ROCm SMI fix
Add missed heading to AMDSMI
Update datestamp and release version number
Update version and release number
Add known issue re: MI300X error detection
Words
Add issue link
Rm GitHub issue link
Move known issue down
Update ki wording
Remove "this issue has been investigated ... " from known issue
Fix changelog h1
* Reorg known issue, upcoming changes, remove rocDecode tested configurations
* Add fixes from review
* Add fixed issue link
* Fix heading
* Remove known issue
* Update the links for rocminfo and rocm-bandwidth-test
* Update the links for rocminfo and rocm-bandwidth-test
* Update the links for rocminfo and rocm-bandwidth-test
* Update links to intersphinx links
---------
Co-authored-by: Peter Jun Park <peter.park@amd.com>
* Update the links for rocminfo and rocm-bandwidth-test
* Update the links for rocminfo and rocm-bandwidth-test
* Update the links for rocminfo and rocm-bandwidth-test
* Update links to intersphinx links
---------
Co-authored-by: Peter Jun Park <peter.park@amd.com>
* Add Fine Tuning LLMs how to guide
* Reorg and refactor Fine-tuning LLMs with ROCm
Update index and headings
Fix formatting and update toc
Split out content from index to overview.rst
Add metadata
Clean up overview
Add inference sections, fix rst errors, clean up single-gpu-fine-tuning
Combine fine-tuning and inference guides
Fix some links and formatting
Update toc and add formatting fixes
Add ck kernel fusion content
Update toc
Clean up model quantization and acceleration
Add CK images
Clean up profiling
Update triton kernel performance optimization
Update llm inference frameworks guide
Disable automatic number of figures and tables in Sphinx conf
Change tabs to spaces
Change heading to end with -ing
Add link fixes and heading updates
Add rocprof/Omniperf/Omnitrace section
Update profiling and debugging guide
Add formatting fixes
Satisfy spellcheck
Fix words
Delete unused file
Finish overview
Clean up first 4 sections
Multi-gpu fine-tuning guide: slight fixes
Update toc
Remove tabs
Formatting fixes
* Minor wording updates
* Add some clean-up
* Update profiling and debugging gudie
* Fix Omnitrace link
* Update ck kernel fusion with latest
* Update CK formatting
* Fix perfetto link syntax
* Fix typos and add blurbs
* Add fixes to Triton optimization doc
* Tabify saving adapters / models section
* Fix linting errors - spellcheck
Fix spelling and grammar
Satisfy linter
Update wording in profiling guide
Add fixes to satisfy linter
More fixes for linting in Triton guide
More linting fixes
Spellcheck in CK guide
* Improve triton guide
Fix linting errors and optics
* Add occupancy / vgpr table
Change some wording
* Re-add tunableop
* Add missing indent in _toc.yml
* Remove ckProfiler references
* Add links to resources
* Add refs in CK optimization guide
* Rename files and fix internal links
* Organize tuning guides
Reorg triton
* Add compute unit diagram
* Remove AutoAWQ
* Add higher res image for Perfetto trace example
* Update link text
* Update fig nums
* Update some formatting
* Update "Inductor"
* Change "Inductor" to TorchInductor
* Add link to official TorchInductor docs
* Add Fine Tuning LLMs how to guide
* Reorg and refactor Fine-tuning LLMs with ROCm
Update index and headings
Fix formatting and update toc
Split out content from index to overview.rst
Add metadata
Clean up overview
Add inference sections, fix rst errors, clean up single-gpu-fine-tuning
Combine fine-tuning and inference guides
Fix some links and formatting
Update toc and add formatting fixes
Add ck kernel fusion content
Update toc
Clean up model quantization and acceleration
Add CK images
Clean up profiling
Update triton kernel performance optimization
Update llm inference frameworks guide
Disable automatic number of figures and tables in Sphinx conf
Change tabs to spaces
Change heading to end with -ing
Add link fixes and heading updates
Add rocprof/Omniperf/Omnitrace section
Update profiling and debugging guide
Add formatting fixes
Satisfy spellcheck
Fix words
Delete unused file
Finish overview
Clean up first 4 sections
Multi-gpu fine-tuning guide: slight fixes
Update toc
Remove tabs
Formatting fixes
* Minor wording updates
* Add some clean-up
* Update profiling and debugging gudie
* Fix Omnitrace link
* Update ck kernel fusion with latest
* Update CK formatting
* Fix perfetto link syntax
* Fix typos and add blurbs
* Add fixes to Triton optimization doc
* Tabify saving adapters / models section
* Fix linting errors - spellcheck
Fix spelling and grammar
Satisfy linter
Update wording in profiling guide
Add fixes to satisfy linter
More fixes for linting in Triton guide
More linting fixes
Spellcheck in CK guide
* Improve triton guide
Fix linting errors and optics
* Add occupancy / vgpr table
Change some wording
* Re-add tunableop
* Add missing indent in _toc.yml
* Remove ckProfiler references
* Add links to resources
* Add refs in CK optimization guide
* Rename files and fix internal links
* Organize tuning guides
Reorg triton
* Add compute unit diagram
* Remove AutoAWQ
* Add higher res image for Perfetto trace example
* Update link text
* Update fig nums
* Update some formatting
* Update "Inductor"
* Change "Inductor" to TorchInductor
* Add link to official TorchInductor docs
* Regenerate changelog
* Add component changelogs and known issue
Fix RELEASE.md headings
Update pub datestamp for 6.1.2
Add AMDSMI and ROCm SMI to 6.1.2 template
Add rccl and rocBLAS
Update intro blurb and headings
Add ROCm SMI fix
Add missed heading to AMDSMI
Update datestamp and release version number
Update version and release number
Add known issue re: MI300X error detection
Words
Add issue link
Rm GitHub issue link
Move known issue down
Update ki wording
Remove "this issue has been investigated ... " from known issue
Fix changelog h1
Template with bash commands to update cmake with snap.
Use template for two components that want updated cmake with latest source on their default branches.
* Add Using ROCm for AI:wq
Add PyTorch Docker installation images
Split doc into subtopics
Add metadata
Clean up index
Clean up hugging face guide
Clean up installation guide
Fix rST formatting
Clean up install and train-a-model
Clean up MAD
Delete unused file
Add ref anchors and clean up MAD doc
Add formatting fixes
Update toc and section index
Format some code blocks
Remove install guide and update toc
Chop installation guide
Clean up deployment and hugging face sections
Change headings to end in -ing
Fix spelling in Training a model
Delete MAD and split out install content
Fix formatting
Change words to satisfy spellcheck linter
* Add review suggestions and add helpful links
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Add helpful links and add review suggestions
Remove fine-tuning link and links to D5 and MAGMA
Update docs/how-to/rocm-for-ai/deploy-your-model.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Update DeepSpeed link
Add subheading to ML framework installation and closing blurb to hugging face models guide
* Reorder topics
* Add Using ROCm for AI:wq
Add PyTorch Docker installation images
Split doc into subtopics
Add metadata
Clean up index
Clean up hugging face guide
Clean up installation guide
Fix rST formatting
Clean up install and train-a-model
Clean up MAD
Delete unused file
Add ref anchors and clean up MAD doc
Add formatting fixes
Update toc and section index
Format some code blocks
Remove install guide and update toc
Chop installation guide
Clean up deployment and hugging face sections
Change headings to end in -ing
Fix spelling in Training a model
Delete MAD and split out install content
Fix formatting
Change words to satisfy spellcheck linter
* Add review suggestions and add helpful links
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Add helpful links and add review suggestions
Remove fine-tuning link and links to D5 and MAGMA
Update docs/how-to/rocm-for-ai/deploy-your-model.rst
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Update DeepSpeed link
Add subheading to ML framework installation and closing blurb to hugging face models guide
* Reorder topics
* removed docker and pointed ROCm deps to our existing builds
* removed vmImage tag for pool
* added pip to apt list and renamed from rocFFT to hipFFT
* fixed spelling mistakes in rocmDependencies
* added correct apt dep for pip
* removed leading slash in the cmake flags
* changed cxx_compiler to /rocm/bin/hipcc
* added llvm-project, ROCR-Runtime, clr, and rocminfo to rocm deps
* added rocFFT as a rocm dependency
* removed docker and added our builds for components
* removed rocFFT from rocm deps
* Fixed typo in rocFFT value
* added rocprofiler-register to rocFFT and fixed typo in the dependencies-rocm file
* changed cxx compiler to amdclang++
* fixed amdclang++ paths
* moving to faster machine
* added cmake module paths
* switched back to medium build
* added libopm-dev to apt deps
* added libomp-14-dev to apt deps
* added aomp as a rocm dep
* added aomp as a rocm dep
* added hipcc as the cxx_compiler
* reverted back to clang++ as the cxx_compiler
* removed unmentioned rocm deps from the readme
* removed docker
* added python3-pip as an apt dep
* fixed compiler paths
* added hipRAND as a rocm dep
* added print statements to see directory structure
* adding a print statement into /agent/_work/1/s/build/library
* added -Tensile_rocm_assembler as a build flag
* removed a broken script line
* added D to tensile rocm assembler
* added DROCM_PATH to build flags
* fixed typo
* changed build pool from medium to base
* changed build pool from base to low
* added env variables using josephs pr
* removed docker from hipBLASLt and added rocm dependencies that point to our builds
* added pip to the apt packages array
* changed cmake_cxx_compiler env var ro amdclang++
* changed cmake_cxx_compiler env var to amdclang++
* changed cmake_cxx_compiler env var to hipcc
* changed cmake_cxx_compiler env var to hipcc
* changed clang to amdclang
* changed all refs mentioning hipcc to amdclang
* changed cmake_cxx_compiler back to hipcc
* added a HIP_PATH env var based off Tensile/Source/FindHIP.cmake
* added hipcc to HIP_PATH
* added rocm-cmake to rocm deps
* added rocRAND as a rocm dep
* removed dcmake_module flag
* added libomp-dev as an apt dep
* added aomp as a rocm dep
* added clang as an apt dep
* reverted changes back to how they appear in develop since this branch will be submitted for review
* removed unecessary flags
* adding -DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++ -DCMAKE_C_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang back to see if these are vital to a successful build
* removed newline character
* Disable aomp offload build for initial ci-build work
* Missing dependency for medium pool use of rocPRIM
* Latest rocBLAS source needs added ROCm dependencies
* Rename 'Tuning guides' to 'Hardware optimization'
* Move deep learning to Install section
* Change 'Hardware' to 'System' to align with index.md
* Satisfy spellcheck linter
* adding new framework install graphic with JAX
* Fix link to ROCm libraries list
* crop framework_install graphic
* Reset .wordlist.txt update
* Prettify deep learning framework installation page
* Change spacing in list of frameworks
---------
Co-authored-by: Young Hui <young.hui@amd.com>
* Rename 'Tuning guides' to 'Hardware optimization'
* Move deep learning to Install section
* Change 'Hardware' to 'System' to align with index.md
* Satisfy spellcheck linter
* adding new framework install graphic with JAX
* Fix link to ROCm libraries list
* crop framework_install graphic
* Reset .wordlist.txt update
* Prettify deep learning framework installation page
* Change spacing in list of frameworks
---------
Co-authored-by: Young Hui <young.hui@amd.com>
aomp build is not triggered by changes to aomp repo, but by updates to llvm-project and ROCR-Runtime, so trigger definition can remain this ROCm/ROCm repo.
Instead of using docker and apt install of ROCm component dependencies, use tarballs from Azure Pipeline builds to enable updates of ROCm interdependencies without waiting for releases..
* Update External CI Interdependencies for more repos
- composable_kernel
- hipBLAS
- rocBLAS
- rocSOLVER
Cleaned up unused flags from llvm-project
* Remove LD_LIBRARY_PATH change. Should not be needed.
- Fixed compilers to pick amdclang.
- Added ldconfig step for setting up linking of shared libraries.
- Set Azure VMs to medium only.
- Remove empty directories in published tarballs.
After examining the build products of recent builds and consuming them for other components, observed some additional flags should be added. Used rocm-build repo for reference.
Move HIPIFY from 6.1.1.md to 6.1.2.md
Regenerate changelog
Fix accidental autoformat in 6.1.1.md
Update 6.1.2.md and regen changelog
Add AMD SMI for ROCm 6.1.2
Regen changelog
Add rocDecode and update RELEASE.md
Update 6.1.2 intro blurb
Fix arrow symbol
Add (tm) to changelog.jinja template
Incorporate Leo's feedback
Intro blurb wording.
Add missed tested ROCm config (rocDecode)
Add OS support
Add version to release notes h1
Update intro blurb again
Make changelog filepath lowercase
Update blurb
Add extra line to 6.1.2 template
Fix heading in RELEASE
Fix amdsmi changelog link
Remove OS support notice
Add rocDecode to table
Add redecode to CL
Update rocDecode setup script note for clarity
Update AMD SMI changelog
Apply Leo's feedback
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
To best determine hardware specs per repo, added more build pool options with varying number of vCPUs, RAM size, etc. and will kick off builds with test targets enabled to determine long-term cost values.
Co-authored-by: alexxu-amd <alexxu12@amd.com>
* Update Ubuntu kernel versions for 6.1.1 changelog and release notes
* Add link to GitHub issue for ROCm SMI in changelog and RN
* Fix ROCm SMI GH issue link
* Update kernel versions format
* Update kernel version format for readability
* Update kernel version brackets
-Updating build flags for llvm-project to support another pipeline to work with aomp repos.
-Added support for rocMLIR component.
-Removed MIVisionX python dependency script and leveraged existing dependencies template.
-Change to use cloud systems
* Add ROCm version 6.1.0 to version list (#3023)
* Update CHANGELOG.md
Added GitHub links to Changelog
* Update CHANGELOG.md
* Update manifest for ROCm 6.1.0 (#3022)
* Reorganize default.xml by group and alphabetically
* Add rocDecode to default.xml
* Add rocDecode to included names in tag script
* update tag to 6.1.0
---------
* Update CHANGELOG.md
Updated ROCm Compiler with fixed issue
* docs(tools/autotag/README.md): Add additional note to avoid duplicating data in changelog template (#3018)
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-major
...
* Use Ubuntu 22.04 and Python 3.10 in RTD config
* Update README.md (#3043)
* Update README.md
Fix rocSPARSE build link
* Update link to just general page, instead of anchor
* Add 'JAX for ROCm' link to index.md (#3034)
* Add JAX for ROCm link to index.md
* Reorder third-party libraries installation guides in index
* Update links to rocAL component (#3033)
* Update links to rocAL component
* Change absolute rocm docs links to relative
* Update compatibility/precision-support links (#3030)
* Change links to component data type support pages from absolute to relative
* Fix rocPRIM data type support links
* Empty commit to trigger demo rebuild.
* Update excluded and included projects
* Separate templates into a module; Fix MIVisionX template
* Add hipfort changelog processor
* Add rpp custom processor
* Add custom processor for rvs
* update the code-owner list (#3046)
* Update default.xml (#3038)
* Remove HIPCC from default.xml
HIPCC moved into llvm-project
* Remove ROCm-Device-Libs from default.xml
ROCm-Device-Libs was moved into llvm-project
* Remove ROCm-CompilerSupport from default.xml
ROCm-CompilerSupport was moved into llvm-project
* Add rocprofiler-register to default.xml
Added in 6.1 manifest
* Apply mathlibs group to projects in manifest
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx (#3047)
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-major
...
* Set Ubuntu 22.04 and Python 3.10 in ReadtheDocs config
---------
* Add 6.1.0.md template
* Add AMD SMI to 6.1.0 template
* Add ROCm Compiler to 6.1.0 template
* Add RDC to 6.1.0 template
* Add ROCgdb to 6.1.0 template
* Add ROCm SMI to 6.1.0 template
* Add ROCProfiler to 6.1.0 template
* Add MI200 SR-IOV known issue to 6.1.0 template
* Add MI300 RAS fixed defect to 6.1.0 template
* docs(6.1.0.md): Add more changelog notes for 6.1.0
* Update 6.1.0.md
Added links to GitHub for known issues and ROCm Compiler fixed defect
* Test autotag script
* Add ck template
* Add HIPIFY to included names for tag script
* Remove rocprofiler from tag_script
* Remove RVS template
Determine cause of missing later
* Add HIPIFY to template for 6.1.0
* Add extra line to topp of template for formatting changelog
* Update 5.7.1.md
Fixing the broken link for rocBLAS programmer's guide in 5.7.1 Changelog.
* Regenerate changelog with new 5.7.1 link fix
* Add note for tag_script included_names
* Improve readability of GPU architecture hardware specs (#3009)
* move units of measurement to table headers
* add glossary explaining table headers
* add missed units and update h1
* toc listing to say indicate Accelerators & GPUs
* fix typo
* update meta description and keywords
* Update title in toc to fit in sidebar
* update title, toc, and filename
* Fix broken link to HIP programming guide
* Revert "update title, toc, and filename"
This reverts commit 6b9e687805.
* Revert glossary; slight fixes
* Change 'Pro' to 'PRO' for consistency
* Add references to programming and hardware architecture guides
* Change 'warp' to 'wavefront'
* Update changelog.jinja to exclude version number in header for lindividual libraries (#3058)
* Base set of Azure DevOps pipeline library source (#3021)
* Base set of Azure DevOps pipeline library source
A base set of yaml files to orchestrate the build and testing of ROCm compiler and runtime components in an Azure DevOps project.
* Use hipcc in llvm-project, also build OpenCL runtime.
* Adding llvm-lit tests to llvm-project pipeline.
Added comgr ctest as well.
* rocm-cmake unit testing in pipeline
* Pipeline changes corresponding to 6.1 release
* Bump rocm-docs-core from 1.0.0 to 1.1.0 in /docs/sphinx (#3063)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.0.0 to 1.1.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.0.0...v1.1.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
* Bump rocm-docs-core from 1.0.0 to 1.1.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.0.0 to 1.1.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.0.0...v1.1.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
* update the default.xml for ROCm6.1 (#3067)
* Bump urllib3 from 1.26.13 to 1.26.18 in /docs/sphinx (#3068)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.13 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/1.26.13...1.26.18)
---
updated-dependencies:
- dependency-name: urllib3
dependency-type: indirect
...
* Add 6.1.1.md template
* Bump rocm-docs-core from 1.1.0 to 1.1.1 in /docs/sphinx (#3070)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.0...v1.1.1)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
* Fix broken link on hardware specs page (#3075)
* Fix broken link
Fix broken link on hardware specs page to HIP programming model due to
refactoring of HIP docs.
* Update link anchor
* Tagged builds of External CI components (#3078)
* Tagged builds of External CI components
Adding capability to kick off builds of ROCm components based on a tag ref, without the need of the yaml file in the corresponding repo that is used for pre-submit and on-submit builds. This unblocks the team from creating an initial set of pipelines to verify things work.
Also made some improvements to the code structure and added support for more repos.
---------
* More external CI pipelines (#3083)
Changing default behaviour for PRs with tag-builds.
Changing build system for some jobs based on execution time.
* Add compatibility matrix (#3082)
* add compatibility matrix and custom css
* fix toc
* reorder some components in matrix, add missing tools to reference page
* Update docs/compatibility/compatibility-matrix.rst
---------
* update OS strings to be more readable and searchable (#3088)
* Tag build pipelines for four more ROCm repos (#3085)
-rocgdb
-hipother via HIP build with targeted platform
-hipSOLVER
-hipSPARSELt
* Bump jinja2 from 3.1.3 to 3.1.4 in /docs/sphinx (#3089)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.3...3.1.4)
---
updated-dependencies:
- dependency-name: jinja2
dependency-type: indirect
...
* Compatibility Matrix - include AMDSMI (#3090)
* Extend codeowners for docs (#3091)
* Add release notes
Improve wording
Clarify Ubuntu 22.04.5 is pre-release
Add AMD SMI changes
Fix md headings and some words
Reword highlight
Add feedback from Leo to release highlight
Add generated changelog
Add RELEASE.md for 6.1.1
Update highlight in RELEASE.md with change in 6.1.1 template
Change h1 in CHANGELOG.md
to ROCm 6.1.1 changelog
Change release notes to changelog in CHANGELOG.md
Fix missing info in CHANGELOG.md pre-6.1.1
Add HIPIFY 6.1.1 to changelog
Add HIPIFY to RELEASE.md
Also fix typo in changelog
Add HIPIFY to 6.1.1 template
* Fix util imports
* Skip and log missing branches for release_data.py
* Update autotag readme
* Remove ck template
* Fix changelog and release notes
Add \n to top of 6.0.2 template
Update RELEASE.md and 6.1.1.md
Regenerate changelog
Add minor wording changes in RELEASE.md
Incorporate Leo's feedback
Reformat RELEASE.md to fix build issue
Fixes an issue preventing Changelog from appearing in the TOC.
Update AMDSMI link & change 'release highlights' to 'release notes'
Change AMD SMI link from develop to docs/6.1.1
* Bump rocm-docs-core from 1.1.0 to 1.1.1 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.0...v1.1.1)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
* Update changelog and release notes for 6.1.1
Reformat 6.1.0 to 6.0.0 changelog
Add ROCm SMI known issues to RN
Tweak ROCm SMI wording
Add known issue
Reword known issue rn
Fix headings and wording
Remove redundancy
Fix headings and known issue words
Leo changes
Remove known issue with Radeon GPUs
Specify Navi3 GPUs in ROCM SMI known issue
Change Navi 3x to RDNA3
Add OS support note
Fix 6.1.1 template link to amdsmi
Update 6.1.1 library table, add hipBLASLt to 6.1.1 CL/RN, update HIPCC upcoming changes wording
Remove extra bullet
Change gpu to GPU in rocFFT
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Roopa Malavally <56051583+Rmalavally@users.noreply.github.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: peter <peter.park@amd.com>
Co-authored-by: amitkumar-amd <120512306+amitkumar-amd@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: abhimeda <abhinav.meda@amd.com>
Co-authored-by: alexxu-amd <alex.xu@amd.com>
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* Add ROCm version 6.1.0 to version list (#3023)
* Update CHANGELOG.md
Added GitHub links to Changelog
* Update CHANGELOG.md
* Update manifest for ROCm 6.1.0 (#3022)
* Reorganize default.xml by group and alphabetically
* Add rocDecode to default.xml
* Add rocDecode to included names in tag script
* update tag to 6.1.0
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* Update CHANGELOG.md
Updated ROCm Compiler with fixed issue
* docs(tools/autotag/README.md): Add additional note to avoid duplicating data in changelog template (#3018)
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
* Use Ubuntu 22.04 and Python 3.10 in RTD config
* Update README.md (#3043)
* Update README.md
Fix rocSPARSE build link
* Update link to just general page, instead of anchor
* Add 'JAX for ROCm' link to index.md (#3034)
* Add JAX for ROCm link to index.md
* Reorder third-party libraries installation guides in index
* Update links to rocAL component (#3033)
* Update links to rocAL component
* Change absolute rocm docs links to relative
* Update compatibility/precision-support links (#3030)
* Change links to component data type support pages from absolute to relative
* Fix rocPRIM data type support links
* Empty commit to trigger demo rebuild.
* Update excluded and included projects
* Separate templates into a module; Fix MIVisionX template
* Add hipfort changelog processor
* Add rpp custom processor
* Add custom processor for rvs
* update the code-owner list (#3046)
* Update default.xml (#3038)
* Remove HIPCC from default.xml
HIPCC moved into llvm-project
* Remove ROCm-Device-Libs from default.xml
ROCm-Device-Libs was moved into llvm-project
* Remove ROCm-CompilerSupport from default.xml
ROCm-CompilerSupport was moved into llvm-project
* Add rocprofiler-register to default.xml
Added in 6.1 manifest
* Apply mathlibs group to projects in manifest
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx (#3047)
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
* Set Ubuntu 22.04 and Python 3.10 in ReadtheDocs config
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Add 6.1.0.md template
* Add AMD SMI to 6.1.0 template
* Add ROCm Compiler to 6.1.0 template
* Add RDC to 6.1.0 template
* Add ROCgdb to 6.1.0 template
* Add ROCm SMI to 6.1.0 template
* Add ROCProfiler to 6.1.0 template
* Add MI200 SR-IOV known issue to 6.1.0 template
* Add MI300 RAS fixed defect to 6.1.0 template
* docs(6.1.0.md): Add more changelog notes for 6.1.0
* Update 6.1.0.md
Added links to GitHub for known issues and ROCm Compiler fixed defect
* Test autotag script
* Add ck template
* Add HIPIFY to included names for tag script
* Remove rocprofiler from tag_script
* Remove RVS template
Determine cause of missing later
* Add HIPIFY to template for 6.1.0
* Add extra line to topp of template for formatting changelog
* Update 5.7.1.md
Fixing the broken link for rocBLAS programmer's guide in 5.7.1 Changelog.
* Regenerate changelog with new 5.7.1 link fix
* Add note for tag_script included_names
* Improve readability of GPU architecture hardware specs (#3009)
* move units of measurement to table headers
* add glossary explaining table headers
* add missed units and update h1
* toc listing to say indicate Accelerators & GPUs
* fix typo
* update meta description and keywords
* Update title in toc to fit in sidebar
* update title, toc, and filename
* Fix broken link to HIP programming guide
* Revert "update title, toc, and filename"
This reverts commit 6b9e687805.
* Revert glossary; slight fixes
* Change 'Pro' to 'PRO' for consistency
* Add references to programming and hardware architecture guides
* Change 'warp' to 'wavefront'
* Update changelog.jinja to exclude version number in header for lindividual libraries (#3058)
* Base set of Azure DevOps pipeline library source (#3021)
* Base set of Azure DevOps pipeline library source
A base set of yaml files to orchestrate the build and testing of ROCm compiler and runtime components in an Azure DevOps project.
* Use hipcc in llvm-project, also build OpenCL runtime.
* Adding llvm-lit tests to llvm-project pipeline.
Added comgr ctest as well.
* rocm-cmake unit testing in pipeline
* Pipeline changes corresponding to 6.1 release
* Bump rocm-docs-core from 1.0.0 to 1.1.0 in /docs/sphinx (#3063)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.0.0 to 1.1.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.0.0...v1.1.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Bump rocm-docs-core from 1.0.0 to 1.1.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.0.0 to 1.1.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.0.0...v1.1.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* update the default.xml for ROCm6.1 (#3067)
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
* Bump urllib3 from 1.26.13 to 1.26.18 in /docs/sphinx (#3068)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.13 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/1.26.13...1.26.18)
---
updated-dependencies:
- dependency-name: urllib3
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add 6.1.1.md template
* Bump rocm-docs-core from 1.1.0 to 1.1.1 in /docs/sphinx (#3070)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.0...v1.1.1)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Fix broken link on hardware specs page (#3075)
* Fix broken link
Fix broken link on hardware specs page to HIP programming model due to
refactoring of HIP docs.
* Update link anchor
* Tagged builds of External CI components (#3078)
* Tagged builds of External CI components
Adding capability to kick off builds of ROCm components based on a tag ref, without the need of the yaml file in the corresponding repo that is used for pre-submit and on-submit builds. This unblocks the team from creating an initial set of pipelines to verify things work.
Also made some improvements to the code structure and added support for more repos.
---------
Co-authored-by: abhimeda <abhinav.meda@amd.com>
Co-authored-by: alexxu-amd <alex.xu@amd.com>
* More external CI pipelines (#3083)
Changing default behaviour for PRs with tag-builds.
Changing build system for some jobs based on execution time.
Co-authored-by: abhimeda <abhinav.meda@amd.com>
Co-authored-by: alexxu-amd <alex.xu@amd.com>
* Add compatibility matrix (#3082)
* add compatibility matrix and custom css
* fix toc
* reorder some components in matrix, add missing tools to reference page
* Update docs/compatibility/compatibility-matrix.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
---------
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* update OS strings to be more readable and searchable (#3088)
* Tag build pipelines for four more ROCm repos (#3085)
-rocgdb
-hipother via HIP build with targeted platform
-hipSOLVER
-hipSPARSELt
* Bump jinja2 from 3.1.3 to 3.1.4 in /docs/sphinx (#3089)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.3...3.1.4)
---
updated-dependencies:
- dependency-name: jinja2
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Compatibility Matrix - include AMDSMI (#3090)
* Extend codeowners for docs (#3091)
* Add release notes
Improve wording
Clarify Ubuntu 22.04.5 is pre-release
Add AMD SMI changes
Fix md headings and some words
Reword highlight
Add feedback from Leo to release highlight
Add generated changelog
Add RELEASE.md for 6.1.1
Update highlight in RELEASE.md with change in 6.1.1 template
Change h1 in CHANGELOG.md
to ROCm 6.1.1 changelog
Change release notes to changelog in CHANGELOG.md
Fix missing info in CHANGELOG.md pre-6.1.1
Add HIPIFY 6.1.1 to changelog
Add HIPIFY to RELEASE.md
Also fix typo in changelog
Add HIPIFY to 6.1.1 template
* Fix util imports
* Skip and log missing branches for release_data.py
* Update autotag readme
* Remove ck template
* Fix changelog and release notes
Add \n to top of 6.0.2 template
Update RELEASE.md and 6.1.1.md
Regenerate changelog
Add minor wording changes in RELEASE.md
Incorporate Leo's feedback
Reformat RELEASE.md to fix build issue
Fixes an issue preventing Changelog from appearing in the TOC.
Update AMDSMI link & change 'release highlights' to 'release notes'
Change AMD SMI link from develop to docs/6.1.1
* Bump rocm-docs-core from 1.1.0 to 1.1.1 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.0...v1.1.1)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* Update changelog and release notes for 6.1.1
Reformat 6.1.0 to 6.0.0 changelog
Add ROCm SMI known issues to RN
Tweak ROCm SMI wording
Add known issue
Reword known issue rn
Fix headings and wording
Remove redundancy
Fix headings and known issue words
Leo changes
Remove known issue with Radeon GPUs
Specify Navi3 GPUs in ROCM SMI known issue
Change Navi 3x to RDNA3
Add OS support note
Fix 6.1.1 template link to amdsmi
Update 6.1.1 library table, add hipBLASLt to 6.1.1 CL/RN, update HIPCC upcoming changes wording
Remove extra bullet
Change gpu to GPU in rocFFT
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Roopa Malavally <56051583+Rmalavally@users.noreply.github.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: peter <peter.park@amd.com>
Co-authored-by: amitkumar-amd <120512306+amitkumar-amd@users.noreply.github.com>
Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: abhimeda <abhinav.meda@amd.com>
Co-authored-by: alexxu-amd <alex.xu@amd.com>
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Reformat 6.1.0 to 6.0.0 changelog
Add ROCm SMI known issues to RN
Tweak ROCm SMI wording
Add known issue
Reword known issue rn
Fix headings and wording
Remove redundancy
Fix headings and known issue words
Leo changes
Remove known issue with Radeon GPUs
Specify Navi3 GPUs in ROCM SMI known issue
Change Navi 3x to RDNA3
Add OS support note
Fix 6.1.1 template link to amdsmi
Update 6.1.1 library table, add hipBLASLt to 6.1.1 CL/RN, update HIPCC upcoming changes wording
Remove extra bullet
Change gpu to GPU in rocFFT
Add \n to top of 6.0.2 template
Update RELEASE.md and 6.1.1.md
Regenerate changelog
Add minor wording changes in RELEASE.md
Incorporate Leo's feedback
Reformat RELEASE.md to fix build issue
Fixes an issue preventing Changelog from appearing in the TOC.
Update AMDSMI link & change 'release highlights' to 'release notes'
Change AMD SMI link from develop to docs/6.1.1
Improve wording
Clarify Ubuntu 22.04.5 is pre-release
Add AMD SMI changes
Fix md headings and some words
Reword highlight
Add feedback from Leo to release highlight
Add generated changelog
Add RELEASE.md for 6.1.1
Update highlight in RELEASE.md with change in 6.1.1 template
Change h1 in CHANGELOG.md
to ROCm 6.1.1 changelog
Change release notes to changelog in CHANGELOG.md
Fix missing info in CHANGELOG.md pre-6.1.1
Add HIPIFY 6.1.1 to changelog
Add HIPIFY to RELEASE.md
Also fix typo in changelog
Add HIPIFY to 6.1.1 template
Changing default behaviour for PRs with tag-builds.
Changing build system for some jobs based on execution time.
Co-authored-by: abhimeda <abhinav.meda@amd.com>
Co-authored-by: alexxu-amd <alex.xu@amd.com>
* Tagged builds of External CI components
Adding capability to kick off builds of ROCm components based on a tag ref, without the need of the yaml file in the corresponding repo that is used for pre-submit and on-submit builds. This unblocks the team from creating an initial set of pipelines to verify things work.
Also made some improvements to the code structure and added support for more repos.
---------
Co-authored-by: abhimeda <abhinav.meda@amd.com>
Co-authored-by: alexxu-amd <alex.xu@amd.com>
* Base set of Azure DevOps pipeline library source
A base set of yaml files to orchestrate the build and testing of ROCm compiler and runtime components in an Azure DevOps project.
* Use hipcc in llvm-project, also build OpenCL runtime.
* Adding llvm-lit tests to llvm-project pipeline.
Added comgr ctest as well.
* rocm-cmake unit testing in pipeline
* Pipeline changes corresponding to 6.1 release
* update manifest file for ROCm6.1 (#3024)
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
* Add ROCm version 6.1.0 to version list (#3023) (#3025)
* Merge develop into roc-6.1.x (#3048)
* Add ROCm version 6.1.0 to version list (#3023)
* Update CHANGELOG.md
Added GitHub links to Changelog
* Update CHANGELOG.md
* Update manifest for ROCm 6.1.0 (#3022)
* Reorganize default.xml by group and alphabetically
* Add rocDecode to default.xml
* Add rocDecode to included names in tag script
* update tag to 6.1.0
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* Update CHANGELOG.md
Updated ROCm Compiler with fixed issue
* docs(tools/autotag/README.md): Add additional note to avoid duplicating data in changelog template (#3018)
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
* Use Ubuntu 22.04 and Python 3.10 in RTD config
* Update README.md (#3043)
* Update README.md
Fix rocSPARSE build link
* Update link to just general page, instead of anchor
* Add 'JAX for ROCm' link to index.md (#3034)
* Add JAX for ROCm link to index.md
* Reorder third-party libraries installation guides in index
* Update links to rocAL component (#3033)
* Update links to rocAL component
* Change absolute rocm docs links to relative
* Update compatibility/precision-support links (#3030)
* Change links to component data type support pages from absolute to relative
* Fix rocPRIM data type support links
* Empty commit to trigger demo rebuild.
* Update excluded and included projects
* Separate templates into a module; Fix MIVisionX template
* Add hipfort changelog processor
* Add rpp custom processor
* Add custom processor for rvs
* update the code-owner list (#3046)
* Update default.xml (#3038)
* Remove HIPCC from default.xml
HIPCC moved into llvm-project
* Remove ROCm-Device-Libs from default.xml
ROCm-Device-Libs was moved into llvm-project
* Remove ROCm-CompilerSupport from default.xml
ROCm-CompilerSupport was moved into llvm-project
* Add rocprofiler-register to default.xml
Added in 6.1 manifest
* Apply mathlibs group to projects in manifest
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx (#3047)
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
* Set Ubuntu 22.04 and Python 3.10 in ReadtheDocs config
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Add 6.1.0.md template
* Add AMD SMI to 6.1.0 template
* Add ROCm Compiler to 6.1.0 template
* Add RDC to 6.1.0 template
* Add ROCgdb to 6.1.0 template
* Add ROCm SMI to 6.1.0 template
* Add ROCProfiler to 6.1.0 template
* Add MI200 SR-IOV known issue to 6.1.0 template
* Add MI300 RAS fixed defect to 6.1.0 template
* docs(6.1.0.md): Add more changelog notes for 6.1.0
* Update 6.1.0.md
Added links to GitHub for known issues and ROCm Compiler fixed defect
* Test autotag script
* Add ck template
* Add HIPIFY to included names for tag script
* Remove rocprofiler from tag_script
* Remove RVS template
Determine cause of missing later
* Add HIPIFY to template for 6.1.0
* Add extra line to topp of template for formatting changelog
* Update 5.7.1.md
Fixing the broken link for rocBLAS programmer's guide in 5.7.1 Changelog.
* Regenerate changelog with new 5.7.1 link fix
* Add note for tag_script included_names
* Improve readability of GPU architecture hardware specs (#3009)
* move units of measurement to table headers
* add glossary explaining table headers
* add missed units and update h1
* toc listing to say indicate Accelerators & GPUs
* fix typo
* update meta description and keywords
* Update title in toc to fit in sidebar
* update title, toc, and filename
* Fix broken link to HIP programming guide
* Revert "update title, toc, and filename"
This reverts commit 6b9e687805.
* Revert glossary; slight fixes
* Change 'Pro' to 'PRO' for consistency
* Add references to programming and hardware architecture guides
* Change 'warp' to 'wavefront'
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Roopa Malavally <56051583+Rmalavally@users.noreply.github.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: peter <peter.park@amd.com>
Co-authored-by: amitkumar-amd <120512306+amitkumar-amd@users.noreply.github.com>
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Yanyao Wang <yanywang@amd.com>
Co-authored-by: Wang, Yanyao <yanyao.wang@amd.com>
Co-authored-by: Roopa Malavally <56051583+Rmalavally@users.noreply.github.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: peter <peter.park@amd.com>
Co-authored-by: amitkumar-amd <120512306+amitkumar-amd@users.noreply.github.com>
* Add ROCm version 6.1.0 to version list (#3023)
* Update CHANGELOG.md
Added GitHub links to Changelog
* Update CHANGELOG.md
* Update manifest for ROCm 6.1.0 (#3022)
* Reorganize default.xml by group and alphabetically
* Add rocDecode to default.xml
* Add rocDecode to included names in tag script
* update tag to 6.1.0
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* Update CHANGELOG.md
Updated ROCm Compiler with fixed issue
* docs(tools/autotag/README.md): Add additional note to avoid duplicating data in changelog template (#3018)
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
* Use Ubuntu 22.04 and Python 3.10 in RTD config
* Update README.md (#3043)
* Update README.md
Fix rocSPARSE build link
* Update link to just general page, instead of anchor
* Add 'JAX for ROCm' link to index.md (#3034)
* Add JAX for ROCm link to index.md
* Reorder third-party libraries installation guides in index
* Update links to rocAL component (#3033)
* Update links to rocAL component
* Change absolute rocm docs links to relative
* Update compatibility/precision-support links (#3030)
* Change links to component data type support pages from absolute to relative
* Fix rocPRIM data type support links
* Empty commit to trigger demo rebuild.
* Update excluded and included projects
* Separate templates into a module; Fix MIVisionX template
* Add hipfort changelog processor
* Add rpp custom processor
* Add custom processor for rvs
* update the code-owner list (#3046)
* Update default.xml (#3038)
* Remove HIPCC from default.xml
HIPCC moved into llvm-project
* Remove ROCm-Device-Libs from default.xml
ROCm-Device-Libs was moved into llvm-project
* Remove ROCm-CompilerSupport from default.xml
ROCm-CompilerSupport was moved into llvm-project
* Add rocprofiler-register to default.xml
Added in 6.1 manifest
* Apply mathlibs group to projects in manifest
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx (#3047)
* Bump rocm-docs-core from 0.38.1 to 1.0.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
* Set Ubuntu 22.04 and Python 3.10 in ReadtheDocs config
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Add 6.1.0.md template
* Add AMD SMI to 6.1.0 template
* Add ROCm Compiler to 6.1.0 template
* Add RDC to 6.1.0 template
* Add ROCgdb to 6.1.0 template
* Add ROCm SMI to 6.1.0 template
* Add ROCProfiler to 6.1.0 template
* Add MI200 SR-IOV known issue to 6.1.0 template
* Add MI300 RAS fixed defect to 6.1.0 template
* docs(6.1.0.md): Add more changelog notes for 6.1.0
* Update 6.1.0.md
Added links to GitHub for known issues and ROCm Compiler fixed defect
* Test autotag script
* Add ck template
* Add HIPIFY to included names for tag script
* Remove rocprofiler from tag_script
* Remove RVS template
Determine cause of missing later
* Add HIPIFY to template for 6.1.0
* Add extra line to topp of template for formatting changelog
* Update 5.7.1.md
Fixing the broken link for rocBLAS programmer's guide in 5.7.1 Changelog.
* Regenerate changelog with new 5.7.1 link fix
* Add note for tag_script included_names
* Improve readability of GPU architecture hardware specs (#3009)
* move units of measurement to table headers
* add glossary explaining table headers
* add missed units and update h1
* toc listing to say indicate Accelerators & GPUs
* fix typo
* update meta description and keywords
* Update title in toc to fit in sidebar
* update title, toc, and filename
* Fix broken link to HIP programming guide
* Revert "update title, toc, and filename"
This reverts commit 6b9e687805.
* Revert glossary; slight fixes
* Change 'Pro' to 'PRO' for consistency
* Add references to programming and hardware architecture guides
* Change 'warp' to 'wavefront'
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Roopa Malavally <56051583+Rmalavally@users.noreply.github.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: peter <peter.park@amd.com>
Co-authored-by: amitkumar-amd <120512306+amitkumar-amd@users.noreply.github.com>
* move units of measurement to table headers
* add glossary explaining table headers
* add missed units and update h1
* toc listing to say indicate Accelerators & GPUs
* fix typo
* update meta description and keywords
* Update title in toc to fit in sidebar
* update title, toc, and filename
* Fix broken link to HIP programming guide
* Revert "update title, toc, and filename"
This reverts commit 6b9e687805.
* Revert glossary; slight fixes
* Change 'Pro' to 'PRO' for consistency
* Add references to programming and hardware architecture guides
* Change 'warp' to 'wavefront'
* Remove HIPCC from default.xml
HIPCC moved into llvm-project
* Remove ROCm-Device-Libs from default.xml
ROCm-Device-Libs was moved into llvm-project
* Remove ROCm-CompilerSupport from default.xml
ROCm-CompilerSupport was moved into llvm-project
* Add rocprofiler-register to default.xml
Added in 6.1 manifest
* Apply mathlibs group to projects in manifest
* Update compatibility/precision-support links (#3030)
* Change links to component data type support pages from absolute to relative
* Fix rocPRIM data type support links
* Empty commit to trigger demo rebuild.
* Update links to rocAL component (#3033)
* Update links to rocAL component
* Change absolute rocm docs links to relative
* Add 'JAX for ROCm' link to index.md (#3034)
* Add JAX for ROCm link to index.md
* Reorder third-party libraries installation guides in index
* Update README.md (#3043)
* Update README.md
Fix rocSPARSE build link
* Update link to just general page, instead of anchor
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
* Change links to component data type support pages from absolute to relative
* Fix rocPRIM data type support links
* Empty commit to trigger demo rebuild.
* Reorganize default.xml by group and alphabetically
* Add rocDecode to default.xml
* Add rocDecode to included names in tag script
* update tag to 6.1.0
---------
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
Added known issue for ROCm compiler
https://ontrack-internal.amd.com/browse/SWDEV-454778
Added known issue for RVS
Added known issue for MI200 SRIOV
Updated PEBB test known issue for RVS
Added expansion for PEBB
Added PBQT known issue
expanded P2P Benchmark and Qualification Tool
Edited RVS known issue description based on Leo's input
Added MI300A fixed defect
Removed PEBB and Babel Stream from RVS known issue
Updated RCCL
Added rocm-cmake
Added rocRAND
Added rocWMMA
Added Tensile
Alan's change 1
Alan change to HIPIFY
Alan's edit 3 for MIOpen
OpenMP 2nd bullet fix - Alan edit
Alan's edit - ROCm Compiler
ROCm Validation Suite edits
Alan's edit rocSOLVER
Alan's edit to ROCTracer
Updated hipSPARSELt
Added hipTensor 1.2.0
Added hipTensor
data type correction
updated the RCCL version
Added bullets to known issues for consistency
Changed RAS to Fixed defect
* Add rocDecode to What is ROCm? components list (#3016)
* Add rocDecode to What is ROCm? components list
* Fix typo -> 'Common Language Runtime'
* Change 'compute' to 'common'
* Add rocDecode to API libraries (#3019)
* Update links
* table cleanup
* cross-refs
* wordlist update
* add temp hard links
* verbiage
* docs(index.md): Disable MD051 for Sphinx Markdown anchor point
In general this rule should be followed to avoid broken links
* revert gpu-arch table, remove dropdowns, quick start hyphen removedon index.md
* revise opening text as per PR comment
---------
Co-authored-by: Lisa <lisa.delaney@amd.com>
Co-authored-by: Sam Wu <sam.wu2@amd.com>
Co-authored-by: Young Hui <young.hui@amd.com>
* add rocm software stack diagram to What is ROCm landing page
* restructure ROCm project list table
* clean up unnecessary hyphenation
* update What is ROCm stack diagram filename
* reorder rocm project list to reflect diagram
* update "What is ROCm?" image metadata
* change 'project list' to 'components'
* change 'project' to 'component'
* Update using-gpu-sanitizer.md
Minor OpenMP update
* Update using-gpu-sanitizer.md
Updated note with additional information.
* Update using-gpu-sanitizer.md
* Update using-gpu-sanitizer.md
Moved the note to another section
* Update using-gpu-sanitizer.md
* Create issue_retrieval.yml
I am tasked with adding a GitHub action to process incoming GitHub issues. The AMD GitHub admin team asked me to try out one of their runners and to do so, I need to load in a workflow file.
* changed group to ROCM-Ubuntu
* Added a field to specify project number
This action receives an org name and project number and adds issues to it using this information
* Update issue_retrieval.yml
* Update issue_retrieval.yml
* Generate release notes for 6.0.1 from autotag script (#2790)
* Update CONTRIBUTING.md (#2791)
* Update CONTRIBUTING.md
* Fixed link to licensing document
Also, changed to use relative links for internal files.
* Revert "Update CONTRIBUTING.md" (#2795)
* Text change to direct PRs into default branch, since not all repos have develop branch
* add keywords (#2799)
* Update issue_retrieval.yml
* ci(default.xml): Add hipBLASLt to manifest (#2796)
* Deleting issue_report.yml in favor of a global issue template placed in ROCm/.github (#2803)
* Delete .github/ISSUE_TEMPLATE/issue_report.yml
* Delete .github/ISSUE_TEMPLATE/config.yml
* Delete .github/ISSUE_TEMPLATE directory (#2805)
* docs(conf.py): Update article info for release page (#2806)
* docs(conf.py): Update article info for release page
* Update conf.py
* Fix typo (#2809)
---------
Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com>
Co-authored-by: David Galiffi <dgaliffi@amd.com>
Co-authored-by: Lisa <lisa.delaney@amd.com>
Co-authored-by: Young Hui <young.hui@amd.com>
Co-authored-by: yhuiYH <145490163+yhuiYH@users.noreply.github.com>
* Mi300 info update (#2780)
* docs(gpu-enabled-mpi.rst): Fix links to 3rd party support matrices (#2775)
* docs(gpu-enabled-mpi.rst): Fix links to 3rd party support matrices
* docs: Directly link for RST instead of using intersphinx
---------
Co-authored-by: Istvan Kiss <neon60@gmail.com>
I am tasked with adding a GitHub action to process incoming GitHub issues. The AMD GitHub admin team asked me to try out one of their runners and to do so, I need to load in a workflow file.
* moved contributing.md to new location as it describes contributing to documentation
* Adding Governance.md and high-level Contributing.md
* fix linting errors (asterisk, whitespace and unused links)
* More linting fixes
* merge conflicts
* verbiage
* License link moved out of codeblock, and text fix there. Changed to full name of AMD. Update links to ROCm Org path
* whitespace linting fix
* Reverted back to ROCm is lead and managed by AMD. Flows better to me.
---------
Co-authored-by: Lisa Delaney <lisa.delaney@amd.com>
* docs(conf.py): Use rocm-docs-core as extension
instead of calling and instantiating as object (legacy method)
Also apply the rocm-docs-home theme flavor
* build: Update rocm-docs-core to 0.30.1
* docs: Remove extra newline from 5.7.1.md template
* docs: Update the changelog and latest release notes
* docs: Rebuild changelog with updated 6.0.0 edits
* Carify mixing C++ and HIP sources via CMake
* Designate code blocks
* Simplify lang around host-only use of the HIP API
* Remove superfluous wording.
* Note LINKER_LANGUAGE of mixed sources
* Space after code-block
* Single space in code-block
* update gpu-enabled-mpi
update the documentation to also include libfabric based network interconnects,
not just UCX.
* add some technical terms to wordlist
* shorten left nav
* grid updates
---------
Co-authored-by: Edgar Gabriel <Edgar.Gabriel@amd.com>
Co-authored-by: Saad Rahim (AMD) <44449863+saadrahim@users.noreply.github.com>
* Update RELEASE.md and 5.7.0.md to match CHANGELOG.md
* Update 5.2.0.md to match CHANGELOG.md
* Copy CHANGELOG into about folder to match RELEASE
To avoid having divergence in relative links between RELEASE and CHANGELOG
Update docs with information in the AMD blog post announcing support for some RDNA3 Radeon GPUs on Linux.
Co-authored-by: Saad Rahim (AMD) <44449863+saadrahim@users.noreply.github.com>
dependabot mis-detected the repository to be a library
(instead of an application) and widened the rocm-docs-core verison
instead of increasing it. This basically disabled pinning.
Explicitly specify to increase the version instead of widening it
to hopefully prevent this in the future.
Flattened out page structure for improved navigability.
* Change Table of Contents
* Update the install guides for windows and linux
* Removed extraneous index pages
* GPU architecture pages duplicate entries removed
* spack page cleanup
---------
Co-authored-by: Sam Wu <samwu103@amd.com>
Co-authored-by: Saad Rahim (AMD) <44449863+saadrahim@users.noreply.github.com>
Since we are not installing the ROCm OpenCL packages. We are not able to
test ROCm withg this command.
Signed-off-by: Tasso Zambelakis <Tasso.Zambelakis@amd.com>
* RX 6700* doc fixes in windows_support.md
Correct RX 6700* LLVM target to gfx1031 windows_support.md
Change name from "RX 6750" to "RX 6750 XT"
* Fix RX7600 LLVM to gfx1102 in windows-support.md
---------
Co-authored-by: Saad Rahim (AMD) <44449863+saadrahim@users.noreply.github.com>
This should increase usability and prevent errors, since the most common
use case is the user using the latest version of their OS,
rather than the oldest supported one.
* update relative link to llvm asan guide
remove docs dir from path
* Minor typo and update on supported OSes
---------
Co-authored-by: Sam Wu <sam.wu2@amd.com>
* Update install instructions to 5.7
* RTG additions to install instructions
* update install instructions for multi version
---------
Co-authored-by: Máté Ferenc Nagy-Egri <mate@streamhpc.com>
Co-authored-by: Sam Wu <sam.wu2@amd.com>
* Update using-gpu-sanitizer.md
Updated content
* fixes for markdown linting
use * instead of + for lists
---------
Co-authored-by: Sam Wu <sam.wu2@amd.com>
* Create using_gpu_sanitizer.md
* Created GPU Sanitizer File and Title
* add technical terms to wordlist and fix spelling
* spelling
---------
Co-authored-by: Sam Wu <sam.wu2@amd.com>
Co-authored-by: b-sumner <brian.sumner@amd.com>
* Added deleted sections to openmp.md and other improvements
* Update CONTRIBUTING.md
* add example of snake case
---------
Co-authored-by: Sam Wu <sam.wu2@amd.com>
* spell out HPC acronym in explanation doc
* update toolchain docs
order in importance descending
* update Contributing guide
add discussions
update formatting and grammar
* separate contributing section for readability
* fix formatting for mdl
* fix spelling
* Update Links (#2240)
* update link to PCIe Gen 4 pdf
* fix broken links
* remove references to broken links
* fix spelling of data center
* Fixing HIP link (#2236)
* Swati develop (#2245)
* Added deleted sections to openmp.md and other improvements
* Update openmp.md
Tagged `ICV`
* Solving indiscrepencies in openmp.md
There are apparently differences in the published document and information conveyed by the Dev. Fixed it.
* add new words to wordlist
---------
Co-authored-by: Sam Wu <sam.wu2@amd.com>
* fix rocm_smi_lib link in toc (#2260)
* ROCm FHS Reorganization, Backward Compatibility, and Versioning - rev (#2255)
* update requirements
---------
Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Ehud Sharlin <112672820+Ehud-Sharlin@users.noreply.github.com>
* docs: update changelog and release notes with hipStreamGetDevice
* docs: fix typos and add version update notes
* docs: add HIP changelog
* remove What's New section from changelog
* docs: clean up SLES tab-sets
- Always use a tab-set for SLES 15.4
- In the toplevel SLES title don't say version 15
- harmonize the `:sync:` labels between documents
* docs: Misc fixes in installation
- Fix rocm repository url in the installer script installation for SLES
- Add a missing :sync: tab in installation prerequisites
* docs: add SLES 15.5 support to installation and OS support pages
* Added deleted sections to openmp.md and other improvements
* Update openmp.md
Tagged `ICV`
* Solving indiscrepencies in openmp.md
There are apparently differences in the published document and information conveyed by the Dev. Fixed it.
* add new words to wordlist
---------
Co-authored-by: Sam Wu <sam.wu2@amd.com>
* Update tensorflow_install.md
Fixed writing commands so that when executed by copy paste it doesn't cause an error.
* Update tensorflow_install.md
Following @saadrahim's suggestion of using "\" to signify a line break in bash.
* Remove package pin from quick start quide
When installing a single-package fashion, no version pinning is needed
* Add package pinning to quick start guide
Pinning the packages is required to make apt prefer the rocm packages
instead of the system ones when both provide the same package (e.g
`rocm-smi`).
* Removing Ubuntu 20.04 change
---------
Co-authored-by: Gergely Meszaros <gergely@streamhpc.com>
Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
* Added specialized kernels to openmp.md
A few formatting changes and addition of specialized kernels section at the end.
* Added Specialized kernels in openmp.md
Some formatting changes and addition of specialized kernels instead of no loop and cross team kernels
* Added specialized kernel to openmp.md
* Added specialized kernels to openmp.md
* Replaced the usage of uncertain clauses(may/might) in openmp.md
* Attempt to align the table headings for environment variables in openmp.md
* Feedback from Dhruva
---------
Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
The hip-devel package depends on perl modules not distributed by default
on RHEL and SLES distriubutions, these can be installed from EPEL and
the `devel:languages:perl` repository respectively.
Ideally in the future these dependencies would be replaced with packages
available from default repositories, but in the meanwhile this should
be at least documented.
* Remove install instructions for unsuported RHEL 8.8 and 9.2
Current ROCm release does not support these versions of RHEL
* Centralize disclaimers and perquisites for installation
- Move the single-version to multi-version diclaimer to the install
overview page where single vs multi installs are discussed.
- Move the installation of kernel-headers and development packages
to the install preparation page. Unify it mainly from the quick start
content.
* s/Name/name/ in repository config files for RHEL
The repository name can be set as `name=><name>` instead of `Name`,
otherwise yum complains about the repo not having a name, e.g:
```output
Repository 'ROCm-5.3.3' is missing name in configuration, using id.
```
This is fixed with this commit.
* Clean up render/video group section on prerequisites
* Installation and Upgrade restructuring & fixes
- Fix the rocm package urls for RHEL in the install & upgrade guides
- RHEL8 and 9 have different URLs, add a tab-set similar to ubuntu
for them.
- Fix the package URL in the upgrade guide for SLES (previously pointed
to the amdgpu url)
- Change the apt-signing key download and conversion to the method used
in the quick start guide, which is the recommended by ubuntu maintainers
- Change the install steps from list items to rubrics with numbered entries
which is more readable and matches the style in the quick start guide
- Do not pass `--append` to `tee` in the upgrade guide, because it is
meant to overwrite.
- Split the one long tab-set to multiple tab-sets in the upgrade guide
to improve readability
Add empty cells to list tables to make them uniform (all rows have the
same number of cells), before this the tables errored out with:
> ERROR: Error parsing content block for the "list-table" directive:
> uniform two-level bullet list expected, but row 13 does not contain
> the same number of items as row 1 (3 vs 4)
and the table did not show up.
* ci: change markdown linting to use the NodeJs markdownlint
The original ruby based markdownlint has a few shortcomings not known
when it was introduced:
- no support for myst extensions
- no support for disabling specific rules for specific files or regions
These two combined make it very hard to use when used for this project
when it has false positives around myst extensions.
Luckily there's a NodeJS based version of markdownlint [1] supporting the
same ruleset that is more configurable:
- seems to support myst extensions better
- has an html comment based syntax to disable specific rules
The library seem to be better maintained too and with better tooling:
e.g. there's a vscode extension using the engine for local use:
markdownlint (DavidAnson.vscode-markdownlint).
[1]: https://github.com/DavidAnson/markdownlint
* docs: hotfix empty links
There are missing links in the docs, these should get fixed, but for now
they are just monkey patched to make CI happy.
* docs: fix links
---------
Co-authored-by: Nara Prasetya <nara@streamhpc.com>
* update the gpu-aware-mpi page
Three changes:
- add the ucx compatibility table
- add the --with-rocm=/opt/rocm option to the compilation of Open MPI
- add a section about how to compile and use UCC for collective
operations.
* Changing link to relative
* Update gpu_aware_mpi.md
---------
Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
* fix rocmcc link
* remove unused link
* remove unused linkcheck configs
* update amd smi section
add link to ami smi github
---------
Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
* update links to new docs and rename .sphinx dir to sphinx
* fix spelling and formatting
add new words to wordlist
remove empty headers
remove version number for ROCm in conf.py
fix typos
* add more formats to rtd config
* Clean up the deployment related pages
- Add an index page for the linux deployment submenu
- Remove deployment options that are not yet completed (i.e. spack,
from source installation)
- remove the general deployment index page
- various cleanups and clarifications in the rest of the pages
* Move all deploy pages to deploy folder
---------
Co-authored-by: Gergely Meszaros <gergely@streamhpc.com>
- Unify code block style (indent vs. fence)
- Mark code languages
- Increase heading level one at a time
- No extra newlines between paragraphs
- List for header reorg stages
- Shrink ascii table (mobile friendlyness)
- 80-column width
* add url to ROCgdb-docs
update reqs and gitignore
* add validation tools section for RVS and TransferBench
* stub in links for validation/mgmt tools
* populate compilers page
* add cards for ai libs and computer vision pages
* add content to math lib pages
* reorg hip and math libs
* update index
* consolidate linear algebra libs
* fix release info order in toc
* fix links and content cards for libraries
* update mdl ignored files
* update understand rocm section
* fix formatting errors
* add link to openmp
* ignore md041
* add url to ROCgdb-docs
update reqs and gitignore
* add validation tools section for RVS and TransferBench
* stub in links for validation/mgmt tools
* populate compilers page
* add cards for ai libs and computer vision pages
* add content to math lib pages
* reorg hip and math libs
* update index
* consolidate linear algebra libs
* Pulling libraries out
* add libraries listed in left sidebar to index page
* Adding all
* Updating nav tree
* fix link to rocm-examples in toc
* update TOC
---------
Co-authored-by: Sam Wu <sam.wu2@amd.com>
* Add C++ algorithm primitive lib cards
* Add PRNG section
* API Reference Manuals first
* Add Tensile and rocWMMA
* Change rocFFT and hipFFT order for consistency
* Add RCCL
* Fix PRNG links
* Add rocSOLVER and hipSOLVER
* Add general note on rocLIB vs hipLIB
* linux quick start: Mention correct package to install
* linux quick start: Rephrase prerequisites
Mention that installing the headers is usually not required by hand.
* linux quick start: Simplify command to get singing key
* linux quick start: Add instructions for RHEL and SLES
Reorganize the quick start guide for linux, adding multi level
tab selection for just the commands where it makes sense.
Currently mostly Ubuntu commands are filled out, if the structure
looks fine, then more will follow.
Add more of the sphinx generated files, so generating the docs does not
add untracked files. Ignore the folder `.venv` typically used for
virtual environments.
Also sort the ignored file list for easier maintenance.
The screenshots are from tables with text, which are not easily searchable,
are bigger in size than needed – increasing load times – and are in a
resolution, causing them to be blurry on HiDPI displays. Therefore, use a
Markdown table instead solving all the issues above, and delete the images
from the repository.
The SLES service pack version differs in the two screenshots: SP2 vs SP3.
Go for *SP3*.
Resolves: https://github.com/RadeonOpenCompute/ROCm/issues/1591
* Release 2.7 project descriptions.
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Update version_history.md for 2.7
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Update meta pkg descriptions and misc. edits
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Updates for release 2.4 README.md
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Updates to version_history.md for release 2.4
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Organize README.md
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* README.md & version_history.md for rocm release 2.5
Fix numerous links, some syntax.
Add links for rocThrust project.
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Update README.md
* Updates for release 2.4 README.md
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Updates to version_history.md for release 2.4
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Organize README.md
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* README.md: Update links to ROCm 2.3 repos
Change-Id: I49c6ca76deb61afeaa90fa7e4af6f94bf3914768
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* README.md: Update more links for release 2.3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Text changes and URL updates for 2.3 release
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Text changes for 2.3 release
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Test updates for release 2.3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Update links for release 2.3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Top level install link for 2.3 release
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* README.md: Update links to ROCm 2.3 repos
Change-Id: I49c6ca76deb61afeaa90fa7e4af6f94bf3914768
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* README.md: Update more links for release 2.3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Text changes and URL updates for 2.3 release
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Text changes for 2.3 release
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Test updates for release 2.3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Update links for release 2.3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* README.md: Update links to ROCm 2.3 repos
Change-Id: I49c6ca76deb61afeaa90fa7e4af6f94bf3914768
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* README.md: Update more links for release 2.3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Text changes and URL updates for 2.3 release
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Text changes for 2.3 release
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
* Test updates for release 2.3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
Community feedback has pointed out a number of confusing,
oudated, or missing sections in our ROCm README file. For example,
we do not describe what our ROCm package structure is, or how the
packages and meta-packages fit together. This can make it confusing
for users who do not want to just install rocm-dkms and move on.
Our repo manifest (default.xml) is severely out of date. It is
missing almost all of the current ROCm projects, and it always
pulls from the main development branch. This means we do not have
a pinned manifest that allows you to pull the code from a
particular ROCm reelease. Manifest updated, and the section of the
README discussing it is majorly overhauled (including links for
information/scripts about building the code after downloading it).
Rather than continually grow our version history in the main
README page, this splits off old version information into its own
file.
* Updated doc on OS support
This commit specifies the ROCm recommended Ubuntu kernel versions.
And advise users to remove ROCm packages if need to upgrade the CentOS versions. There are known DKMS limitations can cause the system fail to upgrade if rock-dkms modules were installed.
* ROCm 1.9 changes
* Update README.md
Update ROCr Debug Agent description
Revised wording for upstream KFD per request
* Update installation instruction
Added instruction to uninstall previous version of ROCm before install new version. Added Ubuntu 18.04 as supported distribution.
* Update README to better list supported hardware.
* Add a table of contents to the README
* ROCm 1.9 changes
Update ROCr Debug Agent description
* Update README.md
Added instruction to uninstall previous version of ROCm before install new version. Added Ubuntu 18.04 as supported distribution.
* ROCm 1.9 changes
* Update installation instruction
Added instruction to uninstall previous version of ROCm before install new version. Added Ubuntu 18.04 as supported distribution.
* Update README.md
AMD values and encourages contributions to our code and documentation. If you want to contribute
to our ROCm repositories, first review the following guidance. For documentation-specific information,
see [Contributing to ROCm docs](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
ROCm is a software stack made up of a collection of drivers, development tools, and APIs that enable
GPU programming from low-level kernel to end-user applications. Because some of our components
are inherited from external projects (such as
[LLVM](https://github.com/ROCm/llvm-project) and
[Kernel driver](https://github.com/ROCm/ROCK-Kernel-Driver)), these use
project-specific contribution guidelines and workflow. Refer to their repositories for more information.
All other ROCm components follow the workflow described in the following sections.
## Development workflow
ROCm uses GitHub to host code, collaborate, and manage version control. We use pull requests (PRs)
for all changes within our repositories. We use
[GitHub issues](https://github.com/ROCm/ROCm/issues) to track known issues, such as
bugs.
### Issue tracking
Before filing a new issue, search the
[existing issues](https://github.com/ROCm/ROCm/issues) to make sure your issue isn't
already listed.
General issue guidelines:
* Use your best judgement for issue creation. If your issue is already listed, upvote the issue and
comment or post to provide additional details, such as how you reproduced this issue.
* If you're not sure if your issue is the same, err on the side of caution and file your issue.
You can add a comment to include the issue number (and link) for the similar issue. If we evaluate
your issue as being the same as the existing issue, we'll close the duplicate.
* If your issue doesn't exist, use the issue template to file a new issue.
* When filing an issue, be sure to provide as much information as possible, including script output so
we can collect information about your configuration. This helps reduce the time required to
reproduce your issue.
* Check your issue regularly, as we may require additional information to successfully reproduce the
issue.
### Pull requests
When you create a pull request, you should target the default branch. Our repositories typically use the **develop** branch as the default integration branch.
When creating a PR, use the following process. Note that each repository may include additional,
project-specific steps. Refer to each repository's PR process for any additional steps.
* Identify the issue you want to fix
* Target the default branch (usually the **develop** branch) for integration
* Ensure your code builds successfully
* Each component has a suite of test cases to run; include the log of the successful test run in your PR
* Do not break existing test cases
* New functionality is only merged with new unit tests
* If your PR includes a new feature, you must provide an application or test so we can ensure that the
feature works and continues to be valid in the future
* Tests must have good code coverage
* Submit your PR and work with the reviewer or maintainer to get your PR approved
* Once approved, the PR is brought onto internal CI systems and may be merged into the component
during our release cycle, as coordinated by the maintainer
* We'll inform you once your change is committed
> [!IMPORTANT]
> By creating a PR, you agree to allow your contribution to be licensed under the
> terms of the LICENSE.txt file in the corresponding repository. Different repositories may use different
> licenses.
You can look up each license on the [ROCm licensing](https://rocm.docs.amd.com/en/latest/about/license.html) page.
### New feature development
Use the [GitHub Discussion forum](https://github.com/ROCm/ROCm/discussions)
(Ideas category) to propose new features. Our maintainers are happy to provide direction and
feedback on feature development.
### Documentation
Submit ROCm documentation changes to our
[documentation repository](https://github.com/ROCm/ROCm). You must update
documentation related to any new feature or API contribution.
Note that each ROCm project uses its own repository for documentation.
## Future development workflow
The current ROCm development workflow is GitHub-based. If, in the future, we change this platform,
the tools and links may change. In this instance, we will update contribution guidelines accordingly.
The ROCm Platform brings a rich foundation to advanced computing by seamlessly
integrating the CPU and GPU with the goal of solving real-world problems.
# AMD ROCm Software
#### Supported CPUs
ROCm is an open-source stack, composed primarily of open-source software, designed for graphics
processing unit (GPU) computation. ROCm consists of a collection of drivers, development tools, and
APIs that enable GPU programming from low-level kernel to end-user applications.
Starting with ROCm 1.8, we have relaxed the requirements for PCIe Atomics on Vega 10 (GFX9) GPUs, and we have similarly opened up more options for number of PCIe lanes. With this release, these GFX9 GPUs can support CPUs without PCIe Atomics and, for example, run on PCIe Gen2 x1 lanes. To enable this option, please set the environment variable `HSA_ENABLE_SDMA=0`.
With ROCm, you can customize your GPU software to meet your specific needs. You can develop,
collaborate, test, and deploy your applications in a free, open source, integrated, and secure software
ecosystem. ROCm is particularly well-suited to GPU-accelerated high-performance computing (HPC),
artificial intelligence (AI), scientific computing, and computer aided design (CAD).
Currently, our GFX8 GPUs (Fiji & Polaris family) still need to use PCIe Gen 3 and PCIe Atomics, but are looking at relaxing this in a future release, once we have fully tested firmware.
ROCm is powered by AMD’s
[Heterogeneous-computing Interface for Portability (HIP)](https://github.com/ROCm/HIP),
an open-source software C++ GPU programming environment and its corresponding runtime. HIP
allows ROCm developers to create portable applications on different platforms by deploying code on a
range of platforms, from dedicated gaming GPUs to exascale HPC clusters.
Current CPUs which support PCIe Gen3 + PCIe Atomics are:
* AMD Ryzen CPUs;
* AMD EPYC CPUs;
* Intel Xeon E7 V3 or newer CPUs;
* Intel Xeon E5 v3 or newer CPUs;
* Intel Xeon E3 v3 or newer CPUs;
* Intel Core i7 v4, Core i5 v4, Core i3 v4 or newer CPUs (i.e. Haswell family or newer).
ROCm supports programming models, such as OpenMP and OpenCL, and includes all necessary open
source software compilers, debuggers, and libraries. ROCm is fully integrated into machine learning
(ML) frameworks, such as PyTorch and TensorFlow.
For Fiji and Polaris GPUs, the ROCm platform leverages PCIe Atomics (Fetch and Add, Compare and Swap,
Unconditional Swap, AtomicsOp Completion).
PCIe Atomics are only supported on PCIe Gen3 enabled CPUs and PCIe Gen3 switches like
Broadcom PLX. When you install your GPUs, make sure you install them in a fully
PCIe Gen3 x16 or x8, x4 or x1 slot attached either directly to the CPU's Root I/O
controller or via a PCIe switch directly attached to the CPU's Root I/O
controller. In our experience, many issues stem from trying to use consumer
motherboards which provide physical x16 connectors that are electrically
connected as e.g. PCIe Gen2 x4, PCIe slots connected via the
Southbridge PCIe I/O controller, or PCIe slots connected through a PCIe switch that does
not support PCIe atomics.
Experimental support for our Hawaii (GFX7) GPUs (Radeon R9 290, R9 390, FirePro W9100, S9150, S9170)
does not require or take advantage of PCIe Atomics. However, we still recommend that you use a CPU
from the list provided above for compatibility purposes.
> [!IMPORTANT]
> A new open source build platform for ROCm is under development at
> https://github.com/ROCm/TheRock, featuring a unified CMake build with bundled
> dependencies, Windows support, and more.
#### Not supported or very limited support under ROCm
###### Limited support
## Getting and Building ROCm from Source
* ROCm 1.8 and Vega10 should support PCIe Gen2 enabled CPUs such as the AMD Opteron, Phenom, Phenom II, Athlon, Athlon X2, Athlon II and older Intel Xeon and Intel Core Architecture and Pentium CPUs. However, we have done very limited testing on these configurations, since our test farm has been catering to CPU listed above. This is where we need community support; if you find problems on such setups, please report these issues.
* Thunderbolt 1, 2, and 3 enabled breakout boxes should now be able to work with ROCm. Thunderbolt 1 and 2 are PCIe Gen2 based, and thus are only supported with GPUs that do not require PCIe Gen 3 atomics (i.e. Vega 10). However, we have done no testing on this configuration and would need comunity support due to limited access to this type of equipment
Please use [TheRock](https://github.com/ROCm/TheRock) build system to build ROCm from source.
###### Not supported
## ROCm documentation
* We do not support GFX8-class GPUs (Fiji, Polaris, etc.) on CPUs that do not have PCIe Gen 3 with PCIe atomics.
* As such, do not support AMD Carrizo and Kaveri APUs as hosts for such GPUs..
* Thunderbolt 1 and 2 enabled GPUs are not supported by GFX8 GPUs on ROCm. Thunderbolt 1 & 2 are PCIe Gen2 based.
* AMD Carrizo based APUs have limited support due to OEM & ODM's choices when it comes to some key configuration parameters. In particular, we have observed that Carrizo laptops, AIOs, and desktop systems showed inconsistencies in exposing and enabling the System BIOS parameters required by the ROCm stack. Before purchasing a Carrizo system for ROCm, please verify that the BIOS provides an option for enabling IOMMUv2 and that the system BIOS properly exposes the correct CRAT table - please inquire with the OEM about the latter.
* AMD Merlin/Falcon Embedded System is not currently supported by the public repo.
* AMD Raven Ridge APU are currently not supported
This repository contains the [manifest file](https://gerrit.googlesource.com/git-repo/+/HEAD/docs/manifest-format.md)
for ROCm releases, changelogs, and release information.
### New features to ROCm 1.8.3
The `default.xml` file contains information for all repositories and the associated commit used to build
the current ROCm release; `default.xml` uses the [Manifest Format repository](https://gerrit.googlesource.com/git-repo/).
* ROCm 1.8.3 is a minor update meant to fix compatibility issues on Ubuntu releases running kernel 4.15.0-33
Source code for our documentation is located in the `/docs` folder of most ROCm repositories. The
`develop` branch of our repositories contains content for the next ROCm release.
### New features as of ROCm 1.8.2
The ROCm documentation homepage is [rocm.docs.amd.com](https://rocm.docs.amd.com).
#### DKMS driver installation
For information on how to contribute to the ROCm documentation, see [Contributing to the ROCm documentation](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
* Debian packages are provided for DKMS on Ubuntu
* RPM packages are provided for CentOS/RHEL 7.4 and 7.5 support
* See the [ROCT-Thunk-Interface](https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/tree/roc-1.8.x) and [ROCK-Kernel-Driver](https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/tree/roc-1.8.x) for additional documentation on driver setup
## Older ROCm releases
#### New distribution support
* Binary package support for Ubuntu 16.04
* Binary package support for CentOS 7.4 and 7.5
* Binary package support for RHEL 7.4 and 7.5
#### Improved OpenMPI via UCX support
* UCX support for OpenMPI
* ROCm RDMA
### The latest ROCm platform - ROCm 1.8.3
The latest tested version of the drivers, tools, libraries and source code for
the ROCm platform have been released and are available under the roc-1.8.x or rocm-1.8.x tag
Next, update the apt repository list and install the rocm package:
>**Warning**: Before proceeding, make sure to completely
>[uninstall any previous ROCm package](https://github.com/RadeonOpenCompute/ROCm#removing-pre-release-packages):
```shell
sudo apt update
sudo apt install rocm-dkms
```
###### Next set your permissions
With move to upstreaming the KFD driver and the support of DKMS, for all Console aka headless user, you will need to add all your users to the 'video" group by setting the Unix permissions
Configure
Ensure that your user account is a member of the "video" group prior to using the ROCm driver. You can find which groups you are a member of with the following command:
```shell
groups
```
To add yourself to the video group you will need the sudo password and can use the following command:
```shell
sudo usermod -a -G video $LOGNAME
```
You may want to ensure that any future users you add to your system are put into the "video" group by default. To do that, you can run the following commands:
```shell
echo'ADD_EXTRA_GROUPS=1'| sudo tee -a /etc/adduser.conf
echo'EXTRA_GROUPS=video'| sudo tee -a /etc/adduser.conf
```
Once complete, reboot your system.
Upon Reboot run the following commands to verify that the ROCm installation waas successful. If you see your GPUs listed by both of these commands, you should be ready to go!
```shell
/opt/rocm/bin/rocminfo
/opt/rocm/opencl/bin/x86_64/clinfo
```
Note that, to make running ROCm programs easier, you may wish to put the ROCm libraries in your LD_LIBRARY_PATH environment variable and the ROCm binaries in your PATH.
```shell
echo'export LD_LIBRARY_PATH=/opt/rocm/opencl/lib/x86_64:/opt/rocm/hsa/lib:$LD_LIBRARY_PATH'| sudo tee -a /etc/profile.d/rocm.sh
echo'export PATH=$PATH:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin/x86_64'| sudo tee -a /etc/profile.d/rocm.sh
```
If you have an [Install Issue](https://rocm.github.io/install_issues.html) please read this FAQ .
###### Vega10 users who want to run ROCm on a system that does not support PCIe atomics must set HSA_ENABLE_SDMA=0
Currently, if you want to run ROCm on a Vega10 GPU (GFX9) on a system without PCIe atomics, you must turn off SDMA functionality.
```shell
exportHSA_ENABLE_SDMA=0
```
###### Performing an OpenCL-only Installation of ROCm
Some users may want to install a subset of the full ROCm installation. In particular, if you are trying to install on a system with a limited amount of storage space, or which will only run a small collection of known applications, you may want to install only the packages that are required to run OpenCL applications. To do that, you can run the following installation command **instead** of the command to install `rocm-dkms`.
At this point they system can install ROCm using the DKMS drivers.
Installing ROCm on the system
At this point ROCm can be installed on the target system. Create a /etc/yum.repos.d/rocm.repo file with the following contents:
```shell
[ROCm]
name=ROCm
baseurl=http://repo.radeon.com/rocm/yum/rpm
enabled=1
gpgcheck=0
```
The repo's URL should point to the location of the repositories repodata database. Install ROCm components using these commands:
```shell
sudo yum install rocm-dkms
```
The rock-dkms component should be installed and the /dev/kfd device should be available on reboot.
Ensure that your user account is a member of the "video" or "wheel" group prior to using the ROCm driver.
You can find which groups you are a member of with the following command:
```shell
groups
```
To add yourself to the video (or wheel) group you will need the sudo password and can use the
following command:
```shell
sudo usermod -a -G video $LOGNAME
```
Current release supports up to CentOS/RHEL 7.4 and 7.5. Users should update to the latest version of the OS:
```shell
sudo yum update
```
###### Vega10 users who want to run ROCm on a system that does not support PCIe atomics must set HSA_ENABLE_SDMA=0
Currently, if you want to run ROCm on a Vega10 GPU (GFX9) on a system without PCIe atomics, you must turn off SDMA functionality.
```shell
export HSA_ENABLE_SDMA=0
```
#### Compiling applications using hcc, hip, etc.
To compile applications or samples, please use gcc-7.2 provided by the devtoolset-7 environment.
To do this, compile all applications after running this command:
```shell
scl enable devtoolset-7 bash
```
#### How to un-install ROCm from CentOS/RHEL 7.4
To un-install the entire rocm development package execute:
```shell
sudo yum autoremove rocm-dkms
```
#### Known Issues / Workarounds
##### If you Plan to Run with X11 - we are seeing X freezes under load
In ROCm 1.8.3, the kernel parameter 'noretry' has been set to 1 to improve overall system performance. However it has been proven to bring instability to graphics driver shipped with Ubuntu. This is an ongoing issue and we are looking into it.
Before that, please try apply this change by changing noretry bit to 0.
```shell
echo 0 | sudo tee /sys/module/amdkfd/parameters/noretry
```
Files under /sys won't be preserved after reboot so you'll need to do it every time.
One way to keep noretry=0 is to change /etc/modprobe.d/amdkfd.conf and make it be:
options amdkfd noretry=0
Once it's done, run sudo update-initramfs -u. Reboot and verify /sys/module/amdkfd/parameters/noretry stays as 0.
##### If you are you are using hipCaffe Alexnet training on ImageNet - we are seeing sporadic hangs of hipCaffe during training
###### Vega10 users who want to run ROCm on a system that does not support PCIe atomics must set HSA_ENABLE_SDMA=0
Currently, if you want to run ROCm on a Vega10 GPU (GFX9) on a system without PCIe atomics, you must turn off SDMA functionality.
```shell
export HSA_ENABLE_SDMA=0
```
#### Closed source components
The ROCm platform relies on a few closed source components to provide legacy
functionality like HSAIL finalization and debugging/profiling support. These
components are only available through the ROCm repositories, and will either be
deprecated or become open source components in the future. These components are
made available in the following packages:
* hsa-ext-rocr-dev
### Getting ROCm source code
Modifications can be made to the ROCm 1.8 components by modifying the open
source code base and rebuilding the components. Source code can be cloned from
each of the GitHub repositories using git, or users can use the repo command
and the ROCm 1.8 manifest file to download the entire ROCm 1.8 source code.
#### Installing repo
Google's repo tool allows you to manage multiple git repositories
simultaneously. You can install it by executing the following commands:
| [ROCK-Kernel-Driver](https://github.com/ROCm/ROCK-Kernel-Driver/) | [GPL 2.0 WITH Linux-syscall-note](https://github.com/ROCm/ROCK-Kernel-Driver/blob/master/COPYING) |
| [rocminfo](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocminfo/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocminfo/License.txt) |
| [ROCm Data Center (RDC)](https://github.com/ROCm/rocm-systems/tree/develop/projects/rdc/) | [MIT](https://github.com/ROCm/rocm-systems/blob/develop/projects/rdc/LICENSE.md) |
| [ROCm-Device-Libs](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/LICENSE.TXT) |
| [ROCr Debug Agent](https://github.com/ROCm/rocr_debug_agent/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocr_debug_agent/blob/amd-staging/LICENSE.txt) |
| [ROCR-Runtime](https://github.com/ROCm/rocm-systems/tree/develop/projects/rocr-runtime/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocm-systems/blob/develop/projects/rocr-runtime/LICENSE.txt) |
The information presented in this document is for informational purposes only
and may contain technical inaccuracies, omissions, and typographical errors. The
information contained herein is subject to change and may be rendered inaccurate
for many reasons, including but not limited to product and roadmap changes,
component and motherboard version changes, new model and/or product releases,
product differences between differing manufacturers, software changes, BIOS
flashes, firmware upgrades, or the like. Any computer system has risks of
security vulnerabilities that cannot be completely prevented or mitigated. AMD
assumes no obligation to update or otherwise correct or revise this information.
However, AMD reserves the right to revise this information and to make changes
from time to time to the content hereof without obligation of AMD to notify any
person of such revisions or changes.
THIS INFORMATION IS PROVIDED “AS IS.” AMD MAKES NO REPRESENTATIONS OR WARRANTIES
WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS, OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD
SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY, OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE
LIABLE TO ANY PERSON FOR ANY RELIANCE, DIRECT, INDIRECT, SPECIAL, OR OTHER
CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN,
EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
AMD, the AMD Arrow logo, ROCm, and combinations thereof are trademarks of
Advanced Micro Devices, Inc. Other product names used in this publication are
for identification purposes only and may be trademarks of their respective
companies.
### Package licensing
:::{attention}
ROCprof Trace Decoder and AOCC CPU optimizations are provided in binary form, subject to the license agreement enclosed on [GitHub](https://github.com/ROCm/rocprof-trace-decoder/blob/amd-mainline/LICENSE) for ROCprof Trace Decoder, and [Developer Central](https://www.amd.com/en/developer/aocc.html) for AOCC. By using, installing,
copying or distributing ROCprof Trace Decoder or AOCC CPU Optimizations, you agree to
the terms and conditions of this license agreement. If you do not agree to the
terms of this agreement, do not install, copy or use ROCprof Trace Decoder or the
AOCC CPU Optimizations.
:::
For the rest of the ROCm packages, you can find the licensing information at the
following location: `/opt/rocm/share/doc/<component-name>/` or in the locations
specified in the preceding table.
For example, you can fetch the licensing information of the `amd_comgr`
component (Code Object Manager) from the `/opt/rocm/share/doc/amd_comgr/LICENSE.txt` file.
,"Oracle Linux 10, 9, 8","Oracle Linux 10, 9, 8","Oracle Linux 10, 9, 8","Oracle Linux 10, 9, 8","Oracle Linux 9, 8","Oracle Linux 9, 8","Oracle Linux 9, 8","Oracle Linux 9, 8","Oracle Linux 9, 8",Oracle Linux 8.10,Oracle Linux 8.10,Oracle Linux 8.10,Oracle Linux 8.10,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,Oracle Linux 8.9,,,
:doc:`ROCm Data Center Tool <rdc:index>`,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000
,,,,,,,,,,,,,,,,,,,,,,,
PERFORMANCE TOOLS,,,,,,,,,,,,,,,,,,,,,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,2.6.0,2.6.0,2.6.0,2.6.0,2.6.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.3.0,1.2.1,1.2.0,1.1.1,1.1.0,1.0.2,1.0.2,1.0.1,1.0.0,0.1.2,0.1.1,0.1.0,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A
..[#os-compatibility] Some operating systems are supported on specific GPUs. For detailed information about operating systems supported on ROCm 7.2.0, see the latest :ref:`supported_distributions`. For version specific information, see `ROCm 7.1.1 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.1/reference/system-requirements.html#supported-operating-systems>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-operating-systems>`__.
..[#gpu-compatibility] Some GPUs have limited operating system support. For detailed information about GPUs supporting ROCm 7.2.0, see the latest :ref:`supported_GPUs`. For version specific information, see `ROCm 7.1.1 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.1/reference/system-requirements.html#supported-gpus>`__, `ROCm 7.1.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.0/reference/system-requirements.html#supported-gpus>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-gpus>`__.
..[#dgl_compat] DGL is only supported on ROCm 7.0.0, 6.4.3 and 6.4.0.
..[#llama-cpp_compat] llama.cpp is only supported on ROCm 7.0.0 and 6.4.x.
..[#flashinfer_compat] FlashInfer is only supported on ROCm 7.1.1 and 6.4.1.
..[#mi325x_KVM] For AMD Instinct MI325X KVM SR-IOV users, do not use AMD GPU Driver (amdgpu) 30.20.0.
..[#driver_patch] AMD GPU Driver (amdgpu) 30.10.1 is a quality release that resolves an issue identified in the 30.10 release. There are no other significant changes or feature additions in ROCm 7.0.1 from ROCm 7.0.0. AMD GPU Driver (amdgpu) 30.10.1 is compatible with ROCm 7.0.1 and ROCm 7.0.0.
..[#kfd_support] As of ROCm 6.4.0, forward and backward compatibility between the AMD GPU Driver (amdgpu) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The supported user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and AMD GPU Driver support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
..[#ROCT-rocr] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package.
.._OS-kernel-versions:
Operating systems, kernel and Glibc versions
*********************************************
For detailed information on operating system supported on ROCm 7.2.0 and associated Kernel and Glibc version, see the latest :ref:`supported_distributions`. For version specific information, see `ROCm 7.1.1 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.1/reference/system-requirements.html#supported-operating-systems>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-operating-systems>`__.
..note::
* See `Red Hat Enterprise Linux Release Dates <https://access.redhat.com/articles/3078>`_ to learn about the specific kernel versions supported on Red Hat Enterprise Linux (RHEL).
* See `List of SUSE Linux Enterprise Server kernel <https://www.suse.com/support/kb/doc/?id=000019587>`_ to learn about the specific kernel version supported on SUSE Linux Enterprise Server (SLES).
..
Footnotes and ref anchors in below historical tables should be appended with "-past-60", to differentiate from the
footnote references in the above, latest, compatibility matrix. It also allows to easily find & replace.
An easy way to work is to download the historical.CSV file, and update open it in excel. Then when content is ready,
delete the columns you don't need, to build the current compatibility matrix to use in above table. Find & replace all
instances of "-past-60" to make it ready for above table.
You can `download the entire .csv <../downloads/compatibility-matrix-historical-6.0.csv>`_ for offline reference.
..csv-table::
:file:compatibility-matrix-historical-6.0.csv
:header-rows:1
:stub-columns:1
..rubric:: Footnotes
..[#os-compatibility-past-60] Some operating systems are supported on specific GPUs. For detailed information, see :ref:`supported_distributions` and select the required ROCm version for version specific support.
..[#gpu-compatibility-past-60] Some GPUs have limited operating system support. For detailed information, see :ref:`supported_GPUs` and select the required ROCm version for version specific support.
..[#tf-mi350-past-60] TensorFlow 2.17.1 is not supported on AMD Instinct MI350 Series GPUs. Use TensorFlow 2.19.1 or 2.18.1 with MI350 Series GPUs instead.
..[#verl_compat-past-60] verl is only supported on ROCm 7.0.0 and 6.2.0.
..[#stanford-megatron-lm_compat-past-60] Stanford Megatron-LM is only supported on ROCm 6.3.0.
..[#dgl_compat-past-60] DGL is only supported on ROCm 7.0.0, 6.4.3 and 6.4.0.
..[#megablocks_compat-past-60] Megablocks is only supported on ROCm 6.3.0.
..[#ray_compat-past-60] Ray is only supported on ROCm 7.0.0 and 6.4.1.
..[#llama-cpp_compat-past-60] llama.cpp is only supported on ROCm 7.0.0 and 6.4.x.
..[#flashinfer_compat-past-60] FlashInfer is only supported on ROCm 7.1.1 and 6.4.1.
..[#mi325x_KVM-past-60] For AMD Instinct MI325X KVM SR-IOV users, do not use AMD GPU Driver (amdgpu) 30.20.0.
..[#driver_patch-past-60] AMD GPU Driver (amdgpu) 30.10.1 is a quality release that resolves an issue identified in the 30.10 release. There are no other significant changes or feature additions in ROCm 7.0.1 from ROCm 7.0.0. AMD GPU Driver (amdgpu) 30.10.1 is compatible with ROCm 7.0.1 and ROCm 7.0.0.
..[#kfd_support-past-60] As of ROCm 6.4.0, forward and backward compatibility between the AMD GPU Driver (amdgpu) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The supported user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and AMD GPU Driver support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
..[#ROCT-rocr-past-60] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package.
llama.cpp can be applied in a variety of scenarios, particularly when you need to meet one or more of the following requirements:
- Plain C/C++ implementation with no external dependencies
- Support for 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory usage
- Custom HIP (Heterogeneous-compute Interface for Portability) kernels for running large language models (LLMs) on AMD GPUs (graphics processing units)
- CPU (central processing unit) + GPU (graphics processing unit) hybrid inference for partially accelerating models larger than the total available VRAM (video random-access memory)
llama.cpp is also used in a range of real-world applications, including:
- Games such as `Lucy's Labyrinth <https://github.com/MorganRO8/Lucys_Labyrinth>`__:
A simple maze game where AI-controlled agents attempt to trick the player.
- Tools such as `Styled Lines <https://marketplace.unity.com/packages/tools/ai-ml-integration/style-text-webgl-ios-stand-alone-llm-llama-cpp-wrapper-292902>`__:
A proprietary, asynchronous inference wrapper for Unity3D game development, including pre-built mobile and web platform wrappers and a model example.
- Various other AI applications use llama.cpp as their inference engine;
for a detailed list, see the `user interfaces (UIs) section <https://github.com/ggml-org/llama.cpp?tab=readme-ov-file#description>`__.
For more use cases and recommendations, refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`__,
where you can search for llama.cpp examples and best practices to optimize your workloads on AMD GPUs.
- The `Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration <https://rocm.blogs.amd.com/ecosystems-and-partners/llama-cpp/README.html>`__
blog post outlines how the open-source llama.cpp framework enables efficient LLM inference—including interactive inference with ``llama-cli``,
server deployment with ``llama-server``, GGUF model preparation and quantization, performance benchmarking, and optimizations tailored for
<meta name="description" content="ROCm Linux Filesystem Hierarchy Standard reorganization">
<meta name="keywords" content="FHS, Linux Filesystem Hierarchy Standard, directory structure,
AMD, ROCm">
</head>
# ROCm Linux Filesystem Hierarchy Standard reorganization
## Introduction
The ROCm Software has adopted the Linux Filesystem Hierarchy Standard (FHS) [https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html](https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html) in order to to ensure ROCm is consistent with standard open source conventions. The following sections specify how current and future releases of ROCm adhere to FHS, how the previous ROCm file system is supported, and how improved versioning specifications are applied to ROCm.
## Adopting the FHS
In order to standardize ROCm directory structure and directory content layout ROCm has adopted the [FHS](https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html), adhering to open source conventions for Linux-based distribution. FHS ensures internal consistency within the ROCm stack, as well as external consistency with other systems and distributions. The ROCm proposed file structure is outlined below:
| -- architecture dependent libraries and binaries used internally by components
| -- cmake
| -- <component>
| --<component>-config.cmake
| -- libexec
| -- <component>
| -- non ISA/architecture independent executables used internally by components
| -- include
| -- <component>
| -- public header files
| -- share
| -- html
| -- <component>
| -- html documentation
| -- info
| -- <component>
| -- info files
| -- man
| -- <component>
| -- man pages
| -- doc
| -- <component>
| -- license files
| -- <component>
| -- samples
| -- architecture independent misc files
```
## Changes from earlier ROCm versions
The following table provides a brief overview of the new ROCm FHS layout, compared to the layout of earlier ROCm versions. Note that /opt/ is used to denote the default rocm-installation-path and should be replaced in case of a non-standard installation location of the ROCm distribution.
The FHS file organization for ROCm was first introduced in the release of ROCm 5.2 . Backward compatibility was implemented to make sure users could still run their ROCm applications while transitioning to the new FHS. ROCm has moved header files and libraries to their new locations as indicated in the above structure, and included symbolic-links and wrapper header files in their old location for backward compatibility. The following sections detail ROCm backward compatibility implementation for wrapper header files, executable files, library files and CMake config files.
### Wrapper header files
Wrapper header files are placed in the old location (
`/opt/rocm-<ver>/<component>/include`) with a warning message to include files
from the new location (`/opt/rocm-<ver>/include`) as shown in the example below.
```cpp
#pragma message "This file is deprecated. Use file from include path /opt/rocm-ver/include/ and prefix with hip."
#include<hip/hip_runtime.h>
```
* Starting at ROCm 5.2 release, the deprecation for backward compatibility wrapper header files is: `#pragma` message announcing `#warning`.
* Starting from ROCm 6.0 (tentatively) backward compatibility for wrapper header files will be removed, and the `#pragma` message will be announcing `#error`.
### Executable files
Executable files are available in the `/opt/rocm-<ver>/bin` folder. For backward
compatibility, the old library location (`/opt/rocm-<ver>/<component>/bin`) has a
soft link to the library at the new location. Soft links will be removed in a
Applications using ROCm are advised to use the new file paths. As the old files
will be deprecated in a future release. Applications have to make sure to include
correct header file and use correct search paths.
1.`#include<header_file.h>` needs to be changed to
`#include <component/header_file.h>`
For example: `#include <hip.h>` needs to change
to `#include <hip/hip.h>`
2. Any variable in CMake or Makefiles pointing to component folder needs to
changed.
For example: `VAR1=/opt/rocm/hip` needs to be changed to `VAR1=/opt/rocm`
`VAR2=/opt/rocm/hsa` needs to be changed to `VAR2=/opt/rocm`
3. Any reference to `/opt/rocm/<component>/bin` or `/opt/rocm/<component>/lib`
needs to be changed to `/opt/rocm/bin` and `/opt/rocm/lib/`, respectively.
## Changes in versioning specifications
In order to better manage ROCm dependencies specification and allow smoother releases of ROCm while avoiding dependency conflicts, ROCm software shall adhere to the following scheme when numbering and incrementing ROCm files versions:
rocm-\<ver\>, where \<ver\> = \<x.y.z\>
x.y.z denote: MAJOR.MINOR.PATCH
z: PATCH - increment z when implementing backward compatible bug fixes.
y: MINOR - increment y when implementing minor changes that add functionality but are still backward compatible.
x: MAJOR - increment x when implementing major changes that are not backward compatible.
The following image shows the node-level architecture of a system that
comprises two AMD EPYC™ processors and (up to) eight AMD Instinct™ GPUs.
The two EPYC processors are connected to each other with the AMD Infinity™
fabric which provides a high-bandwidth (up to 18 GT/sec) and coherent links such
that each processor can access the available node memory as a single
shared-memory domain in a non-uniform memory architecture (NUMA) fashion. In a
2P, or dual-socket, configuration, three AMD Infinity™ fabric links are
available to connect the processors plus one PCIe Gen 4 x16 link per processor
can attach additional I/O devices such as the host adapters for the network
fabric.

In a typical node configuration, each processor can host up to four AMD
Instinct™ GPUs that are attached using PCIe Gen 4 links at 16 GT/sec,
which corresponds to a peak bidirectional link bandwidth of 32 GB/sec. Each hive
of four GPUs can participate in a fully connected, coherent AMD
Instinct™ fabric that connects the four GPUs using 23 GT/sec AMD
Infinity fabric links that run at a higher frequency than the inter-processor
links. This inter-GPU link can be established in certified server systems if the
GPUs are mounted in neighboring PCIe slots by installing the AMD Infinity
Fabric™ bridge for the AMD Instinct™ GPUs.
## Microarchitecture
The microarchitecture of the AMD Instinct GPUs is based on the AMD CDNA
architecture, which targets compute applications such as high-performance
computing (HPC) and AI & machine learning (ML) that run on everything from
individual servers to the world's largest exascale supercomputers. The overall
system architecture is designed for extreme scalability and compute performance.
")
The above image shows the AMD Instinct GPU with its PCIe Gen 4 x16
link (16 GT/sec, at the bottom) that connects the GPU to (one of) the host
processor(s). It also shows the three AMD Infinity Fabric ports that provide
high-speed links (23 GT/sec, also at the bottom) to the other GPUs of the local
hive.
On the left and right of the floor plan, the High Bandwidth Memory (HBM)
attaches via the GPU memory controller. The MI100 generation of the AMD
Instinct GPU offers four stacks of HBM generation 2 (HBM2) for a total
of 32GB with a 4,096bit-wide memory interface. The peak memory bandwidth of the
attached HBM2 is 1.228 TB/sec at a memory clock frequency of 1.2 GHz.
The execution units of the GPU are depicted in the above image as Compute
Units (CU). There are a total 120 compute units that are physically organized
into eight Shader Engines (SE) with fifteen compute units per shader engine.
Each compute unit is further sub-divided into four SIMD units that process SIMD
instructions of 16 data elements per instruction. This enables the CU to process
64 data elements (a so-called 'wavefront') at a peak clock frequency of 1.5 GHz.
Therefore, the theoretical maximum FP64 peak performance is 11.5 TFLOPS
(`4 [SIMD units] x 16 [elements per instruction] x 120 [CU] x 1.5 [GHz]`).

The preceding image shows the block diagram of a single CU of an AMD Instinct™
MI100 GPU and summarizes how instructions flow through the execution
engines. The CU fetches the instructions via a 32KB instruction cache and moves
them forward to execution via a dispatcher. The CU can handle up to ten
wavefronts at a time and feed their instructions into the execution unit. The
execution unit contains 256 vector general-purpose registers (VGPR) and 800
scalar general-purpose registers (SGPR). The VGPR and SGPR are dynamically
allocated to the executing wavefronts. A wavefront can access a maximum of 102
scalar registers. Excess scalar-register usage will cause register spilling and
thus may affect execution performance.
A wavefront can occupy any number of VGPRs from 0 to 256, directly affecting
occupancy; that is, the number of concurrently active wavefronts in the CU. For
instance, with 119 VGPRs used, only two wavefronts can be active in the CU at
the same time. With the instruction latency of four cycles per SIMD instruction,
the occupancy should be as high as possible such that the compute unit can
improve execution efficiency by scheduling instructions from multiple
wavefronts.
:::{table} Peak-performance capabilities of MI100 for different data types.
:name: mi100-perf
| Computation and Data Type | FLOPS/CLOCK/CU | Peak TFLOPS |
The microarchitecture of the AMD Instinct MI250 GPU is based on the
AMD CDNA 2 architecture that targets compute applications such as HPC,
artificial intelligence (AI), and machine learning (ML) and that run on
everything from individual servers to the world’s largest exascale
supercomputers. The overall system architecture is designed for extreme
scalability and compute performance.
The following image shows the components of a single Graphics Compute Die (GCD) of the CDNA 2 architecture. On the top and the bottom are AMD Infinity Fabric™
interfaces and their physical links that are used to connect the GPU die to the
other system-level components of the node (see also Section 2.2). Both
interfaces can drive four AMD Infinity Fabric links. One of the AMD Infinity
Fabric links of the controller at the bottom can be configured as a PCIe link.
Each of the AMD Infinity Fabric links between GPUs can run at up to 25 GT/sec,
which correlates to a peak transfer bandwidth of 50 GB/sec for a 16-wide link (
two bytes per transaction). Section 2.2 has more details on the number of AMD
Infinity Fabric links and the resulting transfer rates between the system-level
components.
To the left and the right are memory controllers that attach the High Bandwidth
Memory (HBM) modules to the GCD. AMD Instinct MI250 GPUs use HBM2e, which offers
a peak memory bandwidth of 1.6 TB/sec per GCD.
The execution units of the GPU are depicted in the following image as Compute
Units (CU). The MI250 GCD has 104 active CUs. Each compute unit is further
subdivided into four SIMD units that process SIMD instructions of 16 data
elements per instruction (for the FP64 data type). This enables the CU to
process 64 work items (a so-called “wavefront”) at a peak clock frequency of 1.7
GHz. Therefore, the theoretical maximum FP64 peak performance per GCD is 22.6
TFLOPS for vector instructions. This equates to 45.3 TFLOPS for vector instructions for both GCDs together. The MI250 compute units also provide specialized
execution units (also called matrix cores), which are geared toward executing
matrix operations like matrix-matrix multiplications. For FP64, the peak
performance of these units amounts to 90.5 TFLOPS.

```{list-table} Peak-performance capabilities of the MI250 OAM for different data types.
:header-rows: 1
:name: mi250-perf-table
*
- Computation and Data Type
- FLOPS/CLOCK/CU
- Peak TFLOPS
*
- Matrix FP64
- 256
- 90.5
*
- Vector FP64
- 128
- 45.3
*
- Matrix FP32
- 256
- 90.5
*
- Packed FP32
- 256
- 90.5
*
- Vector FP32
- 128
- 45.3
*
- Matrix FP16
- 1024
- 362.1
*
- Matrix BF16
- 1024
- 362.1
*
- Matrix INT8
- 1024
- 362.1
```
The above table summarizes the aggregated peak performance of the AMD Instinct MI250 Open Compute Platform (OCP) Open Accelerator Modules (OAMs) and its two GCDs for different data types and execution units. The middle column lists the peak performance (number of data elements processed in a single instruction) of a single compute unit if a SIMD (or matrix) instruction is being retired in each clock cycle. The third column lists the theoretical peak performance of the OAM module. The theoretical aggregated peak memory bandwidth of the GPU is 3.2 TB/sec (1.6 TB/sec per GCD).

The following image shows the block diagram of an OAM package that consists
of two GCDs, each of which constitutes one GPU device in the system. The two
GCDs in the package are connected via four AMD Infinity Fabric links running at
a theoretical peak rate of 25 GT/sec, giving 200 GB/sec peak transfer bandwidth
between the two GCDs of an OAM, or a bidirectional peak transfer bandwidth of
400 GB/sec for the same.
## Node-level architecture
The following image shows the node-level architecture of a system that is
based on the AMD Instinct MI250 GPU. The MI250 OAMs attach to the host
system via PCIe Gen 4 x16 links (yellow lines). Each GCD maintains its own PCIe
x16 link to the host part of the system. Depending on the server platform, the
GCD can attach to the AMD EPYC processor directly or via an optional PCIe switch
. Note that some platforms may offer an x8 interface to the GCDs, which reduces
the available host-to-GPU bandwidth.

The preceding image shows the node-level architecture of a system with AMD
EPYC processors in a dual-socket configuration and four AMD Instinct MI250
GPUs. The MI250 OAMs attach to the host processors system via PCIe Gen 4
x16 links (yellow lines). Depending on the system design, a PCIe switch may
exist to make more PCIe lanes available for additional components like network
interfaces and/or storage devices. Each GCD maintains its own PCIe x16 link to
the host part of the system or to the PCIe switch. Please note, some platforms
may offer an x8 interface to the GCDs, which will reduce the available
host-to-GPU bandwidth.
Between the OAMs and their respective GCDs, a peer-to-peer (P2P) network allows
for direct data exchange between the GPU dies via AMD Infinity Fabric links (
black, green, and red lines). Each of these 16-wide links connects to one of the
two GPU dies in the MI250 OAM and operates at 25 GT/sec, which corresponds to a
theoretical peak transfer rate of 50 GB/sec per link (or 100 GB/sec
bidirectional peak transfer bandwidth). The GCD pairs 2 and 6 as well as GCDs 0
and 4 connect via two XGMI links, which is indicated by the thicker red line in
"``CPF_CMP_UTCL1_STALL_ON_TRANSLATION``", "Cycles", "Number of cycles one of the compute unified translation caches (L1) is stalled waiting on translation"
"``CPF_CPF_STAT_BUSY``", "Cycles", "Number of cycles command processor-fetcher is busy"
"``CPF_CPF_STAT_IDLE``", "Cycles", "Number of cycles command processor-fetcher is idle"
"``CPF_CPF_STAT_STALL``", "Cycles", "Number of cycles command processor-fetcher is stalled"
"``CPF_CPF_TCIU_BUSY``", "Cycles", "Number of cycles command processor-fetcher texture cache interface unit interface is busy"
"``CPF_CPF_TCIU_IDLE``", "Cycles", "Number of cycles command processor-fetcher texture cache interface unit interface is idle"
"``CPF_CPF_TCIU_STALL``", "Cycles", "Number of cycles command processor-fetcher texture cache interface unit interface is stalled waiting on free tags"
The texture cache interface unit is the interface between the command processor and the memory
"``SPI_CSN_BUSY``", "Cycles", "Number of cycles with outstanding waves"
"``SPI_CSN_WINDOW_VALID``", "Cycles", "Number of cycles enabled by ``perfcounter_start`` event"
"``SPI_CSN_NUM_THREADGROUPS``", "Workgroups", "Number of dispatched workgroups"
"``SPI_CSN_WAVE``", "Wavefronts", "Number of dispatched wavefronts"
"``SPI_RA_REQ_NO_ALLOC``", "Cycles", "Number of arbiter cycles with requests but no allocation"
"``SPI_RA_REQ_NO_ALLOC_CSN``", "Cycles", "Number of arbiter cycles with compute shader (n\ :sup:`th` pipe) requests but no compute shader (n\ :sup:`th` pipe) allocation"
"``SPI_RA_RES_STALL_CSN``", "Cycles", "Number of arbiter stall cycles due to shortage of compute shader (n\ :sup:`th` pipe) pipeline slots"
"``SPI_RA_TMP_STALL_CSN``", "Cycles", "Number of stall cycles due to shortage of temp space"
"``SPI_RA_WAVE_SIMD_FULL_CSN``", "SIMD-cycles", "Accumulated number of single instruction, multiple data (SIMD) per cycle affected by shortage of wave slots for compute shader (n\ :sup:`th` pipe) wave dispatch"
"``SPI_RA_VGPR_SIMD_FULL_CSN``", "SIMD-cycles", "Accumulated number of SIMDs per cycle affected by shortage of vector general-purpose register (VGPR) slots for compute shader (n\ :sup:`th` pipe) wave dispatch"
"``SPI_RA_SGPR_SIMD_FULL_CSN``", "SIMD-cycles", "Accumulated number of SIMDs per cycle affected by shortage of scalar general-purpose register (SGPR) slots for compute shader (n\ :sup:`th` pipe) wave dispatch"
"``SPI_RA_LDS_CU_FULL_CSN``", "CU", "Number of compute units affected by shortage of local data share (LDS) space for compute shader (n\ :sup:`th` pipe) wave dispatch"
"``SPI_RA_BAR_CU_FULL_CSN``", "CU", "Number of compute units with compute shader (n\ :sup:`th` pipe) waves waiting at a BARRIER"
"``SPI_RA_BULKY_CU_FULL_CSN``", "CU", "Number of compute units with compute shader (n\ :sup:`th` pipe) waves waiting for BULKY resource"
"``SPI_RA_TGLIM_CU_FULL_CSN``", "Cycles", "Number of compute shader (n\ :sup:`th` pipe) wave stall cycles due to restriction of ``tg_limit`` for thread group size"
"``SPI_RA_WVLIM_STALL_CSN``", "Cycles", "Number of cycles compute shader (n\ :sup:`th` pipe) is stalled due to ``WAVE_LIMIT``"
"``SPI_VWC_CSC_WR``", "Qcycles", "Number of quad-cycles taken to initialize VGPRs when launching waves"
"``SPI_SWC_CSC_WR``", "Qcycles", "Number of quad-cycles taken to initialize SGPRs when launching waves"
"``SQ_INSTS_VMEM``", "Instr", "Number of vector memory instructions issued, including both flat and buffer instructions"
"``SQ_INSTS_SALU``", "Instr", "Number of scalar arithmetic logic unit (SALU) instructions issued"
"``SQ_INSTS_SMEM``", "Instr", "Number of scalar memory instructions issued"
"``SQ_INSTS_SMEM_NORM``", "Instr", "Number of scalar memory instructions normalized to match ``smem_level`` issued"
"``SQ_INSTS_FLAT``", "Instr", "Number of flat instructions issued"
"``SQ_INSTS_FLAT_LDS_ONLY``", "Instr", "**MI200 Series only** Number of FLAT instructions that read/write only from/to LDS issued. Works only if ``EARLY_TA_DONE`` is enabled."
"``SQ_INSTS_LDS``", "Instr", "Number of LDS instructions issued **(MI200: includes flat; MI300: does not include flat)**"
"``SQ_INSTS_GDS``", "Instr", "Number of global data share instructions issued"
"``SQ_INSTS_EXP_GDS``", "Instr", "Number of EXP and global data share instructions excluding skipped export instructions issued"
"``SQ_INSTS_BRANCH``", "Instr", "Number of branch instructions issued"
"``SQ_INSTS_SENDMSG``", "Instr", "Number of ``SENDMSG`` instructions including ``s_endpgm`` issued"
"``SQ_INSTS_VSKIPPED``", "Instr", "Number of vector instructions skipped"
Flat instructions allow read, write, and atomic access to a generic memory address pointer that can
resolve to any of the following physical memories:
"``SQ_BUSY_CYCLES``", "Cycles", "Number of cycles while sequencers reports it to be busy"
"``SQ_BUSY_CU_CYCLES``", "Qcycles", "Number of quad-cycles each compute unit is busy"
"``SQ_VALU_MFMA_BUSY_CYCLES``", "Cycles", "Number of cycles the matrix FMA arithmetic logic unit (ALU) is busy"
"``SQ_WAVE_CYCLES``", "Qcycles", "Number of quad-cycles spent by waves in the compute units"
"``SQ_WAIT_ANY``", "Qcycles", "Number of quad-cycles spent waiting for anything"
"``SQ_WAIT_INST_ANY``", "Qcycles", "Number of quad-cycles spent waiting for any instruction to be issued"
"``SQ_ACTIVE_INST_ANY``", "Qcycles", "Number of quad-cycles spent by each wave to work on an instruction"
"``SQ_ACTIVE_INST_VMEM``", "Qcycles", "Number of quad-cycles spent by the sequencer instruction arbiter to work on a vector memory instruction"
"``SQ_ACTIVE_INST_LDS``", "Qcycles", "Number of quad-cycles spent by the sequencer instruction arbiter to work on an LDS instruction"
"``SQ_ACTIVE_INST_VALU``", "Qcycles", "Number of quad-cycles spent by the sequencer instruction arbiter to work on a VALU instruction"
"``SQ_ACTIVE_INST_SCA``", "Qcycles", "Number of quad-cycles spent by the sequencer instruction arbiter to work on a SALU or scalar memory instruction"
"``SQ_ACTIVE_INST_EXP_GDS``", "Qcycles", "Number of quad-cycles spent by the sequencer instruction arbiter to work on an ``EXPORT`` or ``GDS`` instruction"
"``SQ_ACTIVE_INST_MISC``", "Qcycles", "Number of quad-cycles spent by the sequencer instruction arbiter to work on a ``BRANCH`` or ``SENDMSG`` instruction"
"``SQ_ACTIVE_INST_FLAT``", "Qcycles", "Number of quad-cycles spent by the sequencer instruction arbiter to work on a flat instruction"
"``SQ_INST_CYCLES_VMEM_WR``", "Qcycles", "Number of quad-cycles spent to send addr and cmd data for vector memory write instructions"
"``SQ_INST_CYCLES_VMEM_RD``", "Qcycles", "Number of quad-cycles spent to send addr and cmd data for vector memory read instructions"
"``SQ_INST_CYCLES_SMEM``", "Qcycles", "Number of quad-cycles spent to execute scalar memory reads"
"``SQ_INST_CYCLES_SALU``", "Qcycles", "Number of quad-cycles spent to execute non-memory read scalar operations"
"``SQ_THREAD_CYCLES_VALU``", "Qcycles", "Number of quad-cycles spent to execute VALU operations on active threads"
"``SQ_WAIT_INST_LDS``", "Qcycles", "Number of quad-cycles spent waiting for LDS instruction to be issued"
``SQ_THREAD_CYCLES_VALU`` is similar to ``INST_CYCLES_VALU``, but it's multiplied by the number of
"``SQC_ICACHE_REQ``", "Req", "Number of L1 instruction (L1i) cache requests"
"``SQC_ICACHE_HITS``", "Count", "Number of L1i cache hits"
"``SQC_ICACHE_MISSES``", "Count", "Number of non-duplicate L1i cache misses including uncached requests"
"``SQC_ICACHE_MISSES_DUPLICATE``", "Count", "Number of duplicate L1i cache misses whose previous lookup miss on the same cache line is not fulfilled yet"
"``SQC_DCACHE_REQ``", "Req", "Number of scalar L1d requests"
"``SQC_DCACHE_INPUT_VALID_READYB``", "Cycles", "Number of cycles while sequencer input is valid but scalar L1d is not ready"
"``SQC_DCACHE_HITS``", "Count", "Number of scalar L1d hits"
"``SQC_DCACHE_MISSES``", "Count", "Number of non-duplicate scalar L1d misses including uncached requests"
"``SQC_DCACHE_MISSES_DUPLICATE``", "Count", "Number of duplicate scalar L1d misses"
"``SQC_DCACHE_REQ_READ_1``", "Req", "Number of constant cache read requests in a single 32-bit data word"
"``SQC_DCACHE_REQ_READ_2``", "Req", "Number of constant cache read requests in two 32-bit data words"
"``SQC_DCACHE_REQ_READ_4``", "Req", "Number of constant cache read requests in four 32-bit data words"
"``SQC_DCACHE_REQ_READ_8``", "Req", "Number of constant cache read requests in eight 32-bit data words"
"``SQC_DCACHE_REQ_READ_16``", "Req", "Number of constant cache read requests in 16 32-bit data words"
"``SQC_DCACHE_ATOMIC``", "Req", "Number of atomic requests"
"``SQC_TC_REQ``", "Req", "Number of texture cache requests that were issued by instruction and constant caches"
"``SQC_TC_INST_REQ``", "Req", "Number of instruction requests to the L2 cache"
"``SQC_TC_DATA_READ_REQ``", "Req", "Number of data Read requests to the L2 cache"
"``SQC_TC_DATA_WRITE_REQ``", "Req", "Number of data write requests to the L2 cache"
"``SQC_TC_DATA_ATOMIC_REQ``", "Req", "Number of data atomic requests to the L2 cache"
"``SQC_TC_STALL``", "Cycles", "Number of cycles while the valid requests to the L2 cache are stalled"
"``TCP_TOTAL_CACHE_ACCESSES[n]``", "Req", "Number of vector L1d cache accesses including hits and misses", "0-15"
"``TCP_TCP_LATENCY[n]``", "Cycles", "**MI200 Series only** Accumulated wave access latency to vL1D over all wavefronts", "0-15"
"``TCP_TCC_READ_REQ_LATENCY[n]``", "Cycles", "**MI200 Series only** Total vL1D to L2 request latency over all wavefronts for reads and atomics with return", "0-15"
"``TCP_TCC_WRITE_REQ_LATENCY[n]``", "Cycles", "**MI200 Series only** Total vL1D to L2 request latency over all wavefronts for writes and atomics without return", "0-15"
"``TCP_TCC_READ_REQ[n]``", "Req", "Number of read requests to L2 cache", "0-15"
"``TCP_TCC_WRITE_REQ[n]``", "Req", "Number of write requests to L2 cache", "0-15"
"``TCP_TCC_ATOMIC_WITH_RET_REQ[n]``", "Req", "Number of atomic requests to L2 cache with return", "0-15"
"``TCP_TCC_ATOMIC_WITHOUT_RET_REQ[n]``", "Req", "Number of atomic requests to L2 cache without return", "0-15"
"``TCP_TCC_NC_READ_REQ[n]``", "Req", "Number of non-coherently cached read requests to L2 cache", "0-15"
"``TCP_TCC_UC_READ_REQ[n]``", "Req", "Number of uncached read requests to L2 cache", "0-15"
"``TCP_TCC_CC_READ_REQ[n]``", "Req", "Number of coherently cached read requests to L2 cache", "0-15"
"``TCP_TCC_RW_READ_REQ[n]``", "Req", "Number of coherently cached with write read requests to L2 cache", "0-15"
"``TCP_TCC_NC_WRITE_REQ[n]``", "Req", "Number of non-coherently cached write requests to L2 cache", "0-15"
"``TCP_TCC_UC_WRITE_REQ[n]``", "Req", "Number of uncached write requests to L2 cache", "0-15"
"``TCP_TCC_CC_WRITE_REQ[n]``", "Req", "Number of coherently cached write requests to L2 cache", "0-15"
"``TCP_TCC_RW_WRITE_REQ[n]``", "Req", "Number of coherently cached with write write requests to L2 cache", "0-15"
"``TCP_TCC_NC_ATOMIC_REQ[n]``", "Req", "Number of non-coherently cached atomic requests to L2 cache", "0-15"
"``TCP_TCC_UC_ATOMIC_REQ[n]``", "Req", "Number of uncached atomic requests to L2 cache", "0-15"
"``TCP_TCC_CC_ATOMIC_REQ[n]``", "Req", "Number of coherently cached atomic requests to L2 cache", "0-15"
"``TCP_TCC_RW_ATOMIC_REQ[n]``", "Req", "Number of coherently cached with write atomic requests to L2 cache", "0-15"
L2 cache is also known as texture cache per channel.
..tab-set::
..tab-item:: MI300 hardware counter
..csv-table::
:header:"Hardware counter", "Unit", "Definition", "Value range for ``n``"
"``TCC_CYCLE[n]``", "Cycles", "Number of L2 cache free-running clocks", "0-31"
"``TCC_BUSY[n]``", "Cycles", "Number of L2 cache busy cycles", "0-31"
"``TCC_REQ[n]``", "Req", "Number of L2 cache requests of all types (measured at the tag block)", "0-31"
"``TCC_STREAMING_REQ[n]``", "Req", "Number of L2 cache streaming requests (measured at the tag block)", "0-31"
"``TCC_NC_REQ[n]``", "Req", "Number of non-coherently cached requests (measured at the tag block)", "0-31"
"``TCC_UC_REQ[n]``", "Req", "Number of uncached requests. This is measured at the tag block", "0-31"
"``TCC_CC_REQ[n]``", "Req", "Number of coherently cached requests. This is measured at the tag block", "0-31"
"``TCC_RW_REQ[n]``", "Req", "Number of coherently cached with write requests. This is measured at the tag block", "0-31"
"``TCC_PROBE[n]``", "Req", "Number of probe requests", "0-31"
"``TCC_PROBE_ALL[n]``", "Req", "Number of external probe requests with ``EA_TCC_preq_all == 1``", "0-31"
"``TCC_READ[n]``", "Req", "Number of L2 cache read requests (includes compressed reads but not metadata reads)", "0-31"
"``TCC_WRITE[n]``", "Req", "Number of L2 cache write requests", "0-31"
"``TCC_ATOMIC[n]``", "Req", "Number of L2 cache atomic requests of all types", "0-31"
"``TCC_HIT[n]``", "Req", "Number of L2 cache hits", "0-31"
"``TCC_MISS[n]``", "Req", "Number of L2 cache misses", "0-31"
"``TCC_WRITEBACK[n]``", "Req", "Number of lines written back to the main memory, including writebacks of dirty lines and uncached write or atomic requests", "0-31"
"``TCC_EA0_WRREQ[n]``", "Req", "Number of 32-byte and 64-byte transactions going over the ``TC_EA_wrreq`` interface (doesn't include probe commands)", "0-31"
"``TCC_EA0_WRREQ_64B[n]``", "Req", "Total number of 64-byte transactions (write or ``CMPSWAP``) going over the ``TC_EA_wrreq`` interface", "0-31"
"``TCC_EA0_WR_UNCACHED_32B[n]``", "Req", "Number of 32 or 64-byte write or atomic going over the ``TC_EA_wrreq`` interface due to uncached traffic", "0-31"
"``TCC_EA0_WRREQ_STALL[n]``", "Cycles", "Number of cycles a write request is stalled", "0-31"
"``TCC_EA0_WRREQ_IO_CREDIT_STALL[n]``", "Cycles", "Number of cycles an efficiency arbiter write request is stalled due to the interface running out of input-output (IO) credits", "0-31"
"``TCC_EA0_WRREQ_GMI_CREDIT_STALL[n]``", "Cycles", "Number of cycles an efficiency arbiter write request is stalled due to the interface running out of GMI credits", "0-31"
"``TCC_EA0_WRREQ_DRAM_CREDIT_STALL[n]``", "Cycles", "Number of cycles an efficiency arbiter write request is stalled due to the interface running out of DRAM credits", "0-31"
"``TCC_TOO_MANY_EA_WRREQS_STALL[n]``", "Cycles", "Number of cycles the L2 cache is unable to send an efficiency arbiter write request due to it reaching its maximum capacity of pending efficiency arbiter write requests", "0-31"
"``TCC_EA0_WRREQ_LEVEL[n]``", "Req", "The accumulated number of efficiency arbiter write requests in flight", "0-31"
"``TCC_EA0_ATOMIC[n]``", "Req", "Number of 32-byte or 64-byte atomic requests going over the ``TC_EA_wrreq`` interface", "0-31"
"``TCC_EA0_ATOMIC_LEVEL[n]``", "Req", "The accumulated number of efficiency arbiter atomic requests in flight", "0-31"
"``TCC_EA0_RDREQ[n]``", "Req", "Number of 32-byte or 64-byte read requests to efficiency arbiter", "0-31"
"``TCC_EA0_RDREQ_32B[n]``", "Req", "Number of 32-byte read requests to efficiency arbiter", "0-31"
"``TCC_EA0_RD_UNCACHED_32B[n]``", "Req", "Number of 32-byte efficiency arbiter reads due to uncached traffic. A 64-byte request is counted as 2", "0-31"
"``TCC_EA0_RDREQ_IO_CREDIT_STALL[n]``", "Cycles", "Number of cycles there is a stall due to the read request interface running out of IO credits", "0-31"
"``TCC_EA0_RDREQ_GMI_CREDIT_STALL[n]``", "Cycles", "Number of cycles there is a stall due to the read request interface running out of GMI credits", "0-31"
"``TCC_EA0_RDREQ_DRAM_CREDIT_STALL[n]``", "Cycles", "Number of cycles there is a stall due to the read request interface running out of DRAM credits", "0-31"
"``TCC_EA0_RDREQ_LEVEL[n]``", "Req", "The accumulated number of efficiency arbiter read requests in flight", "0-31"
"``TCC_EA0_RDREQ_DRAM[n]``", "Req", "Number of 32-byte or 64-byte efficiency arbiter read requests to High Bandwidth Memory (HBM)", "0-31"
"``TCC_EA0_WRREQ_DRAM[n]``", "Req", "Number of 32-byte or 64-byte efficiency arbiter write requests to HBM", "0-31"
"``TCC_TAG_STALL[n]``", "Cycles", "Number of cycles the normal request pipeline in the tag is stalled for any reason", "0-31"
"``TCC_NORMAL_WRITEBACK[n]``", "Req", "Number of writebacks due to requests that are not writeback requests", "0-31"
"``TCC_ALL_TC_OP_WB_WRITEBACK[n]``", "Req", "Number of writebacks due to all ``TC_OP`` writeback requests", "0-31"
"``TCC_NORMAL_EVICT[n]``", "Req", "Number of evictions due to requests that are not invalidate or probe requests", "0-31"
"``TCC_ALL_TC_OP_INV_EVICT[n]``", "Req", "Number of evictions due to all ``TC_OP`` invalidate requests", "0-31"
..tab-item:: MI200 hardware counter
..csv-table::
:header:"Hardware counter", "Unit", "Definition", "Value range for ``n``"
"``TCC_CYCLE[n]``", "Cycles", "Number of L2 cache free-running clocks", "0-31"
"``TCC_BUSY[n]``", "Cycles", "Number of L2 cache busy cycles", "0-31"
"``TCC_REQ[n]``", "Req", "Number of L2 cache requests of all types (measured at the tag block)", "0-31"
"``TCC_STREAMING_REQ[n]``", "Req", "Number of L2 cache streaming requests (measured at the tag block)", "0-31"
"``TCC_NC_REQ[n]``", "Req", "Number of non-coherently cached requests (measured at the tag block)", "0-31"
"``TCC_UC_REQ[n]``", "Req", "Number of uncached requests. This is measured at the tag block", "0-31"
"``TCC_CC_REQ[n]``", "Req", "Number of coherently cached requests. This is measured at the tag block", "0-31"
"``TCC_RW_REQ[n]``", "Req", "Number of coherently cached with write requests. This is measured at the tag block", "0-31"
"``TCC_PROBE[n]``", "Req", "Number of probe requests", "0-31"
"``TCC_PROBE_ALL[n]``", "Req", "Number of external probe requests with ``EA_TCC_preq_all == 1``", "0-31"
"``TCC_READ[n]``", "Req", "Number of L2 cache read requests (includes compressed reads but not metadata reads)", "0-31"
"``TCC_WRITE[n]``", "Req", "Number of L2 cache write requests", "0-31"
"``TCC_ATOMIC[n]``", "Req", "Number of L2 cache atomic requests of all types", "0-31"
"``TCC_HIT[n]``", "Req", "Number of L2 cache hits", "0-31"
"``TCC_MISS[n]``", "Req", "Number of L2 cache misses", "0-31"
"``TCC_WRITEBACK[n]``", "Req", "Number of lines written back to the main memory, including writebacks of dirty lines and uncached write or atomic requests", "0-31"
"``TCC_EA_WRREQ[n]``", "Req", "Number of 32-byte and 64-byte transactions going over the ``TC_EA_wrreq`` interface (doesn't include probe commands)", "0-31"
"``TCC_EA_WRREQ_64B[n]``", "Req", "Total number of 64-byte transactions (write or ``CMPSWAP``) going over the ``TC_EA_wrreq`` interface", "0-31"
"``TCC_EA_WR_UNCACHED_32B[n]``", "Req", "Number of 32 write or atomic going over the ``TC_EA_wrreq`` interface due to uncached traffic. A 64-byte request will be counted as 2", "0-31"
"``TCC_EA_WRREQ_STALL[n]``", "Cycles", "Number of cycles a write request is stalled", "0-31"
"``TCC_EA_WRREQ_IO_CREDIT_STALL[n]``", "Cycles", "Number of cycles an efficiency arbiter write request is stalled due to the interface running out of input-output (IO) credits", "0-31"
"``TCC_EA_WRREQ_GMI_CREDIT_STALL[n]``", "Cycles", "Number of cycles an efficiency arbiter write request is stalled due to the interface running out of GMI credits", "0-31"
"``TCC_EA_WRREQ_DRAM_CREDIT_STALL[n]``", "Cycles", "Number of cycles an efficiency arbiter write request is stalled due to the interface running out of DRAM credits", "0-31"
"``TCC_TOO_MANY_EA_WRREQS_STALL[n]``", "Cycles", "Number of cycles the L2 cache is unable to send an efficiency arbiter write request due to it reaching its maximum capacity of pending efficiency arbiter write requests", "0-31"
"``TCC_EA_WRREQ_LEVEL[n]``", "Req", "The accumulated number of efficiency arbiter write requests in flight", "0-31"
"``TCC_EA_ATOMIC[n]``", "Req", "Number of 32-byte or 64-byte atomic requests going over the ``TC_EA_wrreq`` interface", "0-31"
"``TCC_EA_ATOMIC_LEVEL[n]``", "Req", "The accumulated number of efficiency arbiter atomic requests in flight", "0-31"
"``TCC_EA_RDREQ[n]``", "Req", "Number of 32-byte or 64-byte read requests to efficiency arbiter", "0-31"
"``TCC_EA_RDREQ_32B[n]``", "Req", "Number of 32-byte read requests to efficiency arbiter", "0-31"
"``TCC_EA_RD_UNCACHED_32B[n]``", "Req", "Number of 32-byte efficiency arbiter reads due to uncached traffic. A 64-byte request is counted as 2", "0-31"
"``TCC_EA_RDREQ_IO_CREDIT_STALL[n]``", "Cycles", "Number of cycles there is a stall due to the read request interface running out of IO credits", "0-31"
"``TCC_EA_RDREQ_GMI_CREDIT_STALL[n]``", "Cycles", "Number of cycles there is a stall due to the read request interface running out of GMI credits", "0-31"
"``TCC_EA_RDREQ_DRAM_CREDIT_STALL[n]``", "Cycles", "Number of cycles there is a stall due to the read request interface running out of DRAM credits", "0-31"
"``TCC_EA_RDREQ_LEVEL[n]``", "Req", "The accumulated number of efficiency arbiter read requests in flight", "0-31"
"``TCC_EA_RDREQ_DRAM[n]``", "Req", "Number of 32-byte or 64-byte efficiency arbiter read requests to High Bandwidth Memory (HBM)", "0-31"
"``TCC_EA_WRREQ_DRAM[n]``", "Req", "Number of 32-byte or 64-byte efficiency arbiter write requests to HBM", "0-31"
"``TCC_TAG_STALL[n]``", "Cycles", "Number of cycles the normal request pipeline in the tag is stalled for any reason", "0-31"
"``TCC_NORMAL_WRITEBACK[n]``", "Req", "Number of writebacks due to requests that are not writeback requests", "0-31"
"``TCC_ALL_TC_OP_WB_WRITEBACK[n]``", "Req", "Number of writebacks due to all ``TC_OP`` writeback requests", "0-31"
"``TCC_NORMAL_EVICT[n]``", "Req", "Number of evictions due to requests that are not invalidate or probe requests", "0-31"
"``TCC_ALL_TC_OP_INV_EVICT[n]``", "Req", "Number of evictions due to all ``TC_OP`` invalidate requests", "0-31"
Note the following:
*``TCC_REQ[n]`` may be more than the number of requests arriving at the texture cache per channel,
but it's a good indication of the total amount of work that needs to be performed.
* For ``TCC_EA0_WRREQ[n]``, atomics may travel over the same interface and are generally classified as
write requests.
* CC mtypes can produce uncached requests, and those are included in
``TCC_EA0_WR_UNCACHED_32B[n]``
*``TCC_EA0_WRREQ_LEVEL[n]`` is primarily intended to measure average efficiency arbiter write latency.
* Average write latency = ``TCC_PERF_SEL_EA0_WRREQ_LEVEL`` divided by ``TCC_PERF_SEL_EA0_WRREQ``
*``TCC_EA0_ATOMIC_LEVEL[n]`` is primarily intended to measure average efficiency arbiter atomic
latency
* Average atomic latency = ``TCC_PERF_SEL_EA0_WRREQ_ATOMIC_LEVEL`` divided by ``TCC_PERF_SEL_EA0_WRREQ_ATOMIC``
*``TCC_EA0_RDREQ_LEVEL[n]`` is primarily intended to measure average efficiency arbiter read latency.
* Average read latency = ``TCC_PERF_SEL_EA0_RDREQ_LEVEL`` divided by ``TCC_PERF_SEL_EA0_RDREQ``
* Stalls can occur regardless of the need for a read to be performed
* Normally, stalls are measured exactly at one point in the pipeline however in the case of
``TCC_TAG_STALL[n]``, probes can stall the pipeline at a variety of places. There is no single point that
"``ALUStalledByLDS``", "Percentage of GPU time ALU units are stalled due to the LDS input queue being full or the output queue not being ready (value range: 0% (optimal) to 100%)"
"``FetchSize``", "Total kilobytes fetched from the video memory; measured with all extra fetches and any cache or memory effects taken into account"
"``FlatLDSInsts``", "Average number of flat instructions that read from or write to LDS, run per work item (affected by flow control)"
"``FlatVMemInsts``", "Average number of flat instructions that read from or write to the video memory, run per work item (affected by flow control). Includes flat instructions that read from or write to scratch"
"``GDSInsts``", "Average number of global data share read or write instructions run per work item (affected by flow control)"
"``GPUBusy``", "Percentage of time GPU is busy"
"``L2CacheHit``", "Percentage of fetch, write, atomic, and other instructions that hit the data in L2 cache (value range: 0% (no hit) to 100% (optimal))"
"``LDSBankConflict``", "Percentage of GPU time LDS is stalled by bank conflicts (value range: 0% (optimal) to 100%)"
"``LDSInsts``", "Average number of LDS read or write instructions run per work item (affected by flow control). Excludes flat instructions that read from or write to LDS."
"``MemUnitBusy``", "Percentage of GPU time the memory unit is active, which is measured with all extra fetches and writes and any cache or memory effects taken into account (value range: 0% to 100% (fetch-bound))"
"``MemUnitStalled``", "Percentage of GPU time the memory unit is stalled (value range: 0% (optimal) to 100%)"
"``MemWrites32B``", "Total number of effective 32B write transactions to the memory"
"``TCA_BUSY_sum``", "Total number of cycles texture cache arbiter has a pending request, over all texture cache arbiter instances"
"``TCA_CYCLE_sum``", "Total number of cycles over all texture cache arbiter instances"
"``SALUBusy``", "Percentage of GPU time scalar ALU instructions are processed (value range: 0% to 100% (optimal))"
"``SALUInsts``", "Average number of scalar ALU instructions run per work item (affected by flow control)"
"``SFetchInsts``", "Average number of scalar fetch instructions from the video memory run per work item (affected by flow control)"
"``VALUBusy``", "Percentage of GPU time vector ALU instructions are processed (value range: 0% to 100% (optimal))"
"``VALUInsts``", "Average number of vector ALU instructions run per work item (affected by flow control)"
"``VALUUtilization``", "Percentage of active vector ALU threads in a wave, where a lower number can mean either more thread divergence in a wave or that the work-group size is not a multiple of 64 (value range: 0%, 100% (optimal - no thread divergence))"
"``VFetchInsts``", "Average number of vector fetch instructions from the video memory run per work-item (affected by flow control); excludes flat instructions that fetch from video memory"
"``VWriteInsts``", "Average number of vector write instructions to the video memory run per work-item (affected by flow control); excludes flat instructions that write to video memory"
"``Wavefronts``", "Total wavefronts"
"``WRITE_REQ_32B``", "Total number of 32-byte effective memory writes"
"``WriteSize``", "Total kilobytes written to the video memory; measured with all extra fetches and any cache or memory effects taken into account"
"``WriteUnitStalled``", "Percentage of GPU time the write unit is stalled (value range: 0% (optimal) to 100%)"
You can lower ``ALUStalledByLDS`` by reducing LDS bank conflicts or number of LDS accesses.
You can lower ``MemUnitStalled`` by reducing the number or size of fetches and writes.
``MemUnitBusy`` includes the stall time (``MemUnitStalled``).
Hardware counters by and over all texture addressing unit instances
"``TCC_ALL_TC_OP_WB_WRITEBACK_sum``", "Total number of writebacks due to all ``TC_OP`` writeback requests."
"``TCC_ALL_TC_OP_INV_EVICT_sum``", "Total number of evictions due to all ``TC_OP`` invalidate requests."
"``TCC_ATOMIC_sum``", "Total number of L2 cache atomic requests of all types."
"``TCC_BUSY_avr``", "Average number of L2 cache busy cycles."
"``TCC_BUSY_sum``", "Total number of L2 cache busy cycles."
"``TCC_CC_REQ_sum``", "Total number of coherently cached requests."
"``TCC_CYCLE_sum``", "Total number of L2 cache free running clocks."
"``TCC_EA0_WRREQ_sum``", "Total number of 32-byte and 64-byte transactions going over the ``TC_EA0_wrreq`` interface. Atomics may travel over the same interface and are generally classified as write requests. This does not include probe commands."
"``TCC_EA0_WRREQ_64B_sum``", "Total number of 64-byte transactions (write or `CMPSWAP`) going over the ``TC_EA0_wrreq`` interface."
"``TCC_EA0_WR_UNCACHED_32B_sum``", "Total Number of 32-byte write or atomic going over the ``TC_EA0_wrreq`` interface due to uncached traffic. Note that coherently cached mtypes can produce uncached requests, and those are included in this. A 64-byte request is counted as 2."
"``TCC_EA0_WRREQ_STALL_sum``", "Total Number of cycles a write request is stalled, over all instances."
"``TCC_EA0_WRREQ_IO_CREDIT_STALL_sum``", "Total number of cycles an efficiency arbiter write request is stalled due to the interface running out of IO credits, over all instances."
"``TCC_EA0_WRREQ_GMI_CREDIT_STALL_sum``", "Total number of cycles an efficiency arbiter write request is stalled due to the interface running out of GMI credits, over all instances."
"``TCC_EA0_WRREQ_DRAM_CREDIT_STALL_sum``", "Total number of cycles an efficiency arbiter write request is stalled due to the interface running out of DRAM credits, over all instances."
"``TCC_EA0_WRREQ_LEVEL_sum``", "Total number of efficiency arbiter write requests in flight."
"``TCC_EA0_RDREQ_LEVEL_sum``", "Total number of efficiency arbiter read requests in flight."
"``TCC_EA0_ATOMIC_sum``", "Total Number of 32-byte or 64-byte atomic requests going over the ``TC_EA0_wrreq`` interface."
"``TCC_EA0_ATOMIC_LEVEL_sum``", "Total number of efficiency arbiter atomic requests in flight."
"``TCC_EA0_RDREQ_sum``", "Total number of 32-byte or 64-byte read requests to efficiency arbiter."
"``TCC_EA0_RDREQ_32B_sum``", "Total number of 32-byte read requests to efficiency arbiter."
"``TCC_EA0_RD_UNCACHED_32B_sum``", "Total number of 32-byte efficiency arbiter reads due to uncached traffic."
"``TCC_EA0_RDREQ_IO_CREDIT_STALL_sum``", "Total number of cycles there is a stall due to the read request interface running out of IO credits."
"``TCC_EA0_RDREQ_GMI_CREDIT_STALL_sum``", "Total number of cycles there is a stall due to the read request interface running out of GMI credits."
"``TCC_EA0_RDREQ_DRAM_CREDIT_STALL_sum``", "Total number of cycles there is a stall due to the read request interface running out of DRAM credits."
"``TCC_EA0_RDREQ_DRAM_sum``", "Total number of 32-byte or 64-byte efficiency arbiter read requests to HBM."
"``TCC_EA0_WRREQ_DRAM_sum``", "Total number of 32-byte or 64-byte efficiency arbiter write requests to HBM."
"``TCC_HIT_sum``", "Total number of L2 cache hits."
"``TCC_MISS_sum``", "Total number of L2 cache misses."
"``TCC_NC_REQ_sum``", "Total number of non-coherently cached requests."
"``TCC_NORMAL_WRITEBACK_sum``", "Total number of writebacks due to requests that are not writeback requests."
"``TCC_NORMAL_EVICT_sum``", "Total number of evictions due to requests that are not invalidate or probe requests."
"``TCC_PROBE_sum``", "Total number of probe requests."
"``TCC_PROBE_ALL_sum``", "Total number of external probe requests with ``EA0_TCC_preq_all == 1``."
"``TCC_READ_sum``", "Total number of L2 cache read requests (including compressed reads but not metadata reads)."
"``TCC_REQ_sum``", "Total number of all types of L2 cache requests."
"``TCC_RW_REQ_sum``", "Total number of coherently cached with write requests."
"``TCC_STREAMING_REQ_sum``", "Total number of L2 cache streaming requests."
"``TCC_TAG_STALL_sum``", "Total number of cycles the normal request pipeline in the tag is stalled for any reason."
"``TCC_TOO_MANY_EA0_WRREQS_STALL_sum``", "Total number of cycles L2 cache is unable to send an efficiency arbiter write request due to it reaching its maximum capacity of pending efficiency arbiter write requests."
"``TCC_UC_REQ_sum``", "Total number of uncached requests."
"``TCC_WRITE_sum``", "Total number of L2 cache write requests."
"``TCC_WRITEBACK_sum``", "Total number of lines written back to the main memory including writebacks of dirty lines and uncached write or atomic requests."
"``TCC_WRREQ_STALL_max``", "Maximum number of cycles a write request is stalled."
Hardware counters by, for, or over all texture cache per pipe instances
The AMD Instinct MI300 Series GPUs are based on the AMD CDNA 3
architecture which was designed to deliver leadership performance for HPC, artificial intelligence (AI), and machine
learning (ML) workloads. The AMD Instinct MI300 Series GPUs are well-suited for extreme scalability and compute performance, running
on everything from individual servers to the world’s largest exascale supercomputers.
With the MI300 Series, AMD is introducing the Accelerator Complex Die (XCD), which contains the
GPU computational elements of the processor along with the lower levels of the cache hierarchy.
The following image depicts the structure of a single XCD in the AMD Instinct MI300 GPU Series.
```{figure} ../../data/shared/xcd-sys-arch.png
---
name: mi300-xcd
align: center
---
XCD-level system architecture showing 40 Compute Units, each with 32 KB L1 cache, a Unified Compute System with 4 ACE Compute Accelerators, shared 4MB of L2 cache and an HWS Hardware Scheduler.
```
On the XCD, four Asynchronous Compute Engines (ACEs) send compute shader workgroups to the
Compute Units (CUs). The XCD has 40 CUs: 38 active CUs at the aggregate level and 2 disabled CUs for
yield management. The CUs all share a 4 MB L2 cache that serves to coalesce all memory traffic for the
die. With less than half of the CUs of the AMD Instinct MI200 Series compute die, the AMD CDNA™ 3
XCD die is a smaller building block. However, it uses more advanced packaging and the processor
can include 6 or 8 XCDs for up to 304 CUs, roughly 40% more than MI250X.
The MI300 Series integrate up to 8 vertically stacked XCDs, 8 stacks of
High-Bandwidth Memory 3 (HBM3) and 4 I/O dies (containing system
infrastructure) using the AMD Infinity Fabric™ technology as interconnect.
The Matrix Cores inside the CDNA 3 CUs have significant improvements, emphasizing AI and machine
learning, enhancing throughput of existing data types while adding support for new data types.
CDNA 2 Matrix Cores support FP16 and BF16, while offering INT8 for inference. Compared to MI250X
GPUs, CDNA 3 Matrix Cores triple the performance for FP16 and BF16, while providing a
performance gain of 6.8 times for INT8. FP8 has a performance gain of 16 times compared to FP32,
while TF32 has a gain of 4 times compared to FP32.
```{list-table} Peak-performance capabilities of the MI300X for different data types.
:header-rows: 1
:name: mi300x-perf-table
*
- Computation and Data Type
- FLOPS/CLOCK/CU
- Peak TFLOPS
*
- Matrix FP64
- 256
- 163.4
*
- Vector FP64
- 128
- 81.7
*
- Matrix FP32
- 256
- 163.4
*
- Vector FP32
- 256
- 163.4
*
- Vector TF32
- 1024
- 653.7
*
- Matrix FP16
- 2048
- 1307.4
*
- Matrix BF16
- 2048
- 1307.4
*
- Matrix FP8
- 4096
- 2614.9
*
- Matrix INT8
- 4096
- 2614.9
```
The above table summarizes the aggregated peak performance of the AMD Instinct MI300X Open
Compute Platform (OCP) Open Accelerator Modules (OAMs) for different data types and command
processors. The middle column lists the peak performance (number of data elements processed in a
single instruction) of a single compute unit if a SIMD (or matrix) instruction is submitted in each clock
cycle. The third column lists the theoretical peak performance of the OAM. The theoretical aggregated
peak memory bandwidth of the GPU is 5.3 TB per second.
The following image shows the block diagram of the APU (left) and the OAM package (right) both
connected via AMD Infinity Fabric™ network on-chip.
:description:MI355 Series performance counters and metrics
:keywords:MI355, MI355X, MI3XX
***********************************
MI350 Series performance counters
***********************************
This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI350 and MI355 GPUs. These counters are available for profiling using `ROCprofiler-SDK <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/index.html>`_ and `ROCm Compute Profiler <https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/>`_.
The following sections list the performance counters based on the IP blocks.
- ADC valid chunk is not available when dispatch walking is in progress in the multi-xcc mode.
* - CPC_ADC_DISPATCH_ALLOC_DONE
- ADC dispatch allocation is done.
* - CPC_ADC_VALID_CHUNK_END
- ADC crawler's valid chunk end in the multi-xcc mode.
* - CPC_SYNC_FIFO_FULL_LEVEL
- SYNC FIFO full last cycles.
* - CPC_SYNC_FIFO_FULL
- SYNC FIFO full times.
* - CPC_GD_BUSY
- ADC busy.
* - CPC_TG_SEND
- ADC thread group send.
* - CPC_WALK_NEXT_CHUNK
- ADC walking next valid chunk in the multi-xcc mode.
* - CPC_STALLED_BY_SE0_SPI
- ADC CSDATA stalled by SE0SPI.
* - CPC_STALLED_BY_SE1_SPI
- ADC CSDATA stalled by SE1SPI.
* - CPC_STALLED_BY_SE2_SPI
- ADC CSDATA stalled by SE2SPI.
* - CPC_STALLED_BY_SE3_SPI
- ADC CSDATA stalled by SE3SPI.
* - CPC_LTE_ALL
- CPC sync counter LteAll. Only Master XCD manages LteAll.
* - CPC_SYNC_WRREQ_FIFO_BUSY
- CPC sync counter request FIFO is not empty.
* - CPC_CANE_BUSY
- CPC CANE bus is busy, which indicates the presence of inflight sync counter requests.
* - CPC_CANE_STALL
- CPC sync counter sending is stalled by CANE.
Shader pipe interpolators (SPI) counters
=========================================
..list-table::
:header-rows:1
* - Hardware counter
- Definition
* - SPI_CS0_WINDOW_VALID
- Clock count enabled by PIPE0 perfcounter_start event.
* - SPI_CS0_BUSY
- Number of clocks with outstanding waves for PIPE0 (SPI or SH).
* - SPI_CS0_NUM_THREADGROUPS
- Number of thread groups launched for PIPE0.
* - SPI_CS0_CRAWLER_STALL
- Number of clocks when PIPE0 event or wave order FIFO is full.
* - SPI_CS0_EVENT_WAVE
- Number of PIPE0 events and waves.
* - SPI_CS0_WAVE
- Number of PIPE0 waves.
* - SPI_CS1_WINDOW_VALID
- Clock count enabled by PIPE1 perfcounter_start event.
* - SPI_CS1_BUSY
- Number of clocks with outstanding waves for PIPE1 (SPI or SH).
* - SPI_CS1_NUM_THREADGROUPS
- Number of thread groups launched for PIPE1.
* - SPI_CS1_CRAWLER_STALL
- Number of clocks when PIPE1 event or wave order FIFO is full.
* - SPI_CS1_EVENT_WAVE
- Number of PIPE1 events and waves.
* - SPI_CS1_WAVE
- Number of PIPE1 waves.
* - SPI_CS2_WINDOW_VALID
- Clock count enabled by PIPE2 perfcounter_start event.
* - SPI_CS2_BUSY
- Number of clocks with outstanding waves for PIPE2 (SPI or SH).
* - SPI_CS2_NUM_THREADGROUPS
- Number of thread groups launched for PIPE2.
* - SPI_CS2_CRAWLER_STALL
- Number of clocks when PIPE2 event or wave order FIFO is full.
* - SPI_CS2_EVENT_WAVE
- Number of PIPE2 events and waves.
* - SPI_CS2_WAVE
- Number of PIPE2 waves.
* - SPI_CS3_WINDOW_VALID
- Clock count enabled by PIPE3 perfcounter_start event.
* - SPI_CS3_BUSY
- Number of clocks with outstanding waves for PIPE3 (SPI or SH).
* - SPI_CS3_NUM_THREADGROUPS
- Number of thread groups launched for PIPE3.
* - SPI_CS3_CRAWLER_STALL
- Number of clocks when PIPE3 event or wave order FIFO is full.
* - SPI_CS3_EVENT_WAVE
- Number of PIPE3 events and waves.
* - SPI_CS3_WAVE
- Number of PIPE3 waves.
* - SPI_CSQ_P0_Q0_OCCUPANCY
- Sum of occupancy info for PIPE0 Queue0.
* - SPI_CSQ_P0_Q1_OCCUPANCY
- Sum of occupancy info for PIPE0 Queue1.
* - SPI_CSQ_P0_Q2_OCCUPANCY
- Sum of occupancy info for PIPE0 Queue2.
* - SPI_CSQ_P0_Q3_OCCUPANCY
- Sum of occupancy info for PIPE0 Queue3.
* - SPI_CSQ_P0_Q4_OCCUPANCY
- Sum of occupancy info for PIPE0 Queue4.
* - SPI_CSQ_P0_Q5_OCCUPANCY
- Sum of occupancy info for PIPE0 Queue5.
* - SPI_CSQ_P0_Q6_OCCUPANCY
- Sum of occupancy info for PIPE0 Queue6.
* - SPI_CSQ_P0_Q7_OCCUPANCY
- Sum of occupancy info for PIPE0 Queue7.
* - SPI_CSQ_P1_Q0_OCCUPANCY
- Sum of occupancy info for PIPE1 Queue0.
* - SPI_CSQ_P1_Q1_OCCUPANCY
- Sum of occupancy info for PIPE1 Queue1.
* - SPI_CSQ_P1_Q2_OCCUPANCY
- Sum of occupancy info for PIPE1 Queue2.
* - SPI_CSQ_P1_Q3_OCCUPANCY
- Sum of occupancy info for PIPE1 Queue3.
* - SPI_CSQ_P1_Q4_OCCUPANCY
- Sum of occupancy info for PIPE1 Queue4.
* - SPI_CSQ_P1_Q5_OCCUPANCY
- Sum of occupancy info for PIPE1 Queue5.
* - SPI_CSQ_P1_Q6_OCCUPANCY
- Sum of occupancy info for PIPE1 Queue6.
* - SPI_CSQ_P1_Q7_OCCUPANCY
- Sum of occupancy info for PIPE1 Queue7.
* - SPI_CSQ_P2_Q0_OCCUPANCY
- Sum of occupancy info for PIPE2 Queue0.
* - SPI_CSQ_P2_Q1_OCCUPANCY
- Sum of occupancy info for PIPE2 Queue1.
* - SPI_CSQ_P2_Q2_OCCUPANCY
- Sum of occupancy info for PIPE2 Queue2.
* - SPI_CSQ_P2_Q3_OCCUPANCY
- Sum of occupancy info for PIPE2 Queue3.
* - SPI_CSQ_P2_Q4_OCCUPANCY
- Sum of occupancy info for PIPE2 Queue4.
* - SPI_CSQ_P2_Q5_OCCUPANCY
- Sum of occupancy info for PIPE2 Queue5.
* - SPI_CSQ_P2_Q6_OCCUPANCY
- Sum of occupancy info for PIPE2 Queue6.
* - SPI_CSQ_P2_Q7_OCCUPANCY
- Sum of occupancy info for PIPE2 Queue7.
* - SPI_CSQ_P3_Q0_OCCUPANCY
- Sum of occupancy info for PIPE3 Queue0.
* - SPI_CSQ_P3_Q1_OCCUPANCY
- Sum of occupancy info for PIPE3 Queue1.
* - SPI_CSQ_P3_Q2_OCCUPANCY
- Sum of occupancy info for PIPE3 Queue2.
* - SPI_CSQ_P3_Q3_OCCUPANCY
- Sum of occupancy info for PIPE3 Queue3.
* - SPI_CSQ_P3_Q4_OCCUPANCY
- Sum of occupancy info for PIPE3 Queue4.
* - SPI_CSQ_P3_Q5_OCCUPANCY
- Sum of occupancy info for PIPE3 Queue5.
* - SPI_CSQ_P3_Q6_OCCUPANCY
- Sum of occupancy info for PIPE3 Queue6.
* - SPI_CSQ_P3_Q7_OCCUPANCY
- Sum of occupancy info for PIPE3 Queue7.
* - SPI_CSQ_P0_OCCUPANCY
- Sum of occupancy info for all PIPE0 queues.
* - SPI_CSQ_P1_OCCUPANCY
- Sum of occupancy info for all PIPE1 queues.
* - SPI_CSQ_P2_OCCUPANCY
- Sum of occupancy info for all PIPE2 queues.
* - SPI_CSQ_P3_OCCUPANCY
- Sum of occupancy info for all PIPE3 queues.
* - SPI_VWC0_VDATA_VALID_WR
- Number of clocks VGPR bus_0 writes VGPRs.
* - SPI_VWC1_VDATA_VALID_WR
- Number of clocks VGPR bus_1 writes VGPRs.
* - SPI_CSC_WAVE_CNT_BUSY
- Number of cycles when there is any wave in the pipe.
Compute unit (SQ) counters
===========================
..list-table::
:header-rows:1
* - Hardware counter
- Definition
* - SQ_INSTS_VALU_MFMA_F6F4
- Number of VALU V_MFMA_*_F6F4 instructions.
* - SQ_INSTS_VALU_MFMA_MOPS_F6F4
- Number of VALU matrix with the performed math operations (add or mul) divided by 512, assuming a full EXEC mask of F6 or F4 data type.
* - SQ_ACTIVE_INST_VALU2
- Number of quad-cycles when two VALU instructions are issued (per-simd, nondeterministic).
* - SQ_INSTS_LDS_LOAD
- Number of LDS load instructions issued (per-simd, emulated).
* - SQ_INSTS_LDS_STORE
- Number of LDS store instructions issued (per-simd, emulated).
* - SQ_INSTS_LDS_ATOMIC
- Number of LDS atomic instructions issued (per-simd, emulated).
* - SQ_INSTS_LDS_LOAD_BANDWIDTH
- Total number of 64-bytes loaded (instrSize * CountOnes(EXEC))/64 (per-simd, emulated).
* - SQ_INSTS_LDS_STORE_BANDWIDTH
- Total number of 64-bytes written (instrSize * CountOnes(EXEC))/64 (per-simd, emulated).
* - SQ_INSTS_LDS_ATOMIC_BANDWIDTH
- Total number of 64-bytes atomic (instrSize * CountOnes(EXEC))/64 (per-simd, emulated).
* - SQ_INSTS_VALU_FLOPS_FP16
- Counts FLOPS per instruction on float 16 excluding MFMA/SMFMA.
* - SQ_INSTS_VALU_FLOPS_FP32
- Counts FLOPS per instruction on float 32 excluding MFMA/SMFMA.
* - SQ_INSTS_VALU_FLOPS_FP64
- Counts FLOPS per instruction on float 64 excluding MFMA/SMFMA.
* - SQ_INSTS_VALU_FLOPS_FP16_TRANS
- Counts FLOPS per instruction on float 16 trans excluding MFMA/SMFMA.
* - SQ_INSTS_VALU_FLOPS_FP32_TRANS
- Counts FLOPS per instruction on float 32 trans excluding MFMA/SMFMA.
* - SQ_INSTS_VALU_FLOPS_FP64_TRANS
- Counts FLOPS per instruction on float 64 trans excluding MFMA/SMFMA.
* - SQ_INSTS_VALU_IOPS
- Counts OPS per instruction on integer or unsigned or bit data (per-simd, emulated).
* - SQ_LDS_DATA_FIFO_FULL
- Number of cycles LDS data FIFO is full (nondeterministic, unwindowed).
* - SQ_LDS_CMD_FIFO_FULL
- Number of cycles LDS command FIFO is full (nondeterministic, unwindowed).
* - SQ_VMEM_TA_ADDR_FIFO_FULL
- Number of cycles texture requests are stalled due to full address FIFO in TA (nondeterministic, unwindowed).
* - SQ_VMEM_TA_CMD_FIFO_FULL
- Number of cycles texture requests are stalled due to full cmd FIFO in TA (nondeterministic, unwindowed).
* - SQ_VMEM_WR_TA_DATA_FIFO_FULL
- Number of cycles texture writes are stalled due to full data FIFO in TA (nondeterministic, unwindowed).
* - SQC_ICACHE_MISSES_DUPLICATE
- Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic).
* - SQC_DCACHE_MISSES_DUPLICATE
- Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic).
Texture addressing (TA) unit counters
======================================
..list-table::
:header-rows:1
* - Hardware counter
- Definition
* - TA_BUFFER_READ_LDS_WAVEFRONTS
- Number of buffer read wavefronts for LDS return processed by the TA.
* - TA_FLAT_READ_LDS_WAVEFRONTS
- Number of flat opcode reads for LDS return processed by the TA.
Texture data (TD) unit counters
================================
..list-table::
:header-rows:1
* - Hardware counter
- Definition
* - TD_WRITE_ACKT_WAVEFRONT
- Number of write acknowledgments, sent to SQ and not to SP.
* - TD_TD_SP_TRAFFIC
- Number of times this TD sends data to the SP.
Texture cache per pipe (TCP) counters
======================================
..list-table::
:header-rows:1
* - Hardware counter
- Definition
* - TCP_TCP_TA_ADDR_STALL_CYCLES
- TCP stalls TA addr interface.
* - TCP_TCP_TA_DATA_STALL_CYCLES
- TCP stalls TA data interface. Now windowed.
* - TCP_LFIFO_STALL_CYCLES
- Memory latency FIFOs full stall.
* - TCP_RFIFO_STALL_CYCLES
- Memory Request FIFOs full stall.
* - TCP_TCR_RDRET_STALL
- Write into cache stalled by read return from TCR.
* - TCP_PENDING_STALL_CYCLES
- Stall due to data pending from L2.
* - TCP_UTCL1_SERIALIZATION_STALL
- Total number of stalls caused due to serializing translation requests through the UTCL1.
* - TCP_UTCL1_THRASHING_STALL
- Stall caused by thrashing feature in any probe. Lacks accuracy when the stall signal overlaps between probe0 and probe1, which is worse with MECO of thrashing deadlock. Some probe0 events could miss being counted in with MECO on. This perf count provides a rough thrashing estimate.
* - TCP_UTCL1_TRANSLATION_MISS_UNDER_MISS
- Translation miss_under_miss.
* - TCP_UTCL1_STALL_INFLIGHT_MAX
- Total UTCL1 stalls due to inflight counter saturation.
* - TCP_UTCL1_STALL_LRU_INFLIGHT
- Total UTCL1 stalls due to LRU cache line with inflight traffic.
* - TCP_UTCL1_STALL_MULTI_MISS
- Total UTCL1 stalls due to arbitrated multiple misses.
* - TCP_UTCL1_LFIFO_FULL
- Total UTCL1 and UTCL2 latency, which hides FIFO full cycles.
* - TCP_UTCL1_STALL_LFIFO_NOT_RES
- Total UTCL1 stalls due to UTCL2 latency, which hides FIFO output (not resident).
* - TCP_UTCL1_STALL_UTCL2_REQ_OUT_OF_CREDITS
- Total UTCL1 stalls due to UTCL2_req being out of credits.
* - TCP_CLIENT_UTCL1_INFLIGHT
- The sum of inflight client to UTCL1 requests per cycle.
* - TCP_TAGRAM0_REQ
- Total L2 requests mapping to TagRAM 0 from this TCP to all TCCs.
* - TCP_TAGRAM1_REQ
- Total L2 requests mapping to TagRAM 1 from this TCP to all TCCs.
* - TCP_TAGRAM2_REQ
- Total L2 requests mapping to TagRAM 2 from this TCP to all TCCs.
* - TCP_TAGRAM3_REQ
- Total L2 requests mapping to TagRAM 3 from this TCP to all TCCs.
* - TCP_TCP_LATENCY
- Total TCP wave latency (from the first clock of wave entering to the first clock of wave leaving). Divide by TA_TCP_STATE_READ to find average wave latency.
* - TCP_TCC_READ_REQ_LATENCY
- Total TCP to TCC request latency for reads and atomics with return. Not Windowed.
* - TCP_TCC_WRITE_REQ_LATENCY
- Total TCP to TCC request latency for writes and atomics without return. Not Windowed.
* - TCP_TCC_WRITE_REQ_HOLE_LATENCY
- Total TCP req to TCC hole latency for writes and atomics. Not Windowed.
Texture cache per channel (TCC) counters
=========================================
..list-table::
:header-rows:1
* - Hardware counter
- Definition
* - TCC_READ_SECTORS
- Total number of 32B data sectors in read requests.
* - TCC_WRITE_SECTORS
- Total number of 32B data sectors in write requests.
* - TCC_ATOMIC_SECTORS
- Total number of 32B data sectors in atomic requests.
* - TCC_BYPASS_REQ
- Number of bypass requests. This is measured at the tag block.
* - TCC_LATENCY_FIFO_FULL
- Number of cycles when the latency FIFO is full.
* - TCC_SRC_FIFO_FULL
- Number of cycles when the SRC FIFO is assumed to be full as measured at the IB block.
* - TCC_EA0_RDREQ_64B
- Number of 64-byte TCC/EA read requests.
* - TCC_EA0_RDREQ_128B
- Number of 128-byte TCC/EA read requests.
* - TCC_IB_REQ
- Number of requests through the IB. This measures the number of raw requests from graphics clients to this TCC.
* - TCC_IB_STALL
- Number of cycles when the IB output is stalled.
* - TCC_EA0_WRREQ_WRITE_DRAM
- Number of TCC/EA write requests (32-byte or 64-byte) destined for DRAM (MC).
* - TCC_EA0_WRREQ_ATOMIC_DRAM
- Number of TCC/EA atomic requests (32-byte or 64-byte) destined for DRAM (MC).
* - TCC_EA0_RDREQ_DRAM_32B
- Number of 32-byte TCC/EA read requests due to DRAM traffic. One 64-byte request is counted as two and one 128-byte as four.
* - TCC_EA0_RDREQ_GMI_32B
- Number of 32-byte TCC/EA read requests due to GMI traffic. One 64-byte request is counted as two and one 128-byte as four.
* - TCC_EA0_RDREQ_IO_32B
- Number of 32-byte TCC/EA read requests due to IO traffic. One 64-byte request is counted as two and one 128-byte as four.
* - TCC_EA0_WRREQ_WRITE_DRAM_32B
- Number of 32-byte TCC/EA write requests due to DRAM traffic. One 64-byte request is counted as two.
* - TCC_EA0_WRREQ_ATOMIC_DRAM_32B
- Number of 32-byte TCC/EA atomic requests due to DRAM traffic. One 64-byte request is counted as two.
* - TCC_EA0_WRREQ_WRITE_GMI_32B
- Number of 32-byte TCC/EA write requests due to GMI traffic. One 64-byte request is counted as two.
* - TCC_EA0_WRREQ_ATOMIC_GMI_32B
- Number of 32-byte TCC/EA atomic requests due to GMI traffic. One 64-byte request is counted as two.
* - TCC_EA0_WRREQ_WRITE_IO_32B
- Number of 32-byte TCC/EA write requests due to IO traffic. One 64-byte request is counted as two.
* - TCC_EA0_WRREQ_ATOMIC_IO_32B
- Number of 32-byte TCC/EA atomic requests due to IO traffic. One 64-byte request is counted as two.
If you don't see this line, click `Show all checks` to get an itemized view.
## Command line
You can build our documentation via the command line using Python.
See the `build.tools.python` setting in the [Read the Docs configuration file](https://github.com/ROCm/ROCm/blob/develop/.readthedocs.yaml) for the Python version used by Read the Docs to build documentation.
See the [Python requirements file](https://github.com/ROCm/ROCm/blob/develop/docs/sphinx/requirements.txt) for Python packages needed to build the documentation.
Use the Python Virtual Environment (`venv`) and run the following commands from the project root:
The ROCm documentation, like all of ROCm, is open source and available on GitHub. You can contribute to the ROCm documentation by forking the appropriate repository, making your changes, and opening a pull request.
To provide feedback on the ROCm documentation, including submitting an issue or suggesting a feature, see [Providing feedback about the ROCm documentation](./feedback.md).
## The ROCm repositories
The repositories for ROCm and all ROCm components are available on GitHub.
| ROCm installation for Linux | [https://github.com/ROCm/rocm-install-on-linux/tree/develop/docs](https://github.com/ROCm/rocm-install-on-linux/tree/develop/docs) |
| ROCm HIP SDK installation for Windows | [https://github.com/ROCm/rocm-install-on-windows/tree/develop/docs](https://github.com/ROCm/rocm-install-on-windows/tree/develop/docs) |
Individual components have their own repositories with their own documentation in their own `docs` folders.
The sub-folders within the `docs` folders across ROCm are typically structured as follows:
| Sub-folder name | Documentation type |
|-------|----------|
| `install` | Installation instructions, build instructions, and prerequisites |
| `conceptual` | Important concepts |
| `how-to` | How to implement specific use cases |
| `tutorials` | Tutorials |
| `reference` | API references and other reference resources |
## Editing and adding to the documentation
ROCm documentation follows the [Google developer documentation style guide](https://developers.google.com/style/highlights).
Most topics in the ROCm documentation are written in [reStructuredText (rst)](https://www.sphinx-doc.org/en/master/usage/restructuredtext/index.html), with some topics written in Markdown. Only use reStructuredText when adding new topics. Only use Markdown if the topic you are editing is already in Markdown.
To edit or add to the documentation:
1. Fork the repository you want to add to or edit.
2. Clone your fork locally.
3. Create a new local branch cut from the `develop` branch of the repository.
4. Make your changes to the documentation.
5. Optionally, build the documentation locally before creating a pull request by running the following commands from within the `docs` folder:
```bash
pip3 install -r sphinx/requirements.txt # You only need to run this command once
The output files will be located in the `docs/_build` folder. Open `docs/_build/html/index.html` to view the documentation.
For more information on ROCm build tools, see [Documentation toolchain](toolchain.md).
6. Push your changes. A GitHub link will be returned in the output of the `git push` command. Open this link in a browser to create the pull request.
The documentation is built as part of the checks on pull request, along with spell checking and linting. Scroll to the bottom of your pull request to view all the checks.
Verify that the linting and spell checking have passed, and that the documentation was built successfully. New words or acronyms can be added to the [wordlist file](https://github.com/ROCm/rocm-docs-core/blob/develop/.wordlist.txt). The wordlist is subject to approval by the ROCm documentation team.
The Read The Docs build of your pull request can be accessed by clicking on the Details link next to the Read The Docs build check. Verify that your changes are in the build and look as expected.


Your pull request will be reviewed by a member of the ROCm documentation team.
See the [GitHub documentation](https://docs.github.com/en) for information on how to fork and clone a repository, and how to create and push a local branch.
```{important}
By creating a pull request (PR), you agree to allow your contribution to be licensed under the terms of the
LICENSE.txt file in the corresponding repository. Different repositories can use different licenses.
Feedback about the ROCm documentation is welcome. You can provide feedback about the ROCm documentation either through GitHub Discussions or GitHub Issues.
## Participating in discussions through GitHub Discussions
You can ask questions, view announcements, suggest new features, and communicate with other members of the community through [GitHub Discussions](https://github.com/ROCm/ROCm/discussions).
## Submitting issues through GitHub Issues
You can submit issues through [GitHub Issues](https://github.com/ROCm/ROCm/issues).
When creating a new issue, follow the following guidelines:
1. Always do a search to see if the same issue already exists. If the issue already exists, upvote it, and comment or post to provide any additional details you might have.
2. If you find an issue that is similar to your issue, log your issue, then add a comment that includes a link to the similar issue, as well as its issue number.
3. Always provide as much information as possible. This helps reduce the time required to reproduce the issue.
After creating your issue, make sure to check it regularly for any requests for additional information.
For information about contributing content to the ROCm documentation, see [Contributing to the ROCm documentation](./contributing.md).
The ROCm documentation relies on several open source toolchains and sites.
## rocm-docs-core
[rocm-docs-core](https://github.com/ROCm/rocm-docs-core) is an AMD-maintained
project that applies customizations for the ROCm documentation. This project is the tool most ROCm repositories use as part of their documentation build pipeline. It is available as a [pip package on PyPI](https://pypi.org/project/rocm-docs-core/).
See the user and developer guides for rocm-docs-core at
[Sphinx](https://www.sphinx-doc.org/en/master/) is a documentation generator originally used for Python. It is now widely used in the open source community.
### Sphinx External ToC
[Sphinx External ToC](https://sphinx-external-toc.readthedocs.io/en/latest/intro.html) is a Sphinx extension used for ROCm documentation navigation. This tool generates a navigation menu on the left
based on a YAML file (`_toc.yml.in`) that contains the table of contents.
### Sphinx-book-theme
[Sphinx-book-theme](https://sphinx-book-theme.readthedocs.io/en/latest/) is a Sphinx theme that defines the base appearance for ROCm documentation. ROCm documentation applies some customization, such as a custom header and footer, on top of the Sphinx Book Theme.
### Sphinx Design
[Sphinx design](https://sphinx-design.readthedocs.io/en/latest/index.html) is a Sphinx extension that adds design functionality. ROCm documentation uses Sphinx Design for grids, cards, and synchronized tabs.
## Doxygen
[Doxygen](https://www.doxygen.nl/) is a documentation generator that extracts information from in-code comments. It is used for API documentation.
## Breathe
[Breathe](https://www.breathe-doc.org/) is a Sphinx plugin for integrating Doxygen content.
## Read the Docs
[Read the Docs](https://docs.readthedocs.io/en/stable/) is the service that builds and hosts the HTML version of the ROCm documentation.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.