From 0a237dfd42c8927e7e46a8ef1ec5ca1d606117ca Mon Sep 17 00:00:00 2001 From: alexxu-amd <159800977+alexxu-amd@users.noreply.github.com> Date: Thu, 14 Nov 2024 13:14:37 -0500 Subject: [PATCH] Sync develop from external repo (#205) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Update version list with 6.2.0 (#3505) (#3506) * Fix link to meta-llama finetuning recipes * Spellcheck fixes in release notes templates (#3526) (#3548) * fix spelling in 5.4.x templates * add to wordlist * update templates update wordlist * remove extra_components rm extra_components * fix spelling Co-authored-by: Peter Park * Fix link to rocr debug agent (#3533) Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> * Fix intersphinx links (#3546) * update fw install links * fix more intersphinx links * fix more links * add rocPyDecode repo to ROCm6.2 manifest file (#3541) (#3553) Co-authored-by: Yanyao Wang Co-authored-by: Wang, Yanyao * Fix typo for TFLOPs metric in MI250 architecture page * Add rocm-examples to default.xml (#3583) * Add rocm 6.2.0 manifest file for rocm-build scripts (#3538) * Add rocm 6.2.0 manifest file for rocm-build scripts Signed-off-by: David Galiffi * Add "rocm-examples" --------- Signed-off-by: David Galiffi * Add a section on increasing memory allocation to the MI300A system op… (#3587) * Add a section on increasing memory allocation to the MI300A system optimization guide * Addition to wordlist * Change GB to GiB for consistency * Standardize GiB/KiB spacing * Minor wording changes * Update build scripts for ROCm6.2 release * fix README.md for Ubuntu24 docker * Correct ttm to amdttm (#3648) * Expand the section on changing thread affinity (#3653) * Expand the section on changing thread affinity * Clarify the methods for configuring allocatable memory settings * Small correction * Update model-quantization.rst to import `BitsAndBytesConfig` from transformers library (#3638) * remove unneeded file (#3663) * Fix intersphinx links (#3668) * fix links in install.rst * fix links in sys opt guides * Add introduction and links to the new guide to the vLLM optimized Doc… (#3637) * Add introduction and links to the new guide to the vLLM optimized Docker image on AMD Infinity Hub * Update target link for the Docker vLLM guide * Change target URL * Change link target URL again * Fixed broken link to RISC-V documentation * Add FBGEMM/FBGEMM_GPU to the Model acceleration libraries page (#3659) * Add FBGEMM/FBGEMM_GPU to the Model acceleration libraries page * Add words to wordlist and fix a typo * Add new sections for Docker and testing * Incorporate comments from the external review * Some minor edits and clarifications * Incorporate further review coments and fix test section * Add comment to test section * Change git clone command for FBGEMM repo * Change Docker command * Changes from internal review * Fix linting issue * Fixed broken links for tensile, rocprofiler, roctracer, hipify, rocm-cmake * add missing make command to bitsandbytes install commands (#3722) * Update link to rocRAND data type support (#3736) * Fix Radeon link and point at R6.1.3 as absolute link (#3757) * Fix Radeon link and point at R6.1.3 as absolute link (#3757) * Include rocal version change in the highlights (#177) * Include rocal version change in the highlights * Reworded rocal known issues and added link to rocal in highlights * Update ROCm manifest to 6.2.1 * Update ROCm branch name * Add 6.2.1 to version list (#3770) * Add links to GH issues in 6.2.1 release notes (#3769) * add MAD page * link to GitHub issues in release notes known issues * update templates for 6.2.1 * Revert "add MAD page" This reverts commit 9cce72bba306286c7eb317d592645d4e0e1b27aa. * update wordlist for spellcheck linter * add rccl note * update rocal version change heading to be more obvious * make rocal note more specific * fix missing space * fix capitalization * Update RCCL known issue wording (#3775) * add MAD page * fix wording in RCCL known issue * Revert "add MAD page" This reverts commit c81d0f3b0a3620305b11de8745686c86b060b006. * update llvm version for 6.2.1 (#3779) * Fix broken links in 6.2.1 release notes (#3782) * External CI: Replace libomp dependencies with aomp (#3781) Add roctracer dependency for hipBLAS and rocWMMA testing * External CI: Add rocprofiler v1 and v2 smoke tests (#3784) * External CI: ROCgdb smoke tests (#3785) - Since this is an autotools project and not cmake, build and test on gfx942 system instead of separating into two jobs. Pipeline time is short anyway. - Follow build instructions to update build flags and to incorporate the ROCdbgapi. - Results are not parsed and graphed, but the log contents are printed at the end. This was helpful for debugging and will be kept in the pipeline, as the make check-gdb command's output was not helpful on its own. * External CI: rocPyDecode Smoke Test (#3786) * External CI: omniperf pipeline (#3788) - Referred to public documentation, source, and iterative attempts to create and improve build and test pipeline. - ctest failures are due to the test node not having expected marketing name string and override not working. - The fix should be on the omniperf repo side of things, so this pull request should be fine as is. * External CI: create omniperf pipeline IDs, update nightly build (#3790) * Fixed greater than to be less than in rocFFT changes * fix footnote for 6.1.0 (#3791) * fix footnote for 6.1.0 * fix empty columns in historical KFD title * External CI: Publish wheel as artifact for rocPyDecode (#3796) * fix build rocal for ROCm6.2.1 * Add ROCm6.2.1 manifest file * External CI: fix hip-tests symlink creation (#3799) * Docs: Add Ubuntu 24.04.1 (#3801) * add ubuntu 24.04.1 * add 24.04.1 to bottom os section * fix heading and template * Update compatibility-matrix.rst for OpenMP version * Update compatibility-matrix-historical-6.0.csv for OpenMP version * rm ubuntu 24.04.1 from 6.2.0 * Update docs/compatibility/compatibility-matrix.rst Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> * rm duplicate ubuntu in historical --------- Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> * Docs: Add Ubuntu 24.04.1 (#3801) * add ubuntu 24.04.1 * add 24.04.1 to bottom os section * fix heading and template * Update compatibility-matrix.rst for OpenMP version * Update compatibility-matrix-historical-6.0.csv for OpenMP version * rm ubuntu 24.04.1 from 6.2.0 * Update docs/compatibility/compatibility-matrix.rst Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> * rm duplicate ubuntu in historical --------- Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> * External CI: fixes for rocMLIR and nightly build (#3800) * External CI: fix symlinks for rocMLIR and nightly build * add pipeline IDs for hip-tests * fix hip-test ID typo * remove llvm-alt license (#3727) * remove llvm-alt license * fix linting error * External CI: enable ROCR-Runtime tests (#3809) * External CI: default branches for hip-tests, omniperf (#3811) * External CI: torch and torchvision smoke tests (#3810) * External CI: torch and torchvision smoke tests - Fixed issues with package name and version for the vision wheel that prevented it from installing. A patch is used until my pull request in vision repo is merged. - Referred to rocAutomation scripts to pick which test scripts to run out of the many in the torch and vision repo, and iteratively tested suggested scripts to see which ones completed in a timely manner. - Leveraging pytest-azurepipelines module to automatically parse and graph results from these tests. * External CI: omnitrace build pipeline (#3812) * External CI: omnitrace build pipeline starter - Adding initial set of dependencies and build flags. * External CI: omnitrace build pipeline - Add bison, rccl, texinfo dependencies based on build failures. - Add AMDGPU_TARGETS flag - Add ROCm binaries to PATH for clang-format and other tools used. * Fix indentation --------- Co-authored-by: Daniel Su * External CI: AMDMIGraphX Build Fix (#3814) - Swap to default gcc on OS to resolve build errors from recent commits. - Added libdnnl-dev dependency from iterative attempts with compiler change. - Referred to the passing GitHub checks to observe the compilers that was used. - Build CK jit lib and include in AMDMIGraphX build. * External CI: test fixes w/ roctracer, list omniperf as partially succeeding (#3815) * External CI: rpp tests (#3816) * External CI: Build pipeline for rocprofiler-sdk (#3819) * External CI: Pipeline for rocprofiler-sdk * Add rocprofiler dependency * External CI: rocprofiler-sdk build pipeline --------- Co-authored-by: Daniel Su * External CI: Fix/add missing pipeline IDs (#3818) * Update default.xml - Change 6.2.1 to 6.2.2 * Add ROCm6.2.1 manifest file * External CI: omnitrace tests (#3822) * Update tags to 6.2.2 (#3827) * Update tags to 6.2.2 (#3827) * External CI: add roctracer to roc/hipSOLVER test deps (#3825) * External CI: add rocprofiler-sdk pipeline IDs (#3824) * External CI: AMDMIGraphX Smoke Tests (#3830) Co-authored-by: Daniel Su * External CI: MIOpen tests (#3837) * Point to release history instead of deprecated changelog (#3836) * External CI: filter out hipTensor extended tests (#3838) * added revised note re. radeon gpus (#3839) * Restructured the contributions section. (#3715) * testing if this file is editable * changed 'kebob-case' to 'dash-case' * Restructured the page to be more straightforward and provide additional repo information * forgot to save * Moved the topic sentence * Wrong accent on the a in diataxis * Removed the feedback info from contributing and moved it to Feedback * fixed spelling errors * fixed some wording and removed second person text * consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc * updated the titles in the toc * made changes based on feedback * it's better when you save * removed structure and build; fixed something for the linter * added rst to wordlist * added customizations to wordlist * Add links to gpu cluster network guides (#3763) * Add links to gpu cluster network guides * Add newline character to eof * Make link absolute * add dynamic branch in toc * remove unnecessary page clean up * clean up index/toc * make multi-node topics adjacent --------- Co-authored-by: Peter Park * Point to release history instead of deprecated changelog (#3836) * Restructured the contributions section. (#3715) * testing if this file is editable * changed 'kebob-case' to 'dash-case' * Restructured the page to be more straightforward and provide additional repo information * forgot to save * Moved the topic sentence * Wrong accent on the a in diataxis * Removed the feedback info from contributing and moved it to Feedback * fixed spelling errors * fixed some wording and removed second person text * consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc * updated the titles in the toc * made changes based on feedback * it's better when you save * removed structure and build; fixed something for the linter * added rst to wordlist * added customizations to wordlist * Add links to gpu cluster network guides (#3763) * Add links to gpu cluster network guides * Add newline character to eof * Make link absolute * add dynamic branch in toc * remove unnecessary page clean up * clean up index/toc * make multi-node topics adjacent --------- Co-authored-by: Peter Park * updated the radeon note (#3850) * External CI: Fix rocPyDecode wheel creation (#3852) - Set values for expected environment variables. - Accompanying changes required in rocPyDecode repo. Pull request will be made. * External CI: pytorch vision patch removal (#3855) My pull request applying this patch was merged upstream, so this is no longer needed and will break the pipeline since it can no longer be applied. * Build(deps): Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#3807) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * updated the radeon note, as it were (#3857) * updated the radeon note, as it were * updated the note again * Set devops team as codeowners for rocm-build (#3860) * Set ext CI as codeowners for rocm-build * Update CODEOWNERS to rocm-devops * External CI: Add option to pull mainline branch for dependencies (#3689) * External CI: Add option to pull mainline branch for dependencies * Missing parameter for mainline branch dependencies. * External CI: mainline branch definitions * Removed MIGraphX optimization page (#3848) * External CI: add a global variable to control gfx942 tests (#3864) * External CI: update component default/mainline branches (#3871) * External CI: Stop building gfx90a (#3872) Save on VM resources until infrastructure has test targets. * External CI: add libstdc++-12 to rocMLIR (#3874) * Add building doc section (#3873) * External CI: programmatically get latest aqlprofile (#3876) * External CI: use ctest for rocm-examples (#3877) * External CI: Tensile pipeline (#3884) * add oversubscription conceptual doc (#3885) add mitigiation steps add to toc move page for build move doc fix spelling update doc update oversubscription update order fix spelling add oversubscription to wordlist move oversubscription topic to bottom of toc and index * add oversubscription conceptual doc (#3885) add mitigiation steps add to toc move page for build move doc fix spelling update doc update oversubscription update order fix spelling add oversubscription to wordlist move oversubscription topic to bottom of toc and index (cherry picked from commit d0ecf51b0c9202475e2abe90a45b50df0de6d7ae) * add oversubscription conceptual doc (#3885) (cherry picked from commit d0ecf51b0c9202475e2abe90a45b50df0de6d7ae) * Add building doc section (#3873) (cherry picked from commit abc0e6a08781e4ec98c1566b2e82ed3806db3af5) * External CI: Add pipeline to build upstream boost (#3896) * Update bitsandbytes branch in docs (#3898) * Update bitsandbytes branch in docs (#3898) (cherry picked from commit b541be7bcb6541b4a00633972ed5d0a546ad4e85) * Documentation: Add reference to precision-support floating-point types (#3899) * External CI: use Boost template for MIOpen (#3903) * External CI: create rocprofiler-systems pipeline (#3906) * External CI: omnitrace/rocprof-sys pipeline IDs (#3908) * External CI: MIOpen parse test results (#3913) * External CI: Use pip to install latest cmake on test system (#3915) * added a link to the compatibility matrix (#3904) * added a link to the compatibility matrix * removed quotes * docs: Remove invalid amd_iommu=on parameter Per kernel-parameters.txt, there is no "on" option for amd_iommu. While intel_iommu has it, amd_iommu is automatically on unless specified otherwise. For more info, see these 2 links: https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt https://github.com/torvalds/linux/blob/75aa74d52f43e75d0beb20572f98529071b700e5/drivers/iommu/amd/init.c#L3481 Signed-off-by: Kent Russell * docs: Remove invalid amd_iommu=on parameter Per kernel-parameters.txt, there is no "on" option for amd_iommu. While intel_iommu has it, amd_iommu is automatically on unless specified otherwise. For more info, see these 2 links: https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt https://github.com/torvalds/linux/blob/75aa74d52f43e75d0beb20572f98529071b700e5/drivers/iommu/amd/init.c#L3481 Signed-off-by: Kent Russell (cherry picked from commit 74333b667d34ce9d90726572533b28059f3fe5b6) * External CI: hipBLASLt build now requires python packaging module (#3926) https://github.com/ROCm/hipBLASLt/pull/1250/files#diff-fee2e6f068b33fca3a1dc49392de8848dbf05c3f4632b680abb1052523e5a30fR35 * External CI: Moved location of upstream pytorch build scripts (#3930) https://github.com/pytorch/pytorch/pull/138103 * External CI: disable rocMLIR tests (#3931) * External CI: disable rocMLIR tests * roctracer AMDGPU_TARGETS flag * External CI: create a GPU diagnostics template (#3932) * External CI: Add CK into pytorch build environment (#3934) * Update rocm-6.2.2.xml (#3927) vim typo removed * External CI: add support to disable individual component tests (#3938) * External CI: AMDMIGraphX greater-equal pip dependencies (#3939) * Build(deps): Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx (#3933) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.2 to 1.8.3. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.2...v1.8.3) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * External CI: rocDecode add libva-amdgpu-dev dependency (#3940) * External CI: enumerate GPUs in gpu-diagnostics (#3942) * External CI: move gpu-diag directly before tests (#3943) * External CI: fix HIP_PIPELINE_ID (#3944) * External CI: pytorch pipeline updates (#3948) To support recent upstream changes and issues observed. * External CI: rocpydecode dependency installation change (#3954) - Install pybind11 through pip instead of apt - Add pip-installed pybind11 path to CMAKE_PREFIX_PATH - Tested against source of PR 122 * External CI: do not assume python is python3 for rocpydecode (#3955) * Improve consistency of the gpu-arch-specs table. (#3936) * Improve consistency of the gpu-arch-specs table. * Add XCD to the glossary. * External CI: Always force rocPyDecode cleanup step * External CI: Add aqlprofile to Tensile test dependencies (#3961) * add vllm performance validation doc (#3964) * External CI: various fixes (#3963) * add suggestions to vllm perf validation doc (#3968) * External CI: move allowPartiallySucceededBuilds to library variable (#3970) * External CI: suppress GPU diag warnings (#3972) * External CI: rocprofiler-compute pipeline files (#3973) * External CI: disable reload AMDGPU (#3974) * Update links to vllm perf validation doc (#3971) * update links to vllm perf validation doc * add PagedAttention to wordlist * External CI: Change test setup for rocPyDecode (#3978) - Use multiple potential locations for pybind11 to be found by cmake. * External CI: add roctracer to rocBLAS deps (#3982) * External CI: decode test changes (#3983) - Only target container with access to first device - Ensure pybind11-dev is uninstalled before the package manager install steps * Changed the introductory text linked to Radeon (#3988) Co-authored-by: prbasyal * External CI: finish rocprofiler-compute enablement (#3995) * External CI: add aomp as rocprofiler-systems dependency (#3996) * External CI: remove omniperf from nightly (#4000) * Sync from internal develop 6.2.4 (#4002) * add radeon pro v710 to gpu arch specs (#192) * Add V710 specs gpg: using RSA key 22223038B47B3ED4B3355AB11B54779B4780494E gpg: Good signature from "Peter Park (MKMPETEPARK01) " [ultimate] add some specs add cols clean up extra line * fix graphics l1 cache description * update SGPR for RDNA2 and RDNA3 archs * update VGPR * Apply suggestions from code review * change l2 cache to 4 * Update docs/reference/gpu-arch-specs.rst * ROCm 6.2.4 compatibility matrix (#186) * prep compat column (historical) and mi300x column * update historical compat matrix for 6.2.4 * update compat matrix for 6.2.4 * fix compat * fix thunk version * fix hipify ver * ROCm 6.2.4 release notes (#184) * prep 6.2.4 release notes * add mathlibs * add detail component changes * rm non-updated linnks * fix sentence * fix rocthrust v * rm offline installer * condense * add leo/ram fdback words * update documentation section * add rocm on radeon note * update os support note wording * update release * update version and GA date to 10-17 * update 6.2.4 rn * update wording * add link to v710 * update wording * update templ * simplify note * words os note words * change URLs to latest * update link to supported GPUs * Update versions.md 6.2.4 date to Oct 18 * Update conf.py release note date to Oct 18 --------- Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> * Sync change from ROCm to ROCm-internal (#194) * Fix Radeon link and point at R6.1.3 as absolute link (#3757) * Update ROCm manifest to 6.2.1 * Update ROCm branch name * Add 6.2.1 to version list (#3770) * Add links to GH issues in 6.2.1 release notes (#3769) * add MAD page * link to GitHub issues in release notes known issues * update templates for 6.2.1 * Revert "add MAD page" This reverts commit 9cce72bba306286c7eb317d592645d4e0e1b27aa. * update wordlist for spellcheck linter * add rccl note * update rocal version change heading to be more obvious * make rocal note more specific * fix missing space * fix capitalization * Update RCCL known issue wording (#3775) * add MAD page * fix wording in RCCL known issue * Revert "add MAD page" This reverts commit c81d0f3b0a3620305b11de8745686c86b060b006. * update llvm version for 6.2.1 (#3779) * Fix broken links in 6.2.1 release notes (#3782) * External CI: Replace libomp dependencies with aomp (#3781) Add roctracer dependency for hipBLAS and rocWMMA testing * External CI: Add rocprofiler v1 and v2 smoke tests (#3784) * External CI: ROCgdb smoke tests (#3785) - Since this is an autotools project and not cmake, build and test on gfx942 system instead of separating into two jobs. Pipeline time is short anyway. - Follow build instructions to update build flags and to incorporate the ROCdbgapi. - Results are not parsed and graphed, but the log contents are printed at the end. This was helpful for debugging and will be kept in the pipeline, as the make check-gdb command's output was not helpful on its own. * External CI: rocPyDecode Smoke Test (#3786) * External CI: omniperf pipeline (#3788) - Referred to public documentation, source, and iterative attempts to create and improve build and test pipeline. - ctest failures are due to the test node not having expected marketing name string and override not working. - The fix should be on the omniperf repo side of things, so this pull request should be fine as is. * External CI: create omniperf pipeline IDs, update nightly build (#3790) * Fixed greater than to be less than in rocFFT changes * fix footnote for 6.1.0 (#3791) * fix footnote for 6.1.0 * fix empty columns in historical KFD title * External CI: Publish wheel as artifact for rocPyDecode (#3796) * External CI: fix hip-tests symlink creation (#3799) * Docs: Add Ubuntu 24.04.1 (#3801) * add ubuntu 24.04.1 * add 24.04.1 to bottom os section * fix heading and template * Update compatibility-matrix.rst for OpenMP version * Update compatibility-matrix-historical-6.0.csv for OpenMP version * rm ubuntu 24.04.1 from 6.2.0 * Update docs/compatibility/compatibility-matrix.rst Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> * rm duplicate ubuntu in historical --------- Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> * External CI: fixes for rocMLIR and nightly build (#3800) * External CI: fix symlinks for rocMLIR and nightly build * add pipeline IDs for hip-tests * fix hip-test ID typo * remove llvm-alt license (#3727) * remove llvm-alt license * fix linting error * External CI: enable ROCR-Runtime tests (#3809) * External CI: default branches for hip-tests, omniperf (#3811) * External CI: torch and torchvision smoke tests (#3810) * External CI: torch and torchvision smoke tests - Fixed issues with package name and version for the vision wheel that prevented it from installing. A patch is used until my pull request in vision repo is merged. - Referred to rocAutomation scripts to pick which test scripts to run out of the many in the torch and vision repo, and iteratively tested suggested scripts to see which ones completed in a timely manner. - Leveraging pytest-azurepipelines module to automatically parse and graph results from these tests. * External CI: omnitrace build pipeline (#3812) * External CI: omnitrace build pipeline starter - Adding initial set of dependencies and build flags. * External CI: omnitrace build pipeline - Add bison, rccl, texinfo dependencies based on build failures. - Add AMDGPU_TARGETS flag - Add ROCm binaries to PATH for clang-format and other tools used. * Fix indentation --------- Co-authored-by: Daniel Su * External CI: AMDMIGraphX Build Fix (#3814) - Swap to default gcc on OS to resolve build errors from recent commits. - Added libdnnl-dev dependency from iterative attempts with compiler change. - Referred to the passing GitHub checks to observe the compilers that was used. - Build CK jit lib and include in AMDMIGraphX build. * External CI: test fixes w/ roctracer, list omniperf as partially succeeding (#3815) * External CI: rpp tests (#3816) * External CI: Build pipeline for rocprofiler-sdk (#3819) * External CI: Pipeline for rocprofiler-sdk * Add rocprofiler dependency * External CI: rocprofiler-sdk build pipeline --------- Co-authored-by: Daniel Su * External CI: Fix/add missing pipeline IDs (#3818) * External CI: omnitrace tests (#3822) * Update tags to 6.2.2 (#3827) * External CI: add roctracer to roc/hipSOLVER test deps (#3825) * External CI: add rocprofiler-sdk pipeline IDs (#3824) * External CI: AMDMIGraphX Smoke Tests (#3830) Co-authored-by: Daniel Su * External CI: MIOpen tests (#3837) * Point to release history instead of deprecated changelog (#3836) * External CI: filter out hipTensor extended tests (#3838) * added revised note re. radeon gpus (#3839) * Restructured the contributions section. (#3715) * testing if this file is editable * changed 'kebob-case' to 'dash-case' * Restructured the page to be more straightforward and provide additional repo information * forgot to save * Moved the topic sentence * Wrong accent on the a in diataxis * Removed the feedback info from contributing and moved it to Feedback * fixed spelling errors * fixed some wording and removed second person text * consolidated Build and Structure into Contribute; edited toolchai to (hopefully) conform to style guide; updated toc * updated the titles in the toc * made changes based on feedback * it's better when you save * removed structure and build; fixed something for the linter * added rst to wordlist * added customizations to wordlist * Add links to gpu cluster network guides (#3763) * Add links to gpu cluster network guides * Add newline character to eof * Make link absolute * add dynamic branch in toc * remove unnecessary page clean up * clean up index/toc * make multi-node topics adjacent --------- Co-authored-by: Peter Park * updated the radeon note (#3850) * External CI: Fix rocPyDecode wheel creation (#3852) - Set values for expected environment variables. - Accompanying changes required in rocPyDecode repo. Pull request will be made. * External CI: pytorch vision patch removal (#3855) My pull request applying this patch was merged upstream, so this is no longer needed and will break the pipeline since it can no longer be applied. * Build(deps): Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#3807) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * updated the radeon note, as it were (#3857) * updated the radeon note, as it were * updated the note again * Set devops team as codeowners for rocm-build (#3860) * Set ext CI as codeowners for rocm-build * Update CODEOWNERS to rocm-devops * External CI: Add option to pull mainline branch for dependencies (#3689) * External CI: Add option to pull mainline branch for dependencies * Missing parameter for mainline branch dependencies. * External CI: mainline branch definitions * Removed MIGraphX optimization page (#3848) * External CI: add a global variable to control gfx942 tests (#3864) * External CI: update component default/mainline branches (#3871) * External CI: Stop building gfx90a (#3872) Save on VM resources until infrastructure has test targets. * External CI: add libstdc++-12 to rocMLIR (#3874) * Add building doc section (#3873) * External CI: programmatically get latest aqlprofile (#3876) * External CI: use ctest for rocm-examples (#3877) * External CI: Tensile pipeline (#3884) * add oversubscription conceptual doc (#3885) add mitigiation steps add to toc move page for build move doc fix spelling update doc update oversubscription update order fix spelling add oversubscription to wordlist move oversubscription topic to bottom of toc and index * add oversubscription conceptual doc (#3885) (cherry picked from commit d0ecf51b0c9202475e2abe90a45b50df0de6d7ae) * External CI: Add pipeline to build upstream boost (#3896) * Update bitsandbytes branch in docs (#3898) * Documentation: Add reference to precision-support floating-point types (#3899) * External CI: use Boost template for MIOpen (#3903) * External CI: create rocprofiler-systems pipeline (#3906) * External CI: omnitrace/rocprof-sys pipeline IDs (#3908) * External CI: MIOpen parse test results (#3913) * External CI: Use pip to install latest cmake on test system (#3915) * added a link to the compatibility matrix (#3904) * added a link to the compatibility matrix * removed quotes * docs: Remove invalid amd_iommu=on parameter Per kernel-parameters.txt, there is no "on" option for amd_iommu. While intel_iommu has it, amd_iommu is automatically on unless specified otherwise. For more info, see these 2 links: https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt https://github.com/torvalds/linux/blob/75aa74d52f43e75d0beb20572f98529071b700e5/drivers/iommu/amd/init.c#L3481 Signed-off-by: Kent Russell * External CI: hipBLASLt build now requires python packaging module (#3926) https://github.com/ROCm/hipBLASLt/pull/1250/files#diff-fee2e6f068b33fca3a1dc49392de8848dbf05c3f4632b680abb1052523e5a30fR35 * External CI: Moved location of upstream pytorch build scripts (#3930) https://github.com/pytorch/pytorch/pull/138103 * External CI: disable rocMLIR tests (#3931) * External CI: disable rocMLIR tests * roctracer AMDGPU_TARGETS flag * External CI: create a GPU diagnostics template (#3932) * External CI: Add CK into pytorch build environment (#3934) * External CI: add support to disable individual component tests (#3938) * External CI: AMDMIGraphX greater-equal pip dependencies (#3939) * Build(deps): Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx (#3933) Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.2 to 1.8.3. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.2...v1.8.3) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * External CI: rocDecode add libva-amdgpu-dev dependency (#3940) * External CI: enumerate GPUs in gpu-diagnostics (#3942) * External CI: move gpu-diag directly before tests (#3943) * External CI: fix HIP_PIPELINE_ID (#3944) --------- Signed-off-by: dependabot[bot] Signed-off-by: Kent Russell Co-authored-by: Jeffrey Novotny Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> Co-authored-by: Wang, Yanyao Co-authored-by: Yanyao Wang Co-authored-by: Peter Park Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com> Co-authored-by: Daniel Su Co-authored-by: Sandra Polifroni Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> Co-authored-by: Michael Benavidez Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: MKKnorr Co-authored-by: Kent Russell Co-authored-by: Joseph Greathouse * 6.2.4 release notes: add known/fixed issues (#193) * add "for compute workloads" wording for clarity * add AMDSMI resolved issue * add dlm known issue intro text wording * update wording rm bullet point update wording * fix spellcheck due to spacing * rm s * rm gfx1151 * remove dlm known issue * update list of updated docs; note for Radeon users fmt * update GA date for 6.2.4 * fix rdc version * fix RDC version strings (#196) * revert outdataed change for .azuredevops * Fix 6.2.4 date in versions.md Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> --------- Signed-off-by: dependabot[bot] Signed-off-by: Kent Russell Co-authored-by: Peter Park Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> Co-authored-by: Jeffrey Novotny Co-authored-by: Wang, Yanyao Co-authored-by: Yanyao Wang Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com> Co-authored-by: Daniel Su Co-authored-by: Sandra Polifroni Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> Co-authored-by: Michael Benavidez Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: MKKnorr Co-authored-by: Kent Russell Co-authored-by: Joseph Greathouse * fix links in release notes 6.2.4 (#4008) * Remove extra line * Update xml files for 6.2.4 (#4012) * Update xml files for 6.2.4 * Update README with 6.2.4 * Increase visibility of programming guide * Docs: Update what is rocm description * Apply suggestions from code review Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> * Update docs/how-to/hip_programming_guide.rst Co-authored-by: MKKnorr * WIP * Update docs/index.md * Update docs/how-to/hip_programming_guide.rst Co-authored-by: MKKnorr * Update docs/how-to/programming_guide.rst * Update docs/what-is-rocm.rst * Apply suggestions from code review Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update docs/how-to/programming_guide.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Remove tip * External CI: allow test failures to present as failures on Github (#3993) * External CI: disable rdmatest and rocrtstFunc.Memory_Max_Mem (#4016) * Added 6.2.4 manifest.xml * External CI: fix comgr build (#4025) * External CI: increase Tensile test timeout to 90 mins (#4027) --------- Signed-off-by: David Galiffi Signed-off-by: dependabot[bot] Signed-off-by: Kent Russell Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> Co-authored-by: Jeffrey Novotny Co-authored-by: Peter Park Co-authored-by: Yanyao Wang Co-authored-by: Wang, Yanyao Co-authored-by: David Galiffi Co-authored-by: Chris Kime Co-authored-by: ozziemoreno <109979778+ozziemoreno@users.noreply.github.com> Co-authored-by: Sandra Polifroni Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> Co-authored-by: Joseph Macaranas <145489236+amd-jmacaran@users.noreply.github.com> Co-authored-by: Daniel Su Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com> Co-authored-by: Michael Benavidez Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: MKKnorr Co-authored-by: Kent Russell Co-authored-by: Joseph Greathouse Co-authored-by: Johannes Maria Frank Co-authored-by: Brian Cornille Co-authored-by: Joseph Macaranas Co-authored-by: Pratik Basyal Co-authored-by: prbasyal Co-authored-by: Istvan Kiss Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> Co-authored-by: Ameya Keshava Mallya --- .wordlist.txt | 8 + README.md | 2 +- RELEASE.md | 10 +- default.xml | 2 +- docs/compatibility/compatibility-matrix.rst | 4 +- docs/conf.py | 1 + .../llm-inference-frameworks.rst | 6 +- .../mi300x/vllm-benchmark.rst | 407 ++++++++++++++++++ docs/how-to/programming_guide.rst | 37 ++ docs/how-to/tuning-guides/mi300x/index.rst | 2 + docs/how-to/tuning-guides/mi300x/workload.rst | 13 +- docs/index.md | 17 +- docs/reference/gpu-arch-specs.rst | 14 +- docs/release/versions.md | 2 +- docs/sphinx/_toc.yml.in | 6 +- docs/what-is-rocm.rst | 21 +- tools/autotag/components.xml | 2 +- tools/rocm-build/ROCm.mk | 16 +- tools/rocm-build/build_amdmigraphx.sh | 3 +- tools/rocm-build/build_composable_kernel.sh | 12 +- tools/rocm-build/build_hipblas.sh | 3 +- tools/rocm-build/build_hipblaslt.sh | 3 +- tools/rocm-build/build_hipcub.sh | 3 +- tools/rocm-build/build_hipfft.sh | 3 +- tools/rocm-build/build_hiprand.sh | 1 - tools/rocm-build/build_hipsolver.sh | 3 +- tools/rocm-build/build_hipsparse.sh | 3 +- tools/rocm-build/build_hipsparselt.sh | 3 +- tools/rocm-build/build_hiptensor.sh | 4 +- tools/rocm-build/build_lightning.sh | 120 ++++-- tools/rocm-build/build_miopen-deps.sh | 2 +- tools/rocm-build/build_miopen-hip.sh | 5 +- tools/rocm-build/build_mivisionx.sh | 5 +- tools/rocm-build/build_omniperf.sh | 171 ++++++++ tools/rocm-build/build_omnitrace.sh | 191 ++++++++ tools/rocm-build/build_opencl_icd_loader.sh | 141 ++++++ tools/rocm-build/build_rccl.sh | 4 +- tools/rocm-build/build_rocal.sh | 71 +++ tools/rocm-build/build_rocalution.sh | 3 +- tools/rocm-build/build_rocblas.sh | 3 +- tools/rocm-build/build_rocdecode.sh | 5 +- tools/rocm-build/build_rocfft.sh | 3 +- tools/rocm-build/build_rocm-cmake.sh | 31 +- tools/rocm-build/build_rocprim.sh | 5 +- tools/rocm-build/build_rocprofiler-sdk.sh | 222 ++++++++++ tools/rocm-build/build_rocrand.sh | 4 +- tools/rocm-build/build_rocsolver.sh | 4 +- tools/rocm-build/build_rocsparse.sh | 8 +- tools/rocm-build/build_rocthrust.sh | 4 +- tools/rocm-build/build_rocwmma.sh | 4 +- tools/rocm-build/build_rpp.sh | 8 +- tools/rocm-build/compute_helper.sh | 59 +++ .../docker/ubuntu20/install-prerequisites.sh | 12 +- tools/rocm-build/docker/ubuntu20/packages | 4 + .../docker/ubuntu22/install-prerequisities.sh | 89 +++- tools/rocm-build/docker/ubuntu22/packages | 3 + tools/rocm-build/docker/ubuntu24/Dockerfile | 11 + tools/rocm-build/docker/ubuntu24/README.md | 27 ++ .../docker/ubuntu24/install-prerequisites.sh | 237 ++++++++++ tools/rocm-build/docker/ubuntu24/local-pin-60 | 3 + tools/rocm-build/docker/ubuntu24/packages | 140 ++++++ tools/rocm-build/envsetup.sh | 6 +- tools/rocm-build/rocm-6.2.0.xml | 28 +- .../{rocm-6.1.1.xml => rocm-6.2.1.xml} | 67 +-- .../{rocm-6.1.0.xml => rocm-6.2.2.xml} | 65 +-- .../{rocm-6.1.2.xml => rocm-6.2.4.xml} | 146 ++++--- 66 files changed, 2214 insertions(+), 308 deletions(-) create mode 100644 docs/how-to/performance-validation/mi300x/vllm-benchmark.rst create mode 100644 docs/how-to/programming_guide.rst create mode 100755 tools/rocm-build/build_omniperf.sh create mode 100755 tools/rocm-build/build_omnitrace.sh create mode 100755 tools/rocm-build/build_opencl_icd_loader.sh create mode 100755 tools/rocm-build/build_rocal.sh create mode 100755 tools/rocm-build/build_rocprofiler-sdk.sh create mode 100644 tools/rocm-build/docker/ubuntu24/Dockerfile create mode 100644 tools/rocm-build/docker/ubuntu24/README.md create mode 100644 tools/rocm-build/docker/ubuntu24/install-prerequisites.sh create mode 100644 tools/rocm-build/docker/ubuntu24/local-pin-60 create mode 100644 tools/rocm-build/docker/ubuntu24/packages rename tools/rocm-build/{rocm-6.1.1.xml => rocm-6.2.1.xml} (89%) rename tools/rocm-build/{rocm-6.1.0.xml => rocm-6.2.2.xml} (90%) rename tools/rocm-build/{rocm-6.1.2.xml => rocm-6.2.4.xml} (85%) diff --git a/.wordlist.txt b/.wordlist.txt index 580890845..2b7b7eb70 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -36,6 +36,7 @@ Bluefield Bootloader CCD CDNA +CHTML CIFAR CLI CLion @@ -70,6 +71,7 @@ Concretized Conda ConnectX CuPy +Dashboarding DDR DF DGEMM @@ -227,6 +229,7 @@ Mellanox's Meta's Miniconda MirroredStrategy +Mixtral Multicore Multithreaded MyEnvironment @@ -273,6 +276,7 @@ OpenSSL OpenVX OpenXLA Oversubscription +PagedAttention PCC PCI PCIe @@ -294,6 +298,7 @@ PowerShell PyPi PyTorch Qcycles +Qwen RAII RAS RCCL @@ -563,6 +568,7 @@ hipfort hipify hipsolver hipsparse +hlist hotspotting hpc hpp @@ -586,6 +592,7 @@ intra invariants invocating ipo +jax kdb kfd latencies @@ -606,6 +613,7 @@ migraphx miopen miopengemm mivisionx +mjx mkdir mlirmiopen mtypes diff --git a/README.md b/README.md index 2f42b747b..3d34f00fe 100644 --- a/README.md +++ b/README.md @@ -76,7 +76,7 @@ The Build time will reduce significantly if we limit the GPU Architecture/s agai mkdir -p ~/WORKSPACE/ # Or any folder name other than WORKSPACE cd ~/WORKSPACE/ -export ROCM_VERSION=6.2.2 # Or 6.2.0 or 6.2.1 +export ROCM_VERSION=6.2.4 # Or 6.2.0, 6.2.1, 6.2.2 ~/bin/repo init -u http://github.com/ROCm/ROCm.git -b roc-6.2.x -m tools/rocm-build/rocm-${ROCM_VERSION}.xml ~/bin/repo sync diff --git a/RELEASE.md b/RELEASE.md index 726a599d4..ea9f62b5a 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -32,18 +32,18 @@ ROCm documentation continues to be updated to provide clearer and more comprehen a wider variety of user needs and use cases. * Added a new GPU cluster networking guide. See - [Cluster network performance validation for AMD Instinct accelerators](https://rocm.docs.amd.com/projects/gpu-cluster-networking/en/latest/index.html). + [Cluster network performance validation for AMD Instinct accelerators](https://rocm.docs.amd.com/projects/gpu-cluster-networking/en/docs-6.2.4/index.html). This documentation provides guidelines on validating network configurations in single-node and multi-node environments to attain optimal speed and bandwidth in AMD Instinct-powered clusters. * Updated the HIP runtime documentation. - * Added a new section on how to use [HIP graphs](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hipgraph.html). + * Added a new section on how to use [HIP graphs](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.4/how-to/hipgraph.html). - * Added a new section about the [Stream ordered memory allocator (SOMA)](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/stream_ordered_allocator.html). + * Added a new section about the [Stream ordered memory allocator (SOMA)](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.4/how-to/stream_ordered_allocator.html). - * Updated the [Porting CUDA driver API](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_porting_driver_api.html) section. + * Updated the [Porting CUDA driver API](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.4/how-to/hip_porting_driver_api.html) section. * Updated the [Post-installation instructions](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.2.4/install/post-install.html) with guidance on using the `update-alternatives` utility and environment modules to help you manage multiple ROCm @@ -56,7 +56,7 @@ a wider variety of user needs and use cases. ## Operating system and hardware support changes ROCm 6.2.4 adds support for the [AMD Radeon PRO V710](https://www.amd.com/en/products/accelerators/radeon-pro/amd-radeon-pro-v710.html) GPU for compute workloads. See -[Supported GPUs](https://advanced-micro-devices-demo--287.com.readthedocs.build/projects/install-on-linux-internal/en/287/reference/system-requirements.html) +[Supported GPUs](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.2.4/reference/system-requirements.html#supported-gpus) for more information. This release maintains the same operating system support as 6.2.2. diff --git a/default.xml b/default.xml index d3088683d..419b90d33 100644 --- a/default.xml +++ b/default.xml @@ -1,7 +1,7 @@ - diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index 28ef57674..55ea7e990 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -112,9 +112,7 @@ Accelerators and GPUs listed in the following table support compute workloads (n ,,, PERFORMANCE TOOLS,,, :doc:`ROCm Bandwidth Test `,1.4.0,1.4.0,1.4.0 - :doc:`ROCm Compute Profiler `,2.0.1,2.0.1,N/A - :doc:`ROCm Systems Profiler `,1.11.2,1.11.2,N/A - :doc:`ROCProfiler `,2.0.60202,2.0.60201,2.0.60100 + :doc:`ROCProfiler `,2.0.60204,2.0.60202,2.0.60100 :doc:`ROCprofiler-SDK `,0.4.0,0.4.0,N/A :doc:`ROCTracer `,4.1.60204,4.1.60202,4.1.60100 ,,, diff --git a/docs/conf.py b/docs/conf.py index 89c5ab964..727a5dee6 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -81,6 +81,7 @@ article_pages = [ "file": "how-to/llm-fine-tuning-optimization/profiling-and-debugging", "os": ["linux"], }, + {"file": "how-to/performance-validation/mi300x/vllm-benchmark", "os": ["linux"]}, {"file": "how-to/system-optimization/index", "os": ["linux"]}, {"file": "how-to/system-optimization/mi300x", "os": ["linux"]}, {"file": "how-to/system-optimization/mi200", "os": ["linux"]}, diff --git a/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst b/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst index 3ee672353..84e839391 100644 --- a/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst +++ b/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst @@ -16,7 +16,7 @@ This section discusses how to implement `vLLM `_ vLLM inference ============== -vLLM is renowned for its paged attention algorithm that can reduce memory consumption and increase throughput thanks to +vLLM is renowned for its PagedAttention algorithm that can reduce memory consumption and increase throughput thanks to its paging scheme. Instead of allocating GPU high-bandwidth memory (HBM) for the maximum output token lengths of the models, the paged attention of vLLM allocates GPU HBM dynamically for its actual decoding lengths. This paged attention is also effective when multiple requests share the same key and value contents for a large value of beam search or @@ -139,9 +139,7 @@ Refer to :ref:`mi300x-vllm-optimization` for performance optimization tips. ROCm provides a prebuilt optimized Docker image for validating the performance of LLM inference with vLLM on the MI300X accelerator. The Docker image includes ROCm, vLLM, PyTorch, and tuning files in the CSV -format. For more information, see the guide to -`LLM inference performance validation with vLLM on the AMD Instinct™ MI300X accelerator `_ -on the ROCm GitHub repository. +format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`. .. _fine-tuning-llms-tgi: diff --git a/docs/how-to/performance-validation/mi300x/vllm-benchmark.rst b/docs/how-to/performance-validation/mi300x/vllm-benchmark.rst new file mode 100644 index 000000000..90883ea84 --- /dev/null +++ b/docs/how-to/performance-validation/mi300x/vllm-benchmark.rst @@ -0,0 +1,407 @@ +.. meta:: + :description: Learn how to validate LLM inference performance on MI300X accelerators using AMD MAD and the unified + ROCm Docker image. + :keywords: model, MAD, automation, dashboarding, validate + +*********************************************************** +LLM inference performance validation on AMD Instinct MI300X +*********************************************************** + +.. _vllm-benchmark-unified-docker: + +The `ROCm vLLM Docker `_ image offers +a prebuilt, optimized environment designed for validating large language model +(LLM) inference performance on the AMD Instinct™ MI300X accelerator. This +ROCm vLLM Docker image integrates vLLM and PyTorch tailored specifically for the +MI300X accelerator and includes the following components: + +* `ROCm 6.2.1 `_ + +* `vLLM 0.6.4 `_ + +* `PyTorch 2.5.0 `_ + +* Tuning files (in CSV format) + +With this Docker image, you can quickly validate the expected inference +performance numbers on the MI300X accelerator. This topic also provides tips on +optimizing performance with popular AI models. + +.. hlist:: + :columns: 6 + + * Llama 3.1 8B + + * Llama 3.1 70B + + * Llama 3.1 405B + + * Llama 2 7B + + * Llama 2 70B + + * Mixtral 8x7B + + * Mixtral 8x22B + + * Mixtral 7B + + * Qwen2 7B + + * Qwen2 72B + + * JAIS 13B + + * JAIS 30B + +.. _vllm-benchmark-vllm: + +.. note:: + + vLLM is a toolkit and library for LLM inference and serving. AMD implements + high-performance custom kernels and modules in vLLM to enhance performance. + See :ref:`fine-tuning-llms-vllm` and :ref:`mi300x-vllm-optimization` for + more information. + +Getting started +=============== + +Use the following procedures to reproduce the benchmark results on an +MI300X accelerator with the prebuilt vLLM Docker image. + +.. _vllm-benchmark-get-started: + +1. Disable NUMA auto-balancing. + + To optimize performance, disable automatic NUMA balancing. Otherwise, the GPU + might hang until the periodic balancing is finalized. For more information, + see :ref:`AMD Instinct MI300X system optimization `. + + .. code-block:: shell + + # disable automatic NUMA balancing + sh -c 'echo 0 > /proc/sys/kernel/numa_balancing' + # check if NUMA balancing is disabled (returns 0 if disabled) + cat /proc/sys/kernel/numa_balancing + 0 + +2. Download the :ref:`ROCm vLLM Docker image `. + + Use the following command to pull the Docker image from Docker Hub. + + .. code-block:: shell + + docker pull rocm/vllm:rocm6.2_mi300_ubuntu20.04_py3.9_vllm_0.6.4 + +Once setup is complete, you can choose between two options to reproduce the +benchmark results: + +- :ref:`MAD-integrated benchmarking ` + +- :ref:`Standalone benchmarking ` + +.. _vllm-benchmark-mad: + +MAD-integrated benchmarking +=========================== + +Clone the ROCm Model Automation and Dashboarding (``__) repository to a local +directory and install the required packages on the host machine. + +.. code-block:: shell + + git clone https://github.com/ROCm/MAD + cd MAD + pip install -r requirements.txt + +Use this command to run a performance benchmark test of the Llama 3.1 8B model +on one GPU with ``float16`` data type in the host machine. + +.. code-block:: shell + + export MAD_SECRETS_HFTOKEN="your personal Hugging Face token to access gated models" + python3 tools/run_models.py --tags pyt_vllm_llama-3.1-8b --keep-model-dir --live-output --timeout 28800 + +ROCm MAD launches a Docker container with the name +``container_ci-pyt_vllm_llama-3.1-8b``. The latency and throughput reports of the +model are collected in the following path: ``~/MAD/reports_float16/``. + +Although the following models are preconfigured to collect latency and +throughput performance data, you can also change the benchmarking parameters. +Refer to the :ref:`Standalone benchmarking ` section. + +Available models +---------------- + +.. hlist:: + :columns: 3 + + * ``pyt_vllm_llama-3.1-8b`` + + * ``pyt_vllm_llama-3.1-70b`` + + * ``pyt_vllm_llama-3.1-405b`` + + * ``pyt_vllm_llama-2-7b`` + + * ``pyt_vllm_llama-2-70b`` + + * ``pyt_vllm_mixtral-8x7b`` + + * ``pyt_vllm_mixtral-8x22b`` + + * ``pyt_vllm_mistral-7b`` + + * ``pyt_vllm_qwen2-7b`` + + * ``pyt_vllm_qwen2-72b`` + + * ``pyt_vllm_jais-13b`` + + * ``pyt_vllm_jais-30b`` + + * ``pyt_vllm_llama-3.1-8b_fp8`` + + * ``pyt_vllm_llama-3.1-70b_fp8`` + + * ``pyt_vllm_llama-3.1-405b_fp8`` + + * ``pyt_vllm_mixtral-8x7b_fp8`` + + * ``pyt_vllm_mixtral-8x22b_fp8`` + +.. _vllm-benchmark-standalone: + +Standalone benchmarking +======================= + +You can run the vLLM benchmark tool independently by starting the +:ref:`Docker container ` as shown in the following +snippet. + +.. code-block:: + + docker pull rocm/vllm:rocm6.2_mi300_ubuntu20.04_py3.9_vllm_0.6.4 + docker run -it --device=/dev/kfd --device=/dev/dri --group-add video --shm-size 128G --security-opt seccomp=unconfined --security-opt apparmor=unconfined --cap-add=SYS_PTRACE -v $(pwd):/workspace --env HUGGINGFACE_HUB_CACHE=/workspace --name vllm_v0.6.4 rocm/vllm:rocm6.2_mi300_ubuntu20.04_py3.9_vllm_0.6.4 + +In the Docker container, clone the ROCm MAD repository and navigate to the +benchmark scripts directory at ``~/MAD/scripts/vllm``. + +.. code-block:: + + git clone https://github.com/ROCm/MAD + cd MAD/scripts/vllm + +Command +------- + +To start the benchmark, use the following command with the appropriate options. +See :ref:`Options ` for the list of +options and their descriptions. + +.. code-block:: shell + + ./vllm_benchmark_report.sh -s $test_option -m $model_repo -g $num_gpu -d $datatype + +See the :ref:`examples ` for more information. + +.. note:: + + The input sequence length, output sequence length, and tensor parallel (TP) are + already configured. You don't need to specify them with this script. + +.. note:: + + If you encounter the following error, pass your access-authorized Hugging + Face token to the gated models. + + .. code-block:: shell + + OSError: You are trying to access a gated repo. + + # pass your HF_TOKEN + export HF_TOKEN=$your_personal_hf_token + +.. _vllm-benchmark-standalone-options: + +Options +------- + +.. list-table:: + :header-rows: 1 + :align: center + + * - Name + - Options + - Description + + * - ``$test_option`` + - latency + - Measure decoding token latency + + * - + - throughput + - Measure token generation throughput + + * - + - all + - Measure both throughput and latency + + * - ``$model_repo`` + - ``meta-llama/Meta-Llama-3.1-8B-Instruct`` + - Llama 3.1 8B + + * - (``float16``) + - ``meta-llama/Meta-Llama-3.1-70B-Instruct`` + - Llama 3.1 70B + + * - + - ``meta-llama/Meta-Llama-3.1-405B-Instruct`` + - Llama 3.1 405B + + * - + - ``meta-llama/Llama-2-7b-chat-hf`` + - Llama 2 7B + + * - + - ``meta-llama/Llama-2-70b-chat-hf`` + - Llama 2 70B + + * - + - ``mistralai/Mixtral-8x7B-Instruct-v0.1`` + - Mixtral 8x7B + + * - + - ``mistralai/Mixtral-8x22B-Instruct-v0.1`` + - Mixtral 8x22B + + * - + - ``mistralai/Mistral-7B-Instruct-v0.3`` + - Mixtral 7B + + * - + - ``Qwen/Qwen2-7B-Instruct`` + - Qwen2 7B + + * - + - ``Qwen/Qwen2-72B-Instruct`` + - Qwen2 72B + + * - + - ``core42/jais-13b-chat`` + - JAIS 13B + + * - + - ``core42/jais-30b-chat-v3`` + - JAIS 30B + + * - ``$model_repo`` + - ``amd/Meta-Llama-3.1-8B-Instruct-FP8-KV`` + - Llama 3.1 8B + + * - (``float8``) + - ``amd/Meta-Llama-3.1-70B-Instruct-FP8-KV`` + - Llama 3.1 70B + + * - + - ``amd/Meta-Llama-3.1-405B-Instruct-FP8-KV`` + - Llama 3.1 405B + + * - + - ``amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV`` + - Mixtral 8x7B + + * - + - ``amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV`` + - Mixtral 8x22B + + * - ``$num_gpu`` + - 1 or 8 + - Number of GPUs + + * - ``$datatype`` + - ``float16`` or ``float8`` + - Data type + +.. _vllm-benchmark-run-benchmark: + +Running the benchmark on the MI300X accelerator +----------------------------------------------- + +Here are some examples of running the benchmark with various options. +See :ref:`Options ` for the list of +options and their descriptions. + +Example 1: latency benchmark +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use this command to benchmark the latency of the Llama 3.1 8B model on one GPU with the ``float16`` and ``float8`` data types. + +.. code-block:: + + ./vllm_benchmark_report.sh -s latency -m meta-llama/Meta-Llama-3.1-8B-Instruct -g 1 -d float16 + ./vllm_benchmark_report.sh -s latency -m amd/Meta-Llama-3.1-8B-Instruct-FP8-KV -g 1 -d float8 + +Find the latency reports at: + +- ``./reports_float16/summary/Meta-Llama-3.1-8B-Instruct_latency_report.csv`` + +- ``./reports_float8/summary/Meta-Llama-3.1-8B-Instruct-FP8-KV_latency_report.csv`` + +Example 2: throughput benchmark +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use this command to benchmark the throughput of the Llama 3.1 8B model on one GPU with the ``float16`` and ``float8`` data types. + +.. code-block:: shell + + ./vllm_benchmark_report.sh -s throughput -m meta-llama/Meta-Llama-3.1-8B-Instruct -g 1 -d float16 + ./vllm_benchmark_report.sh -s throughput -m amd/Meta-Llama-3.1-8B-Instruct-FP8-KV -g 1 -d float8 + +Find the throughput reports at: + +- ``./reports_float16/summary/Meta-Llama-3.1-8B-Instruct_throughput_report.csv`` + +- ``./reports_float8/summary/Meta-Llama-3.1-8B-Instruct-FP8-KV_throughput_report.csv`` + +.. raw:: html + + + +.. note:: + + Throughput is calculated as: + + - .. math:: throughput\_tot = requests \times (\mathsf{\text{input lengths}} + \mathsf{\text{output lengths}}) / elapsed\_time + + - .. math:: throughput\_gen = requests \times \mathsf{\text{output lengths}} / elapsed\_time + +Further reading +=============== + +- For application performance optimization strategies for HPC and AI workloads, + including inference with vLLM, see :doc:`/how-to/tuning-guides/mi300x/workload`. + +- To learn more about the options for latency and throughput benchmark scripts, + see ``_. + +- To learn more about system settings and management practices to configure your system for + MI300X accelerators, see :doc:`/how-to/system-optimization/mi300x`. + +- To learn how to run LLM models from Hugging Face or your own model, see + :doc:`Using ROCm for AI `. + +- To learn how to optimize inference on LLMs, see + :doc:`Fine-tuning LLMs and inference optimization `. + +- For a list of other ready-made Docker images for ROCm, see the + :doc:`Docker image support matrix `. + +- To compare with the previous version of the ROCm vLLM Docker image for performance validation, refer to + `LLM inference performance validation on AMD Instinct MI300X (ROCm 6.2.0) `_. + diff --git a/docs/how-to/programming_guide.rst b/docs/how-to/programming_guide.rst new file mode 100644 index 000000000..8a489b20d --- /dev/null +++ b/docs/how-to/programming_guide.rst @@ -0,0 +1,37 @@ +.. meta:: + :description: Programming guide + :keywords: HIP, programming guide, heterogeneous programming, AMD GPU programming + +.. _hip-programming-guide: + +******************************************************************************** +Programming guide +******************************************************************************** + +ROCm provides a robust environment for heterogeneous programs running on CPUs +and AMD GPUs. ROCm supports various programming languages and frameworks to +help developers access the power of AMD GPUs. The natively supported programming +languages are HIP (Heterogeneous-Compute Interface for Portability) and +OpenCL, but HIP bindings are available for Python and Fortran. + +HIP is an API based on C++ that provides a runtime and kernel language for GPU +programming and is the essential ROCm programming language. HIP is also designed +to be a marshalling language, allowing code written for NVIDIA CUDA to be +easily ported to run on AMD GPUs. Developers can use HIP to write kernels that +execute on AMD GPUs while maintaining compatibility with CUDA-based systems. + +OpenCL (Open Computing Language) is an open standard for cross-platform, +parallel programming of diverse processors. ROCm supports OpenCL for developers +who want to use standard frameworks across different hardware platforms, +including CPUs, GPUs, and other accelerators. For more information, see +`OpenCL `_. + +Python bindings can be found at https://github.com/ROCm/hip-python. +Python is popular in AI and machine learning applications due to available +frameworks like TensorFlow and PyTorch. + +Fortran bindings can be found at https://github.com/ROCm/hipfort. +It enables scientific, academic, and legacy applications, particularly those in +high-performance computing, to run on AMD GPUs via HIP. + +For a complete description of the HIP programming language, see the :doc:`HIP programming guide`. diff --git a/docs/how-to/tuning-guides/mi300x/index.rst b/docs/how-to/tuning-guides/mi300x/index.rst index 1947a28d1..28389f40a 100644 --- a/docs/how-to/tuning-guides/mi300x/index.rst +++ b/docs/how-to/tuning-guides/mi300x/index.rst @@ -8,6 +8,8 @@ accelerators. They include detailed instructions on system settings and application tuning suggestions to help you fully leverage the capabilities of these accelerators, thereby achieving optimal performance. +* :doc:`/how-to/performance-validation/mi300x/vllm-benchmark` + * :doc:`/how-to/tuning-guides/mi300x/system` * :doc:`/how-to/tuning-guides/mi300x/workload` diff --git a/docs/how-to/tuning-guides/mi300x/workload.rst b/docs/how-to/tuning-guides/mi300x/workload.rst index 9401fa0a6..66e1dfa8c 100644 --- a/docs/how-to/tuning-guides/mi300x/workload.rst +++ b/docs/how-to/tuning-guides/mi300x/workload.rst @@ -152,9 +152,7 @@ address any new bottlenecks that may emerge. ROCm provides a prebuilt optimized Docker image that has everything required to implement the tips in this section. It includes ROCm, vLLM, PyTorch, and tuning files in the CSV -format. For more information, see the guide to -`LLM inference performance validation with vLLM on the AMD Instinct™ MI300X accelerator `_ -on the ROCm GitHub repository. +format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`. .. _mi300x-profiling-tools: @@ -378,11 +376,10 @@ Refer to `vLLM documentation `_ -on the ROCm GitHub repository. +ROCm provides a prebuilt optimized Docker image for validating the performance +of LLM inference with vLLM on the MI300X accelerator. The Docker image includes +ROCm, vLLM, PyTorch, and tuning files in the CSV format. For more information, +see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`. Maximize throughput ------------------- diff --git a/docs/index.md b/docs/index.md index 8513180ab..e1677054c 100644 --- a/docs/index.md +++ b/docs/index.md @@ -9,16 +9,14 @@ ROCm is an open-source software platform optimized to extract HPC and AI workload performance from AMD Instinct accelerators and AMD Radeon GPUs while maintaining -compatibility with industry software frameworks. For more information, see [What is ROCm?](./what-is-rocm.rst) +compatibility with industry software frameworks. For more information, see +[What is ROCm?](./what-is-rocm.rst) -If you're using Radeon GPUs, consider reviewing {doc}`Radeon-specific ROCm documentation`. +ROCm supports multiple programming languages and programming interfaces such as +{doc}`HIP (Heterogeneous-Compute Interface for Portability)`, OpenCL, +and OpenMP, as explained in the [Programming guide](./how-to/programming_guide.rst). -Installation instructions are available from: - -* {doc}`ROCm installation for Linux` -* {doc}`HIP SDK installation for Windows` -* [Deep learning frameworks installation](./how-to/deep-learning-rocm.rst) -* [Build ROCm from source](./how-to/build-rocm.rst) +If you're using AMD Radeon™ PRO or Radeon GPUs in a workstation setting with a display connected, review {doc}`Radeon-specific ROCm documentation`. ROCm documentation is organized into the following categories: @@ -41,11 +39,12 @@ ROCm documentation is organized into the following categories: :::{grid-item-card} How to :class-body: rocm-card-banner rocm-hue-12 +* [Programming guide](./how-to/hip_programming_guide.rst) * [Using ROCm for AI](./how-to/rocm-for-ai/index.rst) * [Using ROCm for HPC](./how-to/rocm-for-hpc/index.rst) * [Fine-tuning LLMs and inference optimization](./how-to/llm-fine-tuning-optimization/index.rst) * [System optimization](./how-to/system-optimization/index.rst) -* [AMD Instinct MI300X tuning guides](./how-to/tuning-guides/mi300x/index.rst) +* [AMD Instinct MI300X performance validation and tuning](./how-to/tuning-guides/mi300x/index.rst) * [GPU cluster networking](https://rocm.docs.amd.com/projects/gpu-cluster-networking/en/latest/index.html) * [System debugging](./how-to/system-debugging.md) * [Using MPI](./how-to/gpu-enabled-mpi.rst) diff --git a/docs/reference/gpu-arch-specs.rst b/docs/reference/gpu-arch-specs.rst index 536f83f2f..dde0a2eb8 100644 --- a/docs/reference/gpu-arch-specs.rst +++ b/docs/reference/gpu-arch-specs.rst @@ -37,11 +37,11 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - CDNA3 - gfx942 - 192 - - 304 + - 304 (38 per XCD) - 64 - 64 - 256 - - 32 + - 32 (4 per XCD) - 32 - 16 per 2 CUs - 64 per 2 CUs @@ -52,11 +52,11 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - CDNA3 - gfx942 - 128 - - 228 + - 228 (38 per XCD) - 64 - 64 - 256 - - 24 + - 24 (4 per XCD) - 32 - 16 per 2 CUs - 64 per 2 CUs @@ -82,7 +82,7 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - CDNA2 - gfx90a - 128 - - 208 + - 208 (104 per GCD) - 64 - 64 - @@ -788,3 +788,7 @@ scalar instructions. **GCD** Graphics Compute Die. + +**XCD** + +Accelerator Complex Die. diff --git a/docs/release/versions.md b/docs/release/versions.md index 3c01f5d72..e7d346057 100644 --- a/docs/release/versions.md +++ b/docs/release/versions.md @@ -8,7 +8,7 @@ | Version | Release date | | ------- | ------------ | -| [6.2.4](https://rocm.docs.amd.com/en/docs-6.2.4/) | October 18, 2024 | +| [6.2.4](https://rocm.docs.amd.com/en/docs-6.2.4/) | November 6, 2024 | | [6.2.2](https://rocm.docs.amd.com/en/docs-6.2.2/) | September 27, 2024 | | [6.2.1](https://rocm.docs.amd.com/en/docs-6.2.1/) | September 20, 2024 | | [6.2.0](https://rocm.docs.amd.com/en/docs-6.2.0/) | August 2, 2024 | diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index 9dd8af346..5180ceb17 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -23,6 +23,8 @@ subtrees: - caption: How to entries: + - file: how-to/programming_guide.rst + title: Programming guide - file: how-to/rocm-for-ai/index.rst title: Using ROCm for AI subtrees: @@ -70,9 +72,11 @@ subtrees: - file: how-to/system-optimization/w6000-v620.md title: AMD RDNA 2 - file: how-to/tuning-guides/mi300x/index.rst - title: AMD MI300X tuning guides + title: AMD MI300X performance validation and tuning subtrees: - entries: + - file: how-to/performance-validation/mi300x/vllm-benchmark.rst + title: Performance validation - file: how-to/tuning-guides/mi300x/system.rst title: System tuning - file: how-to/tuning-guides/mi300x/workload.rst diff --git a/docs/what-is-rocm.rst b/docs/what-is-rocm.rst index b95f3b153..1ee3988a3 100644 --- a/docs/what-is-rocm.rst +++ b/docs/what-is-rocm.rst @@ -6,24 +6,19 @@ What is ROCm? *********************************************************** -ROCm is an open-source stack, composed primarily of open-source software, designed for -graphics processing unit (GPU) computation. ROCm consists of a collection of drivers, development -tools, and APIs that enable GPU programming from low-level kernel to end-user applications. +ROCm is a software stack, composed primarily of open-source software, that +provides the tools for programming AMD Graphics Processing Units (GPUs), from +low-level kernels to high-level end-user applications. .. image:: data/rocm-software-stack-6_2_0.jpg :width: 800 :alt: AMD's ROCm software stack and neighboring technologies. :align: center -ROCm is powered by -:doc:`Heterogeneous-computing Interface for Portability (HIP) `; -it supports programming models, such as OpenMP and OpenCL, and includes all necessary open -source software compilers, debuggers, and libraries. It's fully integrated into machine learning (ML) -frameworks, such as PyTorch and TensorFlow. - -.. tip:: - If you're using Radeon GPUs, refer to the - :doc:`Radeon-specific ROCm documentation `. +Specifically, ROCm provides the tools for +:doc:`HIP (Heterogeneous-computing Interface for Portability) `, +OpenCL and OpenMP. These include compilers, libraries for high-level +functions, debuggers, profilers and runtimes. ROCm components =============================================== @@ -150,5 +145,5 @@ Runtimes :header: "Component", "Description" ":doc:`AMD Common Language Runtime (CLR) `", "Contains source code for AMD's common language runtimes: HIP and OpenCL" - ":doc:`HIP `", "AMD's GPU programming language extension and the GPU runtime" + ":doc:`HIP `", "C++ runtime API and kernel language that lets developers create portable applications for AMD and NVIDIA GPUs from single source code." ":doc:`ROCR-Runtime `", "User-mode API interfaces and libraries necessary for host applications to launch compute kernels on available HSA ROCm kernel agents" diff --git a/tools/autotag/components.xml b/tools/autotag/components.xml index a151b8628..c0eadced7 100644 --- a/tools/autotag/components.xml +++ b/tools/autotag/components.xml @@ -1,7 +1,7 @@ - diff --git a/tools/rocm-build/ROCm.mk b/tools/rocm-build/ROCm.mk index faca1661d..092d659b6 100644 --- a/tools/rocm-build/ROCm.mk +++ b/tools/rocm-build/ROCm.mk @@ -67,16 +67,18 @@ endef $(call adddep,amd_smi_lib,${ASAN_DEP}) $(call adddep,aqlprofile,${ASAN_DEP} hsa) -$(call adddep,clang-ocl,lightning rocm-cmake) $(call adddep,comgr,lightning devicelibs) $(call adddep,dbgapi,hsa comgr) $(call adddep,devicelibs,lightning) -$(call adddep,hip_on_rocclr,${ASAN_DEP} rocclr rocprofiler-register) +$(call adddep,hip_on_rocclr,${ASAN_DEP} hsa comgr hipcc rocprofiler-register) $(call adddep,hipcc,) $(call adddep,hipify_clang,hip_on_rocclr lightning) $(call adddep,hsa,${ASAN_DEP} thunk lightning devicelibs rocprofiler-register) $(call adddep,lightning,) -$(call adddep,opencl_on_rocclr,${ASAN_DEP} rocclr) +$(call adddep,omniperf,${ASAN_DEP}) +$(call adddep,omnitrace,hipcc hsa hip_on_rocclr rocm_smi_lib rocprofiler roctracer) +$(call adddep,opencl_icd_loader,) +$(call adddep,opencl_on_rocclr,${ASAN_DEP} hsa comgr opencl_icd_loader) $(call adddep,openmp_extras,thunk lightning devicelibs hsa) $(call adddep,rdc,${ASAN_DEP} rocm_smi_lib hsa rocprofiler) $(call adddep,rocclr,${ASAN_DEP} hsa comgr hipcc rocprofiler-register) @@ -87,14 +89,15 @@ $(call adddep,rocm-core,${ASAN_DEP}) $(call adddep,rocm-gdb,dbgapi) $(call adddep,rocminfo,${ASAN_DEP} hsa) $(call adddep,rocprofiler-register,${ASAN_DEP}) -$(call adddep,rocprofiler,${ASAN_DEP} hsa roctracer aqlprofile opencl_on_rocclr hip_on_rocclr comgr dbgapi rocm_smi_lib) +$(call adddep,rocprofiler-sdk,${ASAN_DEP} hsa aqlprofile opencl_on_rocclr hip_on_rocclr comgr) +$(call adddep,rocprofiler,${ASAN_DEP} hsa roctracer aqlprofile opencl_on_rocclr hip_on_rocclr comgr) $(call adddep,rocr_debug_agent,${ASAN_DEP} hip_on_rocclr hsa dbgapi) $(call adddep,roctracer,${ASAN_DEP} hsa hip_on_rocclr) $(call adddep,thunk,${ASAN_DEP}) # rocm-dev points to all possible last finish components of Stage1 build. rocm-dev-components :=rdc hipify_clang openmp_extras \ - rocm-core amd_smi_lib hipcc clang-ocl \ + omniperf omnitrace rocm-core amd_smi_lib hipcc \ rocm_bandwidth_test rocr_debug_agent rocm-gdb $(call adddep,rocm-dev,$(filter-out ${NOBUILD},${rocm-dev-components})) @@ -117,6 +120,7 @@ $(call adddep,mivisionx,amdmigraphx miopen-hip rpp lightning hipcc) $(call adddep,rccl,hip_on_rocclr hsa lightning hipcc rocm_smi_lib hipify_clang) $(call adddep,rocalution,rocblas rocsparse rocrand lightning hipcc) $(call adddep,rocblas,hip_on_rocclr openmp_extras lightning hipcc) +$(call adddep,rocal,mivisionx) $(call adddep,rocdecode,hip_on_rocclr lightning hipcc) $(call adddep,rocfft,hip_on_rocclr rocrand hiprand lightning hipcc openmp_extras) $(call adddep,rocmvalidationsuite,hip_on_rocclr hsa rocblas rocm-core lightning hipcc rocm_smi_lib) @@ -221,7 +225,7 @@ rocm-dev: T_rocm-dev ${OUT_DIR}/logs: sudo mkdir -p -m 775 "${ROCM_INSTALL_PATH}" && \ - sudo chown -R "$(shell id -u):$(shell id -g)" "${ROCM_INSTALL_PATH}" + sudo chown -R "$(shell id -u):$(shell id -g)" "/opt" sudo chown -R "$(shell id -u):$(shell id -g)" "/home/$(shell id -un)" mkdir -p "${@}" mkdir -p ${HOME}/.ccache diff --git a/tools/rocm-build/build_amdmigraphx.sh b/tools/rocm-build/build_amdmigraphx.sh index 9ddad2151..05a7528f5 100755 --- a/tools/rocm-build/build_amdmigraphx.sh +++ b/tools/rocm-build/build_amdmigraphx.sh @@ -22,12 +22,13 @@ build_amdmigraphx() { else GPU_TARGETS="gfx908;gfx90a;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101" fi + init_rocm_common_cmake_params mkdir -p ${BUILD_DIR} && rm -rf ${BUILD_DIR}/* && mkdir -p ${HOME}/amdmigraphx && rm -rf ${HOME}/amdmigraphx/* rbuild package -d "${HOME}/amdmigraphx" -B "${BUILD_DIR}" \ --cxx="${ROCM_PATH}/llvm/bin/clang++" \ --cc="${ROCM_PATH}/llvm/bin/clang" \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DCMAKE_MODULE_LINKER_FLAGS="-Wl,--enable-new-dtags -Wl,--rpath,$ROCM_LIB_RPATH" \ -DGPU_TARGETS="${GPU_TARGETS}" \ -DCMAKE_INSTALL_RPATH="" diff --git a/tools/rocm-build/build_composable_kernel.sh b/tools/rocm-build/build_composable_kernel.sh index 577c2c6b2..7d1b38abc 100755 --- a/tools/rocm-build/build_composable_kernel.sh +++ b/tools/rocm-build/build_composable_kernel.sh @@ -17,9 +17,7 @@ build_miopen_ck() { mkdir "$BUILD_DIR" && cd "$BUILD_DIR" if [ -n "$GPU_ARCHS" ]; then - GPU_TARGETS="$GPU_ARCHS" - else - GPU_TARGETS="gfx908;gfx90a;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101" + GPU_TARGETS="-DAMDGPU_TARGETS=${GPU_ARCHS}" fi if [ "${ASAN_CMAKE_PARAMS}" == "true" ] ; then @@ -43,7 +41,7 @@ build_miopen_ck() { ${LAUNCHER_FLAGS} \ -DINSTANCES_ONLY=ON \ -DENABLE_ASAN_PACKAGING=true \ - -DAMDGPU_TARGETS=${GPU_TARGETS} \ + "${GPU_TARGETS}" \ "$COMPONENT_SRC" else cmake -DBUILD_DEV=OFF \ @@ -63,9 +61,11 @@ build_miopen_ck() { -DROCM_DISABLE_LDCONFIG=ON \ -DROCM_PATH=${ROCM_PATH} \ -DCPACK_GENERATOR="${PKGTYPE^^}" \ + -DCMAKE_CXX_COMPILER="${ROCM_PATH}/llvm/bin/clang++" \ + -DCMAKE_C_COMPILER="${ROCM_PATH}/llvm/bin/clang" \ ${LAUNCHER_FLAGS} \ -DINSTANCES_ONLY=ON \ - -DAMDGPU_TARGETS=${GPU_TARGETS} \ + "${GPU_TARGETS}" \ "$COMPONENT_SRC" fi @@ -106,8 +106,6 @@ build_miopen_ckProf() { architectures='gfx10 gfx11 gfx90 gfx94' if [ -n "$GPU_ARCHS" ]; then architectures=$(echo ${GPU_ARCHS} | awk -F';' '{for(i=1;i<=NF;i++) a[substr($i,1,5)]} END{for(i in a) printf i" "}') - else - architectures='gfx10 gfx11 gfx90 gfx94' fi for arch in ${architectures} diff --git a/tools/rocm-build/build_hipblas.sh b/tools/rocm-build/build_hipblas.sh index 8a2b1d8bf..e09f61eb6 100755 --- a/tools/rocm-build/build_hipblas.sh +++ b/tools/rocm-build/build_hipblas.sh @@ -28,9 +28,10 @@ build_hipblas() { rebuild_lapack fi + init_rocm_common_cmake_params cmake \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DUSE_CUDA=OFF \ -DBUILD_CLIENTS_TESTS=ON \ -DBUILD_CLIENTS_BENCHMARKS=ON \ diff --git a/tools/rocm-build/build_hipblaslt.sh b/tools/rocm-build/build_hipblaslt.sh index 4895adde2..f99eb7772 100755 --- a/tools/rocm-build/build_hipblaslt.sh +++ b/tools/rocm-build/build_hipblaslt.sh @@ -27,11 +27,12 @@ build_hipblaslt() { GPU_TARGETS=all fi + init_rocm_common_cmake_params CXX=$(set_build_variables CXX)\ cmake \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DTensile_LOGIC= \ -DTensile_CODE_OBJECT_VERSION=default \ -DTensile_CPU_THREADS= \ diff --git a/tools/rocm-build/build_hipcub.sh b/tools/rocm-build/build_hipcub.sh index c559fe102..592c81364 100755 --- a/tools/rocm-build/build_hipcub.sh +++ b/tools/rocm-build/build_hipcub.sh @@ -17,6 +17,7 @@ build_hipcub() { fi mkdir -p "$BUILD_DIR" && cd "$BUILD_DIR" + init_rocm_common_cmake_params if [ -n "$GPU_ARCHS" ]; then GPU_TARGETS="$GPU_ARCHS" @@ -27,7 +28,7 @@ build_hipcub() { CXX=$(set_build_variables CXX)\ cmake \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DCMAKE_MODULE_PATH="${ROCM_PATH}/lib/cmake/hip;${ROCM_PATH}/hip/cmake" \ -Drocprim_DIR="${ROCM_PATH}/rocprim" \ -DBUILD_TEST=ON \ diff --git a/tools/rocm-build/build_hipfft.sh b/tools/rocm-build/build_hipfft.sh index d8d077f1f..17e8c6c9e 100755 --- a/tools/rocm-build/build_hipfft.sh +++ b/tools/rocm-build/build_hipfft.sh @@ -16,6 +16,7 @@ build_hipfft() { cd $COMPONENT_SRC mkdir -p "$BUILD_DIR" && cd "$BUILD_DIR" + init_rocm_common_cmake_params if [ -n "$GPU_ARCHS" ]; then GPU_TARGETS="$GPU_ARCHS" @@ -26,7 +27,7 @@ build_hipfft() { cmake \ -DCMAKE_CXX_COMPILER=$(set_build_variables CXX) \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DCMAKE_MODULE_PATH="${ROCM_PATH}/lib/cmake/hip" \ -DCMAKE_SKIP_BUILD_RPATH=TRUE \ diff --git a/tools/rocm-build/build_hiprand.sh b/tools/rocm-build/build_hiprand.sh index 293b05b6c..a9cdcbb53 100755 --- a/tools/rocm-build/build_hiprand.sh +++ b/tools/rocm-build/build_hiprand.sh @@ -61,7 +61,6 @@ build_hiprand() { rm -rf _CPack_Packages/ && find -name '*.o' -delete mkdir -p $PACKAGE_DIR && cp ${BUILD_DIR}/*.${PKGTYPE} $PACKAGE_DIR - $SCCACHE_BIN -s || echo "Unable to display sccache stats" } clean_hiprand() { diff --git a/tools/rocm-build/build_hipsolver.sh b/tools/rocm-build/build_hipsolver.sh index ded0279df..22c778832 100755 --- a/tools/rocm-build/build_hipsolver.sh +++ b/tools/rocm-build/build_hipsolver.sh @@ -27,10 +27,11 @@ build_hipsolver() { rebuild_lapack fi + init_rocm_common_cmake_params cmake \ -DUSE_CUDA=OFF \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DBUILD_CLIENTS_TESTS=ON \ -DBUILD_CLIENTS_BENCHMARKS=ON \ -DBUILD_CLIENTS_SAMPLES=ON \ diff --git a/tools/rocm-build/build_hipsparse.sh b/tools/rocm-build/build_hipsparse.sh index 727421635..10171905c 100755 --- a/tools/rocm-build/build_hipsparse.sh +++ b/tools/rocm-build/build_hipsparse.sh @@ -22,11 +22,12 @@ build_hipsparse() { echo "CXX compiler: $CXX" mkdir -p "$BUILD_DIR" && cd "$BUILD_DIR" + init_rocm_common_cmake_params cmake \ -DCPACK_SET_DESTDIR=OFF \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DUSE_CUDA=OFF \ -DBUILD_CLIENTS_SAMPLES=ON \ -DBUILD_CLIENTS_TESTS=ON \ diff --git a/tools/rocm-build/build_hipsparselt.sh b/tools/rocm-build/build_hipsparselt.sh index 99b70edd0..dddff1437 100755 --- a/tools/rocm-build/build_hipsparselt.sh +++ b/tools/rocm-build/build_hipsparselt.sh @@ -28,6 +28,7 @@ build_hipsparselt() { cd $COMPONENT_SRC mkdir -p "$BUILD_DIR" && cd "$BUILD_DIR" + init_rocm_common_cmake_params if [ -n "$GPU_ARCHS" ]; then GPU_TARGETS="$GPU_ARCHS" @@ -41,7 +42,7 @@ build_hipsparselt() { cmake \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DTensile_LOGIC= \ -DTensile_CODE_OBJECT_VERSION=default \ -DTensile_CPU_THREADS= \ diff --git a/tools/rocm-build/build_hiptensor.sh b/tools/rocm-build/build_hiptensor.sh index c239caab9..04346797b 100755 --- a/tools/rocm-build/build_hiptensor.sh +++ b/tools/rocm-build/build_hiptensor.sh @@ -16,6 +16,8 @@ build_hiptensor() { cd "$COMPONENT_SRC" mkdir -p "$BUILD_DIR" && cd "$BUILD_DIR" + init_rocm_common_cmake_params + if [ -n "$GPU_ARCHS" ]; then GPU_TARGETS="$GPU_ARCHS" @@ -25,7 +27,7 @@ build_hiptensor() { cmake \ -B "${BUILD_DIR}" \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ $(set_build_variables CMAKE_C_CXX) \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ ${LAUNCHER_FLAGS} \ diff --git a/tools/rocm-build/build_lightning.sh b/tools/rocm-build/build_lightning.sh index dbf9fae22..ed4ba6bc2 100755 --- a/tools/rocm-build/build_lightning.sh +++ b/tools/rocm-build/build_lightning.sh @@ -17,7 +17,8 @@ printUsage() { echo " -r, --release Build a release version of the package" echo " -a, --address_sanitizer Enable address sanitizer (enabled by default)" echo " -A --no_address_sanitizer Disable address sanitizer" - echo " -s, --static Supports static CI by accepting this param & not bailing out. No effect of the param though" + echo " -s, --static Build static lib (.a). build instead of dynamic/shared(.so) " + echo " -w, --wheel Creates python wheel package of rocm-llvm. It needs to be used along with -r option" echo " -l, --build_llvm_static Build LLVM libraries statically linked. Default is to build dynamic linked libs" echo " -o, --outdir Print path of output directory containing packages of type referred to by pkg_type" @@ -42,6 +43,7 @@ DEB_PATH="$(getDebPath lightning)" RPM_PATH="$(getRpmPath lightning)" INSTALL_PATH="${ROCM_INSTALL_PATH}/lib/llvm" LLVM_ROOT_LCL="${LLVM_ROOT}" +ROCM_WHEEL_DIR="${BUILD_PATH}/_wheel" TARGET="all" MAKEOPTS="$DASH_JAY" @@ -69,14 +71,29 @@ ASSERT_LLVM_VERSION_MINOR="" SKIP_LIT_TESTS=0 BUILD_MANPAGES="ON" +STATIC_FLAG= SANITIZER_AMDGPU=1 HSA_INC_PATH="$WORK_ROOT/ROCR-Runtime/src/inc" COMGR_INC_PATH="$WORK_ROOT/llvm-project/amd/comgr/include" -VALID_STR=`getopt -o htcV:v:draAslo:BPNM --long help,alt,clean,assert_llvm_ver_major:,assert_llvm_ver_minor:,debug,release,address_sanitizer,no_address_sanitizer,static,build_llvm_static,build,package,skip_lit_tests,skip_man_pages,outdir: -- "$@"` +VALID_STR=`getopt -o htcV:v:draAswlo:BPNM --long help,alt,clean,assert_llvm_ver_major:,assert_llvm_ver_minor:,debug,release,address_sanitizer,no_address_sanitizer,static,build_llvm_static,wheel,build,package,skip_lit_tests,skip_man_pages,outdir: -- "$@"` eval set -- "$VALID_STR" +set_dwarf_version(){ + case "$DISTRO_ID" in + (sles*|rhel*) + SET_DWARF_VERSION_4="-gdwarf-4" + ;; + (*) + SET_DWARF_VERSION_4="" + ;; + esac + export CFLAGS="$CFLAGS $SET_DWARF_VERSION_4 " + export CXXFLAGS="$CXXFLAGS $SET_DWARF_VERSION_4 " + export ASMFLAGS="$ASMFLAGS $SET_DWARF_VERSION_4 " +} + while true ; do case "$1" in @@ -95,6 +112,7 @@ do (-r | --release) BUILD_TYPE="Release" ; shift ;; (-a | --address_sanitizer) + set_dwarf_version SANITIZER_AMDGPU=1 ; HSA_INC_PATH="$WORK_ROOT/hsa/runtime/opensrc/hsa-runtime/inc" ; COMGR_INC_PATH="$WORK_ROOT/external/llvm-project/amd/comgr/include" ; shift ;; @@ -103,9 +121,12 @@ do unset HSA_INC_PATH ; unset COMGR_INC_PATH ; shift ;; (-s | --static) - SHARED_LIBS="OFF" ; shift ;; + SHARED_LIBS="OFF" ; + STATIC_FLAG="-DBUILD_SHARED_LIBS=$SHARED_LIBS" ; shift ;; (-l | --build_llvm_static) BUILD_LLVM_DYLIB="OFF"; shift ;; + (-w | --wheel) + WHEEL_PACKAGE=true ; shift ;; (-o | --outdir) TARGET="outdir"; PKGTYPE=$2 ; OUT_DIR_SPECIFIED=1 ; ((CLEAN_OR_OUT|=2)) ; shift 2 ;; (-B | --build) @@ -151,6 +172,7 @@ else fi clean_lightning() { + rm -rf "$ROCM_WHEEL_DIR" rm -rf "$BUILD_PATH" rm -rf "$DEB_PATH" rm -rf "$RPM_PATH" @@ -196,7 +218,10 @@ LLVM_VERSION_MINOR="" LLVM_VERSION_PATCH="" LLVM_VERSION_SUFFIX="" get_llvm_version() { - local LLVM_VERSIONS=($(awk '/set\(LLVM_VERSION/ {print substr($2,1,length($2)-1)}' ${LLVM_ROOT_LCL}/CMakeLists.txt)) + local LLVM_VERSIONS=($(awk '/set\(LLVM_VERSION/ {print substr($2,1,length($2)-1)}' ${LLVM_ROOT_LCL}/../cmake/Modules/LLVMVersion.cmake)) + if [ ${#LLVM_VERSIONS[@]} -eq 0 ]; then + LLVM_VERSIONS=($(awk '/set\(LLVM_VERSION/ {print substr($2,1,length($2)-1)}' ${LLVM_ROOT_LCL}/CMakeLists.txt)) + fi LLVM_VERSION_MAJOR=${LLVM_VERSIONS[0]} LLVM_VERSION_MINOR=${LLVM_VERSIONS[1]} LLVM_VERSION_PATCH=${LLVM_VERSIONS[2]} @@ -260,15 +285,22 @@ build_lightning() { if [ ! -e Makefile ]; then echo "Building LLVM CMake environment" - if [ -e "$LLVM_ROOT_LCL/../flang/docs/AssumedRank.md" ]; then - FLANG_NEW=1 - LLVM_PROJECTS="$LLVM_PROJECTS;flang;mlir" - else - echo "NOT building project flang" - fi + if [ -e "$LLVM_ROOT_LCL/../flang/AFARrelease" ]; then + FLANG_NEW=1 + LLVM_PROJECTS="$LLVM_PROJECTS;flang;mlir" + ENABLE_RUNTIMES="$ENABLE_RUNTIMES;openmp"; + else + if [[ "${JOB_NAME}" != *afar* ]] && [ -e "$LLVM_ROOT_LCL/../flang/DoROCmRelease" ]; then + FLANG_NEW=1 + LLVM_PROJECTS="$LLVM_PROJECTS;flang;mlir" + else + echo "NOT building project flang" + fi + fi set -x cmake $(rocm_cmake_params) ${GEN_NINJA} \ + ${STATIC_FLAG} \ -DCMAKE_INSTALL_PREFIX="$INSTALL_PATH" \ -DLLVM_TARGETS_TO_BUILD="AMDGPU;X86" \ -DLLVM_ENABLE_PROJECTS="$LLVM_PROJECTS" \ @@ -307,9 +339,9 @@ build_lightning() { -DCMAKE_SHARED_LINKER_FLAGS=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,$ROCM_LLVM_LIB_RPATH \ -DROCM_LLVM_BACKWARD_COMPAT_LINK="$ROCM_INSTALL_PATH/llvm" \ -DROCM_LLVM_BACKWARD_COMPAT_LINK_TARGET="./lib/llvm" \ - -DCLANG_LINK_FLANG_LEGACY=ON \ - -DCMAKE_CXX_STANDARD=17 \ - -DFLANG_INCLUDE_DOCS=OFF \ + -DCLANG_LINK_FLANG_LEGACY=ON \ + -DCMAKE_CXX_STANDARD=17 \ + -DFLANG_INCLUDE_DOCS=OFF \ "$LLVM_ROOT_LCL" set +x echo "CMake complete" @@ -326,8 +358,23 @@ build_lightning() { echo "End Workaround for race condition" cmake --build . -- $MAKEOPTS + case "$DISTRO_ID" in + (rhel*|centos*) + RHEL_BUILD=1 + ;; + (*) + RHEL_BUILD=0 + ;; + esac + if [ $SKIP_LIT_TESTS -eq 0 ]; then - if [ "$DISTRO_NAME" != "sles" ] && [ $BUILD_ALT != 1 ]; then + if [ $RHEL_BUILD -eq 1 ] && [ $BUILD_ALT != 1 ]; then + if [ $FLANG_NEW -eq 1 ]; then + cmake --build . -- $MAKEOPTS check-lld check-mlir + else + cmake --build . -- $MAKEOPTS check-lld + fi + elif [ "$DISTRO_NAME" != "sles" ] && [ $BUILD_ALT != 1 ]; then if [ $FLANG_NEW -eq 1 ]; then cmake --build . -- $MAKEOPTS check-llvm check-clang check-lld check-mlir else @@ -733,7 +780,7 @@ package_lightning_static() { local amd_compiler_commands=("amdclang" "amdclang++" "amdclang-cl" "amdclang-cpp" "amdflang" "amdlld" "offload-arch") local amd_man_pages=("amdclang.1.gz" "flang.1.gz" "amdflang.1.gz") local core_bin=("amdgpu-arch" "amdgpu-offload-arch" "amdlld" "amdllvm" "clang" "clang++" "clang-${LLVM_VERSION_MAJOR}" "clang-cl" - "clang-cpp" "clang-build-select-link" "clang-offload-bundler" "clang-offload-packager" "clang-offload-wrapper" "flang" "flang-new" + "clang-cpp" "clang-build-select-link" "clang-offload-bundler" "clang-offload-packager" "clang-offload-wrapper" "clang-linker-wrapper" "clang-nvlink-wrapper" "flang" "flang-new" "ld64.lld" "ld.lld" "llc" "lld" "lld-link" "llvm-ar" "llvm-bitcode-strip" "llvm-dwarfdump" "llvm-install-name-tool" "llvm-link" "llvm-mc" "llvm-objcopy" "llvm-objdump" "llvm-otool" "llvm-ranlib" "llvm-readelf" "llvm-readobj" "llvm-strip" "nvidia-arch" "nvptx-arch" "offload-arch" "opt" "wasm-ld" "amdclang" "amdclang++" "amdclang-${LLVM_VERSION_MAJOR}" "amdclang-cl" @@ -934,7 +981,7 @@ package_lightning_static() { if [ $BUILD_ALT -eq 0 ]; then echo "cp -R $LLVM_ROOT_LCL/LICENSE.TXT \$RPM_BUILD_ROOT/$licenseDir" >> $specFile - echo "cp -P $backwardsCompatibleSymlink \$RPM_BUILD_ROOT/$ROCM_INSTALL_PATH" >> $specFile + echo "cp -P $backwardsCompatibleSymlink \$RPM_BUILD_ROOT/$ROCM_INSTALL_PATH" >> $specFile else echo "cp -R $LLVM_PROJECT_ALT_ROOT/EULA \$RPM_BUILD_ROOT/$licenseDir" >> $specFile echo "cp -R $LLVM_PROJECT_ALT_ROOT/DISCLAIMER.txt \$RPM_BUILD_ROOT/$licenseDir" >> $specFile @@ -948,7 +995,6 @@ package_lightning_static() { echo "cp -d \"$distBin/flang\" \$RPM_BUILD_ROOT/$installPath/bin/" >> $specFile - # Copy the config files if [ $BUILD_ALT -eq 0 ]; then echo "cp -d \"$distBin\"/*.cfg \$RPM_BUILD_ROOT/$installPath/bin/" >> $specFile fi @@ -970,14 +1016,12 @@ package_lightning_static() { if [ "$BUILD_MANPAGES" == "ON" ]; then if [ $BUILD_ALT -eq 0 ]; then echo "mkdir -p \$RPM_BUILD_ROOT/$installPath/share/man/man1" >> $specFile - for i in "${core_man_pages[@]}"; do if [ -f "$distMan/man1/$i" ]; then echo "gzip -f $distMan/man1/$i" >> $specFile echo "cp -d $distMan/man1/${i}.gz \$RPM_BUILD_ROOT/$installPath/share/man/man1/" >> $specFile fi done - if [ -f "$distMan/man1/clang.1.gz" ]; then for i in "${amd_man_pages[@]}"; do echo "ln -sf clang.1.gz \"$distMan/man1/$i\"" >> $specFile @@ -1064,7 +1108,6 @@ package_lightning_static() { contains "$bin" "${core_bin[@]}" "${amd_compiler_commands[@]}" && continue echo "cp -d \"$i\" \$RPM_BUILD_ROOT/$installPath/bin/" >> $specFileExtra done - for i in "$distLib"/*; do lib=$(basename "$i") contains "$lib" "${core_lib[@]}" && continue @@ -1072,18 +1115,15 @@ package_lightning_static() { done echo "cp -R $distInc \$RPM_BUILD_ROOT/$installPath" >> $specFileExtra - echo "rm -rf \$RPM_BUILD_ROOT/$installPath/lib/clang" >> $specFileExtra if [ $FLANG_NEW -eq 1 ]; then - echo "rm -rf \$RPM_BUILD_ROOT/$installPath/include/flang" >> $specFileExtra fi if [ "$BUILD_MANPAGES" == "ON" ]; then if [ $BUILD_ALT -eq 0 ]; then echo "mkdir -p \$RPM_BUILD_ROOT/$installPath/share/man/man1" >> $specFileExtra - for i in "${extra_man_pages[@]}"; do if [ -f "$distMan/man1/$i" ]; then echo "gzip -f $distMan/man1/$i" >> $specFileExtra @@ -1125,34 +1165,34 @@ package_docs() { local packageName="rocm-llvm-docs" local packageSummary="ROCm LLVM compiler documentation" local packageSummaryLong="Documenation for LLVM $llvmParsedVersion" - local installPath="$ROCM_INSTALL_PATH/lib/llvm/share" local packageArch="amd64" local packageVersion="${llvmParsedVersion}.${LLVM_COMMIT_GITDATE}" local packageMaintainer="ROCm Compiler Support " - local distDoc="$INSTALL_PATH/share/doc" + local distDoc="$INSTALL_PATH/share/doc/LLVM" local licenseDir="$ROCM_INSTALL_PATH/share/doc/$packageName" local packageDir="$BUILD_PATH/package" local packageDeb="$packageDir/deb" local controlFile="$packageDeb/DEBIAN/control" + local debDependencies="rocm-core" local packageRpm="$packageDir/rpm" local specFile="$packageDir/$packageName.spec" + local rpmRequires="rocm-core" rm -rf "$packageDir" echo "rm -rf $packageDir" if [ "$PACKAGEEXT" = "deb" ]; then - mkdir -p "$packageDeb/$installPath" - mkdir "${controlFile%/*}" mkdir -p "$packageDeb/$licenseDir" + mkdir "${controlFile%/*}" cp -r "$LLVM_ROOT_LCL/LICENSE.TXT" "$packageDeb/$licenseDir" - cp -r "$distDoc" "$packageDeb/$installPath/doc" + cp -r "$distDoc" "$packageDeb/$licenseDir" { echo "Package: $packageName" @@ -1162,6 +1202,7 @@ package_docs() { echo "Maintainer: $packageMaintainer" echo "Version: ${packageVersion}.${ROCM_LIBPATCH_VERSION}-${JOB_DESIGNATOR}${BUILD_ID}~${DISTRO_RELEASE}" echo "Release: ${JOB_DESIGNATOR}${BUILD_ID}~${DISTRO_RELEASE}" + echo "Depends: $debDependencies" echo "Recommends: $debRecommends" echo "Description: $packageSummary" echo " $packageSummaryLong" @@ -1182,6 +1223,7 @@ package_docs() { echo "Summary: $packageSummary" >> $specFile echo "Group: System Environment/Libraries" >> $specFile echo "License: ASL 2.0 with exceptions" >> $specFile + echo "Requires: $rpmRequires" >> $specFile echo "%description" >> $specFile echo "$packageSummaryLong" >> $specFile @@ -1190,16 +1232,13 @@ package_docs() { echo "%setup -T -D -c -n $packageName" >> $specFile echo "%install" >> $specFile - echo "rm -rf \$RPM_BUILD_ROOT/$installPath" >> $specFile - echo "mkdir -p \$RPM_BUILD_ROOT/$installPath/doc" >> $specFile echo "mkdir -p \$RPM_BUILD_ROOT/$licenseDir" >> $specFile echo "cp -R $LLVM_ROOT_LCL/LICENSE.TXT \$RPM_BUILD_ROOT/$licenseDir" >> $specFile - echo "cp -R \"$distDoc\" \$RPM_BUILD_ROOT/$installPath" >> $specFile + echo "cp -R \"$distDoc\" \$RPM_BUILD_ROOT/$licenseDir" >> $specFile echo "%clean" >> $specFile - echo "rm -rf \$RPM_BUILD_ROOT/$installPath" >> $specFile echo "%files " >> $specFile echo "%defattr(-,root,root,-)" >> $specFile @@ -1232,6 +1271,18 @@ build() { fi } +create_wheel_package() { + echo "Creating rocm-llvm wheel package" + mkdir -p "$ROCM_WHEEL_DIR" + cp -f $SCRIPT_ROOT/generate_setup_py.py $ROCM_WHEEL_DIR + cp -f $SCRIPT_ROOT/repackage_wheel.sh $ROCM_WHEEL_DIR + cd $ROCM_WHEEL_DIR + # Currently only supports python3.6 + ./repackage_wheel.sh $RPM_PATH/rocm-llvm*.rpm python3.6 + # Copy the wheel created to RPM folder which will be uploaded to artifactory + mv "$ROCM_WHEEL_DIR"/dist/*.whl "$RPM_PATH" +} + case $TARGET in (clean) clean_lightning ;; (all) @@ -1250,4 +1301,9 @@ case $TARGET in (*) die "Invalid target $TARGET" ;; esac +if [[ $WHEEL_PACKAGE == true ]]; then + echo "Wheel Package build started !!!!" + create_wheel_package +fi + echo "Operation complete" diff --git a/tools/rocm-build/build_miopen-deps.sh b/tools/rocm-build/build_miopen-deps.sh index 48910fac0..51bd23f1a 100755 --- a/tools/rocm-build/build_miopen-deps.sh +++ b/tools/rocm-build/build_miopen-deps.sh @@ -44,7 +44,7 @@ build_miopen_deps() { cd "$COMPONENT_SRC" # Commenting the rocMLIR & composable_kernel from requirements.txt - sed -i '/ROCmSoftwarePlatform\/rocMLIR@\|ROCmSoftwarePlatform\/composable_kernel@/s/^/#/' requirements.txt + sed -i '/ROCm\/rocMLIR@\|ROCm\/composable_kernel@/s/^/#/' requirements.txt # Extract MLIR commit from requirements.txt MLIR_COMMIT="$(awk '/rocMLIR/ {split($1, s, "@"); print s[2]}' requirements.txt)" diff --git a/tools/rocm-build/build_miopen-hip.sh b/tools/rocm-build/build_miopen-hip.sh index e4267b96d..b07c0f484 100755 --- a/tools/rocm-build/build_miopen-hip.sh +++ b/tools/rocm-build/build_miopen-hip.sh @@ -13,7 +13,7 @@ build_miopen_hip() { echo "Start build" cd $COMPONENT_SRC - + git config --global --add safe.directory "$COMPONENT_SRC" checkout_lfs if [ "${ENABLE_ADDRESS_SANITIZER}" == "true" ]; then @@ -22,8 +22,9 @@ build_miopen_hip() { fi mkdir "$BUILD_DIR" && cd "$BUILD_DIR" + init_rocm_common_cmake_params cmake \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DMIOPEN_BACKEND=HIP \ -DCMAKE_CXX_COMPILER="${ROCM_PATH}/llvm/bin/clang++" \ -DCMAKE_C_COMPILER="${ROCM_PATH}/llvm/bin/clang" \ diff --git a/tools/rocm-build/build_mivisionx.sh b/tools/rocm-build/build_mivisionx.sh index 7bfe40dc3..ae37444b5 100755 --- a/tools/rocm-build/build_mivisionx.sh +++ b/tools/rocm-build/build_mivisionx.sh @@ -16,6 +16,8 @@ build_mivisionx() { BUILD_DEV=OFF fi + init_rocm_common_cmake_params + if [ -n "$GPU_ARCHS" ]; then GPU_TARGETS="$GPU_ARCHS" else @@ -23,7 +25,7 @@ build_mivisionx() { fi cmake \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DROCM_PATH="$ROCM_PATH" \ -DBUILD_DEV=$BUILD_DEV \ -DCMAKE_INSTALL_LIBDIR=$(getInstallLibDir) \ @@ -34,6 +36,7 @@ build_mivisionx() { "$COMPONENT_SRC" cmake --build "$BUILD_DIR" -- -j${PROC} + cmake --build "$BUILD_DIR" -- install cpack -G ${PKGTYPE^^} rm -rf _CPack_Packages/ && find -name '*.o' -delete diff --git a/tools/rocm-build/build_omniperf.sh b/tools/rocm-build/build_omniperf.sh new file mode 100755 index 000000000..78ac7ab9a --- /dev/null +++ b/tools/rocm-build/build_omniperf.sh @@ -0,0 +1,171 @@ +#!/bin/bash + +source "$(dirname "${BASH_SOURCE}")/compute_utils.sh" + +printUsage() { + echo + echo "Usage: ${BASH_SOURCE##*/} [options ...]" + echo + echo "Options:" + echo " -c, --clean Clean output and delete all intermediate work" + echo " -s, --static Build static lib (.a). build instead of dynamic/shared(.so) " + echo " -p, --package Specify packaging format" + echo " -r, --release Make a release build instead of a debug build" + echo " -a, --address_sanitizer Enable address sanitizer" + echo " -o, --outdir Print path of output directory containing packages of + type referred to by pkg_type" + echo " -w, --wheel Creates python wheel package of omniperf. + It needs to be used along with -r option" + echo " -h, --help Prints this help" + echo + echo "Possible values for :" + echo " deb -> Debian format (default)" + echo " rpm -> RPM format" + echo + + return 0 +} + +API_NAME="omniperf" +PROJ_NAME="$API_NAME" +LIB_NAME="lib${API_NAME}" +TARGET="build" +MAKETARGET="deb" +PACKAGE_ROOT="$(getPackageRoot)" +PACKAGE_LIB="$(getLibPath)" +BUILD_DIR="$(getBuildPath $API_NAME)" +PACKAGE_DEB="$(getPackageRoot)/deb/$API_NAME" +PACKAGE_RPM="$(getPackageRoot)/rpm/$API_NAME" +ROCM_WHEEL_DIR="${BUILD_DIR}/_wheel" +BUILD_TYPE="Debug" +MAKE_OPTS="$DASH_JAY -C $BUILD_DIR" +SHARED_LIBS="ON" +CLEAN_OR_OUT=0; +MAKETARGET="deb" +PKGTYPE="deb" +WHEEL_PACKAGE=false + + +#parse the arguments +VALID_STR=$(getopt -o hcraso:p:w --long help,clean,release,static,address_sanitizer,outdir:,package:,wheel -- "$@") +eval set -- "$VALID_STR" + +while true ; +do + case "$1" in + -h | --help) + printUsage ; exit 0;; + -c | --clean) + TARGET="clean" ; ((CLEAN_OR_OUT|=1)) ; shift ;; + -r | --release) + BUILD_TYPE="Release" ; shift ;; + -a | --address_sanitizer) + set_asan_env_vars + set_address_sanitizer_on ; shift ;; + -s | --static) + SHARED_LIBS="OFF" ; shift ;; + -o | --outdir) + TARGET="outdir"; PKGTYPE=$2 ; OUT_DIR_SPECIFIED=1 ; ((CLEAN_OR_OUT|=2)) ; shift 2 ;; + -p | --package) + MAKETARGET="$2" ; shift 2 ;; + -w | --wheel) + WHEEL_PACKAGE=true ; shift ;; + --) shift; break;; # end delimiter + *) + echo " This should never come but just incase : UNEXPECTED ERROR Parm : [$1] ">&2 ; exit 20;; + esac + +done + +RET_CONFLICT=1 +check_conflicting_options "$CLEAN_OR_OUT" "$PKGTYPE" "$MAKETARGET" +if [ $RET_CONFLICT -ge 30 ]; then + print_vars "$API_NAME" "$TARGET" "$BUILD_TYPE" "$SHARED_LIBS" "$CLEAN_OR_OUT" "$PKGTYPE" "$MAKETARGET" + exit $RET_CONFLICT +fi + +clean() { + echo "Cleaning $PROJ_NAME" + rm -rf "$ROCM_WHEEL_DIR" + rm -rf "$BUILD_DIR" + rm -rf "$PACKAGE_DEB" + rm -rf "$PACKAGE_RPM" + rm -rf "$PACKAGE_ROOT/${PROJ_NAME:?}" + rm -rf "$PACKAGE_LIB/${LIB_NAME:?}"* +} + +build() { + echo "Building $PROJ_NAME" + if [ "$DISTRO_ID" = centos-7 ]; then + echo "Skip make and uploading packages for Omniperf on Centos7 distro, due to python dependency" + exit 0 + fi + + if [ ! -d "$BUILD_DIR" ]; then + mkdir -p "$BUILD_DIR" + pushd "$BUILD_DIR" || exit + + echo "ROCm CMake Params: $(rocm_cmake_params)" + echo "ROCm Common CMake Params: $(rocm_common_cmake_params)" + + print_lib_type $SHARED_LIBS + cmake \ + $(rocm_cmake_params) \ + $(rocm_common_cmake_params) \ + -DCHECK_PYTHON_DEPS=NO \ + -DPYTHON_DEPS=${BUILD_DIR}/python-libs \ + -DMOD_INSTALL_PATH=${BUILD_DIR}/modulefiles \ + "$OMNIPERF_ROOT" + fi + + make $MAKE_OPTS + make $MAKE_OPTS install + make $MAKE_OPTS package + + copy_if DEB "${CPACKGEN:-"DEB;RPM"}" "$PACKAGE_DEB" "$BUILD_DIR/${API_NAME}"*.deb + copy_if RPM "${CPACKGEN:-"DEB;RPM"}" "$PACKAGE_RPM" "$BUILD_DIR/${API_NAME}"*.rpm +} + +create_wheel_package() { + echo "Creating Omniperf wheel package" + + # Copy the setup.py generator to build folder + mkdir -p "$ROCM_WHEEL_DIR" + cp -f "$SCRIPT_ROOT"/generate_setup_py.py "$ROCM_WHEEL_DIR" + cp -f "$SCRIPT_ROOT"/repackage_wheel.sh "$ROCM_WHEEL_DIR" + cd "$ROCM_WHEEL_DIR" || exit + + # Currently only supports python3.6 + ./repackage_wheel.sh "$BUILD_DIR"/*.rpm python3.6 + + # Copy the wheel created to RPM folder which will be uploaded to artifactory + copy_if WHL "WHL" "$PACKAGE_RPM" "$ROCM_WHEEL_DIR"/dist/*.whl +} + +print_output_directory() { + case ${PKGTYPE} in + ("deb") + echo "${PACKAGE_DEB}";; + ("rpm") + echo "${PACKAGE_RPM}";; + (*) + echo "Invalid package type \"${PKGTYPE}\" provided for -o" >&2; exit 1;; + esac + exit +} + +verifyEnvSetup + +case "$TARGET" in + (clean) clean ;; + (build) build ;; + (outdir) print_output_directory ;; + (*) die "Invalid target $TARGET" ;; +esac + +if [[ $WHEEL_PACKAGE == true ]]; then + echo "Wheel Package build started !!!!" + create_wheel_package +fi + +echo "Operation complete" \ No newline at end of file diff --git a/tools/rocm-build/build_omnitrace.sh b/tools/rocm-build/build_omnitrace.sh new file mode 100755 index 000000000..c604351de --- /dev/null +++ b/tools/rocm-build/build_omnitrace.sh @@ -0,0 +1,191 @@ +#!/bin/bash + +source "$(dirname "${BASH_SOURCE}")/compute_utils.sh" + +printUsage() { + echo + echo "Usage: ${BASH_SOURCE##*/} [options ...]" + echo + echo "Options:" + echo " -c, --clean Clean output and delete all intermediate work" + echo " -s, --static Build static lib (.a). build instead of dynamic/shared(.so) " + echo " -p, --package Specify packaging format" + echo " -r, --release Make a release build instead of a debug build" + echo " -a, --address_sanitizer Enable address sanitizer" + echo " -o, --outdir Print path of output directory containing packages of + type referred to by pkg_type" + echo " -w, --wheel Creates python wheel package of omnitrace. + It needs to be used along with -r option" + echo " -h, --help Prints this help" + echo + echo "Possible values for :" + echo " deb -> Debian format (default)" + echo " rpm -> RPM format" + echo + + return 0 +} + +API_NAME="omnitrace" +PROJ_NAME="$API_NAME" +LIB_NAME="lib${API_NAME}" +TARGET="build" +MAKETARGET="deb" +PACKAGE_ROOT="$(getPackageRoot)" +PACKAGE_LIB="$(getLibPath)" +BUILD_DIR="$(getBuildPath $API_NAME)" +PACKAGE_DEB="$(getPackageRoot)/deb/$API_NAME" +PACKAGE_RPM="$(getPackageRoot)/rpm/$API_NAME" +BUILD_TYPE="Debug" +MAKE_OPTS="-j 8" +SHARED_LIBS="ON" +CLEAN_OR_OUT=0 +MAKETARGET="deb" +PKGTYPE="deb" +ASAN=0 + +#parse the arguments +VALID_STR=$(getopt -o hcraso:p:w --long help,clean,release,address_sanitizer,static,outdir:,package:,wheel -- "$@") +eval set -- "$VALID_STR" + +while true; do + case "$1" in + -h | --help) + printUsage + exit 0 + ;; + -c | --clean) + TARGET="clean" + ((CLEAN_OR_OUT |= 1)) + shift + ;; + -r | --release) + BUILD_TYPE="RelWithDebInfo" + shift + ;; + -a | --address_sanitizer) + ack_and_ignore_asan + + ASAN=1 + shift + ;; + -s | --static) + SHARED_LIBS="OFF" + shift + ;; + -o | --outdir) + TARGET="outdir" + PKGTYPE=$2 + ((CLEAN_OR_OUT |= 2)) + shift 2 + ;; + -p | --package) + MAKETARGET="$2" + shift 2 + ;; + -w | --wheel) + echo "omnitrace: wheel build option accepted and ignored" + shift + ;; + --) + shift + break + ;; + *) + echo " This should never come but just incase : UNEXPECTED ERROR Parm : [$1] " >&2 + exit 20 + ;; + esac + +done + +RET_CONFLICT=1 +check_conflicting_options $CLEAN_OR_OUT $PKGTYPE $MAKETARGET +if [ $RET_CONFLICT -ge 30 ]; then + print_vars $API_NAME $TARGET $BUILD_TYPE $SHARED_LIBS $CLEAN_OR_OUT $PKGTYPE $MAKETARGET + exit $RET_CONFLICT +fi + +clean() { + echo "Cleaning $PROJ_NAME" + rm -rf "$BUILD_DIR" + rm -rf "$PACKAGE_DEB" + rm -rf "$PACKAGE_RPM" + rm -rf "$PACKAGE_ROOT/${PROJ_NAME:?}" + rm -rf "$PACKAGE_LIB/${LIB_NAME:?}"* +} + +build_omnitrace() { + echo "Building $PROJ_NAME" + if [ "$DISTRO_ID" = "mariner-2.0" ] || [ "$DISTRO_ID" = "ubuntu-24.04" ] || [ "$DISTRO_ID" = "azurelinux-3.0" ]; then + echo "Skip make and uploading packages for Omnitrace on \"${DISTRO_ID}\" distro" + exit 0 + fi + + if [ $ASAN == 1 ]; then + echo "Skip make and uploading packages for Omnitrace on ASAN build" + exit 0 + fi + if [ ! -d "$BUILD_DIR" ]; then + mkdir -p "$BUILD_DIR" + echo "Created build directory: $BUILD_DIR" + fi + + echo "Build directory: $BUILD_DIR" + pushd "$BUILD_DIR" || exit + print_lib_type $SHARED_LIBS + + echo "ROCm CMake Params: $(rocm_cmake_params)" + echo "ROCm Common CMake Params: $(rocm_common_cmake_params)" + + + if [ $ASAN == 1 ]; then + echo "Address Sanitizer path" + + else + cmake \ + $(rocm_cmake_params) \ + $(rocm_common_cmake_params) \ + -DOMNITRACE_BUILD_{LIBUNWIND,DYNINST}=ON \ + -DDYNINST_BUILD_{TBB,BOOST,ELFUTILS,LIBIBERTY}=ON \ + "$OMNITRACE_ROOT" + fi + + + popd || exit + + echo "Make Options: $MAKE_OPTS" + cmake --build "$BUILD_DIR" --target all -- $MAKE_OPTS + cmake --build "$BUILD_DIR" --target install -- $MAKE_OPTS + cmake --build "$BUILD_DIR" --target package -- $MAKE_OPTS + + copy_if DEB "${CPACKGEN:-"DEB;RPM"}" "$PACKAGE_DEB" "$BUILD_DIR/${API_NAME}"*.deb + copy_if RPM "${CPACKGEN:-"DEB;RPM"}" "$PACKAGE_RPM" "$BUILD_DIR/${API_NAME}"*.rpm +} + +print_output_directory() { + case ${PKGTYPE} in + "deb") + echo "${PACKAGE_DEB}" + ;; + "rpm") + echo "${PACKAGE_RPM}" + ;; + *) + echo "Invalid package type \"${PKGTYPE}\" provided for -o" >&2 + exit 1 + ;; + esac + exit +} + +verifyEnvSetup + +case "$TARGET" in +clean) clean ;; +build) build_omnitrace ;; +outdir) print_output_directory ;; +*) die "Invalid target $TARGET" ;; +esac + +echo "Operation complete" diff --git a/tools/rocm-build/build_opencl_icd_loader.sh b/tools/rocm-build/build_opencl_icd_loader.sh new file mode 100755 index 000000000..1e74d1cb9 --- /dev/null +++ b/tools/rocm-build/build_opencl_icd_loader.sh @@ -0,0 +1,141 @@ +#!/bin/bash + +source "$(dirname "${BASH_SOURCE}")/compute_utils.sh" +PROJ_NAME=OpenCL-ICD-Loader +TARGET="build" +MAKEOPTS="$DASH_JAY" +BUILD_TYPE="Debug" +PACKAGE_ROOT="$(getPackageRoot)" +PACKAGE_DEB="$PACKAGE_ROOT/deb/${PROJ_NAME,,}" +PACKAGE_RPM="$PACKAGE_ROOT/rpm/${PROJ_NAME,,}" +CLEAN_OR_OUT=0; +PKGTYPE="deb" +MAKETARGET="deb" +API_NAME="rocm-opencl-icd-loader" + +printUsage() { + echo + echo "Usage: $(basename "${BASH_SOURCE}") [options ...]" + echo + echo "Options:" + echo " -c, --clean Clean output and delete all intermediate work" + echo " -p, --package Specify packaging format" + echo " -r, --release Make a release build instead of a debug build" + echo " -h, --help Prints this help" + echo " -o, --outdir Print path of output directory containing packages" + echo " -s, --static Component/Build does not support static builds just accepting this param & ignore. No effect of the param on this build" + echo + echo "Possible values for :" + echo " deb -> Debian format (default)" + echo " rpm -> RPM format" + echo + return 0 +} + +RET_CONFLICT=1 +check_conflicting_options $CLEAN_OR_OUT $PKGTYPE $MAKETARGET +if [ $RET_CONFLICT -ge 30 ]; then + print_vars $TARGET $BUILD_TYPE $CLEAN_OR_OUT $PKGTYPE $MAKETARGET + exit $RET_CONFLICT +fi + +clean_opencl_icd_loader() { + echo "Cleaning $PROJ_NAME" + rm -rf "$PACKAGE_DEB" + rm -rf "$PACKAGE_RPM" + rm -rf "$PACKAGE_ROOT/${PROJ_NAME,,}" +} + +copy_pkg_files_to_rocm() { + local comp_folder=$1 + local comp_pkg_name=$2 + + cd "${OUT_DIR}/${PKGTYPE}/${comp_folder}"|| exit 2 + if [ "${PKGTYPE}" = 'deb' ]; then + dpkg-deb -x ${comp_pkg_name}_*.deb pkg/ + else + mkdir pkg && pushd pkg/ || exit 2 + if [[ "${comp_pkg_name}" != *-dev* ]]; then + rpm2cpio ../${comp_pkg_name}-*.rpm | cpio -idmv + else + rpm2cpio ../${comp_pkg_name}el-*.rpm | cpio -idmv + fi + popd || exit 2 + fi + ls ./pkg -alt + cp -r ./pkg/*/rocm*/* "${ROCM_PATH}" || exit 2 + rm -rf pkg/ +} + +build_opencl_icd_loader() { + echo "Downloading $PROJ_NAME" package + if [ "$DISTRO_NAME" = ubuntu ]; then + mkdir -p "$PACKAGE_DEB" + local rocm_ver=${ROCM_VERSION} + if [ ${ROCM_VERSION##*.} = 0 ]; then + rocm_ver=${ROCM_VERSION%.*} + fi + local url="https://repo.radeon.com/rocm/apt/${rocm_ver}/pool/main/r/${API_NAME}/" + local package + package=$(curl -s "$url" | grep -Po 'href="\K[^"]*' | grep "${DISTRO_RELEASE}" | head -n 1) + + if [ -z "$package" ]; then + echo "No package found for Ubuntu version $DISTRO_RELEASE" + exit 1 + fi + + wget -t3 -P "$PACKAGE_DEB" "${url}${package}" + copy_pkg_files_to_rocm ${PROJ_NAME,,} ${API_NAME} + else + echo "$DISTRO_ID is not supported..." + exit 2 + fi + + echo "Installing $PROJ_NAME" package +} + +print_output_directory() { + case ${PKGTYPE} in + ("deb") + echo ${PACKAGE_DEB};; + ("rpm") + echo ${PACKAGE_RPM};; + (*) + echo "Invalid package type \"${PKGTYPE}\" provided for -o" >&2; exit 1;; + esac + exit +} + +VALID_STR=`getopt -o hcraswlo:p: --long help,clean,release,outdir:,package: -- "$@"` +eval set -- "$VALID_STR" +while true ; +do + case "$1" in + (-c | --clean ) + TARGET="clean" ; ((CLEAN_OR_OUT|=1)) ; shift ;; + (-r | --release ) + BUILD_TYPE="RelWithDebInfo" ; shift ;; + (-h | --help ) + printUsage ; exit 0 ;; + (-a | --address_sanitizer) + ack_and_ignore_asan ; shift ;; + (-o | --outdir) + TARGET="outdir"; PKGTYPE=$2 ; OUT_DIR_SPECIFIED=1 ; ((CLEAN_OR_OUT|=2)) ; shift 2 ;; + (-p | --package) + MAKETARGET="$2" ; shift 2;; + (-s | --static) + echo "-s parameter accepted but ignored" ; shift ;; + --) shift; break;; + (*) + echo " This should never come but just incase : UNEXPECTED ERROR Parm : [$1] ">&2 ; exit 20;; + esac +done + +case $TARGET in + (clean) clean_opencl_icd_loader ;; + (build) build_opencl_icd_loader ;; + (outdir) print_output_directory ;; + (*) die "Invalid target $TARGET" ;; +esac + +echo "Operation complete" diff --git a/tools/rocm-build/build_rccl.sh b/tools/rocm-build/build_rccl.sh index f6c420474..3dd3d3bd5 100755 --- a/tools/rocm-build/build_rccl.sh +++ b/tools/rocm-build/build_rccl.sh @@ -26,14 +26,16 @@ build_rccl() { GPU_TARGETS="gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101" fi + init_rocm_common_cmake_params CC=${ROCM_PATH}/bin/amdclang \ CXX=$(set_build_variables CXX) \ cmake \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DHIP_COMPILER=clang \ -DCMAKE_PREFIX_PATH="${ROCM_PATH};${ROCM_PATH}/share/rocm/cmake/" \ + ${LAUNCHER_FLAGS} \ -DCPACK_GENERATOR="${PKGTYPE^^}" \ -DROCM_PATCH_VERSION=$ROCM_LIBPATCH_VERSION \ -DBUILD_ADDRESS_SANITIZER="${ADDRESS_SANITIZER}" \ diff --git a/tools/rocm-build/build_rocal.sh b/tools/rocm-build/build_rocal.sh new file mode 100755 index 000000000..8bf5e3a97 --- /dev/null +++ b/tools/rocm-build/build_rocal.sh @@ -0,0 +1,71 @@ +#!/bin/bash + +set -ex +source "$(dirname "${BASH_SOURCE[0]}")/compute_helper.sh" + +set_component_src rocAL + +build_rocal() { + + if [ "$DISTRO_ID" = "mariner-2.0" ] ; then + echo "Not building rocal for ${DISTRO_ID}. Exiting..." + return 0 + fi + + echo "Start build" + + # Enable ASAN + if [ "${ENABLE_ADDRESS_SANITIZER}" == "true" ]; then + set_asan_env_vars + set_address_sanitizer_on + fi + +# python3 ${COMPONENT_SRC}/rocAL-setup.py + pushd /tmp + # PyBind11 + git clone -b v2.11.1 https://github.com/pybind/pybind11 + cd pybind11 && mkdir build && cd build + cmake -DDOWNLOAD_CATCH=ON -DDOWNLOAD_EIGEN=ON ../ + make -j$(nproc) && sudo make install + cd ../.. + # Turbo JPEG + git clone -b 3.0.2 https://github.com/libjpeg-turbo/libjpeg-turbo.git + cd libjpeg-turbo && mkdir build && cd build + cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=RELEASE -DENABLE_STATIC=FALSE -DCMAKE_INSTALL_DEFAULT_LIBDIR=lib -DWITH_JPEG8=TRUE .. + make -j$(nproc) && sudo make install + cd ../.. + # RapidJSON + git clone https://github.com/Tencent/rapidjson.git + cd rapidjson && mkdir build && cd build + cmake .. && make -j$(nproc) && sudo make install + popd + + mkdir -p $BUILD_DIR && cd $BUILD_DIR + + cmake -DAMDRPP_PATH=$ROCM_PATH ${COMPONENT_SRC} + make -j${PROC} + cmake --build . --target PyPackageInstall + sudo make install + sudo make package + sudo chown -R $(id -u):$(id -g) ${BUILD_DIR} + + rm -rf _CPack_Packages/ && find -name '*.o' -delete + mkdir -p $PACKAGE_DIR + cp ${BUILD_DIR}/*.${PKGTYPE} $PACKAGE_DIR + show_build_cache_stats +} + +clean_rocal() { + echo "Cleaning rocAL build directory: ${BUILD_DIR} ${PACKAGE_DIR}" + rm -rf "$BUILD_DIR" "$PACKAGE_DIR" + echo "Done!" +} + +stage2_command_args "$@" + +case $TARGET in + build) build_rocal ;; + outdir) print_output_directory ;; + clean) clean_rocal ;; + *) die "Invalid target $TARGET" ;; +esac diff --git a/tools/rocm-build/build_rocalution.sh b/tools/rocm-build/build_rocalution.sh index d60b6c055..ba4215c07 100755 --- a/tools/rocm-build/build_rocalution.sh +++ b/tools/rocm-build/build_rocalution.sh @@ -22,6 +22,7 @@ build_rocalution() { echo "CXX compiler: $CXX" mkdir -p "$BUILD_DIR" && cd "$BUILD_DIR" + init_rocm_common_cmake_params if [ -n "$GPU_ARCHS" ]; then GPU_TARGETS="$GPU_ARCHS" @@ -32,7 +33,7 @@ build_rocalution() { cmake \ -DSUPPORT_HIP=ON \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DCPACK_SET_DESTDIR=OFF \ -DBUILD_CLIENTS_SAMPLES=ON \ diff --git a/tools/rocm-build/build_rocblas.sh b/tools/rocm-build/build_rocblas.sh index 28593c648..980d199cc 100755 --- a/tools/rocm-build/build_rocblas.sh +++ b/tools/rocm-build/build_rocblas.sh @@ -28,11 +28,12 @@ build_rocblas() { else GPU_TARGETS="gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101" fi + init_rocm_common_cmake_params cmake \ -DCMAKE_TOOLCHAIN_FILE=toolchain-linux.cmake \ -DBUILD_DIR="${BUILD_DIR}" \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DROCM_DIR="${ROCM_PATH}" \ ${LAUNCHER_FLAGS} \ -DCMAKE_PREFIX_PATH="${DEPS_DIR};${ROCM_PATH}" \ diff --git a/tools/rocm-build/build_rocdecode.sh b/tools/rocm-build/build_rocdecode.sh index bc768c8ee..35bb9b5ed 100755 --- a/tools/rocm-build/build_rocdecode.sh +++ b/tools/rocm-build/build_rocdecode.sh @@ -4,15 +4,16 @@ source "$(dirname "${BASH_SOURCE[0]}")/compute_helper.sh" set_component_src rocDecode BUILD_DEV=ON build_rocdecode() { - if [ "$DISTRO_ID" = "centos-7" ] ; then + if [ "$DISTRO_ID" = "centos-7" ] || [ "$DISTRO_ID" = "sles-15.4" ] ; then echo "Not building rocDecode for ${DISTRO_ID}. Exiting..." return 0 fi echo "Start build" mkdir -p $BUILD_DIR && cd $BUILD_DIR + python3 ${COMPONENT_SRC}/rocDecode-setup.py --developer OFF - cmake ${COMPONENT_SRC} + cmake -DROCM_DEP_ROCMCORE=ON ${COMPONENT_SRC} make -j8 make install make package diff --git a/tools/rocm-build/build_rocfft.sh b/tools/rocm-build/build_rocfft.sh index 180406fa1..27d6009c3 100755 --- a/tools/rocm-build/build_rocfft.sh +++ b/tools/rocm-build/build_rocfft.sh @@ -16,6 +16,7 @@ build_rocfft() { set_address_sanitizer_on fi mkdir -p "$BUILD_DIR" && cd "$BUILD_DIR" + init_rocm_common_cmake_params if [ -n "$GPU_ARCHS" ]; then GPU_TARGETS="$GPU_ARCHS" @@ -26,7 +27,7 @@ build_rocfft() { CXX="${ROCM_PATH}/bin/hipcc" \ cmake \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DUSE_HIP_CLANG=ON \ -DHIP_COMPILER=clang \ diff --git a/tools/rocm-build/build_rocm-cmake.sh b/tools/rocm-build/build_rocm-cmake.sh index 267313c13..f700f7be7 100755 --- a/tools/rocm-build/build_rocm-cmake.sh +++ b/tools/rocm-build/build_rocm-cmake.sh @@ -10,11 +10,13 @@ printUsage() { echo " -c, --clean Clean output and delete all intermediate work" echo " -r, --release Build a release version of the package" echo " -a, --address_sanitizer Enable address sanitizer" - echo " -s, --static Supports static CI by accepting this param & not bailing out. No effect of the param though" + echo " -s, --static Build static lib (.a). build instead of dynamic/shared(.so) " + echo " -w, --wheel Creates python wheel package of rocm-cmake. + It needs to be used along with -r option" echo " -o, --outdir Print path of output directory containing packages of type referred to by pkg_type" echo " -p, --package Specify packaging format" - echo " -h, --help Prints this help" + echo " -h, --help Prints this help" echo echo @@ -30,6 +32,7 @@ ROCM_CMAKE_BUILD_DIR="$(getBuildPath rocm-cmake)" ROCM_CMAKE_BUILD_DIR="$(getBuildPath rocm-cmake)" ROCM_CMAKE_PACKAGE_DEB="$(getPackageRoot)/deb/rocm-cmake" ROCM_CMAKE_PACKAGE_RPM="$(getPackageRoot)/rpm/rocm-cmake" +ROCM_WHEEL_DIR="${ROCM_CMAKE_BUILD_DIR}/_wheel" ROCM_CMAKE_BUILD_TYPE="debug" BUILD_TYPE="Debug" SHARED_LIBS="ON" @@ -37,7 +40,7 @@ CLEAN_OR_OUT=0; PKGTYPE="deb" MAKETARGET="deb" -VALID_STR=`getopt -o hcraso:p: --long help,clean,release,static,address_sanitizer,outdir:,package: -- "$@"` +VALID_STR=`getopt -o hcraswo:p: --long help,clean,release,static,wheel,address_sanitizer,outdir:,package: -- "$@"` eval set -- "$VALID_STR" while true ; @@ -53,6 +56,8 @@ do ack_and_ignore_asan ; shift ;; (-s | --static) SHARED_LIBS="OFF" ; shift ;; + (-w | --wheel) + WHEEL_PACKAGE=true ; shift ;; (-o | --outdir) TARGET="outdir"; PKGTYPE=$2 ; OUT_DIR_SPECIFIED=1 ; ((CLEAN_OR_OUT|=2)) ; shift 2 ;; (-p | --package) @@ -73,6 +78,7 @@ fi clean_rocm_cmake() { + rm -rf "$ROCM_WHEEL_DIR" rm -rf $ROCM_CMAKE_BUILD_DIR rm -rf $ROCM_CMAKE_PACKAGE_DEB rm -rf $ROCM_CMAKE_PACKAGE_RPM @@ -87,6 +93,7 @@ build_rocm_cmake() { cmake \ $(rocm_cmake_params) \ + -DBUILD_SHARED_LIBS=$SHARED_LIBS \ -DCPACK_SET_DESTDIR="OFF" \ -DROCM_DISABLE_LDCONFIG=ON \ $ROCM_CMAKE_ROOT @@ -99,6 +106,19 @@ build_rocm_cmake() { copy_if RPM "${CPACKGEN:-"DEB;RPM"}" "$ROCM_CMAKE_PACKAGE_RPM" $ROCM_CMAKE_BUILD_DIR/rocm-cmake*.rpm } +create_wheel_package() { + echo "Creating rocm-cmake wheel package" + # Copy the setup.py generator to build folder + mkdir -p $ROCM_WHEEL_DIR + cp -f $SCRIPT_ROOT/generate_setup_py.py $ROCM_WHEEL_DIR + cp -f $SCRIPT_ROOT/repackage_wheel.sh $ROCM_WHEEL_DIR + cd $ROCM_WHEEL_DIR + # Currently only supports python3.6 + ./repackage_wheel.sh $ROCM_CMAKE_BUILD_DIR/rocm-cmake*.rpm python3.6 + # Copy the wheel created to RPM folder which will be uploaded to artifactory + copy_if WHL "WHL" "$ROCM_CMAKE_PACKAGE_RPM" "$ROCM_WHEEL_DIR"/dist/*.whl +} + print_output_directory() { case ${PKGTYPE} in ("deb") @@ -118,4 +138,9 @@ case $TARGET in (*) die "Invalid target $TARGET" ;; esac +if [[ $WHEEL_PACKAGE == true ]]; then + echo "Wheel Package build started !!!!" + create_wheel_package +fi + echo "Operation complete" diff --git a/tools/rocm-build/build_rocprim.sh b/tools/rocm-build/build_rocprim.sh index e00cd8038..b0cb796f4 100755 --- a/tools/rocm-build/build_rocprim.sh +++ b/tools/rocm-build/build_rocprim.sh @@ -24,13 +24,14 @@ build_rocprim() { GPU_TARGETS="gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101" fi + init_rocm_common_cmake_params CXX="${ROCM_PATH}/bin/hipcc" \ cmake \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DBUILD_BENCHMARK=OFF \ - -DBUILD_BENCHMARK=OFF \ + -DBUILD_SHARED_LIBS=ON \ -DBUILD_TEST=ON \ -DCMAKE_MODULE_PATH="${ROCM_PATH}/lib/cmake/hip;${ROCM_PATH}/hip/cmake" \ "$COMPONENT_SRC" diff --git a/tools/rocm-build/build_rocprofiler-sdk.sh b/tools/rocm-build/build_rocprofiler-sdk.sh new file mode 100755 index 000000000..abfba538b --- /dev/null +++ b/tools/rocm-build/build_rocprofiler-sdk.sh @@ -0,0 +1,222 @@ +#!/bin/bash + +source "$(dirname "${BASH_SOURCE}")/compute_utils.sh" + +printUsage() { + echo + echo "Usage: ${BASH_SOURCE##*/} [options ...]" + echo + echo "Options:" + echo " -c, --clean Clean output and delete all intermediate work" + echo " -s, --static Build static lib (.a). build instead of dynamic/shared(.so) " + echo " -w, --wheel Creates python wheel package of rocprofiler-sdk. + It needs to be used along with -r option" + echo " -p, --package Specify packaging format" + echo " -r, --release Make a release build instead of a debug build" + echo " -a, --address_sanitizer Enable address sanitizer" + echo " -o, --outdir Print path of output directory containing packages of + type referred to by pkg_type" + echo " -h, --help Prints this help" + echo + echo "Possible values for :" + echo " deb -> Debian format (default)" + echo " rpm -> RPM format" + echo + + return 0 +} + +API_NAME="rocprofiler-sdk" +PROJ_NAME="$API_NAME" +LIB_NAME="lib${API_NAME}" +TARGET="build" +MAKETARGET="deb" +PACKAGE_ROOT="$(getPackageRoot)" +PACKAGE_LIB="$(getLibPath)" +PACKAGE_INCLUDE="$(getIncludePath)" +BUILD_DIR="$(getBuildPath $API_NAME)" +PACKAGE_DEB="$(getPackageRoot)/deb/$API_NAME" +PACKAGE_RPM="$(getPackageRoot)/rpm/$API_NAME" +ROCM_WHEEL_DIR="${BUILD_DIR}/_wheel" +PACKAGE_PREFIX="$ROCM_INSTALL_PATH" +BUILD_TYPE="Debug" +MAKE_OPTS="$DASH_JAY" +SHARED_LIBS="ON" +CLEAN_OR_OUT=0 +MAKETARGET="deb" +PKGTYPE="deb" + +GPU_LIST="gfx900;gfx906;gfx908;gfx90a;gfx940;gfx941;gfx942;gfx1030;gfx1031;gfx1100;gfx1101;gfx1102" +ASAN=0 + +VALID_STR=$(getopt -o hcrawso:p: --long help,clean,release,static,address_sanitizer,wheel,outdir:,package: -- "$@") +eval set -- "$VALID_STR" + +while true; do + case "$1" in + -h | --help) + printUsage + exit 0 + ;; + -c | --clean) + TARGET="clean" + ((CLEAN_OR_OUT |= 1)) + shift + ;; + -r | --release) + BUILD_TYPE="RelWithDebInfo" + shift + ;; + -a | --address_sanitizer) + set_address_sanitizer_on + set_asan_env_vars + ASAN=1 + shift + ;; + -s | --static) + SHARED_LIBS="OFF" + shift + ;; + -w | --wheel) + WHEEL_PACKAGE=true + shift + ;; + -o | --outdir) + TARGET="outdir" + PKGTYPE=$2 + OUT_DIR_SPECIFIED=1 + ((CLEAN_OR_OUT |= 2)) + shift 2 + ;; + -p | --package) + MAKETARGET="$2" + shift 2 + ;; + --) + shift + break + ;; # end delimiter + *) + echo " This should never come but just incase : UNEXPECTED ERROR Parm : [$1] " >&2 + exit 20 + ;; + esac + +done + +RET_CONFLICT=1 +check_conflicting_options $CLEAN_OR_OUT $PKGTYPE $MAKETARGET +if [ $RET_CONFLICT -ge 30 ]; then + print_vars $API_NAME $TARGET $BUILD_TYPE $SHARED_LIBS $CLEAN_OR_OUT $PKGTYPE $MAKETARGET + exit $RET_CONFLICT +fi + +clean() { + echo "Cleaning $PROJ_NAME" + rm -rf "$ROCM_WHEEL_DIR" + rm -rf "$BUILD_DIR" + rm -rf "$PACKAGE_DEB" + rm -rf "$PACKAGE_RPM" + rm -rf "$PACKAGE_ROOT/${PROJ_NAME}" + rm -rf "$PACKAGE_ROOT/libexec/${PROJ_NAME}" + rm -rf "$PACKAGE_INCLUDE/${PROJ_NAME}" + rm -rf "$PACKAGE_LIB/${LIB_NAME}"* + rm -rf "$PACKAGE_LIB/${PROJ_NAME}" +} + +build_rocprofiler-sdk() { + if [ ! -f "${ROCPROFILER_SDK_ROOT}/CMakeLists.txt" ]; then + echo "Warning: $ROCPROFILER_SDK_ROOT not found" + else + echo "Building $PROJ_NAME" + PACKAGE_CMAKE="$(getCmakePath)" + if [ ! -d "$BUILD_DIR" ]; then + mkdir -p "$BUILD_DIR" + pushd "$BUILD_DIR" + print_lib_type $SHARED_LIBS + + if [ $ASAN == 1 ]; then + cmake \ + $(rocm_cmake_params) \ + $(rocm_common_cmake_params) \ + -DAMDDeviceLibs_DIR="${ROCM_INSTALL_PATH}/lib/asan/cmake/AMDDeviceLibs" \ + -Dhip_DIR="${ROCM_INSTALL_PATH}/lib/asan/cmake/hip" \ + -Dhip-lang_DIR="${ROCM_INSTALL_PATH}/lib/asan/cmake/hip-lang" \ + -Damd_comgr_DIR="${ROCM_INSTALL_PATH}/lib/asan/cmake/amd_comgr" \ + -Dhsa-runtime64_DIR="${ROCM_INSTALL_PATH}/lib/asan/cmake/hsa-runtime64" \ + -Dhsakmt_DIR="${ROCM_INSTALL_PATH}/lib/asan/cmake/hsakmt" \ + -DCMAKE_HIP_COMPILER_ROCM_ROOT="${ROCM_INSTALL_PATH}" \ + -DCMAKE_PREFIX_PATH="${ROCM_INSTALL_PATH};${ROCM_INSTALL_PATH}/lib/asan" \ + -DBUILD_SHARED_LIBS=$SHARED_LIBS \ + -DGPU_TARGETS="$GPU_LIST" \ + -DCPACK_DEBIAN_PACKAGE_SHLIBDEPS=OFF \ + -DPython3_EXECUTABLE=$(which python3) \ + "$ROCPROFILER_SDK_ROOT" + else + cmake \ + $(rocm_cmake_params) \ + $(rocm_common_cmake_params) \ + -DCMAKE_PREFIX_PATH="${ROCM_INSTALL_PATH}" \ + -DBUILD_SHARED_LIBS=$SHARED_LIBS \ + -DGPU_TARGETS="$GPU_LIST" \ + -DROCPROFILER_BUILD_SAMPLES=ON \ + -DROCPROFILER_BUILD_TESTS=ON \ + -DCPACK_DEBIAN_PACKAGE_SHLIBDEPS=OFF \ + -DPython3_EXECUTABLE=$(which python3) \ + "$ROCPROFILER_SDK_ROOT" + fi + + popd + fi + cmake --build "$BUILD_DIR" --target all -- $MAKE_OPTS + cmake --build "$BUILD_DIR" --target install -- $MAKE_OPTS + cmake --build "$BUILD_DIR" --target package -- $MAKE_OPTS + + copy_if DEB "${CPACKGEN:-"DEB;RPM"}" "$PACKAGE_DEB" "$BUILD_DIR/${API_NAME}"*.deb + copy_if RPM "${CPACKGEN:-"DEB;RPM"}" "$PACKAGE_RPM" "$BUILD_DIR/${API_NAME}"*.rpm + fi +} + +create_wheel_package() { + echo "Creating rocprofiler sdk wheel package" + mkdir -p "$ROCM_WHEEL_DIR" + cp -f "$SCRIPT_ROOT"/generate_setup_py.py "$ROCM_WHEEL_DIR" + cp -f "$SCRIPT_ROOT"/repackage_wheel.sh "$ROCM_WHEEL_DIR" + cd "$ROCM_WHEEL_DIR" + # Currently only supports python3.6 + ./repackage_wheel.sh "$BUILD_DIR"/*.rpm python3.6 + # Copy the wheel created to RPM folder which will be uploaded to artifactory + copy_if WHL "WHL" "$PACKAGE_RPM" "$ROCM_WHEEL_DIR"/dist/*.whl +} + +print_output_directory() { + case ${PKGTYPE} in + "deb") + echo ${PACKAGE_DEB} + ;; + "rpm") + echo ${PACKAGE_RPM} + ;; + *) + echo "Invalid package type \"${PKGTYPE}\" provided for -o" >&2 + exit 1 + ;; + esac + exit +} + +verifyEnvSetup + +case "$TARGET" in + clean) clean ;; + build) build_rocprofiler-sdk ;; + outdir) print_output_directory ;; + *) die "Invalid target $TARGET" ;; +esac + +if [[ $WHEEL_PACKAGE == true ]]; then + echo "Wheel Package build started !!!!" + create_wheel_package +fi + +echo "Operation complete" diff --git a/tools/rocm-build/build_rocrand.sh b/tools/rocm-build/build_rocrand.sh index 8e7aa13af..691e64a63 100755 --- a/tools/rocm-build/build_rocrand.sh +++ b/tools/rocm-build/build_rocrand.sh @@ -25,10 +25,12 @@ build_rocrand() { GPU_TARGETS="gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101" fi + init_rocm_common_cmake_params + CXX=$(set_build_variables CXX)\ cmake \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DBUILD_TEST=ON \ -DBUILD_BENCHMARK=ON \ diff --git a/tools/rocm-build/build_rocsolver.sh b/tools/rocm-build/build_rocsolver.sh index fbffc5dfd..e310671eb 100755 --- a/tools/rocm-build/build_rocsolver.sh +++ b/tools/rocm-build/build_rocsolver.sh @@ -28,11 +28,13 @@ build_rocsolver() { GPU_TARGETS="gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101" fi + init_rocm_common_cmake_params + CXX="${ROCM_PATH}/bin/hipcc" \ cmake \ -DCPACK_SET_DESTDIR=OFF \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -Drocblas_DIR="${ROCM_PATH}/rocblas/lib/cmake/rocblas" \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DBUILD_CLIENTS_TESTS=ON \ diff --git a/tools/rocm-build/build_rocsparse.sh b/tools/rocm-build/build_rocsparse.sh index 2426b98c9..8cda3d59f 100755 --- a/tools/rocm-build/build_rocsparse.sh +++ b/tools/rocm-build/build_rocsparse.sh @@ -27,12 +27,14 @@ build_rocsparse() { fi ROCSPARSE_TEST_MIRROR=$MIRROR \ - CXX=$(set_build_variables CXX)\ - CC=$(set_build_variables CC)\ + export CXX=$(set_build_variables CXX)\ + export CC=$(set_build_variables CC)\ + + init_rocm_common_cmake_params cmake \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}"\ -DBUILD_CLIENTS_SAMPLES=ON \ -DBUILD_CLIENTS_TESTS=ON \ -DBUILD_CLIENTS_BENCHMARKS=ON \ diff --git a/tools/rocm-build/build_rocthrust.sh b/tools/rocm-build/build_rocthrust.sh index cd7cc16e4..84e058b4d 100755 --- a/tools/rocm-build/build_rocthrust.sh +++ b/tools/rocm-build/build_rocthrust.sh @@ -27,10 +27,12 @@ build_rocthrust() { GPU_TARGETS="gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101" fi + init_rocm_common_cmake_params + CXX=$(set_build_variables CXX)\ cmake \ ${LAUNCHER_FLAGS} \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DCMAKE_MODULE_PATH="${ROCM_PATH}/lib/cmake/hip;${ROCM_PATH}/hip/cmake" \ -DROCPRIM_ROOT="${ROCM_PATH}/rocprim" \ diff --git a/tools/rocm-build/build_rocwmma.sh b/tools/rocm-build/build_rocwmma.sh index 54eca1a11..694723c07 100755 --- a/tools/rocm-build/build_rocwmma.sh +++ b/tools/rocm-build/build_rocwmma.sh @@ -27,9 +27,11 @@ build_rocwmma() { GPU_TARGETS="gfx908:xnack-;gfx90a:xnack-;gfx90a:xnack+;gfx940;gfx941;gfx942;gfx1100;gfx1101" fi + init_rocm_common_cmake_params + CXX=$(set_build_variables CXX)\ cmake \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ ${LAUNCHER_FLAGS} \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ -DROCWMMA_BUILD_VALIDATION_TESTS=ON \ diff --git a/tools/rocm-build/build_rpp.sh b/tools/rocm-build/build_rpp.sh index b1051a9d8..d20a62ab7 100755 --- a/tools/rocm-build/build_rpp.sh +++ b/tools/rocm-build/build_rpp.sh @@ -12,7 +12,7 @@ ROCM_LLVM_LIB_RPATH="\$ORIGIN/llvm/lib" rpp_specific_cmake_params() { local rpp_cmake_params if [ "${ASAN_CMAKE_PARAMS}" == "true" ] ; then - rpp_cmake_params="-DCMAKE_EXE_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--rpath,$ROCM_ASAN_EXE_RPATH:$LLVM_LIBDIR" + rpp_cmake_params="-DCMAKE_EXE_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,$ROCM_ASAN_EXE_RPATH:$LLVM_LIBDIR" else rpp_cmake_params="" fi @@ -41,14 +41,16 @@ build_rpp() { GPU_TARGETS="gfx908;gfx90a;gfx940;gfx941;gfx942;gfx1030;gfx1100" fi + init_rocm_common_cmake_params + cmake \ - $(rocm_common_cmake_params) \ + "${rocm_math_common_cmake_params[@]}" \ ${LAUNCHER_FLAGS} \ -DBACKEND=HIP \ -DCMAKE_INSTALL_LIBDIR=$(getInstallLibDir) \ $(rpp_specific_cmake_params) \ -DAMDGPU_TARGETS=${GPU_TARGETS} \ - -DCMAKE_SHARED_LINKER_FLAGS_INIT="-fno-openmp-implicit-rpath -Wl,--enable-new-dtags,--rpath,${ROCM_LIB_RPATH}:${DEPS_DIR}/lib:${ROCM_LLVM_LIB_RPATH}" \ + -DCMAKE_SHARED_LINKER_FLAGS_INIT="-fno-openmp-implicit-rpath -Wl,--enable-new-dtags,--build-id=sha1,--rpath,${ROCM_LIB_RPATH}:${DEPS_DIR}/lib:${ROCM_LLVM_LIB_RPATH}" \ -DCMAKE_PREFIX_PATH="${DEPS_DIR};${ROCM_PATH}" \ "$COMPONENT_SRC" diff --git a/tools/rocm-build/compute_helper.sh b/tools/rocm-build/compute_helper.sh index 08412ad4c..5fbf1b916 100755 --- a/tools/rocm-build/compute_helper.sh +++ b/tools/rocm-build/compute_helper.sh @@ -189,6 +189,65 @@ echo " PKGTYPE= $6 " echo " MAKETARGET = $7 " } +rocm_math_common_cmake_params=() +init_rocm_common_cmake_params(){ + local retCmakeParams=${1:-rocm_math_common_cmake_params} + local SET_BUILD_TYPE=${BUILD_TYPE:-'RelWithDebInfo'} + local ASAN_LIBDIR="lib/asan" + local CMAKE_PATH=$(getCmakePath) +# Common cmake parameters can be set +# component build scripts can use this function + local cmake_params + if [ "${ASAN_CMAKE_PARAMS}" == "true" ] ; then + cmake_params=( + "-DCMAKE_PREFIX_PATH=$CMAKE_PATH;${ROCM_PATH}/$ASAN_LIBDIR;$ROCM_PATH/llvm;$ROCM_PATH" + "-DCMAKE_SHARED_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,$ROCM_ASAN_LIB_RPATH" + "-DCMAKE_EXE_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,$ROCM_ASAN_EXE_RPATH" + "-DENABLE_ASAN_PACKAGING=true" + ) + else + cmake_params=( + "-DCMAKE_PREFIX_PATH=${ROCM_PATH}/llvm;${ROCM_PATH}" + "-DCMAKE_SHARED_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,$ROCM_LIB_RPATH" + "-DCMAKE_EXE_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,$ROCM_EXE_RPATH" + ) + fi + + cmake_params+=( + "-DCMAKE_VERBOSE_MAKEFILE=1" + "-DCMAKE_BUILD_TYPE=${SET_BUILD_TYPE}" + "-DCMAKE_INSTALL_RPATH_USE_LINK_PATH=FALSE" + "-DCMAKE_INSTALL_PREFIX=${ROCM_PATH}" + "-DCMAKE_PACKAGING_INSTALL_PREFIX=${ROCM_PATH}" + "-DBUILD_FILE_REORG_BACKWARD_COMPATIBILITY=OFF" + "-DROCM_SYMLINK_LIBS=OFF" + "-DCPACK_PACKAGING_INSTALL_PREFIX=${ROCM_PATH}" + "-DROCM_DISABLE_LDCONFIG=ON" + "-DROCM_PATH=${ROCM_PATH}" + ) + + #TODO :remove if clause once debug related issues are fixed + if [ "${DISABLE_DEBUG_PACKAGE}" == "true" ] ; then + SET_BUILD_TYPE=${BUILD_TYPE:-'Release'} + cmake_params+=( + "-DCPACK_DEBIAN_DEBUGINFO_PACKAGE=FALSE" + "-DCPACK_RPM_DEBUGINFO_PACKAGE=FALSE" + "-DCPACK_RPM_INSTALL_WITH_EXEC=FALSE" + "-DCMAKE_BUILD_TYPE=${SET_BUILD_TYPE}" + ) + elif [ "$SET_BUILD_TYPE" == "RelWithDebInfo" ] || [ "$SET_BUILD_TYPE" == "Debug" ]; then + # RelWithDebinfo optimization level -O2 is having performance impact + # So overriding the same to -O3 + cmake_params+=( + "-DCPACK_DEBIAN_DEBUGINFO_PACKAGE=TRUE" + "-DCPACK_RPM_DEBUGINFO_PACKAGE=TRUE" + "-DCPACK_RPM_INSTALL_WITH_EXEC=TRUE" + "-DCMAKE_CXX_FLAGS_RELWITHDEBINFO=-O3 -g -DNDEBUG" + ) + fi + eval "${retCmakeParams}=( \"\${cmake_params[@]}\" ) " +} + # Common cmake parameters can be set # component build scripts can use this function rocm_common_cmake_params() { diff --git a/tools/rocm-build/docker/ubuntu20/install-prerequisites.sh b/tools/rocm-build/docker/ubuntu20/install-prerequisites.sh index 0c73e06b1..7828292de 100755 --- a/tools/rocm-build/docker/ubuntu20/install-prerequisites.sh +++ b/tools/rocm-build/docker/ubuntu20/install-prerequisites.sh @@ -2,7 +2,6 @@ set -ex - apt-get update DEBIAN_FRONTEND=noninteractive DEBCONF_NONINTERACTIVE_SEEN=true apt-get install --no-install-recommends -y $(grep -v '^#' /tmp/packages) apt-get clean @@ -105,7 +104,7 @@ git clone --recurse-submodules -b v1.61.0 https://github.com/grpc/grpc cd grpc mkdir -p cmake/build cd cmake/build -cmake -DgRPC_INSTALL=ON -DBUILD_SHARED_LIBS=ON -DgRPC_BUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/usr/grpc -DCMAKE_CXX_STANDARD=14 ../.. +cmake -DgRPC_INSTALL=ON -DBUILD_SHARED_LIBS=ON -DgRPC_BUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/usr/grpc -DCMAKE_CXX_STANDARD=14 -DCMAKE_SHARED_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,'$ORIGIN' ../.. make -j$(nproc) make install cd / @@ -120,11 +119,11 @@ mv amd-blis-mt /usr/blis cd / rm -rf /tmp/blis -## Download aocl-linux-aocc-4.0_1_amd64.deb +## Download aocl-linux-gcc-4.2.0_1_amd64.deb mkdir -p /tmp/aocl cd /tmp/aocl -wget -nv https://download.amd.com/developer/eula/aocl/aocl-4-0/aocl-linux-aocc-4.0_1_amd64.deb -apt install ./aocl-linux-aocc-4.0_1_amd64.deb +wget -nv https://download.amd.com/developer/eula/aocl/aocl-4-2/aocl-linux-gcc-4.2.0_1_amd64.deb +apt install ./aocl-linux-gcc-4.2.0_1_amd64.deb rm -rf /tmp/aocl ## lapack(3.9.1v) @@ -180,8 +179,7 @@ cd ninja-1.11.1.g95dee.kitware.jobserver-1 cp ninja /usr/local/bin/ rm -rf /tmp/ninja -# Install pre-built FFmpeg and dependencies -# See docker/build-deps for instructions on how to build these packages +# Install FFmpeg from source wget -qO- https://www.ffmpeg.org/releases/ffmpeg-4.4.2.tar.gz | tar -xzv -C /usr/local command -v lbzip2 diff --git a/tools/rocm-build/docker/ubuntu20/packages b/tools/rocm-build/docker/ubuntu20/packages index bdcc0d3e3..dc42f2d90 100644 --- a/tools/rocm-build/docker/ubuntu20/packages +++ b/tools/rocm-build/docker/ubuntu20/packages @@ -8,6 +8,7 @@ bison bridge-utils build-essential bzip2 +ccache check chrpath cifs-utils @@ -97,6 +98,7 @@ libva-dev libvirt-clients libvirt-daemon-system libyaml-cpp-dev +libzstd-dev llvm llvm-6.0-dev llvm-dev @@ -119,9 +121,11 @@ python3-yaml python3.8-dev re2c redis-tools +# Eventually we should be able to remove rpm for debian builds. rpm rsync ssh +# This makes life more pleasent inside the container strace sudo systemtap-sdt-dev diff --git a/tools/rocm-build/docker/ubuntu22/install-prerequisities.sh b/tools/rocm-build/docker/ubuntu22/install-prerequisities.sh index 3ea039bc4..16127d846 100644 --- a/tools/rocm-build/docker/ubuntu22/install-prerequisities.sh +++ b/tools/rocm-build/docker/ubuntu22/install-prerequisities.sh @@ -1,6 +1,6 @@ #! /usr/bin/bash -set -ex +set -x apt-get -y update DEBIAN_FRONTEND=noninteractive DEBCONF_NONINTERACTIVE_SEEN=true apt-get install --no-install-recommends -y $(sed 's/#.*//' /tmp/packages) @@ -60,7 +60,6 @@ apt install -y sharp apt clean rm -rf /var/cache/apt/ /var/lib/apt/lists/* mlnx /etc/apt/sources.list.d/sharp.list - apt update apt -y install libunwind-dev apt -y install libgoogle-glog-dev @@ -118,12 +117,12 @@ git clone --recurse-submodules -b v1.61.0 https://github.com/grpc/grpc cd grpc mkdir -p build cd build -cmake -DgRPC_INSTALL=ON -DBUILD_SHARED_LIBS=ON -DgRPC_BUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/usr/grpc -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_STANDARD=14 .. +cmake -DgRPC_INSTALL=ON -DBUILD_SHARED_LIBS=ON -DgRPC_BUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/usr/grpc -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_STANDARD=14 -DCMAKE_SHARED_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,'$ORIGIN' .. make -j $(nproc) install rm -rf /tmp/grpc ## rocBLAS Pre-requisites -## Download prebuilt AMD multithreaded (2.0) +## Download prebuilt AMD multithreaded blis (2.0) ## Reference : https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/develop/install.sh#L403 mkdir -p /tmp/blis cd /tmp/blis @@ -131,12 +130,13 @@ wget -O - https://github.com/amd/blis/releases/download/2.0/aocl-blis-mt-ubuntu- mv amd-blis-mt /usr/blis cd / rm -rf /tmp/blis + ## rocBLAS Pre-requisites(SWDEV-404612) -## Download aocl-linux-aocc-4.0_1_amd64 +## Download aocl-linux-gcc-4.2.0_1_amd64.deb mkdir -p /tmp/aocl cd /tmp/aocl -wget -nv https://download.amd.com/developer/eula/aocl/aocl-4-0/aocl-linux-aocc-4.0_1_amd64.deb -apt install ./aocl-linux-aocc-4.0_1_amd64.deb +wget -nv https://download.amd.com/developer/eula/aocl/aocl-4-2/aocl-linux-gcc-4.2.0_1_amd64.deb +apt install ./aocl-linux-gcc-4.2.0_1_amd64.deb rm -rf /tmp/aocl ## hipBLAS Pre-requisites @@ -194,9 +194,62 @@ cd ninja-1.11.1.g95dee.kitware.jobserver-1 cp ninja /usr/local/bin/ rm -rf /tmp/ninja -# Install pre-built FFmpeg and dependencies -# See docker/build-deps for instructions on how to build these packages -wget -qO- https://www.ffmpeg.org/releases/ffmpeg-4.4.2.tar.gz | tar -xzv -C /usr/local +# Install FFmpeg and dependencies +# Build NASM +mkdir -p /tmp/nasm-2.15.05 +cd /tmp +wget -qO- "https://distfiles.macports.org/nasm/nasm-2.15.05.tar.bz2" | tar -xvj +cd nasm-2.15.05 +./autogen.sh +./configure --prefix="/usr/local" +make -j$(nproc) install +rm -rf /tmp/nasm-2.15.05 + +# Build YASM +mkdir -p /tmp/yasm-1.3.0 +cd /tmp +wget -qO- "http://www.tortall.net/projects/yasm/releases/yasm-1.3.0.tar.gz" | tar -xvz +cd yasm-1.3.0 +./configure --prefix="/usr/local" +make -j$(nproc) install +rm -rf /tmp/yasm-1.3.0 + +# Build x264 +mkdir -p /tmp/x264-snapshot-20191217-2245-stable +cd /tmp +wget -qO- "https://download.videolan.org/pub/videolan/x264/snapshots/x264-snapshot-20191217-2245-stable.tar.bz2" | tar -xvj +cd /tmp/x264-snapshot-20191217-2245-stable +PKG_CONFIG_PATH="/usr/local/lib/pkgconfig" ./configure --prefix="/usr/local" --enable-shared +make -j$(nproc) install +rm -rf /tmp/x264-snapshot-20191217-2245-stable + +# Build x265 +mkdir -p /tmp/x265_2.7 +cd /tmp +wget -qO- "https://get.videolan.org/x265/x265_2.7.tar.gz" | tar -xvz +cd /tmp/x265_2.7/build/linux +cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX="/usr/local" -DENABLE_SHARED:bool=on ../../source +make -j$(nproc) install +rm -rf /tmp/x265_2.7 + +# Build fdk-aac +mkdir -p /tmp/fdk-aac-2.0.2 +cd /tmp +wget -qO- "https://sourceforge.net/projects/opencore-amr/files/fdk-aac/fdk-aac-2.0.2.tar.gz" | tar -xvz +cd /tmp/fdk-aac-2.0.2 +autoreconf -fiv +./configure --prefix="/usr/local" --enable-shared --disable-static +make -j$(nproc) install +rm -rf /tmp/fdk-aac-2.0.2 + +# Build FFmpeg +cd /tmp +git clone -b release/4.4 https://git.ffmpeg.org/ffmpeg.git ffmpeg +cd ffmpeg +PKG_CONFIG_PATH="/usr/local/lib/pkgconfig" +./configure --prefix="/usr/local" --extra-cflags="-I/usr/local/include" --extra-ldflags="-L/usr/local/lib" --extra-libs=-lpthread --extra-libs=-lm --enable-shared --disable-static --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-gpl --enable-nonfree +make -j$(nproc) install +rm -rf /tmp/ffmpeg cp /tmp/local-pin-600 /etc/apt/preferences.d @@ -212,21 +265,21 @@ make -j -C build cd /tmp/Gbenchmark/build make install -# Build boost-1.82.0 from source for RPP +# Build boost-1.85.0 from source for RPP # Installing in a non-standard location since the test packages of hipFFT and rocFFT pick up the version of # the installed Boost library and declare a package dependency on that specific version of Boost. -# For example, if this was installed in the standard location it would declare a dependency on libboost-dev(el)1.82.0 +# For example, if this was installed in the standard location it would declare a dependency on libboost-dev(el)1.85.0 # which is not available as a package in any distro. # Once this is fixed, we can remove the Boost package from the requirements list and install this # in the standard location -mkdir -p /tmp/boost-1.82.0 -cd /tmp/boost-1.82.0 -wget -nv https://sourceforge.net/projects/boost/files/boost/1.82.0/boost_1_82_0.tar.bz2 -O ./boost_1_82_0.tar.bz2 -tar -xf boost_1_82_0.tar.bz2 --use-compress-program="/usr/local/bin/compressor" -cd boost_1_82_0 +mkdir -p /tmp/boost-1.85.0 +cd /tmp/boost-1.85.0 +wget -nv https://sourceforge.net/projects/boost/files/boost/1.85.0/boost_1_85_0.tar.bz2 -O ./boost_1_85_0.tar.bz2 +tar -xf boost_1_85_0.tar.bz2 --use-compress-program="/usr/local/bin/compressor" +cd boost_1_85_0 ./bootstrap.sh --prefix=${RPP_DEPS_LOCATION} --with-python=python3 ./b2 stage -j$(nproc) threading=multi link=shared cxxflags="-std=c++11" ./b2 install threading=multi link=shared --with-system --with-filesystem ./b2 stage -j$(nproc) threading=multi link=static cxxflags="-std=c++11 -fpic" cflags="-fpic" ./b2 install threading=multi link=static --with-system --with-filesystem -rm -rf /tmp/boost-1.82.0 +rm -rf /tmp/boost-1.85.0 diff --git a/tools/rocm-build/docker/ubuntu22/packages b/tools/rocm-build/docker/ubuntu22/packages index 56a22631a..72483d363 100644 --- a/tools/rocm-build/docker/ubuntu22/packages +++ b/tools/rocm-build/docker/ubuntu22/packages @@ -8,6 +8,7 @@ bison bridge-utils build-essential bzip2 +ccache check chrpath cifs-utils @@ -99,6 +100,7 @@ libva-dev libvirt-clients libvirt-daemon-system libyaml-cpp-dev +libzstd-dev llvm llvm-dev llvm-runtime @@ -137,3 +139,4 @@ unzip vim wget xsltproc +zlib1g-dev diff --git a/tools/rocm-build/docker/ubuntu24/Dockerfile b/tools/rocm-build/docker/ubuntu24/Dockerfile new file mode 100644 index 000000000..d23d420a1 --- /dev/null +++ b/tools/rocm-build/docker/ubuntu24/Dockerfile @@ -0,0 +1,11 @@ +FROM ubuntu:noble as builder +COPY packages /tmp/packages +COPY local-pin-600 /tmp/local-pin-600 +COPY install-prerequisities.sh /tmp/install-prerequisities.sh +RUN chmod +x /tmp/install-prerequisities.sh +ENV KBUILD_PKG_ROOTCMD= +ENV RPP_DEPS_LOCATION=/usr/local/rpp-deps +ENV PATH="/opt/venv/bin:$PATH" +ENV PATH=$PATH:"/usr/local/bin" +RUN /tmp/install-prerequisities.sh +WORKDIR /src diff --git a/tools/rocm-build/docker/ubuntu24/README.md b/tools/rocm-build/docker/ubuntu24/README.md new file mode 100644 index 000000000..c7a709f8b --- /dev/null +++ b/tools/rocm-build/docker/ubuntu24/README.md @@ -0,0 +1,27 @@ +## Steps to build the Docker Image + +1. Clone this repositry + + ```bash + git clone https://github.com/ROCm/rocm-build.git + ``` + +2. Go into the OS specific docker directory in build-infra + + ```bash + cd rocm-build/build/docker/ubuntu24 + ``` + +3. Build the docker image + + ```bash + docker build -t . + ``` + + replace the `` with the new Docker image Name of your choice, + +4. After successful build, verify your \ in the list all available docker images. + + ```bash + docker images + ``` diff --git a/tools/rocm-build/docker/ubuntu24/install-prerequisites.sh b/tools/rocm-build/docker/ubuntu24/install-prerequisites.sh new file mode 100644 index 000000000..4898a7792 --- /dev/null +++ b/tools/rocm-build/docker/ubuntu24/install-prerequisites.sh @@ -0,0 +1,237 @@ + +#! /usr/bin/bash +set -ex + +# The following assumes that you have a cache, e.g. +# https://docs.docker.com/engine/examples/apt-cacher-ng/ +# Comment out if it breaks things +echo 'Acquire::http { Proxy "http://rocm-ci-services.amd.com:3142"; };' > /etc/apt/apt.conf.d/01proxy + +apt-get update +DEBIAN_FRONTEND=noninteractive DEBCONF_NONINTERACTIVE_SEEN=true apt-get install --no-install-recommends -y $(sed 's/#.*//' /tmp/packages) +update-ccache-symlinks +apt-get upgrade +apt-get clean +rm -rf /var/cache/apt/ /var/lib/apt/lists/* /etc/apt/apt.conf.d/01proxy + +#Install 2.17.1 version of git as we are seeing issues with 2.25 , where it was not allowing to add git submodules if the user is different for parent git directory +curl -o git.tar.gz https://cdn.kernel.org/pub/software/scm/git/git-2.17.1.tar.gz +tar -zxf git.tar.gz +cd git-* +make prefix=/usr/local all +make prefix=/usr/local install +git --version + +# venv for python to be able to run pip3 without --break-system-packages +python3 -m venv /opt/venv + +pip3 install --no-cache-dir setuptools wheel tox +pip3 install --no-cache-dir --pre CppHeaderParser argparse requests lxml barectf recommonmark jinja2==3.0.0 websockets matplotlib numpy scipy minimal msgpack pytest sphinx joblib PyYAML==5.3.1 rocm-docs-core cmake==3.25.2 pandas myst-parser + +# Allow sudo for everyone user +echo 'ALL ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/everyone + +# Install OCaml packages to build LLVM's OCaml bindings to be used in lightning compiler test pipeline +wget -nv https://sourceforge.net/projects/opam.mirror/files/2.1.4/opam-2.1.4-x86_64-linux -O /usr/local/bin/opam +chmod +x /usr/local/bin/opam +opam init --yes --disable-sandboxing +opam install ctypes --yes + +# Install and modify git-repo (#!/usr/bin/env python -> #!/usr/bin/env python3) +curl https://storage.googleapis.com/git-repo-downloads/repo > /usr/bin/repo +chmod a+x /usr/bin/repo + +# Build ccache from the source +cd /tmp +git clone https://github.com/ccache/ccache -b v4.7.5 +cd ccache +mkdir build +cd build +cmake -DCMAKE_BUILD_TYPE=Release .. +make +make install +cd /tmp +rm -rf ccache + +#Install older version of hwloc-devel package for rocrtst +curl -lO https://download.open-mpi.org/release/hwloc/v1.11/hwloc-1.11.13.tar.bz2 +tar -xvf hwloc-1.11.13.tar.bz2 +cd hwloc-1.11.13 +./configure +make +make install +cp /usr/local/lib/libhwloc.so.5 /usr/lib +hwloc-info --version + +# Install gtest +mkdir -p /tmp/gtest +cd /tmp/gtest +wget https://github.com/google/googletest/archive/refs/tags/v1.14.0.zip -O googletest.zip +unzip googletest.zip +cd googletest-1.14.0/ +mkdir build +cd build +cmake .. +make -j$(nproc) +make install +rm -rf /tmp/gtest + +## Install gRPC from source +## RDC Pre-requisites +GRPC_ARCHIVE=grpc-1.61.0.tar.gz +mkdir /tmp/grpc +mkdir /usr/grpc +cd /tmp +git clone --recurse-submodules -b v1.61.0 https://github.com/grpc/grpc +cd grpc +mkdir -p build +cd build +cmake -DgRPC_INSTALL=ON -DBUILD_SHARED_LIBS=ON -DgRPC_BUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/usr/grpc -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_STANDARD=14 -DCMAKE_SHARED_LINKER_FLAGS_INIT=-Wl,--enable-new-dtags,--build-id=sha1,--rpath,'$ORIGIN' .. +make -j $(nproc) install +rm -rf /tmp/grpc + +## rocBLAS Pre-requisites(ROCMOPS-3856) +## Download prebuilt AMD multithreaded blis (2.0) +## Reference : https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/develop/install.sh#L403 +mkdir -p /tmp/blis +cd /tmp/blis +wget -O - https://github.com/amd/blis/releases/download/2.0/aocl-blis-mt-ubuntu-2.0.tar.gz | tar xfz - +mv amd-blis-mt /usr/blis +cd / +rm -rf /tmp/blis + +## rocBLAS Pre-requisites(SWDEV-404612) +## Download aocl-linux-gcc-4.2.0_1_amd64.deb +mkdir -p /tmp/aocl +cd /tmp/aocl +wget -nv https://download.amd.com/developer/eula/aocl/aocl-4-2/aocl-linux-gcc-4.2.0_1_amd64.deb +apt install ./aocl-linux-gcc-4.2.0_1_amd64.deb +rm -rf /tmp/aocl + +## hipBLAS Pre-requisites +## lapack(3.9.1v) +## Reference https://github.com/ROCmSoftwarePlatform/rocSOLVER/blob/develop/install.sh#L174 +lapack_version=3.9.1 +lapack_srcdir=lapack-$lapack_version +lapack_blddir=lapack-$lapack_version-bld +mkdir -p /tmp/lapack +cd /tmp/lapack +rm -rf "$lapack_srcdir" "$lapack_blddir" +wget -O - https://github.com/Reference-LAPACK/lapack/archive/refs/tags/v3.9.1.tar.gz | tar xzf - +cmake -H$lapack_srcdir -B$lapack_blddir -DCMAKE_BUILD_TYPE=Release -DCMAKE_Fortran_FLAGS=-fno-optimize-sibling-calls -DBUILD_TESTING=OFF -DCBLAS=ON -DLAPACKE=OFF +make -j$(nproc) -C "$lapack_blddir" +make -C "$lapack_blddir" install +cd $lapack_blddir +cp -r ./include/* /usr/local/include/ +cp -r ./lib/* /usr/local/lib +cd / +rm -rf /tmp/lapack + +## rocSOLVER Pre-requisites +## FMT(7.1.3v) +## Reference https://github.com/ROCmSoftwarePlatform/rocSOLVER/blob/develop/install.sh#L152 +fmt_version=7.1.3 +fmt_srcdir=fmt-$fmt_version +fmt_blddir=fmt-$fmt_version-bld +mkdir -p /tmp/fmt +cd /tmp/fmt +rm -rf "$fmt_srcdir" "$fmt_blddir" +wget -O - https://github.com/fmtlib/fmt/archive/refs/tags/7.1.3.tar.gz | tar xzf - +cmake -H$fmt_srcdir -B$fmt_blddir -DCMAKE_BUILD_TYPE=Release -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_EXTENSIONS=OFF -DCMAKE_CXX_STANDARD_REQUIRED=ON -DFMT_DOC=OFF -DFMT_TEST=OFF +make -j$(nproc) -C "$fmt_blddir" +make -C "$fmt_blddir" install + +# Build and install libjpeg-turbo +mkdir -p /tmp/libjpeg-turbo +cd /tmp/libjpeg-turbo +wget -nv https://github.com/rrawther/libjpeg-turbo/archive/refs/heads/2.0.6.2.zip -O libjpeg-turbo-2.0.6.2.zip +unzip libjpeg-turbo-2.0.6.2.zip +cd libjpeg-turbo-2.0.6.2 +mkdir build +cd build +cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=RELEASE -DENABLE_STATIC=FALSE -DCMAKE_INSTALL_DEFAULT_LIBDIR=lib .. +make -j$(nproc) install +rm -rf /tmp/libjpeg-turbo + +# Get released ninja from source +mkdir -p /tmp/ninja +cd /tmp/ninja +wget -nv https://codeload.github.com/Kitware/ninja/zip/refs/tags/v1.11.1.g95dee.kitware.jobserver-1 -O ninja.zip +unzip ninja.zip +cd ninja-1.11.1.g95dee.kitware.jobserver-1 +./configure.py --bootstrap +cp ninja /usr/local/bin/ +rm -rf /tmp/ninja + +# Install FFmpeg and dependencies +# Build NASM +mkdir -p /tmp/nasm-2.15.05 +cd /tmp +wget -qO- "https://distfiles.macports.org/nasm/nasm-2.15.05.tar.bz2" | tar -xvj +cd nasm-2.15.05 +./autogen.sh +./configure --prefix="/usr/local" +make -j$(nproc) install +rm -rf /tmp/nasm-2.15.05 + +# Build YASM +mkdir -p /tmp/yasm-1.3.0 +cd /tmp +wget -qO- "http://www.tortall.net/projects/yasm/releases/yasm-1.3.0.tar.gz" | tar -xvz +cd yasm-1.3.0 +./configure --prefix="/usr/local" +make -j$(nproc) install +rm -rf /tmp/yasm-1.3.0 + +# Build x264 +mkdir -p /tmp/x264-snapshot-20191217-2245-stable +cd /tmp +wget -qO- "https://download.videolan.org/pub/videolan/x264/snapshots/x264-snapshot-20191217-2245-stable.tar.bz2" | tar -xvj +cd /tmp/x264-snapshot-20191217-2245-stable +PKG_CONFIG_PATH="/usr/local/lib/pkgconfig" ./configure --prefix="/usr/local" --enable-shared +make -j$(nproc) install +rm -rf /tmp/x264-snapshot-20191217-2245-stable + +# Build x265 +mkdir -p /tmp/x265_2.7 +cd /tmp +wget -qO- "https://get.videolan.org/x265/x265_2.7.tar.gz" | tar -xvz +cd /tmp/x265_2.7/build/linux +cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX="/usr/local" -DENABLE_SHARED:bool=on ../../source +make -j$(nproc) install +rm -rf /tmp/x265_2.7 + + +# Build fdk-aac +mkdir -p /tmp/fdk-aac-2.0.2 +cd /tmp +wget -qO- "https://sourceforge.net/projects/opencore-amr/files/fdk-aac/fdk-aac-2.0.2.tar.gz" | tar -xvz +cd /tmp/fdk-aac-2.0.2 +autoreconf -fiv +./configure --prefix="/usr/local" --enable-shared --disable-static +make -j$(nproc) install +rm -rf /tmp/fdk-aac-2.0.2 + +# Build FFmpeg +cd /tmp +rm -rf ffmpeg +git clone -b release/4.4 https://git.ffmpeg.org/ffmpeg.git ffmpeg +cd ffmpeg +PKG_CONFIG_PATH="/usr/local/lib/pkgconfig" +./configure --prefix="/usr/local" --extra-cflags="-I/usr/local/include" --extra-ldflags="-L/usr/local/lib" --extra-libs=-lpthread --extra-libs=-lm --enable-shared --disable-static --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-gpl --enable-nonfree +make -j$(nproc) install +rm -rf /tmp/ffmpeg + +cp /tmp/local-pin-600 /etc/apt/preferences.d + +command -v lbzip2 +ln -sf $(command -v lbzip2) /usr/local/bin/compressor || ln -sf $(command -v bzip2) /usr/local/bin/compressor + +# Install Google Benchmark (ROCMOPS-5283) +mkdir -p /tmp/Gbenchmark +cd /tmp/Gbenchmark +wget -qO- https://github.com/google/benchmark/archive/refs/tags/v1.6.1.tar.gz | tar xz +cmake -Sbenchmark-1.6.1 -Bbuild -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DBENCHMARK_ENABLE_TESTING=OFF -DCMAKE_CXX_STANDARD=14 +make -j -C build +cd /tmp/Gbenchmark/build +make install diff --git a/tools/rocm-build/docker/ubuntu24/local-pin-60 b/tools/rocm-build/docker/ubuntu24/local-pin-60 new file mode 100644 index 000000000..8cfa54d34 --- /dev/null +++ b/tools/rocm-build/docker/ubuntu24/local-pin-60 @@ -0,0 +1,3 @@ +Package: * +Pin: origin "" +Pin-Priority: 600 diff --git a/tools/rocm-build/docker/ubuntu24/packages b/tools/rocm-build/docker/ubuntu24/packages new file mode 100644 index 000000000..346335a15 --- /dev/null +++ b/tools/rocm-build/docker/ubuntu24/packages @@ -0,0 +1,140 @@ +# List of packages needed for stage1 build +apt-utils +aria2 +autoconf +automake +bc +bison +bridge-utils +build-essential +bzip2 +ccache +check +chrpath +cifs-utils +cmake +cpio +curl +devscripts +dos2unix +doxygen +fakeroot +flex +freeglut3-dev +g++ +g++-multilib +# gawk is needed for aomp +gawk +gcc +gcc-multilib +gfortran +git-lfs +gnupg +googletest +graphviz +kernel-wedge +# kmod for kernel build +kmod +lbzip2 +# less is needed by repo +less +libass-dev +libatlas-base-dev +libbabeltrace-dev +libboost-all-dev +libboost-dev +libboost-filesystem-dev +libboost-program-options-dev +libboost-system-dev +libbz2-dev +libc++-dev +libc++1 +libc++abi-dev +libc++abi1 +libc6-dev-i386 +libcap-dev +libcurl4-openssl-dev +libdrm-dev +libdw-dev +libdw1 +libdwarf-dev +libelf-dev +libelf1 +libexpat1-dev +libfftw3-dev +libfile-find-rule-perl +libgflags-dev +libglew-dev +libgmp-dev +libgoogle-glog-dev +libgtk2.0-dev +libhdf5-dev +libjpeg-dev +libleveldb-dev +liblmdb-dev +liblzma-dev +libmpfr-dev +libmpich-dev +libmsgpack-dev +libncurses-dev +libnuma-dev +libomp-dev +libopenblas-dev +libpci-dev +libpci3 +libpciaccess-dev +libpciaccess0 +libprotobuf-dev +libpython3-dev +libreadline-dev +libsnappy-dev +libssl-dev +libsuitesparse-dev +libsystemd-dev +libtool +liburi-encode-perl +libva-dev +libvirt-clients +libvirt-daemon-system +libyaml-cpp-dev +llvm +llvm-dev +llvm-runtime +mesa-common-dev +mpich +ocaml +ocaml-findlib +patchelf +pigz +pkg-config +protobuf-compiler +python-is-python3 +python3-barectf +python3-dev +python3-pip +python3-pip-whl +python3-requests +python3-venv +python3-yaml +python3-yaml +re2c +redis-tools +# hipclang needs rpm +rpm +rsync +ssh +# This makes life more pleasent inside the container +strace +sudo +systemtap-sdt-dev +texinfo +texlive +texlive-extra-utils +texlive-plain-generic +texlive-xetex +unzip +vim +wget +xsltproc +xxd +zlib1g-dev diff --git a/tools/rocm-build/envsetup.sh b/tools/rocm-build/envsetup.sh index 9cf30190d..2d1d27e4b 100755 --- a/tools/rocm-build/envsetup.sh +++ b/tools/rocm-build/envsetup.sh @@ -123,9 +123,12 @@ if [ -d "$HSA_OPENSOURCE_ROOT/ROCT-Thunk-Interface" ]; then export THUNK_ROOT=$HSA_OPENSOURCE_ROOT/ROCT-Thunk-Interface fi export AQLPROFILE_ROOT=$WORK_ROOT/hsa/aqlprofile +export OMNIPERF_ROOT=$WORK_ROOT/omniperf export ROCPROFILER_ROOT=$WORK_ROOT/rocprofiler export ROCTRACER_ROOT=$WORK_ROOT/roctracer export ROCPROFILER_REGISTER_ROOT=$WORK_ROOT/rocprofiler-register +export ROCPROFILER_SDK_ROOT=$WORK_ROOT/rocprofiler-sdk +export OMNITRACE_ROOT=$WORK_ROOT/omnitrace export RDC_ROOT=$WORK_ROOT/rdc export RDCTST_ROOT=$RDC_ROOT/tests/rdc_tests export UTILS_ROOT=$WORK_ROOT/rocm-utils @@ -147,7 +150,6 @@ export ROCM_CORE_ROOT=$WORK_ROOT/rocm-core export ROCM_CMAKE_ROOT=$WORK_ROOT/rocm-cmake export ROCM_BANDWIDTH_TEST_ROOT=$WORK_ROOT/rocm_bandwidth_test export ROCMINFO_ROOT=$WORK_ROOT/rocminfo -export CLANG_OCL_ROOT=$WORK_ROOT/clang-ocl export ROCR_DEBUG_AGENT_ROOT=$WORK_ROOT/rocr_debug_agent export COMGR_ROOT=$LLVM_PROJECT_ROOT/amd/comgr export COMGR_LIB_PATH=$OUT_DIR/build/amd_comgr @@ -179,7 +181,7 @@ export BUILD_ARTIFACTS=$OUT_DIR/$PACKAGEEXT export HIPCC_COMPILE_FLAGS_APPEND="-O3 -Wno-format-nonliteral -parallel-jobs=4" export HIPCC_LINK_FLAGS_APPEND="-O3 -parallel-jobs=4" -export PATH="${ROCM_PATH}/lib/llvm/bin:${PATH}" +export PATH="${ROCM_PATH}/bin:${ROCM_PATH}/lib/llvm/bin:${PATH}" export LC_ALL=C.UTF-8 export LANG=C.UTF-8 diff --git a/tools/rocm-build/rocm-6.2.0.xml b/tools/rocm-build/rocm-6.2.0.xml index 981598b67..d9bdc9629 100644 --- a/tools/rocm-build/rocm-6.2.0.xml +++ b/tools/rocm-build/rocm-6.2.0.xml @@ -6,30 +6,29 @@ sync-c="true" sync-j="4" /> + + + + - + - - - - - - - - - + + + + - - - + + + @@ -72,5 +71,4 @@ - - + \ No newline at end of file diff --git a/tools/rocm-build/rocm-6.1.1.xml b/tools/rocm-build/rocm-6.2.1.xml similarity index 89% rename from tools/rocm-build/rocm-6.1.1.xml rename to tools/rocm-build/rocm-6.2.1.xml index 868fabfd9..e70601529 100644 --- a/tools/rocm-build/rocm-6.1.1.xml +++ b/tools/rocm-build/rocm-6.2.1.xml @@ -1,71 +1,74 @@ - - + - + + + + + - - + - - + - - + - + - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + - + + + + + + + + - + \ No newline at end of file diff --git a/tools/rocm-build/rocm-6.1.0.xml b/tools/rocm-build/rocm-6.2.2.xml similarity index 90% rename from tools/rocm-build/rocm-6.1.0.xml rename to tools/rocm-build/rocm-6.2.2.xml index 83f1b0e41..b63a304aa 100644 --- a/tools/rocm-build/rocm-6.1.0.xml +++ b/tools/rocm-build/rocm-6.2.2.xml @@ -1,68 +1,71 @@ - - + - + + + + + - - + - - + - - + - + - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + - + + + + + + + + diff --git a/tools/rocm-build/rocm-6.1.2.xml b/tools/rocm-build/rocm-6.2.4.xml similarity index 85% rename from tools/rocm-build/rocm-6.1.2.xml rename to tools/rocm-build/rocm-6.2.4.xml index 94dedfa3c..7ed58a463 100644 --- a/tools/rocm-build/rocm-6.1.2.xml +++ b/tools/rocm-build/rocm-6.2.4.xml @@ -1,71 +1,75 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file