diff --git a/CHANGELOG.md b/CHANGELOG.md index 270b11ccf..23399f488 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -242,7 +242,9 @@ HIP 6.0.0 for ROCm 6.0.0 * `char luid[8];` * `unsigned int luidDeviceNodeMask;` -Note: HIP supports LUID only on Windows OS. +:::{note} +HIP supports LUID only on Windows OS. +::: ##### Changes @@ -279,7 +281,10 @@ Note: HIP supports LUID only on Windows OS. * HIP complex vector type multiplication and division operations. On AMD platform, some duplicated complex operators are removed to avoid compilation failures. In HIP, `hipFloatComplex` and `hipDoubleComplex` are defined as complex data types: `typedef float2 hipFloatComplex; typedef double2 hipDoubleComplex;` Any application that uses complex multiplication and division operations needs to replace '*' and '/' operators with the following: * `hipCmulf()` and `hipCdivf()` for `hipFloatComplex` * `hipCmul()` and `hipCdiv()` for `hipDoubleComplex` -Note: These complex operations are equivalent to corresponding types/functions on NVIDIA platform. + + :::{note} + These complex operations are equivalent to corresponding types/functions on NVIDIA platform. + ::: ##### Removals @@ -1010,11 +1015,11 @@ New features include: Note that ROCm 5.7.0 is EOS for MI50. 5.7 versions of ROCm are the last major releases in the ROCm 5 series. This release is Linux-only. -```important +:::{important} The next major ROCm release (ROCm 6.0) will not be backward compatible with the ROCm 5 series. Changes will include: splitting LLVM packages into more manageable sizes, changes to the HIP runtime API, splitting rocRAND and hipRAND into separate packages, and reorganizing our file structure. -``` +::: #### AMD Instinct™ MI50 end-of-support notice @@ -1025,8 +1030,8 @@ As outlined in [5.6.0](https://rocm.docs.amd.com/en/docs-5.6.0/release.html), RO final release for gfx906 GPUs to be in a fully supported state. * ROCm 6.0 release will show MI50s as "under maintenance" for - [Linux](../about/compatibility/linux-support.md) and - [Windows](../about/compatibility/windows-support.md) + {doc}`Linux` and + {doc}`Windows` * No new features and performance optimizations will be supported for the gfx906 GPUs beyond this major release (ROCm 5.7). @@ -1060,8 +1065,10 @@ environments. Users may see the following error from runtime (with AMD_LOG_LEVEL The ROCm 5.7 release introduces an alternative to the current hostcall-based implementation that leverages an older OpenCL-based printf scheme, which does not rely on hostcalls/PCIe atomics. -Note: This option is less robust than hostcall-based implementation and is intended to be a +:::{note} +This option is less robust than hostcall-based implementation and is intended to be a workaround when hostcalls do not work. +::: The printf variant is now controlled via a new compiler option -mprintf-kind=. This is supported only for HIP programs and takes the following values, @@ -1094,11 +1101,11 @@ the GPU in heterogeneous applications. Ideally, developers should treat heteroge OpenMP applications like pure CPU applications. However, this simplicity has not been achieved yet. Refer to the documentation on LLVM ASan with the GPU at -[LLVM AddressSanitizer User Guide](../conceptual/using_gpu_sanitizer.md). +[LLVM AddressSanitizer User Guide](../conceptual/using-gpu-sanitizer.md). -```note +:::{note} The beta release of LLVM ASan for ROCm is currently tested and validated on Ubuntu 20.04. -``` +::: #### Defect fixes @@ -2089,9 +2096,9 @@ The following hipcc changes are implemented in this release: ##### New HIP APIs in this release -```note +:::{note} This is a pre-official version (beta) release of the new APIs and may contain unresolved issues. -``` +::: ###### Memory management HIP APIs @@ -2188,13 +2195,13 @@ This release consists of the following OpenMP enhancements: The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: ##### Linux file system hierarchy standard for ROCm @@ -2234,10 +2241,10 @@ The following is the new file system hierarchy:4 ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -2246,9 +2253,9 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -> **Note** -> -> ROCm will continue supporting backward compatibility until the next major release. +:::{note} +ROCm will continue supporting backward compatibility until the next major release. +::: ##### Wrapper header files @@ -2825,13 +2832,13 @@ Tensile 4.36.0 for ROCm 5.5.0 The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: ##### Linux file system hierarchy standard for ROCm @@ -2875,10 +2882,10 @@ The following is the new file system hierarchy:4 ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -2887,9 +2894,9 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -```note +:::{note} ROCm will continue supporting backward compatibility until the next major release. -``` +::: ##### Wrapper header files @@ -2999,13 +3006,13 @@ rocFFT 1.0.21 for ROCm 5.4.3 The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: #### `hipcc` options deprecation @@ -3067,9 +3074,9 @@ The ROCm v5.4.1 release consists of the following new HIP API: The following new HIP API is introduced in the ROCm v5.4.1 release. -> **Note** -> -> This is a pre-official version (beta) release of the new APIs. +:::{note} +This is a pre-official version (beta) release of the new APIs. +::: ```cpp hipError_t hipLaunchHostFunc(hipStream_t stream, hipHostFn_t fn, void* userData); @@ -3093,13 +3100,13 @@ For more information, refer to the HIP API documentation at The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: ### IFWI fixes @@ -3194,9 +3201,9 @@ int wallClkRate = 0; //in kilohertz Where hipDeviceAttributeWallClockRate is a device attribute. -```note +:::{note} The wall clock frequency is a per-device attribute. -``` +::: ##### New registry added for GPU_MAX_HW_QUEUES @@ -3207,10 +3214,10 @@ The environment variable controls how many independent hardware queues HIP runti per process, per device. If the application allocates more HIP streams than this number, then the HIP runtime reuses the same hardware queues for the new streams in a round-robin manner. -```note +:::{note} This maximum number does not apply to hardware queues created for CU-masked HIP streams or cooperative queues for HIP Cooperative Groups (there is only one queue per device). -``` +::: For more details, refer to the HIP Programming Guide. @@ -3218,9 +3225,9 @@ For more details, refer to the HIP Programming Guide. The following new HIP APIs are available in the ROCm v5.4 release. -> **Note** -> -> This is a pre-official version (beta) release of the new APIs. +:::{note} +This is a pre-official version (beta) release of the new APIs. +::: ##### Error handling @@ -3266,13 +3273,13 @@ This release consists of the following OpenMP enhancements: The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: ##### Linux file system hierarchy standard for ROCm @@ -3316,10 +3323,10 @@ The following is the new file system hierarchy: ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -3328,9 +3335,9 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -```note +:::{note} ROCm will continue supporting backward compatibility until the next major release. -``` +::: ##### Wrapper header files @@ -3398,7 +3405,7 @@ determine coherent support. `hipHostMalloc()` allocates memory with fine-grained access by default when the environment variable `HIP_HOST_COHERENT=1` is used. -For more information, refer to {doc}`hip:.doxygen/docBin/html/index`. +For more information, refer to {doc}`hip:doxygen/html/index`. #### SoftHang with `hipStreamWithCUMask` test on AMD Instinct™ @@ -3864,13 +3871,13 @@ This issue is currently under investigation and will be resolved in a future rel The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: #### Linux file system hierarchy standard for ROCm @@ -3914,10 +3921,10 @@ The following is the new file system hierarchy: ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -3926,9 +3933,9 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -```note +:::{note} ROCm will continue supporting backward compatibility until the next major release. -``` +::: ##### Wrapper header files @@ -4790,7 +4797,7 @@ The new APIs for virtual memory management are as follows: ``` For more information, refer to the HIP API documentation at -{doc}`hip:.doxygen/docBin/html/modules`. +{doc}`hip:doxygen/html/modules`. ##### Planned HIP changes in future releases @@ -4890,10 +4897,10 @@ The following is the new file system hierarchy: ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -4902,9 +4909,10 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -```note +:::{note} ROCm will continue supporting backward compatibility until the next major release. -``` +::: + ##### Wrapper header files Wrapper header files are placed in the old location (`/opt/rocm-xxx//include`) with a @@ -5017,10 +5025,10 @@ allow for undefined values. The workaround is to initialize the parameters to `__shfl_sync`. -```note +:::{note} When the `-Wall` compilation flag is used, the compiler generates a warning indicating the variable is initialized along some path. -``` +::: Example: @@ -5437,11 +5445,11 @@ The accuracy is guaranteed if the compiler options `-g -O0` are used and apply o This enhancement enables ROCDebugger users to interact with the HIP source-level variables and function arguments. -```note +:::{note} The newly-suggested compiler -g option must be used instead of the previously-suggested `-ggdb` option. Although the effect of these two options is currently equivalent, this is not guaranteed for the future, as changes might be made by the upstream LLVM community. -``` +::: ##### Machine interface lanes support @@ -5514,16 +5522,16 @@ this ROCm release, CRIU is enhanced with a new plugin to support AMD GPUs, which For more information, refer to -```note +:::{note} The CRIU plugin (amdgpu_plugin) is merged upstream with the CRIU repository. The KFD kernel patches are also available upstream with the amd-staging-drm-next branch (public) and the ROCm 5.1 release branch. -``` +::: -```note +:::{note} This is a Beta release of the Checkpoint and Restore functionality, and some features are not available in this release. -``` +::: For more information, refer to the following websites: @@ -5584,7 +5592,9 @@ debug information. **Issue:** Random memory access fault issues are observed while running Math libraries unit tests. This issue is encountered in ROCm v5.0, ROCm v5.0.1, and ROCm v5.0.2. -Note, the faults only occur in the SRIOV environment. +:::{note} +The faults only occur in the SRIOV environment. +::: **Workaround:** Use SDMA to update the page table. The Guest set up steps are as follows: @@ -5605,7 +5615,7 @@ Where expectation is 0. #### CU masking causes application to freeze Using CU Masking results in an application freeze or runs exceptionally slowly. This issue is noticed -only in the GFX10 suite of products. Note, this issue is observed only in GFX10 suite of products. +only in the GFX10 suite of products. Note that this issue is observed only in GFX10 suite of products. This issue is under active investigation at this time. @@ -5994,12 +6004,12 @@ The resolution includes a compiler change, which emits the required metadata by compiler can prove that the hostcall facility is not required by the kernel. This ensures that the “assert()” call never fails. -```note +:::{note} This fix may lead to breakage in some OpenMP offload use cases, which use print inside a target region and result in an abort in device code. The issue will be fixed in a future release. -``` +::: -The compatibility matrix in the [Deep-learning guide](./how-to/deep-learning-rocm.md) is updated for +The compatibility matrix in the [Deep-learning guide](../how-to/deep-learning-rocm.md) is updated for ROCm v5.0.2. ### Library changes in ROCM 5.0.2 @@ -6086,27 +6096,27 @@ Refer to the HIP Installation Guide v5.0 for more details. Managed memory, including the `__managed__` keyword, is now supported in the HIP combined host/device compilation. Through unified memory allocation, managed memory allows data to be shared and accessible to both the CPU and GPU using a single pointer. The allocation is managed by the AMD GPU driver using the Linux Heterogeneous Memory Management (HMM) mechanism. The user can call managed memory API hipMallocManaged to allocate a large chunk of HMM memory, execute kernels on a device, and fetch data between the host and device as needed. -> **Note** -> -> In a HIP application, it is recommended to do a capability check before calling the managed memory APIs. For example, -> -> ```cpp -> int managed_memory = 0; -> HIPCHECK(hipDeviceGetAttribute(&managed_memory, -> hipDeviceAttributeManagedMemory,p_gpuDevice)); -> if (!managed_memory ) { -> printf ("info: managed memory access not supported on the device %d\n Skipped\n", p_gpuDevice); -> } -> else { -> HIPCHECK(hipSetDevice(p_gpuDevice)); -> HIPCHECK(hipMallocManaged(&Hmm, N * sizeof(T))); -> . . . -> } -> ``` +:::{note} +In a HIP application, it is recommended to do a capability check before calling the managed memory APIs. For example, -> **Note** -> -> The managed memory capability check may not be necessary; however, if HMM is not supported, managed malloc will fall back to using system memory. Other managed memory API calls will, then, have + ```cpp + int managed_memory = 0; + HIPCHECK(hipDeviceGetAttribute(&managed_memory, + hipDeviceAttributeManagedMemory,p_gpuDevice)); + if (!managed_memory ) { + printf ("info: managed memory access not supported on the device %d\n Skipped\n", p_gpuDevice); + } + else { + HIPCHECK(hipSetDevice(p_gpuDevice)); + HIPCHECK(hipMallocManaged(&Hmm, N * sizeof(T))); + . . . + } + ``` +::: + +:::{note} +The managed memory capability check may not be necessary; however, if HMM is not supported, managed malloc will fall back to using system memory. Other managed memory API calls will, then, have +::: Refer to the HIP API documentation for more details on managed memory APIs. @@ -6465,7 +6475,9 @@ During the deprecation, two macros `_HIP_ENABLE_COMPLEX_OPERATORS` and `_HIP_ENABLE_VECTOR_OPERATORS` are provided to allow users to conditionally enable arithmetic operators of HIP complex or vector types. -Note, the two macros are mutually exclusive and, by default, set to Off. +:::{note} +The two macros are mutually exclusive and, by default, set to Off. +::: The arithmetic operators of HIP complex and vector types will be removed in a future release. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 32ed96f74..dd0cf1065 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -8,7 +8,7 @@ AMD values and encourages contributions to our code and documentation. If you want to contribute to our ROCm repositories, first review the following guidance. For documentation-specific information, -see [Contributing to ROCm docs](./docs/contribute/contribute-docs.md). +see [Contributing to ROCm docs](./contribute-docs.md). ROCm is a software stack made up of a collection of drivers, development tools, and APIs that enable GPU programming from low-level kernel to end-user applications. Because some of our components @@ -67,13 +67,13 @@ project-specific steps. Refer to each repository's PR process for any additional during our release cycle, as coordinated by the maintainer * We'll inform you once your change is committed -```important +:::{important} By creating a PR, you agree to allow your contribution to be licensed under the terms of the LICENSE.txt file in the corresponding repository. Different repositories may use different licenses. -``` +::: -You can look up each license on the [ROCm licensing](./docs/about/licensing.md) page. +You can look up each license on the [ROCm licensing](../about/licensing.md) page. ### New feature development diff --git a/docs/about/compatibility/linux-support.md b/docs/about/compatibility/linux-support.md deleted file mode 100644 index 7d54b951e..000000000 --- a/docs/about/compatibility/linux-support.md +++ /dev/null @@ -1,122 +0,0 @@ - - - - - - -# GPU and OS support (Linux) - -(linux-support)= - -## Supported Linux distributions - -AMD ROCm™ Platform supports the following Linux distributions. - -::::{tab-set} - -:::{tab-item} Supported - -| Distribution | Processor Architectures | Validated Kernel | Support | -| :----------- | :---------------------: | :--------------: | ------: | -| RHEL 9.2 | x86-64 | 5.14 (5.14.0-284.11.1.el9_2.x86_64) | ✅ | -| RHEL 9.1 | x86-64 | 5.14.0-284.11.1.el9_2.x86_64 | ✅ | -| RHEL 8.8 | x86-64 | 4.18.0-477.el8.x86_64 | ✅ | -| RHEL 8.7 | x86-64 | 4.18.0-425.10.1.el8_7.x86_64 | ✅ | -| SLES 15 SP5 | x86-64 | 5.14.21-150500.53-default | ✅ | -| SLES 15 SP4 | x86-64 | 5.14.21-150400.24.63-default | ✅ | -| Ubuntu 22.04.2 | x86-64 | 5.19.0-45-generic | ✅ | -| Ubuntu 20.04.5 | x86-64 | 5.15.0-75-generic | ✅ | - -:::{versionadded} 5.6 - -* RHEL 8.8 and 9.2 support is added. -* SLES 15 SP5 support is added - -::: - -:::{tab-item} Unsupported - -| Distribution | Processor Architectures | Validated Kernel | Support | -| :----------- | :---------------------: | :--------------: | ------: | -| RHEL 9.0 | x86-64 | 5.14 | ❌ | -| RHEL 8.6 | x86-64 | 5.14 | ❌ | -| SLES 15 SP3 | x86-64 | 5.3 | ❌ | -| Ubuntu 22.04.0 | x86-64 | 5.15 LTS, 5.17 OEM | ❌ | -| Ubuntu 20.04.4 | x86-64 | 5.13 HWE, 5.13 OEM | ❌ | -| Ubuntu 22.04.1 | x86-64 | 5.15 LTS | ❌ | - -::: - -:::: - -✅: **Supported** - AMD performs full testing of all ROCm components on distro - GA image. -❌: **Unsupported** - AMD no longer performs builds and testing on these - previously supported distro GA images. - -## Virtualization support - -ROCm supports virtualization for select GPUs only as shown below. - -| Hypervisor | Version | GPU | Validated Guest OS (validated kernel) | -|----------------|----------|-------|----------------------------------------------------------------------------------| -| VMWare | ESXi 8 | MI250 | Ubuntu 20.04 (`5.15.0-56-generic`) | -| VMWare | ESXi 8 | MI210 | Ubuntu 20.04 (`5.15.0-56-generic`), SLES 15 SP4 (`5.14.21-150400.24.18-default`) | -| VMWare | ESXi 7 | MI210 | Ubuntu 20.04 (`5.15.0-56-generic`), SLES 15 SP4 (`5.14.21-150400.24.18-default`) | - -## Linux-supported GPUs - -The table below shows supported GPUs for Instinct™, Radeon Pro™ and Radeon™ -GPUs. Please click the tabs below to switch between GPU product lines. If a GPU -is not listed on this table, the GPU is not officially supported by AMD. - -:::::{tab-set} - -::::{tab-item} AMD Instinct™ -:sync: instinct - -| Product Name | Architecture | [LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) |Support | -|:------------:|:------------:|:--------------------------------------------------------------------:|:-------:| -| AMD Instinct™ MI250X | CDNA2 | gfx90a | ✅ | -| AMD Instinct™ MI250 | CDNA2 | gfx90a | ✅ | -| AMD Instinct™ MI210 | CDNA2 | gfx90a | ✅ | -| AMD Instinct™ MI100 | CDNA | gfx908 | ✅ | -| AMD Instinct™ MI50 | GCN5.1 | gfx906 | ✅ | -| AMD Instinct™ MI25 | GCN5.0 | gfx900 | ❌ | - -:::: - -::::{tab-item} Radeon Pro™ -:sync: radeonpro - -| Name | Architecture |[LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Support| -|:----:|:------------:|:--------------------------------------------------------------------:|:-------:| -| AMD Radeon™ Pro W7900 | RDNA3 | gfx1100 | ✅ (Ubuntu 22.04 only)| -| AMD Radeon™ Pro W6800 | RDNA2 | gfx1030 | ✅ | -| AMD Radeon™ Pro V620 | RDNA2 | gfx1030 | ✅ | -| AMD Radeon™ Pro VII | GCN5.1 | gfx906 | ✅ | -:::: - -::::{tab-item} Radeon™ -:sync: radeonpro - -| Name | Architecture |[LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Support| -|:----:|:---------------:|:--------------------------------------------------------------------:|:-------:| -| AMD Radeon™ RX 7900 XTX | RDNA3 | gfx1100 | ✅ (Ubuntu 22.04 only)| -| AMD Radeon™ VII | GCN5.1 | gfx906 | ✅ | - -:::: -::::: - -### Support status - -✅: **Supported** - AMD enables these GPUs in our software distributions for - the corresponding ROCm product. -⚠️: **Deprecated** - Support will be removed in a future release. -❌: **Unsupported** - This configuration is not enabled in our software - distributions. - -## CPU support - -ROCm requires CPUs that support PCIe™ atomics. Modern CPUs after the release of -1st generation AMD Zen CPU and Intel™ Haswell support PCIe atomics. diff --git a/docs/about/compatibility/openmp.md b/docs/about/compatibility/openmp.md index 24216d0c6..e7f14f7c1 100644 --- a/docs/about/compatibility/openmp.md +++ b/docs/about/compatibility/openmp.md @@ -15,7 +15,8 @@ Along with host APIs, the OpenMP compilers support offloading code and data onto GPU devices. This document briefly describes the installation location of the OpenMP toolchain, example usage of device offloading, and usage of `rocprof` with OpenMP applications. The GPUs supported are the same as those supported by -this ROCm release. See the list of supported GPUs for [Linux](../../about/compatibility/linux-support.md) and [Windows](../../about/compatibility/windows-support.md). +this ROCm release. See the list of supported GPUs for {doc}`Linux` and +{doc}`Windows`. The ROCm OpenMP compiler is implemented using LLVM compiler technology. The following image illustrates the internal steps taken to translate a user’s application into an executable that can offload computation to the AMDGPU. The compilation is a two-pass process. Pass 1 compiles the application to generate the CPU code and Pass 2 links the CPU code to the AMDGPU device code. @@ -47,10 +48,10 @@ cd $ROCM_PATH/share/openmp-extras/examples/openmp/veccopy sudo make run ``` -```{note} +:::{note} `sudo` is required since we are building inside the `/opt` directory. Alternatively, copy the files to your home directory first. -``` +::: The above invocation of Make compiles and runs the program. Note the options that are required for target offload from an OpenMP program: @@ -59,13 +60,15 @@ that are required for target offload from an OpenMP program: -fopenmp --offload-arch= ``` -```{note} +:::{note} The compiler also accepts the alternative offloading notation: ```bash -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march= ``` +::: + Obtain the value of `gpu-arch` by running the following command: ```bash @@ -327,10 +330,10 @@ double a = 0.0; a = a + 1.0; ``` -```{note} +:::{note} `AMD_unsafe_fp_atomics` is an alias for `AMD_fast_fp_atomics`, and `AMD_safe_fp_atomics` is implemented with a compare-and-swap loop. -``` +::: To disable the generation of fast floating-point atomic instructions at the file level, build using the option `-msafe-fp-atomics` or use a hint clause on a diff --git a/docs/about/compatibility/windows-support.md b/docs/about/compatibility/windows-support.md deleted file mode 100644 index 1564d2c28..000000000 --- a/docs/about/compatibility/windows-support.md +++ /dev/null @@ -1,86 +0,0 @@ - - - - - - -# GPU and OS support (Windows) - -(windows-support)= - -## Supported SKUs - -AMD HIP SDK supports the following Windows variants. - -| Distribution |Processor Architectures| Validated update | -|---------------------|-----------------------|--------------------| -| Windows 10 | x86-64 | 22H2 (GA) | -| Windows 11 | x86-64 | 22H2 (GA) | -| Windows Server 2022 | x86-64 | | - -## Windows-supported GPUs - -The table below shows supported GPUs for Radeon Pro™ and Radeon™ GPUs. Please -click the tabs below to switch between GPU product lines. If a GPU is not listed -on this table, the GPU is not officially supported by AMD. - -::::{tab-set} - -:::{tab-item} Radeon Pro™ -:sync: radeonpro - -| Name | Architecture |[LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Runtime | HIP SDK | -|:----:|:------------:|:--------------------------------------------------------------------:|:-------:|:----------------:| -| AMD Radeon Pro™ W7900 | RDNA3 | gfx1100 | ✅ | ✅ | -| AMD Radeon Pro™ W7800 | RDNA3 | gfx1100 | ✅ | ✅ | -| AMD Radeon Pro™ W6800 | RDNA2 | gfx1030 | ✅ | ✅ | -| AMD Radeon Pro™ W6600 | RDNA2 | gfx1032 | ✅ | ❌ | -| AMD Radeon Pro™ W5500 | RDNA1 | gfx1012 | ❌ | ❌ | -| AMD Radeon Pro™ VII | GCN5.1 | gfx906 | ❌ | ❌ | - -::: - -:::{tab-item} Radeon™ -:sync: radeon - -| Name | Architecture | [LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Runtime | HIP SDK | -|:----:|:------------:|:--------------------------------------------------------------------:|:-------:|:----------------:| -| AMD Radeon™ RX 7900 XTX | RDNA3 | gfx1100 | ✅ | ✅ | -| AMD Radeon™ RX 7900 XT | RDNA3 | gfx1100 | ✅ | ✅ | -| AMD Radeon™ RX 7600 | RDNA3 | gfx1102 | ✅ | ✅ | -| AMD Radeon™ RX 6950 XT | RDNA2 | gfx1030 | ✅ | ✅ | -| AMD Radeon™ RX 6900 XT | RDNA2 | gfx1030 | ✅ | ✅ | -| AMD Radeon™ RX 6800 XT | RDNA2 | gfx1030 | ✅ | ✅ | -| AMD Radeon™ RX 6800 | RDNA2 | gfx1030 | ✅ | ✅ | -| AMD Radeon™ RX 6750 XT | RDNA2 | gfx1031 | ✅ | ❌ | -| AMD Radeon™ RX 6700 XT | RDNA2 | gfx1031 | ✅ | ❌ | -| AMD Radeon™ RX 6700 | RDNA2 | gfx1031 | ✅ | ❌ | -| AMD Radeon™ RX 6650 XT | RDNA2 | gfx1032 | ✅ | ❌ | -| AMD Radeon™ RX 6600 XT | RDNA2 | gfx1032 | ✅ | ❌ | -| AMD Radeon™ RX 6600 | RDNA2 | gfx1032 | ✅ | ❌ | - -::: - -:::: - -### Component support - -ROCm components are described in [What is ROCm?](../../what-is-rocm.md) Support -on Windows is provided with two levels on enablement. - -* **Runtime**: Runtime enables the use of the HIP and OpenCL runtimes only. -* **HIP SDK**: Runtime plus additional components are listed in [Libraries](../../reference/library-index.md). - Note that some math libraries are Linux exclusive. - -### Support status - -✅: **Supported** - AMD enables these GPUs in our software distributions for - the corresponding ROCm product. -⚠️: **Deprecated** - Support will be removed in a future release. -❌: **Unsupported** - This configuration is not enabled in our software - distributions. - -## CPU support - -ROCm requires CPUs that support PCIe™ atomics. Modern CPUs after the release of -1st generation AMD Zen CPU and Intel™ Haswell support PCIe atomics. diff --git a/docs/about/license.md b/docs/about/license.md index 847e6d251..2929250e4 100644 --- a/docs/about/license.md +++ b/docs/about/license.md @@ -1,6 +1,10 @@ # License -> Note: This license applies to the [ROCm repository](https://github.com/RadeonOpenCompute/ROCm) that primarily contains documentation. For other licensing information, refer to the [Licensing Terms page](./licensing). +:::{note} +This license applies to the [ROCm repository](https://github.com/RadeonOpenCompute/ROCm) that +primarily contains documentation. For other licensing information, refer to the +[Licensing Terms page](./licensing). +::: ```{include} ../../LICENSE ``` diff --git a/docs/about/licensing.md b/docs/about/licensing.md index 5a7f682bd..321872863 100644 --- a/docs/about/licensing.md +++ b/docs/about/licensing.md @@ -114,7 +114,7 @@ companies. ## Package licensing -```{attention} +:::{attention} AQL Profiler and AOCC CPU optimization are both provided in binary form, each subject to the license agreement enclosed in the directory for the binary and is available here: `/opt/rocm/share/doc/rocm-llvm-alt/EULA`. By using, installing, @@ -122,7 +122,7 @@ copying or distributing AQL Profiler and/or AOCC CPU Optimizations, you agree to the terms and conditions of this license agreement. If you do not agree to the terms of this agreement, do not install, copy or use the AQL Profiler and/or the AOCC CPU Optimizations. -``` +::: For the rest of the ROCm packages, you can find the licensing information at the following location: `/opt/rocm/share/doc//` diff --git a/docs/conceptual/ai-migraphx-optimization.md b/docs/conceptual/ai-migraphx-optimization.md index bd7dd390d..60d092a63 100644 --- a/docs/conceptual/ai-migraphx-optimization.md +++ b/docs/conceptual/ai-migraphx-optimization.md @@ -216,23 +216,23 @@ Follow these steps: ./inception_inference ``` -```{note} +:::{note} Set `LD_LIBRARY_PATH` to `/opt/rocm/lib` if required during the build. Additional examples can be found in the MIGraphX repository under the `/examples/` directory. -``` +::: ## Tuning MIGraphX MIGraphX uses MIOpen kernels to target AMD GPU. For the model compiled with MIGraphX, tune MIOpen to pick the best possible kernel implementation. The MIOpen tuning results in a significant performance boost. Tuning can be done by setting the environment variable `MIOPEN_FIND_ENFORCE=3`. -```{note} +:::{note} The tuning process can take a long time to finish. -``` +::: **Example:** The average inference time of the inception model example shown previously over 100 iterations using untuned kernels is 0.01383ms. After tuning, it reduces to 0.00459ms, which is a 3x improvement. This result is from ROCm v4.5 on a MI100 GPU. -```{note} +:::{note} The results may vary depending on the system configurations. -``` +::: For reference, the following code snippet shows inference runs for only the first 10 iterations for both tuned and untuned kernels: diff --git a/docs/conceptual/ai-pytorch-inception.md b/docs/conceptual/ai-pytorch-inception.md index 1c377d225..06813b326 100644 --- a/docs/conceptual/ai-pytorch-inception.md +++ b/docs/conceptual/ai-pytorch-inception.md @@ -63,7 +63,7 @@ This example is adapted from the PyTorch research hub page on [Inception V3](htt Follow these steps: -1. Run the PyTorch ROCm-based Docker image or refer to the section {doc}`Installing PyTorch ` for setting up a PyTorch environment on ROCm. +1. Run the PyTorch ROCm-based Docker image or refer to the section {doc}`Installing PyTorch ` for setting up a PyTorch environment on ROCm. ```dockerfile docker run -it -v $HOME:/data --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest @@ -153,7 +153,7 @@ The previous section focused on downloading and using the Inception V3 model for Follow these steps: -1. Run the PyTorch ROCm Docker image or refer to the section {doc}`Installing PyTorch ` for setting up a PyTorch environment on ROCm. +1. Run the PyTorch ROCm Docker image or refer to the section {doc}`Installing PyTorch ` for setting up a PyTorch environment on ROCm. ```dockerfile docker pull rocm/pytorch:latest @@ -215,9 +215,9 @@ Follow these steps: 7. Set parameters to guide the training process. - ```{note} + :::{note} The device is set to `"cuda"`. In PyTorch, `"cuda"` is a generic keyword to denote a GPU. - ``` + ::: ```py device = "cuda" @@ -277,9 +277,9 @@ Follow these steps: lr_gamma = 0.1 ``` - ```{note} + :::{note} One training epoch is when the neural network passes an entire dataset forward and backward. - ``` + ::: ```py epochs = 90 @@ -340,9 +340,9 @@ Follow these steps: ) ``` - ```{note} + :::{note} Use torchvision to obtain the Inception V3 model. Use the pre-trained model weights to speed up training. - ``` + ::: ```py print("Creating model") @@ -1162,9 +1162,10 @@ To prepare the data for training, follow these steps: print("Accuracy: ", accuracy) ``` - ```{note} - model.fit() returns a History object that contains a dictionary with everything that happened during training. - ``` + :::{note} + `model.fit()` returns a History object that contains a dictionary with everything that happened during + training. + ::: ```py history_dict = history.history diff --git a/docs/conceptual/cmake-packages.rst b/docs/conceptual/cmake-packages.rst index ef3a9ab7c..183984c64 100644 --- a/docs/conceptual/cmake-packages.rst +++ b/docs/conceptual/cmake-packages.rst @@ -25,13 +25,13 @@ Finding dependencies In short, CMake supports finding dependencies in two ways: -* In Module mode, it consults a file ``Find.cmake`` which tries to - find the component in typical install locations and layouts. CMake ships a - few dozen such scripts, but users and projects may ship them as well. +* In Module mode, it consults a file ``Find.cmake`` which tries to find the component + in typical install locations and layouts. CMake ships a few dozen such scripts, but users and projects + may ship them as well. -* In Config mode, it locates a file named ``-config.cmake`` or - ``Config.cmake`` which describes the installed component in all - regards needed to consume it. +* In Config mode, it locates a file named ``-config.cmake`` or + ``Config.cmake`` which describes the installed component in all regards needed to + consume it. ROCm predominantly relies on Config mode, one notable exception being the Module driving the compilation of HIP programs on NVIDIA runtimes. As such, when diff --git a/docs/conceptual/gpu-arch/mi200-performance-counters.md b/docs/conceptual/gpu-arch/mi200-performance-counters.md index 998f4d80f..4a83f07d1 100644 --- a/docs/conceptual/gpu-arch/mi200-performance-counters.md +++ b/docs/conceptual/gpu-arch/mi200-performance-counters.md @@ -14,9 +14,9 @@ This document lists and describes the hardware performance counters and derived See the category-wise listing of MI200 performance counters in the following tables. -```{note} +:::{note} Preliminary validation of all MI200 performance counters is in progress. Those with “*” appended to the names require further evaluation. -``` +::: ### Graphics Register Bus Management (GRBM) counters diff --git a/docs/conceptual/gpu-memory.md b/docs/conceptual/gpu-memory.md index 010ae89c0..3a74d0e8c 100644 --- a/docs/conceptual/gpu-memory.md +++ b/docs/conceptual/gpu-memory.md @@ -9,8 +9,8 @@ For the HIP reference documentation, see: -* {doc}`hip:.doxygen/docBin/html/group___memory` -* {doc}`hip:.doxygen/docBin/html/group___memory_m` +* {doc}`hip:doxygen/html/group___memory` +* {doc}`hip:doxygen/html/group___memory_m` Host memory exists on the host (e.g. CPU) of the machine in random access memory (RAM). diff --git a/docs/conceptual/using-gpu-sanitizer.md b/docs/conceptual/using-gpu-sanitizer.md index d8851b4fd..e9b9a8ff0 100644 --- a/docs/conceptual/using-gpu-sanitizer.md +++ b/docs/conceptual/using-gpu-sanitizer.md @@ -14,7 +14,9 @@ Until now, the LLVM ASan process was only available for traditional purely CPU a This document provides documentation on using ROCm ASan. For information about LLVM ASan, see the [LLVM documentation](https://clang.llvm.org/docs/AddressSanitizer.html). -**Note**: The beta release of LLVM ASan for ROCm is currently tested and validated on Ubuntu 20.04. +:::{note} +The beta release of LLVM ASan for ROCm is currently tested and validated on Ubuntu 20.04. +::: ## Compiling for ASan diff --git a/docs/conf.py b/docs/conf.py index 1cc7ca2e3..7dfc424ce 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -70,9 +70,6 @@ article_pages = [ {"file":"install/windows/cli/index", "os":["windows"]}, {"file":"install/windows/gui/index", "os":["windows"]}, - {"file":"about/compatibility/linux-support", "os":["linux"]}, - {"file":"about/compatibility/windows-support", "os":["windows"]}, - {"file":"about/compatibility/docker-image-support-matrix", "os":["linux"]}, {"file":"about/compatibility/user-kernel-space-compat-matrix", "os":["linux"]}, diff --git a/docs/how-to/deep-learning-rocm.md b/docs/how-to/deep-learning-rocm.md index acc556f0d..e43901331 100644 --- a/docs/how-to/deep-learning-rocm.md +++ b/docs/how-to/deep-learning-rocm.md @@ -11,12 +11,12 @@ The following sections cover the different framework installations for ROCm and deep-learning applications. The following image provides the sequential flow for the use of each framework. Refer to the ROCm Compatible Frameworks Release Notes for each framework's most current release notes at -{doc}`Third-party support`. +{doc}`Third-party support`. ![ROCm Compatible Frameworks Flowchart](../data/install/magma-install/magma005.png "ROCm Compatible Frameworks") ## Frameworks installation -* {doc}`PyTorch for ROCm` -* {doc}`TensorFlow for ROCm` -* {doc}`MAGMA for ROCm` +* {doc}`PyTorch for ROCm` +* {doc}`TensorFlow for ROCm` +* {doc}`MAGMA for ROCm` diff --git a/docs/how-to/tuning-guides/mi100.md b/docs/how-to/tuning-guides/mi100.md index 437c6ed11..249e44965 100644 --- a/docs/how-to/tuning-guides/mi100.md +++ b/docs/how-to/tuning-guides/mi100.md @@ -365,9 +365,9 @@ installed. ## System management For a complete guide on how to install/manage/uninstall ROCm on Linux, refer to -{doc}`Quick-start (Linux)`. For verifying that the -installation was successful, refer to -{ref}`verifying-kernel-mode-driver-installation` and +{doc}`Quick-start (Linux)`. To verify that the installation was +successful, refer to the +{doc}`post-install instructions` and [Validation Tools](../../reference/library-index.md). Should verification fail, consult the [System Debugging Guide](../system-debugging.md). @@ -412,7 +412,8 @@ SIMD pipelines, memory information, and Instruction Set Architecture: ![rocminfo output fragment on an 8*MI100 system](../../data/how-to/tuning-guides/tuning003.png "rocminfo output fragment on an 8*MI100 system") For a complete list of architecture (LLVM target) names, refer to -[Linux support](../../about/compatibility/linux-support.md) and [Windows support](../../about/compatibility/windows-support.md). +{doc}`Linux` and +{doc}`Windows` support. ### Testing inter-device bandwidth diff --git a/docs/how-to/tuning-guides/mi200.md b/docs/how-to/tuning-guides/mi200.md index 2efcf1dd2..480f08bbe 100644 --- a/docs/how-to/tuning-guides/mi200.md +++ b/docs/how-to/tuning-guides/mi200.md @@ -34,7 +34,7 @@ Analogous settings for other non-AMI System BIOS providers could be set similarly. For systems with Intel processors, some settings may not apply or be available as listed in the following table. -```{list-table} Recommended settings for the system BIOS in a GIGABYTE platform. +```{list-table} :header-rows: 1 :name: mi200-bios @@ -351,8 +351,8 @@ installed. For a complete guide on how to install/manage/uninstall ROCm on Linux, refer to {doc}`Quick-start (Linux)`. For verifying that the -installation was successful, refer to -{ref}`verifying-kernel-mode-driver-installation` and +installation was successful, refer to the +{doc}`post-install instructions` and [Validation Tools](../../reference/library-index.md). Should verification fail, consult the [System Debugging Guide](../system-debugging.md). @@ -397,7 +397,8 @@ Instruction Set Architecture (ISA): ![rocminfo output fragment on an 8*MI200 system](../../data/how-to/tuning-guides/tuning010.png "'rocminfo' output fragment on an 8*MI200 system") For a complete list of architecture (LLVM target) names, refer to GPU OS Support for -[Linux](../../about/compatibility/linux-support.md) and [Windows](../../about/compatibility/windows-support.md). +{doc}`Linux` and +{doc}`Windows`. ### Testing inter-device bandwidth diff --git a/docs/how-to/tuning-guides/w6000-v620.md b/docs/how-to/tuning-guides/w6000-v620.md index 20d380da7..4cafac51a 100644 --- a/docs/how-to/tuning-guides/w6000-v620.md +++ b/docs/how-to/tuning-guides/w6000-v620.md @@ -11,16 +11,16 @@ This chapter reviews system settings that are required to configure the system for ROCm virtualization on RDNA2-based AMD Radeon™ PRO GPUs. Installing ROCm on Bare Metal follows the routine ROCm -{doc}`installation procedure`. +{doc}`installation procedure`. To enable ROCm virtualization on V620, one has to setup Single Root I/O Virtualization (SR-IOV) in the BIOS via setting found in the following ({ref}`bios-settings`). A tested configuration can be followed in ({ref}`os-settings`). -```{attention} +:::{attention} SR-IOV is supported on V620 and unsupported on W6800. -``` +::: (bios-settings)= @@ -166,6 +166,6 @@ First, assign GPU virtual function (VF) to VM using the following steps. Then start the VM. Finally install ROCm on the virtual machine (VM). For detailed instructions, -refer to the {doc}`Linux install guide`. For any +refer to the {doc}`Linux install guide`. For any issue encountered during installation, write to us [here](mailto:CloudGPUsupport@amd.com). diff --git a/docs/index.md b/docs/index.md index b074c4ff3..5d8b2e4e7 100644 --- a/docs/index.md +++ b/docs/index.md @@ -47,8 +47,8 @@ Installation guides ROCm compatibility information ^^^ -* [Linux (GPU & OS)](./about/compatibility/linux-support.md) -* [Windows (GPU & OS)](./about/compatibility/windows-support.md) +* {doc}`System requirements (Linux)` +* {doc}`System requirements (Windows)` * {doc}`Third-party` * {doc}`User/kernel space` * {doc}`Docker` diff --git a/docs/reference/rocmcc.md b/docs/reference/rocmcc.md index e939e715c..c8d504aa3 100644 --- a/docs/reference/rocmcc.md +++ b/docs/reference/rocmcc.md @@ -141,12 +141,12 @@ The `-famd-opt` flag is useful when a user wants to build with the proprietary optimization compiler and not have to depend on setting any of the other proprietary optimization flags. -```{note} +:::{note} `-famd-opt` can be used in addition to the other proprietary CPU optimization flags. The table of optimizations below implicitly enables the invocation of the AMD proprietary optimizations compiler, whereas the `-famd-opt` flag requires this to be handled explicitly. -``` +::: #### `-fstruct-layout=[1,2,3,4,5,6,7]` @@ -262,12 +262,12 @@ loop. The heuristic can be controlled with the following options: Where, `n` is a positive integer and higher value of `` facilitates more unswitching. - ```{note} + :::{note} These options may facilitate more unswitching under some workloads. Since loop-unswitching inherently leads to code bloat, facilitating more unswitching may significantly increase the code size. Hence, it may also lead to longer compilation times. - ``` + ::: ##### `-enable-strided-vectorization` @@ -458,11 +458,11 @@ supports ASM statements, their use is not recommended for the following reasons: * Writing correct ASM statements is often difficult; we strongly recommend thorough testing of any use of ASM statements. -```{note} +:::{note} For developers who choose to include ASM statements in the code, AMD is interested in understanding the use case and appreciates feedback at [https://github.com/RadeonOpenCompute/ROCm/issues](https://github.com/RadeonOpenCompute/ROCm/issues) -``` +::: ### Miscellaneous OpenMP compiler features diff --git a/docs/release/versions.md b/docs/release/versions.md index 235a786bd..f600a231d 100644 --- a/docs/release/versions.md +++ b/docs/release/versions.md @@ -1,4 +1,4 @@ -# ROCm Release History +# ROCm release history | Version | Release Date | | ------- | ------------ | diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index 5a765b380..6aaf50273 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -7,7 +7,8 @@ root: index subtrees: - entries: - file: what-is-rocm.md - - file: about/whats-new/whats-new.md + - file: about/release-notes.md + title: Release notes subtrees: - entries: - file: about/CHANGELOG.md @@ -28,7 +29,7 @@ subtrees: title: Linux - url: https://rocm.docs.amd.com/projects/install-on-windows/en/${branch}/reference/system-requirements.html title: Windows - + - caption: Reference entries: - file: reference/library-index.md diff --git a/docs/temp/troubleshooting.md b/docs/temp/troubleshooting.md index 81ec6a44e..5234ee939 100644 --- a/docs/temp/troubleshooting.md +++ b/docs/temp/troubleshooting.md @@ -29,11 +29,11 @@ To implement a workaround, follow these steps: roc-obj-ls -v $TORCHDIR/lib/libtorch_hip.so # check for gfx target ``` -```{note} +:::{note} Recompile PyTorch with the right gfx target if compiling from the source if the hardware is not supported. For wheels or Docker installation, contact ROCm support [^ROCm_issues]. -``` +::: **Q: Why am I unable to access Docker or GPU in user accounts?** diff --git a/tools/autotag/templates/rocm_changes/5.0.0.md b/tools/autotag/templates/rocm_changes/5.0.0.md index 39991b96c..a262c959f 100644 --- a/tools/autotag/templates/rocm_changes/5.0.0.md +++ b/tools/autotag/templates/rocm_changes/5.0.0.md @@ -16,27 +16,27 @@ Refer to the HIP Installation Guide v5.0 for more details. Managed memory, including the `__managed__` keyword, is now supported in the HIP combined host/device compilation. Through unified memory allocation, managed memory allows data to be shared and accessible to both the CPU and GPU using a single pointer. The allocation is managed by the AMD GPU driver using the Linux Heterogeneous Memory Management (HMM) mechanism. The user can call managed memory API hipMallocManaged to allocate a large chunk of HMM memory, execute kernels on a device, and fetch data between the host and device as needed. -> **Note** -> -> In a HIP application, it is recommended to do a capability check before calling the managed memory APIs. For example, -> -> ```cpp -> int managed_memory = 0; -> HIPCHECK(hipDeviceGetAttribute(&managed_memory, -> hipDeviceAttributeManagedMemory,p_gpuDevice)); -> if (!managed_memory ) { -> printf ("info: managed memory access not supported on the device %d\n Skipped\n", p_gpuDevice); -> } -> else { -> HIPCHECK(hipSetDevice(p_gpuDevice)); -> HIPCHECK(hipMallocManaged(&Hmm, N * sizeof(T))); -> . . . -> } -> ``` +:::{note} +In a HIP application, it is recommended to do a capability check before calling the managed memory APIs. For example, -> **Note** -> -> The managed memory capability check may not be necessary; however, if HMM is not supported, managed malloc will fall back to using system memory. Other managed memory API calls will, then, have + ```cpp + int managed_memory = 0; + HIPCHECK(hipDeviceGetAttribute(&managed_memory, + hipDeviceAttributeManagedMemory,p_gpuDevice)); + if (!managed_memory ) { + printf ("info: managed memory access not supported on the device %d\n Skipped\n", p_gpuDevice); + } + else { + HIPCHECK(hipSetDevice(p_gpuDevice)); + HIPCHECK(hipMallocManaged(&Hmm, N * sizeof(T))); + . . . + } + ``` +::: + +:::{note} +The managed memory capability check may not be necessary; however, if HMM is not supported, managed malloc will fall back to using system memory. Other managed memory API calls will, then, have +::: Refer to the HIP API documentation for more details on managed memory APIs. diff --git a/tools/autotag/templates/rocm_changes/5.0.2.md b/tools/autotag/templates/rocm_changes/5.0.2.md index 6fbcc6f92..985689727 100644 --- a/tools/autotag/templates/rocm_changes/5.0.2.md +++ b/tools/autotag/templates/rocm_changes/5.0.2.md @@ -17,10 +17,10 @@ The resolution includes a compiler change, which emits the required metadata by compiler can prove that the hostcall facility is not required by the kernel. This ensures that the “assert()” call never fails. -```note +:::{note} This fix may lead to breakage in some OpenMP offload use cases, which use print inside a target region and result in an abort in device code. The issue will be fixed in a future release. -``` +::: The compatibility matrix in the [Deep-learning guide](./how-to/deep-learning-rocm.md) is updated for ROCm v5.0.2. diff --git a/tools/autotag/templates/rocm_changes/5.1.0.md b/tools/autotag/templates/rocm_changes/5.1.0.md index 335afe899..82fd6bd2e 100644 --- a/tools/autotag/templates/rocm_changes/5.1.0.md +++ b/tools/autotag/templates/rocm_changes/5.1.0.md @@ -41,11 +41,11 @@ The accuracy is guaranteed if the compiler options `-g -O0` are used and apply o This enhancement enables ROCDebugger users to interact with the HIP source-level variables and function arguments. -```note +:::{note} The newly-suggested compiler -g option must be used instead of the previously-suggested `-ggdb` option. Although the effect of these two options is currently equivalent, this is not guaranteed for the future, as changes might be made by the upstream LLVM community. -``` +::: ##### Machine interface lanes support @@ -118,16 +118,16 @@ this ROCm release, CRIU is enhanced with a new plugin to support AMD GPUs, which For more information, refer to -```note +:::{note} The CRIU plugin (amdgpu_plugin) is merged upstream with the CRIU repository. The KFD kernel patches are also available upstream with the amd-staging-drm-next branch (public) and the ROCm 5.1 release branch. -``` +::: -```note +:::{note} This is a Beta release of the Checkpoint and Restore functionality, and some features are not available in this release. -``` +::: For more information, refer to the following websites: diff --git a/tools/autotag/templates/rocm_changes/5.2.0.md b/tools/autotag/templates/rocm_changes/5.2.0.md index 311738612..f6698a413 100644 --- a/tools/autotag/templates/rocm_changes/5.2.0.md +++ b/tools/autotag/templates/rocm_changes/5.2.0.md @@ -275,7 +275,7 @@ The new APIs for virtual memory management are as follows: ``` For more information, refer to the HIP API documentation at -{doc}`hip:.doxygen/docBin/html/modules`. +{doc}`hip:doxygen/html/modules`. ##### Planned HIP changes in future releases @@ -375,10 +375,10 @@ The following is the new file system hierarchy: ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -387,9 +387,10 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -```note +:::{note} ROCm will continue supporting backward compatibility until the next major release. -``` +::: + ##### Wrapper header files Wrapper header files are placed in the old location (`/opt/rocm-xxx//include`) with a @@ -502,10 +503,10 @@ allow for undefined values. The workaround is to initialize the parameters to `__shfl_sync`. -```note +:::{note} When the `-Wall` compilation flag is used, the compiler generates a warning indicating the variable is initialized along some path. -``` +::: Example: diff --git a/tools/autotag/templates/rocm_changes/5.3.0.md b/tools/autotag/templates/rocm_changes/5.3.0.md index 9f4978e86..11bd19c92 100644 --- a/tools/autotag/templates/rocm_changes/5.3.0.md +++ b/tools/autotag/templates/rocm_changes/5.3.0.md @@ -6,13 +6,13 @@ The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: #### Linux file system hierarchy standard for ROCm @@ -56,10 +56,10 @@ The following is the new file system hierarchy: ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -68,9 +68,9 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -```note +:::{note} ROCm will continue supporting backward compatibility until the next major release. -``` +::: ##### Wrapper header files diff --git a/tools/autotag/templates/rocm_changes/5.4.0.md b/tools/autotag/templates/rocm_changes/5.4.0.md index ea344f67d..15daa4aa9 100644 --- a/tools/autotag/templates/rocm_changes/5.4.0.md +++ b/tools/autotag/templates/rocm_changes/5.4.0.md @@ -27,9 +27,9 @@ int wallClkRate = 0; //in kilohertz Where hipDeviceAttributeWallClockRate is a device attribute. -```note +:::{note} The wall clock frequency is a per-device attribute. -``` +::: ##### New registry added for GPU_MAX_HW_QUEUES @@ -40,10 +40,10 @@ The environment variable controls how many independent hardware queues HIP runti per process, per device. If the application allocates more HIP streams than this number, then the HIP runtime reuses the same hardware queues for the new streams in a round-robin manner. -```note +:::{note} This maximum number does not apply to hardware queues created for CU-masked HIP streams or cooperative queues for HIP Cooperative Groups (there is only one queue per device). -``` +::: For more details, refer to the HIP Programming Guide. @@ -51,9 +51,9 @@ For more details, refer to the HIP Programming Guide. The following new HIP APIs are available in the ROCm v5.4 release. -> **Note** -> -> This is a pre-official version (beta) release of the new APIs. +:::{note} +This is a pre-official version (beta) release of the new APIs. +::: ##### Error handling @@ -99,13 +99,13 @@ This release consists of the following OpenMP enhancements: The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: ##### Linux file system hierarchy standard for ROCm @@ -149,10 +149,10 @@ The following is the new file system hierarchy: ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -161,9 +161,9 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -```note +:::{note} ROCm will continue supporting backward compatibility until the next major release. -``` +::: ##### Wrapper header files @@ -231,7 +231,7 @@ determine coherent support. `hipHostMalloc()` allocates memory with fine-grained access by default when the environment variable `HIP_HOST_COHERENT=1` is used. -For more information, refer to {doc}`hip:.doxygen/docBin/html/index`. +For more information, refer to {doc}`hip:doxygen/html/index`. #### SoftHang with `hipStreamWithCUMask` test on AMD Instinct™ diff --git a/tools/autotag/templates/rocm_changes/5.4.1.md b/tools/autotag/templates/rocm_changes/5.4.1.md index 20b5f4db8..d52a32ae4 100644 --- a/tools/autotag/templates/rocm_changes/5.4.1.md +++ b/tools/autotag/templates/rocm_changes/5.4.1.md @@ -9,9 +9,9 @@ The ROCm v5.4.1 release consists of the following new HIP API: The following new HIP API is introduced in the ROCm v5.4.1 release. -> **Note** -> -> This is a pre-official version (beta) release of the new APIs. +:::{note} +This is a pre-official version (beta) release of the new APIs. +::: ```cpp hipError_t hipLaunchHostFunc(hipStream_t stream, hipHostFn_t fn, void* userData); @@ -35,13 +35,13 @@ For more information, refer to the HIP API documentation at The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: ### IFWI fixes diff --git a/tools/autotag/templates/rocm_changes/5.4.2.md b/tools/autotag/templates/rocm_changes/5.4.2.md index e85843062..ba64bde5c 100644 --- a/tools/autotag/templates/rocm_changes/5.4.2.md +++ b/tools/autotag/templates/rocm_changes/5.4.2.md @@ -6,13 +6,13 @@ The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: #### `hipcc` options deprecation diff --git a/tools/autotag/templates/rocm_changes/5.4.3.md b/tools/autotag/templates/rocm_changes/5.4.3.md index 2b3adde8a..e2d941411 100644 --- a/tools/autotag/templates/rocm_changes/5.4.3.md +++ b/tools/autotag/templates/rocm_changes/5.4.3.md @@ -6,13 +6,13 @@ The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: ##### Linux file system hierarchy standard for ROCm @@ -56,10 +56,10 @@ The following is the new file system hierarchy:4 ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -68,9 +68,9 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -```note +:::{note} ROCm will continue supporting backward compatibility until the next major release. -``` +::: ##### Wrapper header files diff --git a/tools/autotag/templates/rocm_changes/5.5.0.md b/tools/autotag/templates/rocm_changes/5.5.0.md index a988b00d8..18aaab716 100644 --- a/tools/autotag/templates/rocm_changes/5.5.0.md +++ b/tools/autotag/templates/rocm_changes/5.5.0.md @@ -55,9 +55,9 @@ The following hipcc changes are implemented in this release: ##### New HIP APIs in this release -```note +:::{note} This is a pre-official version (beta) release of the new APIs and may contain unresolved issues. -``` +::: ###### Memory management HIP APIs @@ -154,13 +154,13 @@ This release consists of the following OpenMP enhancements: The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. -```note +:::{note} There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. -``` +::: ##### Linux file system hierarchy standard for ROCm @@ -200,10 +200,10 @@ The following is the new file system hierarchy:4 ``` -```note +:::{note} ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. -``` +::: For more information, refer to . @@ -212,10 +212,9 @@ For more information, refer to . ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. -> **Note** -> -> ROCm will continue supporting backward compatibility until the next major release. - +:::{note} +ROCm will continue supporting backward compatibility until the next major release. +::: ##### Wrapper header files Wrapper header files are placed in the old location (`/opt/rocm-xxx//include`) with a diff --git a/tools/autotag/templates/rocm_changes/5.7.0.md b/tools/autotag/templates/rocm_changes/5.7.0.md index 12fa1bd89..76219eb2c 100644 --- a/tools/autotag/templates/rocm_changes/5.7.0.md +++ b/tools/autotag/templates/rocm_changes/5.7.0.md @@ -12,11 +12,11 @@ New features include: Note that ROCm 5.7.0 is EOS for MI50. 5.7 versions of ROCm are the last major releases in the ROCm 5 series. This release is Linux-only. -```important +:::{important} The next major ROCm release (ROCm 6.0) will not be backward compatible with the ROCm 5 series. Changes will include: splitting LLVM packages into more manageable sizes, changes to the HIP runtime API, splitting rocRAND and hipRAND into separate packages, and reorganizing our file structure. -``` +::: #### AMD Instinct™ MI50 end-of-support notice @@ -27,8 +27,8 @@ As outlined in [5.6.0](https://rocm.docs.amd.com/en/docs-5.6.0/release.html), RO final release for gfx906 GPUs to be in a fully supported state. * ROCm 6.0 release will show MI50s as "under maintenance" for - [Linux](../about/compatibility/linux-support.md) and - [Windows](../about/compatibility/windows-support.md) + {doc}`Linux` and + {doc}`Windows` * No new features and performance optimizations will be supported for the gfx906 GPUs beyond this major release (ROCm 5.7). @@ -62,8 +62,10 @@ environments. Users may see the following error from runtime (with AMD_LOG_LEVEL The ROCm 5.7 release introduces an alternative to the current hostcall-based implementation that leverages an older OpenCL-based printf scheme, which does not rely on hostcalls/PCIe atomics. -Note: This option is less robust than hostcall-based implementation and is intended to be a +:::{note} +This option is less robust than hostcall-based implementation and is intended to be a workaround when hostcalls do not work. +::: The printf variant is now controlled via a new compiler option -mprintf-kind=. This is supported only for HIP programs and takes the following values, @@ -96,11 +98,11 @@ the GPU in heterogeneous applications. Ideally, developers should treat heteroge OpenMP applications like pure CPU applications. However, this simplicity has not been achieved yet. Refer to the documentation on LLVM ASan with the GPU at -[LLVM AddressSanitizer User Guide](../conceptual/using_gpu_sanitizer.md). +[LLVM AddressSanitizer User Guide](../conceptual/using-gpu-sanitizer.md). -```note +:::{note} The beta release of LLVM ASan for ROCm is currently tested and validated on Ubuntu 20.04. -``` +::: #### Defect fixes diff --git a/tools/autotag/templates/rocm_changes/6.0.0.md b/tools/autotag/templates/rocm_changes/6.0.0.md index 75a4db89c..8f26615e5 100644 --- a/tools/autotag/templates/rocm_changes/6.0.0.md +++ b/tools/autotag/templates/rocm_changes/6.0.0.md @@ -226,7 +226,9 @@ HIP 6.0.0 for ROCm 6.0.0 * `char luid[8];` * `unsigned int luidDeviceNodeMask;` -Note: HIP supports LUID only on Windows OS. +:::{note} +HIP only supports LUID on Windows OS. +::: ##### Changes