# Release Notes The release notes for the ROCm platform. ------------------- ## ROCm 5.4.0 ### What's New in This Release #### HIP Enhancements The ROCm v5.4 release consists of the following HIP enhancements: ##### Support for Wall Clock64 A new timer function wall_clock64() is supported, which returns wall clock count at a constant frequency on the device. ```h long long int wall_clock64(); ``` It returns wall clock count at a constant frequency on the device, which can be queried via HIP API with the hipDeviceAttributeWallClockRate attribute of the device in the HIP application code. Example: ```h int wallClkRate = 0; //in kilohertz +HIPCHECK(hipDeviceGetAttribute(&wallClkRate, hipDeviceAttributeWallClockRate, deviceId)); ``` Where hipDeviceAttributeWallClockRate is a device attribute. > **Note** > > The wall clock frequency is a per-device attribute. ##### New Registry Added for GPU_MAX_HW_QUEUES The GPU_MAX_HW_QUEUES registry defines the maximum number of independent hardware queues allocated per process per device. The environment variable controls how many independent hardware queues HIP runtime can create per process, per device. If the application allocates more HIP streams than this number, then the HIP runtime reuses the same hardware queues for the new streams in a round-robin manner. > **Note** > > This maximum number does not apply to hardware queues created for CU-masked HIP streams or cooperative queues for HIP Cooperative Groups (there is only one queue per device). For more details, refer to the HIP Programming Guide. #### New HIP APIs in This Release The following new HIP APIs are available in the ROCm v5.4 release. > **Note** > > This is a pre-official version (beta) release of the new APIs. ##### Error Handling ```h hipError_t hipDrvGetErrorName(hipError_t hipError, const char** errorString); ``` This returns HIP errors in the text string format. ```h hipError_t hipDrvGetErrorString(hipError_t hipError, const char** errorString); ``` This returns text string messages with more details about the error. For more information, refer to the HIP API Guide. ##### HIP Tests Source Separation With ROCm v5.4, a separate GitHub project is created at This contains HIP catch2 tests and samples, and new tests will continue to develop. In future ROCm releases, catch2 tests and samples will be removed from the HIP project. ### OpenMP Enhancements This release consists of the following OpenMP enhancements: - Enable new device RTL in libomptarget as default. - New flag `-fopenmp-target-fast` to imply `-fopenmp-target-ignore-env-vars -fopenmp-assume-no-thread-state -fopenmp-assume-no-nested-parallelism`. - Support for the collapse clause and non-unit stride in cases where the No-Loop specialized kernel is generated. - Initial implementation of optimized cross-team sum reduction for float and double type scalars. - Pool-based optimization in the OpenMP runtime to reduce locking during data transfer. ### Deprecations and Warnings #### HIP Perl Scripts Deprecation The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, compiled binaries will be available as `hipcc.bin` and `hipconfig.bin` as replacements for the Perl scripts. > **Note** > > There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option. (5_4_0_filesystem_reorg_deprecation_notice)= ##### Linux Filesystem Hierarchy Standard for ROCm ROCm packages have adopted the Linux foundation filesystem hierarchy standard in this release to ensure ROCm components follow open source conventions for Linux-based distributions. While moving to a new filesystem hierarchy, ROCm ensures backward compatibility with its 5.1 version or older filesystem hierarchy. See below for a detailed explanation of the new filesystem hierarchy and backward compatibility. ##### New Filesystem Hierarchy The following is the new filesystem hierarchy: ```text /opt/rocm- | --bin | --All externally exposed Binaries | --libexec | -- | -- Component specific private non-ISA executables (architecture independent) | --include | -- | --
| --lib | --lib.so -> lib.so.major -> lib.so.major.minor.patch (public libraries linked with application) | -- (component specific private library, executable data) | -- | --components | --.config.cmake | --share | --html//*.html | --info//*.[pdf, md, txt] | --man | --doc | -- | -- | -- | -- (arch independent non-executable) | --samples ``` > **Note** > > ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release. For more information, refer to . ##### Backward Compatibility with Older Filesystems ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility. > **Note** > > ROCm will continue supporting backward compatibility until the next major release. ##### Wrapper header files Wrapper header files are placed in the old location (`/opt/rocm-xxx//include`) with a warning message to include files from the new location (`/opt/rocm-xxx/include`) as shown in the example below: ```h // Code snippet from hip_runtime.h #pragma message “This file is deprecated. Use file from include path /opt/rocm-ver/include/ and prefix with hip”. #include "hip/hip_runtime.h" ``` The wrapper header files’ backward compatibility deprecation is as follows: - `#pragma` message announcing deprecation -- ROCm v5.2 release - `#pragma` message changed to `#warning` -- Future release - `#warning` changed to `#error` -- Future release - Backward compatibility wrappers removed -- Future release ##### Library files Library files are available in the `/opt/rocm-xxx/lib` folder. For backward compatibility, the old library location (`/opt/rocm-xxx//lib`) has a soft link to the library at the new location. Example: ```log $ ls -l /opt/rocm/hip/lib/ total 4 drwxr-xr-x 4 root root 4096 May 12 10:45 cmake lrwxrwxrwx 1 root root 24 May 10 23:32 libamdhip64.so -> ../../lib/libamdhip64.so ``` ##### CMake Config files All CMake configuration files are available in the `/opt/rocm-xxx/lib/cmake/` folder. For backward compatibility, the old CMake locations (`/opt/rocm-xxx//lib/cmake`) consist of a soft link to the new CMake config. Example: ```log $ ls -l /opt/rocm/hip/lib/cmake/hip/ total 0 lrwxrwxrwx 1 root root 42 May 10 23:32 hip-config.cmake -> ../../../../lib/cmake/hip/hip-config.cmake ``` ### Fixed Defects The following defects are fixed in this release. These defects were identified and documented as known issues in previous ROCm releases and are fixed in this release. #### Memory Allocated Using hipHostMalloc() with Flags Did Not Exhibit Fine-Grain Behavior ##### Issue The test was incorrectly using the `hipDeviceAttributePageableMemoryAccess` device attribute to determine coherent support. ##### Fix `hipHostMalloc()` allocates memory with fine-grained access by default when the environment variable `HIP_HOST_COHERENT=1` is used. For more information, refer to {doc}`hip:.doxygen/docBin/html/index`. #### SoftHang with `hipStreamWithCUMask` test on AMD Instinct™ ##### Issue On GFX10 GPUs, kernel execution hangs when it is launched on streams created using `hipStreamWithCUMask`. ##### Fix On GFX10 GPUs, each workgroup processor encompasses two compute units, and the compute units must be enabled as a pair. The `hipStreamWithCUMask` API unit test cases are updated to set compute unit mask (cuMask) in pairs for GFX10 GPUs. #### ROCm Tools GPU IDs The HIP language device IDs are not the same as the GPU IDs reported by the tools. GPU IDs are globally unique and guaranteed to be consistent across APIs and processes. GPU IDs reported by ROCTracer and ROCProfiler or ROCm Tools are HSA Driver Node ID of that GPU, as it is a unique ID for that device in that particular node.