mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-09 22:58:17 -05:00
Compare commits
26 Commits
docs/5.2.0
...
docs/5.0.0
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
815a7aa81d | ||
|
|
966d100108 | ||
|
|
ee0084a97d | ||
|
|
110f9c37d6 | ||
|
|
0bf18fc59a | ||
|
|
5ef1b40475 | ||
|
|
1e9227e254 | ||
|
|
b1e441451e | ||
|
|
d3a0e85c21 | ||
|
|
1f39f027fc | ||
|
|
eec431a98f | ||
|
|
e651705e91 | ||
|
|
34e9bb3c2e | ||
|
|
0ace5e9db2 | ||
|
|
7062df1c67 | ||
|
|
24db72b8a8 | ||
|
|
4668632fa2 | ||
|
|
270bc73661 | ||
|
|
a7ce874940 | ||
|
|
e78e6a9a23 | ||
|
|
4bf9dc9560 | ||
|
|
0dd3fc9eb4 | ||
|
|
0908ed22b1 | ||
|
|
683b940a89 | ||
|
|
3f60f1df3b | ||
|
|
d9543485ce |
@@ -6,9 +6,17 @@ version: 2
|
||||
sphinx:
|
||||
configuration: docs/conf.py
|
||||
|
||||
formats: [htmlzip, pdf, epub]
|
||||
formats: [htmlzip]
|
||||
|
||||
python:
|
||||
version: "3.8"
|
||||
install:
|
||||
- requirements: docs/sphinx/requirements.txt
|
||||
|
||||
build:
|
||||
os: ubuntu-22.04
|
||||
tools:
|
||||
python: "3.10"
|
||||
apt_packages:
|
||||
- "doxygen"
|
||||
- "gfortran" # For pre-processing fortran sources
|
||||
- "graphviz" # For dot graphs in doxygen
|
||||
|
||||
1452
CHANGELOG.md
1452
CHANGELOG.md
File diff suppressed because it is too large
Load Diff
782
RELEASE.md
782
RELEASE.md
@@ -15,498 +15,404 @@ The release notes for the ROCm platform.
|
||||
|
||||
-------------------
|
||||
|
||||
## ROCm 5.2.0
|
||||
## ROCm 5.0.0
|
||||
<!-- markdownlint-disable first-line-h1 -->
|
||||
<!-- markdownlint-disable no-duplicate-header -->
|
||||
### What's New in This Release
|
||||
|
||||
#### HIP Enhancements
|
||||
|
||||
The ROCm v5.2 release consists of the following HIP enhancements:
|
||||
The ROCm v5.0 release consists of the following HIP enhancements.
|
||||
|
||||
##### HIP Installation Guide Updates
|
||||
|
||||
The HIP Installation Guide is updated to include building HIP tests from source on the AMD and NVIDIA platforms.
|
||||
The HIP Installation Guide is updated to include building HIP from source on the NVIDIA platform.
|
||||
|
||||
For more details, refer to the HIP Installation Guide v5.2.
|
||||
Refer to the HIP Installation Guide v5.0 for more details.
|
||||
|
||||
##### Support for device-side malloc on HIP-Clang
|
||||
##### Managed Memory Allocation
|
||||
|
||||
HIP-Clang now supports device-side malloc. This implementation does not require the use of `hipDeviceSetLimit(hipLimitMallocHeapSize,value)` nor respect any setting. The heap is fully dynamic and can grow until the available free memory on the device is consumed.
|
||||
|
||||
The test codes at the following link show how to implement applications using malloc and free functions in device kernels:
|
||||
|
||||
<https://github.com/ROCm-Developer-Tools/HIP/blob/develop/tests/src/deviceLib/hipDeviceMalloc.cpp>
|
||||
|
||||
##### New HIP APIs in This Release
|
||||
|
||||
The following new HIP APIs are available in the ROCm v5.2 release. Note that this is a pre-official version (beta) release of the new APIs:
|
||||
|
||||
###### Device management HIP APIs
|
||||
|
||||
The new device management HIP APIs are as follows:
|
||||
|
||||
- Gets a UUID for the device. This API returns a UUID for the device.
|
||||
|
||||
```h
|
||||
hipError_t hipDeviceGetUuid(hipUUID* uuid, hipDevice_t device);
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> This new API corresponds to the following CUDA API:
|
||||
>
|
||||
> ```h
|
||||
> CUresult cuDeviceGetUuid(CUuuid* uuid, CUdevice dev);
|
||||
> ```
|
||||
|
||||
- Gets default memory pool of the specified device
|
||||
|
||||
```h
|
||||
hipError_t hipDeviceGetDefaultMemPool(hipMemPool_t* mem_pool, int device);
|
||||
```
|
||||
|
||||
- Sets the current memory pool of a device
|
||||
|
||||
```h
|
||||
hipError_t hipDeviceSetMemPool(int device, hipMemPool_t mem_pool);
|
||||
```
|
||||
|
||||
- Gets the current memory pool for the specified device
|
||||
|
||||
```h
|
||||
hipError_t hipDeviceGetMemPool(hipMemPool_t* mem_pool, int device);
|
||||
```
|
||||
|
||||
###### New HIP Runtime APIs in Memory Management
|
||||
|
||||
The new Stream Ordered Memory Allocator functions of HIP runtime APIs in memory management are as follows:
|
||||
|
||||
- Allocates memory with stream ordered semantics
|
||||
|
||||
```h
|
||||
hipError_t hipMallocAsync(void** dev_ptr, size_t size, hipStream_t stream);
|
||||
```
|
||||
|
||||
- Frees memory with stream ordered semantics
|
||||
|
||||
```h
|
||||
hipError_t hipFreeAsync(void* dev_ptr, hipStream_t stream);
|
||||
```
|
||||
|
||||
- Releases freed memory back to the OS
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolTrimTo(hipMemPool_t mem_pool, size_t min_bytes_to_hold);
|
||||
```
|
||||
|
||||
- Sets attributes of a memory pool
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolSetAttribute(hipMemPool_t mem_pool, hipMemPoolAttr attr, void* value);
|
||||
```
|
||||
|
||||
- Gets attributes of a memory pool
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolGetAttribute(hipMemPool_t mem_pool, hipMemPoolAttr attr, void* value);
|
||||
```
|
||||
|
||||
- Controls visibility of the specified pool between devices
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolSetAccess(hipMemPool_t mem_pool, const hipMemAccessDesc* desc_list, size_t count);
|
||||
```
|
||||
|
||||
- Returns the accessibility of a pool from a device
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolGetAccess(hipMemAccessFlags* flags, hipMemPool_t mem_pool, hipMemLocation* location);
|
||||
```
|
||||
|
||||
- Creates a memory pool
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolCreate(hipMemPool_t* mem_pool, const hipMemPoolProps* pool_props);
|
||||
```
|
||||
|
||||
- Destroys the specified memory pool
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolDestroy(hipMemPool_t mem_pool);
|
||||
```
|
||||
|
||||
- Allocates memory from a specified pool with stream ordered semantics
|
||||
|
||||
```h
|
||||
hipError_t hipMallocFromPoolAsync(void** dev_ptr, size_t size, hipMemPool_t mem_pool, hipStream_t stream);
|
||||
```
|
||||
|
||||
- Exports a memory pool to the requested handle type
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolExportToShareableHandle(
|
||||
void* shared_handle,
|
||||
hipMemPool_t mem_pool,
|
||||
hipMemAllocationHandleType handle_type,
|
||||
unsigned int flags);
|
||||
```
|
||||
|
||||
- Imports a memory pool from a shared handle
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolImportFromShareableHandle(
|
||||
hipMemPool_t* mem_pool,
|
||||
void* shared_handle,
|
||||
hipMemAllocationHandleType handle_type,
|
||||
unsigned int flags);
|
||||
```
|
||||
|
||||
- Exports data to share a memory pool allocation between processes
|
||||
|
||||
```h
|
||||
hipError_t hipMemPoolExportPointer(hipMemPoolPtrExportData* export_data, void* dev_ptr);
|
||||
Import a memory pool allocation from another process.t
|
||||
hipError_t hipMemPoolImportPointer(
|
||||
void** dev_ptr,
|
||||
hipMemPool_t mem_pool,
|
||||
hipMemPoolPtrExportData* export_data);
|
||||
```
|
||||
|
||||
###### HIP Graph Management APIs
|
||||
|
||||
The new HIP Graph Management APIs are as follows:
|
||||
|
||||
- Enqueues a host function call in a stream
|
||||
|
||||
```h
|
||||
hipError_t hipLaunchHostFunc(hipStream_t stream, hipHostFn_t fn, void* userData);
|
||||
```
|
||||
|
||||
- Swaps the stream capture mode of a thread
|
||||
|
||||
```h
|
||||
hipError_t hipThreadExchangeStreamCaptureMode(hipStreamCaptureMode* mode);
|
||||
```
|
||||
|
||||
- Sets a node attribute
|
||||
|
||||
```h
|
||||
hipError_t hipGraphKernelNodeSetAttribute(hipGraphNode_t hNode, hipKernelNodeAttrID attr, const hipKernelNodeAttrValue* value);
|
||||
```
|
||||
|
||||
- Gets a node attribute
|
||||
|
||||
```h
|
||||
hipError_t hipGraphKernelNodeGetAttribute(hipGraphNode_t hNode, hipKernelNodeAttrID attr, hipKernelNodeAttrValue* value);
|
||||
```
|
||||
|
||||
###### Support for Virtual Memory Management APIs
|
||||
|
||||
The new APIs for virtual memory management are as follows:
|
||||
|
||||
- Frees an address range reservation made via hipMemAddressReserve
|
||||
|
||||
```h
|
||||
hipError_t hipMemAddressFree(void* devPtr, size_t size);
|
||||
```
|
||||
|
||||
- Reserves an address range
|
||||
|
||||
```h
|
||||
hipError_t hipMemAddressReserve(void** ptr, size_t size, size_t alignment, void* addr, unsigned long long flags);
|
||||
```
|
||||
|
||||
- Creates a memory allocation described by the properties and size
|
||||
|
||||
```h
|
||||
hipError_t hipMemCreate(hipMemGenericAllocationHandle_t* handle, size_t size, const hipMemAllocationProp* prop, unsigned long long flags);
|
||||
```
|
||||
|
||||
- Exports an allocation to a requested shareable handle type
|
||||
|
||||
```h
|
||||
hipError_t hipMemExportToShareableHandle(void* shareableHandle, hipMemGenericAllocationHandle_t handle, hipMemAllocationHandleType handleType, unsigned long long flags);
|
||||
```
|
||||
|
||||
- Gets the access flags set for the given location and ptr
|
||||
|
||||
```h
|
||||
hipError_t hipMemGetAccess(unsigned long long* flags, const hipMemLocation* location, void* ptr);
|
||||
```
|
||||
|
||||
- Calculates either the minimal or recommended granularity
|
||||
|
||||
```h
|
||||
hipError_t hipMemGetAllocationGranularity(size_t* granularity, const hipMemAllocationProp* prop, hipMemAllocationGranularity_flags option);
|
||||
```
|
||||
|
||||
- Retrieves the property structure of the given handle
|
||||
|
||||
```h
|
||||
hipError_t hipMemGetAllocationPropertiesFromHandle(hipMemAllocationProp* prop, hipMemGenericAllocationHandle_t handle);
|
||||
```
|
||||
|
||||
- Imports an allocation from a requested shareable handle type
|
||||
|
||||
```h
|
||||
hipError_t hipMemImportFromShareableHandle(hipMemGenericAllocationHandle_t* handle, void* osHandle, hipMemAllocationHandleType shHandleType);
|
||||
```
|
||||
|
||||
- Maps an allocation handle to a reserved virtual address range
|
||||
|
||||
```h
|
||||
hipError_t hipMemMap(void* ptr, size_t size, size_t offset, hipMemGenericAllocationHandle_t handle, unsigned long long flags);
|
||||
```
|
||||
|
||||
- Maps or unmaps subregions of sparse HIP arrays and sparse HIP mipmapped arrays
|
||||
|
||||
```h
|
||||
hipError_t hipMemMapArrayAsync(hipArrayMapInfo* mapInfoList, unsigned int count, hipStream_t stream);
|
||||
```
|
||||
|
||||
- Release a memory handle representing a memory allocation, that was previously allocated through hipMemCreate
|
||||
|
||||
```h
|
||||
hipError_t hipMemRelease(hipMemGenericAllocationHandle_t handle);
|
||||
```
|
||||
|
||||
- Returns the allocation handle of the backing memory allocation given the address
|
||||
|
||||
```h
|
||||
hipError_t hipMemRetainAllocationHandle(hipMemGenericAllocationHandle_t* handle, void* addr);
|
||||
```
|
||||
|
||||
- Sets the access flags for each location specified in desc for the given virtual address range
|
||||
|
||||
```h
|
||||
hipError_t hipMemSetAccess(void* ptr, size_t size, const hipMemAccessDesc* desc, size_t count);
|
||||
```
|
||||
|
||||
- Unmaps memory allocation of a given address range
|
||||
|
||||
```h
|
||||
hipError_t hipMemUnmap(void* ptr, size_t size);
|
||||
```
|
||||
|
||||
For more information, refer to the HIP API documentation at
|
||||
{doc}`hip:.doxygen/docBin/html/modules`.
|
||||
|
||||
##### Planned HIP Changes in Future Releases
|
||||
|
||||
Changes to `hipDeviceProp_t`, `HIPMEMCPY_3D`, and `hipArray` structures (and related HIP APIs) are planned in the next major release. These changes may impact backward compatibility.
|
||||
|
||||
Refer to the Release Notes document in subsequent releases for more information.
|
||||
ROCm Math and Communication Libraries
|
||||
|
||||
In this release, ROCm Math and Communication Libraries consist of the following enhancements and fixes:
|
||||
New rocWMMA for Matrix Multiplication and Accumulation Operations Acceleration
|
||||
|
||||
This release introduces a new ROCm C++ library for accelerating mixed precision matrix multiplication and accumulation (MFMA) operations leveraging specialized GPU matrix cores. rocWMMA provides a C++ API to facilitate breaking down matrix multiply accumulate problems into fragments and using them in block-wise operations that are distributed in parallel across GPU wavefronts. The API is a header library of GPU device code, meaning matrix core acceleration may be compiled directly into your kernel device code. This can benefit from compiler optimization in the generation of kernel assembly and does not incur additional overhead costs of linking to external runtime libraries or having to launch separate kernels.
|
||||
|
||||
rocWMMA is released as a header library and includes test and sample projects to validate and illustrate example usages of the C++ API. GEMM matrix multiplication is used as primary validation given the heavy precedent for the library. However, the usage portfolio is growing significantly and demonstrates different ways rocWMMA may be consumed.
|
||||
|
||||
For more information, refer to
|
||||
[Communication Libraries](../../../../docs/reference/gpu_libraries/communication.md).
|
||||
|
||||
#### OpenMP Enhancements in This Release
|
||||
|
||||
##### OMPT Target Support
|
||||
|
||||
The OpenMP runtime in ROCm implements a subset of the OMPT device APIs, as described in the OpenMP specification document. These are APIs that allow first-party tools to examine the profile and traces for kernels that execute on a device. A tool may register callbacks for data transfer and kernel dispatch entry points. A tool may use APIs to start and stop tracing for device-related activities such as data transfer and kernel dispatch timings and associated metadata. If device tracing is enabled, trace records for device activities are collected during program execution and returned to the tool using the APIs described in the specification.
|
||||
|
||||
Following is an example demonstrating how a tool would use the OMPT target APIs supported. The README in /opt/rocm/llvm/examples/tools/ompt outlines the steps to follow, and you can run the provided example as indicated below:
|
||||
|
||||
```sh
|
||||
cd /opt/rocm/llvm/examples/tools/ompt/veccopy-ompt-target-tracing
|
||||
make run
|
||||
```
|
||||
|
||||
The file `veccopy-ompt-target-tracing.c` simulates how a tool would initiate device activity tracing. The file `callbacks.h` shows the callbacks that may be registered and implemented by the tool.
|
||||
|
||||
### Deprecations and Warnings
|
||||
|
||||
#### Linux Filesystem Hierarchy Standard for ROCm
|
||||
|
||||
ROCm packages have adopted the Linux foundation filesystem hierarchy standard in this release to ensure ROCm components follow open source conventions for Linux-based distributions. While moving to a new filesystem hierarchy, ROCm ensures backward compatibility with its 5.1 version or older filesystem hierarchy. See below for a detailed explanation of the new filesystem hierarchy and backward compatibility.
|
||||
|
||||
##### New Filesystem Hierarchy
|
||||
|
||||
The following is the new filesystem hierarchy:
|
||||
|
||||
```text
|
||||
/opt/rocm-<ver>
|
||||
| --bin
|
||||
| --All externally exposed Binaries
|
||||
| --libexec
|
||||
| --<component>
|
||||
| -- Component specific private non-ISA executables (architecture independent)
|
||||
| --include
|
||||
| -- <component>
|
||||
| --<header files>
|
||||
| --lib
|
||||
| --lib<soname>.so -> lib<soname>.so.major -> lib<soname>.so.major.minor.patch
|
||||
(public libraries linked with application)
|
||||
| --<component> (component specific private library, executable data)
|
||||
| --<cmake>
|
||||
| --components
|
||||
| --<component>.config.cmake
|
||||
| --share
|
||||
| --html/<component>/*.html
|
||||
| --info/<component>/*.[pdf, md, txt]
|
||||
| --man
|
||||
| --doc
|
||||
| --<component>
|
||||
| --<licenses>
|
||||
| --<component>
|
||||
| --<misc files> (arch independent non-executable)
|
||||
| --samples
|
||||
|
||||
```
|
||||
Managed memory, including the `__managed__` keyword, is now supported in the HIP combined host/device compilation. Through unified memory allocation, managed memory allows data to be shared and accessible to both the CPU and GPU using a single pointer. The allocation is managed by the AMD GPU driver using the Linux Heterogeneous Memory Management (HMM) mechanism. The user can call managed memory API hipMallocManaged to allocate a large chunk of HMM memory, execute kernels on a device, and fetch data between the host and device as needed.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> ROCm will not support backward compatibility with the v5.1(old) file system hierarchy in its next major release.
|
||||
|
||||
For more information, refer to <https://refspecs.linuxfoundation.org/fhs.shtml>.
|
||||
|
||||
##### Backward Compatibility with Older Filesystems
|
||||
|
||||
ROCm has moved header files and libraries to its new location as indicated in the above structure and included symbolic-link and wrapper header files in its old location for backward compatibility.
|
||||
> In a HIP application, it is recommended to do a capability check before calling the managed memory APIs. For example,
|
||||
>
|
||||
> ```cpp
|
||||
> int managed_memory = 0;
|
||||
> HIPCHECK(hipDeviceGetAttribute(&managed_memory,
|
||||
> hipDeviceAttributeManagedMemory,p_gpuDevice));
|
||||
> if (!managed_memory ) {
|
||||
> printf ("info: managed memory access not supported on the device %d\n Skipped\n", p_gpuDevice);
|
||||
> }
|
||||
> else {
|
||||
> HIPCHECK(hipSetDevice(p_gpuDevice));
|
||||
> HIPCHECK(hipMallocManaged(&Hmm, N * sizeof(T)));
|
||||
> . . .
|
||||
> }
|
||||
> ```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> ROCm will continue supporting backward compatibility until the next major release.
|
||||
> The managed memory capability check may not be necessary; however, if HMM is not supported, managed malloc will fall back to using system memory. Other managed memory API calls will, then, have
|
||||
|
||||
##### Wrapper header files
|
||||
Refer to the HIP API documentation for more details on managed memory APIs.
|
||||
|
||||
Wrapper header files are placed in the old location (`/opt/rocm-xxx/<component>/include`) with a warning message to include files from the new location (`/opt/rocm-xxx/include`) as shown in the example below:
|
||||
For the application, see
|
||||
|
||||
```h
|
||||
// Code snippet from hip_runtime.h
|
||||
#pragma message “This file is deprecated. Use file from include path /opt/rocm-ver/include/ and prefix with hip”.
|
||||
#include "hip/hip_runtime.h"
|
||||
<https://github.com/ROCm-Developer-Tools/HIP/blob/rocm-4.5.x/tests/src/runtimeApi/memory/hipMallocManaged.cpp>
|
||||
|
||||
#### New Environment Variable
|
||||
|
||||
The following new environment variable is added in this release:
|
||||
|
||||
| Environment Variable | Value | Description |
|
||||
|----------------------|-----------------------|-------------|
|
||||
| HSA_COOP_CU_COUNT | 0 or 1 (default is 0) | Some processors support more CUs than can reliably be used in a cooperative dispatch. Setting the environment variable HSA_COOP_CU_COUNT to 1 will cause ROCr to return the correct CU count for cooperative groups through the HSA_AMD_AGENT_INFO_COOPERATIVE_COMPUTE_UNIT_COUNT attribute of hsa_agent_get_info(). Setting HSA_COOP_CU_COUNT to other values, or leaving it unset, will cause ROCr to return the same CU count for the attributes HSA_AMD_AGENT_INFO_COOPERATIVE_COMPUTE_UNIT_COUNT and HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT. Future ROCm releases will make HSA_COOP_CU_COUNT=1 the default. |
|
||||
|
||||
### Breaking Changes
|
||||
|
||||
#### Runtime Breaking Change
|
||||
|
||||
Re-ordering of the enumerated type in hip_runtime_api.h to better match NV. See below for the difference in enumerated types.
|
||||
|
||||
ROCm software will be affected if any of the defined enums listed below are used in the code. Applications built with ROCm v5.0 enumerated types will work with a ROCm 4.5.2 driver. However, an undefined behavior error will occur with a ROCm v4.5.2 application that uses these enumerated types with a ROCm 5.0 runtime.
|
||||
|
||||
```diff
|
||||
typedef enum hipDeviceAttribute_t {
|
||||
- hipDeviceAttributeMaxThreadsPerBlock, ///< Maximum number of threads per block.
|
||||
- hipDeviceAttributeMaxBlockDimX, ///< Maximum x-dimension of a block.
|
||||
- hipDeviceAttributeMaxBlockDimY, ///< Maximum y-dimension of a block.
|
||||
- hipDeviceAttributeMaxBlockDimZ, ///< Maximum z-dimension of a block.
|
||||
- hipDeviceAttributeMaxGridDimX, ///< Maximum x-dimension of a grid.
|
||||
- hipDeviceAttributeMaxGridDimY, ///< Maximum y-dimension of a grid.
|
||||
- hipDeviceAttributeMaxGridDimZ, ///< Maximum z-dimension of a grid.
|
||||
- hipDeviceAttributeMaxSharedMemoryPerBlock, ///< Maximum shared memory available per block in
|
||||
- ///< bytes.
|
||||
- hipDeviceAttributeTotalConstantMemory, ///< Constant memory size in bytes.
|
||||
- hipDeviceAttributeWarpSize, ///< Warp size in threads.
|
||||
- hipDeviceAttributeMaxRegistersPerBlock, ///< Maximum number of 32-bit registers available to a
|
||||
- ///< thread block. This number is shared by all thread
|
||||
- ///< blocks simultaneously resident on a
|
||||
- ///< multiprocessor.
|
||||
- hipDeviceAttributeClockRate, ///< Peak clock frequency in kilohertz.
|
||||
- hipDeviceAttributeMemoryClockRate, ///< Peak memory clock frequency in kilohertz.
|
||||
- hipDeviceAttributeMemoryBusWidth, ///< Global memory bus width in bits.
|
||||
- hipDeviceAttributeMultiprocessorCount, ///< Number of multiprocessors on the device.
|
||||
- hipDeviceAttributeComputeMode, ///< Compute mode that device is currently in.
|
||||
- hipDeviceAttributeL2CacheSize, ///< Size of L2 cache in bytes. 0 if the device doesn't have L2
|
||||
- ///< cache.
|
||||
- hipDeviceAttributeMaxThreadsPerMultiProcessor, ///< Maximum resident threads per
|
||||
- ///< multiprocessor.
|
||||
- hipDeviceAttributeComputeCapabilityMajor, ///< Major compute capability version number.
|
||||
- hipDeviceAttributeComputeCapabilityMinor, ///< Minor compute capability version number.
|
||||
- hipDeviceAttributeConcurrentKernels, ///< Device can possibly execute multiple kernels
|
||||
- ///< concurrently.
|
||||
- hipDeviceAttributePciBusId, ///< PCI Bus ID.
|
||||
- hipDeviceAttributePciDeviceId, ///< PCI Device ID.
|
||||
- hipDeviceAttributeMaxSharedMemoryPerMultiprocessor, ///< Maximum Shared Memory Per
|
||||
- ///< Multiprocessor.
|
||||
- hipDeviceAttributeIsMultiGpuBoard, ///< Multiple GPU devices.
|
||||
- hipDeviceAttributeIntegrated, ///< iGPU
|
||||
- hipDeviceAttributeCooperativeLaunch, ///< Support cooperative launch
|
||||
- hipDeviceAttributeCooperativeMultiDeviceLaunch, ///< Support cooperative launch on multiple devices
|
||||
- hipDeviceAttributeMaxTexture1DWidth, ///< Maximum number of elements in 1D images
|
||||
- hipDeviceAttributeMaxTexture2DWidth, ///< Maximum dimension width of 2D images in image elements
|
||||
- hipDeviceAttributeMaxTexture2DHeight, ///< Maximum dimension height of 2D images in image elements
|
||||
- hipDeviceAttributeMaxTexture3DWidth, ///< Maximum dimension width of 3D images in image elements
|
||||
- hipDeviceAttributeMaxTexture3DHeight, ///< Maximum dimensions height of 3D images in image elements
|
||||
- hipDeviceAttributeMaxTexture3DDepth, ///< Maximum dimensions depth of 3D images in image elements
|
||||
+ hipDeviceAttributeCudaCompatibleBegin = 0,
|
||||
|
||||
- hipDeviceAttributeHdpMemFlushCntl, ///< Address of the HDP_MEM_COHERENCY_FLUSH_CNTL register
|
||||
- hipDeviceAttributeHdpRegFlushCntl, ///< Address of the HDP_REG_COHERENCY_FLUSH_CNTL register
|
||||
+ hipDeviceAttributeEccEnabled = hipDeviceAttributeCudaCompatibleBegin, ///< Whether ECC support is enabled.
|
||||
+ hipDeviceAttributeAccessPolicyMaxWindowSize, ///< Cuda only. The maximum size of the window policy in bytes.
|
||||
+ hipDeviceAttributeAsyncEngineCount, ///< Cuda only. Asynchronous engines number.
|
||||
+ hipDeviceAttributeCanMapHostMemory, ///< Whether host memory can be mapped into device address space
|
||||
+ hipDeviceAttributeCanUseHostPointerForRegisteredMem,///< Cuda only. Device can access host registered memory
|
||||
+ ///< at the same virtual address as the CPU
|
||||
+ hipDeviceAttributeClockRate, ///< Peak clock frequency in kilohertz.
|
||||
+ hipDeviceAttributeComputeMode, ///< Compute mode that device is currently in.
|
||||
+ hipDeviceAttributeComputePreemptionSupported, ///< Cuda only. Device supports Compute Preemption.
|
||||
+ hipDeviceAttributeConcurrentKernels, ///< Device can possibly execute multiple kernels concurrently.
|
||||
+ hipDeviceAttributeConcurrentManagedAccess, ///< Device can coherently access managed memory concurrently with the CPU
|
||||
+ hipDeviceAttributeCooperativeLaunch, ///< Support cooperative launch
|
||||
+ hipDeviceAttributeCooperativeMultiDeviceLaunch, ///< Support cooperative launch on multiple devices
|
||||
+ hipDeviceAttributeDeviceOverlap, ///< Cuda only. Device can concurrently copy memory and execute a kernel.
|
||||
+ ///< Deprecated. Use instead asyncEngineCount.
|
||||
+ hipDeviceAttributeDirectManagedMemAccessFromHost, ///< Host can directly access managed memory on
|
||||
+ ///< the device without migration
|
||||
+ hipDeviceAttributeGlobalL1CacheSupported, ///< Cuda only. Device supports caching globals in L1
|
||||
+ hipDeviceAttributeHostNativeAtomicSupported, ///< Cuda only. Link between the device and the host supports native atomic operations
|
||||
+ hipDeviceAttributeIntegrated, ///< Device is integrated GPU
|
||||
+ hipDeviceAttributeIsMultiGpuBoard, ///< Multiple GPU devices.
|
||||
+ hipDeviceAttributeKernelExecTimeout, ///< Run time limit for kernels executed on the device
|
||||
+ hipDeviceAttributeL2CacheSize, ///< Size of L2 cache in bytes. 0 if the device doesn't have L2 cache.
|
||||
+ hipDeviceAttributeLocalL1CacheSupported, ///< caching locals in L1 is supported
|
||||
+ hipDeviceAttributeLuid, ///< Cuda only. 8-byte locally unique identifier in 8 bytes. Undefined on TCC and non-Windows platforms
|
||||
+ hipDeviceAttributeLuidDeviceNodeMask, ///< Cuda only. Luid device node mask. Undefined on TCC and non-Windows platforms
|
||||
+ hipDeviceAttributeComputeCapabilityMajor, ///< Major compute capability version number.
|
||||
+ hipDeviceAttributeManagedMemory, ///< Device supports allocating managed memory on this system
|
||||
+ hipDeviceAttributeMaxBlocksPerMultiProcessor, ///< Cuda only. Max block size per multiprocessor
|
||||
+ hipDeviceAttributeMaxBlockDimX, ///< Max block size in width.
|
||||
+ hipDeviceAttributeMaxBlockDimY, ///< Max block size in height.
|
||||
+ hipDeviceAttributeMaxBlockDimZ, ///< Max block size in depth.
|
||||
+ hipDeviceAttributeMaxGridDimX, ///< Max grid size in width.
|
||||
+ hipDeviceAttributeMaxGridDimY, ///< Max grid size in height.
|
||||
+ hipDeviceAttributeMaxGridDimZ, ///< Max grid size in depth.
|
||||
+ hipDeviceAttributeMaxSurface1D, ///< Maximum size of 1D surface.
|
||||
+ hipDeviceAttributeMaxSurface1DLayered, ///< Cuda only. Maximum dimensions of 1D layered surface.
|
||||
+ hipDeviceAttributeMaxSurface2D, ///< Maximum dimension (width, height) of 2D surface.
|
||||
+ hipDeviceAttributeMaxSurface2DLayered, ///< Cuda only. Maximum dimensions of 2D layered surface.
|
||||
+ hipDeviceAttributeMaxSurface3D, ///< Maximum dimension (width, height, depth) of 3D surface.
|
||||
+ hipDeviceAttributeMaxSurfaceCubemap, ///< Cuda only. Maximum dimensions of Cubemap surface.
|
||||
+ hipDeviceAttributeMaxSurfaceCubemapLayered, ///< Cuda only. Maximum dimension of Cubemap layered surface.
|
||||
+ hipDeviceAttributeMaxTexture1DWidth, ///< Maximum size of 1D texture.
|
||||
+ hipDeviceAttributeMaxTexture1DLayered, ///< Cuda only. Maximum dimensions of 1D layered texture.
|
||||
+ hipDeviceAttributeMaxTexture1DLinear, ///< Maximum number of elements allocatable in a 1D linear texture.
|
||||
+ ///< Use cudaDeviceGetTexture1DLinearMaxWidth() instead on Cuda.
|
||||
+ hipDeviceAttributeMaxTexture1DMipmap, ///< Cuda only. Maximum size of 1D mipmapped texture.
|
||||
+ hipDeviceAttributeMaxTexture2DWidth, ///< Maximum dimension width of 2D texture.
|
||||
+ hipDeviceAttributeMaxTexture2DHeight, ///< Maximum dimension hight of 2D texture.
|
||||
+ hipDeviceAttributeMaxTexture2DGather, ///< Cuda only. Maximum dimensions of 2D texture if gather operations performed.
|
||||
+ hipDeviceAttributeMaxTexture2DLayered, ///< Cuda only. Maximum dimensions of 2D layered texture.
|
||||
+ hipDeviceAttributeMaxTexture2DLinear, ///< Cuda only. Maximum dimensions (width, height, pitch) of 2D textures bound to pitched memory.
|
||||
+ hipDeviceAttributeMaxTexture2DMipmap, ///< Cuda only. Maximum dimensions of 2D mipmapped texture.
|
||||
+ hipDeviceAttributeMaxTexture3DWidth, ///< Maximum dimension width of 3D texture.
|
||||
+ hipDeviceAttributeMaxTexture3DHeight, ///< Maximum dimension height of 3D texture.
|
||||
+ hipDeviceAttributeMaxTexture3DDepth, ///< Maximum dimension depth of 3D texture.
|
||||
+ hipDeviceAttributeMaxTexture3DAlt, ///< Cuda only. Maximum dimensions of alternate 3D texture.
|
||||
+ hipDeviceAttributeMaxTextureCubemap, ///< Cuda only. Maximum dimensions of Cubemap texture
|
||||
+ hipDeviceAttributeMaxTextureCubemapLayered, ///< Cuda only. Maximum dimensions of Cubemap layered texture.
|
||||
+ hipDeviceAttributeMaxThreadsDim, ///< Maximum dimension of a block
|
||||
+ hipDeviceAttributeMaxThreadsPerBlock, ///< Maximum number of threads per block.
|
||||
+ hipDeviceAttributeMaxThreadsPerMultiProcessor, ///< Maximum resident threads per multiprocessor.
|
||||
+ hipDeviceAttributeMaxPitch, ///< Maximum pitch in bytes allowed by memory copies
|
||||
+ hipDeviceAttributeMemoryBusWidth, ///< Global memory bus width in bits.
|
||||
+ hipDeviceAttributeMemoryClockRate, ///< Peak memory clock frequency in kilohertz.
|
||||
+ hipDeviceAttributeComputeCapabilityMinor, ///< Minor compute capability version number.
|
||||
+ hipDeviceAttributeMultiGpuBoardGroupID, ///< Cuda only. Unique ID of device group on the same multi-GPU board
|
||||
+ hipDeviceAttributeMultiprocessorCount, ///< Number of multiprocessors on the device.
|
||||
+ hipDeviceAttributeName, ///< Device name.
|
||||
+ hipDeviceAttributePageableMemoryAccess, ///< Device supports coherently accessing pageable memory
|
||||
+ ///< without calling hipHostRegister on it
|
||||
+ hipDeviceAttributePageableMemoryAccessUsesHostPageTables, ///< Device accesses pageable memory via the host's page tables
|
||||
+ hipDeviceAttributePciBusId, ///< PCI Bus ID.
|
||||
+ hipDeviceAttributePciDeviceId, ///< PCI Device ID.
|
||||
+ hipDeviceAttributePciDomainID, ///< PCI Domain ID.
|
||||
+ hipDeviceAttributePersistingL2CacheMaxSize, ///< Cuda11 only. Maximum l2 persisting lines capacity in bytes
|
||||
+ hipDeviceAttributeMaxRegistersPerBlock, ///< 32-bit registers available to a thread block. This number is shared
|
||||
+ ///< by all thread blocks simultaneously resident on a multiprocessor.
|
||||
+ hipDeviceAttributeMaxRegistersPerMultiprocessor, ///< 32-bit registers available per block.
|
||||
+ hipDeviceAttributeReservedSharedMemPerBlock, ///< Cuda11 only. Shared memory reserved by CUDA driver per block.
|
||||
+ hipDeviceAttributeMaxSharedMemoryPerBlock, ///< Maximum shared memory available per block in bytes.
|
||||
+ hipDeviceAttributeSharedMemPerBlockOptin, ///< Cuda only. Maximum shared memory per block usable by special opt in.
|
||||
+ hipDeviceAttributeSharedMemPerMultiprocessor, ///< Cuda only. Shared memory available per multiprocessor.
|
||||
+ hipDeviceAttributeSingleToDoublePrecisionPerfRatio, ///< Cuda only. Performance ratio of single precision to double precision.
|
||||
+ hipDeviceAttributeStreamPrioritiesSupported, ///< Cuda only. Whether to support stream priorities.
|
||||
+ hipDeviceAttributeSurfaceAlignment, ///< Cuda only. Alignment requirement for surfaces
|
||||
+ hipDeviceAttributeTccDriver, ///< Cuda only. Whether device is a Tesla device using TCC driver
|
||||
+ hipDeviceAttributeTextureAlignment, ///< Alignment requirement for textures
|
||||
+ hipDeviceAttributeTexturePitchAlignment, ///< Pitch alignment requirement for 2D texture references bound to pitched memory;
|
||||
+ hipDeviceAttributeTotalConstantMemory, ///< Constant memory size in bytes.
|
||||
+ hipDeviceAttributeTotalGlobalMem, ///< Global memory available on devicice.
|
||||
+ hipDeviceAttributeUnifiedAddressing, ///< Cuda only. An unified address space shared with the host.
|
||||
+ hipDeviceAttributeUuid, ///< Cuda only. Unique ID in 16 byte.
|
||||
+ hipDeviceAttributeWarpSize, ///< Warp size in threads.
|
||||
|
||||
- hipDeviceAttributeMaxPitch, ///< Maximum pitch in bytes allowed by memory copies
|
||||
- hipDeviceAttributeTextureAlignment, ///<Alignment requirement for textures
|
||||
- hipDeviceAttributeTexturePitchAlignment, ///<Pitch alignment requirement for 2D texture references bound to pitched memory;
|
||||
- hipDeviceAttributeKernelExecTimeout, ///<Run time limit for kernels executed on the device
|
||||
- hipDeviceAttributeCanMapHostMemory, ///<Device can map host memory into device address space
|
||||
- hipDeviceAttributeEccEnabled, ///<Device has ECC support enabled
|
||||
+ hipDeviceAttributeCudaCompatibleEnd = 9999,
|
||||
+ hipDeviceAttributeAmdSpecificBegin = 10000,
|
||||
|
||||
- hipDeviceAttributeCooperativeMultiDeviceUnmatchedFunc, ///< Supports cooperative launch on multiple
|
||||
- ///devices with unmatched functions
|
||||
- hipDeviceAttributeCooperativeMultiDeviceUnmatchedGridDim, ///< Supports cooperative launch on multiple
|
||||
- ///devices with unmatched grid dimensions
|
||||
- hipDeviceAttributeCooperativeMultiDeviceUnmatchedBlockDim, ///< Supports cooperative launch on multiple
|
||||
- ///devices with unmatched block dimensions
|
||||
- hipDeviceAttributeCooperativeMultiDeviceUnmatchedSharedMem, ///< Supports cooperative launch on multiple
|
||||
- ///devices with unmatched shared memories
|
||||
- hipDeviceAttributeAsicRevision, ///< Revision of the GPU in this device
|
||||
- hipDeviceAttributeManagedMemory, ///< Device supports allocating managed memory on this system
|
||||
- hipDeviceAttributeDirectManagedMemAccessFromHost, ///< Host can directly access managed memory on
|
||||
- /// the device without migration
|
||||
- hipDeviceAttributeConcurrentManagedAccess, ///< Device can coherently access managed memory
|
||||
- /// concurrently with the CPU
|
||||
- hipDeviceAttributePageableMemoryAccess, ///< Device supports coherently accessing pageable memory
|
||||
- /// without calling hipHostRegister on it
|
||||
- hipDeviceAttributePageableMemoryAccessUsesHostPageTables, ///< Device accesses pageable memory via
|
||||
- /// the host's page tables
|
||||
- hipDeviceAttributeCanUseStreamWaitValue ///< '1' if Device supports hipStreamWaitValue32() and
|
||||
- ///< hipStreamWaitValue64() , '0' otherwise.
|
||||
+ hipDeviceAttributeClockInstructionRate = hipDeviceAttributeAmdSpecificBegin, ///< Frequency in khz of the timer used by the device-side "clock*"
|
||||
+ hipDeviceAttributeArch, ///< Device architecture
|
||||
+ hipDeviceAttributeMaxSharedMemoryPerMultiprocessor, ///< Maximum Shared Memory PerMultiprocessor.
|
||||
+ hipDeviceAttributeGcnArch, ///< Device gcn architecture
|
||||
+ hipDeviceAttributeGcnArchName, ///< Device gcnArch name in 256 bytes
|
||||
+ hipDeviceAttributeHdpMemFlushCntl, ///< Address of the HDP_MEM_COHERENCY_FLUSH_CNTL register
|
||||
+ hipDeviceAttributeHdpRegFlushCntl, ///< Address of the HDP_REG_COHERENCY_FLUSH_CNTL register
|
||||
+ hipDeviceAttributeCooperativeMultiDeviceUnmatchedFunc, ///< Supports cooperative launch on multiple
|
||||
+ ///< devices with unmatched functions
|
||||
+ hipDeviceAttributeCooperativeMultiDeviceUnmatchedGridDim, ///< Supports cooperative launch on multiple
|
||||
+ ///< devices with unmatched grid dimensions
|
||||
+ hipDeviceAttributeCooperativeMultiDeviceUnmatchedBlockDim, ///< Supports cooperative launch on multiple
|
||||
+ ///< devices with unmatched block dimensions
|
||||
+ hipDeviceAttributeCooperativeMultiDeviceUnmatchedSharedMem, ///< Supports cooperative launch on multiple
|
||||
+ ///< devices with unmatched shared memories
|
||||
+ hipDeviceAttributeIsLargeBar, ///< Whether it is LargeBar
|
||||
+ hipDeviceAttributeAsicRevision, ///< Revision of the GPU in this device
|
||||
+ hipDeviceAttributeCanUseStreamWaitValue, ///< '1' if Device supports hipStreamWaitValue32() and
|
||||
+ ///< hipStreamWaitValue64() , '0' otherwise.
|
||||
|
||||
+ hipDeviceAttributeAmdSpecificEnd = 19999,
|
||||
+ hipDeviceAttributeVendorSpecificBegin = 20000,
|
||||
+ // Extended attributes for vendors
|
||||
} hipDeviceAttribute_t;
|
||||
|
||||
enum hipComputeMode {
|
||||
```
|
||||
|
||||
The wrapper header files’ backward compatibility deprecation is as follows:
|
||||
|
||||
- `#pragma` message announcing deprecation -- ROCm v5.2 release
|
||||
- `#pragma` message changed to `#warning` -- Future release
|
||||
- `#warning` changed to `#error` -- Future release
|
||||
- Backward compatibility wrappers removed -- Future release
|
||||
|
||||
##### Library files
|
||||
|
||||
Library files are available in the `/opt/rocm-xxx/lib` folder. For backward compatibility, the old library location (`/opt/rocm-xxx/<component>/lib`) has a soft link to the library at the new location.
|
||||
|
||||
Example:
|
||||
|
||||
```log
|
||||
$ ls -l /opt/rocm/hip/lib/
|
||||
total 4
|
||||
drwxr-xr-x 4 root root 4096 May 12 10:45 cmake
|
||||
lrwxrwxrwx 1 root root 24 May 10 23:32 libamdhip64.so -> ../../lib/libamdhip64.so
|
||||
```
|
||||
|
||||
##### CMake Config files
|
||||
|
||||
All CMake configuration files are available in the `/opt/rocm-xxx/lib/cmake/<component>` folder. For backward compatibility, the old CMake locations (`/opt/rocm-xxx/<component>/lib/cmake`) consist of a soft link to the new CMake config.
|
||||
|
||||
Example:
|
||||
|
||||
```log
|
||||
$ ls -l /opt/rocm/hip/lib/cmake/hip/
|
||||
total 0
|
||||
lrwxrwxrwx 1 root root 42 May 10 23:32 hip-config.cmake -> ../../../../lib/cmake/hip/hip-config.cmake
|
||||
```
|
||||
|
||||
#### Planned deprecation of hip-rocclr and hip-base packages
|
||||
|
||||
In the ROCm v5.2 release, hip-rocclr and hip-base packages (Debian and RPM) are planned for deprecation and will be removed in a future release. hip-runtime-amd and hip-dev(el) will replace these packages respectively. Users of hip-rocclr must install two packages, hip-runtime-amd and hip-dev, to get the same set of packages installed by hip-rocclr previously.
|
||||
|
||||
Currently, both package names hip-rocclr (or) hip-runtime-amd and hip-base (or) hip-dev(el) are supported.
|
||||
Deprecation of Integrated HIP Directed Tests
|
||||
|
||||
The integrated HIP directed tests, which are currently built by default, are deprecated in this release. The default building and execution support through CMake will be removed in future release.
|
||||
|
||||
### Fixed Defects
|
||||
|
||||
| Fixed Defect | Fix |
|
||||
|------------------------------------------------------------------------------|----------|
|
||||
| ROCmInfo does not list gpus | Code fix |
|
||||
| Hang observed while restoring cooperative group samples | Code fix |
|
||||
| ROCM-SMI over SRIOV: Unsupported commands do not return proper error message | Code fix |
|
||||
|
||||
### Known Issues
|
||||
|
||||
This section consists of known issues in this release.
|
||||
#### Incorrect dGPU Behavior When Using AMDVBFlash Tool
|
||||
|
||||
#### Compiler Error on gfx1030 When Compiling at -O0
|
||||
The AMDVBFlash tool, used for flashing the VBIOS image to dGPU, does not communicate with the ROM Controller specifically when the driver is present. This is because the driver, as part of its runtime power management feature, puts the dGPU to a sleep state.
|
||||
|
||||
As a workaround, users can run amdgpu.runpm=0, which temporarily disables the runtime power management feature from the driver and dynamically changes some power control-related sysfs files.
|
||||
|
||||
#### Issue with START Timestamp in ROCProfiler
|
||||
|
||||
Users may encounter an issue with the enabled timestamp functionality for monitoring one or multiple counters. ROCProfiler outputs the following four timestamps for each kernel:
|
||||
|
||||
- Dispatch
|
||||
|
||||
- Start
|
||||
|
||||
- End
|
||||
|
||||
- Complete
|
||||
|
||||
##### Issue
|
||||
|
||||
A compiler error occurs when using -O0 flag to compile code for gfx1030 that calls atomicAddNoRet, which is defined in amd_hip_atomic.h. The compiler generates an illegal instruction for gfx1030.
|
||||
This defect is related to the Start timestamp functionality, which incorrectly shows an earlier time than the Dispatch timestamp.
|
||||
|
||||
##### Workaround
|
||||
To reproduce the issue,
|
||||
|
||||
The workaround is not to use the -O0 flag for this case. For higher optimization levels, the compiler does not generate an invalid instruction.
|
||||
1. Enable timing using the --timestamp on flag.
|
||||
|
||||
#### System Freeze Observed During CUDA Memtest Checkpoint
|
||||
2. Use the -i option with the input filename that contains the name of the counter(s) to monitor.
|
||||
|
||||
##### Issue
|
||||
3. Run the program.
|
||||
|
||||
Checkpoint/Restore in Userspace (CRIU) requires 20 MB of VRAM approximately to checkpoint and restore. The CRIU process may freeze if the maximum amount of available VRAM is allocated to checkpoint applications.
|
||||
4. Check the output result file.
|
||||
|
||||
##### Workaround
|
||||
##### Current behavior
|
||||
|
||||
To use CRIU to checkpoint and restore your application, limit the amount of VRAM the application uses to ensure at least 20 MB is available.
|
||||
BeginNS is lower than DispatchNS, which is incorrect.
|
||||
|
||||
#### HPC test fails with the “HSA_STATUS_ERROR_MEMORY_FAULT” error
|
||||
##### Expected behavior
|
||||
|
||||
##### Issue
|
||||
The correct order is:
|
||||
|
||||
The compiler may incorrectly compile a program that uses the `__shfl_sync(mask, value, srcLane)` function when the "value" parameter to the function is undefined along some path to the function. For most functions, uninitialized inputs cause undefined behavior, but the definition for `__shfl_sync` should allow for undefined values.
|
||||
Dispatch < Start < End < Complete
|
||||
|
||||
##### Workaround
|
||||
Users cannot use ROCProfiler to measure the time spent on each kernel because of the incorrect timestamp with counter collection enabled.
|
||||
|
||||
The workaround is to initialize the parameters to `__shfl_sync`.
|
||||
##### Recommended Workaround
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> When the `-Wall` compilation flag is used, the compiler generates a warning indicating the variable is initialized along some path.
|
||||
Users are recommended to collect kernel execution timestamps without monitoring counters, as follows:
|
||||
|
||||
Example:
|
||||
1. Enable timing using the --timestamp on flag, and run the application.
|
||||
|
||||
```cpp
|
||||
double res = 0.0; // Initialize the input to __shfl_sync.
|
||||
if (lane == 0) {
|
||||
res = <some expression>
|
||||
}
|
||||
res = __shfl_sync(mask, res, 0);
|
||||
```
|
||||
2. Rerun the application using the -i option with the input filename that contains the name of the counter(s) to monitor, and save this to a different output file using the -o flag.
|
||||
|
||||
#### Kernel produces incorrect result
|
||||
3. Check the output result file from step 1.
|
||||
|
||||
##### Issue
|
||||
4. The order of timestamps correctly displays as:
|
||||
DispathNS < BeginNS < EndNS < CompleteNS
|
||||
|
||||
In recent changes to Clang, insertion of the noundef attribute to all the function arguments has been enabled by default.
|
||||
5. Users can find the values of the collected counters in the output file generated in step 2.
|
||||
|
||||
In the HIP kernel, variable var in shfl_sync may not be initialized, so LLVM IR treats it as undef.
|
||||
#### Radeon Pro V620 and W6800 Workstation GPUs
|
||||
|
||||
So, the function argument that is potentially undef (because it is not intialized) has always been assumed to be noundef by LLVM IR (since Clang has inserted noundef attribute). This leads to ambiguous kernel execution.
|
||||
##### No Support for SMI and ROCDebugger on SRIOV
|
||||
|
||||
##### Workaround
|
||||
System Management Interface (SMI) and ROCDebugger are not supported in the SRIOV environment on any GPU. For more information, refer to the Systems Management Interface documentation.
|
||||
|
||||
- Skip adding `noundef` attribute to functions tagged with convergent attribute. Refer to <https://reviews.llvm.org/D124158> for more information.
|
||||
### Deprecations and Warnings
|
||||
|
||||
- Introduce shuffle attribute and add it to `__shfl` like APIs at hip headers. Clang can skip adding noundef attribute, if it finds that argument is tagged with shuffle attribute. Refer to <https://reviews.llvm.org/D125378> for more information.
|
||||
#### ROCm Libraries Changes – Deprecations and Deprecation Removal
|
||||
|
||||
- Introduce clang builtin for `__shfl` to identify it and skip adding `noundef` attribute.
|
||||
- The hipFFT.h header is now provided only by the hipFFT package. Up to ROCm 5.0, users would get hipFFT.h in the rocFFT package too.
|
||||
|
||||
- Introduce `__builtin_freeze` to use on the relevant arguments in library wrappers. The library/header need to insert freezes on the relevant inputs.
|
||||
- The GlobalPairwiseAMG class is now entirely removed, users should use the PairwiseAMG class instead.
|
||||
|
||||
#### Issue with Applications Triggering Oversubscription
|
||||
- The rocsparse_spmm signature in 5.0 was changed to match that of rocsparse_spmm_ex. In 5.0, rocsparse_spmm_ex is still present, but deprecated. Signature diff for rocsparse_spmm
|
||||
rocsparse_spmm in 5.0
|
||||
|
||||
There is a known issue with applications that trigger oversubscription. A hardware hang occurs when ROCgdb is used on AMD Instinct™ MI50 and MI100 systems.
|
||||
```h
|
||||
rocsparse_status rocsparse_spmm(rocsparse_handle handle,
|
||||
rocsparse_operation trans_A,
|
||||
rocsparse_operation trans_B,
|
||||
const void* alpha,
|
||||
const rocsparse_spmat_descr mat_A,
|
||||
const rocsparse_dnmat_descr mat_B,
|
||||
const void* beta,
|
||||
const rocsparse_dnmat_descr mat_C,
|
||||
rocsparse_datatype compute_type,
|
||||
rocsparse_spmm_alg alg,
|
||||
rocsparse_spmm_stage stage,
|
||||
size_t* buffer_size,
|
||||
void* temp_buffer);
|
||||
```
|
||||
|
||||
This issue is under investigation and will be fixed in a future release.
|
||||
rocSPARSE_spmm in 4.0
|
||||
|
||||
```h
|
||||
rocsparse_status rocsparse_spmm(rocsparse_handle handle,
|
||||
rocsparse_operation trans_A,
|
||||
rocsparse_operation trans_B,
|
||||
const void* alpha,
|
||||
const rocsparse_spmat_descr mat_A,
|
||||
const rocsparse_dnmat_descr mat_B,
|
||||
const void* beta,
|
||||
const rocsparse_dnmat_descr mat_C,
|
||||
rocsparse_datatype compute_type,
|
||||
rocsparse_spmm_alg alg,
|
||||
size_t* buffer_size,
|
||||
void* temp_buffer);
|
||||
```
|
||||
|
||||
#### HIP API Deprecations and Warnings
|
||||
|
||||
##### Warning - Arithmetic Operators of HIP Complex and Vector Types
|
||||
|
||||
In this release, arithmetic operators of HIP complex and vector types are deprecated.
|
||||
|
||||
- As alternatives to arithmetic operators of HIP complex types, users can use arithmetic operators of `std::complex` types.
|
||||
|
||||
- As alternatives to arithmetic operators of HIP vector types, users can use the operators of the native clang vector type associated with the data member of HIP vector types.
|
||||
|
||||
During the deprecation, two macros `_HIP_ENABLE_COMPLEX_OPERATORS` and `_HIP_ENABLE_VECTOR_OPERATORS` are provided to allow users to conditionally enable arithmetic operators of HIP complex or vector types.
|
||||
|
||||
Note, the two macros are mutually exclusive and, by default, set to Off.
|
||||
|
||||
The arithmetic operators of HIP complex and vector types will be removed in a future release.
|
||||
|
||||
Refer to the HIP API Guide for more information.
|
||||
|
||||
#### Warning - Compiler-Generated Code Object Version 4 Deprecation
|
||||
|
||||
Support for loading compiler-generated code object version 4 will be deprecated in a future release with no release announcement and replaced with code object 5 as the default version.
|
||||
|
||||
The current default is code object version 4.
|
||||
|
||||
#### Warning - MIOpenTensile Deprecation
|
||||
|
||||
MIOpenTensile will be deprecated in a future release.
|
||||
|
||||
@@ -14,6 +14,13 @@ shutil.copy2('../RELEASE.md','./release.md')
|
||||
# Keep capitalization due to similar linking on GitHub's markdown preview.
|
||||
shutil.copy2('../CHANGELOG.md','./CHANGELOG.md')
|
||||
|
||||
# configurations for PDF output by Read the Docs
|
||||
project = "ROCm Documentation"
|
||||
author = "Advanced Micro Devices, Inc."
|
||||
copyright = "Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved."
|
||||
version = "5.0.0"
|
||||
release = "5.0.0"
|
||||
|
||||
setting_all_article_info = True
|
||||
all_article_info_os = ["linux"]
|
||||
all_article_info_author = ""
|
||||
@@ -57,7 +64,7 @@ article_pages = [
|
||||
|
||||
external_toc_path = "./sphinx/_toc.yml"
|
||||
|
||||
docs_core = ROCmDocs("ROCm Documentation Home")
|
||||
docs_core = ROCmDocs("ROCm 5.0.0 Documentation Home")
|
||||
docs_core.setup()
|
||||
|
||||
external_projects_current_project = "rocm"
|
||||
|
||||
@@ -18,8 +18,8 @@ following commands based on your distribution.
|
||||
|
||||
```shell
|
||||
sudo apt update
|
||||
wget https://repo.radeon.com/amdgpu-install/22.20/ubuntu/bionic/amdgpu-install_22.20.50200-1_all.deb
|
||||
sudo apt install ./amdgpu-install_22.20.50200-1_all.deb
|
||||
wget https://repo.radeon.com/amdgpu-install/21.50/ubuntu/bionic/amdgpu-install_21.50.50000-1_all.deb
|
||||
sudo apt install ./amdgpu-install_21.50.50000-1_all.deb
|
||||
```
|
||||
|
||||
:::
|
||||
@@ -28,8 +28,8 @@ sudo apt install ./amdgpu-install_22.20.50200-1_all.deb
|
||||
|
||||
```shell
|
||||
sudo apt update
|
||||
wget https://repo.radeon.com/amdgpu-install/22.20/ubuntu/focal/amdgpu-install_22.20.50200-1_all.deb
|
||||
sudo apt install ./amdgpu-install_22.20.50200-1_all.deb
|
||||
wget https://repo.radeon.com/amdgpu-install/21.50/ubuntu/focal/amdgpu-install_21.50.50000-1_all.deb
|
||||
sudo apt install ./amdgpu-install_21.50.50000-1_all.deb
|
||||
```
|
||||
|
||||
:::
|
||||
@@ -44,7 +44,16 @@ sudo apt install ./amdgpu-install_22.20.50200-1_all.deb
|
||||
:sync: RHEL-7
|
||||
|
||||
```shell
|
||||
sudo yum install https://repo.radeon.com/amdgpu-install/22.20/rhel/7.9/amdgpu-install-22.20.50200-1.el7.noarch.rpm
|
||||
sudo yum install https://repo.radeon.com/amdgpu-install/21.50/rhel/7.9/amdgpu-install-21.50.50000-1.el7.noarch.rpm
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} RHEL 8.4
|
||||
:sync: RHEL-8.4
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
sudo yum install https://repo.radeon.com/amdgpu-install/21.50/rhel/8.4/amdgpu-install-21.50.50000-1.el8.noarch.rpm
|
||||
```
|
||||
|
||||
:::
|
||||
@@ -53,16 +62,7 @@ sudo yum install https://repo.radeon.com/amdgpu-install/22.20/rhel/7.9/amdgpu-in
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
sudo yum install https://repo.radeon.com/amdgpu-install/22.20/rhel/8.5/amdgpu-install-22.20.50200-1.el8.noarch.rpm
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} RHEL 8.6
|
||||
:sync: RHEL-8.6
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
sudo yum install https://repo.radeon.com/amdgpu-install/22.20/rhel/8.6/amdgpu-install-22.20.50200-1.el8.noarch.rpm
|
||||
sudo yum install https://repo.radeon.com/amdgpu-install/21.50/rhel/8.5/amdgpu-install-21.50.50000-1.el8.noarch.rpm
|
||||
```
|
||||
|
||||
:::
|
||||
@@ -72,19 +72,11 @@ sudo yum install https://repo.radeon.com/amdgpu-install/22.20/rhel/8.6/amdgpu-in
|
||||
:sync: SLES15
|
||||
|
||||
::::{tab-set}
|
||||
:::{tab-item} Service Pack 4
|
||||
:sync: SLES15-SP4
|
||||
|
||||
```shell
|
||||
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/22.20/sle/15.4/amdgpu-install-22.20.50200-1.noarch.rpm
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} Service Pack 3
|
||||
:sync: SLES15-SP3
|
||||
|
||||
```shell
|
||||
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/22.20/sle/15.3/amdgpu-install-22.20.50200-1.noarch.rpm
|
||||
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/21.50/sle/15/amdgpu-install-21.50.50000-1.noarch.rpm
|
||||
```
|
||||
|
||||
:::
|
||||
@@ -163,9 +155,9 @@ the installer script will install packages in the single-version layout.
|
||||
For the multi-version ROCm installation you must use the installer script from
|
||||
the latest release of ROCm that you wish to install.
|
||||
|
||||
**Example:** If you want to install ROCm releases 5.1.3 and 5.2
|
||||
**Example:** If you want to install ROCm releases 4.5.2 and 5.0.0
|
||||
simultaneously, you are required to download the installer from the latest ROCm
|
||||
release v5.2.
|
||||
release v5.0.0.
|
||||
|
||||
### Add Required Repositories
|
||||
|
||||
@@ -184,10 +176,12 @@ Run the following commands based on your distribution to add the repositories:
|
||||
:sync: ubuntu-18.04
|
||||
|
||||
```shell
|
||||
for ver in 5.1.3; do
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver bionic main" | sudo tee /etc/apt/sources.list.d/rocm.list
|
||||
for ver in 5.0; do
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver bionic main" \
|
||||
| sudo tee /etc/apt/sources.list.d/rocm.list
|
||||
done
|
||||
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
|
||||
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
|
||||
| sudo tee /etc/apt/preferences.d/rocm-pin-600
|
||||
sudo apt update
|
||||
```
|
||||
|
||||
@@ -196,10 +190,12 @@ sudo apt update
|
||||
:sync: ubuntu-20.04
|
||||
|
||||
```shell
|
||||
for ver in 5.1.3; do
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" | sudo tee /etc/apt/sources.list.d/rocm.list
|
||||
for ver in 5.0; do
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" \
|
||||
| sudo tee /etc/apt/sources.list.d/rocm.list
|
||||
done
|
||||
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
|
||||
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
|
||||
| sudo tee /etc/apt/preferences.d/rocm-pin-600
|
||||
sudo apt update
|
||||
```
|
||||
|
||||
@@ -214,7 +210,7 @@ sudo apt update
|
||||
:sync: RHEL-7
|
||||
|
||||
```shell
|
||||
for ver in 5.1.3; do
|
||||
for ver in 5.0; do
|
||||
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
|
||||
[ROCm-$ver]
|
||||
name=ROCm$ver
|
||||
@@ -233,7 +229,7 @@ sudo yum clean all
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
for ver in 5.1.3; do
|
||||
for ver in 5.0; do
|
||||
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
|
||||
[ROCm-$ver]
|
||||
name=ROCm$ver
|
||||
@@ -247,7 +243,6 @@ done
|
||||
sudo yum clean all
|
||||
```
|
||||
|
||||
:::
|
||||
::::
|
||||
:::::
|
||||
:::::{tab-item} SUSE Linux Enterprise Server 15
|
||||
@@ -258,27 +253,10 @@ sudo yum clean all
|
||||
:sync: SLES15-SP3
|
||||
|
||||
```shell
|
||||
for ver in 5.1.3; do
|
||||
for ver in 5.0; do
|
||||
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
|
||||
name=rocm
|
||||
baseurl=https://repo.radeon.com/rocm/$ver/sle/15.3/main/x86_64
|
||||
enabled=1
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
EOF
|
||||
done
|
||||
sudo zypper ref
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} Service Pack 4
|
||||
:sync: SLES15-SP4
|
||||
|
||||
```shell
|
||||
for ver in 5.1.3; do
|
||||
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
|
||||
name=rocm
|
||||
baseurl=https://repo.radeon.com/rocm/$ver/sle/15.4/main/x86_64
|
||||
baseurl=https://repo.radeon.com/rocm/$ver/sle/15/main/x86_64
|
||||
enabled=1
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
@@ -304,12 +282,12 @@ sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-3>
|
||||
```
|
||||
|
||||
Following are examples of ROCm multi-version installation. The kernel-mode
|
||||
driver, associated with the ROCm release v5.3, will be installed as its latest
|
||||
driver, associated with the ROCm release v5.0, will be installed as its latest
|
||||
release in the list.
|
||||
|
||||
```none
|
||||
sudo amdgpu-install --usecase=rocm --rocmrelease=5.1.3
|
||||
sudo amdgpu-install --usecase=rocm --rocmrelease=5.2.0
|
||||
sudo amdgpu-install --usecase=rocm --rocmrelease=4.5.2
|
||||
sudo amdgpu-install --usecase=rocm --rocmrelease=5.0.0
|
||||
```
|
||||
|
||||
## Additional options
|
||||
|
||||
@@ -53,7 +53,7 @@ To add the AMDGPU repository, follow these steps:
|
||||
|
||||
```shell
|
||||
# amdgpu repository for bionic
|
||||
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.20/ubuntu bionic main' \
|
||||
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/21.50/ubuntu bionic main' \
|
||||
| sudo tee /etc/apt/sources.list.d/amdgpu.list
|
||||
sudo apt update
|
||||
```
|
||||
@@ -64,7 +64,7 @@ sudo apt update
|
||||
|
||||
```shell
|
||||
# amdgpu repository for focal
|
||||
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.20/ubuntu focal main' \
|
||||
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/21.50/ubuntu focal main' \
|
||||
| sudo tee /etc/apt/sources.list.d/amdgpu.list
|
||||
sudo apt update
|
||||
```
|
||||
@@ -91,7 +91,7 @@ To add the ROCm repository, use the following steps:
|
||||
|
||||
```shell
|
||||
# ROCm repositories for bionic
|
||||
for ver in 5.1.3 5.2; do
|
||||
for ver in 5.0; do
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ver bionic main" \
|
||||
| sudo tee --append /etc/apt/sources.list.d/rocm.list
|
||||
done
|
||||
@@ -106,7 +106,7 @@ sudo apt update
|
||||
|
||||
```shell
|
||||
# ROCm repositories for focal
|
||||
for ver in 5.1.3 5.2; do
|
||||
for ver in 5.0; do
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" \
|
||||
| sudo tee --append /etc/apt/sources.list.d/rocm.list
|
||||
done
|
||||
@@ -136,7 +136,7 @@ For a comprehensive list of meta-packages, refer to
|
||||
- Sample Multi-version installation
|
||||
|
||||
```shell
|
||||
sudo apt install rocm-hip-sdk5.2.0 rocm-hip-sdk5.1.3
|
||||
sudo apt install rocm-hip-sdk5.0.0
|
||||
```
|
||||
|
||||
:::::
|
||||
@@ -160,7 +160,26 @@ section.
|
||||
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/rhel/7.9/main/x86_64/
|
||||
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/7.9/main/x86_64/
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
EOF
|
||||
sudo yum clean all
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
:::{tab-item} RHEL 8.4
|
||||
:sync: RHEL-8.4
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/8.4/main/x86_64/
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
@@ -179,7 +198,7 @@ sudo yum clean all
|
||||
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/rhel/8.5/main/x86_64/
|
||||
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/8.5/main/x86_64/
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
@@ -189,26 +208,6 @@ sudo yum clean all
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
:::{tab-item} RHEL 8.6
|
||||
:sync: RHEL-8.6
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/rhel/8.6/main/x86_64/
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
EOF
|
||||
sudo yum clean all
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
Install the kernel mode driver and reboot the system using the following
|
||||
@@ -229,7 +228,7 @@ To add the ROCm repository, use the following steps, based on your distribution:
|
||||
:sync: RHEL-7
|
||||
|
||||
```shell
|
||||
for ver in 5.2.1 5.2; do
|
||||
for ver in 5.0; do
|
||||
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
|
||||
[ROCm-$ver]
|
||||
name=ROCm$ver
|
||||
@@ -248,7 +247,7 @@ sudo yum clean all
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
for ver in 5.1.3 5.2; do
|
||||
for ver in 5.0; do
|
||||
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
|
||||
[ROCm-$ver]
|
||||
name=ROCm$ver
|
||||
@@ -283,7 +282,7 @@ For a comprehensive list of meta-packages, refer to
|
||||
- Sample Multi-version installation
|
||||
|
||||
```shell
|
||||
sudo yum install rocm-hip-sdk5.2.0 rocm-hip-sdk5.1.3
|
||||
sudo yum install rocm-hip-sdk5.0.0
|
||||
```
|
||||
|
||||
:::::
|
||||
@@ -306,23 +305,7 @@ section.
|
||||
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/sle/15.3/main/x86_64
|
||||
enabled=1
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
EOF
|
||||
sudo zypper ref
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} Service Pack 4
|
||||
:sync: SLES15-SP4
|
||||
|
||||
```shell
|
||||
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/sle/15.4/main/x86_64
|
||||
baseurl=https://repo.radeon.com/amdgpu/21.50/sle/15.3/main/x86_64
|
||||
enabled=1
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
@@ -347,7 +330,7 @@ sudo reboot
|
||||
To add the ROCm repository, use the following steps:
|
||||
|
||||
```shell
|
||||
for ver in 5.1.3 5.2; do
|
||||
for ver in 5.0; do
|
||||
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
|
||||
[ROCm-$ver]
|
||||
name=ROCm$ver
|
||||
@@ -379,7 +362,7 @@ For a comprehensive list of meta-packages, refer to
|
||||
- Sample Multi-version installation
|
||||
|
||||
```shell
|
||||
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk5.2.0 rocm-hip-sdk5.1.3
|
||||
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk5.0.0
|
||||
```
|
||||
|
||||
:::::
|
||||
@@ -416,7 +399,7 @@ but are generally useful. Verification of the install is advised.
|
||||
2. Add binary paths to the `PATH` environment variable.
|
||||
|
||||
```shell
|
||||
export PATH=$PATH:/opt/rocm-5.2.0/bin:/opt/rocm-5.2.0/opencl/bin
|
||||
export PATH=$PATH:/opt/rocm-5.0.0/bin:/opt/rocm-5.0.0/opencl/bin
|
||||
```
|
||||
|
||||
```{attention}
|
||||
|
||||
@@ -26,7 +26,7 @@ repository to the new release.
|
||||
|
||||
```shell
|
||||
# amdgpu repository for bionic
|
||||
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.20.3/ubuntu bionic main' \
|
||||
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/21.50/ubuntu bionic main' \
|
||||
| sudo tee /etc/apt/sources.list.d/amdgpu.list
|
||||
sudo apt update
|
||||
```
|
||||
@@ -37,12 +37,11 @@ sudo apt update
|
||||
|
||||
```shell
|
||||
# amdgpu repository for focal
|
||||
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.20/ubuntu focal main' \
|
||||
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/21.50/ubuntu focal main' \
|
||||
| sudo tee /etc/apt/sources.list.d/amdgpu.list
|
||||
sudo apt update
|
||||
```
|
||||
|
||||
:::
|
||||
::::
|
||||
:::::
|
||||
:::::{tab-item} Red Hat Enterprise Linux
|
||||
@@ -57,7 +56,25 @@ sudo apt update
|
||||
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/rhel/7.9/main/x86_64/
|
||||
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/7.9/main/x86_64/
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
EOF
|
||||
sudo yum clean all
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} RHEL 8.4
|
||||
:sync: RHEL-8.4
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/8.4/main/x86_64/
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
@@ -75,7 +92,7 @@ sudo yum clean all
|
||||
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/rhel/8.5/main/x86_64/
|
||||
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/8.5/main/x86_64/
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
@@ -85,24 +102,8 @@ sudo yum clean all
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} RHEL 8.6
|
||||
:sync: RHEL-8.6
|
||||
:sync: RHEL-8
|
||||
|
||||
```shell
|
||||
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/rhel/8.6/main/x86_64/
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
EOF
|
||||
sudo yum clean all
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
:::::
|
||||
:::::{tab-item} SUSE Linux Enterprise Server 15
|
||||
:sync: SLES15
|
||||
@@ -115,23 +116,7 @@ sudo yum clean all
|
||||
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/sle/15.3/main/x86_64
|
||||
enabled=1
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
EOF
|
||||
sudo zypper ref
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} Service Pack 4
|
||||
:sync: SLES15-SP4
|
||||
|
||||
```shell
|
||||
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
|
||||
[amdgpu]
|
||||
name=amdgpu
|
||||
baseurl=https://repo.radeon.com/amdgpu/22.20/sle/15.4/main/x86_64
|
||||
baseurl=https://repo.radeon.com/amdgpu/21.50/sle/15.3/main/x86_64
|
||||
enabled=1
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
@@ -194,7 +179,7 @@ repository to the new release.
|
||||
:sync: ubuntu-18.04
|
||||
|
||||
```shell
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.2.3 bionic main" \
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.0 bionic main" \
|
||||
| sudo tee /etc/apt/sources.list.d/rocm.list
|
||||
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
|
||||
| sudo tee /etc/apt/preferences.d/rocm-pin-600
|
||||
@@ -206,7 +191,7 @@ sudo apt update
|
||||
:sync: ubuntu-20.04
|
||||
|
||||
```shell
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.2.3 focal main" \
|
||||
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.0 focal main" \
|
||||
| sudo tee /etc/apt/sources.list.d/rocm.list
|
||||
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
|
||||
| sudo tee /etc/apt/preferences.d/rocm-pin-600
|
||||
@@ -225,9 +210,9 @@ sudo apt update
|
||||
|
||||
```shell
|
||||
sudo tee /etc/yum.repos.d/rocm.repo <<EOF
|
||||
[ROCm-5.2]
|
||||
name=ROCm5.2
|
||||
baseurl=https://repo.radeon.com/rocm/yum/5.2/main
|
||||
[ROCm-5.0]
|
||||
name=ROCm5.0
|
||||
baseurl=https://repo.radeon.com/rocm/yum/5.0/main
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
@@ -242,9 +227,9 @@ sudo yum clean all
|
||||
|
||||
```shell
|
||||
sudo tee /etc/yum.repos.d/rocm.repo <<EOF
|
||||
[ROCm-5.2]
|
||||
name=ROCm5.2
|
||||
baseurl=https://repo.radeon.com/rocm/rhel8/5.2/main
|
||||
[ROCm-5.0.1]
|
||||
name=ROCm5.0.1
|
||||
baseurl=https://repo.radeon.com/rocm/rhel8/5.0/main
|
||||
enabled=1
|
||||
priority=50
|
||||
gpgcheck=1
|
||||
@@ -261,10 +246,10 @@ sudo yum clean all
|
||||
|
||||
```shell
|
||||
sudo tee /etc/zypp/repos.d/rocm.repo <<EOF
|
||||
[ROCm-5.2]
|
||||
name=ROCm5.2
|
||||
[ROCm-5.0]
|
||||
name=ROCm5.0
|
||||
name=rocm
|
||||
baseurl=https://repo.radeon.com/rocm/zyp/5.2/main
|
||||
baseurl=https://repo.radeon.com/rocm/zyp/5.0/main
|
||||
enabled=1
|
||||
gpgcheck=1
|
||||
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
|
||||
|
||||
@@ -18,7 +18,6 @@ Detailed walkthroughs of specific use-cases driven by frameworks using ROCm
|
||||
acceleration.
|
||||
|
||||
- [Implementing Inception V3 on ROCm with PyTorch](machine_learning/pytorch_inception.md)
|
||||
- [Optimizing Inference with MIGraphX](machine_learning/migraphx_optimization.md)
|
||||
|
||||
:::
|
||||
|
||||
|
||||
@@ -10,11 +10,4 @@ A collection of detailed and guided examples for working with Inception V3 with
|
||||
|
||||
:::
|
||||
|
||||
:::{grid-item-card} Optimizing Inference with MIGraphX
|
||||
:link: migraphx_optimization
|
||||
:link-type: doc
|
||||
Walkthroughs of optimizing inference using MIGraphX.
|
||||
|
||||
:::
|
||||
|
||||
:::::
|
||||
|
||||
@@ -83,10 +83,6 @@ TensorFlow, \[Online image\]. [https://www.tensorflow.org/extras/tensorflow_bran
|
||||
|
||||
MAGMA, \[Online image\]. [https://bitbucket.org/icl/magma/src/master/docs/](https://bitbucket.org/icl/magma/src/master/docs/)
|
||||
|
||||
Advanced Micro Devices, Inc., \[Online\]. Available: [https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/](https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/)
|
||||
|
||||
Advanced Micro Devices, Inc., \[Online\]. Available: [https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki)
|
||||
|
||||
Docker, \[Online\]. [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/)
|
||||
|
||||
Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision)
|
||||
|
||||
@@ -425,10 +425,6 @@ TensorFlow, \[Online image\]. [https://www.tensorflow.org/extras/tensorflow_bran
|
||||
|
||||
MAGMA, \[Online image\]. [https://bitbucket.org/icl/magma/src/master/docs/](https://bitbucket.org/icl/magma/src/master/docs/)
|
||||
|
||||
Advanced Micro Devices, Inc., \[Online\]. Available: [https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/](https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/)
|
||||
|
||||
Advanced Micro Devices, Inc., \[Online\]. Available: [https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki)
|
||||
|
||||
Docker, \[Online\]. [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/)
|
||||
|
||||
Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision)
|
||||
|
||||
@@ -197,10 +197,6 @@ TensorFlow, \[Online image\]. [https://www.tensorflow.org/extras/tensorflow_bran
|
||||
|
||||
MAGMA, \[Online image\]. [https://bitbucket.org/icl/magma/src/master/docs/](https://bitbucket.org/icl/magma/src/master/docs/)
|
||||
|
||||
Advanced Micro Devices, Inc., \[Online\]. Available: [https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/](https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/)
|
||||
|
||||
Advanced Micro Devices, Inc., \[Online\]. Available: [https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki)
|
||||
|
||||
Docker, \[Online\]. [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/)
|
||||
|
||||
Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision)
|
||||
|
||||
@@ -93,7 +93,6 @@ agile, flexible, rapid and secure manner. [more...](rocm)
|
||||
- [Examples](https://github.com/amd/rocm-examples)
|
||||
- [ML, DL, and AI](examples/machine_learning/all)
|
||||
- [](examples/machine_learning/pytorch_inception)
|
||||
- [](examples/machine_learning/migraphx_optimization)
|
||||
|
||||
:::
|
||||
::::
|
||||
|
||||
@@ -10,17 +10,10 @@ AMD's library for high performance machine learning primitives.
|
||||
|
||||
:::
|
||||
|
||||
:::{grid-item-card} {doc}`Composable Kernel <composable-kernel:index>`
|
||||
:::{grid-item-card} {doc}`Composable Kernel <composable_kernel:index>`
|
||||
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
|
||||
|
||||
- {doc}`Documentation <composable-kernel:index>`
|
||||
|
||||
:::
|
||||
|
||||
:::{grid-item-card} {doc}`MIGraphX <migraphx:index>`
|
||||
AMD MIGraphX is AMD's graph inference engine that accelerates machine learning model inference.
|
||||
|
||||
- {doc}`Documentation <migraphx:index>`
|
||||
- {doc}`Documentation <composable_kernel:index>`
|
||||
|
||||
:::
|
||||
|
||||
|
||||
@@ -42,8 +42,7 @@ Inter and intra-node communication is supported by the following projects:
|
||||
Libraries related to AI.
|
||||
|
||||
- {doc}`MIOpen <miopen:index>`
|
||||
- {doc}`Composable Kernel <composable-kernel:index>`
|
||||
- {doc}`MIGraphX <migraphx:index>`
|
||||
- {doc}`Composable Kernel <composable_kernel:index>`
|
||||
|
||||
:::
|
||||
|
||||
@@ -80,7 +79,7 @@ Computer vision related projects.
|
||||
|
||||
:::{grid-item-card} [Validation Tools](validation_tools)
|
||||
|
||||
- {doc}`ROCm Validation Suite <rocm-validation-suite:index>`
|
||||
- {doc}`ROCm Validation Suite <rocmvalidationsuite:index>`
|
||||
- {doc}`TransferBench <transferbench:index>`
|
||||
|
||||
:::
|
||||
|
||||
@@ -3,10 +3,10 @@
|
||||
:::::{grid} 1 1 2 2
|
||||
:gutter: 1
|
||||
|
||||
:::{grid-item-card} {doc}`RVS <rocm-validation-suite:index>`
|
||||
:::{grid-item-card} {doc}`RVS <rocmvalidationsuite:index>`
|
||||
The ROCm Validation Suite is a system administrator’s and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform.
|
||||
|
||||
- {doc}`Documentation <rocm-validation-suite:index>`
|
||||
- {doc}`Documentation <rocmvalidationsuite:index>`
|
||||
|
||||
:::
|
||||
|
||||
|
||||
@@ -8,13 +8,12 @@ AMD ROCm™ Platform supports the following Linux distributions.
|
||||
|
||||
| Distribution |Processor Architectures| Validated Kernel |
|
||||
|--------------------|-----------------------|--------------------|
|
||||
| CentOS 8.3 | x86-64 | 4.18 |
|
||||
| CentOS 7.9 | x86-64 | 3.10 |
|
||||
| RHEL 8.6 to 8.5 | x86-64 | 4.18 |
|
||||
| RHEL 8.5, 8.4 | x86-64 | 4.18 |
|
||||
| RHEL 7.9 | x86-64 | 3.10 |
|
||||
| SLES 15 SP4 | x86-64 | 5.14.21 |
|
||||
| SLES 15 SP3 | x86-64 | 5.3.18 |
|
||||
| Ubuntu 20.04.4 LTS | x86-64 | 5.13 |
|
||||
| Ubuntu 20.04.3 LTS | x86-64 | 5.11 |
|
||||
| Ubuntu 20.04.3 LTS | x86-64 | 5.8 |
|
||||
| Ubuntu 18.04.5 LTS | x86-64 | 5.4.0 |
|
||||
|
||||
## Virtualization Support
|
||||
|
||||
@@ -58,7 +58,6 @@ The table is ordered to follow ROCm's manifest file.
|
||||
| [rocPRIM](https://github.com/ROCmSoftwarePlatform/rocPRIM/) | [MIT](https://github.com/ROCmSoftwarePlatform/rocPRIM/blob/develop/LICENSE.txt) |
|
||||
| [rocWMMA](https://github.com/ROCmSoftwarePlatform/rocWMMA/) | [MIT](https://github.com/ROCmSoftwarePlatform/rocWMMA/blob/develop/LICENSE.md) |
|
||||
| [hipfort](https://github.com/ROCmSoftwarePlatform/hipfort/) | [MIT](https://github.com/ROCmSoftwarePlatform/hipfort/blob/master/LICENSE) |
|
||||
| [AMDMIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/) | [MIT](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/LICENSE) |
|
||||
| [ROCmValidationSuite](https://github.com/ROCm-Developer-Tools/ROCmValidationSuite/) | [MIT](https://github.com/ROCm-Developer-Tools/ROCmValidationSuite/blob/master/LICENSE) |
|
||||
| [aomp](https://github.com/ROCm-Developer-Tools/aomp/) | [Apache 2.0](https://github.com/ROCm-Developer-Tools/aomp/blob/aomp-dev/LICENSE) |
|
||||
| [aomp-extras](https://github.com/ROCm-Developer-Tools/aomp-extras/) | [MIT](https://github.com/ROCm-Developer-Tools/aomp-extras/blob/aomp-dev/LICENSE) |
|
||||
@@ -121,4 +120,4 @@ following location: `/opt/rocm/share/doc/<component-name>/`
|
||||
For example, you can fetch the licensing information of the `_amd_comgr_`
|
||||
component (Code Object Manager) from the `amd_comgr` folder. A file named
|
||||
`LICENSE.txt` contains the license details at:
|
||||
`/opt/rocm-5.2.0/share/doc/amd_comgr/LICENSE.txt`
|
||||
`/opt/rocm-5.0.0/share/doc/amd_comgr/LICENSE.txt`
|
||||
|
||||
@@ -146,9 +146,7 @@ subtrees:
|
||||
- title: MIOpen - Machine Intelligence
|
||||
url: ${project:miopen}
|
||||
- title: Composable Kernel
|
||||
url: ${project:composable-kernel}
|
||||
- title: MIGraphX - Graph Optimization
|
||||
url: ${project:migraphx}
|
||||
url: ${project:composable_kernel}
|
||||
- file: reference/computer_vision
|
||||
subtrees:
|
||||
- entries:
|
||||
@@ -171,7 +169,7 @@ subtrees:
|
||||
title: Validation Tools
|
||||
subtrees:
|
||||
- entries:
|
||||
- url: ${project:rocm-validation-suite}
|
||||
- url: ${project:rocmvalidationsuite}
|
||||
title: RVS
|
||||
- url: ${project:transferbench}
|
||||
title: TransferBench
|
||||
@@ -223,7 +221,6 @@ subtrees:
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: examples/machine_learning/pytorch_inception
|
||||
- file: examples/machine_learning/migraphx_optimization
|
||||
|
||||
- caption: About
|
||||
entries:
|
||||
|
||||
@@ -1 +1,2 @@
|
||||
rocm-docs-core==0.16.0
|
||||
rocm-docs-core==1.8.0
|
||||
sphinx-reredirects
|
||||
|
||||
@@ -1,114 +1,106 @@
|
||||
#
|
||||
# This file is autogenerated by pip-compile with Python 3.11
|
||||
# This file is autogenerated by pip-compile with Python 3.10
|
||||
# by the following command:
|
||||
#
|
||||
# pip-compile docs/sphinx/requirements.in
|
||||
# pip-compile requirements.in
|
||||
#
|
||||
accessible-pygments==0.0.3
|
||||
accessible-pygments==0.0.5
|
||||
# via pydata-sphinx-theme
|
||||
alabaster==0.7.13
|
||||
alabaster==1.0.0
|
||||
# via sphinx
|
||||
babel==2.11.0
|
||||
babel==2.16.0
|
||||
# via
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
beautifulsoup4==4.11.2
|
||||
beautifulsoup4==4.12.3
|
||||
# via pydata-sphinx-theme
|
||||
breathe==4.34.0
|
||||
breathe==4.35.0
|
||||
# via rocm-docs-core
|
||||
certifi==2022.12.7
|
||||
certifi==2024.8.30
|
||||
# via requests
|
||||
cffi==1.15.1
|
||||
cffi==1.17.1
|
||||
# via
|
||||
# cryptography
|
||||
# pynacl
|
||||
charset-normalizer==2.1.1
|
||||
charset-normalizer==3.3.2
|
||||
# via requests
|
||||
click==8.1.3
|
||||
click==8.1.7
|
||||
# via sphinx-external-toc
|
||||
colorama==0.4.6
|
||||
# via
|
||||
# click
|
||||
# sphinx
|
||||
cryptography==40.0.2
|
||||
cryptography==43.0.1
|
||||
# via pyjwt
|
||||
deprecated==1.2.13
|
||||
deprecated==1.2.14
|
||||
# via pygithub
|
||||
docutils==0.19
|
||||
docutils==0.21.2
|
||||
# via
|
||||
# breathe
|
||||
# myst-parser
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
fastjsonschema==2.16.3
|
||||
fastjsonschema==2.20.0
|
||||
# via rocm-docs-core
|
||||
gitdb==4.0.10
|
||||
gitdb==4.0.11
|
||||
# via gitpython
|
||||
gitpython==3.1.30
|
||||
gitpython==3.1.43
|
||||
# via rocm-docs-core
|
||||
idna==3.4
|
||||
idna==3.10
|
||||
# via requests
|
||||
imagesize==1.4.1
|
||||
# via sphinx
|
||||
jinja2==3.1.2
|
||||
jinja2==3.1.4
|
||||
# via
|
||||
# myst-parser
|
||||
# sphinx
|
||||
linkify-it-py==1.0.3
|
||||
# via myst-parser
|
||||
markdown-it-py==2.2.0
|
||||
markdown-it-py==3.0.0
|
||||
# via
|
||||
# mdit-py-plugins
|
||||
# myst-parser
|
||||
markupsafe==2.1.2
|
||||
markupsafe==2.1.5
|
||||
# via jinja2
|
||||
mdit-py-plugins==0.3.4
|
||||
mdit-py-plugins==0.4.2
|
||||
# via myst-parser
|
||||
mdurl==0.1.2
|
||||
# via markdown-it-py
|
||||
myst-parser[linkify]==1.0.0
|
||||
myst-parser==4.0.0
|
||||
# via rocm-docs-core
|
||||
packaging==23.0
|
||||
packaging==24.1
|
||||
# via
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
pycparser==2.21
|
||||
pycparser==2.22
|
||||
# via cffi
|
||||
pydata-sphinx-theme==0.13.3
|
||||
pydata-sphinx-theme==0.15.4
|
||||
# via
|
||||
# rocm-docs-core
|
||||
# sphinx-book-theme
|
||||
pygithub==1.58.1
|
||||
pygithub==2.4.0
|
||||
# via rocm-docs-core
|
||||
pygments==2.14.0
|
||||
pygments==2.18.0
|
||||
# via
|
||||
# accessible-pygments
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
pyjwt[crypto]==2.6.0
|
||||
pyjwt[crypto]==2.9.0
|
||||
# via pygithub
|
||||
pynacl==1.5.0
|
||||
# via pygithub
|
||||
pytz==2022.7.1
|
||||
# via babel
|
||||
pyyaml==6.0
|
||||
pyyaml==6.0.2
|
||||
# via
|
||||
# myst-parser
|
||||
# rocm-docs-core
|
||||
# sphinx-external-toc
|
||||
requests==2.28.1
|
||||
requests==2.32.3
|
||||
# via
|
||||
# pygithub
|
||||
# sphinx
|
||||
rocm-docs-core==0.16.0
|
||||
# via -r docs/sphinx/requirements.in
|
||||
smmap==5.0.0
|
||||
rocm-docs-core==1.8.0
|
||||
# via -r requirements.in
|
||||
smmap==5.0.1
|
||||
# via gitdb
|
||||
snowballstemmer==2.2.0
|
||||
# via sphinx
|
||||
soupsieve==2.4
|
||||
soupsieve==2.6
|
||||
# via beautifulsoup4
|
||||
sphinx==5.3.0
|
||||
sphinx==8.0.2
|
||||
# via
|
||||
# breathe
|
||||
# myst-parser
|
||||
@@ -119,33 +111,40 @@ sphinx==5.3.0
|
||||
# sphinx-design
|
||||
# sphinx-external-toc
|
||||
# sphinx-notfound-page
|
||||
sphinx-book-theme==1.0.1
|
||||
# sphinx-reredirects
|
||||
sphinx-book-theme==1.1.3
|
||||
# via rocm-docs-core
|
||||
sphinx-copybutton==0.5.1
|
||||
sphinx-copybutton==0.5.2
|
||||
# via rocm-docs-core
|
||||
sphinx-design==0.4.1
|
||||
sphinx-design==0.6.1
|
||||
# via rocm-docs-core
|
||||
sphinx-external-toc==0.3.1
|
||||
sphinx-external-toc==1.0.1
|
||||
# via rocm-docs-core
|
||||
sphinx-notfound-page==0.8.3
|
||||
sphinx-notfound-page==1.0.4
|
||||
# via rocm-docs-core
|
||||
sphinxcontrib-applehelp==1.0.4
|
||||
sphinx-reredirects==0.1.5
|
||||
# via -r requirements.in
|
||||
sphinxcontrib-applehelp==2.0.0
|
||||
# via sphinx
|
||||
sphinxcontrib-devhelp==1.0.2
|
||||
sphinxcontrib-devhelp==2.0.0
|
||||
# via sphinx
|
||||
sphinxcontrib-htmlhelp==2.0.1
|
||||
sphinxcontrib-htmlhelp==2.1.0
|
||||
# via sphinx
|
||||
sphinxcontrib-jsmath==1.0.1
|
||||
# via sphinx
|
||||
sphinxcontrib-qthelp==1.0.3
|
||||
sphinxcontrib-qthelp==2.0.0
|
||||
# via sphinx
|
||||
sphinxcontrib-serializinghtml==1.1.5
|
||||
sphinxcontrib-serializinghtml==2.0.0
|
||||
# via sphinx
|
||||
typing-extensions==4.5.0
|
||||
# via pydata-sphinx-theme
|
||||
uc-micro-py==1.0.1
|
||||
# via linkify-it-py
|
||||
urllib3==1.26.13
|
||||
# via requests
|
||||
wrapt==1.14.1
|
||||
tomli==2.0.1
|
||||
# via sphinx
|
||||
typing-extensions==4.12.2
|
||||
# via
|
||||
# pydata-sphinx-theme
|
||||
# pygithub
|
||||
urllib3==2.2.3
|
||||
# via
|
||||
# pygithub
|
||||
# requests
|
||||
wrapt==1.16.0
|
||||
# via deprecated
|
||||
|
||||
@@ -224,10 +224,6 @@ ROCm CMake Packages
|
||||
+-----------+----------+--------------------------------------------------------+
|
||||
| MIOpen | miopen | ``MIOpen`` |
|
||||
+-----------+----------+--------------------------------------------------------+
|
||||
| MIGraphX | migraphx | ``migraphx::migraphx``, ``migraphx::migraphx_c``, |
|
||||
| | | ``migraphx::migraphx_cpu``, ``migraphx::migraphx_gpu``,|
|
||||
| | | ``migraphx::migraphx_onnx``, ``migraphx::migraphx_tf`` |
|
||||
+-----------+----------+--------------------------------------------------------+
|
||||
|
||||
Using CMake Presets
|
||||
===================
|
||||
|
||||
Reference in New Issue
Block a user