Compare commits

...

30 Commits

Author SHA1 Message Date
Sam Wu
815a7aa81d Bump rocm-docs-core to 1.8.0 2024-09-16 12:03:30 -06:00
Sam Wu
966d100108 Update documentation requirements 2024-06-06 16:58:10 -06:00
Sam Wu
ee0084a97d Fix RTD config 2024-05-02 08:52:10 -06:00
Sam Wu
110f9c37d6 Update documentation requirements 2024-05-01 16:58:29 -06:00
Sam Wu
0bf18fc59a Update documentation requirements 2024-05-01 16:44:46 -06:00
Sam Wu
5ef1b40475 add version to html title 2023-08-04 17:11:01 -06:00
Sam Wu
1e9227e254 rocm-docs-core v0.18.3 2023-06-29 15:58:40 -06:00
Máté Ferenc Nagy-Egri
b1e441451e Downgrade license notice to 5.0.0 2023-06-22 18:46:15 +02:00
Máté Ferenc Nagy-Egri
d3a0e85c21 Downgrade changelog to 5.0.0 2023-06-22 18:46:15 +02:00
Máté Ferenc Nagy-Egri
1f39f027fc Downgrade install instructions to 5.0.0 2023-06-22 18:46:14 +02:00
Máté Ferenc Nagy-Egri
eec431a98f Downgrade release notes to 5.0.0 2023-06-22 18:46:14 +02:00
Máté Ferenc Nagy-Egri
e651705e91 Downgrade license notice to 5.0.1 2023-06-22 18:46:14 +02:00
Máté Ferenc Nagy-Egri
34e9bb3c2e Downgrade changelog to 5.0.1 2023-06-22 18:46:14 +02:00
Máté Ferenc Nagy-Egri
0ace5e9db2 Downgrade install instructions to 5.0.1 2023-06-22 18:46:14 +02:00
Máté Ferenc Nagy-Egri
7062df1c67 Downgrade release notes to 5.0.1 2023-06-22 18:46:14 +02:00
Máté Ferenc Nagy-Egri
24db72b8a8 Downgrade license notice to 5.0.2 2023-06-22 18:46:14 +02:00
Máté Ferenc Nagy-Egri
4668632fa2 Downgrade changelog to 5.0.2 2023-06-22 18:46:13 +02:00
Máté Ferenc Nagy-Egri
270bc73661 Downgrade install instructions to 5.0.2 2023-06-22 18:46:13 +02:00
Máté Ferenc Nagy-Egri
a7ce874940 Downgrade OS support to 5.0.2 2023-06-22 18:46:13 +02:00
Máté Ferenc Nagy-Egri
e78e6a9a23 Downgrade release notes to 5.0.2 2023-06-22 18:46:13 +02:00
Máté Ferenc Nagy-Egri
4bf9dc9560 Remove references to GraphX 2023-06-22 18:46:13 +02:00
Máté Ferenc Nagy-Egri
0dd3fc9eb4 Downgrade license notice to 5.1.0 2023-06-22 18:46:13 +02:00
Máté Ferenc Nagy-Egri
0908ed22b1 Downgrade changelog to 5.1.0 2023-06-22 18:46:13 +02:00
Máté Ferenc Nagy-Egri
683b940a89 Downgrade install instructions to 5.1.0 2023-06-22 18:46:12 +02:00
Máté Ferenc Nagy-Egri
3f60f1df3b Downgrade OS support to 5.1.0 2023-06-22 18:46:12 +02:00
Máté Ferenc Nagy-Egri
d9543485ce Downgrade release notes to 5.1.0 2023-06-22 18:46:12 +02:00
Máté Ferenc Nagy-Egri
65180d309c Downgrade license notice to 5.2.0 2023-06-22 18:46:12 +02:00
Máté Ferenc Nagy-Egri
07901f4a38 Downgrade changelog to 5.2.0 2023-06-22 18:46:12 +02:00
Máté Ferenc Nagy-Egri
a6e350fbe1 Downgrade install instructions to 5.2.0 2023-06-22 18:46:12 +02:00
Máté Ferenc Nagy-Egri
e7c7e1ea3b Downgrade release notes to 5.2.0 2023-06-22 18:46:12 +02:00
22 changed files with 571 additions and 1902 deletions

View File

@@ -6,9 +6,17 @@ version: 2
sphinx:
configuration: docs/conf.py
formats: [htmlzip, pdf, epub]
formats: [htmlzip]
python:
version: "3.8"
install:
- requirements: docs/sphinx/requirements.txt
build:
os: ubuntu-22.04
tools:
python: "3.10"
apt_packages:
- "doxygen"
- "gfortran" # For pre-processing fortran sources
- "graphviz" # For dot graphs in doxygen

File diff suppressed because it is too large Load Diff

View File

@@ -15,69 +15,404 @@ The release notes for the ROCm platform.
-------------------
## ROCm 5.2.3
## ROCm 5.0.0
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable no-duplicate-header -->
### Changes in This Release
### What's New in This Release
#### Ubuntu 18.04 End of Life Announcement
#### HIP Enhancements
Support for Ubuntu 18.04 ends in this release. Future releases of ROCm will not provide prebuilt packages for Ubuntu 18.04.
HIP and Other Runtimes
The ROCm v5.0 release consists of the following HIP enhancements.
#### HIP Runtime
##### HIP Installation Guide Updates
##### Fixes
The HIP Installation Guide is updated to include building HIP from source on the NVIDIA platform.
- A bug was discovered in the HIP graph capture implementation in the ROCm v5.2.0 release. If the same kernel is called twice (with different argument values) in a graph capture, the implementation only kept the argument values for the second kernel call.
Refer to the HIP Installation Guide v5.0 for more details.
- A bug was introduced in the hiprtc implementation in the ROCm v5.2.0 release. This bug caused the `hiprtcGetLoweredName` call to fail for named expressions with whitespace in it.
##### Managed Memory Allocation
Example:
Managed memory, including the `__managed__` keyword, is now supported in the HIP combined host/device compilation. Through unified memory allocation, managed memory allows data to be shared and accessible to both the CPU and GPU using a single pointer. The allocation is managed by the AMD GPU driver using the Linux Heterogeneous Memory Management (HMM) mechanism. The user can call managed memory API hipMallocManaged to allocate a large chunk of HMM memory, execute kernels on a device, and fetch data between the host and device as needed.
The named expression `my_sqrt<complex<double>>` passed but `my_sqrt<complex<double >>` failed.
ROCm Libraries
> **Note**
>
> In a HIP application, it is recommended to do a capability check before calling the managed memory APIs. For example,
>
> ```cpp
> int managed_memory = 0;
> HIPCHECK(hipDeviceGetAttribute(&managed_memory,
> hipDeviceAttributeManagedMemory,p_gpuDevice));
> if (!managed_memory ) {
> printf ("info: managed memory access not supported on the device %d\n Skipped\n", p_gpuDevice);
> }
> else {
> HIPCHECK(hipSetDevice(p_gpuDevice));
> HIPCHECK(hipMallocManaged(&Hmm, N * sizeof(T)));
> . . .
> }
> ```
#### RCCL
> **Note**
>
> The managed memory capability check may not be necessary; however, if HMM is not supported, managed malloc will fall back to using system memory. Other managed memory API calls will, then, have
##### Added
Refer to the HIP API documentation for more details on managed memory APIs.
Compatibility with NCCL 2.12.10
For the application, see
- Packages for test and benchmark executables on all supported OSes using CPack
<https://github.com/ROCm-Developer-Tools/HIP/blob/rocm-4.5.x/tests/src/runtimeApi/memory/hipMallocManaged.cpp>
- Added custom signal handler - opt-in with RCCL_ENABLE_SIGNALHANDLER=1
#### New Environment Variable
- Additional details provided if Binary File Descriptor library (BFD) is pre-installed.
The following new environment variable is added in this release:
- Added experimental support for using multiple ranks per device
| Environment Variable | Value | Description |
|----------------------|-----------------------|-------------|
| HSA_COOP_CU_COUNT | 0 or 1 (default is 0) | Some processors support more CUs than can reliably be used in a cooperative dispatch. Setting the environment variable HSA_COOP_CU_COUNT to 1 will cause ROCr to return the correct CU count for cooperative groups through the HSA_AMD_AGENT_INFO_COOPERATIVE_COMPUTE_UNIT_COUNT attribute of hsa_agent_get_info(). Setting HSA_COOP_CU_COUNT to other values, or leaving it unset, will cause ROCr to return the same CU count for the attributes HSA_AMD_AGENT_INFO_COOPERATIVE_COMPUTE_UNIT_COUNT and HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT. Future ROCm releases will make HSA_COOP_CU_COUNT=1 the default. |
- Requires using a new interface to create communicator (ncclCommInitRankMulti), refer to the interface documentation for details.
### Breaking Changes
- To avoid potential deadlocks, user might have to set an environment variables increasing the number of hardware queues. For example,
#### Runtime Breaking Change
```sh
export GPU_MAX_HW_QUEUES=16
Re-ordering of the enumerated type in hip_runtime_api.h to better match NV. See below for the difference in enumerated types.
ROCm software will be affected if any of the defined enums listed below are used in the code. Applications built with ROCm v5.0 enumerated types will work with a ROCm 4.5.2 driver. However, an undefined behavior error will occur with a ROCm v4.5.2 application that uses these enumerated types with a ROCm 5.0 runtime.
```diff
typedef enum hipDeviceAttribute_t {
- hipDeviceAttributeMaxThreadsPerBlock, ///< Maximum number of threads per block.
- hipDeviceAttributeMaxBlockDimX, ///< Maximum x-dimension of a block.
- hipDeviceAttributeMaxBlockDimY, ///< Maximum y-dimension of a block.
- hipDeviceAttributeMaxBlockDimZ, ///< Maximum z-dimension of a block.
- hipDeviceAttributeMaxGridDimX, ///< Maximum x-dimension of a grid.
- hipDeviceAttributeMaxGridDimY, ///< Maximum y-dimension of a grid.
- hipDeviceAttributeMaxGridDimZ, ///< Maximum z-dimension of a grid.
- hipDeviceAttributeMaxSharedMemoryPerBlock, ///< Maximum shared memory available per block in
- ///< bytes.
- hipDeviceAttributeTotalConstantMemory, ///< Constant memory size in bytes.
- hipDeviceAttributeWarpSize, ///< Warp size in threads.
- hipDeviceAttributeMaxRegistersPerBlock, ///< Maximum number of 32-bit registers available to a
- ///< thread block. This number is shared by all thread
- ///< blocks simultaneously resident on a
- ///< multiprocessor.
- hipDeviceAttributeClockRate, ///< Peak clock frequency in kilohertz.
- hipDeviceAttributeMemoryClockRate, ///< Peak memory clock frequency in kilohertz.
- hipDeviceAttributeMemoryBusWidth, ///< Global memory bus width in bits.
- hipDeviceAttributeMultiprocessorCount, ///< Number of multiprocessors on the device.
- hipDeviceAttributeComputeMode, ///< Compute mode that device is currently in.
- hipDeviceAttributeL2CacheSize, ///< Size of L2 cache in bytes. 0 if the device doesn't have L2
- ///< cache.
- hipDeviceAttributeMaxThreadsPerMultiProcessor, ///< Maximum resident threads per
- ///< multiprocessor.
- hipDeviceAttributeComputeCapabilityMajor, ///< Major compute capability version number.
- hipDeviceAttributeComputeCapabilityMinor, ///< Minor compute capability version number.
- hipDeviceAttributeConcurrentKernels, ///< Device can possibly execute multiple kernels
- ///< concurrently.
- hipDeviceAttributePciBusId, ///< PCI Bus ID.
- hipDeviceAttributePciDeviceId, ///< PCI Device ID.
- hipDeviceAttributeMaxSharedMemoryPerMultiprocessor, ///< Maximum Shared Memory Per
- ///< Multiprocessor.
- hipDeviceAttributeIsMultiGpuBoard, ///< Multiple GPU devices.
- hipDeviceAttributeIntegrated, ///< iGPU
- hipDeviceAttributeCooperativeLaunch, ///< Support cooperative launch
- hipDeviceAttributeCooperativeMultiDeviceLaunch, ///< Support cooperative launch on multiple devices
- hipDeviceAttributeMaxTexture1DWidth, ///< Maximum number of elements in 1D images
- hipDeviceAttributeMaxTexture2DWidth, ///< Maximum dimension width of 2D images in image elements
- hipDeviceAttributeMaxTexture2DHeight, ///< Maximum dimension height of 2D images in image elements
- hipDeviceAttributeMaxTexture3DWidth, ///< Maximum dimension width of 3D images in image elements
- hipDeviceAttributeMaxTexture3DHeight, ///< Maximum dimensions height of 3D images in image elements
- hipDeviceAttributeMaxTexture3DDepth, ///< Maximum dimensions depth of 3D images in image elements
+ hipDeviceAttributeCudaCompatibleBegin = 0,
- hipDeviceAttributeHdpMemFlushCntl, ///< Address of the HDP_MEM_COHERENCY_FLUSH_CNTL register
- hipDeviceAttributeHdpRegFlushCntl, ///< Address of the HDP_REG_COHERENCY_FLUSH_CNTL register
+ hipDeviceAttributeEccEnabled = hipDeviceAttributeCudaCompatibleBegin, ///< Whether ECC support is enabled.
+ hipDeviceAttributeAccessPolicyMaxWindowSize, ///< Cuda only. The maximum size of the window policy in bytes.
+ hipDeviceAttributeAsyncEngineCount, ///< Cuda only. Asynchronous engines number.
+ hipDeviceAttributeCanMapHostMemory, ///< Whether host memory can be mapped into device address space
+ hipDeviceAttributeCanUseHostPointerForRegisteredMem,///< Cuda only. Device can access host registered memory
+ ///< at the same virtual address as the CPU
+ hipDeviceAttributeClockRate, ///< Peak clock frequency in kilohertz.
+ hipDeviceAttributeComputeMode, ///< Compute mode that device is currently in.
+ hipDeviceAttributeComputePreemptionSupported, ///< Cuda only. Device supports Compute Preemption.
+ hipDeviceAttributeConcurrentKernels, ///< Device can possibly execute multiple kernels concurrently.
+ hipDeviceAttributeConcurrentManagedAccess, ///< Device can coherently access managed memory concurrently with the CPU
+ hipDeviceAttributeCooperativeLaunch, ///< Support cooperative launch
+ hipDeviceAttributeCooperativeMultiDeviceLaunch, ///< Support cooperative launch on multiple devices
+ hipDeviceAttributeDeviceOverlap, ///< Cuda only. Device can concurrently copy memory and execute a kernel.
+ ///< Deprecated. Use instead asyncEngineCount.
+ hipDeviceAttributeDirectManagedMemAccessFromHost, ///< Host can directly access managed memory on
+ ///< the device without migration
+ hipDeviceAttributeGlobalL1CacheSupported, ///< Cuda only. Device supports caching globals in L1
+ hipDeviceAttributeHostNativeAtomicSupported, ///< Cuda only. Link between the device and the host supports native atomic operations
+ hipDeviceAttributeIntegrated, ///< Device is integrated GPU
+ hipDeviceAttributeIsMultiGpuBoard, ///< Multiple GPU devices.
+ hipDeviceAttributeKernelExecTimeout, ///< Run time limit for kernels executed on the device
+ hipDeviceAttributeL2CacheSize, ///< Size of L2 cache in bytes. 0 if the device doesn't have L2 cache.
+ hipDeviceAttributeLocalL1CacheSupported, ///< caching locals in L1 is supported
+ hipDeviceAttributeLuid, ///< Cuda only. 8-byte locally unique identifier in 8 bytes. Undefined on TCC and non-Windows platforms
+ hipDeviceAttributeLuidDeviceNodeMask, ///< Cuda only. Luid device node mask. Undefined on TCC and non-Windows platforms
+ hipDeviceAttributeComputeCapabilityMajor, ///< Major compute capability version number.
+ hipDeviceAttributeManagedMemory, ///< Device supports allocating managed memory on this system
+ hipDeviceAttributeMaxBlocksPerMultiProcessor, ///< Cuda only. Max block size per multiprocessor
+ hipDeviceAttributeMaxBlockDimX, ///< Max block size in width.
+ hipDeviceAttributeMaxBlockDimY, ///< Max block size in height.
+ hipDeviceAttributeMaxBlockDimZ, ///< Max block size in depth.
+ hipDeviceAttributeMaxGridDimX, ///< Max grid size in width.
+ hipDeviceAttributeMaxGridDimY, ///< Max grid size in height.
+ hipDeviceAttributeMaxGridDimZ, ///< Max grid size in depth.
+ hipDeviceAttributeMaxSurface1D, ///< Maximum size of 1D surface.
+ hipDeviceAttributeMaxSurface1DLayered, ///< Cuda only. Maximum dimensions of 1D layered surface.
+ hipDeviceAttributeMaxSurface2D, ///< Maximum dimension (width, height) of 2D surface.
+ hipDeviceAttributeMaxSurface2DLayered, ///< Cuda only. Maximum dimensions of 2D layered surface.
+ hipDeviceAttributeMaxSurface3D, ///< Maximum dimension (width, height, depth) of 3D surface.
+ hipDeviceAttributeMaxSurfaceCubemap, ///< Cuda only. Maximum dimensions of Cubemap surface.
+ hipDeviceAttributeMaxSurfaceCubemapLayered, ///< Cuda only. Maximum dimension of Cubemap layered surface.
+ hipDeviceAttributeMaxTexture1DWidth, ///< Maximum size of 1D texture.
+ hipDeviceAttributeMaxTexture1DLayered, ///< Cuda only. Maximum dimensions of 1D layered texture.
+ hipDeviceAttributeMaxTexture1DLinear, ///< Maximum number of elements allocatable in a 1D linear texture.
+ ///< Use cudaDeviceGetTexture1DLinearMaxWidth() instead on Cuda.
+ hipDeviceAttributeMaxTexture1DMipmap, ///< Cuda only. Maximum size of 1D mipmapped texture.
+ hipDeviceAttributeMaxTexture2DWidth, ///< Maximum dimension width of 2D texture.
+ hipDeviceAttributeMaxTexture2DHeight, ///< Maximum dimension hight of 2D texture.
+ hipDeviceAttributeMaxTexture2DGather, ///< Cuda only. Maximum dimensions of 2D texture if gather operations performed.
+ hipDeviceAttributeMaxTexture2DLayered, ///< Cuda only. Maximum dimensions of 2D layered texture.
+ hipDeviceAttributeMaxTexture2DLinear, ///< Cuda only. Maximum dimensions (width, height, pitch) of 2D textures bound to pitched memory.
+ hipDeviceAttributeMaxTexture2DMipmap, ///< Cuda only. Maximum dimensions of 2D mipmapped texture.
+ hipDeviceAttributeMaxTexture3DWidth, ///< Maximum dimension width of 3D texture.
+ hipDeviceAttributeMaxTexture3DHeight, ///< Maximum dimension height of 3D texture.
+ hipDeviceAttributeMaxTexture3DDepth, ///< Maximum dimension depth of 3D texture.
+ hipDeviceAttributeMaxTexture3DAlt, ///< Cuda only. Maximum dimensions of alternate 3D texture.
+ hipDeviceAttributeMaxTextureCubemap, ///< Cuda only. Maximum dimensions of Cubemap texture
+ hipDeviceAttributeMaxTextureCubemapLayered, ///< Cuda only. Maximum dimensions of Cubemap layered texture.
+ hipDeviceAttributeMaxThreadsDim, ///< Maximum dimension of a block
+ hipDeviceAttributeMaxThreadsPerBlock, ///< Maximum number of threads per block.
+ hipDeviceAttributeMaxThreadsPerMultiProcessor, ///< Maximum resident threads per multiprocessor.
+ hipDeviceAttributeMaxPitch, ///< Maximum pitch in bytes allowed by memory copies
+ hipDeviceAttributeMemoryBusWidth, ///< Global memory bus width in bits.
+ hipDeviceAttributeMemoryClockRate, ///< Peak memory clock frequency in kilohertz.
+ hipDeviceAttributeComputeCapabilityMinor, ///< Minor compute capability version number.
+ hipDeviceAttributeMultiGpuBoardGroupID, ///< Cuda only. Unique ID of device group on the same multi-GPU board
+ hipDeviceAttributeMultiprocessorCount, ///< Number of multiprocessors on the device.
+ hipDeviceAttributeName, ///< Device name.
+ hipDeviceAttributePageableMemoryAccess, ///< Device supports coherently accessing pageable memory
+ ///< without calling hipHostRegister on it
+ hipDeviceAttributePageableMemoryAccessUsesHostPageTables, ///< Device accesses pageable memory via the host's page tables
+ hipDeviceAttributePciBusId, ///< PCI Bus ID.
+ hipDeviceAttributePciDeviceId, ///< PCI Device ID.
+ hipDeviceAttributePciDomainID, ///< PCI Domain ID.
+ hipDeviceAttributePersistingL2CacheMaxSize, ///< Cuda11 only. Maximum l2 persisting lines capacity in bytes
+ hipDeviceAttributeMaxRegistersPerBlock, ///< 32-bit registers available to a thread block. This number is shared
+ ///< by all thread blocks simultaneously resident on a multiprocessor.
+ hipDeviceAttributeMaxRegistersPerMultiprocessor, ///< 32-bit registers available per block.
+ hipDeviceAttributeReservedSharedMemPerBlock, ///< Cuda11 only. Shared memory reserved by CUDA driver per block.
+ hipDeviceAttributeMaxSharedMemoryPerBlock, ///< Maximum shared memory available per block in bytes.
+ hipDeviceAttributeSharedMemPerBlockOptin, ///< Cuda only. Maximum shared memory per block usable by special opt in.
+ hipDeviceAttributeSharedMemPerMultiprocessor, ///< Cuda only. Shared memory available per multiprocessor.
+ hipDeviceAttributeSingleToDoublePrecisionPerfRatio, ///< Cuda only. Performance ratio of single precision to double precision.
+ hipDeviceAttributeStreamPrioritiesSupported, ///< Cuda only. Whether to support stream priorities.
+ hipDeviceAttributeSurfaceAlignment, ///< Cuda only. Alignment requirement for surfaces
+ hipDeviceAttributeTccDriver, ///< Cuda only. Whether device is a Tesla device using TCC driver
+ hipDeviceAttributeTextureAlignment, ///< Alignment requirement for textures
+ hipDeviceAttributeTexturePitchAlignment, ///< Pitch alignment requirement for 2D texture references bound to pitched memory;
+ hipDeviceAttributeTotalConstantMemory, ///< Constant memory size in bytes.
+ hipDeviceAttributeTotalGlobalMem, ///< Global memory available on devicice.
+ hipDeviceAttributeUnifiedAddressing, ///< Cuda only. An unified address space shared with the host.
+ hipDeviceAttributeUuid, ///< Cuda only. Unique ID in 16 byte.
+ hipDeviceAttributeWarpSize, ///< Warp size in threads.
- hipDeviceAttributeMaxPitch, ///< Maximum pitch in bytes allowed by memory copies
- hipDeviceAttributeTextureAlignment, ///<Alignment requirement for textures
- hipDeviceAttributeTexturePitchAlignment, ///<Pitch alignment requirement for 2D texture references bound to pitched memory;
- hipDeviceAttributeKernelExecTimeout, ///<Run time limit for kernels executed on the device
- hipDeviceAttributeCanMapHostMemory, ///<Device can map host memory into device address space
- hipDeviceAttributeEccEnabled, ///<Device has ECC support enabled
+ hipDeviceAttributeCudaCompatibleEnd = 9999,
+ hipDeviceAttributeAmdSpecificBegin = 10000,
- hipDeviceAttributeCooperativeMultiDeviceUnmatchedFunc, ///< Supports cooperative launch on multiple
- ///devices with unmatched functions
- hipDeviceAttributeCooperativeMultiDeviceUnmatchedGridDim, ///< Supports cooperative launch on multiple
- ///devices with unmatched grid dimensions
- hipDeviceAttributeCooperativeMultiDeviceUnmatchedBlockDim, ///< Supports cooperative launch on multiple
- ///devices with unmatched block dimensions
- hipDeviceAttributeCooperativeMultiDeviceUnmatchedSharedMem, ///< Supports cooperative launch on multiple
- ///devices with unmatched shared memories
- hipDeviceAttributeAsicRevision, ///< Revision of the GPU in this device
- hipDeviceAttributeManagedMemory, ///< Device supports allocating managed memory on this system
- hipDeviceAttributeDirectManagedMemAccessFromHost, ///< Host can directly access managed memory on
- /// the device without migration
- hipDeviceAttributeConcurrentManagedAccess, ///< Device can coherently access managed memory
- /// concurrently with the CPU
- hipDeviceAttributePageableMemoryAccess, ///< Device supports coherently accessing pageable memory
- /// without calling hipHostRegister on it
- hipDeviceAttributePageableMemoryAccessUsesHostPageTables, ///< Device accesses pageable memory via
- /// the host's page tables
- hipDeviceAttributeCanUseStreamWaitValue ///< '1' if Device supports hipStreamWaitValue32() and
- ///< hipStreamWaitValue64() , '0' otherwise.
+ hipDeviceAttributeClockInstructionRate = hipDeviceAttributeAmdSpecificBegin, ///< Frequency in khz of the timer used by the device-side "clock*"
+ hipDeviceAttributeArch, ///< Device architecture
+ hipDeviceAttributeMaxSharedMemoryPerMultiprocessor, ///< Maximum Shared Memory PerMultiprocessor.
+ hipDeviceAttributeGcnArch, ///< Device gcn architecture
+ hipDeviceAttributeGcnArchName, ///< Device gcnArch name in 256 bytes
+ hipDeviceAttributeHdpMemFlushCntl, ///< Address of the HDP_MEM_COHERENCY_FLUSH_CNTL register
+ hipDeviceAttributeHdpRegFlushCntl, ///< Address of the HDP_REG_COHERENCY_FLUSH_CNTL register
+ hipDeviceAttributeCooperativeMultiDeviceUnmatchedFunc, ///< Supports cooperative launch on multiple
+ ///< devices with unmatched functions
+ hipDeviceAttributeCooperativeMultiDeviceUnmatchedGridDim, ///< Supports cooperative launch on multiple
+ ///< devices with unmatched grid dimensions
+ hipDeviceAttributeCooperativeMultiDeviceUnmatchedBlockDim, ///< Supports cooperative launch on multiple
+ ///< devices with unmatched block dimensions
+ hipDeviceAttributeCooperativeMultiDeviceUnmatchedSharedMem, ///< Supports cooperative launch on multiple
+ ///< devices with unmatched shared memories
+ hipDeviceAttributeIsLargeBar, ///< Whether it is LargeBar
+ hipDeviceAttributeAsicRevision, ///< Revision of the GPU in this device
+ hipDeviceAttributeCanUseStreamWaitValue, ///< '1' if Device supports hipStreamWaitValue32() and
+ ///< hipStreamWaitValue64() , '0' otherwise.
+ hipDeviceAttributeAmdSpecificEnd = 19999,
+ hipDeviceAttributeVendorSpecificBegin = 20000,
+ // Extended attributes for vendors
} hipDeviceAttribute_t;
enum hipComputeMode {
```
- Added support for reusing ports in NET/IB channels
### Known Issues
- Opt-in with NCCL_IB_SOCK_CLIENT_PORT_REUSE=1 and NCCL_IB_SOCK_SERVER_PORT_REUSE=1
#### Incorrect dGPU Behavior When Using AMDVBFlash Tool
- When "Call to bind failed: Address already in use" error happens in large-scale AlltoAll(for example, >=64 MI200 nodes), users are suggested to opt-in either one or both of the options to resolve the massive port usage issue
The AMDVBFlash tool, used for flashing the VBIOS image to dGPU, does not communicate with the ROM Controller specifically when the driver is present. This is because the driver, as part of its runtime power management feature, puts the dGPU to a sleep state.
- Avoid using NCCL_IB_SOCK_SERVER_PORT_REUSE when NCCL_NCHANNELS_PER_NET_PEER is tuned >1
As a workaround, users can run amdgpu.runpm=0, which temporarily disables the runtime power management feature from the driver and dynamically changes some power control-related sysfs files.
##### Removed
#### Issue with START Timestamp in ROCProfiler
- Removed experimental clique-based kernels
Users may encounter an issue with the enabled timestamp functionality for monitoring one or multiple counters. ROCProfiler outputs the following four timestamps for each kernel:
#### Development Tools
- Dispatch
No notable changes in this release for development tools, including the compiler, profiler, and debugger
Deployment and Management Tools
- Start
No notable changes in this release for deployment and management tools.
Older ROCm Releases
- End
For release information for older ROCm releases, refer to <https://github.com/RadeonOpenCompute/ROCm/blob/master/CHANGELOG.md>
- Complete
##### Issue
This defect is related to the Start timestamp functionality, which incorrectly shows an earlier time than the Dispatch timestamp.
To reproduce the issue,
1. Enable timing using the --timestamp on flag.
2. Use the -i option with the input filename that contains the name of the counter(s) to monitor.
3. Run the program.
4. Check the output result file.
##### Current behavior
BeginNS is lower than DispatchNS, which is incorrect.
##### Expected behavior
The correct order is:
Dispatch < Start < End < Complete
Users cannot use ROCProfiler to measure the time spent on each kernel because of the incorrect timestamp with counter collection enabled.
##### Recommended Workaround
Users are recommended to collect kernel execution timestamps without monitoring counters, as follows:
1. Enable timing using the --timestamp on flag, and run the application.
2. Rerun the application using the -i option with the input filename that contains the name of the counter(s) to monitor, and save this to a different output file using the -o flag.
3. Check the output result file from step 1.
4. The order of timestamps correctly displays as:
DispathNS < BeginNS < EndNS < CompleteNS
5. Users can find the values of the collected counters in the output file generated in step 2.
#### Radeon Pro V620 and W6800 Workstation GPUs
##### No Support for SMI and ROCDebugger on SRIOV
System Management Interface (SMI) and ROCDebugger are not supported in the SRIOV environment on any GPU. For more information, refer to the Systems Management Interface documentation.
### Deprecations and Warnings
#### ROCm Libraries Changes Deprecations and Deprecation Removal
- The hipFFT.h header is now provided only by the hipFFT package. Up to ROCm 5.0, users would get hipFFT.h in the rocFFT package too.
- The GlobalPairwiseAMG class is now entirely removed, users should use the PairwiseAMG class instead.
- The rocsparse_spmm signature in 5.0 was changed to match that of rocsparse_spmm_ex. In 5.0, rocsparse_spmm_ex is still present, but deprecated. Signature diff for rocsparse_spmm
rocsparse_spmm in 5.0
```h
rocsparse_status rocsparse_spmm(rocsparse_handle handle,
rocsparse_operation trans_A,
rocsparse_operation trans_B,
const void* alpha,
const rocsparse_spmat_descr mat_A,
const rocsparse_dnmat_descr mat_B,
const void* beta,
const rocsparse_dnmat_descr mat_C,
rocsparse_datatype compute_type,
rocsparse_spmm_alg alg,
rocsparse_spmm_stage stage,
size_t* buffer_size,
void* temp_buffer);
```
rocSPARSE_spmm in 4.0
```h
rocsparse_status rocsparse_spmm(rocsparse_handle handle,
rocsparse_operation trans_A,
rocsparse_operation trans_B,
const void* alpha,
const rocsparse_spmat_descr mat_A,
const rocsparse_dnmat_descr mat_B,
const void* beta,
const rocsparse_dnmat_descr mat_C,
rocsparse_datatype compute_type,
rocsparse_spmm_alg alg,
size_t* buffer_size,
void* temp_buffer);
```
#### HIP API Deprecations and Warnings
##### Warning - Arithmetic Operators of HIP Complex and Vector Types
In this release, arithmetic operators of HIP complex and vector types are deprecated.
- As alternatives to arithmetic operators of HIP complex types, users can use arithmetic operators of `std::complex` types.
- As alternatives to arithmetic operators of HIP vector types, users can use the operators of the native clang vector type associated with the data member of HIP vector types.
During the deprecation, two macros `_HIP_ENABLE_COMPLEX_OPERATORS` and `_HIP_ENABLE_VECTOR_OPERATORS` are provided to allow users to conditionally enable arithmetic operators of HIP complex or vector types.
Note, the two macros are mutually exclusive and, by default, set to Off.
The arithmetic operators of HIP complex and vector types will be removed in a future release.
Refer to the HIP API Guide for more information.
#### Warning - Compiler-Generated Code Object Version 4 Deprecation
Support for loading compiler-generated code object version 4 will be deprecated in a future release with no release announcement and replaced with code object 5 as the default version.
The current default is code object version 4.
#### Warning - MIOpenTensile Deprecation
MIOpenTensile will be deprecated in a future release.

View File

@@ -14,6 +14,13 @@ shutil.copy2('../RELEASE.md','./release.md')
# Keep capitalization due to similar linking on GitHub's markdown preview.
shutil.copy2('../CHANGELOG.md','./CHANGELOG.md')
# configurations for PDF output by Read the Docs
project = "ROCm Documentation"
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved."
version = "5.0.0"
release = "5.0.0"
setting_all_article_info = True
all_article_info_os = ["linux"]
all_article_info_author = ""
@@ -57,7 +64,7 @@ article_pages = [
external_toc_path = "./sphinx/_toc.yml"
docs_core = ROCmDocs("ROCm Documentation Home")
docs_core = ROCmDocs("ROCm 5.0.0 Documentation Home")
docs_core.setup()
external_projects_current_project = "rocm"

View File

@@ -18,8 +18,8 @@ following commands based on your distribution.
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/22.20.3/ubuntu/bionic/amdgpu-install_22.20.50203-1_all.deb
sudo apt install ./amdgpu-install_22.20.50203-1_all.deb
wget https://repo.radeon.com/amdgpu-install/21.50/ubuntu/bionic/amdgpu-install_21.50.50000-1_all.deb
sudo apt install ./amdgpu-install_21.50.50000-1_all.deb
```
:::
@@ -28,8 +28,8 @@ sudo apt install ./amdgpu-install_22.20.50203-1_all.deb
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/22.20.3/ubuntu/focal/amdgpu-install_22.20.50203-1_all.deb
sudo apt install ./amdgpu-install_22.20.50203-1_all.deb
wget https://repo.radeon.com/amdgpu-install/21.50/ubuntu/focal/amdgpu-install_21.50.50000-1_all.deb
sudo apt install ./amdgpu-install_21.50.50000-1_all.deb
```
:::
@@ -44,7 +44,16 @@ sudo apt install ./amdgpu-install_22.20.50203-1_all.deb
:sync: RHEL-7
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/22.20.3/rhel/7.9/amdgpu-install-22.20.50203-1.el7.noarch.rpm
sudo yum install https://repo.radeon.com/amdgpu-install/21.50/rhel/7.9/amdgpu-install-21.50.50000-1.el7.noarch.rpm
```
:::
:::{tab-item} RHEL 8.4
:sync: RHEL-8.4
:sync: RHEL-8
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/21.50/rhel/8.4/amdgpu-install-21.50.50000-1.el8.noarch.rpm
```
:::
@@ -53,16 +62,7 @@ sudo yum install https://repo.radeon.com/amdgpu-install/22.20.3/rhel/7.9/amdgpu-
:sync: RHEL-8
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/22.20.3/rhel/8.5/amdgpu-install-22.20.50203-1.el8.noarch.rpm
```
:::
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
:sync: RHEL-8
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/22.20.3/rhel/8.6/amdgpu-install-22.20.50203-1.el8.noarch.rpm
sudo yum install https://repo.radeon.com/amdgpu-install/21.50/rhel/8.5/amdgpu-install-21.50.50000-1.el8.noarch.rpm
```
:::
@@ -72,19 +72,11 @@ sudo yum install https://repo.radeon.com/amdgpu-install/22.20.3/rhel/8.6/amdgpu-
:sync: SLES15
::::{tab-set}
:::{tab-item} Service Pack 4
:sync: SLES15-SP4
```shell
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/22.20.3/sle/15.4/amdgpu-install-22.20.50203-1.noarch.rpm
```
:::
:::{tab-item} Service Pack 3
:sync: SLES15-SP3
```shell
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/22.20.3/sle/15.3/amdgpu-install-22.20.50203-1.noarch.rpm
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/21.50/sle/15/amdgpu-install-21.50.50000-1.noarch.rpm
```
:::
@@ -163,9 +155,9 @@ the installer script will install packages in the single-version layout.
For the multi-version ROCm installation you must use the installer script from
the latest release of ROCm that you wish to install.
**Example:** If you want to install ROCm releases 5.1.3 and 5.2.3
**Example:** If you want to install ROCm releases 4.5.2 and 5.0.0
simultaneously, you are required to download the installer from the latest ROCm
release v5.2.3.
release v5.0.0.
### Add Required Repositories
@@ -184,10 +176,12 @@ Run the following commands based on your distribution to add the repositories:
:sync: ubuntu-18.04
```shell
for ver in 5.1.3 5.2.2; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver bionic main" | sudo tee /etc/apt/sources.list.d/rocm.list
for ver in 5.0; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver bionic main" \
| sudo tee /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
@@ -196,10 +190,12 @@ sudo apt update
:sync: ubuntu-20.04
```shell
for ver in 5.1.3 5.2.2; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" | sudo tee /etc/apt/sources.list.d/rocm.list
for ver in 5.0; do
echo "deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" \
| sudo tee /etc/apt/sources.list.d/rocm.list
done
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
```
@@ -214,7 +210,7 @@ sudo apt update
:sync: RHEL-7
```shell
for ver in 5.1.3 5.2.2; do
for ver in 5.0; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
@@ -233,7 +229,7 @@ sudo yum clean all
:sync: RHEL-8
```shell
for ver in 5.1.3 5.2.2; do
for ver in 5.0; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
@@ -247,7 +243,6 @@ done
sudo yum clean all
```
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
@@ -258,27 +253,10 @@ sudo yum clean all
:sync: SLES15-SP3
```shell
for ver in 5.1.3 5.2.2; do
for ver in 5.0; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
name=rocm
baseurl=https://repo.radeon.com/rocm/$ver/sle/15.3/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
done
sudo zypper ref
```
:::
:::{tab-item} Service Pack 4
:sync: SLES15-SP4
```shell
for ver in 5.1.3 5.2.2; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
name=rocm
baseurl=https://repo.radeon.com/rocm/$ver/sle/15.4/main/x86_64
baseurl=https://repo.radeon.com/rocm/$ver/sle/15/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -304,12 +282,12 @@ sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-3>
```
Following are examples of ROCm multi-version installation. The kernel-mode
driver, associated with the ROCm release v5.3, will be installed as its latest
driver, associated with the ROCm release v5.0, will be installed as its latest
release in the list.
```none
sudo amdgpu-install --usecase=rocm --rocmrelease=5.1.3
sudo amdgpu-install --usecase=rocm --rocmrelease=5.2.3
sudo amdgpu-install --usecase=rocm --rocmrelease=4.5.2
sudo amdgpu-install --usecase=rocm --rocmrelease=5.0.0
```
## Additional options

View File

@@ -48,23 +48,23 @@ section.
To add the AMDGPU repository, follow these steps:
::::{tab-set}
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
:::{tab-item} Ubuntu 18.04
:sync: ubuntu-18.04
```shell
# amdgpu repository for focal
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.20.3/ubuntu focal main' \
# amdgpu repository for bionic
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/21.50/ubuntu bionic main' \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
:::{tab-item} Ubuntu 22.04
:sync: ubuntu-22.04
:::{tab-item} Ubuntu 20.04
:sync: ubuntu-20.04
```shell
# amdgpu repository for jammy
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.20.3/ubuntu jammy main' \
# amdgpu repository for focal
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/21.50/ubuntu focal main' \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
@@ -91,7 +91,7 @@ To add the ROCm repository, use the following steps:
```shell
# ROCm repositories for bionic
for ver in 5.1.3 5.2.3; do
for ver in 5.0; do
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ver bionic main" \
| sudo tee --append /etc/apt/sources.list.d/rocm.list
done
@@ -106,7 +106,7 @@ sudo apt update
```shell
# ROCm repositories for focal
for ver in 5.1.3 5.2.3; do
for ver in 5.0; do
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" \
| sudo tee --append /etc/apt/sources.list.d/rocm.list
done
@@ -136,7 +136,7 @@ For a comprehensive list of meta-packages, refer to
- Sample Multi-version installation
```shell
sudo apt install rocm-hip-sdk5.2.3 rocm-hip-sdk5.1.3
sudo apt install rocm-hip-sdk5.0.0
```
:::::
@@ -160,7 +160,26 @@ section.
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/rhel/7.9/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/7.9/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.4
:sync: RHEL-8.4
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/8.4/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -179,7 +198,7 @@ sudo yum clean all
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/rhel/8.5/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/8.5/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -189,26 +208,6 @@ sudo yum clean all
```
:::
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/rhel/8.6/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
::::
Install the kernel mode driver and reboot the system using the following
@@ -229,7 +228,7 @@ To add the ROCm repository, use the following steps, based on your distribution:
:sync: RHEL-7
```shell
for ver in 5.2.1 5.2.3; do
for ver in 5.0; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
@@ -248,7 +247,7 @@ sudo yum clean all
:sync: RHEL-8
```shell
for ver in 5.1.3 5.2.3; do
for ver in 5.0; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
@@ -283,7 +282,7 @@ For a comprehensive list of meta-packages, refer to
- Sample Multi-version installation
```shell
sudo yum install rocm-hip-sdk5.2.3 rocm-hip-sdk5.1.3
sudo yum install rocm-hip-sdk5.0.0
```
:::::
@@ -306,23 +305,7 @@ section.
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/sle/15.3/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
:::
:::{tab-item} Service Pack 4
:sync: SLES15-SP4
```shell
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/sle/15.4/main/x86_64
baseurl=https://repo.radeon.com/amdgpu/21.50/sle/15.3/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -347,7 +330,7 @@ sudo reboot
To add the ROCm repository, use the following steps:
```shell
for ver in 5.1.3 5.2.3; do
for ver in 5.0; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
@@ -379,7 +362,7 @@ For a comprehensive list of meta-packages, refer to
- Sample Multi-version installation
```shell
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk5.2.3 rocm-hip-sdk5.1.3
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk5.0.0
```
:::::
@@ -416,7 +399,7 @@ but are generally useful. Verification of the install is advised.
2. Add binary paths to the `PATH` environment variable.
```shell
export PATH=$PATH:/opt/rocm-5.2.3/bin:/opt/rocm-5.2.3/opencl/bin
export PATH=$PATH:/opt/rocm-5.0.0/bin:/opt/rocm-5.0.0/opencl/bin
```
```{attention}

View File

@@ -26,7 +26,7 @@ repository to the new release.
```shell
# amdgpu repository for bionic
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.20.3/ubuntu bionic main' \
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/21.50/ubuntu bionic main' \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
@@ -37,12 +37,11 @@ sudo apt update
```shell
# amdgpu repository for focal
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/22.20.3/ubuntu focal main' \
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/21.50/ubuntu focal main' \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
:::
::::
:::::
:::::{tab-item} Red Hat Enterprise Linux
@@ -57,7 +56,25 @@ sudo apt update
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/rhel/7.9/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/7.9/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.4
:sync: RHEL-8.4
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/8.4/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -75,25 +92,7 @@ sudo yum clean all
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/rhel/8.5/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
:sync: RHEL-8
```shell
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/rhel/8.6/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/21.50/rhel/8.5/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -104,6 +103,7 @@ sudo yum clean all
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
@@ -116,23 +116,7 @@ sudo yum clean all
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/sle/15.3/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
:::
:::{tab-item} Service Pack 4
:sync: SLES15-SP4
```shell
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/22.20.3/sle/15.4/main/x86_64
baseurl=https://repo.radeon.com/amdgpu/21.50/sle/15.3/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -195,7 +179,7 @@ repository to the new release.
:sync: ubuntu-18.04
```shell
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.2.3 bionic main" \
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.0 bionic main" \
| sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
@@ -207,7 +191,7 @@ sudo apt update
:sync: ubuntu-20.04
```shell
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.2.3 focal main" \
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.0 focal main" \
| sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
@@ -226,9 +210,9 @@ sudo apt update
```shell
sudo tee /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-5.2.3]
name=ROCm5.2.3
baseurl=https://repo.radeon.com/rocm/yum/5.2.3/main
[ROCm-5.0]
name=ROCm5.0
baseurl=https://repo.radeon.com/rocm/yum/5.0/main
enabled=1
priority=50
gpgcheck=1
@@ -243,9 +227,9 @@ sudo yum clean all
```shell
sudo tee /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-5.2.3]
name=ROCm5.2.3
baseurl=https://repo.radeon.com/rocm/rhel8/5.2.3/main
[ROCm-5.0.1]
name=ROCm5.0.1
baseurl=https://repo.radeon.com/rocm/rhel8/5.0/main
enabled=1
priority=50
gpgcheck=1
@@ -262,10 +246,10 @@ sudo yum clean all
```shell
sudo tee /etc/zypp/repos.d/rocm.repo <<EOF
[ROCm-5.2.3]
name=ROCm5.2.3
[ROCm-5.0]
name=ROCm5.0
name=rocm
baseurl=https://repo.radeon.com/rocm/zyp/5.2.3/main
baseurl=https://repo.radeon.com/rocm/zyp/5.0/main
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key

View File

@@ -18,7 +18,6 @@ Detailed walkthroughs of specific use-cases driven by frameworks using ROCm
acceleration.
- [Implementing Inception V3 on ROCm with PyTorch](machine_learning/pytorch_inception.md)
- [Optimizing Inference with MIGraphX](machine_learning/migraphx_optimization.md)
:::

View File

@@ -10,11 +10,4 @@ A collection of detailed and guided examples for working with Inception V3 with
:::
:::{grid-item-card} Optimizing Inference with MIGraphX
:link: migraphx_optimization
:link-type: doc
Walkthroughs of optimizing inference using MIGraphX.
:::
:::::

View File

@@ -83,10 +83,6 @@ TensorFlow, \[Online image\]. [https://www.tensorflow.org/extras/tensorflow_bran
MAGMA, \[Online image\]. [https://bitbucket.org/icl/magma/src/master/docs/](https://bitbucket.org/icl/magma/src/master/docs/)
Advanced Micro Devices, Inc., \[Online\]. Available: [https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/](https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/)
Advanced Micro Devices, Inc., \[Online\]. Available: [https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki)
Docker, \[Online\]. [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/)
Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision)

View File

@@ -425,10 +425,6 @@ TensorFlow, \[Online image\]. [https://www.tensorflow.org/extras/tensorflow_bran
MAGMA, \[Online image\]. [https://bitbucket.org/icl/magma/src/master/docs/](https://bitbucket.org/icl/magma/src/master/docs/)
Advanced Micro Devices, Inc., \[Online\]. Available: [https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/](https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/)
Advanced Micro Devices, Inc., \[Online\]. Available: [https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki)
Docker, \[Online\]. [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/)
Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision)

View File

@@ -197,10 +197,6 @@ TensorFlow, \[Online image\]. [https://www.tensorflow.org/extras/tensorflow_bran
MAGMA, \[Online image\]. [https://bitbucket.org/icl/magma/src/master/docs/](https://bitbucket.org/icl/magma/src/master/docs/)
Advanced Micro Devices, Inc., \[Online\]. Available: [https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/](https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/)
Advanced Micro Devices, Inc., \[Online\]. Available: [https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/wiki)
Docker, \[Online\]. [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/)
Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision)

View File

@@ -93,7 +93,6 @@ agile, flexible, rapid and secure manner. [more...](rocm)
- [Examples](https://github.com/amd/rocm-examples)
- [ML, DL, and AI](examples/machine_learning/all)
- [](examples/machine_learning/pytorch_inception)
- [](examples/machine_learning/migraphx_optimization)
:::
::::

View File

@@ -10,17 +10,10 @@ AMD's library for high performance machine learning primitives.
:::
:::{grid-item-card} {doc}`Composable Kernel <composable-kernel:index>`
:::{grid-item-card} {doc}`Composable Kernel <composable_kernel:index>`
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
- {doc}`Documentation <composable-kernel:index>`
:::
:::{grid-item-card} {doc}`MIGraphX <migraphx:index>`
AMD MIGraphX is AMD's graph inference engine that accelerates machine learning model inference.
- {doc}`Documentation <migraphx:index>`
- {doc}`Documentation <composable_kernel:index>`
:::

View File

@@ -42,8 +42,7 @@ Inter and intra-node communication is supported by the following projects:
Libraries related to AI.
- {doc}`MIOpen <miopen:index>`
- {doc}`Composable Kernel <composable-kernel:index>`
- {doc}`MIGraphX <migraphx:index>`
- {doc}`Composable Kernel <composable_kernel:index>`
:::
@@ -80,7 +79,7 @@ Computer vision related projects.
:::{grid-item-card} [Validation Tools](validation_tools)
- {doc}`ROCm Validation Suite <rocm-validation-suite:index>`
- {doc}`ROCm Validation Suite <rocmvalidationsuite:index>`
- {doc}`TransferBench <transferbench:index>`
:::

View File

@@ -3,10 +3,10 @@
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} {doc}`RVS <rocm-validation-suite:index>`
:::{grid-item-card} {doc}`RVS <rocmvalidationsuite:index>`
The ROCm Validation Suite is a system administrators and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform.
- {doc}`Documentation <rocm-validation-suite:index>`
- {doc}`Documentation <rocmvalidationsuite:index>`
:::

View File

@@ -8,13 +8,12 @@ AMD ROCm™ Platform supports the following Linux distributions.
| Distribution |Processor Architectures| Validated Kernel |
|--------------------|-----------------------|--------------------|
| CentOS 8.3 | x86-64 | 4.18 |
| CentOS 7.9 | x86-64 | 3.10 |
| RHEL 8.6 to 8.5 | x86-64 | 4.18 |
| RHEL 8.5, 8.4 | x86-64 | 4.18 |
| RHEL 7.9 | x86-64 | 3.10 |
| SLES 15 SP4 | x86-64 | 5.14.21 |
| SLES 15 SP3 | x86-64 | 5.3.18 |
| Ubuntu 20.04.4 LTS | x86-64 | 5.13 |
| Ubuntu 20.04.3 LTS | x86-64 | 5.11 |
| Ubuntu 20.04.3 LTS | x86-64 | 5.8 |
| Ubuntu 18.04.5 LTS | x86-64 | 5.4.0 |
## Virtualization Support

View File

@@ -58,7 +58,6 @@ The table is ordered to follow ROCm's manifest file.
| [rocPRIM](https://github.com/ROCmSoftwarePlatform/rocPRIM/) | [MIT](https://github.com/ROCmSoftwarePlatform/rocPRIM/blob/develop/LICENSE.txt) |
| [rocWMMA](https://github.com/ROCmSoftwarePlatform/rocWMMA/) | [MIT](https://github.com/ROCmSoftwarePlatform/rocWMMA/blob/develop/LICENSE.md) |
| [hipfort](https://github.com/ROCmSoftwarePlatform/hipfort/) | [MIT](https://github.com/ROCmSoftwarePlatform/hipfort/blob/master/LICENSE) |
| [AMDMIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/) | [MIT](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/LICENSE) |
| [ROCmValidationSuite](https://github.com/ROCm-Developer-Tools/ROCmValidationSuite/) | [MIT](https://github.com/ROCm-Developer-Tools/ROCmValidationSuite/blob/master/LICENSE) |
| [aomp](https://github.com/ROCm-Developer-Tools/aomp/) | [Apache 2.0](https://github.com/ROCm-Developer-Tools/aomp/blob/aomp-dev/LICENSE) |
| [aomp-extras](https://github.com/ROCm-Developer-Tools/aomp-extras/) | [MIT](https://github.com/ROCm-Developer-Tools/aomp-extras/blob/aomp-dev/LICENSE) |
@@ -121,4 +120,4 @@ following location: `/opt/rocm/share/doc/<component-name>/`
For example, you can fetch the licensing information of the `_amd_comgr_`
component (Code Object Manager) from the `amd_comgr` folder. A file named
`LICENSE.txt` contains the license details at:
`/opt/rocm-5.2.3/share/doc/amd_comgr/LICENSE.txt`
`/opt/rocm-5.0.0/share/doc/amd_comgr/LICENSE.txt`

View File

@@ -146,9 +146,7 @@ subtrees:
- title: MIOpen - Machine Intelligence
url: ${project:miopen}
- title: Composable Kernel
url: ${project:composable-kernel}
- title: MIGraphX - Graph Optimization
url: ${project:migraphx}
url: ${project:composable_kernel}
- file: reference/computer_vision
subtrees:
- entries:
@@ -171,7 +169,7 @@ subtrees:
title: Validation Tools
subtrees:
- entries:
- url: ${project:rocm-validation-suite}
- url: ${project:rocmvalidationsuite}
title: RVS
- url: ${project:transferbench}
title: TransferBench
@@ -223,7 +221,6 @@ subtrees:
subtrees:
- entries:
- file: examples/machine_learning/pytorch_inception
- file: examples/machine_learning/migraphx_optimization
- caption: About
entries:

View File

@@ -1 +1,2 @@
rocm-docs-core==0.16.0
rocm-docs-core==1.8.0
sphinx-reredirects

View File

@@ -1,114 +1,106 @@
#
# This file is autogenerated by pip-compile with Python 3.11
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile docs/sphinx/requirements.in
# pip-compile requirements.in
#
accessible-pygments==0.0.3
accessible-pygments==0.0.5
# via pydata-sphinx-theme
alabaster==0.7.13
alabaster==1.0.0
# via sphinx
babel==2.11.0
babel==2.16.0
# via
# pydata-sphinx-theme
# sphinx
beautifulsoup4==4.11.2
beautifulsoup4==4.12.3
# via pydata-sphinx-theme
breathe==4.34.0
breathe==4.35.0
# via rocm-docs-core
certifi==2022.12.7
certifi==2024.8.30
# via requests
cffi==1.15.1
cffi==1.17.1
# via
# cryptography
# pynacl
charset-normalizer==2.1.1
charset-normalizer==3.3.2
# via requests
click==8.1.3
click==8.1.7
# via sphinx-external-toc
colorama==0.4.6
# via
# click
# sphinx
cryptography==40.0.2
cryptography==43.0.1
# via pyjwt
deprecated==1.2.13
deprecated==1.2.14
# via pygithub
docutils==0.19
docutils==0.21.2
# via
# breathe
# myst-parser
# pydata-sphinx-theme
# sphinx
fastjsonschema==2.16.3
fastjsonschema==2.20.0
# via rocm-docs-core
gitdb==4.0.10
gitdb==4.0.11
# via gitpython
gitpython==3.1.30
gitpython==3.1.43
# via rocm-docs-core
idna==3.4
idna==3.10
# via requests
imagesize==1.4.1
# via sphinx
jinja2==3.1.2
jinja2==3.1.4
# via
# myst-parser
# sphinx
linkify-it-py==1.0.3
# via myst-parser
markdown-it-py==2.2.0
markdown-it-py==3.0.0
# via
# mdit-py-plugins
# myst-parser
markupsafe==2.1.2
markupsafe==2.1.5
# via jinja2
mdit-py-plugins==0.3.4
mdit-py-plugins==0.4.2
# via myst-parser
mdurl==0.1.2
# via markdown-it-py
myst-parser[linkify]==1.0.0
myst-parser==4.0.0
# via rocm-docs-core
packaging==23.0
packaging==24.1
# via
# pydata-sphinx-theme
# sphinx
pycparser==2.21
pycparser==2.22
# via cffi
pydata-sphinx-theme==0.13.3
pydata-sphinx-theme==0.15.4
# via
# rocm-docs-core
# sphinx-book-theme
pygithub==1.58.1
pygithub==2.4.0
# via rocm-docs-core
pygments==2.14.0
pygments==2.18.0
# via
# accessible-pygments
# pydata-sphinx-theme
# sphinx
pyjwt[crypto]==2.6.0
pyjwt[crypto]==2.9.0
# via pygithub
pynacl==1.5.0
# via pygithub
pytz==2022.7.1
# via babel
pyyaml==6.0
pyyaml==6.0.2
# via
# myst-parser
# rocm-docs-core
# sphinx-external-toc
requests==2.28.1
requests==2.32.3
# via
# pygithub
# sphinx
rocm-docs-core==0.16.0
# via -r docs/sphinx/requirements.in
smmap==5.0.0
rocm-docs-core==1.8.0
# via -r requirements.in
smmap==5.0.1
# via gitdb
snowballstemmer==2.2.0
# via sphinx
soupsieve==2.4
soupsieve==2.6
# via beautifulsoup4
sphinx==5.3.0
sphinx==8.0.2
# via
# breathe
# myst-parser
@@ -119,33 +111,40 @@ sphinx==5.3.0
# sphinx-design
# sphinx-external-toc
# sphinx-notfound-page
sphinx-book-theme==1.0.1
# sphinx-reredirects
sphinx-book-theme==1.1.3
# via rocm-docs-core
sphinx-copybutton==0.5.1
sphinx-copybutton==0.5.2
# via rocm-docs-core
sphinx-design==0.4.1
sphinx-design==0.6.1
# via rocm-docs-core
sphinx-external-toc==0.3.1
sphinx-external-toc==1.0.1
# via rocm-docs-core
sphinx-notfound-page==0.8.3
sphinx-notfound-page==1.0.4
# via rocm-docs-core
sphinxcontrib-applehelp==1.0.4
sphinx-reredirects==0.1.5
# via -r requirements.in
sphinxcontrib-applehelp==2.0.0
# via sphinx
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-devhelp==2.0.0
# via sphinx
sphinxcontrib-htmlhelp==2.0.1
sphinxcontrib-htmlhelp==2.1.0
# via sphinx
sphinxcontrib-jsmath==1.0.1
# via sphinx
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-qthelp==2.0.0
# via sphinx
sphinxcontrib-serializinghtml==1.1.5
sphinxcontrib-serializinghtml==2.0.0
# via sphinx
typing-extensions==4.5.0
# via pydata-sphinx-theme
uc-micro-py==1.0.1
# via linkify-it-py
urllib3==1.26.13
# via requests
wrapt==1.14.1
tomli==2.0.1
# via sphinx
typing-extensions==4.12.2
# via
# pydata-sphinx-theme
# pygithub
urllib3==2.2.3
# via
# pygithub
# requests
wrapt==1.16.0
# via deprecated

View File

@@ -224,10 +224,6 @@ ROCm CMake Packages
+-----------+----------+--------------------------------------------------------+
| MIOpen | miopen | ``MIOpen`` |
+-----------+----------+--------------------------------------------------------+
| MIGraphX | migraphx | ``migraphx::migraphx``, ``migraphx::migraphx_c``, |
| | | ``migraphx::migraphx_cpu``, ``migraphx::migraphx_gpu``,|
| | | ``migraphx::migraphx_onnx``, ``migraphx::migraphx_tf`` |
+-----------+----------+--------------------------------------------------------+
Using CMake Presets
===================