mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-08 22:28:06 -05:00
Changelog updated for HIP (#613)
This commit is contained in:
28
CHANGELOG.md
28
CHANGELOG.md
@@ -69,32 +69,30 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc
|
|||||||
|
|
||||||
* New HIP APIs
|
* New HIP APIs
|
||||||
- `hipModuleGetFunctionCount` returns the number of functions within a module
|
- `hipModuleGetFunctionCount` returns the number of functions within a module
|
||||||
- `hipMemsetD2D8` used for setting 2D memory range with specified 8-bit values
|
- `hipMemsetD2D8` sets 2D memory range with specified 8-bit values
|
||||||
- `hipMemsetD2D8Async` used for setting 2D memory range with specified 8-bit values asynchronously
|
- `hipMemsetD2D8Async` asynchronously sets 2D memory range with specified 8-bit values
|
||||||
- `hipMemsetD2D16` used for setting 2D memory range with specified 16-bit values
|
- `hipMemsetD2D16` sets 2D memory range with specified 16-bit values
|
||||||
- `hipMemsetD2D16Async` used for setting 2D memory range with specified 16-bit values asynchronously
|
- `hipMemsetD2D16Async` asynchronously sets 2D memory range with specified 16-bit values
|
||||||
- `hipMemsetD2D32` used for setting 2D memory range with specified 32-bit values
|
- `hipMemsetD2D32` sets 2D memory range with specified 32-bit values
|
||||||
- `hipMemsetD2D32Async` used for setting 2D memory range with specified 32-bit values asynchronously
|
- `hipMemsetD2D32Async` asynchronously sets 2D memory range with specified 32-bit values
|
||||||
- `hipStreamSetAttribute` sets attributes such as synchronization policy for a given stream
|
- `hipStreamSetAttribute` sets attributes such as synchronization policy for a given stream
|
||||||
- `hipStreamGetAttribute` returns attributes such as priority for a given stream
|
- `hipStreamGetAttribute` returns attributes such as priority for a given stream
|
||||||
- `hipModuleLoadFatBinary` loads fatbin binary to a module
|
- `hipModuleLoadFatBinary` loads fatbin binary to a module
|
||||||
- `hipMemcpyBatchAsync` performs a batch of 1D or 2D memory copied asynchronously
|
- `hipMemcpyBatchAsync` asynchronously performs a batch copy of 1D or 2D memory
|
||||||
- `hipMemcpy3DBatchAsync` performs a batch of 3D memory copied asynchronously
|
- `hipMemcpy3DBatchAsync` asynchronously performs a batch copy of 3D memory
|
||||||
- `hipMemcpy3DPeer` copies memory between devices
|
- `hipMemcpy3DPeer` copies memory between devices
|
||||||
- `hipMemcpy3DPeerAsync`copies memory between devices asynchronously
|
- `hipMemcpy3DPeerAsync` asynchronously copies memory between devices
|
||||||
- `hipMemsetD2D32Async` used for setting 2D memory range with specified 32-bit values
|
- `hipMemsetD2D32Async` asynchronously sets 2D memory range with specified 32-bit values
|
||||||
asynchronously
|
|
||||||
- `hipMemPrefetchAsync_v2` prefetches memory to the specified location
|
- `hipMemPrefetchAsync_v2` prefetches memory to the specified location
|
||||||
- `hipMemAdvise_v2` advise about the usage of a given memory range
|
- `hipMemAdvise_v2` advises about the usage of a given memory range
|
||||||
- `hipGetDriverEntryPoint ` gets function pointer of a HIP API.
|
- `hipGetDriverEntryPoint ` gets function pointer of a HIP API.
|
||||||
- `hipSetValidDevices` sets a default list of devices that can be used by HIP
|
- `hipSetValidDevices` sets a default list of devices that can be used by HIP
|
||||||
- `hipStreamGetId` queries the ID of a stream
|
- `hipStreamGetId` queries the id of a stream
|
||||||
* Support for the flag `hipMemLocationTypeHost`, enables handling virtual memory management in host memory location, in addition to device memory.
|
|
||||||
* Support for nested tile partitioning within cooperative groups, matching CUDA functionality.
|
* Support for nested tile partitioning within cooperative groups, matching CUDA functionality.
|
||||||
|
|
||||||
#### Optimized
|
#### Optimized
|
||||||
|
|
||||||
* Improved hip module loading latency.
|
* Improved HIP module loading latency.
|
||||||
* Optimized kernel metadata retrieval during module post load.
|
* Optimized kernel metadata retrieval during module post load.
|
||||||
* Optimized doorbell ring in HIP runtime for the following performance improvements:
|
* Optimized doorbell ring in HIP runtime for the following performance improvements:
|
||||||
- Makes efficient packet batching for HIP graph launch
|
- Makes efficient packet batching for HIP graph launch
|
||||||
|
|||||||
32
RELEASE.md
32
RELEASE.md
@@ -201,8 +201,8 @@ ROCm 7.1.0 improves the compatibility between the HIP runtime and NVIDIA CUDA.
|
|||||||
* Stream Management: `hipStreamSetAttribute`, `hipStreamGetAttribute`, and `hipStreamGetId`
|
* Stream Management: `hipStreamSetAttribute`, `hipStreamGetAttribute`, and `hipStreamGetId`
|
||||||
* Device Management: `hipSetValidDevices`
|
* Device Management: `hipSetValidDevices`
|
||||||
* Driver Entry Point Access: `hipGetDriverEntryPoint`
|
* Driver Entry Point Access: `hipGetDriverEntryPoint`
|
||||||
* New HIP flag `hipMemLocationTypeHost` enables handling virtual memory management in host memory location, in addition to device memory.
|
* HIP runtime now supports nested tile partitioning within cooperative groups, matching CUDA functionality.
|
||||||
* HIP runtime now supports nested tile partitioning within cooperative groups, matching CUDA functionality.
|
* Improved HIP module loading latency.
|
||||||
|
|
||||||
For detailed enhancements and updates refer to the [HIP Changelog](#hip-7-1-0).
|
For detailed enhancements and updates refer to the [HIP Changelog](#hip-7-1-0).
|
||||||
|
|
||||||
@@ -758,32 +758,30 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc
|
|||||||
|
|
||||||
* New HIP APIs
|
* New HIP APIs
|
||||||
- `hipModuleGetFunctionCount` returns the number of functions within a module
|
- `hipModuleGetFunctionCount` returns the number of functions within a module
|
||||||
- `hipMemsetD2D8` used for setting 2D memory range with specified 8-bit values
|
- `hipMemsetD2D8` sets 2D memory range with specified 8-bit values
|
||||||
- `hipMemsetD2D8Async` used for setting 2D memory range with specified 8-bit values asynchronously
|
- `hipMemsetD2D8Async` asynchronously sets 2D memory range with specified 8-bit values
|
||||||
- `hipMemsetD2D16` used for setting 2D memory range with specified 16-bit values
|
- `hipMemsetD2D16` sets 2D memory range with specified 16-bit values
|
||||||
- `hipMemsetD2D16Async` used for setting 2D memory range with specified 16-bit values asynchronously
|
- `hipMemsetD2D16Async` asynchronously sets 2D memory range with specified 16-bit values
|
||||||
- `hipMemsetD2D32` used for setting 2D memory range with specified 32-bit values
|
- `hipMemsetD2D32` sets 2D memory range with specified 32-bit values
|
||||||
- `hipMemsetD2D32Async` used for setting 2D memory range with specified 32-bit values asynchronously
|
- `hipMemsetD2D32Async` asynchronously sets 2D memory range with specified 32-bit values
|
||||||
- `hipStreamSetAttribute` sets attributes such as synchronization policy for a given stream
|
- `hipStreamSetAttribute` sets attributes such as synchronization policy for a given stream
|
||||||
- `hipStreamGetAttribute` returns attributes such as priority for a given stream
|
- `hipStreamGetAttribute` returns attributes such as priority for a given stream
|
||||||
- `hipModuleLoadFatBinary` loads fatbin binary to a module
|
- `hipModuleLoadFatBinary` loads fatbin binary to a module
|
||||||
- `hipMemcpyBatchAsync` performs a batch of 1D or 2D memory copied asynchronously
|
- `hipMemcpyBatchAsync` asynchronously performs a batch copy of 1D or 2D memory
|
||||||
- `hipMemcpy3DBatchAsync` performs a batch of 3D memory copied asynchronously
|
- `hipMemcpy3DBatchAsync` asynchronously performs a batch copy of 3D memory
|
||||||
- `hipMemcpy3DPeer` copies memory between devices
|
- `hipMemcpy3DPeer` copies memory between devices
|
||||||
- `hipMemcpy3DPeerAsync`copies memory between devices asynchronously
|
- `hipMemcpy3DPeerAsync` asynchronously copies memory between devices
|
||||||
- `hipMemsetD2D32Async` used for setting 2D memory range with specified 32-bit values
|
- `hipMemsetD2D32Async` asynchronously sets 2D memory range with specified 32-bit values
|
||||||
asynchronously
|
|
||||||
- `hipMemPrefetchAsync_v2` prefetches memory to the specified location
|
- `hipMemPrefetchAsync_v2` prefetches memory to the specified location
|
||||||
- `hipMemAdvise_v2` advise about the usage of a given memory range
|
- `hipMemAdvise_v2` advises about the usage of a given memory range
|
||||||
- `hipGetDriverEntryPoint ` gets function pointer of a HIP API.
|
- `hipGetDriverEntryPoint ` gets function pointer of a HIP API.
|
||||||
- `hipSetValidDevices` sets a default list of devices that can be used by HIP
|
- `hipSetValidDevices` sets a default list of devices that can be used by HIP
|
||||||
- `hipStreamGetId` queries the ID of a stream
|
- `hipStreamGetId` queries the id of a stream
|
||||||
* Support for the flag `hipMemLocationTypeHost`, enables handling virtual memory management in host memory location, in addition to device memory.
|
|
||||||
* Support for nested tile partitioning within cooperative groups, matching CUDA functionality.
|
* Support for nested tile partitioning within cooperative groups, matching CUDA functionality.
|
||||||
|
|
||||||
#### Optimized
|
#### Optimized
|
||||||
|
|
||||||
* Improved hip module loading latency.
|
* Improved HIP module loading latency.
|
||||||
* Optimized kernel metadata retrieval during module post load.
|
* Optimized kernel metadata retrieval during module post load.
|
||||||
* Optimized doorbell ring in HIP runtime for the following performance improvements:
|
* Optimized doorbell ring in HIP runtime for the following performance improvements:
|
||||||
- Makes efficient packet batching for HIP graph launch
|
- Makes efficient packet batching for HIP graph launch
|
||||||
|
|||||||
Reference in New Issue
Block a user