mirror of
https://github.com/ROCm/ROCm.git
synced 2026-02-01 09:55:00 -05:00
Compare commits
1 Commits
hipdnn
...
update_jax
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f37007bf42 |
@@ -36,6 +36,7 @@ Andrej
|
|||||||
Arb
|
Arb
|
||||||
Autocast
|
Autocast
|
||||||
autograd
|
autograd
|
||||||
|
Backported
|
||||||
BARs
|
BARs
|
||||||
BatchNorm
|
BatchNorm
|
||||||
BLAS
|
BLAS
|
||||||
@@ -202,9 +203,11 @@ GenAI
|
|||||||
GenZ
|
GenZ
|
||||||
GitHub
|
GitHub
|
||||||
Gitpod
|
Gitpod
|
||||||
|
hardcoded
|
||||||
HBM
|
HBM
|
||||||
HCA
|
HCA
|
||||||
HGX
|
HGX
|
||||||
|
HLO
|
||||||
HIPCC
|
HIPCC
|
||||||
hipDataType
|
hipDataType
|
||||||
HIPExtension
|
HIPExtension
|
||||||
@@ -329,6 +332,7 @@ MoEs
|
|||||||
Mooncake
|
Mooncake
|
||||||
Mpops
|
Mpops
|
||||||
Multicore
|
Multicore
|
||||||
|
multihost
|
||||||
Multithreaded
|
Multithreaded
|
||||||
mx
|
mx
|
||||||
MXFP
|
MXFP
|
||||||
@@ -1020,6 +1024,7 @@ uncacheable
|
|||||||
uncorrectable
|
uncorrectable
|
||||||
underoptimized
|
underoptimized
|
||||||
unhandled
|
unhandled
|
||||||
|
unfused
|
||||||
uninstallation
|
uninstallation
|
||||||
unmapped
|
unmapped
|
||||||
unsqueeze
|
unsqueeze
|
||||||
|
|||||||
@@ -269,6 +269,33 @@ For a complete and up-to-date list of JAX public modules (for example, ``jax.num
|
|||||||
JAX API modules are maintained by the JAX project and is subject to change.
|
JAX API modules are maintained by the JAX project and is subject to change.
|
||||||
Refer to the official Jax documentation for the most up-to-date information.
|
Refer to the official Jax documentation for the most up-to-date information.
|
||||||
|
|
||||||
|
Key features and enhancements for ROCm 7.1
|
||||||
|
===============================================================================
|
||||||
|
|
||||||
|
- Enabled compilation of multihost HLO runner Python bindings.
|
||||||
|
|
||||||
|
- Backported multihost HLO runner bindings and some related changes to
|
||||||
|
:code:`FunctionalHloRunner`.
|
||||||
|
|
||||||
|
- Added :code:`requirements_lock_3_12` to enable building for Python 3.12.
|
||||||
|
|
||||||
|
- Removed hardcoded NHWC convolution layout for ``fp16`` precision to address the performance drops for ``fp16`` precision on gfx12xx GPUs.
|
||||||
|
|
||||||
|
|
||||||
|
- ROCprofiler-SDK integration:
|
||||||
|
|
||||||
|
- Integrated ROCprofiler-SDK (v3) to XLA to improve profiling of GPU events,
|
||||||
|
support both time-based and step-based profiling.
|
||||||
|
|
||||||
|
- Added unit tests for :code:`rocm_collector` and :code:`rocm_tracer`.
|
||||||
|
|
||||||
|
- Added Triton unsupported conversion from ``f8E4M3FNUZ`` to ``fp16`` with
|
||||||
|
rounding mode.
|
||||||
|
|
||||||
|
- Introduced :code:`CudnnFusedConvDecomposer` to revert fused convolutions
|
||||||
|
when :code:`ConvAlgorithmPicker` fails to find a fused algorithm, and removed
|
||||||
|
unfused fallback paths from :code:`RocmFusedConvRunner`.
|
||||||
|
|
||||||
Key features and enhancements for ROCm 7.0
|
Key features and enhancements for ROCm 7.0
|
||||||
===============================================================================
|
===============================================================================
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user