Compare commits

...

1 Commits

Author SHA1 Message Date
Istvan Kiss
f37007bf42 JAX key features and enhancements (#5708)
Co-authored-by: Pratik Basyal <prbasyal@amd.com>
2025-12-04 12:34:48 +01:00
2 changed files with 32 additions and 0 deletions

View File

@@ -36,6 +36,7 @@ Andrej
Arb Arb
Autocast Autocast
autograd autograd
Backported
BARs BARs
BatchNorm BatchNorm
BLAS BLAS
@@ -202,9 +203,11 @@ GenAI
GenZ GenZ
GitHub GitHub
Gitpod Gitpod
hardcoded
HBM HBM
HCA HCA
HGX HGX
HLO
HIPCC HIPCC
hipDataType hipDataType
HIPExtension HIPExtension
@@ -329,6 +332,7 @@ MoEs
Mooncake Mooncake
Mpops Mpops
Multicore Multicore
multihost
Multithreaded Multithreaded
mx mx
MXFP MXFP
@@ -1020,6 +1024,7 @@ uncacheable
uncorrectable uncorrectable
underoptimized underoptimized
unhandled unhandled
unfused
uninstallation uninstallation
unmapped unmapped
unsqueeze unsqueeze

View File

@@ -269,6 +269,33 @@ For a complete and up-to-date list of JAX public modules (for example, ``jax.num
JAX API modules are maintained by the JAX project and is subject to change. JAX API modules are maintained by the JAX project and is subject to change.
Refer to the official Jax documentation for the most up-to-date information. Refer to the official Jax documentation for the most up-to-date information.
Key features and enhancements for ROCm 7.1
===============================================================================
- Enabled compilation of multihost HLO runner Python bindings.
- Backported multihost HLO runner bindings and some related changes to
:code:`FunctionalHloRunner`.
- Added :code:`requirements_lock_3_12` to enable building for Python 3.12.
- Removed hardcoded NHWC convolution layout for ``fp16`` precision to address the performance drops for ``fp16`` precision on gfx12xx GPUs.
- ROCprofiler-SDK integration:
- Integrated ROCprofiler-SDK (v3) to XLA to improve profiling of GPU events,
support both time-based and step-based profiling.
- Added unit tests for :code:`rocm_collector` and :code:`rocm_tracer`.
- Added Triton unsupported conversion from ``f8E4M3FNUZ`` to ``fp16`` with
rounding mode.
- Introduced :code:`CudnnFusedConvDecomposer` to revert fused convolutions
when :code:`ConvAlgorithmPicker` fails to find a fused algorithm, and removed
unfused fallback paths from :code:`RocmFusedConvRunner`.
Key features and enhancements for ROCm 7.0 Key features and enhancements for ROCm 7.0
=============================================================================== ===============================================================================