From 67f988f58bec25c24c17ce50d08e0c2ebea6e6bf Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Fri, 15 Aug 2025 13:17:41 -0400 Subject: [PATCH 01/58] 7.0.0 release changes to ROCm documentation (#483) * Update RELEASE.md (#481) Update HIP 7.0 Release Notes * Initial 7.0.0 related changes * Update RELEASE.md Add Release Notes entry for `__reduce_XXX_sync` functions in HIP. * Update RELEASE.md Add HIP 7 API changes to Release Highlights * Update RELEASE.md Corect link for HIP 7 changes * Update RELEASE.md Update Release Highlights note for HIP 7 changes * Changelog entry updated post RC2 * 642 GA manifest added * 6.4.3 GA manifest added * 7.0.0 RC1 manifest added * added rocCV (#490) Co-authored-by: Pratik Basyal * 7.0.0 RC2 manifest added * Documentation updated added * Highlight for 7.0.0 added * Highlight updated * Highlights update * removed rocCV (#499) Co-authored-by: Pratik Basyal * Version udpate * Version table update * Installer udpate added * Table updated --------- Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> Co-authored-by: spolifroni-amd --- RELEASE.md | 1379 +++++++++++++++++++++++++++++++++++++++++++--- docs/conf.py | 6 +- manifest_700.xml | 80 +++ 3 files changed, 1389 insertions(+), 76 deletions(-) create mode 100644 manifest_700.xml diff --git a/RELEASE.md b/RELEASE.md index be1527030..4041b0fdd 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -10,7 +10,7 @@ -# ROCm 6.4.3 release notes +# ROCm 7.0.0 release notes The release notes provide a summary of notable changes since the previous ROCm release. @@ -24,6 +24,8 @@ The release notes provide a summary of notable changes since the previous ROCm r - [ROCm known issues](#rocm-known-issues) +- [ROCm resolved issues](#rocm-resolved-issues) + - [ROCm upcoming changes](#rocm-upcoming-changes) ```{note} @@ -33,40 +35,194 @@ documentation to verify compatibility and system requirements. ## Release highlights -ROCm 6.4.3 is a quality release that resolves the following issues. For changes to individual components, see [Detailed component changes](#detailed-component-changes). +The following are notable new features and improvements in ROCm 7.0.0. For changes to individual components, see +[Detailed component changes](#detailed-component-changes). -### AMDGPU driver updates +### HIP API compatibility improvements -* Resolved an issue causing performance degradation in communication operations, caused by increased latency in certain RCCL applications. The fix prevents unnecessary queue eviction during the fork process. -* Fixed an issue in the AMDGPU driver’s scheduler constraints that could cause queue preemption to fail during workload execution. +HIP API 7.0 introduces changes to make it align more closely with NVIDIA CUDA. These change are incompatible with prior releases, +and might require recompiling existing HIP applications for use in the ROCm 7.0 release. For more information, see the [HIP API 7.0 changes](../hip-7-changes) and the [HIP changelog](#hip-7-0-0) below. -### ROCm SMI update -* Fixed the failure to load GPU data like System Clock (SCLK) by adjusting the logic for retrieving GPU board voltage. +### Instinct Driver / ROCm packaging separation + +The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) for more information. + +Forward and backward compatibility between the Instinct Driver and ROCm is not supported in the Beta release. See the [installation instructions](https://rocm.docs.amd.com/en/docs-7.0-beta/preview/install/index.html). + +### Deep learning framework support improvements + +ROCm 7.0 supports PyTorch 2.7, TensorFlow 2.19, and Triton 3.3.0. + +### ROCprofiler-SDK and rocprofv3 improvements + +#### rocpd + +Support has been added for the ROCm Profiling Data (rocpd) output format, which is now the default format for ``rocprofv3``. A subproject of the ROCprofiler-SDK, rocpd enables saving profiling results to a SQLite3 database, providing a structured and efficient foundation for analysis and post-processing. + +#### Core SDK enhancements + +* ROCprofiler-SDK is now compatible with the HIP 7.0 API. +* Added stochastic and host-trap PC sampling support for all MI300 series accelerators. +* Added support for tracing KFD events. + +#### rocprofv3 CLI tool enhancements + +* Added stochastic and host-trap PC sampling support for all MI300 series accelerators. +* HIP streams translate to Queues in Time Traces in Perfetto output. + +### Compilers changes and improvements + +ROCm 7.0 introduces the AMD Next-Gen Fortran compiler. ``llvm-flang`` (sometimes called new-flang or flang-18) is a re-implementation of the Fortran frontend. It is a strategic replacement for classic-flang and is developed in LLVM’s upstream repo at [llvm/llvm-project](https://github.com/llvm/llvm-project/tree/main/flang). + +Key enhancements include: + + * Compiler: + * Improved memory load and store instructions. + * Updated clang/llvm to AMD clang version 20.0.0git (equivalent to LLVM 20.0.0 with additional out-of-tree patches). + * Support added for separate debug file generation for device code. + + * Comgr: + * Added support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps. This is designed to improve performance by reducing on-disk file I/O. Currently, VFS is supported only for the device library link step, with plans for expanded support in future releases. + + * SPIR-V: + * Improved [target-specific extensions](https://github.com/ROCm/llvm-project/blob/c2535466c6e40acd5ecf6ba1676a4e069c6245cc/clang/docs/LanguageExtensions.rst): + * Added a new target-specific builtin ``__builtin_amdgcn_processor_is`` for late or deferred queries of the current target processor. + * Added a new target-specific builtin ``__builtin_amdgcn_is_invocable``, enabling fine-grained, per-builtin feature availability. + * HIPIFY now supports NVIDIA CUDA 12.8.0 APIs: + * Added support for all new device and host APIs, including FP4, FP6, and FP128 – including support for the corresponding ROCm HIP equivalents. + +Deprecated features: + + * ROCm components no longer use the ``__AMDGCN_WAVEFRONT_SIZE`` and ``__AMDGCN_WAVEFRONT_SIZE__`` macros nor HIP’s ``warpSize`` variable as ``constexpr``. These macros and reliance on ``warpSize`` as a ``constexpr`` are deprecated and will be disabled in a future release. Users are encouraged to update their code if needed to ensure future compatibility. + +### Libraries changes and improvements + +#### New data type support + +MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). The ROCm 7.0 Alpha enables functional support for MX data types FP4, FP6, and FP8 on MI355X systems in these ROCm libraries: + * Composable Kernel (FP4 and FP8 only) + * hipBLASLt + * MIGraphX (FP4 only) + +The following libraries are updated to support the Open Compute Project (OCP) floating-point FP8 format on AMD Instinct MI355X instead of the NANOO FP8 format: + + * Composable Kernel + * hipBLASLt + * hipSPARSELt + * MIGraphX + * rocWMMA +MIGraphX now also supports BF16. + +#### RCCL support + +RCCL is supported for single-node functional usage only. Multi-node communication capabilities will be supported in future preview releases. + +#### MIGraphX support + +* Support for OCP FP8 and MX FP4 data types on MI355X +* Support for BF16 on all hardware +* Support for PyTorch 2.7 via Torch-MIGraphX + +### Tools changes and improvements + +#### AMD SMI + +* The default output of the ``amd-smi`` CLI now displays a simple table view. +* New APIs: CPU affinity shows GPUs’ affinitization to each CPU in a system. + +#### ROCgdb +* MX data types support: FP4, FP6, and FP8 + +#### ROCprof Compute Viewer +* Initial release: ``rocprof-compute-viewer`` allows the visualization of ``rocprofv3``’s thread trace output + +#### ROCprof Trace Decoder +* Initial release: ``rocprof-trace-decoder`` a plugin API for decoding thread traces + +#### ROCm Compute Profiler + +* MX data types support: FP4, FP6, and FP8. +* AMD Instinct MI355X and MI350X performance counters: CPC, SPI, SQ, TA/TD/TCP, and TCC. +* Enhanced roofline analysis with support for INT8, INT32, FP8, FP16, and BF16 data types. +* Roofline distinction for FP32 and FP64 data types. +* Selective kernel profiling. + +#### ROCm Systems Profiler +* Trace support for computer vision APIs: H264, H265, AV1, VP9, and JPEG. +* Trace support for computer vision engine activity. +* OpenMP for C++ language and kernel activity support. + +#### ROCm Validation Suite +* AMD Instinct MI355X and MI350X accelerator support in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. + +#### ROCprofiler-SDK +* Program counter (PC) sampling (host trap-based). +* API for profiling applications using thread traces (beta). +* Support in ``rocprofv3`` CLI tool for thread trace service. + +### ROCm Offline Installer Creator updates + +The ROCm Offline Installer Creator 7.0.0 includes the following features and improvements: + +* Added support for RHEL 10.0, Oracle 10.0, and Rocky 9.6. +* Added support for the new graphics repo structure for graphics/mesa related packages. +* Improvements to kernel header version matching for AMDGPU driver installation. +* Added support for creating an offline installer when the kernel version of the target operating system differs from the operating system of the host creating the installer (for Ubuntu 22.04 and 24.04 only). + +See [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/rocm-offline-installer.html) for more information. + +### ROCm Runfile Installer updates + +The ROCm Runfile Installer 7.0.0 adds the following features and improvements: + +* Added support for RHEL 10.0, Oracle 10.0, and Rocky 9.6, +* Added `untar` mode for the `.run` file to allow extraction of ROCm to a given directory, similar to a normal tarball. +* Added an RVS test script. +* Fixes to the rocm-examples test script. +* Fixes for `clinfo` and OpenCL use after installation. + +For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/rocm-runfile-installer.html). ### ROCm documentation updates ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases. -* [Tutorials for AI developers](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/) have been expanded with the following five new tutorials: - * Inference tutorials - * [ChatQnA vLLM deployment and performance evaluation](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/opea_deployment_and_evaluation.html) - * [Text-to-video generation with ComfyUI](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/t2v_comfyui_radeon.html) - * [DeepSeek Janus Pro on CPU or GPU](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/deepseek_janus_cpu_gpu.html) - * [DeepSeek-R1 with vLLM V1](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/vllm_v1_DSR1.html) - * GPU development and optimization tutorial: [MLA decoding kernel of AITER library](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/gpu_dev_optimize/aiter_mla_decode_kernel.html) - - For more information about the changes, see [Changelog for the AI Developer Hub](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/changelog.html). +* ROCm Math libraries support a wide range of data types, enabling optimized performance across various precision requirements. The following Math libraries are now updated with new precision content. For more information, click the Math library’s link: -* ROCm provides a comprehensive ecosystem for deep learning development. For more details, see [Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/docs-6.4.3/how-to/deep-learning-rocm.html). AMD ROCm adds support for the following deep learning frameworks: + * [hipBLAS](https://rocm.docs.amd.com/projects/hipBLAS/en/develop/reference/data-type-support.html) + * [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/develop/reference/data-type-support.html) + * [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/develop/reference/precision.html) + * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/precision.html) + * [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/develop/src/reference/precision-support.html#precision-support) - * Taichi is an open-source, imperative, and parallel programming language designed for high-performance numerical computation. Embedded in Python, it leverages just-in-time (JIT) compilation frameworks such as LLVM to accelerate compute-intensive Python code by compiling it to native GPU or CPU instructions. It is currently supported on ROCm 6.3.2. For more information, see [Taichi compatibility](https://rocm.docs.amd.com/en/docs-6.4.3/compatibility/ml-compatibility/taichi-compatibility.html). - * Megablocks is a light-weight library for mixture-of-experts (MoE) training. The core of the system is efficient "dropless-MoE" and standard MoE layers. Megablocks is integrated with Megatron-LM, where data and pipeline parallel training of MoEs is supported. It is currently supported on ROCm 6.3.0. For more information, see [Megablocks compatibility](https://rocm.docs.amd.com/en/docs-6.4.3/compatibility/ml-compatibility/megablocks-compatibility.html). +* Documentation for [rocCV](https://rocm.docs.amd.com/projects/rocCV/en/latest/index.html), an efficient GPU-accelerated library for image pre- and post-processing, has been added. rocCV is in an early access state, and using it on production workloads is not recommended. -* The [Data types and precision support](https://rocm.docs.amd.com/en/latest/reference/precision-support.html) topic now includes new hardware and library support information. +* ROCm offers a comprehensive ecosystem for deep learning development, featuring libraries optimized for deep learning operations and ROCm-aware versions of popular deep learning frameworks and libraries. The following deep learning frameworks' content now includes release notes and known issues: + + * [PyTorch](https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/pytorch-compatibility.html) + * [JAX](https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/jax-compatibility.html) + +* ROCm components support a wide range of environment variables that can be used for testing, logging, debugging, experimental features, and more. The following components have been updated with new environment variable content. For more information, click the component’s link: + + * [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/develop/reference/env-variables.html) + * [hipSPARSELt](https://rocm.docs.amd.com/projects/hipSPARSELt/en/develop/reference/env-variables.html) + * [MIVisionX](https://rocm.docs.amd.com/projects/MIVisionX/en/develop/reference/MIVisionX-env-variables.html) + * [MIOpen](https://rocm.docs.amd.com/projects/MIOpen/en/develop/reference/env_variables.html) + * [rocBLAS](https://rocm.docs.amd.com/projects/rocBLAS/en/develop/reference/env-variables.html) + * [ROCm Performance Primitives (RPP)](https://rocm.docs.amd.com/projects/rpp/en/develop/reference/rpp-env-variables.html) + * [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/develop/reference/env_variables.html) + * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/env_variables.html) + * [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/develop/src/reference/environment-variables.html) + +* Modern computing tasks often require balancing numerical precision against hardware resources and processing speed. Low precision floating point number formats in HIP include FP4 (4-bit) and FP6 (6-bit), which reduce memory and bandwidth requirements. For more information, see the updated [Low precision floating point types](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html) topic. ## Operating system and hardware support changes -Operating system and hardware support remain unchanged in this release. +ROCm 7.0.0 adds support for [placeholder]. For more information, see installation instructions. + +ROCm 6.4.2 marks the end of support (EoS) for [placeholder] + +ROCm 7.0.0 adds support for AMD Instinct MI355X and MI350X. For details, see the full list of Supported GPUs (Linux). See the [Compatibility matrix](../../docs/compatibility/compatibility-matrix.rst) @@ -74,7 +230,9 @@ for more information about operating system and hardware compatibility. ## ROCm components -The following table lists the versions of ROCm components for ROCm 6.4.3. +The following table lists the versions of ROCm components for ROCm 7.0.0, including any version +changes from 6.4.3 to 7.0.0. Click the component's updated version to go to a list of its changes. + Click {fab}`github` to go to the component's source code on GitHub.
@@ -97,47 +255,47 @@ Click {fab}`github` to go to the component's source code on GitHub. Libraries Machine learning and computer vision Composable Kernel - 1.1.0 + 1.1.0 ⇒ 1.1.0 MIGraphX - 2.12.0 + 2.12.0 ⇒ 2.13.0 MIOpen - 3.4.0 + 3.4.0 ⇒ 3.4.1 MIVisionX - 3.2.0 + 3.2.0 ⇒ 3.3.0 rocAL - 2.2.0 + 2.2.0 ⇒ 2.3.0 rocDecode - 0.10.0 + 0.10.0 ⇒ 1.0.0 rocJPEG - 0.8.0 + 0.8.0 ⇒ 1.1.0 rocPyDecode - 0.3.1 + 0.3.1 ⇒ 0.6.0 RPP - 1.9.10 + 1.9.10 ⇒ 2.0.0 @@ -146,12 +304,12 @@ Click {fab}`github` to go to the component's source code on GitHub. Communication RCCL - 2.22.3 + 2.22.3 ⇒ 2.26.6 rocSHMEM - 2.0.1 + 2.0.1 ⇒ 3.0.0 @@ -160,82 +318,82 @@ Click {fab}`github` to go to the component's source code on GitHub. Math hipBLAS - 2.4.0 + 2.4.0 ⇒ 3.0.0 hipBLASLt - 0.12.1 + 0.12.1 ⇒ 1.0.0 hipFFT - 1.0.18 + 1.0.18 ⇒ 1.0.20 hipfort - 0.6.0 + 0.6.0 ⇒ 0.7.0 hipRAND - 2.12.0 + 2.12.0 ⇒ 3.0.0 hipSOLVER - 2.4.0 + 2.4.0 ⇒ 3.0.0 hipSPARSE - 3.2.0 + 3.2.0 ⇒ 4.0.1 hipSPARSELt - 0.2.3 + 0.2.3 ⇒ 0.2.4 rocALUTION - 3.2.3 + 3.2.3 ⇒ 4.0.0 rocBLAS - 4.4.1 + 4.4.1 ⇒ 5.0.0 rocFFT - 1.0.32 + 1.0.32 ⇒ 1.0.34 rocRAND - 3.3.0 + 3.3.0 ⇒ 4.0.0 rocSOLVER - 3.28.2 + 3.28.2 ⇒ 3.30.0 rocSPARSE - 3.4.0 + 3.4.0 ⇒ 4.0.2 rocWMMA - 1.7.0 + 1.7.0 ⇒ 2.0.0 Tensile - 4.43.0 + 4.43.0 ⇒ 4.44.0 @@ -244,22 +402,22 @@ Click {fab}`github` to go to the component's source code on GitHub. Primitives hipCUB - 3.4.0 + 3.4.0 ⇒ 4.0.0 hipTensor - 1.5.0 + 1.5.0 ⇒ 2.0.0 rocPRIM - 3.4.1 + 3.4.1 ⇒ 4.0.0 rocThrust - 3.3.0 + 3.3.0 ⇒ 4.0.0 @@ -268,12 +426,12 @@ Click {fab}`github` to go to the component's source code on GitHub. Tools System management AMD SMI - 25.5.1 + 25.5.1 ⇒ 26.0.0 ROCm Data Center Tool - 0.3.0 + 0.3.0 ⇒ 1.1.0 @@ -283,12 +441,12 @@ Click {fab}`github` to go to the component's source code on GitHub. ROCm SMI - 7.5.0 ⇒ 7.7.0 + 7.7.0 ⇒ 7.8.0 ROCm Validation Suite - 1.1.0 + 1.1.0 ⇒ 1.2.0 @@ -298,19 +456,19 @@ Click {fab}`github` to go to the component's source code on GitHub. Performance ROCm Bandwidth Test - 1.4.0 + 1.4.0 ⇒ 2.6.0 ROCm Compute Profiler - 3.1.1 + 3.1.1 ⇒ 3.2.1 ROCm Systems Profiler - 1.0.2 + 1.0.2 ⇒ 1.1.0 @@ -322,7 +480,7 @@ Click {fab}`github` to go to the component's source code on GitHub. ROCprofiler-SDK - 0.6.0 + 0.6.0 ⇒ 1.0.0 @@ -338,13 +496,13 @@ Click {fab}`github` to go to the component's source code on GitHub. Development HIPIFY - 19.0.0 + 19.0.0 ⇒ 20.0.0 ROCdbgapi - 0.77.2 + 0.77.2 ⇒ 0.77.3 @@ -357,14 +515,14 @@ Click {fab}`github` to go to the component's source code on GitHub. ROCm Debugger (ROCgdb) - 15.2 + 15.2 ⇒ 16.3 ROCr Debug Agent - 2.0.4 + 2.0.4 ⇒ 2.1.0 @@ -379,7 +537,7 @@ Click {fab}`github` to go to the component's source code on GitHub. llvm-project - 19.0.0 + 19.0.0 ⇒ 20.0.0 @@ -388,12 +546,12 @@ Click {fab}`github` to go to the component's source code on GitHub. Runtimes HIP - 6.4.3 + 6.4.3 ⇒ 7.0.0 ROCr Runtime - 1.15.0 + 1.15.0 ⇒ 1.18.0 @@ -408,21 +566,1096 @@ The following sections describe key changes to ROCm components. For a historical overview of ROCm component updates, see the {doc}`ROCm consolidated changelog `. ``` -### **ROCm SMI** (7.7.0) +### Composable Kernel 1.1.0 #### Added -- Support for getting the GPU Board voltage. +* Added support for bf16, f32, and f16 for 2D and 3D NGCHW grouped convolution backward data +* Added a fully asynchronous HOST (CPU) arguments copy flow for CK grouped GEMM kernels. +* Added support GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW, number of instances in instance factory for NGCHW/GKYXC/NGKHW has been reduced). +* Added support for GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW). +* Added support for GKCYX layout for grouped convolution backward weight (NGCHW/GKCYX/NGKHW). +* Added support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). +* Added support for Stream-K version of mixed fp8/bf16 GEMM +* Added support for Multiple D GEMM +* Added GEMM pipeline for microscaling (MX) FP8/FP6/FP4 data types +* Added support for FP16 2:4 structured sparsity to universal GEMM. +* Added support for Split K for grouped convolution backward data. +* Added logit soft-capping support for fMHA forward kernels. +* Added support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv) +* Added benchmarking support for tile engine GEMM. +* Added Ping-pong scheduler support for GEMM operation along the K dimension. +* Added rotating buffer feature for CK_Tile GEMM. +* Added int8 support for CK_TILE GEMM. -```{note} -See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-6.4/CHANGELOG.md) for details, examples, and in-depth descriptions. -``` +#### Optimized + + +* Optimize the gemm multiply multiply preshuffle & lds bypass with Pack of KGroup and better instruction layout. +* Added Vectorize Transpose optimization for CK Tile. +* Added the asynchronous copy for gfx950. + +#### Changes + +* Removed support for gfx940 and gfx941 targets. +* Replaced the raw buffer load/store intrinsics with Clang20 built-ins. +* DL and DPP kernels are now enabled by default. +* Number of instances in instance factory for grouped convolution forward NGCHW/GKYXC/NGKHW has been reduced. +* Number of instances in instance factory for grouped convolution backward weight NGCHW/GKYXC/NGKHW has been reduced. +* Number of instances in instance factory for grouped convolution backward data NGCHW/GKYXC/NGKHW has been reduced. + +### HIP 7.0.0 + +#### Added + +* New HIP APIs + - `hipLaunchKernelEx` dispatches the provided kernel with the given launch configuration and forwards the kernel arguments. + - `hipLaunchKernelExC` launches a HIP kernel using a generic function pointer and the specified configuration. + - `hipDrvLaunchKernelEx` dispatches the device kernel represented by a HIP function object. + - `hipMemGetHandleForAddressRange` gets a handle for the address range requested. + - `num_threads` Total number of threads in the group. The legacy API size is alias. + - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for reduction across lanes of a warp. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). +* New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). + - Data types for `FP4`/`FP6`/`FP8`. + - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. + - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. +* New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. +* New debug mask, to print precise code object information for logging. +* The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. +* Added `constexpr` operators for `fp16`/`bf16`. +* Added `__syncwarp` operation. +* Added PCI CHIP ID information as the device attribute. +* Added new tests applications for OCP data types `FP4`/`FP6`/`FP8`. +* A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. + +#### Changed +* Deprecated GPUs. +Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. +* Behavior changes + - `hipGetLastError` now gets the error code returned by `hipGetLastError` which should be the last actual error caught in the current thread during the application execution. + - Cooperative groups in `hipLaunchCooperativeKernelMultiDevice` and `hipLaunchCooperativeKernel` functions, additional input parameter validation checks are added. + - `hipPointerGetAttributes` returns `hipSuccess` instead of an error with invalid value `hipErrorInvalidValue`, in case `NULL` host or attribute pointer is passed as input parameter. It now matches the functionality of `cudaPointerGetAttributes` which changed with CUDA 11 and above releases. + - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree` + - `hipFreeAsync` now returns `hipSuccess` when the input pointer is NULL, instead of ` hipErrorInvalidValue` , to be consistent with `hipFree`. +* Changes in hipRTC. + - Removal of `hipRTC` symbols from HIP Runtime Library. + Any application using `hipRTC` APIs should link explicitly with the `hipRTC` library. This makes the usage of `hipRTC` library on Linux the same as on Windows and matches the behavior of CUDA `nvRTC`. + - `hipRTC` compilation + The device code compilation now uses namespace `__hip_internal`, instead of the standard headers `std`, to avoid namespace collision. + - Changes of datatypes from `hipRTC`. + Datatype definitions such as `int64_t`, `uint64_t`, `int32_t`, and `uint32_t`, etc. are removed to avoid any potential conflicts in some applications. HIP now uses internal datatypes instead, prefixed with `__hip`, for example, `__hip_int64_t`. +* HIP header clean up + - Usage of STD headers, HIP header files only include necessary STL headers. + - Deprecated structure `HIP_MEMSET_NODE_PARAMS` is removed. Developers can use the definition `hipMemsetParams` instead. +* API signature/struct changes + - API signatures are adjusted in some APIs to match corresponding CUDA APIs. Impacted APIs are as folloing: + * `hiprtcCreateProgram` + * `hiprtcCompileProgram` + * `hipMemcpyHtoD` + * `hipCtxGetApiVersion` + - HIP struct change in `hipMemsetParams`, it is updated and compatible with CUDA. + - HIP vector constructor change in `hipComplex` initialization now generates correct values. The affected constructors will be small vector types such as `float2`, `int4`, etc. +* Stream Capture updates + - Restricted stream capture mode, it is made in HIP APIs via adding the macro `CHECK_STREAM_CAPTURE_SUPPORTED ()`. +In the previous HIP enumeration `hipStreamCaptureMode`, three capture modes were defined. With checking in the macro, the only supported stream capture mode is now `hipStreamCaptureModeRelaxed`. The rest are not supported, and the macro will return `hipErrorStreamCaptureUnsupported`. This update involves the following APIs, which is allowed only in relaxed stream capture mode, + * `hipMallocManaged` + * `hipMemAdvise` + - Checks stream capture mode, the following APIs check the stream capture mode and return error codes to match the behavior of CUDA. + * `hipLaunchCooperativeKernelMultiDevice` + * `hipEventQuery` + * `hipStreamAddCallback` + - Returns error during stream capture. The following HIP APIs now returns specific error `hipErrorStreamCaptureUnsupported` on the AMD platform, but not always `hipSuccess`, to match behavior with CUDA. + * `hipDeviceSetMemPool` + * `hipMemPoolCreate` + * `hipMemPoolDestroy` + * `hipDeviceSetSharedMemConfig` + * `hipDeviceSetCacheConfig` + * `hipMemcpyWithStream` +* Error code update +Returned error/value codes are updated in the following HIP APIs to match the corresponding CUDA APIs. + - Module Management Related APIs + * `hipModuleLaunchKernel` + * `hipExtModuleLaunchKernel` + * `hipExtLaunchKernel` + * `hipDrvLaunchKernelEx` + * `hipLaunchKernel` + * `hipLaunchKernelExC` + * `hipModuleLaunchCooperativeKernel` + * `hipModuleLoad` + - Texture Management Related APIs +The following APIs update the return codes to match the behavior with CUDA: + * `hipTexObjectCreate`, supports zero width and height for 2D image. If either is zero, will not return `false`. + * `hipBindTexture2D`, adds extra check, if pointer for texture reference or device is NULL, returns `hipErrorNotFound`. + * `hipBindTextureToArray`, if any NULL pointer is input for texture object, resource descriptor, or texture descriptor, returns error `hipErrorInvalidChannelDescriptor`, instead of `hipErrorInvalidValue`. + * `hipGetTextureAlignmentOffset`, adds a return code `hipErrorInvalidTexture` when the texture reference pointer is NULL. + - Cooperative Group Related APIs, more calidations are added in the following API implementation, + * `hipLaunchCooperativeKernelMultiDevice` + * `hipLaunchCooperativeKernel` +* Invalid stream input parameter handling +In order to match the CUDA runtime behavior more closely, HIP APIs with streams passed as input parameters no longer check the stream validity. Previously, the HIP runtime returned an error code `hipErrorContextIsDestroyed` if the stream was invalid. In CUDA version 12 and later, the equivalent behavior is to raise a segmentation fault. HIP runtime now matches the CUDA by causing a segmentation fault. The list of APIs impacted by this change are as follows: + - Stream Management Related APIs + * `hipStreamGetCaptureInfo` + * `hipStreamGetPriority` + * `hipStreamGetFlags` + * `hipStreamDestroy` + * `hipStreamAddCallback` + * `hipStreamQuery` + * `hipLaunchHostFunc` + - Graph Management Related APIs + * `hipGraphUpload` + * `hipGraphLaunch` + * `hipStreamBeginCaptureToGraph` + * `hipStreamBeginCapture` + * `hipStreamIsCapturing` + * `hipStreamGetCaptureInfo` + * `hipGraphInstantiateWithParams` + - Memory Management Related APIs + * `hipMemcpyPeerAsync` + * `hipMemcpy2DValidateParams` + * `hipMallocFromPoolAsync` + * `hipFreeAsync` + * `hipMallocAsync` + * `hipMemcpyAsync` + * `hipMemcpyToSymbolAsync` + * `hipStreamAttachMemAsync` + * `hipMemPrefetchAsync` + * `hipDrvMemcpy3D` + * `hipDrvMemcpy3DAsync` + * `hipDrvMemcpy2DUnaligned` + * `hipMemcpyParam2D` + * `hipMemcpyParam2DAsync` + * `hipMemcpy2DArrayToArray` + * `hipMemcpy2D` + * `hipMemcpy2DAsync` + * `hipDrvMemcpy2DUnaligned` + * `hipMemcpy3D` + - Event Management Related APIs + * `hipEventRecord` + * `hipEventRecordWithFlags` +* `warpSize` Change +In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see either the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). + +#### Optimized + +HIP runtime has the following functional improvements which greatly improve runtime performance and user experience. + +* Reduced usage of the lock scope in events and kernel handling. + - Switches to `shared_mutex` for event validation, uses `std::unique_lock` in HIP runtime to create/destroy event, instead of `scopedLock`. + - Reduces the `scopedLock` in handling of kernel execution. HIP runtime now calls `scopedLock` during kernel binary creation/initialization, doesn't call it again during kernel vector iteration before launch. +* Implementation of unifying managed buffer and kernel argument buffer so HIP runtime doesn't need to create/load a separate kernel argument buffer. +* Refactored memory validation, creates a unique function to validate a variety of memory copy operations. +* Improved kernel logging using demangling shader names. +* Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). +* Programmatic support for scratch limits on MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, + - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. + - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. + - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. + Developers can now use the environment variable `HSA_SCRATCH_SINGLE_LIMIT_ASYNC` to change the default allocation size with expected scratch limit in ROCR runtime. On top of it, this value can also be overwritten programmatically in the application using the HIP API `hipDeviceSetLimit(hipExtLimitScratchCurrent, value)` to reset the scratch limit value. +* HIP runtime now enables peer-to-peer (P2P) memory copies to utilize all available SDMA engines, rather than being limited to a single engine. It also selects the best engine first to give optimal bandwidth. +* Improved launch latency for `D2D` copies and `memset` on MI300 series. +* Memory manager was implemented to improve the efficiency of memory usage and speed-up memory allocation/free in memory pools. +* Introduced a threshold to handle the command submission patch to the GPU device(s), considering the synchronization with CPU, for performance improvement. + +#### Resolved issues + +* Error of "unable to find modules" in HIP clean up for code object module. +* The issue of incorrect return error `hipErrorNoDevice`, when a crash occurred on GPU device due to illegal operation or memory violation. HIP runtime now handles the failure on the GPU side properly and reports the precise error code based on the last error seen on the GPU. +* Failures in some framework test applications, HIP runtime fixed the bug in retrieving a memory object from the IPC memory handle. +* A crash in TensorFlow related application. HIP runtime now combines multiple definitions of `callbackQueue` into a single function, in case of an exception, passes its handler to the application and provides corresponding error code. +* Fixed issue of handling the kernel parameters for the graph launch. +* Failures in roc-obj tools. HIP runtime now makes `DEPRECATED` message in roc-obj tools as `STDERR`. + +### **hipBLAS** (3.0.0) + +#### Added + +* Added the `hipblasSetWorkspace()` API +* Support for codecoverage tests + +#### Changed + +* HIPBLAS_V2 API is now the only available API using `hipComplex` and `hipDatatype` types +* Documentation updates +* Verbose compilation for `hipblas.cpp` + +#### Removed + +* `hipblasDatatype_t` type +* `hipComplex` and `hipDoubleComplex` types +* Support code for non-production gfx targets + +#### Resolved Issues + +* The build time `CMake` configuration for the dependency on `hipBLAS-common` is fixed +* Compiler warnings for unhandled enums have been resolved + +### **hipBLASLt** (1.0.0) + +#### Added + +* Stream-K GEMM support has been enabled for the `FP32`, `FP16`, `BF16`, `FP8`, and `BF8` data types on the MI300A APU. To activate this feature, set the `TENSILE_SOLUTION_SELECTION_METHOD` environment variable to `2`, for example, `export TENSILE_SOLUTION_SELECTION_METHOD=2`. +* Fused Swish/SiLU GEMM in hipBLASLt (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``) +* Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942 +* Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain max workspace size for user offline tuning +* Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support FP16/BF16 swizzle GEMM and FP8/BF8 swizzle GEMM respectively. +* Added TF32 emulation on gfx950 + +#### Changed + +* ``HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER_VEC_EXT`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER_VEC_EXT`` are removed. Use the ``HIPBLASLT_MATMUL_DESC_A_SCALE_MODE`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_MODE`` attributes to set scalar (``HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F``) or vector (``HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F``). +* The non-V2 APIs (``GemmPreference``, ``GemmProblemType``, ``GemmEpilogue``, ``GemmTuning``, ``GemmInputs``) in the Cpp header are now the same as the V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``). The original non-V2 APIs are removed. +* `hipblasltExtAMaxWithScale` API is removed. + +#### Optimized + +* Improved performance for 8-bit (FP8/BF8/I8) NN/NT cases by adding ``s_delay_alu`` to reduce stalls from dependent ALU operations on gfx12+. +* Improved performance for 8-bit and 16-bit (FP16/BF16) TN cases by enabling software dependency check (Expert Scheduling Mode) under certain restrictions to reduce redundant hardware dependency checks on gfx12+. +* Improved performance for 8-bit, 16-bit, and 32-bit batched GEMM with a better heuristic search algorithm for gfx942. + +#### Upcoming Changes + +* V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``) are deprecated. + +### **hipCUB** (4.0.0) + +#### Added + +* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is build with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. +* Added single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: + * `BlockScanRunningPrefixOp` + * `ScanTileStatus` + * `ScanTileState` + * `ReduceByKeyScanTileState` + * `TilePrefixCallbackOp` +* Added gfx950 support. +* Added an overload of `BlockScan::InclusiveScan` that accepts an initial value to seed the scan. +* Added an overload of `WarpScan::InclusiveScan` that accepts an initial value to seed the scan. +* `UnrolledThreadLoad`, `UnrolledCopy`, and `ThreadLoadVolatilePointer` were added to align hipCUB with CUB. +* `ThreadStoreVolatilePtr` and the `IterateThreadStore` struct were added to align hipCUB with CUB. +* Added `hipcub::InclusiveScanInit` for CUB parity. + +#### Removed + +* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you would like to build for these architectures, please specify them explicitly in the `AMDGPU_TARGETS` cmake option. +* Deprecated `hipcub::AsmThreadLoad` is removed, use `hipcub::ThreadLoad` instead. +* Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. +* Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. +* This release removes support for custom builds on gfx940 and gfx941. +* Removed C++14 support, only C++17 is supported. + +#### Changed + +* The NVIDIA backend now requires CUB, Thrust, and libcu++ 2.7.0. If they aren't found, they will be downloaded from the NVIDIA CCCL repository. +* Updated `thread_load` and `thread_store` to align hipCUB with CUB. +* All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, (for example, hipcub::HIPCUB_300400_NS::symbol instead of hipcub::symbol), letting the user link multiple libraries built with different versions of hipCUB. +* Modified the broadcast kernel in warp scan benchmarks. The reported performance may be different to previous versions. +* The `hipcub::detail::accumulator_t` in rocPRIM backend has been changed to utilise `rocprim::accumulator_t`. +* The usage of `rocprim::invoke_result_binary_op_t` has been replaced with `rocprim::accumulator_t`. + +#### Resolved Issues + +* Fixed an issue where `Sort(keys, compare_op, valid_items, oob_default)` in `block_merge_sort.hpp` would not fill in elements that are out of range (items after `valid_items`) with `oob_default`. +* Fixed an issue where `ScatterToStripedFlagged` in `block_exhange.hpp` was calling the wrong function. + +#### Known Issues + +* `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed from hipCUB's CUB backend. They were already deprecated as of version 2.12.0 of hipCUB and they were removed from CCCL (CUB) as of CCCL's 2.6.0 release. +* `BlockScan::InclusiveScan` for the NVIDIA backend does not compute the block aggregate correctly when passing an initial value parameter. This behavior is not matched by the AMD backend. + +#### Upcoming Changes + +* `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` were deprecated as of version 2.12.0 of hipCUB, and will be removed from the rocPRIM backend in a future release for the next ROCm major version (ROCm 7.0.0). + +### **hipFFT** (1.0.20) + +#### Added + +* Added gfx950 support. + +#### Removed + +* Removed hipfft-rider legacy compatibility from clients +* Remove support for the gfx940 and gfx941 targets from the client programs. +* Remove backward compatibility symlink for include directories. + +### **hipfort** (0.7.0) + +#### Added + +* Added documentation clarifying how hipfort is built for the NVIDIA + platform. Thanks [@fluidnumerics-joe](https://github.com/fluidnumerics-joe)! + +#### Changed + +* Updated and reorganized documentation for clarity and consistency. + +### **HIPIFY** (7.0.0) + +#### Added + +* CUDA 12.9.1 support +* cuDNN 9.11.0 support +* cuTENSOR 2.2.0.0 support +* LLVM 20.1.8 support + +#### Resolved Issues + +* `hipDNN` support is removed by default +* [#1859](https://github.com/ROCm/HIPIFY/issues/1859)[hipify-perl] Fix warnings on unsupported Driver or Runtime APIs which were erroneously not reported +* [#1930](https://github.com/ROCm/HIPIFY/issues/1930) Revise `JIT API` +* [#1962](https://github.com/ROCm/HIPIFY/issues/1962) Support for cuda-samples helper headers +* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Remove `const_cast<const char**>` in `hiprtcCreateProgram` and `hiprtcCompileProgram` + +### **hipRAND** (3.0.0) + +#### Added + +* gfx950 support + +#### Changed + +* Deprecated hipRAND's Fortran API in favor of hipfort. + +#### Removed + +* Removed C++14 support, only C++17 is supported. + +### **hipSOLVER** (3.0.0) + +#### Added + +* Added compatibility-only functions + * csrlsvqr + * hipsolverSpCcsrlsvqr, hipsolverSpZcsrlsvqr + +#### Resolved Issues + +* Corrected the value of `lwork` returned by various `bufferSize` functions to be consistent with NVIDIA cuSOLVER. The following functions will + now return `lwork` such that the workspace size (in bytes) is `sizeof(T) * lwork`, rather than `lwork`. To restore the original behavior, set + environment variable `HIPSOLVER_BUFFERSIZE_RETURN_BYTES`. + * hipsolverXorgbr_bufferSize, hipsolverXorgqr_bufferSize, hipsolverXorgtr_bufferSize, hipsolverXormqr_bufferSize, hipsolverXormtr_bufferSize, + hipsolverXgesvd_bufferSize, hipsolverXgesvdj_bufferSize, hipsolverXgesvdBatched_bufferSize, hipsolverXgesvdaStridedBatched_bufferSize, + hipsolverXsyevd_bufferSize, hipsolverXsyevdx_bufferSize, hipsolverXsyevj_bufferSize, hipsolverXsyevjBatched_bufferSize, + hipsolverXsygvd_bufferSize, hipsolverXsygvdx_bufferSize, hipsolverXsygvj_bufferSize, hipsolverXsytrd_bufferSize, hipsolverXsytrf_bufferSize + +### **hipSPARSE** (4.0.1) + +#### Added + +* Add the `int8`, `int32`, and `float16` data types to `hipDataTypeToHCCDataType` so that sparse matrix descriptors can be used with them. +* Adds half float mixed precision to `hipsparseAxpby` where X and Y use float16 and result and the compute type use float +* Adds half float mixed precision to `hipsparseSpVV` where X and Y use float16 and result and the compute type use float +* Adds half float mixed precision to `hipsparseSpMM` where A and B use float16 and C and the compute type use float +* Adds half float mixed precision to `hipsparseSDDMM` where A and B use float16 and C and the compute type use float +* Adds half float uniform precision to `hipsparseScatter` and `hipsparseGather` routines +* Adds half float uniform precision to `hipsparseSDDMM` routine +* Add `int8` precision to `hipsparseCsr2cscEx2` routine. +* Add the `almalinux` OS name to correct the gfortran dependency + +#### Changed + +* Switch to defaulting to C++17 when building hipSPARSE from source. Previously hipSPARSE was using C++14 by default. + +#### Resolved Issues + +* Fixed a compilation [issue](https://github.com/ROCm/hipSPARSE/issues/555) related to using `std::filesystem` and C++14. +* Fixed the empty clients-common package by moving the `hipsparse_clientmatrices.cmake` and `hipsparse_mtx2csr` files to it. + +#### Known Issues + +* In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed because it is unused internally by `hipsparseSpSM_solve()`. + +### **hipSPARSELt** (0.2.4) + +#### Added + +* Support for the LLVM target gfx950. +* Support for the following data type combinations for the LLVM target gfx950: + * FP8(E4M3) inputs, F32 output, and F32 Matrix Core accumulation. + * BF8(E5M2) inputs, F32 output, and F32 Matrix Core accumulation. +* Support for ROC-TX if `HIPSPARSELT_ENABLE_MARKER=1` is set. +* Support for the cuSPARSELt v0.6.3 backend. + +#### Optimized + +* Improved the library loading time. +* Provided more kernels for FP16 datatype. + +#### Removed + +* Support for LLVM targets gfx940 and gfx941 has been removed. +* `hipsparseLtDatatype_t` has been removed. + +### **hipTensor** (2.0.0) + +#### Added + +* Added element-wise binary operation support. +* Added element-wise trinary operation support. +* Added support for new GPU target gfx950. +* Added dynamic unary and binary operator support for element-wise operations and permutation. +* Added a CMake check for `f8` datatype availability. +* Added `hiptensorDestroyOperationDescriptor` to free all resources related to the provided descriptor. +* Added `hiptensorOperationDescriptorSetAttribute` to set attribute of a `hiptensorOperationDescriptor_t` object. +* Added `hiptensorOperationDescriptorGetAttribute` to retrieve an attribute of the provided `hiptensorOperationDescriptor_t` object. +* Added `hiptensorCreatePlanPreference` to allocate the `hiptensorPlanPreference_t` and enabled users to limit the applicable kernels for a given plan or operation. +* Added `hiptensorDestroyPlanPreference` to free all resources related to the provided preference. +* Added `hiptensorPlanPreferenceSetAttribute` to set attribute of a `hiptensorPlanPreference_t` object. +* Added `hiptensorPlanGetAttribute` to retrieve information about an already-created plan. +* Added `hiptensorEstimateWorkspaceSize` to determine the required workspaceSize for the given operation. +* Added `hiptensorCreatePlan` to allocate a `hiptensorPlan_t` object, select an appropriate kernel for a given operation and prepare a plan that encodes the execution. +* Added `hiptensorDestroyPlan` to free all resources related to the provided plan. + +#### Changed + +* Removed architecture support for gfx940 and gfx941. +* Generalized opaque buffer now for any descriptor. +* Replaced `hipDataType` with `hiptensorDataType_t` for all supported types, for example, `HIP_R_32F` to `HIPTENSOR_R_32F`. +* Replaced `hiptensorComputeType_t` with `hiptensorComputeDescriptor_t` for all supported types. +* Replaced `hiptensorInitTensorDescriptor` with `hiptensorCreateTensorDescriptor`. +* Changed handle type and API usage from `*handle` to `handle`. +* Replaced `hiptensorContractionDescriptor_t` with `hipTensorOperationDescriptor_t`. +* Replaced `hiptensorInitContractionDescriptor` with `hiptensorCreateContraction`. +* Replaced `hiptensorContractionFind_t` with `hiptensorPlanPreference_t`. +* Replaced `hiptensorInitContractionFind` with `hiptensorCreatePlanPreference`. +* Replaced `hiptensorContractionGetWorkspaceSize` with `hiptensorEstimateWorkspaceSize`. +* Replaced `HIPTENSOR_WORKSPACE_RECOMMENDED` with `HIPTENSOR_WORKSPACE_DEFAULT`. +* Replaced `hiptensorContractionPlan_t` with `hiptensorPlan_t`. +* Replaced `hiptensorInitContractionPlan` with `hiptensorCreatePlan`. +* Replaced `hiptensorContraction` with `hiptensorContract`. +* Replaced `hiptensorPermutation` with `hiptensorPermute`. +* Replaced `hiptensorReduction` with `hiptensorReduce`. +* Replaced `hiptensorElementwiseBinary` with `hiptensorElementwiseBinaryExecute`. +* Replaced `hiptensorElementwiseTrinary` with `hiptensorElementwiseTrinaryExecute`. +* Removed function `hiptensorReductionGetWorkspaceSize`. + +### **MIOpen** (3.5.0) + +#### Added + +* [Conv] Added misa kernels for gfx950 +* [Conv] Enabled split_k support for CK backward data solvers (2D) +* Added grouped convolution + activation fusion +* Added grouped convolution + bias + activation fusion +* [BatchNorm] Enabled NHWC in OpenCL +* Composable Kernel (CK) can now be built inline as part of MIOpen +* Changed to using median value with outliers removed when deciding on the best solution to run +* [Conv] Enabled CK wrw solver on gfx950 for bf16 datatype +* [Conv] Updated igemm asm solver + +#### Optimized + +* [BatchNorm] Optimized NHWC OpenCL kernels and improved heuristics +* [RNN] Dynamic algorithm optimization +* [Conv] Eliminated redundant clearing of output buffers +* [RNN] Updated selection heuristics +* Updated tuning for MI300 + +#### Resolved Issues + +* Fixed a segmentation fault when user specifies workspace smaller than what is required +* Fixed a layout calculation logic error that returned incorrect results and enabled less restrictive layout selection +* Fixed memory access faults in misa kernels due to out-of-bounds memory usage +* Fixed performance drop on gfx950 due to transpose kernel use +* Fixed memory access fault caused by not allocating enough workspace +* Fixed a name typo that caused kernel mismatches and long startup times + +### **MIVisionX** (3.3.0) + +#### Changed + +* VX_RPP extension : Version 3.1.0 release +* Add support to enable/disable BatchPD code in VX_RPP extensions by checking the RPP_LEGACY_SUPPORT flag. +* Update the parameters and kernel API of Blur, Fog, Jitter, LensCorrection, Rain, Pixelate, Vignette and ResizeCrop wrt tensor kernels replacing the legacy BatchPD API calls in VX_RPP extensions. + +#### Known Issues + +* Installation on CentOS/RedHat/SLES requires the manual installation of the `FFMPEG` & `OpenCV` dev packages. + +#### Upcoming Changes + +* Optimized audio augmentations support for VX_RPP + +### **rccl** (2.26.6) + +#### Resolved Issues + +* Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. +* Fixed unit test failures in tests ending with `ManagedMem` and `ManagedMemGraph` suffixes. +* Suboptimal algorithmic switching point for AllReduce on MI300x. +* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault." with a design change to use `comm` instead of `rank` for `mscclStatus`. The Global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read writes. This is tested for correctness, but there is a plan to use a thread-safe map data structure in upcoming changes. + +#### Added + +* Added support for extended fine-grained system memory pool. +* Added new GPU target `gfx950`. +* Added support for `unroll=1` in device-code generation to improve performance. +* Set a default of 112 channels for a single node with `8 * gfx950`. +* Enabled LL128 protocol on `gfx950`. +* Adding ability to choose unroll factor at runtime via `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. +* Added MSCCL support for AllGather multinode gfx942/gfx950 (i.e., 16 and 32 GPUs). To enable, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. Max message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. +* Thread thresholds for LL/LL128 are selected in Tuning Models for the MI300X. This impacts the number of channels used for AG and RS. Channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS', or 'NCCL_MAX_NCHANNELS` are set. +* Multi-node tuning for AllGather, AllReduce, and ReduceScatter that leverages LL/LL64/LL128 protocol to use nontemporal vector load/store for tunable message size ranges. +* LL/LL128 usage ranges for AR, AG, and RS are part of the tuning models, which enable architecture-specific tuning in conjunction with the existing Rome Models scheme in RCCL. +* Two new APIs are exposed as part of an initiative to separate RCCL code. These APIs are `rcclGetAlgoInfo` and `rcclFuncMaxSendRecvCount`. However, user-level invocation requires that RCCL be built with `RCCL_EXPOSE_STATIC` enabled. + +#### Changed + +* Compatibility with NCCL 2.23.4 +* Compatibility with NCCL 2.24.3 +* Compatibility with NCCL 2.25.1 +* Compatibility with NCCL 2.26.6 + +### **rocALUTION** (4.0.0) + +#### Added + +* Added support for gfx950. + +#### Changed + +* Switch to defaulting to C++17 when building rocALUTION from source. Previously rocALUTION was using C++14 by default. + +#### Optimized + +* Improved the user documentation + +#### Resolved Issues + +* Fix for GPU hashing algorithm when not compiling with -O2/O3 + +### **rocBLAS** (5.0.0) + +#### Added + +* gfx950 support +* `ROCBLAS_LAYER = 8` internal API logging for `gemm` debugging +* Support for AOCL 5.0 gcc build as a client reference library +* Allow `PkgConfig` for client reference library fallback detection + +#### Changed + +* `CMAKE_CXX_COMPILER` is now passed on during compilation for a Tensile build +* Change default atomics mode from `allowed` to `not allowed` + +#### Removed + +* Support code for non-production gfx targets +* `rocblas_hgemm_kernel_name`, `rocblas_sgemm_kernel_name`, and `rocblas_dgemm_kernel_name` API functions +* Use of `warpSize` as a constexpr +* Use of deprecated behavior of `hipPeekLastError` +* `rocblas_float8.h` and `rocblas_hip_f8_impl.h` files +* `rocblas_gemm_ex3`, `rocblas_gemm_batched_ex3`, `rocblas_gemm_strided_batched_ex3` API functions + +#### Optimized + +* Optimized `gemm` by using `gemv` kernels when applicable +* Optimized `gemv` for small `m` and `n` with a large batch count on gfx942 +* Improved the performance of Level 1 `dot` for all precisions and variants when `N > 100000000` on gfx942 +* Improved the performance of Level 1 `asum` and `nrm2` for all precisions and variants on gfx942 +* Improved the performance of Level 2 `sger` (single precision) on gfx942 +* Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942 + +#### Resolved Issues + +* Fixed environment variable path-based logging to append multiple handle output to the same file +* Support numerics when `trsm` is running with `rocblas_status_perf_degraded` +* Fixed the build dependency installation of `joblib` on some operating systems +* Return `rocblas_status_internal_error` when `rocblas_[set,get]_ [matrix,vector]` is called with a host pointer in place of a device pointer +* Reduced the default verbosity level for internal GEMM backend information +* Updated from the deprecated rocm-cmake to ROCmCMakeBuildTools +* Corrected AlmaLinux gfortran package dependencies + +#### Upcoming Changes + +* Deprecated the use of negative indices to indicate the default solution is being used for `gemm_ex` with `rocblas_gemm_algo_solution_index` + +### **rocDecode** (1.0.0) + +#### Added + +* VP9 IVF container file parsing support in bitstream reader. +* CTest for VP9 decode on bitstream reader. +* HEVC/AVC/AV1/VP9 stream syntax error handling. +* HEVC stream bit depth change handling and DPB buffer size change handling through decoder reconfiguration. +* AVC stream DPB buffer size change handling through decoder reconfiguration. +* rocdecode now uses the Cmake CMAKE_PREFIX_PATH directive. +* rocdecode - A new avcodec-based decoder built as a separate "rocdecode-host" library + +#### Optimized + +* Decode session start latency reduction. +* Bitstream type detection optimization in bitstream reader. + +#### Resolved Issues + +* Fixed a bug in picture files sample "videoDecodePicFiles" that can results in incorrect output frame count. +* Fixed a decoded frame output issue in video size change cases. +* Removed incorrect asserts of bitdepth_minus_8 in GetBitDepth() and num_chroma_planes in GetNumChromaPlanes() API calls in RocVideoDecoder utility class. + +#### Removed + +* GetStream() interface call from RocVideoDecoder utility class + +#### Changed + +* Changed asserts in query API calls in RocVideoDecoder utility class to error reports, to avoid hard stop during query in case error occurs and to let the caller decide actions. +* `libdrm_amdgpu` is now explicitly linked with rocdecode. + +### **rocFFT** (1.0.34) + +#### Added + +* Added gfx950 support. + +#### Removed + +* Removed rocfft-rider legacy compatibility from clients +* Removed support for the gfx940 and gfx941 targets from the client programs. +* Removed backward compatibility symlink for include directories. + +#### Optimized + +* Removed unnecessary HIP event/stream allocation and synchronization during MPI transforms. +* Implemented single-precision 1D kernels for lengths: + - 4704 + - 5488 + - 6144 + - 6561 + - 8192 +* Implemented single-kernel plans for some large 1D problem sizes, on devices with at least 160KiB of LDS. + +#### Resolved Issues + +* Fixed kernel faults on multi-device transforms that gather to a single device, when the input/output bricks are not + contiguous. + +### **ROCmValidationSuite** (1.2.0) + +#### Added + +- Support for new platforms: MI350X and MI355X. +- Introduced rotating buffer mechanism for GEMM operations. +- Support for read and write tests in Babel. +- Support for new platforms: RX9070 and RX9070GRE. + +#### Changed + +- Migrated SMI API usage from `rocm-smi` to `amd-smi`. +- Updated FP8 GEMM operations to use hipBLASLt instead of rocBLAS. + +### **rocPRIM** (4.0.0) + +#### Added + +* Added `rocprim::accumulator_t` to ensure parity with CCCL. +* Added test for `rocprim::accumulator_t` +* Added `rocprim::invoke_result_r` to ensure parity with CCCL. +* Added function `is_build_in` into `rocprim::traits::get`. +* Added virtual shared memory as a fallback option in `rocprim::device_merge` when it exceeds shared memory capacity, similar to `rocprim::device_select`, `rocprim::device_partition`, and `rocprim::device_merge_sort`, which already include this feature. +* Added initial value support to device level inclusive scans. +* Added new optimization to the backend for `device_transform` when the input and output are pointers. +* Added `LoadType` to `transform_config`, which is used for the `device_transform` when the input and output are pointers. +* Added `rocprim:device_transform` for n-ary transform operations API with as input `n` number of iterators inside a `rocprim::tuple`. +* Added gfx950 support. +* Added `rocprim::key_value_pair::operator==`. +* Added the `rocprim::unrolled_copy` thread function to copy multiple items inside a thread. +* Added the `rocprim::unrolled_thread_load` function to load multiple items inside a thread using `rocprim::thread_load`. +* Added `rocprim::int128_t` and `rocprim::uint128_t` to benchmarks for improved performance evaluation on 128-bit integers. +* Added `rocprim::int128_t` to the supported autotuning types to improve performance for 128-bit integers. +* Added the `rocprim::merge_inplace` function for merging in-place. +* Added initial value support for warp- and block-level inclusive scan. +* Added support for building tests with device-side random data generation, making them finish faster. This requires rocRAND, and is enabled with the `WITH_ROCRAND=ON` build flag. +* Added tests and documentation to `lookback_scan_state`. It is still in the `detail` namespace. + +#### Optimizations + +* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the MI3XX architecture. + +#### Changed + +* Changed the parameters `long_radix_bits` and `LongRadixBits` from `segmented_radix_sort` to `radix_bits` and `RadixBits` respectively. +* Marked the initialisation constructor of `rocprim::reverse_iterator<Iter>` `explicit`, use `rocprim::make_reverse_iterator`. +* Merged `radix_key_codec` into type_traits system. +* Renamed `type_traits_interface.hpp` to `type_traits.hpp`, rename the original `type_traits.hpp` to `type_traits_functions.hpp`. +* The default scan accumulator types for device-level scan algorithms have changed. This is a breaking change. +The previous default accumulator types could lead to situations in which unexpected overflow occured, such as +when the input or inital type was smaller than the output type. + * This is a complete list of affected functions and how their default accumulator types are changing: + * `rocprim::inclusive_scan` + * Previous default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>` + * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, typename std::iterator_traits<InputIterator>::value_type>` + * `rocprim::deterministic_inclusive_scan` + * Previous default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>` + * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, typename std::iterator_traits<InputIterator>::value_type>` + * `rocprim::exclusive_scan` + * Previous default: `class AccType = detail::input_type_t<InitValueType>>` + * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, rocprim::detail::input_type_t<InitValueType>>` + * `rocprim::deterministic_exclusive_scan` + * Previous default: `class AccType = detail::input_type_t<InitValueType>>` + * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, rocprim::detail::input_type_t<InitValueType>>` +* Undeprecated internal `detail::raw_storage`. +* A new version of `rocprim::thread_load` and `rocprim::thread_store` replace the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. +* Renamed `rocprim::load_cs` to `rocprim::load_nontemporal` and `rocprim::store_cs` to `rocprim::store_nontemporal` to express the intent of these load and store methods better. +* All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, for example, `rocprim::ROCPRIM_300400_NS::symbol` instead of `rocPRIM::symbol`, letting the user link multiple libraries built with different versions of rocPRIM. + +#### Upcoming Changes + +* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` now. + +#### Removed + +* Removed `rocprim::detail::float_bit_mask` and relative tests, use `rocprim::traits::float_bit_mask` instead. +* Removed `rocprim::traits::is_fundamental`, please use `rocprim::traits::get<T>::is_fundamental()` directly. +* Removed the deprecated parameters `short_radix_bits` and `ShortRadixBits` from the `segmented_radix_sort` config. They were unused, it is only an API change. +* Removed the deprecated `operator<<` from the iterators. +* Removed the deprecated `TwiddleIn` and `TwiddleOut`. Use `radix_key_codec` instead. +* Removed the deprecated flags API of `block_adjacent_difference`. Use `subtract_left()` or `block_discontinuity::flag_heads()` instead. +* Removed the deprecated `to_exclusive` functions in the warp scans. +* Removed the `rocprim::load_cs` from the `cache_load_modifier` enum. Use `rocprim::load_nontemporal` instead. +* Removed the `rocprim::store_cs` from the `cache_store_modifier` enum. Use `rocprim::store_nontemporal` instead. +* Removed the deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. + * This header included `rocprim::detail::invoke_result`. Use `rocprim::invoke_result` instead. + * This header included `rocprim::detail::invoke_result_binary_op`. Use `rocprim::invoke_result_binary_op` instead. + * This header included `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. +* Removed the deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. +* Removed `rocprim/detail/radix_sort.hpp`, functionality can now be found in `rocprim/thread/radix_key_codec.hpp`. +* Removed C++14 support, only C++17 is supported. +* Due to the removal of `__AMDGCN_WAVEFRONT_SIZE` in the compiler, the following deprecated warp size-related symbols have been removed: + * `rocprim::device_warp_size()` + * For compile-time constants, this is replaced with `rocprim::arch::wavefront::min_size()` and `rocprim::arch::wavefront::max_size()`. Use this when allocating global or shared memory. + * For run-time constants, this is replaced with `rocprim::arch::wavefront::size().` + * `rocprim::warp_size()` + * Use `rocprim::host_warp_size()`, `rocprim::arch::wavefront::min_size()` or `rocprim::arch::wavefront::max_size()` instead. + * `ROCPRIM_WAVEFRONT_SIZE` + * Use `rocprim::arch::wavefront::min_size()` or `rocprim::arch::wavefront::max_size()` instead. + * `__AMDGCN_WAVEFRONT_SIZE` + * This was a fallback define for the compiler's removed symbol, having the same name. +* This release removes support for custom builds on gfx940 and gfx941. + +#### Resolved Issues + +* Fixed an issue where `device_batch_memcpy` reported benchmarking throughput being 2x lower than it was in reality. +* Fixed an issue where `device_segmented_reduce` reported autotuning throughput being 5x lower than it was in reality. +* Fixed device radix sort not returning the correct required temporary storage when a double buffer contains `nullptr`. +* Fixed constness of equality operators (`==` and `!=`) in `rocprim::key_value_pair`. +* Fixed an issue for the comparison operators in `arg_index_iterator` and `texture_cache_iterator`, where `<` and `>` comparators were swapped. +* Fixed an issue for the `rocprim::thread_reduce` not working correctly with a prefix value. + +#### Known Issues + +* When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x + * However if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs + +### **rocRAND** (4.0.0) + +#### Added + +* gfx950 support +* Additional unit tests for `test_log_normal_distribution.cpp` +* Additional unit tests for `test_normal_distribution.cpp` +* Additional unit tests for `test_rocrand_mtgp32_prng.cpp` +* Additional unit tests for `test_rocrand_scrambled_sobol32_qrng.cpp` +* Additional unit tests for `test_rocrand_scrambled_sobol64_qrng.cpp` +* Additional unit tests for `test_rocrand_sobol32_qrng.cpp` +* Additional unit tests for `test_rocrand_sobol64_qrng.cpp` +* Additional unit tests for `test_rocrand_threefry2x32_20_prng.cpp` +* Additional unit tests for `test_rocrand_threefry2x64_20_prng.cpp` +* Additional unit tests for `test_rocrand_threefry4x32_20_prng.cpp` +* Additional unit tests for `test_rocrand_threefry4x64_20_prng.cpp` +* Additional unit tests for `test_uniform_distribution.cpp` +* New unit tests for `include/rocrand/rocrand_discrete.h` in `test_rocrand_discrete.cpp` +* New unit tests for `include/rocrand/rocrand_mrg31k3p.h` in `test_rocrand_mrg31k3p_prng.cpp` +* New unit tests for `include/rocrand/rocrand_mrg32k3a.h` in `test_rocrand_mrg32k3a_prng.cpp` +* New unit tests for `include/rocrand/rocrand_poisson.h` in `test_rocrand_poisson.cpp` + +#### Changed + +* Changed the return type for `rocrand_generate_poisson` for the `SOBOL64` and `SCRAMBLED_SOBOL64` engines. +* Changed the unnecessarily large 64-bit data type for constants used for skipping in `MRG32K3A` to the 32-bit data type. +* Updated several `gfx942` auto tuning parameters. +* Modified error handling and expanded the error information for the case of double-deallocation of the (scrambled) sobol32 and sobol64 constants and direction vectors. + +#### Removed + +* Removed inline assembly and the `ENABLE_INLINE_ASM` CMake option. Inline assembly was used to optimizate of multiplications in the Mrg32k3a and Philox 4x32-10 generators. It is no longer needed because the current HIP compiler is able to produce code with the same or better performance. +* Removed instances of the deprecated clang definition `__AMDGCN_WAVEFRONT_SIZE`. +* Removed C++14 support. Beginning with this release, only C++17 is supported. +* Directly accessing the (scrambled) sobol32 and sobol64 constants and direction vectors is no longer supported. For: + * `h_scrambled_sobol32_constants`, use `rocrand_get_scramble_constants32` instead. + * `h_scrambled_sobol64_constants`, use `rocrand_get_scramble_constants64` instead. + * `rocrand_h_sobol32_direction_vectors`, use `rocrand_get_direction_vectors32` instead. + * `rocrand_h_sobol64_direction_vectors`, use `rocrand_get_direction_vectors64` instead. + * `rocrand_h_scrambled_sobol32_direction_vectors`, use `rocrand_get_direction_vectors32` instead. + * `rocrand_h_scrambled_sobol64_direction_vectors`, use `rocrand_get_direction_vectors64` instead. + +#### Resolved Issues + +* Fixed an issue where `mt19937.hpp` would cause kernel errors during auto tuning. + +#### Upcoming Changes + +* Deprecated the rocRAND Fortran API in favor of hipfort. + +### **rocSHMEM** (3.0.0) + +#### Added + +* Added the Reverse Offload conduit +* Added new APIs: + * `rocshmem_ctx_barrier` + * `rocshmem_ctx_barrier_wave` + * `rocshmem_ctx_barrier_wg` + * `rocshmem_barrier_all` + * `rocshmem_barrier_all_wave` + * `rocshmem_barrier_all_wg` + * `rocshmem_ctx_sync` + * `rocshmem_ctx_sync_wave` + * `rocshmem_ctx_sync_wg` + * `rocshmem_sync_all` + * `rocshmem_sync_all_wave` + * `rocshmem_sync_all_wg` + * `rocshmem_init_attr` + * `rocshmem_get_uniqueid` + * `rocshmem_set_attr_uniqueid_args` +* Added dlmalloc based allocator +* Added XNACK support +* Added support for initialization with MPI communicators other than `MPI_COMM_WORLD` + +#### Changed + +* Changed collective APIs to use `_wg` suffix rather than `_wg_` infix + +#### Resolved Issues + +* Resolved segfault in `rocshmem_wg_ctx_create`, now provides nullptr if ctx cannot be created + +### **rocSOLVER** (3.30.0) + +#### Added + +* Hybrid computation support for existing routines: + - STEQR + +#### Optimized + +* Improved the performance of BDSQR and downstream functions such as GESVD +* Improved the performance of STEQR and downstream functions such as SYEV/HEEV +* Improved the performance of LARFT and downstream functions such as GEQR2 and GEQRF + +#### Resolved Issues + +* Fixed corner cases that can produce NaNs in SYEVD, for valid input matrices + +### **rocSPARSE** (4.0.2) + +#### Added + +* Adds `SpGEAM` generic routine for computing sparse matrix addition in CSR format +* Adds `v2_SpMV` generic routine for computing sparse matrix vector multiplication. As opposed to the deprecated `rocsparse_spmv` routine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated routine `rocsparse_spmv`, the user can enable warning messages in situations where a fallback algorithm is used by either calling upfront the routine `rocsparse_enable_debug` or exporting the variable `ROCSPARSE_DEBUG` (with the shell command `export ROCSPARSE_DEBUG=1`). +* Adds half float mixed precision to `rocsparse_axpby` where X and Y use float16 and result and the compute type use float +* Adds half float mixed precision to `rocsparse_spvv` where X and Y use float16 and result and the compute type use float +* Adds half float mixed precision to `rocsparse_spmv` where A and X use float16 and Y and the compute type use float +* Adds half float mixed precision to `rocsparse_spmm` where A and B use float16 and C and the compute type use float +* Adds half float mixed precision to `rocsparse_sddmm` where A and B use float16 and C and the compute type use float +* Adds half float uniform precision to `rocsparse_scatter` and `rocsparse_gather` routines +* Adds half float uniform precision to `rocsparse_sddmm` routine +* Added `rocsparse_spmv_alg_csr_rowsplit` algorithm. +* Added support for gfx950 +* Add ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux). +* Added the `almalinux` OS name to correct the gfortran dependency + +#### Changed + +* Switch to defaulting to C++17 when building rocSPARSE from source. Previously rocSPARSE was using C++14 by default. + +#### Optimized + +* Reduced the number of template instantiations in the library to further reduce the shared library binary size and improve compile times +* Allow SpGEMM routines to use more shared memory when available. This can speed up performance for matrices with a large number of intermediate products. +* Use of the `rocsparse_spmv_alg_csr_adaptive` or `rocsparse_spmv_alg_csr_default` algorithms in `rocsparse_spmv` to perform transposed sparse matrix multiplication (`C=alpha*A^T*x+beta*y`) resulted in unnecessary analysis on A and needless slowdown during the analysis phase. This has been fixed by skipping the analysis when performing the transposed sparse matrix multiplication. +* Improved the user documentation + +#### Resolved Issues + +* Fixed an issue in the public headers where `extern "C"` was not wrapped by `#ifdef __cplusplus`, which caused failures when building C programs with rocSPARSE. +* Fixed a memory access fault in the `rocsparse_Xbsrilu0` routines. +* Fixed failures that could occur in `rocsparse_Xbsrsm_solve` or `rocsparse_spsm` with BSR format when using host pointer mode. +* Fixed ASAN compilation failures +* Fixed failure that occurred when using const descriptor `rocsparse_create_const_csr_descr` with the generic routine `rocsparse_sparse_to_sparse`. Issue was not observed when using non-const descriptor `rocsparse_create_csr_descr` with `rocsparse_sparse_to_sparse`. +* Fixed a memory leak in the rocsparse handle + +#### Removed + +* The deprecated `rocsparse_spmv_ex` routine +* The deprecated `rocsparse_sbsrmv_ex`, `rocsparse_dbsrmv_ex`, `rocsparse_cbsrmv_ex`, and `rocsparse_zbsrmv_ex` routines +* The deprecated `rocsparse_sbsrmv_ex_analysis`, `rocsparse_dbsrmv_ex_analysis`, `rocsparse_cbsrmv_ex_analysis`, and `rocsparse_zbsrmv_ex_analysis` routines + +#### Upcoming Changes + +* Deprecated the `rocsparse_spmv` routine. Users should use the `rocsparse_v2_spmv` routine going forward. +* Deprecated `rocsparse_spmv_alg_csr_stream` algorithm. Users should use the `rocsparse_spmv_alg_csr_rowsplit` algorithm going forward. +* Deprecated the `rocsparse_itilu0_alg_sync_split_fusion` algorithm. Users should use one of `rocsparse_itilu0_alg_async_inplace`, `rocsparse_itilu0_alg_async_split`, or `rocsparse_itilu0_alg_sync_split` going forward. + +### **rocThrust** (4.0.0) + +#### Changed + +* Updated the required version of Google Benchmark from 1.8.0 to 1.9.0. +* Drop `c++14` support for rocthrust. +* Renamed `cpp14_required.h` to `cpp_version_check.h` +* Refactored `test_header.hpp` into separte modules `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. + * This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. + +#### Added + +* Additional unit tests for: + * binary_search + * complex + * c99math + * catrig + * ccosh + * cexp + * clog + * csin + * csqrt + * ctan +* Added `test_param_fixtures.hpp` to store all the parameters for typed test suites. +* Added `test_real_assertions.hpp` to handle unit test assertions for real numbers. +* Added `test_imag_assertions.hpp` to handle unit test assertions for imaginary numbers. +* `clang++` is now used to compile google benchmarks on Windows. +* Added gfx950 support. +* Merged changes from upstream CCCL/thrust 2.6.0 + +#### Removed + +* `device_malloc_allocator.h` has been removed. This header file was unused and should not impact users. +* Removed C++14 support, only C++17 is supported. +* `test_header.hpp` has been removed. The `HIP_CHECK` function, as well as the `test` and `inter_run_bwr` namespaces, have been moved to `test_utils.hpp`. +* `test_assertions.hpp` has been split into `test_real_assertions.hpp` and `test_imag_assertions.hpp`. + +#### Upcoming Changes + +* `thrust::device_malloc_allocator` is deprecated as of this version. It will be removed in an upcoming version. + +#### Resolved Issues + +* Fixed an issue with internal calls to unqualified `distance()` which would be ambigious due to also visibile implementation through ADL. + +#### Known Issues + +* The order of the values being compared by thrust::exclusive_scan_by_key and thrust::inclusive_scan_by_key can change between runs when integers are being compared. This can cause incorrect output when a non-commutative operator such as division is being used. + +### **rocWMMA** (2.0.0) + +#### Added + +* Added internal register layout transforms to support interleaved MMA layouts +* Added support for the gfx950 target +* Added mixed input `bf8` / `fp8` types for MMA support +* Added fragment scheduler API objects to embed thread block cooperation properties in fragments + +#### Changed + +* Augmented load / store / MMA internals with static loop unrolling +* rocWMMA mma_sync API now supports `wave tile` fragment sizes +* rocWMMA cooperative fragments are now expressed with fragment scheduler template arguments +* rocWMMA cooperative fragments now use the same base API as non-cooperative fragments +* rocWMMA cooperative fragments register usage footprint has been reduced +* rocWMMA fragments now support partial tile sizes with padding + +#### Optimized + +* Added internal flow control barriers to improve assembly code generation and overall performance +* Enabled interleaved layouts by default in MMA to improve overall performance + +#### Removed + +* Removed support for the gfx940 and gfx941 targets +* Removed the rocWMMA cooperative API +* Removed wave count template parameters from transforms APIs + +#### Resolved Issues + +* Fixed a validation issue for small precision compute types `< B32` on gfx9 +* Fixed CMake validation of compiler support for `bf8` / `fp8` types +* Fixed linkage of rocwmma::synchronize_workgroup to inline + +### **rpp** (2.0.0) + +#### Added + +* Bitwise NOT, Bitwise AND, Bitwise OR augmentations on HOST (CPU) and HIP backends. (#520) +* Tensor Concat augmentation on HOST (CPU) and HIP backends. (#530) +* JPEG Compression Distortion augmentation on HIP backend. (#538) +* `log1p`, defined as `log (1 + x)`, tensor augmentation support on HOST (CPU) and HIP backends. +* JPEG Compression Distortion augmentation on HOST (CPU) backend. (#531) + +#### Changed + +* All handle creation and destruction APIs have been consolidated to `rppCreate()`, for handle initialization, and `rppDestroy()`, for handle destruction (#513) +* RPP function category "logical_operations" more appropriately renamed to "bitwise_operations". (#520) +* TurboJPEG package installation enabled for RPP Test Suite with `sudo apt-get install libturbojpeg0-dev`. Instructions updated in utilities/test_suite/README.md. (#518) +* Changed API of swap_channels augmentation to be called channel_permute, which now accepts one new argument, "permutationTensor" (pointer to a unsigned int tensor) that provides the permutation order to swap the RGB channels of each input image in the batch in any order. (#547) + * Old API - `RppStatus rppt_swap_channels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, rppHandle_t rppHandle);` + * New API - `RppStatus rppt_channel_permute_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u *permutationTensor , rppHandle_t rppHandle);` + +#### Removed + +* Older versions of RPP handle creation inlcuding `rppCreateWithBatchSize()`, `rppCreateWithStream()`, and `rppCreateWithStreamAndBatchSize()` are now removed and replaced with `rppCreate()`. +* Older versions of RPP handle destruction API including `rppDestroyGPU()` and `rppDestroyHost()` are now removed and replaced with `rppDestroy()`. + +#### Resolved Issues + +* Test package - debian packages will install required dependencies + +### **Tensile** (4.44.0) + +#### Added + +- Added support for gfx950 +- Added code object compression via bundling +- Added support for non-default HIP SDK installations on Windows +- Added master solution library documentation +- Added compiler version dependent assembler and architecture capabilities +- Added documentation from GitHub Wiki to ROCm docs + +#### Changed + +- Loosened check for CLI compiler choices +- Introduced 4-tuple targets for bundler invocations +- Introduced PATHEXT extensions on Windows when searching for toolchain components +- Enabled passing fully qualified paths to toolchain components +- Enabled environment variable overrides when searching for a ROCm stack +- Improved default toolchain configuration +- Ignored f824 flake errors + +#### Removed + +- Removed support for the gfx940 and gfx941 targets +- Removed unused tuning files +- Removed disabled tests + +#### Resolved Issues + +- Fixed configure time path not being invoked at build +- Fixed find_package for msgpack to work with versions 5 and 6 +- Fixed rhel9 testing +- Fixed gfx908 builds +- Fixed "argument list too long" error +- Fixed version typo in 6.3 changelog +- Fixed improper use of aliases as nested namespace specifiers ## ROCm known issues ROCm known issues are noted on {fab}`github` [GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known issues related to individual components, review the [Detailed component changes](#detailed-component-changes). +## ROCm resolved issues + +The following are previously known issues resolved in this release. For resolved issues related to +individual components, review the [Detailed component changes](#detailed-component-changes). + ## ROCm upcoming changes The following changes to the ROCm software stack are anticipated for future releases. diff --git a/docs/conf.py b/docs/conf.py index c4753c4b7..27cd7f167 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -82,15 +82,15 @@ project = "ROCm Documentation" project_path = os.path.abspath(".").replace("\\", "/") author = "Advanced Micro Devices, Inc." copyright = "Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved." -version = "6.4.3" -release = "6.4.3" +version = "7.0.0" +release = "7.0.0" setting_all_article_info = True all_article_info_os = ["linux", "windows"] all_article_info_author = "" # pages with specific settings article_pages = [ - {"file": "about/release-notes", "os": ["linux"], "date": "2025-08-07"}, + {"file": "about/release-notes", "os": ["linux"], "date": "2025-08-26"}, {"file": "release/changelog", "os": ["linux"],}, {"file": "compatibility/compatibility-matrix", "os": ["linux"]}, {"file": "compatibility/ml-compatibility/pytorch-compatibility", "os": ["linux"]}, diff --git a/manifest_700.xml b/manifest_700.xml new file mode 100644 index 000000000..4f7d505d8 --- /dev/null +++ b/manifest_700.xml @@ -0,0 +1,80 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file From ae734e784641132a12ac7e19e791b69bb22c1d0c Mon Sep 17 00:00:00 2001 From: Istvan Kiss Date: Mon, 18 Aug 2025 15:37:19 +0200 Subject: [PATCH 02/58] Add MI350X and MI355X to atomics operation page (#497) Add MI350X and MI355X to atomics operation page --- .../cas-atomics_nopcie_instinct.csv | 650 +++++++++--------- .../cas-atomics_pcie_instinct.csv | 650 +++++++++--------- .../hw-atomics_nopcie_instinct.csv | 650 +++++++++--------- .../hw-atomics_pcie_instinct.csv | 650 +++++++++--------- docs/reference/gpu-atomics-operation.rst | 30 +- 5 files changed, 1320 insertions(+), 1310 deletions(-) diff --git a/docs/data/reference/gpu-atomics-operation/cas-atomics_nopcie_instinct.csv b/docs/data/reference/gpu-atomics-operation/cas-atomics_nopcie_instinct.csv index cb909bbb0..b375886eb 100644 --- a/docs/data/reference/gpu-atomics-operation/cas-atomics_nopcie_instinct.csv +++ b/docs/data/reference/gpu-atomics-operation/cas-atomics_nopcie_instinct.csv @@ -1,325 +1,325 @@ -Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X,MI300A -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicSub,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicInc,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicDec,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atoimcExch,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicExch,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicSub,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicInc,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicDec,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atoimcExch,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicExch,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicSub,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicInc,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicDec,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atoimcExch,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ CAS -32 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicExch,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ CAS -64 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -64 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS -32 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicSub,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicInc,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicDec,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atoimcExch,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicExch,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS +Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X series,MI300A,MI350X series +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicSub,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicInc,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicDec,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atoimcExch,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ Native,⚠️ Scope Downgrade - CAS +32 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicExch,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ Native,⚠️ Scope Downgrade - CAS +64 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ CAS,✅ Native +32 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ CAS,✅ Native +64 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicSub,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicInc,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicDec,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atoimcExch,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ Native,⚠️ Scope Downgrade - CAS +32 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicExch,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicCAS,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicAnd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicOr,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit atomicXor,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS diff --git a/docs/data/reference/gpu-atomics-operation/cas-atomics_pcie_instinct.csv b/docs/data/reference/gpu-atomics-operation/cas-atomics_pcie_instinct.csv index 74bbfed10..2e0e40fc1 100644 --- a/docs/data/reference/gpu-atomics-operation/cas-atomics_pcie_instinct.csv +++ b/docs/data/reference/gpu-atomics-operation/cas-atomics_pcie_instinct.csv @@ -1,325 +1,325 @@ -Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X,MI300A -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X series,MI300A,MI350X series +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicSub,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicInc,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicDec,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 half2 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atoimcExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicExch,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicOr,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit atomicXor,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS diff --git a/docs/data/reference/gpu-atomics-operation/hw-atomics_nopcie_instinct.csv b/docs/data/reference/gpu-atomics-operation/hw-atomics_nopcie_instinct.csv index 18f0bf55c..483684089 100644 --- a/docs/data/reference/gpu-atomics-operation/hw-atomics_nopcie_instinct.csv +++ b/docs/data/reference/gpu-atomics-operation/hw-atomics_nopcie_instinct.csv @@ -1,325 +1,325 @@ -Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X,MI300A -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit float atomicMin,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicExch,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -32 bit float atomicMin,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit float atomicMax,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade,✅ Native -32 bit atoimcExch,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicExch,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit float atomicMin,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicExch,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -32 bit float atomicMin,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade - CAS,✅ CAS -32 bit float atomicMax,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade - CAS,✅ CAS -64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade,✅ Native -32 bit atoimcExch,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicExch,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native +Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X series,MI300A,MI350X series +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit float atomicMin,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit float atomicMax,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atoimcExch,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicExch,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,❌ NOP,❌ NOP,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicAdd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit float atomicMin,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +32 bit float atomicMax,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade - CAS,✅ CAS,⚠️ Scope Downgrade - CAS +64 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit float atomicMin,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit float atomicMax,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +16bx2 bfloat162 atomicAdd,❌ NOP,❌ NOP,✅ CAS,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atoimcExch,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicExch,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicCAS,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade diff --git a/docs/data/reference/gpu-atomics-operation/hw-atomics_pcie_instinct.csv b/docs/data/reference/gpu-atomics-operation/hw-atomics_pcie_instinct.csv index cf4136864..5ea596069 100644 --- a/docs/data/reference/gpu-atomics-operation/hw-atomics_pcie_instinct.csv +++ b/docs/data/reference/gpu-atomics-operation/hw-atomics_pcie_instinct.csv @@ -1,325 +1,325 @@ -Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X,MI300A -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,⚠️ Scope Downgrade,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native -32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS -64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native -16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,⚠️ Scope Downgrade,✅ Native -32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native -64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native -64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native +Atomic,MI100,MI200 PCIe,MI200 A+A,MI300X series,MI300A,MI350X series +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,✅ NoReturn,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,✅ Native,✅ Native,✅ Native +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicSub,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicInc,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicDec,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicAdd,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicMin,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicMax,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit float atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit float atomicMin,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +32 bit float atomicMax,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS,✅ CAS +64 bit float atomicAdd,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit float atomicMin,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit float atomicMax,✅ CAS,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +16bx2 half2 atomicAdd,❌ NOP,❌ NOP,❌ NOP,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +16bx2 bfloat162 atomicAdd,✅ CAS,✅ CAS,✅ CAS,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atoimcExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +32 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +32 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicExch,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicCAS,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native,✅ Native +64 bit atomicAnd,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicOr,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade +64 bit atomicXor,❌ NOP,❌ NOP,✅ Native,⚠️ Scope Downgrade,✅ Native,⚠️ Scope Downgrade diff --git a/docs/reference/gpu-atomics-operation.rst b/docs/reference/gpu-atomics-operation.rst index dddbef0c0..faa9a8320 100644 --- a/docs/reference/gpu-atomics-operation.rst +++ b/docs/reference/gpu-atomics-operation.rst @@ -14,16 +14,26 @@ completed as an indivisible unit, preventing race conditions where simultaneous access to the same memory location could lead to incorrect or undefined behavior. -This document details the various support of atomic read-modify-write -(atomicRMW) operations on gfx9, gfx10, gfx11, gfx12, MI100, MI200 and MI300 AMD -GPUs. The atomics operation type behavior effected by the memory locations, -memory granularity or scope of operations. +This topic summarizes the support of atomic read-modify-write +(atomicRMW) operations on AMD GPUs and accelerators. This includes gfx9, gfx10, +gfx11, and gfx12 targets and the following series of Instinct™ series: + +- MI100 + +- MI200 + +- MI300 + +- MI350 + +The atomics operation type behavior is affected by the memory locations, memory +granularity, and scope of operations. Memory locations: -- :ref:`Device memory `, i.e. VRAM, the RAM on a discrete GPU - device or in framebuffer carveout for APUs. This includes peer-device memory - within an Infinity Fabric™ hive. +- :ref:`Device memory `, that is, VRAM, the RAM on a discrete + GPU device or in framebuffer carveout for APUs. This includes peer-device + memory within an Infinity Fabric™ hive. - :ref:`Host memory `: in DRAM associated with the CPU (or peer device memory using PCIe® (PCI Express) peer-to-peer). This can be two sub-types: @@ -69,10 +79,10 @@ Scopes of operations: Support summary ================================================================================ -AMD Instinct™ accelerators +AMD Instinct accelerators -------------------------------------------------------------------------------- -**MI300** +**MI300 and MI350 series** - All atomicRMW operations are forwarded out to the Infinity Fabric. - Infinity Fabric supports common integer and bitwise atomics, FP32 atomic add, @@ -85,7 +95,7 @@ AMD Instinct™ accelerators It will seem like atomics to the wave, but the CPU sees it as a non-atomic load-op-store sequence. This downgrades system-scope atomics to device-scope. -**MI200** +**MI200 series** - L2 cache and Infinity Fabric both support common integer and bitwise atomics. - L2 cache supports FP32 atomic add, packed-FP16 atomic add, and FP64 add, From 08d0840b6922b7a6702d1cbde3bf809028c91214 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Mon, 18 Aug 2025 14:03:43 -0400 Subject: [PATCH 03/58] Post RC3 7.0.0 RN update (#501) * Indentation and formatting updated * AMD SMI changelog update * Changelog update * Compute and Systems profiler changelog added * Highlight added * AMD SMI link added * Changelog updated * Refernece link updated * ROCal changelog added * rocJpeg added * Minor change * version update * rocpydecode added * Changelog.md updated * Heading level error fixed * Feedback from Jeff incorporated * Title formatting updated * Changelog updated * Changelog updated * Changelog updates * HIPCC perl script removed * TOC for internal purpose updated * ROCgdb api and ROCdbg added * Changelog udpate * Sandra's feedback added --- CHANGELOG.md | 1502 +++++++++++++++++++++++++++++++++++++++ RELEASE.md | 1370 +++++++++++++++++++++++------------ docs/sphinx/_toc.yml.in | 4 +- 3 files changed, 2411 insertions(+), 465 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 122110a7a..4d581c4eb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,1508 @@ This page is a historical overview of changes made to ROCm components. This consolidated changelog documents key modifications and improvements across different versions of the ROCm software stack and its components. +## ROCm 7.0.0 + +See the [ROCm 7.0.0 release notes](https://rocm-stg.amd.com/en/latest/about/release-notes.html#rocm-7-0-0-release-notes) +for a complete overview of this release. + +### **AMD SMI** (26.0.0) + +### Added + +* The Default command. + + A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through laternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. + +* Support for GPU metrics 1.8. + - Added new fields for `amdsmi_gpu_xcp_metrics_t` including: + - Adding the following metrics to allow new calculations for violation status: + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. + - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). + - Increasing available JPEG engines to 40. + Current ASICs may not support all 40. These will be indicated as `UINT16_MAX` or `N/A` in CLI. + +* Bad page threshold count. + - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions required to display the count. + +* CPU model name for RDC. + - Added new C and Python API `amdsmi_get_cpu_model_name`. + - Not sourced from esmi library. + +* Added `amdsmi_get_cpu_affinity_with_scope()`. + +* `socket power` to `amdsmi_get_power_info` + - Previously the C API had the value in the `amdsmi_power_info` structure, but was unused + - Now we populate the value in both C & Python APIs + - The value is representative of the socket's power agnostic of the the GPU version. + +* New event notification types to `amdsmi_evt_notification_type_t`. + The following values were added to the `amdsmi_evt_notification_type_t` enum: + - `AMDSMI_EVT_NOTIF_EVENT_MIGRATE_START` + - `AMDSMI_EVT_NOTIF_EVENT_MIGRATE_END` + - `AMDSMI_EVT_NOTIF_EVENT_PAGE_FAULT_START` + - `AMDSMI_EVT_NOTIF_EVENT_PAGE_FAULT_END` + - `AMDSMI_EVT_NOTIF_EVENT_QUEUE_EVICTION` + - `AMDSMI_EVT_NOTIF_EVENT_QUEUE_RESTORE` + - `AMDSMI_EVT_NOTIF_EVENT_UNMAP_FROM_GPU` + - `AMDSMI_EVT_NOTIF_PROCESS_START` + - `AMDSMI_EVT_NOTIF_PROCESS_END` + +- Power Cap to `amd-smi monitor`. + - `amd-smi monitor -p` will display the power cap along with power. + +### Changed + +* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. + - The `clk_deep_sleep` field now returns the sleep integer value. + +* Updated `amdsmi_get_gpu_asic_info` in `amdsmi.h`. + - Added `subsystem_id` structure member. + +* The `amd-smi topology` command has been enabled for Guest environments. + - `amd-smi topology` is now available in Guest environments. This includes full functionality so users can use the command just as they would in Bare Metal environments. + +* Expanded Violation Status tracking for GPU metrics 1.8. + - The driver will no longer be supporting existing single-value GFX Clk Below Host Limit fields (`acc_gfx_clk_below_host_limit`, `per_gfx_clk_below_host_limit`, `active_gfx_clk_below_host_limit`), they are now changed in favor of new per-XCP/XCC arrays. + - Added new fields to `amdsmi_violation_status_t` and related interfaces for enhanced violation breakdown: + - Per-XCP/XCC accumulators and status for: + - GFX Clock Below Host Limit (Power, Thermal, and Total) + - Low Utilization + - Added 2D arrays to track per-XCP/XCC accumulators, percentage, and active status: + - `acc_gfx_clk_below_host_limit_pwr`, `acc_gfx_clk_below_host_limit_thm`, `acc_gfx_clk_below_host_limit_total` + - `per_gfx_clk_below_host_limit_pwr`, `per_gfx_clk_below_host_limit_thm`, `per_gfx_clk_below_host_limit_total` + - `active_gfx_clk_below_host_limit_pwr`, `active_gfx_clk_below_host_limit_thm`, `active_gfx_clk_below_host_limit_total` + - `acc_low_utilization`, `per_low_utilization`, `active_low_utilization` + - Python API and CLI now report these expanded fields. + +* The char arrays in the following structures have been changed. + - `amdsmi_vbios_info_t` member `build_date` changed from `AMDSMI_MAX_DATE_LENGTH` to `AMDSMI_MAX_STRING_LENGTH`. + - `amdsmi_dpm_policy_entry_t` member `policy_description` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. + - `amdsmi_name_value_t` member `name` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. + +* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. + - The `clk_deep_sleep` field now returns the sleep integer value. + +* Updated `amdsmi_bdf_t` in `amdsmi.h`. + - The `amdsmi_bdf_t` union was changed to have an identical unnamed struct for backwards compatiblity + +### Removed + +- Removed unnecessary API, `amdsmi_free_name_value_pairs(),` from amdsmi.h + - This API is only used internally to free up memory from the python interface and does not need to be + exposed to the User. + +- Removed unused definitions: + - `AMDSMI_MAX_NAME` + - `AMDSMI_256_LENGTH` + - `AMDSMI_MAX_DATE_LENGTH` + - `MAX_AMDSMI_NAME_LENGTH` + - `AMDSMI_LIB_VERSION_YEAR` + - `AMDSMI_DEFAULT_VARIANT` + - `AMDSMI_MAX_NUM_POWER_PROFILES` + - `AMDSMI_MAX_DRIVER_VERSION_LENGTH` + +- Removed unused member `year` in struct `amdsmi_version_t`. + +- Removed `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`** + - `amdsmi_io_link_type_t` is no longer needed as `amdsmi_link_type_t` is sufficient. + - `amdsmi_link_type_t` enum has changed. + - This change will also affect `amdsmi_link_metrics_t`, where the link_type field changes from `amdsmi_io_link_type_t` to `amdsmi_link_type_t`. + +- Removed `amdsmi_get_power_info_v2()`. + - The ``amdsmi_get_power_info()`` has been unified and the v2 function is no longer needed/used. + +- Removed `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. + +- The `amdsmi_get_gpu_vram_info` now provides vendor names as a string. + - `amdsmi_vram_vendor_type_t` enum structure is removed. + - `amdsmi_vram_info_t` member named `amdsmi_vram_vendor_type_t` is changed to a character string. + - `amdsmi_get_gpu_vram_info` now no longer requires decoding the vendor name as an enum. + +- Removed backwards compatibility for `amdsmi_get_gpu_metrics_info()`'s,`jpeg_activity`and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` or `xcp_stats.vcn_busy`. + - Backwards compatibility is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. + - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion about which field to use. By removing backward compatibility, it is easier to identify the relevant field. + - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. + +### Optimized + +- Reduced ``amd-smi`` CLI API calls needed to be called before reading or (re)setting GPU features. + - Now when users call any amd-smi CLI command, we have reduced the APIs needed to be called. Previously, + when a user would read a GPU's status, (for example) we would poll for other information helpful for our sets/reset + CLI calls. This change will increase overall run-time performance of the CLI tool. + +- Removed partition information from the default `amd-smi static` CLI command. + - Users can still retrieve the same data by calling `amd-smi`, `amd-smi static -p`, or `amd-smi partition -c -m`/`sudo amd-smi partition -a`. + - Reading ``current_compute_partition`` may momentarily wake the GPU up. This is due to reading XCD registers, which is expected behavior. Changing partitions is not a trivial operation, `current_compute_partition` SYSFS controls this action. + +- Optimized CLI command `amd-smi topology` in partition mode. + - Reduced the number of `amdsmi_topo_get_p2p_status` API calls to one fourth. + +### Resolved issues + +- Removed duplicated GPU IDs when receiving events using the `amd-smi event` command. + +```{note} +See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. +``` + +### **Composable Kernel** (1.1.0) + +#### Added + +* Added support for BF16, F32, and F16 for 2D and 3D NGCHW grouped convolution backward data. +* Added a fully asynchronous HOST (CPU) arguments copy flow for CK grouped GEMM kernels. +* Added support GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW, number of instances in instance factory for NGCHW/GKYXC/NGKHW has been reduced). +* Added support for GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW). +* Added support for GKCYX layout for grouped convolution backward weight (NGCHW/GKCYX/NGKHW). +* Added support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). +* Added support for Stream-K version of mixed FP8/BF16 GEMM. +* Added support for Multiple D GEMM. +* Added GEMM pipeline for microscaling (MX) FP8/FP6/FP4 data types +* Added support for FP16 2:4 structured sparsity to universal GEMM. +* Added support for Split K for grouped convolution backward data. +* Added logit soft-capping support for fMHA forward kernels. +* Added support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv). +* Added benchmarking support for tile engine GEMM. +* Added Ping-pong scheduler support for GEMM operation along the K dimension. +* Added rotating buffer feature for CK_Tile GEMM. +* Added int8 support for CK_TILE GEMM. + +#### Changed + +* Removed support for gfx940 and gfx941 targets. +* Replaced the raw buffer load/store intrinsics with Clang20 built-ins. +* DL and DPP kernels are now enabled by default. +* Number of instances in instance factory for grouped convolution forward NGCHW/GKYXC/NGKHW has been reduced. +* Number of instances in instance factory for grouped convolution backward weight NGCHW/GKYXC/NGKHW has been reduced. +* Number of instances in instance factory for grouped convolution backward data NGCHW/GKYXC/NGKHW has been reduced. + +#### Optimized + +* Optimize the GEMM multiply preshuffle and lds bypass with Pack of KGroup and better instruction layout. +* Added Vectorize Transpose optimization for CK Tile. +* Added the asynchronous copy for gfx950. + +### **HIP** 7.0.0 + +#### Added + +* New HIP APIs + - `hipLaunchKernelEx` dispatches the provided kernel with the given launch configuration and forwards the kernel arguments. + - `hipLaunchKernelExC` launches a HIP kernel using a generic function pointer and the specified configuration. + - `hipDrvLaunchKernelEx` dispatches the device kernel represented by a HIP function object. + - `hipMemGetHandleForAddressRange` gets a handle for the address range requested. + - `num_threads` Total number of threads in the group. The legacy API size is alias. + - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for reduction across lanes of a warp. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). +* New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). + - Data types for `FP4`/`FP6`/`FP8`. + - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. + - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. +* New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. +* New debug mask, to print precise code object information for logging. +* The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. +* Added `constexpr` operators for `FP16`/`BF16`. +* Added `__syncwarp` operation. +* Added PCI CHIP ID information as the device attribute. +* Added new tests applications for OCP data types `FP4`/`FP6`/`FP8`. +* A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. + +#### Changed +* Deprecated GPUs. +Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. +* Behavior changes + - `hipGetLastError` now gets the error code returned by `hipGetLastError` which should be the last actual error caught in the current thread during the application execution. + - Cooperative groups in `hipLaunchCooperativeKernelMultiDevice` and `hipLaunchCooperativeKernel` functions, additional input parameter validation checks are added. + - `hipPointerGetAttributes` returns `hipSuccess` instead of an error with invalid value `hipErrorInvalidValue`, in case `NULL` host or attribute pointer is passed as input parameter. It now matches the functionality of `cudaPointerGetAttributes` which changed with CUDA 11 and above releases. + - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree` + - `hipFreeAsync` now returns `hipSuccess` when the input pointer is NULL, instead of ` hipErrorInvalidValue` , to be consistent with `hipFree`. +* Changes in hipRTC. + - Removal of `hipRTC` symbols from HIP Runtime Library. + Any application using `hipRTC` APIs should link explicitly with the `hipRTC` library. This makes the usage of `hipRTC` library on Linux the same as on Windows and matches the behavior of CUDA `nvRTC`. + - `hipRTC` compilation + The device code compilation now uses namespace `__hip_internal`, instead of the standard headers `std`, to avoid namespace collision. + - Changes of datatypes from `hipRTC`. + Datatype definitions such as `int64_t`, `uint64_t`, `int32_t`, and `uint32_t`, etc. are removed to avoid any potential conflicts in some applications. HIP now uses internal datatypes instead, prefixed with `__hip`, for example, `__hip_int64_t`. +* HIP header clean up + - Usage of STD headers, HIP header files only include necessary STL headers. + - Deprecated structure `HIP_MEMSET_NODE_PARAMS` is removed. Developers can use the definition `hipMemsetParams` instead. +* API signature/struct changes + - API signatures are adjusted in some APIs to match corresponding CUDA APIs. Impacted APIs are as folloing: + * `hiprtcCreateProgram` + * `hiprtcCompileProgram` + * `hipMemcpyHtoD` + * `hipCtxGetApiVersion` + - HIP struct change in `hipMemsetParams`, it is updated and compatible with CUDA. + - HIP vector constructor change in `hipComplex` initialization now generates correct values. The affected constructors will be small vector types such as `float2`, `int4`, etc. +* Stream Capture updates + - Restricted stream capture mode, it is made in HIP APIs via adding the macro `CHECK_STREAM_CAPTURE_SUPPORTED ()`. +In the previous HIP enumeration `hipStreamCaptureMode`, three capture modes were defined. With checking in the macro, the only supported stream capture mode is now `hipStreamCaptureModeRelaxed`. The rest are not supported, and the macro will return `hipErrorStreamCaptureUnsupported`. This update involves the following APIs, which is allowed only in relaxed stream capture mode, + * `hipMallocManaged` + * `hipMemAdvise` + - Checks stream capture mode, the following APIs check the stream capture mode and return error codes to match the behavior of CUDA. + * `hipLaunchCooperativeKernelMultiDevice` + * `hipEventQuery` + * `hipStreamAddCallback` + - Returns error during stream capture. The following HIP APIs now returns specific error `hipErrorStreamCaptureUnsupported` on the AMD platform, but not always `hipSuccess`, to match behavior with CUDA. + * `hipDeviceSetMemPool` + * `hipMemPoolCreate` + * `hipMemPoolDestroy` + * `hipDeviceSetSharedMemConfig` + * `hipDeviceSetCacheConfig` + * `hipMemcpyWithStream` +* Error code update +Returned error/value codes are updated in the following HIP APIs to match the corresponding CUDA APIs. + - Module Management Related APIs + * `hipModuleLaunchKernel` + * `hipExtModuleLaunchKernel` + * `hipExtLaunchKernel` + * `hipDrvLaunchKernelEx` + * `hipLaunchKernel` + * `hipLaunchKernelExC` + * `hipModuleLaunchCooperativeKernel` + * `hipModuleLoad` + - Texture Management Related APIs +The following APIs update the return codes to match the behavior with CUDA: + * `hipTexObjectCreate`, supports zero width and height for 2D image. If either is zero, will not return `false`. + * `hipBindTexture2D`, adds extra check, if pointer for texture reference or device is NULL, returns `hipErrorNotFound`. + * `hipBindTextureToArray`, if any NULL pointer is input for texture object, resource descriptor, or texture descriptor, returns error `hipErrorInvalidChannelDescriptor`, instead of `hipErrorInvalidValue`. + * `hipGetTextureAlignmentOffset`, adds a return code `hipErrorInvalidTexture` when the texture reference pointer is NULL. + - Cooperative Group Related APIs, more calidations are added in the following API implementation, + * `hipLaunchCooperativeKernelMultiDevice` + * `hipLaunchCooperativeKernel` +* Invalid stream input parameter handling +In order to match the CUDA runtime behavior more closely, HIP APIs with streams passed as input parameters no longer check the stream validity. Previously, the HIP runtime returned an error code `hipErrorContextIsDestroyed` if the stream was invalid. In CUDA version 12 and later, the equivalent behavior is to raise a segmentation fault. HIP runtime now matches the CUDA by causing a segmentation fault. The list of APIs impacted by this change are as follows: + - Stream Management Related APIs + * `hipStreamGetCaptureInfo` + * `hipStreamGetPriority` + * `hipStreamGetFlags` + * `hipStreamDestroy` + * `hipStreamAddCallback` + * `hipStreamQuery` + * `hipLaunchHostFunc` + - Graph Management Related APIs + * `hipGraphUpload` + * `hipGraphLaunch` + * `hipStreamBeginCaptureToGraph` + * `hipStreamBeginCapture` + * `hipStreamIsCapturing` + * `hipStreamGetCaptureInfo` + * `hipGraphInstantiateWithParams` + - Memory Management Related APIs + * `hipMemcpyPeerAsync` + * `hipMemcpy2DValidateParams` + * `hipMallocFromPoolAsync` + * `hipFreeAsync` + * `hipMallocAsync` + * `hipMemcpyAsync` + * `hipMemcpyToSymbolAsync` + * `hipStreamAttachMemAsync` + * `hipMemPrefetchAsync` + * `hipDrvMemcpy3D` + * `hipDrvMemcpy3DAsync` + * `hipDrvMemcpy2DUnaligned` + * `hipMemcpyParam2D` + * `hipMemcpyParam2DAsync` + * `hipMemcpy2DArrayToArray` + * `hipMemcpy2D` + * `hipMemcpy2DAsync` + * `hipDrvMemcpy2DUnaligned` + * `hipMemcpy3D` + - Event Management Related APIs + * `hipEventRecord` + * `hipEventRecordWithFlags` +* `warpSize` Change +In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see either the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). + +#### Optimized + +HIP runtime has the following functional improvements which greatly improve runtime performance and user experience. + +* Reduced usage of the lock scope in events and kernel handling. + - Switches to `shared_mutex` for event validation, uses `std::unique_lock` in HIP runtime to create/destroy event, instead of `scopedLock`. + - Reduces the `scopedLock` in handling of kernel execution. HIP runtime now calls `scopedLock` during kernel binary creation/initialization, doesn't call it again during kernel vector iteration before launch. +* Implementation of unifying managed buffer and kernel argument buffer so HIP runtime doesn't need to create/load a separate kernel argument buffer. +* Refactored memory validation, creates a unique function to validate a variety of memory copy operations. +* Improved kernel logging using demangling shader names. +* Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). +* Programmatic support for scratch limits on MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, + - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. + - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. + - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. + Developers can now use the environment variable `HSA_SCRATCH_SINGLE_LIMIT_ASYNC` to change the default allocation size with expected scratch limit in ROCR runtime. On top of it, this value can also be overwritten programmatically in the application using the HIP API `hipDeviceSetLimit(hipExtLimitScratchCurrent, value)` to reset the scratch limit value. +* HIP runtime now enables peer-to-peer (P2P) memory copies to utilize all available SDMA engines, rather than being limited to a single engine. It also selects the best engine first to give optimal bandwidth. +* Improved launch latency for `D2D` copies and `memset` on MI300 series. +* Memory manager was implemented to improve the efficiency of memory usage and speed-up memory allocation/free in memory pools. +* Introduced a threshold to handle the command submission patch to the GPU device(s), considering the synchronization with CPU, for performance improvement. + +#### Resolved issues + +* Error of "unable to find modules" in HIP clean up for code object module. +* The issue of incorrect return error `hipErrorNoDevice`, when a crash occurred on GPU device due to illegal operation or memory violation. HIP runtime now handles the failure on the GPU side properly and reports the precise error code based on the last error seen on the GPU. +* Failures in some framework test applications, HIP runtime fixed the bug in retrieving a memory object from the IPC memory handle. +* A crash in TensorFlow related application. HIP runtime now combines multiple definitions of `callbackQueue` into a single function, in case of an exception, passes its handler to the application and provides corresponding error code. +* Fixed issue of handling the kernel parameters for the graph launch. +* Failures in roc-obj tools. HIP runtime now makes `DEPRECATED` message in roc-obj tools as `STDERR`. + +### **hipBLAS** (3.0.0) + +#### Added + +* Added the `hipblasSetWorkspace()` API. +* Support for codecoverage tests. + +#### Changed + +* HIPBLAS_V2 API is the only available API using the `hipComplex` and `hipDatatype` types. +* Documentation updates. +* Verbose compilation for `hipblas.cpp`. + +#### Removed + +* `hipblasDatatype_t` type. +* `hipComplex` and `hipDoubleComplex` types. +* Support code for non-production gfx targets. + +#### Resolved issues + +* The build time `CMake` configuration for the dependency on `hipBLAS-common` is fixed. +* Compiler warnings for unhandled enumerations have been resolved. + +### **hipBLASLt** (1.0.0) + +#### Added + +* Stream-K GEMM support has been enabled for the `FP32`, `FP16`, `BF16`, `FP8`, and `BF8` data types on the Instinct MI300A APU. To activate this feature, set the `TENSILE_SOLUTION_SELECTION_METHOD` environment variable to `2`, for example, `export TENSILE_SOLUTION_SELECTION_METHOD=2`. +* Fused Swish/SiLU GEMM in hipBLASLt (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``) +* Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. +* Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. +* Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8`/`BF8` swizzle GEMM respectively. +* Added TF32 emulation on gfx950. +* Added support for `FP6`, `BF6`, and `FP4` on gfx950 +* Added support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. + +#### Changed + +* ``HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER_VEC_EXT`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER_VEC_EXT`` are removed. Use the ``HIPBLASLT_MATMUL_DESC_A_SCALE_MODE`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_MODE`` attributes to set scalar (``HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F``) or vector (``HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F``) attributes. +* The non-V2 APIs (``GemmPreference``, ``GemmProblemType``, ``GemmEpilogue``, ``GemmTuning``, ``GemmInputs``) in the cpp header are now the same as the V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``). The original non-V2 APIs are removed. +* The `hipblasltExtAMaxWithScale` API is removed. + +#### Optimized + +* Improved performance for 8-bit (`FP8`/`BF8`/`I8`) NN/NT cases by adding ``s_delay_alu`` to reduce stalls from dependent ALU operations on gfx12+. +* Improved performance for 8-bit and 16-bit (`FP16`/`BF16`) TN cases by enabling software dependency checks (Expert Scheduling Mode) under certain restrictions to reduce redundant hardware dependency checks on gfx12+. +* Improved performance for 8-bit, 16-bit, and 32-bit batched GEMM with a better heuristic search algorithm for gfx942. + +#### Upcoming changes + +* V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``) are deprecated. + +### **hipCUB** (4.0.0) + +#### Added + +* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is build with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. +* Added single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: + * `BlockScanRunningPrefixOp` + * `ScanTileStatus` + * `ScanTileState` + * `ReduceByKeyScanTileState` + * `TilePrefixCallbackOp` +* Added gfx950 support. +* Added an overload of `BlockScan::InclusiveScan` that accepts an initial value to seed the scan. +* Added an overload of `WarpScan::InclusiveScan` that accepts an initial value to seed the scan. +* `UnrolledThreadLoad`, `UnrolledCopy`, and `ThreadLoadVolatilePointer` were added to align hipCUB with CUB. +* `ThreadStoreVolatilePtr` and the `IterateThreadStore` struct were added to align hipCUB with CUB. +* Added `hipcub::InclusiveScanInit` for CUB parity. + +#### Removed + +* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you would like to build for these architectures, please specify them explicitly in the `AMDGPU_TARGETS` cmake option. +* Deprecated `hipcub::AsmThreadLoad` is removed, use `hipcub::ThreadLoad` instead. +* Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. +* Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. +* This release removes support for custom builds on gfx940 and gfx941. +* Removed C++14 support, only C++17 is supported. + +#### Changed + +* The NVIDIA backend now requires CUB, Thrust, and libcu++ 2.7.0. If they aren't found, they will be downloaded from the NVIDIA CCCL repository. +* Updated `thread_load` and `thread_store` to align hipCUB with CUB. +* All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, (for example, hipcub::HIPCUB_300400_NS::symbol instead of hipcub::symbol), letting the user link multiple libraries built with different versions of hipCUB. +* Modified the broadcast kernel in warp scan benchmarks. The reported performance may be different to previous versions. +* The `hipcub::detail::accumulator_t` in rocPRIM backend has been changed to utilise `rocprim::accumulator_t`. +* The usage of `rocprim::invoke_result_binary_op_t` has been replaced with `rocprim::accumulator_t`. + +#### Resolved issues + +* Fixed an issue where `Sort(keys, compare_op, valid_items, oob_default)` in `block_merge_sort.hpp` would not fill in elements that are out of range (items after `valid_items`) with `oob_default`. +* Fixed an issue where `ScatterToStripedFlagged` in `block_exhange.hpp` was calling the wrong function. + +#### Known issues + +* `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed from hipCUB's CUB backend. They were already deprecated as of version 2.12.0 of hipCUB and they were removed from CCCL (CUB) as of CCCL's 2.6.0 release. +* `BlockScan::InclusiveScan` for the NVIDIA backend does not compute the block aggregate correctly when passing an initial value parameter. This behavior is not matched by the AMD backend. + +#### Upcoming changes + +* `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` were deprecated as of version 2.12.0 of hipCUB, and will be removed from the rocPRIM backend in a future release for the next ROCm major version (ROCm 7.0.0). + +### **hipFFT** (1.0.20) + +#### Added + +* Added gfx950 support. + +#### Removed + +* Removed hipfft-rider legacy compatibility from clients. +* Removed support for the gfx940 and gfx941 targets from the client programs. +* Removed backward compatibility symlink for include directories. + +### **hipfort** (0.7.0) + +#### Added + +* Added documentation clarifying how hipfort is built for the NVIDIA platform. + +#### Changed + +* Updated and reorganized documentation for clarity and consistency. + +### **HIPIFY** (7.0.0) + +#### Added + +* CUDA 12.9.1 support +* cuDNN 9.11.0 support +* cuTENSOR 2.2.0.0 support +* LLVM 20.1.8 support + +#### Resolved issues + +* `hipDNN` support is removed by default +* [#1859](https://github.com/ROCm/HIPIFY/issues/1859)[hipify-perl] Fix warnings on unsupported Driver or Runtime APIs which were erroneously not reported +* [#1930](https://github.com/ROCm/HIPIFY/issues/1930) Revise `JIT API` +* [#1962](https://github.com/ROCm/HIPIFY/issues/1962) Support for cuda-samples helper headers +* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Remove `const_cast<const char**>` in `hiprtcCreateProgram` and `hiprtcCompileProgram` + +### **hipRAND** (3.0.0) + +#### Added + +* gfx950 support. + +#### Changed + +* Deprecated the hipRAND Fortran API in favor of hipfort. + +#### Removed + +* Removed C++14 support, so only C++17 is supported. + +### **hipSOLVER** (3.0.0) + +#### Added + +* Added compatibility-only functions + * csrlsvqr + * hipsolverSpCcsrlsvqr, hipsolverSpZcsrlsvqr + +#### Resolved issues + +* Corrected the value of `lwork` returned by various `bufferSize` functions to be consistent with NVIDIA cuSOLVER. The following functions now return `lwork` so that the workspace size (in bytes) is `sizeof(T) * lwork`, rather than `lwork`. To restore the original behavior, set the environment variable `HIPSOLVER_BUFFERSIZE_RETURN_BYTES`. + * `hipsolverXorgbr_bufferSize`, `hipsolverXorgqr_bufferSize`, `hipsolverXorgtr_bufferSize`, `hipsolverXormqr_bufferSize`, `hipsolverXormtr_bufferSize`, `hipsolverXgesvd_bufferSize`, `hipsolverXgesvdj_bufferSize`, `hipsolverXgesvdBatched_bufferSize`, `hipsolverXgesvdaStridedBatched_bufferSize`, `hipsolverXsyevd_bufferSize`, `hipsolverXsyevdx_bufferSize`, `hipsolverXsyevj_bufferSize`, `hipsolverXsyevjBatched_bufferSize`, `hipsolverXsygvd_bufferSize`, `hipsolverXsygvdx_bufferSize`, `hipsolverXsygvj_bufferSize`, `hipsolverXsytrd_bufferSize`, `hipsolverXsytrf_bufferSize`. + +### **hipSPARSE** (4.0.1) + +#### Added + +* Added the `int8`, `int32`, and `float16` data types to `hipDataTypeToHCCDataType` so that sparse matrix descriptors can be used with them. +* Added half float mixed precision to `hipsparseAxpby` where X and Y use `float16` and the result and compute type use `float`. +* Added half float mixed precision to `hipsparseSpVV` where X and Y use `float16` and the result and compute type use `float`. +* Added half float mixed precision to `hipsparseSpMM` where A and B use `float16` and C and the compute type use `float`. +* Added half float mixed precision to `hipsparseSDDMM` where A and B use `float16` and C and the compute type use `float`. +* Added half float uniform precision to the `hipsparseScatter` and `hipsparseGather` routines. +* Added half float uniform precision to the `hipsparseSDDMM` routine. +* Added `int8` precision to the `hipsparseCsr2cscEx2` routine. +* Added the `almalinux` operating system name to correct the GFortran dependency. + +#### Changed + +* Switched to defaulting to C++17 when building hipSPARSE from source. Previously hipSPARSE was using C++14 by default. + +#### Resolved issues + +* Fixed a compilation [issue](https://github.com/ROCm/hipSPARSE/issues/555) related to using `std::filesystem` and C++14. +* Fixed an issue where the clients-common package was empty by moving the `hipsparse_clientmatrices.cmake` and `hipsparse_mtx2csr` files to it. + +#### Known issues + +* In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed in, because it is unused internally. + +### **hipSPARSELt** (0.2.4) + +#### Added + +* Support for the LLVM target gfx950. +* Support for the following data type combinations for the LLVM target gfx950: + * FP8(E4M3) inputs, F32 output, and F32 Matrix Core accumulation. + * BF8(E5M2) inputs, F32 output, and F32 Matrix Core accumulation. +* Support for ROC-TX if `HIPSPARSELT_ENABLE_MARKER=1` is set. +* Support for the cuSPARSELt v0.6.3 backend. + +#### Removed + +* Support for LLVM targets gfx940 and gfx941 has been removed. +* `hipsparseLtDatatype_t` has been removed. + +#### Optimized + +* Improved the library loading time. +* Provided more kernels for the `FP16` data type. + +### **hipTensor** (2.0.0) + +#### Added + +* Added element-wise binary operation support. +* Added element-wise trinary operation support. +* Added support for new GPU target gfx950. +* Added dynamic unary and binary operator support for element-wise operations and permutation. +* Added a CMake check for `f8` datatype availability. +* Added `hiptensorDestroyOperationDescriptor` to free all resources related to the provided descriptor. +* Added `hiptensorOperationDescriptorSetAttribute` to set attribute of a `hiptensorOperationDescriptor_t` object. +* Added `hiptensorOperationDescriptorGetAttribute` to retrieve an attribute of the provided `hiptensorOperationDescriptor_t` object. +* Added `hiptensorCreatePlanPreference` to allocate the `hiptensorPlanPreference_t` and enabled users to limit the applicable kernels for a given plan or operation. +* Added `hiptensorDestroyPlanPreference` to free all resources related to the provided preference. +* Added `hiptensorPlanPreferenceSetAttribute` to set attribute of a `hiptensorPlanPreference_t` object. +* Added `hiptensorPlanGetAttribute` to retrieve information about an already-created plan. +* Added `hiptensorEstimateWorkspaceSize` to determine the required workspaceSize for the given operation. +* Added `hiptensorCreatePlan` to allocate a `hiptensorPlan_t` object, select an appropriate kernel for a given operation and prepare a plan that encodes the execution. +* Added `hiptensorDestroyPlan` to free all resources related to the provided plan. + +#### Changed + +* Removed architecture support for gfx940 and gfx941. +* Generalized opaque buffer now for any descriptor. +* Replaced `hipDataType` with `hiptensorDataType_t` for all supported types, for example, `HIP_R_32F` to `HIPTENSOR_R_32F`. +* Replaced `hiptensorComputeType_t` with `hiptensorComputeDescriptor_t` for all supported types. +* Replaced `hiptensorInitTensorDescriptor` with `hiptensorCreateTensorDescriptor`. +* Changed handle type and API usage from `*handle` to `handle`. +* Replaced `hiptensorContractionDescriptor_t` with `hipTensorOperationDescriptor_t`. +* Replaced `hiptensorInitContractionDescriptor` with `hiptensorCreateContraction`. +* Replaced `hiptensorContractionFind_t` with `hiptensorPlanPreference_t`. +* Replaced `hiptensorInitContractionFind` with `hiptensorCreatePlanPreference`. +* Replaced `hiptensorContractionGetWorkspaceSize` with `hiptensorEstimateWorkspaceSize`. +* Replaced `HIPTENSOR_WORKSPACE_RECOMMENDED` with `HIPTENSOR_WORKSPACE_DEFAULT`. +* Replaced `hiptensorContractionPlan_t` with `hiptensorPlan_t`. +* Replaced `hiptensorInitContractionPlan` with `hiptensorCreatePlan`. +* Replaced `hiptensorContraction` with `hiptensorContract`. +* Replaced `hiptensorPermutation` with `hiptensorPermute`. +* Replaced `hiptensorReduction` with `hiptensorReduce`. +* Replaced `hiptensorElementwiseBinary` with `hiptensorElementwiseBinaryExecute`. +* Replaced `hiptensorElementwiseTrinary` with `hiptensorElementwiseTrinaryExecute`. +* Removed function `hiptensorReductionGetWorkspaceSize`. + +### **MIOpen** (3.5.0) + +#### Added + +* [Conv] Added misa kernels for gfx950. +* [Conv] Enabled Split-K support for CK backward data solvers (2D). +* [Conv] Enabled CK wrw solver on gfx950 for the `BF16` data type. +* [BatchNorm] Enabled NHWC in OpenCL. +* Added grouped convolution + activation fusion. +* Added grouped convolution + bias + activation fusion. +* Composable Kernel (CK) can now be built inline as part of MIOpen. + +#### Changed + +* Changed to using the median value with outliers removed when deciding on the best solution to run. +* [Conv] Updated the igemm asm solver. + +#### Optimized + +* [BatchNorm] Optimized NHWC OpenCL kernels and improved heuristics +* [RNN] Dynamic algorithm optimization. +* [Conv] Eliminated redundant clearing of output buffers +* [RNN] Updated selection heuristics. +* Updated tuning for the AMD Instinct MI300 series. + +#### Resolved issues + +* Fixed a segmentation fault when the user specified a smaller workspace than what was required. +* Fixed a layout calculation logic error that returned incorrect results and enabled less restrictive layout selection. +* Fixed memory access faults in misa kernels due to out-of-bounds memory usage. +* Fixed a performance drop on the gfx950 due to transpose kernel use. +* Fixed a memory access fault caused by not allocating enough workspace. +* Fixed a name typo that caused kernel mismatches and long startup times. + +### **MIVisionX** (3.3.0) + +#### Changed + +* VX_RPP extension : Version 3.1.0 release +* Add support to enable/disable BatchPD code in VX_RPP extensions by checking the RPP_LEGACY_SUPPORT flag. +* Update the parameters and kernel API of Blur, Fog, Jitter, LensCorrection, Rain, Pixelate, Vignette and ResizeCrop wrt tensor kernels replacing the legacy BatchPD API calls in VX_RPP extensions. + +#### Known issues + +* Installation on CentOS/RedHat/SLES requires the manual installation of the `FFMPEG` & `OpenCV` dev packages. + +#### Upcoming changes + +* Optimized audio augmentations support for VX_RPP + +### **RCCL** (2.26.6) + +#### Added + +* Added support for the extended fine-grained system memory pool. +* Added support for gfx950. +* Added support for `unroll=1` in device-code generation to improve performance. +* Set a default of 112 channels for a single node with `8 * gfx950`. +* Enabled LL128 protocol on the gfx950. +* Added the ability to choose the unroll factor at runtime using `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. +* Added MSCCL support for AllGather multinode gfx942/gfx950 (for instance, 16 and 32 GPUs). To enable this feature, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. The maximum message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. +* Thread thresholds for LL/LL128 are selected in Tuning Models for the AMD Instinct MI300X. This impacts the number of channels used for AG and RS. The channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS`, or `NCCL_MAX_NCHANNELS` are set. +* Multi-node tuning for AllGather, AllReduce, and ReduceScatter that leverages LL/LL64/LL128 protocols to use nontemporal vector load/store for tunable message size ranges. +* LL/LL128 usage ranges for AR, AG, and RS are part of the tuning models, which enable architecture-specific tuning in conjunction with the existing Rome Models scheme in RCCL. +* Two new APIs are exposed as part of an initiative to separate RCCL code. These APIs are `rcclGetAlgoInfo` and `rcclFuncMaxSendRecvCount`. However, user-level invocation requires that RCCL be built with `RCCL_EXPOSE_STATIC` enabled. + +#### Changed + +* Compatibility with NCCL 2.23.4. +* Compatibility with NCCL 2.24.3. +* Compatibility with NCCL 2.25.1. +* Compatibility with NCCL 2.26.6. + +#### Resolved issues + +* Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. +* Fixed unit test failures in tests ending with the `ManagedMem` and `ManagedMemGraph` suffixes. +* Fixed a suboptimal algorithmic switching point for AllReduce on the AMD Instinct MI300X. +* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault." with a design change to use `comm` instead of `rank` for `mscclStatus`. The Global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. + +### **rocAL** (2.3.0) + +#### Added +* Extended support to rocAL's video decoder to use rocDecode hardware decoder. +* Setup - installs rocdecode dev packages for Ubuntu, RedHat, and SLES. +* Setup - installs turbojpeg dev package for Ubuntu and Redhat. +* rocAL's image decoder has been extended to support the rocJPEG hardware decoder. +* Added numpy reader support for reading npy files in rocAL. +* Added test case for numpy reader in C++ and python tests. + +#### Resolved issues +* `TurboJPEG` no longer needs to be installed manually. It is now installed by the package installer. +* Hardware decode no longer requires that ROCm be installed with the `graphics` usecase. + +#### Known issues +* Package installation on SLES requires manually installing `TurboJPEG`. +* Package installation on CentOS, RedHat, and SLES requires manually installing the `FFMPEG Dev` package. + +#### Upcoming changes + +* rocJPEG support for JPEG decode. + +### **rocALUTION** (4.0.0) + +#### Added + +* Added support for gfx950. + +#### Changed + +* Switch to defaulting to C++17 when building rocALUTION from source. Previously rocALUTION was using C++14 by default. + +#### Optimized + +* Improved the user documentation + +#### Resolved issues + +* Fix for GPU hashing algorithm when not compiling with -O2/O3 + +### **rocBLAS** (5.0.0) + +#### Added + +* gfx950 support. +* Internal API logging for `gemm` debugging using `ROCBLAS_LAYER = 8`. +* Support for AOCL 5.0 gcc build as a client reference library. +* Allowing the use of `PkgConfig` for client reference library fallback detection. + +#### Changed + +* `CMAKE_CXX_COMPILER` is now passed on during compilation for a Tensile build. +* The default atomics mode is changed from `allowed` to `not allowed`. + +#### Optimized + +* Optimized `gemm` by using `gemv` kernels when applicable. +* Optimized `gemv` for small `m` and `n` with a large batch count on gfx942. +* Improved the performance of Level 1 `dot` for all precisions and variants when `N > 100000000` on gfx942. +* Improved the performance of Level 1 `asum` and `nrm2` for all precisions and variants on gfx942. +* Improved the performance of Level 2 `sger` (single precision) on gfx942. +* Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942. + +#### Removed + +* Support code for non-production gfx targets. +* `rocblas_hgemm_kernel_name`, `rocblas_sgemm_kernel_name`, and `rocblas_dgemm_kernel_name` API functions. +* The use of `warpSize` as a constexpr. +* The use of deprecated behavior of `hipPeekLastError`. +* `rocblas_float8.h` and `rocblas_hip_f8_impl.h` files. +* `rocblas_gemm_ex3`, `rocblas_gemm_batched_ex3`, and `rocblas_gemm_strided_batched_ex3` API functions. + +#### Resolved issues + +* Fixed environment variable path-based logging to append multiple handle outputs to the same file. +* Support numerics when `trsm` is running with `rocblas_status_perf_degraded`. +* Fixed the build dependency installation of `joblib` on some operating systems. +* Return `rocblas_status_internal_error` when `rocblas_[set,get]_ [matrix,vector]` is called with a host pointer in place of a device pointer. +* Reduced the default verbosity level for internal GEMM backend information. +* Updated from the deprecated rocm-cmake to ROCmCMakeBuildTools. +* Corrected AlmaLinux GFortran package dependencies. + +#### Upcoming changes + +* Deprecated the use of negative indices to indicate the default solution is being used for `gemm_ex` with `rocblas_gemm_algo_solution_index`. + +### **rocDecode** (1.0.0) + +#### Added + +* VP9 IVF container file parsing support in bitstream reader. +* CTest for VP9 decode on bitstream reader. +* HEVC/AVC/AV1/VP9 stream syntax error handling. +* HEVC stream bit depth change handling and DPB buffer size change handling through decoder reconfiguration. +* AVC stream DPB buffer size change handling through decoder reconfiguration. +* rocdecode now uses the Cmake CMAKE_PREFIX_PATH directive. +* rocdecode - A new avcodec-based decoder built as a separate ``rocdecode-host`` library. + +#### Optimized + +* Decode session start latency reduction. +* Bitstream type detection optimization in bitstream reader. + +#### Resolved issues + +* Fixed a bug in picture files sample ``videoDecodePicFiles`` that can results in incorrect output frame count. +* Fixed a decoded frame output issue in video size change cases. +* Removed incorrect asserts of bitdepth_minus_8 in GetBitDepth() and num_chroma_planes in GetNumChromaPlanes() API calls in RocVideoDecoder utility class. + +#### Removed + +* GetStream() interface call from RocVideoDecoder utility class. + +#### Changed + +* Changed asserts in query API calls in RocVideoDecoder utility class to error reports, to avoid hard stop during query in case error occurs and to let the caller decide actions. +* `libdrm_amdgpu` is now explicitly linked with rocdecode. + +### **rocFFT** (1.0.34) + +#### Added + +* Added gfx950 support. + +#### Removed + +* Removed ``rocfft-rider`` legacy compatibility from clients. +* Removed support for the gfx940 and gfx941 targets from the client programs. +* Removed backward compatibility symlink for include directories. + +#### Optimized + +* Removed unnecessary HIP event/stream allocation and synchronization during MPI transforms. +* Implemented single-precision 1D kernels for lengths: + - 4704 + - 5488 + - 6144 + - 6561 + - 8192 +* Implemented single-kernel plans for some large 1D problem sizes, on devices with at least 160KiB of LDS. + +#### Resolved isues + +* Fixed kernel faults on multi-device transforms that gather to a single device, when the input/output bricks are not + contiguous. + +### **rocJPEG** (1.1.0) + +#### Added +* cmake config files. +* CTEST - New tests were introduced for JPEG batch decoding using various output formats, such as yuv_planar, y, rgb, and rgb_planar, both with and without region-of-interest (ROI). + +#### Changed +* Readme - cleanup and updates to pre-reqs. +* The `decode_params` argument of the `rocJpegDecodeBatched` API is now an array of `RocJpegDecodeParams` structs representing the decode parameters for the batch of JPEG images. +* `libdrm_amdgpu` is now explicitly linked with rocjpeg. + +#### Removed +* Dev Package - No longer installs pkg-config. + +#### Resolved issues +* Fixed a bug that prevented copying the decoded image into the output buffer when the output buffer is larger than the input image. +* Resolved an issue with resizing the internal memory pool by utilizing the explicit constructor of the vector's type during the resizing process. +* Addressed and resolved CMake configuration warnings. + +### **ROCm SMI** (7.8.0) + +#### Added + +- Support for GPU metrics 1.8. + - Added new fields for `rsmi_gpu_metrics_t` including: + - Adding the following metrics to allow new calculations for violation status: + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. + - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). + - Increasing available JPEG engines to 40. + Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI. + +#### Removed + +- Removed backwards compatibility for `rsmi_dev_gpu_metrics_info_get()`'s `jpeg_activity` and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` and `xcp_stats.vcn_busy`. + - Backwards compability is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. + - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion for users about which field to use. By removing backward compatibility, it is easier to identify the relevant field. + - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. + +```{note} +See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. +``` + +### **ROCm Compute Profiler** (3.2.1) + +#### Added + +##### CDNA4 (AMD Instinct MI350/MI355) support + +* Support for AMD Instinct MI350 series GPUs with the addition of the following counters: + * VALU co-issue (Two VALUs are issued instructions) efficiency + * Stream Processor Instruction (SPI) Wave Occupancy + * Scheduler-Pipe Wave Utilization + * Scheduler FIFO Full Rate + * CPC ADC Utilization + * F6F4 data type metrics + * Update formula for total FLOPs while taking into account F6F4 ops + * LDS STORE, LDS LOAD, LDS ATOMIC instruction count metrics + * LDS STORE, LDS LOAD, LDS ATOMIC bandwidth metrics + * LDS FIFO full rate + * Sequencer -> TA ADDR Stall rates + * Sequencer -> TA CMD Stall rates + * Sequencer -> TA DATA Stall rates + * L1 latencies + * L2 latencies + * L2 to EA stalls + * L2 to EA stalls per channel + +* Roofline support for AMD Instinct MI350 series architecture. + +##### Textual User Interface (TUI) (beta version) + +* Text User Interface (TUI) support for analyze mode + * A command line based user interface to support interactive single-run analysis + * To launch, use `--tui` option in analyze mode. For example, ``rocprof-compute analyze --tui``. + +##### PC Sampling (beta version) + +* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X series and later accelerators. + +* Host-trap PC Sampling has been enabled for AMD Instinct MI200 series and later accelerators. + +* Support for sorting of PC sampling by type: offset or count. + +* PC Sampling Support on CLI and TUI analysis. + +##### Roofline + +* Support for Roofline plot on CLI (single run) analysis. + +* Roofline support for RHEL 10 OS. + +* FP4 and FP6 data types have been added for roofline profiling on AMD Instinct MI350 series. + +##### rocprofv3 support + +* ``rocprofv3`` is supported as the default backend for profiling. +* Support to obtain performance information for all channels for TCC counters. +* Support for profiling on AMD Instinct MI 100 using ``rocprofv3``. +* Deprecation warning for ``rocprofv3`` interface in favor of the ROCprofiler-SDK interface, which directly accesses ``rocprofv3`` C++ tool. + +##### Others + +* Docker files to package the application and dependencies into a single portable and executable standalone binary file. + +* Analysis report based filtering + * ``-b`` option in profile mode now also accepts metric id(s) for analysis report based filtering. + * ``-b`` option in profile mode also accepts hardware IP block for filtering; however, this filter support will be deprecated soon. + * ``--list-metrics`` option added in profile mode to list possible metric id(s), similar to analyze mode. + +* Support MEM chart on CLI (single run) + +* ``--specs-correction`` option to provide missing system specifications for analysis. + +#### Changed + +* Changed the default ``rocprof`` version to ``rocprofv3``. This is used when environment variable ``ROCPROF`` is not set. +* Changed ``normal_unit`` default to ``per_kernel``. +* Decreased profiling time by not collecting unused counters in post-analysis. +* Updated Dash to >=3.0.0 (for web UI). +* Changed the condition when Roofline PDFs are generated during general profiling and ``--roof-only`` profiling (skip only when ``--no-roof`` option is present). +* Updated Roofline binaries: + * Rebuild using latest ROCm stack + * Minimum OS distribution support minimum for roofline feature is now Ubuntu 22.04, RHEL 8, and SLES15 SP6. + +#### Removed + +* Roofline support for Ubuntu 20.04 and SLES below 15.6 +* Removed support for AMD Instinct MI50 and MI60. + +#### Optimized + +* ROCm Compute Profiler CLI has been improved to better display the GPU architecture analytics + +#### Resolved issues + +* Fixed kernel name and kernel dispatch filtering when using ``rocprofv3``. +* Fixed an issue of TCC channel counters collection in ``rocprofv3``. +* Fixed peak FLOPS of F8, I8, F16, and BF16 on AMD Instinct MI 300. +* Fixed not detecting memory clock issue when using amd-smi +* Fixed standalone GUI crashing +* Fixed L2 read/write/atomic bandwidths on MI350 + +#### Known issues + +* On AMD Instinct MI100, accumulation counters are not collected, resulting in the following metrics failing to show up in the analysis: Instruction Fetch Latency, Wavefront Occupancy, LDS Latency + * As a workaround, use the environment variable ``ROCPROF=rocprof``, to use ``rocprof v1`` for profiling on AMD Instinct MI100. + +* GPU id filtering is not supported when using ``rocprofv3``. + +* Analysis of previously collected workload data will not work due to sysinfo.csv schema change. + * As a workaround, re-run the profiling operation for the workload and interrupt the process after 10 seconds. + Followed by copying the ``sysinfo.csv`` file from the new data folder to the old one. + This assumes your system specification hasn't changed since the creation of the previous workload data. + +* Analysis of new workloads might require providing shader/memory clock speed using +``--specs-correction`` operation if amd-smi or rocminfo does not provide clock speeds. + +* Memory chart on ROCm Compute Profiler CLI might look corrupted if the CLI width is too narrow. + +#### Upcoming changes + +* ``rocprof v1/v2/v3`` interfaces will be removed in favor of the ROCprofiler-SDK interface, which directly accesses ``rocprofv3`` C++ tool. Using ``rocprof v1/v2/v3`` interfaces will trigger a deprecation warning. + * To use ROCprofiler-SDK interface, set environment variable `ROCPROF=rocprofiler-sdk` and optionally provide profile mode option ``--rocprofiler-sdk-library-path /path/to/librocprofiler-sdk.so``. Add ``--rocprofiler-sdk-library-path`` runtime option to choose the path to ROCprofiler-SDK library to be used. +* Hardware IP block based filtering using ``-b`` option in profile mode will be removed in favor of analysis report block based filtering using ``-b`` option in profile mode. +* MongoDB database support will be removed, and a deprecation warning has been added to the application interface. +* Usage of ``rocm-smi`` is deprecated in favor of ``amd-smi``, and a deprecation warning has been added to the application interface. + +### **ROCm Data Center Tool** (1.1.0) + +#### Added + +- More profiling and monitoring metrics, especially for AMD Instinct MI300 and newer GPUs. +- Advanced logging and debugging options, including new log levels and troubleshooting guidance. + +#### Changed + +- Completed migration from legacy [ROCProfiler](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/) to [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/). +- Reorganized the configuration files internally and improved [README/installation](https://github.com/ROCm/rdc/blob/amd-staging/README.md) instructions. +- Updated metrics and monitoring support for the latest AMD data center GPUs. + +#### Optimized + +- Integration with [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/) for performance metrics collection. +- Standalone and embedded operating modes, including streamlined authentication and configuration options. +- Support and documentation for diagnostic commands and GPU group management. +- [RVS](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/latest/) test integration and reporting. + +### **ROCm Systems Profiler** (1.1.0) + +#### Added + +- Profiling and metric collection capabilities for VCN engine activity, JPEG engine activity, and API tracing for rocDecode, rocJPEG, and VA-APIs. +- How-to document for VCN and JPEG activity sampling and tracing. +- Support for tracing Fortran applications. +- Support for tracing MPI API in Fortran. + +#### Changed + +- Replaced ROCm SMI backend with AMD SMI backend for collecting GPU metrics. +- ROCprofiler-SDK is now used to trace RCCL API and collect communication counters. +- Updated the Dyninst submodule to v13.0. +- Set the default value of `ROCPROFSYS_SAMPLING_CPUS` to `none`. + +#### Resolved issues + +- Fixed GPU metric collection settings with `ROCPROFSYS_AMD_SMI_METRICS`. +- Fixed a build issue with CMake 4. +- Fixed incorrect kernel names shown for kernel dispatch tracks in Perfetto. +- Fixed formatting of some output logs. + +### **ROCmValidationSuite** (1.2.0) + +#### Added + +- Support for new platforms: MI350X and MI355X. +- Introduced rotating buffer mechanism for GEMM operations. +- Support for read and write tests in Babel. +- Support for new platforms: RX9070 and RX9070GRE. + +#### Changed + +- Migrated SMI API usage from `rocm-smi` to `amd-smi`. +- Updated FP8 GEMM operations to use hipBLASLt instead of rocBLAS. + +### **rocPRIM** (4.0.0) + +#### Added + +* Added `rocprim::accumulator_t` to ensure parity with CCCL. +* Added test for `rocprim::accumulator_t` +* Added `rocprim::invoke_result_r` to ensure parity with CCCL. +* Added function `is_build_in` into `rocprim::traits::get`. +* Added virtual shared memory as a fallback option in `rocprim::device_merge` when it exceeds shared memory capacity, similar to `rocprim::device_select`, `rocprim::device_partition`, and `rocprim::device_merge_sort`, which already include this feature. +* Added initial value support to device level inclusive scans. +* Added new optimization to the backend for `device_transform` when the input and output are pointers. +* Added `LoadType` to `transform_config`, which is used for the `device_transform` when the input and output are pointers. +* Added `rocprim:device_transform` for n-ary transform operations API with as input `n` number of iterators inside a `rocprim::tuple`. +* Added gfx950 support. +* Added `rocprim::key_value_pair::operator==`. +* Added the `rocprim::unrolled_copy` thread function to copy multiple items inside a thread. +* Added the `rocprim::unrolled_thread_load` function to load multiple items inside a thread using `rocprim::thread_load`. +* Added `rocprim::int128_t` and `rocprim::uint128_t` to benchmarks for improved performance evaluation on 128-bit integers. +* Added `rocprim::int128_t` to the supported autotuning types to improve performance for 128-bit integers. +* Added the `rocprim::merge_inplace` function for merging in-place. +* Added initial value support for warp- and block-level inclusive scan. +* Added support for building tests with device-side random data generation, making them finish faster. This requires rocRAND, and is enabled with the `WITH_ROCRAND=ON` build flag. +* Added tests and documentation to `lookback_scan_state`. It is still in the `detail` namespace. + +#### Optimized + +* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the MI3XX architecture. + +#### Changed + +* Changed the parameters `long_radix_bits` and `LongRadixBits` from `segmented_radix_sort` to `radix_bits` and `RadixBits` respectively. +* Marked the initialisation constructor of `rocprim::reverse_iterator<Iter>` `explicit`, use `rocprim::make_reverse_iterator`. +* Merged `radix_key_codec` into type_traits system. +* Renamed `type_traits_interface.hpp` to `type_traits.hpp`, rename the original `type_traits.hpp` to `type_traits_functions.hpp`. +* The default scan accumulator types for device-level scan algorithms have changed. This is a breaking change. +The previous default accumulator types could lead to situations in which unexpected overflow occured, such as +when the input or inital type was smaller than the output type. + * This is a complete list of affected functions and how their default accumulator types are changing: + * `rocprim::inclusive_scan` + * Previous default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>` + * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, typename std::iterator_traits<InputIterator>::value_type>` + * `rocprim::deterministic_inclusive_scan` + * Previous default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>` + * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, typename std::iterator_traits<InputIterator>::value_type>` + * `rocprim::exclusive_scan` + * Previous default: `class AccType = detail::input_type_t<InitValueType>>` + * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, rocprim::detail::input_type_t<InitValueType>>` + * `rocprim::deterministic_exclusive_scan` + * Previous default: `class AccType = detail::input_type_t<InitValueType>>` + * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, rocprim::detail::input_type_t<InitValueType>>` +* Undeprecated internal `detail::raw_storage`. +* A new version of `rocprim::thread_load` and `rocprim::thread_store` replace the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. +* Renamed `rocprim::load_cs` to `rocprim::load_nontemporal` and `rocprim::store_cs` to `rocprim::store_nontemporal` to express the intent of these load and store methods better. +* All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, for example, `rocprim::ROCPRIM_300400_NS::symbol` instead of `rocPRIM::symbol`, letting the user link multiple libraries built with different versions of rocPRIM. + +#### Upcoming changes + +* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` now. + +#### Removed + +* Removed `rocprim::detail::float_bit_mask` and relative tests, use `rocprim::traits::float_bit_mask` instead. +* Removed `rocprim::traits::is_fundamental`, please use `rocprim::traits::get<T>::is_fundamental()` directly. +* Removed the deprecated parameters `short_radix_bits` and `ShortRadixBits` from the `segmented_radix_sort` config. They were unused, it is only an API change. +* Removed the deprecated `operator<<` from the iterators. +* Removed the deprecated `TwiddleIn` and `TwiddleOut`. Use `radix_key_codec` instead. +* Removed the deprecated flags API of `block_adjacent_difference`. Use `subtract_left()` or `block_discontinuity::flag_heads()` instead. +* Removed the deprecated `to_exclusive` functions in the warp scans. +* Removed the `rocprim::load_cs` from the `cache_load_modifier` enum. Use `rocprim::load_nontemporal` instead. +* Removed the `rocprim::store_cs` from the `cache_store_modifier` enum. Use `rocprim::store_nontemporal` instead. +* Removed the deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. + * This header included `rocprim::detail::invoke_result`. Use `rocprim::invoke_result` instead. + * This header included `rocprim::detail::invoke_result_binary_op`. Use `rocprim::invoke_result_binary_op` instead. + * This header included `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. +* Removed the deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. +* Removed `rocprim/detail/radix_sort.hpp`, functionality can now be found in `rocprim/thread/radix_key_codec.hpp`. +* Removed C++14 support, only C++17 is supported. +* Due to the removal of `__AMDGCN_WAVEFRONT_SIZE` in the compiler, the following deprecated warp size-related symbols have been removed: + * `rocprim::device_warp_size()` + * For compile-time constants, this is replaced with `rocprim::arch::wavefront::min_size()` and `rocprim::arch::wavefront::max_size()`. Use this when allocating global or shared memory. + * For run-time constants, this is replaced with `rocprim::arch::wavefront::size().` + * `rocprim::warp_size()` + * Use `rocprim::host_warp_size()`, `rocprim::arch::wavefront::min_size()` or `rocprim::arch::wavefront::max_size()` instead. + * `ROCPRIM_WAVEFRONT_SIZE` + * Use `rocprim::arch::wavefront::min_size()` or `rocprim::arch::wavefront::max_size()` instead. + * `__AMDGCN_WAVEFRONT_SIZE` + * This was a fallback define for the compiler's removed symbol, having the same name. +* This release removes support for custom builds on gfx940 and gfx941. + +#### Resolved issues + +* Fixed an issue where `device_batch_memcpy` reported benchmarking throughput being 2x lower than it was in reality. +* Fixed an issue where `device_segmented_reduce` reported autotuning throughput being 5x lower than it was in reality. +* Fixed device radix sort not returning the correct required temporary storage when a double buffer contains `nullptr`. +* Fixed constness of equality operators (`==` and `!=`) in `rocprim::key_value_pair`. +* Fixed an issue for the comparison operators in `arg_index_iterator` and `texture_cache_iterator`, where `<` and `>` comparators were swapped. +* Fixed an issue for the `rocprim::thread_reduce` not working correctly with a prefix value. + +#### Known issues + +* When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x + * However if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs + +### **ROCprofiler-SDK** (1.0.0) + +### Added + +- Support for [rocJPEG](https://rocm.docs.amd.com/projects/rocJPEG/en/latest/index.html) API Tracing. +- Support for AMD Instinct MI350X and MI355X accelerators. +- `rocprofiler_create_counter` to facilitate adding custom derived counters at runtime. +- Support in `rocprofv3` for iteration based counter multiplexing. +- Perfetto support for counter collection. +- Support for negating `rocprofv3` tracing options when using aggregate options such as `--sys-trace --hsa-trace=no`. +- `--agent-index` option in `rocprofv3` to specify the agent naming convention in the output: + - absolute == node_id + - relative == logical_node_id + - type-relative == logical_node_type_id +- MI300 and MI350 stochastic (hardware-based) PC sampling support in ROCProfiler-SDK and `rocprofv3`. +- Python bindings for `rocprofiler-sdk-roctx` +- SQLite3 output support for `rocprofv3` using `--output-format rocpd`. +- `rocprofiler-sdk-rocpd` package: + - Public API in `include/rocprofiler-sdk-rocpd/rocpd.h`. + - Library implementation in `librocprofiler-sdk-rocpd.so`. + - Support for `find_package(rocprofiler-sdk-rocpd)`. + - `rocprofiler-sdk-rocpd` DEB and RPM packages. +- `--version` option in `rocprofv3`. +- `rocpd` Python package. +- Thread trace as experimental API. +- ROCprof Trace Decoder as experimental API: + - Requires [ROCprof Trace Decoder plugin](https://github.com/rocm/rocprof-trace-decoder). +- Thread trace option in the `rocprofv3` tool under the `--att` parameters: + - See [using thread trace with rocprofv3](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/amd-mainline/how-to/using-thread-trace.html) + - Requires [ROCprof Trace Decoder plugin](https://github.com/rocm/rocprof-trace-decoder). +- `rocpd` output format documentation: + - Requires [ROCprof Trace Decoder plugin](https://github.com/rocm/rocprof-trace-decoder). +- Perfetto support for scratch memory. +- Support in the `rocprofv3` avail tool for command-line arguments. +- Documentation for `rocprofv3` advanced options. + +### Changed + +- SDK to NOT to create a background thread when every tool returns a nullptr from `rocprofiler_configure`. +- `vaddr-to-file-offset` mapping in `disassembly.hpp` to use the dedicated comgr API. +- `rocprofiler_uuid_t` ABI to hold 128 bit value. +- `rocprofv3` shorthand argument for `--collection-period` to `-P` (upper-case) while `-p` (lower-case) is reserved for later use. +- Default output format for `rocprofv3` to `rocpd` (SQLite3 database). +- `rocprofv3` avail tool to be renamed from `rocprofv3_avail` to `rocprofv3-avail` tool. +- `rocprofv3` tool to facilitate thread trace and PC sampling on the same agent. + +#### Removed + +* Support for compilation of gfx940 and gfx941 targets. + +### Resolved issues + +- Fixed missing callbacks around internal thread creation within counter collection service. +- Fixed potential data race in the ROCprofiler-SDK double buffering scheme. +- Fixed usage of std::regex in the core ROCprofiler-SDK library that caused segfaults or exceptions when used under dual ABI. +- Fixed Perfetto counter collection by introducing accumulation per dispatch. +- Fixed code object disassembly for missing function inlining information. +- Fixed queue preemption error and `HSA_STATUS_ERROR_INVALID_PACKET_FORMAT` error for stochastic PC-sampling in MI300X, leading to stabler runs. +- Fixed the system hang issue for host-trap PC-sampling on AMD Instinct MI300X. +- Fixed `rocpd` counter collection issue when counter collection alone is enabled. `rocpd_kernel_dispatch` table is updated to be populated by counters data instead of kernel_dispatch data. +- Fixed `rocprofiler_*_id_t` structs for inconsistency related to a "null" handle: + - The correct definition for a null handle is `.handle = 0` while some definitions previously used `UINT64_MAX`. +- Fixed kernel trace csv output generated by `rocpd`. + +### **rocPyDecode** (0.6.0) + +#### Added + +* ``rocpyjpegdecode`` package. +* Added ``src/rocjpeg`` source new subfolder. +* Added ``samples/rocjpeg`` new subfolder. + +#### Changed +* Minimum version for rocdecode and rocjpeg updated to V1.0.0 + +### **rocRAND** (4.0.0) + +#### Added + +* gfx950 support. +* Additional unit tests for `test_log_normal_distribution.cpp`, `test_normal_distribution.cpp`, `test_rocrand_mtgp32_prng.cpp`, `test_rocrand_scrambled_sobol32_qrng.cpp`, `test_rocrand_scrambled_sobol64_qrng.cpp`, `test_rocrand_sobol32_qrng.cpp`, `test_rocrand_sobol64_qrng.cpp`, `test_rocrand_threefry2x32_20_prng.cpp`, `test_rocrand_threefry2x64_20_prng.cpp`, `test_rocrand_threefry4x32_20_prng.cpp`, `test_rocrand_threefry4x64_20_prng.cpp`, and `test_uniform_distribution.cpp`. +* New unit tests for `include/rocrand/rocrand_discrete.h` in `test_rocrand_discrete.cpp`, `include/rocrand/rocrand_mrg31k3p.h` in `test_rocrand_mrg31k3p_prng.cpp`, `include/rocrand/rocrand_mrg32k3a.h` in `test_rocrand_mrg32k3a_prng.cpp`, and `include/rocrand/rocrand_poisson.h` in `test_rocrand_poisson.cpp`. + +#### Changed + +* Changed the return type for `rocrand_generate_poisson` for the `SOBOL64` and `SCRAMBLED_SOBOL64` engines. +* Changed the unnecessarily large 64-bit data type for constants used for skipping in `MRG32K3A` to the 32-bit data type. +* Updated several `gfx942` auto tuning parameters. +* Modified error handling and expanded the error information for the case of double-deallocation of the (scrambled) sobol32 and sobol64 constants and direction vectors. + +#### Removed + +* Removed inline assembly and the `ENABLE_INLINE_ASM` CMake option. Inline assembly was used to optimize multiplication in the Mrg32k3a and Philox 4x32-10 generators. It is no longer needed because the current HIP compiler is able to produce code with the same or better performance. +* Removed instances of the deprecated clang definition `__AMDGCN_WAVEFRONT_SIZE`. +* Removed C++14 support. Beginning with this release, only C++17 is supported. +* Directly accessing the (scrambled) sobol32 and sobol64 constants and direction vectors is no longer supported. For: + * `h_scrambled_sobol32_constants`, use `rocrand_get_scramble_constants32` instead. + * `h_scrambled_sobol64_constants`, use `rocrand_get_scramble_constants64` instead. + * `rocrand_h_sobol32_direction_vectors`, use `rocrand_get_direction_vectors32` instead. + * `rocrand_h_sobol64_direction_vectors`, use `rocrand_get_direction_vectors64` instead. + * `rocrand_h_scrambled_sobol32_direction_vectors`, use `rocrand_get_direction_vectors32` instead. + * `rocrand_h_scrambled_sobol64_direction_vectors`, use `rocrand_get_direction_vectors64` instead. + +#### Resolved isues + +* Fixed an issue where `mt19937.hpp` would cause kernel errors during auto tuning. + +#### Upcoming canges + +* Deprecated the rocRAND Fortran API in favor of hipfort. + +### **ROCr Debug Agent** (2.1.0) + +#### Added + +* Added the `-e` and `--precise-alu-exceptions` flags to enable precise ALU exceptions reporting on supported configurations. + +### **rocSHMEM** (3.0.0) + +#### Added + +* Added the Reverse Offload conduit. +* Added new APIs: `rocshmem_ctx_barrier`, `rocshmem_ctx_barrier_wave`, `rocshmem_ctx_barrier_wg`, `rocshmem_barrier_all`, `rocshmem_barrier_all_wave`, `rocshmem_barrier_all_wg`, `rocshmem_ctx_sync`, `rocshmem_ctx_sync_wave`, `rocshmem_ctx_sync_wg`, `rocshmem_sync_all`, `rocshmem_sync_all_wave`, `rocshmem_sync_all_wg`, `rocshmem_init_attr`, `rocshmem_get_uniqueid`, and `rocshmem_set_attr_uniqueid_args`. +* Added dlmalloc based allocator. +* Added XNACK support. +* Added support for initialization with MPI communicators other than `MPI_COMM_WORLD`. + +#### Changed + +* Changed collective APIs to use `_wg` suffix rather than `_wg_` infix. + +#### Resolved issues + +* Resolved segfault in `rocshmem_wg_ctx_create`, now provides nullptr if ctx cannot be created. + +### **rocSOLVER** (3.30.0) + +#### Added + +* Hybrid computation support for existing routines: + - STEQR + +#### Optimized + +* Fixed corner cases that can produce NaNs in SYEVD, for valid input matrices. +* Improved the performance of BDSQR and downstream functions, such as GESVD. +* Improved the performance of STEQR and downstream functions, such as SYEV/HEEV. +* Improved the performance of LARFT and downstream functions, such as GEQR2 and GEQRF. + +#### Resolved issues + +* Fixed corner cases that can produce NaNs in SYEVD for valid input matrices. + +### **rocSPARSE** (4.0.2) + +#### Added + +* Added the `SpGEAM` generic routine for computing sparse matrix addition in CSR format. +* Added the `v2_SpMV` generic routine for computing sparse matrix vector multiplication. As opposed to the deprecated `rocsparse_spmv` routine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated `rocsparse_spmv` routine, the user can enable warning messages in situations where a fallback algorithm is used by either calling the `rocsparse_enable_debug` routine upfront or exporting the variable `ROCSPARSE_DEBUG` (with the shell command `export ROCSPARSE_DEBUG=1`). +* Added half float mixed precision to `rocsparse_axpby` where X and Y use `float16` and the result and compute type use `float`. +* Added half float mixed precision to `rocsparse_spvv` where X and Y use `float16` and the result and compute type use `float`. +* Added half float mixed precision to `rocsparse_spmv` where A and X use `float16` and Y and the compute type use `float`. +* Added half float mixed precision to `rocsparse_spmm` where A and B use `float16` and C and the compute type use `float`. +* Added half float mixed precision to `rocsparse_sddmm` where A and B use `float16` and C and the compute type use `float`. +* Added half float uniform precision to the `rocsparse_scatter` and `rocsparse_gather` routines. +* Added half float uniform precision to the `rocsparse_sddmm` routine. +* Added the `rocsparse_spmv_alg_csr_rowsplit` algorithm. +* Added support for gfx950. +* Added ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux). +* Added the `almalinux` operating system name to correct the GFortran dependency. + +#### Changed + +* Switch to defaulting to C++17 when building rocSPARSE from source. Previously rocSPARSE was using C++14 by default. + +#### Removed + +* The deprecated `rocsparse_spmv_ex` routine. +* The deprecated `rocsparse_sbsrmv_ex`, `rocsparse_dbsrmv_ex`, `rocsparse_cbsrmv_ex`, and `rocsparse_zbsrmv_ex` routines. +* The deprecated `rocsparse_sbsrmv_ex_analysis`, `rocsparse_dbsrmv_ex_analysis`, `rocsparse_cbsrmv_ex_analysis`, and `rocsparse_zbsrmv_ex_analysis` routines. + +#### Optimized + +* Reduced the number of template instantiations in the library to further reduce the shared library binary size and improve compile times. +* Allow SpGEMM routines to use more shared memory when available. This can speed up performance for matrices with a large number of intermediate products. +* Use of the `rocsparse_spmv_alg_csr_adaptive` or `rocsparse_spmv_alg_csr_default` algorithms in `rocsparse_spmv` to perform transposed sparse matrix multiplication (`C=alpha*A^T*x+beta*y`) resulted in unnecessary analysis on A and needless slowdown during the analysis phase. This has been improved by skipping the analysis when performing the transposed sparse matrix multiplication. +* Improved the user documentation. + +#### Resolved issues + +* Fixed an issue in the public headers where `extern "C"` was not wrapped by `#ifdef __cplusplus`, which caused failures when building C programs with rocSPARSE. +* Fixed a memory access fault in the `rocsparse_Xbsrilu0` routines. +* Fixed failures that could occur in `rocsparse_Xbsrsm_solve` or `rocsparse_spsm` with BSR format when using host pointer mode. +* Fixed ASAN compilation failures. +* Fixed a failure that occurred when using const descriptor `rocsparse_create_const_csr_descr` with the generic routine `rocsparse_sparse_to_sparse`. The issue was not observed when using non-const descriptor `rocsparse_create_csr_descr` with `rocsparse_sparse_to_sparse`. +* Fixed a memory leak in the rocSPARSE handle. + +#### Upcoming changes + +* Deprecated the `rocsparse_spmv` routine. Use the `rocsparse_v2_spmv` routine instead. +* Deprecated the `rocsparse_spmv_alg_csr_stream` algorithm. Use the `rocsparse_spmv_alg_csr_rowsplit` algorithm instead. +* Deprecated the `rocsparse_itilu0_alg_sync_split_fusion` algorithm. Use one of `rocsparse_itilu0_alg_async_inplace`, `rocsparse_itilu0_alg_async_split`, or `rocsparse_itilu0_alg_sync_split` instead. + +### **rocThrust** (4.0.0) + +#### Changed + +* Updated the required version of Google Benchmark from 1.8.0 to 1.9.0. +* Drop `c++14` support for rocthrust. +* Renamed `cpp14_required.h` to `cpp_version_check.h` +* Refactored `test_header.hpp` into separte modules `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. + * This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. + +#### Added + +* Additional unit tests for: binary_search, complex, c99math, catrig, ccosh, cexp, clog, csin, csqrt, and ctan. +* Added `test_param_fixtures.hpp` to store all the parameters for typed test suites. +* Added `test_real_assertions.hpp` to handle unit test assertions for real numbers. +* Added `test_imag_assertions.hpp` to handle unit test assertions for imaginary numbers. +* `clang++` is now used to compile google benchmarks on Windows. +* Added gfx950 support. +* Merged changes from upstream CCCL/thrust 2.6.0. + +#### Removed + +* `device_malloc_allocator.h` has been removed. This header file was unused and should not impact users. +* Removed C++14 support, only C++17 is supported. +* `test_header.hpp` has been removed. The `HIP_CHECK` function, as well as the `test` and `inter_run_bwr` namespaces, have been moved to `test_utils.hpp`. +* `test_assertions.hpp` has been split into `test_real_assertions.hpp` and `test_imag_assertions.hpp`. + +#### Upcoming changes + +* `thrust::device_malloc_allocator` is deprecated as of this version. It will be removed in an upcoming version. + +#### Resolved issues + +* Fixed an issue with internal calls to unqualified `distance()` which would be ambigious due to also visibile implementation through ADL. + +#### Known issues + +* The order of the values being compared by thrust::exclusive_scan_by_key and thrust::inclusive_scan_by_key can change between runs when integers are being compared. This can cause incorrect output when a non-commutative operator such as division is being used. + +### **rocWMMA** (2.0.0) + +#### Added + +* Added internal register layout transforms to support interleaved MMA layouts. +* Added support for the gfx950 target. +* Added mixed input `BF8` / `FP8` types for MMA support. +* Added fragment scheduler API objects to embed thread block cooperation properties in fragments + +#### Changed + +* Augmented load / store / MMA internals with static loop unrolling +* rocWMMA mma_sync API now supports `wave tile` fragment sizes +* rocWMMA cooperative fragments are now expressed with fragment scheduler template arguments +* rocWMMA cooperative fragments now use the same base API as non-cooperative fragments +* rocWMMA cooperative fragments register usage footprint has been reduced +* rocWMMA fragments now support partial tile sizes with padding + +#### Optimized + +* Added internal flow control barriers to improve assembly code generation and overall performance +* Enabled interleaved layouts by default in MMA to improve overall performance + +#### Removed + +* Removed support for the gfx940 and gfx941 targets +* Removed the rocWMMA cooperative API +* Removed wave count template parameters from transforms APIs + +#### Resolved issues + +* Fixed a validation issue for small precision compute types `< B32` on gfx9 +* Fixed CMake validation of compiler support for `BF8` / `FP8` types +* Fixed linkage of rocwmma::synchronize_workgroup to inline + +### **RPP** (2.0.0) + +#### Added + +* Bitwise NOT, Bitwise AND, Bitwise OR augmentations on HOST (CPU) and HIP backends. +* Tensor Concat augmentation on HOST (CPU) and HIP backends. +* JPEG Compression Distortion augmentation on HIP backend. +* `log1p`, defined as `log (1 + x)`, tensor augmentation support on HOST (CPU) and HIP backends. +* JPEG Compression Distortion augmentation on HOST (CPU) backend. + +#### Changed + +* All handle creation and destruction APIs have been consolidated to `rppCreate()`, for handle initialization, and `rppDestroy()`, for handle destruction. +* RPP function category `logical_operations` more appropriately renamed to `bitwise_operations`. +* TurboJPEG package installation enabled for RPP Test Suite with `sudo apt-get install libturbojpeg0-dev`. Instructions updated in utilities/test_suite/README.md. (#518) +* Changed API of swap_channels augmentation to be called channel_permute, which now accepts one new argument, `permutationTensor` (pointer to a unsigned int tensor) that provides the permutation order to swap the RGB channels of each input image in the batch in any order. + * Old API - `RppStatus rppt_swap_channels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, rppHandle_t rppHandle);`. + * New API - `RppStatus rppt_channel_permute_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u *permutationTensor , rppHandle_t rppHandle);`. + +#### Removed + +* Older versions of RPP handle creation inlcuding `rppCreateWithBatchSize()`, `rppCreateWithStream()`, and `rppCreateWithStreamAndBatchSize()` are now removed and replaced with `rppCreate()`. +* Older versions of RPP handle destruction API including `rppDestroyGPU()` and `rppDestroyHost()` are now removed and replaced with `rppDestroy()`. + +#### Resolved issues + +* Test package - debian packages will install required dependencies. + +### **Tensile** (4.44.0) + +#### Added + +- Added support for gfx950. +- Added code object compression via bundling. +- Added support for non-default HIP SDK installations on Windows. +- Added master solution library documentation. +- Added compiler version dependent assembler and architecture capabilities. +- Added documentation from GitHub Wiki to ROCm docs. + +#### Changed + +- Loosened check for CLI compiler choices. +- Introduced 4-tuple targets for bundler invocations. +- Introduced PATHEXT extensions on Windows when searching for toolchain components. +- Enabled passing fully qualified paths to toolchain components. +- Enabled environment variable overrides when searching for a ROCm stack. +- Improved default toolchain configuration. +- Ignored f824 flake errors. + +#### Removed + +- Removed support for the gfx940 and gfx941 targets. +- Removed unused tuning files. +- Removed disabled tests. + +#### Resolved issues + +- Fixed configure time path not being invoked at build. +- Fixed find_package for msgpack to work with versions 5 and 6. +- Fixed rhel9 testing. +- Fixed gfx908 builds. +- Fixed the 'argument list too long' error. +- Fixed version typo in 6.3 changelog. +- Fixed improper use of aliases as nested namespace specifiers. + ## ROCm 6.4.3 See the [ROCm 6.4.3 release notes](https://rocm.docs.amd.com/en/docs-6.4.3/about/release-notes.html) diff --git a/RELEASE.md b/RELEASE.md index 4041b0fdd..88c74150c 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -43,7 +43,15 @@ The following are notable new features and improvements in ROCm 7.0.0. For chang HIP API 7.0 introduces changes to make it align more closely with NVIDIA CUDA. These change are incompatible with prior releases, and might require recompiling existing HIP applications for use in the ROCm 7.0 release. For more information, see the [HIP API 7.0 changes](../hip-7-changes) and the [HIP changelog](#hip-7-0-0) below. -### Instinct Driver / ROCm packaging separation +### New machine learning programming language for AMD accelerators + +Wave is a high-performance domain-specific Python programming language (DSL) designed to accelerate the development and optimization of machine learning kernels on AMD GPUs. It introduces a subgroup-level (wave) programming model that deliberately separates the mathematical formulation of computation from subgroup and thread distribution strategies. ROCm 7.0 supports the library on AMD Instinct MI300 and MI350 series accelerators. Wave is now integrated into SGLang, also enabling a broader user base. For more information, see its [GitHub repository](https://github.com/iree-org/wave). + +```{note} +Wave for ROCm is in an early access state. Running production workloads is not recommended. +``` + +### Instinct Driver/ROCm packaging separation The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) for more information. @@ -70,48 +78,51 @@ Support has been added for the ROCm Profiling Data (rocpd) output format, which * Added stochastic and host-trap PC sampling support for all MI300 series accelerators. * HIP streams translate to Queues in Time Traces in Perfetto output. +For more information about ROCprofiler-SDK changes, see the [detailed component changelog](#rocprofiler-sdk-1-0-0) below + ### Compilers changes and improvements ROCm 7.0 introduces the AMD Next-Gen Fortran compiler. ``llvm-flang`` (sometimes called new-flang or flang-18) is a re-implementation of the Fortran frontend. It is a strategic replacement for classic-flang and is developed in LLVM’s upstream repo at [llvm/llvm-project](https://github.com/llvm/llvm-project/tree/main/flang). Key enhancements include: - * Compiler: - * Improved memory load and store instructions. - * Updated clang/llvm to AMD clang version 20.0.0git (equivalent to LLVM 20.0.0 with additional out-of-tree patches). - * Support added for separate debug file generation for device code. +* Compiler: + * Improved memory load and store instructions. + * Updated clang/llvm to AMD clang version 20.0.0git (equivalent to LLVM 20.0.0 with additional out-of-tree patches). + * Support added for separate debug file generation for device code. - * Comgr: - * Added support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps. This is designed to improve performance by reducing on-disk file I/O. Currently, VFS is supported only for the device library link step, with plans for expanded support in future releases. +* Comgr: + * Added support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps. This is designed to improve performance by reducing on-disk file I/O. Currently, VFS is supported only for the device library link step, with plans for expanded support in future releases. - * SPIR-V: - * Improved [target-specific extensions](https://github.com/ROCm/llvm-project/blob/c2535466c6e40acd5ecf6ba1676a4e069c6245cc/clang/docs/LanguageExtensions.rst): - * Added a new target-specific builtin ``__builtin_amdgcn_processor_is`` for late or deferred queries of the current target processor. - * Added a new target-specific builtin ``__builtin_amdgcn_is_invocable``, enabling fine-grained, per-builtin feature availability. - * HIPIFY now supports NVIDIA CUDA 12.8.0 APIs: - * Added support for all new device and host APIs, including FP4, FP6, and FP128 – including support for the corresponding ROCm HIP equivalents. +* SPIR-V: + * Improved [target-specific extensions](https://github.com/ROCm/llvm-project/blob/c2535466c6e40acd5ecf6ba1676a4e069c6245cc/clang/docs/LanguageExtensions.rst): + * Added a new target-specific builtin ``__builtin_amdgcn_processor_is`` for late or deferred queries of the current target processor. + * Added a new target-specific builtin ``__builtin_amdgcn_is_invocable``, enabling fine-grained, per-builtin feature availability. +* HIPIFY now supports NVIDIA CUDA 12.8.0 APIs: + * Added support for all new device and host APIs, including `FP4`, `FP6`, and `FP128`– including support for the corresponding ROCm HIP equivalents. Deprecated features: - * ROCm components no longer use the ``__AMDGCN_WAVEFRONT_SIZE`` and ``__AMDGCN_WAVEFRONT_SIZE__`` macros nor HIP’s ``warpSize`` variable as ``constexpr``. These macros and reliance on ``warpSize`` as a ``constexpr`` are deprecated and will be disabled in a future release. Users are encouraged to update their code if needed to ensure future compatibility. +* ROCm components no longer use the ``__AMDGCN_WAVEFRONT_SIZE`` and ``__AMDGCN_WAVEFRONT_SIZE__`` macros nor HIP’s ``warpSize`` variable as ``constexpr``. These macros and reliance on ``warpSize`` as a ``constexpr`` are deprecated and will be disabled in a future release. Users are encouraged to update their code if needed to ensure future compatibility. ### Libraries changes and improvements #### New data type support -MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). The ROCm 7.0 Alpha enables functional support for MX data types FP4, FP6, and FP8 on MI355X systems in these ROCm libraries: - * Composable Kernel (FP4 and FP8 only) - * hipBLASLt - * MIGraphX (FP4 only) +MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). The ROCm 7.0 Alpha enables functional support for MX data types `FP4`, `FP6`, and `FP8` on MI355X systems in these ROCm libraries: +* Composable Kernel (`FP4` and `FP8` only) +* hipBLASLt +* MIGraphX (`FP4` only) -The following libraries are updated to support the Open Compute Project (OCP) floating-point FP8 format on AMD Instinct MI355X instead of the NANOO FP8 format: +The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on AMD Instinct MI355X instead of the NANOO `FP8` format: - * Composable Kernel - * hipBLASLt - * hipSPARSELt - * MIGraphX - * rocWMMA -MIGraphX now also supports BF16. +* Composable Kernel +* hipBLASLt +* hipSPARSELt +* MIGraphX +* rocWMMA + +MIGraphX now also supports `BF16`. #### RCCL support @@ -119,8 +130,8 @@ RCCL is supported for single-node functional usage only. Multi-node communicatio #### MIGraphX support -* Support for OCP FP8 and MX FP4 data types on MI355X -* Support for BF16 on all hardware +* Support for OCP `FP8` and MX `FP4` data types on MI355X +* Support for `BF16` on all hardware * Support for PyTorch 2.7 via Torch-MIGraphX ### Tools changes and improvements @@ -131,20 +142,20 @@ RCCL is supported for single-node functional usage only. Multi-node communicatio * New APIs: CPU affinity shows GPUs’ affinitization to each CPU in a system. #### ROCgdb -* MX data types support: FP4, FP6, and FP8 +* MX data types support: `FP4`, `FP6`, and `FP8`. #### ROCprof Compute Viewer -* Initial release: ``rocprof-compute-viewer`` allows the visualization of ``rocprofv3``’s thread trace output +* Initial release: ``rocprof-compute-viewer`` allows the visualization of ``rocprofv3``’s thread trace output. #### ROCprof Trace Decoder -* Initial release: ``rocprof-trace-decoder`` a plugin API for decoding thread traces +* Initial release: ``rocprof-trace-decoder`` a plugin API for decoding thread traces. #### ROCm Compute Profiler -* MX data types support: FP4, FP6, and FP8. +* MX data types support: `FP4`, `FP6`, and `FP8`. * AMD Instinct MI355X and MI350X performance counters: CPC, SPI, SQ, TA/TD/TCP, and TCC. -* Enhanced roofline analysis with support for INT8, INT32, FP8, FP16, and BF16 data types. -* Roofline distinction for FP32 and FP64 data types. +* Enhanced roofline analysis with support for `INT8`, `INT32`, `FP8`, `FP16`, and `BF16` data types. +* Roofline distinction for `FP32` and `FP64` data types. * Selective kernel profiling. #### ROCm Systems Profiler @@ -175,7 +186,7 @@ See [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/install- The ROCm Runfile Installer 7.0.0 adds the following features and improvements: -* Added support for RHEL 10.0, Oracle 10.0, and Rocky 9.6, +* Added support for RHEL 10.0, Oracle 10.0, and Rocky 9.6. * Added `untar` mode for the `.run` file to allow extraction of ROCm to a given directory, similar to a normal tarball. * Added an RVS test script. * Fixes to the rocm-examples test script. @@ -187,6 +198,8 @@ For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/pro ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases. +* Documentation for [rocCV](https://rocm.docs.amd.com/projects/rocCV/en/latest/index.html), an efficient GPU-accelerated library for image pre- and post-processing, has been added. rocCV is in an early access state, and using it on production workloads is not recommended. + * ROCm Math libraries support a wide range of data types, enabling optimized performance across various precision requirements. The following Math libraries are now updated with new precision content. For more information, click the Math library’s link: * [hipBLAS](https://rocm.docs.amd.com/projects/hipBLAS/en/develop/reference/data-type-support.html) @@ -195,8 +208,6 @@ ROCm documentation continues to be updated to provide clearer and more comprehen * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/precision.html) * [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/develop/src/reference/precision-support.html#precision-support) -* Documentation for [rocCV](https://rocm.docs.amd.com/projects/rocCV/en/latest/index.html), an efficient GPU-accelerated library for image pre- and post-processing, has been added. rocCV is in an early access state, and using it on production workloads is not recommended. - * ROCm offers a comprehensive ecosystem for deep learning development, featuring libraries optimized for deep learning operations and ROCm-aware versions of popular deep learning frameworks and libraries. The following deep learning frameworks' content now includes release notes and known issues: * [PyTorch](https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/pytorch-compatibility.html) @@ -214,13 +225,13 @@ ROCm documentation continues to be updated to provide clearer and more comprehen * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/env_variables.html) * [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/develop/src/reference/environment-variables.html) -* Modern computing tasks often require balancing numerical precision against hardware resources and processing speed. Low precision floating point number formats in HIP include FP4 (4-bit) and FP6 (6-bit), which reduce memory and bandwidth requirements. For more information, see the updated [Low precision floating point types](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html) topic. +* Modern computing tasks often require balancing numerical precision against hardware resources and processing speed. Low precision floating point number formats in HIP include `FP4` (4-bit) and `FP6` (6-bit), which reduce memory and bandwidth requirements. For more information, see the updated [Low precision floating point types](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html) topic. ## Operating system and hardware support changes ROCm 7.0.0 adds support for [placeholder]. For more information, see installation instructions. -ROCm 6.4.2 marks the end of support (EoS) for [placeholder] +ROCm 7.0.0 marks the end of support (EoS) for [placeholder] ROCm 7.0.0 adds support for AMD Instinct MI355X and MI350X. For details, see the full list of Supported GPUs (Linux). @@ -260,42 +271,42 @@ Click {fab}`github` to go to the component's source code on GitHub. MIGraphX - 2.12.0 ⇒ 2.13.0 + 2.12.0 ⇒ 2.13.0 MIOpen - 3.4.0 ⇒ 3.4.1 + 3.4.0 ⇒ 3.5.0 MIVisionX - 3.2.0 ⇒ 3.3.0 + 3.2.0 ⇒ 3.3.0 rocAL - 2.2.0 ⇒ 2.3.0 + 2.2.0 ⇒ 2.3.0 rocDecode - 0.10.0 ⇒ 1.0.0 + 0.10.0 ⇒ 1.0.0 rocJPEG - 0.8.0 ⇒ 1.1.0 + 0.8.0 ⇒ 1.1.0 rocPyDecode - 0.3.1 ⇒ 0.6.0 + 0.3.1 ⇒ 0.6.0 RPP - 1.9.10 ⇒ 2.0.0 + 1.9.10 ⇒ 2.0.0 @@ -304,12 +315,12 @@ Click {fab}`github` to go to the component's source code on GitHub. Communication RCCL - 2.22.3 ⇒ 2.26.6 + 2.22.3 ⇒ 2.26.6 rocSHMEM - 2.0.1 ⇒ 3.0.0 + 2.0.1 ⇒ 3.0.0 @@ -318,82 +329,82 @@ Click {fab}`github` to go to the component's source code on GitHub. Math hipBLAS - 2.4.0 ⇒ 3.0.0 + 2.4.0 ⇒ 3.0.0 hipBLASLt - 0.12.1 ⇒ 1.0.0 + 0.12.1 ⇒ 1.0.0 hipFFT - 1.0.18 ⇒ 1.0.20 + 1.0.18 ⇒ 1.0.20 hipfort - 0.6.0 ⇒ 0.7.0 + 0.6.0 ⇒ 0.7.0 hipRAND - 2.12.0 ⇒ 3.0.0 + 2.12.0 ⇒ 3.0.0 hipSOLVER - 2.4.0 ⇒ 3.0.0 + 2.4.0 ⇒ 3.0.0 hipSPARSE - 3.2.0 ⇒ 4.0.1 + 3.2.0 ⇒ 4.0.1 hipSPARSELt - 0.2.3 ⇒ 0.2.4 + 0.2.3 ⇒ 0.2.4 rocALUTION - 3.2.3 ⇒ 4.0.0 + 3.2.3 ⇒ 4.0.0 rocBLAS - 4.4.1 ⇒ 5.0.0 + 4.4.1 ⇒ 5.0.0 rocFFT - 1.0.32 ⇒ 1.0.34 + 1.0.32 ⇒ 1.0.34 rocRAND - 3.3.0 ⇒ 4.0.0 + 3.3.0 ⇒ 4.0.0 rocSOLVER - 3.28.2 ⇒ 3.30.0 + 3.28.2 ⇒ 3.30.0 rocSPARSE - 3.4.0 ⇒ 4.0.2 + 3.4.0 ⇒ 4.0.2 rocWMMA - 1.7.0 ⇒ 2.0.0 + 1.7.0 ⇒ 2.0.0 Tensile - 4.43.0 ⇒ 4.44.0 + 4.43.0 ⇒ 4.44.0 @@ -402,22 +413,22 @@ Click {fab}`github` to go to the component's source code on GitHub. Primitives hipCUB - 3.4.0 ⇒ 4.0.0 + 3.4.0 ⇒ 4.0.0 hipTensor - 1.5.0 ⇒ 2.0.0 + 1.5.0 ⇒ 2.0.0 rocPRIM - 3.4.1 ⇒ 4.0.0 + 3.4.1 ⇒ 4.0.0 rocThrust - 3.3.0 ⇒ 4.0.0 + 3.3.0 ⇒ 4.0.0 @@ -426,12 +437,12 @@ Click {fab}`github` to go to the component's source code on GitHub. Tools System management AMD SMI - 25.5.1 ⇒ 26.0.0 + 25.5.1 ⇒ 26.0.0 ROCm Data Center Tool - 0.3.0 ⇒ 1.1.0 + 0.3.0 ⇒ 1.1.0 @@ -441,12 +452,12 @@ Click {fab}`github` to go to the component's source code on GitHub. ROCm SMI - 7.7.0 ⇒ 7.8.0 + 7.7.0 ⇒ 7.8.0 ROCm Validation Suite - 1.1.0 ⇒ 1.2.0 + 1.1.0 ⇒ 1.2.0 @@ -456,19 +467,19 @@ Click {fab}`github` to go to the component's source code on GitHub. Performance ROCm Bandwidth Test - 1.4.0 ⇒ 2.6.0 + 1.4.0 ⇒ 2.6.0 ROCm Compute Profiler - 3.1.1 ⇒ 3.2.1 + 3.1.1 ⇒ 3.2.1 ROCm Systems Profiler - 1.0.2 ⇒ 1.1.0 + 1.0.2 ⇒ 1.1.0 @@ -480,7 +491,7 @@ Click {fab}`github` to go to the component's source code on GitHub. ROCprofiler-SDK - 0.6.0 ⇒ 1.0.0 + 0.6.0 ⇒ 1.0.0 @@ -496,13 +507,13 @@ Click {fab}`github` to go to the component's source code on GitHub. Development HIPIFY - 19.0.0 ⇒ 20.0.0 + 19.0.0 ⇒ 20.0.0 ROCdbgapi - 0.77.2 ⇒ 0.77.3 + 0.77.2 ⇒ 0.77.3 @@ -515,14 +526,14 @@ Click {fab}`github` to go to the component's source code on GitHub. ROCm Debugger (ROCgdb) - 15.2 ⇒ 16.3 + 15.2 ⇒ 16.3 ROCr Debug Agent - 2.0.4 ⇒ 2.1.0 + 2.0.4 ⇒ 2.1.0 @@ -537,7 +548,7 @@ Click {fab}`github` to go to the component's source code on GitHub. llvm-project - 19.0.0 ⇒ 20.0.0 + 19.0.0 ⇒ 20.0.0 @@ -546,12 +557,12 @@ Click {fab}`github` to go to the component's source code on GitHub. Runtimes HIP - 6.4.3 ⇒ 7.0.0 + 6.4.3 ⇒ 7.0.0 ROCr Runtime - 1.15.0 ⇒ 1.18.0 + 1.15.0 ⇒ 1.18.0 @@ -566,45 +577,189 @@ The following sections describe key changes to ROCm components. For a historical overview of ROCm component updates, see the {doc}`ROCm consolidated changelog `. ``` -### Composable Kernel 1.1.0 +### **AMD SMI** (26.0.0) + +### Added + +* The Default command. + + A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through laternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. + +* Support for GPU metrics 1.8. + - Added new fields for `amdsmi_gpu_xcp_metrics_t` including: + - Adding the following metrics to allow new calculations for violation status: + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. + - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). + - Increasing available JPEG engines to 40. + Current ASICs may not support all 40. These will be indicated as `UINT16_MAX` or `N/A` in CLI. + +* Bad page threshold count. + - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions required to display the count. + +* CPU model name for RDC. + - Added new C and Python API `amdsmi_get_cpu_model_name`. + - Not sourced from esmi library. + +* Added `amdsmi_get_cpu_affinity_with_scope()`. + +* `socket power` to `amdsmi_get_power_info` + - Previously the C API had the value in the `amdsmi_power_info` structure, but was unused + - Now we populate the value in both C and Python APIs + - The value is representative of the socket's power agnostic of the the GPU version. + +* New event notification types to `amdsmi_evt_notification_type_t`. + The following values were added to the `amdsmi_evt_notification_type_t` enum: + - `AMDSMI_EVT_NOTIF_EVENT_MIGRATE_START` + - `AMDSMI_EVT_NOTIF_EVENT_MIGRATE_END` + - `AMDSMI_EVT_NOTIF_EVENT_PAGE_FAULT_START` + - `AMDSMI_EVT_NOTIF_EVENT_PAGE_FAULT_END` + - `AMDSMI_EVT_NOTIF_EVENT_QUEUE_EVICTION` + - `AMDSMI_EVT_NOTIF_EVENT_QUEUE_RESTORE` + - `AMDSMI_EVT_NOTIF_EVENT_UNMAP_FROM_GPU` + - `AMDSMI_EVT_NOTIF_PROCESS_START` + - `AMDSMI_EVT_NOTIF_PROCESS_END` + +- Power Cap to `amd-smi monitor`. + - `amd-smi monitor -p` will display the power cap along with power. + +### Changed + +* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. + - The `clk_deep_sleep` field now returns the sleep integer value. + +* Updated `amdsmi_get_gpu_asic_info` in `amdsmi.h`. + - Added `subsystem_id` structure member. + +* The `amd-smi topology` command has been enabled for Guest environments. + - `amd-smi topology` is now available in Guest environments. This includes full functionality so users can use the command just as they would in Bare Metal environments. + +* Expanded Violation Status tracking for GPU metrics 1.8. + - The driver will no longer be supporting existing single-value GFX Clk Below Host Limit fields (`acc_gfx_clk_below_host_limit`, `per_gfx_clk_below_host_limit`, `active_gfx_clk_below_host_limit`), they are now changed in favor of new per-XCP/XCC arrays. + - Added new fields to `amdsmi_violation_status_t` and related interfaces for enhanced violation breakdown: + - Per-XCP/XCC accumulators and status for: + - GFX Clock Below Host Limit (Power, Thermal, and Total) + - Low Utilization + - Added 2D arrays to track per-XCP/XCC accumulators, percentage, and active status: + - `acc_gfx_clk_below_host_limit_pwr`, `acc_gfx_clk_below_host_limit_thm`, `acc_gfx_clk_below_host_limit_total` + - `per_gfx_clk_below_host_limit_pwr`, `per_gfx_clk_below_host_limit_thm`, `per_gfx_clk_below_host_limit_total` + - `active_gfx_clk_below_host_limit_pwr`, `active_gfx_clk_below_host_limit_thm`, `active_gfx_clk_below_host_limit_total` + - `acc_low_utilization`, `per_low_utilization`, `active_low_utilization` + - Python API and CLI now report these expanded fields. + +* The char arrays in the following structures have been changed. + - `amdsmi_vbios_info_t` member `build_date` changed from `AMDSMI_MAX_DATE_LENGTH` to `AMDSMI_MAX_STRING_LENGTH`. + - `amdsmi_dpm_policy_entry_t` member `policy_description` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. + - `amdsmi_name_value_t` member `name` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. + +* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. + - The `clk_deep_sleep` field now returns the sleep integer value. + +* Updated `amdsmi_bdf_t` in `amdsmi.h`. + - The `amdsmi_bdf_t` union was changed to have an identical unnamed struct for backwards compatiblity + +### Removed + +- Removed unnecessary API, `amdsmi_free_name_value_pairs(),` from amdsmi.h + - This API is only used internally to free up memory from the python interface and does not need to be + exposed to the User. + +- Removed unused definitions: + - `AMDSMI_MAX_NAME` + - `AMDSMI_256_LENGTH` + - `AMDSMI_MAX_DATE_LENGTH` + - `MAX_AMDSMI_NAME_LENGTH` + - `AMDSMI_LIB_VERSION_YEAR` + - `AMDSMI_DEFAULT_VARIANT` + - `AMDSMI_MAX_NUM_POWER_PROFILES` + - `AMDSMI_MAX_DRIVER_VERSION_LENGTH` + +- Removed unused member `year` in struct `amdsmi_version_t`. + +- Removed `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`** + - `amdsmi_io_link_type_t` is no longer needed as `amdsmi_link_type_t` is sufficient. + - `amdsmi_link_type_t` enum has changed. + - This change will also affect `amdsmi_link_metrics_t`, where the link_type field changes from `amdsmi_io_link_type_t` to `amdsmi_link_type_t`. + +- Removed `amdsmi_get_power_info_v2()`. + - The ``amdsmi_get_power_info()`` has been unified and the v2 function is no longer needed/used. + +- Removed `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. + +- The `amdsmi_get_gpu_vram_info` now provides vendor names as a string. + - `amdsmi_vram_vendor_type_t` enum structure is removed. + - `amdsmi_vram_info_t` member named `amdsmi_vram_vendor_type_t` is changed to a character string. + - `amdsmi_get_gpu_vram_info` now no longer requires decoding the vendor name as an enum. + +- Removed backwards compatibility for `amdsmi_get_gpu_metrics_info()`'s,`jpeg_activity`and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` or `xcp_stats.vcn_busy`. + - Backwards compatibility is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. + - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion about which field to use. By removing backward compatibility, it is easier to identify the relevant field. + - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. + +### Optimized + +- Reduced ``amd-smi`` CLI API calls needed to be called before reading or (re)setting GPU features. + - Now when users call any amd-smi CLI command, we have reduced the APIs needed to be called. Previously, + when a user would read a GPU's status, (for example) we would poll for other information helpful for our sets/reset + CLI calls. This change will increase overall run-time performance of the CLI tool. + +- Removed partition information from the default `amd-smi static` CLI command. + - Users can still retrieve the same data by calling `amd-smi`, `amd-smi static -p`, or `amd-smi partition -c -m`/`sudo amd-smi partition -a`. + - Reading ``current_compute_partition`` may momentarily wake the GPU up. This is due to reading XCD registers, which is expected behavior. Changing partitions is not a trivial operation, `current_compute_partition` SYSFS controls this action. + +- Optimized CLI command `amd-smi topology` in partition mode. + - Reduced the number of `amdsmi_topo_get_p2p_status` API calls to one fourth. + +### Resolved issues + +- Removed duplicated GPU IDs when receiving events using the `amd-smi event` command. + +```{note} +See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. +``` + +### **Composable Kernel** (1.1.0) #### Added -* Added support for bf16, f32, and f16 for 2D and 3D NGCHW grouped convolution backward data +* Added support for `BF16`, `F32`, and `F16` for 2D and 3D NGCHW grouped convolution backward data. * Added a fully asynchronous HOST (CPU) arguments copy flow for CK grouped GEMM kernels. * Added support GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW, number of instances in instance factory for NGCHW/GKYXC/NGKHW has been reduced). * Added support for GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW). * Added support for GKCYX layout for grouped convolution backward weight (NGCHW/GKCYX/NGKHW). * Added support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). -* Added support for Stream-K version of mixed fp8/bf16 GEMM -* Added support for Multiple D GEMM -* Added GEMM pipeline for microscaling (MX) FP8/FP6/FP4 data types -* Added support for FP16 2:4 structured sparsity to universal GEMM. +* Added support for Stream-K version of mixed `FP8` / `BF16` GEMM. +* Added support for Multiple D GEMM. +* Added GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types +* Added support for `FP16` 2:4 structured sparsity to universal GEMM. * Added support for Split K for grouped convolution backward data. * Added logit soft-capping support for fMHA forward kernels. -* Added support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv) +* Added support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv). * Added benchmarking support for tile engine GEMM. * Added Ping-pong scheduler support for GEMM operation along the K dimension. * Added rotating buffer feature for CK_Tile GEMM. * Added int8 support for CK_TILE GEMM. -#### Optimized +#### Changed - -* Optimize the gemm multiply multiply preshuffle & lds bypass with Pack of KGroup and better instruction layout. -* Added Vectorize Transpose optimization for CK Tile. -* Added the asynchronous copy for gfx950. - -#### Changes - -* Removed support for gfx940 and gfx941 targets. * Replaced the raw buffer load/store intrinsics with Clang20 built-ins. * DL and DPP kernels are now enabled by default. * Number of instances in instance factory for grouped convolution forward NGCHW/GKYXC/NGKHW has been reduced. * Number of instances in instance factory for grouped convolution backward weight NGCHW/GKYXC/NGKHW has been reduced. * Number of instances in instance factory for grouped convolution backward data NGCHW/GKYXC/NGKHW has been reduced. -### HIP 7.0.0 +#### Removed + +* Removed support for gfx940 and gfx941 targets. + +#### Optimized + +* Optimize the GEMM multiply preshuffle and lds bypass with Pack of KGroup and better instruction layout. +* Added Vectorize Transpose optimization for CK Tile. +* Added the asynchronous copy for gfx950. + +### **HIP** 7.0.0 #### Added @@ -615,17 +770,17 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - `hipMemGetHandleForAddressRange` gets a handle for the address range requested. - `num_threads` Total number of threads in the group. The legacy API size is alias. - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for reduction across lanes of a warp. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). -* New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). - - Data types for `FP4`/`FP6`/`FP8`. - - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. +* New support for Open Compute Project (OCP) floating-point `FP4` / `FP6` / `FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). + - Data types for `FP4` / `FP6` / `FP8`. + - HIP APIs for `FP4` / `FP6` / `FP8`, which are compatible with corresponding CUDA APIs. - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. * New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. * New debug mask, to print precise code object information for logging. * The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. -* Added `constexpr` operators for `fp16`/`bf16`. +* Added `constexpr` operators for `FP16` / `BF16`. * Added `__syncwarp` operation. * Added PCI CHIP ID information as the device attribute. -* Added new tests applications for OCP data types `FP4`/`FP6`/`FP8`. +* Added new tests applications for OCP data types `FP4` / `FP6` / `FP8`. * A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. #### Changed @@ -769,50 +924,52 @@ HIP runtime has the following functional improvements which greatly improve runt #### Added -* Added the `hipblasSetWorkspace()` API -* Support for codecoverage tests - +* Added the `hipblasSetWorkspace()` API. +* Support for codecoverage tests. + #### Changed -* HIPBLAS_V2 API is now the only available API using `hipComplex` and `hipDatatype` types -* Documentation updates -* Verbose compilation for `hipblas.cpp` +* HIPBLAS_V2 API is the only available API using the `hipComplex` and `hipDatatype` types. +* Documentation updates. +* Verbose compilation for `hipblas.cpp`. #### Removed -* `hipblasDatatype_t` type -* `hipComplex` and `hipDoubleComplex` types -* Support code for non-production gfx targets +* `hipblasDatatype_t` type. +* `hipComplex` and `hipDoubleComplex` types. +* Support code for non-production gfx targets. -#### Resolved Issues +#### Resolved issues -* The build time `CMake` configuration for the dependency on `hipBLAS-common` is fixed -* Compiler warnings for unhandled enums have been resolved +* The build time `CMake` configuration for the dependency on `hipBLAS-common` is fixed. +* Compiler warnings for unhandled enumerations have been resolved. ### **hipBLASLt** (1.0.0) #### Added -* Stream-K GEMM support has been enabled for the `FP32`, `FP16`, `BF16`, `FP8`, and `BF8` data types on the MI300A APU. To activate this feature, set the `TENSILE_SOLUTION_SELECTION_METHOD` environment variable to `2`, for example, `export TENSILE_SOLUTION_SELECTION_METHOD=2`. +* Stream-K GEMM support has been enabled for the `FP32`, `FP16`, `BF16`, `FP8`, and `BF8` data types on the Instinct MI300A APU. To activate this feature, set the `TENSILE_SOLUTION_SELECTION_METHOD` environment variable to `2`, for example, `export TENSILE_SOLUTION_SELECTION_METHOD=2`. * Fused Swish/SiLU GEMM in hipBLASLt (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``) -* Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942 -* Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain max workspace size for user offline tuning -* Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support FP16/BF16 swizzle GEMM and FP8/BF8 swizzle GEMM respectively. -* Added TF32 emulation on gfx950 +* Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. +* Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. +* Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8` / `BF8` swizzle GEMM respectively. +* Added TF32 emulation on gfx950. +* Added support for `FP6`, `BF6`, and `FP4` on gfx950 +* Added support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. #### Changed -* ``HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER_VEC_EXT`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER_VEC_EXT`` are removed. Use the ``HIPBLASLT_MATMUL_DESC_A_SCALE_MODE`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_MODE`` attributes to set scalar (``HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F``) or vector (``HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F``). -* The non-V2 APIs (``GemmPreference``, ``GemmProblemType``, ``GemmEpilogue``, ``GemmTuning``, ``GemmInputs``) in the Cpp header are now the same as the V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``). The original non-V2 APIs are removed. -* `hipblasltExtAMaxWithScale` API is removed. +* ``HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER_VEC_EXT`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER_VEC_EXT`` are removed. Use the ``HIPBLASLT_MATMUL_DESC_A_SCALE_MODE`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_MODE`` attributes to set scalar (``HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F``) or vector (``HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F``) attributes. +* The non-V2 APIs (``GemmPreference``, ``GemmProblemType``, ``GemmEpilogue``, ``GemmTuning``, ``GemmInputs``) in the cpp header are now the same as the V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``). The original non-V2 APIs are removed. +* The `hipblasltExtAMaxWithScale` API is removed. #### Optimized -* Improved performance for 8-bit (FP8/BF8/I8) NN/NT cases by adding ``s_delay_alu`` to reduce stalls from dependent ALU operations on gfx12+. -* Improved performance for 8-bit and 16-bit (FP16/BF16) TN cases by enabling software dependency check (Expert Scheduling Mode) under certain restrictions to reduce redundant hardware dependency checks on gfx12+. +* Improved performance for 8-bit (`FP8` / `BF8` / `I8`) NN/NT cases by adding ``s_delay_alu`` to reduce stalls from dependent ALU operations on gfx12+. +* Improved performance for 8-bit and 16-bit (`FP16` / `BF16`) TN cases by enabling software dependency checks (Expert Scheduling Mode) under certain restrictions to reduce redundant hardware dependency checks on gfx12+. * Improved performance for 8-bit, 16-bit, and 32-bit batched GEMM with a better heuristic search algorithm for gfx942. -#### Upcoming Changes +#### Upcoming changes * V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``) are deprecated. @@ -841,28 +998,28 @@ HIP runtime has the following functional improvements which greatly improve runt * Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. * Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. * This release removes support for custom builds on gfx940 and gfx941. -* Removed C++14 support, only C++17 is supported. +* Removed C++14 support. Only C++17 is supported. #### Changed -* The NVIDIA backend now requires CUB, Thrust, and libcu++ 2.7.0. If they aren't found, they will be downloaded from the NVIDIA CCCL repository. +* The NVIDIA backend now requires CUB, Thrust, and libcu++ 2.7.0. If they aren't found, they will be downloaded from the NVIDIA CCCL repository. * Updated `thread_load` and `thread_store` to align hipCUB with CUB. -* All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, (for example, hipcub::HIPCUB_300400_NS::symbol instead of hipcub::symbol), letting the user link multiple libraries built with different versions of hipCUB. +* All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, (for example, `hipcub::HIPCUB_300400_NS::symbol` instead of `hipcub::symbol`), letting the user link multiple libraries built with different versions of hipCUB. * Modified the broadcast kernel in warp scan benchmarks. The reported performance may be different to previous versions. * The `hipcub::detail::accumulator_t` in rocPRIM backend has been changed to utilise `rocprim::accumulator_t`. * The usage of `rocprim::invoke_result_binary_op_t` has been replaced with `rocprim::accumulator_t`. -#### Resolved Issues +#### Resolved issues * Fixed an issue where `Sort(keys, compare_op, valid_items, oob_default)` in `block_merge_sort.hpp` would not fill in elements that are out of range (items after `valid_items`) with `oob_default`. * Fixed an issue where `ScatterToStripedFlagged` in `block_exhange.hpp` was calling the wrong function. -#### Known Issues +#### Known issues -* `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed from hipCUB's CUB backend. They were already deprecated as of version 2.12.0 of hipCUB and they were removed from CCCL (CUB) as of CCCL's 2.6.0 release. +* `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed from hipCUB's CUB backend. They were already deprecated as of version 2.12.0 of hipCUB and they were removed from CCCL (CUB) as of CCCL's 2.6.0 release. * `BlockScan::InclusiveScan` for the NVIDIA backend does not compute the block aggregate correctly when passing an initial value parameter. This behavior is not matched by the AMD backend. -#### Upcoming Changes +#### Upcoming changes * `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` were deprecated as of version 2.12.0 of hipCUB, and will be removed from the rocPRIM backend in a future release for the next ROCm major version (ROCm 7.0.0). @@ -874,16 +1031,15 @@ HIP runtime has the following functional improvements which greatly improve runt #### Removed -* Removed hipfft-rider legacy compatibility from clients -* Remove support for the gfx940 and gfx941 targets from the client programs. -* Remove backward compatibility symlink for include directories. +* Removed hipfft-rider legacy compatibility from clients. +* Removed support for the gfx940 and gfx941 targets from the client programs. +* Removed backward compatibility symlink for include directories. ### **hipfort** (0.7.0) #### Added -* Added documentation clarifying how hipfort is built for the NVIDIA - platform. Thanks [@fluidnumerics-joe](https://github.com/fluidnumerics-joe)! +* Added documentation clarifying how hipfort is built for the NVIDIA platform. #### Changed @@ -898,27 +1054,27 @@ HIP runtime has the following functional improvements which greatly improve runt * cuTENSOR 2.2.0.0 support * LLVM 20.1.8 support -#### Resolved Issues +#### Resolved issues * `hipDNN` support is removed by default * [#1859](https://github.com/ROCm/HIPIFY/issues/1859)[hipify-perl] Fix warnings on unsupported Driver or Runtime APIs which were erroneously not reported * [#1930](https://github.com/ROCm/HIPIFY/issues/1930) Revise `JIT API` * [#1962](https://github.com/ROCm/HIPIFY/issues/1962) Support for cuda-samples helper headers -* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Remove `const_cast<const char**>` in `hiprtcCreateProgram` and `hiprtcCompileProgram` +* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Remove `const_cast;` in `hiprtcCreateProgram` and `hiprtcCompileProgram` ### **hipRAND** (3.0.0) #### Added -* gfx950 support +* gfx950 support. #### Changed -* Deprecated hipRAND's Fortran API in favor of hipfort. - +* Deprecated the hipRAND Fortran API in favor of hipfort. + #### Removed - -* Removed C++14 support, only C++17 is supported. + +* Removed C++14 support, so only C++17 is supported. ### **hipSOLVER** (3.0.0) @@ -928,42 +1084,37 @@ HIP runtime has the following functional improvements which greatly improve runt * csrlsvqr * hipsolverSpCcsrlsvqr, hipsolverSpZcsrlsvqr -#### Resolved Issues - -* Corrected the value of `lwork` returned by various `bufferSize` functions to be consistent with NVIDIA cuSOLVER. The following functions will - now return `lwork` such that the workspace size (in bytes) is `sizeof(T) * lwork`, rather than `lwork`. To restore the original behavior, set - environment variable `HIPSOLVER_BUFFERSIZE_RETURN_BYTES`. - * hipsolverXorgbr_bufferSize, hipsolverXorgqr_bufferSize, hipsolverXorgtr_bufferSize, hipsolverXormqr_bufferSize, hipsolverXormtr_bufferSize, - hipsolverXgesvd_bufferSize, hipsolverXgesvdj_bufferSize, hipsolverXgesvdBatched_bufferSize, hipsolverXgesvdaStridedBatched_bufferSize, - hipsolverXsyevd_bufferSize, hipsolverXsyevdx_bufferSize, hipsolverXsyevj_bufferSize, hipsolverXsyevjBatched_bufferSize, - hipsolverXsygvd_bufferSize, hipsolverXsygvdx_bufferSize, hipsolverXsygvj_bufferSize, hipsolverXsytrd_bufferSize, hipsolverXsytrf_bufferSize +#### Resolved issues + +* Corrected the value of `lwork` returned by various `bufferSize` functions to be consistent with NVIDIA cuSOLVER. The following functions now return `lwork` so that the workspace size (in bytes) is `sizeof(T) * lwork`, rather than `lwork`. To restore the original behavior, set the environment variable `HIPSOLVER_BUFFERSIZE_RETURN_BYTES`. + * `hipsolverXorgbr_bufferSize`, `hipsolverXorgqr_bufferSize`, `hipsolverXorgtr_bufferSize`, `hipsolverXormqr_bufferSize`, `hipsolverXormtr_bufferSize`, `hipsolverXgesvd_bufferSize`, `hipsolverXgesvdj_bufferSize`, `hipsolverXgesvdBatched_bufferSize`, `hipsolverXgesvdaStridedBatched_bufferSize`, `hipsolverXsyevd_bufferSize`, `hipsolverXsyevdx_bufferSize`, `hipsolverXsyevj_bufferSize`, `hipsolverXsyevjBatched_bufferSize`, `hipsolverXsygvd_bufferSize`, `hipsolverXsygvdx_bufferSize`, `hipsolverXsygvj_bufferSize`, `hipsolverXsytrd_bufferSize`, `hipsolverXsytrf_bufferSize`. ### **hipSPARSE** (4.0.1) #### Added -* Add the `int8`, `int32`, and `float16` data types to `hipDataTypeToHCCDataType` so that sparse matrix descriptors can be used with them. -* Adds half float mixed precision to `hipsparseAxpby` where X and Y use float16 and result and the compute type use float -* Adds half float mixed precision to `hipsparseSpVV` where X and Y use float16 and result and the compute type use float -* Adds half float mixed precision to `hipsparseSpMM` where A and B use float16 and C and the compute type use float -* Adds half float mixed precision to `hipsparseSDDMM` where A and B use float16 and C and the compute type use float -* Adds half float uniform precision to `hipsparseScatter` and `hipsparseGather` routines -* Adds half float uniform precision to `hipsparseSDDMM` routine -* Add `int8` precision to `hipsparseCsr2cscEx2` routine. -* Add the `almalinux` OS name to correct the gfortran dependency +* Added the `int8`, `int32`, and `float16` data types to `hipDataTypeToHCCDataType` so that sparse matrix descriptors can be used with them. +* Added half float mixed precision to `hipsparseAxpby` where X and Y use `float16` and the result and compute type use `float`. +* Added half float mixed precision to `hipsparseSpVV` where X and Y use `float16` and the result and compute type use `float`. +* Added half float mixed precision to `hipsparseSpMM` where A and B use `float16` and C and the compute type use `float`. +* Added half float mixed precision to `hipsparseSDDMM` where A and B use `float16` and C and the compute type use `float`. +* Added half float uniform precision to the `hipsparseScatter` and `hipsparseGather` routines. +* Added half float uniform precision to the `hipsparseSDDMM` routine. +* Added `int8` precision to the `hipsparseCsr2cscEx2` routine. +* Added the `almalinux` operating system name to correct the GFortran dependency. #### Changed -* Switch to defaulting to C++17 when building hipSPARSE from source. Previously hipSPARSE was using C++14 by default. - -#### Resolved Issues +* Switched to defaulting to C++17 when building hipSPARSE from source. Previously hipSPARSE was using C++14 by default. + +#### Resolved issues * Fixed a compilation [issue](https://github.com/ROCm/hipSPARSE/issues/555) related to using `std::filesystem` and C++14. -* Fixed the empty clients-common package by moving the `hipsparse_clientmatrices.cmake` and `hipsparse_mtx2csr` files to it. +* Fixed an issue where the clients-common package was empty by moving the `hipsparse_clientmatrices.cmake` and `hipsparse_mtx2csr` files to it. -#### Known Issues - -* In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed because it is unused internally by `hipsparseSpSM_solve()`. +#### Known issues + +* In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed in, because it is unused internally. ### **hipSPARSELt** (0.2.4) @@ -971,20 +1122,20 @@ HIP runtime has the following functional improvements which greatly improve runt * Support for the LLVM target gfx950. * Support for the following data type combinations for the LLVM target gfx950: - * FP8(E4M3) inputs, F32 output, and F32 Matrix Core accumulation. - * BF8(E5M2) inputs, F32 output, and F32 Matrix Core accumulation. + * `FP8`(E4M3) inputs, `F32` output, and `F32` Matrix Core accumulation. + * `BF8`(E5M2) inputs, `F32` output, and `F32` Matrix Core accumulation. * Support for ROC-TX if `HIPSPARSELT_ENABLE_MARKER=1` is set. * Support for the cuSPARSELt v0.6.3 backend. +#### Removed + +* Support for LLVM targets gfx940 and gfx941 has been removed. +* `hipsparseLtDatatype_t` has been removed. + #### Optimized * Improved the library loading time. -* Provided more kernels for FP16 datatype. - -#### Removed - -* Support for LLVM targets gfx940 and gfx941 has been removed. -* `hipsparseLtDatatype_t` has been removed. +* Provided more kernels for the `FP16` data type. ### **hipTensor** (2.0.0) @@ -1032,79 +1183,107 @@ HIP runtime has the following functional improvements which greatly improve runt ### **MIOpen** (3.5.0) #### Added + +* [Conv] Added misa kernels for gfx950. +* [Conv] Enabled Split-K support for CK backward data solvers (2D). +* [Conv] Enabled CK wrw solver on gfx950 for the `BF16` data type. +* [BatchNorm] Enabled NHWC in OpenCL. +* Added grouped convolution + activation fusion. +* Added grouped convolution + bias + activation fusion. +* Composable Kernel (CK) can now be built inline as part of MIOpen. -* [Conv] Added misa kernels for gfx950 -* [Conv] Enabled split_k support for CK backward data solvers (2D) -* Added grouped convolution + activation fusion -* Added grouped convolution + bias + activation fusion -* [BatchNorm] Enabled NHWC in OpenCL -* Composable Kernel (CK) can now be built inline as part of MIOpen -* Changed to using median value with outliers removed when deciding on the best solution to run -* [Conv] Enabled CK wrw solver on gfx950 for bf16 datatype -* [Conv] Updated igemm asm solver +#### Changed + +* Changed to using the median value with outliers removed when deciding on the best solution to run. +* [Conv] Updated the igemm asm solver. #### Optimized * [BatchNorm] Optimized NHWC OpenCL kernels and improved heuristics -* [RNN] Dynamic algorithm optimization +* [RNN] Dynamic algorithm optimization. * [Conv] Eliminated redundant clearing of output buffers -* [RNN] Updated selection heuristics -* Updated tuning for MI300 +* [RNN] Updated selection heuristics. +* Updated tuning for the AMD Instinct MI300 series. -#### Resolved Issues +#### Resolved issues -* Fixed a segmentation fault when user specifies workspace smaller than what is required -* Fixed a layout calculation logic error that returned incorrect results and enabled less restrictive layout selection -* Fixed memory access faults in misa kernels due to out-of-bounds memory usage -* Fixed performance drop on gfx950 due to transpose kernel use -* Fixed memory access fault caused by not allocating enough workspace -* Fixed a name typo that caused kernel mismatches and long startup times +* Fixed a segmentation fault when the user specified a smaller workspace than what was required. +* Fixed a layout calculation logic error that returned incorrect results and enabled less restrictive layout selection. +* Fixed memory access faults in misa kernels due to out-of-bounds memory usage. +* Fixed a performance drop on the gfx950 due to transpose kernel use. +* Fixed a memory access fault caused by not allocating enough workspace. +* Fixed a name typo that caused kernel mismatches and long startup times. ### **MIVisionX** (3.3.0) +#### Added + +* Support to enable/disable BatchPD code in VX_RPP extensions by checking the RPP_LEGACY_SUPPORT flag. + #### Changed -* VX_RPP extension : Version 3.1.0 release -* Add support to enable/disable BatchPD code in VX_RPP extensions by checking the RPP_LEGACY_SUPPORT flag. +* VX_RPP extension : Version 3.1.0 release. * Update the parameters and kernel API of Blur, Fog, Jitter, LensCorrection, Rain, Pixelate, Vignette and ResizeCrop wrt tensor kernels replacing the legacy BatchPD API calls in VX_RPP extensions. -#### Known Issues +#### Known issues * Installation on CentOS/RedHat/SLES requires the manual installation of the `FFMPEG` & `OpenCV` dev packages. -#### Upcoming Changes +#### Upcoming changes * Optimized audio augmentations support for VX_RPP -### **rccl** (2.26.6) - -#### Resolved Issues - -* Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. -* Fixed unit test failures in tests ending with `ManagedMem` and `ManagedMemGraph` suffixes. -* Suboptimal algorithmic switching point for AllReduce on MI300x. -* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault." with a design change to use `comm` instead of `rank` for `mscclStatus`. The Global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read writes. This is tested for correctness, but there is a plan to use a thread-safe map data structure in upcoming changes. +### **RCCL** (2.26.6) #### Added -* Added support for extended fine-grained system memory pool. -* Added new GPU target `gfx950`. +* Added support for the extended fine-grained system memory pool. +* Added support for gfx950. * Added support for `unroll=1` in device-code generation to improve performance. * Set a default of 112 channels for a single node with `8 * gfx950`. -* Enabled LL128 protocol on `gfx950`. -* Adding ability to choose unroll factor at runtime via `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. -* Added MSCCL support for AllGather multinode gfx942/gfx950 (i.e., 16 and 32 GPUs). To enable, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. Max message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. -* Thread thresholds for LL/LL128 are selected in Tuning Models for the MI300X. This impacts the number of channels used for AG and RS. Channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS', or 'NCCL_MAX_NCHANNELS` are set. -* Multi-node tuning for AllGather, AllReduce, and ReduceScatter that leverages LL/LL64/LL128 protocol to use nontemporal vector load/store for tunable message size ranges. +* Enabled LL128 protocol on the gfx950. +* Added the ability to choose the unroll factor at runtime using `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. +* Added MSCCL support for AllGather multinode gfx942/gfx950 (for instance, 16 and 32 GPUs). To enable this feature, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. The maximum message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. +* Thread thresholds for LL/LL128 are selected in Tuning Models for the AMD Instinct MI300X. This impacts the number of channels used for AG and RS. The channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS`, or `NCCL_MAX_NCHANNELS` are set. +* Multi-node tuning for AllGather, AllReduce, and ReduceScatter that leverages LL/LL64/LL128 protocols to use nontemporal vector load/store for tunable message size ranges. * LL/LL128 usage ranges for AR, AG, and RS are part of the tuning models, which enable architecture-specific tuning in conjunction with the existing Rome Models scheme in RCCL. * Two new APIs are exposed as part of an initiative to separate RCCL code. These APIs are `rcclGetAlgoInfo` and `rcclFuncMaxSendRecvCount`. However, user-level invocation requires that RCCL be built with `RCCL_EXPOSE_STATIC` enabled. #### Changed -* Compatibility with NCCL 2.23.4 -* Compatibility with NCCL 2.24.3 -* Compatibility with NCCL 2.25.1 -* Compatibility with NCCL 2.26.6 +* Compatibility with NCCL 2.23.4. +* Compatibility with NCCL 2.24.3. +* Compatibility with NCCL 2.25.1. +* Compatibility with NCCL 2.26.6. + +#### Resolved issues + +* Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. +* Fixed unit test failures in tests ending with the `ManagedMem` and `ManagedMemGraph` suffixes. +* Fixed a suboptimal algorithmic switching point for AllReduce on the AMD Instinct MI300X. +* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault" with a design change to use `comm` instead of `rank` for `mscclStatus`. The Global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. + +### **rocAL** (2.3.0) + +#### Added +* Extended support to rocAL's video decoder to use rocDecode hardware decoder. +* Setup - installs rocdecode dev packages for Ubuntu, RedHat, and SLES. +* Setup - installs turbojpeg dev package for Ubuntu and Redhat. +* rocAL's image decoder has been extended to support the rocJPEG hardware decoder. +* Added numpy reader support for reading npy files in rocAL. +* Added test case for numpy reader in C++ and python tests. + +#### Resolved issues +* `TurboJPEG` no longer needs to be installed manually. It is now installed by the package installer. +* Hardware decode no longer requires that ROCm be installed with the `graphics` usecase. + +#### Known issues +* Package installation on SLES requires manually installing `TurboJPEG`. +* Package installation on CentOS, RedHat, and SLES requires manually installing the `FFMPEG Dev` package. + +#### Upcoming changes + +* rocJPEG support for JPEG decode. ### **rocALUTION** (4.0.0) @@ -1120,7 +1299,7 @@ HIP runtime has the following functional improvements which greatly improve runt * Improved the user documentation -#### Resolved Issues +#### Resolved issues * Fix for GPU hashing algorithm when not compiling with -O2/O3 @@ -1128,47 +1307,55 @@ HIP runtime has the following functional improvements which greatly improve runt #### Added -* gfx950 support -* `ROCBLAS_LAYER = 8` internal API logging for `gemm` debugging -* Support for AOCL 5.0 gcc build as a client reference library -* Allow `PkgConfig` for client reference library fallback detection +* gfx950 support. +* Internal API logging for `gemm` debugging using `ROCBLAS_LAYER = 8`. +* Support for AOCL 5.0 gcc build as a client reference library. +* Allowing the use of `PkgConfig` for client reference library fallback detection. #### Changed -* `CMAKE_CXX_COMPILER` is now passed on during compilation for a Tensile build -* Change default atomics mode from `allowed` to `not allowed` - -#### Removed - -* Support code for non-production gfx targets -* `rocblas_hgemm_kernel_name`, `rocblas_sgemm_kernel_name`, and `rocblas_dgemm_kernel_name` API functions -* Use of `warpSize` as a constexpr -* Use of deprecated behavior of `hipPeekLastError` -* `rocblas_float8.h` and `rocblas_hip_f8_impl.h` files -* `rocblas_gemm_ex3`, `rocblas_gemm_batched_ex3`, `rocblas_gemm_strided_batched_ex3` API functions +* `CMAKE_CXX_COMPILER` is now passed on during compilation for a Tensile build. +* The default atomics mode is changed from `allowed` to `not allowed`. #### Optimized -* Optimized `gemm` by using `gemv` kernels when applicable -* Optimized `gemv` for small `m` and `n` with a large batch count on gfx942 -* Improved the performance of Level 1 `dot` for all precisions and variants when `N > 100000000` on gfx942 -* Improved the performance of Level 1 `asum` and `nrm2` for all precisions and variants on gfx942 -* Improved the performance of Level 2 `sger` (single precision) on gfx942 -* Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942 +* Optimized `gemm` by using `gemv` kernels when applicable. +* Optimized `gemv` for small `m` and `n` with a large batch count on gfx942. +* Improved the performance of Level 1 `dot` for all precisions and variants when `N > 100000000` on gfx942. +* Improved the performance of Level 1 `asum` and `nrm2` for all precisions and variants on gfx942. +* Improved the performance of Level 2 `sger` (single precision) on gfx942. +* Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942. -#### Resolved Issues +#### Removed -* Fixed environment variable path-based logging to append multiple handle output to the same file -* Support numerics when `trsm` is running with `rocblas_status_perf_degraded` -* Fixed the build dependency installation of `joblib` on some operating systems -* Return `rocblas_status_internal_error` when `rocblas_[set,get]_ [matrix,vector]` is called with a host pointer in place of a device pointer -* Reduced the default verbosity level for internal GEMM backend information -* Updated from the deprecated rocm-cmake to ROCmCMakeBuildTools -* Corrected AlmaLinux gfortran package dependencies +* Support code for non-production gfx targets. +* `rocblas_hgemm_kernel_name`, `rocblas_sgemm_kernel_name`, and `rocblas_dgemm_kernel_name` API functions. +* The use of `warpSize` as a constexpr. +* The use of deprecated behavior of `hipPeekLastError`. +* `rocblas_float8.h` and `rocblas_hip_f8_impl.h` files. +* `rocblas_gemm_ex3`, `rocblas_gemm_batched_ex3`, and `rocblas_gemm_strided_batched_ex3` API functions. -#### Upcoming Changes +#### Resolved issues + +* Fixed environment variable path-based logging to append multiple handle outputs to the same file. +* Support numerics when `trsm` is running with `rocblas_status_perf_degraded`. +* Fixed the build dependency installation of `joblib` on some operating systems. +* Return `rocblas_status_internal_error` when `rocblas_[set,get]_ [matrix,vector]` is called with a host pointer in place of a device pointer. +* Reduced the default verbosity level for internal GEMM backend information. +* Updated from the deprecated rocm-cmake to ROCmCMakeBuildTools. +* Corrected AlmaLinux GFortran package dependencies. -* Deprecated the use of negative indices to indicate the default solution is being used for `gemm_ex` with `rocblas_gemm_algo_solution_index` +#### Upcoming changes + +* Deprecated the use of negative indices to indicate the default solution is being used for `gemm_ex` with `rocblas_gemm_algo_solution_index`. + +### **ROCdbgapi** (0.77.3) + +#### Added +- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. + +#### Removed +- Support for the `gfx940` and `gfx941` architectures. ### **rocDecode** (1.0.0) @@ -1179,23 +1366,26 @@ HIP runtime has the following functional improvements which greatly improve runt * HEVC/AVC/AV1/VP9 stream syntax error handling. * HEVC stream bit depth change handling and DPB buffer size change handling through decoder reconfiguration. * AVC stream DPB buffer size change handling through decoder reconfiguration. -* rocdecode now uses the Cmake CMAKE_PREFIX_PATH directive. -* rocdecode - A new avcodec-based decoder built as a separate "rocdecode-host" library +* A new avcodec-based decoder built as a separate `rocdecode-host` library + +#### Changed + +* rocDecode now uses the Cmake `CMAKE_PREFIX_PATH` directive. #### Optimized -* Decode session start latency reduction. +* Decode session starts latency reduction. * Bitstream type detection optimization in bitstream reader. -#### Resolved Issues +#### Resolved issues -* Fixed a bug in picture files sample "videoDecodePicFiles" that can results in incorrect output frame count. +* Fixed a bug in the `videoDecodePicFiles` picture files sample that can results in incorrect output frame count. * Fixed a decoded frame output issue in video size change cases. -* Removed incorrect asserts of bitdepth_minus_8 in GetBitDepth() and num_chroma_planes in GetNumChromaPlanes() API calls in RocVideoDecoder utility class. +* Removed incorrect asserts of `bitdepth_minus_8` in `GetBitDepth()` and `num_chroma_planes` in `GetNumChromaPlanes()` API calls in the RocVideoDecoder utility class. #### Removed -* GetStream() interface call from RocVideoDecoder utility class +* `GetStream()` interface call from RocVideoDecoder utility class. #### Changed @@ -1210,7 +1400,7 @@ HIP runtime has the following functional improvements which greatly improve runt #### Removed -* Removed rocfft-rider legacy compatibility from clients +* Removed ``rocfft-rider`` legacy compatibility from clients. * Removed support for the gfx940 and gfx941 targets from the client programs. * Removed backward compatibility symlink for include directories. @@ -1225,11 +1415,234 @@ HIP runtime has the following functional improvements which greatly improve runt - 8192 * Implemented single-kernel plans for some large 1D problem sizes, on devices with at least 160KiB of LDS. -#### Resolved Issues +#### Resolved isues * Fixed kernel faults on multi-device transforms that gather to a single device, when the input/output bricks are not contiguous. +### **rocJPEG** (1.1.0) + +#### Added +* cmake config files. +* CTEST - New tests were introduced for JPEG batch decoding using various output formats, such as yuv_planar, y, rgb, and rgb_planar, both with and without region-of-interest (ROI). + +#### Changed +* Readme - cleanup and updates to pre-reqs. +* The `decode_params` argument of the `rocJpegDecodeBatched` API is now an array of `RocJpegDecodeParams` structs representing the decode parameters for the batch of JPEG images. +* `libdrm_amdgpu` is now explicitly linked with rocjpeg. + +#### Removed +* Dev Package - No longer installs pkg-config. + +#### Resolved issues +* Fixed a bug that prevented copying the decoded image into the output buffer when the output buffer is larger than the input image. +* Resolved an issue with resizing the internal memory pool by utilizing the explicit constructor of the vector's type during the resizing process. +* Addressed and resolved CMake configuration warnings. + +### **ROCm SMI** (7.8.0) + +#### Added + +- Support for GPU metrics 1.8. + - Added new fields for `rsmi_gpu_metrics_t` including: + - Adding the following metrics to allow new calculations for violation status: + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. + - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). + - Increasing available JPEG engines to 40. + Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI. + +#### Removed + +- Removed backwards compatibility for `rsmi_dev_gpu_metrics_info_get()`'s `jpeg_activity` and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` and `xcp_stats.vcn_busy`. + - Backwards compability is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. + - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion for users about which field to use. By removing backward compatibility, it is easier to identify the relevant field. + - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. + +```{note} +See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. +``` + +### **ROCm Compute Profiler** (3.2.1) + +#### Added + +##### CDNA4 (AMD Instinct MI350/MI355) support + +* Support for AMD Instinct MI350 series GPUs with the addition of the following counters: + * VALU co-issue (Two VALUs are issued instructions) efficiency + * Stream Processor Instruction (SPI) Wave Occupancy + * Scheduler-Pipe Wave Utilization + * Scheduler FIFO Full Rate + * CPC ADC Utilization + * F6F4 data type metrics + * Update formula for total FLOPs while taking into account F6F4 ops + * LDS STORE, LDS LOAD, LDS ATOMIC instruction count metrics + * LDS STORE, LDS LOAD, LDS ATOMIC bandwidth metrics + * LDS FIFO full rate + * Sequencer -> TA ADDR Stall rates + * Sequencer -> TA CMD Stall rates + * Sequencer -> TA DATA Stall rates + * L1 latencies + * L2 latencies + * L2 to EA stalls + * L2 to EA stalls per channel + +* Roofline support for AMD Instinct MI350 series architecture. + +##### Textual User Interface (TUI) (beta version) + +* Text User Interface (TUI) support for analyze mode + * A command line based user interface to support interactive single-run analysis + * To launch, use `--tui` option in analyze mode. For example, ``rocprof-compute analyze --tui``. + +##### PC Sampling (beta version) + +* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X series and later accelerators. + +* Host-trap PC Sampling has been enabled for AMD Instinct MI200 series and later accelerators. + +* Support for sorting of PC sampling by type: offset or count. + +* PC Sampling Support on CLI and TUI analysis. + +##### Roofline + +* Support for Roofline plot on CLI (single run) analysis. + +* Roofline support for RHEL 10 OS. + +* `FP4` and `FP6` data types have been added for roofline profiling on AMD Instinct MI350 series. + +##### rocprofv3 support + +* ``rocprofv3`` is supported as the default backend for profiling. +* Support to obtain performance information for all channels for TCC counters. +* Support for profiling on AMD Instinct MI 100 using ``rocprofv3``. +* Deprecation warning for ``rocprofv3`` interface in favor of the ROCprofiler-SDK interface, which directly accesses ``rocprofv3`` C++ tool. + +##### Others + +* Docker files to package the application and dependencies into a single portable and executable standalone binary file. + +* Analysis report based filtering + * ``-b`` option in profile mode now also accepts metric id(s) for analysis report based filtering. + * ``-b`` option in profile mode also accepts hardware IP block for filtering; however, this filter support will be deprecated soon. + * ``--list-metrics`` option added in profile mode to list possible metric id(s), similar to analyze mode. + +* Support MEM chart on CLI (single run) + +* ``--specs-correction`` option to provide missing system specifications for analysis. + +#### Changed + +* Changed the default ``rocprof`` version to ``rocprofv3``. This is used when environment variable ``ROCPROF`` is not set. +* Changed ``normal_unit`` default to ``per_kernel``. +* Decreased profiling time by not collecting unused counters in post-analysis. +* Updated Dash to >=3.0.0 (for web UI). +* Changed the condition when Roofline PDFs are generated during general profiling and ``--roof-only`` profiling (skip only when ``--no-roof`` option is present). +* Updated Roofline binaries: + * Rebuild using latest ROCm stack + * Minimum OS distribution support minimum for roofline feature is now Ubuntu 22.04, RHEL 8, and SLES15 SP6. + +#### Removed + +* Roofline support for Ubuntu 20.04 and SLES below 15.6 +* Removed support for AMD Instinct MI50 and MI60. + +#### Optimized + +* ROCm Compute Profiler CLI has been improved to better display the GPU architecture analytics + +#### Resolved issues + +* Fixed kernel name and kernel dispatch filtering when using ``rocprofv3``. +* Fixed an issue of TCC channel counters collection in ``rocprofv3``. +* Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI 300. +* Fixed not detecting memory clock issue when using amd-smi +* Fixed standalone GUI crashing +* Fixed L2 read/write/atomic bandwidths on MI350 + +#### Known issues + +* On AMD Instinct MI100, accumulation counters are not collected, resulting in the following metrics failing to show up in the analysis: Instruction Fetch Latency, Wavefront Occupancy, LDS Latency + * As a workaround, use the environment variable ``ROCPROF=rocprof``, to use ``rocprof v1`` for profiling on AMD Instinct MI100. + +* GPU id filtering is not supported when using ``rocprofv3``. + +* Analysis of previously collected workload data will not work due to sysinfo.csv schema change. + * As a workaround, re-run the profiling operation for the workload and interrupt the process after 10 seconds. + Followed by copying the ``sysinfo.csv`` file from the new data folder to the old one. + This assumes your system specification hasn't changed since the creation of the previous workload data. + +* Analysis of new workloads might require providing shader/memory clock speed using +``--specs-correction`` operation if amd-smi or rocminfo does not provide clock speeds. + +* Memory chart on ROCm Compute Profiler CLI might look corrupted if the CLI width is too narrow. + +#### Upcoming changes + +* ``rocprof v1/v2/v3`` interfaces will be removed in favor of the ROCprofiler-SDK interface, which directly accesses ``rocprofv3`` C++ tool. Using ``rocprof v1/v2/v3`` interfaces will trigger a deprecation warning. + * To use ROCprofiler-SDK interface, set environment variable `ROCPROF=rocprofiler-sdk` and optionally provide profile mode option ``--rocprofiler-sdk-library-path /path/to/librocprofiler-sdk.so``. Add ``--rocprofiler-sdk-library-path`` runtime option to choose the path to ROCprofiler-SDK library to be used. +* Hardware IP block based filtering using ``-b`` option in profile mode will be removed in favor of analysis report block based filtering using ``-b`` option in profile mode. +* MongoDB database support will be removed, and a deprecation warning has been added to the application interface. +* Usage of ``rocm-smi`` is deprecated in favor of ``amd-smi``, and a deprecation warning has been added to the application interface. + +### **ROCgdb** (16.3) + +#### Added + +- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. + +#### Removed + +- Support for the `gfx940` and `gfx941` architectures. + + +### **ROCm Data Center Tool** (1.1.0) + +#### Added + +- More profiling and monitoring metrics, especially for AMD Instinct MI300 and newer GPUs. +- Advanced logging and debugging options, including new log levels and troubleshooting guidance. + +#### Changed + +- Completed migration from legacy [ROCProfiler](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/) to [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/). +- Reorganized the configuration files internally and improved [README/installation](https://github.com/ROCm/rdc/blob/amd-staging/README.md) instructions. +- Updated metrics and monitoring support for the latest AMD data center GPUs. + +#### Optimized + +- Integration with [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/) for performance metrics collection. +- Standalone and embedded operating modes, including streamlined authentication and configuration options. +- Support and documentation for diagnostic commands and GPU group management. +- [RVS](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/latest/) test integration and reporting. + +### **ROCm Systems Profiler** (1.1.0) + +#### Added + +- Profiling and metric collection capabilities for VCN engine activity, JPEG engine activity, and API tracing for rocDecode, rocJPEG, and VA-APIs. +- How-to document for VCN and JPEG activity sampling and tracing. +- Support for tracing Fortran applications. +- Support for tracing MPI API in Fortran. + +#### Changed + +- Replaced ROCm SMI backend with AMD SMI backend for collecting GPU metrics. +- ROCprofiler-SDK is now used to trace RCCL API and collect communication counters. +- Updated the Dyninst submodule to v13.0. +- Set the default value of `ROCPROFSYS_SAMPLING_CPUS` to `none`. + +#### Resolved issues + +- Fixed GPU metric collection settings with `ROCPROFSYS_AMD_SMI_METRICS`. +- Fixed a build issue with CMake 4. +- Fixed incorrect kernel names shown for kernel dispatch tracks in Perfetto. +- Fixed formatting of some output logs. + ### **ROCmValidationSuite** (1.2.0) #### Added @@ -1242,14 +1655,15 @@ HIP runtime has the following functional improvements which greatly improve runt #### Changed - Migrated SMI API usage from `rocm-smi` to `amd-smi`. -- Updated FP8 GEMM operations to use hipBLASLt instead of rocBLAS. +- Updated `FP8` GEMM operations to use hipBLASLt instead of rocBLAS. ### **rocPRIM** (4.0.0) #### Added +* Added gfx950 support. * Added `rocprim::accumulator_t` to ensure parity with CCCL. -* Added test for `rocprim::accumulator_t` +* Added test for `rocprim::accumulator_t`. * Added `rocprim::invoke_result_r` to ensure parity with CCCL. * Added function `is_build_in` into `rocprim::traits::get`. * Added virtual shared memory as a fallback option in `rocprim::device_merge` when it exceeds shared memory capacity, similar to `rocprim::device_select`, `rocprim::device_partition`, and `rocprim::device_merge_sort`, which already include this feature. @@ -1268,56 +1682,55 @@ HIP runtime has the following functional improvements which greatly improve runt * Added support for building tests with device-side random data generation, making them finish faster. This requires rocRAND, and is enabled with the `WITH_ROCRAND=ON` build flag. * Added tests and documentation to `lookback_scan_state`. It is still in the `detail` namespace. -#### Optimizations +#### Optimized -* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the MI3XX architecture. +* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 series. #### Changed * Changed the parameters `long_radix_bits` and `LongRadixBits` from `segmented_radix_sort` to `radix_bits` and `RadixBits` respectively. -* Marked the initialisation constructor of `rocprim::reverse_iterator<Iter>` `explicit`, use `rocprim::make_reverse_iterator`. +* Marked the initialisation constructor of `rocprim::reverse_iterator` `explicit`, use `rocprim::make_reverse_iterator`. * Merged `radix_key_codec` into type_traits system. * Renamed `type_traits_interface.hpp` to `type_traits.hpp`, rename the original `type_traits.hpp` to `type_traits_functions.hpp`. * The default scan accumulator types for device-level scan algorithms have changed. This is a breaking change. -The previous default accumulator types could lead to situations in which unexpected overflow occured, such as -when the input or inital type was smaller than the output type. - * This is a complete list of affected functions and how their default accumulator types are changing: - * `rocprim::inclusive_scan` - * Previous default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>` - * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, typename std::iterator_traits<InputIterator>::value_type>` - * `rocprim::deterministic_inclusive_scan` - * Previous default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>` - * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, typename std::iterator_traits<InputIterator>::value_type>` - * `rocprim::exclusive_scan` - * Previous default: `class AccType = detail::input_type_t<InitValueType>>` - * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, rocprim::detail::input_type_t<InitValueType>>` - * `rocprim::deterministic_exclusive_scan` - * Previous default: `class AccType = detail::input_type_t<InitValueType>>` - * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, rocprim::detail::input_type_t<InitValueType>>` +The previous default accumulator types could lead to situations in which unexpected overflow occured, such as when the input or inital type was smaller than the output type. This is a complete list of affected functions and how their default accumulator types are changing: + + * `rocprim::inclusive_scan` + * Previous default: `class AccType = typename std::iterator_traits::value_type>` + * Current default: `class AccType = rocprim::accumulator_t::value_type>` + * `rocprim::deterministic_inclusive_scan` + * Previous default: `class AccType = typename std::iterator_traits::value_type>` + * Current default: `class AccType = rocprim::accumulator_t::value_type>` + * `rocprim::exclusive_scan` + * Previous default: `class AccType = detail::input_type_t>` + * Current default: `class AccType = rocprim::accumulator_t>` + * `rocprim::deterministic_exclusive_scan` + * Previous default: `class AccType = detail::input_type_t>` + * Current default: `class AccType = rocprim::accumulator_t>` * Undeprecated internal `detail::raw_storage`. -* A new version of `rocprim::thread_load` and `rocprim::thread_store` replace the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. +* A new version of `rocprim::thread_load` and `rocprim::thread_store` replace the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. * Renamed `rocprim::load_cs` to `rocprim::load_nontemporal` and `rocprim::store_cs` to `rocprim::store_nontemporal` to express the intent of these load and store methods better. * All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, for example, `rocprim::ROCPRIM_300400_NS::symbol` instead of `rocPRIM::symbol`, letting the user link multiple libraries built with different versions of rocPRIM. -#### Upcoming Changes +#### Upcoming changes -* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` now. +* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` instead. #### Removed * Removed `rocprim::detail::float_bit_mask` and relative tests, use `rocprim::traits::float_bit_mask` instead. -* Removed `rocprim::traits::is_fundamental`, please use `rocprim::traits::get<T>::is_fundamental()` directly. +* Removed `rocprim::traits::is_fundamental`, use `rocprim::traits::get::is_fundamental()` directly. * Removed the deprecated parameters `short_radix_bits` and `ShortRadixBits` from the `segmented_radix_sort` config. They were unused, it is only an API change. -* Removed the deprecated `operator<<` from the iterators. +* Removed the deprecated `operator<<` from the iterators. * Removed the deprecated `TwiddleIn` and `TwiddleOut`. Use `radix_key_codec` instead. * Removed the deprecated flags API of `block_adjacent_difference`. Use `subtract_left()` or `block_discontinuity::flag_heads()` instead. * Removed the deprecated `to_exclusive` functions in the warp scans. * Removed the `rocprim::load_cs` from the `cache_load_modifier` enum. Use `rocprim::load_nontemporal` instead. * Removed the `rocprim::store_cs` from the `cache_store_modifier` enum. Use `rocprim::store_nontemporal` instead. -* Removed the deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. - * This header included `rocprim::detail::invoke_result`. Use `rocprim::invoke_result` instead. - * This header included `rocprim::detail::invoke_result_binary_op`. Use `rocprim::invoke_result_binary_op` instead. - * This header included `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. +* Removed the deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. This header included: + * `rocprim::detail::invoke_result`. Use `rocprim::invoke_result` instead. + * `rocprim::detail::invoke_result_binary_op`. Use `rocprim::invoke_result_binary_op` instead. + * `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. * Removed the deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. * Removed `rocprim/detail/radix_sort.hpp`, functionality can now be found in `rocprim/thread/radix_key_codec.hpp`. * Removed C++14 support, only C++17 is supported. @@ -1330,44 +1743,104 @@ when the input or inital type was smaller than the output type. * `ROCPRIM_WAVEFRONT_SIZE` * Use `rocprim::arch::wavefront::min_size()` or `rocprim::arch::wavefront::max_size()` instead. * `__AMDGCN_WAVEFRONT_SIZE` - * This was a fallback define for the compiler's removed symbol, having the same name. + * This was a fallback define for the compiler's removed symbol, having the same name. * This release removes support for custom builds on gfx940 and gfx941. -#### Resolved Issues +#### Resolved issues * Fixed an issue where `device_batch_memcpy` reported benchmarking throughput being 2x lower than it was in reality. * Fixed an issue where `device_segmented_reduce` reported autotuning throughput being 5x lower than it was in reality. * Fixed device radix sort not returning the correct required temporary storage when a double buffer contains `nullptr`. * Fixed constness of equality operators (`==` and `!=`) in `rocprim::key_value_pair`. -* Fixed an issue for the comparison operators in `arg_index_iterator` and `texture_cache_iterator`, where `<` and `>` comparators were swapped. +* Fixed an issue for the comparison operators in `arg_index_iterator` and `texture_cache_iterator`, where `<` and `>` comparators were swapped. * Fixed an issue for the `rocprim::thread_reduce` not working correctly with a prefix value. -#### Known Issues +#### Known issues -* When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x - * However if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs +* * When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x. However, if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs. + +### **ROCprofiler-SDK** (1.0.0) + +### Added + +- Support for [rocJPEG](https://rocm.docs.amd.com/projects/rocJPEG/en/latest/index.html) API Tracing. +- Support for AMD Instinct MI350X and MI355X accelerators. +- `rocprofiler_create_counter` to facilitate adding custom derived counters at runtime. +- Support in `rocprofv3` for iteration based counter multiplexing. +- Perfetto support for counter collection. +- Support for negating `rocprofv3` tracing options when using aggregate options such as `--sys-trace --hsa-trace=no`. +- `--agent-index` option in `rocprofv3` to specify the agent naming convention in the output: + - absolute == node_id + - relative == logical_node_id + - type-relative == logical_node_type_id +- MI300 and MI350 stochastic (hardware-based) PC sampling support in ROCProfiler-SDK and `rocprofv3`. +- Python bindings for `rocprofiler-sdk-roctx` +- SQLite3 output support for `rocprofv3` using `--output-format rocpd`. +- `rocprofiler-sdk-rocpd` package: + - Public API in `include/rocprofiler-sdk-rocpd/rocpd.h`. + - Library implementation in `librocprofiler-sdk-rocpd.so`. + - Support for `find_package(rocprofiler-sdk-rocpd)`. + - `rocprofiler-sdk-rocpd` DEB and RPM packages. +- `--version` option in `rocprofv3`. +- `rocpd` Python package. +- Thread trace as experimental API. +- ROCprof Trace Decoder as experimental API: + - Requires [ROCprof Trace Decoder plugin](https://github.com/rocm/rocprof-trace-decoder). +- Thread trace option in the `rocprofv3` tool under the `--att` parameters: + - See [using thread trace with rocprofv3](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/amd-mainline/how-to/using-thread-trace.html) + - Requires [ROCprof Trace Decoder plugin](https://github.com/rocm/rocprof-trace-decoder). +- `rocpd` output format documentation: + - Requires [ROCprof Trace Decoder plugin](https://github.com/rocm/rocprof-trace-decoder). +- Perfetto support for scratch memory. +- Support in the `rocprofv3` avail tool for command-line arguments. +- Documentation for `rocprofv3` advanced options. + +### Changed + +- SDK to NOT to create a background thread when every tool returns a nullptr from `rocprofiler_configure`. +- `vaddr-to-file-offset` mapping in `disassembly.hpp` to use the dedicated comgr API. +- `rocprofiler_uuid_t` ABI to hold 128 bit value. +- `rocprofv3` shorthand argument for `--collection-period` to `-P` (upper-case) while `-p` (lower-case) is reserved for later use. +- Default output format for `rocprofv3` to `rocpd` (SQLite3 database). +- `rocprofv3` avail tool to be renamed from `rocprofv3_avail` to `rocprofv3-avail` tool. +- `rocprofv3` tool to facilitate thread trace and PC sampling on the same agent. + +#### Removed + +* Support for compilation of gfx940 and gfx941 targets. + +### Resolved issues + +- Fixed missing callbacks around internal thread creation within counter collection service. +- Fixed potential data race in the ROCprofiler-SDK double buffering scheme. +- Fixed usage of std::regex in the core ROCprofiler-SDK library that caused segfaults or exceptions when used under dual ABI. +- Fixed Perfetto counter collection by introducing accumulation per dispatch. +- Fixed code object disassembly for missing function inlining information. +- Fixed queue preemption error and `HSA_STATUS_ERROR_INVALID_PACKET_FORMAT` error for stochastic PC-sampling in MI300X, leading to stabler runs. +- Fixed the system hang issue for host-trap PC-sampling on AMD Instinct MI300X. +- Fixed `rocpd` counter collection issue when counter collection alone is enabled. `rocpd_kernel_dispatch` table is updated to be populated by counters data instead of kernel_dispatch data. +- Fixed `rocprofiler_*_id_t` structs for inconsistency related to a "null" handle: + - The correct definition for a null handle is `.handle = 0` while some definitions previously used `UINT64_MAX`. +- Fixed kernel trace csv output generated by `rocpd`. + +### **rocPyDecode** (0.6.0) + +#### Added + +* ``rocpyjpegdecode`` package. +* Added ``src/rocjpeg`` source new subfolder. +* Added ``samples/rocjpeg`` new subfolder. + +#### Changed +* Minimum version for rocdecode and rocjpeg updated to V1.0.0 ### **rocRAND** (4.0.0) #### Added -* gfx950 support -* Additional unit tests for `test_log_normal_distribution.cpp` -* Additional unit tests for `test_normal_distribution.cpp` -* Additional unit tests for `test_rocrand_mtgp32_prng.cpp` -* Additional unit tests for `test_rocrand_scrambled_sobol32_qrng.cpp` -* Additional unit tests for `test_rocrand_scrambled_sobol64_qrng.cpp` -* Additional unit tests for `test_rocrand_sobol32_qrng.cpp` -* Additional unit tests for `test_rocrand_sobol64_qrng.cpp` -* Additional unit tests for `test_rocrand_threefry2x32_20_prng.cpp` -* Additional unit tests for `test_rocrand_threefry2x64_20_prng.cpp` -* Additional unit tests for `test_rocrand_threefry4x32_20_prng.cpp` -* Additional unit tests for `test_rocrand_threefry4x64_20_prng.cpp` -* Additional unit tests for `test_uniform_distribution.cpp` -* New unit tests for `include/rocrand/rocrand_discrete.h` in `test_rocrand_discrete.cpp` -* New unit tests for `include/rocrand/rocrand_mrg31k3p.h` in `test_rocrand_mrg31k3p_prng.cpp` -* New unit tests for `include/rocrand/rocrand_mrg32k3a.h` in `test_rocrand_mrg32k3a_prng.cpp` -* New unit tests for `include/rocrand/rocrand_poisson.h` in `test_rocrand_poisson.cpp` +* gfx950 support. +* Additional unit tests for `test_log_normal_distribution.cpp`, `test_normal_distribution.cpp`, `test_rocrand_mtgp32_prng.cpp`, `test_rocrand_scrambled_sobol32_qrng.cpp`, `test_rocrand_scrambled_sobol64_qrng.cpp`, `test_rocrand_sobol32_qrng.cpp`, `test_rocrand_sobol64_qrng.cpp`, `test_rocrand_threefry2x32_20_prng.cpp`, `test_rocrand_threefry2x64_20_prng.cpp`, `test_rocrand_threefry4x32_20_prng.cpp`, `test_rocrand_threefry4x64_20_prng.cpp`, and `test_uniform_distribution.cpp`. +* New unit tests for `include/rocrand/rocrand_discrete.h` in `test_rocrand_discrete.cpp`, `include/rocrand/rocrand_mrg31k3p.h` in `test_rocrand_mrg31k3p_prng.cpp`, `include/rocrand/rocrand_mrg32k3a.h` in `test_rocrand_mrg32k3a_prng.cpp`, and `include/rocrand/rocrand_poisson.h` in `test_rocrand_poisson.cpp`. #### Changed @@ -1378,7 +1851,7 @@ when the input or inital type was smaller than the output type. #### Removed -* Removed inline assembly and the `ENABLE_INLINE_ASM` CMake option. Inline assembly was used to optimizate of multiplications in the Mrg32k3a and Philox 4x32-10 generators. It is no longer needed because the current HIP compiler is able to produce code with the same or better performance. +* Removed inline assembly and the `ENABLE_INLINE_ASM` CMake option. Inline assembly was used to optimize multiplication in the Mrg32k3a and Philox 4x32-10 generators. It is no longer needed because the current HIP compiler is able to produce code with the same or better performance. * Removed instances of the deprecated clang definition `__AMDGCN_WAVEFRONT_SIZE`. * Removed C++14 support. Beginning with this release, only C++17 is supported. * Directly accessing the (scrambled) sobol32 and sobol64 constants and direction vectors is no longer supported. For: @@ -1389,46 +1862,37 @@ when the input or inital type was smaller than the output type. * `rocrand_h_scrambled_sobol32_direction_vectors`, use `rocrand_get_direction_vectors32` instead. * `rocrand_h_scrambled_sobol64_direction_vectors`, use `rocrand_get_direction_vectors64` instead. -#### Resolved Issues +#### Resolved isues * Fixed an issue where `mt19937.hpp` would cause kernel errors during auto tuning. -#### Upcoming Changes +#### Upcoming canges * Deprecated the rocRAND Fortran API in favor of hipfort. +### **ROCr Debug Agent** (2.1.0) + +#### Added + +* Added the `-e` and `--precise-alu-exceptions` flags to enable precise ALU exceptions reporting on supported configurations. + ### **rocSHMEM** (3.0.0) #### Added -* Added the Reverse Offload conduit -* Added new APIs: - * `rocshmem_ctx_barrier` - * `rocshmem_ctx_barrier_wave` - * `rocshmem_ctx_barrier_wg` - * `rocshmem_barrier_all` - * `rocshmem_barrier_all_wave` - * `rocshmem_barrier_all_wg` - * `rocshmem_ctx_sync` - * `rocshmem_ctx_sync_wave` - * `rocshmem_ctx_sync_wg` - * `rocshmem_sync_all` - * `rocshmem_sync_all_wave` - * `rocshmem_sync_all_wg` - * `rocshmem_init_attr` - * `rocshmem_get_uniqueid` - * `rocshmem_set_attr_uniqueid_args` -* Added dlmalloc based allocator -* Added XNACK support -* Added support for initialization with MPI communicators other than `MPI_COMM_WORLD` +* Added the Reverse Offload conduit. +* Added new APIs: `rocshmem_ctx_barrier`, `rocshmem_ctx_barrier_wave`, `rocshmem_ctx_barrier_wg`, `rocshmem_barrier_all`, `rocshmem_barrier_all_wave`, `rocshmem_barrier_all_wg`, `rocshmem_ctx_sync`, `rocshmem_ctx_sync_wave`, `rocshmem_ctx_sync_wg`, `rocshmem_sync_all`, `rocshmem_sync_all_wave`, `rocshmem_sync_all_wg`, `rocshmem_init_attr`, `rocshmem_get_uniqueid`, and `rocshmem_set_attr_uniqueid_args`. +* Added dlmalloc based allocator. +* Added XNACK support. +* Added support for initialization with MPI communicators other than `MPI_COMM_WORLD`. #### Changed -* Changed collective APIs to use `_wg` suffix rather than `_wg_` infix +* Changed collective APIs to use `_wg` suffix rather than `_wg_` infix. -#### Resolved Issues +#### Resolved issues -* Resolved segfault in `rocshmem_wg_ctx_create`, now provides nullptr if ctx cannot be created +* Resolved segfault in `rocshmem_wg_ctx_create`, now provides nullptr if ctx cannot be created. ### **rocSOLVER** (3.30.0) @@ -1439,120 +1903,109 @@ when the input or inital type was smaller than the output type. #### Optimized -* Improved the performance of BDSQR and downstream functions such as GESVD -* Improved the performance of STEQR and downstream functions such as SYEV/HEEV -* Improved the performance of LARFT and downstream functions such as GEQR2 and GEQRF +* Fixed corner cases that can produce NaNs in SYEVD, for valid input matrices. +* Improved the performance of BDSQR and downstream functions, such as GESVD. +* Improved the performance of STEQR and downstream functions, such as SYEV/HEEV. +* Improved the performance of LARFT and downstream functions, such as GEQR2 and GEQRF. + +#### Resolved issues -#### Resolved Issues - -* Fixed corner cases that can produce NaNs in SYEVD, for valid input matrices +* Fixed corner cases that can produce NaNs in SYEVD for valid input matrices. ### **rocSPARSE** (4.0.2) #### Added -* Adds `SpGEAM` generic routine for computing sparse matrix addition in CSR format -* Adds `v2_SpMV` generic routine for computing sparse matrix vector multiplication. As opposed to the deprecated `rocsparse_spmv` routine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated routine `rocsparse_spmv`, the user can enable warning messages in situations where a fallback algorithm is used by either calling upfront the routine `rocsparse_enable_debug` or exporting the variable `ROCSPARSE_DEBUG` (with the shell command `export ROCSPARSE_DEBUG=1`). -* Adds half float mixed precision to `rocsparse_axpby` where X and Y use float16 and result and the compute type use float -* Adds half float mixed precision to `rocsparse_spvv` where X and Y use float16 and result and the compute type use float -* Adds half float mixed precision to `rocsparse_spmv` where A and X use float16 and Y and the compute type use float -* Adds half float mixed precision to `rocsparse_spmm` where A and B use float16 and C and the compute type use float -* Adds half float mixed precision to `rocsparse_sddmm` where A and B use float16 and C and the compute type use float -* Adds half float uniform precision to `rocsparse_scatter` and `rocsparse_gather` routines -* Adds half float uniform precision to `rocsparse_sddmm` routine -* Added `rocsparse_spmv_alg_csr_rowsplit` algorithm. -* Added support for gfx950 -* Add ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux). -* Added the `almalinux` OS name to correct the gfortran dependency +* Added the `SpGEAM` generic routine for computing sparse matrix addition in CSR format. +* Added the `v2_SpMV` generic routine for computing sparse matrix vector multiplication. As opposed to the deprecated `rocsparse_spmv` routine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated `rocsparse_spmv` routine, the user can enable warning messages in situations where a fallback algorithm is used by either calling the `rocsparse_enable_debug` routine upfront or exporting the variable `ROCSPARSE_DEBUG` (with the shell command `export ROCSPARSE_DEBUG=1`). +* Added half float mixed precision to `rocsparse_axpby` where X and Y use `float16` and the result and compute type use `float`. +* Added half float mixed precision to `rocsparse_spvv` where X and Y use `float16` and the result and compute type use `float`. +* Added half float mixed precision to `rocsparse_spmv` where A and X use `float16` and Y and the compute type use `float`. +* Added half float mixed precision to `rocsparse_spmm` where A and B use `float16` and C and the compute type use `float`. +* Added half float mixed precision to `rocsparse_sddmm` where A and B use `float16` and C and the compute type use `float`. +* Added half float uniform precision to the `rocsparse_scatter` and `rocsparse_gather` routines. +* Added half float uniform precision to the `rocsparse_sddmm` routine. +* Added the `rocsparse_spmv_alg_csr_rowsplit` algorithm. +* Added support for gfx950. +* Added ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux). +* Added the `almalinux` operating system name to correct the GFortran dependency. #### Changed * Switch to defaulting to C++17 when building rocSPARSE from source. Previously rocSPARSE was using C++14 by default. -#### Optimized - -* Reduced the number of template instantiations in the library to further reduce the shared library binary size and improve compile times -* Allow SpGEMM routines to use more shared memory when available. This can speed up performance for matrices with a large number of intermediate products. -* Use of the `rocsparse_spmv_alg_csr_adaptive` or `rocsparse_spmv_alg_csr_default` algorithms in `rocsparse_spmv` to perform transposed sparse matrix multiplication (`C=alpha*A^T*x+beta*y`) resulted in unnecessary analysis on A and needless slowdown during the analysis phase. This has been fixed by skipping the analysis when performing the transposed sparse matrix multiplication. -* Improved the user documentation - -#### Resolved Issues - -* Fixed an issue in the public headers where `extern "C"` was not wrapped by `#ifdef __cplusplus`, which caused failures when building C programs with rocSPARSE. -* Fixed a memory access fault in the `rocsparse_Xbsrilu0` routines. -* Fixed failures that could occur in `rocsparse_Xbsrsm_solve` or `rocsparse_spsm` with BSR format when using host pointer mode. -* Fixed ASAN compilation failures -* Fixed failure that occurred when using const descriptor `rocsparse_create_const_csr_descr` with the generic routine `rocsparse_sparse_to_sparse`. Issue was not observed when using non-const descriptor `rocsparse_create_csr_descr` with `rocsparse_sparse_to_sparse`. -* Fixed a memory leak in the rocsparse handle - #### Removed -* The deprecated `rocsparse_spmv_ex` routine -* The deprecated `rocsparse_sbsrmv_ex`, `rocsparse_dbsrmv_ex`, `rocsparse_cbsrmv_ex`, and `rocsparse_zbsrmv_ex` routines -* The deprecated `rocsparse_sbsrmv_ex_analysis`, `rocsparse_dbsrmv_ex_analysis`, `rocsparse_cbsrmv_ex_analysis`, and `rocsparse_zbsrmv_ex_analysis` routines +* The deprecated `rocsparse_spmv_ex` routine. +* The deprecated `rocsparse_sbsrmv_ex`, `rocsparse_dbsrmv_ex`, `rocsparse_cbsrmv_ex`, and `rocsparse_zbsrmv_ex` routines. +* The deprecated `rocsparse_sbsrmv_ex_analysis`, `rocsparse_dbsrmv_ex_analysis`, `rocsparse_cbsrmv_ex_analysis`, and `rocsparse_zbsrmv_ex_analysis` routines. -#### Upcoming Changes +#### Optimized -* Deprecated the `rocsparse_spmv` routine. Users should use the `rocsparse_v2_spmv` routine going forward. -* Deprecated `rocsparse_spmv_alg_csr_stream` algorithm. Users should use the `rocsparse_spmv_alg_csr_rowsplit` algorithm going forward. -* Deprecated the `rocsparse_itilu0_alg_sync_split_fusion` algorithm. Users should use one of `rocsparse_itilu0_alg_async_inplace`, `rocsparse_itilu0_alg_async_split`, or `rocsparse_itilu0_alg_sync_split` going forward. +* Reduced the number of template instantiations in the library to further reduce the shared library binary size and improve compile times. +* Allow SpGEMM routines to use more shared memory when available. This can speed up performance for matrices with a large number of intermediate products. +* Use of the `rocsparse_spmv_alg_csr_adaptive` or `rocsparse_spmv_alg_csr_default` algorithms in `rocsparse_spmv` to perform transposed sparse matrix multiplication (`C=alpha*A^T*x+beta*y`) resulted in unnecessary analysis on A and needless slowdown during the analysis phase. This has been improved by skipping the analysis when performing the transposed sparse matrix multiplication. +* Improved the user documentation. + +#### Resolved issues + +* Fixed an issue in the public headers where `extern "C"` was not wrapped by `#ifdef __cplusplus`, which caused failures when building C programs with rocSPARSE. +* Fixed a memory access fault in the `rocsparse_Xbsrilu0` routines. +* Fixed failures that could occur in `rocsparse_Xbsrsm_solve` or `rocsparse_spsm` with BSR format when using host pointer mode. +* Fixed ASAN compilation failures. +* Fixed a failure that occurred when using const descriptor `rocsparse_create_const_csr_descr` with the generic routine `rocsparse_sparse_to_sparse`. The issue was not observed when using non-const descriptor `rocsparse_create_csr_descr` with `rocsparse_sparse_to_sparse`. +* Fixed a memory leak in the rocSPARSE handle. + +#### Upcoming changes + +* Deprecated the `rocsparse_spmv` routine. Use the `rocsparse_v2_spmv` routine instead. +* Deprecated the `rocsparse_spmv_alg_csr_stream` algorithm. Use the `rocsparse_spmv_alg_csr_rowsplit` algorithm instead. +* Deprecated the `rocsparse_itilu0_alg_sync_split_fusion` algorithm. Use one of `rocsparse_itilu0_alg_async_inplace`, `rocsparse_itilu0_alg_async_split`, or `rocsparse_itilu0_alg_sync_split` instead. ### **rocThrust** (4.0.0) #### Changed * Updated the required version of Google Benchmark from 1.8.0 to 1.9.0. -* Drop `c++14` support for rocthrust. -* Renamed `cpp14_required.h` to `cpp_version_check.h` -* Refactored `test_header.hpp` into separte modules `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. - * This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. +* Renamed `cpp14_required.h` to `cpp_version_check.h`. +* Refactored `test_header.hpp` into `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. #### Added -* Additional unit tests for: - * binary_search - * complex - * c99math - * catrig - * ccosh - * cexp - * clog - * csin - * csqrt - * ctan +* Additional unit tests for: binary_search, complex, c99math, catrig, ccosh, cexp, clog, csin, csqrt, and ctan. * Added `test_param_fixtures.hpp` to store all the parameters for typed test suites. * Added `test_real_assertions.hpp` to handle unit test assertions for real numbers. * Added `test_imag_assertions.hpp` to handle unit test assertions for imaginary numbers. * `clang++` is now used to compile google benchmarks on Windows. * Added gfx950 support. -* Merged changes from upstream CCCL/thrust 2.6.0 +* Merged changes from upstream CCCL/thrust 2.6.0. #### Removed * `device_malloc_allocator.h` has been removed. This header file was unused and should not impact users. -* Removed C++14 support, only C++17 is supported. +* Removed C++14 support. Only C++17 is now supported. * `test_header.hpp` has been removed. The `HIP_CHECK` function, as well as the `test` and `inter_run_bwr` namespaces, have been moved to `test_utils.hpp`. * `test_assertions.hpp` has been split into `test_real_assertions.hpp` and `test_imag_assertions.hpp`. -#### Upcoming Changes +#### Upcoming changes * `thrust::device_malloc_allocator` is deprecated as of this version. It will be removed in an upcoming version. -#### Resolved Issues +#### Resolved issues * Fixed an issue with internal calls to unqualified `distance()` which would be ambigious due to also visibile implementation through ADL. -#### Known Issues +#### Known issues -* The order of the values being compared by thrust::exclusive_scan_by_key and thrust::inclusive_scan_by_key can change between runs when integers are being compared. This can cause incorrect output when a non-commutative operator such as division is being used. +* The order of the values being compared by `thrust::exclusive_scan_by_key` and `thrust::inclusive_scan_by_key` can change between runs when integers are being compared. This can cause incorrect output when a non-commutative operator such as division is being used. ### **rocWMMA** (2.0.0) #### Added -* Added internal register layout transforms to support interleaved MMA layouts -* Added support for the gfx950 target -* Added mixed input `bf8` / `fp8` types for MMA support +* Added internal register layout transforms to support interleaved MMA layouts. +* Added support for the gfx950 target. +* Added mixed input `BF8` / `FP8` types for MMA support. * Added fragment scheduler API objects to embed thread block cooperation properties in fragments #### Changed @@ -1575,76 +2028,80 @@ when the input or inital type was smaller than the output type. * Removed the rocWMMA cooperative API * Removed wave count template parameters from transforms APIs -#### Resolved Issues +#### Resolved issues -* Fixed a validation issue for small precision compute types `< B32` on gfx9 -* Fixed CMake validation of compiler support for `bf8` / `fp8` types +* Fixed a validation issue for small precision compute types `< B32` on gfx9 +* Fixed CMake validation of compiler support for `BF8` / `FP8` types * Fixed linkage of rocwmma::synchronize_workgroup to inline -### **rpp** (2.0.0) +### **RPP** (2.0.0) #### Added -* Bitwise NOT, Bitwise AND, Bitwise OR augmentations on HOST (CPU) and HIP backends. (#520) -* Tensor Concat augmentation on HOST (CPU) and HIP backends. (#530) -* JPEG Compression Distortion augmentation on HIP backend. (#538) +* Bitwise NOT, Bitwise AND, and Bitwise OR augmentations on HOST (CPU) and HIP backends. +* Tensor Concat augmentation on HOST (CPU) and HIP backends. +* JPEG Compression Distortion augmentation on HIP backend. * `log1p`, defined as `log (1 + x)`, tensor augmentation support on HOST (CPU) and HIP backends. -* JPEG Compression Distortion augmentation on HOST (CPU) backend. (#531) +* JPEG Compression Distortion augmentation on HOST (CPU) backend. #### Changed -* All handle creation and destruction APIs have been consolidated to `rppCreate()`, for handle initialization, and `rppDestroy()`, for handle destruction (#513) -* RPP function category "logical_operations" more appropriately renamed to "bitwise_operations". (#520) -* TurboJPEG package installation enabled for RPP Test Suite with `sudo apt-get install libturbojpeg0-dev`. Instructions updated in utilities/test_suite/README.md. (#518) -* Changed API of swap_channels augmentation to be called channel_permute, which now accepts one new argument, "permutationTensor" (pointer to a unsigned int tensor) that provides the permutation order to swap the RGB channels of each input image in the batch in any order. (#547) - * Old API - `RppStatus rppt_swap_channels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, rppHandle_t rppHandle);` - * New API - `RppStatus rppt_channel_permute_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u *permutationTensor , rppHandle_t rppHandle);` +* Handle creation and destruction APIs have been consolidated. Use `rppCreate()` for handle initialization and `rppDestroy()` for handle destruction. +* The `logical_operations` function category has been renamed to `bitwise_operations`. +* TurboJPEG package installation enabled for RPP Test Suite with `sudo apt-get install libturbojpeg0-dev`. Instructions have been updated in utilities/test_suite/README.md. +* The `swap_channels` augmentation has been changed to `channel_permute`. `channel_permute` now also accepts a new argument, `permutationTensor` (pointer to a unsigned int tensor) that provides the permutation order to swap the RGB channels of each input image in the batch in any order: + + `RppStatus rppt_swap_channels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, rppHandle_t rppHandle);` + + changed to: + + `RppStatus rppt_channel_permute_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u *permutationTensor , rppHandle_t rppHandle);` #### Removed -* Older versions of RPP handle creation inlcuding `rppCreateWithBatchSize()`, `rppCreateWithStream()`, and `rppCreateWithStreamAndBatchSize()` are now removed and replaced with `rppCreate()`. -* Older versions of RPP handle destruction API including `rppDestroyGPU()` and `rppDestroyHost()` are now removed and replaced with `rppDestroy()`. +* Older versions of RPP handle creation inlcuding `rppCreateWithBatchSize()`, `rppCreateWithStream()`, and `rppCreateWithStreamAndBatchSize()`. These have been replaced with `rppCreate()`. +* Older versions of RPP handle destruction API including `rppDestroyGPU()` and `rppDestroyHost()`. These have been replaced with `rppDestroy()`. -#### Resolved Issues +#### Resolved issues -* Test package - debian packages will install required dependencies +* Test package - debian packages will install required dependencies. ### **Tensile** (4.44.0) #### Added -- Added support for gfx950 -- Added code object compression via bundling -- Added support for non-default HIP SDK installations on Windows -- Added master solution library documentation -- Added compiler version dependent assembler and architecture capabilities -- Added documentation from GitHub Wiki to ROCm docs +- Added support for gfx950. +- Added code object compression via bundling. +- Added support for non-default HIP SDK installations on Windows. +- Added master solution library documentation. +- Added compiler version dependent assembler and architecture capabilities. +- Added documentation from GitHub Wiki to ROCm docs. #### Changed -- Loosened check for CLI compiler choices -- Introduced 4-tuple targets for bundler invocations -- Introduced PATHEXT extensions on Windows when searching for toolchain components -- Enabled passing fully qualified paths to toolchain components -- Enabled environment variable overrides when searching for a ROCm stack -- Improved default toolchain configuration -- Ignored f824 flake errors +- Loosened check for CLI compiler choices. +- Introduced 4-tuple targets for bundler invocations. +- Introduced PATHEXT extensions on Windows when searching for toolchain components. +- Enabled passing fully qualified paths to toolchain components. +- Enabled environment variable overrides when searching for a ROCm stack. +- Improved default toolchain configuration. +- Ignored f824 flake errors. #### Removed -- Removed support for the gfx940 and gfx941 targets -- Removed unused tuning files -- Removed disabled tests +- Removed support for the gfx940 and gfx941 targets. +- Removed unused tuning files. +- Removed disabled tests. -#### Resolved Issues +#### Resolved issues -- Fixed configure time path not being invoked at build -- Fixed find_package for msgpack to work with versions 5 and 6 -- Fixed rhel9 testing -- Fixed gfx908 builds -- Fixed "argument list too long" error -- Fixed version typo in 6.3 changelog -- Fixed improper use of aliases as nested namespace specifiers +- Fixed configure time path not being invoked at build. +- Fixed find_package for msgpack to work with versions 5 and 6. +- Fixed rhel9 testing. +- Fixed gfx908 builds. +- Fixed the 'argument list too long' error. +- Fixed version typo in 6.3 changelog. +- Fixed improper use of aliases as nested namespace specifiers. ## ROCm known issues @@ -1686,8 +2143,7 @@ It's anticipated that ROCTracer, ROCProfiler, `rocprof`, and `rocprofv2` will re ### AMDGPU wavefront size compiler macro deprecation Access to the wavefront size as a compile-time constant via the `__AMDGCN_WAVEFRONT_SIZE` -and `__AMDGCN_WAVEFRONT_SIZE__` macros or the `constexpr warpSize` variable is deprecated -and will be disabled in a future release. +and `__AMDGCN_WAVEFRONT_SIZE__` macros are deprecated and will be disabled in a future release. In ROCm 7.0.0 `warpSize` is only availble as a non-`constextpr` variable. * The `__AMDGCN_WAVEFRONT_SIZE__` macro and `__AMDGCN_WAVEFRONT_SIZE` alias will be removed in an upcoming release. It is recommended to remove any use of this macro. For more information, see @@ -1708,10 +2164,6 @@ and will be disabled in a future release. #endif ``` -### HIPCC Perl scripts deprecation - -The HIPCC Perl scripts (`hipcc.pl` and `hipconfig.pl`) will be removed in an upcoming release. - ### Changes to ROCm Object Tooling ROCm Object Tooling tools ``roc-obj-ls``, ``roc-obj-extract``, and ``roc-obj`` are @@ -1722,11 +2174,3 @@ or executables passed as input. The ``llvm-objdump --offloading`` tool option a supports the ``--arch-name`` option, and only extracts code objects found with the specified target architecture. See [llvm-objdump](https://llvm.org/docs/CommandGuide/llvm-objdump.html) for more information. - -### HIP runtime API changes - -There are a number of upcoming changes planned for HIP runtime API in an upcoming major release -that are not backward compatible with prior releases. Most of these changes increase -alignment between HIP and CUDA APIs or behavior. Some of the upcoming changes are to -clean up header files, remove namespace collision, and have a clear separation between -`hipRTC` and HIP runtime. For more information, see [HIP 7.0 Is Coming: What You Need to Know to Stay Ahead](https://rocm.blogs.amd.com/ecosystems-and-partners/transition-to-hip-7.0-blog/README.html). diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index 8560f0c68..be53a146f 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -12,14 +12,14 @@ subtrees: - file: compatibility/compatibility-matrix.rst title: Compatibility matrix entries: - - url: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html + - url: https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html title: Linux system requirements - url: https://rocm.docs.amd.com/projects/install-on-windows/en/${branch}/reference/system-requirements.html title: Windows system requirements - caption: Install entries: - - url: https://rocm.docs.amd.com/projects/install-on-linux/en/${branch}/ + - url: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/ title: ROCm on Linux - url: https://rocm.docs.amd.com/projects/install-on-windows/en/latest/ title: HIP SDK on Windows From 7b087769a2f02a03a7b822a74603abdec385f727 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Tue, 19 Aug 2025 20:40:28 +0530 Subject: [PATCH 04/58] Create mi355-performance-counters.rst --- .../gpu-arch/mi355-performance-counters.rst | 530 ++++++++++++++++++ 1 file changed, 530 insertions(+) create mode 100644 docs/conceptual/gpu-arch/mi355-performance-counters.rst diff --git a/docs/conceptual/gpu-arch/mi355-performance-counters.rst b/docs/conceptual/gpu-arch/mi355-performance-counters.rst new file mode 100644 index 000000000..23c6a6ae3 --- /dev/null +++ b/docs/conceptual/gpu-arch/mi355-performance-counters.rst @@ -0,0 +1,530 @@ +.. meta:: + :description: MI355 series performance counters and metrics + :keywords: MI355, MI355X, MI3XX + +********************************************** +MI350 and MI355 series performance counters +********************************************** + +This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI355 series GPUs. These counters are available for profiling using `ROCprofiler-SDK `_ and `ROCm Compute Profiler `_. + +The following sections list the performance counters based on the IP blocks. + +Command processor packet processor counters (CPC) +================================================== + +.. list-table:: CPC counters + :header-rows: 1 + + * - Hardware counter + - Definition + + * - CPC_ALWAYS_COUNT + - Always count. + + * - CPC_ADC_VALID_CHUNK_NOT_AVAIL + - ADC valid chunk is not available when dispatch walking is in progress in the multi-xcc mode. + + * - CPC_ADC_DISPATCH_ALLOC_DONE + - ADC dispatch allocation done. + + * - CPC_ADC_VALID_CHUNK_END + - ADC crawler's valid chunk end in the multi-xcc mode. + + * - CPC_SYNC_FIFO_FULL_LEVEL + - Level count SYNC FIFO full last cycles. + + * - CPC_SYNC_FIFO_FULL + - SYNC FIFO full times. + + * - CPC_GD_BUSY + - ADC busy. + + * - CPC_TG_SEND + - ADC thread group send. + + * - CPC_WALK_NEXT_CHUNK + - ADC walking next valid chunk in the multi-xcc mode. + + * - CPC_STALLED_BY_SE0_SPI + - ADC CSDATA stalled by SE0SPI. + + * - CPC_STALLED_BY_SE1_SPI + - ADC CSDATA stalled by SE1SPI. + + * - CPC_STALLED_BY_SE2_SPI + - ADC CSDATA stalled by SE2SPI. + + * - CPC_STALLED_BY_SE3_SPI + - ADC CSDATA stalled by SE3SPI. + + * - CPC_LTE_ALL + - CPC sync counter LteAll. Only Master XCD manages LteAll. + + * - CPC_SYNC_WRREQ_FIFO_BUSY + - CPC sync counter request FIFO is not empty. + + * - CPC_CANE_BUSY + - CPC CANE bus is busy, which indicates the presence of inflight sync counter requests. + + * - CPC_CANE_STALL + - CPC sync counter sending is stalled by CANE. + +Shader pipe interpolators (SPI) counters +========================================= + +.. list-table:: SPI counters + :header-row: 1 + + * - Hardware counter + - Definition + + * - SPI_CS0_WINDOW_VALID + - Clock count enabled by PIPE0 perfcounter_start event. + + * - SPI_CS0_BUSY + - Number of clocks with outstanding waves for PIPE0 (SPI or SH). + + * - SPI_CS0_NUM_THREADGROUPS + - Number of threadgroups launched for PIPE0. + + * - SPI_CS0_CRAWLER_STALL + - Number of clocks when PIPE0 event or wave order FIFO is full. + + * - SPI_CS0_EVENT_WAVE + - Number of PIPE0 events and waves. + + * - SPI_CS0_WAVE + - Number of PIPE0 waves. + + * - SPI_CS1_WINDOW_VALID + - Clock count enabled by PIPE1 perfcounter_start event. + + * - SPI_CS1_BUSY + - Number of clocks with outstanding waves for PIPE1 (SPI or SH). + + * - SPI_CS1_NUM_THREADGROUPS + - Number of threadgroups launched for PIPE1. + + * - SPI_CS1_CRAWLER_STALL + - Number of clocks when PIPE1 event or wave order FIFO is full. + + * - SPI_CS1_EVENT_WAVE + - Number of PIPE1 events and waves. + + * - SPI_CS1_WAVE + - Number of PIPE1 waves. + + * - SPI_CS2_WINDOW_VALID + - Clock count enabled by PIPE2 perfcounter_start event. + + * - SPI_CS2_BUSY + - Number of clocks with outstanding waves for PIPE2 (SPI or SH). + + * - SPI_CS2_NUM_THREADGROUPS + - Number of threadgroups launched for PIPE2. + + * - SPI_CS2_CRAWLER_STALL + - Number of clocks when PIPE2 event or wave order FIFO is full. + + * - SPI_CS2_EVENT_WAVE + - Number of PIPE2 events and waves. + + * - SPI_CS2_WAVE + - Number of PIPE2 waves. + + * - SPI_CS3_WINDOW_VALID + - Clock count enabled by PIPE3 perfcounter_start event. + + * - SPI_CS3_BUSY + - Number of clocks with outstanding waves for PIPE3 (SPI or SH). + + * - SPI_CS3_NUM_THREADGROUPS + - Number of threadgroups launched for PIPE3. + + * - SPI_CS3_CRAWLER_STALL + - Number of clocks when PIPE3 event or wave order FIFO is full. + + * - SPI_CS3_EVENT_WAVE + - Number of PIPE3 events and waves. + + * - SPI_CS3_WAVE + - Number of PIPE3 waves. + + * - SPI_CSQ_P0_Q0_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue0. + + * - SPI_CSQ_P0_Q1_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue1. + + * - SPI_CSQ_P0_Q2_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue2. + + * - SPI_CSQ_P0_Q3_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue3. + + * - SPI_CSQ_P0_Q4_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue4. + + * - SPI_CSQ_P0_Q5_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue5. + + * - SPI_CSQ_P0_Q6_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue6. + + * - SPI_CSQ_P0_Q7_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue7. + + * - SPI_CSQ_P1_Q0_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue0. + + * - SPI_CSQ_P1_Q1_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue1. + + * - SPI_CSQ_P1_Q2_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue2. + + * - SPI_CSQ_P1_Q3_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue3. + + * - SPI_CSQ_P1_Q4_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue4. + + * - SPI_CSQ_P1_Q5_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue5. + + * - SPI_CSQ_P1_Q6_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue6. + + * - SPI_CSQ_P1_Q7_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue7. + + * - SPI_CSQ_P2_Q0_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue0. + + * - SPI_CSQ_P2_Q1_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue1. + + * - SPI_CSQ_P2_Q2_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue2. + + * - SPI_CSQ_P2_Q3_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue3. + + * - SPI_CSQ_P2_Q4_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue4. + + * - SPI_CSQ_P2_Q5_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue5. + + * - SPI_CSQ_P2_Q6_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue6. + + * - SPI_CSQ_P2_Q7_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue7. + + * - SPI_CSQ_P3_Q0_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue0. + + * - SPI_CSQ_P3_Q1_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue1. + + * - SPI_CSQ_P3_Q2_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue2. + + * - SPI_CSQ_P3_Q3_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue3. + + * - SPI_CSQ_P3_Q4_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue4. + + * - SPI_CSQ_P3_Q5_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue5. + + * - SPI_CSQ_P3_Q6_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue6. + + * - SPI_CSQ_P3_Q7_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue7. + + * - SPI_CSQ_P0_OCCUPANCY + - Sum of occupancy info for all PIPE0 queues. + + * - SPI_CSQ_P1_OCCUPANCY + - Sum of occupancy info for all PIPE1 queues. + + * - SPI_CSQ_P2_OCCUPANCY + - Sum of occupancy info for all PIPE2 queues. + + * - SPI_CSQ_P3_OCCUPANCY + - Sum of occupancy info for all PIPE3 queues. + + * - SPI_VWC0_VDATA_VALID_WR + - Number of clocks vgpr bus_0 writes VGPRs. + + * - SPI_VWC1_VDATA_VALID_WR + - Number of clocks vgpr bus_1 writes VGPRs. + + * - SPI_CSC_WAVE_CNT_BUSY + - Number of cycles when there is any wave in the pipe. + +Compute unit counters +====================== + +.. list-table:: SQ counters + :header-row: 1 + + * - Hardware counter + - Definition + + * - SQ_INSTS_VALU_MFMA_F6F4 + - Number of VALU V_MFMA_*_F6F4 instructions. + + * - SQ_INSTS_VALU_MFMA_MOPS_F6F4 + - Number of VALU matrix with the performed math operations (add or mul) divided by 512, assuming a full EXEC mask of F6 or F4 data type. + + * - SQ_ACTIVE_INST_VALU2 + - Number of quad-cycles when two VALU instructions are issued (per-simd, nondeterministic). + + * - SQ_INSTS_LDS_LOAD + - Number of LDS load instructions issued (per-simd, emulated). + + * - SQ_INSTS_LDS_STORE + - Number of LDS store instructions issued (per-simd, emulated). + + * - SQ_INSTS_LDS_ATOMIC + - Number of LDS atomic instructions issued (per-simd, emulated). + + * - SQ_INSTS_LDS_LOAD_BANDWIDTH + - Total number of 64-bytes loaded (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). + + * - SQ_INSTS_LDS_STORE_BANDWIDTH + - Total number of 64-bytes written (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). + + * - SQ_INSTS_LDS_ATOMIC_BANDWIDTH + - Total number of 64-bytes atomic (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). + + * - SQ_INSTS_VALU_FLOPS_FP16 + - Counts FLOPS per instruction on float 16 excluding MFMA/SMFMA. + + * - SQ_INSTS_VALU_FLOPS_FP32 + - Counts FLOPS per instruction on float 32 excluding MFMA/SMFMA. + + * - SQ_INSTS_VALU_FLOPS_FP64 + - Counts FLOPS per instruction on float 64 excluding MFMA/SMFMA. + + * - SQ_INSTS_VALU_FLOPS_FP16_TRANS + - Counts FLOPS per instruction on float 16 trans excluding MFMA/SMFMA. + + * - SQ_INSTS_VALU_FLOPS_FP32_TRANS + - Counts FLOPS per instruction on float 32 trans excluding MFMA/SMFMA. + + * - SQ_INSTS_VALU_FLOPS_FP64_TRANS + - Counts FLOPS per instruction on float 64 trans excluding MFMA/SMFMA. + + * - SQ_INSTS_VALU_IOPS + - Counts OPS per instruction on integer or unsigned or bit data (per-simd, emulated). + + * - SQ_LDS_DATA_FIFO_FULL + - Number of cycles LDS data FIFO is full (nondeterministic, unwindowed). + + * - SQ_LDS_CMD_FIFO_FULL + - Number of cycles LDS command FIFO is full (nondeterministic, unwindowed). + + * - SQ_VMEM_TA_ADDR_FIFO_FULL + - Number of cycles texture requests are stalled due to full address FIFO in TA (nondeterministic, unwindowed). + + * - SQ_VMEM_TA_CMD_FIFO_FULL + - Number of cycles texture requests are stalled due to full cmd FIFO in TA (nondeterministic, unwindowed). + + * - SQ_VMEM_WR_TA_DATA_FIFO_FULL + - Number of cycles texture writes are stalled due to full data FIFO in TA (nondeterministic, unwindowed). + + * - SQC_ICACHE_MISSES_DUPLICATE + - Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic). + + * - SQC_DCACHE_MISSES_DUPLICATE + - Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic). + +Texture addressing unit counters +================================= + +.. list-table:: TA counters + :header-row: 1 + + * - Hardware counter + - Definition + + * - TA_BUFFER_READ_LDS_WAVEFRONTS + - Number of buffer read wavefronts for LDS return processed by the TA. + + * - TA_FLAT_READ_LDS_WAVEFRONTS + - Number of flat opcode reads for LDS return processed by the TA. + +Texture data unit counters +=========================== + +.. list-table:: TD counters + :header-row: 1 + + * - Hardware counter + - Definition + + * - TD_WRITE_ACKT_WAVEFRONT + - Number of write acknowledgments, sent to SQ and not to SP. + + * - TD_TD_SP_TRAFFIC + - Number of times this TD sends data to the SP. + +Texture cache per pipe counters +================================ + +.. list-table:: TCP counters + :header-row: 1 + + * - Hardware counter + - Definition + + * - TCP_TCP_TA_ADDR_STALL_CYCLES + - TCP stalls TA addr interface. + + * - TCP_TCP_TA_DATA_STALL_CYCLES + - TCP stalls TA data interface. Now windowed. + + * - TCP_LFIFO_STALL_CYCLES + - Memory latency FIFOs full stall. + + * - TCP_RFIFO_STALL_CYCLES + - Memory Request FIFOs full stall. + + * - TCP_TCR_RDRET_STALL + - Write into cache stalled by read return from TCR. + + * - TCP_PENDING_STALL_CYCLES + - Stall due to data pending from L2. + + * - TCP_UTCL1_SERIALIZATION_STALL + - Total number of stalls caused due to serializing translation requests through the UTCL1. + + * - TCP_UTCL1_THRASHING_STALL + - Stall caused by thrashing feature in any probe. Lacks accuracy when the stall signal overlaps between probe0 and probe1, which is worse with MECO of thrashing deadlock. Some probe0 events could miss being counted in with MECO on. This perf count provides a rough thrashing estimate. + + * - TCP_UTCL1_TRANSLATION_MISS_UNDER_MISS + - Translation miss_under_miss. + + * - TCP_UTCL1_STALL_INFLIGHT_MAX + - Total UTCL1 stalls due to inflight counter saturation. + + * - TCP_UTCL1_STALL_LRU_INFLIGHT + - Total UTCL1 stalls due to LRU cache line with inflight traffic. + + * - TCP_UTCL1_STALL_MULTI_MISS + - Total UTCL1 stalls due to arbitrated multiple misses. + + * - TCP_UTCL1_LFIFO_FULL + - Total UTCL1 and UTCL2 latency, which hides FIFO full cycles. + + * - TCP_UTCL1_STALL_LFIFO_NOT_RES + - Total UTCL1 stalls due to UTCL2 latency, which hides FIFO output (not resident). + + * - TCP_UTCL1_STALL_UTCL2_REQ_OUT_OF_CREDITS + - Total UTCL1 stalls due to UTCL2_req being out of credits. + + * - TCP_CLIENT_UTCL1_INFLIGHT + - The sum of inflight client to UTCL1 requests per cycle. + + * - TCP_TAGRAM0_REQ + - Total L2 requests mapping to TagRAM 0 from this TCP to all TCCs. + + * - TCP_TAGRAM1_REQ + - Total L2 requests mapping to TagRAM 1 from this TCP to all TCCs. + + * - TCP_TAGRAM2_REQ + - Total L2 requests mapping to TagRAM 2 from this TCP to all TCCs. + + * - TCP_TAGRAM3_REQ + - Total L2 requests mapping to TagRAM 3 from this TCP to all TCCs. + + * - TCP_TCP_LATENCY + - Total TCP wave latency (from the first clock of wave entering to the first clock of wave leaving). Divide by TA_TCP_STATE_READ to find average wave latency. + + * - TCP_TCC_READ_REQ_LATENCY + - Total TCP to TCC request latency for reads and atomics with return. Not Windowed. + + * - TCP_TCC_WRITE_REQ_LATENCY + - Total TCP to TCC request latency for writes and atomics without return. Not Windowed. + + * - TCP_TCC_WRITE_REQ_HOLE_LATENCY + - Total TCP req to TCC hole latency for writes and atomics. Not Windowed. + +Texture cache per channel counters +=================================== + +.. list-table:: TCC counters + :header-row: 1 + + * - Hardware counter + - Definition + + * - TCC_READ_SECTORS + - Total number of 32B data sectors in read requests. + + * - TCC_WRITE_SECTORS + - Total number of 32B data sectors in write requests. + + * - TCC_ATOMIC_SECTORS + - Total number of 32B data sectors in atomic requests. + + * - TCC_BYPASS_REQ + - Number of bypass requests. This is measured at the tag block. + + * - TCC_LATENCY_FIFO_FULL + - Number of cycles when the latency FIFO is full. + + * - TCC_SRC_FIFO_FULL + - Number of cycles when the SRC FIFO is assumed to be full as measured at the IB block. + + * - TCC_EA0_RDREQ_64B + - Number of 64-byte TCC/EA read requests. + + * - TCC_EA0_RDREQ_128B + - Number of 128-byte TCC/EA read requests. + + * - TCC_IB_REQ + - Number of requests through the IB. This measures the number of raw requests from graphics clients to this TCC. + + * - TCC_IB_STALL + - Number of cycles when the IB output is stalled. + + * - TCC_EA0_WRREQ_WRITE_DRAM + - Number of TCC/EA write requests (32-byte or 64-byte) destined for DRAM (MC). + + * - TCC_EA0_WRREQ_ATOMIC_DRAM + - Number of TCC/EA atomic requests (32-byte or 64-byte) destined for DRAM (MC). + + * - TCC_EA0_RDREQ_DRAM_32B + - Number of 32-byte TCC/EA read requests due to DRAM traffic. One 64-byte request is counted as two and one 128-byte as four. + + * - TCC_EA0_RDREQ_GMI_32B + - Number of 32-byte TCC/EA read requests due to GMI traffic. One 64-byte request is counted as two and one 128-byte as four. + + * - TCC_EA0_RDREQ_IO_32B + - Number of 32-byte TCC/EA read requests due to IO traffic. One 64-byte request is counted as two and one 128-byte as four. + + * - TCC_EA0_WRREQ_WRITE_DRAM_32B + - Number of 32-byte TCC/EA write requests due to DRAM traffic. One 64-byte request is counted as two. + + * - TCC_EA0_WRREQ_ATOMIC_DRAM_32B + - Number of 32-byte TCC/EA atomic requests due to DRAM traffic. One 64-byte request is counted as two. + + * - TCC_EA0_WRREQ_WRITE_GMI_32B + - Number of 32-byte TCC/EA write requests due to GMI traffic. One 64-byte request is counted as two. + + * - TCC_EA0_WRREQ_ATOMIC_GMI_32B + - Number of 32-byte TCC/EA atomic requests due to GMI traffic. One 64-byte request is counted as two. + + * - TCC_EA0_WRREQ_WRITE_IO_32B + - Number of 32-byte TCC/EA write requests due to IO traffic. One 64-byte request is counted as two. + + * - TCC_EA0_WRREQ_ATOMIC_IO_32B + - Number of 32-byte TCC/EA atomic requests due to IO traffic. One 64-byte request is counted as two. From 71bc63d2d8403d691eb1894ed7fabf2dc99d4126 Mon Sep 17 00:00:00 2001 From: randyh62 <42045079+randyh62@users.noreply.github.com> Date: Tue, 19 Aug 2025 10:53:36 -0700 Subject: [PATCH 05/58] Update RELEASE.md (#505) * Update RELEASE.md Updated with Changelog info from Julia * Update RELEASE.md * Update RELEASE.md * Update RELEASE.md --- RELEASE.md | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 88c74150c..d81e1ade7 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -770,24 +770,28 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc - `hipMemGetHandleForAddressRange` gets a handle for the address range requested. - `num_threads` Total number of threads in the group. The legacy API size is alias. - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for reduction across lanes of a warp. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). -* New support for Open Compute Project (OCP) floating-point `FP4` / `FP6` / `FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). - - Data types for `FP4` / `FP6` / `FP8`. - - HIP APIs for `FP4` / `FP6` / `FP8`, which are compatible with corresponding CUDA APIs. +* New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). + - Data types for `FP4`/`FP6`/`FP8`. + - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. * New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. * New debug mask, to print precise code object information for logging. -* The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. -* Added `constexpr` operators for `FP16` / `BF16`. -* Added `__syncwarp` operation. +* The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. +* Added `constexpr` operators for `fp16`/`bf16`. +* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`) +* Extended fine grained system memory pool. +* `num_threads` total number of threads in the group. The legacy API size is alias. * Added PCI CHIP ID information as the device attribute. -* Added new tests applications for OCP data types `FP4` / `FP6` / `FP8`. +* Added new tests applications for OCP data types `FP4`/`FP6`/`FP8`. * A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. #### Changed * Deprecated GPUs. Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. +* Removal of Beta warnings in HIP Graph APIs +All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. * Behavior changes - - `hipGetLastError` now gets the error code returned by `hipGetLastError` which should be the last actual error caught in the current thread during the application execution. + - `hipGetLastError` now returns the error code which is the last actual error caught in the current thread during the application execution. - Cooperative groups in `hipLaunchCooperativeKernelMultiDevice` and `hipLaunchCooperativeKernel` functions, additional input parameter validation checks are added. - `hipPointerGetAttributes` returns `hipSuccess` instead of an error with invalid value `hipErrorInvalidValue`, in case `NULL` host or attribute pointer is passed as input parameter. It now matches the functionality of `cudaPointerGetAttributes` which changed with CUDA 11 and above releases. - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree` @@ -901,9 +905,9 @@ HIP runtime has the following functional improvements which greatly improve runt * Refactored memory validation, creates a unique function to validate a variety of memory copy operations. * Improved kernel logging using demangling shader names. * Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). -* Programmatic support for scratch limits on MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, - - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. - - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. +* Programmatic support for scratch limits on MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, + - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. + - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. Developers can now use the environment variable `HSA_SCRATCH_SINGLE_LIMIT_ASYNC` to change the default allocation size with expected scratch limit in ROCR runtime. On top of it, this value can also be overwritten programmatically in the application using the HIP API `hipDeviceSetLimit(hipExtLimitScratchCurrent, value)` to reset the scratch limit value. * HIP runtime now enables peer-to-peer (P2P) memory copies to utilize all available SDMA engines, rather than being limited to a single engine. It also selects the best engine first to give optimal bandwidth. @@ -919,6 +923,11 @@ HIP runtime has the following functional improvements which greatly improve runt * A crash in TensorFlow related application. HIP runtime now combines multiple definitions of `callbackQueue` into a single function, in case of an exception, passes its handler to the application and provides corresponding error code. * Fixed issue of handling the kernel parameters for the graph launch. * Failures in roc-obj tools. HIP runtime now makes `DEPRECATED` message in roc-obj tools as `STDERR`. +* Support of `hipDeviceMallocContiguous` flags in `hipExtMallocWithFlags()`. It now enables `HSA_AMD_MEMORY_POOL_CONTIGUOUS_FLAG` in the memory pool allocation on GPU device. +* Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v` +* A numerical error/corruption found in Pytorch during graph replay. HIP runtime fixed the input sizes of kernel launch dimensions in hipExtModuleLaunchKernel for the execution of hipGraph capture. +* A crash during kernel execution in a customer application. The structure of kernel arguments was updated via adding the size of kernel arguments, and HIP runtime does validation before launch kernel with the structured arguments. +* A permission related error during execution of `hipLaunchHostFunc`. The API is now supported and allowed to run during stream capture, to match the behavior of CUDA. ### **hipBLAS** (3.0.0) From 1d127d987b16ec9a982ce268c637739ef161de9e Mon Sep 17 00:00:00 2001 From: randyh62 <42045079+randyh62@users.noreply.github.com> Date: Tue, 19 Aug 2025 12:51:09 -0700 Subject: [PATCH 06/58] Update RELEASE.md (#508) * Update RELEASE.md Added ROCR Runtime * Update RELEASE.md Removed Resolved Issue from HIP * Update RELEASE.md fix a few bad words --- RELEASE.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index d81e1ade7..2bc13014f 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -927,7 +927,6 @@ HIP runtime has the following functional improvements which greatly improve runt * Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v` * A numerical error/corruption found in Pytorch during graph replay. HIP runtime fixed the input sizes of kernel launch dimensions in hipExtModuleLaunchKernel for the execution of hipGraph capture. * A crash during kernel execution in a customer application. The structure of kernel arguments was updated via adding the size of kernel arguments, and HIP runtime does validation before launch kernel with the structured arguments. -* A permission related error during execution of `hipLaunchHostFunc`. The API is now supported and allowed to run during stream capture, to match the behavior of CUDA. ### **hipBLAS** (3.0.0) @@ -2075,6 +2074,15 @@ The previous default accumulator types could lead to situations in which unexpec * Test package - debian packages will install required dependencies. +### **ROCr Runtime** (1.18.0) + +#### Added + +* New API `hsa_amd_memory_get_preferred_copy_engine` to get preferred copy engine that can be used to when calling `hsa_amd_memory_async_copy_on_engine`. +* New API `hsa_amd_portable_export_dmabuf_v2` extension of existing `hsa_amd_portable_export_dmabuf` API to support new flags parameter. This allows specifying the new `HSA_AMD_DMABUF_MAPPING_TYPE_PCIE` flag when exporting dma-bufs. +* New flag `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` adds support for new `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` when calling `hsa_amd_vmem_address_reserve` API. This allows virtual address range reservations for SVM allocations to be tracked when running in ASAN mode. +* New sub query `HSA_AMD_AGENT_INFO_CLOCK_COUNTERS` returns a snapshot of the underlying driver's clock counters that can be used for profiling. + ### **Tensile** (4.44.0) #### Added From da340c3d05ec9650c0a339877945e08124789325 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Wed, 20 Aug 2025 17:06:02 +0530 Subject: [PATCH 07/58] spellcheck --- .wordlist.txt | 20 +++++++++++++++++++ .../gpu-arch/mi355-performance-counters.rst | 8 ++++---- 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/.wordlist.txt b/.wordlist.txt index c32752d7c..12ef2fce7 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -61,6 +61,7 @@ CPU CPUs Cron CSC +CSDATA CSE CSV CSn @@ -123,6 +124,7 @@ EPYC ESXi EoS FBGEMM +FIFOs FFT FFTs FFmpeg @@ -196,6 +198,7 @@ Haswell Higgs Hyperparameters Huggingface +IB ICD ICT ICV @@ -204,8 +207,11 @@ IDEs IFWI IMDb IncDec +instrSize +interpolators IOMMU IOP +IOPS IOPM IOV IRQ @@ -242,12 +248,15 @@ LLM LLMs LLVM LM +LRU LSAN LSan LTS LSTMs +LteAll LanguageCrossEntropy LoRA +MECO MEM MERCHANTABILITY MFMA @@ -266,6 +275,7 @@ MNIST MPI MPT MSVC +mul MVAPICH MVFFR Makefile @@ -343,6 +353,7 @@ PCC PCI PCIe PEFT +perf PEQT PIL PILImage @@ -426,6 +437,7 @@ SKUs SLES SLURM SMEM +SMFMA SMI SMT SPI @@ -437,18 +449,23 @@ SWE SerDes ShareGPT Shlens +simd Skylake Softmax Spack SplitK Supermicro Szegedy +TagRAM TCA TCC +TCCs TCI TCIU TCP TCR +THREADGROUPS +threadgroups TensorRT TensorFloat TF @@ -492,6 +509,7 @@ UltraChat Uncached Unittests Unhandled +unwindowed VALU VBIOS VCN @@ -507,11 +525,13 @@ Vanhoucke Vulkan WGP WGPs +WR WX WikiText Wojna Workgroups Writebacks +xcc XCD XCDs XGBoost diff --git a/docs/conceptual/gpu-arch/mi355-performance-counters.rst b/docs/conceptual/gpu-arch/mi355-performance-counters.rst index 23c6a6ae3..11831f453 100644 --- a/docs/conceptual/gpu-arch/mi355-performance-counters.rst +++ b/docs/conceptual/gpu-arch/mi355-performance-counters.rst @@ -86,7 +86,7 @@ Shader pipe interpolators (SPI) counters - Number of clocks with outstanding waves for PIPE0 (SPI or SH). * - SPI_CS0_NUM_THREADGROUPS - - Number of threadgroups launched for PIPE0. + - Number of thread groups launched for PIPE0. * - SPI_CS0_CRAWLER_STALL - Number of clocks when PIPE0 event or wave order FIFO is full. @@ -104,7 +104,7 @@ Shader pipe interpolators (SPI) counters - Number of clocks with outstanding waves for PIPE1 (SPI or SH). * - SPI_CS1_NUM_THREADGROUPS - - Number of threadgroups launched for PIPE1. + - Number of thread groups launched for PIPE1. * - SPI_CS1_CRAWLER_STALL - Number of clocks when PIPE1 event or wave order FIFO is full. @@ -122,7 +122,7 @@ Shader pipe interpolators (SPI) counters - Number of clocks with outstanding waves for PIPE2 (SPI or SH). * - SPI_CS2_NUM_THREADGROUPS - - Number of threadgroups launched for PIPE2. + - Number of thread groups launched for PIPE2. * - SPI_CS2_CRAWLER_STALL - Number of clocks when PIPE2 event or wave order FIFO is full. @@ -140,7 +140,7 @@ Shader pipe interpolators (SPI) counters - Number of clocks with outstanding waves for PIPE3 (SPI or SH). * - SPI_CS3_NUM_THREADGROUPS - - Number of threadgroups launched for PIPE3. + - Number of thread groups launched for PIPE3. * - SPI_CS3_CRAWLER_STALL - Number of clocks when PIPE3 event or wave order FIFO is full. From 35ec186cd99473343bfb190e17276ef9b3edacd6 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Wed, 20 Aug 2025 17:19:28 +0530 Subject: [PATCH 08/58] spellcheck --- .wordlist.txt | 1 + docs/conceptual/gpu-arch/mi355-performance-counters.rst | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/.wordlist.txt b/.wordlist.txt index 12ef2fce7..d828dbe7b 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -81,6 +81,7 @@ CommonMark Concretized Conda ConnectX +CountOnes CuPy da Dashboarding diff --git a/docs/conceptual/gpu-arch/mi355-performance-counters.rst b/docs/conceptual/gpu-arch/mi355-performance-counters.rst index 11831f453..f073169cd 100644 --- a/docs/conceptual/gpu-arch/mi355-performance-counters.rst +++ b/docs/conceptual/gpu-arch/mi355-performance-counters.rst @@ -260,10 +260,10 @@ Shader pipe interpolators (SPI) counters - Sum of occupancy info for all PIPE3 queues. * - SPI_VWC0_VDATA_VALID_WR - - Number of clocks vgpr bus_0 writes VGPRs. + - Number of clocks VGPR bus_0 writes VGPRs. * - SPI_VWC1_VDATA_VALID_WR - - Number of clocks vgpr bus_1 writes VGPRs. + - Number of clocks VGPR bus_1 writes VGPRs. * - SPI_CSC_WAVE_CNT_BUSY - Number of cycles when there is any wave in the pipe. From 073ac54e47004bb3f03c4450424bf48499bc10d5 Mon Sep 17 00:00:00 2001 From: randyh62 <42045079+randyh62@users.noreply.github.com> Date: Wed, 20 Aug 2025 11:26:28 -0700 Subject: [PATCH 09/58] Llvm rn update (#511) * Update RELEASE.md Added LLVM Release Notes content * Update RELEASE.md minor formatting edits * Update RELEASE.md updated CUDA version --- RELEASE.md | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index 2bc13014f..b03c139b3 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -98,7 +98,7 @@ Key enhancements include: * Improved [target-specific extensions](https://github.com/ROCm/llvm-project/blob/c2535466c6e40acd5ecf6ba1676a4e069c6245cc/clang/docs/LanguageExtensions.rst): * Added a new target-specific builtin ``__builtin_amdgcn_processor_is`` for late or deferred queries of the current target processor. * Added a new target-specific builtin ``__builtin_amdgcn_is_invocable``, enabling fine-grained, per-builtin feature availability. -* HIPIFY now supports NVIDIA CUDA 12.8.0 APIs: +* HIPIFY now supports NVIDIA CUDA 12.9.1 APIs: * Added support for all new device and host APIs, including `FP4`, `FP6`, and `FP128`– including support for the corresponding ROCm HIP equivalents. Deprecated features: @@ -1188,6 +1188,29 @@ HIP runtime has the following functional improvements which greatly improve runt * Replaced `hiptensorElementwiseTrinary` with `hiptensorElementwiseTrinaryExecute`. * Removed function `hiptensorReductionGetWorkspaceSize`. +### **llvm-project** (20.0.0) + +#### Added + +* Added compiler support for separate debug file generation for device code. +* Added `llvm-flang`, AMD's next generation Fortran compiler is a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. +* Added Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. +* Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` enabling fine-grained target-specific feature availability. +* Added HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. + +#### Changed + +* Updated clang/llvm to AMD clang version 20.0.0 (equivalent to LLVM 20.0.0 with additional out-of-tree patches). + +#### Optimized + +* Improved compiler memory load and store instructions. + +#### Upcoming changes + +* `__AMDGCN_WAVEFRONT_SIZE__` macro and HIP’s `warpSize` variable as `constexpr` are deprecated and will be disabled in a future release. Users are encouraged to update their code if needed to ensure future compatibility. For more information, see [AMDGCN_WAVEFRONT_SIZE deprecation](#amdgpu-wavefront-size-compiler-macro-deprecation). +* The `roc-obj-ls` and `roc-obj-extract` tools are deprecated. To extract all Clang offload bundles into separate code objects use `llvm-objdump --offloading `. For more information, see [Changes to ROCm Object Tooling](#changes-to-rocm-object-tooling). + ### **MIOpen** (3.5.0) #### Added From acdb5c90a68156e51a463abc168bbc8681b24096 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Wed, 20 Aug 2025 14:50:24 -0400 Subject: [PATCH 10/58] PRE RC4 7.0.0 RN Update (#507) * Indentation and formatting updated * Feedback changes and AQLprofiler addition * AQL Profiler update * MIgraphx changelog added * Release highlight added * Indentation fixed * Highlights updated * Highlights changes * Leo quick review feedback added * Leo's review feedback added * Leo's feedback incorporated * Consolidated changelog synced * OS virtualization link updated * ROCm Bandwidth test added * Changelog.md sycned --- CHANGELOG.md | 374 ++++++++++++++++++++++++++++++++++++++------------- RELEASE.md | 324 ++++++++++++++++++++++++++++++++++++-------- 2 files changed, 543 insertions(+), 155 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 4d581c4eb..a2e827555 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -38,7 +38,7 @@ for a complete overview of this release. * `socket power` to `amdsmi_get_power_info` - Previously the C API had the value in the `amdsmi_power_info` structure, but was unused - - Now we populate the value in both C & Python APIs + - Now we populate the value in both C and Python APIs - The value is representative of the socket's power agnostic of the the GPU version. * New event notification types to `amdsmi_evt_notification_type_t`. @@ -155,16 +155,16 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc #### Added -* Added support for BF16, F32, and F16 for 2D and 3D NGCHW grouped convolution backward data. +* Added support for `BF16`, `F32`, and `F16` for 2D and 3D NGCHW grouped convolution backward data. * Added a fully asynchronous HOST (CPU) arguments copy flow for CK grouped GEMM kernels. * Added support GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW, number of instances in instance factory for NGCHW/GKYXC/NGKHW has been reduced). * Added support for GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW). * Added support for GKCYX layout for grouped convolution backward weight (NGCHW/GKCYX/NGKHW). * Added support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). -* Added support for Stream-K version of mixed FP8/BF16 GEMM. +* Added support for Stream-K version of mixed `FP8` / `BF16` GEMM. * Added support for Multiple D GEMM. -* Added GEMM pipeline for microscaling (MX) FP8/FP6/FP4 data types -* Added support for FP16 2:4 structured sparsity to universal GEMM. +* Added GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types +* Added support for `FP16` 2:4 structured sparsity to universal GEMM. * Added support for Split K for grouped convolution backward data. * Added logit soft-capping support for fMHA forward kernels. * Added support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv). @@ -175,13 +175,16 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc #### Changed -* Removed support for gfx940 and gfx941 targets. * Replaced the raw buffer load/store intrinsics with Clang20 built-ins. * DL and DPP kernels are now enabled by default. * Number of instances in instance factory for grouped convolution forward NGCHW/GKYXC/NGKHW has been reduced. * Number of instances in instance factory for grouped convolution backward weight NGCHW/GKYXC/NGKHW has been reduced. * Number of instances in instance factory for grouped convolution backward data NGCHW/GKYXC/NGKHW has been reduced. +#### Removed + +* Removed support for gfx940 and gfx941 targets. + #### Optimized * Optimize the GEMM multiply preshuffle and lds bypass with Pack of KGroup and better instruction layout. @@ -205,9 +208,11 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. * New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. * New debug mask, to print precise code object information for logging. -* The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. -* Added `constexpr` operators for `FP16`/`BF16`. -* Added `__syncwarp` operation. +* The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. +* Added `constexpr` operators for `fp16`/`bf16`. +* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`) +* Extended fine grained system memory pool. +* `num_threads` total number of threads in the group. The legacy API size is alias. * Added PCI CHIP ID information as the device attribute. * Added new tests applications for OCP data types `FP4`/`FP6`/`FP8`. * A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. @@ -215,8 +220,10 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc #### Changed * Deprecated GPUs. Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. +* Removal of Beta warnings in HIP Graph APIs +All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. * Behavior changes - - `hipGetLastError` now gets the error code returned by `hipGetLastError` which should be the last actual error caught in the current thread during the application execution. + - `hipGetLastError` now returns the error code which is the last actual error caught in the current thread during the application execution. - Cooperative groups in `hipLaunchCooperativeKernelMultiDevice` and `hipLaunchCooperativeKernel` functions, additional input parameter validation checks are added. - `hipPointerGetAttributes` returns `hipSuccess` instead of an error with invalid value `hipErrorInvalidValue`, in case `NULL` host or attribute pointer is passed as input parameter. It now matches the functionality of `cudaPointerGetAttributes` which changed with CUDA 11 and above releases. - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree` @@ -317,11 +324,12 @@ In order to match the CUDA runtime behavior more closely, HIP APIs with streams * `hipEventRecord` * `hipEventRecordWithFlags` * `warpSize` Change + In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see either the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). #### Optimized -HIP runtime has the following functional improvements which greatly improve runtime performance and user experience. +HIP runtime has the following functional improvements which improves runtime performance and user experience. * Reduced usage of the lock scope in events and kernel handling. - Switches to `shared_mutex` for event validation, uses `std::unique_lock` in HIP runtime to create/destroy event, instead of `scopedLock`. @@ -330,9 +338,9 @@ HIP runtime has the following functional improvements which greatly improve runt * Refactored memory validation, creates a unique function to validate a variety of memory copy operations. * Improved kernel logging using demangling shader names. * Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). -* Programmatic support for scratch limits on MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, - - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. - - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. +* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, + - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. + - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. Developers can now use the environment variable `HSA_SCRATCH_SINGLE_LIMIT_ASYNC` to change the default allocation size with expected scratch limit in ROCR runtime. On top of it, this value can also be overwritten programmatically in the application using the HIP API `hipDeviceSetLimit(hipExtLimitScratchCurrent, value)` to reset the scratch limit value. * HIP runtime now enables peer-to-peer (P2P) memory copies to utilize all available SDMA engines, rather than being limited to a single engine. It also selects the best engine first to give optimal bandwidth. @@ -348,6 +356,10 @@ HIP runtime has the following functional improvements which greatly improve runt * A crash in TensorFlow related application. HIP runtime now combines multiple definitions of `callbackQueue` into a single function, in case of an exception, passes its handler to the application and provides corresponding error code. * Fixed issue of handling the kernel parameters for the graph launch. * Failures in roc-obj tools. HIP runtime now makes `DEPRECATED` message in roc-obj tools as `STDERR`. +* Support of `hipDeviceMallocContiguous` flags in `hipExtMallocWithFlags()`. It now enables `HSA_AMD_MEMORY_POOL_CONTIGUOUS_FLAG` in the memory pool allocation on GPU device. +* Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v` +* A numerical error/corruption found in Pytorch during graph replay. HIP runtime fixed the input sizes of kernel launch dimensions in hipExtModuleLaunchKernel for the execution of hipGraph capture. +* A crash during kernel execution in a customer application. The structure of kernel arguments was updated via adding the size of kernel arguments, and HIP runtime does validation before launch kernel with the structured arguments. ### **hipBLAS** (3.0.0) @@ -355,7 +367,7 @@ HIP runtime has the following functional improvements which greatly improve runt * Added the `hipblasSetWorkspace()` API. * Support for codecoverage tests. - + #### Changed * HIPBLAS_V2 API is the only available API using the `hipComplex` and `hipDatatype` types. @@ -381,7 +393,7 @@ HIP runtime has the following functional improvements which greatly improve runt * Fused Swish/SiLU GEMM in hipBLASLt (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``) * Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. * Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. -* Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8`/`BF8` swizzle GEMM respectively. +* Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8` / `BF8` swizzle GEMM respectively. * Added TF32 emulation on gfx950. * Added support for `FP6`, `BF6`, and `FP4` on gfx950 * Added support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. @@ -394,8 +406,8 @@ HIP runtime has the following functional improvements which greatly improve runt #### Optimized -* Improved performance for 8-bit (`FP8`/`BF8`/`I8`) NN/NT cases by adding ``s_delay_alu`` to reduce stalls from dependent ALU operations on gfx12+. -* Improved performance for 8-bit and 16-bit (`FP16`/`BF16`) TN cases by enabling software dependency checks (Expert Scheduling Mode) under certain restrictions to reduce redundant hardware dependency checks on gfx12+. +* Improved performance for 8-bit (`FP8` / `BF8` / `I8`) NN/NT cases by adding ``s_delay_alu`` to reduce stalls from dependent ALU operations on gfx12+. +* Improved performance for 8-bit and 16-bit (`FP16` / `BF16`) TN cases by enabling software dependency checks (Expert Scheduling Mode) under certain restrictions to reduce redundant hardware dependency checks on gfx12+. * Improved performance for 8-bit, 16-bit, and 32-bit batched GEMM with a better heuristic search algorithm for gfx942. #### Upcoming changes @@ -406,7 +418,7 @@ HIP runtime has the following functional improvements which greatly improve runt #### Added -* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is build with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. +* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. * Added single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: * `BlockScanRunningPrefixOp` * `ScanTileStatus` @@ -427,13 +439,13 @@ HIP runtime has the following functional improvements which greatly improve runt * Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. * Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. * This release removes support for custom builds on gfx940 and gfx941. -* Removed C++14 support, only C++17 is supported. +* Removed C++14 support. Only C++17 is supported. #### Changed * The NVIDIA backend now requires CUB, Thrust, and libcu++ 2.7.0. If they aren't found, they will be downloaded from the NVIDIA CCCL repository. * Updated `thread_load` and `thread_store` to align hipCUB with CUB. -* All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, (for example, hipcub::HIPCUB_300400_NS::symbol instead of hipcub::symbol), letting the user link multiple libraries built with different versions of hipCUB. +* All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, (for example, `hipcub::HIPCUB_300400_NS::symbol` instead of `hipcub::symbol`), letting the user link multiple libraries built with different versions of hipCUB. * Modified the broadcast kernel in warp scan benchmarks. The reported performance may be different to previous versions. * The `hipcub::detail::accumulator_t` in rocPRIM backend has been changed to utilise `rocprim::accumulator_t`. * The usage of `rocprim::invoke_result_binary_op_t` has been replaced with `rocprim::accumulator_t`. @@ -489,7 +501,7 @@ HIP runtime has the following functional improvements which greatly improve runt * [#1859](https://github.com/ROCm/HIPIFY/issues/1859)[hipify-perl] Fix warnings on unsupported Driver or Runtime APIs which were erroneously not reported * [#1930](https://github.com/ROCm/HIPIFY/issues/1930) Revise `JIT API` * [#1962](https://github.com/ROCm/HIPIFY/issues/1962) Support for cuda-samples helper headers -* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Remove `const_cast<const char**>` in `hiprtcCreateProgram` and `hiprtcCompileProgram` +* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Remove `const_cast;` in `hiprtcCreateProgram` and `hiprtcCompileProgram` ### **hipRAND** (3.0.0) @@ -500,9 +512,9 @@ HIP runtime has the following functional improvements which greatly improve runt #### Changed * Deprecated the hipRAND Fortran API in favor of hipfort. - + #### Removed - + * Removed C++14 support, so only C++17 is supported. ### **hipSOLVER** (3.0.0) @@ -514,7 +526,7 @@ HIP runtime has the following functional improvements which greatly improve runt * hipsolverSpCcsrlsvqr, hipsolverSpZcsrlsvqr #### Resolved issues - + * Corrected the value of `lwork` returned by various `bufferSize` functions to be consistent with NVIDIA cuSOLVER. The following functions now return `lwork` so that the workspace size (in bytes) is `sizeof(T) * lwork`, rather than `lwork`. To restore the original behavior, set the environment variable `HIPSOLVER_BUFFERSIZE_RETURN_BYTES`. * `hipsolverXorgbr_bufferSize`, `hipsolverXorgqr_bufferSize`, `hipsolverXorgtr_bufferSize`, `hipsolverXormqr_bufferSize`, `hipsolverXormtr_bufferSize`, `hipsolverXgesvd_bufferSize`, `hipsolverXgesvdj_bufferSize`, `hipsolverXgesvdBatched_bufferSize`, `hipsolverXgesvdaStridedBatched_bufferSize`, `hipsolverXsyevd_bufferSize`, `hipsolverXsyevdx_bufferSize`, `hipsolverXsyevj_bufferSize`, `hipsolverXsyevjBatched_bufferSize`, `hipsolverXsygvd_bufferSize`, `hipsolverXsygvdx_bufferSize`, `hipsolverXsygvj_bufferSize`, `hipsolverXsytrd_bufferSize`, `hipsolverXsytrf_bufferSize`. @@ -535,14 +547,14 @@ HIP runtime has the following functional improvements which greatly improve runt #### Changed * Switched to defaulting to C++17 when building hipSPARSE from source. Previously hipSPARSE was using C++14 by default. - + #### Resolved issues * Fixed a compilation [issue](https://github.com/ROCm/hipSPARSE/issues/555) related to using `std::filesystem` and C++14. * Fixed an issue where the clients-common package was empty by moving the `hipsparse_clientmatrices.cmake` and `hipsparse_mtx2csr` files to it. #### Known issues - + * In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed in, because it is unused internally. ### **hipSPARSELt** (0.2.4) @@ -551,16 +563,16 @@ HIP runtime has the following functional improvements which greatly improve runt * Support for the LLVM target gfx950. * Support for the following data type combinations for the LLVM target gfx950: - * FP8(E4M3) inputs, F32 output, and F32 Matrix Core accumulation. - * BF8(E5M2) inputs, F32 output, and F32 Matrix Core accumulation. + * `FP8`(E4M3) inputs, `F32` output, and `F32` Matrix Core accumulation. + * `BF8`(E5M2) inputs, `F32` output, and `F32` Matrix Core accumulation. * Support for ROC-TX if `HIPSPARSELT_ENABLE_MARKER=1` is set. * Support for the cuSPARSELt v0.6.3 backend. #### Removed - + * Support for LLVM targets gfx940 and gfx941 has been removed. * `hipsparseLtDatatype_t` has been removed. - + #### Optimized * Improved the library loading time. @@ -609,10 +621,118 @@ HIP runtime has the following functional improvements which greatly improve runt * Replaced `hiptensorElementwiseTrinary` with `hiptensorElementwiseTrinaryExecute`. * Removed function `hiptensorReductionGetWorkspaceSize`. +### **llvm-project** (20.0.0) + +#### Added + +* Added compiler support for separate debug file generation for device code. +* Added `llvm-flang`, AMD's next generation Fortran compiler is a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. +* Added Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. +* Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` enabling fine-grained target-specific feature availability. +* Added HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. + +#### Changed + +* Updated clang/llvm to AMD clang version 20.0.0 (equivalent to LLVM 20.0.0 with additional out-of-tree patches). + +#### Optimized + +* Improved compiler memory load and store instructions. + +#### Upcoming changes + +* `__AMDGCN_WAVEFRONT_SIZE__` macro and HIP’s `warpSize` variable as `constexpr` are deprecated and will be disabled in a future release. Users are encouraged to update their code if needed to ensure future compatibility. For more information, see [AMDGCN_WAVEFRONT_SIZE deprecation](#amdgpu-wavefront-size-compiler-macro-deprecation). +* The `roc-obj-ls` and `roc-obj-extract` tools are deprecated. To extract all Clang offload bundles into separate code objects use `llvm-objdump --offloading `. For more information, see [Changes to ROCm Object Tooling](#changes-to-rocm-object-tooling). + +### **MIGraphX** (2.13.0) + +### Added + +* Support for OCP `FP8` and MX `FP4` data types on AMD Instinct MI350X and MI355X accelerators. +* Support for `BF16` on all hardware. +* Support for PyTorch 2.7 via Torch-MIGraphX. +* Contrib Operators for Microsoft ONNX: Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, skipLayerNorm. +* TensorFlow Operator: Sigmoid, AddN. +* GroupQuery Attention for LLM support . +* Added support for edge mode in the ONNX Pad operator. +* Support additional types for linear Resize operator. +* Added bitonic topk ONNX operator. +* Added onnx runtime python driver +* Added FLUX e2e example. +* Added API to save and load arguments. +* Added quantize_bf16 to C api output. +* Added rocMLIR fusion for kv-cache attention. + +### Changed + +* Print Kernel/Module in Compile Failure. +* Use hipblaslt instead of rocBLAS for newer GPU asics. +* Normalize standard input shapes for rocBLAS. +* Updated Stable Diffusion example to use torch 6.3. +* Rewrite 1x1 convolutions to gemm. +* Make version header public. +* Represent `BF16::max` by its encoding, rather than the expected value. +* Direct warnings to cout, instead into cerr. +* Use vector instead of `set` for implicit deps. +* Disable layernorm by default. +* Update timing in compile_ops() to use common average + +### Removed + +* DPP for v_add_f64 as it is unsupported. +* rocBLAS bug workaround for solution index. +* ROCM_USE_FLOAT8 macro. +* rocBLAS `FP8`, always use hipBlasLt. +* Call to hipGetMemoryInfo when checking free memory based on feedback from HIP team. + +### Optimized + +* Layout convolution as NHWC or NCHW only +* einsum: conditionally do squeeze before transpose +* Update problem cache as configs are benchmarked +* Enable debug assertions in libstdc++ +* Topologically sort onnx models if nodes are unordered +* Use time_loop function to measure time for exhaustive tune runs +* Slice Channels Conv Optimization (slice output fusion) +* Horiz fuse after pointwise +* GridSample Linear Sampler Refactor +* find_splits::is_dependent refactor +* Visually improved the output from Graphviz +* Print MigraphX consumed Env Variables when using the migraphx-driver +* Add timestamps and duration when printing the summary of migraphx-driver +* Add a trim size flag to the verify option for migraphx-driver +* Print node names, to track parsing within the onnx graph when using the MIGRAPHX_TRACE_ONNX_PARSER flag +* Embed onnx/tf files for api tests +* Fuse multiple outputs for pointwise ops +* Fuse reshapes on pointwise inputs for mlir output fusion +* Print MIGRAPHX ENV Variables at end of summary +* Update accuracy checker to spit out test data with --show-test-data flag +* Dont fold mul with gemm when the gemm is used more than once +* Detect when parallel stl is not parallel and enable when it is in parallel +* Dont fuse broadcast after conv/gemm in mlir +* Avoid the fusion (in reduction) when operator data-types mismatch + +### Resolved issues + +* Workaround ICE in clang 20 when using views::transform. +* Fix bug with reshape_lazy in MLIR. +* Quantizelinear nearbyint fix. +* Add case for empty strings in node inputs for ops like resize. +* Parse resize fix: only check "keep_aspect_ratio_policy" attribute for sizes input. +* Fix Layernorm and SimplifiedLayernorm onnx parsers. +* nonmaxsuppression: identical boxes/scores not ordered correctly. +* Gcc/G++ compilation fix. +* Bug fix: events would get created on the wrong device in a multi-gpu scenario. +* Check for file-write errors. +* Fix out of order keys in value for comparisons and hashes when caching best kernels. +* Make checking env variables thread-safe again. +* [controlnet] Fixed mul: Types do not match. +* Fix check for scales if presenting roi in Resize op. + ### **MIOpen** (3.5.0) #### Added - + * [Conv] Added misa kernels for gfx950. * [Conv] Enabled Split-K support for CK backward data solvers (2D). * [Conv] Enabled CK wrw solver on gfx950 for the `BF16` data type. @@ -645,10 +765,13 @@ HIP runtime has the following functional improvements which greatly improve runt ### **MIVisionX** (3.3.0) +#### Added + +* Support to enable/disable BatchPD code in VX_RPP extensions by checking the RPP_LEGACY_SUPPORT flag. + #### Changed -* VX_RPP extension : Version 3.1.0 release -* Add support to enable/disable BatchPD code in VX_RPP extensions by checking the RPP_LEGACY_SUPPORT flag. +* VX_RPP extension : Version 3.1.0 release. * Update the parameters and kernel API of Blur, Fog, Jitter, LensCorrection, Rain, Pixelate, Vignette and ResizeCrop wrt tensor kernels replacing the legacy BatchPD API calls in VX_RPP extensions. #### Known issues @@ -687,7 +810,7 @@ HIP runtime has the following functional improvements which greatly improve runt * Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. * Fixed unit test failures in tests ending with the `ManagedMem` and `ManagedMemGraph` suffixes. * Fixed a suboptimal algorithmic switching point for AllReduce on the AMD Instinct MI300X. -* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault." with a design change to use `comm` instead of `rank` for `mscclStatus`. The Global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. +* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault" with a design change to use `comm` instead of `rank` for `mscclStatus`. The Global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. ### **rocAL** (2.3.0) @@ -762,7 +885,7 @@ HIP runtime has the following functional improvements which greatly improve runt * `rocblas_gemm_ex3`, `rocblas_gemm_batched_ex3`, and `rocblas_gemm_strided_batched_ex3` API functions. #### Resolved issues - + * Fixed environment variable path-based logging to append multiple handle outputs to the same file. * Support numerics when `trsm` is running with `rocblas_status_perf_degraded`. * Fixed the build dependency installation of `joblib` on some operating systems. @@ -775,6 +898,14 @@ HIP runtime has the following functional improvements which greatly improve runt * Deprecated the use of negative indices to indicate the default solution is being used for `gemm_ex` with `rocblas_gemm_algo_solution_index`. +### **ROCdbgapi** (0.77.3) + +#### Added +- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. + +#### Removed +- Support for the `gfx940` and `gfx941` architectures. + ### **rocDecode** (1.0.0) #### Added @@ -784,23 +915,26 @@ HIP runtime has the following functional improvements which greatly improve runt * HEVC/AVC/AV1/VP9 stream syntax error handling. * HEVC stream bit depth change handling and DPB buffer size change handling through decoder reconfiguration. * AVC stream DPB buffer size change handling through decoder reconfiguration. -* rocdecode now uses the Cmake CMAKE_PREFIX_PATH directive. -* rocdecode - A new avcodec-based decoder built as a separate ``rocdecode-host`` library. +* A new avcodec-based decoder built as a separate `rocdecode-host` library + +#### Changed + +* rocDecode now uses the Cmake `CMAKE_PREFIX_PATH` directive. #### Optimized -* Decode session start latency reduction. +* Decode session starts latency reduction. * Bitstream type detection optimization in bitstream reader. #### Resolved issues -* Fixed a bug in picture files sample ``videoDecodePicFiles`` that can results in incorrect output frame count. +* Fixed a bug in the `videoDecodePicFiles` picture files sample that can results in incorrect output frame count. * Fixed a decoded frame output issue in video size change cases. -* Removed incorrect asserts of bitdepth_minus_8 in GetBitDepth() and num_chroma_planes in GetNumChromaPlanes() API calls in RocVideoDecoder utility class. +* Removed incorrect asserts of `bitdepth_minus_8` in `GetBitDepth()` and `num_chroma_planes` in `GetNumChromaPlanes()` API calls in the RocVideoDecoder utility class. #### Removed -* GetStream() interface call from RocVideoDecoder utility class. +* `GetStream()` interface call from RocVideoDecoder utility class. #### Changed @@ -830,7 +964,7 @@ HIP runtime has the following functional improvements which greatly improve runt - 8192 * Implemented single-kernel plans for some large 1D problem sizes, on devices with at least 160KiB of LDS. -#### Resolved isues +#### Resolved issues * Fixed kernel faults on multi-device transforms that gather to a single device, when the input/output bricks are not contiguous. @@ -854,6 +988,32 @@ HIP runtime has the following functional improvements which greatly improve runt * Resolved an issue with resizing the internal memory pool by utilizing the explicit constructor of the vector's type during the resizing process. * Addressed and resolved CMake configuration warnings. +### **ROCm Bandwidth Test** (2.6.0) + +### Added + +* Plugin architecture: + * `rocm_bandwidth_test` is now the **framework** for individual `plugins` and features. The `framework` is available at: `/opt/rocm/bin/` + + * Individual `plugins`: The **plugins (shared libraries)** are available at: `/opt/rocm/lib/rocm_bandwidth_test/plugins/` + +```{note} +Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/rocm-rel-7.0/README.md) file for details about the new options and outputs. +``` + +### Changed + +* The `CLI` and options/parameters have changed due to the new plugin architecture, where the plugin parameters are parsed by the plugin. + +### Removed + +- The old CLI, parameters, and switches used. + +### Known Issues + +- MI350: Crashes due to HIP gfx support. + + ### **ROCm SMI** (7.8.0) #### Added @@ -928,7 +1088,7 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * Roofline support for RHEL 10 OS. -* FP4 and FP6 data types have been added for roofline profiling on AMD Instinct MI350 series. +* `FP4` and `FP6` data types have been added for roofline profiling on AMD Instinct MI350 series. ##### rocprofv3 support @@ -974,7 +1134,7 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * Fixed kernel name and kernel dispatch filtering when using ``rocprofv3``. * Fixed an issue of TCC channel counters collection in ``rocprofv3``. -* Fixed peak FLOPS of F8, I8, F16, and BF16 on AMD Instinct MI 300. +* Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI 300. * Fixed not detecting memory clock issue when using amd-smi * Fixed standalone GUI crashing * Fixed L2 read/write/atomic bandwidths on MI350 @@ -1004,6 +1164,17 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * MongoDB database support will be removed, and a deprecation warning has been added to the application interface. * Usage of ``rocm-smi`` is deprecated in favor of ``amd-smi``, and a deprecation warning has been added to the application interface. +### **ROCgdb** (16.3) + +#### Added + +- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. + +#### Removed + +- Support for the `gfx940` and `gfx941` architectures. + + ### **ROCm Data Center Tool** (1.1.0) #### Added @@ -1059,14 +1230,15 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele #### Changed - Migrated SMI API usage from `rocm-smi` to `amd-smi`. -- Updated FP8 GEMM operations to use hipBLASLt instead of rocBLAS. +- Updated `FP8` GEMM operations to use hipBLASLt instead of rocBLAS. ### **rocPRIM** (4.0.0) #### Added +* Added gfx950 support. * Added `rocprim::accumulator_t` to ensure parity with CCCL. -* Added test for `rocprim::accumulator_t` +* Added test for `rocprim::accumulator_t`. * Added `rocprim::invoke_result_r` to ensure parity with CCCL. * Added function `is_build_in` into `rocprim::traits::get`. * Added virtual shared memory as a fallback option in `rocprim::device_merge` when it exceeds shared memory capacity, similar to `rocprim::device_select`, `rocprim::device_partition`, and `rocprim::device_merge_sort`, which already include this feature. @@ -1087,30 +1259,29 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele #### Optimized -* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the MI3XX architecture. +* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 series. #### Changed * Changed the parameters `long_radix_bits` and `LongRadixBits` from `segmented_radix_sort` to `radix_bits` and `RadixBits` respectively. -* Marked the initialisation constructor of `rocprim::reverse_iterator<Iter>` `explicit`, use `rocprim::make_reverse_iterator`. +* Marked the initialisation constructor of `rocprim::reverse_iterator` `explicit`, use `rocprim::make_reverse_iterator`. * Merged `radix_key_codec` into type_traits system. * Renamed `type_traits_interface.hpp` to `type_traits.hpp`, rename the original `type_traits.hpp` to `type_traits_functions.hpp`. * The default scan accumulator types for device-level scan algorithms have changed. This is a breaking change. -The previous default accumulator types could lead to situations in which unexpected overflow occured, such as -when the input or inital type was smaller than the output type. - * This is a complete list of affected functions and how their default accumulator types are changing: - * `rocprim::inclusive_scan` - * Previous default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>` - * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, typename std::iterator_traits<InputIterator>::value_type>` - * `rocprim::deterministic_inclusive_scan` - * Previous default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>` - * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, typename std::iterator_traits<InputIterator>::value_type>` - * `rocprim::exclusive_scan` - * Previous default: `class AccType = detail::input_type_t<InitValueType>>` - * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, rocprim::detail::input_type_t<InitValueType>>` - * `rocprim::deterministic_exclusive_scan` - * Previous default: `class AccType = detail::input_type_t<InitValueType>>` - * Current default: `class AccType = rocprim::accumulator_t<BinaryFunction, rocprim::detail::input_type_t<InitValueType>>` +The previous default accumulator types could lead to situations in which unexpected overflow occured, such as when the input or inital type was smaller than the output type. This is a complete list of affected functions and how their default accumulator types are changing: + + * `rocprim::inclusive_scan` + * Previous default: `class AccType = typename std::iterator_traits::value_type>` + * Current default: `class AccType = rocprim::accumulator_t::value_type>` + * `rocprim::deterministic_inclusive_scan` + * Previous default: `class AccType = typename std::iterator_traits::value_type>` + * Current default: `class AccType = rocprim::accumulator_t::value_type>` + * `rocprim::exclusive_scan` + * Previous default: `class AccType = detail::input_type_t>` + * Current default: `class AccType = rocprim::accumulator_t>` + * `rocprim::deterministic_exclusive_scan` + * Previous default: `class AccType = detail::input_type_t>` + * Current default: `class AccType = rocprim::accumulator_t>` * Undeprecated internal `detail::raw_storage`. * A new version of `rocprim::thread_load` and `rocprim::thread_store` replace the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. * Renamed `rocprim::load_cs` to `rocprim::load_nontemporal` and `rocprim::store_cs` to `rocprim::store_nontemporal` to express the intent of these load and store methods better. @@ -1118,23 +1289,23 @@ when the input or inital type was smaller than the output type. #### Upcoming changes -* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` now. +* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` instead. #### Removed * Removed `rocprim::detail::float_bit_mask` and relative tests, use `rocprim::traits::float_bit_mask` instead. -* Removed `rocprim::traits::is_fundamental`, please use `rocprim::traits::get<T>::is_fundamental()` directly. +* Removed `rocprim::traits::is_fundamental`, use `rocprim::traits::get::is_fundamental()` directly. * Removed the deprecated parameters `short_radix_bits` and `ShortRadixBits` from the `segmented_radix_sort` config. They were unused, it is only an API change. -* Removed the deprecated `operator<<` from the iterators. +* Removed the deprecated `operator<<` from the iterators. * Removed the deprecated `TwiddleIn` and `TwiddleOut`. Use `radix_key_codec` instead. * Removed the deprecated flags API of `block_adjacent_difference`. Use `subtract_left()` or `block_discontinuity::flag_heads()` instead. * Removed the deprecated `to_exclusive` functions in the warp scans. * Removed the `rocprim::load_cs` from the `cache_load_modifier` enum. Use `rocprim::load_nontemporal` instead. * Removed the `rocprim::store_cs` from the `cache_store_modifier` enum. Use `rocprim::store_nontemporal` instead. -* Removed the deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. - * This header included `rocprim::detail::invoke_result`. Use `rocprim::invoke_result` instead. - * This header included `rocprim::detail::invoke_result_binary_op`. Use `rocprim::invoke_result_binary_op` instead. - * This header included `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. +* Removed the deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. This header included: + * `rocprim::detail::invoke_result`. Use `rocprim::invoke_result` instead. + * `rocprim::detail::invoke_result_binary_op`. Use `rocprim::invoke_result_binary_op` instead. + * `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. * Removed the deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. * Removed `rocprim/detail/radix_sort.hpp`, functionality can now be found in `rocprim/thread/radix_key_codec.hpp`. * Removed C++14 support, only C++17 is supported. @@ -1156,13 +1327,12 @@ when the input or inital type was smaller than the output type. * Fixed an issue where `device_segmented_reduce` reported autotuning throughput being 5x lower than it was in reality. * Fixed device radix sort not returning the correct required temporary storage when a double buffer contains `nullptr`. * Fixed constness of equality operators (`==` and `!=`) in `rocprim::key_value_pair`. -* Fixed an issue for the comparison operators in `arg_index_iterator` and `texture_cache_iterator`, where `<` and `>` comparators were swapped. +* Fixed an issue for the comparison operators in `arg_index_iterator` and `texture_cache_iterator`, where `<` and `>` comparators were swapped. * Fixed an issue for the `rocprim::thread_reduce` not working correctly with a prefix value. #### Known issues -* When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x - * However if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs +* * When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x. However, if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs. ### **ROCprofiler-SDK** (1.0.0) @@ -1199,6 +1369,7 @@ when the input or inital type was smaller than the output type. - Perfetto support for scratch memory. - Support in the `rocprofv3` avail tool for command-line arguments. - Documentation for `rocprofv3` advanced options. +- AQLprofile is now available as open source. ### Changed @@ -1267,11 +1438,11 @@ when the input or inital type was smaller than the output type. * `rocrand_h_scrambled_sobol32_direction_vectors`, use `rocrand_get_direction_vectors32` instead. * `rocrand_h_scrambled_sobol64_direction_vectors`, use `rocrand_get_direction_vectors64` instead. -#### Resolved isues +#### Resolved issues * Fixed an issue where `mt19937.hpp` would cause kernel errors during auto tuning. -#### Upcoming canges +#### Upcoming changes * Deprecated the rocRAND Fortran API in favor of hipfort. @@ -1312,7 +1483,7 @@ when the input or inital type was smaller than the output type. * Improved the performance of BDSQR and downstream functions, such as GESVD. * Improved the performance of STEQR and downstream functions, such as SYEV/HEEV. * Improved the performance of LARFT and downstream functions, such as GEQR2 and GEQRF. - + #### Resolved issues * Fixed corner cases that can produce NaNs in SYEVD for valid input matrices. @@ -1353,7 +1524,7 @@ when the input or inital type was smaller than the output type. * Improved the user documentation. #### Resolved issues - + * Fixed an issue in the public headers where `extern "C"` was not wrapped by `#ifdef __cplusplus`, which caused failures when building C programs with rocSPARSE. * Fixed a memory access fault in the `rocsparse_Xbsrilu0` routines. * Fixed failures that could occur in `rocsparse_Xbsrsm_solve` or `rocsparse_spsm` with BSR format when using host pointer mode. @@ -1372,10 +1543,8 @@ when the input or inital type was smaller than the output type. #### Changed * Updated the required version of Google Benchmark from 1.8.0 to 1.9.0. -* Drop `c++14` support for rocthrust. -* Renamed `cpp14_required.h` to `cpp_version_check.h` -* Refactored `test_header.hpp` into separte modules `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. - * This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. +* Renamed `cpp14_required.h` to `cpp_version_check.h`. +* Refactored `test_header.hpp` into `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. #### Added @@ -1390,7 +1559,7 @@ when the input or inital type was smaller than the output type. #### Removed * `device_malloc_allocator.h` has been removed. This header file was unused and should not impact users. -* Removed C++14 support, only C++17 is supported. +* Removed C++14 support. Only C++17 is now supported. * `test_header.hpp` has been removed. The `HIP_CHECK` function, as well as the `test` and `inter_run_bwr` namespaces, have been moved to `test_utils.hpp`. * `test_assertions.hpp` has been split into `test_real_assertions.hpp` and `test_imag_assertions.hpp`. @@ -1404,7 +1573,7 @@ when the input or inital type was smaller than the output type. #### Known issues -* The order of the values being compared by thrust::exclusive_scan_by_key and thrust::inclusive_scan_by_key can change between runs when integers are being compared. This can cause incorrect output when a non-commutative operator such as division is being used. +* The order of the values being compared by `thrust::exclusive_scan_by_key` and `thrust::inclusive_scan_by_key` can change between runs when integers are being compared. This can cause incorrect output when a non-commutative operator such as division is being used. ### **rocWMMA** (2.0.0) @@ -1437,7 +1606,7 @@ when the input or inital type was smaller than the output type. #### Resolved issues -* Fixed a validation issue for small precision compute types `< B32` on gfx9 +* Fixed a validation issue for small precision compute types `< B32` on gfx9 * Fixed CMake validation of compiler support for `BF8` / `FP8` types * Fixed linkage of rocwmma::synchronize_workgroup to inline @@ -1445,30 +1614,43 @@ when the input or inital type was smaller than the output type. #### Added -* Bitwise NOT, Bitwise AND, Bitwise OR augmentations on HOST (CPU) and HIP backends. +* Bitwise NOT, Bitwise AND, and Bitwise OR augmentations on HOST (CPU) and HIP backends. * Tensor Concat augmentation on HOST (CPU) and HIP backends. -* JPEG Compression Distortion augmentation on HIP backend. +* JPEG Compression Distortion augmentation on HIP backend. * `log1p`, defined as `log (1 + x)`, tensor augmentation support on HOST (CPU) and HIP backends. -* JPEG Compression Distortion augmentation on HOST (CPU) backend. +* JPEG Compression Distortion augmentation on HOST (CPU) backend. #### Changed -* All handle creation and destruction APIs have been consolidated to `rppCreate()`, for handle initialization, and `rppDestroy()`, for handle destruction. -* RPP function category `logical_operations` more appropriately renamed to `bitwise_operations`. -* TurboJPEG package installation enabled for RPP Test Suite with `sudo apt-get install libturbojpeg0-dev`. Instructions updated in utilities/test_suite/README.md. (#518) -* Changed API of swap_channels augmentation to be called channel_permute, which now accepts one new argument, `permutationTensor` (pointer to a unsigned int tensor) that provides the permutation order to swap the RGB channels of each input image in the batch in any order. - * Old API - `RppStatus rppt_swap_channels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, rppHandle_t rppHandle);`. - * New API - `RppStatus rppt_channel_permute_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u *permutationTensor , rppHandle_t rppHandle);`. +* Handle creation and destruction APIs have been consolidated. Use `rppCreate()` for handle initialization and `rppDestroy()` for handle destruction. +* The `logical_operations` function category has been renamed to `bitwise_operations`. +* TurboJPEG package installation enabled for RPP Test Suite with `sudo apt-get install libturbojpeg0-dev`. Instructions have been updated in utilities/test_suite/README.md. +* The `swap_channels` augmentation has been changed to `channel_permute`. `channel_permute` now also accepts a new argument, `permutationTensor` (pointer to a unsigned int tensor) that provides the permutation order to swap the RGB channels of each input image in the batch in any order: + + `RppStatus rppt_swap_channels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, rppHandle_t rppHandle);` + + changed to: + + `RppStatus rppt_channel_permute_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u *permutationTensor , rppHandle_t rppHandle);` #### Removed -* Older versions of RPP handle creation inlcuding `rppCreateWithBatchSize()`, `rppCreateWithStream()`, and `rppCreateWithStreamAndBatchSize()` are now removed and replaced with `rppCreate()`. -* Older versions of RPP handle destruction API including `rppDestroyGPU()` and `rppDestroyHost()` are now removed and replaced with `rppDestroy()`. +* Older versions of RPP handle creation inlcuding `rppCreateWithBatchSize()`, `rppCreateWithStream()`, and `rppCreateWithStreamAndBatchSize()`. These have been replaced with `rppCreate()`. +* Older versions of RPP handle destruction API including `rppDestroyGPU()` and `rppDestroyHost()`. These have been replaced with `rppDestroy()`. #### Resolved issues * Test package - debian packages will install required dependencies. +### **ROCr Runtime** (1.18.0) + +#### Added + +* New API `hsa_amd_memory_get_preferred_copy_engine` to get preferred copy engine that can be used to when calling `hsa_amd_memory_async_copy_on_engine`. +* New API `hsa_amd_portable_export_dmabuf_v2` extension of existing `hsa_amd_portable_export_dmabuf` API to support new flags parameter. This allows specifying the new `HSA_AMD_DMABUF_MAPPING_TYPE_PCIE` flag when exporting dma-bufs. +* New flag `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` adds support for new `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` when calling `hsa_amd_vmem_address_reserve` API. This allows virtual address range reservations for SVM allocations to be tracked when running in ASAN mode. +* New sub query `HSA_AMD_AGENT_INFO_CLOCK_COUNTERS` returns a snapshot of the underlying driver's clock counters that can be used for profiling. + ### **Tensile** (4.44.0) #### Added diff --git a/RELEASE.md b/RELEASE.md index b03c139b3..8398914a6 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -16,7 +16,7 @@ The release notes provide a summary of notable changes since the previous ROCm r - [Release highlights](#release-highlights) -- [Operating system and hardware support changes](#operating-system-and-hardware-support-changes) +- [Operating system, hardware, and virtualization support changes](#operating-system-hardware-and-virtualization-support-changes) - [ROCm components versioning](#rocm-components) @@ -40,77 +40,109 @@ The following are notable new features and improvements in ROCm 7.0.0. For chang ### HIP API compatibility improvements -HIP API 7.0 introduces changes to make it align more closely with NVIDIA CUDA. These change are incompatible with prior releases, -and might require recompiling existing HIP applications for use in the ROCm 7.0 release. For more information, see the [HIP API 7.0 changes](../hip-7-changes) and the [HIP changelog](#hip-7-0-0) below. +To improve code portability between AMD ROCm and other programming models, HIP API has been updated in ROCm 7.0 to simplify cross-platform programming. These changes are incompatible with prior ROCm releases and might require recompiling existing HIP applications for use with ROCm 7.0. For more information, see the [HIP API 7.0 changes](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/hip-7-changes.html) and the [HIP changelog](#hip-7-0-0) below. -### New machine learning programming language for AMD accelerators +### HIP runtime updates -Wave is a high-performance domain-specific Python programming language (DSL) designed to accelerate the development and optimization of machine learning kernels on AMD GPUs. It introduces a subgroup-level (wave) programming model that deliberately separates the mathematical formulation of computation from subgroup and thread distribution strategies. ROCm 7.0 supports the library on AMD Instinct MI300 and MI350 series accelerators. Wave is now integrated into SGLang, also enabling a broader user base. For more information, see its [GitHub repository](https://github.com/iree-org/wave). +The HIP runtime now includes support for: -```{note} -Wave for ROCm is in an early access state. Running production workloads is not recommended. -``` +* Open Compute Project (OCP) MX floating-point `FP4`, `FP6`, and `FP8` data types and APIs. +* Improved logging by adding more precise pointer information and launch arguments for better tracking and debugging in dispatch methods. +* `constexpr` operators for `FP16` and `BF16`. +* `__syncwarp` operation. +* The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. + +In addition, the HIP runtime includes functional improvements, which improves functionality, runtime performance, and user experience. For more information, see [HIP changelog](#hip-7-0-0) below. ### Instinct Driver/ROCm packaging separation The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) for more information. -Forward and backward compatibility between the Instinct Driver and ROCm is not supported in the Beta release. See the [installation instructions](https://rocm.docs.amd.com/en/docs-7.0-beta/preview/install/index.html). +### Deep learning and AI framework support improvements -### Deep learning framework support improvements +ROCm 7.0 introduces several newly supported versions of Deep learning and AI frameworks. For more information, see [Installting Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/latest/how-to/deep-learning-rocm.html). -ROCm 7.0 supports PyTorch 2.7, TensorFlow 2.19, and Triton 3.3.0. +See the [Compatibility +matrix](../../docs/compatibility/compatibility-matrix.rst) +for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm. -### ROCprofiler-SDK and rocprofv3 improvements +#### PyTorch -#### rocpd +ROCm 7.0 enables the following PyTorch features: -Support has been added for the ROCm Profiling Data (rocpd) output format, which is now the default format for ``rocprofv3``. A subproject of the ROCprofiler-SDK, rocpd enables saving profiling results to a SQLite3 database, providing a structured and efficient foundation for analysis and post-processing. +* Support for PyTorch 2.7. +* Integrated Fused Rope kernels in APEX. +* Compilation of Python C++ extensions using ``amdclang++``. +* Support for channels-last NHWC format for convolutions via MIOpen. -#### Core SDK enhancements +#### JAX -* ROCprofiler-SDK is now compatible with the HIP 7.0 API. -* Added stochastic and host-trap PC sampling support for all MI300 series accelerators. -* Added support for tracing KFD events. +ROCm 7.0 enables support for JAX 0.6.0. -#### rocprofv3 CLI tool enhancements +#### Megatron-LM -* Added stochastic and host-trap PC sampling support for all MI300 series accelerators. -* HIP streams translate to Queues in Time Traces in Perfetto output. +Megatron-LM for ROCm now supports: -For more information about ROCprofiler-SDK changes, see the [detailed component changelog](#rocprofiler-sdk-1-0-0) below +* Fused Gradient Accumulation via APEX. -### Compilers changes and improvements +* Fused Rope Kernel in APEX. -ROCm 7.0 introduces the AMD Next-Gen Fortran compiler. ``llvm-flang`` (sometimes called new-flang or flang-18) is a re-implementation of the Fortran frontend. It is a strategic replacement for classic-flang and is developed in LLVM’s upstream repo at [llvm/llvm-project](https://github.com/llvm/llvm-project/tree/main/flang). +* Fused_bias_swiglu kernel. -Key enhancements include: +For more information, see [Training a model with Megatron-LM for ROCm](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/benchmark-docker/megatron-lm.html?model=pyt_megatron_lm_train_llama-3.3-70b). + +#### Tensorflow + +ROCm 7.0 enables support for TensorFlow 2.19.1. + +#### vLLM + +* Support for Open Compute Project (OCP) `FP8` data type. +* `FP4` precision for Llama 3.1 405B. + +#### Triton + +ROCm 7.0 enables support for support for Triton 3.3.0. + +### Compiler changes and improvements + +ROCm 7.0 introduces the AMD Next-Gen Fortran compiler. ``llvm-flang`` (sometimes called ``new-flang`` or ``flang-18``) is a re-implementation of the Fortran frontend. It is a strategic replacement for ``classic-flang`` and is developed in LLVM’s upstream repo at [llvm/llvm-project](https://github.com/llvm/llvm-project/tree/main/flang). + +Key compiler enhancements include: * Compiler: * Improved memory load and store instructions. * Updated clang/llvm to AMD clang version 20.0.0git (equivalent to LLVM 20.0.0 with additional out-of-tree patches). * Support added for separate debug file generation for device code. - + * `llvm-strip` now supports AMD GPU device code objects (EM_AMDGPU). * Comgr: * Added support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps. This is designed to improve performance by reducing on-disk file I/O. Currently, VFS is supported only for the device library link step, with plans for expanded support in future releases. - * SPIR-V: * Improved [target-specific extensions](https://github.com/ROCm/llvm-project/blob/c2535466c6e40acd5ecf6ba1676a4e069c6245cc/clang/docs/LanguageExtensions.rst): * Added a new target-specific builtin ``__builtin_amdgcn_processor_is`` for late or deferred queries of the current target processor. * Added a new target-specific builtin ``__builtin_amdgcn_is_invocable``, enabling fine-grained, per-builtin feature availability. +* The compiler driver now uses parallel code generation by default when compiling using full LTO (including when using the `-fgpu-rdc` option) for HIP. This divides the optimized LLVM IR module into roughly equal partitions before instruction selection and lowering, which can help improve build times. + + Each kernel in the linked LTO module can be put in a separate partition, and any non-inlined function it depends on may be copied alongside it. Thus, while parallel code generation can improve build time, it can duplicate non-inlined, non-kernel functions across multiple partitions, potentially increasing the binary size of the final object file. + + * Compiler option `-flto-partitions=`: + + Equivalent to the `--lto-partitions=` LLD option. Controls the number of partitions used for parallel code generation when using full LTO (including when using `-fgpu-rdc`). The number of partitions must be greater than 0, and a value of 1 disables the feature. The default value is 8. + + Developers are encouraged to experiment with different numbers of partitions using the `-flto-partitions` Clang command line option. Recommended values are 1 to 16 partitions, with especially large projects containing many kernels potentially benefiting from up to 64 partitions. It is not recommended to use a value greater than the number of threads on the machine. Smaller projects, or those containing only a few kernels, might not benefit at all from partitioning and might even experience a slight increase in build time due to the small overhead of analyzing and partitioning the modules. + * HIPIFY now supports NVIDIA CUDA 12.9.1 APIs: * Added support for all new device and host APIs, including `FP4`, `FP6`, and `FP128`– including support for the corresponding ROCm HIP equivalents. -Deprecated features: +* The HIPCC Perl scripts (`hipcc.pl` and `hipconfig.pl`) have been removed in this release. -* ROCm components no longer use the ``__AMDGCN_WAVEFRONT_SIZE`` and ``__AMDGCN_WAVEFRONT_SIZE__`` macros nor HIP’s ``warpSize`` variable as ``constexpr``. These macros and reliance on ``warpSize`` as a ``constexpr`` are deprecated and will be disabled in a future release. Users are encouraged to update their code if needed to ensure future compatibility. - -### Libraries changes and improvements +### Library changes and improvements #### New data type support -MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). The ROCm 7.0 Alpha enables functional support for MX data types `FP4`, `FP6`, and `FP8` on MI355X systems in these ROCm libraries: -* Composable Kernel (`FP4` and `FP8` only) +MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). The ROCm 7.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD MI355X systems in these ROCm libraries: + +* Composable Kernel (`FP4`, `FP6`, and `FP8` only) * hipBLASLt * MIGraphX (`FP4` only) @@ -124,52 +156,94 @@ The following libraries are updated to support the Open Compute Project (OCP) fl MIGraphX now also supports `BF16`. -#### RCCL support +For more information about data types, see [Data types and precision support](https://rocm.docs.amd.com/en/latest/reference/precision-support.html). -RCCL is supported for single-node functional usage only. Multi-node communication capabilities will be supported in future preview releases. +#### hipBLASLt improvement + +GEMM performance has been improved for `FP8`, `FP16`, `BF16`, and `FP32` data types. + +For more information about hipBLASLt changes, see the [hipBLASLt changelog](#hipblaslt-1-0-0) below. #### MIGraphX support -* Support for OCP `FP8` and MX `FP4` data types on MI355X +* Support for OCP `FP8` and MX `FP4` data types on AMD Instinct MI350X and MI355X accelerators. * Support for `BF16` on all hardware * Support for PyTorch 2.7 via Torch-MIGraphX -### Tools changes and improvements +For more information about MIGraphX changes, see the [MIGraphX changelog](migraphx-2-13-0) below. + +#### rocSHMEM Reverse Offload conduit inter-node support + +The rocSHMEM communications library has added the RO (Reverse Offload) inter-node communication backend which enables communication between GPUs on different nodes through a NIC, using a host-based CPU proxy to forward communication orders to and from the GPU. Inter-node communication requires MPI, and is tested with Open MPI and CX7 IB NICs. For more information, see [available network backends](https://rocm.docs.amd.com/projects/rocSHMEM/en/develop/install.html#available-network-backends) for installting rocSHMEM. + +See the [rocSHMEM changelog](#rocshmem-3-0-0) for more details. + +### Tool changes and improvements #### AMD SMI -* The default output of the ``amd-smi`` CLI now displays a simple table view. -* New APIs: CPU affinity shows GPUs’ affinitization to each CPU in a system. +Key enhancements to AMD SMI include the ability to reload the AMD GPU driver from the +CLI or API. The `amd-smi` command-line interface gains a new default view, `amd-smi` topology support +in guest environments, and performance optimizations. Additionally, AMD SMI library APIs +have been refined for improved usability. See the [AMDSMI changelog](#amdsmi-26-0-0) for more details. #### ROCgdb -* MX data types support: `FP4`, `FP6`, and `FP8`. -#### ROCprof Compute Viewer -* Initial release: ``rocprof-compute-viewer`` allows the visualization of ``rocprofv3``’s thread trace output. +The MX data types now support `FP4`, `FP6`, and `FP8`. -#### ROCprof Trace Decoder -* Initial release: ``rocprof-trace-decoder`` a plugin API for decoding thread traces. +See the [ROCgdb changelog](#rocgdb-16-3) for more details. #### ROCm Compute Profiler +ROCm Compute Profiler includes the following key changes: + * MX data types support: `FP4`, `FP6`, and `FP8`. * AMD Instinct MI355X and MI350X performance counters: CPC, SPI, SQ, TA/TD/TCP, and TCC. * Enhanced roofline analysis with support for `INT8`, `INT32`, `FP8`, `FP16`, and `BF16` data types. * Roofline distinction for `FP32` and `FP64` data types. * Selective kernel profiling. +See the [ROCm Compute Profiler changelog](#rocm-compute-profiler-3-2-1) for more details. + +#### ROCm Data Center (RDC) improvements + +The ROCm Data Center tool (RDC) streamlines the administration of AMD GPUs in cluster data center environments. ROCm 7.0 introduces new data center management and monitoring tools for system administrators. For more information, see [ROCm Data Center (RDC) tool documentation](https://rocm.docs.amd.com/projects/rdc/en/latest/index.html). + #### ROCm Systems Profiler + +ROCm Systems Profiler includes the following key changes: + * Trace support for computer vision APIs: H264, H265, AV1, VP9, and JPEG. * Trace support for computer vision engine activity. * OpenMP for C++ language and kernel activity support. +See the [ROCm Systems Profiler changelog](#rocm-systems-profiler-1-1-0) for more details. + #### ROCm Validation Suite -* AMD Instinct MI355X and MI350X accelerator support in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. +AMD Instinct MI355X and MI350X accelerator support in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. + +See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more details. #### ROCprofiler-SDK -* Program counter (PC) sampling (host trap-based). + +##### Core SDK enhancements + +* ROCprofiler-SDK is now compatible with the HIP 7.0 API. +* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 series accelerators. +* Added support for tracing KFD events. * API for profiling applications using thread traces (beta). -* Support in ``rocprofv3`` CLI tool for thread trace service. + +##### rocpd + +Support has been added for the ROCm Profiling Data (rocpd) output format, which is now the default format for ``rocprofv3``. A subproject of the ROCprofiler-SDK, rocpd enables saving profiling results to a SQLite3 database, providing a structured and efficient foundation for analysis and post-processing. + +##### rocprofv3 CLI tool enhancements + +* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 series accelerators. +* HIP streams translate to Queues in Time Traces in Perfetto output. +* Support for thread trace service. + +See the [ROCprofiler-SDK changelog](#rocprofiler-sdk-1-0-0) for more details. ### ROCm Offline Installer Creator updates @@ -227,18 +301,29 @@ ROCm documentation continues to be updated to provide clearer and more comprehen * Modern computing tasks often require balancing numerical precision against hardware resources and processing speed. Low precision floating point number formats in HIP include `FP4` (4-bit) and `FP6` (6-bit), which reduce memory and bandwidth requirements. For more information, see the updated [Low precision floating point types](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html) topic. -## Operating system and hardware support changes +## Operating system, hardware, and virtualization support changes -ROCm 7.0.0 adds support for [placeholder]. For more information, see installation instructions. +ROCm 7.0.0 adds support for the following operating systems and kernel versions: -ROCm 7.0.0 marks the end of support (EoS) for [placeholder] +* Ubuntu 24.04.3 (kernel: 6.8 [GA], 6.14 [HWE]) +* RHEL 10.0 (kernel: 6.12) +* Oracle Linux 10 (kernel: 6.12 UEK) +* Rocky 9 (kernel: 5.14+ B/P from 6.11/6.12) -ROCm 7.0.0 adds support for AMD Instinct MI355X and MI350X. For details, see the full list of Supported GPUs (Linux). +For more information about supported operating systems, see [Supported operating systems](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html). + +ROCm 7.0.0 marks the end of support (EoS) for Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE]) + +ROCm 7.0.0 adds support for [AMD Instinct MI355X](https://www.amd.com/en/products/accelerators/instinct/mi350/mi355x.html) and [MI350X](https://www.amd.com/en/products/accelerators/instinct/mi350/mi350x.html). For details, see the full list of [Supported GPUs (Linux)](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#supported-gpus). See the [Compatibility matrix](../../docs/compatibility/compatibility-matrix.rst) for more information about operating system and hardware compatibility. +### Virtualization support + +ROCm 7.0 introduces support for KVM-based SR-IOV for select Instinct accelerators. All supported configurations require the GIM SR-IOV driver version 8.3.0K. In addition, support for VMware ESXi 8 has been introduced for select AMD accelerators. For more information, see [Virtualization Support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support). + ## ROCm components The following table lists the versions of ROCm components for ROCm 7.0.0, including any version @@ -892,11 +977,12 @@ In order to match the CUDA runtime behavior more closely, HIP APIs with streams * `hipEventRecord` * `hipEventRecordWithFlags` * `warpSize` Change + In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see either the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). #### Optimized -HIP runtime has the following functional improvements which greatly improve runtime performance and user experience. +HIP runtime has the following functional improvements which improves runtime performance and user experience. * Reduced usage of the lock scope in events and kernel handling. - Switches to `shared_mutex` for event validation, uses `std::unique_lock` in HIP runtime to create/destroy event, instead of `scopedLock`. @@ -905,7 +991,7 @@ HIP runtime has the following functional improvements which greatly improve runt * Refactored memory validation, creates a unique function to validate a variety of memory copy operations. * Improved kernel logging using demangling shader names. * Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). -* Programmatic support for scratch limits on MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, +* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. @@ -985,7 +1071,7 @@ HIP runtime has the following functional improvements which greatly improve runt #### Added -* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is build with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. +* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. * Added single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: * `BlockScanRunningPrefixOp` * `ScanTileStatus` @@ -1211,6 +1297,91 @@ HIP runtime has the following functional improvements which greatly improve runt * `__AMDGCN_WAVEFRONT_SIZE__` macro and HIP’s `warpSize` variable as `constexpr` are deprecated and will be disabled in a future release. Users are encouraged to update their code if needed to ensure future compatibility. For more information, see [AMDGCN_WAVEFRONT_SIZE deprecation](#amdgpu-wavefront-size-compiler-macro-deprecation). * The `roc-obj-ls` and `roc-obj-extract` tools are deprecated. To extract all Clang offload bundles into separate code objects use `llvm-objdump --offloading `. For more information, see [Changes to ROCm Object Tooling](#changes-to-rocm-object-tooling). +### **MIGraphX** (2.13.0) + +### Added + +* Support for OCP `FP8` and MX `FP4` data types on AMD Instinct MI350X and MI355X accelerators. +* Support for `BF16` on all hardware. +* Support for PyTorch 2.7 via Torch-MIGraphX. +* Contrib Operators for Microsoft ONNX: Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, skipLayerNorm. +* TensorFlow Operator: Sigmoid, AddN. +* GroupQuery Attention for LLM support . +* Added support for edge mode in the ONNX Pad operator. +* Support additional types for linear Resize operator. +* Added bitonic topk ONNX operator. +* Added onnx runtime python driver +* Added FLUX e2e example. +* Added API to save and load arguments. +* Added quantize_bf16 to C api output. +* Added rocMLIR fusion for kv-cache attention. + +### Changed + +* Print Kernel/Module in Compile Failure. +* Use hipblaslt instead of rocBLAS for newer GPU asics. +* Normalize standard input shapes for rocBLAS. +* Updated Stable Diffusion example to use torch 6.3. +* Rewrite 1x1 convolutions to gemm. +* Make version header public. +* Represent `BF16::max` by its encoding, rather than the expected value. +* Direct warnings to cout, instead into cerr. +* Use vector instead of `set` for implicit deps. +* Disable layernorm by default. +* Update timing in compile_ops() to use common average + +### Removed + +* DPP for v_add_f64 as it is unsupported. +* rocBLAS bug workaround for solution index. +* ROCM_USE_FLOAT8 macro. +* rocBLAS `FP8`, always use hipBlasLt. +* Call to hipGetMemoryInfo when checking free memory based on feedback from HIP team. + +### Optimized + +* Layout convolution as NHWC or NCHW only +* einsum: conditionally do squeeze before transpose +* Update problem cache as configs are benchmarked +* Enable debug assertions in libstdc++ +* Topologically sort onnx models if nodes are unordered +* Use time_loop function to measure time for exhaustive tune runs +* Slice Channels Conv Optimization (slice output fusion) +* Horiz fuse after pointwise +* GridSample Linear Sampler Refactor +* find_splits::is_dependent refactor +* Visually improved the output from Graphviz +* Print MigraphX consumed Env Variables when using the migraphx-driver +* Add timestamps and duration when printing the summary of migraphx-driver +* Add a trim size flag to the verify option for migraphx-driver +* Print node names, to track parsing within the onnx graph when using the MIGRAPHX_TRACE_ONNX_PARSER flag +* Embed onnx/tf files for api tests +* Fuse multiple outputs for pointwise ops +* Fuse reshapes on pointwise inputs for mlir output fusion +* Print MIGRAPHX ENV Variables at end of summary +* Update accuracy checker to spit out test data with --show-test-data flag +* Dont fold mul with gemm when the gemm is used more than once +* Detect when parallel stl is not parallel and enable when it is in parallel +* Dont fuse broadcast after conv/gemm in mlir +* Avoid the fusion (in reduction) when operator data-types mismatch + +### Resolved issues + +* Workaround ICE in clang 20 when using views::transform. +* Fix bug with reshape_lazy in MLIR. +* Quantizelinear nearbyint fix. +* Add case for empty strings in node inputs for ops like resize. +* Parse resize fix: only check "keep_aspect_ratio_policy" attribute for sizes input. +* Fix Layernorm and SimplifiedLayernorm onnx parsers. +* nonmaxsuppression: identical boxes/scores not ordered correctly. +* Gcc/G++ compilation fix. +* Bug fix: events would get created on the wrong device in a multi-gpu scenario. +* Check for file-write errors. +* Fix out of order keys in value for comparisons and hashes when caching best kernels. +* Make checking env variables thread-safe again. +* [controlnet] Fixed mul: Types do not match. +* Fix check for scales if presenting roi in Resize op. + ### **MIOpen** (3.5.0) #### Added @@ -1446,7 +1617,7 @@ HIP runtime has the following functional improvements which greatly improve runt - 8192 * Implemented single-kernel plans for some large 1D problem sizes, on devices with at least 160KiB of LDS. -#### Resolved isues +#### Resolved issues * Fixed kernel faults on multi-device transforms that gather to a single device, when the input/output bricks are not contiguous. @@ -1470,6 +1641,32 @@ HIP runtime has the following functional improvements which greatly improve runt * Resolved an issue with resizing the internal memory pool by utilizing the explicit constructor of the vector's type during the resizing process. * Addressed and resolved CMake configuration warnings. +### **ROCm Bandwidth Test** (2.6.0) + +### Added + +* Plugin architecture: + * `rocm_bandwidth_test` is now the **framework** for individual `plugins` and features. The `framework` is available at: `/opt/rocm/bin/` + + * Individual `plugins`: The **plugins (shared libraries)** are available at: `/opt/rocm/lib/rocm_bandwidth_test/plugins/` + +```{note} +Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/rocm-rel-7.0/README.md) file for details about the new options and outputs. +``` + +### Changed + +* The `CLI` and options/parameters have changed due to the new plugin architecture, where the plugin parameters are parsed by the plugin. + +### Removed + +- The old CLI, parameters, and switches used. + +### Known Issues + +- MI350: Crashes due to HIP gfx support. + + ### **ROCm SMI** (7.8.0) #### Added @@ -1825,6 +2022,7 @@ The previous default accumulator types could lead to situations in which unexpec - Perfetto support for scratch memory. - Support in the `rocprofv3` avail tool for command-line arguments. - Documentation for `rocprofv3` advanced options. +- AQLprofile is now available as open source. ### Changed @@ -1893,11 +2091,11 @@ The previous default accumulator types could lead to situations in which unexpec * `rocrand_h_scrambled_sobol32_direction_vectors`, use `rocrand_get_direction_vectors32` instead. * `rocrand_h_scrambled_sobol64_direction_vectors`, use `rocrand_get_direction_vectors64` instead. -#### Resolved isues +#### Resolved issues * Fixed an issue where `mt19937.hpp` would cause kernel errors during auto tuning. -#### Upcoming canges +#### Upcoming changes * Deprecated the rocRAND Fortran API in favor of hipfort. @@ -2153,6 +2351,14 @@ issues related to individual components, review the [Detailed component changes] The following are previously known issues resolved in this release. For resolved issues related to individual components, review the [Detailed component changes](#detailed-component-changes). +### Failure when using a generic target with compression and vice versa + +An issue where compilation for generic target with compression failing has been resolved in this release. This issue resulted in you being unable to compile for a generic target and use compression simultaneously. See [GitHub issue #4602](https://github.com/ROCm/ROCm/issues/4602). + +### Limited support for Sparse API and Pallas functionality in JAX + +An issue where due to limited support for Sparse API in JAX, some of the functionality of the Pallas extension were restricted has been resolved. See [GitHub issue #4608](https://github.com/ROCm/ROCm/issues/4608). + ## ROCm upcoming changes The following changes to the ROCm software stack are anticipated for future releases. @@ -2183,12 +2389,12 @@ It's anticipated that ROCTracer, ROCProfiler, `rocprof`, and `rocprofv2` will re ### AMDGPU wavefront size compiler macro deprecation Access to the wavefront size as a compile-time constant via the `__AMDGCN_WAVEFRONT_SIZE` -and `__AMDGCN_WAVEFRONT_SIZE__` macros are deprecated and will be disabled in a future release. In ROCm 7.0.0 `warpSize` is only availble as a non-`constextpr` variable. +and `__AMDGCN_WAVEFRONT_SIZE__` macros are deprecated and will be disabled in a future release. In ROCm 7.0.0 `warpSize` is only available as a non-`constextpr` variable. You're encougared to update your code if needed to ensure future compatibility. * The `__AMDGCN_WAVEFRONT_SIZE__` macro and `__AMDGCN_WAVEFRONT_SIZE` alias will be removed in an upcoming release. It is recommended to remove any use of this macro. For more information, see [AMDGPU support](https://rocm.docs.amd.com/projects/llvm-project/en/docs-6.4.3/LLVM/clang/html/AMDGPUSupport.html). -* `warpSize` will only be available as a non-`constexpr` variable. Where required, +* `warpSize` is only available as a non-`constexpr` variable. Where required, the wavefront size should be queried via the `warpSize` variable in device code, or via `hipGetDeviceProperties` in host code. Neither of these will result in a compile-time constant. For more information, see [warpSize](https://rocm.docs.amd.com/projects/HIP/en/docs-6.4.3/how-to/hip_cpp_language_extensions.html#warpsize). * For cases where compile-time evaluation of the wavefront size cannot be avoided, From 6b93d7a75a744a026636f52d5a89a9f036f5aefd Mon Sep 17 00:00:00 2001 From: Peter Park Date: Wed, 20 Aug 2025 15:29:20 -0400 Subject: [PATCH 11/58] Update amdsmi changelog for 7.0 (#510) Co-authored-by: Pratik Basyal --- RELEASE.md | 120 ++++++++++++++++++++++++++--------------------------- 1 file changed, 60 insertions(+), 60 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 8398914a6..37ce0dd86 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -664,30 +664,35 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid ### **AMD SMI** (26.0.0) -### Added +#### Added -* The Default command. +* Ability to restart the AMD GPU driver from the CLI and API. + - `amdsmi_gpu_driver_reload()` API and `amd-smi reset --reload-driver` or `amd-smi reset -r` CLI options. + - Driver reload functionality is now separated from memory partition + functions; memory partition change requests should now be followed by a driver reload. + - Driver reload requires all GPU activity on all devices to be stopped. - A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through laternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. +* Default command: -* Support for GPU metrics 1.8. - - Added new fields for `amdsmi_gpu_xcp_metrics_t` including: - - Adding the following metrics to allow new calculations for violation status: - - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts - - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts - - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. - - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). - - Increasing available JPEG engines to 40. - Current ASICs may not support all 40. These will be indicated as `UINT16_MAX` or `N/A` in CLI. + A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through laternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. -* Bad page threshold count. +* Support for GPU metrics 1.8: + - Added new fields for `amdsmi_gpu_xcp_metrics_t` including: + - Metrics to allow new calculations for violation status: + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. + - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). + - Increased available JPEG engines to 40. Current ASICs may not support all 40. These are indicated as `UINT16_MAX` or `N/A` in CLI. + +* Bad page threshold count. - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions required to display the count. -* CPU model name for RDC. +* CPU model name for RDC. - Added new C and Python API `amdsmi_get_cpu_model_name`. - Not sourced from esmi library. -* Added `amdsmi_get_cpu_affinity_with_scope()`. +* Added `amdsmi_get_cpu_affinity_with_scope()`. * `socket power` to `amdsmi_get_power_info` - Previously the C API had the value in the `amdsmi_power_info` structure, but was unused @@ -706,26 +711,28 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - `AMDSMI_EVT_NOTIF_PROCESS_START` - `AMDSMI_EVT_NOTIF_PROCESS_END` -- Power Cap to `amd-smi monitor`. +- Power cap to `amd-smi monitor`. - `amd-smi monitor -p` will display the power cap along with power. -### Changed +#### Changed -* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. - - The `clk_deep_sleep` field now returns the sleep integer value. +* Separated driver reload functionality from `amdsmi_set_gpu_memory_partition()` and + `amdsmi_set_gpu_memory_partition_mode()` APIs -- and from the CLI `amd-smi set -M `. -* Updated `amdsmi_get_gpu_asic_info` in `amdsmi.h`. - - Added `subsystem_id` structure member. +* Disabled `amd-smi monitor --violation` on guest. Modified `amd-smi metric --throttle` to alias to `amd-smi metric --violation`. -* The `amd-smi topology` command has been enabled for Guest environments. - - `amd-smi topology` is now available in Guest environments. This includes full functionality so users can use the command just as they would in Bare Metal environments. +* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. + - The `clk_deep_sleep` field now returns the sleep integer value. -* Expanded Violation Status tracking for GPU metrics 1.8. - - The driver will no longer be supporting existing single-value GFX Clk Below Host Limit fields (`acc_gfx_clk_below_host_limit`, `per_gfx_clk_below_host_limit`, `active_gfx_clk_below_host_limit`), they are now changed in favor of new per-XCP/XCC arrays. +* The `amd-smi topology` command has been enabled for guest environments. + - This includes full functionality so users can use the command just as they would in bare metal environments. + +* Expanded violation status tracking for GPU metrics 1.8. + - The driver will no longer be supporting existing single-value GFX clock below host limit fields (`acc_gfx_clk_below_host_limit`, `per_gfx_clk_below_host_limit`, `active_gfx_clk_below_host_limit`), they are now changed in favor of new per-XCP/XCC arrays. - Added new fields to `amdsmi_violation_status_t` and related interfaces for enhanced violation breakdown: - Per-XCP/XCC accumulators and status for: - - GFX Clock Below Host Limit (Power, Thermal, and Total) - - Low Utilization + - GFX clock below host limit (power, thermal, and total) + - Low utilization - Added 2D arrays to track per-XCP/XCC accumulators, percentage, and active status: - `acc_gfx_clk_below_host_limit_pwr`, `acc_gfx_clk_below_host_limit_thm`, `acc_gfx_clk_below_host_limit_total` - `per_gfx_clk_below_host_limit_pwr`, `per_gfx_clk_below_host_limit_thm`, `per_gfx_clk_below_host_limit_total` @@ -738,41 +745,34 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - `amdsmi_dpm_policy_entry_t` member `policy_description` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. - `amdsmi_name_value_t` member `name` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. -* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. - - The `clk_deep_sleep` field now returns the sleep integer value. +* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. + - The `clk_deep_sleep` field now returns the sleep integer value. -* Updated `amdsmi_bdf_t` in `amdsmi.h`. - - The `amdsmi_bdf_t` union was changed to have an identical unnamed struct for backwards compatiblity +* For backwards compatibility, updated `amdsmi_bdf_t` union to have an identical unnamed struct. -### Removed +#### Removed -- Removed unnecessary API, `amdsmi_free_name_value_pairs(),` from amdsmi.h - - This API is only used internally to free up memory from the python interface and does not need to be - exposed to the User. +- Removed unnecessary API, `amdsmi_free_name_value_pairs()` + - This API is only used internally to free up memory from the Python interface and does not need to be + exposed to the user. -- Removed unused definitions: - - `AMDSMI_MAX_NAME` - - `AMDSMI_256_LENGTH` - - `AMDSMI_MAX_DATE_LENGTH` - - `MAX_AMDSMI_NAME_LENGTH` - - `AMDSMI_LIB_VERSION_YEAR` - - `AMDSMI_DEFAULT_VARIANT` - - `AMDSMI_MAX_NUM_POWER_PROFILES` - - `AMDSMI_MAX_DRIVER_VERSION_LENGTH` +- Removed unused definitions: + - `AMDSMI_MAX_NAME`, `AMDSMI_256_LENGTH`, `AMDSMI_MAX_DATE_LENGTH`, `MAX_AMDSMI_NAME_LENGTH`, `AMDSMI_LIB_VERSION_YEAR`, + `AMDSMI_DEFAULT_VARIANT`, `AMDSMI_MAX_NUM_POWER_PROFILES`, `AMDSMI_MAX_DRIVER_VERSION_LENGTH`. -- Removed unused member `year` in struct `amdsmi_version_t`. +- Removed unused member `year` in struct `amdsmi_version_t`. -- Removed `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`** +- Removed `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`. - `amdsmi_io_link_type_t` is no longer needed as `amdsmi_link_type_t` is sufficient. - `amdsmi_link_type_t` enum has changed. - This change will also affect `amdsmi_link_metrics_t`, where the link_type field changes from `amdsmi_io_link_type_t` to `amdsmi_link_type_t`. -- Removed `amdsmi_get_power_info_v2()`. - - The ``amdsmi_get_power_info()`` has been unified and the v2 function is no longer needed/used. +- Removed `amdsmi_get_power_info_v2()`. + - The ``amdsmi_get_power_info()`` has been unified and the v2 function is no longer needed or used. -- Removed `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. +- Removed `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. -- The `amdsmi_get_gpu_vram_info` now provides vendor names as a string. +- The `amdsmi_get_gpu_vram_info` now provides vendor names as a string. - `amdsmi_vram_vendor_type_t` enum structure is removed. - `amdsmi_vram_info_t` member named `amdsmi_vram_vendor_type_t` is changed to a character string. - `amdsmi_get_gpu_vram_info` now no longer requires decoding the vendor name as an enum. @@ -782,24 +782,24 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion about which field to use. By removing backward compatibility, it is easier to identify the relevant field. - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. -### Optimized +#### Optimized -- Reduced ``amd-smi`` CLI API calls needed to be called before reading or (re)setting GPU features. - - Now when users call any amd-smi CLI command, we have reduced the APIs needed to be called. Previously, - when a user would read a GPU's status, (for example) we would poll for other information helpful for our sets/reset - CLI calls. This change will increase overall run-time performance of the CLI tool. +- Reduced ``amd-smi`` CLI API calls needed to be called before reading or (re)setting GPU features. This + improves overall runtime performance of the CLI. -- Removed partition information from the default `amd-smi static` CLI command. - - Users can still retrieve the same data by calling `amd-smi`, `amd-smi static -p`, or `amd-smi partition -c -m`/`sudo amd-smi partition -a`. +- Removed partition information from the default `amd-smi static` CLI command. + - Users can still retrieve the same data by calling `amd-smi`, `amd-smi static -p`, or `amd-smi partition -c -m`/`sudo amd-smi partition -a`. - Reading ``current_compute_partition`` may momentarily wake the GPU up. This is due to reading XCD registers, which is expected behavior. Changing partitions is not a trivial operation, `current_compute_partition` SYSFS controls this action. -- Optimized CLI command `amd-smi topology` in partition mode. - - Reduced the number of `amdsmi_topo_get_p2p_status` API calls to one fourth. +- Optimized CLI command `amd-smi topology` in partition mode. + - Reduced the number of `amdsmi_topo_get_p2p_status` API calls to one fourth. -### Resolved issues +#### Resolved issues - Removed duplicated GPU IDs when receiving events using the `amd-smi event` command. +- Fixed `amd-smi monitor` decoder utilization (`DEC%`) not showing up on MI300 series ASICs. + ```{note} See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. ``` From 0d5f17a58b5dc4afbd48b26cb5f9f2a29ca9d32d Mon Sep 17 00:00:00 2001 From: randyh62 <42045079+randyh62@users.noreply.github.com> Date: Thu, 21 Aug 2025 06:18:35 -0700 Subject: [PATCH 12/58] Update RELEASE.md (#515) * Update RELEASE.md Add logical reduction changes to ROCm 7.0 Release Notes * Update RELEASE.md Added description of DebugFission option for llvm-project * Update RELEASE.md update definition of __builtin_amdgcn_is_invocable * Update RELEASE.md Removed Perl Scripts from HIPCC --- RELEASE.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 37ce0dd86..3d5db57f1 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -854,7 +854,8 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc - `hipDrvLaunchKernelEx` dispatches the device kernel represented by a HIP function object. - `hipMemGetHandleForAddressRange` gets a handle for the address range requested. - `num_threads` Total number of threads in the group. The legacy API size is alias. - - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for reduction across lanes of a warp. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). + - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for aritimetic reduction across lanes of a warp, and `__reduce_and_sync`, `__reduce_or_sync`, and `__reduce_xor_sync` +functions added for logical reduction. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). * New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). - Data types for `FP4`/`FP6`/`FP8`. - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. @@ -1278,15 +1279,16 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added compiler support for separate debug file generation for device code. +* Added the compiler `-gsplit-dwarf` option to enable the generation of separate debug information file at compile time. When used, separate debug information files are generated for host and for each offload architecture. For additional information, see [DebugFission](https://gcc.gnu.org/wiki/DebugFission). * Added `llvm-flang`, AMD's next generation Fortran compiler is a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. * Added Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. -* Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` enabling fine-grained target-specific feature availability. +* Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` to determine the current target processor ability to invoke a particular builtin. * Added HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. #### Changed * Updated clang/llvm to AMD clang version 20.0.0 (equivalent to LLVM 20.0.0 with additional out-of-tree patches). +* HIPCC Perl scripts (`hipcc.pl` and `hipconfig.pl`) have been removed from this release. #### Optimized From 19156cf2c6f47af00d0c55b98a027d5e9ea3ffa0 Mon Sep 17 00:00:00 2001 From: spolifroni-amd Date: Thu, 21 Aug 2025 11:30:12 -0400 Subject: [PATCH 13/58] adding roccv to rocm (#479) * adding-roccv * removed rocCV --------- Co-authored-by: Pratik Basyal --- docs/what-is-rocm.rst | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/what-is-rocm.rst b/docs/what-is-rocm.rst index 4accd00ed..e83e3faa8 100644 --- a/docs/what-is-rocm.rst +++ b/docs/what-is-rocm.rst @@ -45,6 +45,10 @@ Machine Learning & Computer Vision ":doc:`rocJPEG `", "Library for decoding JPG images on AMD GPUs" ":doc:`rocPyDecode `", "Provides access to rocDecode APIs in both Python and C/C++ languages" +.. note:: + + `rocCV `_ is an efficient GPU-accelerated library for image pre- and post-processing. rocCV is in an early access state. Using it on production workloads is not recommended. + Communication ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ From e24bd407c1729bf774dd749d8da3cd74044eabcb Mon Sep 17 00:00:00 2001 From: yugang-amd Date: Thu, 21 Aug 2025 11:58:26 -0400 Subject: [PATCH 14/58] edit release notes (#516) Co-authored-by: Pratik Basyal --- RELEASE.md | 44 ++++++++++++++++++++++---------------------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 3d5db57f1..cf85b1d83 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1248,14 +1248,14 @@ HIP runtime has the following functional improvements which improves runtime per * Added `hiptensorDestroyPlanPreference` to free all resources related to the provided preference. * Added `hiptensorPlanPreferenceSetAttribute` to set attribute of a `hiptensorPlanPreference_t` object. * Added `hiptensorPlanGetAttribute` to retrieve information about an already-created plan. -* Added `hiptensorEstimateWorkspaceSize` to determine the required workspaceSize for the given operation. +* Added `hiptensorEstimateWorkspaceSize` to determine the required workspace size for the given operation. * Added `hiptensorCreatePlan` to allocate a `hiptensorPlan_t` object, select an appropriate kernel for a given operation and prepare a plan that encodes the execution. * Added `hiptensorDestroyPlan` to free all resources related to the provided plan. #### Changed * Removed architecture support for gfx940 and gfx941. -* Generalized opaque buffer now for any descriptor. +* Generalized opaque buffer for any descriptor. * Replaced `hipDataType` with `hiptensorDataType_t` for all supported types, for example, `HIP_R_32F` to `HIPTENSOR_R_32F`. * Replaced `hiptensorComputeType_t` with `hiptensorComputeDescriptor_t` for all supported types. * Replaced `hiptensorInitTensorDescriptor` with `hiptensorCreateTensorDescriptor`. @@ -1873,14 +1873,14 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele - Fixed incorrect kernel names shown for kernel dispatch tracks in Perfetto. - Fixed formatting of some output logs. -### **ROCmValidationSuite** (1.2.0) +### **ROCm Validation Suite** (1.2.0) #### Added -- Support for new platforms: MI350X and MI355X. +- Support for AMD Instinct MI350X and MI355X accelerators. - Introduced rotating buffer mechanism for GEMM operations. - Support for read and write tests in Babel. -- Support for new platforms: RX9070 and RX9070GRE. +- Support for AMD Radeon RX9070 and RX9070GRE graphics cards. #### Changed @@ -2123,7 +2123,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Resolved issues -* Resolved segfault in `rocshmem_wg_ctx_create`, now provides nullptr if ctx cannot be created. +* Resolved segfault in `rocshmem_wg_ctx_create`, now provides `nullptr` if `ctx` cannot be created. ### **rocSOLVER** (3.30.0) @@ -2236,34 +2236,34 @@ The previous default accumulator types could lead to situations in which unexpec * Added internal register layout transforms to support interleaved MMA layouts. * Added support for the gfx950 target. -* Added mixed input `BF8` / `FP8` types for MMA support. -* Added fragment scheduler API objects to embed thread block cooperation properties in fragments +* Added mixed input `BF8`/`FP8` types for MMA support. +* Added fragment scheduler API objects to embed thread block cooperation properties in fragments. #### Changed -* Augmented load / store / MMA internals with static loop unrolling -* rocWMMA mma_sync API now supports `wave tile` fragment sizes -* rocWMMA cooperative fragments are now expressed with fragment scheduler template arguments -* rocWMMA cooperative fragments now use the same base API as non-cooperative fragments -* rocWMMA cooperative fragments register usage footprint has been reduced -* rocWMMA fragments now support partial tile sizes with padding +* Augmented load/store/MMA internals with static loop unrolling. +* Updated linkage of `rocwmma::synchronize_workgroup` to inline. +* rocWMMA `mma_sync` API now supports `wave tile` fragment sizes. +* rocWMMA cooperative fragments are now expressed with fragment scheduler template arguments. +* rocWMMA cooperative fragments now use the same base API as non-cooperative fragments. +* rocWMMA cooperative fragments register usage footprint has been reduced. +* rocWMMA fragments now support partial tile sizes with padding. #### Optimized -* Added internal flow control barriers to improve assembly code generation and overall performance -* Enabled interleaved layouts by default in MMA to improve overall performance +* Added internal flow control barriers to improve assembly code generation and overall performance. +* Enabled interleaved layouts by default in MMA to improve overall performance. #### Removed -* Removed support for the gfx940 and gfx941 targets -* Removed the rocWMMA cooperative API -* Removed wave count template parameters from transforms APIs +* Removed support for the gfx940 and gfx941 targets. +* Removed the rocWMMA cooperative API. +* Removed wave count template parameters from transforms APIs. #### Resolved issues -* Fixed a validation issue for small precision compute types `< B32` on gfx9 -* Fixed CMake validation of compiler support for `BF8` / `FP8` types -* Fixed linkage of rocwmma::synchronize_workgroup to inline +* Fixed a validation issue for small precision compute types `< B32` on gfx9. +* Fixed CMake validation of compiler support for `BF8`/`FP8` types. ### **RPP** (2.0.0) From 60571680b54e7b214c2ea2dcb1bdc710a913d0c6 Mon Sep 17 00:00:00 2001 From: Jeffrey Novotny Date: Thu, 21 Aug 2025 14:17:17 -0400 Subject: [PATCH 15/58] Second round of proofreading for components in 7.0 release notes (#514) * Second round of proofreading for components * Remove duplicate item --------- Co-authored-by: Pratik Basyal --- RELEASE.md | 48 +++++++++++++++++++++++++----------------------- 1 file changed, 25 insertions(+), 23 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index cf85b1d83..a443eae81 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1044,18 +1044,21 @@ HIP runtime has the following functional improvements which improves runtime per #### Added * Stream-K GEMM support has been enabled for the `FP32`, `FP16`, `BF16`, `FP8`, and `BF8` data types on the Instinct MI300A APU. To activate this feature, set the `TENSILE_SOLUTION_SELECTION_METHOD` environment variable to `2`, for example, `export TENSILE_SOLUTION_SELECTION_METHOD=2`. -* Fused Swish/SiLU GEMM in hipBLASLt (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``) +* Added fused Swish/SiLU GEMM (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``). * Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. * Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. * Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8` / `BF8` swizzle GEMM respectively. * Added TF32 emulation on gfx950. -* Added support for `FP6`, `BF6`, and `FP4` on gfx950 +* Added support for `FP6`, `BF6`, and `FP4` on gfx950. * Added support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. #### Changed -* ``HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER_VEC_EXT`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER_VEC_EXT`` are removed. Use the ``HIPBLASLT_MATMUL_DESC_A_SCALE_MODE`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_MODE`` attributes to set scalar (``HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F``) or vector (``HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F``) attributes. * The non-V2 APIs (``GemmPreference``, ``GemmProblemType``, ``GemmEpilogue``, ``GemmTuning``, ``GemmInputs``) in the cpp header are now the same as the V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``). The original non-V2 APIs are removed. + +#### Removed + +* ``HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER_VEC_EXT`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER_VEC_EXT`` are removed. Use the ``HIPBLASLT_MATMUL_DESC_A_SCALE_MODE`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_MODE`` attributes to set scalar (``HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F``) or vector (``HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F``) attributes. * The `hipblasltExtAMaxWithScale` API is removed. #### Optimized @@ -1177,7 +1180,7 @@ HIP runtime has the following functional improvements which improves runtime per * Added compatibility-only functions * csrlsvqr - * hipsolverSpCcsrlsvqr, hipsolverSpZcsrlsvqr + * `hipsolverSpCcsrlsvqr`, `hipsolverSpZcsrlsvqr` #### Resolved issues @@ -1209,7 +1212,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Known issues -* In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed in, because it is unused internally. +* In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed in because it is unused internally. ### **hipSPARSELt** (0.2.4) @@ -1403,9 +1406,9 @@ HIP runtime has the following functional improvements which improves runtime per #### Optimized -* [BatchNorm] Optimized NHWC OpenCL kernels and improved heuristics +* [BatchNorm] Optimized NHWC OpenCL kernels and improved heuristics. * [RNN] Dynamic algorithm optimization. -* [Conv] Eliminated redundant clearing of output buffers +* [Conv] Eliminated redundant clearing of output buffers. * [RNN] Updated selection heuristics. * Updated tuning for the AMD Instinct MI300 series. @@ -1447,10 +1450,10 @@ HIP runtime has the following functional improvements which improves runtime per * Set a default of 112 channels for a single node with `8 * gfx950`. * Enabled LL128 protocol on the gfx950. * Added the ability to choose the unroll factor at runtime using `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. -* Added MSCCL support for AllGather multinode gfx942/gfx950 (for instance, 16 and 32 GPUs). To enable this feature, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. The maximum message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. -* Thread thresholds for LL/LL128 are selected in Tuning Models for the AMD Instinct MI300X. This impacts the number of channels used for AG and RS. The channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS`, or `NCCL_MAX_NCHANNELS` are set. +* Added MSCCL support for AllGather multinode on the gfx942 and gfx950 (for instance, 16 and 32 GPUs). To enable this feature, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. The maximum message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. +* Thread thresholds for LL/LL128 are selected in Tuning Models for the AMD Instinct MI300X. This impacts the number of channels used for AllGather and ReduceScatter. The channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS`, or `NCCL_MAX_NCHANNELS` are set. * Multi-node tuning for AllGather, AllReduce, and ReduceScatter that leverages LL/LL64/LL128 protocols to use nontemporal vector load/store for tunable message size ranges. -* LL/LL128 usage ranges for AR, AG, and RS are part of the tuning models, which enable architecture-specific tuning in conjunction with the existing Rome Models scheme in RCCL. +* LL/LL128 usage ranges for AllReduce, AllGather, and ReduceScatter are part of the tuning models, which enable architecture-specific tuning in conjunction with the existing Rome Models scheme in RCCL. * Two new APIs are exposed as part of an initiative to separate RCCL code. These APIs are `rcclGetAlgoInfo` and `rcclFuncMaxSendRecvCount`. However, user-level invocation requires that RCCL be built with `RCCL_EXPOSE_STATIC` enabled. #### Changed @@ -1465,7 +1468,7 @@ HIP runtime has the following functional improvements which improves runtime per * Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. * Fixed unit test failures in tests ending with the `ManagedMem` and `ManagedMemGraph` suffixes. * Fixed a suboptimal algorithmic switching point for AllReduce on the AMD Instinct MI300X. -* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault" with a design change to use `comm` instead of `rank` for `mscclStatus`. The Global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. +* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault" with a design change to use `comm` instead of `rank` for `mscclStatus`. The global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. ### **rocAL** (2.3.0) @@ -1513,23 +1516,14 @@ HIP runtime has the following functional improvements which improves runtime per * gfx950 support. * Internal API logging for `gemm` debugging using `ROCBLAS_LAYER = 8`. -* Support for AOCL 5.0 gcc build as a client reference library. -* Allowing the use of `PkgConfig` for client reference library fallback detection. +* Support for the AOCL 5.0 gcc build as a client reference library. +* The use of `PkgConfig` for client reference library fallback detection. #### Changed * `CMAKE_CXX_COMPILER` is now passed on during compilation for a Tensile build. * The default atomics mode is changed from `allowed` to `not allowed`. -#### Optimized - -* Optimized `gemm` by using `gemv` kernels when applicable. -* Optimized `gemv` for small `m` and `n` with a large batch count on gfx942. -* Improved the performance of Level 1 `dot` for all precisions and variants when `N > 100000000` on gfx942. -* Improved the performance of Level 1 `asum` and `nrm2` for all precisions and variants on gfx942. -* Improved the performance of Level 2 `sger` (single precision) on gfx942. -* Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942. - #### Removed * Support code for non-production gfx targets. @@ -1539,6 +1533,15 @@ HIP runtime has the following functional improvements which improves runtime per * `rocblas_float8.h` and `rocblas_hip_f8_impl.h` files. * `rocblas_gemm_ex3`, `rocblas_gemm_batched_ex3`, and `rocblas_gemm_strided_batched_ex3` API functions. +#### Optimized + +* Optimized `gemm` by using `gemv` kernels when applicable. +* Optimized `gemv` for small `m` and `n` with a large batch count on gfx942. +* Improved the performance of Level 1 `dot` for all precisions and variants when `N > 100000000` on gfx942. +* Improved the performance of Level 1 `asum` and `nrm2` for all precisions and variants on gfx942. +* Improved the performance of Level 2 `sger` (single precision) on gfx942. +* Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942. + #### Resolved issues * Fixed environment variable path-based logging to append multiple handle outputs to the same file. @@ -2134,7 +2137,6 @@ The previous default accumulator types could lead to situations in which unexpec #### Optimized -* Fixed corner cases that can produce NaNs in SYEVD, for valid input matrices. * Improved the performance of BDSQR and downstream functions, such as GESVD. * Improved the performance of STEQR and downstream functions, such as SYEV/HEEV. * Improved the performance of LARFT and downstream functions, such as GEQR2 and GEQRF. From 0ae99ea21e6679f2cd1a9dbe22ba70170948f128 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Thu, 21 Aug 2025 16:02:27 -0400 Subject: [PATCH 16/58] Indentation and formatting updated (#517) From 91c26c502d0357af0019e71c440ffc45af80e7cb Mon Sep 17 00:00:00 2001 From: Matt Williams Date: Thu, 21 Aug 2025 18:02:31 -0400 Subject: [PATCH 17/58] Updating license for AQLprofile --- docs/about/license.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/docs/about/license.md b/docs/about/license.md index 3ab8b3544..91dbca114 100644 --- a/docs/about/license.md +++ b/docs/about/license.md @@ -29,6 +29,7 @@ additional licenses. Please review individual repositories for more information. | [AMD SMI](https://github.com/ROCm/amdsmi) | [MIT](https://github.com/ROCm/amdsmi/blob/amd-staging/LICENSE) | | [aomp](https://github.com/ROCm/aomp/) | [Apache 2.0](https://github.com/ROCm/aomp/blob/aomp-dev/LICENSE) | | [aomp-extras](https://github.com/ROCm/aomp-extras/) | [MIT](https://github.com/ROCm/aomp-extras/blob/aomp-dev/LICENSE) | +| [AQLprofile] | [MIT](https://github.com/ROCm/aqlprofile/blob/amd-staging/LICENSE) | | [Code Object Manager (Comgr)](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/comgr) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/comgr/LICENSE.txt) | | [Composable Kernel](https://github.com/ROCm/composable_kernel) | [MIT](https://github.com/ROCm/composable_kernel/blob/develop/LICENSE) | | [half](https://github.com/ROCm/half/) | [MIT](https://github.com/ROCm/half/blob/rocm/LICENSE.txt) | @@ -46,7 +47,6 @@ additional licenses. Please review individual repositories for more information. | [hipSPARSE](https://github.com/ROCm/hipSPARSE/) | [MIT](https://github.com/ROCm/hipSPARSE/blob/develop/LICENSE.md) | | [hipSPARSELt](https://github.com/ROCm/hipSPARSELt/) | [MIT](https://github.com/ROCm/hipSPARSELt/blob/develop/LICENSE.md) | | [hipTensor](https://github.com/ROCm/hipTensor) | [MIT](https://github.com/ROCm/hipTensor/blob/develop/LICENSE) | -| hsa-amd-aqlprofile | [AMD Software EULA](https://www.amd.com/en/legal/eula/amd-software-eula.html) | | [llvm-project](https://github.com/ROCm/llvm-project/) | [Apache](https://github.com/ROCm/llvm-project/blob/amd-staging/LICENSE.TXT) | | [llvm-project/flang](https://github.com/ROCm/llvm-project/tree/amd-staging/flang) | [Apache 2.0](https://github.com/ROCm/llvm-project/blob/amd-staging/flang/LICENSE.TXT) | | [MIGraphX](https://github.com/ROCm/AMDMIGraphX/) | [MIT](https://github.com/ROCm/AMDMIGraphX/blob/develop/LICENSE) | @@ -132,12 +132,10 @@ companies. ### Package licensing :::{attention} -AQL Profiler and AOCC CPU optimization are both provided in binary form, each -subject to the license agreement enclosed in the directory for the binary available -in `/opt/rocm/share/doc/hsa-amd-aqlprofile/EULA`. By using, installing, -copying or distributing AQL Profiler and/or AOCC CPU Optimizations, you agree to +ROCprof Trace Decoder and AOCC CPU optimizations are provided in binary form, subject to the license agreement enclosed on [GitHub](https://github.com/ROCm/rocprof-trace-decoder/blob/amd-mainline/LICENSE) for ROCprof Trace Decoder, and [Developer Central](https://www.amd.com/en/developer/aocc.html) for AOCC. By using, installing, +copying or distributing ROCprof Trace Decoder or AOCC CPU Optimizations, you agree to the terms and conditions of this license agreement. If you do not agree to the -terms of this agreement, do not install, copy or use the AQL Profiler and/or the +terms of this agreement, do not install, copy or use ROCprof Trace Decoder or the AOCC CPU Optimizations. ::: From 2ec8757ffaaff9bccadb9163825c4144ae1e149e Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Thu, 21 Aug 2025 18:51:57 -0400 Subject: [PATCH 18/58] Post RC4 RN 700 update (#513) * Indentation and formatting updated * Rc4 compute profiler version update * Editorial changes in changelog * Changelog and compatibility matrix updated * ROCProfiler-SDK highlight update * az and ol added to wordlist * updated with newer info fr from migraphx * fixed a formatting error * Release date updated * ROCProfiler-SDK highlight updated * Changelog update * Changelog update * Release notes feedback * Release notes update --------- Co-authored-by: spolifroni-amd --- .wordlist.txt | 2 + RELEASE.md | 483 +++++++++--------- .../compatibility-matrix-historical-6.0.csv | 266 +++++----- docs/compatibility/compatibility-matrix.rst | 141 ++--- docs/conf.py | 2 +- 5 files changed, 444 insertions(+), 450 deletions(-) diff --git a/.wordlist.txt b/.wordlist.txt index c32752d7c..5377d2eef 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -546,6 +546,7 @@ autogenerated autotune avx awk +az backend backends bb @@ -763,6 +764,7 @@ opencv openmp openssl optimizers +ol os oversubscription pageable diff --git a/RELEASE.md b/RELEASE.md index a443eae81..a4e7de681 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -38,33 +38,35 @@ documentation to verify compatibility and system requirements. The following are notable new features and improvements in ROCm 7.0.0. For changes to individual components, see [Detailed component changes](#detailed-component-changes). -### HIP API compatibility improvements +### Operating system, hardware, and virtualization support changes -To improve code portability between AMD ROCm and other programming models, HIP API has been updated in ROCm 7.0 to simplify cross-platform programming. These changes are incompatible with prior ROCm releases and might require recompiling existing HIP applications for use with ROCm 7.0. For more information, see the [HIP API 7.0 changes](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/hip-7-changes.html) and the [HIP changelog](#hip-7-0-0) below. +ROCm 7.0.0 adds support for [AMD Instinct MI355X](https://www.amd.com/en/products/accelerators/instinct/mi350/mi355x.html) and [MI350X](https://www.amd.com/en/products/accelerators/instinct/mi350/mi350x.html). For details, see the full list of [Supported GPUs (Linux)](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#supported-gpus). -### HIP runtime updates +ROCm 7.0.0 adds support for the following operating systems and kernel versions: -The HIP runtime now includes support for: +* Ubuntu 24.04.3 (kernel: 6.8 [GA], 6.14 [HWE]) +* RHEL 10 (kernel: 6.12.0-55) +* Oracle Linux 10 (kernel: 6.12.0 UEK) +* Rocky 9 (kernel: 5.14.0-570) -* Open Compute Project (OCP) MX floating-point `FP4`, `FP6`, and `FP8` data types and APIs. -* Improved logging by adding more precise pointer information and launch arguments for better tracking and debugging in dispatch methods. -* `constexpr` operators for `FP16` and `BF16`. -* `__syncwarp` operation. -* The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. +ROCm 7.0.0 marks the end of support (EoS) for Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE]). -In addition, the HIP runtime includes functional improvements, which improves functionality, runtime performance, and user experience. For more information, see [HIP changelog](#hip-7-0-0) below. - -### Instinct Driver/ROCm packaging separation - -The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) for more information. - -### Deep learning and AI framework support improvements - -ROCm 7.0 introduces several newly supported versions of Deep learning and AI frameworks. For more information, see [Installting Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/latest/how-to/deep-learning-rocm.html). +For more information about supported operating systems, see [Supported operating systems](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#supported-operating-systems) and [install instructions](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/). See the [Compatibility matrix](../../docs/compatibility/compatibility-matrix.rst) -for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm. +for more information about operating system and hardware compatibility. + +#### Virtualization support + +ROCm 7.0 introduces support for KVM Passthrough for AMD Instinct MI350X and MI355X accelerators. + +All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver version 8.4.0.K. In addition, support for VMware ESXi 8 has been introduced for select AMD accelerators. For more information, see [Virtualization Support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support). + +### Deep learning and AI framework updates + +ROCm 7.0 introduces several newly supported versions of Deep learning and AI frameworks. For more information, see [Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/latest/how-to/deep-learning-rocm.html) and the [Compatibility +matrix](../../docs/compatibility/compatibility-matrix.rst) for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm. #### PyTorch @@ -89,12 +91,14 @@ Megatron-LM for ROCm now supports: * Fused_bias_swiglu kernel. -For more information, see [Training a model with Megatron-LM for ROCm](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/benchmark-docker/megatron-lm.html?model=pyt_megatron_lm_train_llama-3.3-70b). - -#### Tensorflow +#### TensorFlow ROCm 7.0 enables support for TensorFlow 2.19.1. +#### ONNX Runtime + +ROCm 7.0 enables support for ONNX Runtime 1.22.1. + #### vLLM * Support for Open Compute Project (OCP) `FP8` data type. @@ -102,7 +106,30 @@ ROCm 7.0 enables support for TensorFlow 2.19.1. #### Triton -ROCm 7.0 enables support for support for Triton 3.3.0. +ROCm 7.0 enables support for Triton 3.3.0. + +### Instinct Driver/ROCm packaging separation + +The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) for more information. + +### HIP API compatibility improvements + +To improve code portability between AMD ROCm and other programming models, HIP API has been updated in ROCm 7.0 to simplify cross-platform programming. These changes are incompatible with prior ROCm releases and might require recompiling existing HIP applications for use with ROCm 7.0. For more information, see the [HIP API 7.0 changes](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/hip-7-changes.html) and the [HIP changelog](#hip-7-0-0) below. + +### HIP runtime updates + +The HIP runtime now includes support for: + +* Open Compute Project (OCP) MX floating-point `FP4`, `FP6`, and `FP8` data types and APIs. +* Improved logging by adding more precise pointer information and launch arguments for better tracking and debugging in dispatch methods. +* `constexpr` operators for `FP16` and `BF16`. +* `__syncwarp` operation. +* The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. +* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`). +* Extended fine grained system memory pool. +* A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. + +In addition, the HIP runtime includes functional improvements, which improves functionality, runtime performance, and user experience. For more information, see [HIP changelog](#hip-7-0-0) below. ### Compiler changes and improvements @@ -140,13 +167,13 @@ Key compiler enhancements include: #### New data type support -MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). The ROCm 7.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD MI355X systems in these ROCm libraries: +MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). The ROCm 7.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 series accelerators in these ROCm libraries: * Composable Kernel (`FP4`, `FP6`, and `FP8` only) * hipBLASLt * MIGraphX (`FP4` only) -The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on AMD Instinct MI355X instead of the NANOO `FP8` format: +The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on AMD Instinct MI350 series accelerators instead of the NANOO `FP8` format: * Composable Kernel * hipBLASLt @@ -166,9 +193,8 @@ For more information about hipBLASLt changes, see the [hipBLASLt changelog](#hip #### MIGraphX support -* Support for OCP `FP8` and MX `FP4` data types on AMD Instinct MI350X and MI355X accelerators. -* Support for `BF16` on all hardware -* Support for PyTorch 2.7 via Torch-MIGraphX +* Support for OCP `FP8` on AMD Instinct MI350X and MI355X accelerators. +* Support for PyTorch 2.7 via Torch-MIGraphX. For more information about MIGraphX changes, see the [MIGraphX changelog](migraphx-2-13-0) below. @@ -185,7 +211,7 @@ See the [rocSHMEM changelog](#rocshmem-3-0-0) for more details. Key enhancements to AMD SMI include the ability to reload the AMD GPU driver from the CLI or API. The `amd-smi` command-line interface gains a new default view, `amd-smi` topology support in guest environments, and performance optimizations. Additionally, AMD SMI library APIs -have been refined for improved usability. See the [AMDSMI changelog](#amdsmi-26-0-0) for more details. +have been refined for improved usability. See the [AMD SMI changelog](#amdsmi-26-0-0) for more details. #### ROCgdb @@ -203,7 +229,7 @@ ROCm Compute Profiler includes the following key changes: * Roofline distinction for `FP32` and `FP64` data types. * Selective kernel profiling. -See the [ROCm Compute Profiler changelog](#rocm-compute-profiler-3-2-1) for more details. +See the [ROCm Compute Profiler changelog](#rocm-compute-profiler-3-2-3) for more details. #### ROCm Data Center (RDC) improvements @@ -227,19 +253,24 @@ See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more #### ROCprofiler-SDK ##### Core SDK enhancements - + * ROCprofiler-SDK is now compatible with the HIP 7.0 API. -* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 series accelerators. -* Added support for tracing KFD events. -* API for profiling applications using thread traces (beta). - +* ROCprofiler-SDK adds support for AMD Instinct MI350X and MI355X accelerators. +* The stochastic and host-trap PC sampling support has been added for all AMD Instinct MI300 and MI350 series accelerators, which +provides information particularly useful for understanding stalls during kernel execution. +* The added support for tracing events surfaced by AMD's Kernel Fusion Driver (KFD) captures low level driver routines involved in mapping, invalidation, and migration of data between CPU and GPU memories. Such events are central to the support for [Unified Memory](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_runtime_api/memory_management/unified_memory.html) on AMD systems. Tracing of KFD events helps to detect performance problems arising from excessive data migration. +* New APIs are added for profiling applications using thread traces (beta) +which facilitates profiling wavefronts at the instruction timing level. + ##### rocpd - -Support has been added for the ROCm Profiling Data (rocpd) output format, which is now the default format for ``rocprofv3``. A subproject of the ROCprofiler-SDK, rocpd enables saving profiling results to a SQLite3 database, providing a structured and efficient foundation for analysis and post-processing. - + +The ROCm Profiling Data (``rocpd``) is now the default output format for ``rocprofv3``. +A subproject of the ROCprofiler-SDK, ``rocpd`` enables saving profiling results to a SQLite3 database, providing a structured and +efficient foundation for analysis and post-processing. + ##### rocprofv3 CLI tool enhancements - -* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 series accelerators. + +* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 and MI350 series accelerators. * HIP streams translate to Queues in Time Traces in Perfetto output. * Support for thread trace service. @@ -301,29 +332,6 @@ ROCm documentation continues to be updated to provide clearer and more comprehen * Modern computing tasks often require balancing numerical precision against hardware resources and processing speed. Low precision floating point number formats in HIP include `FP4` (4-bit) and `FP6` (6-bit), which reduce memory and bandwidth requirements. For more information, see the updated [Low precision floating point types](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html) topic. -## Operating system, hardware, and virtualization support changes - -ROCm 7.0.0 adds support for the following operating systems and kernel versions: - -* Ubuntu 24.04.3 (kernel: 6.8 [GA], 6.14 [HWE]) -* RHEL 10.0 (kernel: 6.12) -* Oracle Linux 10 (kernel: 6.12 UEK) -* Rocky 9 (kernel: 5.14+ B/P from 6.11/6.12) - -For more information about supported operating systems, see [Supported operating systems](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html). - -ROCm 7.0.0 marks the end of support (EoS) for Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE]) - -ROCm 7.0.0 adds support for [AMD Instinct MI355X](https://www.amd.com/en/products/accelerators/instinct/mi350/mi355x.html) and [MI350X](https://www.amd.com/en/products/accelerators/instinct/mi350/mi350x.html). For details, see the full list of [Supported GPUs (Linux)](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#supported-gpus). - -See the [Compatibility -matrix](../../docs/compatibility/compatibility-matrix.rst) -for more information about operating system and hardware compatibility. - -### Virtualization support - -ROCm 7.0 introduces support for KVM-based SR-IOV for select Instinct accelerators. All supported configurations require the GIM SR-IOV driver version 8.3.0K. In addition, support for VMware ESXi 8 has been introduced for select AMD accelerators. For more information, see [Virtualization Support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support). - ## ROCm components The following table lists the versions of ROCm components for ROCm 7.0.0, including any version @@ -558,7 +566,7 @@ Click {fab}`github` to go to the component's source code on GitHub. ROCm Compute Profiler - 3.1.1 ⇒ 3.2.1 + 3.1.1 ⇒ 3.2.3 @@ -808,23 +816,25 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc #### Added -* Added support for `BF16`, `F32`, and `F16` for 2D and 3D NGCHW grouped convolution backward data. -* Added a fully asynchronous HOST (CPU) arguments copy flow for CK grouped GEMM kernels. -* Added support GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW, number of instances in instance factory for NGCHW/GKYXC/NGKHW has been reduced). -* Added support for GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW). -* Added support for GKCYX layout for grouped convolution backward weight (NGCHW/GKCYX/NGKHW). -* Added support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). -* Added support for Stream-K version of mixed `FP8` / `BF16` GEMM. -* Added support for Multiple D GEMM. -* Added GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types -* Added support for `FP16` 2:4 structured sparsity to universal GEMM. -* Added support for Split K for grouped convolution backward data. -* Added logit soft-capping support for fMHA forward kernels. -* Added support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv). -* Added benchmarking support for tile engine GEMM. -* Added Ping-pong scheduler support for GEMM operation along the K dimension. -* Added rotating buffer feature for CK_Tile GEMM. -* Added int8 support for CK_TILE GEMM. +* Support for `BF16`, `F32`, and `F16` for 2D and 3D NGCHW grouped convolution backward data. +* Fully asynchronous HOST (CPU) arguments copy flow for CK grouped GEMM kernels. +* Support GKCYX for layout for grouped convolution forward (NGCHW/GKCYX/NGKHW, number of instances in instance factory for NGCHW/GKYXC/NGKHW has been reduced). +* Support for GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW). +* Support for GKCYX layout for grouped convolution backward weight (NGCHW/GKCYX/NGKHW). +* Support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). +* Support for Stream-K version of mixed `FP8` / `BF16` GEMM. +* Support for Multiple D GEMM. +* GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types +* Support for `FP16` 2:4 structured sparsity to universal GEMM. +* Support for Split K for grouped convolution backward data. +* Logit soft-capping support for fMHA forward kernels. +* Support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv). +* Benchmarking support for tile engine GEMM. +* Ping-pong scheduler support for GEMM operation along the K dimension. +* Rotating buffer feature for CK_Tile GEMM. +* `int8` support for CK_TILE GEMM. +* Vectorize Transpose optimization for CK Tile. +* Asynchronous copy for gfx950. #### Changed @@ -840,9 +850,7 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc #### Optimized -* Optimize the GEMM multiply preshuffle and lds bypass with Pack of KGroup and better instruction layout. -* Added Vectorize Transpose optimization for CK Tile. -* Added the asynchronous copy for gfx950. +* Optimized the GEMM multiply preshuffle and lds bypass with Pack of KGroup and better instruction layout. ### **HIP** 7.0.0 @@ -872,8 +880,7 @@ functions added for logical reduction. For details, see [Warp cross-lane functio * A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. #### Changed -* Deprecated GPUs. -Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. +* Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. * Removal of Beta warnings in HIP Graph APIs All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. * Behavior changes @@ -1304,88 +1311,69 @@ HIP runtime has the following functional improvements which improves runtime per ### **MIGraphX** (2.13.0) -### Added +#### Added -* Support for OCP `FP8` and MX `FP4` data types on AMD Instinct MI350X and MI355X accelerators. -* Support for `BF16` on all hardware. +* Support for OCP `FP8` on AMD Instinct MI350X accelerators. * Support for PyTorch 2.7 via Torch-MIGraphX. -* Contrib Operators for Microsoft ONNX: Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, skipLayerNorm. -* TensorFlow Operator: Sigmoid, AddN. -* GroupQuery Attention for LLM support . +* Support for the Microsoft ONNX Contrib Operators (Self) Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, SkipLayerNorm. +* Support for Sigmoid and AddN TensorFlow operators. +* Added GroupQuery Attention support for LLMs. * Added support for edge mode in the ONNX Pad operator. -* Support additional types for linear Resize operator. -* Added bitonic topk ONNX operator. -* Added onnx runtime python driver +* Added ONNX runtime Python driver. * Added FLUX e2e example. -* Added API to save and load arguments. -* Added quantize_bf16 to C api output. +* Added C++ and Python APIs to save arguments to a graph as a msgpack file, and then read the file back. * Added rocMLIR fusion for kv-cache attention. +* Introduced a check for file-write errors. -### Changed +#### Changed -* Print Kernel/Module in Compile Failure. -* Use hipblaslt instead of rocBLAS for newer GPU asics. -* Normalize standard input shapes for rocBLAS. -* Updated Stable Diffusion example to use torch 6.3. -* Rewrite 1x1 convolutions to gemm. -* Make version header public. -* Represent `BF16::max` by its encoding, rather than the expected value. -* Direct warnings to cout, instead into cerr. -* Use vector instead of `set` for implicit deps. -* Disable layernorm by default. -* Update timing in compile_ops() to use common average +* `quantize_bf16` for quantizing the model to `BF16` has been made visible in the MIGraphX user API. +* Print additional kernel/module information in the event of compile failure. +* Use hipBLASLt instead of rocBLAS on newer GPUs. +* 1x1 convolutions are now rewritten to GEMMs. +* `BF16::max` is now represented by its encoding rather than its expected value. +* Direct warnings now go to `cout` rather `cerr`. +* `FP8` uses hipBLASLt rather than rocBLAS. +* ONNX models are now topologically sorted when nodes are unordered. +* Improved layout of Graphviz output. +* Enhanced debugging for migraphx-driver: consumed environment variables are printed, timestamps and duration are added to the summary. +* Add a trim size flag to the verify option for migraphx-driver. +* Node names are printed to track parsing within the ONNX graph when using the `MIGRAPHX_TRACE_ONNX_PARSER` flag. +* Update accuracy checker to output test data with the `--show-test-data` flag. +* The `MIGRAPHX_TRACE_BENCHMARKING` option now allows the problem cache file to be updated after finding the best solution. -### Removed +#### Removed -* DPP for v_add_f64 as it is unsupported. -* rocBLAS bug workaround for solution index. -* ROCM_USE_FLOAT8 macro. -* rocBLAS `FP8`, always use hipBlasLt. -* Call to hipGetMemoryInfo when checking free memory based on feedback from HIP team. +* `ROCM_USE_FLOAT8` macro. +* The BF16 GEMM test was removed for Navi21, as it is unsupported by rocBLAS and hipBLASLt on that platform. -### Optimized +#### Optimized -* Layout convolution as NHWC or NCHW only -* einsum: conditionally do squeeze before transpose -* Update problem cache as configs are benchmarked -* Enable debug assertions in libstdc++ -* Topologically sort onnx models if nodes are unordered -* Use time_loop function to measure time for exhaustive tune runs +* Use common average in `compile_ops` to reduce run-to-run variations when tuning. +* Improved the performance of the TopK operator. +* Conform to a single layout (NHWC or NCHW) during compilation rather than combining two. * Slice Channels Conv Optimization (slice output fusion) -* Horiz fuse after pointwise -* GridSample Linear Sampler Refactor -* find_splits::is_dependent refactor -* Visually improved the output from Graphviz -* Print MigraphX consumed Env Variables when using the migraphx-driver -* Add timestamps and duration when printing the summary of migraphx-driver -* Add a trim size flag to the verify option for migraphx-driver -* Print node names, to track parsing within the onnx graph when using the MIGRAPHX_TRACE_ONNX_PARSER flag -* Embed onnx/tf files for api tests -* Fuse multiple outputs for pointwise ops -* Fuse reshapes on pointwise inputs for mlir output fusion -* Print MIGRAPHX ENV Variables at end of summary -* Update accuracy checker to spit out test data with --show-test-data flag -* Dont fold mul with gemm when the gemm is used more than once -* Detect when parallel stl is not parallel and enable when it is in parallel -* Dont fuse broadcast after conv/gemm in mlir -* Avoid the fusion (in reduction) when operator data-types mismatch +* Horizontal fusion optimization after pointwise operations. +* Reduced the number of literals used in `GridSample` linear sampler. +* Fuse multiple outputs for pointwise operations. +* Fuse reshapes on pointwise inputs for MLIR output fusion. +* MUL operation not folded into the GEMM when the GEMM is used more than once. +* Broadcast not fused after convolution or GEMM MLIR kernels. +* Avoid reduction fusion when operator data-types mismatch. -### Resolved issues +#### Resolved issues -* Workaround ICE in clang 20 when using views::transform. -* Fix bug with reshape_lazy in MLIR. -* Quantizelinear nearbyint fix. -* Add case for empty strings in node inputs for ops like resize. -* Parse resize fix: only check "keep_aspect_ratio_policy" attribute for sizes input. -* Fix Layernorm and SimplifiedLayernorm onnx parsers. -* nonmaxsuppression: identical boxes/scores not ordered correctly. -* Gcc/G++ compilation fix. -* Bug fix: events would get created on the wrong device in a multi-gpu scenario. -* Check for file-write errors. -* Fix out of order keys in value for comparisons and hashes when caching best kernels. -* Make checking env variables thread-safe again. -* [controlnet] Fixed mul: Types do not match. -* Fix check for scales if presenting roi in Resize op. +* Compilation workaround ICE in clang 20 when using `views::transform`. +* Fix bug with `reshape_lazy` in MLIR. +* Quantizelinear fixed for Nearbyint operation. +* Check for empty strings in ONNX node inputs for operations like Resize. +* Parse Resize fix: only check `keep_aspect_ratio_policy` attribute for sizes input. +* Nonmaxsuppression: fixed issue where identical boxes/scores not ordered correctly. +* Fixed a bug where events were created on the wrong device in a multi-gpu scenario. +* Fixed out of order keys in value for comparisons and hashes when caching best kernels. +* Fixed Controlnet MUL types do not match error. +* Fixed check for scales if ROI input is present in Resize operation. +* Einsum: Fixed a crash on empty squeeze operations. ### **MIOpen** (3.5.0) @@ -1627,6 +1615,16 @@ HIP runtime has the following functional improvements which improves runtime per * Fixed kernel faults on multi-device transforms that gather to a single device, when the input/output bricks are not contiguous. +### **ROCgdb** (16.3) + +#### Added + +- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. + +#### Removed + +- Support for the `gfx940` and `gfx941` architectures. + ### **rocJPEG** (1.1.0) #### Added @@ -1648,56 +1646,26 @@ HIP runtime has the following functional improvements which improves runtime per ### **ROCm Bandwidth Test** (2.6.0) -### Added +#### Added * Plugin architecture: - * `rocm_bandwidth_test` is now the **framework** for individual `plugins` and features. The `framework` is available at: `/opt/rocm/bin/` + * `rocm_bandwidth_test` is now the `framework` for individual `plugins` and features. The `framework` is available at: `/opt/rocm/bin/` - * Individual `plugins`: The **plugins (shared libraries)** are available at: `/opt/rocm/lib/rocm_bandwidth_test/plugins/` + * Individual `plugins`: The `plugins` (shared libraries) are available at: `/opt/rocm/lib/rocm_bandwidth_test/plugins/` ```{note} Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/rocm-rel-7.0/README.md) file for details about the new options and outputs. ``` -### Changed +#### Changed * The `CLI` and options/parameters have changed due to the new plugin architecture, where the plugin parameters are parsed by the plugin. -### Removed +#### Removed - The old CLI, parameters, and switches used. -### Known Issues - -- MI350: Crashes due to HIP gfx support. - - -### **ROCm SMI** (7.8.0) - -#### Added - -- Support for GPU metrics 1.8. - - Added new fields for `rsmi_gpu_metrics_t` including: - - Adding the following metrics to allow new calculations for violation status: - - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts - - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts - - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. - - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). - - Increasing available JPEG engines to 40. - Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI. - -#### Removed - -- Removed backwards compatibility for `rsmi_dev_gpu_metrics_info_get()`'s `jpeg_activity` and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` and `xcp_stats.vcn_busy`. - - Backwards compability is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. - - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion for users about which field to use. By removing backward compatibility, it is easier to identify the relevant field. - - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. - -```{note} -See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. -``` - -### **ROCm Compute Profiler** (3.2.1) +### **ROCm Compute Profiler** (3.2.3) #### Added @@ -1792,10 +1760,10 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * Fixed kernel name and kernel dispatch filtering when using ``rocprofv3``. * Fixed an issue of TCC channel counters collection in ``rocprofv3``. -* Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI 300. +* Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI300. * Fixed not detecting memory clock issue when using amd-smi * Fixed standalone GUI crashing -* Fixed L2 read/write/atomic bandwidths on MI350 +* Fixed L2 read/write/atomic bandwidths on AMD Instinct MI350 series. #### Known issues @@ -1822,17 +1790,6 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * MongoDB database support will be removed, and a deprecation warning has been added to the application interface. * Usage of ``rocm-smi`` is deprecated in favor of ``amd-smi``, and a deprecation warning has been added to the application interface. -### **ROCgdb** (16.3) - -#### Added - -- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. - -#### Removed - -- Support for the `gfx940` and `gfx941` architectures. - - ### **ROCm Data Center Tool** (1.1.0) #### Added @@ -1853,6 +1810,31 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele - Support and documentation for diagnostic commands and GPU group management. - [RVS](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/latest/) test integration and reporting. +### **ROCm SMI** (7.8.0) + +#### Added + +- Support for GPU metrics 1.8. + - Added new fields for `rsmi_gpu_metrics_t` including: + - Adding the following metrics to allow new calculations for violation status: + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. + - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). + - Increasing available JPEG engines to 40. + Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI. + +#### Removed + +- Removed backwards compatibility for `rsmi_dev_gpu_metrics_info_get()`'s `jpeg_activity` and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` and `xcp_stats.vcn_busy`. + - Backwards compability is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. + - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion for users about which field to use. By removing backward compatibility, it is easier to identify the relevant field. + - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. + +```{note} +See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. +``` + ### **ROCm Systems Profiler** (1.1.0) #### Added @@ -1994,7 +1976,7 @@ The previous default accumulator types could lead to situations in which unexpec ### **ROCprofiler-SDK** (1.0.0) -### Added +#### Added - Support for [rocJPEG](https://rocm.docs.amd.com/projects/rocJPEG/en/latest/index.html) API Tracing. - Support for AMD Instinct MI350X and MI355X accelerators. @@ -2029,7 +2011,7 @@ The previous default accumulator types could lead to situations in which unexpec - Documentation for `rocprofv3` advanced options. - AQLprofile is now available as open source. -### Changed +#### Changed - SDK to NOT to create a background thread when every tool returns a nullptr from `rocprofiler_configure`. - `vaddr-to-file-offset` mapping in `disassembly.hpp` to use the dedicated comgr API. @@ -2039,11 +2021,11 @@ The previous default accumulator types could lead to situations in which unexpec - `rocprofv3` avail tool to be renamed from `rocprofv3_avail` to `rocprofv3-avail` tool. - `rocprofv3` tool to facilitate thread trace and PC sampling on the same agent. -#### Removed +##### Removed * Support for compilation of gfx940 and gfx941 targets. -### Resolved issues +#### Resolved issues - Fixed missing callbacks around internal thread creation within counter collection service. - Fixed potential data race in the ROCprofiler-SDK double buffering scheme. @@ -2110,15 +2092,24 @@ The previous default accumulator types could lead to situations in which unexpec * Added the `-e` and `--precise-alu-exceptions` flags to enable precise ALU exceptions reporting on supported configurations. +### **ROCr Runtime** (1.18.0) + +#### Added + +* New API `hsa_amd_memory_get_preferred_copy_engine` to get preferred copy engine that can be used to when calling `hsa_amd_memory_async_copy_on_engine`. +* New API `hsa_amd_portable_export_dmabuf_v2` extension of existing `hsa_amd_portable_export_dmabuf` API to support new flags parameter. This allows specifying the new `HSA_AMD_DMABUF_MAPPING_TYPE_PCIE` flag when exporting dma-bufs. +* New flag `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` adds support for new `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` when calling `hsa_amd_vmem_address_reserve` API. This allows virtual address range reservations for SVM allocations to be tracked when running in ASAN mode. +* New sub query `HSA_AMD_AGENT_INFO_CLOCK_COUNTERS` returns a snapshot of the underlying driver's clock counters that can be used for profiling. + ### **rocSHMEM** (3.0.0) #### Added -* Added the Reverse Offload conduit. -* Added new APIs: `rocshmem_ctx_barrier`, `rocshmem_ctx_barrier_wave`, `rocshmem_ctx_barrier_wg`, `rocshmem_barrier_all`, `rocshmem_barrier_all_wave`, `rocshmem_barrier_all_wg`, `rocshmem_ctx_sync`, `rocshmem_ctx_sync_wave`, `rocshmem_ctx_sync_wg`, `rocshmem_sync_all`, `rocshmem_sync_all_wave`, `rocshmem_sync_all_wg`, `rocshmem_init_attr`, `rocshmem_get_uniqueid`, and `rocshmem_set_attr_uniqueid_args`. -* Added dlmalloc based allocator. -* Added XNACK support. -* Added support for initialization with MPI communicators other than `MPI_COMM_WORLD`. +* Reverse Offload conduit. +* New APIs: `rocshmem_ctx_barrier`, `rocshmem_ctx_barrier_wave`, `rocshmem_ctx_barrier_wg`, `rocshmem_barrier_all`, `rocshmem_barrier_all_wave`, `rocshmem_barrier_all_wg`, `rocshmem_ctx_sync`, `rocshmem_ctx_sync_wave`, `rocshmem_ctx_sync_wg`, `rocshmem_sync_all`, `rocshmem_sync_all_wave`, `rocshmem_sync_all_wg`, `rocshmem_init_attr`, `rocshmem_get_uniqueid`, and `rocshmem_set_attr_uniqueid_args`. +* `dlmalloc` based allocator. +* XNACK support. +* Support for initialization with MPI communicators other than `MPI_COMM_WORLD`. #### Changed @@ -2132,8 +2123,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -* Hybrid computation support for existing routines: - - STEQR +* Hybrid computation support for existing routines: STEQR #### Optimized @@ -2149,19 +2139,19 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -* Added the `SpGEAM` generic routine for computing sparse matrix addition in CSR format. -* Added the `v2_SpMV` generic routine for computing sparse matrix vector multiplication. As opposed to the deprecated `rocsparse_spmv` routine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated `rocsparse_spmv` routine, the user can enable warning messages in situations where a fallback algorithm is used by either calling the `rocsparse_enable_debug` routine upfront or exporting the variable `ROCSPARSE_DEBUG` (with the shell command `export ROCSPARSE_DEBUG=1`). -* Added half float mixed precision to `rocsparse_axpby` where X and Y use `float16` and the result and compute type use `float`. -* Added half float mixed precision to `rocsparse_spvv` where X and Y use `float16` and the result and compute type use `float`. -* Added half float mixed precision to `rocsparse_spmv` where A and X use `float16` and Y and the compute type use `float`. -* Added half float mixed precision to `rocsparse_spmm` where A and B use `float16` and C and the compute type use `float`. -* Added half float mixed precision to `rocsparse_sddmm` where A and B use `float16` and C and the compute type use `float`. -* Added half float uniform precision to the `rocsparse_scatter` and `rocsparse_gather` routines. -* Added half float uniform precision to the `rocsparse_sddmm` routine. -* Added the `rocsparse_spmv_alg_csr_rowsplit` algorithm. -* Added support for gfx950. -* Added ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux). -* Added the `almalinux` operating system name to correct the GFortran dependency. +* The `SpGEAM` generic routine for computing sparse matrix addition in CSR format. +* The `v2_SpMV` generic routine for computing sparse matrix vector multiplication. As opposed to the deprecated `rocsparse_spmv` routine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated `rocsparse_spmv` routine, the user can enable warning messages in situations where a fallback algorithm is used by either calling the `rocsparse_enable_debug` routine upfront or exporting the variable `ROCSPARSE_DEBUG` (with the shell command `export ROCSPARSE_DEBUG=1`). +* Half float mixed precision to `rocsparse_axpby` where X and Y use `float16` and the result and compute type use `float`. +* Half float mixed precision to `rocsparse_spvv` where X and Y use `float16` and the result and compute type use `float`. +* Half float mixed precision to `rocsparse_spmv` where A and X use `float16` and Y and the compute type use `float`. +* Half float mixed precision to `rocsparse_spmm` where A and B use `float16` and C and the compute type use `float`. +* Half float mixed precision to `rocsparse_sddmm` where A and B use `float16` and C and the compute type use `float`. +* Half float uniform precision to the `rocsparse_scatter` and `rocsparse_gather` routines. +* Half float uniform precision to the `rocsparse_sddmm` routine. +* The `rocsparse_spmv_alg_csr_rowsplit` algorithm. +* Support for gfx950. +* ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux). +* The `almalinux` operating system name to correct the GFortran dependency. #### Changed @@ -2197,12 +2187,6 @@ The previous default accumulator types could lead to situations in which unexpec ### **rocThrust** (4.0.0) -#### Changed - -* Updated the required version of Google Benchmark from 1.8.0 to 1.9.0. -* Renamed `cpp14_required.h` to `cpp_version_check.h`. -* Refactored `test_header.hpp` into `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. - #### Added * Additional unit tests for: binary_search, complex, c99math, catrig, ccosh, cexp, clog, csin, csqrt, and ctan. @@ -2213,6 +2197,12 @@ The previous default accumulator types could lead to situations in which unexpec * Added gfx950 support. * Merged changes from upstream CCCL/thrust 2.6.0. +#### Changed + +* Updated the required version of Google Benchmark from 1.8.0 to 1.9.0. +* Renamed `cpp14_required.h` to `cpp_version_check.h`. +* Refactored `test_header.hpp` into `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. + #### Removed * `device_malloc_allocator.h` has been removed. This header file was unused and should not impact users. @@ -2220,10 +2210,6 @@ The previous default accumulator types could lead to situations in which unexpec * `test_header.hpp` has been removed. The `HIP_CHECK` function, as well as the `test` and `inter_run_bwr` namespaces, have been moved to `test_utils.hpp`. * `test_assertions.hpp` has been split into `test_real_assertions.hpp` and `test_imag_assertions.hpp`. -#### Upcoming changes - -* `thrust::device_malloc_allocator` is deprecated as of this version. It will be removed in an upcoming version. - #### Resolved issues * Fixed an issue with internal calls to unqualified `distance()` which would be ambigious due to also visibile implementation through ADL. @@ -2232,6 +2218,10 @@ The previous default accumulator types could lead to situations in which unexpec * The order of the values being compared by `thrust::exclusive_scan_by_key` and `thrust::inclusive_scan_by_key` can change between runs when integers are being compared. This can cause incorrect output when a non-commutative operator such as division is being used. +#### Upcoming changes + +* `thrust::device_malloc_allocator` is deprecated as of this version. It will be removed in an upcoming version. + ### **rocWMMA** (2.0.0) #### Added @@ -2251,17 +2241,17 @@ The previous default accumulator types could lead to situations in which unexpec * rocWMMA cooperative fragments register usage footprint has been reduced. * rocWMMA fragments now support partial tile sizes with padding. -#### Optimized - -* Added internal flow control barriers to improve assembly code generation and overall performance. -* Enabled interleaved layouts by default in MMA to improve overall performance. - #### Removed * Removed support for the gfx940 and gfx941 targets. * Removed the rocWMMA cooperative API. * Removed wave count template parameters from transforms APIs. +#### Optimized + +* Added internal flow control barriers to improve assembly code generation and overall performance. +* Enabled interleaved layouts by default in MMA to improve overall performance. + #### Resolved issues * Fixed a validation issue for small precision compute types `< B32` on gfx9. @@ -2299,15 +2289,6 @@ The previous default accumulator types could lead to situations in which unexpec * Test package - debian packages will install required dependencies. -### **ROCr Runtime** (1.18.0) - -#### Added - -* New API `hsa_amd_memory_get_preferred_copy_engine` to get preferred copy engine that can be used to when calling `hsa_amd_memory_async_copy_on_engine`. -* New API `hsa_amd_portable_export_dmabuf_v2` extension of existing `hsa_amd_portable_export_dmabuf` API to support new flags parameter. This allows specifying the new `HSA_AMD_DMABUF_MAPPING_TYPE_PCIE` flag when exporting dma-bufs. -* New flag `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` adds support for new `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` when calling `hsa_amd_vmem_address_reserve` API. This allows virtual address range reservations for SVM allocations to be tracked when running in ASAN mode. -* New sub query `HSA_AMD_AGENT_INFO_CLOCK_COUNTERS` returns a snapshot of the underlying driver's clock counters that can be used for profiling. - ### **Tensile** (4.44.0) #### Added diff --git a/docs/compatibility/compatibility-matrix-historical-6.0.csv b/docs/compatibility/compatibility-matrix-historical-6.0.csv index 38a9ed893..dc92e2a42 100644 --- a/docs/compatibility/compatibility-matrix-historical-6.0.csv +++ b/docs/compatibility/compatibility-matrix-historical-6.0.csv @@ -1,131 +1,135 @@ -ROCm Version,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0 - :ref:`Operating systems & kernels `,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,, - ,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2" - ,,,,,,,,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5" - ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2" - ,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8" - ,"SLES 15 SP7, SP6","SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4" - ,,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9 - ,"Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,, - ,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,, - ,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,,,,,,,,,,,, - ,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`Architecture `,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3 - ,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2 - ,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA - ,RDNA4,RDNA4,RDNA4,,,,,,,,,,,,,,, - ,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3 - ,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2 - ,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`GPU / LLVM target `,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,, - ,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,, -,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,, - ,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100 - ,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030 - ,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942 [#mi300_624-past-60]_,gfx942 [#mi300_622-past-60]_,gfx942 [#mi300_621-past-60]_,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_ - ,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a - ,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908 -,,,,,,,,,,,,,,,,,, - FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13" - :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1" - :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.4.35,0.4.35,0.4.35,0.4.35,0.4.31,0.4.31,0.4.31,0.4.31,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26 - :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.3.0.post0,N/A,N/A,N/A,N/A,N/A - :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,N/A,N/A,85f95ae,85f95ae,85f95ae,85f95ae,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A - :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,N/A,2.4.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, - :doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>`,N/A,N/A,N/A,N/A,0.7.0,0.7.0,0.7.0,0.7.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A - :doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_,N/A,N/A,N/A,N/A,N/A,1.8.0b1,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A - `ONNX Runtime `_,1.2,1.2,1.2,1.2,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1 -,,,,,,,,,,,,,,,,,, - ,,,,,,,,,,,,,,,,,, - THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - `UCC `_,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0 - `UCX `_,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1 - ,,,,,,,,,,,,,,,,,, - THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - Thrust,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 - CUB,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 -,,,,,,,,,,,,,,,,,, - KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`KMD versions `,"6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x" - ,,,,,,,,,,,,,,,,,, - ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`Composable Kernel `,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0 - :doc:`MIGraphX `,2.12.0,2.12.0,2.12.0,2.12.0,2.11.0,2.11.0,2.11.0,2.11.0,2.10.0,2.10.0,2.10.0,2.10.0,2.9.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0 - :doc:`MIOpen `,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0 - :doc:`MIVisionX `,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0 - :doc:`rocAL `,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0,2.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0 - :doc:`rocDecode `,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A - :doc:`rocJPEG `,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A - :doc:`rocPyDecode `,0.3.1,0.3.1,0.3.1,0.3.1,0.2.0,0.2.0,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0,N/A,N/A,N/A,N/A,N/A,N/A - :doc:`RPP `,1.9.10,1.9.10,1.9.10,1.9.10,1.9.1,1.9.1,1.9.1,1.9.1,1.8.0,1.8.0,1.8.0,1.8.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0 - ,,,,,,,,,,,,,,,,,, - COMMUNICATION,.. _commlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`RCCL `,2.22.3,2.22.3,2.22.3,2.22.3,2.21.5,2.21.5,2.21.5,2.21.5,2.20.5,2.20.5,2.20.5,2.20.5,2.18.6,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3 - :doc:`rocSHMEM `,2.0.1,2.0.1,2.0.0,2.0.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A - ,,,,,,,,,,,,,,,,,, - MATH LIBS,.. _mathlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - `half `_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0 - :doc:`hipBLAS `,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0 - :doc:`hipBLASLt `,0.12.1,0.12.1,0.12.1,0.12.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.7.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0 - :doc:`hipFFT `,1.0.18,1.0.18,1.0.18,1.0.18,1.0.17,1.0.17,1.0.17,1.0.17,1.0.16,1.0.15,1.0.15,1.0.14,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13 - :doc:`hipfort `,0.6.0,0.6.0,0.6.0,0.6.0,0.5.1,0.5.1,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0 - :doc:`hipRAND `,2.12.0,2.12.0,2.12.0,2.12.0,2.11.1,2.11.1,2.11.1,2.11.0,2.11.1,2.11.0,2.11.0,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16 - :doc:`hipSOLVER `,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.1,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0 - :doc:`hipSPARSE `,3.2.0,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.1.1,3.1.1,3.1.1,3.1.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0 - :doc:`hipSPARSELt `,0.2.3,0.2.3,0.2.3,0.2.3,0.2.2,0.2.2,0.2.2,0.2.2,0.2.1,0.2.1,0.2.1,0.2.1,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0 - :doc:`rocALUTION `,3.2.3,3.2.3,3.2.3,3.2.2,3.2.1,3.2.1,3.2.1,3.2.1,3.2.1,3.2.0,3.2.0,3.2.0,3.1.1,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3 - :doc:`rocBLAS `,4.4.1,4.4.1,4.4.0,4.4.0,4.3.0,4.3.0,4.3.0,4.3.0,4.2.4,4.2.1,4.2.1,4.2.0,4.1.2,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0 - :doc:`rocFFT `,1.0.32,1.0.32,1.0.32,1.0.32,1.0.31,1.0.31,1.0.31,1.0.31,1.0.30,1.0.29,1.0.29,1.0.28,1.0.27,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23 - :doc:`rocRAND `,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.1,3.1.0,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17 - :doc:`rocSOLVER `,3.28.2,3.28.2,3.28.0,3.28.0,3.27.0,3.27.0,3.27.0,3.27.0,3.26.2,3.26.0,3.26.0,3.26.0,3.25.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0 - :doc:`rocSPARSE `,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2 - :doc:`rocWMMA `,1.7.0,1.7.0,1.7.0,1.7.0,1.6.0,1.6.0,1.6.0,1.6.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0 - :doc:`Tensile `,4.43.0,4.43.0,4.43.0,4.43.0,4.42.0,4.42.0,4.42.0,4.42.0,4.41.0,4.41.0,4.41.0,4.41.0,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0 - ,,,,,,,,,,,,,,,,,, - PRIMITIVES,.. _primitivelibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`hipCUB `,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0 - :doc:`hipTensor `,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0 - :doc:`rocPRIM `,3.4.1,3.4.1,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.2,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0 - :doc:`rocThrust `,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.1.1,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0 - ,,,,,,,,,,,,,,,,,, - SUPPORT LIBS,,,,,,,,,,,,,,,,,, - `hipother `_,6.4.43483,6.4.43483,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830 - `rocm-core `_,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0,6.1.5,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0 - `ROCT-Thunk-Interface `_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,20240607.5.7,20240607.5.7,20240607.4.05,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245 - ,,,,,,,,,,,,,,,,,, - SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`AMD SMI `,25.5.1,25.5.1,25.4.2,25.3.0,24.7.1,24.7.1,24.7.1,24.7.1,24.6.3,24.6.3,24.6.3,24.6.2,24.5.1,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2 - :doc:`ROCm Data Center Tool `,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0 - :doc:`rocminfo `,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0 - :doc:`ROCm SMI `,7.7.0,7.5.0,7.5.0,7.5.0,7.4.0,7.4.0,7.4.0,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0 - :doc:`ROCm Validation Suite `,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000 - ,,,,,,,,,,,,,,,,,, - PERFORMANCE TOOLS,,,,,,,,,,,,,,,,,, - :doc:`ROCm Bandwidth Test `,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0 - :doc:`ROCm Compute Profiler `,3.1.1,3.1.1,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.0.1,2.0.1,2.0.1,2.0.1,N/A,N/A,N/A,N/A,N/A,N/A - :doc:`ROCm Systems Profiler `,1.0.2,1.0.2,1.0.1,1.0.0,0.1.2,0.1.1,0.1.0,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A - :doc:`ROCProfiler `,2.0.60403,2.0.60402,2.0.60401,2.0.60400,2.0.60303,2.0.60302,2.0.60301,2.0.60300,2.0.60204,2.0.60202,2.0.60201,2.0.60200,2.0.60105,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000 - :doc:`ROCprofiler-SDK `,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,N/A,N/A,N/A,N/A,N/A,N/A - :doc:`ROCTracer `,4.1.60403,4.1.60402,4.1.60401,4.1.60400,4.1.60303,4.1.60302,4.1.60301,4.1.60300,4.1.60204,4.1.60202,4.1.60201,4.1.60200,4.1.60105,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000 - ,,,,,,,,,,,,,,,,,, - DEVELOPMENT TOOLS,,,,,,,,,,,,,,,,,, - :doc:`HIPIFY `,19.0.0,19.0.0,19.0.0,19.0.0,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483 - :doc:`ROCm CMake `,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.13.0,0.13.0,0.13.0,0.13.0,0.12.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0 - :doc:`ROCdbgapi `,0.77.2,0.77.2,0.77.2,0.77.2,0.77.0,0.77.0,0.77.0,0.77.0,0.76.0,0.76.0,0.76.0,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0 - :doc:`ROCm Debugger (ROCgdb) `,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,14.2.0,14.2.0,14.2.0,14.2.0,14.1.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0 - `rocprofiler-register `_,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.3.0,0.3.0,0.3.0,0.3.0,N/A,N/A - :doc:`ROCr Debug Agent `,2.0.4,2.0.4,2.0.4,2.0.4,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3 - ,,,,,,,,,,,,,,,,,, - COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - `clang-ocl `_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0 - :doc:`hipCC `,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0 - `Flang `_,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483 - :doc:`llvm-project `,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483 - `OpenMP `_,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483 -,,,,,,,,,,,,,,,,,, - RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,, - :doc:`AMD CLR `,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830 - :doc:`HIP `,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830 - `OpenCL Runtime `_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0 - :doc:`ROCr Runtime `,1.15.0,1.15.0,1.15.0,1.15.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.13.0,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0 +ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0 + :ref:`Operating systems & kernels `,Ubuntu 24.04.3,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,, + ,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2" + ,,,,,,,,,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5" +,RHEL 10,,,,,,,,,,,,,,,,,, + ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2" + ,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8" + ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4" + ,,,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9 + ,"Oracle Linux 10, 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,, + ,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,, + ,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,,,,,,,,,,,, +,Rocky 9,,,,,,,,,,,,,,,,,, + ,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`Architecture `,CDNA4,,,,,,,,,,,,,,,,,, +,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3 + ,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2 + ,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA + ,RDNA4,RDNA4,RDNA4,RDNA4,,,,,,,,,,,,,,, + ,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3 + ,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2 + ,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`GPU / LLVM target `,gfx950,,,,,,,,,,,,,,,,,, +,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,, + ,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,, +,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,, + ,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100 + ,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030 + ,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942 [#mi300_624-past-60]_,gfx942 [#mi300_622-past-60]_,gfx942 [#mi300_621-past-60]_,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_ + ,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a + ,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908 +,,,,,,,,,,,,,,,,,,, + FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.7, 2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13" + :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.19.1, 2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1" + :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.6.0,0.4.35,0.4.35,0.4.35,0.4.35,0.4.31,0.4.31,0.4.31,0.4.31,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26 + :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.3.0.post0,N/A,N/A,N/A,N/A,N/A, + :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,N/A,N/A,N/A,85f95ae,85f95ae,85f95ae,85f95ae,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, + :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,N/A,N/A,2.4.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, + :doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>`,N/A,N/A,N/A,N/A,N/A,0.7.0,0.7.0,0.7.0,0.7.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, + :doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,1.8.0b1,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, + `ONNX Runtime `_,1.22.1,1.20.0,1.20.0,1.20.0,1.20.0,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1 +,,,,,,,,,,,,,,,,,,, + ,,,,,,,,,,,,,,,,,,, + THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + `UCC `_,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0 + `UCX `_,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1 + ,,,,,,,,,,,,,,,,,,, + THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + Thrust,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 + CUB,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 +,,,,,,,,,,,,,,,,,,, + KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`KMD versions `,"7.0.x, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x" + ,,,,,,,,,,,,,,,,,,, + ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`Composable Kernel `,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0 + :doc:`MIGraphX `,2.13.0,2.12.0,2.12.0,2.12.0,2.12.0,2.11.0,2.11.0,2.11.0,2.11.0,2.10.0,2.10.0,2.10.0,2.10.0,2.9.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0 + :doc:`MIOpen `,3.5.0,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0 + :doc:`MIVisionX `,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0 + :doc:`rocAL `,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0,2.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0 + :doc:`rocDecode `,1.0.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A + :doc:`rocJPEG `,1.1.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A + :doc:`rocPyDecode `,0.6.0,0.3.1,0.3.1,0.3.1,0.3.1,0.2.0,0.2.0,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0,N/A,N/A,N/A,N/A,N/A,N/A + :doc:`RPP `,2.0.0,1.9.10,1.9.10,1.9.10,1.9.10,1.9.1,1.9.1,1.9.1,1.9.1,1.8.0,1.8.0,1.8.0,1.8.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0 + ,,,,,,,,,,,,,,,,,,, + COMMUNICATION,.. _commlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`RCCL `,2.26.6,2.22.3,2.22.3,2.22.3,2.22.3,2.21.5,2.21.5,2.21.5,2.21.5,2.20.5,2.20.5,2.20.5,2.20.5,2.18.6,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3 + :doc:`rocSHMEM `,3.0.0,2.0.1,2.0.1,2.0.0,2.0.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A + ,,,,,,,,,,,,,,,,,,, + MATH LIBS,.. _mathlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + `half `_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0 + :doc:`hipBLAS `,3.0.0,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0 + :doc:`hipBLASLt `,1.0.0,0.12.1,0.12.1,0.12.1,0.12.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.7.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0 + :doc:`hipFFT `,1.0.20,1.0.18,1.0.18,1.0.18,1.0.18,1.0.17,1.0.17,1.0.17,1.0.17,1.0.16,1.0.15,1.0.15,1.0.14,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13 + :doc:`hipfort `,0.7.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.1,0.5.1,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0 + :doc:`hipRAND `,3.0.0,2.12.0,2.12.0,2.12.0,2.12.0,2.11.1,2.11.1,2.11.1,2.11.0,2.11.1,2.11.0,2.11.0,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16 + :doc:`hipSOLVER `,3.0.0,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.1,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0 + :doc:`hipSPARSE `,4.0.1,3.2.0,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.1.1,3.1.1,3.1.1,3.1.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0 + :doc:`hipSPARSELt `,0.2.4,0.2.3,0.2.3,0.2.3,0.2.3,0.2.2,0.2.2,0.2.2,0.2.2,0.2.1,0.2.1,0.2.1,0.2.1,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0 + :doc:`rocALUTION `,4.0.0,3.2.3,3.2.3,3.2.3,3.2.2,3.2.1,3.2.1,3.2.1,3.2.1,3.2.1,3.2.0,3.2.0,3.2.0,3.1.1,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3 + :doc:`rocBLAS `,5.0.0,4.4.1,4.4.1,4.4.0,4.4.0,4.3.0,4.3.0,4.3.0,4.3.0,4.2.4,4.2.1,4.2.1,4.2.0,4.1.2,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0 + :doc:`rocFFT `,1.0.34,1.0.32,1.0.32,1.0.32,1.0.32,1.0.31,1.0.31,1.0.31,1.0.31,1.0.30,1.0.29,1.0.29,1.0.28,1.0.27,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23 + :doc:`rocRAND `,4.0.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.1,3.1.0,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17 + :doc:`rocSOLVER `,3.30.0,3.28.2,3.28.2,3.28.0,3.28.0,3.27.0,3.27.0,3.27.0,3.27.0,3.26.2,3.26.0,3.26.0,3.26.0,3.25.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0 + :doc:`rocSPARSE `,4.0.2,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2 + :doc:`rocWMMA `,2.0.0,1.7.0,1.7.0,1.7.0,1.7.0,1.6.0,1.6.0,1.6.0,1.6.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0 + :doc:`Tensile `,4.44.0,4.43.0,4.43.0,4.43.0,4.43.0,4.42.0,4.42.0,4.42.0,4.42.0,4.41.0,4.41.0,4.41.0,4.41.0,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0 + ,,,,,,,,,,,,,,,,,,, + PRIMITIVES,.. _primitivelibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`hipCUB `,4.0.0,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0 + :doc:`hipTensor `,2.0.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0 + :doc:`rocPRIM `,4.0.0,3.4.1,3.4.1,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.2,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0 + :doc:`rocThrust `,4.0.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.1.1,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0 + ,,,,,,,,,,,,,,,,,,, + SUPPORT LIBS,,,,,,,,,,,,,,,,,,, + `hipother `_,7.0.51830,6.4.43483,6.4.43483,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830 + `rocm-core `_,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0,6.1.5,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0 + `ROCT-Thunk-Interface `_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,20240607.5.7,20240607.5.7,20240607.4.05,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245 + ,,,,,,,,,,,,,,,,,,, + SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`AMD SMI `,26.0.0,25.5.1,25.5.1,25.4.2,25.3.0,24.7.1,24.7.1,24.7.1,24.7.1,24.6.3,24.6.3,24.6.3,24.6.2,24.5.1,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2 + :doc:`ROCm Data Center Tool `,1.1.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0 + :doc:`rocminfo `,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0 + :doc:`ROCm SMI `,7.8.0,7.7.0,7.5.0,7.5.0,7.5.0,7.4.0,7.4.0,7.4.0,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0 + :doc:`ROCm Validation Suite `,1.2.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000 + ,,,,,,,,,,,,,,,,,,, + PERFORMANCE TOOLS,,,,,,,,,,,,,,,,,,, + :doc:`ROCm Bandwidth Test `,2.6.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0 + :doc:`ROCm Compute Profiler `,3.2.3,3.1.1,3.1.1,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.0.1,2.0.1,2.0.1,2.0.1,N/A,N/A,N/A,N/A,N/A,N/A + :doc:`ROCm Systems Profiler `,1.1.0,1.0.2,1.0.2,1.0.1,1.0.0,0.1.2,0.1.1,0.1.0,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A + :doc:`ROCProfiler `,2.0.70000,2.0.60403,2.0.60402,2.0.60401,2.0.60400,2.0.60303,2.0.60302,2.0.60301,2.0.60300,2.0.60204,2.0.60202,2.0.60201,2.0.60200,2.0.60105,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000 + :doc:`ROCprofiler-SDK `,1.0.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,N/A,N/A,N/A,N/A,N/A,N/A + :doc:`ROCTracer `,4.1.70000,4.1.60403,4.1.60402,4.1.60401,4.1.60400,4.1.60303,4.1.60302,4.1.60301,4.1.60300,4.1.60204,4.1.60202,4.1.60201,4.1.60200,4.1.60105,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000 + ,,,,,,,,,,,,,,,,,,, + DEVELOPMENT TOOLS,,,,,,,,,,,,,,,,,,, + :doc:`HIPIFY `,20.0.0,19.0.0,19.0.0,19.0.0,19.0.0,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483 + :doc:`ROCm CMake `,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.13.0,0.13.0,0.13.0,0.13.0,0.12.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0 + :doc:`ROCdbgapi `,0.77.3,0.77.2,0.77.2,0.77.2,0.77.2,0.77.0,0.77.0,0.77.0,0.77.0,0.76.0,0.76.0,0.76.0,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0 + :doc:`ROCm Debugger (ROCgdb) `,16.3.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,14.2.0,14.2.0,14.2.0,14.2.0,14.1.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0 + `rocprofiler-register `_,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.3.0,0.3.0,0.3.0,0.3.0,N/A,N/A + :doc:`ROCr Debug Agent `,2.1.0,2.0.4,2.0.4,2.0.4,2.0.4,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3 + ,,,,,,,,,,,,,,,,,,, + COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + `clang-ocl `_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0 + :doc:`hipCC `,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0 + `Flang `_,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483 + :doc:`llvm-project `,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483 + `OpenMP `_,20.0.0.25314,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483 +,,,,,,,,,,,,,,,,,,, + RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, + :doc:`AMD CLR `,7.0.51830,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830 + :doc:`HIP `,7.0.51830,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830 + `OpenCL Runtime `_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0 + :doc:`ROCr Runtime `,1.18.0,1.15.0,1.15.0,1.15.0,1.15.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.13.0,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0 diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index 3cae61198..e5c89265a 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -23,26 +23,30 @@ compatibility and system requirements. .. container:: format-big-table .. csv-table:: - :header: "ROCm Version", "6.4.3", "6.4.2", "6.3.0" + :header: "ROCm Version", "7.0.0", "6.4.3", "6.3.0" :stub-columns: 1 - :ref:`Operating systems & kernels `,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2 + :ref:`Operating systems & kernels `,Ubuntu 24.04.3,Ubuntu 24.04.2,Ubuntu 24.04.2 ,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5 + ,RHEL 10,, ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.5, 9.4" ,RHEL 8.10,RHEL 8.10,RHEL 8.10 - ,"SLES 15 SP7, SP6","SLES 15 SP7, SP6","SLES 15 SP6, SP5" - ,"Oracle Linux 9, 8 [#mi300x]_","Oracle Linux 9, 8 [#mi300x]_",Oracle Linux 8.10 [#mi300x]_ + ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP6, SP5" + ,"Oracle Linux 10, 9, 8 [#ol-mi300x]_","Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_ ,Debian 12 [#single-node]_,Debian 12 [#single-node]_, - ,Azure Linux 3.0 [#mi300x]_,Azure Linux 3.0 [#mi300x]_, + ,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#az-mi300x]_, + ,Rocky 9,, ,.. _architecture-support-compatibility-matrix:,, - :doc:`Architecture `,CDNA3,CDNA3,CDNA3 + :doc:`Architecture `,CDNA4,, + ,CDNA3,CDNA3,CDNA3 ,CDNA2,CDNA2,CDNA2 ,CDNA,CDNA,CDNA ,RDNA4,RDNA4, ,RDNA3,RDNA3,RDNA3 ,RDNA2,RDNA2,RDNA2 ,.. _gpu-support-compatibility-matrix:,, - :doc:`GPU / LLVM target `,gfx1201 [#RDNA-OS]_,gfx1201 [#RDNA-OS]_, + :doc:`GPU / LLVM target `,gfx950,, + ,gfx1201 [#RDNA-OS]_,gfx1201 [#RDNA-OS]_, ,gfx1200 [#RDNA-OS]_,gfx1200 [#RDNA-OS]_, ,gfx1101 [#RDNA-OS]_ [#7700XT-OS]_,gfx1101 [#RDNA-OS]_ [#7700XT-OS]_, ,gfx1100,gfx1100,gfx1100 @@ -52,12 +56,15 @@ compatibility and system requirements. ,gfx908,gfx908,gfx908 ,,, FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix:,, - :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 2.1, 2.0, 1.13" - :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1" - :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.4.35,0.4.35,0.4.31 + :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.7, 2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 2.1, 2.0, 1.13" + :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.19.1, 2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1" + :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.6.0,0.4.35,0.4.31 + :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,85f95ae + :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,N/A :doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>`,N/A,N/A,0.7.0 - `ONNX Runtime `_,1.2,1.2,1.17.3 + :doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_,N/A,N/A,N/A + `ONNX Runtime `_,1.22.1,1.20.0,1.17.3 ,,, THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix:,, `UCC `_,>=1.3.0,>=1.3.0,>=1.3.0 @@ -68,94 +75,94 @@ compatibility and system requirements. CUB,2.5.0,2.5.0,2.3.2 ,,, KMD & USER SPACE [#kfd_support]_,.. _kfd-userspace-support-compatibility-matrix:,, - :doc:`KMD versions `,"6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x" + :doc:`KMD versions `,"7.0.x, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x" ,,, ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix:,, :doc:`Composable Kernel `,1.1.0,1.1.0,1.1.0 - :doc:`MIGraphX `,2.12.0,2.12.0,2.11.0 - :doc:`MIOpen `,3.4.0,3.4.0,3.3.0 - :doc:`MIVisionX `,3.2.0,3.2.0,3.1.0 - :doc:`rocAL `,2.2.0,2.2.0,2.1.0 - :doc:`rocDecode `,0.10.0,0.10.0,0.8.0 - :doc:`rocJPEG `,0.8.0,0.8.0,0.6.0 - :doc:`rocPyDecode `,0.3.1,0.3.1,0.2.0 - :doc:`RPP `,1.9.10,1.9.10,1.9.1 + :doc:`MIGraphX `,2.13.0,2.12.0,2.11.0 + :doc:`MIOpen `,3.5.0,3.4.0,3.3.0 + :doc:`MIVisionX `,3.3.0,3.2.0,3.1.0 + :doc:`rocAL `,2.3.0,2.2.0,2.1.0 + :doc:`rocDecode `,1.0.0,0.10.0,0.8.0 + :doc:`rocJPEG `,1.1.0,0.8.0,0.6.0 + :doc:`rocPyDecode `,0.6.0,0.3.1,0.2.0 + :doc:`RPP `,2.0.0,1.9.10,1.9.1 ,,, COMMUNICATION,.. _commlibs-support-compatibility-matrix:,, - :doc:`RCCL `,2.22.3,2.22.3,2.21.5 - :doc:`rocSHMEM `,2.0.1,2.0.1,N/A + :doc:`RCCL `,2.26.6,2.22.3,2.21.5 + :doc:`rocSHMEM `,3.0.0,2.0.1,N/A ,,, MATH LIBS,.. _mathlibs-support-compatibility-matrix:,, `half `_ ,1.12.0,1.12.0,1.12.0 - :doc:`hipBLAS `,2.4.0,2.4.0,2.3.0 - :doc:`hipBLASLt `,0.12.1,0.12.1,0.10.0 - :doc:`hipFFT `,1.0.18,1.0.18,1.0.17 - :doc:`hipfort `,0.6.0,0.6.0,0.5.0 - :doc:`hipRAND `,2.12.0,2.12.0,2.11.0 - :doc:`hipSOLVER `,2.4.0,2.4.0,2.3.0 - :doc:`hipSPARSE `,3.2.0,3.2.0,3.1.2 - :doc:`hipSPARSELt `,0.2.3,0.2.3,0.2.2 - :doc:`rocALUTION `,3.2.3,3.2.3,3.2.1 - :doc:`rocBLAS `,4.4.1,4.4.1,4.3.0 - :doc:`rocFFT `,1.0.32,1.0.32,1.0.31 - :doc:`rocRAND `,3.3.0,3.3.0,3.2.0 - :doc:`rocSOLVER `,3.28.2,3.28.2,3.27.0 - :doc:`rocSPARSE `,3.4.0,3.4.0,3.3.0 - :doc:`rocWMMA `,1.7.0,1.7.0,1.6.0 - :doc:`Tensile `,4.43.0,4.43.0,4.42.0 + :doc:`hipBLAS `,3.0.0,2.4.0,2.3.0 + :doc:`hipBLASLt `,1.0.0,0.12.1,0.10.0 + :doc:`hipFFT `,1.0.20,1.0.18,1.0.17 + :doc:`hipfort `,0.7.0,0.6.0,0.5.0 + :doc:`hipRAND `,3.0.0,2.12.0,2.11.0 + :doc:`hipSOLVER `,3.0.0,2.4.0,2.3.0 + :doc:`hipSPARSE `,4.0.1,3.2.0,3.1.2 + :doc:`hipSPARSELt `,0.2.4,0.2.3,0.2.2 + :doc:`rocALUTION `,4.0.0,3.2.3,3.2.1 + :doc:`rocBLAS `,5.0.0,4.4.1,4.3.0 + :doc:`rocFFT `,1.0.34,1.0.32,1.0.31 + :doc:`rocRAND `,4.0.0,3.3.0,3.2.0 + :doc:`rocSOLVER `,3.30.0,3.28.2,3.27.0 + :doc:`rocSPARSE `,4.0.2,3.4.0,3.3.0 + :doc:`rocWMMA `,2.0.0,1.7.0,1.6.0 + :doc:`Tensile `,4.44.0,4.43.0,4.42.0 ,,, PRIMITIVES,.. _primitivelibs-support-compatibility-matrix:,, - :doc:`hipCUB `,3.4.0,3.4.0,3.3.0 - :doc:`hipTensor `,1.5.0,1.5.0,1.4.0 - :doc:`rocPRIM `,3.4.1,3.4.1,3.3.0 - :doc:`rocThrust `,3.3.0,3.3.0,3.3.0 + :doc:`hipCUB `,4.0.0,3.4.0,3.3.0 + :doc:`hipTensor `,2.0.0,1.5.0,1.4.0 + :doc:`rocPRIM `,4.0.0,3.4.1,3.3.0 + :doc:`rocThrust `,4.0.0,3.3.0,3.3.0 ,,, SUPPORT LIBS,,, - `hipother `_,6.4.43483,6.4.43483,6.3.42131 - `rocm-core `_,6.4.3,6.4.2,6.3.0 + `hipother `_,7.0.51830,6.4.43483,6.3.42131 + `rocm-core `_,7.0.0,6.4.3,6.3.0 `ROCT-Thunk-Interface `_,N/A [#ROCT-rocr]_,N/A [#ROCT-rocr]_,N/A [#ROCT-rocr]_ ,,, SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix:,, - :doc:`AMD SMI `,25.5.1,25.5.1,24.7.1 - :doc:`ROCm Data Center Tool `,0.3.0,0.3.0,0.3.0 + :doc:`AMD SMI `,26.0.0,25.5.1,24.7.1 + :doc:`ROCm Data Center Tool `,1.1.0,0.3.0,0.3.0 :doc:`rocminfo `,1.0.0,1.0.0,1.0.0 - :doc:`ROCm SMI `,7.7.0,7.5.0,7.4.0 - :doc:`ROCm Validation Suite `,1.1.0,1.1.0,1.1.0 + :doc:`ROCm SMI `,7.8.0,7.7.0,7.4.0 + :doc:`ROCm Validation Suite `,1.2.0,1.1.0,1.1.0 ,,, PERFORMANCE TOOLS,,, - :doc:`ROCm Bandwidth Test `,1.4.0,1.4.0,1.4.0 - :doc:`ROCm Compute Profiler `,3.1.1,3.1.1,3.0.0 - :doc:`ROCm Systems Profiler `,1.0.2,1.0.2,0.1.0 - :doc:`ROCProfiler `,2.0.60403,2.0.60402,2.0.60300 - :doc:`ROCprofiler-SDK `,0.6.0,0.6.0,0.5.0 - :doc:`ROCTracer `,4.1.60403,4.1.60402,4.1.60300 + :doc:`ROCm Bandwidth Test `,2.6.0,1.4.0,1.4.0 + :doc:`ROCm Compute Profiler `,3.2.3,3.1.1,3.0.0 + :doc:`ROCm Systems Profiler `,1.1.0,1.0.2,0.1.0 + :doc:`ROCProfiler `,2.0.70000,2.0.60403,2.0.60300 + :doc:`ROCprofiler-SDK `,1.0.0,0.6.0,0.5.0 + :doc:`ROCTracer `,4.1.70000,4.1.60403,4.1.60300 ,,, DEVELOPMENT TOOLS,,, - :doc:`HIPIFY `,19.0.0,19.0.0,18.0.0.24455 + :doc:`HIPIFY `,20.0.0,19.0.0,18.0.0.24455 :doc:`ROCm CMake `,0.14.0,0.14.0,0.14.0 - :doc:`ROCdbgapi `,0.77.2,0.77.2,0.77.0 - :doc:`ROCm Debugger (ROCgdb) `,15.2.0,15.2.0,15.2.0 - `rocprofiler-register `_,0.4.0,0.4.0,0.4.0 - :doc:`ROCr Debug Agent `,2.0.4,2.0.4,2.0.3 + :doc:`ROCdbgapi `,0.77.3,0.77.2,0.77.0 + :doc:`ROCm Debugger (ROCgdb) `,16.3.0,15.2.0,15.2.0 + `rocprofiler-register `_,0.5.0,0.4.0,0.4.0 + :doc:`ROCr Debug Agent `,2.1.0,2.0.4,2.0.3 ,,, COMPILERS,.. _compilers-support-compatibility-matrix:,, `clang-ocl `_,N/A,N/A,N/A :doc:`hipCC `,1.1.1,1.1.1,1.1.1 - `Flang `_,19.0.0.25224,19.0.0.25224,18.0.0.24455 - :doc:`llvm-project `,19.0.0.25224,19.0.0.25224,18.0.0.24491 - `OpenMP `_,19.0.0.25224,19.0.0.25224,18.0.0.24491 + `Flang `_,20.0.0.25314,19.0.0.25224,18.0.0.24455 + :doc:`llvm-project `,20.0.0.25314,19.0.0.25224,18.0.0.24491 + `OpenMP `_,20.0.0.25314,19.0.0.25224,18.0.0.24491 ,,, RUNTIMES,.. _runtime-support-compatibility-matrix:,, - :doc:`AMD CLR `,6.4.43484,6.4.43484,6.3.42131 - :doc:`HIP `,6.4.43484,6.4.43484,6.3.42131 + :doc:`AMD CLR `,7.0.51830,6.4.43484,6.3.42131 + :doc:`HIP `,7.0.51830,6.4.43484,6.3.42131 `OpenCL Runtime `_,2.0.0,2.0.0,2.0.0 - :doc:`ROCr Runtime `,1.15.0,1.15.0,1.14.0 - + :doc:`ROCr Runtime `,1.18.0,1.15.0,1.14.0 .. rubric:: Footnotes -.. [#mi300x] Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X. +.. [#ol-mi300x] Oracle Linux 10 and 9 are supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. .. [#single-node] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality. +.. [#az-mi300x] Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. .. [#RDNA-OS] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4. .. [#7700XT-OS] Radeon RX 7700 XT (gfx1101) is supported only on Ubuntu 24.04.2 and RHEL 9.6. .. [#kfd_support] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix `_. diff --git a/docs/conf.py b/docs/conf.py index 27cd7f167..9250a9447 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -90,7 +90,7 @@ all_article_info_author = "" # pages with specific settings article_pages = [ - {"file": "about/release-notes", "os": ["linux"], "date": "2025-08-26"}, + {"file": "about/release-notes", "os": ["linux"], "date": "2025-08-28"}, {"file": "release/changelog", "os": ["linux"],}, {"file": "compatibility/compatibility-matrix", "os": ["linux"]}, {"file": "compatibility/ml-compatibility/pytorch-compatibility", "os": ["linux"]}, From ff7d9eb17a670519964ccd778b0015174a087285 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Thu, 21 Aug 2025 21:34:11 -0400 Subject: [PATCH 19/58] Post RC4 7.0.0 release notes update [Batch 2] (#519) * Indentation and formatting updated * Compatibility updated * OS support updated * Changelog synced * AMD SMI link updated * Broken links fixed * Changelog synced --- CHANGELOG.md | 565 +++++++++--------- RELEASE.md | 20 +- .../compatibility-matrix-historical-6.0.csv | 10 +- docs/compatibility/compatibility-matrix.rst | 46 +- 4 files changed, 311 insertions(+), 330 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index a2e827555..682a3401b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,30 +11,35 @@ for a complete overview of this release. ### **AMD SMI** (26.0.0) -### Added +#### Added -* The Default command. +* Ability to restart the AMD GPU driver from the CLI and API. + - `amdsmi_gpu_driver_reload()` API and `amd-smi reset --reload-driver` or `amd-smi reset -r` CLI options. + - Driver reload functionality is now separated from memory partition + functions; memory partition change requests should now be followed by a driver reload. + - Driver reload requires all GPU activity on all devices to be stopped. - A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through laternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. +* Default command: -* Support for GPU metrics 1.8. - - Added new fields for `amdsmi_gpu_xcp_metrics_t` including: - - Adding the following metrics to allow new calculations for violation status: - - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts - - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts - - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. - - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). - - Increasing available JPEG engines to 40. - Current ASICs may not support all 40. These will be indicated as `UINT16_MAX` or `N/A` in CLI. + A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through laternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. -* Bad page threshold count. +* Support for GPU metrics 1.8: + - Added new fields for `amdsmi_gpu_xcp_metrics_t` including: + - Metrics to allow new calculations for violation status: + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. + - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). + - Increased available JPEG engines to 40. Current ASICs may not support all 40. These are indicated as `UINT16_MAX` or `N/A` in CLI. + +* Bad page threshold count. - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions required to display the count. -* CPU model name for RDC. +* CPU model name for RDC. - Added new C and Python API `amdsmi_get_cpu_model_name`. - Not sourced from esmi library. -* Added `amdsmi_get_cpu_affinity_with_scope()`. +* Added `amdsmi_get_cpu_affinity_with_scope()`. * `socket power` to `amdsmi_get_power_info` - Previously the C API had the value in the `amdsmi_power_info` structure, but was unused @@ -53,26 +58,28 @@ for a complete overview of this release. - `AMDSMI_EVT_NOTIF_PROCESS_START` - `AMDSMI_EVT_NOTIF_PROCESS_END` -- Power Cap to `amd-smi monitor`. +- Power cap to `amd-smi monitor`. - `amd-smi monitor -p` will display the power cap along with power. -### Changed +#### Changed -* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. - - The `clk_deep_sleep` field now returns the sleep integer value. +* Separated driver reload functionality from `amdsmi_set_gpu_memory_partition()` and + `amdsmi_set_gpu_memory_partition_mode()` APIs -- and from the CLI `amd-smi set -M `. -* Updated `amdsmi_get_gpu_asic_info` in `amdsmi.h`. - - Added `subsystem_id` structure member. +* Disabled `amd-smi monitor --violation` on guest. Modified `amd-smi metric --throttle` to alias to `amd-smi metric --violation`. -* The `amd-smi topology` command has been enabled for Guest environments. - - `amd-smi topology` is now available in Guest environments. This includes full functionality so users can use the command just as they would in Bare Metal environments. +* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. + - The `clk_deep_sleep` field now returns the sleep integer value. -* Expanded Violation Status tracking for GPU metrics 1.8. - - The driver will no longer be supporting existing single-value GFX Clk Below Host Limit fields (`acc_gfx_clk_below_host_limit`, `per_gfx_clk_below_host_limit`, `active_gfx_clk_below_host_limit`), they are now changed in favor of new per-XCP/XCC arrays. +* The `amd-smi topology` command has been enabled for guest environments. + - This includes full functionality so users can use the command just as they would in bare metal environments. + +* Expanded violation status tracking for GPU metrics 1.8. + - The driver will no longer be supporting existing single-value GFX clock below host limit fields (`acc_gfx_clk_below_host_limit`, `per_gfx_clk_below_host_limit`, `active_gfx_clk_below_host_limit`), they are now changed in favor of new per-XCP/XCC arrays. - Added new fields to `amdsmi_violation_status_t` and related interfaces for enhanced violation breakdown: - Per-XCP/XCC accumulators and status for: - - GFX Clock Below Host Limit (Power, Thermal, and Total) - - Low Utilization + - GFX clock below host limit (power, thermal, and total) + - Low utilization - Added 2D arrays to track per-XCP/XCC accumulators, percentage, and active status: - `acc_gfx_clk_below_host_limit_pwr`, `acc_gfx_clk_below_host_limit_thm`, `acc_gfx_clk_below_host_limit_total` - `per_gfx_clk_below_host_limit_pwr`, `per_gfx_clk_below_host_limit_thm`, `per_gfx_clk_below_host_limit_total` @@ -85,41 +92,34 @@ for a complete overview of this release. - `amdsmi_dpm_policy_entry_t` member `policy_description` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. - `amdsmi_name_value_t` member `name` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. -* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. - - The `clk_deep_sleep` field now returns the sleep integer value. +* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. + - The `clk_deep_sleep` field now returns the sleep integer value. -* Updated `amdsmi_bdf_t` in `amdsmi.h`. - - The `amdsmi_bdf_t` union was changed to have an identical unnamed struct for backwards compatiblity +* For backwards compatibility, updated `amdsmi_bdf_t` union to have an identical unnamed struct. -### Removed +#### Removed -- Removed unnecessary API, `amdsmi_free_name_value_pairs(),` from amdsmi.h - - This API is only used internally to free up memory from the python interface and does not need to be - exposed to the User. +- Removed unnecessary API, `amdsmi_free_name_value_pairs()` + - This API is only used internally to free up memory from the Python interface and does not need to be + exposed to the user. -- Removed unused definitions: - - `AMDSMI_MAX_NAME` - - `AMDSMI_256_LENGTH` - - `AMDSMI_MAX_DATE_LENGTH` - - `MAX_AMDSMI_NAME_LENGTH` - - `AMDSMI_LIB_VERSION_YEAR` - - `AMDSMI_DEFAULT_VARIANT` - - `AMDSMI_MAX_NUM_POWER_PROFILES` - - `AMDSMI_MAX_DRIVER_VERSION_LENGTH` +- Removed unused definitions: + - `AMDSMI_MAX_NAME`, `AMDSMI_256_LENGTH`, `AMDSMI_MAX_DATE_LENGTH`, `MAX_AMDSMI_NAME_LENGTH`, `AMDSMI_LIB_VERSION_YEAR`, + `AMDSMI_DEFAULT_VARIANT`, `AMDSMI_MAX_NUM_POWER_PROFILES`, `AMDSMI_MAX_DRIVER_VERSION_LENGTH`. -- Removed unused member `year` in struct `amdsmi_version_t`. +- Removed unused member `year` in struct `amdsmi_version_t`. -- Removed `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`** +- Removed `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`. - `amdsmi_io_link_type_t` is no longer needed as `amdsmi_link_type_t` is sufficient. - `amdsmi_link_type_t` enum has changed. - This change will also affect `amdsmi_link_metrics_t`, where the link_type field changes from `amdsmi_io_link_type_t` to `amdsmi_link_type_t`. -- Removed `amdsmi_get_power_info_v2()`. - - The ``amdsmi_get_power_info()`` has been unified and the v2 function is no longer needed/used. +- Removed `amdsmi_get_power_info_v2()`. + - The ``amdsmi_get_power_info()`` has been unified and the v2 function is no longer needed or used. -- Removed `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. +- Removed `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. -- The `amdsmi_get_gpu_vram_info` now provides vendor names as a string. +- The `amdsmi_get_gpu_vram_info` now provides vendor names as a string. - `amdsmi_vram_vendor_type_t` enum structure is removed. - `amdsmi_vram_info_t` member named `amdsmi_vram_vendor_type_t` is changed to a character string. - `amdsmi_get_gpu_vram_info` now no longer requires decoding the vendor name as an enum. @@ -129,24 +129,24 @@ for a complete overview of this release. - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion about which field to use. By removing backward compatibility, it is easier to identify the relevant field. - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. -### Optimized +#### Optimized -- Reduced ``amd-smi`` CLI API calls needed to be called before reading or (re)setting GPU features. - - Now when users call any amd-smi CLI command, we have reduced the APIs needed to be called. Previously, - when a user would read a GPU's status, (for example) we would poll for other information helpful for our sets/reset - CLI calls. This change will increase overall run-time performance of the CLI tool. +- Reduced ``amd-smi`` CLI API calls needed to be called before reading or (re)setting GPU features. This + improves overall runtime performance of the CLI. -- Removed partition information from the default `amd-smi static` CLI command. - - Users can still retrieve the same data by calling `amd-smi`, `amd-smi static -p`, or `amd-smi partition -c -m`/`sudo amd-smi partition -a`. +- Removed partition information from the default `amd-smi static` CLI command. + - Users can still retrieve the same data by calling `amd-smi`, `amd-smi static -p`, or `amd-smi partition -c -m`/`sudo amd-smi partition -a`. - Reading ``current_compute_partition`` may momentarily wake the GPU up. This is due to reading XCD registers, which is expected behavior. Changing partitions is not a trivial operation, `current_compute_partition` SYSFS controls this action. -- Optimized CLI command `amd-smi topology` in partition mode. - - Reduced the number of `amdsmi_topo_get_p2p_status` API calls to one fourth. +- Optimized CLI command `amd-smi topology` in partition mode. + - Reduced the number of `amdsmi_topo_get_p2p_status` API calls to one fourth. -### Resolved issues +#### Resolved issues - Removed duplicated GPU IDs when receiving events using the `amd-smi event` command. +- Fixed `amd-smi monitor` decoder utilization (`DEC%`) not showing up on MI300 series ASICs. + ```{note} See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. ``` @@ -155,23 +155,25 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc #### Added -* Added support for `BF16`, `F32`, and `F16` for 2D and 3D NGCHW grouped convolution backward data. -* Added a fully asynchronous HOST (CPU) arguments copy flow for CK grouped GEMM kernels. -* Added support GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW, number of instances in instance factory for NGCHW/GKYXC/NGKHW has been reduced). -* Added support for GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW). -* Added support for GKCYX layout for grouped convolution backward weight (NGCHW/GKCYX/NGKHW). -* Added support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). -* Added support for Stream-K version of mixed `FP8` / `BF16` GEMM. -* Added support for Multiple D GEMM. -* Added GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types -* Added support for `FP16` 2:4 structured sparsity to universal GEMM. -* Added support for Split K for grouped convolution backward data. -* Added logit soft-capping support for fMHA forward kernels. -* Added support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv). -* Added benchmarking support for tile engine GEMM. -* Added Ping-pong scheduler support for GEMM operation along the K dimension. -* Added rotating buffer feature for CK_Tile GEMM. -* Added int8 support for CK_TILE GEMM. +* Support for `BF16`, `F32`, and `F16` for 2D and 3D NGCHW grouped convolution backward data. +* Fully asynchronous HOST (CPU) arguments copy flow for CK grouped GEMM kernels. +* Support GKCYX for layout for grouped convolution forward (NGCHW/GKCYX/NGKHW, number of instances in instance factory for NGCHW/GKYXC/NGKHW has been reduced). +* Support for GKCYX layout for grouped convolution forward (NGCHW/GKCYX/NGKHW). +* Support for GKCYX layout for grouped convolution backward weight (NGCHW/GKCYX/NGKHW). +* Support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). +* Support for Stream-K version of mixed `FP8` / `BF16` GEMM. +* Support for Multiple D GEMM. +* GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types +* Support for `FP16` 2:4 structured sparsity to universal GEMM. +* Support for Split K for grouped convolution backward data. +* Logit soft-capping support for fMHA forward kernels. +* Support for hdim as a multiple of 32 for FMHA (fwd/fwd_splitkv). +* Benchmarking support for tile engine GEMM. +* Ping-pong scheduler support for GEMM operation along the K dimension. +* Rotating buffer feature for CK_Tile GEMM. +* `int8` support for CK_TILE GEMM. +* Vectorize Transpose optimization for CK Tile. +* Asynchronous copy for gfx950. #### Changed @@ -187,9 +189,7 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc #### Optimized -* Optimize the GEMM multiply preshuffle and lds bypass with Pack of KGroup and better instruction layout. -* Added Vectorize Transpose optimization for CK Tile. -* Added the asynchronous copy for gfx950. +* Optimized the GEMM multiply preshuffle and lds bypass with Pack of KGroup and better instruction layout. ### **HIP** 7.0.0 @@ -201,7 +201,8 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc - `hipDrvLaunchKernelEx` dispatches the device kernel represented by a HIP function object. - `hipMemGetHandleForAddressRange` gets a handle for the address range requested. - `num_threads` Total number of threads in the group. The legacy API size is alias. - - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for reduction across lanes of a warp. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). + - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for aritimetic reduction across lanes of a warp, and `__reduce_and_sync`, `__reduce_or_sync`, and `__reduce_xor_sync` +functions added for logical reduction. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). * New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). - Data types for `FP4`/`FP6`/`FP8`. - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. @@ -218,8 +219,7 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc * A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. #### Changed -* Deprecated GPUs. -Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. +* Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. * Removal of Beta warnings in HIP Graph APIs All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. * Behavior changes @@ -390,18 +390,21 @@ HIP runtime has the following functional improvements which improves runtime per #### Added * Stream-K GEMM support has been enabled for the `FP32`, `FP16`, `BF16`, `FP8`, and `BF8` data types on the Instinct MI300A APU. To activate this feature, set the `TENSILE_SOLUTION_SELECTION_METHOD` environment variable to `2`, for example, `export TENSILE_SOLUTION_SELECTION_METHOD=2`. -* Fused Swish/SiLU GEMM in hipBLASLt (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``) +* Added fused Swish/SiLU GEMM (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``). * Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. * Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. * Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8` / `BF8` swizzle GEMM respectively. * Added TF32 emulation on gfx950. -* Added support for `FP6`, `BF6`, and `FP4` on gfx950 +* Added support for `FP6`, `BF6`, and `FP4` on gfx950. * Added support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. #### Changed -* ``HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER_VEC_EXT`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER_VEC_EXT`` are removed. Use the ``HIPBLASLT_MATMUL_DESC_A_SCALE_MODE`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_MODE`` attributes to set scalar (``HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F``) or vector (``HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F``) attributes. * The non-V2 APIs (``GemmPreference``, ``GemmProblemType``, ``GemmEpilogue``, ``GemmTuning``, ``GemmInputs``) in the cpp header are now the same as the V2 APIs (``GemmPreferenceV2``, ``GemmProblemTypeV2``, ``GemmEpilogueV2``, ``GemmTuningV2``, ``GemmInputsV2``). The original non-V2 APIs are removed. + +#### Removed + +* ``HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER_VEC_EXT`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER_VEC_EXT`` are removed. Use the ``HIPBLASLT_MATMUL_DESC_A_SCALE_MODE`` and ``HIPBLASLT_MATMUL_DESC_B_SCALE_MODE`` attributes to set scalar (``HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F``) or vector (``HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F``) attributes. * The `hipblasltExtAMaxWithScale` API is removed. #### Optimized @@ -486,7 +489,7 @@ HIP runtime has the following functional improvements which improves runtime per * Updated and reorganized documentation for clarity and consistency. -### **HIPIFY** (7.0.0) +### **HIPIFY** (20.0.0) #### Added @@ -523,7 +526,7 @@ HIP runtime has the following functional improvements which improves runtime per * Added compatibility-only functions * csrlsvqr - * hipsolverSpCcsrlsvqr, hipsolverSpZcsrlsvqr + * `hipsolverSpCcsrlsvqr`, `hipsolverSpZcsrlsvqr` #### Resolved issues @@ -555,7 +558,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Known issues -* In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed in, because it is unused internally. +* In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed in because it is unused internally. ### **hipSPARSELt** (0.2.4) @@ -594,14 +597,14 @@ HIP runtime has the following functional improvements which improves runtime per * Added `hiptensorDestroyPlanPreference` to free all resources related to the provided preference. * Added `hiptensorPlanPreferenceSetAttribute` to set attribute of a `hiptensorPlanPreference_t` object. * Added `hiptensorPlanGetAttribute` to retrieve information about an already-created plan. -* Added `hiptensorEstimateWorkspaceSize` to determine the required workspaceSize for the given operation. +* Added `hiptensorEstimateWorkspaceSize` to determine the required workspace size for the given operation. * Added `hiptensorCreatePlan` to allocate a `hiptensorPlan_t` object, select an appropriate kernel for a given operation and prepare a plan that encodes the execution. * Added `hiptensorDestroyPlan` to free all resources related to the provided plan. #### Changed * Removed architecture support for gfx940 and gfx941. -* Generalized opaque buffer now for any descriptor. +* Generalized opaque buffer for any descriptor. * Replaced `hipDataType` with `hiptensorDataType_t` for all supported types, for example, `HIP_R_32F` to `HIPTENSOR_R_32F`. * Replaced `hiptensorComputeType_t` with `hiptensorComputeDescriptor_t` for all supported types. * Replaced `hiptensorInitTensorDescriptor` with `hiptensorCreateTensorDescriptor`. @@ -625,15 +628,16 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added compiler support for separate debug file generation for device code. +* Added the compiler `-gsplit-dwarf` option to enable the generation of separate debug information file at compile time. When used, separate debug information files are generated for host and for each offload architecture. For additional information, see [DebugFission](https://gcc.gnu.org/wiki/DebugFission). * Added `llvm-flang`, AMD's next generation Fortran compiler is a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. * Added Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. -* Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` enabling fine-grained target-specific feature availability. +* Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` to determine the current target processor ability to invoke a particular builtin. * Added HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. #### Changed * Updated clang/llvm to AMD clang version 20.0.0 (equivalent to LLVM 20.0.0 with additional out-of-tree patches). +* HIPCC Perl scripts (`hipcc.pl` and `hipconfig.pl`) have been removed from this release. #### Optimized @@ -646,88 +650,69 @@ HIP runtime has the following functional improvements which improves runtime per ### **MIGraphX** (2.13.0) -### Added +#### Added -* Support for OCP `FP8` and MX `FP4` data types on AMD Instinct MI350X and MI355X accelerators. -* Support for `BF16` on all hardware. +* Support for OCP `FP8` on AMD Instinct MI350X accelerators. * Support for PyTorch 2.7 via Torch-MIGraphX. -* Contrib Operators for Microsoft ONNX: Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, skipLayerNorm. -* TensorFlow Operator: Sigmoid, AddN. -* GroupQuery Attention for LLM support . +* Support for the Microsoft ONNX Contrib Operators (Self) Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, SkipLayerNorm. +* Support for Sigmoid and AddN TensorFlow operators. +* Added GroupQuery Attention support for LLMs. * Added support for edge mode in the ONNX Pad operator. -* Support additional types for linear Resize operator. -* Added bitonic topk ONNX operator. -* Added onnx runtime python driver +* Added ONNX runtime Python driver. * Added FLUX e2e example. -* Added API to save and load arguments. -* Added quantize_bf16 to C api output. +* Added C++ and Python APIs to save arguments to a graph as a msgpack file, and then read the file back. * Added rocMLIR fusion for kv-cache attention. +* Introduced a check for file-write errors. -### Changed +#### Changed -* Print Kernel/Module in Compile Failure. -* Use hipblaslt instead of rocBLAS for newer GPU asics. -* Normalize standard input shapes for rocBLAS. -* Updated Stable Diffusion example to use torch 6.3. -* Rewrite 1x1 convolutions to gemm. -* Make version header public. -* Represent `BF16::max` by its encoding, rather than the expected value. -* Direct warnings to cout, instead into cerr. -* Use vector instead of `set` for implicit deps. -* Disable layernorm by default. -* Update timing in compile_ops() to use common average +* `quantize_bf16` for quantizing the model to `BF16` has been made visible in the MIGraphX user API. +* Print additional kernel/module information in the event of compile failure. +* Use hipBLASLt instead of rocBLAS on newer GPUs. +* 1x1 convolutions are now rewritten to GEMMs. +* `BF16::max` is now represented by its encoding rather than its expected value. +* Direct warnings now go to `cout` rather `cerr`. +* `FP8` uses hipBLASLt rather than rocBLAS. +* ONNX models are now topologically sorted when nodes are unordered. +* Improved layout of Graphviz output. +* Enhanced debugging for migraphx-driver: consumed environment variables are printed, timestamps and duration are added to the summary. +* Add a trim size flag to the verify option for migraphx-driver. +* Node names are printed to track parsing within the ONNX graph when using the `MIGRAPHX_TRACE_ONNX_PARSER` flag. +* Update accuracy checker to output test data with the `--show-test-data` flag. +* The `MIGRAPHX_TRACE_BENCHMARKING` option now allows the problem cache file to be updated after finding the best solution. -### Removed +#### Removed -* DPP for v_add_f64 as it is unsupported. -* rocBLAS bug workaround for solution index. -* ROCM_USE_FLOAT8 macro. -* rocBLAS `FP8`, always use hipBlasLt. -* Call to hipGetMemoryInfo when checking free memory based on feedback from HIP team. +* `ROCM_USE_FLOAT8` macro. +* The BF16 GEMM test was removed for Navi21, as it is unsupported by rocBLAS and hipBLASLt on that platform. -### Optimized +#### Optimized -* Layout convolution as NHWC or NCHW only -* einsum: conditionally do squeeze before transpose -* Update problem cache as configs are benchmarked -* Enable debug assertions in libstdc++ -* Topologically sort onnx models if nodes are unordered -* Use time_loop function to measure time for exhaustive tune runs +* Use common average in `compile_ops` to reduce run-to-run variations when tuning. +* Improved the performance of the TopK operator. +* Conform to a single layout (NHWC or NCHW) during compilation rather than combining two. * Slice Channels Conv Optimization (slice output fusion) -* Horiz fuse after pointwise -* GridSample Linear Sampler Refactor -* find_splits::is_dependent refactor -* Visually improved the output from Graphviz -* Print MigraphX consumed Env Variables when using the migraphx-driver -* Add timestamps and duration when printing the summary of migraphx-driver -* Add a trim size flag to the verify option for migraphx-driver -* Print node names, to track parsing within the onnx graph when using the MIGRAPHX_TRACE_ONNX_PARSER flag -* Embed onnx/tf files for api tests -* Fuse multiple outputs for pointwise ops -* Fuse reshapes on pointwise inputs for mlir output fusion -* Print MIGRAPHX ENV Variables at end of summary -* Update accuracy checker to spit out test data with --show-test-data flag -* Dont fold mul with gemm when the gemm is used more than once -* Detect when parallel stl is not parallel and enable when it is in parallel -* Dont fuse broadcast after conv/gemm in mlir -* Avoid the fusion (in reduction) when operator data-types mismatch +* Horizontal fusion optimization after pointwise operations. +* Reduced the number of literals used in `GridSample` linear sampler. +* Fuse multiple outputs for pointwise operations. +* Fuse reshapes on pointwise inputs for MLIR output fusion. +* MUL operation not folded into the GEMM when the GEMM is used more than once. +* Broadcast not fused after convolution or GEMM MLIR kernels. +* Avoid reduction fusion when operator data-types mismatch. -### Resolved issues +#### Resolved issues -* Workaround ICE in clang 20 when using views::transform. -* Fix bug with reshape_lazy in MLIR. -* Quantizelinear nearbyint fix. -* Add case for empty strings in node inputs for ops like resize. -* Parse resize fix: only check "keep_aspect_ratio_policy" attribute for sizes input. -* Fix Layernorm and SimplifiedLayernorm onnx parsers. -* nonmaxsuppression: identical boxes/scores not ordered correctly. -* Gcc/G++ compilation fix. -* Bug fix: events would get created on the wrong device in a multi-gpu scenario. -* Check for file-write errors. -* Fix out of order keys in value for comparisons and hashes when caching best kernels. -* Make checking env variables thread-safe again. -* [controlnet] Fixed mul: Types do not match. -* Fix check for scales if presenting roi in Resize op. +* Compilation workaround ICE in clang 20 when using `views::transform`. +* Fix bug with `reshape_lazy` in MLIR. +* Quantizelinear fixed for Nearbyint operation. +* Check for empty strings in ONNX node inputs for operations like Resize. +* Parse Resize fix: only check `keep_aspect_ratio_policy` attribute for sizes input. +* Nonmaxsuppression: fixed issue where identical boxes/scores not ordered correctly. +* Fixed a bug where events were created on the wrong device in a multi-gpu scenario. +* Fixed out of order keys in value for comparisons and hashes when caching best kernels. +* Fixed Controlnet MUL types do not match error. +* Fixed check for scales if ROI input is present in Resize operation. +* Einsum: Fixed a crash on empty squeeze operations. ### **MIOpen** (3.5.0) @@ -748,9 +733,9 @@ HIP runtime has the following functional improvements which improves runtime per #### Optimized -* [BatchNorm] Optimized NHWC OpenCL kernels and improved heuristics +* [BatchNorm] Optimized NHWC OpenCL kernels and improved heuristics. * [RNN] Dynamic algorithm optimization. -* [Conv] Eliminated redundant clearing of output buffers +* [Conv] Eliminated redundant clearing of output buffers. * [RNN] Updated selection heuristics. * Updated tuning for the AMD Instinct MI300 series. @@ -792,10 +777,10 @@ HIP runtime has the following functional improvements which improves runtime per * Set a default of 112 channels for a single node with `8 * gfx950`. * Enabled LL128 protocol on the gfx950. * Added the ability to choose the unroll factor at runtime using `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. -* Added MSCCL support for AllGather multinode gfx942/gfx950 (for instance, 16 and 32 GPUs). To enable this feature, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. The maximum message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. -* Thread thresholds for LL/LL128 are selected in Tuning Models for the AMD Instinct MI300X. This impacts the number of channels used for AG and RS. The channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS`, or `NCCL_MAX_NCHANNELS` are set. +* Added MSCCL support for AllGather multinode on the gfx942 and gfx950 (for instance, 16 and 32 GPUs). To enable this feature, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. The maximum message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. +* Thread thresholds for LL/LL128 are selected in Tuning Models for the AMD Instinct MI300X. This impacts the number of channels used for AllGather and ReduceScatter. The channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS`, or `NCCL_MAX_NCHANNELS` are set. * Multi-node tuning for AllGather, AllReduce, and ReduceScatter that leverages LL/LL64/LL128 protocols to use nontemporal vector load/store for tunable message size ranges. -* LL/LL128 usage ranges for AR, AG, and RS are part of the tuning models, which enable architecture-specific tuning in conjunction with the existing Rome Models scheme in RCCL. +* LL/LL128 usage ranges for AllReduce, AllGather, and ReduceScatter are part of the tuning models, which enable architecture-specific tuning in conjunction with the existing Rome Models scheme in RCCL. * Two new APIs are exposed as part of an initiative to separate RCCL code. These APIs are `rcclGetAlgoInfo` and `rcclFuncMaxSendRecvCount`. However, user-level invocation requires that RCCL be built with `RCCL_EXPOSE_STATIC` enabled. #### Changed @@ -810,7 +795,7 @@ HIP runtime has the following functional improvements which improves runtime per * Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. * Fixed unit test failures in tests ending with the `ManagedMem` and `ManagedMemGraph` suffixes. * Fixed a suboptimal algorithmic switching point for AllReduce on the AMD Instinct MI300X. -* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault" with a design change to use `comm` instead of `rank` for `mscclStatus`. The Global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. +* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault" with a design change to use `comm` instead of `rank` for `mscclStatus`. The global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. ### **rocAL** (2.3.0) @@ -858,23 +843,14 @@ HIP runtime has the following functional improvements which improves runtime per * gfx950 support. * Internal API logging for `gemm` debugging using `ROCBLAS_LAYER = 8`. -* Support for AOCL 5.0 gcc build as a client reference library. -* Allowing the use of `PkgConfig` for client reference library fallback detection. +* Support for the AOCL 5.0 gcc build as a client reference library. +* The use of `PkgConfig` for client reference library fallback detection. #### Changed * `CMAKE_CXX_COMPILER` is now passed on during compilation for a Tensile build. * The default atomics mode is changed from `allowed` to `not allowed`. -#### Optimized - -* Optimized `gemm` by using `gemv` kernels when applicable. -* Optimized `gemv` for small `m` and `n` with a large batch count on gfx942. -* Improved the performance of Level 1 `dot` for all precisions and variants when `N > 100000000` on gfx942. -* Improved the performance of Level 1 `asum` and `nrm2` for all precisions and variants on gfx942. -* Improved the performance of Level 2 `sger` (single precision) on gfx942. -* Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942. - #### Removed * Support code for non-production gfx targets. @@ -884,6 +860,15 @@ HIP runtime has the following functional improvements which improves runtime per * `rocblas_float8.h` and `rocblas_hip_f8_impl.h` files. * `rocblas_gemm_ex3`, `rocblas_gemm_batched_ex3`, and `rocblas_gemm_strided_batched_ex3` API functions. +#### Optimized + +* Optimized `gemm` by using `gemv` kernels when applicable. +* Optimized `gemv` for small `m` and `n` with a large batch count on gfx942. +* Improved the performance of Level 1 `dot` for all precisions and variants when `N > 100000000` on gfx942. +* Improved the performance of Level 1 `asum` and `nrm2` for all precisions and variants on gfx942. +* Improved the performance of Level 2 `sger` (single precision) on gfx942. +* Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942. + #### Resolved issues * Fixed environment variable path-based logging to append multiple handle outputs to the same file. @@ -969,6 +954,16 @@ HIP runtime has the following functional improvements which improves runtime per * Fixed kernel faults on multi-device transforms that gather to a single device, when the input/output bricks are not contiguous. +### **ROCgdb** (16.3) + +#### Added + +- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. + +#### Removed + +- Support for the `gfx940` and `gfx941` architectures. + ### **rocJPEG** (1.1.0) #### Added @@ -990,56 +985,26 @@ HIP runtime has the following functional improvements which improves runtime per ### **ROCm Bandwidth Test** (2.6.0) -### Added +#### Added * Plugin architecture: - * `rocm_bandwidth_test` is now the **framework** for individual `plugins` and features. The `framework` is available at: `/opt/rocm/bin/` + * `rocm_bandwidth_test` is now the `framework` for individual `plugins` and features. The `framework` is available at: `/opt/rocm/bin/` - * Individual `plugins`: The **plugins (shared libraries)** are available at: `/opt/rocm/lib/rocm_bandwidth_test/plugins/` + * Individual `plugins`: The `plugins` (shared libraries) are available at: `/opt/rocm/lib/rocm_bandwidth_test/plugins/` ```{note} Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/rocm-rel-7.0/README.md) file for details about the new options and outputs. ``` -### Changed +#### Changed * The `CLI` and options/parameters have changed due to the new plugin architecture, where the plugin parameters are parsed by the plugin. -### Removed +#### Removed - The old CLI, parameters, and switches used. -### Known Issues - -- MI350: Crashes due to HIP gfx support. - - -### **ROCm SMI** (7.8.0) - -#### Added - -- Support for GPU metrics 1.8. - - Added new fields for `rsmi_gpu_metrics_t` including: - - Adding the following metrics to allow new calculations for violation status: - - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts - - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts - - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. - - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). - - Increasing available JPEG engines to 40. - Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI. - -#### Removed - -- Removed backwards compatibility for `rsmi_dev_gpu_metrics_info_get()`'s `jpeg_activity` and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` and `xcp_stats.vcn_busy`. - - Backwards compability is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. - - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion for users about which field to use. By removing backward compatibility, it is easier to identify the relevant field. - - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. - -```{note} -See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. -``` - -### **ROCm Compute Profiler** (3.2.1) +### **ROCm Compute Profiler** (3.2.3) #### Added @@ -1134,10 +1099,10 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * Fixed kernel name and kernel dispatch filtering when using ``rocprofv3``. * Fixed an issue of TCC channel counters collection in ``rocprofv3``. -* Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI 300. +* Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI300. * Fixed not detecting memory clock issue when using amd-smi * Fixed standalone GUI crashing -* Fixed L2 read/write/atomic bandwidths on MI350 +* Fixed L2 read/write/atomic bandwidths on AMD Instinct MI350 series. #### Known issues @@ -1164,17 +1129,6 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * MongoDB database support will be removed, and a deprecation warning has been added to the application interface. * Usage of ``rocm-smi`` is deprecated in favor of ``amd-smi``, and a deprecation warning has been added to the application interface. -### **ROCgdb** (16.3) - -#### Added - -- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. - -#### Removed - -- Support for the `gfx940` and `gfx941` architectures. - - ### **ROCm Data Center Tool** (1.1.0) #### Added @@ -1195,6 +1149,31 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele - Support and documentation for diagnostic commands and GPU group management. - [RVS](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/latest/) test integration and reporting. +### **ROCm SMI** (7.8.0) + +#### Added + +- Support for GPU metrics 1.8. + - Added new fields for `rsmi_gpu_metrics_t` including: + - Adding the following metrics to allow new calculations for violation status: + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. + - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). + - Increasing available JPEG engines to 40. + Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI. + +#### Removed + +- Removed backwards compatibility for `rsmi_dev_gpu_metrics_info_get()`'s `jpeg_activity` and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` and `xcp_stats.vcn_busy`. + - Backwards compability is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. + - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion for users about which field to use. By removing backward compatibility, it is easier to identify the relevant field. + - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. + +```{note} +See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. +``` + ### **ROCm Systems Profiler** (1.1.0) #### Added @@ -1218,14 +1197,14 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele - Fixed incorrect kernel names shown for kernel dispatch tracks in Perfetto. - Fixed formatting of some output logs. -### **ROCmValidationSuite** (1.2.0) +### **ROCm Validation Suite** (1.2.0) #### Added -- Support for new platforms: MI350X and MI355X. +- Support for AMD Instinct MI350X and MI355X accelerators. - Introduced rotating buffer mechanism for GEMM operations. - Support for read and write tests in Babel. -- Support for new platforms: RX9070 and RX9070GRE. +- Support for AMD Radeon RX9070 and RX9070GRE graphics cards. #### Changed @@ -1336,7 +1315,7 @@ The previous default accumulator types could lead to situations in which unexpec ### **ROCprofiler-SDK** (1.0.0) -### Added +#### Added - Support for [rocJPEG](https://rocm.docs.amd.com/projects/rocJPEG/en/latest/index.html) API Tracing. - Support for AMD Instinct MI350X and MI355X accelerators. @@ -1371,7 +1350,7 @@ The previous default accumulator types could lead to situations in which unexpec - Documentation for `rocprofv3` advanced options. - AQLprofile is now available as open source. -### Changed +#### Changed - SDK to NOT to create a background thread when every tool returns a nullptr from `rocprofiler_configure`. - `vaddr-to-file-offset` mapping in `disassembly.hpp` to use the dedicated comgr API. @@ -1381,11 +1360,11 @@ The previous default accumulator types could lead to situations in which unexpec - `rocprofv3` avail tool to be renamed from `rocprofv3_avail` to `rocprofv3-avail` tool. - `rocprofv3` tool to facilitate thread trace and PC sampling on the same agent. -#### Removed +##### Removed * Support for compilation of gfx940 and gfx941 targets. -### Resolved issues +#### Resolved issues - Fixed missing callbacks around internal thread creation within counter collection service. - Fixed potential data race in the ROCprofiler-SDK double buffering scheme. @@ -1452,15 +1431,24 @@ The previous default accumulator types could lead to situations in which unexpec * Added the `-e` and `--precise-alu-exceptions` flags to enable precise ALU exceptions reporting on supported configurations. +### **ROCr Runtime** (1.18.0) + +#### Added + +* New API `hsa_amd_memory_get_preferred_copy_engine` to get preferred copy engine that can be used to when calling `hsa_amd_memory_async_copy_on_engine`. +* New API `hsa_amd_portable_export_dmabuf_v2` extension of existing `hsa_amd_portable_export_dmabuf` API to support new flags parameter. This allows specifying the new `HSA_AMD_DMABUF_MAPPING_TYPE_PCIE` flag when exporting dma-bufs. +* New flag `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` adds support for new `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` when calling `hsa_amd_vmem_address_reserve` API. This allows virtual address range reservations for SVM allocations to be tracked when running in ASAN mode. +* New sub query `HSA_AMD_AGENT_INFO_CLOCK_COUNTERS` returns a snapshot of the underlying driver's clock counters that can be used for profiling. + ### **rocSHMEM** (3.0.0) #### Added -* Added the Reverse Offload conduit. -* Added new APIs: `rocshmem_ctx_barrier`, `rocshmem_ctx_barrier_wave`, `rocshmem_ctx_barrier_wg`, `rocshmem_barrier_all`, `rocshmem_barrier_all_wave`, `rocshmem_barrier_all_wg`, `rocshmem_ctx_sync`, `rocshmem_ctx_sync_wave`, `rocshmem_ctx_sync_wg`, `rocshmem_sync_all`, `rocshmem_sync_all_wave`, `rocshmem_sync_all_wg`, `rocshmem_init_attr`, `rocshmem_get_uniqueid`, and `rocshmem_set_attr_uniqueid_args`. -* Added dlmalloc based allocator. -* Added XNACK support. -* Added support for initialization with MPI communicators other than `MPI_COMM_WORLD`. +* Reverse Offload conduit. +* New APIs: `rocshmem_ctx_barrier`, `rocshmem_ctx_barrier_wave`, `rocshmem_ctx_barrier_wg`, `rocshmem_barrier_all`, `rocshmem_barrier_all_wave`, `rocshmem_barrier_all_wg`, `rocshmem_ctx_sync`, `rocshmem_ctx_sync_wave`, `rocshmem_ctx_sync_wg`, `rocshmem_sync_all`, `rocshmem_sync_all_wave`, `rocshmem_sync_all_wg`, `rocshmem_init_attr`, `rocshmem_get_uniqueid`, and `rocshmem_set_attr_uniqueid_args`. +* `dlmalloc` based allocator. +* XNACK support. +* Support for initialization with MPI communicators other than `MPI_COMM_WORLD`. #### Changed @@ -1468,18 +1456,16 @@ The previous default accumulator types could lead to situations in which unexpec #### Resolved issues -* Resolved segfault in `rocshmem_wg_ctx_create`, now provides nullptr if ctx cannot be created. +* Resolved segfault in `rocshmem_wg_ctx_create`, now provides `nullptr` if `ctx` cannot be created. ### **rocSOLVER** (3.30.0) #### Added -* Hybrid computation support for existing routines: - - STEQR +* Hybrid computation support for existing routines: STEQR #### Optimized -* Fixed corner cases that can produce NaNs in SYEVD, for valid input matrices. * Improved the performance of BDSQR and downstream functions, such as GESVD. * Improved the performance of STEQR and downstream functions, such as SYEV/HEEV. * Improved the performance of LARFT and downstream functions, such as GEQR2 and GEQRF. @@ -1492,19 +1478,19 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -* Added the `SpGEAM` generic routine for computing sparse matrix addition in CSR format. -* Added the `v2_SpMV` generic routine for computing sparse matrix vector multiplication. As opposed to the deprecated `rocsparse_spmv` routine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated `rocsparse_spmv` routine, the user can enable warning messages in situations where a fallback algorithm is used by either calling the `rocsparse_enable_debug` routine upfront or exporting the variable `ROCSPARSE_DEBUG` (with the shell command `export ROCSPARSE_DEBUG=1`). -* Added half float mixed precision to `rocsparse_axpby` where X and Y use `float16` and the result and compute type use `float`. -* Added half float mixed precision to `rocsparse_spvv` where X and Y use `float16` and the result and compute type use `float`. -* Added half float mixed precision to `rocsparse_spmv` where A and X use `float16` and Y and the compute type use `float`. -* Added half float mixed precision to `rocsparse_spmm` where A and B use `float16` and C and the compute type use `float`. -* Added half float mixed precision to `rocsparse_sddmm` where A and B use `float16` and C and the compute type use `float`. -* Added half float uniform precision to the `rocsparse_scatter` and `rocsparse_gather` routines. -* Added half float uniform precision to the `rocsparse_sddmm` routine. -* Added the `rocsparse_spmv_alg_csr_rowsplit` algorithm. -* Added support for gfx950. -* Added ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux). -* Added the `almalinux` operating system name to correct the GFortran dependency. +* The `SpGEAM` generic routine for computing sparse matrix addition in CSR format. +* The `v2_SpMV` generic routine for computing sparse matrix vector multiplication. As opposed to the deprecated `rocsparse_spmv` routine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated `rocsparse_spmv` routine, the user can enable warning messages in situations where a fallback algorithm is used by either calling the `rocsparse_enable_debug` routine upfront or exporting the variable `ROCSPARSE_DEBUG` (with the shell command `export ROCSPARSE_DEBUG=1`). +* Half float mixed precision to `rocsparse_axpby` where X and Y use `float16` and the result and compute type use `float`. +* Half float mixed precision to `rocsparse_spvv` where X and Y use `float16` and the result and compute type use `float`. +* Half float mixed precision to `rocsparse_spmv` where A and X use `float16` and Y and the compute type use `float`. +* Half float mixed precision to `rocsparse_spmm` where A and B use `float16` and C and the compute type use `float`. +* Half float mixed precision to `rocsparse_sddmm` where A and B use `float16` and C and the compute type use `float`. +* Half float uniform precision to the `rocsparse_scatter` and `rocsparse_gather` routines. +* Half float uniform precision to the `rocsparse_sddmm` routine. +* The `rocsparse_spmv_alg_csr_rowsplit` algorithm. +* Support for gfx950. +* ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux). +* The `almalinux` operating system name to correct the GFortran dependency. #### Changed @@ -1540,12 +1526,6 @@ The previous default accumulator types could lead to situations in which unexpec ### **rocThrust** (4.0.0) -#### Changed - -* Updated the required version of Google Benchmark from 1.8.0 to 1.9.0. -* Renamed `cpp14_required.h` to `cpp_version_check.h`. -* Refactored `test_header.hpp` into `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. - #### Added * Additional unit tests for: binary_search, complex, c99math, catrig, ccosh, cexp, clog, csin, csqrt, and ctan. @@ -1556,6 +1536,12 @@ The previous default accumulator types could lead to situations in which unexpec * Added gfx950 support. * Merged changes from upstream CCCL/thrust 2.6.0. +#### Changed + +* Updated the required version of Google Benchmark from 1.8.0 to 1.9.0. +* Renamed `cpp14_required.h` to `cpp_version_check.h`. +* Refactored `test_header.hpp` into `test_param_fixtures.hpp`, `test_real_assertions.hpp`, `test_imag_assertions.hpp`, and `test_utils.hpp`. This is done to prevent unit tests from having access to modules that they're not testing. This will improve the accuracy of code coverage reports. + #### Removed * `device_malloc_allocator.h` has been removed. This header file was unused and should not impact users. @@ -1563,10 +1549,6 @@ The previous default accumulator types could lead to situations in which unexpec * `test_header.hpp` has been removed. The `HIP_CHECK` function, as well as the `test` and `inter_run_bwr` namespaces, have been moved to `test_utils.hpp`. * `test_assertions.hpp` has been split into `test_real_assertions.hpp` and `test_imag_assertions.hpp`. -#### Upcoming changes - -* `thrust::device_malloc_allocator` is deprecated as of this version. It will be removed in an upcoming version. - #### Resolved issues * Fixed an issue with internal calls to unqualified `distance()` which would be ambigious due to also visibile implementation through ADL. @@ -1575,40 +1557,44 @@ The previous default accumulator types could lead to situations in which unexpec * The order of the values being compared by `thrust::exclusive_scan_by_key` and `thrust::inclusive_scan_by_key` can change between runs when integers are being compared. This can cause incorrect output when a non-commutative operator such as division is being used. +#### Upcoming changes + +* `thrust::device_malloc_allocator` is deprecated as of this version. It will be removed in an upcoming version. + ### **rocWMMA** (2.0.0) #### Added * Added internal register layout transforms to support interleaved MMA layouts. * Added support for the gfx950 target. -* Added mixed input `BF8` / `FP8` types for MMA support. -* Added fragment scheduler API objects to embed thread block cooperation properties in fragments +* Added mixed input `BF8`/`FP8` types for MMA support. +* Added fragment scheduler API objects to embed thread block cooperation properties in fragments. #### Changed -* Augmented load / store / MMA internals with static loop unrolling -* rocWMMA mma_sync API now supports `wave tile` fragment sizes -* rocWMMA cooperative fragments are now expressed with fragment scheduler template arguments -* rocWMMA cooperative fragments now use the same base API as non-cooperative fragments -* rocWMMA cooperative fragments register usage footprint has been reduced -* rocWMMA fragments now support partial tile sizes with padding - -#### Optimized - -* Added internal flow control barriers to improve assembly code generation and overall performance -* Enabled interleaved layouts by default in MMA to improve overall performance +* Augmented load/store/MMA internals with static loop unrolling. +* Updated linkage of `rocwmma::synchronize_workgroup` to inline. +* rocWMMA `mma_sync` API now supports `wave tile` fragment sizes. +* rocWMMA cooperative fragments are now expressed with fragment scheduler template arguments. +* rocWMMA cooperative fragments now use the same base API as non-cooperative fragments. +* rocWMMA cooperative fragments register usage footprint has been reduced. +* rocWMMA fragments now support partial tile sizes with padding. #### Removed -* Removed support for the gfx940 and gfx941 targets -* Removed the rocWMMA cooperative API -* Removed wave count template parameters from transforms APIs +* Removed support for the gfx940 and gfx941 targets. +* Removed the rocWMMA cooperative API. +* Removed wave count template parameters from transforms APIs. + +#### Optimized + +* Added internal flow control barriers to improve assembly code generation and overall performance. +* Enabled interleaved layouts by default in MMA to improve overall performance. #### Resolved issues -* Fixed a validation issue for small precision compute types `< B32` on gfx9 -* Fixed CMake validation of compiler support for `BF8` / `FP8` types -* Fixed linkage of rocwmma::synchronize_workgroup to inline +* Fixed a validation issue for small precision compute types `< B32` on gfx9. +* Fixed CMake validation of compiler support for `BF8`/`FP8` types. ### **RPP** (2.0.0) @@ -1642,15 +1628,6 @@ The previous default accumulator types could lead to situations in which unexpec * Test package - debian packages will install required dependencies. -### **ROCr Runtime** (1.18.0) - -#### Added - -* New API `hsa_amd_memory_get_preferred_copy_engine` to get preferred copy engine that can be used to when calling `hsa_amd_memory_async_copy_on_engine`. -* New API `hsa_amd_portable_export_dmabuf_v2` extension of existing `hsa_amd_portable_export_dmabuf` API to support new flags parameter. This allows specifying the new `HSA_AMD_DMABUF_MAPPING_TYPE_PCIE` flag when exporting dma-bufs. -* New flag `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` adds support for new `HSA_AMD_VMEM_ADDRESS_NO_REGISTER` when calling `hsa_amd_vmem_address_reserve` API. This allows virtual address range reservations for SVM allocations to be tracked when running in ASAN mode. -* New sub query `HSA_AMD_AGENT_INFO_CLOCK_COUNTERS` returns a snapshot of the underlying driver's clock counters that can be used for profiling. - ### **Tensile** (4.44.0) #### Added diff --git a/RELEASE.md b/RELEASE.md index a4e7de681..e3fb4438f 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -173,7 +173,7 @@ MX-compliant data types bring microscaling support to ROCm. For more information * hipBLASLt * MIGraphX (`FP4` only) -The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on AMD Instinct MI350 series accelerators instead of the NANOO `FP8` format: +The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on MI350 series accelerators instead of the NANOO `FP8` format: * Composable Kernel * hipBLASLt @@ -211,7 +211,7 @@ See the [rocSHMEM changelog](#rocshmem-3-0-0) for more details. Key enhancements to AMD SMI include the ability to reload the AMD GPU driver from the CLI or API. The `amd-smi` command-line interface gains a new default view, `amd-smi` topology support in guest environments, and performance optimizations. Additionally, AMD SMI library APIs -have been refined for improved usability. See the [AMD SMI changelog](#amdsmi-26-0-0) for more details. +have been refined for improved usability. See the [AMD SMI changelog](#amd-smi-26-0-0) for more details. #### ROCgdb @@ -303,7 +303,7 @@ For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/pro ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases. -* Documentation for [rocCV](https://rocm.docs.amd.com/projects/rocCV/en/latest/index.html), an efficient GPU-accelerated library for image pre- and post-processing, has been added. rocCV is in an early access state, and using it on production workloads is not recommended. +* Documentation for [rocCV](https://advanced-micro-devices-roccv--28.com.readthedocs.build/en/28/), an efficient GPU-accelerated library for image pre- and post-processing, has been added. rocCV is in an early access state, and using it on production workloads is not recommended. * ROCm Math libraries support a wide range of data types, enabling optimized performance across various precision requirements. The following Math libraries are now updated with new precision content. For more information, click the Math library’s link: @@ -379,7 +379,7 @@ Click {fab}`github` to go to the component's source code on GitHub. rocAL - 2.2.0 ⇒ 2.3.0 + 2.2.0 ⇒ 2.3.0 @@ -530,12 +530,12 @@ Click {fab}`github` to go to the component's source code on GitHub. Tools System management AMD SMI - 25.5.1 ⇒ 26.0.0 + 25.5.1 ⇒ 26.0.0 ROCm Data Center Tool - 0.3.0 ⇒ 1.1.0 + 0.3.0 ⇒ 1.1.0 @@ -545,7 +545,7 @@ Click {fab}`github` to go to the component's source code on GitHub. ROCm SMI - 7.7.0 ⇒ 7.8.0 + 7.7.0 ⇒ 7.8.0 @@ -1150,7 +1150,7 @@ HIP runtime has the following functional improvements which improves runtime per * Updated and reorganized documentation for clarity and consistency. -### **HIPIFY** (7.0.0) +### **HIPIFY** (20.0.0) #### Added @@ -2348,10 +2348,6 @@ An issue where due to limited support for Sparse API in JAX, some of the functio The following changes to the ROCm software stack are anticipated for future releases. -### AMD SMI migration to AMDGPU driver repository - -In a future release, [AMD SMI](https://github.com/ROCm/amdsmi) will be relocated from the ROCm organization repository to a new AMDTools repository to better align with its system-level functionality. `amd-smi-lib` will no longer be included in the `rocm-developer-tools` meta-package included with your standard ROCm installation. Instead, it will be packaged with the AMDGPU driver installation. - ### ROCm SMI deprecation [ROCm SMI](https://github.com/ROCm/rocm_smi_lib) will be phased out in an diff --git a/docs/compatibility/compatibility-matrix-historical-6.0.csv b/docs/compatibility/compatibility-matrix-historical-6.0.csv index dc92e2a42..8078ba0d4 100644 --- a/docs/compatibility/compatibility-matrix-historical-6.0.csv +++ b/docs/compatibility/compatibility-matrix-historical-6.0.csv @@ -7,9 +7,9 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 ,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8" ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4" ,,,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9 - ,"Oracle Linux 10, 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,, - ,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,, - ,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,,,,,,,,,,,, + ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,, + ,Debian 12,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,, + ,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,,,,,,,,,,,, ,Rocky 9,,,,,,,,,,,,,,,,,, ,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, :doc:`Architecture `,CDNA4,,,,,,,,,,,,,,,,,, @@ -47,8 +47,8 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 `UCX `_,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1 ,,,,,,,,,,,,,,,,,,, THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, - Thrust,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 - CUB,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 + Thrust,2.6.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 + CUB,2.6.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 ,,,,,,,,,,,,,,,,,,, KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, :doc:`KMD versions `,"7.0.x, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x" diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index e5c89265a..2ddf9cea9 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -32,9 +32,9 @@ compatibility and system requirements. ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.5, 9.4" ,RHEL 8.10,RHEL 8.10,RHEL 8.10 ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP6, SP5" - ,"Oracle Linux 10, 9, 8 [#ol-mi300x]_","Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_ - ,Debian 12 [#single-node]_,Debian 12 [#single-node]_, - ,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#az-mi300x]_, + ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x]_","Oracle Linux 9, 8 [#mi300x]_",Oracle Linux 8.10 [#mi300x]_ + ,Debian 12,Debian 12 [#single-node]_, + ,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#mi300x]_, ,Rocky 9,, ,.. _architecture-support-compatibility-matrix:,, :doc:`Architecture `,CDNA4,, @@ -71,8 +71,8 @@ compatibility and system requirements. `UCX `_,>=1.15.0,>=1.15.0,>=1.15.0 ,,, THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix:,, - Thrust,2.5.0,2.5.0,2.3.2 - CUB,2.5.0,2.5.0,2.3.2 + Thrust,2.6.0,2.5.0,2.3.2 + CUB,2.6.0,2.5.0,2.3.2 ,,, KMD & USER SPACE [#kfd_support]_,.. _kfd-userspace-support-compatibility-matrix:,, :doc:`KMD versions `,"7.0.x, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x" @@ -160,9 +160,10 @@ compatibility and system requirements. .. rubric:: Footnotes -.. [#ol-mi300x] Oracle Linux 10 and 9 are supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. +.. [#ol-700-mi300x] **For ROCm 7.0** - Oracle Linux 10 and 9 are supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. +.. [#mi300x] **Prior ROCm 7.0** - Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X. .. [#single-node] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality. -.. [#az-mi300x] Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. +.. [#az-mi300x] **For ROCm 7.0** - Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. .. [#RDNA-OS] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4. .. [#7700XT-OS] Radeon RX 7700 XT (gfx1101) is supported only on Ubuntu 24.04.2 and RHEL 9.6. .. [#kfd_support] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix `_. @@ -181,28 +182,33 @@ Use this lookup table to confirm which operating system and kernel versions are :widths: 40, 20, 30, 20 :stub-columns: 1 - `Ubuntu `_, 24.04.2, "6.8 GA, 6.11 HWE", 2.39 + `Ubuntu `_, 24.04.3, "6.8 [GA], 6.14 [HWE]", 2.39 ,, - `Ubuntu `_, 22.04.5, "5.15 GA, 6.8 HWE", 2.35 + `Ubuntu `_, 24.04.2, "6.8 [GA], 6.11 [HWE]", 2.39 ,, - `Red Hat Enterprise Linux (RHEL 9) `_, 9.6, 5.14+, 2.34 + `Ubuntu `_, 22.04.5, "5.15 [GA], 6.8 [HWE]", 2.35 + ,, + `Red Hat Enterprise Linux (RHEL 10) `_, 10, 6.12.0-55, 2.39 + ,, + `Red Hat Enterprise Linux (RHEL 9) `_, 9.6, 5.14.0-570, 2.34 ,9.5, 5.14+, 2.34 - ,9.4, 5.14+, 2.34 - ,9.3, 5.14+, 2.34 + ,9.4, 5.14.0-427, 2.34 ,, - `Red Hat Enterprise Linux (RHEL 8) `_, 8.10, 4.18.0+, 2.28 - ,8.9, 4.18.0, 2.28 + `Red Hat Enterprise Linux (RHEL 8) `_, 8.10, 4.18.0-553, 2.28 ,, - `SUSE Linux Enterprise Server (SLES) `_, 15 SP7, 6.11.0+, 2.38 + `SUSE Linux Enterprise Server (SLES) `_, 15 SP7, 6.40-150700.51, 2.38 ,15 SP6, "6.5.0+, 6.4.0", 2.38 ,15 SP5, 5.14.21, 2.31 ,, - `Oracle Linux `_, 9, 5.15.0 (UEK), 2.35 + `Rocky `_, 9, 5.14.0-570, 2.34 + ,, + `Oracle Linux `_, 10, 6.12.0 (UEK), 2.39 + ,9, 6.12.0 (UEK), 2.34 ,8, 5.15.0 (UEK), 2.28 ,, - `Debian `_,12, 6.1, 2.36 + `Debian `_,12, 6.1.0, 2.36 ,, - `Azure Linux `_,3.0, 6.6.60, 2.38 + `Azure Linux `_,3.0, 6.6.92, 2.38 ,, .. note:: @@ -235,8 +241,10 @@ Expand for full historical view of: .. rubric:: Footnotes - .. [#mi300x-past-60] Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X. + .. [#ol-700-mi300x-past-60] **For ROCm 7.0** - Oracle Linux 10 and 9 are supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. + .. [#mi300x-past-60] **Prior ROCm 7.0** - Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X. .. [#single-node-past-60] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality. + .. [#az-mi300x-past-60] **For ROCm 7.0** - Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. .. [#RDNA-OS-past-60] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4. .. [#7700XT-OS-past-60] Radeon RX 7700 XT (gfx1101) is supported only on Ubuntu 24.04.2 and RHEL 9.6. .. [#mi300_624-past-60] **For ROCm 6.2.4** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE]. From a88151f505fc905acd9c0fef8a70c2cb882b39e1 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Fri, 22 Aug 2025 14:59:59 +0530 Subject: [PATCH 20/58] Update mi355-performance-counters.rst --- docs/conceptual/gpu-arch/mi355-performance-counters.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/conceptual/gpu-arch/mi355-performance-counters.rst b/docs/conceptual/gpu-arch/mi355-performance-counters.rst index f073169cd..faec46606 100644 --- a/docs/conceptual/gpu-arch/mi355-performance-counters.rst +++ b/docs/conceptual/gpu-arch/mi355-performance-counters.rst @@ -26,13 +26,13 @@ Command processor packet processor counters (CPC) - ADC valid chunk is not available when dispatch walking is in progress in the multi-xcc mode. * - CPC_ADC_DISPATCH_ALLOC_DONE - - ADC dispatch allocation done. + - ADC dispatch allocation is done. * - CPC_ADC_VALID_CHUNK_END - ADC crawler's valid chunk end in the multi-xcc mode. * - CPC_SYNC_FIFO_FULL_LEVEL - - Level count SYNC FIFO full last cycles. + - SYNC FIFO full last cycles. * - CPC_SYNC_FIFO_FULL - SYNC FIFO full times. From c587d75701d25b17a38a7b83d2940ad4ff026dc7 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Fri, 22 Aug 2025 19:57:27 +0530 Subject: [PATCH 21/58] listing in TOC --- docs/conceptual/gpu-arch/mi355-performance-counters.rst | 2 +- docs/sphinx/_toc.yml.in | 6 ++++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/conceptual/gpu-arch/mi355-performance-counters.rst b/docs/conceptual/gpu-arch/mi355-performance-counters.rst index faec46606..0335f08d0 100644 --- a/docs/conceptual/gpu-arch/mi355-performance-counters.rst +++ b/docs/conceptual/gpu-arch/mi355-performance-counters.rst @@ -6,7 +6,7 @@ MI350 and MI355 series performance counters ********************************************** -This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI355 series GPUs. These counters are available for profiling using `ROCprofiler-SDK `_ and `ROCm Compute Profiler `_. +This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI350 and MI355 series GPUs. These counters are available for profiling using `ROCprofiler-SDK `_ and `ROCm Compute Profiler `_. The following sections list the performance counters based on the IP blocks. diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index be53a146f..a3f23da94 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -125,7 +125,7 @@ subtrees: - file: how-to/setting-cus title: Set the number of CUs - file: how-to/Bar-Memory.rst - title: Troubleshoot BAR access limitation + title: Troubleshoot BAR access limitation - url: https://github.com/amd/rocm-examples title: ROCm examples @@ -145,7 +145,9 @@ subtrees: - url: https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf title: White paper - file: conceptual/gpu-arch/mi300-mi200-performance-counters.rst - title: MI300 and MI200 Performance counter + title: MI300 and MI200 performance counters + - file: conceptual/gpu-arch/mi355-performance-counters.rst + title: MI350 and MI355 series performance counters - file: conceptual/gpu-arch/mi250.md title: MI250 microarchitecture subtrees: From 78c4a4c12a8d0d6dbcff59b31be16413573dfc92 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Fri, 22 Aug 2025 11:30:37 -0400 Subject: [PATCH 22/58] Post RC4 700 RN update [Batch 3] (#520) * Indentation and formatting updated * OS support changes * Historical compatibility updated * Minor update --- RELEASE.md | 4 +++- .../compatibility-matrix-historical-6.0.csv | 2 +- docs/compatibility/compatibility-matrix.rst | 13 +++++++------ 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index e3fb4438f..debe0d914 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -49,7 +49,7 @@ ROCm 7.0.0 adds support for the following operating systems and kernel versions: * Oracle Linux 10 (kernel: 6.12.0 UEK) * Rocky 9 (kernel: 5.14.0-570) -ROCm 7.0.0 marks the end of support (EoS) for Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE]). +ROCm 7.0.0 marks the end of support (EoS) for Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE]) and SLES 15 SP6. For more information about supported operating systems, see [Supported operating systems](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#supported-operating-systems) and [install instructions](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/). @@ -112,6 +112,8 @@ ROCm 7.0 enables support for Triton 3.3.0. The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) for more information. +[AMD SMI](https://github.com/ROCm/amdsmi) continues to stay with the ROCm software stack under the ROCm organization repository. + ### HIP API compatibility improvements To improve code portability between AMD ROCm and other programming models, HIP API has been updated in ROCm 7.0 to simplify cross-platform programming. These changes are incompatible with prior ROCm releases and might require recompiling existing HIP applications for use with ROCm 7.0. For more information, see the [HIP API 7.0 changes](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/hip-7-changes.html) and the [HIP changelog](#hip-7-0-0) below. diff --git a/docs/compatibility/compatibility-matrix-historical-6.0.csv b/docs/compatibility/compatibility-matrix-historical-6.0.csv index 8078ba0d4..7d3e9d040 100644 --- a/docs/compatibility/compatibility-matrix-historical-6.0.csv +++ b/docs/compatibility/compatibility-matrix-historical-6.0.csv @@ -9,7 +9,7 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 ,,,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9 ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,, ,Debian 12,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,, - ,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,,,,,,,,,,,, + ,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-630-past-60]_,Azure Linux 3.0 [#az-mi300x-630-past-60]_,,,,,,,,,,,, ,Rocky 9,,,,,,,,,,,,,,,,,, ,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, :doc:`Architecture `,CDNA4,,,,,,,,,,,,,,,,,, diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index 2ddf9cea9..6af5e9d19 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -32,9 +32,9 @@ compatibility and system requirements. ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.5, 9.4" ,RHEL 8.10,RHEL 8.10,RHEL 8.10 ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP6, SP5" - ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x]_","Oracle Linux 9, 8 [#mi300x]_",Oracle Linux 8.10 [#mi300x]_ + ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x]_","Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_ ,Debian 12,Debian 12 [#single-node]_, - ,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#mi300x]_, + ,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#az-mi300x]_, ,Rocky 9,, ,.. _architecture-support-compatibility-matrix:,, :doc:`Architecture `,CDNA4,, @@ -161,9 +161,9 @@ compatibility and system requirements. .. rubric:: Footnotes .. [#ol-700-mi300x] **For ROCm 7.0** - Oracle Linux 10 and 9 are supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. -.. [#mi300x] **Prior ROCm 7.0** - Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X. +.. [#ol-mi300x] **Prior ROCm 7.0** - Oracle Linux is only on AMD Instinct MI300X. .. [#single-node] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality. -.. [#az-mi300x] **For ROCm 7.0** - Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. +.. [#az-mi300x] Starting ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. .. [#RDNA-OS] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4. .. [#7700XT-OS] Radeon RX 7700 XT (gfx1101) is supported only on Ubuntu 24.04.2 and RHEL 9.6. .. [#kfd_support] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix `_. @@ -242,9 +242,10 @@ Expand for full historical view of: .. rubric:: Footnotes .. [#ol-700-mi300x-past-60] **For ROCm 7.0** - Oracle Linux 10 and 9 are supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. - .. [#mi300x-past-60] **Prior ROCm 7.0** - Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X. + .. [#mi300x-past-60] **Prior ROCm 7.0** - Oracle Linux is supported only on AMD Instinct MI300X. .. [#single-node-past-60] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality. - .. [#az-mi300x-past-60] **For ROCm 7.0** - Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. + .. [#az-mi300x-past-60] Starting ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. + .. [#az-mi300x-630-past-60] **Prior ROCm 6.4.0**- Azure Linux 3.0 is supported only on AMD Instinct MI300X. .. [#RDNA-OS-past-60] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4. .. [#7700XT-OS-past-60] Radeon RX 7700 XT (gfx1101) is supported only on Ubuntu 24.04.2 and RHEL 9.6. .. [#mi300_624-past-60] **For ROCm 6.2.4** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE]. From e839054e562ed313dc530bdd3b000a28be701ae5 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Fri, 22 Aug 2025 22:31:49 +0530 Subject: [PATCH 23/58] Update mi355-performance-counters.rst --- .../gpu-arch/mi355-performance-counters.rst | 278 +++++++++--------- 1 file changed, 139 insertions(+), 139 deletions(-) diff --git a/docs/conceptual/gpu-arch/mi355-performance-counters.rst b/docs/conceptual/gpu-arch/mi355-performance-counters.rst index 0335f08d0..dcb181d05 100644 --- a/docs/conceptual/gpu-arch/mi355-performance-counters.rst +++ b/docs/conceptual/gpu-arch/mi355-performance-counters.rst @@ -74,202 +74,202 @@ Shader pipe interpolators (SPI) counters ========================================= .. list-table:: SPI counters - :header-row: 1 + :header-row: 1 - * - Hardware counter - - Definition + * - Hardware counter + - Definition - * - SPI_CS0_WINDOW_VALID - - Clock count enabled by PIPE0 perfcounter_start event. + * - SPI_CS0_WINDOW_VALID + - Clock count enabled by PIPE0 perfcounter_start event. - * - SPI_CS0_BUSY - - Number of clocks with outstanding waves for PIPE0 (SPI or SH). + * - SPI_CS0_BUSY + - Number of clocks with outstanding waves for PIPE0 (SPI or SH). - * - SPI_CS0_NUM_THREADGROUPS - - Number of thread groups launched for PIPE0. + * - SPI_CS0_NUM_THREADGROUPS + - Number of thread groups launched for PIPE0. - * - SPI_CS0_CRAWLER_STALL - - Number of clocks when PIPE0 event or wave order FIFO is full. + * - SPI_CS0_CRAWLER_STALL + - Number of clocks when PIPE0 event or wave order FIFO is full. - * - SPI_CS0_EVENT_WAVE - - Number of PIPE0 events and waves. + * - SPI_CS0_EVENT_WAVE + - Number of PIPE0 events and waves. - * - SPI_CS0_WAVE - - Number of PIPE0 waves. + * - SPI_CS0_WAVE + - Number of PIPE0 waves. - * - SPI_CS1_WINDOW_VALID - - Clock count enabled by PIPE1 perfcounter_start event. + * - SPI_CS1_WINDOW_VALID + - Clock count enabled by PIPE1 perfcounter_start event. - * - SPI_CS1_BUSY - - Number of clocks with outstanding waves for PIPE1 (SPI or SH). + * - SPI_CS1_BUSY + - Number of clocks with outstanding waves for PIPE1 (SPI or SH). - * - SPI_CS1_NUM_THREADGROUPS - - Number of thread groups launched for PIPE1. + * - SPI_CS1_NUM_THREADGROUPS + - Number of thread groups launched for PIPE1. - * - SPI_CS1_CRAWLER_STALL - - Number of clocks when PIPE1 event or wave order FIFO is full. + * - SPI_CS1_CRAWLER_STALL + - Number of clocks when PIPE1 event or wave order FIFO is full. - * - SPI_CS1_EVENT_WAVE - - Number of PIPE1 events and waves. + * - SPI_CS1_EVENT_WAVE + - Number of PIPE1 events and waves. - * - SPI_CS1_WAVE - - Number of PIPE1 waves. + * - SPI_CS1_WAVE + - Number of PIPE1 waves. - * - SPI_CS2_WINDOW_VALID - - Clock count enabled by PIPE2 perfcounter_start event. + * - SPI_CS2_WINDOW_VALID + - Clock count enabled by PIPE2 perfcounter_start event. - * - SPI_CS2_BUSY - - Number of clocks with outstanding waves for PIPE2 (SPI or SH). + * - SPI_CS2_BUSY + - Number of clocks with outstanding waves for PIPE2 (SPI or SH). - * - SPI_CS2_NUM_THREADGROUPS - - Number of thread groups launched for PIPE2. + * - SPI_CS2_NUM_THREADGROUPS + - Number of thread groups launched for PIPE2. - * - SPI_CS2_CRAWLER_STALL - - Number of clocks when PIPE2 event or wave order FIFO is full. + * - SPI_CS2_CRAWLER_STALL + - Number of clocks when PIPE2 event or wave order FIFO is full. - * - SPI_CS2_EVENT_WAVE - - Number of PIPE2 events and waves. + * - SPI_CS2_EVENT_WAVE + - Number of PIPE2 events and waves. - * - SPI_CS2_WAVE - - Number of PIPE2 waves. + * - SPI_CS2_WAVE + - Number of PIPE2 waves. - * - SPI_CS3_WINDOW_VALID - - Clock count enabled by PIPE3 perfcounter_start event. + * - SPI_CS3_WINDOW_VALID + - Clock count enabled by PIPE3 perfcounter_start event. - * - SPI_CS3_BUSY - - Number of clocks with outstanding waves for PIPE3 (SPI or SH). + * - SPI_CS3_BUSY + - Number of clocks with outstanding waves for PIPE3 (SPI or SH). - * - SPI_CS3_NUM_THREADGROUPS - - Number of thread groups launched for PIPE3. + * - SPI_CS3_NUM_THREADGROUPS + - Number of thread groups launched for PIPE3. - * - SPI_CS3_CRAWLER_STALL - - Number of clocks when PIPE3 event or wave order FIFO is full. + * - SPI_CS3_CRAWLER_STALL + - Number of clocks when PIPE3 event or wave order FIFO is full. - * - SPI_CS3_EVENT_WAVE - - Number of PIPE3 events and waves. + * - SPI_CS3_EVENT_WAVE + - Number of PIPE3 events and waves. - * - SPI_CS3_WAVE - - Number of PIPE3 waves. + * - SPI_CS3_WAVE + - Number of PIPE3 waves. - * - SPI_CSQ_P0_Q0_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue0. + * - SPI_CSQ_P0_Q0_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue0. - * - SPI_CSQ_P0_Q1_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue1. + * - SPI_CSQ_P0_Q1_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue1. - * - SPI_CSQ_P0_Q2_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue2. + * - SPI_CSQ_P0_Q2_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue2. - * - SPI_CSQ_P0_Q3_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue3. + * - SPI_CSQ_P0_Q3_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue3. - * - SPI_CSQ_P0_Q4_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue4. + * - SPI_CSQ_P0_Q4_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue4. - * - SPI_CSQ_P0_Q5_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue5. + * - SPI_CSQ_P0_Q5_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue5. - * - SPI_CSQ_P0_Q6_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue6. + * - SPI_CSQ_P0_Q6_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue6. - * - SPI_CSQ_P0_Q7_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue7. + * - SPI_CSQ_P0_Q7_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue7. - * - SPI_CSQ_P1_Q0_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue0. + * - SPI_CSQ_P1_Q0_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue0. - * - SPI_CSQ_P1_Q1_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue1. + * - SPI_CSQ_P1_Q1_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue1. - * - SPI_CSQ_P1_Q2_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue2. + * - SPI_CSQ_P1_Q2_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue2. - * - SPI_CSQ_P1_Q3_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue3. + * - SPI_CSQ_P1_Q3_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue3. - * - SPI_CSQ_P1_Q4_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue4. + * - SPI_CSQ_P1_Q4_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue4. - * - SPI_CSQ_P1_Q5_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue5. + * - SPI_CSQ_P1_Q5_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue5. - * - SPI_CSQ_P1_Q6_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue6. + * - SPI_CSQ_P1_Q6_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue6. - * - SPI_CSQ_P1_Q7_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue7. + * - SPI_CSQ_P1_Q7_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue7. - * - SPI_CSQ_P2_Q0_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue0. + * - SPI_CSQ_P2_Q0_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue0. - * - SPI_CSQ_P2_Q1_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue1. + * - SPI_CSQ_P2_Q1_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue1. - * - SPI_CSQ_P2_Q2_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue2. + * - SPI_CSQ_P2_Q2_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue2. - * - SPI_CSQ_P2_Q3_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue3. + * - SPI_CSQ_P2_Q3_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue3. - * - SPI_CSQ_P2_Q4_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue4. + * - SPI_CSQ_P2_Q4_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue4. - * - SPI_CSQ_P2_Q5_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue5. + * - SPI_CSQ_P2_Q5_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue5. - * - SPI_CSQ_P2_Q6_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue6. + * - SPI_CSQ_P2_Q6_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue6. - * - SPI_CSQ_P2_Q7_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue7. + * - SPI_CSQ_P2_Q7_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue7. - * - SPI_CSQ_P3_Q0_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue0. + * - SPI_CSQ_P3_Q0_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue0. - * - SPI_CSQ_P3_Q1_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue1. + * - SPI_CSQ_P3_Q1_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue1. - * - SPI_CSQ_P3_Q2_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue2. + * - SPI_CSQ_P3_Q2_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue2. - * - SPI_CSQ_P3_Q3_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue3. + * - SPI_CSQ_P3_Q3_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue3. - * - SPI_CSQ_P3_Q4_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue4. + * - SPI_CSQ_P3_Q4_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue4. - * - SPI_CSQ_P3_Q5_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue5. + * - SPI_CSQ_P3_Q5_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue5. - * - SPI_CSQ_P3_Q6_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue6. + * - SPI_CSQ_P3_Q6_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue6. - * - SPI_CSQ_P3_Q7_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue7. + * - SPI_CSQ_P3_Q7_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue7. - * - SPI_CSQ_P0_OCCUPANCY - - Sum of occupancy info for all PIPE0 queues. + * - SPI_CSQ_P0_OCCUPANCY + - Sum of occupancy info for all PIPE0 queues. - * - SPI_CSQ_P1_OCCUPANCY - - Sum of occupancy info for all PIPE1 queues. + * - SPI_CSQ_P1_OCCUPANCY + - Sum of occupancy info for all PIPE1 queues. - * - SPI_CSQ_P2_OCCUPANCY - - Sum of occupancy info for all PIPE2 queues. + * - SPI_CSQ_P2_OCCUPANCY + - Sum of occupancy info for all PIPE2 queues. - * - SPI_CSQ_P3_OCCUPANCY - - Sum of occupancy info for all PIPE3 queues. + * - SPI_CSQ_P3_OCCUPANCY + - Sum of occupancy info for all PIPE3 queues. - * - SPI_VWC0_VDATA_VALID_WR - - Number of clocks VGPR bus_0 writes VGPRs. + * - SPI_VWC0_VDATA_VALID_WR + - Number of clocks VGPR bus_0 writes VGPRs. - * - SPI_VWC1_VDATA_VALID_WR - - Number of clocks VGPR bus_1 writes VGPRs. + * - SPI_VWC1_VDATA_VALID_WR + - Number of clocks VGPR bus_1 writes VGPRs. - * - SPI_CSC_WAVE_CNT_BUSY - - Number of cycles when there is any wave in the pipe. + * - SPI_CSC_WAVE_CNT_BUSY + - Number of cycles when there is any wave in the pipe. -Compute unit counters -====================== +Compute unit (SQ) counters +=========================== .. list-table:: SQ counters :header-row: 1 @@ -346,8 +346,8 @@ Compute unit counters * - SQC_DCACHE_MISSES_DUPLICATE - Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic). -Texture addressing unit counters -================================= +Texture addressing (TA) unit counters +====================================== .. list-table:: TA counters :header-row: 1 @@ -361,8 +361,8 @@ Texture addressing unit counters * - TA_FLAT_READ_LDS_WAVEFRONTS - Number of flat opcode reads for LDS return processed by the TA. -Texture data unit counters -=========================== +Texture data (TD) unit counters +================================ .. list-table:: TD counters :header-row: 1 @@ -376,8 +376,8 @@ Texture data unit counters * - TD_TD_SP_TRAFFIC - Number of times this TD sends data to the SP. -Texture cache per pipe counters -================================ +Texture cache per pipe (TCP) counters +====================================== .. list-table:: TCP counters :header-row: 1 @@ -457,8 +457,8 @@ Texture cache per pipe counters * - TCP_TCC_WRITE_REQ_HOLE_LATENCY - Total TCP req to TCC hole latency for writes and atomics. Not Windowed. -Texture cache per channel counters -=================================== +Texture cache per channel (TCC) counters +========================================= .. list-table:: TCC counters :header-row: 1 From 7fd6146b160b98fdb83cc5b60f695786a0c7dea3 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Fri, 22 Aug 2025 23:16:18 +0530 Subject: [PATCH 24/58] Update mi355-performance-counters.rst --- .../gpu-arch/mi355-performance-counters.rst | 576 +++++++++--------- 1 file changed, 288 insertions(+), 288 deletions(-) diff --git a/docs/conceptual/gpu-arch/mi355-performance-counters.rst b/docs/conceptual/gpu-arch/mi355-performance-counters.rst index dcb181d05..861ce5641 100644 --- a/docs/conceptual/gpu-arch/mi355-performance-counters.rst +++ b/docs/conceptual/gpu-arch/mi355-performance-counters.rst @@ -74,457 +74,457 @@ Shader pipe interpolators (SPI) counters ========================================= .. list-table:: SPI counters - :header-row: 1 + :header-rows: 1 - * - Hardware counter - - Definition + * - Hardware counter + - Definition - * - SPI_CS0_WINDOW_VALID - - Clock count enabled by PIPE0 perfcounter_start event. + * - SPI_CS0_WINDOW_VALID + - Clock count enabled by PIPE0 perfcounter_start event. - * - SPI_CS0_BUSY - - Number of clocks with outstanding waves for PIPE0 (SPI or SH). + * - SPI_CS0_BUSY + - Number of clocks with outstanding waves for PIPE0 (SPI or SH). - * - SPI_CS0_NUM_THREADGROUPS - - Number of thread groups launched for PIPE0. + * - SPI_CS0_NUM_THREADGROUPS + - Number of thread groups launched for PIPE0. - * - SPI_CS0_CRAWLER_STALL - - Number of clocks when PIPE0 event or wave order FIFO is full. + * - SPI_CS0_CRAWLER_STALL + - Number of clocks when PIPE0 event or wave order FIFO is full. - * - SPI_CS0_EVENT_WAVE - - Number of PIPE0 events and waves. + * - SPI_CS0_EVENT_WAVE + - Number of PIPE0 events and waves. - * - SPI_CS0_WAVE - - Number of PIPE0 waves. + * - SPI_CS0_WAVE + - Number of PIPE0 waves. - * - SPI_CS1_WINDOW_VALID - - Clock count enabled by PIPE1 perfcounter_start event. + * - SPI_CS1_WINDOW_VALID + - Clock count enabled by PIPE1 perfcounter_start event. - * - SPI_CS1_BUSY - - Number of clocks with outstanding waves for PIPE1 (SPI or SH). + * - SPI_CS1_BUSY + - Number of clocks with outstanding waves for PIPE1 (SPI or SH). - * - SPI_CS1_NUM_THREADGROUPS - - Number of thread groups launched for PIPE1. + * - SPI_CS1_NUM_THREADGROUPS + - Number of thread groups launched for PIPE1. - * - SPI_CS1_CRAWLER_STALL - - Number of clocks when PIPE1 event or wave order FIFO is full. + * - SPI_CS1_CRAWLER_STALL + - Number of clocks when PIPE1 event or wave order FIFO is full. - * - SPI_CS1_EVENT_WAVE - - Number of PIPE1 events and waves. + * - SPI_CS1_EVENT_WAVE + - Number of PIPE1 events and waves. - * - SPI_CS1_WAVE - - Number of PIPE1 waves. + * - SPI_CS1_WAVE + - Number of PIPE1 waves. - * - SPI_CS2_WINDOW_VALID - - Clock count enabled by PIPE2 perfcounter_start event. + * - SPI_CS2_WINDOW_VALID + - Clock count enabled by PIPE2 perfcounter_start event. - * - SPI_CS2_BUSY - - Number of clocks with outstanding waves for PIPE2 (SPI or SH). + * - SPI_CS2_BUSY + - Number of clocks with outstanding waves for PIPE2 (SPI or SH). - * - SPI_CS2_NUM_THREADGROUPS - - Number of thread groups launched for PIPE2. + * - SPI_CS2_NUM_THREADGROUPS + - Number of thread groups launched for PIPE2. - * - SPI_CS2_CRAWLER_STALL - - Number of clocks when PIPE2 event or wave order FIFO is full. + * - SPI_CS2_CRAWLER_STALL + - Number of clocks when PIPE2 event or wave order FIFO is full. - * - SPI_CS2_EVENT_WAVE - - Number of PIPE2 events and waves. + * - SPI_CS2_EVENT_WAVE + - Number of PIPE2 events and waves. - * - SPI_CS2_WAVE - - Number of PIPE2 waves. + * - SPI_CS2_WAVE + - Number of PIPE2 waves. - * - SPI_CS3_WINDOW_VALID - - Clock count enabled by PIPE3 perfcounter_start event. + * - SPI_CS3_WINDOW_VALID + - Clock count enabled by PIPE3 perfcounter_start event. - * - SPI_CS3_BUSY - - Number of clocks with outstanding waves for PIPE3 (SPI or SH). + * - SPI_CS3_BUSY + - Number of clocks with outstanding waves for PIPE3 (SPI or SH). - * - SPI_CS3_NUM_THREADGROUPS - - Number of thread groups launched for PIPE3. + * - SPI_CS3_NUM_THREADGROUPS + - Number of thread groups launched for PIPE3. - * - SPI_CS3_CRAWLER_STALL - - Number of clocks when PIPE3 event or wave order FIFO is full. + * - SPI_CS3_CRAWLER_STALL + - Number of clocks when PIPE3 event or wave order FIFO is full. - * - SPI_CS3_EVENT_WAVE - - Number of PIPE3 events and waves. + * - SPI_CS3_EVENT_WAVE + - Number of PIPE3 events and waves. - * - SPI_CS3_WAVE - - Number of PIPE3 waves. + * - SPI_CS3_WAVE + - Number of PIPE3 waves. - * - SPI_CSQ_P0_Q0_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue0. + * - SPI_CSQ_P0_Q0_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue0. - * - SPI_CSQ_P0_Q1_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue1. + * - SPI_CSQ_P0_Q1_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue1. - * - SPI_CSQ_P0_Q2_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue2. + * - SPI_CSQ_P0_Q2_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue2. - * - SPI_CSQ_P0_Q3_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue3. + * - SPI_CSQ_P0_Q3_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue3. - * - SPI_CSQ_P0_Q4_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue4. + * - SPI_CSQ_P0_Q4_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue4. - * - SPI_CSQ_P0_Q5_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue5. + * - SPI_CSQ_P0_Q5_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue5. - * - SPI_CSQ_P0_Q6_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue6. + * - SPI_CSQ_P0_Q6_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue6. - * - SPI_CSQ_P0_Q7_OCCUPANCY - - Sum of occupancy info for PIPE0 Queue7. + * - SPI_CSQ_P0_Q7_OCCUPANCY + - Sum of occupancy info for PIPE0 Queue7. - * - SPI_CSQ_P1_Q0_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue0. + * - SPI_CSQ_P1_Q0_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue0. - * - SPI_CSQ_P1_Q1_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue1. + * - SPI_CSQ_P1_Q1_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue1. - * - SPI_CSQ_P1_Q2_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue2. + * - SPI_CSQ_P1_Q2_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue2. - * - SPI_CSQ_P1_Q3_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue3. + * - SPI_CSQ_P1_Q3_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue3. - * - SPI_CSQ_P1_Q4_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue4. + * - SPI_CSQ_P1_Q4_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue4. - * - SPI_CSQ_P1_Q5_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue5. + * - SPI_CSQ_P1_Q5_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue5. - * - SPI_CSQ_P1_Q6_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue6. + * - SPI_CSQ_P1_Q6_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue6. - * - SPI_CSQ_P1_Q7_OCCUPANCY - - Sum of occupancy info for PIPE1 Queue7. + * - SPI_CSQ_P1_Q7_OCCUPANCY + - Sum of occupancy info for PIPE1 Queue7. - * - SPI_CSQ_P2_Q0_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue0. + * - SPI_CSQ_P2_Q0_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue0. - * - SPI_CSQ_P2_Q1_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue1. + * - SPI_CSQ_P2_Q1_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue1. - * - SPI_CSQ_P2_Q2_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue2. + * - SPI_CSQ_P2_Q2_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue2. - * - SPI_CSQ_P2_Q3_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue3. + * - SPI_CSQ_P2_Q3_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue3. - * - SPI_CSQ_P2_Q4_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue4. + * - SPI_CSQ_P2_Q4_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue4. - * - SPI_CSQ_P2_Q5_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue5. + * - SPI_CSQ_P2_Q5_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue5. - * - SPI_CSQ_P2_Q6_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue6. + * - SPI_CSQ_P2_Q6_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue6. - * - SPI_CSQ_P2_Q7_OCCUPANCY - - Sum of occupancy info for PIPE2 Queue7. + * - SPI_CSQ_P2_Q7_OCCUPANCY + - Sum of occupancy info for PIPE2 Queue7. - * - SPI_CSQ_P3_Q0_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue0. + * - SPI_CSQ_P3_Q0_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue0. - * - SPI_CSQ_P3_Q1_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue1. + * - SPI_CSQ_P3_Q1_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue1. - * - SPI_CSQ_P3_Q2_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue2. + * - SPI_CSQ_P3_Q2_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue2. - * - SPI_CSQ_P3_Q3_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue3. + * - SPI_CSQ_P3_Q3_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue3. - * - SPI_CSQ_P3_Q4_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue4. + * - SPI_CSQ_P3_Q4_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue4. - * - SPI_CSQ_P3_Q5_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue5. + * - SPI_CSQ_P3_Q5_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue5. - * - SPI_CSQ_P3_Q6_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue6. + * - SPI_CSQ_P3_Q6_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue6. - * - SPI_CSQ_P3_Q7_OCCUPANCY - - Sum of occupancy info for PIPE3 Queue7. + * - SPI_CSQ_P3_Q7_OCCUPANCY + - Sum of occupancy info for PIPE3 Queue7. - * - SPI_CSQ_P0_OCCUPANCY - - Sum of occupancy info for all PIPE0 queues. + * - SPI_CSQ_P0_OCCUPANCY + - Sum of occupancy info for all PIPE0 queues. - * - SPI_CSQ_P1_OCCUPANCY - - Sum of occupancy info for all PIPE1 queues. + * - SPI_CSQ_P1_OCCUPANCY + - Sum of occupancy info for all PIPE1 queues. - * - SPI_CSQ_P2_OCCUPANCY - - Sum of occupancy info for all PIPE2 queues. + * - SPI_CSQ_P2_OCCUPANCY + - Sum of occupancy info for all PIPE2 queues. - * - SPI_CSQ_P3_OCCUPANCY - - Sum of occupancy info for all PIPE3 queues. + * - SPI_CSQ_P3_OCCUPANCY + - Sum of occupancy info for all PIPE3 queues. - * - SPI_VWC0_VDATA_VALID_WR - - Number of clocks VGPR bus_0 writes VGPRs. + * - SPI_VWC0_VDATA_VALID_WR + - Number of clocks VGPR bus_0 writes VGPRs. - * - SPI_VWC1_VDATA_VALID_WR - - Number of clocks VGPR bus_1 writes VGPRs. + * - SPI_VWC1_VDATA_VALID_WR + - Number of clocks VGPR bus_1 writes VGPRs. - * - SPI_CSC_WAVE_CNT_BUSY - - Number of cycles when there is any wave in the pipe. + * - SPI_CSC_WAVE_CNT_BUSY + - Number of cycles when there is any wave in the pipe. Compute unit (SQ) counters =========================== .. list-table:: SQ counters - :header-row: 1 + :header-rows: 1 - * - Hardware counter - - Definition + * - Hardware counter + - Definition - * - SQ_INSTS_VALU_MFMA_F6F4 - - Number of VALU V_MFMA_*_F6F4 instructions. + * - SQ_INSTS_VALU_MFMA_F6F4 + - Number of VALU V_MFMA_*_F6F4 instructions. - * - SQ_INSTS_VALU_MFMA_MOPS_F6F4 - - Number of VALU matrix with the performed math operations (add or mul) divided by 512, assuming a full EXEC mask of F6 or F4 data type. + * - SQ_INSTS_VALU_MFMA_MOPS_F6F4 + - Number of VALU matrix with the performed math operations (add or mul) divided by 512, assuming a full EXEC mask of F6 or F4 data type. - * - SQ_ACTIVE_INST_VALU2 - - Number of quad-cycles when two VALU instructions are issued (per-simd, nondeterministic). + * - SQ_ACTIVE_INST_VALU2 + - Number of quad-cycles when two VALU instructions are issued (per-simd, nondeterministic). - * - SQ_INSTS_LDS_LOAD - - Number of LDS load instructions issued (per-simd, emulated). + * - SQ_INSTS_LDS_LOAD + - Number of LDS load instructions issued (per-simd, emulated). - * - SQ_INSTS_LDS_STORE - - Number of LDS store instructions issued (per-simd, emulated). + * - SQ_INSTS_LDS_STORE + - Number of LDS store instructions issued (per-simd, emulated). - * - SQ_INSTS_LDS_ATOMIC - - Number of LDS atomic instructions issued (per-simd, emulated). + * - SQ_INSTS_LDS_ATOMIC + - Number of LDS atomic instructions issued (per-simd, emulated). - * - SQ_INSTS_LDS_LOAD_BANDWIDTH - - Total number of 64-bytes loaded (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). + * - SQ_INSTS_LDS_LOAD_BANDWIDTH + - Total number of 64-bytes loaded (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). - * - SQ_INSTS_LDS_STORE_BANDWIDTH - - Total number of 64-bytes written (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). + * - SQ_INSTS_LDS_STORE_BANDWIDTH + - Total number of 64-bytes written (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). - * - SQ_INSTS_LDS_ATOMIC_BANDWIDTH - - Total number of 64-bytes atomic (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). + * - SQ_INSTS_LDS_ATOMIC_BANDWIDTH + - Total number of 64-bytes atomic (instrSize * CountOnes(EXEC))/64 (per-simd, emulated). - * - SQ_INSTS_VALU_FLOPS_FP16 - - Counts FLOPS per instruction on float 16 excluding MFMA/SMFMA. + * - SQ_INSTS_VALU_FLOPS_FP16 + - Counts FLOPS per instruction on float 16 excluding MFMA/SMFMA. - * - SQ_INSTS_VALU_FLOPS_FP32 - - Counts FLOPS per instruction on float 32 excluding MFMA/SMFMA. + * - SQ_INSTS_VALU_FLOPS_FP32 + - Counts FLOPS per instruction on float 32 excluding MFMA/SMFMA. - * - SQ_INSTS_VALU_FLOPS_FP64 - - Counts FLOPS per instruction on float 64 excluding MFMA/SMFMA. + * - SQ_INSTS_VALU_FLOPS_FP64 + - Counts FLOPS per instruction on float 64 excluding MFMA/SMFMA. - * - SQ_INSTS_VALU_FLOPS_FP16_TRANS - - Counts FLOPS per instruction on float 16 trans excluding MFMA/SMFMA. + * - SQ_INSTS_VALU_FLOPS_FP16_TRANS + - Counts FLOPS per instruction on float 16 trans excluding MFMA/SMFMA. - * - SQ_INSTS_VALU_FLOPS_FP32_TRANS - - Counts FLOPS per instruction on float 32 trans excluding MFMA/SMFMA. + * - SQ_INSTS_VALU_FLOPS_FP32_TRANS + - Counts FLOPS per instruction on float 32 trans excluding MFMA/SMFMA. - * - SQ_INSTS_VALU_FLOPS_FP64_TRANS - - Counts FLOPS per instruction on float 64 trans excluding MFMA/SMFMA. + * - SQ_INSTS_VALU_FLOPS_FP64_TRANS + - Counts FLOPS per instruction on float 64 trans excluding MFMA/SMFMA. - * - SQ_INSTS_VALU_IOPS - - Counts OPS per instruction on integer or unsigned or bit data (per-simd, emulated). + * - SQ_INSTS_VALU_IOPS + - Counts OPS per instruction on integer or unsigned or bit data (per-simd, emulated). - * - SQ_LDS_DATA_FIFO_FULL - - Number of cycles LDS data FIFO is full (nondeterministic, unwindowed). + * - SQ_LDS_DATA_FIFO_FULL + - Number of cycles LDS data FIFO is full (nondeterministic, unwindowed). - * - SQ_LDS_CMD_FIFO_FULL - - Number of cycles LDS command FIFO is full (nondeterministic, unwindowed). + * - SQ_LDS_CMD_FIFO_FULL + - Number of cycles LDS command FIFO is full (nondeterministic, unwindowed). - * - SQ_VMEM_TA_ADDR_FIFO_FULL - - Number of cycles texture requests are stalled due to full address FIFO in TA (nondeterministic, unwindowed). + * - SQ_VMEM_TA_ADDR_FIFO_FULL + - Number of cycles texture requests are stalled due to full address FIFO in TA (nondeterministic, unwindowed). - * - SQ_VMEM_TA_CMD_FIFO_FULL - - Number of cycles texture requests are stalled due to full cmd FIFO in TA (nondeterministic, unwindowed). + * - SQ_VMEM_TA_CMD_FIFO_FULL + - Number of cycles texture requests are stalled due to full cmd FIFO in TA (nondeterministic, unwindowed). - * - SQ_VMEM_WR_TA_DATA_FIFO_FULL - - Number of cycles texture writes are stalled due to full data FIFO in TA (nondeterministic, unwindowed). + * - SQ_VMEM_WR_TA_DATA_FIFO_FULL + - Number of cycles texture writes are stalled due to full data FIFO in TA (nondeterministic, unwindowed). - * - SQC_ICACHE_MISSES_DUPLICATE - - Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic). + * - SQC_ICACHE_MISSES_DUPLICATE + - Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic). - * - SQC_DCACHE_MISSES_DUPLICATE - - Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic). + * - SQC_DCACHE_MISSES_DUPLICATE + - Number of duplicate misses (access to a non-resident, miss pending CL) (per-SQ, per-Bank, nondeterministic). Texture addressing (TA) unit counters ====================================== .. list-table:: TA counters - :header-row: 1 + :header-rows: 1 - * - Hardware counter - - Definition + * - Hardware counter + - Definition - * - TA_BUFFER_READ_LDS_WAVEFRONTS - - Number of buffer read wavefronts for LDS return processed by the TA. + * - TA_BUFFER_READ_LDS_WAVEFRONTS + - Number of buffer read wavefronts for LDS return processed by the TA. - * - TA_FLAT_READ_LDS_WAVEFRONTS - - Number of flat opcode reads for LDS return processed by the TA. + * - TA_FLAT_READ_LDS_WAVEFRONTS + - Number of flat opcode reads for LDS return processed by the TA. Texture data (TD) unit counters ================================ .. list-table:: TD counters - :header-row: 1 + :header-rows: 1 - * - Hardware counter - - Definition + * - Hardware counter + - Definition - * - TD_WRITE_ACKT_WAVEFRONT - - Number of write acknowledgments, sent to SQ and not to SP. + * - TD_WRITE_ACKT_WAVEFRONT + - Number of write acknowledgments, sent to SQ and not to SP. - * - TD_TD_SP_TRAFFIC - - Number of times this TD sends data to the SP. + * - TD_TD_SP_TRAFFIC + - Number of times this TD sends data to the SP. Texture cache per pipe (TCP) counters ====================================== .. list-table:: TCP counters - :header-row: 1 + :header-rows: 1 - * - Hardware counter - - Definition + * - Hardware counter + - Definition - * - TCP_TCP_TA_ADDR_STALL_CYCLES - - TCP stalls TA addr interface. + * - TCP_TCP_TA_ADDR_STALL_CYCLES + - TCP stalls TA addr interface. - * - TCP_TCP_TA_DATA_STALL_CYCLES - - TCP stalls TA data interface. Now windowed. + * - TCP_TCP_TA_DATA_STALL_CYCLES + - TCP stalls TA data interface. Now windowed. - * - TCP_LFIFO_STALL_CYCLES - - Memory latency FIFOs full stall. + * - TCP_LFIFO_STALL_CYCLES + - Memory latency FIFOs full stall. - * - TCP_RFIFO_STALL_CYCLES - - Memory Request FIFOs full stall. + * - TCP_RFIFO_STALL_CYCLES + - Memory Request FIFOs full stall. - * - TCP_TCR_RDRET_STALL - - Write into cache stalled by read return from TCR. + * - TCP_TCR_RDRET_STALL + - Write into cache stalled by read return from TCR. - * - TCP_PENDING_STALL_CYCLES - - Stall due to data pending from L2. + * - TCP_PENDING_STALL_CYCLES + - Stall due to data pending from L2. - * - TCP_UTCL1_SERIALIZATION_STALL - - Total number of stalls caused due to serializing translation requests through the UTCL1. + * - TCP_UTCL1_SERIALIZATION_STALL + - Total number of stalls caused due to serializing translation requests through the UTCL1. - * - TCP_UTCL1_THRASHING_STALL - - Stall caused by thrashing feature in any probe. Lacks accuracy when the stall signal overlaps between probe0 and probe1, which is worse with MECO of thrashing deadlock. Some probe0 events could miss being counted in with MECO on. This perf count provides a rough thrashing estimate. + * - TCP_UTCL1_THRASHING_STALL + - Stall caused by thrashing feature in any probe. Lacks accuracy when the stall signal overlaps between probe0 and probe1, which is worse with MECO of thrashing deadlock. Some probe0 events could miss being counted in with MECO on. This perf count provides a rough thrashing estimate. - * - TCP_UTCL1_TRANSLATION_MISS_UNDER_MISS - - Translation miss_under_miss. + * - TCP_UTCL1_TRANSLATION_MISS_UNDER_MISS + - Translation miss_under_miss. - * - TCP_UTCL1_STALL_INFLIGHT_MAX - - Total UTCL1 stalls due to inflight counter saturation. + * - TCP_UTCL1_STALL_INFLIGHT_MAX + - Total UTCL1 stalls due to inflight counter saturation. - * - TCP_UTCL1_STALL_LRU_INFLIGHT - - Total UTCL1 stalls due to LRU cache line with inflight traffic. + * - TCP_UTCL1_STALL_LRU_INFLIGHT + - Total UTCL1 stalls due to LRU cache line with inflight traffic. - * - TCP_UTCL1_STALL_MULTI_MISS - - Total UTCL1 stalls due to arbitrated multiple misses. + * - TCP_UTCL1_STALL_MULTI_MISS + - Total UTCL1 stalls due to arbitrated multiple misses. - * - TCP_UTCL1_LFIFO_FULL - - Total UTCL1 and UTCL2 latency, which hides FIFO full cycles. + * - TCP_UTCL1_LFIFO_FULL + - Total UTCL1 and UTCL2 latency, which hides FIFO full cycles. - * - TCP_UTCL1_STALL_LFIFO_NOT_RES - - Total UTCL1 stalls due to UTCL2 latency, which hides FIFO output (not resident). + * - TCP_UTCL1_STALL_LFIFO_NOT_RES + - Total UTCL1 stalls due to UTCL2 latency, which hides FIFO output (not resident). - * - TCP_UTCL1_STALL_UTCL2_REQ_OUT_OF_CREDITS - - Total UTCL1 stalls due to UTCL2_req being out of credits. + * - TCP_UTCL1_STALL_UTCL2_REQ_OUT_OF_CREDITS + - Total UTCL1 stalls due to UTCL2_req being out of credits. - * - TCP_CLIENT_UTCL1_INFLIGHT - - The sum of inflight client to UTCL1 requests per cycle. + * - TCP_CLIENT_UTCL1_INFLIGHT + - The sum of inflight client to UTCL1 requests per cycle. - * - TCP_TAGRAM0_REQ - - Total L2 requests mapping to TagRAM 0 from this TCP to all TCCs. + * - TCP_TAGRAM0_REQ + - Total L2 requests mapping to TagRAM 0 from this TCP to all TCCs. - * - TCP_TAGRAM1_REQ - - Total L2 requests mapping to TagRAM 1 from this TCP to all TCCs. + * - TCP_TAGRAM1_REQ + - Total L2 requests mapping to TagRAM 1 from this TCP to all TCCs. - * - TCP_TAGRAM2_REQ - - Total L2 requests mapping to TagRAM 2 from this TCP to all TCCs. + * - TCP_TAGRAM2_REQ + - Total L2 requests mapping to TagRAM 2 from this TCP to all TCCs. - * - TCP_TAGRAM3_REQ - - Total L2 requests mapping to TagRAM 3 from this TCP to all TCCs. + * - TCP_TAGRAM3_REQ + - Total L2 requests mapping to TagRAM 3 from this TCP to all TCCs. - * - TCP_TCP_LATENCY - - Total TCP wave latency (from the first clock of wave entering to the first clock of wave leaving). Divide by TA_TCP_STATE_READ to find average wave latency. + * - TCP_TCP_LATENCY + - Total TCP wave latency (from the first clock of wave entering to the first clock of wave leaving). Divide by TA_TCP_STATE_READ to find average wave latency. - * - TCP_TCC_READ_REQ_LATENCY - - Total TCP to TCC request latency for reads and atomics with return. Not Windowed. + * - TCP_TCC_READ_REQ_LATENCY + - Total TCP to TCC request latency for reads and atomics with return. Not Windowed. - * - TCP_TCC_WRITE_REQ_LATENCY - - Total TCP to TCC request latency for writes and atomics without return. Not Windowed. + * - TCP_TCC_WRITE_REQ_LATENCY + - Total TCP to TCC request latency for writes and atomics without return. Not Windowed. - * - TCP_TCC_WRITE_REQ_HOLE_LATENCY - - Total TCP req to TCC hole latency for writes and atomics. Not Windowed. + * - TCP_TCC_WRITE_REQ_HOLE_LATENCY + - Total TCP req to TCC hole latency for writes and atomics. Not Windowed. Texture cache per channel (TCC) counters ========================================= .. list-table:: TCC counters - :header-row: 1 + :header-rows: 1 - * - Hardware counter - - Definition + * - Hardware counter + - Definition - * - TCC_READ_SECTORS - - Total number of 32B data sectors in read requests. + * - TCC_READ_SECTORS + - Total number of 32B data sectors in read requests. - * - TCC_WRITE_SECTORS - - Total number of 32B data sectors in write requests. + * - TCC_WRITE_SECTORS + - Total number of 32B data sectors in write requests. - * - TCC_ATOMIC_SECTORS - - Total number of 32B data sectors in atomic requests. + * - TCC_ATOMIC_SECTORS + - Total number of 32B data sectors in atomic requests. - * - TCC_BYPASS_REQ - - Number of bypass requests. This is measured at the tag block. + * - TCC_BYPASS_REQ + - Number of bypass requests. This is measured at the tag block. - * - TCC_LATENCY_FIFO_FULL - - Number of cycles when the latency FIFO is full. + * - TCC_LATENCY_FIFO_FULL + - Number of cycles when the latency FIFO is full. - * - TCC_SRC_FIFO_FULL - - Number of cycles when the SRC FIFO is assumed to be full as measured at the IB block. + * - TCC_SRC_FIFO_FULL + - Number of cycles when the SRC FIFO is assumed to be full as measured at the IB block. - * - TCC_EA0_RDREQ_64B - - Number of 64-byte TCC/EA read requests. + * - TCC_EA0_RDREQ_64B + - Number of 64-byte TCC/EA read requests. - * - TCC_EA0_RDREQ_128B - - Number of 128-byte TCC/EA read requests. + * - TCC_EA0_RDREQ_128B + - Number of 128-byte TCC/EA read requests. - * - TCC_IB_REQ - - Number of requests through the IB. This measures the number of raw requests from graphics clients to this TCC. + * - TCC_IB_REQ + - Number of requests through the IB. This measures the number of raw requests from graphics clients to this TCC. - * - TCC_IB_STALL - - Number of cycles when the IB output is stalled. + * - TCC_IB_STALL + - Number of cycles when the IB output is stalled. - * - TCC_EA0_WRREQ_WRITE_DRAM - - Number of TCC/EA write requests (32-byte or 64-byte) destined for DRAM (MC). + * - TCC_EA0_WRREQ_WRITE_DRAM + - Number of TCC/EA write requests (32-byte or 64-byte) destined for DRAM (MC). - * - TCC_EA0_WRREQ_ATOMIC_DRAM - - Number of TCC/EA atomic requests (32-byte or 64-byte) destined for DRAM (MC). + * - TCC_EA0_WRREQ_ATOMIC_DRAM + - Number of TCC/EA atomic requests (32-byte or 64-byte) destined for DRAM (MC). - * - TCC_EA0_RDREQ_DRAM_32B - - Number of 32-byte TCC/EA read requests due to DRAM traffic. One 64-byte request is counted as two and one 128-byte as four. + * - TCC_EA0_RDREQ_DRAM_32B + - Number of 32-byte TCC/EA read requests due to DRAM traffic. One 64-byte request is counted as two and one 128-byte as four. - * - TCC_EA0_RDREQ_GMI_32B - - Number of 32-byte TCC/EA read requests due to GMI traffic. One 64-byte request is counted as two and one 128-byte as four. + * - TCC_EA0_RDREQ_GMI_32B + - Number of 32-byte TCC/EA read requests due to GMI traffic. One 64-byte request is counted as two and one 128-byte as four. - * - TCC_EA0_RDREQ_IO_32B - - Number of 32-byte TCC/EA read requests due to IO traffic. One 64-byte request is counted as two and one 128-byte as four. + * - TCC_EA0_RDREQ_IO_32B + - Number of 32-byte TCC/EA read requests due to IO traffic. One 64-byte request is counted as two and one 128-byte as four. - * - TCC_EA0_WRREQ_WRITE_DRAM_32B - - Number of 32-byte TCC/EA write requests due to DRAM traffic. One 64-byte request is counted as two. + * - TCC_EA0_WRREQ_WRITE_DRAM_32B + - Number of 32-byte TCC/EA write requests due to DRAM traffic. One 64-byte request is counted as two. - * - TCC_EA0_WRREQ_ATOMIC_DRAM_32B - - Number of 32-byte TCC/EA atomic requests due to DRAM traffic. One 64-byte request is counted as two. + * - TCC_EA0_WRREQ_ATOMIC_DRAM_32B + - Number of 32-byte TCC/EA atomic requests due to DRAM traffic. One 64-byte request is counted as two. - * - TCC_EA0_WRREQ_WRITE_GMI_32B - - Number of 32-byte TCC/EA write requests due to GMI traffic. One 64-byte request is counted as two. + * - TCC_EA0_WRREQ_WRITE_GMI_32B + - Number of 32-byte TCC/EA write requests due to GMI traffic. One 64-byte request is counted as two. - * - TCC_EA0_WRREQ_ATOMIC_GMI_32B - - Number of 32-byte TCC/EA atomic requests due to GMI traffic. One 64-byte request is counted as two. + * - TCC_EA0_WRREQ_ATOMIC_GMI_32B + - Number of 32-byte TCC/EA atomic requests due to GMI traffic. One 64-byte request is counted as two. - * - TCC_EA0_WRREQ_WRITE_IO_32B - - Number of 32-byte TCC/EA write requests due to IO traffic. One 64-byte request is counted as two. + * - TCC_EA0_WRREQ_WRITE_IO_32B + - Number of 32-byte TCC/EA write requests due to IO traffic. One 64-byte request is counted as two. - * - TCC_EA0_WRREQ_ATOMIC_IO_32B - - Number of 32-byte TCC/EA atomic requests due to IO traffic. One 64-byte request is counted as two. + * - TCC_EA0_WRREQ_ATOMIC_IO_32B + - Number of 32-byte TCC/EA atomic requests due to IO traffic. One 64-byte request is counted as two. From 8cc17e307cfa67fb40ec104b7c0130558c0a7f68 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Tue, 26 Aug 2025 18:22:35 +0530 Subject: [PATCH 25/58] review comments --- docs/conceptual/gpu-arch.md | 3 ++- ...ers.rst => mi350-performance-counters.rst} | 20 +++++++++---------- 2 files changed, 12 insertions(+), 11 deletions(-) rename docs/conceptual/gpu-arch/{mi355-performance-counters.rst => mi350-performance-counters.rst} (98%) diff --git a/docs/conceptual/gpu-arch.md b/docs/conceptual/gpu-arch.md index e60c6a653..d2f790578 100644 --- a/docs/conceptual/gpu-arch.md +++ b/docs/conceptual/gpu-arch.md @@ -21,7 +21,8 @@ architecture. * [AMD Instinct™ MI300 microarchitecture](./gpu-arch/mi300.md) * [AMD Instinct MI300/CDNA3 ISA](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf) * [White paper](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf) -* [Performance counters](./gpu-arch/mi300-mi200-performance-counters.rst) +* [MI300 performance counters](./gpu-arch/mi300-mi200-performance-counters.rst) +* [MI350 performance counters](./gpu-arch/mi350-performance-counters.rst) ::: :::{grid-item-card} diff --git a/docs/conceptual/gpu-arch/mi355-performance-counters.rst b/docs/conceptual/gpu-arch/mi350-performance-counters.rst similarity index 98% rename from docs/conceptual/gpu-arch/mi355-performance-counters.rst rename to docs/conceptual/gpu-arch/mi350-performance-counters.rst index 861ce5641..d42eaa0e6 100644 --- a/docs/conceptual/gpu-arch/mi355-performance-counters.rst +++ b/docs/conceptual/gpu-arch/mi350-performance-counters.rst @@ -2,9 +2,9 @@ :description: MI355 series performance counters and metrics :keywords: MI355, MI355X, MI3XX -********************************************** -MI350 and MI355 series performance counters -********************************************** +*********************************** +MI350 series performance counters +*********************************** This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI350 and MI355 series GPUs. These counters are available for profiling using `ROCprofiler-SDK `_ and `ROCm Compute Profiler `_. @@ -13,7 +13,7 @@ The following sections list the performance counters based on the IP blocks. Command processor packet processor counters (CPC) ================================================== -.. list-table:: CPC counters +.. list-table:: :header-rows: 1 * - Hardware counter @@ -73,7 +73,7 @@ Command processor packet processor counters (CPC) Shader pipe interpolators (SPI) counters ========================================= -.. list-table:: SPI counters +.. list-table:: :header-rows: 1 * - Hardware counter @@ -271,7 +271,7 @@ Shader pipe interpolators (SPI) counters Compute unit (SQ) counters =========================== -.. list-table:: SQ counters +.. list-table:: :header-rows: 1 * - Hardware counter @@ -349,7 +349,7 @@ Compute unit (SQ) counters Texture addressing (TA) unit counters ====================================== -.. list-table:: TA counters +.. list-table:: :header-rows: 1 * - Hardware counter @@ -364,7 +364,7 @@ Texture addressing (TA) unit counters Texture data (TD) unit counters ================================ -.. list-table:: TD counters +.. list-table:: :header-rows: 1 * - Hardware counter @@ -379,7 +379,7 @@ Texture data (TD) unit counters Texture cache per pipe (TCP) counters ====================================== -.. list-table:: TCP counters +.. list-table:: :header-rows: 1 * - Hardware counter @@ -460,7 +460,7 @@ Texture cache per pipe (TCP) counters Texture cache per channel (TCC) counters ========================================= -.. list-table:: TCC counters +.. list-table:: :header-rows: 1 * - Hardware counter From ea8ff1b17dcce8b23aba6c645cd89a2da6fc4cfd Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Tue, 26 Aug 2025 16:34:27 -0400 Subject: [PATCH 26/58] UCC and UCX version and release notes update for 7.0.0 (#521) * Indentation and formatting updated * UCC and UCX version udpated * ROCm bandwidth test update * MI350 series info added * Changelog update * ROCm systems Profiler highlight updated * Redundant removed, pulled out from HIP changelog * Known issues to Compute profiler added * ONNX compatibility updtaed * ROCm COmpute Profiler highlight added * RN update * ROCm 700 stack image updated * ROCM Compute and System highlight updated * Deep learning frameworks added * removed BF16 support for MIGraphX -- already in 6.4 release notes; removed FP4 MIGraphX support * ROCm Compute profiler highlight updated * Formatting update * AI framework update * ROCm Systems Profiler udpate * removed mention of CentOS of CentOS * ROCm Compute Profiler update * Feedback changes * leo's feedback incorporated * ampersand * Changelog synced * Changelog synced * RHEL 10 removed * Rocky Linux updated --------- Co-authored-by: spolifroni-amd --- CHANGELOG.md | 81 ++++---- RELEASE.md | 196 ++++++++++-------- .../compatibility-matrix-historical-6.0.csv | 9 +- docs/compatibility/compatibility-matrix.rst | 13 +- docs/data/rocm-software-stack-7_0_0.jpg | Bin 0 -> 358711 bytes docs/reference/gpu-arch-specs.rst | 35 +++- docs/sphinx/_toc.yml.in | 2 +- docs/what-is-rocm.rst | 2 +- 8 files changed, 189 insertions(+), 149 deletions(-) create mode 100644 docs/data/rocm-software-stack-7_0_0.jpg diff --git a/CHANGELOG.md b/CHANGELOG.md index 682a3401b..40d86ebba 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,7 +21,7 @@ for a complete overview of this release. * Default command: - A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through laternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. + A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through alternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. * Support for GPU metrics 1.8: - Added new fields for `amdsmi_gpu_xcp_metrics_t` including: @@ -30,7 +30,7 @@ for a complete overview of this release. - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). - - Increased available JPEG engines to 40. Current ASICs may not support all 40. These are indicated as `UINT16_MAX` or `N/A` in CLI. + - Increased available JPEG engines to 40. Current ASICs might not support all 40. These are indicated as `UINT16_MAX` or `N/A` in CLI. * Bad page threshold count. - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions required to display the count. @@ -99,32 +99,32 @@ for a complete overview of this release. #### Removed -- Removed unnecessary API, `amdsmi_free_name_value_pairs()` +- Unnecessary API, `amdsmi_free_name_value_pairs()` - This API is only used internally to free up memory from the Python interface and does not need to be exposed to the user. -- Removed unused definitions: +- Unused definitions: - `AMDSMI_MAX_NAME`, `AMDSMI_256_LENGTH`, `AMDSMI_MAX_DATE_LENGTH`, `MAX_AMDSMI_NAME_LENGTH`, `AMDSMI_LIB_VERSION_YEAR`, `AMDSMI_DEFAULT_VARIANT`, `AMDSMI_MAX_NUM_POWER_PROFILES`, `AMDSMI_MAX_DRIVER_VERSION_LENGTH`. -- Removed unused member `year` in struct `amdsmi_version_t`. +- Unused member `year` in struct `amdsmi_version_t`. -- Removed `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`. +- `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`. - `amdsmi_io_link_type_t` is no longer needed as `amdsmi_link_type_t` is sufficient. - `amdsmi_link_type_t` enum has changed. - This change will also affect `amdsmi_link_metrics_t`, where the link_type field changes from `amdsmi_io_link_type_t` to `amdsmi_link_type_t`. -- Removed `amdsmi_get_power_info_v2()`. +- `amdsmi_get_power_info_v2()`. - The ``amdsmi_get_power_info()`` has been unified and the v2 function is no longer needed or used. -- Removed `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. +- `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. - The `amdsmi_get_gpu_vram_info` now provides vendor names as a string. - `amdsmi_vram_vendor_type_t` enum structure is removed. - `amdsmi_vram_info_t` member named `amdsmi_vram_vendor_type_t` is changed to a character string. - `amdsmi_get_gpu_vram_info` now no longer requires decoding the vendor name as an enum. -- Removed backwards compatibility for `amdsmi_get_gpu_metrics_info()`'s,`jpeg_activity`and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` or `xcp_stats.vcn_busy`. +- Backwards compatibility for `amdsmi_get_gpu_metrics_info()`'s,`jpeg_activity`and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` or `xcp_stats.vcn_busy`. - Backwards compatibility is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion about which field to use. By removing backward compatibility, it is easier to identify the relevant field. - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. @@ -203,7 +203,7 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc - `num_threads` Total number of threads in the group. The legacy API size is alias. - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for aritimetic reduction across lanes of a warp, and `__reduce_and_sync`, `__reduce_or_sync`, and `__reduce_xor_sync` functions added for logical reduction. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). -* New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). +* New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as follows. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). - Data types for `FP4`/`FP6`/`FP8`. - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. @@ -220,7 +220,7 @@ functions added for logical reduction. For details, see [Warp cross-lane functio #### Changed * Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. -* Removal of Beta warnings in HIP Graph APIs +* Removal of beta warnings in HIP Graph APIs All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. * Behavior changes - `hipGetLastError` now returns the error code which is the last actual error caught in the current thread during the application execution. @@ -421,7 +421,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. +* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large that symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. * Added single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: * `BlockScanRunningPrefixOp` * `ScanTileStatus` @@ -437,7 +437,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Removed -* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you would like to build for these architectures, please specify them explicitly in the `AMDGPU_TARGETS` cmake option. +* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you want to build for these architectures, specify them explicitly in the `AMDGPU_TARGETS` cmake option. * Deprecated `hipcub::AsmThreadLoad` is removed, use `hipcub::ThreadLoad` instead. * Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. * Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. @@ -587,7 +587,7 @@ HIP runtime has the following functional improvements which improves runtime per * Added element-wise binary operation support. * Added element-wise trinary operation support. -* Added support for new GPU target gfx950. +* Added support for GPU target gfx950. * Added dynamic unary and binary operator support for element-wise operations and permutation. * Added a CMake check for `f8` datatype availability. * Added `hiptensorDestroyOperationDescriptor` to free all resources related to the provided descriptor. @@ -629,7 +629,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added * Added the compiler `-gsplit-dwarf` option to enable the generation of separate debug information file at compile time. When used, separate debug information files are generated for host and for each offload architecture. For additional information, see [DebugFission](https://gcc.gnu.org/wiki/DebugFission). -* Added `llvm-flang`, AMD's next generation Fortran compiler is a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. +* Added `llvm-flang`, AMD's next-generation Fortran compiler. It's a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. * Added Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. * Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` to determine the current target processor ability to invoke a particular builtin. * Added HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. @@ -761,11 +761,11 @@ HIP runtime has the following functional improvements which improves runtime per #### Known issues -* Installation on CentOS/RedHat/SLES requires the manual installation of the `FFMPEG` & `OpenCV` dev packages. +* Installation on RHEL and SLES requires the manual installation of the `FFMPEG` and `OpenCV` dev packages. #### Upcoming changes -* Optimized audio augmentations support for VX_RPP +* Optimized audio augmentations support for VX_RPP. ### **RCCL** (2.26.6) @@ -813,7 +813,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Known issues * Package installation on SLES requires manually installing `TurboJPEG`. -* Package installation on CentOS, RedHat, and SLES requires manually installing the `FFMPEG Dev` package. +* Package installation on RHEL and SLES requires manually installing the `FFMPEG Dev` package. #### Upcoming changes @@ -993,7 +993,7 @@ HIP runtime has the following functional improvements which improves runtime per * Individual `plugins`: The `plugins` (shared libraries) are available at: `/opt/rocm/lib/rocm_bandwidth_test/plugins/` ```{note} -Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/rocm-rel-7.0/README.md) file for details about the new options and outputs. +Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainline/README.md) file for details about the new options and outputs. ``` #### Changed @@ -1002,7 +1002,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/roc #### Removed -- The old CLI, parameters, and switches used. +- The old CLI, parameters, and switches. ### **ROCm Compute Profiler** (3.2.3) @@ -1051,8 +1051,6 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/roc * Support for Roofline plot on CLI (single run) analysis. -* Roofline support for RHEL 10 OS. - * `FP4` and `FP6` data types have been added for roofline profiling on AMD Instinct MI350 series. ##### rocprofv3 support @@ -1121,6 +1119,8 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/roc * Memory chart on ROCm Compute Profiler CLI might look corrupted if the CLI width is too narrow. +* Roofline feature is currently not functional on Azure Linux 3.0 and Debian 12. + #### Upcoming changes * ``rocprof v1/v2/v3`` interfaces will be removed in favor of the ROCprofiler-SDK interface, which directly accesses ``rocprofv3`` C++ tool. Using ``rocprof v1/v2/v3`` interfaces will trigger a deprecation warning. @@ -1166,7 +1166,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/roc #### Removed - Removed backwards compatibility for `rsmi_dev_gpu_metrics_info_get()`'s `jpeg_activity` and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` and `xcp_stats.vcn_busy`. - - Backwards compability is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. + - Backwards compatibility is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion for users about which field to use. By removing backward compatibility, it is easier to identify the relevant field. - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. @@ -1225,7 +1225,6 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * Added new optimization to the backend for `device_transform` when the input and output are pointers. * Added `LoadType` to `transform_config`, which is used for the `device_transform` when the input and output are pointers. * Added `rocprim:device_transform` for n-ary transform operations API with as input `n` number of iterators inside a `rocprim::tuple`. -* Added gfx950 support. * Added `rocprim::key_value_pair::operator==`. * Added the `rocprim::unrolled_copy` thread function to copy multiple items inside a thread. * Added the `rocprim::unrolled_thread_load` function to load multiple items inside a thread using `rocprim::thread_load`. @@ -1242,12 +1241,12 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele #### Changed -* Changed the parameters `long_radix_bits` and `LongRadixBits` from `segmented_radix_sort` to `radix_bits` and `RadixBits` respectively. +* Changed the parameters `long_radix_bits` and `LongRadixBits` from `segmented_radix_sort` to `radix_bits` and `RadixBits`, respectively. * Marked the initialisation constructor of `rocprim::reverse_iterator` `explicit`, use `rocprim::make_reverse_iterator`. * Merged `radix_key_codec` into type_traits system. * Renamed `type_traits_interface.hpp` to `type_traits.hpp`, rename the original `type_traits.hpp` to `type_traits_functions.hpp`. * The default scan accumulator types for device-level scan algorithms have changed. This is a breaking change. -The previous default accumulator types could lead to situations in which unexpected overflow occured, such as when the input or inital type was smaller than the output type. This is a complete list of affected functions and how their default accumulator types are changing: +The previous default accumulator types could lead to situations in which unexpected overflow occurred, such as when the input or initial type was smaller than the output type. This is a complete list of affected functions and how their default accumulator types are changing: * `rocprim::inclusive_scan` * Previous default: `class AccType = typename std::iterator_traits::value_type>` @@ -1262,7 +1261,7 @@ The previous default accumulator types could lead to situations in which unexpec * Previous default: `class AccType = detail::input_type_t>` * Current default: `class AccType = rocprim::accumulator_t>` * Undeprecated internal `detail::raw_storage`. -* A new version of `rocprim::thread_load` and `rocprim::thread_store` replace the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. +* A new version of `rocprim::thread_load` and `rocprim::thread_store` replaces the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. * Renamed `rocprim::load_cs` to `rocprim::load_nontemporal` and `rocprim::store_cs` to `rocprim::store_nontemporal` to express the intent of these load and store methods better. * All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, for example, `rocprim::ROCPRIM_300400_NS::symbol` instead of `rocPRIM::symbol`, letting the user link multiple libraries built with different versions of rocPRIM. @@ -1287,7 +1286,7 @@ The previous default accumulator types could lead to situations in which unexpec * `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. * Removed the deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. * Removed `rocprim/detail/radix_sort.hpp`, functionality can now be found in `rocprim/thread/radix_key_codec.hpp`. -* Removed C++14 support, only C++17 is supported. +* Removed C++14 support. Only C++17 is supported. * Due to the removal of `__AMDGCN_WAVEFRONT_SIZE` in the compiler, the following deprecated warp size-related symbols have been removed: * `rocprim::device_warp_size()` * For compile-time constants, this is replaced with `rocprim::arch::wavefront::min_size()` and `rocprim::arch::wavefront::max_size()`. Use this when allocating global or shared memory. @@ -1311,7 +1310,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Known issues -* * When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x. However, if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs. +* When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x. However, if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs. ### **ROCprofiler-SDK** (1.0.0) @@ -1551,7 +1550,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Resolved issues -* Fixed an issue with internal calls to unqualified `distance()` which would be ambigious due to also visibile implementation through ADL. +* Fixed an issue with internal calls to unqualified `distance()` which would be ambiguous due to the visible implementation through ADL. #### Known issues @@ -1565,10 +1564,10 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -* Added internal register layout transforms to support interleaved MMA layouts. -* Added support for the gfx950 target. -* Added mixed input `BF8`/`FP8` types for MMA support. -* Added fragment scheduler API objects to embed thread block cooperation properties in fragments. +* Internal register layout transforms to support interleaved MMA layouts. +* Support for the gfx950 target. +* Mixed input `BF8`/`FP8` types for MMA support. +* Fragment scheduler API objects to embed thread block cooperation properties in fragments. #### Changed @@ -1582,9 +1581,9 @@ The previous default accumulator types could lead to situations in which unexpec #### Removed -* Removed support for the gfx940 and gfx941 targets. -* Removed the rocWMMA cooperative API. -* Removed wave count template parameters from transforms APIs. +* Support for the gfx940 and gfx941 targets. +* The rocWMMA cooperative API. +* Wave count template parameters from transforms APIs. #### Optimized @@ -1611,7 +1610,7 @@ The previous default accumulator types could lead to situations in which unexpec * Handle creation and destruction APIs have been consolidated. Use `rppCreate()` for handle initialization and `rppDestroy()` for handle destruction. * The `logical_operations` function category has been renamed to `bitwise_operations`. * TurboJPEG package installation enabled for RPP Test Suite with `sudo apt-get install libturbojpeg0-dev`. Instructions have been updated in utilities/test_suite/README.md. -* The `swap_channels` augmentation has been changed to `channel_permute`. `channel_permute` now also accepts a new argument, `permutationTensor` (pointer to a unsigned int tensor) that provides the permutation order to swap the RGB channels of each input image in the batch in any order: +* The `swap_channels` augmentation has been changed to `channel_permute`. `channel_permute` now also accepts a new argument, `permutationTensor` (pointer to an unsigned int tensor), that provides the permutation order to swap the RGB channels of each input image in the batch in any order: `RppStatus rppt_swap_channels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, rppHandle_t rppHandle);` @@ -1626,7 +1625,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Resolved issues -* Test package - debian packages will install required dependencies. +* Test package - Debian packages will install required dependencies. ### **Tensile** (4.44.0) @@ -1636,7 +1635,7 @@ The previous default accumulator types could lead to situations in which unexpec - Added code object compression via bundling. - Added support for non-default HIP SDK installations on Windows. - Added master solution library documentation. -- Added compiler version dependent assembler and architecture capabilities. +- Added compiler version-dependent assembler and architecture capabilities. - Added documentation from GitHub Wiki to ROCm docs. #### Changed @@ -1659,7 +1658,7 @@ The previous default accumulator types could lead to situations in which unexpec - Fixed configure time path not being invoked at build. - Fixed find_package for msgpack to work with versions 5 and 6. -- Fixed rhel9 testing. +- Fixed RHEL 9 testing. - Fixed gfx908 builds. - Fixed the 'argument list too long' error. - Fixed version typo in 6.3 changelog. diff --git a/RELEASE.md b/RELEASE.md index debe0d914..fa784e6d0 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -45,9 +45,8 @@ ROCm 7.0.0 adds support for [AMD Instinct MI355X](https://www.amd.com/en/product ROCm 7.0.0 adds support for the following operating systems and kernel versions: * Ubuntu 24.04.3 (kernel: 6.8 [GA], 6.14 [HWE]) -* RHEL 10 (kernel: 6.12.0-55) * Oracle Linux 10 (kernel: 6.12.0 UEK) -* Rocky 9 (kernel: 5.14.0-570) +* Rocky Linux 9 (kernel: 5.14.0-570) ROCm 7.0.0 marks the end of support (EoS) for Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE]) and SLES 15 SP6. @@ -65,10 +64,22 @@ All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver vers ### Deep learning and AI framework updates -ROCm 7.0 introduces several newly supported versions of Deep learning and AI frameworks. For more information, see [Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/latest/how-to/deep-learning-rocm.html) and the [Compatibility +ROCm provides a comprehensive ecosystem for deep learning development. For more information, see [Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/latest/how-to/deep-learning-rocm.html) and the [Compatibility matrix](../../docs/compatibility/compatibility-matrix.rst) for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm. -#### PyTorch +#### New frameworks + +AMD ROCm has officially added support for the following Deep learning and AI frameworks: + +* Ray is a unified framework for scaling AI and Python applications from your laptop to a full cluster, without changing your code. Ray consists of a core distributed runtime and a set of AI libraries for simplifying machine learning computations. It is currently supported on ROCm 6.4.1. For more information, see [Ray compatibility](https://advanced-micro-devices-rocm-internal--500.com.readthedocs.build/en/500/compatibility/ml-compatibility/ray-compatibility.html). + +* llama.cpp is an open-source framework for Large Language Model (LLM) inference that runs on both central processing units (CPUs) and graphics processing units (GPUs). It is written in plain C/C++, providing a simple, dependency-free setup. It is currently supported on ROCm 6.4.0. For more information, see [llama.cpp compatibility](https://advanced-micro-devices-rocm-internal--500.com.readthedocs.build/en/500/compatibility/ml-compatibility/llama-cpp-compatibility.html). + +#### Updated framework support + +ROCm 7.0 introduces several newly supported versions of Deep learning and AI frameworks: + +##### PyTorch ROCm 7.0 enables the following PyTorch features: @@ -77,11 +88,11 @@ ROCm 7.0 enables the following PyTorch features: * Compilation of Python C++ extensions using ``amdclang++``. * Support for channels-last NHWC format for convolutions via MIOpen. -#### JAX +##### JAX ROCm 7.0 enables support for JAX 0.6.0. -#### Megatron-LM +##### Megatron-LM Megatron-LM for ROCm now supports: @@ -91,26 +102,26 @@ Megatron-LM for ROCm now supports: * Fused_bias_swiglu kernel. -#### TensorFlow +##### TensorFlow ROCm 7.0 enables support for TensorFlow 2.19.1. -#### ONNX Runtime +##### ONNX Runtime ROCm 7.0 enables support for ONNX Runtime 1.22.1. -#### vLLM +##### vLLM * Support for Open Compute Project (OCP) `FP8` data type. * `FP4` precision for Llama 3.1 405B. -#### Triton +##### Triton ROCm 7.0 enables support for Triton 3.3.0. ### Instinct Driver/ROCm packaging separation -The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) for more information. +The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See the [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) blog for more information. [AMD SMI](https://github.com/ROCm/amdsmi) continues to stay with the ROCm software stack under the ROCm organization repository. @@ -127,11 +138,11 @@ The HIP runtime now includes support for: * `constexpr` operators for `FP16` and `BF16`. * `__syncwarp` operation. * The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. -* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`). +* Added warp level primitives: `__syncwarp` and reduce intrinsics (for example, `__reduce_add_sync()`). * Extended fine grained system memory pool. * A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. -In addition, the HIP runtime includes functional improvements, which improves functionality, runtime performance, and user experience. For more information, see [HIP changelog](#hip-7-0-0) below. +Additionally, the HIP runtime includes functional improvements, which improve functionality, runtime performance, and the user experience. For more information, see [HIP changelog](#hip-7-0-0) below. ### Compiler changes and improvements @@ -152,11 +163,11 @@ Key compiler enhancements include: * Added a new target-specific builtin ``__builtin_amdgcn_is_invocable``, enabling fine-grained, per-builtin feature availability. * The compiler driver now uses parallel code generation by default when compiling using full LTO (including when using the `-fgpu-rdc` option) for HIP. This divides the optimized LLVM IR module into roughly equal partitions before instruction selection and lowering, which can help improve build times. - Each kernel in the linked LTO module can be put in a separate partition, and any non-inlined function it depends on may be copied alongside it. Thus, while parallel code generation can improve build time, it can duplicate non-inlined, non-kernel functions across multiple partitions, potentially increasing the binary size of the final object file. + Each kernel in the linked LTO module can be put in a separate partition, and any non-inlined function it depends on can be copied alongside it. Thus, while parallel code generation can improve build time, it can duplicate non-inlined, non-kernel functions across multiple partitions, potentially increasing the binary size of the final object file. * Compiler option `-flto-partitions=`: - Equivalent to the `--lto-partitions=` LLD option. Controls the number of partitions used for parallel code generation when using full LTO (including when using `-fgpu-rdc`). The number of partitions must be greater than 0, and a value of 1 disables the feature. The default value is 8. + Equivalent to the `--lto-partitions=` LLD option. Controls the number of partitions used for parallel code generation when using full LTO (including when using `-fgpu-rdc`). The number of partitions must be greater than 0, and a value of 1 turns off the feature. The default value is 8. Developers are encouraged to experiment with different numbers of partitions using the `-flto-partitions` Clang command line option. Recommended values are 1 to 16 partitions, with especially large projects containing many kernels potentially benefiting from up to 64 partitions. It is not recommended to use a value greater than the number of threads on the machine. Smaller projects, or those containing only a few kernels, might not benefit at all from partitioning and might even experience a slight increase in build time due to the small overhead of analyzing and partitioning the modules. @@ -169,11 +180,10 @@ Key compiler enhancements include: #### New data type support -MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). The ROCm 7.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 series accelerators in these ROCm libraries: +MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). ROCm 7.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 series accelerators in these ROCm libraries: * Composable Kernel (`FP4`, `FP6`, and `FP8` only) * hipBLASLt -* MIGraphX (`FP4` only) The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on MI350 series accelerators instead of the NANOO `FP8` format: @@ -183,8 +193,6 @@ The following libraries are updated to support the Open Compute Project (OCP) fl * MIGraphX * rocWMMA -MIGraphX now also supports `BF16`. - For more information about data types, see [Data types and precision support](https://rocm.docs.amd.com/en/latest/reference/precision-support.html). #### hipBLASLt improvement @@ -193,10 +201,12 @@ GEMM performance has been improved for `FP8`, `FP16`, `BF16`, and `FP32` data ty For more information about hipBLASLt changes, see the [hipBLASLt changelog](#hipblaslt-1-0-0) below. -#### MIGraphX support +#### MIGraphX improvements * Support for OCP `FP8` on AMD Instinct MI350X and MI355X accelerators. * Support for PyTorch 2.7 via Torch-MIGraphX. +* Improved performance of Generative AI models +* Added additional MSFT Contrib Operators for improved ONNX Runtime Experience For more information about MIGraphX changes, see the [MIGraphX changelog](migraphx-2-13-0) below. @@ -217,7 +227,7 @@ have been refined for improved usability. See the [AMD SMI changelog](#amd-smi-2 #### ROCgdb -The MX data types now support `FP4`, `FP6`, and `FP8`. +The micro-scaling (MX) data types now support `FP4`, `FP6`, and `FP8`. See the [ROCgdb changelog](#rocgdb-16-3) for more details. @@ -225,11 +235,14 @@ See the [ROCgdb changelog](#rocgdb-16-3) for more details. ROCm Compute Profiler includes the following key changes: -* MX data types support: `FP4`, `FP6`, and `FP8`. -* AMD Instinct MI355X and MI350X performance counters: CPC, SPI, SQ, TA/TD/TCP, and TCC. -* Enhanced roofline analysis with support for `INT8`, `INT32`, `FP8`, `FP16`, and `BF16` data types. -* Roofline distinction for `FP32` and `FP64` data types. -* Selective kernel profiling. +* Interactive command line with a Textual User Interface (TUI) has been added to analyze mode. For more details, see [TUI analysis](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/amd-staging/how-to/analyze/tui.html). +* Support added for advanced data types: `FP4` and `FP6` +* Support for AMD Instinct MI355X and MI350X with addition of performance counters: CPC, SPI, SQ, TA/TD/TCP, and TCC. +* Roofline enhancement added for AMD Instinct MI350 series. +* Improved support for Selective Kernel profiling. +* Program Counter (PC) sampling (Software-based) feature has been enabled for AMD Instinct MI200, MI300X, MI350X, and MI355X accelerators. This feature helps in GPU profiling to understand code execution patterns and hotspots during GPU kernel execution. For more details, see [Using PC sampling in ROCm Compute Profiler](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/amd-staging/how-to/pc_sampling.html). +* Program Counter (PC) sampling (Hardware-based, Stochastic) feature has been enabled for AMD Instinct MI300X, MI350, and MI355X accelerators. +* Docker files has been added to package the application and dependencies into a single portable and executable standalone binary file. See the [ROCm Compute Profiler changelog](#rocm-compute-profiler-3-2-3) for more details. @@ -241,14 +254,14 @@ The ROCm Data Center tool (RDC) streamlines the administration of AMD GPUs in cl ROCm Systems Profiler includes the following key changes: -* Trace support for computer vision APIs: H264, H265, AV1, VP9, and JPEG. -* Trace support for computer vision engine activity. -* OpenMP for C++ language and kernel activity support. +* Improved profiling support for Computer Vision workloads through rocDecode and rocJPEG API tracing and engine activity sampling. +* Network profiling support has been added to AMD Instinct MI300X, MI350X, and MI355X. +* Improved profiling of the communication layer with RCCL and MPI API tracing. See the [ROCm Systems Profiler changelog](#rocm-systems-profiler-1-1-0) for more details. #### ROCm Validation Suite -AMD Instinct MI355X and MI350X accelerator support in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. +In ROCm 7.0, ROCm Validation Suite includes support for the AMD Instinct MI355X and MI350X accelerators in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more details. @@ -260,7 +273,7 @@ See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more * ROCprofiler-SDK adds support for AMD Instinct MI350X and MI355X accelerators. * The stochastic and host-trap PC sampling support has been added for all AMD Instinct MI300 and MI350 series accelerators, which provides information particularly useful for understanding stalls during kernel execution. -* The added support for tracing events surfaced by AMD's Kernel Fusion Driver (KFD) captures low level driver routines involved in mapping, invalidation, and migration of data between CPU and GPU memories. Such events are central to the support for [Unified Memory](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_runtime_api/memory_management/unified_memory.html) on AMD systems. Tracing of KFD events helps to detect performance problems arising from excessive data migration. +* The added support for tracing events surfaced by AMD's Kernel Fusion Driver (KFD) captures low-level driver routines involved in mapping, invalidation, and migration of data between CPU and GPU memories. Such events are central to the support for [Unified Memory](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_runtime_api/memory_management/unified_memory.html) on AMD systems. Tracing of KFD events helps to detect performance problems arising from excessive data migration. * New APIs are added for profiling applications using thread traces (beta) which facilitates profiling wavefronts at the instruction timing level. @@ -282,8 +295,8 @@ See the [ROCprofiler-SDK changelog](#rocprofiler-sdk-1-0-0) for more details. The ROCm Offline Installer Creator 7.0.0 includes the following features and improvements: -* Added support for RHEL 10.0, Oracle 10.0, and Rocky 9.6. -* Added support for the new graphics repo structure for graphics/mesa related packages. +* Added support for Oracle 10.0, and Rocky Linux 9.6. +* Added support for the new graphics repo structure for graphics/Mesa related packages. * Improvements to kernel header version matching for AMDGPU driver installation. * Added support for creating an offline installer when the kernel version of the target operating system differs from the operating system of the host creating the installer (for Ubuntu 22.04 and 24.04 only). @@ -293,7 +306,7 @@ See [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/install- The ROCm Runfile Installer 7.0.0 adds the following features and improvements: -* Added support for RHEL 10.0, Oracle 10.0, and Rocky 9.6. +* Added support for Oracle 10.0, and Rocky Linux 9.6. * Added `untar` mode for the `.run` file to allow extraction of ROCm to a given directory, similar to a normal tarball. * Added an RVS test script. * Fixes to the rocm-examples test script. @@ -372,7 +385,7 @@ Click {fab}`github` to go to the component's source code on GitHub. MIOpen 3.4.0 ⇒ 3.5.0 - + MIVisionX @@ -425,17 +438,17 @@ Click {fab}`github` to go to the component's source code on GitHub. Math hipBLAS 2.4.0 ⇒ 3.0.0 - + hipBLASLt 0.12.1 ⇒ 1.0.0 - + hipFFT 1.0.18 ⇒ 1.0.20 - + hipfort @@ -445,7 +458,7 @@ Click {fab}`github` to go to the component's source code on GitHub. hipRAND 2.12.0 ⇒ 3.0.0 - + hipSOLVER @@ -455,12 +468,12 @@ Click {fab}`github` to go to the component's source code on GitHub. hipSPARSE 3.2.0 ⇒ 4.0.1 - + hipSPARSELt 0.2.3 ⇒ 0.2.4 - + rocALUTION @@ -470,17 +483,17 @@ Click {fab}`github` to go to the component's source code on GitHub. rocBLAS 4.4.1 ⇒ 5.0.0 - + rocFFT 1.0.32 ⇒ 1.0.34 - + rocRAND 3.3.0 ⇒ 4.0.0 - + rocSOLVER @@ -490,7 +503,7 @@ Click {fab}`github` to go to the component's source code on GitHub. rocSPARSE 3.4.0 ⇒ 4.0.2 - + rocWMMA @@ -500,7 +513,7 @@ Click {fab}`github` to go to the component's source code on GitHub. Tensile 4.43.0 ⇒ 4.44.0 - + @@ -509,7 +522,7 @@ Click {fab}`github` to go to the component's source code on GitHub. Primitives hipCUB 3.4.0 ⇒ 4.0.0 - + hipTensor @@ -519,12 +532,12 @@ Click {fab}`github` to go to the component's source code on GitHub. rocPRIM 3.4.1 ⇒ 4.0.0 - + rocThrust 3.3.0 ⇒ 4.0.0 - + @@ -684,7 +697,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid * Default command: - A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through laternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. + A default view has been added. The default view provides a snapshot of commonly requested information such as bdf, current partition mode, version information, and more. Users can access that information by simply typing `amd-smi` with no additional commands or arguments. Users may also obtain this information through alternate output formats such as json or csv by using the default command with the respective output format: `amd-smi default --json` or `amd-smi default --csv`. * Support for GPU metrics 1.8: - Added new fields for `amdsmi_gpu_xcp_metrics_t` including: @@ -693,7 +706,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). - - Increased available JPEG engines to 40. Current ASICs may not support all 40. These are indicated as `UINT16_MAX` or `N/A` in CLI. + - Increased available JPEG engines to 40. Current ASICs might not support all 40. These are indicated as `UINT16_MAX` or `N/A` in CLI. * Bad page threshold count. - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions required to display the count. @@ -762,32 +775,32 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid #### Removed -- Removed unnecessary API, `amdsmi_free_name_value_pairs()` +- Unnecessary API, `amdsmi_free_name_value_pairs()` - This API is only used internally to free up memory from the Python interface and does not need to be exposed to the user. -- Removed unused definitions: +- Unused definitions: - `AMDSMI_MAX_NAME`, `AMDSMI_256_LENGTH`, `AMDSMI_MAX_DATE_LENGTH`, `MAX_AMDSMI_NAME_LENGTH`, `AMDSMI_LIB_VERSION_YEAR`, `AMDSMI_DEFAULT_VARIANT`, `AMDSMI_MAX_NUM_POWER_PROFILES`, `AMDSMI_MAX_DRIVER_VERSION_LENGTH`. -- Removed unused member `year` in struct `amdsmi_version_t`. +- Unused member `year` in struct `amdsmi_version_t`. -- Removed `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`. +- `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`. - `amdsmi_io_link_type_t` is no longer needed as `amdsmi_link_type_t` is sufficient. - `amdsmi_link_type_t` enum has changed. - This change will also affect `amdsmi_link_metrics_t`, where the link_type field changes from `amdsmi_io_link_type_t` to `amdsmi_link_type_t`. -- Removed `amdsmi_get_power_info_v2()`. +- `amdsmi_get_power_info_v2()`. - The ``amdsmi_get_power_info()`` has been unified and the v2 function is no longer needed or used. -- Removed `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. +- `AMDSMI_EVT_NOTIF_RING_HANG` event notification type in `amdsmi_evt_notification_type_t`. - The `amdsmi_get_gpu_vram_info` now provides vendor names as a string. - `amdsmi_vram_vendor_type_t` enum structure is removed. - `amdsmi_vram_info_t` member named `amdsmi_vram_vendor_type_t` is changed to a character string. - `amdsmi_get_gpu_vram_info` now no longer requires decoding the vendor name as an enum. -- Removed backwards compatibility for `amdsmi_get_gpu_metrics_info()`'s,`jpeg_activity`and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` or `xcp_stats.vcn_busy`. +- Backwards compatibility for `amdsmi_get_gpu_metrics_info()`'s,`jpeg_activity`and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` or `xcp_stats.vcn_busy`. - Backwards compatibility is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion about which field to use. By removing backward compatibility, it is easier to identify the relevant field. - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. @@ -866,7 +879,7 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc - `num_threads` Total number of threads in the group. The legacy API size is alias. - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for aritimetic reduction across lanes of a warp, and `__reduce_and_sync`, `__reduce_or_sync`, and `__reduce_xor_sync` functions added for logical reduction. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). -* New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as the following. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). +* New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as follows. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). - Data types for `FP4`/`FP6`/`FP8`. - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. @@ -883,7 +896,7 @@ functions added for logical reduction. For details, see [Warp cross-lane functio #### Changed * Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. -* Removal of Beta warnings in HIP Graph APIs +* Removal of beta warnings in HIP Graph APIs All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. * Behavior changes - `hipGetLastError` now returns the error code which is the last actual error caught in the current thread during the application execution. @@ -1084,7 +1097,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. +* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large that symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. * Added single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: * `BlockScanRunningPrefixOp` * `ScanTileStatus` @@ -1100,7 +1113,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Removed -* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you would like to build for these architectures, please specify them explicitly in the `AMDGPU_TARGETS` cmake option. +* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you want to build for these architectures, specify them explicitly in the `AMDGPU_TARGETS` cmake option. * Deprecated `hipcub::AsmThreadLoad` is removed, use `hipcub::ThreadLoad` instead. * Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. * Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. @@ -1250,7 +1263,7 @@ HIP runtime has the following functional improvements which improves runtime per * Added element-wise binary operation support. * Added element-wise trinary operation support. -* Added support for new GPU target gfx950. +* Added support for GPU target gfx950. * Added dynamic unary and binary operator support for element-wise operations and permutation. * Added a CMake check for `f8` datatype availability. * Added `hiptensorDestroyOperationDescriptor` to free all resources related to the provided descriptor. @@ -1292,7 +1305,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added * Added the compiler `-gsplit-dwarf` option to enable the generation of separate debug information file at compile time. When used, separate debug information files are generated for host and for each offload architecture. For additional information, see [DebugFission](https://gcc.gnu.org/wiki/DebugFission). -* Added `llvm-flang`, AMD's next generation Fortran compiler is a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. +* Added `llvm-flang`, AMD's next-generation Fortran compiler. It's a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. * Added Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. * Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` to determine the current target processor ability to invoke a particular builtin. * Added HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. @@ -1424,11 +1437,11 @@ HIP runtime has the following functional improvements which improves runtime per #### Known issues -* Installation on CentOS/RedHat/SLES requires the manual installation of the `FFMPEG` & `OpenCV` dev packages. +* Installation on RHEL and SLES requires the manual installation of the `FFMPEG` and `OpenCV` dev packages. #### Upcoming changes -* Optimized audio augmentations support for VX_RPP +* Optimized audio augmentations support for VX_RPP. ### **RCCL** (2.26.6) @@ -1476,7 +1489,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Known issues * Package installation on SLES requires manually installing `TurboJPEG`. -* Package installation on CentOS, RedHat, and SLES requires manually installing the `FFMPEG Dev` package. +* Package installation on RHEL and SLES requires manually installing the `FFMPEG Dev` package. #### Upcoming changes @@ -1656,7 +1669,7 @@ HIP runtime has the following functional improvements which improves runtime per * Individual `plugins`: The `plugins` (shared libraries) are available at: `/opt/rocm/lib/rocm_bandwidth_test/plugins/` ```{note} -Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/rocm-rel-7.0/README.md) file for details about the new options and outputs. +Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainline/README.md) file for details about the new options and outputs. ``` #### Changed @@ -1665,7 +1678,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/roc #### Removed -- The old CLI, parameters, and switches used. +- The old CLI, parameters, and switches. ### **ROCm Compute Profiler** (3.2.3) @@ -1714,8 +1727,6 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/roc * Support for Roofline plot on CLI (single run) analysis. -* Roofline support for RHEL 10 OS. - * `FP4` and `FP6` data types have been added for roofline profiling on AMD Instinct MI350 series. ##### rocprofv3 support @@ -1784,6 +1795,8 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/roc * Memory chart on ROCm Compute Profiler CLI might look corrupted if the CLI width is too narrow. +* Roofline feature is currently not functional on Azure Linux 3.0 and Debian 12. + #### Upcoming changes * ``rocprof v1/v2/v3`` interfaces will be removed in favor of the ROCprofiler-SDK interface, which directly accesses ``rocprofv3`` C++ tool. Using ``rocprof v1/v2/v3`` interfaces will trigger a deprecation warning. @@ -1829,7 +1842,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/release/roc #### Removed - Removed backwards compatibility for `rsmi_dev_gpu_metrics_info_get()`'s `jpeg_activity` and `vcn_activity` fields. Alternatively use `xcp_stats.jpeg_busy` and `xcp_stats.vcn_busy`. - - Backwards compability is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. + - Backwards compatibility is removed for `jpeg_activity` and `vcn_activity` fields, if the `jpeg_busy` or `vcn_busy` field is available. - Providing both `vcn_activity`/`jpeg_activity` and XCP (partition) stats `vcn_busy`/`jpeg_busy` caused confusion for users about which field to use. By removing backward compatibility, it is easier to identify the relevant field. - The `jpeg_busy` field increased in size (for supported ASICs), making backward compatibility unable to fully copy the structure into `jpeg_activity`. @@ -1888,7 +1901,6 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele * Added new optimization to the backend for `device_transform` when the input and output are pointers. * Added `LoadType` to `transform_config`, which is used for the `device_transform` when the input and output are pointers. * Added `rocprim:device_transform` for n-ary transform operations API with as input `n` number of iterators inside a `rocprim::tuple`. -* Added gfx950 support. * Added `rocprim::key_value_pair::operator==`. * Added the `rocprim::unrolled_copy` thread function to copy multiple items inside a thread. * Added the `rocprim::unrolled_thread_load` function to load multiple items inside a thread using `rocprim::thread_load`. @@ -1905,12 +1917,12 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele #### Changed -* Changed the parameters `long_radix_bits` and `LongRadixBits` from `segmented_radix_sort` to `radix_bits` and `RadixBits` respectively. +* Changed the parameters `long_radix_bits` and `LongRadixBits` from `segmented_radix_sort` to `radix_bits` and `RadixBits`, respectively. * Marked the initialisation constructor of `rocprim::reverse_iterator` `explicit`, use `rocprim::make_reverse_iterator`. * Merged `radix_key_codec` into type_traits system. * Renamed `type_traits_interface.hpp` to `type_traits.hpp`, rename the original `type_traits.hpp` to `type_traits_functions.hpp`. * The default scan accumulator types for device-level scan algorithms have changed. This is a breaking change. -The previous default accumulator types could lead to situations in which unexpected overflow occured, such as when the input or inital type was smaller than the output type. This is a complete list of affected functions and how their default accumulator types are changing: +The previous default accumulator types could lead to situations in which unexpected overflow occurred, such as when the input or initial type was smaller than the output type. This is a complete list of affected functions and how their default accumulator types are changing: * `rocprim::inclusive_scan` * Previous default: `class AccType = typename std::iterator_traits::value_type>` @@ -1925,7 +1937,7 @@ The previous default accumulator types could lead to situations in which unexpec * Previous default: `class AccType = detail::input_type_t>` * Current default: `class AccType = rocprim::accumulator_t>` * Undeprecated internal `detail::raw_storage`. -* A new version of `rocprim::thread_load` and `rocprim::thread_store` replace the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. +* A new version of `rocprim::thread_load` and `rocprim::thread_store` replaces the deprecated `rocprim::thread_load` and `rocprim::thread_store` functions. The versions avoid inline assembly where possible, and don't hinder the optimizer as much as a result. * Renamed `rocprim::load_cs` to `rocprim::load_nontemporal` and `rocprim::store_cs` to `rocprim::store_nontemporal` to express the intent of these load and store methods better. * All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, for example, `rocprim::ROCPRIM_300400_NS::symbol` instead of `rocPRIM::symbol`, letting the user link multiple libraries built with different versions of rocPRIM. @@ -1950,7 +1962,7 @@ The previous default accumulator types could lead to situations in which unexpec * `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. * Removed the deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. * Removed `rocprim/detail/radix_sort.hpp`, functionality can now be found in `rocprim/thread/radix_key_codec.hpp`. -* Removed C++14 support, only C++17 is supported. +* Removed C++14 support. Only C++17 is supported. * Due to the removal of `__AMDGCN_WAVEFRONT_SIZE` in the compiler, the following deprecated warp size-related symbols have been removed: * `rocprim::device_warp_size()` * For compile-time constants, this is replaced with `rocprim::arch::wavefront::min_size()` and `rocprim::arch::wavefront::max_size()`. Use this when allocating global or shared memory. @@ -1974,7 +1986,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Known issues -* * When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x. However, if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs. +* When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x. However, if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs. ### **ROCprofiler-SDK** (1.0.0) @@ -2214,7 +2226,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Resolved issues -* Fixed an issue with internal calls to unqualified `distance()` which would be ambigious due to also visibile implementation through ADL. +* Fixed an issue with internal calls to unqualified `distance()` which would be ambiguous due to the visible implementation through ADL. #### Known issues @@ -2228,10 +2240,10 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -* Added internal register layout transforms to support interleaved MMA layouts. -* Added support for the gfx950 target. -* Added mixed input `BF8`/`FP8` types for MMA support. -* Added fragment scheduler API objects to embed thread block cooperation properties in fragments. +* Internal register layout transforms to support interleaved MMA layouts. +* Support for the gfx950 target. +* Mixed input `BF8`/`FP8` types for MMA support. +* Fragment scheduler API objects to embed thread block cooperation properties in fragments. #### Changed @@ -2245,9 +2257,9 @@ The previous default accumulator types could lead to situations in which unexpec #### Removed -* Removed support for the gfx940 and gfx941 targets. -* Removed the rocWMMA cooperative API. -* Removed wave count template parameters from transforms APIs. +* Support for the gfx940 and gfx941 targets. +* The rocWMMA cooperative API. +* Wave count template parameters from transforms APIs. #### Optimized @@ -2274,7 +2286,7 @@ The previous default accumulator types could lead to situations in which unexpec * Handle creation and destruction APIs have been consolidated. Use `rppCreate()` for handle initialization and `rppDestroy()` for handle destruction. * The `logical_operations` function category has been renamed to `bitwise_operations`. * TurboJPEG package installation enabled for RPP Test Suite with `sudo apt-get install libturbojpeg0-dev`. Instructions have been updated in utilities/test_suite/README.md. -* The `swap_channels` augmentation has been changed to `channel_permute`. `channel_permute` now also accepts a new argument, `permutationTensor` (pointer to a unsigned int tensor) that provides the permutation order to swap the RGB channels of each input image in the batch in any order: +* The `swap_channels` augmentation has been changed to `channel_permute`. `channel_permute` now also accepts a new argument, `permutationTensor` (pointer to an unsigned int tensor), that provides the permutation order to swap the RGB channels of each input image in the batch in any order: `RppStatus rppt_swap_channels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, rppHandle_t rppHandle);` @@ -2289,7 +2301,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Resolved issues -* Test package - debian packages will install required dependencies. +* Test package - Debian packages will install required dependencies. ### **Tensile** (4.44.0) @@ -2299,7 +2311,7 @@ The previous default accumulator types could lead to situations in which unexpec - Added code object compression via bundling. - Added support for non-default HIP SDK installations on Windows. - Added master solution library documentation. -- Added compiler version dependent assembler and architecture capabilities. +- Added compiler version-dependent assembler and architecture capabilities. - Added documentation from GitHub Wiki to ROCm docs. #### Changed @@ -2322,7 +2334,7 @@ The previous default accumulator types could lead to situations in which unexpec - Fixed configure time path not being invoked at build. - Fixed find_package for msgpack to work with versions 5 and 6. -- Fixed rhel9 testing. +- Fixed RHEL 9 testing. - Fixed gfx908 builds. - Fixed the 'argument list too long' error. - Fixed version typo in 6.3 changelog. @@ -2340,7 +2352,7 @@ individual components, review the [Detailed component changes](#detailed-compone ### Failure when using a generic target with compression and vice versa -An issue where compilation for generic target with compression failing has been resolved in this release. This issue resulted in you being unable to compile for a generic target and use compression simultaneously. See [GitHub issue #4602](https://github.com/ROCm/ROCm/issues/4602). +An issue where compiling of a generic target with compression failing has been resolved in this release. This issue prevented you from compiling a generic target and using compression simultaneously. See [GitHub issue #4602](https://github.com/ROCm/ROCm/issues/4602). ### Limited support for Sparse API and Pallas functionality in JAX @@ -2395,7 +2407,7 @@ and `__AMDGCN_WAVEFRONT_SIZE__` macros are deprecated and will be disabled in a ### Changes to ROCm Object Tooling -ROCm Object Tooling tools ``roc-obj-ls``, ``roc-obj-extract``, and ``roc-obj`` are +ROCm Object Tooling tools ``roc-obj-ls``, ``roc-obj-extract``, and ``roc-obj`` were deprecated in ROCm 6.4, and will be removed in a future release. Functionality has been added to the ``llvm-objdump --offloading`` tool option to extract all clang-offload-bundles into individual code objects found within the objects diff --git a/docs/compatibility/compatibility-matrix-historical-6.0.csv b/docs/compatibility/compatibility-matrix-historical-6.0.csv index 7d3e9d040..b54e8025d 100644 --- a/docs/compatibility/compatibility-matrix-historical-6.0.csv +++ b/docs/compatibility/compatibility-matrix-historical-6.0.csv @@ -2,7 +2,6 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 :ref:`Operating systems & kernels `,Ubuntu 24.04.3,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,, ,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2" ,,,,,,,,,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5" -,RHEL 10,,,,,,,,,,,,,,,,,, ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2" ,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8" ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4" @@ -10,7 +9,7 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,, ,Debian 12,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,, ,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-630-past-60]_,Azure Linux 3.0 [#az-mi300x-630-past-60]_,,,,,,,,,,,, -,Rocky 9,,,,,,,,,,,,,,,,,, +,Rocky Linux 9,,,,,,,,,,,,,,,,,, ,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, :doc:`Architecture `,CDNA4,,,,,,,,,,,,,,,,,, ,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3 @@ -39,12 +38,12 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,N/A,N/A,2.4.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, :doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>`,N/A,N/A,N/A,N/A,N/A,0.7.0,0.7.0,0.7.0,0.7.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, :doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,1.8.0b1,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, - `ONNX Runtime `_,1.22.1,1.20.0,1.20.0,1.20.0,1.20.0,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1 + `ONNX Runtime `_,1.22.0,1.20.0,1.20.0,1.20.0,1.20.0,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1 ,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,, THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, - `UCC `_,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0 - `UCX `_,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1 + `UCC `_,>=1.4.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0 + `UCX `_,>=1.17.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1 ,,,,,,,,,,,,,,,,,,, THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, Thrust,2.6.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index 6af5e9d19..5f0ace4ea 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -28,14 +28,13 @@ compatibility and system requirements. :ref:`Operating systems & kernels `,Ubuntu 24.04.3,Ubuntu 24.04.2,Ubuntu 24.04.2 ,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5 - ,RHEL 10,, ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.5, 9.4" ,RHEL 8.10,RHEL 8.10,RHEL 8.10 ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP6, SP5" ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x]_","Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_ ,Debian 12,Debian 12 [#single-node]_, ,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#az-mi300x]_, - ,Rocky 9,, + ,Rocky Linux 9,, ,.. _architecture-support-compatibility-matrix:,, :doc:`Architecture `,CDNA4,, ,CDNA3,CDNA3,CDNA3 @@ -64,11 +63,11 @@ compatibility and system requirements. :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,N/A :doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>`,N/A,N/A,0.7.0 :doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_,N/A,N/A,N/A - `ONNX Runtime `_,1.22.1,1.20.0,1.17.3 + `ONNX Runtime `_,1.22.0,1.20.0,1.17.3 ,,, THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix:,, - `UCC `_,>=1.3.0,>=1.3.0,>=1.3.0 - `UCX `_,>=1.15.0,>=1.15.0,>=1.15.0 + `UCC `_,>=1.4.0,>=1.3.0,>=1.3.0 + `UCX `_,>=1.17.0,>=1.15.0,>=1.15.0 ,,, THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix:,, Thrust,2.6.0,2.5.0,2.3.2 @@ -188,8 +187,6 @@ Use this lookup table to confirm which operating system and kernel versions are ,, `Ubuntu `_, 22.04.5, "5.15 [GA], 6.8 [HWE]", 2.35 ,, - `Red Hat Enterprise Linux (RHEL 10) `_, 10, 6.12.0-55, 2.39 - ,, `Red Hat Enterprise Linux (RHEL 9) `_, 9.6, 5.14.0-570, 2.34 ,9.5, 5.14+, 2.34 ,9.4, 5.14.0-427, 2.34 @@ -200,7 +197,7 @@ Use this lookup table to confirm which operating system and kernel versions are ,15 SP6, "6.5.0+, 6.4.0", 2.38 ,15 SP5, 5.14.21, 2.31 ,, - `Rocky `_, 9, 5.14.0-570, 2.34 + `Rocky Linux `_, 9, 5.14.0-570, 2.34 ,, `Oracle Linux `_, 10, 6.12.0 (UEK), 2.39 ,9, 6.12.0 (UEK), 2.34 diff --git a/docs/data/rocm-software-stack-7_0_0.jpg b/docs/data/rocm-software-stack-7_0_0.jpg new file mode 100644 index 0000000000000000000000000000000000000000..dd15df471ee8430aff37e6a269c50da4afd68c8f GIT binary patch literal 358711 zcmbSz3B2@JRp0>263ZsmfBRRQ4*-CP^igeXCR|DRDzY z1x6GR*%Uzr1srg6Si~JsTo@1(aGyp70cEruMO0MgzN+3}TE^eh?|tvxy1Dn{F6W$k z&i|bIedQ0n^5<8Y>|k)WB+nRD7@eFsb8>R!L@_DRNm`ZixPQT1EH7O0;gho` zmz`XE(m6S?oJ~Ge81)2v*wNs^1~fV?Klu5FPn-ZsCnvvm>qNS6;lcXtp#@hc?IZb5F~wBwc|v zr(@1n$?CNH0Z{&)dh3GnIRTVkTnFwBlz$(TpAv7A0F?g`ly68}YjbjP{>sz(%5{8D zMnL(>Qk4gwe8`f;p)EB|peE)-W_G?Z~-t%22Cy)N42kV~lV<#t9zv$%T-5+_# zd~Q7D7r(0V+hKUO+uatp&TXd)`oQP^+2Dbe|9;^i@4M~veGjbNg`w+N+qk-Lx~NVT zXWR0^Cbt~-!mVG+#Q)zNA2O_mjN@6d>$#-H)%Yt+bwmV$~Rgk3@g7n|p;s4KW z4;jO0@yOSpaQ&PA?c|2LZ#%i>BR8C!`(GbAIe+~XC+D8`qu|Mz2lh?4Og}l5JmX29 zJ@P#$gXf3eU-j$k%fZ*Pn_zw6w0a<`7o2TbpO%44oc_A#ADljoj{PPmiViF&d)`KA-+#6L++%9Gv63r@b{`ow@7G=!|eiKC?LU{4?t_>6z`B7o7RdGcP^!iZic0^Tso8J@a4B z{M?yeJ@Z>U4&wT34=g<7_Gyi<{+}SJ7-f;E_XPcjLJypS$%Ob#8EOc5Zpj zJ6E3j)^jg8_o{PmI`@uq?>qOwbH8`)o^xL~_fO|9K7alB3+Hb=|LpUV^V)gmd~yC; z&%f;aYtR4W`FEfH!1>=h|Ecr$p8vOtuDa;)7eN=@ei3_-dC~f!-9<0H=v5cJ<)U|8 z^y?RW{G!iX^bZ$beDRGJ-*Pc|@#Nyg#o@)@eDTXJ{*jB{dGW7Y{Lza)eeoABx#W_^ zTmoHk=OwdCtV^m(UVO=GF8QfTe)*D*UGkYr{_)Z)E`8Fa=%vi1%S+QsUwG-OE`8gj zzjW!xF8%DKU%u>`%bt4K?U&6iTV2*(_VUZ#eA)Xh`^aUVx$MiAUwip2m*080dU<&H zcU=DJ%inSNuV4Nrm;c=rS6p$^6}>CwR|Hpl`xURg;-|0p%_}~2#Xntn-IdS0a&YDI zuH0VveOLa(mA`W3A71%)S6y}0)2@2%Rm-cYtG@55w_f#YSN+*l|M-aOA933w*hj1$ z@okTI-6P)ph>tzuudlxB>Ze{!U;Ry2@2`H<)jxanN3Q}Sl4{(HLtto z=dStWHDA2;`fEGa&aO?aec82dyY{!P{oHkzT=%r=M%S&bd(m}2e%-HK_v!1;U4QfS zgX^8^zw`PxU;pdZf98gZZ@A?K_J;6=m)`JSZusyG_dfEvM|K~nKeBn`>mK=wkG$u` zb2mQiM)t<|#+TptGdF(X#(#d)6COoB%6-)LKI$Eh`q-ns^ytSu`nivGAN`U?zw^-_ zfAl{;=E7q}kBJ`h1CM#nWB&B9=O6pb$Ic(Sd+d)s_SYZ#g~#3SxMx4kdE84L_cM?C zqsO0p{4*afKmMB^|KpGU(Br@OgeN{>{Dk}ouX)0+J>d&ayzz<8ePZ~;S3dEVp7{AE zUH_!Jo)kRkl~4NRC;jDxM_!;W#1~$D;R6@$ee&a5Pk+Q#_vA!>@)Fa z`pu{YHcHY<@2Yns-``A?&gMAzJUhHqXP}lAL zX!j5B>+vc6-T3?QeZO{JP9k<-!-tp!;K7Hp+cUpJ8@y>hhx^UO?@A|R3?s?9WpJP4e zP0#uB=idC>)pOtS+|Sa_phNnn=)dkG{j&eA{+9-~558saE5mcc@$eukf$s|6F*cPzvvx zo}F^jADsS)_*5|!-!nUzv9lkX{i$?|R7gKRzj!XqUpM~?xg)R&R3#w(0_OxXMu^LG}HT?C8wE-zW?%eO4Q_zk0P_@Qt3 z%=7T)z4&>b_{LklvH8Z|`liQzQ~XW8c=y$JTX+BL^Dlb7@%*ba{|u0HSH>Au4KtaqpP3h#64JJ&zB{=EMj|5g4Mf2+z8ejs~d)?^>eq5Q@9J;m+C4;A;8eEF8mnawwD-do)O zjHwT8pT7OB?WgLy>eto(xYKt(+g#hE&4=1&w%^-+cF*kp^EaRWX6KvV|AHsK;5%OM zsc)gb<&EEZ=3DJ=egC)J{B1A#w$FU~_}hQ-J1+l@=sW)V3y~MT;)Q?zoyK>*??q2| z(YL?o)894zuD8GV5ic%Z{K@Zr_IJPGd(MAP@I4>?UhI2c^S%G}lGRIo`=!WBU-i;| zd71OF-~LaX|Mcqr^p)@PzVE{?Ctm)e-+%Gw=55D`A z7hd_2SN{DE-Tgzq^{Vcx-tfbh|M2#Q?|HTG>i53pDX;nd*Zj+C{nvi{b^X`9^YxE^ z{rA59OFy#uk&pf8bAR+5Z+OBRUiyZA{;}Z4{@{({H@^E#PkYm=e*EGeuYde6-mJg* zga4WO&u@LpW8d=9xBTl*q(5=bPs%^}fwz)xee2sE|F)OE{p{PfZ~y#HJ@2P}_rDDP z%X{B(>pR}~&KuwP(w{#0>FrPd<$rztfBnSI@IUkZpQV2G9sljA|Lyhfy8c}+dH0!j zx9`69J^p*{dGF%AAN@J*=idLmXTR@VKY!cL|HLmm`4?XIi;w)pm;ch`zx1MC`pPfw zfBB2QlK;vVes%q;pML+{@BiapGk@*(KQRBmhku>__22vr<~M%rgY*Y~`8V(U&0qMf z+kfllew+I3_xuj=JMaDw{-JmMcl^J<>%+u{-~AEtBk%p)-tWHeqj!Atmp=BKkNxWJ z4S(-9exLpQ-}$)s@sEB&`@|o7@_C=U=MPqY@VP&X|L|}BsQ#lb|M9o~$whzil0UuX zPha(CkN>kb{f}q-k9Xcf-1Cc{8h+|Sf3E!bAAj2U^k4nY&HwzD&%Ed_uKJ5tefCM8 zee37Y&;7#Zna}_3Uw*@1e&(<8zxwhQUi`nV{a>&D>!<(qfBT!~{>_K(HShh*-3w!yy1C6-Wwv{plO=Ft<#6XV1U|XrISsJ9BDcC= zyXzpm^QOP~`Cq>2La@5?CIg=$(|q9i!DL^$%3f5RedP33H$CUB%kOAzZ<0K5E9*j& z#Odbt=FXcA#BT@X(`xvp3x_6EbmvX<>02+DQwi8wrF)@wTjy2>L9h$p{fE!9E}Z^> z4?nje$nA%p3-^6)y6bXKx}#iqx64E3zPEvIci!|6^L2;amX&K5nDadtL17fV6|}f@ z)25Zx+?sA~J`%Z~&Jg%AWso0izEirEoo%Z-Z@TI9MGt(A(^@$XyeHq5@!>5i2X03Lu&*u$;{cm}xn;pTAqH$jhbpLI_aQ_8sRN7CarM=;_dke(Q`WydwoN{@NZ7S%?Q7xkE7cyHEthZ$+qEQ9|u> zZ^yCQyWLxnE>H&#A#!Z_H8iH#D)8F>iw1S1u{?HvNTaXq4)hgRJ?sA>@GygzKD6!E z&{6$uVy9LRZ@y-APnZ0w7xzBaIqt=U(?7XYZroELyYr^|rR)K%b!>fKAFybrTH^Ly zu&VsKP@FhS?;%wWj_QGar`vXD;k&2U2Uuz9!Cv>(eZ6iE;qM3g-B5_uxe_foc}@4}^@u)xhA=s&H`>^*SL5E^8uxbWaJrq+bax)bA`KzOX8rq`(G^#XbGi>hU3wTXk&SXOJywqoJlLjL@NhjJ^=WO?Uu#T%9gX|*1vl8O$NlwM z7}6O(*k$~I;IL0mhxG3B-+V}Sr4dcdhW)9^^yliR+1cY}<4hXLd*I)0H0kdgaY!Qz z@P@_M*JMV{4$lt+A87c{PXEC&OwsgczNQf{U3xc?k5i^unVwH&+Ji^5#hy-cLZ8lV ze*apYR?QiD$Bx1BOa^oWrU#y_+2Pl=I})7J129{#E)F}GX98$C052HQ>-hlGffWIS z(ka-isSMViVP&?N>!a}q9RLC4ZL1pa%YvUx~`d%-lPM0*lgr4tVzHoe9&#hFoOmWuIO>@E|TSe zU}2{m>ZUj!=Dd;QM35K7c2inJoyj!Kg6Xrv{v5E&d@&jThOkf9)*lV~gOM`TMvNwl zKD%Bv>*c6D5X|h!j+sv6t|AT?oeP|LwC9HMaP3=))*8Wr6b5}vQbd;%WdYT&1sDym2|XB{ve=NKDRLn25t%^yt);sh0UTl3i{=$ zN#R8mHiDvCKJF!(E*e?7Iq0p2eqWDLK(PlX0@ZR9J&zio!@!7hm?yw{4kU#Ir67pc zwzUs8P;=Tu8dZV{85Jc#5-PxJV|Q$V*D`1#!-pn^5uUa)V*<3>N}yfSjq?nSVRs0J zB+!)1#sw2dTV@r)A#s0&B}Eo_I;x%wgS|>8U_@fg9SAZVxJ7b!$DNW|U{ONAxTV3} zQ5Os$uW)`_Lq;rZXQ$HuomPkTy#_QlChcC*b9zPT%KVgj`o{dgghf`BA?&G#X|t|yBv=nukkNYk zrN?eku(RD@BYM*cYlCn)EC%|bBXRDoKO2n{+^$?@lzG~0f|1gyTkrRgs!E8>R;7bx z!GYjtAc!{B=d)8iI}?PCQF%^^FI>8}H@lHbu|5nr!D>E|17p17b&iL1JMasoL2-@f zQhhR}6v-U{Kjc6VZwH;EsrvS=v=pAp1wV<_+nFi}JksE!*><-gg4Cyodf?*TD8zic z16dLtnQ1y64+)&<0|HUd4Ejg))wDewvYBv7ux%ko-` z*(TRLz22|E()Kn>(jCh70Ev_3WJKsiJ@%J+w_)d{+l&Eo50wMKLiN%KZUo_Gxtdj^ zs#bFZmoRZZ9OXU5T(gjqPv;JzZ96l$ZviKYR0@3$t&BuzRl2cOg0&%AqgOfe13^)o z(1<_ZED%?B04-{Kk@V$BHd3%b!>Fu{lFOk$N#l7YBNL|f@fB>3W0NQ$Yl1t3Z>cn7 zmIl@t$V(>JlCEowXVQUSv`<98KMpb_Wmj$`Z7to$sU0D}ZAa+sJp_zZtf%7;NQI!( zm-(up2|PD}7)e_jN<==d zN;d|k!b2-&iATbd<2y986CMbv;LTq50iq?bg6pn#|BLl%24mEL<{$=r%u_M*b`LZ! zj0ptGy#~%hC7T=w>Li3j2+@@}e$yg!<`IixhjdTXe#uS~zh=xl(k5gsQM5iGHr+(V z1d<7*e6p3tVZNgpqpodi0*|r-!3`mxE2!{xutN3W-CQDThQvLx6K_`S~o1eui6Y`~d5j-iY`{q%M{ni&r#7HmAPXLzS7-Oz)QzG_No$ua%wB%vibD z!erfH6#{0*@Orwo|& z)E^#4$Tgfc?YDBOks>TA_)pEEp zYoor)PId(e)cbw1oYO45loPe1dSH!ofQQby+XmlI^_WMy0xP?UHr+&mDvU}u+%2Ke z61KWj?U$zAuP|&<>luSc~2PRZy>GX0y zH;dt<%WZnZ#H6;>EJgNeNnjLi@u3;aVMDOUbb46BX*AmOr1g?*W*rtXo2{-5iE+2y znSta6nbcMYVlMsKn{9&zrIwN6X@a#dnN74F*)lukg4LcMN0`nJ4+PgMxDO3bx93Ta zR+W+`ScHhTh(ZY5Sc5j0HX61y%Tui8N87d;dc(#JfSqR9UAQl%_1sJ*R7ILfb)0g~ z=uil;EXP^D<%HQT@uNZ0$Qz2votE#g`*BBFLoB0+ZJYUhjzp*!=F<7nE2rbwONLpG zbema!5K6}pG6QPC3GK9-MiCv(nWQUk`g0P&b|yZppkBCbxMiVb6jEsqI=ecA&5xEt-8SpWRyEV7x`bHhaNm&)G`3cGubGH~ zJ6-^h5UNo@I}I#u$CHH6yYqzOzM#YHS4`OIou|scB$zr^al)6o4fgjnz4@emli@(rZ(wkU9r~%N@1f znjLuTDUyRN@&@QP59l`u=(i0aq0W7(ZAb!160x91)jguMXQek~cr{-uqfPB?QZ}xF zqXk>zbO(q}_h%1N$+x*>p{|=-{x1GppBaTnF;t*%VGl!!J6GjZ#dz zh=W+%gmN&?Hyk&!XND6ll6tW;EMle(0MoN}+;`2DM%4zrZOYmDKrl2yX|k&lu_IbZ ze=7`=I=5%8tRc3w&aqRXQ#DCo6}fQLWV0Y`q#ajfGoUiRr`Kw{Iv!Ge z7G#N7VYSEmx>+{k0c!SfKF)ER@_}XGYt$Mg=a@+=yVl%nvO$sQaTvvlV)3Pu_QOao zkiNk4Onp28%r%vCy;+T0vayK(YYfVQ%keGkYDB)8W}3p2MX(t!XLB%);w;Gwnm@(j3iG?njuY4Lz*jCss*w9Smjp;jmqlJu?RW)o5Waps`UG2nIwe93v0;(zlKy>|0Q= zhg&!RgI)`A#lg7OjKy-tltk)yhX?9IznE*j@^~HD%4%n{=tDsw3)Wbeu+#W3LVw4% zLoNgUGb!{Z5FWxE87gO^WYt33?w*jzc*eL=!Z$GPlPOHI@i0}~z1LBp31O~ln}o>5 zg-l|6G>Mm;5)X1^H66_3nCwOPsM;6|w5)r0F2(RyLM4{H(M=7u%u>wxM6yudP#~>tQGLQuEsGx|1^A}wGG-;gi6$i_Ss6TZOaZb|zk!L&i?pP( zZ%@@dK3SO4D#yc7-06=Se{HL{#x(4p!|2;+!DiCYH7mR1sz2U$f;}A?Nb9Z55Uu!P z9|rWSR&72Hotd2n?kF#DfhveeiEo=CO~x3)WWzzV6?~mdrsYVQAJrE(V0xmC>oaU_ z>n$fvyJSyWxp3TA{Zb=Vc!&@#7HQyY5F5up6eAlsD%}e99&`M@yJ+A+jmE?RZI23e zFGs-d0XE38<~+#ljs2~*HP%zT<^_K^F8w}V`q6@d10L=-`Fzu^J9OQuJIJ!;aMZy$ zR|RpaX{|8Ofo-PV#1MSKR;HsxT>5I;WBEPbr-VtM&Y8`G!MUy{;vzqrhWr@sEA#Qh z^$A=oHzl7oJ7H-R-h>e=COI;;8dRmSce%zugVR62<>2a3m#HknZf3k~2~{^GDZR?J zFx5aXV;d9(Clc1p@2{e(tmpwwIojA7@I9Jva(Y5S0<(|C3V?8y{Rk1eMXz3mup`4n z0T=r2kkX z^3%mp!Lr790cYx7hC43MI=l&Omr`GX9HMMMvIaecfb}ic)7==VMqs@q*6TD~u1Cn} z4#@MK)V8%a?KE>wJDMU>AXXH$o30lMiJ~)`fdPpC%7LD4kvQsUB#GqlXbte3{*3Eq znlIaJHK2%QhKy`}6J>p#-*T+TQhC9*;Pwu zT3UsahRy(hB#t^RFm={jakgD7No{RTwMYxrCAg++NMv` zJC8syvwsx5(@7N&RS|GbCNmhZdKE_pEyH!WtQ|SCww5~+0BxqM1SuYUe3K83Cra-| zR+9|(jc2erKO0a-#_ro4osj}1n7VVY%c205Tem%VrEl#7t87mp3m4xKp1xqa`6P;q zbu`HfTETYHF-nfNrwcaPxtb>)8GEl1HSTr>Ry|I06d`qdOIbZeLQ{Vwlf{k%nk3xL zhlK5MJDga}>SD$sMHogn2=ZkpCe45u(OriiTzHU_a4REeMDF#;fI`xEuW5cLjTN~x z9I)pNGmrB5q%&Y37c-gF7N2*_mMoJ9rNn))4a{wd_IAo<)amra1HnP$)5RP(ikWTi zcEW;;%ASG3*dF3`NIjb2Wv8@ZPlAV&F_1Tmuc-NI8nFz$Xy)!j1;+J|36DzZK=6J| zg;mHwQw67*YI&-u7BIjm<{E)e0FQ$AL(BwryhGUN@;*C$Lb@L23Rnxz+W|m%v;#qV zAm{d&NGi@W*ZZ9!jvFljVGu>EHU!f2@jBUXZP?^JYA9< z(^xGi#MI-}paW0$>dH73m1xAQV}P3~YFn`Ztjnz()rDeC>%DG~Ymy$er>-q9w;lJ0 z!7hc4G#x^~Fs`ynRgkdL;!DBnaJe*hiFL5mCcr#c8fzzzhjW@tsu6_C-V9k5#+GOt zw1-k){VgalkD^y|;2`D)4FY#2s{v4817_VFZK4f9ZGByz5nEn#l?bwcAkB%6rBiT-vzwUNK#+{2DA_M*^mRXbi5L)QH+yZ6| zw19jxH~N)kB0|s?TBVzI+Xh97>1N=^Nu>>skSVkW2ctzUkQr`EE{3~XyvHZ;w8q!y zh)d(D%PqC_#@;Gas4N$3yfrN%ST3?IaDN1+SkEAsGa6~DxxK>(S0%K8oTfkVEDu_W zp|=M@SPp zTVvF=o-T~hYP{JqLu*&@Z8@Cceoq{7<4RgDD7Qbf0Ytx!Oe~iici~BM#X=Ey59fMg zC6?9J$8$OL5nNc>gU)hjtF6y$$4f30)HRE;i%s3=_HZsQNkm={y+TYgC^yg&PSl*q z>KxM?J+bSENEC zF{W}FbH3>cXsL?dLwq@z}%7s))^IITDlSV>nSwg!_@=oa1sx<%>?+4~qfXdJvBJs8HR zJsSj?Xt`VWWUsT_ji9AxOg#i_tG5c$L{L3w;gUc)xcL&nHhW+g`oJbM(c>`+3`6hK zFchamkJ@YbEa|gK+XG4`Fi>FtKcM&$F~VoyROs{-DkuCUIxb7cbkcn1kF}~V#*xtGK~qR;85?B(J&6NYO|We6O~LjB6oKCmdX;G*0D^6 zHcNRn=T_mmQNq#MX8@XUjEz!T^Z;r2!~4;U{ex(RXG?5hiPc;qJ2bZ(^vPtggOdi&K*#y>ol_A->Bzw7i~73Q?;<%^h`=L?sz$yG;hR7=88UXYDZ zb-MxLFJ`8`n_Hn*`w%~yv}z0NY|(2uMzY2dRf!^0pu0g$Q(0umi(MArN=+MrK$&84 zkmYjEO@w?6@fH;oz*X`j4%%ANm1wpmaHIIilrW)G9D%WJRAF2?*}MstC?w9i=9VeM zR&sNu$M6J29DO~RL;(-gtTQyb6|ktcC7`@PtQCu?9T+%MlL-y1?s3&;S!bK9@QC?!IuaWXnpTFFCC=Omb@Ftr>t=JX%;$)SYW%j*Nod*o3sM zE$^6Yr+RHV?sNxLZ(u|7t^kma(B6CPWQ-g|uW!J54&%enqBsjGL5PU>d4d01JiqmHXp|=<$ z3#8eCsOnLcxxiM1*25*AYGI>WBl{(>T=*NR<)NDs9L?ZVfg2y{Dx>95>YRmmx z^dV!aH+hJ~4ZupEeNsuLt*>TLtYP{#E|1EgRkXR-tkP^9^p|r|Y>I6oc)-UbI|c>w z?%u#-qopX<=fgnCwuX$C;>ei^m&`iqt2pRYMitU-3p1JHMuHmVv|Wo(|W6*X6*qdDiP_k%$9`B(S3RzQl|-bUm>+O%Ip z-VB4vazc|bn2Zp&bzbN`zcH4&7w~wKg#N%#j@H+t?KMC+ojQPmRkn1Zspbu(f*{v! zH1vpiO%W7g)k{k)$1$dN_rhRd=d=#OA)L>mlXh=Id#3@ZSky5lh14?$-Hr`qJe6@w z=nWVcHw1hMv*` zII}9e9!6>!?8D3n$33f4G~2GJFhBrvU76KPX?@7d z(S`t?xPG(`qq#jymX5XzawkII1}-tcDji^hWH1D-`BJVLdaV}#`$CW@z)3Dh(Am*Z z7mPkhCk@WQ>;l*)N~Rpo6dMFj1(o$zhx160V zvWDp^Jws|{(ny0C1}kKM5O#ywmo2<;4{US%-p69?d&#&h8CnsUePt%I*<>`y zHZVTMs6K7&Iv|&4UUH)>4yA%-+-c<4tVQ_<(AZ_XYr0Tx7lMSh!=^}GZzB;%*%?wB z!B#P2Mk!FWoM;Lgbw|tPQ0uS9AtP5E5LcEZcEbSyT~NC|FwQDos9QhZupr@Zflxg?_v6V8#HdLd#9r0$6G3^2&V&J0`Mm~za9(je0uHdR4nsT>SJ)WaQ6kc^=y5Skz&hExXcCyCMVdSkiCJq6ZDIo;X| zH=izyRHysAGE-P?`hyxPWtEy=s04hfSbKytQq>rh91$hpNaa%9f z8^KAFz3H^kx}fFFd~U&It53ZVk6=i>uDjsOaV)bLkVH@H3Lvkl43F9dBr8Dm-W|Y> zM|6>TtoNYCR=KgHM|}waQNC=kan|ialB^JW*@EOl#OtFYY$Uun8A16SbLv+@gG@qdrJqs<1knk?>4v;guUKg;BPc)6%?ipD8k$1Q^gDyBeJ< z{Vq*v?QT)P!K57qN{5rEQJrXYC#7~?OVwOtnF=n-Q%h#6W<``D5i=0i=~B7N9vvTU zymujVJXNIbT8efl2#)neAxJ13xc17SDzgLfB*2kw#3<2FCZHlZREkesZ>IQ@T{K85 z5@$j1&c4rI4wt>D3SiZlxQt|ctu3ZvZ)f51l-%wRL}BA8uoyBB2{$Bd1X6uG$*rfw zP|X*EW+ap?$cDG}XkGO8yCX^*#z+B}VmX4;x zWH?x8#-`YdoUl@-NEn9t#0>JXGiin^ak%A=ZpCJ$!)(DTVDmSkn{Jj%ChBn$Kjaez z8`$n<4o-8%U9CEc3S14C5!S@fCXVVw3j{<582}pTAOO}|q?7wXND@aI4!N$`bu@sI zenU^tqPMYmL=v?Mq+J57-xvZ0r53s&(9NnYM&f26kC(fc6ABLJ5<4-m0W*(}s7SFO z%Q+eBJY$&mJ2QM13XYI_Q&{i2%yV^Q8RBi?qp?_bKRwu z_^T|>l3~-bO3AlLwp7HUwwXH`qE8EK?bt?8fu#2dV+kfx)7=qbfK+#I%^}iV7(^Li zXu5!X;O91r?bit7(gv|W**i$e(b+{kkY=?#__Fo+g7YSBJG862=yJk@RH`Sl zj6AF%?umT`jynYnk~jQ78#+3aRZ7t*p$5$ykR#g45HNoMV6bsIbxSfi<@?)eY9I;k zHxq$o(QbuUAwN8_O?NovaIGDy>=5L?RST{hIgYa%S$@CXWHhWN(sY0=Eq6Ipm#W#W z7LaUjC9RKVopE==MfxyLGrv|>@X>;`=IS(J+IH>FGuB*Gh7Yo+STb>zD_k{t;84yz zB*H6TzGU65H>Hd2pMMkJ-`*Vwe2xPe6RaD;KCpiw>5 zfI@fU4uA*I5ga!dM`J2zvYJ~5qX`0|+jK{6VJB&@QWU7I7wrw|u!iFta>&LX;KqI` z`of4qrq*mC>S=ElLit!1-MyX%HW~Lp@&GIL!OaIxEi=Wn$fW6vk*yF5u1wAhcYHiX z*9yq+XF{1ptDu{eeJsK-bUcHd02W7$#Xu@D)p5p!7Gg7mUAHLT%m zzu|}^%q!~1hr86@65&b=>v^>Gz1Z3y5#=StvO5$VxiKc0Njvj$$l9(>Zgv`ov@dq; z07HkXRROX#S`05^D)r)0IHp{JL^EiiL{lm)+SXb+V9|#YyA}u^m-GNBgwaT2>Y(6z zQ*RBtUD@IwEiT1n3-G0FuQR|#GOSD)f7u?{rX8CIj>3*&4k(w66lOj#yK9(@`B5JO z0R9NY;ysl}au{r-S%>O$q?X(22w}{|7+qwe$Rklj1J`_J$8(v{3y?NC@wV!MM0*TX zdl6C;Nd?(zMz-!76c2n3n}&@f!IBQ_t>1Aozz=U!iG=DmGp;)bE`=@xRz8klnN)_3Hc zF{)y4xr*K!dmhPCfEXMh({8atS;kFG#-@-Z?AIXktqTxks)wx_M+I0O2pM_KyxGk? z9XE#JK#wgMHi5Gsuzh)pp=+|Nu+4(Aq$ z6xwd2qwEeuDFh@U419)97lpSy^>;U4ozHoga!H64$e_xOu?d`%hyW_Or;%U<+=2kP zN~g+uH(0n7+>QIG=C<%LLCdf05+rCrAVF)iOczp89P_>&2B`d6j1pw5&mezOEev-t zYx@(1?FiLYqL-#0Yo)U{x_fCdcjIH$J=Ixhtmfr{Ui+BYEyG2)gM?`q_?WjKu-yje z!WoPz4ZPPoUcihCYeVzVFm*I=kqGLL-5Ebw^#VEA9}lTLnAURYcGA7OQlcJE*;^1n z;75kyD6^gJ1|0xJ07If_Kr&n1Qthf8QpVWIXAzPuPZP*dU*$PqQF%wmv`MB4h^dnM zUMa;hNn=6KZJh)iQBw%RQS_W?d1vBB{%TOx`@UXu0Q`_`z%>#T#o-wk_X<;65Z|#t>m!LRH#ne=G&Mz8J=oT&}Ze4bpESc$Vt+wyo@;E(vI0 zfxu7E$g&_<;5Emn*(g)}5%9W0fz2jDXf9ShxQ&5P@y5h4$YH=;z&5VRnE{e06iMy& z@(k?_W?Uvyn%08bo!%14NNqSBTSwOn%9kSOM}fBi6gy5q&Vd4l;D$WV|H)>xpV-)9 zsgAh?8kUVt(3HD_3{j|Y0R++$YBz@#c4&hCrwjxw4^t)+vI4271qfpf8HOXGqOEvr ztagT63;eK^G}1ZF8ju*GgBxy`YxN5^A+~Y2Axmdlv+TF#xdq{)XVue}0@`Mg5z#RtG({uwD0#Sxg$i)Dp-`TO#nLK7=0+1mmua68DE68G!HL#eXCLa{%D1z@t_@37XnKi_TIg zKi(U45sxgpwF|P~+HI678tN7@pF{j$)C{JD*q9v-MD$c(y@U-jOUzD0rNC`T>zEs= zAZW4H2YUj+L}q!^Hgf~pi7mTQTN+-{JOU<0fvW_L2PU#u;af10sguFTu(#?>d63oE zBTTlcw(Ecxg6mX0Q&)E;VMD@xcvP@J*TEA65AoSz73W^g1FV4;W1QWlTZz)FaAmng zH5P0FWb6W@b17+f8o*41#3@+sOMWIB)przxvwdo_R7^j*htOyxWQ-axm+ylc#hz&@-KZ3f8WlJ>W3Q7iT zzT4yy=FO_%o9ZjS4r`H4AQ8~@+E@B99r_dd_z1bla_y>{=t)@QBF26r=)iREYVF(Nt7t&^wOU^cQxgrow& zTkSaH_<~lc+oZ)XL>9IsAOg5NGnH+|3s3>NA&neDaWWz&13kbPIB^^a&PI{IxOf6i z!6IpeNCHYB)W}tMgAEwHMdjE?kwRx?QBf=rFdEaF==7kMMCD)v;8Kg>$R&CkRu=N8 zB@Pa_VDc~+jLgCfpfd~^NH<mtJ%|uZwz#+(i$dLv^(_%vmm)@3! zwR2o*75uXfytz$$m|N9RYf*CZ3exCz;{Ul41f!X0&8t> z58Yq`I7$wM#|#2Np&r;BS-ns^29ME61X6*YYLY z(jnm#36TAY90(JE46~S2Bqxwf(B%X;5-YKC*d#w3Yj+!gaU{S{s5Fvjrv$+f10#mv z6#=M2N%Z1n96JD${~)%eBO?$mIn?3Ddhi50KvPJ;L^q?jT0A_6wKCkm#@C5wi$l<; zlvryvGEpQQPrBPQg$9B4Lm`AK%Ir!zfHk z2xkj&xqLht#xg;DL97ibQ&7N) zn@Cg!&`ttYPtE{G3l>3(aKgMOiXG$8=-6_dJp(V-7?m6s#>!Jtf&gCxj-h0v5E#*s z?LdPh(d)qcLXl9l1i7%F85c4GU4j9C^ewf2i#2qg0%m2P(QJ^;5v2Srj~l_q8>0DX zjv!EM(-c6R2ABGP5)Bp|Hf{#WPB(c2G8@=@n6BC{8`=v1b~ub27cw}&u=N8?6jTbf zJTxxyB?n)?nw?BeOT(gQZbSsNFN;8GYB@BR%|1X5CZoeJVE>9_a{9og2@cyts|{Ef zf_9IPF>Cxl|K|ZBL8^uh=qR-)L?Om9MhB`F|A1^r-v z6VYix88BSNfl~w)&gS!?mq!dGkRiis15PhOQZgd8>O8fFq!R)~7ETL9 zBz!+5jTlhtOmZ}olYt6F&Q2L36CjEw(n)+dG9$WTli5m@ibOSfjC_vGt&%%^E{;}Z zb;5OSncf)^24xb9-Dxy)s6lpyhGcRIZ4|NrYi81d20qJb1wE|u2MIQ#%@bV@#X*PH z3gn<>gxZ!SB8$*M42h_;nYef=MrHT{kJ#I2kda`(K ztz07^*!e82SRB3Hdg#C$QUb8-M5G_dF-j~Pk_kcATYXZLh|gp4Elv?srw?N6DwPh4 z2b)e%Bo~JSNDB{1a8oEmwTcY}H$e4%7mN!jEWg;KXV{QDAVvWCGJ6OGmBC$JV6#9n z84M~Ufnu;ZLbMDR&&iik-2o3lC*}ZgFIA($i^X(5F9Hbm17$R(cGyUOt^nvw@lN0- zgKfd0v<41T4i~|VK+YLR3s@|CCk5`J7`%EDMdLM_{U~r%1ZrS>z#l|_b3DxG@dvS9 zi`8$58V&K3ZWGMvGn45i6_-cD;?-7})P}(uKvPQ$QVf8GTlg%D3&u_pgL6D1sOBiJ zPMU@cmMb|Nfnczx7@1}TP`LSNmirm|cwb;EYDLo{9Bq zf)269$H4+ckVt9+XECkMC4eb)z}DOhHZ(h$NDrWJaGW-JKJ>a#7@JO_mB3{H;{onO zP9%UXqu6wuQIsY_t2A6O9UG!C?M5*E#CV-hZK5G1b}A(#u?KY;su|!d9bPK9x_P5Y zE}F}M*MMt=ip4M+yef`Vhz6htn%lyUNKvK0Xoae%{q_UHCz%NWv{T^x4NwgtC^G=k z3zfAA*uYNpn;-~^#wsX5rI3!5*x5+J4~QQj5MTOvG;qF#y7_!1Od?0f83ZOF2rfHr zddRQUf>j1oM8|=H>n_z0y_%67Tsnbb7wbY?fl};`>`y|RI%H)q%uJpdC@1U;s~Ydt z^6qYvhlz@}z0TN&()1jjn6*9CJYt3+h0|Si0hNdwzC@9U2^y&E_z7?ZW*N#Ii z3KlZ!T_PFY<1oR!LPp3%RMJ^!k{KEz3)pzA3yx3o3GpI^25BbQ8EJB)6|Khrl@Oaw z76b%LBSYWe(LwFq=6Bdq5qp8gZnv5F~^4Tnf1=ug?g+jZPYj!fE0+$^p!kHN+ zGJ+)lXdR%ej|ze?z<^N2<)iQln~5UlIgCm=6_{k|q+$t7rpO>6T)+_*8cB1hBY+ne zm*-$|Y-%wR2h`qRPtXPc)E(&XZCr&kqqbA*F1io_(>vt`r&DH^tLYhDkr05Jg>sBR zCr<3J3t2mZR1;{7N*u=a#b|CbW=m1O{SZ)a98nZ?L*cAb5a~U3(lHgoWF2Tv5 zsllnvQp;h$#{|k@C&wg0k_Ao<(a0f-7-$2EVFrjEIq*~9)MA5h7|R{vXs|Svo9Wl4 znGGVO2y6DDd=xC+Xf=b`N;OOAQK*~Vk99~DNFb%P1teAvo5b;p$rzyd#^bRmuXBF0O}AGY`F=X5&Qz&hac^C zRi;n|wKHIBzK;%UQ@~vx4Z{ct85t70i6eIk^+66H1DIYj-FlG;W|8BJW-!@Rk%OWM zrl}<@@+llZY)xiI=Q7D7AOY~TmTUa1etCtPW!dc%z*XqQFoDfMX<* zWiX60q=4A~J{e{SF)%KTg>T1GbZW8QK}ErMX%swyrN@9Vsv`O%cr1~DU~wogg-H&z z5wRAJF{-22n;E(QILRaIa0U+!CyDFd@fC zO=JV(V5$-ru{mKJdnn@h$P50A&J>9q=$lAr6O;^GBrxSR9mxiKEP2s3rOd3f0N|dL zA#q`CCIIiYP_+Rw+N37|WN?sx0H`$vOa)G{BoR1l@_^{Ac3^YROequW95xe~uA|eT zCX>-8CCj-gQPAZvf#VDtO_nG~2#SQSkp|ERJTpxnG-I49ptRP(%&MSS1I!JaRx4K& z-SxO^F4h@98%-piN`Y5FsZ_9c+B{~M7U2g~p#ly%>TbJw7jJMJQ1&86rfJ1i(Q;Bn)yS5k2C!IHt1OlXnbH3O*6uQ zj2VNAN{MkO;Q1v&RDd8j)0_gN2VltrRvJx4wqoG~mzfk`Gl?vshQj-H24HYapeG#) zic=yDNZoLS&rZ~;2x)}q=?m=&QSffyIualB0`aT}1x&$+dN0So_B-@=KgUEP2(17^ zDOO^wIyst;kjbDZvk{mlp?Cr&M#khD*uVhINaOIFwVlFad7V7AiEFpwgfgz*DF~pb z65ulkkwG=eghGN?jTXVNUZF|EcLbyyuilIH65$cWy9Au@OyJt=4_UyLN@8V1Hz}Jy zXq2czXsL+`EMF8Z;E_k=#;EZ@34lIJkTl?v1TAnQU_ynKX+xy3`BDtZ>n0fhz?lUg zL1d~ME|S0ix;_MpD!GU*B{+EjIir||u_y&WyE)*&_(W+GD4ds(p`fVD6t`bwr2rPc(99x}iZ)0Awtx3i4&->fT^AhfThHK;Eu*6W}|5U zf9nP3cPmB2)=SL*{T9;D*zO<$>{1?=OYXDi1b(;Gg|NG+W+agb6A`W7yvJ~9OfyYk z1lC6uqfM>WD-jBiA~p&kw=iT(5%7Fu@Pv3oHU%LDo&!}9B2>qA z@VpE>fMjFwQKdf+?r6w#ltqtFurNSghKISS3gEIs6@ZJ);AY@AxV(NI84nOtVx-;1 z1y(a^pJ;D*COaKfGA{ADm2?3?(e84UuE$F85ghsD|1H~A<0KbJ&34_rCr`503(9HxH5A>qg zfC_MebQqPul?v1Ve$F!ik5g>d?|(YAK+415`t$@bG>z?{rO|%CPWo%x5ZDpwSpY67 zb-OtttX>I|yUlQ!9IjJ{(ITMV5rauBW&-4w#E#LKF)SY@jYmXLEC{lcVgk)z)r#>Z zu303tBcfHtBa|kr6T|Y$R9w802;925X#xaAs5J)^3=D9J1?k z2nK~gE)naIC{;$imlOyv@mh~cYKjuH$x!%^%tNBPuza2qD}p*)UM^kaG--hQlpxV1 z(E#U#6t59jsR$9~kQ;c002?NW1%w+K8cAwI=tFk!Eu(>jwzkSnwh2JCfLjEu438!N ztp&`<`W_+$7V|{D)Eo&je@i97BlQHCR1dY7_)s5@B;{i5;A)BW1YyxLHjAn!XjEjk z48e)mfh!R(3ggh>$w#`Vnj|8)1*8%P7Z_q@!Jo+`A}gH3?Z*xMSvmHBp4tYO{|ByRli4KU{Rfv z8mug7Vj^G#=hnld5!C<&3~{7is{)nCh_GsMkr5&+LW-knm1?RmVAY%S2&u{d6$GRv zW5(~^0c$%Y;-^vx+yROmb{j8*e|uvmpM zA28Dwn7A}F%*>N8kb032pm3PLn?Wr>s|EPtLwnH#0oWXf9tRVrQjlZ@@C*~ujt&Gg z406C5Qh5co4249lfz!BBMF0~7=5v0G#$vL=^&z7Q>*Z)sz$~GbV47M-55cq{avD!B zaX76Gxdg9*GlV{`EFeXrFi?;;;PyI5E-Bm_z>;h_j+sGY2;DYGNMt7;83(o zZ>0)w083}28i034F&ns%kdR$bf*7<5xV8X^9a=IE7|62W8o%0zAS$79x>=(|ptaI8 zf<`Jp`}skf3{64!P+l{IPi4xuN-H&>fttA?HkK)1qM{1x3^T#0%@Am$LMz7Y$B3a$ zGQ#bYhOjC*NvBE^Neq4xR!wAus2C>{pj|u&pf%EAT{NI$XY+yMwlufJ&vpyBz;LOS zLmDtzruuLqDTw*6qA<$eyeRK=pg#{Cl%u2e{X~VEsbTW%1rO|js z2|=ZBnQ#E89W)aJFzL6c=ktpRT$e*9F&Z!uicLn#5FqHT3tr56^b$cStVv%g{`*(+~Zi*@D5f&_9e{vBU*;Hyh z#hRkM{Lf1JPYS9XHh*wP`$K7E$NI3W)SZJ*d6f9Co z1~~}KD5NNu0N`+}dV-uI^h0YIkKO_lz=5n93<;(lV*%HQ+TwmE$PxRYMgh1{1;O(C zog`p9k_m>TQDI#B4~*BgVWc>jT~8pJ^gxw>(0GB8A;KSUxE*nXivk-G*#s=0!S5e< ziA2>l6L3QUuEuPiEYeK11i!UaZIQ$@qtG8>!mzdRW}F-5u|*u&MyiLxSpHDGXlX!V z4^=WE&8P>xQcEzR2Z^*6M+ArA(SR4Qm;Q~-?-3c4f7&EMAV48~>#|yc(SE}dYk83Y z_E~{{(AyE8xxg|q0}2KL=yjjY1ll`VISqsAitwV-N3Ru1zz`T=pxuD4FS-bgtc}%1 zoLT-Jq2!oVwFJ>HE)Mhp@FNN~f1Dr4A_86zkwMn*{GsS=6nwbtj)=^G4;1_qk2OOU zAxTW5+cEaYpoPPM0h2yrSQ#M;%r-+tEAVIv`&$HPG7tgw->8CyLcznqkZ=N;OehYF zW#F3G4zrVS--7jj3`XJMU`#q4(O@tpCgl8hk&T9!uzlL8CSO@M&gbgn;!evbfekNjINRMfr#(_f{DJg_s!qXMHO(s|%znD1e~v;6xD710F$U~s5# z1Qe2h#`japep2CyppqgPOHsrVkfxhjpx?sa&K;2o4G z*+1s{=kzs3@Y0J;I|%kY8lC&w?Bj#i!K@~=8@R9Rfxv;M#RkCskm`@Q1K(dUlIq-^$U21tuaTuj^ZZ@h zx2(~=`Qi1j+T}7D3f?9IucEEBkm|C4|GUZ?+)RR2WluR{LRApiRF zujcyK?enMDznbe$4f3x)|7xy(-9CSc{j0hD)FA)*^RMRm*X{GC*nibrO@4n^72M_q z_RIj!rhdH@(+c>nrNz{#TPKDA{)zeeJf<(-2>e`lj1FfHR4P0MPhxWGJ&(NG9@8q4 z0zCO1Nn88->(!V^G2pHB|LITN-+$`;?dMzOx^?P+Sn%K9BL1g;e7z3dm7nybJuK~%$-dUbw(x4wRZ*oKW7H;JiR zuPF$vQ?CKw2PBIA)QhRppl!+O*(>SUep?`-r$>@w=%z=@FG?>>E|h{Z$IqRJEac}Pw3BJvg8V-5joQL`|f?s z2T<{P;0gQKCQTYOY|vyxq|X4;t=l9Z8W3aS+6uDHw(AKvbA?xIscaaJL}SRF@hfjP zYM%&RGt0gMK~W?J&%NwGoww?I$0Ut=!o>UwckeZB(kZzYU=JJNk46U^N@ajYAjdKQ zKE1wLJ||87c?w`PCpQmpTTr+fFue8eIIg+~c)tJO;U8H3dL^b=J-}$|x~*dd$CPnd z%TtBSG3ScKdC!Zl?lG@;Q`_+koV7c=R&Yc2cjI!@#-CTxiP@!H{Q*?WcI&nug^ z8I~Wu56P=d^-)qx$=l2S+xPLJ-TC{?`6uRuhaWt1R=g|ev3T{c)tx}tb`bVG<0K)| z9cF+3elA5b-Aj_Lb^egDWF@3b}$Wts`Q(&H@hJ9kgICps>k5qA|1Ut9Pne7}FB$bC=WrL_Hd-Jvjh zO>uhBjm5A|yt7HaWbQUCTFWlWx`2Gs9g`%z|B$mFFlRLUEa8{b9QF=QDRtj((tG#4 zAakF2hb4DMSI;gwy>5AN`htYDuky1lc-!N+qP^k^_nBAv$mV})yKrqMSRc*Wf?(67 zS=ZKnyimD7SPJ8;^Uv7uZp1E*Pdxcrl7J# zoeH64^Pep7c6-7s-qeYCk$53dT>R<8>Fn%H3BNwdza9tcU(o+m_wauD24>Qj^WhO+ zfB9>S{WZq^FORXae~q#KhA|eqYx$O{L|Y9p9%ZxlcwaiCS&w!#JXkNwC33OF(dlap zS^r?)^Cw$Y-WRS(7r&sSXIvk1KWWH?;DtWhR=)1`Wax!{Ac#pg8*I98KIFnfusCIJ zPw#m%GFDf@ z6W7v;95s1Mdi6>t-#p9Qot*At`R)u(8Z$>3W_>P#?EhN%*C_mN7=;i>azDsvLNYXE zU^u>YbNtK(>jO8G<1NbDH7$mjj{FpR@%c4eyJFk&@?og05s$1^jU zu)8c~&KOXvS{gXS{48e7glSf4?{qy;Le0E1AT#}HYNu5LFKu}9HRdTdZ^F~+UihC6 z_RyA$yE|NYX6K3C1Ge%THL@e(LHbdaUppRV7r_jgPi%J3@&L)oAPDO@bGzZxor| z?0dpIJ)?Zs`Gd)ue!9?0ykzC8_>bp4KZ#760#shkyPJD>qFL?P7kW{SN$&F2rmG*W zrLA4IFI~KNZMx=VW#gaM9WUaQQunXS;#_a9WFC)XZ{DU;YVlfLQP=j9Yo4rH>D=Aa z_atkry3{ng=pyk&{-V&U{3YJ5LJWKC$hl-pBwfosc)YzUS9f?jv8E@t&e5Ajd*@HnCSUo8eYWpZ-gAG^)cW#A z>zeNPuUyN||1}W*8wTQk_08zQe>%okm+Fs3@62d2eFMI_{9fgyd!v<;N|%XOb6&rk zRq`hL^>X;eVJqid8*;Zk@vic!Ez@7L*jgE!K$iFiZ(V(O*{-Pw67@sGfa5j@u2a@rmLT=efjGP zO2wGjun!|vx0};>!Z7)Wyjxoizuo)%6{%wQy3axoQCOR}5x%6~mIfPkUTz9;dCMNBh zJh)%>{NOp4s%`g!*JrX`qfGZduA1|?dFjcW1t+u*x_L`BPiy&0GA`|&UX=O#g9`q+ z`}#9xjr>hXuRDi&$W~k%wagJe6d8Y}Fs|45wdb#G_;j#uon*96@c6Cg zCl4*Vl^5S;xIJe@Md_sJtbSh@+)>xdK1zmd<&9i4qj8V@X{{$MwkKYFxzkmebG4?} zPBllO^5Wyz{g3r?Nmp#5?Fvn-n?J$Y0GD>;&cX!cyZsQN`YLcWP)z-KMdkj+b*nfJ zVp11h?|Q4lja6>Vy`I9?@5^pntnbNbvth!cx|@$q)wce5!ns8mZ7qx^s=4n+PTm~f zy-S_TqZirSXHuq(Kegjm9h;V@(~>~kB*R9D1kK zv&(bAYqjrc`^Qo zO%FhKdE3XPChd5u9GjIfdVu9tpE1LGT#L|1op7uC$ibaA&TLKo8nb%Vs0QpjCLlgc zo<4T#$s@OS{AzkZdECYF=xa=?qE3%nJgE6ldFRll&$+XEe_E;?Oz-@@$)eo-^FjB$ z{+wHw0>WS*tVLcwFn(qlYk^ABCzy9c-Q1ODBC?Von@^fjP9 zZ`wEXBX*w3Ha!G=0m65A7oUo>`Y*$};g5rIu}ePSzs4Lm|2_QSn3f!!CH?av-bd*N z;~!!^!bTZtFiS$U>A%rG*8k1^hE?btzuHgr0+aNoh6m1no}0JJu_QIp-(Sj>?>8Ip z*H0(69RIvJW!9)Eo>7loDqsA#SDXE#KNWu7e|`R#h6N|`AJk@;@JlLg$EGnS3%|^o zaw|0Kdkx+nLL9qSW_xsLOvA$al^)#by!iHcW$kB=O-7BN9DPZZTyC71TibbW_D$S! z0d$HAbF=#7*!q>vCM`J;DbO1~$#u!GZCm5QUkf$wd**2&FNW53in8r7|IQmH9xVE@ zWX`SW!y}w8+W=TE{^j(*Mw!CL%PfzkWrG3I#wAgd#sF{r}h_JbK>Q2 z94vciZ8J&)q(vUa@8C0{^dt`uC}ez;qt~Iw;VR`tm31$LbS%#_ox*R~!7Y z!GnGs4!kc0!GQMJFG-d6=R9Z}8&f(4)u{Wu!r_kq69p$moNRNT{j6$T{r%6+^{eK5 zjfoYxkG-isF($qJ*~M|T=a)IfJH{3o-XpWt2d{yobE{1au6}0e_Uv1*?ACG8mb*@N z4B&QsJeZu6ZDY@&K6Pa0?4O@B4whi7YP$8^vz3Z7VQbd&MlNf#BX3TbBlgog z_Q1Ho2j@kWacoTKAXET%Z>8eVf-lVqjxReImph?!@~Zh5FcJlQBP@~MFW??u${P=o zQbG96(I1jVeU0f0hz{;E7S7oEv^>;*qVneXrLz5x0gb}@X`r5p;0U_YmbYK~df$aw zUWXNYI@t4oY*ux!;r{1mpw-;&vBLYX2aOL8NynXi*4_5y)9d||voCeHl*kSTJ)#^m z@zthx4{|;v&zYFwp5m#$d0Z)d)qCpB(KQ9b_wMg|_N6V6|E44vw8{3djljgKC`g~S z3XJwRlY7dW`tM1v_RUqr@A&mHdfg63q47O-(VPr8cO0zctR`_a$vGiF*3@ZA*Mkz> z{m}`BGX2jMiq1G7=U)0!`7cH;0*zYPB^eBJ^y-Xr=ucg_$M($cwe?lEtc2m4_V-rop#^2NOvQ(9XJSEMHFrcq;oUSiNcie~ z0M8k8M6a~oo^2YU#iP`=Q zwLz~Y_vl-?57WVMZ#MVsm3=J>%Y<8B#?M=E)$#6b_UN4(9d}3f)j>g}gk3E+u4gg# z#>oy=Uz+tIt36mXL{o;oy_4;K`)L~pe1B?v&OB(&jy8vrv$hn%&t-2LRElZk`oxo0 z4$UiK2&TB|#<*A2!Itb)R$U!)VFNF&?V@}iZ+A-{%=VH#1>htNzpDB~Ks!JV~PH!7dZ+~;l^T&|!b9{TYhlezK_^G&<2@(5W%m`J7DaV5!#@;D{WIy@# zyU)AP`!!6++!1?LA7(S9$FHwtD zA7m!o8Jsv~PI{PK7G?~D%+36DzaOv~5!WT%(Lc=?)jmJ`vU}Ek`X*-5T_&Vkn0;gn z>vOjqki2P&uMcX%gw=Ub><5`YZo0M?k~bOr3q&EnP@zs#LXn>A@xD)il+y1x&OWVV z21fWk<^h7>Um$gKfFcIIeIaB9I-WXrD?j||59J*C`0I5*a@wNj*WNw`6n0;~5+U2U z8njh9I!r8rluo|4=|c_p-R!3QVR3|5)eBI@&F^J2XC~c))e0NgM^4+9HGt$z{WR;t zK)_woKXphcecyGUu&YPdJqmPE;%nTa-$0Qj$ixWCqdyh{-nxCvNcql1=kM8xAQk=I zrn65$9TwEq;bpg{874@nWB2D1AI^s(@s53;uL3eHqr>dg9~=D;Xt(_vYTLE!kM(W# z;PFd9bx!%sRTn|=`Qet#q}!w4y|@8-AoKaP2uHOgHT>4HmCS~b7L00;(zRB#%XVMph9=tqtVXvX=T*J3F_a--tU9G*&d3K(--_>nI_v||} zX4YwhWA^f8&dz?V4<3<|n&;n~W)MGd*K0Pv;!A2xvg5L~)#JH$kRA3b(qksBA2{a8 zYq8HP@yqjyYLHXj^_Zd!=d!ndUd36G^P6CG@7YVbyTq&N-7PwDFuNb>LCU@DQzWm_ zw-vG$97=>DhN|o4Ku7hxntftT$Nbgk1oR4P*Dw>hVnch=N2uw3y%%~F!p=J*>YLU4 zV2hn|k1vcl*}>ZJ`H^Gm67-{nRNaezIRC_gLZ6Vjq9l7>)7+j%bnC<1?hVstE$=^< zx^~SC(a0SI@Z=8n>|JhIK0YO_(SBeDKe}2BP%(|!F_l(l! zVtfIfN6d89jY~~{ge&%^z3<<(;CWxI^hL{2)7B|+hfS$i(rfn1zO$Bg+|y~;pFK0ddyly*nOkD#ZoZ+K{W5s+)y_eJrsRe0^Xoc1 zENgRbTd#r`QQPov)>hhzR~MMiPG|f~>&EFWyxeZGD&ELy82?Ee&P?hbQ~9xK*XB5D zx1NyQ{DrFZle3bocXD}ic%hTpLSFe7TLJ4{pL@1P*4OtAde#peKZ@_ukp z@S^sQ@|v3rUCfL6jjYJk!h{_SP}Jd(&uVDQ7~@fR@BDp&ZlwCqaBu0mq)|g=%}dTP zBsHwV|G8C{kp#rd>Tu@tU%JhEc{+c`)vinL9vgLYX7%TStLAPET8>_N;&y6&5vyz; zZ6|!~&{&t5zq|?Uyjdt&nv>RpxV+U=eB}{T8|Fmsx!5@9LuiZduvB?$T=u3)2!Uc?Ae1zGm9bZnCgZ*4hs2JM$UMcQfmrA2iil zR3MMty`xT!YwWPZ#N#jK$C2k3Sbl_eK=!>1>L$t{ zRmoepe`eB<6^*&<^QYVs+oqgvd8D^pyg|_T%HorVf!Ryv4H1l-s_xLw{UV&UK9jLU zHo80E0r}>=gQuQ!D7b9y7u&-&ZWi>Qv*Y@-?#D-53wNpNwY)US zV_ddNi2N|UL8SQT&)4q>`-aN*oqV>{M0ma$&uVUiY@Z!Q zAI0FX$)xh?M}lL;t6OmfiJTE_~l|lQ|D`^R=kH^@6!G3nkQOj_Pox$`W%A4yyo8lnU{0yQ+IsW1Uu1; zu4^pp{p!7^vG(~u=Hek`hs^J~k{4&K?gZDxX5j|4Kd~q!K)DhadLilV%u#D-OTMDi>*>U}TRSU;9er2lesN8`&TY>k-$FTYyAD(_dIr0}GdF3L zj4Ykr>!(@Wm>0L{W^5Hz)HIzc>k7#p-sJ4!q5YO0szG%<-hJpx(YAUAcZYMCZLzyx z2P(UmCvH;oU7gf?=JPc#F87$-Z1R?&ofgykv{}9yd#5!pw@s>-+;iG=S^Z zEcDstFtg><&C)E0Y1<@20_1#f+x7mMnd1j6>3_HvzG7=apDsGK17st-1PF$p&32)Aoj7#FI7!VU6SFQ zjXbs>$2|wjM}L}*3{0cQ8n#X53zQhifaeS$y`!+?L3b+_HY^LMVE;kXKY%u zZ`)gPyWO$xCwkGPlh)F-OV8v@?{Mx?@aX!!cP=*Tb?HF7tjq0jd-CV)fp&Nl^Lfy2 zbCa{zRhJ7kAvfPXoLGFk!_bY%CHoFRMC$Z$o9>{_S5ZjDN9C<$ExX-?_XHxUxB1~& zo`Id7q^86=kFR?*^qhSVU7?l#aMkpuovXJts(P=zJ-(#+33J!<3ptB9Ymva|a_Vk=MxwYbWXEcb%3`hfeQ)>z5MB?mE3RV@qF$ zemVWxvgc#m#E*5$9+a4K*d>pb#5L$$UUI(W)H-?I@U+Ww8s6_)2g+@HB8#jRV%Bpj zwT)As9QlxSe9$ut6}M;SfU?CmI(C^`G_di}OWNMZgXpYrLu2bGE0PkuHw(KoSegi1 zyL1NvI_~uClB;z?wWisH7!*r8~D?T<27&Mse%H+-K6D*!ht)qoc0u9G5U3E)y0lmlj~mHXxrH- zbM|3Y3FlJNrZco*Ds{8nx-8M=nsEDB0oswVZSF!DWlQx=+N@vtHH=?y<@s^NmH1XW zPnnxFHWLz>&h4cxorD&knp-Drkg$VDGw56weL26^Wm-gNHY$WSB#vrcK}T( z9C&&a9Jin~i(FMz>^*A$zJ)Hd(;bOD-V!o?A^&s(u(l7l$-QDU=5?HTYhlp>u6y!U zO)4t2|MS8Ntsb;4S-)uI)IqQ=YilM@)#tW`T~AIl&zk-;bqH(o^!tP%kZf4$xl|Ez z;kNWHWzNel42nq)&rKbAcYSP~o|*U~{L7+~S|mN$SR;(nlrN#2Mop+U2j0L|LJ+*N zL-Vtx($tC>4O=E8_g~WIr-T9cW49oMQ_EXoou5wMA^)@oGGObXBPq~5qTMUQMV#r! z7L&5;@(;D1esDLled@97x7FE6Gj5%$pMto0M_x8WycJo!03FZTWtrjbU47)u+;M9T z-#HfRo76VTR?b&d@M4Irwm&t1Wc_ug`G4t7^B-gI{)H$2+-Y9VjBEPjaNB#OxgRf` zPsV^d&C=bMUcNd%JWzHnHuh#|RqQ*NZlLhl6XFP0D?{A8yG^IZOeHdV2~G@Guv5%& zEnufQvNx3&I+}&;v#xTV_3VGSig7PwpZ~P{% zS-tAcd)U!^zI{c0*!eu;F{)K^C2jDu1Cod6k`~C*%S$(2MOAdzv9BX`%#{msWr>se z{oF%DXnk*<`|4%)nAIKGsl=hNsJDEOW52d5ZtEtsD@;luWyT)4Hq7#K1*9~LN>*PpekkszLGJUjmbeD0t%^17_8A^829gpmiZGNU^*83&B*x91qp#{?RxvT3rb2TRy;Es0+xQE9r>YtJ|c<|kemwY*i zO*Q1T*mf;Oiq|C8+dW}Ow|$FTbMDdLkZ?w0%dy6X@IT#mHXg9~){|UR{nPJPw-2wF zuHXBpe4}|Tdho9GX?=OeH};*kZ&yrC&n8C;mbON=ZFQw;m%kHuUyY}zjQ5}bhEzZeYlNzkq=R?XIZo4NmnoB zhqDgcCLSiAFWj^3o?8BN)-c`f^!H02@2GeIx%cUD((%^u)H~Dhy$#gv*;}^cZWXy_ zLRt=O_1nPlK6T3umE*!(@zErdHIi(j`n zLzo5~eX8%lsnlWWlziLzQ#dL1*6{x8EH^S``LxY%va*1pqO+7m;Mby9w~bu1&;=NPIg_hpZ6ssw&F{v!Db z$h^#>anD36V{zm91SUUhOta^*5V=HhlRO;bswSk?W)AH-cjlD9x{7IW+}A7aU(Ou9 z(YSlbj)deH*-vK3nMti*cU#&9+22sih~*wTH_A7KHMP9cvP!0CgKy!Ht&p58)1mrF zGapM=KHISJ4CMN6dzouyIyyG#w^1Ag*4=+ znr)rBt$>_^l$@TQFIY`zH_bE`H$3iWIiJ~ZxT06w=*i7`UTGT+xtES8t11iSPC5K~ zl<9I+AN9jp|)YAY1QD{tCx&;@W?K%fLArL(xnSDf|7c@OPLB0c}DtOXf}eA zm~wwcmkLAM35}G zz&UcoJ8w7rTDQc5XIf|LUNbcp6MFEsFK9bPelgs90s$gicsn6sxpCKMqO<;xEjwF1 z3rqd))?KT6VuQ3&i~kt2yR7j{we->3B6+90Q-zzS-GZNw`8nmy^ra`SEf1sTm!*#o z7O}`?T?)%_*^{~gf5^-v4ZuFQS!dqUvr99xZrRVwS!g(M)_+;!Jw4!V_X<(NzAI)9 zt0+x*jJ%qVy*^w>S>e_`4jiX3n(aN_Xa>9ovv*eZD@#=$SqswJUpr^?x6Qn@fZ5=# z<-(?VmQ<3eC+U}8-uZ0L578k*CJ*Vm1>e_1?KpZI`oN7tyP1j>g}?1w+Hg(#hr^QQ zR9}0SXc@15TYh7e^2*Yrns(6X>)SUr1cqHqOdB+`&eLiMYjQuz{MZ?@+MT_$s-EWr zT*gd%jDFE)VB@QziW`+3^9&u_DF;8@yHqdLGeLqN^nX{=cb4x9bIAF@Bd31s8P}p@ z!?v|0tDbkXbPMyBK$fCgw7g7#yW`;}+giFc?$tB7mo`4UtP_3GYp8Hn@rM7<*oSb;I>z*EGjz=ko?k+{Vqu>yz57 zUp;63%D%3m7O=h>%QeIqyH|I(*llxzd>(7$%q=sF*0$j!Sc;~Cv!%Fu?~~5tm4+^D zO3u`w-qeaH1qXXBkaI3QepT5$X$(g*QQPs*18Pp;uy&)<%Au{;f}xTD>cDOb*(`Lv9)z=CSj0#iO|*xqxk z>eaU&dIqgzWT#AU*1r?Vd=mEyMVV4O(h)TbBE|U z#x}{*nd%>$QuqGJj0Q6>tmpHk%+?8~5jBZ%+3O}W7WXK;_m-c;n2uRBuV-1q3I${V zeZZE1v+q2wlQe;Gzocy7pek%Z7YM&(`-+20b`X1SOYe!u$$3=RfN}N4v{`dMteJp6 z*fC*C;jlf*!~4#Mvl|?`fBBAW$t%dAgxoC~6DB{otJs>{RdQ$8`V*~Y5@SEk9np4R z-U8L^_V-3CY_+Q_M$xdbu)%m%UT%shCj<;;$2jlV7G7gn~IyK`=3!HW)S#&^Ty58YT%)+*0q%TzvwD7sW^V$6MM zqPngXom$=Y-iy|Y550ceO(o78+HA{8Leb*Xt&Q4KQfiv^vl`QS;O|v%ca+ze*f_6Y zZ-Kfllag{HQ@8NRj%Ka9EkVDktDGaXu0SOmOw6t(G#INyK4PKSi@UoylPBECY!##y zr)=Md|8-1SxcU3-Vd2dl$F|v->I)}ZvR=#?9N+tEOp~3V#c2y}x2Zmxq8{a5zpYaO zGH=E8%PW-SngP4&6-=@nQEPk6q%9aJAC?K3H#zr=>wJN9#CGKa);(!L$mrY$qN%&;`SFSKcAmHnS>b3Dx2NamUL>TZ z&v~rtVwrM^IpI~@==E{J&NDYt)cyJ!oE`853!D#%&RtnIDPc1?GjDY9{!UfQK7;ea zQRU3oi>H`LxAY$~2JU9UTK`d%G#6;(5`wir*Z+Mr($sfXfg7m*KwEbr@vjQ+<(b7Tzd zb2p&Hngy8~Q9DJIYl%1Qz}9MNZArI&)V=}zRL+NipxSX!MO3}(_U9lkuzmuWfv)Xa z{(7%vkA5qHlyYiSMB)eZFR$zW`l&SkYs?Fqul3sSTwrJQZvcq@V@r^Kant#KT8=&i(U01r=iVv#d393noRXbwHV|!3FJKiThl)2)>eXE!{P|@`cXjMYO3(Gd z#q6uI`xQZ?13v@3VzRwB;~?nCAC#E?M1MGG*Y4c|)@(hNk@R9+%;X*0n_SE7JuJWZ z$x}s&%fH0@M(bo0pttu(=%d~+CZX}PIzu)Oi2>Io5fic{P!UHj?@R7T81VWRB?(Pt51Hps4TX1)GcbAR3I|=UY?!j|+@_*lRX4X10XJ*|` z_kLngPgQqyb#?EjpZZlbZb9ZTpj*a4nM8?aT+W(Vc)!*FojN0H0Z=b(&})8^5XnUr zkY}x_E#w#{_`eQNc96v+SWi|SWhxQb+X_eCp~zK-qEslSvW(fgkbNTKXpND|p&meC zK$KGZo?pS-TgW87JdgbqSy?jgpP1%9i|>E8Bq(f}99I5|m?msN+%;xg*U3IOV$9;z zmbLcpb@OM1f`it*aguCg$uaH!)3W(@{FVV6gat=l!BN!zDTe!h8d?1l!~HA%zYVN{ zL%)BX{}ayM{3o~N-8IESfai z&p+D{E%@W^f1jx%DJAu}1aVMU4bHE3d({|O-?ygHLU*hAJBn5>DA0OgA$Lqo2C7DZ z%+jmcloQI@11+zkaLGfyk%bwKmZ6LiqJuc;FFthch^tM(?Ypyp6n#qUyiTlxk_He&bj2_Cc2(p|X!wW;P48$YM}-O6t1 zSwXvNy!fHDa7Dy)b+$ zK;L1}G_+Dd$S?OMQ9Cug#$jj-7F4+xkN(@Lo1P9%cjvEspV(7}+ZKhzCAaL|gtmXF z(|%DUr>o~N(k1ac^z&l+?2oL=e|CD=+2mwRe*vfWv?y}CEW=4TQ%bs6wea3%7EEJy_ zbwz^vCzbeX4&T)BSHCC1CM~+!- z`lkDrU68V~qkrqQwD+*G%kWFlT@IYNi{gRR9^ORf#wK6ZdOcrh8zJS68@T{O;YIAD zNabhzl8j61b+!JAk~TT(9iGT@G_YWVA|N}ut- z53g0Om%XNx@dA?InjdHQB7gySiMg2FuJkqL8Qs>zELe;nC{jHAr z*3zn8>jL8kVKx=78&>|jZ9<5-CXyE$j`^}3;n%MB1f>_9-~4BdBsOV)AuhT$l>*qC zIySvfWfm)@CLw&{qTcTchl8E=pQd-fXN{rPGQE*JLZ#SG-_7ia*W5ZAny22KT)mk` zUz-HaTK-2P0G9677a#82m(<)lT-1k;$t3s-=_fJ~mkeVk8>*PE?J-YLp`LJRA0fP- zC+>>CO;_HXls?~AGm9K^Q(oc&1bM1C=9979lJ`J6rCZL`EMoPhyWHfL_@u&c8~=3s z`ta<;^~fUjS3u>jjx$$TFbg}!vkSW;m?ekB-3cwo=j|I>){IqMgyKauEA0cAB@e~j z4ch^Poi|=)c7fh{z~z+A$1uRcB_a9)Ms^I+$;4gZAc67p)6Q(iZwMQ7-TdvYcgx8{ z|I#1OMTk_`MM2deG2jhbybINOLq6z=c`(#fsxacmpp0VUTWSxv<`xGY5eNh-rM-V< zlvp|Y4WabDN#H80beH6Q;EyRko_JZ9i~m|#3KRGvbMy%DM^B)suTv+8jo9l3-MfHf z18Vx;?Q5UNDE&Iq{Te+Rp$7&Je|OO}`fK+?bB=3WFwt-`?BB$fhZ`sCM}Wk!G}EP+ zz<;Vupqtl4KKv>r?){v?QhSVFV4nf7IXzt5eRu>&9m}&_ zssV)8|C+ISD(Xg2FK#xgkkki8O()vMW&df$t`lZvr)3uL*(;g+jv5CH5jI>|djvbu z#=Do!$~#D}>)Fmf{L25*5*O@_<-pg!WJ4dA>Hi=DA5I-MD_4>V&TgD(@gA5{e*Brd zVzA}2@x47>k_ztV!PE@P%Reqw2%aT{sA}Ff8Re&>$QNvd#jSsJdg7&e5Pp9_FLSKL z1qQ+nAzl38e!$c|&gR)@ax>7r?;^R{@TZQN!r%DEg?WzidGtI>IZ9oB^(3?r@PKWZ z9;pLU=6O;FhH55s7Ljg%ak`KmhF}-#4*?3U=%X9 zjSACm2!HB-7Q!DbGw)$>#6u3Egq z+>tDEoXn}a!b2)|Y!$pY`o|G$Dmkz>b%p0BY}S@gJyu?tu)N`m&`Ddt3SooI5Y6E& zsq&Pe55^7W{_$(MLD%HOJBf7XjSDZn-w;(hggpr_0a_33BM$@9PkG?9w6#b6i{TS6 zc@?}z7`>SKx`wN2Ry}V>p7}tH`+&#ww{5#Dm>*iX6WR=*%kbY2I`;y8NB=B9pKlCF@DO^iV%!r0po1V&-U z{o&Jb(^Jv3v3fg8evDcGFB_Sy8!|oPU3xj=j!rFx8{+!SHs#%QLS~@_U6@2*lwtmUsk3OSs+bjd7_wCC?x@{;wD!;ac z;I^OIfFbPhSw74j)x_p{)c0-18xM>AV8dned5-v3fsJfg=%IJnH9kuStGxS$zBjL{ zKHI3=#K`bsC3LCAfwCow*Q%$SD!xHQIKC?YqrA;ajS10ST?)(wBbx1kZ}>2K9GfGm zeVA=G)n$%VDD~Wgx~@9F#8sh#8!{c^eR^r5mHR455A~aO0)C}-l-c81CzDTcBk3*h z?qZlds;g4CFC*aJjXKR+WU$dPyO+mt=5!-=@`yB45f7O z-fKTsrp>%R78o>fYxrFg6xd6ooGriJ$Xu$=7Y?8%5qmK39E*@{xHIgFc0teBiShN` zU=-N1wdn{oS+$R*Z>9m2WDyq>qC`mCHeSZT|69`)w_bG?FjK6&?ypSudw!ND;Yat? zHJEi4>A3I)kCKJ*fhJ|p#&VjX$8AGt>ErVXgI9Lvt|yvIID4!5N3PBmDpyw*moG?i z+U4o~aXFTt<}geeE~)ONub39#o44<;iO;p2IAxXS|L=0@|J#IM>Au?pJwiM_y$5!1 z>RK4b*+M|)D4I3lPw)oj+O?bp`UUj_o7zC5Zm2`d z(?Veg`Tz2-x?AJ+@kr2y2$S${h_%3^&irRsA4LYZPHBdFecjvWS zpcu=$Z-g$`GfFeZs%PwbnAcNh=u_9V991+!jny4f5%ihe3_(lR3bzy{q|eF%FS#tQ zRcW;~P1+6eOHVMve{}&co4-60g{okeo zOAj@C*BOe`V_11B1~|XjihDOLRKs~Cn5++@`)p(}+!s}PwZHk6S)GDQ3Ut?%zc!tM zXj*VzeHMP|TJ?7Zt^({iz$b<3yd<(JOs>Cl8B}%njL5gHlf%ML+Q3Qi-w=9tx61vA zy_Eyb+s8|B8zRhqoYFV0v+x<_H-sW3ocCX^obpQ7o87$^{KqR97aEcW)c(KXmC`+q z!Nt60srt<7OLgl03!qWa=@_}m!KOqbXEQ6!8e$WhQ>Uuw&R`WkIc#zsA`?gu9`5SO!%#6tl^5E$oU}D|xN)2n~lhbuF@j@Lq z=Fib&qiISsWCpH|>?M|VuCEGQBy`2vXdMhve)l|2a4%HZHuI2L)EUpxlsP-=dwZ05 z#JH>SG;oERJd!9=gRa16XT~yU;fh9)e=8O{mE39Q15nBZ#W9H&4>6^@!Tff`X!F~igkVZ zp72i#dN%TeE`Odkpt}?GKsK3m${j~4gck9_FM5I16D61g^8HaB{~Kg0RnPa80U@$8 zA82Xst_zx+OAZ4UW;bJ$&WRTu?h(01l=-m9pGhdN>ogxQWqrXJp*w0qo8AM29+XK+ z>8DG5`4tHp;D%!5V{AvXT}zDctMgv+enVW9p=?GCo*P|Y)bl!J{Fa7!r5&1)>M=pjH)lE#NeknXCYVP(oL>#^uMVnko5C6Ix=|rPm=IROf zz2kW81#3w}@HYgt?!akNAKdYi&j?aOuzimr^s2QKOJSq+4i~P^Ls+8^T%(2Kr|X0S z-BBQ&Ppcq~hhH$?c^t zZ~+sGzhvKv9GX2M;0+ZDG~yS6`*0{%cp|=K9SnnP2FnKjk=08EdbRYm9hrTk;)<3w z?8Pti(%cO^Sb!lEB?v0pWqlbcWU%Cs$;^@jp7=%QeG*Z{p-^j!t`)B7I>rg1F`PT8`09w4TI0K`{nXOC}S>*xTko2=d>JAau2;|M=VSJ#7H zP7igsZ`+Q-FZ7FdW5ClaVcHnX!<#tS0eCWaUNnm&Begq{sb3Mv%#uEF0o^|uJg=1o zlJOHHDz1PIJlru4^e|tUbk1jA{xs-4seaRx>+N*qGxK@gFmC{7ROuW6Vs8xQtP?%EuhKAu4ycB-Cx$s;PSSJ z$G+VLEPV1DJp-o6jTl@Oo$mSYaAWEY13Oc)?rf~fc4Hpg)pXLU{&aF0I(8I#@#X$F zUJjXO=H9w3vjaXemdlHY?|6NUgCe{)+2o{U3tioz%)d3bv;%7{AG8ZYX_f zl8GPKl4t+RnQe@;p-R~fn@kzqTXs6|SD4EADVJ}PwT0Iky^TqGay@Wz6u+}X!4;9I zI0Es=*`Num@8orphu%o+I@H-T`qhIn?qA|f-H=Z)X*=9X@h+J5uh!^Xp6q$Ly;=)q z2DxVL|MHa-R%0w(8@YUccIu)-SOod1l_z(E)!` zk6B$tdE4yOY6xwZ$?>oD!vDp{fjae3C#eE#m?O9N^_AD*?CH(>qlb_en$FR=KaS*V z^6O6YEg6O>@O$BsRwvj!{!QUP?{B02)9w6?#F#f#`t;WQQ9qp;V(Ax`rd+4?xnIYj z9RNb>((P2osF^aTFSiVn6WQJ+>C-*;M|O0VWgfJDgdg4N&!_s(+%@>uI0LiLcg)o` zzT5&bGX%VpI0}8wX6$6uoaM`}+) zTT&a zwn^~~k@@>lz5c`(^Ey1Mfw>jF&LHC;pEf9RFNOOLZ#n!(F9~|5(=B<=>^`MKr@q&B zK4*+tI@M;?VgFG#^G|6eS(JBIm~5nFh2`DepAeNMLt7wOT*Qu|N_;UgTfuLTX+cWb zE$CEh?${Ehrv=xQ;Nb5wROP8%i8TPW5sk8SCY=COfR(c-USlT?_^$_cKvkzs*CGHD+ z9)(Sxb=VB`t((4ONIMDdktKH8y7;*ETD|G{|Auhd*IIsod|RuJ^;5u47Me9LuVLy8#X@0gTy*HHF9W)tcc)WpQ3SDjQO=b4(I(6U6d0Fv)^ zeeDHthZW!k2QoVWdbl?IcRmBvWl!NXQLPe>)+!Zz3HEEACtZeoLW>t`lV!*2MZ5em z2}??^X-i%ar$abAxE4G0Z590ci`C(@TEA9{JBn?SS13UQ(ifH*mHpesuS3|LuFdV7 zJeC@lD?CNs5(KWHl6rY{=q}bi5$&xK*|_KS#vHP7VedP3iIccPw0SA6_;M$BH^cq2-26KP=ccDb)aXB}*s6(;rrfXM0rcr@o5^sI z#d?U&sea+?a7`Ly)b4m%5h_uqjfgilij$Qol5*LmSaMs$as4QB#ovf&EHD;lAL6)l$#Dm*7o=&R_81#Zk5Y+$%x@+ z+t9Jp49c4@qu4b<9yboIV>ijp8(5s?Sz zG-Vk3k6lG>TLOANpIQ9WZlGp&Vydk0{-|WpI?3Mur@sR0F?Wi=MILpVGx1ZaSWNte z?5Oen!3`*xOjEuIYdJIi6CPs+3y`>Vm%~bVAr|Z74=5M!u6)(32EnAb46#^}lmmjj zQfTM1#b?@*#KNJAZ6_n8r4K)XWtd0lw3Q`D*H^~O9q~h(=ZfT^#8zl_K-C@^5;`($ zzy>DnJ4M*!#v#xkVD-$#*Q0g3pOh74tFpo%%s;=oxjrjb7rj#-Q@ z76L>aG1H8(VH8%scFFF@_^D?s^~ZP=NUUDm0E$%~QYD-6Vm6Zio?FFe0$f4|58eC5 z7Oh{GbaD9c4X~ruE8wrE_$JDU1A^QJ`Zr46cC0v z&N_D06q_|sy)s_6M(w>Omcf5}UV9_2;UvGDV;KD;B*$kUyPhC#4?0W-3o(A4iiiJyx(fMw{W_2<75U)1|UT5my#J4*Hl%7LRQ=%o()E z`JKM7P98^BPCGs>l9bP35Q;WZ(Hw0Z+~YA6x+;q{HU3sWJ| zf18s1l(3|ecmvP0|6(T?d^phJ2V;LwecIZi4PQ-8^nqc0fHK zN8f}G;>&xTkZ4TGogpL6b`8dKj|}G_4}-p}q|3r{zDYXPRmiHIk z7$Y2y`8IE-A)J)?6WG8)l|08s_Q9>eRQ$_v>&9uSTYEct@w>Qu>U?axz($mU1n8u7TRy1AZvC-ioFN&|`jkjt(N^Gb&yN|C zZRJG?(3}gq#Hd3 zS^7SmJt4RJUKRYXh@-X1Nl~Vb_#MVoAP6noaj3*Z0w@CLle(H097i?drNDZ{aQE^~ zpd4O26CK3a4Ol=E0H6Db)H>`zZHTE3`SbtrGNAsey^rS9Wia@k0N|Dqg z)wl|19O(ZpHqzWo^5_}|r?yNsy`NMsjE8(tiJjlSPn$fugLh3iK{$~s`>+(7sgiE# zPb#d2A5WOH_i(H(zD|NnDLc!dy6-D&2TWg7C;QAIM~Fj;C1bt52(AdENsvxsUQ$|W zvW?K+E&yMRje$EnzKyr6Be1A0;OR)H_7|HroT|xc6kM$ZoS?5w>DhL80OC)x?1fn*SBC{+KtJK@pYbFZ=u!ir_J9T!R4qZf?;68cbVO+_>GGPS zG6aV}ty6U}P-~W9<{j0<(ri@)Y7bEy`y5lXzWz7Ji<)-_OtH`!RMTz#4yfa%fEmxG z(l3bgOy(k5X+blS@n3(Q5?#gp;BPYorp#eDY0{%RDV38$;FUAU&YYmo`ZX=G%3Vye%oc!RIp}MZ8iX1dPHrHMP7z>eKnX@ostHZ-G;4El)w}}gH;^}g=SLDXI z6obUtnwuMw6pRNTIR{|ij$nzO;GQH7OGQJELlJcDVXqEpP{$nJ;CVQYJlW5X{QS8i$4!84CTFB6M2vgcG8nE)KUa{-_}IxcYT?%PF}%yMgPfmT*J zsxa5*GupZKKbwRm+|F^x_6WR{FtAcn2K-rX$b)lmuZpzg!X9G^Kob#O|t3kD?Ja^F)+SNiDm~bcVeVQ7@X${+9+&|hE8M@--EXR-~M++8v znD=@gIkLX^MFMZOv3Zd6=LhcMVl^4zd9vV`&=xKW1$)TQNpk{od?+k=j zMo;!W{dMDVaSrd5xx|gmVqEAFDJ5bJD8$3;)%A6@VigmJKiVaz(>sLWmaUIlCGj#? z8t*hV6S-Q&K88^({gfK$%mUi+n+!_no0&NWEpT9TUgg=PH#1*)imNOY(awd#mi86N zSsWEZ9_Qn8ef=QM7Dhx0*waL3Dp^d3(x>pCrDm!096G0;q#if3qUaA#q#J;KkIC$> zPHc@?Q;%m&&(Z1k+%Sh{`Ji>GG)%U=XQ;~l+(K><4HX=`L)(4iXFCkM#j(K?>P# zXf&L8y(neorIKM>8iF3o&O}ngO%%%qQyzsnB2YR?Y`l?xFxf>M!EX#T5E+pTH5dx6 zSB)`{zENBKjRb@0LD|p@@4*wa*uiMq%qhh^vlu${%RexJ^W~K>8HG#}b$p(AB8q=$&z9NyShnNzLhG3kBKfqeV0Sn)hoG#`^})#`f+-u&2$Y$*|jkJfO!nn`mUCHfvHrmqz`f>q39hT_i#N1136 zMRt@CT5v5q4t;?MB*@)U@ka5${3^y;C!s_d`U@*Py9Jtiet#G$uDP_RILw_>V}1_+ zmyWa;%<}p1hRh^)(#3Ni4Z2$ip*GU7wh&kt3lgZbM*g;QGnvPjVagRbR1B^QL?g>z6bk_8IQh23h9YiG z(byr}Y_dA2(A;@MA_!Q2Jsh~?X6{S=ToB_cS(H2pH_3oSGnK8thc`hhGhZ2Iah`xy z`P7;vY?==+5lxa%j22Nid$FInK*)+2gGzfa?Q!t(c7{?aG)7^x$VNVRhS-bgy8Y*C z&-@%oVc5k0GZcxDbn4+4hM-MEtIWanbqb0X)E`IiRougz8|w6NRs`dYr5S8?M6;sN z??;H^se%M%suXz;jA6fTejVSnwz<|e0?AdL4q&`w6f-v4_u+g+2e6sc_MSP=GrC$; z+t~wHn%G;2@Y5DBI^v*Y4IRe9C5c$K_$nt|YOoI+Jw9xHA8LUasyD&68ce&bc+Pt% zt{4{2r@*4Ip3W8Zcz+B_Tzpy#owd!*MOUPW8Z)R@K%<)EAa!k!e8&V`{QjxLl%jAh z>RYdd>#BYM&;}El@KwipB4sGUD#6|X#nsO2XB8q*c-;HWlRFVU^TQ3oqdf8=%gP`p zB;7*B&XOF55hD_NZz}J;Pap7^yk)!x9D>wpQqiLncZ@!eMTi`e4-yE4f9&WdD_HJs zNoJHk0^ajA>-GW|yu=30*UJ)P*ttl2u+hKASZe3i4sW=pzCiw@XC4Oel(?+v8a(K# z^k|%jYmyiPY%;`mrj1a@Mz04EeY0b5;w8gagBA|Ujn7WSOag#3mm_k9bp~ECY)aeYM@zqQ15}H%ba+geVa6jEanIWA`J|mmLEi zu=L4LnQ*P4PS_t4_a}3Vob1_4b0;jC8~vC892`@r;K&oVeuGC8>uB{yTmFITebTnx)#y{oSWR~IwXqJI=04y(Oyf2EaVw)TX_k4|K@OL7PBF7PqN2& z-koNLd1;d@F-|_lG0SD%8Hxr4&JGz&+4_c7#>B^3jg+F473>{S)La*NEvG|S@8^(h zQJ#R_P7$)JI%eAy>?Zb$l=%HGKl_P`vZMqVpv6}XOm)vmdm$Eji>zy!S(v0V#cTsj z+NaXj_}iD8c@&1D6IYf9DSpAP|EvFe*#?DTQlbWKU%7Z$4n+-VwZbDq6B0hzeRZ~P#^0D$Xe*TGl5j8t zcGO`PlFn-qf{vy+y~PKz<4il=%3qj<6>X~Q zA}Q-`4cb8XV$e3{_sfh~=3owr89lgp<&>i#NiE0OiV-;Ypu!qyO>g~aYpyxZ%&zyn zHCbeiK@DV{aece;RGNavIRj!5^$-aI@JNi1HPQI}>IBTTWFBdpNf{a1spQ&>cC1m3 zOtt5GNLFJ|k^bO3ukd&~Sq{i;FUtc2m7EDSzGHITjApF5_&1~sGhyP?d<`DjV}9(6 z1#=o6=suN{(^L~m>CM_q>~b8*cOK*@Fu-mNBARYAJl%b!YfgzUA=?w+J9>u>AV2#6 zUfd8G=P7aqc}`F=oB;&xN`L80Aj$h-dF;)jDF$w@p(o=h_mG|?-`^0NJ9ws%AA6Zp zf?;N%mr){2j%t>bQHTPwS%-$vXs}CBizN4Pp|5?v-e0&$??j2iVTv!nquP{UVqTRz z?MDHyWuVACC*JyE$_ex3id zs1jWMuxr=eeBH*8SK7zuE$(7>qXzZfuMm@(tbEeuRo1v`Z~XOb@AIr|c2V0S(`|X~ zxx{#!BTHH7|HZvb|803l>B$ByJzR_~a9rf=$l7Q4^h-O+y(-}m(I05lw?ELT`-Lai zk)|4xIWV+pH2Z#|9=(0nyQy0DTeF_fm8&`zVdcqmvkYdSL+sOf2W>#@e??N$ZqqiY zUaq`lVP$z;#HKwiu93Kdv~}49@BAi-69P-MukrxGmkzyWS&SswYp5(gE+P|}RT5Qvo9zJx7>?*(fDaIti%tr{EG*NocS7M<}8gYuH}?zNMOHhG{k zSeFopkexH>d)7Jcb-rcq_|9Msab0~qN$}W7eJJ-d!S}s0^R%vgvBeMfRAsk+siwc; zdIXejUvcTRYP<2TnalT|RrTHrz&|7Q+qXMEpOa!@KFY1GHI~nHE_33a4b?Ah-*1h1 zp!KhML}?FPN7{IkUQ}AVPKoM;-nK+hTt_73lMOl^Q3lF8oR=-BC0|iX*3&bWSd2V= z;D(-bi=Foj9PmB*TmcnBAwkH$Z4GIF;=8C7uBO*`LY)(!)LEOzq^yGSOPPctif0+Q z7Y`|le5bKfD}X+@dm_G}oE9InwZvlilv&!4;v)j3+`AmwCsLNl0vw9em>0Zg@wI2R zl{(TR%n6%f1^fzT%tBQUBB&LXu=FnxJ<1<*fW$66KsCB9s3kRm9SMm`ikRz`*F3VE z=_i{RCyh`PGU)bdQ9U|x&hQW}DzL(s6t#8k1(3-QVCwe;-FHOMb;I$jE?v?Z4~R81 zb12oYJ5Rd6?dBq7HNfKJet9yFVx}_@BCpj)vibZ~%ElZ55-}*x)rNYP;8R4Z=cnK; z%#q}I53+j>jL;$q_LeUE;uuzOaCxVFwrxjOab^B8)IEdNERB=ZcaY9>5G~6rXCHBN zD#b?2$hX~y4KsE^uo;u}K^2H+0ml06t10WBep=$!gE;9Md9a1##(3yP@}r1I>A&;U zUimK8@@ivnI4^K%Xh0Xm*PbJ|e*gCI15-Dau2#NMjFl=%Off2!Jqsn`VQqZ;Sb*C| z%VxYvf=uY9j?ulnveeiIVkUigJ5OyO`3lHGn!B=GML|sos~{;o+)&pRjR>+k%Cb5D z`2&|Pb==d z?@~~r9x9duXxDCJsnvS}&3MMh@5y#F z(eQaJYz+Zk>Uj8;hEO@%ZMw(ZHR20aw5B>rw$6v55rl{eOn8Xa#R^HQOF(vJSo;n8 z>z{1)`AL#kyoGWIJeinUcdhRKp1OQz-9$4g~vIOPch9=;z^^#(3@hiM1k z@@C{Ss#iU$iaus!cpCLKD;gMZsemX(ER$t#g_tXf^a6Xb*0980g6elYOso1Ro`ID! z6h?8Z9K*!=!NX{@Y```|a+zz0&l->C?HBs$k8g0q&N+xO7sQdUzcm*{ zYe_L@jd_w9+=O+%S*j7QyWQfPv20l#Ryk``QQAi^?zw(RIFP*a2Ln;3-*^Q(fv z-`m0KCQ(Q-o(ldWfEAXq*Q(Ah$?Ddn6uA1 z4YUEC9^^67&Oin-R31atuWj_Zs2|n=bymSO&ZK2r=(9CA`n~4E{=mWP7WOz7w$y#C z1=ysBZd+(mqm8eg)P=8-XnDYxE~zxBHLa788xuW_oL1oMr)Pe1I_~UL#c#c|p-42rT z3(O#wPUIkW1=&z}b={PbAr z5-6y&YI?&+epGmdQqKftqH|vrgj61=JVkEa&ZC?^`He{4B}RjQk3nCN4^mXIgFlQi zLS;$yKbSLw<}n-1U_K`J5Y_f=hz3qsgf2?&^}oeW=Z?%%Fi)1C$gY*e@ zGPYugnt_zjc7DrCs>+)Ujg!dFPui+D!^pQLS-x}a{yO2JWlbQ`Mm-oVcNg&y`-+peWdz^#2{J~^sw&hO3A24)wiBC0*#e+y z)j3IO{j6ATKtcB1J!ol+t67VXnP(P|Tk5C_%1R@oX3$ZIbBG?bTM%~Eb~UwRtC@0j zN;fg!`VpFBq< z@xoU&aZb^XRFYUpu!6oxedd#A=U1L;sR>$sT!nP-V*R1>&U~2wAbAwfNnx)eSAxP3$cgu0c;mDobCM0J{rQk_(3u&meBc?;9h9b`K z!P1{V47K~T6&EkyC{YHTtVPJ2#1w|Zfn@?SK`vX_ zQ24!(lC81RxEnq_8g^50@Mmsu{AbRD;8_XME7)*Y&Q@?-I^&LvoSkGVdhj$Q_X54y z_{`nRiwCI?M4768O;1wPFEeM%`NcDpI5jtXA2uS8Q=f#5B7i~M76o9bc$rX_p->Z_ zMzm=*HfOt2EN=k=hY&HAcm<`A6vgvOOQIR;^1f&A-f6g&0RFh#RC2jD zaB~vb1rcbpb3G{(&P6?EKcvnE5hZE=k~zh@H9iJm5HlJCI%mMMi=~O%1y^8;rq>i) zniDUE%&#?y+RuZNlIv%A&sQCyPqg|94Pr-7 z7Hd!I74v%^Mj*f`Zcendl8eL0(la3roZ$upd^-T2ceLyQo-Vz(O$v{(0C=Scl1opsh8u`WYk?ASMM(WA|+aphW|EPavSQ>&H73 z{Cq`!6)SwFHG5xMO_4*7Z8fqc$s|PcPJaM`JYpq^_{EZ;xxva-fD;N3{jP@1E``Rh zzE+MDxW)|8oPJPcC{vE#pTW``_uf|)1!}U&HXS-&1W54Y;W>c?!_Pv=EW$h=W}uvM zvYgUyqQFAJL_gpL!4i!A;v_$Q?o0%oMm7)~phGTs1&j zmjVG|bTGxh%)u&zn^CBSVxZ{2m4;$p9tF>i&+jr+WWfkg%ao(?i-8sGPP=*>2ss{) zDlJz4_F%Y3BjIz*5L!~o20;|gkI9N6Wo?LS7d1T8+4m65vl>^*9o3=TOu12Z#7&J> zjhc>!mB!VPSF^0G>Zw}Y)6aS!Tf2@Z5xPpBB_)7vv7DCqLOJbUQ&{6psIIAnv2z408(b`fR18`dM;%_qbZnWhw9*I7M&v|<${ph(5<%!H zB;2#kb$og!^Bn~R;DRvl*qN)?j$$4#-F;xoi!tS%JoUn?`60F`000%SNe7>sHaLq> zWPi?@Egu_7pwT^nlVxxFZDb@b*VxI?%*-`w$2!l-mM+i~HBv}~)jzW1Vf%fq*>=XY z9aAMVA`(8sHqFtBmLR2(6&6M8IrMYU&ip!4tVqU#G^6?cqZvk+v2}TjE$L1cqP;vC zG%H4@ec%X_gYPvcA9{bMllp`XxfiWjtIw_D5hAn(a_2n)HQ5;Zo}$r4E*w4E)WDXS zvoM-xrh_=i504yi_vZ}ILU*>OORrMTg4Ne8T~xeuC^w=^3CgQ|%7+&-n!X=6!!^5Q0gNzG-HWuLvml^IsfOXU6g$N)i1Z^1 zx4Cf-VgJPXZ;}*y^zY8FY~yf)NhFDoS(_%yhq-%t555v0WJpmADm)@^v&~|7jcN{J zQ%H|SK_zZ#8j^oqa_2-t%X2B_P!n>G+L2&C@UFBL)$*>Ojctty8B2ERml~A56de2k zx&9tbfM2>VQKVm`e?Qrd!OoVw-alL}c>FyFKG>9;3XT({rvo`>oLfl|gNdf6`wdDZ zLF6nMa0RP~2*Eo=W_VGkChRuWGm5cP?&=ttJc7!Ylr^n%wxUvlc!gNu?;P?c$ZCej zs{5>oCx)BjoVu*G*Ng$|k$j>xGrfct&8vWUR!DpLWe1>%jSp-Xg^8T;HxJqAHTfMp z=%upMhHU1`?^_5%+4yxUup|aieW|zQ>Gr_2o_r)&Oinl3wCV(bakYeug>vQubg}rs zZEXDzv8a!sT0VfzI^H~kQkJM((CIbj zEiUs(Y-L&FQLD;!(o}aY)_w~}+)#iHLS)m4K1SYZb?i!@fM_P}mhC(7jDU%-jG2t9 zuQ6=o6dm)hTMgG7A=6jvL%?Ssm+{p0Sgs~_9VSkDrr9dIJvI^xhTDm?#ldt|J*TQ| z?h?W)M>M`sjjc?$eBSi2}>-Jgp}L5l40M)yO5#~JPC@G{)RRm7sGjqLWn z$arf4%m}atSMA3&s@u{(@#7jFh@U3#no{lu4snui|CSF@9EucocXxLQR@^J@Ya-=d6=tos(q$zI%VR$@MiH4Bqu2^36$7V`2gy!;EDn39@IV=LkB_ zgi)|+&#F!U!d?n9j1@Y=p@?u4mL*>-@8DkOZ@=W5K1YdMc)8CBU_L9Kx!Svqe*hMX z^r`j>a!vdO+gW&(XSnSg#nYUe6nG1pT@kUM%{z-z)F7hu7{fw^W_vJ~++%hmA3{Tf zt%4_W4C&YiUDwAUK>#iJhnF^0=gHfst2mFBR9>f?L&hwNGbi?#!;AT#rsg6MU;9cQ z@47qMGtTV7Dg_8*A#Q5LXR|6y1hc6xOG3AM4GstBWT>#**EW07RuEQI%MsI0qgR5~ zs9A3FMsaOtLIJCu;jR3!Xv1H-;_a)*CSyFrt4I#YUqi}0YgDKcSVPaE4M|d`DFjuK z6U(sLLS0nm)PCEPdOCg8gROC$=mNo-GWoFvd*#QGz^CzGx>U;*#FeU7l~mZ^YS6$m ziO#Y%P99<-(Ln4`GMMpMcFS3gX z$yVJ|rk1*3Wg&xc4Z>AYxVdjU)E2WAcvm zY<3a;LQbU}Tb9P*aqxK)o0R1#Ge?fMjPHrOV4=_a%RM-%4c}dwmEK1-E?vnBB}*#b zK7d2PX`jTgXSCST{wzvN@u&}SNqkD9MMiejGbqHyEbo9f8`QcHlILh!br=*y ztVKxkl$ewb>fUF?JE0+YCG^sc1J#j4DfM~oUSyt#xJ!$m%4>Ss&6)d**e;1!z5%B< zu3LyICN2tmK0B^c+4N!VJX&H|3uQpVL45^-{4J~lC^KR^63-|f_$ z*_`8+I?cA0aXz;{2<1C|xnQZj<2oB7W?Li+=g<+76^K3bV>~i8I>e8{E{VOoz@ryX zZg|Zmj>A6wP6hTB-4rqrQvZTcB@>DqhefJtk&R@*b}V2-nsC4{G-%Btgx;u zn$Ad!U*t))#BkJ#UO?AN;rhOVuwnTN02I~6WWgqx;DJ+dTxpLN+=ka)-j*N$z}XZ@Q#bf=-Ucq`ukd^0m8TZ972 zMc*nJM{8DhgfRP<1p zv~k@l+CsW3Lm*!~9XUs*PA}aOlfg!4W_ilaDb0kj4F>n~-d!ryr4cc4FqF{dHhFx5 zx4K=Jw)C0Ad@kF@@aEqwMeIW#V<8YePiis;LtP=;nW$Be?F{;+#%vU)nCTo&Pw3RH zat!rYb~gPyI5q=H{h7k;H^2aA%zKa}>7uAyn)Lv}u@6F;drkm+FEj0SJ*W&pTbLpb z)+f3M>HmIC9LaLoA-% zgs@j=k{h_gh5o>DN+U>gd4yGZGEPb@!a)Y~zm&Gr#lsy?LlN8Pe+8}0XK45JCf_6~ zf24OM0K5!sXFL9GQ$6WjvA1W#4=Mz2&WjTN znMQ5fo6$FgQ$(b(k0(cQe}$1%F(&v-rxdrK&;=`Ku*fHg1mPu#-)0X*;u82lmtgTu zDHL|dtqqgZ+y~AlLEZqW=p~oUfe=sgtoKdMa#a;iWY*8hPimar)(U3m7O_`!+I5F( z{JKL&(qlx>*uI?cj`@P-0H2PfS$`Iwr~@3}8KnKJU(Tg8Hb>xp{GoO3@CaPy&qJn1 zRnDQ&CHJ=6_0VoF0U#!IyZH(Wr}|2(FOFc>ZPM5xn5}VA4R(RoYN|rjRO+C*`D{tZ%q#+UW1~^TJEH_y&q$@5huAl@rx7mFnd7 z_`RqY{?!ebI9%!3`RSGNck+$)$7H`5?d1nWHVvR|&X4De>453Xm$xbAicE2pFSnpm zUFq%kOq!oI+8RGamy=ygc#`Wvs^ltNp3-ya>Hkv>*F|UIq3H47@Jz+GN}B$E{>K(D z#f9Grro^}k?5`ZgPTh|r(YitYQ^hWpOya+0KfY+{^lkhDkl5n8N8jw<)z|#Hb#IZ8 z6aVG%JzK{8-49`;bK=2&<*iFs3HyZl`-|@Vsqy==)PJ^@x;L`9iT-E}a86wusY<=t zHa2hD`e_|$^U}b%wh~|#a0at`GBNshmN_fG+4MQbNc_qEU;G5qTC478*#9#@{~rRv z{XOP<{{YzEXP{kotWN%bS7QF@e*iC_+oWh1B~1LS)DHZ_P>T9Y8+#&a0P68(VTe?+u3= zrmW2Su;svU8zl*HH+zOS;UdWEuUL*L8ctqk0!lEkN{X9y2E@=>zb)Y?=q~T95L|8z z3fn-IX7pjtXQsJKh!E$lk=CGqeC$^Knv|s=%;2+UWkM$)3&|p$Nve`d>rxTOV=r)g zffV5rJjnLY>!a&DFKbFRQ&8%m4cX_yi)iCmg$sYo1^WcErpNP;p}Eqoo3P?RU!2*4tcL!>msubT$6$#aP%ulKlHn^t?(_&@ zVzVOFG<+moIM{G2YvdGNFXZa|;LUOX^h|fnLoL-DKh)AQKOA9g+q~NJ;Y|WNR|Epx zwg`!RRhA7NI(B%8zwPh-Kq`$%8I?3R^F{ZkgfHzNclv@< zF-FNz5fTN7k#1~9cAaSmT;; z^X91Dm%b^L`oe?;c^)<{MDylNu=#_YBV@R5t?l=%BWt_rhOsEACNYdzN*5y{p5+uL z?Is6)P6nx?sgL_<)6Z-T4H555H%7fFt;J-7s`#QTS<|9sgvo^|(~Bte&?Y(MICQDK zae|zsZehK_M9=$>7#*8!IoSQ!>`-DlrV2){W5E^5(Kv@Jed`~(7SmWCYS(arY-flq z-F;#AR8(i{HZ78@xoy&rLB4KnhW*IRpg~Ab95&F$EH)x1vKy@#!Fq9FBnL6VEd$f? z!{wzbZ}Irzw~}nZu80%|=g7)#*{CEoH}=Eiq*!?ZS0A(1OI>+eHGI(`FR%=<*VtAA zx>EC{#Ewc$v0($>PfM;Wrbbewto3*t+EPgGMC@MkbUVWKdpp8XT0E`b3lGa6mks{5 zprCPF88?HHV-?E2#VO>_9J!967|+?mJ%-6q@pmfh`Jb`x8rW>Ho`-0CJ1jaYTO1e9 zY^Iow%lmSzqxE>U;6{%Dk4tcxZeOv#%Ah~u%H{VKPg2d2;yRv1Ah9V{Pn;PIH)_t% zw_Vb$`_&;I%}!9TfM}w%k#V@2(KM&fK|oBLW41-4uT=~h@OJ4oT!BtBX*$owE~_XB z-sP$kmpT@jVyj6(PC+eIF$%*#7mC%b;Cx=Za@H(k7`?crR0)S4tsOsx8pV@UGZrvxSrL$n*ss<6`(%OeQ3kL()5T!{?+ zO^12Pu2FJ6lp8l?er$25kPcaTQdtJ66B)W^M%0}15r*QkRSuYe0NDEY(6+Ub zfB%x4@xS6n+eCKlfDf@KHmNFk{;~Eum^$6m{e<5wew5QYA=WD%)odovM9_RpZ{b*a zO(P?P9!M+d8it1mPVz@gGV>;p@s9}Eo)6;(lEB`YUF;O&6Vi1qBShI zzougAl_pK-w9<5xiv$fz=7ty)+jak-i%vU`bO1^e)YJ4Go;njwzt^dDD*pB5F=ZaU zkErxX@+V_?4bbH`=8xeI3+D#qKYK5JjNAD_68eepzD?lQp zP5&uMu$U!Th8Q57rZbFF8qjIm)mav-Sef}5w`05mW;X9`mna|B=R^o&#_PpMy%tVP z43tV}Z=TAD!30$yBG21y#97SQcbDGx=5tEHCqTXZ3D@kt2YX&aTVnq;!J-2PZD73{ z+N$1{u>$aFSaj-d?H%pBJ<-KhQz3ON0K1TwtcE~B_gkTtKGqodTrq+e*ym9CyBxRY z!d}BeSaZ72o+>+D;H9FmMZJ<UcGe#gaidfO%aL%oOaZ$2br7b0W|PFR zv|b0wS-gD{v!a;II2k6DJ`U!a^rajfINV<!e}= zRk#Z>4CHeD0hm2T&d^!soCS9(C%Ijh(I?yV%)cj|Sfz!4aki7+NIv7}QfgJA!s_$M zrt%E0r--pih0WUYxfzq8U?&5W3l=9ELSbWZYfzhasGy6&+pQ(}**soU?zc^NhQ!-M z?4^?pT@(`u5k@u%^m=LtQ9XV2*bTF_W0 z0E5}~I${9zt{Yjq_YqK}enoErkWcZKE<5UC8fdu|TgY)$R@%{#R!Iq29?yp0a(9}c zL$J`urj~iG=&=fL3X#?5IxzLxk^Ga8Ig7nHjxf&kiV_JiQU`IQ+NX+Hv3umHNW{r? zjj8nzrTQ%ael|$LQ{ry9pQM-hd&h!zKey~`li=M~t^LSr`8mr4sEAgD=A-3Be%b*clgyxaf~YJNa*AFL_5qx;n{V=XY5~(&-v+-Hjt|l~!aeTgtb?Rq zaYAcoums+O&BNLq^QvQeXJax8t1lG80q9CY60x(T?xZ14eywlWiNKM6q}*Lm%Wx`K zOl%LKw!@YiuN17)Fi-L|xY7E`OP$qNKuxsQQ#4nJW1-K1d{oy0hjkj?PHzU^jQ&N3 zLxdCRGimn49w9o9IlQ?8JyNVmVH{>k;@|BI2RZv7Wu%#eDIkrOUl!r-IRGhJb zj^$xp1K9xYHk4DXrl4PAdo)sbn4C${K&oA|y3eznvMLhh?j6JwC`d8f<|Nn2HMRAY zmgwTsd30WHKxbV|2mTLpk^~W=v|pu{e|TzCx=<`SayYe5dPYwQj9CmFKa#KnSQ*Y@ zhAYEdc(v8?levl_L;GGI=+=iG#by`E1GmazD6sYLG zW3pzp?SFavG<>hz*t>vi?tTfM77VszbMg%Gv8gI!>Q}cvHhTPYTkS7rGZ=c#EP?um zlG-k`KJIizKb}FJEBT;KE{t=&;0B&<_S({rEL@)<9;jxF4a$#Zl{|}KK!~(xttWHH zFdQ>Wj>f;NVNGllB;GpKD7R7|qdYJ_n80(U!m!x2{>COrydombB`lX`LLGm3W?%}s zibIQ6QBvwuA|}mB%?cas)z`gDDJ2>?U>SI=81~UVP~fAMCA3&G_mrXfS}Sz!_l679 z%GC;5n#;DyN_U%Pb;T!Wc8ldL-=3^wR6A1I-kQ{f8@%B@<~ly+3Wnl5|F3JYH-NUXSpQ$kmJpofqQ3?>EK#{7msEsamHNsb>AVij{Qv zS&q_Jbgq@FSwFp~>~8#_s|q>1dagqIU6{K4r&9bt`#`z+{a56Ob3y1Gvc3exLB}@$ zvS|v-{J&}<{vV6w6jyizOv!N+T$z=?++?1{uRmrg)Xi3dP(o%`NvPgOM^6;ccsk# zjQ65b8pWbmYKgZ|?{-9Ls3hGYBd*_Q0;}*Bc{MOnsNGrFKv)Wi#mOIts^>y-5 z+Q^M?d|wE&ujdwXjgnR#`kG`p{x#n)A-|~~RIBFP~9t`Ov1O5-_zDKoNi5KSE<3Tml1sxHDDMWE(>eD!r0b{8vX2inXO7ni8orv~Nk~lw@E7ETse&N3~7S5bV588|a3Y zM<=A3VFwYilJB8$g*34okmOK~&+8NRZoY+-hO@-xEs+VJE>BTcTHzqzoMs!drSC?X7vr*p%6@W3T4mZ}xnu-<5eq+oN^N7W|%s zn>a(^d--KsMviNhMB>O*B^)FYsVNLNM^)B~CJf1fsXz-{o-N~dl4GXJT(DrL7 zXTnTgKf;{{*^*4gHZeOW_moN~Kbc1f-DIIg(uUHZHxDO(SwXt{Vzl5Ac<0GTsZHXZ<9GFYJ~~7Cgr9Mh$gj#lS$% z-tOrz84&m@FCY!RaDo?1jp6AZr(WnocL32%JmcvP5b5z+tml^EFu;aJUM}YS41@EV zRKCWZb!6yvJQ_4NS+nb)*X2G;GO!lYy^Q1NNHbn8NrH^}U_0w4$8%}rP5H1T2{4|v zg*Ivo50a^_BB+FA0}Bf?kuHpgw)hMQ2442ucg-DN&}VPQHw<+w6Zs}6MX=8I^&Ob; zR=C8B<8QURkBD=I^z>#~GmkayNDO*Z!$>x3Jat^XO3BS*oND>vrYklY_z;&GM(K7i zqM4gf4_qmsJ47T1!~&MQD5fFXbOn33j;Rf6Q=3)(a2Zwhu2WXwh4C0^%GW~ja>iqD z#BtPD#(u=FG1YGrR<=7m?7X=#wDI+V1}Z^Z@O2K%zAb*h$(@9r zDaW&fgq&M%$q!HJkxSlO1Lmim7(0b&d3{8-Gti2v0Gu%%q-<`Yu~-AlEgWmO4n3np zP)i47z^@r-+HlSnyN~$I@u#Y#xDbN(>I^@bljv zWe5!*B~v)$bD?%;^5H);WKT|LT;0bxCXz^?L~zpHw&`+fvo8d2*sV+wnN%eKG}gf$I0zcEVQIN-o3i{|BPPjnJfdQR0%S~ z+u-A_|H;SsZ1Hh!+8|5Wm}*RYu)VQ`ndbM6jCprC^BCWF9wre(l{2c-x^z(*Uk;|~ zYG+*}VkW^Z44-3MCq)S5Y(+@b7 zu?{TNA6=#*s(`F7UaL|~k3hv0#meG~CM?oroT`rVu4}O1Ta}_YTa3s>-rE+hWqCAL}Gp6%5VzSXJ2mIff%J` zj&oSrs`3Z0mMn7?h?QIvRfb__xNT%G&9az13uCTlHS8B8pdHI^ei`LS>%tWO_@{ys zu&pxNh^Mzd8ft{-@_pZs^_6col^?B597Hd+@{>^OL_8+NJgGt@(u@Tg6WI)&XAA4P zVmQacryTh>xLBY{JC~%qGl`XG7%bxQY5cn8GJ10#U0vUhdM2u!iw>MAXkOD46_!go zA#t3U(@woyO!I=R-i};Kgr07Su;Hid8m`f~C^$L?JsJsyvC5*cnzO{$gLBxT1{n`I zMHmkJZ1a!>m2W1F!% zMhD+NlXot_$DpL1B{3G-nV0Nb>TNQ1S4R@o&>zP9$G`U45t;6x(uH+tSy(NFl^-27 zHjDT)M~Mq5JB(l;NCI2ogC%gjSxDk&I$3(RaS=P%$ERhCBg$PNdz2GdIpe@$L7JAx zV!}3ubh@u@BpZ47WNtQj>lhH3s;k6?ao5&*f>;4cEFHe#eDf8sHoCi$?-&EoxN%ow z?Z?*Hv9!50q!tmr4Wp`{n|T@@i?YtUX4G59`owDsba-+85bJbvp;M3Oi!5)HoI@r} zBAg`#cQ;YNJ>&x95t5cA6dnqJrZ|8yi%xG1N2t!$R%`ucORc|FU0;We;GhP^PC2zI ze=V1GH0yIP@uOLnsDsuUesA{Ov*PMjz@A+|%0GYxUhkF31?hSsg%}W;poR0FRQJ-* z1I|e5>Ne(G{NOXl=Y_+JF|s$F3peLR+$Br8FbEs1_>Gu72_ zQUno_VWnepN^4MEuah*v`9d{ZpTas~9BhGn9hzQ}1@_u5l zNR^H`wc^a7cetAwW}uogg^|_`=Qt@ZxAQpAjW|!GQTWwxsAc20%#NUVU!1FMzx;W0oz@1k5oN4bsB~JxkGLD0bFm=dv zQTBW5lQ}d^iDT^fjvj2F==r- zb*ex}G6`-VIlR(f4l++1%nuuu2X|6SjkW9aOx#Yq4&4X-W<#GtvfzftfMQ%aOff3* zMlPptY0(Qb#5RUX5IFgmtfG4(giG-Zzfc-o=p?olI)p{~MT$|Cr_rydwjO>(O+6b| zVkt?i+D;!L2tl{PT}MQan$ejp@2Jb+ympl%_>3AWolY5P9s|_sqLPb6o}!#N+|DXa zAvNFjoDa^>MxuM8kOfEPyBmRte{dHiMpv~i(^aMTM|;TOo7+wXg`hejsPv9Ud`;~d zcbyhe3^|Y19n2f~6;BsO!k+vJN;GM?i%$^kYd#3CB!9!_=9 z(pCBPo`arE6A%lbc|VkZf(ryxnTqVCGPsm(1g#Z!+}ECbvXkGuYd?oJIC0 z%mA|OQ1>scQE?8YgrErwPb`0q6CdJiyMwO06f+N*#NLHDnyAtlYkmL~zEd3wC)(^g zR!f#t$$GmKZlEh+Sr>r^k=@*GOnHwieyFMGXIbuG{2elwFBg6H%Yk93E6+ueuCEhX z_DOoB&l%shECX3-*R5M42iTZTj_x(l4Xv}7+n#IqWA4HePQk5}S$exk#n!DCK2-pZ zOdG=(=c2KJZmjk&v_ZE^Q)!!Rly#TSm{Hho)b^$%=H;`NP4W%NXv4%-lNW1RnNp%v zdp7c=nC+JJ7>Tj|v4O0Vv$EZ!ih67cn(KV)i+qTDML>mvbyboJ33LLdd~roDa-B(r zy>h$5Cf+g$A|x=ot3OL~!^zS>de&F}WfR4{tM!em2DbrvVZ2?;IHDOph_&qHxA zuR*HW zGJkj|Wm5@sf?hvdu?!NxaFq*6V^EnC91dGrL}oXSN%o1Z9vdX>Z~OjY@N+geT^AH&6{7tPQZ;8JH44?!4pQ(63up68 zc&^O)Oz>?>FQ;x0;_TzH-Gm|t6CIqn*P}H3ZX>&-j*)$;cWi%Wt(DMvr)!1Uezh91 zL;3Vh=eq7|vZ%2x7X;A>XhmM=m}I*-HW|ldZsE#sf`zTx^3> z%9wwrm8#%W)R4nVUN5ByCyy0woCr*0U{0LiT%?jGkDC}<{?siuT&{73PuRgmXRs#W zMrAC*W#@jUYN5s9)BNCr4nVxLHM+t%fvl40P)9Th9=;JoftaNCgvLR7#!5E_2wmD0 zZ1wt<`CB)&l_Ou^dg-j6gKo^rkW=BNXvX#~nM0K%i<#FkDjUxm>?^UigdhqV zM-7g*D*yHui}6vd^@GHkVK~`|-V22jltvcF#lw5%H9@TQs@X`kZZ+4n>vf7v@L;?r zc~G{Bp$%i{UM>2be5g3o4SV|&+ay=G+(^glosxHFxJaMvu92d^YiG-e)gdivQ+W8T zR-UUP_TX4+Md4m+ESCWBknC^xeWHMKouhD-OlosbSe1e6v637jjBkWWaQN4zZ3|?*EB}`G?oE9dG-V$PzZ+x1js$C4lh`n5*yKkLBk=Um>&Ix=6 zSa?O~H3LvGwpg`TJgzq$=GGnry;6_h=YxaG6fR>4Y7`LG#xwfiSQ8yf^Jly=T55H-Fsr!Fo$O871Byez$U17)x&0j1xYvTlqoG>y7>Q&+_;uR#J-& zU;X|8$lc_;wcc#5{#EJ(n)vHZI=y<_;J0CgZ)@L}Z0^6D@0`_nNpq>z-1ukk_yt`j zdc!goA^$??OS0L(3dV|`qQ;(fV;K{ZDQ20by$yeyNETE~IeghUPh~XrU|a!w00%j& zQ+1U)vZub+XRFb=;tK}yzztfF8q+Gf-$-Ln%AI*)11^D7S2SCOmC*WHd0S$+S5~6I zWeX1bJ4y#wVaHZnyYUz*0cYN_<-6GC@^UoUFSMP2*w{bGIr}pUot;&knDR+Xf@V@= z0!j!=FeJo4zs`m$Do{J=8{6t;4_14%hy2AhR?T(9j^k zPY`gL>csb+6(QtM;pY`Y*n`XLddd}ITd=nDcZ3P%uO!X9?&I-J&R~wEM`PO+@GYHb zUG5H0{zRP3_~kZe>!Tpt@lTs~Jf9KtC1*lUfZ-OLMI6cR3%h^Qa^;!lMUC8b`*d6F5 z#xB-)5-bf%sPLKP=Sbb=rRL=6Frk#FVXxNNVZ#kFgf4+*AtkVd(m zmyvuf@>I@@;~~!YE%(AV#mS(`+qxIuM93Df3@Z}O()K6~*BLfb{OvxC=E}@uZw0N5 z1rI4|tL?#k2723jx4z7$1grWL)o`#Ls=o#ld7n5kKy1I`PQWpz3q4#1IovT;X3S30 zEBxs%T~pAd9}I$akg8j~!cIAJTF8cMay1!KnLn-aP9qKA%t8{3<@VeJxpz zDn~`FQ6(PfaZklHRoW2Q(v8?HS)kARrTGOt{QIT&XZ35jTHLc?zN_|7ev@h(CtGMpa)XKSq_-xF$|q$AO}<*B4f69^GF4~e zOFgrZ%!ES-!TL}JlLEtZ2Fk?~jWZN|pvJq0Was}pNl=XVzMVKf!6cR9~)8q(FG zlFGKRpLBC?Q`0C^Iai0RpE1e!Lw`}uJIbS0!$Y;79mu)hj(1i;M^rCVsP(^Ff|H@T zP_Ej43wFk3&c|MUsq!fE7~9cwDqrH%1s|88l-0GIt{rI{p5MOD1gfN5dt79IoApzL zn`NdclkCF3dbO&F*t8L~&TlhmzZpI=6;u(irB{6^7h-EkMSVX;l~e?V4#yYk zB09;rfZyh-O27G3((Uxv(HkRbK{bBdu?wlH_C)j-CA%y4oTZ$P_$$7VwbG=%XaEEL zSr}7MEO;Md_%F3G1f%=D`)QI>=R+6ZlKuI zl8dDx`+qO;zqc<*{GjkbyfnRLuKWjJ%V&{g6o0;9e@D9ZjYJZX_~yf^cxr)~^Y3%I zN|;860VdNLU3$SM)X(<@??yOWX`oZHzU@0EM2r7zzH#jxIbvpd2Ka9$TsC{NrnuGL z|61N}$u?Ov#rrQt$P|#B`MsnzgX+URo4vfYl5XHD8BuXAjvK7?Rm>8pIxMTO$9u01 zFIrK%8-se+Z?3rc4d9Vs{dRx7q+1u17T>uSX0!Td676ta?K}Bd<-Bh1!DX+S{;ZTz z1u{SV;LBf4_-O{Yz;D||MgDW8X3JKJ>9lx@rufvzPDeG_X5WJtN81-bAF<7(__xfA z_)CR<0Jov@_g(g5L*Jbj1X9l49ed2ayH|g2LbqP~5&xdgw6;FNT|53@f3H7ZIb?65 zevbMX(D(7NCVJpp*u?+e=@CtOM*B{t4vz`-^^(g|GP-)PR*?z`Qwy zGSbZ5A^PkWis3L-R$m!Wf&o7lORNx%>BPpcg~Qi6T$;AvXgFpJPW9ICgmIiU~H$QVY&CvIu0$d-m%1?W@D~@kxQZ#|?PtKPH3MIQL4r|lDwPUjwv)pGD#6P{$ zR07OZmz>p~&5t}tr(c-0@D?7rUzN;UmG-JkQ^G0^OdclNVkS$-vA{+&JN~%Fa5*-_@*BZ2C4u1{ zcwO^)d*l??B9a22;Ght?h|`|z00_h*Hdv9zC&S*RMvbD;(@U;>`qhgQm(f{2cHsno zgSa9Dk`6exMc%x}F}lK1507Uy80HDHfSH-F2`n4ne*g?BJ6sS6H5f@-sSQ!EuR4>& zC#z~6TD?-U!y(%X(lp%rrIC^K$F|lY?yjNlM06f1N=G<-@|k0CYyw>_?(B5-F`9&h zsM+t`YN{U7=<#f5Z1d};P;Y)OqRCcYaUU^oM5vET3i9O_7NUCh7(DQ>K!hU-rh z@CPh(aRi__KHl;Hz}iT4QdODfrX7tJUmrbASyt{E-%Ux5A-6RWx3ID&PU@|ekl0{i z@l#k-G^=pNbz{`wO8$-f%!pe=YU63k2iDUQmVn)!#C~?Sf z&OwG7p13Ce7?&3IMAC!TR>=jWcm?TJATxnvSg4vao_%qW}d_Chh88u)uFdJIv$+5!3wddDoa8!WZyaSoAb4;_5` z!dF;vjPv^Q@JznOA$M|m!%8KWyLL%6qk?$m@wf^G(O5-{jW;r~dpRf8w<879p{e*B z-9{Gc^{hA$W(uYsnLupNo95)7;(mM=e>(Dg{L2L z(}eafN(INSxZ;#Y>G3zPe`cRcrT0-ztCDt=$X#F+_uZg6VUK=F!{)7nQwr1^RaZx* zBP1L07}{q)XjEEv?(%4d?nf15{L<^H2$EbNx-b01hhL%Gju%ZMrf_vcnMjf*G|ndV zEobfFAxmNI=)r6!$WB*4qL45QF(f7Qsh2R4LKMimW zM2_-@ZN1NSeK;c#F}1-l6b29`BAircLP9-8ec~RrzRD38?LKgp4ojQ6X0~dxm_w69;nGnT4!I6SfhwtBt>uv!EQR}~hXye0>ph|tJ^9(N)_TfD)Ox+f zyIvs7O(aR}IfMh>TTV0#+pwdK*scl9D90&m$|Qut#02bcZ`nV-v4m=7>>V1 zOypELz<0w`TvZwUsul0d>iF+PF@GYvh%R-wA%aCX^5bE;hC7*}T0lI<-s-i+$%N-ErQV#ieQ)U_lc%aswIzY}+6_ zf5XNn^FsoUPnui`58jC?SVF9EgG=X)P}WFnpIS*9@FYd2=>2vYNBjNt3}|G%9@` z5UDh&yb^HGLwMOC?haw2D4u#XSU; z0*IQ~eJ*2@L-mYyhBF`^VLicQ^{qDLkDKexh$|RHt~rvkm&%!A%#nB2ieoqJ$W$Zf zwMsOodV5XQ)}t==TZD;aocdu%cSs6A3(^5K7A!F&OMw!Q(YXuYS$jlVTa6!{OJQv1 z-XRkIWa6x=`Rq=I^CFjaHkK%VS(wZ^i$*aRlumf319Y5R9AQW-a+Zl!5|yQ+(t1v$ z_OW+EVby4s&}Wc=@O|r!PpA8iJ#P0+K`T2*6@yKj--;$DvK6vWoFsxP| zMzaFLDwxtGakBt0wH^Zi7$2+`rb1oyFX;lE^VZlvn$6wK!$*;mF#@^Hi_>U&*1#qe z+SqI0?ZoEOGGmq8^$+YUws5lJGO{*95`~5|Z=#N(^wMy?jE3a!Q2RM7{13CFBg(8a z0zr|sUt-R!q)o1e1{6TGU|;Ba6|>W1z|G8=^uFc|(&KLPml5#?bYJb0eC+&TLDOHP ze_k3t4*M%j`75>eWk<)56NdHJh=x6TD5YqDeO38yMCYiTq4-)vY0D5DdgGw1KQP(E zt_~ThaA^dRsW!GZ!hu?B=S6{a#=k<7?fo@YVpBNhkb54893?OI(7F&#szH#ztC zrz@1|0`XBF18}yt_kJ$$LnbfI8IE&%Yto(8L}$_{$ltNC|TuFBFT`HZ`X1`y+w^g!l zTf(T#EZpQLj)@iC%?F|1Eu%a@N*~*NBQc^?tpjIyIU*)B zX!}uK17xUY7&y&nRqLqQ+F9$qv;U@H?}8}5v!9`oJEYT%W?&zU_}y)(_C(xGz>=# z#no-De!5v5i?lycD<1Y9w&2Z(W-8=Q3GG*A1O^4(z`6K5^{^+EkzwWY2$@mQvCgUe67e9HD$K3IMyo7v z-BoGUV?8nd{<@r`dkLidXYqd&!CO287Hqqig97ViU z)|lG&7r18s-y%3$Bz^Vu><)eC8P2V9($8hC)HQYLg!WIrQ0kB_bI!jTb&-wg+)NGc zDvfB}DmS&AvgVas3bXe5ul%C>8ed=oG~P?s^QrM)cP!nSVdw(9 zN)w1Mg!erN-9+;e*Q`Olid`k_J!o43Sn zj+}7jZmJ3|`+GU!5Ha;X);GVR#jV@c_EN=D++tr$uF3yYh1_C2SLfrS?Y(5)##i&l z`)XXQZoYDr-^M9j`V0iDR(+h1KdbXK;*PkonG&nn5xr4rHgve2%AXQsZ~X=HURhKj zPDx(XFUI;b{J?!guH!-`mfdVKi5V)1pU!Hy0dobd!0^UfAh_K^<$g4%pW=nV7O}OTiZr@voCm!^{+bh zUCOHO=R7=rux9%!mTv7izULvH^2H#a^_|2T@O~<4@~SRVE^DyM3?bLsMo4>H*~QrU zE8F`U()N+uReSsM?XR4;wzny}HI_F^XHxz5bfL#H=jqFUe@}II5NL!-fNn-Sdq z>XzN)qjs*Xew3=VD-SQ1%@=da@<&a5-ieT>amj;5#_JGcbzN%#|29)MUK%w2bLl ze}Dbb3j62sYsrdEu1)q719W#B(eTQ)GU9gYYogyp&oC_S$N1<=hf6HQZ8>i`KM*RW zClU`gt90*nlK=$Yh`GX(ZK5ne>9Yzdtz8GJ89X@*jF0wv*%qTcO7fWBnjjAN#{WRI zI+;d94W66=L)nn;#`8z}R`2PJB9~(lZ_nT=%8sHCqlsl2K|bYjTCnP+uRc3QW5_&z zyz17drbT|=^wIlB?EcH{_5Wh;t)uE{o;6`&NC-|yf)kwJ?t~D61qc@0-Q6unaCaxc z-Q5oE?(WXP;oxNUf%kpy@7}p<-S3ZC-xaY>8)2kC|5>fuiF<-Z{vJ_5Zw?U)<$_)mbIHXPfXExU++=yzjPr^p7o4c+1s z5YpQ5ir+ibJ6SD~MQ=WE;xGRL1Lxa~cv`yZ0`%BukfWBd2TZ4ieY5v-`B6K4%=@5E z^_RW?{~{6Kfp)v&^8QYrL!Nvh;4AwTtb!V%)HsSq;%wORCw~G%TkKnaPa1H>&>|$d=;$NyTd*lzqACwF3Wm62J-A6)#mB2b zVEx5l=jc`C*x1Ko<_B!Cu;EANN7Y$tuMP13<|iQV=)4H`^wv(8Lx6m2Q_lwZx-8Es z0ST=gbL2pXTRfa!rlT`BMOr1m7eM|w*cstiIy6RjjM=oeq;nGSs7yE5DRoua_m`Li z5ZkD((g7Isp#x@W@pSFdH^r`zy<*|tF8o}=YNuQISZbYgKxkqI*=r;x0e@j%O}Emq z)Ld(T(8LX$*PEoo!bbqwfO_3uVzGlph&n}P_jY|dudoVO^-V&OE4{j3rXkmQ%A6cdB>>z_NN`c>1h^np4h2wWL_w3hP`e_JiX}-uq`x9^v!n7Fkl>A@ zYnO4IY7*HX?V$j`fq2WoC8+!ZF0B$0_z>3)oZHbAC^p}M^Ewceh|q>MASnNcSuzeI z97U`|c~9R&yy3z$KNa)MLOjMSBzFP4kC3{EYX}2zVVIu^___l@sYpJN1pY}#KV)zu zvMyykWTrdZxD5Cn`M|^Are6(kQoMcu@#f0yee?rzEh&LxU=J|6$(Hhq|3#VNQCYBh zckkv(?JwoOX`q>2*WKUCKgQ{Sa|gh=UD>S&1d3jzRFe2B8~_V9JHnZB-G$3yG|^G} z9`20Wn8=0)WgOmudeG~?l}R%mDZ>us(;jv>JWw8ueU!S{;ZL2*HC`4wk3RTED<%Ss z|CZ*yyPWcp$VKDButcm^ByU1z$Un+$M3;cUS{m{P2TL}K^Joq;ZWa>}Z=ivWiEI=E zNVg?4<&-x?9y%$g17jPpKAybG2Y}#CoC_wOr8)TUSxXQ%_mb@ zIXxKM(){`dMo`^_eDSOV)Fd*R8aZa4a|_I0F}4x82YSc{OQW&mS&uwo%+vB8>=7*V zj3!2oUI4lvE(G0cs>XK!^3O+PvjgubesXd?ag;2+hw4*q1dzNOc9!t91=uX zR%sj05to8H;JH*jzA;wTFddL%M$0AdG|Mw4)|09%#DAirFOm1Zb`xgozkL5(sCbFY ziWi}NsoS7@{l;f%2vpV_4h|GFpHy(4P3VB6U&EnQm-s-BYtKuCceogi%TJ--()_1IK`-GH2i&NczUh{oy^- zjCX);*WBXo_fR89>t)sdIq#vu+yYri{)z8!msYwudq0iu+1Y+~vX=F`$uIldXBVZ2 zl>dp*@iqU|hk&Uy(|zkjXI%?>15-a$9-`1^s@|(4IMaB*C1I|ue_(W)kdMf!sE|r# zBe_d0gIA}gUa=?5_c&YEOYCEmAA4-^v3~pm!#3~sBv;a4&Nu~y;yNGW#@zy^rua6J z&s62I`7||=&^_^h4`vh+TvfY#5}Fq~ck^SR!+kye#%QY(F6nmW+3k|$C2VYkY2^@H zl&fsh(md?Jhd(eRLv!*QXNXCKrz$9?u3gL6Y9m!s-1t)GGIt<$qS12X z>H!PMfeV5kxz1naRE`{lCd46k%WRhFS7dIpb?W?%t#cDC?f}Q9S&-023VAoKM+A=) z)jkdTm?EEBnE7r&5rlz)4FEw7ML*?PYC{p!3I49%Nux;ha;;qI{O4`Ue1)<%B6GM$ zR}Wfl&;QmEO8eTa9wCRg?W}%8lN{G?WJ0 z&w%_2R-ASO`0ZK?%`%h8b5)sUn7qoIW5%zL&PT;t*_WrE1Pe#b6E5|Ru@a;dSrWxsC=^nI;a2q$A3zie*`6WDbg5-_)w z>?XfeOUa>6NXsf@Wk zb46;LpcQ0Wy+4enGhxOZwLz|&Jjg9<>b!=zsr%%kJr9J=S@RcQ*tV$AUhE9{?8${q zE#}fg9DKY0qLZ~#HlUA(9|`Qm!o@Ee`J~mR4LG1-h_x3p({6-uU4Y_4N1$Zb8X7jW zhKy?&sGRM6;<3AZf9l)|IOgPMh1SKCyitGSfa*8J{)7CTSa>g2RAb4v2b3+fk-KLP zD8#YNspV(fQ0iAx?dlIyB)d0{Y_cFNe+6Kb?w9>9l=`goh!7nB;9>MOmcGr!-9rG{ z5n{^{tak_tK&;f0-2~214mB@qZbQ`J4z41==O|3qK&;fm?w;LDyDl(fL|K#0?BeU6 z0oLr#SU33^eLH2W6kQN*xCHv|Jtje1g#2fDT+=SPd?Zj}QR<<9&l~F$G$(+RTbEXZ z&VymDg4i00sw2!BE#wRrl(k-ViDBpD8-Mq#&=wpH+L65|ahP=jROM!7>AIkEDuV5f zDg}aqBi^Hn+hb=Hc>K^m_z#R|cB5D4=z5t0I(Xv^l?-baxJF{RV9mTj{QrxrAw14n*Tsx6ri<00Hw7zqpd!g zfB5nE!PZ^;Uj#~vcjPVrP?ZG9r>g&xFonNAR?mgy>t6|ii_Rh_xDujLkF>AJ=& zC6CC#nQVZj!CmGG@dz1yNSDG3obJFFn2)r4sKomU#CXL(=JvPEn8eq%l2UkLVrz{a zRR)u}LjP;uEXV)LH_K%QJ`YRHTADvFKE?yJp!X%cSH}=VcNy%!n-hWiiSKk3pkWTA zMsK^~%h$4hU@CAJB;0^pts+P<-8`M!>W3s?oZ>dIo>0fF5;c9~nAH0Bid zEy%vjsWkj+a{}B#Cvf@tbu^cFt>A6>jUdx)f%slP>3;s9^%|nS1>#DTOXM9wyx;wJ zD3B%se3}?%6`0UW?qm1^Mfw>L)q}`~hi}C}eMjyG4SmY$Kloq=PX53g6{>x@dC!-V z;(G&J6BN*0fLva=weh%S8|bkiUuZNV!M&f7edh+UfT7FiZms!VCAu;@(E%9$oMQoeEL_q|Wh5Cj8&3?&AS7F2kSSkj} z<$(mKGnp3(BevNRAgWjr5MA9_o9vq0mK;^A^5J%#XvQvxGLnv}@cX~)ApGA~oImkj zKGv(3JT01<{w%;_?L1NK{xmG_Y9FHXuaC9+Fb61?xRS3qFZ=G-On}{q-Sv`DL!(9G z1+ZT+x96Jouss14K5SSdH&xF*Y+;a_7eXF(ELcrV6J}_EC%Wz7i8h8j`L`!}=`Mj_ zeEy#&`Ze%G+x|Cvq7UD;Oss<=LoLglT9v2PDcdEG{ag6$*}bVYLc8C}P_u`{eG4XK z=xb$wZAEHF1<#p6`tyyH1%+s|JLWJY9C23)X{+N6kwVLL=5W+vbv`fm80Besvu{3$ z-q#?}4Wq4#SNhS6ZKIQv_=9;ANNgdH`szqAZ*w+AW$5Me_WE}Q3PQ$*_nz96Tsv&S zV835JeKcAh6S{Dj9DbA(hky(7D>t&hMKhRTU|)gGr=z!~jU~QX`_|8QlEE;wia;VF zJ6(&YYlZ_|%d<+OOL;s+kNxnHV@aA$5(bU(jwHDHZZ?d-0$QcmLR6zPu5j(Q$y=!0 z?^DGU*HKEz__;ff$v!V(n(<72#)_>N+7g=NU?!5ZO!9kKRKfO6W4(w;4E`{Eg|;mr z{2Vy_eiTU2?i}0~3r@83MSZ?)YHD4=>YW4OSSci%^wk{yCcKa>g|mdg(D^yoKXX*m z!BNv%iBuN`;_sT>3@QUM`{PW6l(HxBtKi*QZH&Y-i#a_J7}b#&q{63=c8 zkeTA;u)+ zVz&-pYxSaMbOZ)j>dsCz<|qDiA@H=98IbPmpQ`mny-gYw@L)Hg7A*cfFIt@ep0N#9FYa@8bV z(_Ni!E2KeVc?@y(Pqh~+QUdxPpm#X8;Ga{BJ7F*pnoQuqO&g~1s|Du4_fiQ>- zwk_`Lb-tr!u3ykf7hBu+Po;~360kEQL~>m(dpo-WJU{^(Jw@mdjUi+!o$v=3QZp1q zInR}rYrNPHs&HZA^RS0WBDiOd=em4xTnE!JTwt%%Z8ykEK}h+C$|4`HaCaTZ->ReC zM1NsG1eOLzZk10ssu4PIq6ga2MReTX?)oVb;BS1HpYzIy`@}iihG*fy(IK7Rhc2xo zB2N8DX`@ULCIeocyd2$cdm>CLT|m2O?e$M%dDCsug7o!>Ry$WQGD~Z(wn7~9*+kfC z1h}`Z!9I$)Z)=d0#mcHcx`z4hsv#~FLB?;lRo}9pAXRv79Cji)q$60+5@UVY_Sc7v zsp_>g)t2$bd~fHV$8z_Q9N~b*oE0d#B0})ZubZi}U`^IF5~FRWtfhKRWar_w*o*QK z@+?lr^dq%#NyIy%!KwNFTdfsL=QkGnk*sjTSpApBKXiZK5_(uWe!;MMk*vM^iCE~NHD2&6b%m!cYvH$N=%>fFe3(*7axfOMVlCVc0?46 z_fz&26{5=2pIoe*y+#HLAyhFD_za3^>uiLh668-8rcLb?C`D1UY0+d&3ezjd^)cki zQzqk7e{ih(G~Shl!TNlLbR#?$hBwhk@-$~Uvfc1?m0X`k%PbIy$rL{Bw|ahk6iCXq zT9)~WC{7St?PW#U=Vr)4$Oxr_g>>4!pQr+hPu2(BRwhxdfR|>dqrtIwc4IMEb@JS= zo?J@2L7VkNJi)51EM6v3+sB95XFH0?*0L~)$rIjL^B=_E#5G-eJ7ssM?ftPP(&KkW zJJLE1eR#^NLR+150)L;KP0>vkJ(G)&bsZ{qF9r(D*sg?Ge3gVn_%KnWU^B9P$?b8n z?-n2ZEk&Q%bSBSDPjC}XAI$|N|KjjyBgd&od`gXU`W!8%N`q`>{^^RX9)qMH5kv2G z5@Ncc@htW#F7B~O(kQ~d3W||vq5rz$0Wncn2gyLd3`w|eEc=;9d~<$8qvL{yj5Uu5&;6jSh1@zoT4QAlrB zkGD2!cj7Cy%4$X3;ODZ^GOQ`&GCi2=p_mB!KiVyoE|#2gu)Ko=((Z8eg88*w5VOG} zVN2}dr>v2yWq3ksSx6r{Ru#A~v?=;m{Bpd%=;{uyh>)0B-PkIRaKt(<-glir)!AY<7tl3xzzJx<%N%kd2{0fk$ z`AK@g<=3cMaB6vz*_3fpD(nc;*t&$ecWDe$XPrf7WZ?S_2VQ|G_~(icMbhXv=g6O9 z_!b3E6zbnt9{&h3a}~~H6n<}_P-gi&VwH(AQBk%h9`8vpYOB7>MtliYWudmD(O4Ks zQm+Byr)e{-<+^^(D8-*gdOW=-Dr9k5Sf6gi0*SwoNactm2=y7`YNDafMPjeu`80xr z(CcZ8iNwQl?cn6#i@vwih&FHUDZ^U_J}21}8A9#9v{#RRDcw`0yemd*l9icWX@G^R zT!z~mZpGYWlxB3?OQbUH?nzUp^u<>puF^V0Am+R}DDG5%mDJ(KS2XT5YT?DU=nY46 zlCK1b(G6V#f&9dmB{vn`<)Su`g}Y@hKZn78b{yHPsp$Ty9~d-y@h+(V&U=a1JXyS4 z$Jg>nIa_eDM@to(v9s2ygjO>f%>i4t(pD=+3&-mLD;kd49Wi+{@POxO*7v@*Rsu0u z>}J}&DaX|Hu&WmOL1ApSdZqU|jdvLZJiS~M2|s%`>FD#b*emql+3FLvzTlGwRei6( zd5YA+{>9O}`@Yu>oYlWQK>hoOXVB2n(9qD>5~pf&4Oy#;@2SeR{i|)$a@ok~1IB*# zHU)HI=FnLDirIzyq>o;b3jR|HlmzJ%zrMwP)+XAZB%#&$*^0`RX{zaD{bfFLaHp3$ z>@AGiT3$p(JL0VN+fHh7w6^5IPitIiE$BYjMADm`kp2V8Y4*-#^LLgS{FZPZW;ds# zKfKUXISzA>J_C8O6t>)C&9fez`d3=yalv-Y8gzuIxF@h*EA&eYZCQEkP)HWppTgswuanlNON2*_^S% zXyfz=+8~dzTooqWX_X7IwmMB(7>A3InT}tjA2a*bWo#sxT^bSJWAQd+F1a~oZj)H@ zW&Nx&FQsm_DY1YEP4Fa)T}PpVoyxvt?X4W;7CW3cT@uOux*?9pCAKlL^RnUqhqgrA z?=2*4!*MP$Jqqeb)>cbo0T<7Dp_?~RgJgrlU)o+Ng~>(iIo_u2}7yVL}-S5gvWt@`$N$#kbF=6E;X}s859JB0s^X=e>e2 zO1VrFqK+DE97LKzV?9=vj92VZDQ{dszu+7a zLCcTy&#h!dDZK5u^U*Z94i+RM=-gkDFx%WkGhulN{1AQew4Jx5uqtJHcJYNk!oF{r z_Ja8y*|T@rhC!Xn*ZJejynHyCUy>U`czfRUML@o1vC6{tk-o=@iGmL#VjzVfjS|s> zH`SP(tO#Zb%w4nWk72&@^e)uaE{G3?VJWMDVTNH}PxHIVTgjWVI?0m;mKkob_Mo|U zvb~wW7d15Z4yY(&EHJ2($$LYYoF>+WVvMf zfC{%Q*NZ%0jRRN_NCxu$>JZ%pWnT*jj-E@LZ3#|B{092$bmn&E_vXrFn-jQ=eg6X! zy>!CCV$N2FPs&&3E@FRTK8@haFC7>aA-BbLpT)Rzvg!5*#wuW!Oma;1Htt$(_=0#- zg!A2mY#oX%$sABRwD)T_JWu&vcT1#aD(vD++RQ&TSvuJvv`0OCG9v*0$X+{ej7?ZCi@Y{{v$f6KXsswz~V@?@YTzhNKUj z9OU@xF=sk8bY10sNh=smUbG^J(d|oNd}LoNTc}4cMI*zWKYi`_!TbGO*-df$_c7M0 z5GA%mR)wTiN3R5Bcg|590z;8fNF(i(|L>F0@eX>!ZVZgsBZR@`m>EUIgvL=^Hfc*& zMXi3v)SzBOMjcPX@U=00hc}u}43j(cUdfQq_f-UobtmM$d;gy~*+OqIc3X+01zhWR`+=gaIT1;^UeqiRak$;mdxmsn%%MX=%6#2DZquLB6-@+LNeVJ@C zIpDYL##WkIN|CQn>$%08QP$UjaHbmId6WqnHZ+Lkk+^EueK(%N$6~)pyA{k9tFDq< zp+yaH4dG=Z?D@7~ud&mTF7;EV`8R2}T;|}6QLiZqLrC8yrOB=sX_R`Nq{Xg`@LFIP z^`g~2?Mt~yJ28KHwiwRTz{bueZ|XbraJ1%eoDYefd8PDn(dmRB2UJZqn8fW}33{Pm z?q7DfhNY%^(Mr8m=4 z>bG5J6|8l0RGQYB7;3nrL)%f5f%>IR4=-$$cg+US@Q;fd+_f+iMBilp7czCqG^R@9>*7A>XoZY;l7Znm$MK7Y-gP2Iy?yf8tMY3d6;a ziIQt2lHqT0O_x-Ep(`84-6b!>wC|@L62clpSh0!QDojpfjBZ0a4-sF!AD*_YC)R6G zhELV~CM_5k$S#gM);K*s<@IxO=wKeQYYHNK#XYz8I{=(hB5{my3{YW#R$? z&;C|md&Bp07;l(pFfnOzJ=t@?BOP_$-;}T})cZL6tT6Sct?gFx zyVM68u@0EcM(R(=%E`m=%>RxmXMHDjjD|>H7sHN@n1Yz-4I7A|vuU^jKUSqBR|JWj z*3&cgH>zUD{@pIinB@iR{$R&0$HbgQ{tC-fxUVc>h!tVd?e{Z!^VMCgyw)Dp%f%y0|`$S zwRs~9zCL_|LE^s;(wK;;&hr;W9$|YoRmwVIVQn}nEp3U;-)H!>t*TwyA}ZyT>uXs=7_?eOO!@uY2YrS?yBmX2L`h*w%0w6jliiki z*2ev`S9*8{Q!g20Kk2>;5v1D_NWF@wx ze3{uDXeH`D48u*5V)>oo)3%7Zt7te_+t3anUc0 zVmIJ^Wu=sof`$d$U%e43G)hgoD0g!gQJOQqrJ*}Xg}@eb)^bwVu8E+jW8-5t{CwJn z{gDq}lqqsBwiG}3>@>6SlcA?3qqS~lJUhdyZ>ev#UW7YDzM7fjo-O(q8pY!&*lzAb z$xni9`FXYzoua9Wp+EAiIqnYbX3eXPfw|MC=BN&&c3)mSPm?n~78YDA#7W1nyYTj+YQL z_bwpGX;7`uurvyvpDQA1UjgnpQ@#yE zX)iUoCWvMj8=qu@Ve!{U^UgDs9lx2NJb;;Q4rzV+IfzAz|Jxv4aLdo;#n}jW)~34^ zc|3_h=7Y4z5X@B-6AL8TS_4C5)8;Bgg>=6!zZ{qfGr?Wyxot&tilxe(g~yPLW7$E< zppsYFg!n$>p6d#YJ%NPPT$>IymW4{$JNW^g)U%|&%0BPZbe;eiGx~yxU97(wzB9Zt zd=9MAUFkSRf6bXmias-E@(XI&Cn_K(iy}<5}~G$M8AKbRI+J zl+U!UR%JI<6%K}}d2-ip#X3$}D*4W>c@7VcX?{X4zZ%3H6F=4=o{!Q8EUj|E8VRAx zGXwpX5K}U(QQ80;=Hz&7`%WOS34^xEGf~lI7v}a zzDlu{dQ~%ry2yMHesQ7&moRPWo?LtRdXF{gYi?E?OGimdM~x1hi^iPlU@v5V>us!e zvC#Dw<*ZlwSq-+iue^R7yV;N9TlJ&NRDNOf)$QS25Hy^;R&oe!j@KAug144JbuqFG zl%Y_nvA4G?F<~6Ws5<`|WsBk4B;E2MVVK5>OI~Xb!ja-J7>YHLk-;D&F@yI^?Sj3o zr3;516{e=!I%$Z}Q}6w95&4Hq36(@|gg)3@kK&h8mMs;VebF;k7VE+3CJfXz9BB7M z6du`rPZafkm<1!lal94z@$NGgV-N>RjZI(>h^x>naPj_WL1EeDf9aDvCO+A@r)JO-_B#m}io<7tZDVZ`f|^XO|S$`T?ZYLw2D4(H6% zzP&Y8j6qr##YE!hlPI-kC#1NMmzE?Zp`%9cRt#(WVF;(Jw$*X2#$+hXPmSa!Y`9`l zo5b!Po?(FW19tA}-f3wJlA$r3E#h?Bd_tD1G~<@1$>|ilHiaciy-RG;SIYWZ>z&>g zlOC0>fmzc9euRJh&C!H6SChj5w0dvs(&>& znWw>35+KnNq%qeD=fzC8P;eBDz)sTlCPwOg5ol+@(Agtt(rRuBPurpGI!g(+pEk1@ zk=&S3Xz;eu(3oBei$tycDx8;mg+^LLy7*`@b=JiGv(#gUP*e27*40SXv&_@ zD-DrM3)UWFiBKl)_O;`N0M$N*7GM9O5{eX_9Lu2)P@8gD{&n2Zt9=#cJ7KlrNGr;zxf zV-Y4<7GWa*%jOBS5J&eno?_ZB$GyeRBC1Avz6yW-7J_%6>Q60ai+SiHLH*{5F~(5x z!D6K#^9^rE5tZMoL1Phe3?*c5M8z(*>N4R^HFAssw9#ib8RT)sPaVu~2i{hhUB$dD z_ysPe?T9O;9ArFGMk#u_VHh7Hq;*;lu(~~`8TakSuZj(oW=zCUSDaUeN%%7Z@8>)S z5zn6Y&sx1S*ntO5P>onQ2vh>@DMBR)`#o6B^jzp$Lhbm0(~m0N@=vGVUHTN2=%*+| z7kxy3CK3*+lN!i$qgB*3v1B?vp{?4D7%PE@B&ZUxE#_jT5`+9RlfL+ley4>$x-{0qoUJ z<^n%sGvc=E5N%~KkmQrUL9>=OF4SwCL^|_3u0_DHJnIwNzC})Db%^g?bj0<4Pr}dACe>s z#n6GGz5nD7s$lZTnZmo&wobhJt+@O>-|0UvsijM>O%I7JtFG}tjpE$OVebSqvqi{u zzw+4ezu`Xk|I2L2_aQ5i@_!!)^+(t+<>lJo?}}6xV~b4j zPY}yi-}VhgMivttZf7iD&~avR9uq`>gbGDS-Nqx-|NPh2PV8KAJnL*_OoV_Ma;`wP z6&mNKZQvSKh;Z8dt;HXhO1VJr)){i~uE@UR5#_^yR7R-{sT=O=_6pKAro- z5^+yT5K*7Z&N-`H%t_lQ8ylNU_jB>OQL%BZYiJjl8ZYcy4x1^2!aKmV@V=fXtMYx(wncF4EXXaF+>MF286en=Wfjz`t5DoS*m=>?4|gn;%yXPWe!TkRB85} z{^|Q-$oWvnsT^NzvQW+|&1=cb{bKK>pQ$|pH%jV;#oCgSLz{x?`+KSX3M4kCjsQaB zi|~b1G>xfsrWU652Y}|Lwz}-|0`Iwdc|K|rjJnd9x{%#c--{B=j@@P51@kHM#lLp1 zE~=gw%q<2nQJg;sau+#ktUQW6D!-xtlIQo?z=qc|x83=7jkZmNl_IWJ(Ly=6y4T;b zAQP?XtwZa-?+W>5t*58X%df)aZ=6n>3MxO*Tm=gPNsd@dcNC+Lu~x~y)tsm2plWhA zPA6D-=j?o>akA>>M+m`%d!Gx%_tU-I0T0`5lq|vgU<>9MP=7%2+|-Xs!Ca{;>bkT) zFjF|V%98;~iF17LL%Zcy?5={}AjA+3Y?Gx;LHYe|it`USZLG&9y@C7pPc1hZ4Jz|e zO_GRiDeficq)?qPZt#&<&;C>N|cvA2=}hu6@6u=<7EcP;(S6Zzqq@ul@ZR+g!fL!12%KrLj+N4h;Bw z$K7TFLV9Ohzbq;f<=`*;fpPV7a1DHGf2!nrKNGk|im8#*V&=wBox zR~?&}!({dcrs-CPwaRhWoPs z`xOmum}qFK4tO^A>p7ZM{A6wWu`?p@1vQM+&8L*qa3{5c^#=w`ve(%8_(XB{7X(iG zA~c)40K_@Gvb7s3Tz#3Kl2zT5N}7`A)?yNJe71fb`#(3Sdd}dj?eqYZ6y-IJm+7Mv z+~Tw{N^Z`@)^VLFMx`DX$= zb^sok(=_3Wyh$l#TZ-~fKw)liHi|+%Y-le#6dn6VioJDzV0g!YegT=iA0GN;gYml$ z3B>QqZuj0VL7Py3roSP@BNyEg?{o*tXioQqGQuwNRMJXr4NsJJv%zFNhkQV?Q!iqP^9gBq^P~y0JWt?6-%JFBpI8t70P+nOlDT2Vi5&vl#?80m!d_un7=+OekVz*Tluxdi8`@$=rvA zJM!zk&`Rqm;izg4zU(C4kopIgT;RB2X zC2!S*ie?A|tn(tkR!=0D;Q>;3JRlp}!G^Dm~z_3VqGKD*gb(D6h$ zDbm$uc23v>k;^o z?1K6BF&YtIiT*=%5nyVA#jY!*c61gu6NB@hvrXEq9Xxpc07rVKlyg>QZY>t{*rMt` z)3^r&SEb?0)YWeK=uYDRACBBZ-0NSLJ5K@bg1AttQv!HsQ|HGlqg8>}?n+KKCN4Gs z@f&QX*-8r)%ZB9s)gnG!_}Fqa?O=X7B9u9HBU>|D1Pn0{^naQxxKE{aZ0;v$8M7Z^ zbo1uXQtkAAY_X%$yLbqZb;+>fW_klT)DjPrgIK9ESEDO1mrhGsZ%?Hb;UT#-y(v&b z5Ib#Fdnah@xL(l>vEVI-8pUv?Tt09cNPci=swBeY zL~0QsGbFo-06H6(^dE=F`tOELS?E;GrR_91c#J5zWQr(P3v=msY7h?+Kr8W)*2$44 z-D!uABN^0$6Aob8&f)T}(MA?Zq2lTRx?HpYP<+~U?oQVd04q4+q1Slp2?}{Ao%lXL zz!p*gQjNNUEn`=!=?#(M0qIQ-@?GTfv*bP6M$fF%79mHfR6I$S2h-zltB?8^j?3wd zuf+p?-D+-|OREOY-0kgfO{##==u%#CjmVbD+P4iaX4K=9-%O3#0|Wd_Y|As`OukwY za2v2d4z`7RWb{k~+Mr(jkLhcu>B-t@Y)NiR`!wL5@0>&5-$0yry@0{6E$(ZMP~SYq z+Xk7%P=!3Yz3sd-Y7bNby&y;&|Cb(DFHoGXC4i!nAMa0{LLlnZqUU%=XKcA_+S^tlyhjCEc!sl}Fq++sec+mhWB4;Ew?86!HA*4LAm zJT2!d+civZW^X@w=~kKUn;^+D+})HCpyGQQ{Pscf!CEmqirgW^lZIY=`v2=;vU`_o zZJjMA`M}G2!NbdY*_TRm{FP$NS{`KX(gjVtdRGijO?(aOuGWhWK7?N_8^*=Bl zuCFPz#Qsy6z!mx&OSf*5j}@@LK8bz32jSBP+qM*k1>PmdbExZ?pR?9DpH#ZH}+cb7RzweAegs5nhH7TKxVDa@a(sNQ!0vdJ zXt6*?hMvsh!ZMi!k2e!~R;oMo{RmJ`@&UtXwZPfvX8y7j$YZ=TO(g-c1UhCUi8c@HxW~Z$R0py}zdsf< z2}OR#vFp&ZEOeV3h;0%otznU`T6PuW4lgZmL&tMoY(9WXHm1e7p#X9UxCw{!c^;$Mk8@7ve^V)IB?V3Sp@wyIp04coW zJ~0KE8<78nGV&LSFMuNVfTE_g4C(P$V(~mf_ye=-WNx4=2I4H)%j>}N>fE!8HyXs)RrC6}JN z8krl^3xNdA8s!vj2pyQ+rR3cKzn?LEQbcX6*F&ubUq5xZB+KQ7n|-$S8;DPVGw+ef zeR|RT>l-aPw$IilRmYiOH-MJYQRQq7BwBq*6VUqLn!?U`IBp99A|NPauxa&WkvWah zHC2?{9Z@&Wt9f?ixD*H!Ueh2qf&H%N(283fsZ^#&fN6svrhzhV%s~p)&9~$hs>xd& zb+KF(rkQLFkfDRf_q(Ia1O2?- zp{_>bc|X_7Q5hO|7AyY;1~XwC&g5mxF2wOVdfu&9`Mo&huA6T`xP)c2sm42B3JKl7 zYO@ZjMDwy6E)}h|@7CZR4Nd2yQJ4j{Otx|d^KwYCicPp6lR9w$LYwAVd4Tx1QU6q2 zaEYMB;?%ygvJr%eeBf<(283C}Xmsw%0C4UF<78X%UA;drln1%B zuafppvmrf?Tx`cva}#f~0dPj^>LH-eSybYJJh9NWzLfR$>(U0#1pBYBLvv_}byVYl zB%9LSR0MB^`AN!dEKy`Kz`acR;sM@~c=yW#ywy97IeCB#8Dzu&_FU?Ml*C zEUae`8nVA(%6QWrVcgjz_d{YEdp3cp>lp&6M_o`7>3#2R-f#^=8`_jE0@?1-JAR>Uw}aGhP!c)#jty9u zVgrhd!)ZXMQr84P?Xvk6LMS|??;_v8WbMp*$sCG1lul_VL}SSq6r!jxlH`;#6fnm} z?9MJ{=k{pMP9m~@mrvlrrZaraL9G02q3sqJ+Q^JYRW?$0_P>QEC*Cnv$&%Y@jHwZP zN#r4iR7g2<`0RT%oFk8wUHQjH$y7 zKvlLo`&ZU;hMTUdx~VN)zlQ$&GrD);J?euP5W5Fr)@AK#dU614CLlxy?>{=crJM7h ziX#>QjOWa(8#ItpWoNv4auEOHiuw546+XS48(#pGi%wuuSKS~LM=X*LP_27V9qb#v zikF`y?;K?w!f91a%swF4+*74v6flpMyZ2h7w9-HI_Lp0vY0IZ&z@ayzFXxPP+kIbH3 zawdoCffx(_qpG&VN(1Rk+=FmSubaEO_>2HYyqybR`zTZ!#JTM*KSRNhv(mTr4w?D! z5XI!A>Z@@5S>nzyYUWGKA(3{?7V8Hl>mV)*o1{>w8R`5AZQn=UbLe~?nsaHd2G~lr*C&yI z+wuAU1}CQwKw1w-#r*8q#mbKC>40zz1t6No#kw;A2gv~dax(^8v5s}4;nCZ^tw`UJ z()|(0t==QdOR#jb!$ao$i01TKCXfv^OXZ^`sB<-)pTOYgg&6&`09MG=tlf9Dfl^~y z{lg22qBNa5mZhH1OjQ&sUY*F_$#P=9E~Ld99Y!VUkaMv!L7QqeNy&~V?t*07@s1Oe zQ+~J6*E~idA0QbK{~wNQ+`nY?>THRq2VONxuI_hHm9AzWsyNOh!+&7h*dG!(^{?Ud z&q-OcFEN2lp9&ln1wQim8@Y`jvBkcmRO@}|_J}P&AX`dO+jGG?K`laSjIYZHL&Y_<`itxCulMA#$9~s_{`s1K3{D2JB~dbwtOTf6Dtl` zCa@$0{g;u$^&HVReeKg?`myTS%6@*JvdEQH3{X@gK`i*yjHqc>X6$pZNO+jjSX#ML zMT%COZ@d6+fu-?$ekVVHdyi~6^aX3^D@dOmexD1&L>(LVJF$d5dX>#i?xn=atJc#o zyX368^j`R5Rz90RrtT?bZXV1kt;(6kC`v`)mp=vLGNU6t*7d%EDb(86nL;{j;9rv6 z4=xrjr4VyQ4T}4C9X2EC3hs|SufzDtXYqt+w^vK0Vm~E=74S-t+h73`M~TM zc{ZW}#`{gMbXQ!sZZe;CTtGSYA3 zDd@h=Z;WAYh^t~>8nT)?S2_R6OR)P)VvpH;^sce1`uUTFa81|m0xd(@H2JM~{`iX_ z^}3@{;@>nha^sIVPJ-Vq)p8VBm$weDu4%^zs|a|dvWSv#K!8;`#y0j`y~JQ2WfJZU z%Jd8;T2abfa(&^sdJ~U2+Do5((K+k_hhRk;dHB~DW;C5d-cNE?%m#f@V$vvf$xZcm zslPODsd~db9o5%uY{FC*lz77W2OY35Qd>>Z73Xs$tQ2A(|K-jcV&= zq0+n?U-}_iY5gb97vl9S1t;r)63(a&-FZWa5^o%ZH;hx-M~V8TGnv`x=y2_{masoo ztZ3&rW(*4}@o_{f>ZjMdaS_rpi)Ih9ZCFSuEX|w)ak24!u8fpCnp3DwQU2KG*X;*!K+obiZBWaPc2nE zd4`8EEX;KgC7XjQwplAij~V#1u>yQ}2HVB@X4xBPtW^&XYOVR7ql%wcXD7Wm?RR}` z5+Qjml;-#TqTeQvyo^7~v2v3x3}0Kqv^TcuJdg1_bOd*`w0)hNm^hr?a8OCO_eph) zptGP*Yk*U{DjpAwNwCS!p94W09?jDWEt``y#Q%Jz&-o3{YUJ)VduaP~uZh7USu?)hT9%9eQ9L`wdIyeF-s# z>kIcJf~w=XH{->BxVAaNOgvBQd8-(U^utgKJAVa<b>(|BWzySq!!#v!=7Ymfv9N#4!-?Q{0s8T%hNH+@wXJ*q~{wQ9}zn~%Wf zQZQS5dBXaP=OI?tfc??Jx!{+qL9A+%mi^78C-we3{j|8z36r7Hjboeclqe3Z%z zX(o?I(u7=D+j|5%ycm%_+iDqvTKF^;+kAml*vZT%KgAUFcQmG$=`#D$`e?$`D?JD| z*jL4L%>oS>2HQUJPttvofWT0@5KWEu&lSRNS=rTZaJ65Ali72BeE<`Iyzp8mYrb#M zj?FL*A4djGVsc@tyVm1)5I1$~_zV><9HctSPML%kr7xEW7op0q48}w-aHq!^^_*2E z=TORUY;JRo-uop|r1*os2+KGq^>^XHun)FM2uzIIR&{WWV|+PH1gqO79oI0Qv@#gqD$8 z6{BmxX;|@O-buUs(D9b_AriR6L{lo%JF2nJsUq(<5j;HtMcW420$)NY@hBP=aPRnJ z_Rx-jNPCCsVMQ;4A{wUF*C=;h-H%^P1Qu_s}E7)ineq`$bP)sAYZg zlear3hv%FK%=iTt=!RHu@Bx}iWj(Q~l=hk#1X!hVT)w@rZai|>`VHGjnH z&3rv5DM6p*7?p-TQ=}p}!N`Uv|A(#HI31Sdu3D^egYa?Ap>Xb*q7NLt`d{QAW_jN- zHN(SKN@A+CpF>q4{+v@dP7H>N93sQrOpFW@X#!UZ?ZwGMhI>b9w_a=Q4eqQrGFxJCofHFcfx#A+QC|6tWcypslZ3pztr7yc`ZD?B z+U&Iw;pnImZO?cxz>;?eSI$K^HTslWDqz`y=Lj{1)r2&2)+2sX6b9Q|KX-+?elC+R zJT)v{nvp91D{JS->#fVI#Rps00nAxPQKyWc=_)OS=E#@rWk}u&6{6 zD^vQzYtpCrriPCw0cM=_O$o55aB{J^esupN*meosHQzN|A5pt#+fdaRq8p)DKMO#Wj0$veE^=orO-&kT za6*_PkPwPc_k^a}3IU`~vVWWJrsf2_laouilnkvudm|G=C9!0fU739*_DH~*|H{zEJtHTGGiE+*`w9SmURbwFA;)UFDXi{pvfW308I z^8lkW6cCJ%?t)zzEz`-YqdYzla2j^?26V2^Ls(ia3358mKz| zs}xE>edW2f0@$y52)E5RxVg>S=@jR5HS>%hHJeg4mcKUk{H|!Jl{S6v-ceT4rFKCU zh|6h{)WQ+eA+dqVqi2lol%6UcXh*G?kY{s8Y|u-IjU6l2d#7|UH^&oYKG>0RKV3q) z8>p(0VS68W`Ldr(Na^?SJY6D-sS&ywH&oGmk(pRZ+%Og%uU`WJmYb%d@SrW zXM!EwWtwZYPVn^c-6Aa<>iJ_-bMl@S*g#}cJ2J~IZ?g~p1qBWK>3wl`vR(669*lQ& z_6B<%NZ-%#PLd_q&w_d0>(@`jW9hGwZSKT92}7Z(dqLlqyMJxfEd)%VD2z*X@U&MO zmZs2#B*>oTzE8e_^eYtFlSUs9LDb+Yr&fx%s+02J(MPGbnvPSb5H#t_I zdQ-vlh@$9wfTaQ}8#Dea^7YdvQ)8yoIm(PzN)Sw6FY!Y(sk)9&HMZd3@f7J-re&QJ z#ndRwc~@R4w&MkT-l2PCjHKwz1MeB$nNqtyl-qP1&N-Uh=Q2ia)3)yNE#s+;OUUku z)vx<8{!VB&!0KhGX@P|a2r|)hj1)0hv3%TP1GQ>8BGFVeIE{}BM&(>8V+_RP-M-iq zllZee#N;Iz^>7vZh@%5$z- z<$+hk_?l>%b*+d_MOgKosDk~-iX$&A2cX-5FX+vf6?`l%AQ=P0^^o68kc6E7Cqm1} zUxFnNO+>848#`sKbcUzPmu>Xm_|}TX`=XQkEZDITWx6ICFGZUuEYeb^D)UtZnzFP= zX{g#YMfElJwC9O-ef?#@Ln7Q7(bUw`S!6tN_?wv}Uj<8sa_BQUr1jI;zKi*yoTtFMn0xmjspIiJJ<@%E+%m=NBhR1iIGj`0Il?pDPLtezaEKgv z9KjojxeM{5$Vn%(;}ah87zI9XriD4;-6rznBSTujv#oyVzu05xA`El-w6ANwwtyu- zNI@e&Wm)O4EV*3rx_YhBCKayCBalIRz02;;g;`AS@OWGApp@PKqyjI&aCh|)k|KT} zpu~f1XaO?gv!=94Sv`O#DRDd0b)^vjv_Qs8Q+Oqw8aO@$Y3stZA!Sd`fU}6c6-#}3 zU1Bj350Fp~Z19?d^yXb>12XAFFhd-j4kOyH-9lY1d|`Dm5E{?XQXXC#%$WPQV(eKE zfojDHdYZ6JT;$b8HLAT`hC03=6`7@M`Y%i9&O%qXAUyu`)XbLCB8Pr2OZe#qw)1$yXE zkRQwu9X*Wz4!Nntx~rlSGHGdXKp8yS zI2%-i;N7kFCzS)#wxh^B+NUj&Hh1 z?%)VcQ7pr8QFODnwzhYpQ%UR{=}kymDp!WK;%Ej&zj(Z!P*(6TMopG!PC^487`WpG#N-)0mBf^v(16y;gZ(l(*Fn;w zm-!5(>__Y+GCJ&vWhEI*nwrzx@T8wz#1uIY?a{5*_#s`m`>s;Ugx^L*!{$Z5`g
%=a=yRTHiUq4oJR;jv>6 z$GbeXqXZ+h5*KI)T%k01sakG=N~Y2fTNAaj8kF$ii$=`+SbCaQ^`91%M&{J<7#?KP z&umdb!s5FAM?~%t0}7qGISjz@ILouHPUTQ0cfN$dTcKRoU-E`@DnVy<8r$quvVwi_ z71S95f$JpAKRR4v>DjAFnLSF-X*;kWtf&#FSdLSBjKdT`kFQ;fEid{5E0EYQUp?d!I#mz(y)*2ISjKr~PYbl3; z^`BsJ4!RbRUD>4$^QaL+bPo^3@8RR!vJmR^iDlz6hY$E^4D+0@D)E+)HXBrT!C4F_ z`C40VHZJJWIeQDNxVwsVQih#Jrz45;CB&v`WO5@ zk(eQ~TjFE?a7w#Ck+LEuAGY}7JJT$9_o=<6VjdwW_le*LO%hUFN>G}@kENp4k504V zzGBiiWt)}%UP7psM8XzXS|xcUc_Ul7etzGV zH#L)g24NG&D}B4?0U7@$|1$1PezRn}K~jDP*cd?ferHdIP3OG*ono)b`KadqU~^$0 zvz?p+ICs#~K9^Wd|7^~m-Vj+MU;X(tE1#L!TxM@UsfWqjSK0UQBV%Z?ahmeqYQVSu z{NJ12f4m4FbRsGLXuaR~@*g;#7simQH_qAqG{>-Slpk;Otlzl)BJ}=F%hUOT?i{v> zm0QOZ|EvCW$dicd9)EB4}a`is6@Xj8J=gRl~?{)9p1CKT6 zlYScw={J8Q==N@XV+Q*ECKoPu@2^eB{hMWn{s*p=?{}WK_^WZ_Q;s97zh{8^{SLh2 zmKFkNKK3sqhe0sk{*DnDDzxDI!v1(-bYYUWPVqZiD=efntSRc)!}tI+lK3kV$1I&k zUU}Rx-%13J!FY0+SPNhIxGKgATZ9)0=f>>>sHr5~6!qbj4pC?;*tM3E!i}oXWN3pz zL7Y#dMPAWn^APFzfN9EM-68TV*v8{`fm)-SffB_!+FU0uJ6C zJaNjd=pp@VyUhlPQvF)0%_5Jr>PUtttDXKD=^)CeuJvG<$=ci;1(#9*z6eVfes|jL zqxNA8^p9@V;+$1BE;hy3MR5HGQNs9JFrHLvU!?v{7s#82hgjn0W;x8p^~ak|g-(n- zOpQ!%(yYq_G&}dfc;3VKj)CzAZ^B%8{;2Ee zqGk=el4OOtAJ#`OR;N$9DQK`ZINUT-A)yRDrwVE9X*~|6VM5IKOjSs6-FPBA-3;Sg zr<=7>f9Hi&%8Xw==XCjIh>+hlq{1wRLvTK%PZ6XFgQy~qCARe%uf}9W_-#H*G`rfa zi_-BJ4S7a#=Xqa`sEjI>zeaPaq;?ni`gx%<6B+SJVWme@7ne5|JJt!X&cY=ChjIhb z;ox}4hN+HCluM9r>dV_yY_<%I@REwcIfXm=I{~T1@54YI2aZz>IGSr8BmPkBzu6W> z92&7jJqU};wI<0Nrehed7Wym#(Ev?wHl9qIT?v0I_(vKDqwTKgvZxL=Jo4pjARvR( z@AGAv0jlznlUN=ud4f#R(22#L-ytY6A3n$y^|EPuhIEBfuL}ZpK$~wOF!VTvG;Deq zfL&7X_L(>_#nWK_?CJ)bYr~73(oOQ$5L(NIjbE)&rtFq(;)f1ngs!&lM^jjoMtHn3 zR&mzeG3=P5;|H6jc`7f=TMvI?S;ij;B=eel#<6x%UFE(J_lrrdm+;!J|Lj-E^Lk|| z*r7Jzuq|W6x7I@xWPmiE62pU?p89ZaDm6PuG#&kl zzX;q0A!vRqnVyyc6-@Mx&B>yAhdVwRYWz z3E9-^?fI@irq0Nn^>iAHQEoH@^^??Gj#=;5Q9P(ySrLW(WE1 zv%DdOLwF3Y(HL#hk#9(=OHUg6I9=`LXFF-ThemL3!;R@mHv{R5^?8iyS~_f_vg6_fRC>5fCz3Z`4*zxAA6N^EH3$=bf$+g{j+?K(e>7*ZU@B znvt0Le<*3_a-JU?Y5d8Sw3aS)Zv;F&hsWKbhO~;Nt%9Z~e0!(IuT-yAAeLrd>MwgthbgUy^pv z!Us03haE|BG6idTo%w~U@7oQe=@|a#;d<9WWje9bdacjZsA=#^C%x@X^yv`-jTT(* z-?5GrPD=u`e{?EJ%l}5dS0XE1#AYu0b6wYKnkel4xaWu%;)yS*PyFPE6ZoDAMZ*N7bjGlI=}P4OSST#x9YkmV@3CEn4D#5M!1E87OdotHW=7#I$3{W!&xF|m5!wBzNLB^EEi=yWTGvEW|Pih}Yl z{Si6K!rsV&(L>mB?NER$5YC44eKH4~J33H`BkFHf*jYLwz zb$@>;1?K%#8ynBG2G@*^aoYg3x;kW-Y(I1TjxG`JE&XH-DL%z69uYG)blTnx=Sfq-F zN{4qv%mQ!O{=>yoy3@V9h_30KB7stDqo>AM7i^LTio#*3*K#pYc^#H1tG95JkVl$v z3$Tf-RN2&?7PXuHL0aKWJ*Lx_{*^3$X4U(rwf%r_8V`N&B${{nLoYmjYeUOIgCRV* z7m%Lr;uNGorOtGpE-^HGxG%PC>XI3&JXPyd=?hpEHyP!>=b7v9dceDr zt?Fp$ggty_x*p4atTre_V}>}-(4rW1>OhdaJw1K+&`s#4=dHIhxHU;eO)NNII`)j0 zWooy*MA3H3-EHy*vURhitwuMC{kfZg5sATc9P(}FtJM7uW9#8>x>=^j0coFj3p2$E=fQ>?qsrSuN_r$Olz1`V~$V9IVr+`COU2^ zH~KBvwVikP4}3U{7uNNl|GA`H+R!A8iVTvhybPVU)Tl8WZ)z}>ddQPS_L{sQn?KKf z#FD*~bJpxV|B|UVTNhZ5JZLfyH-r5!d(|Ov^C-*y(+ab@Yor*JtsUilb$CpxnFg>K z*!}e03z;>~Ap6iUbSOL*sg0vC@2l(MA@MWg;lnv-i9azP-mUZGpuMqQs7Lqf1qyrZ z{+|_CA^-PYRj+d#Ec?Wt6UQ^=&V{Fd4P*kokB3EH)z}Th@SO5aujf)+m~l7o+Byo9 zi+3+R&_*We&gDa0@xM{1iRBj&GXU#*C7xxaW--^Ml`AxeWyZ2thp*CtMiL2Z?9-mK&$zQG6k3~rV^=EMW{mL(?v zU63Um~@TZ5>0Sh zs+oFs+hJcAruR-+7QVfmkO$QE7ygic+4%=MA?pam4SO*B58Q^U!DD<|QWYP6dVCir z!NnOr=cv(9qygL61;kh-p<;MUt>jjXd&1+FJxIv*OkJO56SQHaU{Y{k z;=Q11O+&IR^;cKSVm)GN&f^M1J513TuoZLr>Ds(j`@v)Q(!Ly?uIVlv2M< zelq!8hW&E$T6Ku74@aGniX{A%B6F;1i-EBq|2~nsL$$3V{B>pe5Q45eQQ^mmP&{|M zO({zO{Yi)G$Op7scR)NteO=mNwCRJDFl4qVn|8fQD$U_}^IGvK`S~F7$EMm`N0U%G zLXE349#c0U=c1R4N3Df=h#9M_mt+jJ^urHZ?W+@ZE*4tfMk-F!2%B2%b#3AsTTA<= zqi@aJP}@6=*9QEjmKn8CU-#LGe8f+)N^N&q6{HjKlovdKjD4zp%wGK`M_%^U^&SZ3Lg@=apmhB6#6K6 z1*~dG3#6@p@Y?z^1iQ)_FQfq%AUDtVEPuu--9*&Od(?~yn}1I8*qozWi>I6E*GI9= zlJO)3bvH0(UGl1L#xSr5+}|1gg$`#oDBHI=&VWlrq!L6|9cKh#q|@xi|9~-F%Q3qn z(iDv!D~Idz5OI9o*E60r#-~wVr@)EZB{gxh_3tsvnW+RK$HW!e>0HS8+-Lu^RQZd?w?&3QKD!>snMv;N)sw zy_x5kiIpa0-yhN%_xo*hxIy`5DL&OIKwX_kI%bmw6dLZiujy7hkz)|0AOO2yP>7D@ zX;_D#$7fr3^D9#idIAy?w(0R%$?rGD-4eS6MGtxtegGtKhung7!jP3=UhA7wL`tb^ z-%%3n8eQS;LN0zC1Wu80jIGIfR$|7Am{F$Q8Qz5qDB4^xHeTk?6s#Pq#zud}QBx4c zLMScCvWBY~LxJ$cl)(!gcHLMR@`D_4Xff{RcMjw?y%bf|IEB^3{YfOIB=`H$4u$;^ zy37C@Yb5>lAmfGjw(t@q5GnIzvp1;g{_+rC))TcdjtroV_cXd#no?~FxIixZ3CNX_ z0gOm{`mCVzl;DvKzx9fl?=leVJ52u%(-GQ!xp~;@v3{xg=VTL7uR{P$B)zEIBmJ;DGhni}(9+lw4ck&>q87 z&!>g(wwObC`4t?bZDt+(d^Hr!oUld@*Wp&#R@a}!|1c6cZ-wbD^2@v46Y{31dkv^u z{tPOLIe3VKn$-IeGIcq^yYO)vB<%hk2tRso{V$7DDUF6O}kGC4C zteohp$oVgrKKh;TfH6%oeZZXST$(Vp^L&56c%AT;0Ut9kO*d5XEtRUs>)G{t2$j}Z zO-8#UvSj(Xfink>k~X0l_7PoL;Jw*QN0EbHqWgfsj>{ImKB{w3Y57oen)Qsx2HfM9 zC*k@&3JZ2SM4@!qu!4Vy$f?ANh* ziMpgzN@eBOtuhx+uL*Ql-Sx7n$|(N~0&tI8Q6HWoxD~Jgz-WN<4)0Ab1opO+5a!u} z93)?|vY4Q%lEgM!4YKYv!2m>Cxrm*CL5Cb8t?b^G3C6{&^KGpj6sq9V0kI~jT*^Xo zV=GXDx-&#Tj(nF9TLa%g|NN35QC>QWC?Shli;g@piC>1a+!z&(z}qcwRq_xc-mz6_ zDQ~DCFhIY}8aD~=l7``q&GAjE$1vr@HKNKKTUCuj2C98_F(9E7PtPfkE^~=NLafgotp79C>RlBWF@)TL!Vku_==!!K!`8PW)ev8Wh3ZC( zP=(^IVe*_gRf}>t`sT?6PF9t4pc|mGgN1FoQajvKGu)RmvkpQ=73XQS)ytjK3x?i&~@fE4~W_ z#^x)Rp%u8y4 zLmImSfsN0+LhZg#iErn(ky4Efd^mJ?eo-$@Fii4vps6h*eh^3EB{i}-L8jO|tOkEk z^a-mFT{B{xxG|X%1l`NPYr}C;DZAj%Oe^ zhPkjAWo)I3zPno46T>~*By-$u#y)BqN*pgkH%XP3)YLVsSXX@0{$mrdMNog1;magm zSTlRJ)fcgggRowwDjEfb$IEI1RS^Sb+kY5yq=x^LC(>PGrio%4b6tpCp;P83)eO8x zzXDO>Ytj^y!;BB?g`yW?+Z73^v(k8^Wda%8X%Pjnj_X~Vi#g!%O?m$*mmZ`(Xz8M) zeZr(aeG(r0$>KYd^!I0B6@uksVSZWWW2b~B9Rcp+YIY;eY!qeWv)@p@2~omh_z>ILp&i9li6HxX`aH7Gsrn6AJN@g zZ}}HON$d|qR}%y`lpbSU^cJ7og0q_*ImL6W zx<^d@b}GHhUq{P%w!XU0qNx>|>B3t6)k%iUqeB`YS8Do4J4IvG?T&6tE{cj=(gJ@s z7#l9%D1>R0!U59n2m{#^wpo3BF+X+hi8IJ}NUJ9~XVZ&pw?N97%ltUSe^M=>X00j2 zo~X}YfLl?TTe~FZ%~xw{c05HFOn`vOJRTHy9Ftt$EN5aD4z1nz=h6TOO)}&{$mg`j(Bckx-iD8GD&Fc>LuU#MikOWiR&{Pub8#FwrjS~* zHrYH1(};z%=5<5lwbV5 zW#azUh$gB`+9shz+4Yfx6{EEjoc%dr|7Si1J`?QJJHo=f+sYZe2=L$$)SF&&cj%Hq z3|bkztNIa+lY_spJxA8R^593PL;=q;Hg=hBI`hf^txlPmME5+OdC^^2n66Xh207PP zV~>4OfEu40a3pJz(d`C2*ATx%nmSZgW^g$i<5+L?^{~xl{Zeq4G+@mTA(nm5J0q_& zHBadm(#;{6*FnM=R%e^TsUfgB%eguF-7|a~cw13!E;-TkRsY7CW`^{wqz1jX#>W zfzu)T{ObKNtXs~Pfcc`~pXwL3^WfA+KtASWjTuT$n%g@C#P_?e;>!Dg4DVyv@zq4| zZ&o4M>A7n4K854Pg)40(i*`WPfBx@GzxytJgige1uGagl|9Xh$R3R5{Tv-3dLv(&) z|BLYLcUtfN@epOwJHP+Bnp(s7uZPI|x%g_a>)k^exA^uKV+Dzy~YVNX2;dVmghxSA;gZ^Rk+u$r@FYTDCTo@O(v%ra{AL zM86yJwIW>|9li0<71*Rz|6PCa@R4d9o%we+j?$JIB=hM4XXhqVocPc*?scO3hj)rLg9Oz0w_J4jbFkULOB#g@!;pT< z#kH2{+6=Txx}~H{dS&XV+T4*v>fLd5UXedSO`^okobG(XtU+}rygXyNEj>5vxziOR zzuGlXLK+h`_i~Ye*Y_p&VH;o6I1>YI2y2pnbcBVNtxL9AlYE0(vlu;-9;w0d+Je-- zy<_){B~y&uUT=C_Z%GQ+fk1F%lN9sy7{9*5AMHh@rTUd=dXs2*xhVZB>q%0UC=?h# zA~xHql$-!jV-;VZGa~mhHccV4ns7?Ynt&?^BEGLC$52DVh)&EfLWNLy1@(+S81RRf zL9N`X7PQhSKs?qle=MAL7oP?&drj6Yu-c?399&T^N!c!C*lxzb0#A8ZclV_vI~?Aw z{_V5vEUWBQ;?Hna^jrwDuR#Q4iq7R|HqDfod`Znr5*{$;rKL7L3Hr}~lXpf3Ic=5| zDV_P@R`8yqv!MA*P4=vg8e%$y*<2FXvm+-(UHK_8Y8$YTF`Vq+hJG0&pC-dlV7QJZ zlCs*zPZ07jAN2W11pUhgjD5@2_z)nkefqj>MEs}y z=2cpuvx8!Rjn%2f<01Pf9e#9M4SP6nY^eDTw9=2gv}qzny77x(kkQKKrL#MG-Ph$V zHNG-fZ9IZx;CMUb#a8^eE>b8!jEdIPd3;<{c@K*9wY>#Rcq7lMg5BKX2$dW;rM8@r zXabm+f$??daLY1JNrf~;fzS;<&B8tcl(*e8(!m@k)YBchvYCk;u|gz(t7(|R63F4gnY^(lENSk{+&7(BfRy^ZP0*G{)KKIN!qf`EYiw+t(`*ht0Y(ASK@nFh*1(LMmtB-jnbZrMNo?0-c~!M zXH|V|V6}^hAZwYpqvZ`^mM4tDJ4*DRQ?TW_Q@^JMgYn2}d9^5#>dI{W_mJy+>h<)h zdD$g)>#v$c_2MfF&A%hde+n5VLt;r1H*1AllQLnRP)H%D|HoL%^Xl_R@Sa$8z>C3f zBo339^0)MZk=P8iSgiV3Y%!}2{r37ar@UWih4h~9rNIpz)A!ldWe18Cj5L$pf{w#c z<#93`Jl7oaO1WTlcvAND4{QhRW1(yAzf^w`@~Vt0VXEi#0G%WNX2j|4ZZ*#C3K~Xw zRCL5Z*A{^&Z!2u9&Y4n#t|+lwI`qSLq6n&V|1BN*^ZSqa+vnij=bPQ?u-ivQY^S=w z*U%p^zijApBKUE7K^IjQzu5fpyb@m>o)R2h4fEEBY_<$`&tSxJ`y?Z-ugt#*`0)N# z)BMxS`M0KKxNcIv$&8<``96gB8A`Rc%bgdN^-ks7350ZfgA2vVJNXaXfpK5-lWdrl z1kUrj1xJ)>Naf%%Km9)t(nsRArFWd=Xejx6ik;W02GjoPS)!*L*Xgkc`Sx#^e6@sQ z1QE-tdm$EPX)_+{+4-+?cLPkN=-%j0Y)=x}tRp8i8*j@|N^lnsJDY8(N`tv;jalcl zWc)h(7r7~+tV>#p|AzA`wo8AGOkIj)J?U3*l>YF4feh|vjp%-3kwD`A70>bSE5(at zkA#cvUy}a3$KT4UTVV%IVHy8{YZ3@tnOu-*Bv(urH0)A%j0%&ql#ruOI83+S^j&-u zOgrFk&ORAQ6t#hfW3ev{D^x4vR|3R1^2pG$YjmwM5>twLi_m!|hCj(?`^A762o4z6 z&HpHwf$&-SjhbT)- z$~HXluEPMAZOTvdYsB*!aC$Nw%+Vl6Ftel}ar;;$uG4_QD9Rqyni{1=RHX1I?`k$5 zgR22yA@ibrmW59iShK+~^J3Y$daYAYYeKb;@*jvX5h zln_La0pCOYsj?D$uHeS_448y#DYZhiCqNA?-IT3N%9A#_U0i95Q+Ae47&Wai-9zp9 zbd~TDKWxj0nIy6kCxr_X=bp$6u@kq3BeZ~CaieZnL!?ek2zWDD6wco5zi6@eaYjhc zkUf(fwo$6uv`n>Kx72&g92QL(>vbsmJPlO6#v|;4{1tz(O6#7-+O^uy=@2P{v{StZ zNqpetjYt(ymoPcYK%?gk*Ic0txo=&Zz)oxlVsf3ifxY3Av5~JF4%S%l2N+&eE{3QHzge6+I?ng#^V#e+)wvCQa2VWfW<(Yx zL&(#nxR&xg*m}h8us6O*ECOTpELz0)x)f*DwFhJWE&4>!-4T5Dm8n_vbAh5?2vq(! zG8-S|U*DzF1wF(;gmJLmN@+^fGm=FRZ4iZ$-hRO^h7VXC-R0(YQR!9fx6EWev(7qN ze6R+;)+IFGxp+}H3T4b~8zYZ0t7`S4h|vrUHA=(DwV7#p=1;dKy9zb5Pi@DUF<&( zioLv+(=LyPn+8j$Wa%g8uB9L6TJ@d#tm13jvTVt~_M;CIJ1pr!EZX<{Zjpit z$X*EBz})j?NltZXj|}W8UBx4BfOq3FFFD*cP<-t#I5Du&wy8N=ff6Gz-^!l^&n5q< z!{AJkw*h?B$tm}bJG^t;b7*z{GP_Idu27#}33Oe&GA&6_)w*;dDQ{lb=(e9n-~kY5mRh^|d}rIqx^u zf!MKcWU-Qbu!_|^>s8(XwrGU_FfAy5r5&SfA$Ua0HbyjBj1`q) zT01*(=CU-?FNt%pB>9%NgCd7|H4$I(S^aaqh1Lz%f`@1&h80+$XP@HfwcyLX@yNK( z=zvGfNjlL!zGnygBKSj$)0<9;ghl%ikP*zZkj*4Jx4!)|D`AS)+P|-mIkl@ydR1!n zKz8wXCcBJX1C5Sl2yS?SWy8V7d;5&L<2W%qu;u3t(w4eOYa*D5R|FB_NK1l+b`yVR zE3dM~=24?jnR0zY7U}dJ0rw4EZe=gC;6DKpfgV)cHFHJ*ys*i(7JzFkgS~i0M0J7% zz(Kiq0yULcL~ep}m0SGNbqyO+`z&WjGUm(&xrp|N2`nK-&ZX4rU?W1F9w`UnNH4T* z5>63(J{dk5^qM~}LjKVDqFtN|ec3Oua>edOZsy|iLlc!{(bA;N*H&8%aLeV{u1^aj zN%XX>qB|2`1}U~{x7-?4eM0nA(>@J;cInv2duBb|v=7d)44?Q!T6c8}M-%3Rj&5*(@YuqTYBGb3pMqExpp7-*xwv_8vOdIj=t+AU1DBFk!^6F>$icM*iO6JLo zPL+!PBH4}bR#@bnYE@2U2ZjC=W*9y3Vw^g}6@pO?k!4oEW@Kb6J9tHfMJv>7)JeN# zG=ztGtDE5v(}XZSO@MCtj^$qGloU+K&q!ntn#Fqsp^aUBgIn1YmE;n`2YExRD{Wcq z2Ur}=Je;kz!um292OEVn>T;;I)CGBEGhLaLG);$pN};!Ep?Z^0zEJJ&T3l*HlMoeA8;whD07yw`^JTR-UiY%S)p$2YKxMBI)8;aSyE z3)i%%v!fiBJNzPIbFCrhWu4G0yLKX;Fm=VAI_>4!1YuP9`o>^hVThd4Q!DHOH8ENw z_aNAPQkJr@S*J8t8a-49x3fC^HdkK>5#dJNqrEWOAHPP5IYy$5t`0G$1;m(oOAL6ufr@i{GrLbNWKtYjavor3?Q+zk3x2hT zA9|z=6-Zex(A-HI{Z*;e#$Z_tW3KHi%Xyy!RlQ?2GUk#>Mq^zpe~<}^Vt6x|?h=De zV_7fW=%qYvb9r*k+Vn=E1Y%$6X*>(em@o)sn8EKvg9tJOLGHrx z%O||9#7WTEoV&aU7;^f!hd%gV#%Fv;d$UEPy-@DESV7 z1p`26bb9NIcovy;k5XIuJNE>f_$#IKp732YVq#RKDDAaNYJ#pxKbXA0sbd-AadCp2q>3>-b%1Mk=EnjyB%H5*vof+ud z&<>GSy4u}D)WeKD5n?gIg%Hk~^*U+RZj0#ZH)%AGbO@vSJ;JIB>W8#X%JVo8mO4rFyk_1xX!_gx{+ld~+dt7;Ii^&rXoolJ@`S! zbJ08kpjQ)Dy!=5Yp|9(KcG|(=ph`cfVTp?oK188Zk+d+tt4>C#wDhHIdgwqN+@n8D zI4DHOfknO^@x^;hVpUrq(k05X6zenzd)yb16d0`WTflwoXvXOe_v>it%|E$XHSsCy zBg?{09^sE@f0IHkij%rYP4Xyyr}Ca4{4eUxGN`Sv4bxD(xI0CHy9X`q?h@SHtx(*Z z;O<&ngSJ?3DDK5wiWM(izHj%(&d%)Y@BN$1Bze!AIeE^x-}}1sT*o_D#O&=R)~ON? zWp@r==DQEOASHF!6S&rmk!)P6pT64>zuptW?p#slJeu8x^Z8VwSkuhEr8HU^DrLJH z=I;`@;Oyz2`{OG%?=MkDBHjW=t2W%bHx@5`a`NqBRes=y?w`zOV8Qn>NIHC`u?P#> zcpyXtrT9CTrT2^{jwy_A&c2tW`3Vt&#Q<@jqp!kLl2g16s0=55xqOl(QyhMmoC_*I z%>=G{z<6!FyQ3ScghG+vP&s{bo9Kv6X@P|uv6yF)U~*Nc54#<*ipqvq6k4;K=)O_|SQ$Y{8rA!PKE6~2q7n_{?S!2AF4@Lm{ zx(Rx-uiQ5IvZp@b8wT}3Ke7vn4zel8E#nlM>NO163MBDmE7h%(*jR`$H1dm^xMCDK z*jno~;9@n@;zj#;>bY7_JS-f2oss)WSp|ANo}^Fi-w?JXW#8@%h_^-a({T2z*p$S5 z^i{98c`eT5y7Zn)jBev)pO|a6llVj2jW$kjPgKB-nIuM?JPP-&t%cmnOu)uV?p9%y z+N*Wi5x|la=M@hShqBYz+FFp^2lIbIzl8`C-5d#Ag7tR&^IX>9(j$TiXebfmChG}U z6;Xv%ZZ(Gt;pO-I=+Ou7F4*Ulk_eK&E1osV2+RGD5gm?pi%~_%38!a$s9Au3#NU#rw+nmY%nnkaNm`WOOr{2Nm_#gTThgbhkAH&IqXN zG17-oHXk}iZi@SD#mb6IGp+HkimjaFo{mbEH@FgSFFY=!>q?V-o741?SqdU~yBuG4DkuvSc@F@X}XTlS12G`pmh zZXG>UcJR0mTdIP7-u`oZru4kBv1(w_>Bq@xaAA>iv%4$)Vfo7Cuyf1huOwJUl&`H0-IeWqYp^@% z7>r$DP~Om59jxEh*VA&n+5b(Tq>Z39>)ET>D=_O-R`nm3vfH~$`KCtOZhd5{b@BUw z&ugNx;nZbRjOuyeoY8CU{RiWpS|ta7bib*0b#HvfYiHs2jgqdI)Vt*%I?;4bkN3n|5O7S*1(GY7NN=t zMep6a!1)fX%!`ioZu_d^Y=3Q(<^M!!wBv3y3b#P6!49|ymyYX=cQdl*3Vy)R3wz!D zc=7)-p?o{9yu-viRd_d{F#JywN|TD}*>ZZb@$HmH>hoRmf8fbyb7L&a0-qZc{{vNi z>tz8?+3YbGT6b$3K7JYe2eYw<@b&8p#rvw<`V0SyPZl(>*8Rf=^t)*@dB-O^`TGT! z{r`kduDcEEeu3h;+`N8zY2aE~-p1^23HbiXtef07nsiEE{Py^Hx0#tSLYk~Hl_8Bd zYr#@za(|LjrZi6(m~jItQ>iw<%7N^kEK+E2EwZ3bVSVHoOy_RS$Wf?f zv-nR6)@}<}S32jKWW{f(vR?C-K+jb%ps&hK;py5bo&)lLg5?OQ@8Q?oW))Q9H3eQh zw@BFO2_MVczZxz7f(UyfEQvzj`OK*eUro;%^uI)1QhZi(-!yzMNj(UZDHdrtpYFd8 z>e&Bc);JPJKn8R(#J_#eY4>TK5?$&6+@D+!{Jb|<9*Nmd!C;Tyu!Hz7l_8Go1d=CT8i29smSg-RQ@+3e!fGmpiPj zL4?^egUaLH^R2=;o%Z$j`#Z(g%D*Asrt4K=$$jUfbmen>iFnz-aAD2&+1FD4U}$%< z52LRFD=M0+S-umt6)@(F;8u*S8VDumfk0->jJY6Ml5c1ghpnnA<@D|INj_0D$|CuAZw3J?5v^jNU0BX22BnU$3r>5}~cpv*YR4Pl*8iU_%A2>Rf` z5d`0-CtWb#@0ClMYd-0os^a&eX{sOf>pD_5?Le=f&^&evaOEgY86$rP{NpXf`xBl< zfFvEU;1@Tm70LirrWuC9p&PJz)Z@CXa{~tC#oyEnsa)lXdF=XmC_CKyx7r~t3}Ms` zKJJ0=tpZ*ak zMW6$jNn(GzH4fX~jeVr|w@r>g*B75DdICj5H`5E3zN351dCl|F@zLs{y|v}8=^=yn zi8tDI28zvmnRCaR-&$E+ds6z2-uPH>qUS4psQPgMP>8+16p%zPMXcO{36RDxisPIH zdYPv;kL@$0-Pp^4Bw8~#V0Lp|9i90q1hUCx2@GCVTCvR$ZaW2-lzGwaz%#NVd~%Io z3WIIQ{3sSyr)Ypjc@WAGVg<*nF~`6ynWS)@CBJvI#)Y8$tA>?ZThr^;Z@%|{VLz)I z<`0~qqN7Dw?uWIZt!ZTqyk7T6|r@HFg=W62!^&$ouy}@*ag|qN^JhrKIM)UBjb7(XVH3FW(@!658|d|fstP#YeK zLR|9F&D87^M*uj;7X@4>-zO0jw2n>0RG+s8bAEb3<5R~9eEK=g-~Xed-0nmiy=W{u zy~U=K)Ut;U9)cwOR|~k4lH5E|hJkzp$#%|P!eqoPSacS(sSET5N3kKfsa*rRTb~n* z)P4%K6UUoDy?M)v422+Y{C2V8sWC!*n}N?byQ0Lw!mH0$9`!1iX&^dXHpt%ktqH6y}Zs zNlENnRgNd-o=uCK{pZ0wdhKh$DEyQi+?O8Tr@=qAb9NAf2#TiQPY&!Le~vKQY(~#w z^hvR@URZ~cu+lp5dHM*EFvNOJZ8h}lf3SiQs@p8WoNBq!lf{R#^qBD%Adfv?`&b0Y z5-!Z0%nAi1t(LDPoY^m}vLc;2of=%(Ju*2UzX}Qea$otMR3e`Us@HsTFr?f!(lYhz z3Q_o7S^3I{HryEfGIVN#=@f-$EE}R!V13JHz6q zLWMA$+*QQ3s9CtjK!Q1;1w{GUBZBh!ac!(Hs40&9ofEP}xa zGQQh8(3}Qx4Q)kmStCccN|*Y6)c#=WtaUysILM73_v;tQ4USp9xe<)?0v&(cSg`4C zQ%2ow9|NS3h_s4c%`f^;PP}EWEU~0W$%v)^5ogwPPGHSrCSS2u;73PgW`d>tGTNcz zUTRh&0h5rT?*oFR4xq{QbwEGQYb_F%W?o6dI{hG01=@)h$j_qdXU1BrU<*wgp%V^9 zYfiwu4=+KKCpED2rrya9wz|fyC$JNvUEqo0`j+~uCJi*S(Jcb-HH~Gut zD0U{F`|tcRctSX*A9wMqF1(6Px*NO<)JibRcD$IMvnlusrR4Z67%=x+(9B;fO$#~Mi7H_Jbkd@>imP`-p2 zAIav*nF+q~+(kkgj&x1M=)$(CRZiY?d(vQ8r|mzzu1!&rFt+@n(>8!lFmmnJ3S?HJ zR4aybcXpxCA(zZNk;Z$|3e^;l0wP6O2cC?d;#+!?#Kdc?;8i?1Vj%Zj%Wh5y35jAK zgo)E$ucA#UP`^?fD$Ck=V-qiLMDdu@GVRXVO@8A(*)-G<>Ns9Zg-Q)A5r#7_!p)}s!u*)6kRvspZ(`6Y7)+m#8qd13a_rEy8 zT=Jz@El+e9Lk`!}BlnOPndf@Nj@RV{u7Az@jE-5Ajgf=~cZ>m8F4(Mir46#pqKz~4 zT18wr<~8UpY7AWPWLB<}EFCspyc=96vMZ0K3>N44kcL*qXs?=9*L6_~9DBYp`ZtK8 zfyRm;Z)6$MfJOmF2gW@mc{<}}k>snrxFJL8{b-)&*nK^z_za*W>K4l|Rh7Gu~_UgAJ@Gq=oJNxcz6*cKK;>0>}X zPH0+-p7Dn}O|P*JHIj8yUqWq*-7|6Uy*APT=_wB#1PCCi;30`Z+9*Zr8PfFK!Kr$e z)p}fx`J6V4JS8%I086Dk0H0}lAh(M`M<}s91$DT2a+ujJp{Qwvr%%XUc%)iD0*!aJ zF*Zcz(;$xCBHbu3lHleyTdQ4(UIo=h>nuwGdQc3Tz3o>8{0m3;PIhMyCzXb}q9P~o zgG04+cR+NXSYGE;q$ISSwE1k)m4+X2H6#AQA2n~WTD%NQDxP_ zP970z7uZJ`{XP0gkW35TgJTrx9h~AW=|K__9hWl6R^c7|$l(U9ZB5T@zPoWqw&-kD z_y?21SrAdAE?IiA%C)37K!j*lRa>*nC-;ZU%iALM|b>YhY=uy&~3w84t=t7fOW9`LS>H5+pq2FqVOl& zqP2OIGtLn5qHIU>qqE?rCS`LEW~zHwTysW0C(5h)fXPmbJmpnq2$6c0Eqt0aGGgfV z_@x%*1QLPKPVbN7DMMr;TV?GTB3kq-z1BkVOMzu9yZVpc!n}2BX~@en#^Mb&N7fwV zpa_%qq1gx-rUAKWWJYQ2DAB5W8946;R6!moViI{Eeozch$|1&sc49OxC1AEY(ZdfR zNBKWLdm)NOl>}!{(Dvtj(89OCI_nDi?}YX^Qg_2)?xN(8A4E8Yax&6(t+Cs3MF(#U z(@6|blHiA9rf|w-y0SeK)#M49?UBL7xAJ6qh#z?2_YJ=ex#)iQ2rP=UdmM!FjAS?~KOVqA-C3dx_NX8b2t}C^Fzk(SN|4`P;19Hke3CuEL5<3lRw6 zC^|XdW;)be)IGfZv`4RSLf@%lwCm6ySqfMb9{#N}(Y&6)G4jR8B-R{W;`$f)=*K3* zjSL^w@!6OZ7aYRQT#d_OI~R$(ZfYf5Og+5>WJjL~p+iqKQ`n47uUmi#v!Q#llL$cT zdPc{RPH}~6pub`n5SIVo_m0pX5Jvf8JKia)k^ILXa&$_~t4nWAr>Ir(^H(|=bhVqO zdRnKH9yrtm7F$Qn4_$RK(rLULEO4Z8K}=?wYWl-^>sZPyXXO?o=QITTNSOOwDHZ3p zx+#u1et;Cn2JvvyY+lq!HV@FICnV)#xd~734CYTZ^##XC?y5^asNg@CI6K&T0mZ^LcKta&Xu zhZAX~*U(fJ*Lv?5{h)uM7y5MuFvnUrHK~b!-}*d>>Og)-l}xO3)6L%Y>~7BWHI$RU z`mE*Vu$`~3OH>d`R*O|5pNcF{0GgpEC4?i(8{E6fgIb8CGut%R+tg@uNeR0{*RSgR zj=I1h^g~?$ZF2RTXM;e7!^vVK$-MJiVx8?dzvsrfpc}+pdlR^z z-lW|X;@M`!5P+@OQ-A=SMZ(oQ8iCHS{lFTH^E_HVXP6XvfZs?W8`o(dRIG%Mt5ai|Y*=9HT7?#4)#ap}c&6PUNb44jmg`XMYoJ}w>V z?`Kbjx4=oNm>$KxVNXrgKZyW%&cS60mi`4o8tZ5P^zAKU&rcdX#E=sOY#2~@+ z(N^>V&*B75#c1ar(-X{b_Snk|2?r_?R^80tgJwh>xm-6pWhdjcobmdyez2J}p87|o zOgZyWSQ?P(loLKL9!o?UG1$x><|@j>1Vd8MOTrm*lb!#JR2O3SUVQ-VKE5fvZvV|F zzkpN9AE1i7enG8$VIIS(H~w|LvUa!$m%x6CyGW~*$2=q*F8}I)BNTJ6Yvvio51SdD z7nN_IF2-bUh?~ z>`Ds$eLReb)Rtn?M7FF_Vim`bCi(d?(wfgyL+cLfv6&xrm9~LU5Ec)F@Ay{(xx}FH zxHa$^$BdO8#71~+qQ0ljuQBZEQ54tJbdp0;Uj+r3r)Mmb86>A~A4P+q*0-5}!0vE& zl*AoyRd^l~3UjKIc_sCHTsp(N6s4Xy3#ce}+=8Y3>}^cv_J(6!j1~v(VYvMm2J4HK zP5+}L!>KH@#8n`EY&PGxKKAILt)p6{H37Hr#;A+lvAM*i<#c?lJlZK9V&bw|RdVdk zZsY?|C~1alJ0D@sYB3^1Ax|j7@Gli{#Ze#l2IPQ4L%}falNzyAdT987kf1m)YmdxY z%59lOHg$X_`A_{Sg%iFFH+8Sl-`?6uzmF!TwFxLg+o5?yib(0 z_u9t}U@OP2Q-eD~L(ydXE6F2{B95!ALl7U^c*0d4&h9T~oLp4m16qASy98H%B}tmh z#Z3NBbpcZz!U5y;j$&9u+$7*C5kD(88R}X%RIHk>N-Rwty+0vs)1W64S|v|yE<(hP zja<^?bqdGPnIGkfY&k%YtBlyz?^Vhmjx!k(@E`mXFL@NsB-LyeILYE`F}ne6n50|} zxdJnm=f(NY-XTxQ* zg=xQsA$!p*@UP8dW!$Df{HrctLy$3Qf|8~$gFR1I)F!&^{`$8(J-knrLs;2z7Nm}a zd-7#a>F#+Y4?s}l(}_^E#vB#nZ?G^yYbjoql}L|xeDUt5OQDRjQ31hZnOWpt=023% zKQI+|bmf1p9j!+?NlE3DZDL5~)h}-lQlQc#i+LiTe=ydd zUPa}+BH!NzrPxlTYE^9tjh`I4octm{b&&QminxsQ+gv$6m9T9eqspE{kr;zGJ_ni@ z#Hj0$#pz_e&>iB0Zj|FnD!Sr1z55nF!}mg68`d9JKWPEJx&)-!T|arQtFfJBadfH) z$cTq28S;PPW}ix3wIUYjSdwCjbm*0RXvt}Yk5e(i2? zP6-{K>fJ*R$yDctumWA>u5TILD&A0+ssiN~N@bBtF(eYDIdmeMb?VpjJbMdTHyJY( z@P6F;%_@ho4tw_Y#Zswwpy{ELk~JI19>txlIm$)_J2W zeAzTy(q-u0?-$^#42eCGlwrij(s?tgBCK9xBH=oGVJ&>(JR@YsR=E|NJ!z{my7 zJ#= zGdZX9g9qXa`!Kk(S(xBw{}6gcvP*$sM0=)-ta5mZQv01?w~L7 z7wJIDV+dN4?ajR}e5^x^WxOoHuwqRjWR*75)sae1nOw0@?bsLwCV76wNj8Z?Q%jvo zQ0|ll0BI$J>&8EqF#utIclWajAn79Mf8r01kg@Vw4Cp|&4MRMpr-QD&bHazPdvcir z7l(onLX}c(B$)^AS9Odz7HE@K;9{R&pD%rO1)xDhw&UPa+` z)#s(2Ad}X>oB~pnRmC@jX(dKefjie_;mX9y zHUD&Jnp+-coSuMr@)(6umumWi*jzlUweYJ*X!y;!NC^56m)uboqZoAEnuqf7RpD0& zQ8u23gwb%{li)H!d!GeOP2DNuqQQSKnBp$SHKA3uUB=bF@{YUawAd#udx49UcYSSu`j(xvx)!6; zcj#sDKbT9m(7qpElY$mHwrBoJz06l_VSdhe{a?__`re*C&i_I$71O}Rx?6!498#!5 zpoPSmj0V_8VP085x_$VmO~MMn5_kjev$sBbVrl1nwNkHREAqnT)bS3J&LHE^11}N< zw#_eaZMw4n#V8TOcQ!}vveDq>)}W5+qJl)uQDNrgql*p~V-~L13uHPcu~4JZu7lxw zMv!eqP(F*oO@wUmD_2PlOXo{I?)U7ICtjEksgV4bJ_ z&A)pn9Z!Lnf0wfz9IlYISNlxf@V||kSjzsvefM1je7e@h@3spW&M(x+3Ghu3xkHEkX0#b`emrAxBTkl@kSGDm~E#FlFBsL!UN-BJMali85> z=S+(|-x)V&8Wz2_m>5Ob>M!p3i5vhO-sf2UFSr%8(6pYn^!yOyeaX3T#8s4kAA z-^dPutmoRZNez?9k_9tqRKj6}0huGn^#yhJ=>)3*Zth=D47Krj`G_U0xr~|`f&Puq zjFgC}-HGPT_8Q5sb&F7JRSk#x1Lk`xmAN%xE9U(&MU^?vcWP z9jt(=QY>~)uiR)xRA1u__^IY(ug7pIU|1s_)8F7>gh?{6)Vi?|M@UwWSU*0%ZsR%* z?sj%Hv0J*Zrx=!QiLSQTk4Z{w)R-LmL?h>n2!U{xVmajb%UGk-895zB#vfJw#hWCz zRHN^K5-9CP(HBQ&61sRMc3|_FvVY?FK4KwGH#<~p%__D&Dhv+iK%RFp16dHYU|={N zoln)Z+U;4^tv(iWX6{mrz(EFeWyB}hw&oRsXSZRsF5#Bub(^QgC3d!9fQ;0C=nX-cx=C$PKt zJGQXc%lPy8G;S>kq^_&Rx`cq!(@IE#a*44IAPO9~)bC(Py-#xV(M$1L?W1k{Y z6Y6n-pDgf(A>^-;_ewgd_|$L@{z0h{eJX;RSb%VXnIoSrlX+#**de1`VWrYbEHDmmkL!Y&&M zu$(=0XSHP3a&(s_;LhNu+58n9iK8eVP*Hb#8v-G?#l~2A6ynp38#c*}mr<>X<>BBA zR>@O&XUYWmq6Yo)4uy!$!yc2Ej9#ma!DNEbp?9WD>6EzmCp!{tHisc2NYn~HgD$>S zy`|NKCs^5$wX?U$Qbo-9S1kiyxVs8%$H_q1)FMLm%L;%MU;7HQu*?XyqU* zCT9C8H3?fy2$`QZL(!r2MccRc{13&QpMd^cUS(wkdlRZq!w{MRNSFQ z1V<%@bgu43iYVna&(M24gmb%!wh6MoZomIp_URv3iioR7;$3Rw@XGqWzo#Of@ z`T9E{I55*MJf1SsN1je4QH6gF|MT{RgaqOGIC)>FK7+3uLDCj@qD-*2l}J{QUv9gowX9b(ZHO zo@K^I3!nq*-CT!cDu;X*?p|>&h2ab{QMe8!1ojR(H~F7kE03p%c>*5q7HI2r7H*RM}OKpE>rKh*q&W9VE#q8AaV-W0l8J$Hq|u5*7}y zVb{j?!R={XF@B2H`16-4gPU7-qtgvNJPNB4*@+32llW*70ab7uOG{*k8*Q3X`g zp{WiMl8qT*r97cY_qSiTIKH0#Z5f6xeAK73k}Qq}ZKHYgBkUug(Fa}x2-H|CKnuOt ztH^TGtvfB$4amndCfUm#K!2!{PoDikFv9Cc)Iibw2yDo6g$j*jkhiB0k9#S=-6%-2 zSfdVc9$mMQF^(*aYY>-YWtGyQFsdD+ceMA#!r!ul$?nvi*5d$-B$8rI8`0iKY1m3I z>`~lqcqfleZ7JudD7L}kMwq=|&CNYV)GePt^u}>SXK$T4qiVUBhjMAKieaVI$z5Bv zl$HHa8v>3l9g=iK=h+U=Mbptn^Qb9yg>NM^>0iq=InvUQgQ_iZK#)3X8~bc7L+_6! z44rj!ul@SJcOvZ?O6o(Rx>A$gfS_laA6 zC;AjtKe4BN6J=8J88~zAF(BtS+r2t2gFFJiSy;W0MyX#yCU)MG9WUzAvhh@CqC9>h zWpr}&mGUei4?lhj@y8l5;%rUgh0dLl4@0&4A5u6s%=!pJhp46i!(+6aDC`;u#55`# z2x3DV(ZcG{0&~)=+(b7IT4Wb%`&S;`ISAe)qt=ym8FK5SwreaP&Liv(QwtmEctBn{ z!B8mn2I71QM4E@FJ8TCkiI;lxmBr@#B)0iO1xV-kRkL0rxWn#cku~AxQ8nu70ou{07;E zbo4^r^qrkUb-u!&mhou}U=NePfFotCku=I+iWse~m)*pRwk$dW@+i|5l>`v<%-u_{ za#ia)Tsg#@z>P$rv+~anMNz6vwnsS$LyH z^K*?xZ~Xv+3Ama37F`;EJ8FpuJ|A1#=e6GPKji6MEW4G}mB$JmL2=CRU=K$k2M~5g zH#RM_k8LlCs^xdOezoVKZ7{MqUf1soK3GAah{<=;Q6DD$2lH8Ki}Z$C7p?u~JxXc1 z^A$iSAlivR(Mu5LdLX@-NMJxn!zTp}MkL^~dhxi+lIJIG2fI(ur2n2wP^Xzfi?7ci zF2C^YVROjizNi=LZG4cH!ytUkt%{!)UJ_d3So#Npu(2-JjHCNmmKrgXjl4NV%!nn#@P;?S9qa%n;mHFxFeUYCO&2h=COh+p; ztn+p^C^iagFtigktaQ>c85~{P=Dhm$^-VJ(_uakumJyOv86##A&cTPeSdz-MDl(^Y z&3RhQj20QBd<7ihDMcInoDRR<;6G{Rc#7Ab*dXc-v1@!$<+OZ|SreX%g`)5|I7xFu zvvskzkHGX$t138BP^=GPRCDLhqzs|vx=g-CdW>UXI_xqLo<@%M1eEhh?#M0$VO8%~ z#ZF6%*+_M041bcvDy{{?LHQbOl-R*(M=B5rEggGEm?|rdYMTCf(&UrrNKOsH5m%?C zC1ZZd$&uXCVki^oSpG5}wUeh;x*)aGmNN*kV7mw`zei?f*6EOSG}l=N4>nAm`5lTe zp3tPpl|^knm_|1rn{wd)Nw>n^i zbpADEnanJW{t`&L|zcMjZm6!Tg++!;FGz>vBBxsvFZqT~Dnh)Qj@Z1}7>?_7T; zu{Qg{>nmE`^f51?=9NCa6tgQR*{+l(q50azk1V^Bc>rP({sBgF*~@) zd%hXS+YfUrLhKg8JKS>_R8!G6lfO~rl(Nl3t+HFxZu45VaN?AoBTxf+mCOZi(%R0{ z^bW5XxufNwZb-xRX}~eSG(w#p0TNAed$e|aLX)`CGVkw%`c6AYcb<`{&x%8*UefSy zgnuw@A^|3StVFrHxYyHO|6q0(uVtcMIs5iM9uZ)vI14^g<)>)>L@q(KJdmXHC zB(6@AH`19jAa2_iIz(z)hvxXLG?GuA)G`k1{oGqz2_YOk+tZj=TPyg{{^QH@@Z)#2 zWJZ=G4A+1w89NJ#={=l!=){!1sv_|Sc0Sm$D@f^?4<)UwfX%_Yyu4NhJVKBFvmCjk zI95O@iV>`TxNYLtU6%#MhT-83zWVjU~FRrFAUnxA?l++NRj`2H_g+zR@L z@(2Z%Z0%gaWWqUX^zbJOITGo0G6YpcSyU_hB?MzChjRD8cno6HWW^@bkZoOj97(-Z zvsf7 zO*K8uZA6s?-QW0CFHT@MdN`nM(%>)Z7AMf|OuIf$fanP2)L)^=YW0ls%5{3jJ#VT? z&X^?RAg1+$K-#9)XPOZ%P~+R8AK)vWp|6Kpp4_Fo8o)R;Ww%0rpO`y7F4|*lFQS3? zgB;?Di^~U%EPDZvn>-P_rzIhf2Zud6)W`TQm9#Rk?A?6-CerfVves%PiB*i!I?6Cl zcH(Xh!$Jk_s2`3b`2u89`_ND8gOH1~Mw0E+f4e~+p!ZzzcCI2M zUXm>R5eBgqj!p8E*^sO&^0hFBV;TSHn$AK5DUPn}S5OM1qr(*gZj)yNHkqFgj-|(H zVBs~#E4$V8ca&FmEjot%gl;yRx}bH$@e!rDQd|@_hQgLG9>gNu5L!?EOu zce{hyjc=CPxdtRaX(qI)CSC=%1|p0&`dhpJgbdwio!F$>q+6IaPErO}a|ypS_-+KQho}+TjVezeEWBBVG~j{hAwL;9m{=`-e7t_3mm6%+Zsi^w@&Nt6X%07Mow3 zeKkGN^etr98GE>!AjXa!g~QngPb{KYlR!`glE{D-aqg_*ef_N_bJ$>&7ErR$Sj;fh zXBi)qaE9DM8(eeoLXqy`ZB(sGY9HCe_<$%m>P+B zAaLkte?MQ2oY;@i6M?c^kr>-epq+U4J{-xap7Mc|yc=iPuzJTsPJEIhk)T2)6(0%{o2l%4}u4?bmbi`}Ab zvR?x8kQD2@I2mus6_YFs%i&{o^%%?5mhexkHs=vZ7F?5q{S6Javlxk%dp&Z)tYaP^ zx;ii;TGa5~vDwwGB{*W6mDz$>5GP(IaA3X?>v!{hfe;s;40?FNk$@3KhNFfupbSZn zFkxUnKA0Fa{|D3C*q1Nq=I8tsuPYC9p@JiSghT;pG>c=!&1H9EdrqzpzayEvyO0$RQozkkV$6{vHnBwqXM6Zn}zg}T1p zeU(|u&;G_@lq^*wFN8A+o)X2{4Mk1q!W!#-O}uN!(qm>WM2S@H#CJsMWx$d11Io}w zGtFnjYAd$UFYY@Z=Npg~FgO|J0wXvlg>??5a&)BezQf)pS z-mI9d%H1-Z6k1ax`i7emBs9V|m-AyFd3z17RAy`vs?H50ovv2?A?bSNzM z8|Z>Ds|2Ld8g7xqN~`eSh|zu8hQ+sDDpJRZ$~RP7KD=>{Ub$T||7fn# zL`3oOWu=If3Mv20M26Gy6apdl4{66s%$!UE&RG*H6DFp6P^iQM@2bFlFbef!UFLKG zXRJPxg=Nb9RQv}J4AAu#^nT;pe}*tR;g%=}50J|(u&i}BnS1(^;||)?lerK2I9%4r zAY0w=5QYK6HF-G? z)$U0_8bK$e7-W_lqueZcTHE`YQ$owRLm&-obN^ zJ*88fH+s!_*8cfGb0{xJyr|WM=>=F{P5_)0ou<^h*O89o$Ci^H?lhcZGV?X^eUrva zN|(yO-|gku|1JMNQ__21j*E^Dir*)D{=vMRB!7GwyEpxGLi--_1u_J!)hyo2J|m@c zG`7EQ$NLalcpqi`jdXYYz4FJ_yXW3H+_7BlS+5Ihf4-|vA3q}4GESfC1iimM*QLuW zLah6;g)kItbY4faKoG9hMdjbZCx?RUW#5xRx}dgs9X^6H z@0g)Vt@mGOc!cEh5Kp&t@c8+`*s8AK$EPssr=Ulj2CA<4q;kpCWUECbR6rh`avII9E%N1RTS5e>a(}Q8l2HRuDC;9F$}@ms0S1 z7WqP&K8R!HyrcH9H%Tkj^~t_H@O?32f&m@gbpQ=ly0)`}XC|mJ-jtwq*{4{~1<&t| z|A~74a@=$!9{f$OLA8C^e4}jy(q1jHk7J4}iL?G`S1P&%6pc35X2#h-bR6BBl$=Zaq51*^=iF7j z=Z`Cg1iM>n2=4IJw9z0r7k{vudGBwx2~q$YQEwjdb;3tqn~V>8*i9a&D%ouVLpIZ* zp2~@wI%<>6X;%T_2}hkewzXl}fM^TbfJhGpt*nqV{Zrik!QER0#T~T$f)gMCg1bwC zySoM_Sa5>7ySqaO&fxCu?hxDucMBdoNSL76f#f~s+f!Tn-JFZvb2C+R_wTW8{?+nG zjRZj^Q8Z~hCV&iAx<%w&=wfHJmyNs^nJo>RmB&cURatKM=gdIt*7JQp{1keZL6MK- z&hqRV#%zmw&Ynx-ns=C5M6xWhXKcK{VAC{Vm>x7w194IhI1s6R<17`Mxj&s<PAtBs;6S=ILXd~QzZHJ*9dptGm4=oz zk|G(yF8F;s=7PR0)_dBU&WdfR6QVE!mkL|9uB3ym-E_o(GgBNReeRH`RU2C_EuU*K z7AeFtclZyn?kY{D+Qj6?f@#q-gzj-rBrt{oZ`E5Xa^&C^-Uy+aDg8)kl>c~(X)&0S zlcH~{&)}J{-?S8l!-=!u@g;;(;+83(U<>kB4SxJMKdsnU18miaAFgk@0=sg#xhyOF zLgkqPW3ojbi^LAvpxR}yC)n)(oMK5?6xtvLwZkJ4Btu=J2}Z57UDvuSGi%|flhAg9 zMQ*e|di!HVkDRs9UmNn$4;#D;C7DZP&(jlYynl*+RM|(sCxJSFUx&`XZ!-VMjoYc* zf{6h*cx!-qg{vl^t%(V7CE6=NSB}TU5a#n&!&+0Bh1F+eBRaJf0q-}?^M^Jxe0-NM zD1IKW&@6Cj<*s6C6Wd*%8_Urk$k)g@hCf?7s+F#tlq*VMB6TxODK$+hm9ib*(O3wS zoZ`mLl#-=Jh=WwFs4YlhUPFb432?}3Mz)865`I6_mw9#jO_2HPQGA)nsB!!V|Hy5= z5>VT4nZdz*gtaA!f6&usVEp8wXq7(@DMmA1jHLiAGmmmSO6GE5{~j|o2$4o6ZI;O@ z&0;Vj%2Q)_%YtE~AejVjJY#;WFs-0L@XX!Z%5jYqn!$%gk#oALy0$#yT{4-4vJx|F zMk={w=uT*A71GFsI)4C@&nKA3I4Kd>rMA~ftz@J5P)aJ)fLx`Z*ZN7XbZ;6F6`;>F zZcy1cNtc`wBxh6yHJ9JV6%O{EzTZ8=!OI3{k7{zv-ylVXTcll=mJ$jezis&OntoJ+ zi)!b`usR(x1-Tg2p|MVG+2}kzA57h?F86?!=0fKot2;Tjkr&~a{)cM{1`2J4A$BE} zj57=9g5deiZP{|-p_&Is-HymQ1Y(+Dx()pApLw?N$gGFz-MB9s<<3v1;qw+|$Q)J2 zt)>Y@dKGE7`-jk(@$|#qH--(>bkv|lS~8_v1hZjQMt4W7nVeQ+}X@ z-a!8l?}rvB1>p%MMe{KY##DyMw@nt3O3^%VNh}`agA}5Hmw_E=vq%U)P6rw7?-tvG zA3sTmoNZ?^ET+Z6TNhl8?pzF_Ze$4pK+K&Py2&k#x7DDqm5=pI=i%R)yx7rmC$h2W57q;>)a359e$XSN17i>gYcj;IEj`#6n0#r z$lDY${dK}hG)ya7GpSaPv~J;QwzQ5aNsbAt?JaG(rtB_DPPp+wvk*P@O^w!?iw4#(2zNQvxg6AR*6;`o75RM$0eMbTn zz=d^cRqoH|tlEHsx)9M)Oi5L|1{N4mNHIhTcD$Zt9>g4ulXehOz#rfBpyjV4t(Zff zi-}cpRRxA7vbgj|CZo?$(3?jC{9XL%bQcO|1{dH7_d|(xBg~2{R$(S%`v2_n4dJ1njc3(v zs0WN+L<>hn;VDrwibQmY?9)@Wa22=;Yt5NTg*2X43{=VttH{NWwx?uVjw4SYa<#Ke zcsVc!&UTn$$wy0N7$p`~;|J19jw?s!e@x4j7VsSS*sz|BVFI_teiHG1@QC?Fl+v>= zCcZ%ad!F+z6jok51a0=FUy2#vq}icco8T(Fnu)Lp-P8Tkb+;GyyLp)?OHf!Y z>~;XHxN@hHR=OLQn>#0}39PFTV}i7US8HkKgpvGm1q>Q@B=1sNYWm0)05Vg}HFs3;kPXVVSO z==@#ljf}MMx1pm0?|#SE(GVjnEn;&iy&`o7oo8~fi;YTecj6u+@C{V+^xP@V6`2Gt zk7zOMa~HohLHvSU-{%g;;w87sf?MAwmob%8kxMAaFq(kz%HhodR#T{SZq$;om*$zF zu!ISN%ES<&qd>-1L3AvJN3y-H&Fpm?BA0-{T5+(+nYBg1X{~H~j7iIi)E0qe=R_%C z2mdt+uVG^=LX_a!U3xXkV z+D|W1FAiJnlMUxR+3zDYa1y@mzZ>mT*6O3{)67feiYRWf{#zK?F_sgc4ZoDyIWtgBX8_&NG> zp`>skdVQyqKTOPWpdTNUCbva~`j|oy=Q1`& zVt-kj!lERMZ%3OaT|-66`tv=DlM?(&+ibC3qD*}C*jcw@DSYomCmNMyH|-b8RLB7? zN@G_Eu6IBlt#~oz@>k>~+|fXVZG@(3dCcfCAsK5D1??x|jGg6z+GZ2EGk7t&*xI-8 zQru+JJ^l*%66R6lrbtnpym^)dN5F?7X=m)G@Dke4h{B`HLr_E|g~&tK@obcGlzO64 zx%`IUV^QTF0A2U4qveL3Gyg|F<0s(x?q&`8O<*Ao4S6}p?ormHdvE;lX%A~oHoK_( zmgTxU_e^pe=*(7D>P&nj-CT_iIqA=nsWd-~f#Tz<-<|&`SjS-EE#$o*0d`APN>ljG z0rmb&{kMlf@NcHm=2tTJT7Iv=Pj=s#NESf~_0V^z&O)2vBsXP^KCw>$2anP=rBAQ1 zgiets)C2||7q7}k-E@Dp9pHfVK#vL174#ZK`&*kAr9pX5vw+;!nnQ5CuRzJ0x zu)EFN`s*;;;}fIzKOJUc>Z$J1*8VxnW_5P`ALuX}u;AM9@N&v!VzKLFtuXJ9+$!3< zqQuV`wC#3UEmM4x%?q$SU%a-AUkF1}ib*>BIW@|MhMwgcqZ_(s-=Ct@q5@TIH2*$B zP_r@{;{F?poXxOaE{&N1U+H=48(teP&L*E)3cH=jLBw=IvtlzUB+Flw-|A;#8Mq<{ z_WNp$4DZb$a~U+Hte^qh{PC_k6d`}1QDNF^xm&29(x&+dJNZ`O9$Gz9I}?-lq9Cl{ zn_9Fxzl`fQ7PjmAIBcN-cPhgU%C=A!GEV7Lyvy4ZR!Ky_$8_4>KDi znPbW>@bo&d^~JUd4v3%D@~S|tEJi$0t%y~}+E+`)uprwyUCH`t0`6u*%s{i7K~RHR zFGmwust|LXg3&sWxfQo~jbQ!i_3uA`Cv)StJQB*yOK=VaC{ zgpC=p$nM}ekf!**2MN?5cX^zRNJKGPnMp;S^EM0xgif^ZI^2}iF!BymdeF(o;86() z7+NGC(ADiysb2-eUhjm6W8+P`^Slv0>M@_!nYNxzjIEG@!kx zWjfW59c9^Q4~{xz^ma(zu^WPnsIT!<#WEcIY$dB7@5VX_ry#hniDT3 z0?CMB--do^rX!w&?m!KEcnX`LVJG+^l&}t3Yd+45aEpaM8EO1CrSTl)+rn z__$eq0? z9-vH3T&v_};VI9QUdqa;JeqYFwdCRRmu)q|!HDtFYLg0P8{EerD~N?u8v+WSHq7g? zzD=H=kt3|MmJP(=X*oGzFlRh{W31JaHr(U{9NbTBW|1{0ykvnlRUT&5DAf@)1%2ea zXK-^iVybL{v)=5tR#9z5&|FIbF@ok5`e&9_7QJ$&P(%i(ob=c>ChfJQE--*C6%2(8 z_!I1LJy{cU18hc(OucC3gaJjDPNI}@9#)Y%Vfk(B@7_A+oZxkGHxq$?8Hn(#6;0no;JMaHAR0UuHbh z??3*9VECBO$l9^NCpMn+qqld_T)-!dnaBW)8WG&IA`F58U(Vqn0u2Pzwa;3vC(}qA zSTd%@EGe0YkvRQ2VLV%*-*=JSM%RX4rR7YW`MWCsS$5V5+YD0{qFqQzD(Dy{P){b} zS#=Gv22lGiyE2i|ZzzD|O@k9`YLbhAg(bXGqQcTDVeo+r$+fv9qROh0Zqi}HgRyV` zX!~PrGn?gg)lu^j&LrHeZ;-92Owd;cY$ql|8?T$*g=uR04Foqb#Wh0SxPH&u#A7R& zDt?+3*@2i~P}=1gU;0ku_bopCv|93HBgDuHGGnz?5NgO|oVS^iy*#7rwyi!RU*NSg z>Gs~&3%LDlZO2*|&6ODB?XpmeQJSOo8544&8fzC&ME2r6Ei<=)%H8zQa`r*lZ>RCf zHE~f`Iggx(OvVxX8Z8!5phgu!5|y;9&GZr;U~gd~aZUO>{Ft=b;?Q;;caN*G(jX>uz=?3pN!JpvF`;d& zFFe_LKOV`BjkT1+AhRIF9!HTy8A%20Kw1>_pKtSB7;?8 zPwM9P_+PQsWhcK&_Cnurl_P$N;yEbdiY8qtQXVG__$|6gAh)6*n8Xti%tL1(r=u-? z7R{5jDv*kbnKU7J^7KjbtePj&sjnyRe_uG_Fn(&)C(z1klXnjPLft&9uBQ z;O8WBlGWf5>;A|p@my;YCs?GPFcHmitBHJ?@+)cRvDwb5^)Y&**;sg_SxTOU($yM_ zbZyPM$+)H2qo4(CxB=Anc#6 z-#ikczPI*4762=fnG}asjL9+`(RU;S_V>4O>nZ&sqEFMWxsEO$MD_gg{#o}EXd~>jN@S=S@@7D_^jTn#g}w~%?sWa=6cu6 zq=0xsjRwNjAue=)sQ~M9N0s-W$UF7v=*;U-(0mc|Zj|JakmQApYFruB$zl>2E1{%7 zn(;$6(HF{q9jS;}R}Ony|5XBXzl~8GXhKBZ2M4Tyrdcu5EJ@(K*y2{7K?*>3_aGBY zXCcl(kM@d~W;aX-&nduy9FkmXboJcD1!s8tPGHWrn%i(mnbIQqhzi@@Vi!?c(|9i| zBzeJCB5*fTp%Ey($ot*-9df!Ewdw^77WwiwG2K4^O7lJw|49sL`AR0(Td8;5Cm*CQ z0+T#k%N?zw!eaJ%5&i&{h-UJARlgZb%_4v5Ey4I2HKs|({mI+wEkL?P#$#qMX?LRj zw^h3my8P>e_gbuzC&}B%Uhxj3D7N=fxKhhJAFR{DzPCQfp2VaULP>exI*XDq=tJL< zAq{2r<$ioLu()|$rHh8k*hYRzoo@e>*oak0DB1hQ%Th6Ae0ZRzGS)=-!L&})Max6H zbrC9%w_m?i80sjT`Sk^N5>^jPRdEsf&U$Ah{gWEKC(>wpX-HyXUTxo*;s}}Xg_tz4 zvNd`WUpZBhAUa}1FA}|hg~s-;dz72k^ZeT+u3OwgYqR5p_!g{w*fylI@h&2-3S4zl z;@*On%((-RS9KSR;Mnt02wSDi1?kmPISazUYApz9lpRsaN!p%@l%dYKgx_x_HB-J8 zF|giS4ZXMYQ@GP~V*JLvXoLH$K|SG@C4B$^_l)ePec8k#^+-+GS~ukR;TTx_2YH(C zA|qu2qAuo{^WA}-rYjy=UejqgGOw$w1wV2`r!#&x{fnaNz)4tBLY)E|3_}=A-w6+B zFiU2c$xyF5+9 zcP`L1{u!fiN@c34wrlV;ZOiEzM zkH-z)IoKF!W)8r=1=x(XeDfeBsx|;{HD*m&$?-Tqf$i>5)$b|Ju_pAX-r($cclI^Y z6y^(HePgptylyX9y9k}Og1=Jy+)T5hoyiaKJhqz6>*V-;@|i4{Iq5Jgozf$SxN2mU zg0@J(juK{OH>#v67Kmg^Zv`++_c(B(@OEeiSbYF+{{Gt4N`UCUzO&z2SYX_6G0EmW zOEyd~W6(P#d}_Ppl*%_0#J_w%NYt}c5XG=x(Fbf;>()H~WQ(=_$@~~+1jAC$A>5#D zjVHq4hAp;lb9a;_R=p^`Kn+VqjZ7fc`k)k}CDd{@9yOKwBYA0>T5Z)gL`5W%8 z2#0ZChRtQUHxoC`@H6I_7nSWmnf`gpNKOfv5lrocm#oJOy}^ESYq>-e{$dc&fALib zcV7QnGgs<^akPPYKE!kHKLFS8HVOT-`1gwrb%F4d3r%XgTB4Jk%xQSvmuAN&WmeEf zd4v?~JBG(9q>EIT1olUs8n@caJZA>kz8r8YeZUMtHZrTog(X&WHL#9o9ZCMGMPJbP zew!;H;|*MAVOE*M7#Z&N%r|;ynOXupHX~ab5iBB1#^O+ETtW&ejEFvaz==RfOIPp@ zk0b}j48b>MCbMstVZ*xX%n<7j;@K%B1^Inhx8Tiy?b`A@POx-KH6jzF1Y$TRdI6@E3o ziRSaOZ}iv2DDXyxIpi*FM%poTg-Yl3ge;bvj>T;CXyM4>9$2qcC3wqyBP_mAFHy8n z7E%#Z*KO0YRE=G`47qf1k5wk8`C9dZ3cC)I%DKF=js-2w9#}WSGQ+Z?EK$?-0({D0 zcdPRUKo+ld<(0dwd{1?q>__?NpWJ{&4Y_5|b@DqhV)z4KG&=`VSLGQ$LXJ@}g*qsX zg2wT=nW{25U?sUu_Zj&%!=)OsiZ@OK0^%|7B zQt5BY33T>{xPql}&FeJW8lEa9CZ_;Ks-|}5+QyJ(<7AT~m*kh@T{VO)Q{qUgo@HJ&)^9ktxSWd3THJbuH9QCmr)q%NfW zJsw(x_W*3vL>@!h9@>RY?S!N^7jiK>NJ1k_lR;f43uaM1Oovi#~*0Df1Jb)oW!X970v;vsxd zYbUN9pC464m0sYd#6(l&4<&pZ&B(}z+^=Bc5X*(#*lZl;+Nc0|5GAw1@BQ{|V-_K5 z4UT-{fGu-c_8@qNT4-pC+1o0Ry|$>r4r6@E`sP;*J2;B`f|EZ#?xL(7t7l=4s#QWH zZL**uuDw(VGs=WN-|qcUBzs+;AEkpe_0o(sSm2z2p*@TfR!ig6zU*T$UD|>qp`rva zu_(T<1HK_SP7De%Gv2?HUICc;nRh4RS}m z4&%}iT&O%t`ZZb1`dmL34T8)g-yBq zEK2Jb&VpB_rCRi=KX4z_+{QkPWg3pT5jYmm{?(>K_dv`us>Cv4ogCGn8m|RbZ;;QC z>53A5Rxiz;kNYq}5pBSs1zBN6jl9R*3cFntwIg4)h{50JwdfA5NvX~*in@nmd=0>2 zbBx+TD?|E|!-tR3lyGfi7J1v7Xst$T(%m~|$tQH`>XW1IOKj5VWv#TVeoPy9gQMJt zjiKo*^$1N6nSe&(hgayrtQ$)6r7J<7X@~6f>p6{5c=$_qiSZ{~7th@R07dvNs7VGZeZ zST8*JQoNTry`b^*%HX*8_9>;|4TjM)`n3*_)m5Xn-hCVAy_s~;t%Pz6rOdkCA`d?= zGt-I59{^rc+6?AI$t_XQviRIZfa6jR2_D(RBrbr=#h>iPN#(M&r6kYh3;F?1ft4Go z(|VkFljwnxB(r^`nRYH_(%`mxzU-Z|)5tP5R^pWkSbnK-6peQxzQco}WB>g{tPV*L z#+xbB*;qS^*Kw@*W+Fx{*6t(NAc{A<)JxyB4?5UUp`*URl39nI)OUT{w<+bXV6Y;K zZ0!Nj+eCzM;f$YydV7Bm%WtuJjiq7*F-)-Q<(MHRWLZ#T@A7E;G%i(aosnjWlrjQR zeD(-gq~Q@;huSXt=rp+@>cu?jM;)>k1`9-3DN+n&uC0>^T!7kXtX)q8Yw_&?%j60U zbt$}=9a6qhLF((l5Q<_oZ<4Z2cy;|O*$Zf~_}yXV@0?%!gE5PP4R(GOnWoo$#e_Hz zbQR5~w87nwOi2SY9&x_-_5Mx21lQ~LRD(0ZUk$bxzj5xR=|Dwgc5w-SoDgiKT42*j z`A0yrNf#Fm0|aET{2V#{+U%6O@e?IS;Wb3^-;>08T0Npq z<@K@wzbovlHXgu@4>Y^|YZPsb{dXZ+C4G>xl#0H-T0jLzS<1Rwj(9`(W8cpvkg}Bd zGsArUDN9+3M^5oy%2Lkp0{?GiDapheEb}tv4Ewj``l>V?py1x`(9g;wH2NevMhvEq z8*(JQqL^XC8(LC88NL`Opl1q8Wr^Ikh~0K<=Awefmc|l z(n~he?OUs3^A+oJ60blgns#mG1Y5@tGYTc{#Kd~|N^71NW#rIDtgd7cLA{0&dj++S z`&O9D+8QX)*xVR4n=bopN>4c2HNbf?O-kQu(`vnFkzgQ$UcRa^fDc%k95Tk;WLlb~ z>)NaNIa$(VMV?Bm)cN;qG^N}U6kf5|%u@d{J>@&>P~+i@hB&G?i7G#pY9D*<9&WnU zWZYo6VIy(8NM@ezUv>r@!rB&ETgb-CZRb+tP3XfK1_rijd-3e80i;J<@ujzMeT$3) z3ET=g#w%zu$o%+^oMw^Y2qwKtO`4o>yVkv{#4!?%8NuT{DXgC>SR*+TX@@j+>Uo4U z6Z^H3scOX&^R%;s0oE}BEtCiL>#_Jh?EU~M9fvctbr9w5)6{;LkoieP zXkaKgW%6+zy{bZ=$8c6Cc@!9HSmq`yMChGO%bDfT#@7FFjENz0Oo}=N=QK1NA@5mJ zB>{!RP4(&X;;A+vzdqV@T619TPuNg^`|udj?N?X^wz0l#r2`E6T1zUF z9a#?qWB$c6hFJu(aV%>2k2)*&LpH(aUlRCZCpNr~u-A2_JpOkCCG z=LbV*3Zb?DA1DYF92^*t9AFxV=d$6B-f0gs9E5Y$aI%r%i|R^RPuazCHC9v3DX80b9pS~;UhBj3?DXxiU}juS+)O-D^;@G9*`XXP@mMcqawU5-=q~(P6oxXnJJCj779%3>SwIt()<04Q*)*kxoR%IbS@mg*{ z{P0IWtdiGCZzE!UIJQ7VX949OfWY-wUs6f@DQ)O3Gc#2iJ zGBCum_}v?M-yl(!WT0`Xf}jyKUqrgDP?A)^<&boE!MY_R7lfKbf4r(s@Ok34*22ULm&_@F zg|*y?ualM0(8U;1;@(Y7`*v<)x{lRwx%M>{E&+#W&19M zEM~B3W5&31OMfFdC1VKowHXbwWVMocfMNLV2!o^o1|x&xwlrlwPzl(Sr_R3Ji$67m zF9l%Nu)7`AHaT@|vZVIh9pTA`-RAK0pPzOoZIafOAuj$n8@|CZ7Zq|vZp-LJa_E#>HzB2NyELzq#Wa-Ix%Xe zpf3eqV_$rPG~LX`)_H1ho&HG8r9gI*VA6fLh#bruG301$(b;4lvNg)?h*%zBngSHD z?6+pBEf&Sfj{+rTg$ZkDAZULkW2eGI1DZIAEvF8&{g!L^N$}L$6qtKG3!e{Yfnu9w z{;iyzbiXCyE%V`xS@Al(r5|3;KKjT8gXJ8EJ5dhX&;s*<@13$@PlmxV3JEo| zY@wLV+r-K$A4gjrCH+)K&NWxmf+DL8W2>woG^5~^o0(fo4XbQS5!gB&GMKnXjNnWg zLyo42PmsUv&Tf<}l`5|YWTCB9woWc@h33zuBZyOE25H1Im{9Dcyy2iSL!h2n znuW3mXRz0Rn-SC8H^tv?>>^!fNY*rfoY3NiMP+up))ypKdc zdllUR)3KM5Z-KSwyd|RIoXrULz6&i(M=>h!ckebfftJmUP2M4}k#hgX>%(~}@hPpc zoD|EBqYx#?#SK&2c@ zao+VyslvQfg-H4TI|=h<{@YueJ8njsW`L~FcpoX`4yVFLkK1>g?2&ww&((rfnx;e*KM-huSfi>3B-JDqmWBwOZJ< z?2i1a{Xw}~DO|kgaf{FDZ_AOpQYmHatINktq@zJ`HYjrvH@sQJ*&`K5-5s3E$y3b9 z%UZ4~+M%ZE&dErI%x_Gg%a;n*l%{0&N!oGwKdIJ4e}JcxsX(b>faeYw^-b9 zUg1kdj;043UY-Zqv)Y&s%pm_j%Q@mB?@+xjW2hziGV17m>8;CTWpf!7()*PYi|=q% zo)X2E?jnUD7P|>nPtF!Q??&{7hEQr2>@Ciscq&TqE~<`8^ELo71o}>U)~A;80mfJFc0@&R@Fr$+@e1Ykx=k;*5OOaR`POf@0p-w1 z=Jd-yPSUphs zJPp|U$+X!uVg6N%hjwUbcl!SBMXgv-ZgJxA)!#5wO`BU~kS8h7&89@@y}sOU)Q9FP zV1-X~B}sSoKGoIXE3ROolZkGBOZ(F&eBthgZVkAamlIJ_OUmtG8 z&8+56PP_fnAj9y!bwBj1d^tU{`}Sq8J0mXrMyRp7+Ydr*54~jJYWd#pRm1ny2UY;n zZCE4X&kduyNMEhn5pA|5yvoOWKNi1ic)@ZFCQH{Rn9`o+-}?S~SA_ToEhcQ;N+_d$ z06gU-k3lPwn-CYj0lR}<8HTx?m5J^EpySOS04#vyVKmhL3~T&+F8{I{WO*rMP&9OZ z)&0Br;Vdp1ul3Z8;5;~^itDHLtCOcM0cz)4r7`@Osn zR?nMFS2`lhrJ-TuBDs-%S5|o%2qs1^{&Z=aw<}}&&eWvAWCh^p=c&#^8tuFVWnZZf zUzLI8QXZrr3z6gVy^`IQP%mT)=)5CvJN4XqUAQdW{+vyML-ka#I~Y>Y>fslNZ8H-d z((>W_qfSD4M-0=I(oM-`uV2WM=fev4;G^G!u-8n=sdP*~q)SzBF8#N{&+bVbJG1tO zLxtNC6i=m-g7sYwzqB$!4D(0j2d1mV1FUmU!5=`HKe&)_RBl5u`L@(c<)180;ot+M zXPOHsr#vq-|0lZ^;`6Igq5TwrV<{V+_Niw%xxbf79>r|=Wp(p#b@RqQSp;n~JjTC> zseG?yw?*=-pRc2xuM5?~ENrJ`d_$2CCV%&Y%u2l<+!N02@#UVt=7FZqh6e-~I+No4 z>6t3a_j2iO>gVoiQt*MH-vp`GLfJnxvRBIgBCkp}9!?Z)3s5}gsVroDsUa#RH<~`7 zJ!{CDc@Bc4n@bZ0S2B*sZG3sw@PA07!?yn9O7$=U+gW-~#pOw?w_o<<^@@;+AH>?X zbJf1C#m{|ZS;G^5fg1nRa4OY(I90gKL-A5NiCq8j!Y}kMKSeUHZqVG#6L-+Y+>V8; z;IlB|w66g6rqAJzJeo z+?qhmR1edzorkUxw$~7&$#XMr_6k8J{&?Y_n{W2X9D3uro$N(};+wj^T?blO59bQE zIVfH;|3k>6e@0mYA9(sr0K7I*PT^w)P@Y%PKh)vEP&YHx!xU^-;WGT+nN>ZtfqWc) z_)IJv7Oi_k#qD)yFf0@wi zl~u@>{u$%y*8b3&J^Sr~eXZ@q7&0q6f4dedCVdSRAUPz{eD(g>qHW_3;95w)S0@o7 z1Pj&mPkuR;e?kx=vp@V5LbCgB52MZts_3TaFUsW@V>4y;&Y74tJNab=lvP_*p8o-e zLY7rOc%S_#6E?hvW0oWwvT%xT{;_hZQ1CTUIGl1Xv(c;e}(6g)#atHxzU#MDnbZ zE-%mwR^snd=BAJxIO81n<-Qj za7J_cmsW~3c1-jgNk%sxme*aNpB(g>(q8tN4G^FIeu2zjIiWnWDJFo=-+>6g1nS>-%FCV=2DASYIlIQ@1;dBobr=53 zWfs3RW*~;{O6i2)uP*&r{}5|swl^U=^h|1Q$B5nA4l*l?HObut3Y$H0qoyzI(SsJhHzBsas`>KLE|V~^@yww8?Z)G? z7P_clzXZs5g@5BA?e~y&mlM46LuK$!WyjQ))u6iM3f|1W9r3_0*8oLHBhc%FT4t#b zrkn`H?MA@bnOUya%;FJI+Ttoi{{hrs{bHalG|+b?mAh^}Nhm8)r~BU;!ae)n?%VI( z8EmW$?+l(yX<+XQjaF`ZZVXTkDL?&>5N@_R*iy13J&Qkp0C9zT(R$Sc2OC|*I~p_r#MlJ1MxVTs+fT%xTq;07f+H`k$&^OQUa1s+}vLQT|%Azi<9IAOA}!wy5=B z@4ky*J~G@CW=imktth{Et`y?p&a{`kLX*8`Xs4|jW25JljjP4yuZL8>oBcy}T=J&C z?^eX@UWdcTz_!^uKXn4dh3_0IyJcpO2?vp$n;L@EaEL}Xx=puRt>z&NBuHUmj ziU^nYbtjbl0azr3S*;=LKw{V-@%E6Z-17mS&B6$Gh)+lzm0JEJ3F=dE%Ci-gU7z+e zDkBvyw(UON&rMJ}6MxI48>+Z`Ps3mCa-aO`2cx=TLEh$?TeEVFmbf?Rj z;OPU0@uNxsO_j&3?|JAiqxY!+>?rM#knGUEXkF7GPO)Y;;bylQ>5EuKtLhsEgV><& zIDOW)RCgF`Xpf~NPoGks7UeXSv?^Dq&@(?l1hge2KK+4h#8Wb#D^X&QiTDaXxIc^Z zOYM!-g$NmdZi&y%>!h;u8~2VcD5Wzei_abQT(cwYMnpwLwAYfdAKX|J7vdfv{?gLeLVSnEmR}RHkxz}}kAzc&mf4)AT#-e zN=u5}nN~5gES;6zoil{uloQ_PN~_1+Cfh40JuH=r{{mfXbJK*fq7$54kufVke zuC{g=%3b@7sBnoscUR$Qecm*=%v^}4!`JMRIcMpfvFK0*3_SN$lU0;=AJ%!uA4?6K z+9d5jN)6fE1F0@tY=7rXHrfd*p{dS;m-+J-V&$VIZg#AN!`Hiv3*6fpfZX><*RLsF z>!|z0j|x~+dx(Uu`Gn1^y6*02J-x3mj^^`HW_2s(|>ihd~76 zSRx+n&CZFpV?@7{y}m!Tc|XtS@5BGJ89lcZPS`Ok49Pu%$Ff|<9>?vxNt$KxSASr9 zE$W6?N7uV`Q;n+&WworPIfb_61-x*O`RfI2xN!AaI?MKSodU9Ld(_zYEo7UH;1Osm z=nHA``1RorK%MUe`n^w(Vf3A%XMI0V3)yWQ z*xuzlwv}*n^r%p#Z9qsm9|b*Cxj_U}!EU;Htxm+{dIoHD`A0Ab`wC{zqVg|VvHs^( z`;W(Y75okTQjBNysxp=GAzY^K%Zec)mPaMGxk+muC6YiSj9FLe-Ox`aU~)c#OlYCR z*dqv1euGxXMFuj`fxa0me@Hi0rsAW0b|>T_7t+zAJ~+0{S5W2hF{V6oQ}V3zc`f~i zAhG!~L1pZw6_}P~uVUxK9hZ-;x5iHoW^KhtcFa<844t!(rF~SWYa!H)_g}DjnJVQI zRNx(fdA77wbBv3+XSOAYw5LAH&jZV1A;wlVc@|rTw^Th%;e61mtQ;bsFQ@L|I)zhN zqDR3qlQQjvE0_oC7bb0uT1;RQiSu#@JRRBiZ|g-{qpzjnW+Yfu7h*9`lGZ&$$lH0# zhq?S`Hpv{#Q!d} zKdyCEPMj$!R%tuPB((jqoa87y-@h+=S?>RonTPzR**uR_4sE)F+5T+z%r{%D;z~2? z$*!#ka*I{xCVxmX7l3b3IR;}9O5nw?Y?Z=-eA7&-dZW4Bd2VIXsWox-<*WQjG0#*> z3(&#$n@ri;c*xeTny3303*?eInGJlOpsvX3n|8iCG*k7Zm4sD#3RG4&U3NYx{v^hKZ`jo}y9WG(!A}K)vyfa%o-etU7^A$DPP^nkfDp?|oMp5}v1_4C*atOnV@0X@Gya zCxV1+C(OfXX*2V+kSFtOc}Mn)9h$xydadCvti`@w6K*9%+I>x4j~&{m>c35;z}iG+ zA0>uCD-mUM`I(UR)7K90YkZE6!D(;cFet3|ibJzp0QLwJw_HKH%D+Y<}HEmhv+(t!=MP5)u+hf%6(n}?7 zy=`iw!!b6)2I@`?XzheBVrC9F77b9Oyo!zNWqcP*=&N+|`owsJT~X~4o*bzRHC2jq zNdhS%qB0oZF#%o~HAF_SNsEeKULtk6F(}9E`PLB6mqtux?cOqkJ_GkI*vx8^Vy*)( zNk#NB;p}HIyF&-{9---|86V-s=rN*m5?{)<-jEsgvqr=eOKg5@^6meNytj<1Yuma7 zHz5fS+@0Vq!QEXG+}+*X5;VBG1%kV6+=4p zG3R2hxyJ0HigIM%k?xwE!e;rCO9pgdP+GWoYA#2QW%Tc#A>ACWmV(@4yk|vCq1YA+ zY1pDDJk8lWH>?HiE%8xvUEV^8t&=voTE@wt9+#&0KNpQr4K5Q2Q2M^r<5JYJ0|gG5gk7%%W9J zymCiD?&6sjmv6zst0jn`kQ`^*LnqkG;Q5X!?SsAby=r3Gd3e-tIW<*oJ}rvINkj$4 zwp~M8%6SePI2hA_g(J{meuX71_`}-nm7$OrXIuuE@>^k|=)sM=A54;B;JO9o*oDXt ztUOCK`oh;F8qq@pG#PeZ*yNazYJxB|87;R8cd+;UX~~f~wPp?VT*R&#iAT3x)FTa$ zS5jG~V}a~;lyxiJxj}YIabrbwyi9u%QKT;&zdnhPIjH9zJ5t$gkm-ksVU1&X#2ur; zwRI%Qo8Wu^$@50C0S0~0 z-Wa0AqUi06X`#A^9~psL_bo0ZuNaY>{B}+z96w3O0i1a5p=|j;?*~F&63x%bSb}x% zr6}DruV>zN@8w2{f&GB$8tbDg&MwQTJ5%DaaO3^AFreCoFE&0?$iMmv!Cx*`Ii{Fx zyqyA@dT=K9#y(CgOUVA9)H?pUBf-O_{Prs+SX5}D`|8a^%~JI%#%Q4`A~E839KuQv z0iX{xQgdOD6^qT#r_pjga0z9+%=@nc10e8>w;wvbB#QGq5U}bg_`Id-gKfPMW!)Iri2v z+*%S6Y$HFKMf5;$-MU<(S*~1PN(Cc~zO#S{EyQS#6-Rw7?An;J+%#cM++#-V2G``G z&1F&h3%ZOF{HSZBJ|km%}Z3?bo1&_NUA>9YLh)(mA##@u2`BAJ10 z8YMfv8~<&`{fUeE4Ufy`Lg7S`Ue@o_#9L%a;}qn6pR6lVvlb%XA&)7`acsjaY76X)1WaKC!)FLFE9}?BU(nKK*{#p+g))gDW z;R#B=VOG?PB0s!2J`-3@6xcbpQ3zIF!i!2JrW>jAEq(+l(}fP_Bb&-_2j|b*pOX`t z^)rgyw%EuRIt@X0Y~<+4W4H@c8bT(=zs~Dk=?*iyNgP)WjdQ6M7(OS|q^(D7)mE|E zk%yI}rH|GGo7=k>{MsI;v$S0)NiA1$*z>+F4hpTxIZ(n^mOhJ>{&riOpV#QFOa+h( z%JP!!MktO zb&Zu1R2h$1DYQUO*D?+N8Qh??!*vB;x8}WRqZ%J?Q;g{5iPhgh+7Qjmh6C9uX%Y!# zwM>}YjjQGilPlgL&DqI0$tUj2Hx|nTo0%0AxL0UBGqS77o!xD^sS+Z_N3=$CAYT%J zYKfFv74Sn%qEO_&nfAvVXr`?7)?t!V~2z{jqyXC z0g*(iEI&Z6u*-wtY&bs0Z{_Ro-2H4uRSb1`)oI?doH9o2sa4aDZ}q)A=>jIh*)sUj zH2pNDa%Rc*yOoBG>fk6UCi_fH`oiaB5;uBI7+59qOOo0~VHsJEXbKfa4@(U30ZQl~ zH4#Y*!@$Fc>+W-X+xg!_lk&2PXmR3J$n41NW6?ueQA`2KcBPn989CC!ns-!wrDmn< z!b|uScc*q{p@z!}<|LTLoGe?NFo<31b;gN?x6}rCuidLfXK&7+#W4Ei&%cL3Yj`-o z^Y#Z4D=ifg6B5FR;o8b8Q1vYJB|~(ser@oLX-k6v1+E( z=!wMfke|}$9}P>$;X4{;^Q1J=iWbtv)oN_sy$*LA*1I|aKY8=Ms~069u^N5v=zZw^ zd*KliCNC%dKkiW}2CDsZ26&TZxAw@n=>3+JUw0|*C{aHj7JAPf>%`g@pXMuheGiVl9;&-i~4_v?SJkoj~&L-#ne)b}u4{r*~_&-vW-u;(e?`d=rvzTZ5rX_MTrpXJSx5{&>3>=9sPeiJ5Pi}90a0GIFri33y z`VJ0F=T4qaYkj>=pT8LdANCH-D+c>g0l7YOh_el>e+n@zl&^COHRu|$eRY#z2EBJ?L9~?3n;(isk!K2MXc-^nA-EhC`=yczHwVFe_ELAdxpPse`R=mP6fs+) zJ71DD6UuQ@FdMtIj^yO;o9KCY-90c6T6~^g> zFj?xino5(K+x3{owTZr1+6MBtFXkwXT{ek0Iq$a)&7QW4o;pp-y}iC5HITzb0A4pWZ6 z;~oLNbN66-qqgT2|z{Shn}q?+{8d+i;I z-OHt>-kQe)^a(WuO2x#vMPWV}-{>6_2j#ZgIPA>^w^GFlDHAyJpsgTj`D3}++U6H< zsq3vK`Am}6g2-!8^^V6!e=kGU49ql`PifL{HcO`8F2iSqh!8eVVsNa}(>|KvN1;t956<<-)Y)Pa#)`e| z)fnZh!fnj!)*dl)e!xKaZD*^;9yrZnXH{%E)oiM;;Y(4#a&gWjxk%a)aY@L=iPalR z;<=0g>6xTG4<0+*Bbl#ZsQRXh=qNH7X9~ZExiQ6ViN!~X$@kntK4wR91*W1&XY94JOT!K*;kwN*TS4}7Zv?VT+z)W%45q|R0( z?UrD(D6;IOM*C1r%1=CX0&LgcFS-Wzxs$F`3hBO|xZWgo{7ULlYNzCV!*Ury_vD9r zqu0c1sq~21ns$CCHM8iuX6_pgeB)+ypC0%rT!AAIW_&ZNgq`e>|7-WBbJpCK{NaR- z=BUKYQdcvy*FTJeg@3fhZ%Im~3?YW6vHDnBdxf%RCLouyizG|7ZBX)LFXeOH7_O5S zRc)kwva)fw|2A&kY9g;JJ6Kd$&dFRDM5Ld!q~*xgI1E=|w4e{A9)}=3G4#Y~joK?3 z_0)x*^#1ffjd+)QyCS1n(wTM-UPgrVXVp=iO71+d$P87(g~gzTDOxs)aj@Mx;0Krk z)V;q3Lq>ItZRRF4>>{mPxQ<};l9=i+<|6G$=Fy&GB*A%|`{d~B*qz=1((h_1?f!AW z;^-uYp@h-|MgvE1mFZS+42bkR5gB<~H?`h657H12QZ!#nC|`L(6JiW%5r?Wmmr^aV zP-q32?EG{x%aE6s#odQsjmj^!F&K6Bw+pjyomaSiP5RC?m_w0;@G|k&IjzCyV;a1H zkyAgi>aHJDaGc?&Otz(^%G1i{zyrS{LZNhHG@|et6%=HR*$=ozmP_-LdC1Z0xDCs~ zR$n)REd=&bw`A)yV0&DUZil?4&S#PU}3RTmTl{6!4u;6U5L~ydT6kdWh ztR*$Fva)iW+e*BwEMwRYW)^ei>`=`mSDm}_g2-Tj3VV}6WMO~B3$V;;W^FcoH`2z5 zAJQQ{Mn#VS5l2K5YY4Q|oyB9h|DZ|W`4Rfc&5%)P)V8-y(a^bh%>`xF*sGKgb4%7lma=_y zC6wG?2aHUlc$V+99hA`J+MP7-u3DYBlyzn7eLKH>S|u2(GYu{LeLXS=ZE50O*fRS= z)9);^Mo|jGbcWYw2hR4jNB=k))|Jkxe}4yu{mo6Tc?OFOok7*Pj|i*YX9cHau>?CY z363Nva@;199(>D`k2aLuFjWn0B$*F~r6s!ZY(qxiq>b)lP>f+PtZa}l&BLsU)HHS^ z@9$ANKYk7lZgG>j9lu4)u@4v9G@+9whyD>~LcG+wk^El%;Z+(*DtyDHKi$wGE;dOq z$~d>6;(4k9j5V&RNJF86%9xP^*(mk~*8ucdV=lxRm zHAvYKPtFncnD z*G6+X_O}bBo{M}&J=F)VjLE|7NGzG@o`sbGKu%NACVmJ_$gjNwxA!Z+Y04(K7W9Ii^7vCeb)mBvJ4OWO;A3KJ_T z)2Zm|z9;v6n!QON2sLnE9X>6gV#P~93dgpl4)T!8lOtSOw9{v>LaGpzc*53rJ!7OlK8C?&Um;s0#6U&$Z@$nO`0%y#x^z$(%)G)bB%GUX~^@J%9_YP zV;-AY&mlBEle}A-dki1p*li6Sm9eQfP*y$QLq4Si-h|fd%)}L*em1zm)KkM=7@J3& zG8`74{>`DQ>3Ax8{|BW}C>j<=Oan-6)PM%t2HZ0PaH3a`+HH5vkL1`rfL3hmdlV$KYVlzP}t*GNTbM z@Ex0TzYzha>$i$kWS~PGraRBNd(`jlP^nbDDDzF$?>KRjz;k_$TT6?``hlAreSDN7 zMQ;6HKwBB7rp?ZiLdG#MN-_+!93lOrF5Pu=@MS5OYy*O=+{RQ_PLo-%BJF`V`bA&z zc^_qLY>4>*eo}h+!4d7hTUQ!29;>P^rgy$d$byU&T9m0 zS>XrsDApg3{pjOUJA8BaA-&w%YhaAjLn>4qB0nsA^K^1plF)$TbL(feez=;`!pUf2 zE}gz5ggd`PGtM2Z?cH*Rsk&F878zX}EE_j6lQIL9*FA5p;i;Ybk}r=fBumv3j%^y5 z#GhJNu3V?k*(L{WFz_0(=^j z(OrM&@T(AlPLLec(Jg5`ER=AiTf9&3Bhpmwy&HlGFSYz7USe^zv_Vo6((_rkn1<>utUovPr!SMc#-nf1x<2$o z$$dp&StNDxM)I%gq@CO+J8w)-cm*BH7Pi%Pe4ggD3^yYxF+fRVXXnJs6K(7DRqr;H zUYv8U&nW*5mp>|+JX_!p(cX?d3+f!_S@K4=S4uqkv}2m4)&!+xHOuE1k)`x`i~_m! z$Q=@ni(m{$dZzG7Q^?eVHNv^M8$|X90~^D6E4^@`_4Qf2jdVAaUZzCz!B`z+IwJuh z#yN|D(Wo2hapOlkR?#XEs!lJJJQFU*pO)cKG-T`INfHh=1~z?B#{+2y>eJh7HMOy5 z0hmr3Sicr=-_YQ#+qGG8Gg}chGl1IDON%9qoSjPbu-5Z?@KYJWtGP4iwJjGlA!5nC zPy_YpGbyKO*YV19VX9i-xweNSjBKzPsMqV@^ zE=Xlk3S7~Pao18}B=gNvztXtgyg2Gxth-Cg$3oIU?RITc?~b0>Ej8=OfC4JP-a(~K z*dNyExs2a8<&!T4QkVhy!+d!nl_okmEBEj=jOhEhj_5|>_)aXyK3bK!p;k}fZW-4` zrl_|&)ND$TU&l!3zA83^efGIci(jUhmI=Zc$urAcqOjz`}qCJ`GL=?YpQWY=h4xevjVG!fX?UmUv%Z_QVD^ zfio4{nMD77ZcYAX4T=QzntdgSv~i9Bqm`jTYN_udaLI!^2GpNoKa3MOG%w8hQQvLN zqG8Eqfa6o zf{1nIxA75EjtzTzNTO@aSnFsp9FjprD0iQt+L*6-3*RS#Oh>#Z)UD*<4fP&*S+}ip zX>m8gMCj^4J4ueyH9TVazpr~lCEz)k0?m>0yf~#M-jV-k3Mw=dR%h-a=sOx7bezUg0utl{gl3r8(Kvl$! zHE>*k^|;UcrP>D+4lE7LQ1Z@C*&*Un`3hJ!=<0+bpHB4N-rL}!WO8t7E`i>T(CrN4 zK0NHA$0=X2Q4XbVug8?vC7PRAKR`N+$VwId=aC+Vas;LZaA4`96KnYY58kK$UOehA zPy=HLqDlXq8*lkArE*9BS-S_*);Ho{tmx|9zY&_jmH= z+C5`?QPDrSpTQ+dG#k|aV{$)j)A3hVKBW03{rU^E?4DTH4)TFd+xiQ%rqL33CE_d3 zi+By3Gor_+;hS2H4ZUP|)~ zcCTFX;qLf2$@8qO3tVql+W9-$(OE(^XwsNl#m7#}w4SY`ZcSo0Rw&+HM3_XM9mm=; zHw_n&8@c|V2?J7%49*~8dc#)6-RT2DLO88Xe0dN;cu3qw7jy7q*aI)b23 zQlxPE4a)V7f69mg>1(YY6paS>18u$*R9tZ zg&b6hSlLs)%YGXEQP^^nkbOu(I>V7KIn?;d8OMZ`(JRyuCpixVQ><<+TKVQ%BiXB8 zz_)WcrLX$o>k$!_k-tF~6tTtmCPA{KQF`N?-I0C@^XEDmo3X-)8v__M&RUNk2ctvd zUbxG`^p-sYUA%q0oI`IbW2cTQWtj{ySoA4>LJqhl!-&%WBlHsaU1x-<)A3k|ow}N_tSS(TCPix+yJU_kA45eCHg{w>T}4 z4o8HlU!;g;u=NU>&M~~9$Von0Lt}}4I1FwqID?31ZN$yfe~xz2(IryP6>cZB2ImT5r^tUMorD(f?xJ^SN3n?2R zRjqUw^kHO^3}-^Li{VUm@i^(PjdgW|J7akgi@9hTWTPFfE@oqCvqNb>bWA89KyCIc$DcF?O-7@gEx|Jl=2B$I(PURLUG8$=jqw)sFmH!Y^tG9h>cP@_LPS1C^~&sTBlre*ovlF&L;H% zbpees(zb;31$EAhGMSgnM{+nzdVCr*Q!Y1FY8&a z7nSnurh|$y9Q79^h#&tZIoy+_?c_I;xb zY%4hrw-(7qv`kRlSh8EJnaqsZWU3B#ifH9Tru5?IZ8Y5brwh=BSgQQ4!9>6@UUpe0 zd`D#^iQDgDS&?J+m@)U54nG?k!70=h{55wIybYd_&pk%63HhuD*j>L5@uS8`4gx2c z1=dVkI7uXsn>d@M3lR zT*rKj{WvxfO;5tLy;*(3Cn%kKmScX!7$u&=F=Hbpl!VIAR|Qb@q!7n^<%zGiYZck} z86+P<;y?;jAQM7&EKdmGx;|-dzed-&M7k6Pi{gGeX+H^7Cu7q)4q|5qnL#YQJs^>M zo6Cw!I|%RIqzF7wlUSJE5UM*|_*2OdIQR>Ef)F$ln&r3mR-_R?6D?sOQ_Iua>@+*X zPei|}3?;7#v(zBrE~WdKRNKwW(OmjAp

67f7tA_JwDo_R*Ew1#ku{d}CI!sgd4I01Vp zg)kB=$jQ00PCg1Ow$>O!EkV225j+H>X(V&7YU?;_hK81rhNCF40Icbo*r3L8(;?JX zXG4b`iU|-@{WUNkCaqLbp{bve7y__0ot48`hA`<$7%NqM#Y)5w+plB%y3i{=V&V#P z&V&&J%n@yy+X$<)fdlbE^^7?{TAr31+)CZISaT zKHM7G51$o&FV0u~8f70LsiB0G@B^k3M>!;jI1Jj{cHM;lw*IT5S~j0V3Ck9eCLm}p@LCBag#yKoYSqC`2vU7cVi zkRG5ZgSidpWz{?qS~<-_*JD9`!-i2%m;v@)GZ?$FeX_mwbIZNgs_gx0rscv6C29)F!OdRnkEtdOY4yn@mJlpQAfo)c!ow#Ih3 zA_$Dk(Qn;@lbt59{SNM%vYl@@i0Me&qT-nSnkuZEoB{_2&DEk&DAVH7h{>pN6V*0N z3Bu#vYWK>!zVp;LFaXUhxgm08S9mgfjT8(v);+MsoBeS^Nyto6sZ z1q&Zza|&{e49LvWUx0X)tqNEGr>*!xV96rj$X+_~bTsE2D9V@Z1DA(}gP!E@S_Lmw z?#*kO*AcYFp+a90xPL3>Q!dg%V8|Q7LO0wCTh1?|$+=Ll=FNcnF_o)x+!nt?d8&KP zp}cvJuzIh`RnZHbPuDQ1*c8|oF^{}8C>f3WYrcR~Bt|axBO#0};yGF=ipm-3iCkVH zQjtOYTp4v}`LaqYc_yKyq)t}05KK*oRRcz%1NNx&& zIUe@K_Qhw#%I1rGtF_bZTdoh0vM(9*9}lx8v>(#0Ug?Z0f9dr2-w2 zGu7S8BGQV5Sn2;>o=RFk3mpP{an1z16S4$9PB1QeE!{Ilw8okFw!`rsXN@fZ6w5z)vk53$ZwJ9=tXVn(H08LA8-^qa+4T$x$))Un!sH&RwhIIqbYBz2dPf+mCUWxY zZbUEzXpnGX#j4ym@!zqmh>+Z33`Ij;o2$0{(E2X(=EODz{b;d0pX45eH+B_QXTLSH z4++LRHGiYSZi2HO_i&vvM29x*4jQVmTcL^VlU<@1Yc~{IYB+0$E!LI4wqarSOr{-z z6#TxY^yD(dSSsU;sOwi=-c}JLS%u<21`m4+x}u)F$MArCL@&Ka8raXueTYaMC~vjH zuo(20RFQ=9y<=J<>vpOl7sK?dZ_UX|Q)9rO$tLki#xrTG*8^yBJ)smRkP=n4R+5As z>~pU3!TvfUE!bIsTIhalS*D7!Z80MyM)Lhek(4I%W@(LPc+yEqMFWL-_aIklJ;0H^R@hUmCp2Q(_;0Ry|jG-=OYL?fkRu@$Htgg*0 z6fEG9TX1{}iWhU+>7UT<<(+>kO0Hrz6+V^lm_mU2*f(D?bx3iiuOfQYIgh_yGZ1w( zUCQsi5tJ2pSTQ)|i%Rb^90mWhaX7v2g8KNO$h)#m+^zJ?@Da?g;=1Fq~e&0Qm zRYTI{+i=h7krD70NQPS1iY{h_h6lK{Vl920$zgt@Gx$&@e#&&%v5$)Hy74+6@-PqG++QFTo1G}3_I!BuMM$IT&@poUMP=QV%P#)TE7dKECvw5y zPMWWxO%N$fhQHe<9~F=O0s+I?pNAP9daXr2s#gC6qT)YgdTk4BLQ1e4jn}t$$9Q8q z`-E5llve9HN?4(Oc=krb7EsoOd{<)pq~N=5q~QwvmWtUH11g!Wez&sg0kwG=Dmaby zK~xtowQ%I>)tcL*^53G#=z~&EwRCj#hhZ{?8S=n1k^2AoW)|I@j=#8VN8)N%i%$ zKTtNmxgc65vls4Xl)=lGu=P=rS_oTj+pF_+{}P*?_Y}ez*vxmbZ$ZJ?iL+*EvBtga zl$+w3?~gHDTxZ{_d_>68sGJ?FK=)o0>c2zW_@c8u$J@49-MX(EK5k1 zn)^)x#YfS}k}3D$(-l^M;_OJmHWJTwdSQ(_{-R< zGTl|eJ=emfONS`VjQdVo?@9a;P{agDug555G` zGby{Ry3kIgy-jf9MBcFk?RNMCP`Lk(Lg@vtua|$`O3uT651%79pztoATY=>C*_e+< z8h@twXls-_?6{o3v|h@vy9u)Av}a>e|nx zy-)8fRl&L3Jm~AKh+l;LVky~?LGz0#KH5^vfXwAfw5PjVwsh|op;2!RD(;N%=`S(m z)OFL3TetYzipvLkRLWxd%+#PPdkL!8AeGpLO{RA;}}`JYIcAxWJucm1Q|D zsRh=K4x^jys-70!JNPC_=*%EV-YBM1IVoDOY_)piAkfwJU4nFq)!EFJq9ODaW z(hJkJUcmsOd)J<$!F*@pZB)%Z8t~o z-i`87_{ReTc$MpZ4*<><1z4wm$uW{b*n`g*y`~WMHrnt{@ZTT; z7H&INM&D=>!Ntd@ym_qq=QSO7uU&c)eet~9UEgzQdH&?%OFnx0J}mAzq_dXXFJNs= za;&V@CLvs?M&;uQW*o~suRbVC?gjvo@>Z#>9mQZznm%Rin5fo)m$>b>9Rm)OobBON zoELSqCTtW%j+=Tu%QXD4&~^PA(5IUr_V+wRm%l}}me07ppqwGJv!n2Ir5aJ(O!EF^ zTFJttt}CzpT{KT? z+kBY-Vvyuo{b0?p0C(cx#6Ba(_D1dg2JoO_w7$uGs<}03!@SXxdnp5Q=HuMEwqtt0 zf%CG_OlSHN18q2c#}<)CpdED_hjIJuh?hWayoY`1VIzm;a^V$+IC%msPW) z;MT)-T&1%s*G4X=(yU5!?Tt`{prl|3-PaXW~TJv2={Uh2{cu?bO>?u>PO&nwDeMfscjgcOcIu zeT|iMKpq7yLZSD=v%@>;1YU=#yO76wx0Wxaf1H!bd;vGZ(EYqh_Zwn!K5H4jJB7ey z+!ac}i&GEHxMR;ocrUX9M+rbr>>=4lS4<%MJ1$Ta+{zxH`{I(s`i5uo@Iw@Qo?c>A zim2mKd_i=rELm!PKL9|HR++&22lS519g7N~_s+6IxJ#1byhe`!MK4G+UGZG+SY!#? zIx9QOT%$N4s`Kkxcs|2dG1><(nJMmC9~Z}|!TjA-{L)qK!u4m8O`R@_c+9^NK3ol;0ck;fN}sClzsO&Q558RPkg@a&*Q z%B;xb&KnBapQ{#r)Xx+mUJj5Q!J@Mdz-T?PT$w42`%=2#VAo|VUKRFs&SfGG2{RUl zoVij&OH4oe#zsufta8=JypDyhuh~NfxlY{9y&0!=8oI-Agd;zG6l+#JZ<|)|TQp21 zC(Xo6d3f!4GAkl|mNaWcNM%6davh~i()%IAuk>yLjnyA!7QWbjb|3utW*6E)Pbz$3 z00l`+u$@N#UGGta**(KOiXCF07f7Awk-cz5WD)O*1YP|wbHLeekJj1_}(#43%C4N+jwZ>Z??qJL8Dl<9yq`37H2w= zO5TY!Mzf_Qd9E??BP)mXN8iy>NIqlmDedgGrPb_xS$pu(pF1F~b4QQTa{t@`i8|~V z`{xcw8G>pi;h_QlA5hQxV(t){)=0QPdfgVz2y3#+!QA zGv=H$18%=q8>%VR`k*(~dE8lInhy91eV)5{(C>Li#V#^5S|ryZzFG0}KtWO$Oq%r9 ze$39=+wPk@XAWTbE&!PWn0HXKm*g(bnFGGH+??ud zLyzyyoM)Sklad=19c^U$5EvXpKja%J7KZLf#+r03PxWX_^nb!>lS{~Be22!3*O{JO z^vYrAgK2oC3=#TC^|=>rVrEu=CSOi<_LNh2*_W~O4=tTPXr%MuKuO6R4Dg^Nj5Sc4 zaro~+oZ>cUYYW*7iIHqw@{wXAr}CI8wtR7c--E7F>odLN_Q*+%BX9`a60g>GFF|WR zxC;+6&l9&(V+Um%Qr?j(RlM#ZI#;=JHeqBXH4>}a)BT_nVTKM<1@Q*aAd{@~lgr!L zMbEo~VAl7BP>46;?C;pxpu{BLt?P-TWY9QKLvKW^=`P^eE zJf`{d;2{(MWXLyURmCo>w6RSj%E}VA8ssq(jmg=INFVCMJ(u`)JR6o`^r)K^e@3wN zn3nFc)bl4Bg?r@?d~vKfoPJk}5R(6vw_mY=HJ{3e;FaFuR}-lfGjSu6pzTcKMtz-G z2p{t+vH6w0W@hUis4B3t%J`gIu4NXPu-0tIaBT1gD8)%UVz%yIZq@WVF5BJg3j|$@ zhVz!zb`~C4CfX^2)oz~_uB$1{31)lu<>3&FBI!!XAdV9zS^R0_Fj!9Ykaqg;lkU(z zYg>nvc3C(GP$t%#p%uaiHQ1*YFA}~&&Tv{&BR5>o5{j3im7gV~rcVp8pMzXKTI^{X zNxFwUe8ZT3Mg8kY04vda_7}NODs`&z}6Mt`@;sf=5>tkUH)cwfC`|?R&$oi}4vH4<&cDrVV z664jpVqGh#v5t+0qDz#=?K{L-^Y$-`_ZB;c)1 zp$=+SJB}W~K2IYGqW6LG-kS7}bc#_%TwQmO{DWEY3@f(Z9$nII@Z0Mu#^wtK`0cs? zJ?(h_=`p_PuNXi|XvK;hFeK@DNO}F5Y|QFuGGK50MLj^UtbRl48_$=cf5q_$akaMn zS(AHMAhoz}uW{OR_3Fg3;#FtK0}B7(dwJ#iQ5KAPA7-nWr?wJb z>G-xeC;_k z_7yjvva~(g_Nb63otz1k&bK2lbJ5qIEGuupuxqgEF7{a~l6Mpf$@KxlCholj1A9Qo z!&q-Z&f;Gn+c!z8kFG~b1o+BXQ&oV#uMzj}%{1>XlFr~C!5H|RI1f3q5APO)A3c>< zpFS-F9pfpVt6Mi(j%l zqtakosg3LFqwpl!UPp|sG3%jNH`{0ZZ4n?rlKl&`#ak#IcqzwsZJOvU>M*9y#pd&1 z^bqE2C7F;~c4ugTPXRDc+5LUxBMHCiz%9|efh66Obn2qc zYzQ$!M7`sW&eaE7ux5JzI>Sw(yXmxQ)jfuy_#JrX*34tCE^ZTDA;3?}Q4-pZM4COG z&TOo;T>EN{N06L)7JT(cFs(Laja{75)5ZpFS(|-)-Z!tI=R ze&315ceKYL51-u1`t%F>$E4D8F#mI_TEVgA+D~{J?FG3y1|2?$ezTIIZtB96vhZpv0VypfZFu?yB z%*Y(oYWr~tU~kF{T&(VCbIHG0%e>0(;s4j*j6}L~Fna;qHTYk~KFifu#)ocqmg`Cro?53_xr-QxFD zT=?E-EBR;W)r2eTv9;{p&-;bcX~+}J36GaBJ{~6YjZU>?yl*Y?UTypb+R*_muOsQ_ zu}RB++xiUG|MSqyY5XCNj|x`LA;F9)pjx-Z2|=6lYhzw-wc&Of3#ylp#;tnfuf7Z=te|pRDi{QT%dOcfQ@V}bz;zv>( z$KsBKu3Xg0)M^e(CmyOW2{u|`q_30J=-fIm1MAUJ-X~m#?m0RL@U&+Ajoc!6B2j?|pJ}$`5 zJ21v+HXbyNRXxqlTvCa5=(cdCV>j*3yU?fKC>`o7wjtk@@@hMBg zK=bpnrV`jtO%DnTJ9^%XzUZ>Ovd=q0&Jfb$;U@R+cHUM-w?_eBB?EvW@8F!(4b^Tr ztj}jyFS_|N{WnCbkC&pbZARin&x0~MF2l>y1Aa%_hpMA~bjbA-V5er&*>rl8!~jO> zQodT{<`-`d&FdXnip(k^1pI7ACw@45BzU6Eib#3^8N^;Ov?Ww_Uj6v=M4uM<(ik0% z4|u7T#dWfJvc89%_ud0|qOn1YHa)1ooR@0<1$tUf!hg}N+!N`1d#X+{TIG6jA8NxV z_pJK`IzAfnheJ%C_OQIZLB0#1^(GsbuZrqE$M@)KFC7`zh|jNJ1)1UFr5bxX!*~A| z$X+_?rO}7hvAlj5#`?T;^gmklvAAb-!%#kQ4WG1Mvozdt$4p*Qtaq$5tum3+kJfme zJdUtazRZ6fxzm$?nPGChdY`J5^J3FY)o42(?Jgj6<@PU-qjbDvIWq8H;>CP3!5#JG z^QY~!_fJac=Lv&xm=K{&A`_pCArjS~Rj^!sR@&sRR+)W#+Gxe;? zOMyDnI7n9t5^<1FR8&QInOr4cLj zSVY?X*MDVmmKO9v(_LvE{tNWy&B)48m(6F`ci;4r@82&*xOY%v|M_Ax;?C$xtRE%FEPbTAZ+NAz}vG-QNaV^c(pvYpDEM{hAmW3^} zn89LZX116aZHt+u7Be%W#mvmonYPdU{{POr+?a@Yn2DIj-j%&FD{Dvhu3edHi2+BO zKm;Gwxgy3pov0$nmJXxmSH3U^PlDZ-^(vv9+5FW-E6HjEgoTl@RxKG&$o8K_$(Jb- zCZc(UN5&lE0oJQ??po7uL`j@qOSM(|E&KQ}y+WZ-0g;a4#}G~vJl(09Q@XV=d>rBd zMveWJMPR@B8ovDDYFp)EHo7z!p|r8t`dCA84uG0bZGeyuT|es6=3Uy*cz8T`$jUxS~lAWED=0?ooV0~wTcp%2uHpT^{4L9 zv=tIrFi{G~`F=9E0E)VBu&N1@FGx{Hfkq)nsZFQ0)pdn6^(%NO0S${WQF7FM_H*J% zr-m@cK3Z5~a73g(g?ilQ0cXN`%^& z%^6xwdNO>XG*%RZ=D#Ys_>MrabH%gElE5%K*IQATBJPSqM8HL&AXSm-_aId_JvNq~ z3D;04<}GIhPY-3s{W6w{?&$Vh5;iau)Wsy4Bw(s4u)ob@%=ch|ZP2 zRnXeDEm6gVCFxkk#*|ILX$*>q7s;NCAJ^_4fQ!Rmj&7)swu&No1b>9G2FHg)FbQkHgu z{u0h|CtOnFNt2}3Rjq6H+7-*G*eKW}#Mr6hwP3uG6hJLgDooB?5)>uz5=d=N``M=1 zNr|YdVq29zsLi^gdCXp$+&AqB)}<`zhV7*Uw^eMi;F)L4waFKb9O>P3p1sGnjxz=X zHX~KH1pd2FE-xW%LE^imIeS9W!BD)8*?S_i&dZp+s<+PS1n=M`ZHGnWgUXgpLrKz& zvR6sLJHM&wny0$B=t`IInskS$dznJds+c+o$nN|>W-ID_{PHb1w~Rt5v<_acyd!8Q%Q+xd31HmaqCcIuRAKWI zx!+RXelZ{f(1%VA<-Co&O{Wbh%WON>9{RKSb5LH-CjxcPJU7SlCg6NrDXOq9jW0e- zJ}hK+%YHchF?;l9^NURzQdy^TX!{hnv2 z-wC;63myKOnannQ=&2dEtoEj>XJ=L3-~#?dO7HtAd2xonq~fksc?ihY@bQ7cOw!Y; z2AnRFYr@=kS3dQ1Llrtu_djNqe|zP*vD|^}L>0cSZuXoV^+dIc82E(|(VC;WNg@H}lbU{~&gGP3|Aq&xBt$ zesxv>dCnR7cj%fEl-zpY^Irr8`9Cwe5HfqmiNlx0PzLG;S~xUTS{J`##^I}?Gv#jk z-52(O{)v7i?b7|&!T{m?=OgW3pa!64-kq|K=^r4cb0&P|M+4ADTUG}JSbaTn$A3ff zk(YO9jmQ{%==%LPAo+p0b!MSDUQi4sA6y9zw7N_|#x5Og0DmdPUCThQLWlO3-cwN4 zKr18;N}*^Czk$==_0oX5U}$v{x&*D>?9+(`TorbDel>TOB{Ar(Ayy69i(<-&6?l$& z{>dCy#9IW^H;NJ5czi80S5FtxW~`rpFbDl_}c6-etzV8k07pd1wFKbP64d6Db`U+*-SL9$zI#Woqd36mmD8Ge0-BZMuSVg^>{^ue3CB{3j+lUkd|Aus=R%HZ+V zkllTD&k>$D&wVe@JhjL|ZN2RjOgrz|kh=9Y=(2RQJ5V#WI=&W2_ndjVvVJ`0#`-W- zAZh=gb|&Wi2P_xK@1wxjbG>|U!*%iLz!X#|+KVSmdzVGq_LN9b;tJpc$6-(yl*v~t2moK_1&q--bt5rdR z&emtI{U4qj-XHOi3ie+_&k+i*lQ$<750j+2j)A4j#Xrd&Ry;RxChh%KiEl^pHbEL` zjKx1m9&W4B1~Kw@K=Il90|u+&$mZ;Y&TOggJ}}qO{M!5BS^q{V-AB#+lgHb(?9@kx z|2FFRVCpKAHB8p>NKSvPS~x=Yagz8jpSzdiO)h&Qg?>NDSSKS+>v?w$8PnDSF1A|hq+wBO8bdfKpj5k2ZSJ_pe6~&ceZ7GK0^v6j;na+plAz)-F{_#%PE+Q>Q zImaVz#2|mMAhF8|!2HkuC>|jznvIFIR&1GPe|?S;-#;GN<|~=B2lj8`GuMT6!5!_uLTQLn0gC_gzPP)^!E)MW6WDF}EOm zx=wO*f#tX|^dbyR_Q6m)YUFXRQyqTa0BjW4O!@B08b(a- z<2+WT`zRh<55`t$dmvCOcR4iBTK1;qI&DR`x2rbG$GPN-^_bZI2t5w6hPEfZXAi11Au&nyQ6 zOKH%c0S&hI3{8It7~HuLloaG|njW*b&DHN%J1m7t%?Qp<1E#R+n0sAv(a-KOhZTKKV!MP6T>uSC03KPud`%c1z#hj?_j&LhZ7&fEGy~`o6Y!lCjvf|} zEG|J+8FI{Gangbq2BpcWjRNu_!9oc1NqiXlMv1V_Cb86&9k#|VFt|j_T;c`|+=7N+ z4hVY3d06)F4Z#|}&utq0uIy&4?H0^40Fn8+IWW$kDGX~3aPjl@O7gLRo5Uoc#`8BZ z_wVRMp_x8)8Qhkmwl3qOTGq=qIRx^9YJD$AKMr>9A7sy5KMM1pg*!d|Xg-cTfCNEQ z6u%q%;eo3}Tmqc|n*QfB?f>nXkN^J};QyU7!2gjCT~>4*^ozOC2CA?fpZjwP_-C)% z3EJ!aw^Klr7nsfezS;fl$?09FXF`p=ha`VxWXxb$JVWJ`A;Jz7|CWQu^ksoS^+u-I zGQZw*zSP~p*7hsZ`Hf`#7_HI(x;qIe#11D9r2jV?Kcw?b6Z_gevn%(2bhFhver**k z1xgc1k;8exG&bI7E}I%4=LM1*y2K_9%waREhjEWTNM5rZ#m1p+MO-M)&{jSe(#=;X z2(X4`qna=hTFVB&(|w}DIh0)D1$H|iCMRGoKe4$k=j@k$kyBe<{iTbRnJaqzb z-0AZhBO4WUhYEKSCr_cPTbPlBiHubX4WKbLjoK-xgKqyBKb%k%`g^}93%|#J{(8=X zEu|_OftYQ|jacPHJ5f zb&psB*Kaj>azdX{JwgHbKA)?MZ^;cu+iOufEfq4iK@WHnL{=X!P?sO(!{1-C;+8-~ zx3z08so6Lm&pRNkpp+@8m56tFAwr)*jaP*RcKsa#+&{FOpk@XDGyP?kTRPdN9VRO& zHCc^-HzK_J91JH2vx}%XLL~@Be%F7%3^);gH9OHLwcfGCPOSj;b*k69szsezV<@rh zI>1-Uj}j^Saw|6q$mCC}wjb@c%c)NaN#&iV#vXmR%W*Ns;ePn8o6AcnmZ~(YB^!(h zH>e1Y_HO{oiB>xwg;<*0!8-BHM}(&{(&Nsh{{g!RnR;lo8tnaTH^Y;B5;_W?7nrTe zWJOl$58rp@`ZS4RF4XZ8FOgUuf2`F+3KMFp&P5iMYza)lw|VEhOVB13(ZSi&Rxp{a zZ23@5f7{^@F>s8;rhVY7{4A~a1L3OimD*U9jgJ({4r~B_Vjbz$8E{)ldYy2*c0+M| zQYLj3VmRGI!<9Tfm%^bSvbVfTYj1Fcthu97PRvw z^TWp8BL&q*vI+A0iRTEaNq?Xfo^u&;=j-9G5?b1U;PJcUVd&t zHEtWv-m(p{gMZ9lyTY66l}u;T(eDefk|%=ri)YhBf|RhDGV}lTGlycfX(IMg#XKx_ z!i;uQt|+=zM{{^mBylGk$qCqU>$z08*<37tb@L`&`xgj`UE;+T`KUant6Kc~(LZ3A zmT`~|D1T?A`C$F<<&EhN772{S8aVXQ&OmnMyXPME{W@K64^oH_Eof>pQuR`GaPJy_ z=7uH{Lt+^8aBYmtwep7EwfUh~h62?@*d^nhdeLC^%69d3s)oHsc8`XaL;Vf?;hNkI zYvm11$Y4oN)cuni%`CU4-Khq{!?*42#WLaHffI_^4(9FNrfc%k%KD?O4ud5fklh0d zuH+@-Z>tMo(@ugmZBXGhId)kh!rggfcuflxXX;9`*;29f#YK6CPFJuug(ZG_v1|EJ zM~a8lxs~x-vR~PkmTd;N%)UKsH5jw=sw%R>t{D}tM!)WrfW5wb{z`z@Sw`>UYgw3| zjIbKl#nUWVQrs|!06({_ZjJu%=|XZAUSKQ7vs9%_&sk`1+H2{`Gtax~%y7C6eguyt z00#=eAj&)2;=!h)5@^&>-TCX3hW2OZR5a=q@8}p%xObwtfh!wxL78*9V+$=XpPq0_ zkNZBZvMD&-u$O;zX~o*+pn1A^vD8ajlOnL%+iutQF#Aazqw%40rf`)_6e``w4j++` zud(1jTcOZpU+%?z5~Wd!Lv`^{$ayiJU%0XDa>l%@y~Wut>-p=eYRhMC=R-lO&M8~F z*E_9~DyM^Y?wg+vrA~Bqhm&5jXAg^)T@txvrwcH?!Z$x3IbuGuyl(QIs=E1GyKnxQ zQgdlddUzJz7rn3yCH#w2+uYRHCI7+zDnL}ty94!I1h3EC{GY)N)rkKcSfGkc_r1)% zFFt!%_}7iA)aquKKOKs*9L@1-9`C6F*TsL0ELaJZzWorKP#ZpO0}ZITIa@vc3Bb8* z-dE*W#5=QKRwb_EHn*XTw~_zP6kC5AW%b66>C?pE#lH)svIm+G6)*GLfD=KAxLw%_Er{@={FI#q-8%V-RMMSIEb zQ`z^0D--3Leo=yIv!@H^Z`{vcItF<1i~oSBavjV!AOA_)+i6f&xnGzDuuT4M2$rwh z{|48x^vPfS9kJR`HgWFH!lao!6#w-v3an_F`2OcB#hA&5mmoBziF2Vp0aOdMFupV& znLM@H$twPdOfPFS=kch1LH9Iq6XZlOs&K^ePtIiPy;VBju|QS87uF!Huc;SYA*M9S zzmf5*c`N(_R`h4ESP%Q%;Ev2kkDD^MlRqZv=ctT_+&%O$?CcrY8SiI32~m za5*jp_x{rn*Diyp2;TpFpzwD#*XG60k@!=qZYDD5wpsAaSk=2dYF^Ouy4XI5n-0HL zeM|j2K$yAe`zL0D9CwFXsUvZZGGlqLKR1+Y{eOA+kxO9f&*1DU9`{*}_Q${JoSePP zljsCxb1W^>UzB`6<;LPqWQr>}g_}ZAzG{>0EuO`9=mKiWmOqKT4+SmVj4j=#er>9` zwvDAuF?F_q-yRQfaC z`igQ#6k}f9dQNGV*xLU5JHw^wiw*0)PUHo2H=Oe7*2Ml!}s6z~XOCT+TvwC-TOReI!)=e21=T%oh1?RBm6c52g-W zI?2+|u(`7fSN_ZO*l;eF!NnnW#_Ut1t~khf_P;@SrRAB}3)y^_19oVE;Vtd^YAZ4y zAfWk{75u3z;fz_2f6dcO>0d$VP(J)$*3oD>ioA;dW!+v)>AkkUd0>)D79Yf9&nw4O z>Dwju!gykklO*`u{G37Eas2cLo@rDAv_Kn$lOwrrYj5L88+U`SVVD04rxjk|**4F; zDA6s6Pd!%XVZ%a4r7UdC@^$=zR~L-w}r`zE(PJNaXW z^0#QayW+DPclQH(<^6q>Hs(j~tnuOZOZfq*iDTJ1*Wa5N9kQc0#h2WmYP$T#4vqeS ziDLz~91GAPANwB%iXZkLrV^J{1${WbkL6eRE-uGGfsSszZN800ZOpH$`fMb(QS1GF z&YEx4*CJW-Zv#C_mvo#f_76=QtGUTEx7~7=`nhS5-1a@EV0V-yVY<%I@qzNgX>#B5@g|6Udv zH}M0#6JjHItmMX=G&L}x=LbcS6oKrbdlITNHnl&_Xg~q(B%V@35b)0ADcn0G=wU|M{EyPApcF`HH6{0~dAgcPx7t;=Ty*9mk=h1@4QrooKN3*|9#Mvte72lMJa$*bV_$$)js0u0uHjZO4s%?dQaLt#yyjtN} zv-PU2HJ~pyJbsUS6hl>L(sgXiY--}1dt{_~SlX_!Xgsi`>YUNY$M)>{?69XYs}I`y z^V}8XI{!8W3)e4g6$C01@N*O#O-p+H!|!zOIljbm2c{HR*WR(Dfxe&0B6Dw$P# zlP+UAB1DJWNN^60>J`;QS^ipXC>yq~vqH3bKDI8kbqtJ-yN=L}3qLCa#Nt@|zFG}U z>&7#bocu=oQcOGA`Xhg=Q$>R`DdN379-Wg*8Z(9jlw7$ip;r?6i+sysW#6;CX$PWS1$R$YHaKB7O3|J~St^gTQcd|w-S zP?^!3mHEoF+hMk@76^{A;xV{g^wUqm(Ja@hU!F@$Ker`(iCOxY?B+MBNC#@=e)mF1 znYW~Ab3H;{Fc42*BDlA^Cq=i{b4O|ON752jqlLH5^)qPO?$8`AGL(nB*yXe0^OXS+ zKmYpBdjFO2p-y!_p!u-y*m50O5BzksCIt8h8SnSEk^&XfQTaE{sdHhSx<&LCtAZlb z>jd?GwnJ=w=%E`t>}9TJ!#%-u(fk7@b4^{}d`~>`e$)5St2zn>ZP=}4h!%!J5&#Bf zM;dL^g-HS?20F$K^-XPTbMG+iUt|9@_XFLAosu^(nmXerqAB1YJ&szOTxvZ&fizj9 zP!7>tPvlJ58}saBHUT_MU~HLW+4QNmbIi6wORS;ERjyrvQyg7R=)T%IXBv)9zdpE~ zWM#g$-s0<`sI;y52x9cN&Yb#8{M@r!`565Y*8|4KA`n|!0H{k?g%g@R@BUVe&>|}c z!XBgon=BHkYS-nwRtC@SDlYSUL@nVoq5RE!>?=_<*?NKVi;dR{S3>?Kyju9jOva4gRs>CvcnrLHV)pZLZ#7k!;@Dj7fzBr3%R=d1< z6yp<|EJe>%L18&s%8-6J1(lGyJ=5Ldm&BA6tN%-qF@s8!-CVLSb%=ocxjf z#Qno7aI0cLp&_%#X{Qk`;KsXGdXPg+tzT5{rB zwrb0-Ny4k$zhJh2iKGR%k`b}6dRYiIr26{yyzvWPD z`~zz5)``i;($Go{R#g^y&Hd<1B_+KVB}5BZXAi_hoq6w~qoJ{PpTr&Zh8L#(RAFQh z0CAv~X2*HhwiaJg(}1+9We@ZF9C9|0|30J2mhJczt774TzHjc;RHfdzN;~_T)JPOM z4i`(Z36=&8W+#30YW_$P@p9?5V|}*IMKRs>UWnJ*s3*5S)73)&C~~y#f6-|+3jxnL`p2Z3*q=uT?-H?7OydSg$=K>@tquY zW1YO$*lFOLC3=YrF-&|TmMLO*J&2pZCY6H64XtCaFHWlRVHAENn? z{uSgop{glGx{IZ>aa~m$lmNDJ$!95emE)fyr8L#SmhCLXct8HrumP&l+BPbPE zQ)tkJ*LE|ADF%N|MH!zG;_K_6cNE%qa`cDMN>&6Ap~Cjcf2sl}vPAqUq`rosE&Uw2 zkisCJTSr6eFf_cFWo@*D)HtDKG9!8ydsL zp@X_#62_cfPazxUXvdv2XrL>>OseA8$7@MULP7v`R#R=W8d-#s!;NJeAGjI>OM%Ww zVW(_guyO;pB0t1sp{=)u6)&cX=VC!}n8@!jBMg0jJF%Q5q4nZhv5=XQ!)EA2y`qDi z5Ej{;-s@QsuFK|#X=*=I$0f0XK`v=zn}Jk;Y$ws#K%?f6OL5do5>h@hO1EcRMuv7x zk_G7_&JpiCjbT$qmoHKOGb7o#)lPY9tp(`N&ywucav6^Cwze@9*v!)>$<63TQP zp`J|QNH5pJ6%)DEgic*IM1wIO4-HKZPcR_}d*>NVh^RM$HCJjOCv8D@3&tk$3<8dW zyLI#_m1b%*@j1*yvb?q)lK-e77tTTaI;S!~(KxWjAR+?i6%%Xg)cIeUe46F96Rz4&I+-eKd4(6!dvleYV9dbPXbmL*&OvyGTGdDT$Z?eDHnD~Y(HKZ49LPFi(OsENal~Q-% zcaSQ|C^HhJFnJsW@?7Ai4IJT=ROSGPbMw1B7}_?o;5<3sJp_&w^BQPlO<8|q?w!My zj@DVEHhnh7iB`cnfqTW_GOQDnchSXqht}x|iX(;2m+bCt2peFI9q&Ac-Y0%-B>Oz`bURzzucu-> zC8q_3b1vR^#Relc)&|#DA=>qwJ`NX4O|*`#pv;ux2aABg=T(U?$wk*}(cu8Qd>7+j zdyOQ9YX07YbbT(jJY&i9zV+!kSH>c;pXW4G01(B7k&{z)za*=t{*Ky%D;ahu-3Z$f z!)$OY!KDHlF@8-1&;l-PGq+csp`DiLuE0DDW8~ zX%oG0c7iT}-FAq(HD(cVq%eRJ#Xt~6;6x7_CpP0uL6)?%uF?n!p6gKvq)H}R3`38q zxjCv8?wTZ?Ug|4C8Llidy)8j7RN;2fYXsMO){|<~f&%x}MR1!*G)hsCt@C&zHuMOU z5%8|e?=duOmmrEXZXzmDdVy<~$L152xH03)@Pca$j=t_f^lnnfd)jWAL+?p3I7Pc= zc+|dkbjwI{9HpxYAtb_&Vhp^nHdtz;20UI$AQ+SdID+P=b!HQ zoTH+LB9ENTUWRQ4*_N~M%N1>0p!=7fkdnR&YD%0n zRWw#nldQ>uoQb<02kaDS@{1yM;xbnEF_dM-6$#V`xe1Uhr%V;`1>j%QEh%u6G!%^4 zcFyrcc+sr)m|%fR8QM8{>Qpv3LPuNDwNR+wRQ>q)nn$1+r&rPS;WB4Aua?VQUr z3-xELgrDIlsHokN@?W)0wUELXuU>3xgs;~L|OKr zX+I03?PC{bV^`4V^t4(;s1W2}f@6AP%cT$3)yAmE783$Pk)ZYVJCijuJ;MMZt~s{3 zPabiy@v|Lo2KN~5wGjO1GiQf5HpRf_y=+qwX_I22#ErxG;z~|ZDMUy(s_fBnsdQ3)tC)P4W+~&+9cc*> zC2MX}qM_^3b+MzG_1y8jTn^W(2PZ2@3nXo>V7akKqBWLet$n$d3qaCg&~`1xEnURN zzumySw36!zEJw{_XND*}ZuAh+#BoOTeZrKdQjHQHD`b4U*i^nPxoEj}7ux_CO~t5o zK83WXehL%0m^N{RBF5@86+5Eqp9luuC17F@g+h|=X$*V2ktx!L%kdv&BbfUvcJ;+_ ze3z*I+uTa94J5N+%c-*fpU2tJ5MKO-BJxO_quTjk3tKsSEv2^W>KKV+Wt~$K@?B=D z!}KHH1Vx~Or160Qm#uZKu{F~}fPD5`!t^@6-E(s>niTjD{GZ2cMyd`%qYRsm)lw{l zUS!6)htnE2MTcE?6I@nVeI+Nx#3ddJ6329quNZ%X&_>d-u;O*Nj)W!EfXb(mgQePj7Hs zanlKr{=t(*NXV^-j9c<;YAh;^3fJtWAi5Tlm8Kkb;QRQtlM{GVXRw>eTDD0P*JM&G zt8(K&C2ksTh?U5=Cc@f!_JV@hIAQADZsMMYH&6FwZIetydNgw_0PPc2BddZm245qy z+)vuktQFZ zKEAvu!9Uypgo#MMgPkMPLtQrQ@Gh*g%-(RT&R>VkLI=!3|H#?8rS9>EM$Xf}=Z#Dh zzgg5VoZ0&aEV$<}WLEN&)+j=n69%@|flDskXAM6wBlXOIR$qcoG%C5sUSG>1UW@J~ zj+JLnWSA>KjU;*)b>#;~i^cCji%~@#~!E#2spw zohi5=Gck(yCeT5UBiVjLl0zjmYA%KeLxYl&nDLIS>F$c^l7aI|Jotf8FQgAm%W zXl|yJ5!6I2*)EGOROxeaLtoEGA%S^O>bo?8ZvFUJNe-=1&R1>YP+}GnS>yp_vm&)b z*mcn=W}%?reT)=`y)8!k(fBSdETb&~^dJrF2y3@7CHidRZk(mmL+|~axgX^%B>92S z7>-3!gM^MD;J%Aw2Wz*m^`uv}B6vK6EZnf;SeA^QW9Yhh)W(yV`y;D7Hb%73aw99` zm?DR)Vd4o5PO1@;>0i*fnZH&}e4}D;rJ=+Gy>*pV1unGabbkZWe6oQf@@F%jsj+!j zHyjT0}^8uN3EzapNaGxl+r%=M5* z0Y?bUkqvDm>K6M6F3RdP(}OsRJ9JO7!27U;c>AvW^6ULbB3c0RKXfP^VvOaosKZ#9 z;w!9dOs&Jb>B#^h~50c6Am#h9CUJ)u%)u--*@HZJ-13 zC6-gC;fvm<3AxYmdX{t5qlF_&`;P`Ksv+AR8{tVH>29$`xh~p7efLeTQgx31Ca9_= zs>-3qta_>7qGw5mz5l46Z3W__EKAXLyM8?9w0{P*M3E<{-+KStQF*1lOY^6C4@hhH zO!NQf8~guJpg{JE4@TZb>xA(50*I>5#m3Qq^J6^jzxC2!{9nq8&OyC2g$4r5vb-p2 zZ+LvVVaCNJ&h*Jn&WRMKlbJl7+}zpW>~B>IMziTdL+JJC2=-c4G$X z{KU&+_OCSWvDvuGZ^W62c#&SKE!kGY}Lx-8BZ!b()6Vdwt~*b1gk|71lcBzVyuid>g>>Oz@XV>E}FHCgR`1Xt=YUX zrn+OUb&NBo*NlZ=H$;;bE;6=2JdiO78|OmXz@PX=_2L0e+>$4&(s3qK&Y_f>=p@Hi zZR|{xVT@fW<1h?Dt6ZyQd6{TKjK;Zp6(f_6f2J6R01g$$K4FEMtbgZHbc2iZxLJN_ z8oG~RbcV9Hz72A9YSiEfLQ+-glWex4=;jb}b(MFUUcT>b>`CNujJO`0TuBcOw4e+b zW<;&!BFFGH*TfTysBdEG-nJX?8;+Unp+wu+Darj7^tHElDYfYv;*6^w?5`1=iLPN+QWpF4(wDy>` z%Y8_25U644BCX$alpVTk=bHfH%>#;GcX)sS5#z==xZnB@+ufSnab~JFtqR<#Yo#E$ zT;f&mtf4i&N)ivvx7^ERYVKys%??Vy)sJQL%K&O8LJVn4GA*Y+SBeXU9 zI0i{%Te!ZMWS?Cvcok>fQ|BRNP*++#Qd({vkv6^^tHIhV@Bnd~pw#+4Wmh#CZC4Uu zAP99-rBx=!@N5OSu&0yA_(#9>g%&o+=M?B~x?K*y%;rzhy$4R_A!&}3Wfm{6JE07K zj`D0zoU)wa=1}NMFAP?+e>BB)Z0_hl1{okJzx}wwoLi)7eu1^$4m!H8RJqfCu%9}n z*RlNvEVS@L<)eGyP=i_mIgFhP=k8*DT)Al;x{a6AhSqs3$SgHPl37*xE4{0$_BZJ@ zd==O=#Czy)O7ySdtf1b+OEaq8qe`4`JFT2kokDB#pkjj4U8G&|b@sDCn6Mgp7-+BT zM(oiiRWH7TZC(z*6u}pT|7FOC5|UPHM5KYBPUjPEyg+8lU6sED z!Qpi($u4q7_u|qpf|hA}o2iukaOcs{3eI;_XOFsB2KfpAHpM3Pce!M#x!j^_=s*@K zoY0oNar^WYl6FSz0^^CRCxKpiE;jFik+2-r&RlBc#RDwt1@h#c;%ipk^0=0uEgSs~ z8V1ofC*Zcg)rcB>E@Y-0QY}do4-XIphc8AghEtVaC?}f9dUdr0GJs@*ZLlT_GM(&! z552wapw?w3*=4C0BhaB>nROf{ibMYA8$!WBbm}fq=$>v_HP%TOIB4e{kj^tDLtVPw zYVq-Mb#8V=;F^{ANi+albSFN)I3r1dA?T)6s%3E;p&>Aj1q)n_(Uha15%Y>~{w#<} zDzdu`p)cFEJrTgQ&PT~#M?j0XZ5?`*({)NOL3P}PC(7b7;Z2~T!n5d`L!4$Th$G?{ z6-~t(N~$oGB$2p;`{uJpiTX`ERm-iV2vL{}91554i$#c&<0KPqL)gKMy>m)M)o@^z zR-62s=t$;b7EA+hJDRKK*TEx}*4cEB%_!oDTrl`a7Oo`*jd(tjz2jL<3OlJLZ^Tln z{`7DYYOF~MSt?{U(To7%EgC-@zj6poveV#oke1}A+O>~f;kSg4qfZ~+u4c6_;M~NK zx$1p1*vQT`vDOB1&#~%+d|)&s?2}%Uz$XCGf|+ZtvX6Ka;VAxr zw*RoBElb}4uP`yUI4ealO*S>RcmN(Nt%JbA2nG?kB2RfFeWYiYdjAL%<;*tp<4AXF z_j*7b^b~4YkW2)rt7fbn>jYzMo3tSlmZe)KL2r!GB*uhb_1r<8cnI@^*XA(9jCw|+ z604_3L?o(y)?G@*=KlB`)XFD7zIu^L!Y-nN`U+3n-UuEQ{`%M%c$%`a6iv!fGZoJI zOczFIrJbQ8+OCM*Y%$06>Q3Xj4v`BE&W>moFZj<5@L7;_p+4ONPPuM+Fuqyzmv_2J z;o46$ImX=y@%gcEn~wb#XeDkOo~E!kYVEvyMymEn6cm{sp==G{CqWfx(R~PDt5}`f z{f~{^T&Y8hTg}Ww+EnE}DiQ)p9idazz;QEJkAbv}_2@yF=DRj*slt!XwtT%HJ|U@UWjlMF8{;Mz#lBS~rIA;V>xVNg6*2*3gXh%3>D;T#}?)u}X|1m~)$z%Px+ zjJPo5_L1!KvmMsSkyL3=vDI9m&$2kdgX^+hZP#62W2knvNh#z%75-!eLEf7bOyZD3 zU~$MhB|Qv(-TC7OB&IfZS+*eN$S7q73X}{RmFkG{0~T(siS+2k&E(!}WlO6_BVoLg z`0@@kzmg(@LXZY7MYQ(0u$Od=0~lgh2qC9(Q5}4kn#f=u;zdU;8;XVN7{EEatpb*; z`zUzd$2Rm{uL_a17axD|Biiu-L31s?rAaDkpJ6tLGLO~j3^KCj4TbEB`c|O@zi^fe z#9?#_T#X!US~Kq_02$*~8DI<&1LzN?HDzYl^gMt>uNaMjKpQ;QT}E@etkP`S=Bf!n zQrfxVVB1<0$CbKqbP4(xKfdTGI79{9s+nNd=Q-%AN!(}Kb-oB_lyI(WGKh#c$JCP< z5Z+M3ir2@TftQQtGRbpD*sQCKq$v{+|APwK@vH{w+$OoM_R>1L2U61^YPsPS@qAlV z`pGQ>eTX31B&_lV>;~VhT~jl=iZNZuoxC`FI~Kuas>l_$4My?C*#NExD_yG%b>CwB z9I6UhAoa<%Je+Z&Fi67ZK;|2X+^s_qT6Q&hFVKvZ9-j*v+L6MH(Ky`1 zhyP6)1@XI?PDpeojz zvCr-x+L--d7eIg;G%5eVmlI*4^@ReR_jQclE5-KO;v~3AOM}J7Iu(C)-C?7OE-FKS zEv#k2()B9H)wY?QF&CebQSQ^3@y4PKMzCmo4{sirvf2S$eq-f21_5zv^nwDFXRJ|+ zdi|kS;z&9^EsJ!mE{o@goaiG;H)Brs^z$dVI+6FtBoco&l9ZDC(wws&MmRFN>$Wbt zab*#)ZNZSDxLH;Xr(L9-qkDDnO{rnO)Ug_d^|iOOSv4Ds#?5q$oFPt}Nyc8N=2D%v z7|p+%wBterZZV3Vs@dsVgrA(ymvco6>B~ZYrsKI}A7?!h;uVo|lG`&%fm^t*VWbF2 z@yVR)O_3gJWGG4H3))1bjm`BU8Dgz4Ni?eOyV^vm+JcrEuyr0T z%bDehx(oWvKYvH9?pdwaLgX8*q4Kg_&q~zI= z+_a6}y&k<#75TY@riy1yf5Dqme80kl(ZM>RV;O)o*9ct7k(ixTh8Z3$9z1N|hDC>N zTv}?b4;|D(@-yjLZ2DCrZty+vQ)x-X0Vi)aPapf0#~x?BfF-4DQbvjd%Lmommy8yh zIxLe?MjAIGKB^A9EG;%ucSjjZYHCsuISS0V0n(L0Y6wb3kJ>RdJ$o5S-79X}rk^1= zcuz4szv%72PJK5I`OT4l}^{e&uE*5 zPQ&%JX+Z3-Zmyrk&wu7_giCkiV)0EX9H4Np8gx`Ib7hT=NX5u}`AGD^%o4?a4T*Wqd2xq&z!r1ENRBR*) z@;?n;ZBVk2m+l%WNINlQ?JBoq>v@yL7G(yG;F`hS4CZd{M>Y4*7N9}50|?S%b!su( zHmOnB>pFL72d6oaeeA+4SYp@21{$;;CIS+y_mm_#=fzP_tg(Zu2^l@7nR2fNvso|U z$lm~N>8{zYxx1(>LSA!-BP8&-2fD%Wrj^%Ar3}Y8K(u~99wN&S3?w`(h?tn}UWi%< zPM2p$`etm$w%QyC4AQNIbcmmfV2%JZgSByR z@3#^Y@mf#7#jl5!CP>~y-oXOaWuAl9#q)~AZK@5y)v^@|#Fz$2NyqGrvwjrNWAY9p zRwe>Qk#(lWh$gu_E=tgM`CAbE*P6n ze>``DN`8fjL}mhfeH7!sZ#F~N0eA*9j)_F|33|;f0%I_=j9zE1#oqu?3XaR7z#EoU zu5n$lK#Re-5Tn^=mz35a0#z6#g1lnK^X=j9W*%<#GX;Sv@RGqLip5!0B#B3n+p#@IL|TxBM@C8 z?mOjAWlr4Cq6{|1Fal1-sQlr2>N9>EYKdvBihK%kBGQ4EPz{v6#O86&L`vJpoK;8C zgYuzRBzrw=s7H~Vf=^XoOsuN@pUE}#aTM?pcNjBp4|wlK6dbSlM9~s{@I_QWWd~Qx z8M*%-?0scy9LkoaV|L6CvtzbnW~Ml1X13cgGc!AmnVA`5W@ct)W@ct*+qw6>d85^6 zW~EtaN7_Fvb)7m`)vbb4=QGh8ZT-RkDKUKp)%#>IAvvy$ZEG{QXI$+!jMl#EEG#R) zGVRHpVHV_r5CSQ*vpB^o-Vm(YRg7QEvnkJnMr6#yr9Nt0T;y^4Q2%CB-n1*SrS_o$_lhX(h&$B6E>vW zZ~-YBSzYq#99kYYHXOu5iPo(v#P`T-vO@yA;Fi zEu-G5NP&&CpUmr`>#+4`kFt{JAZz ziQB|CRF$s=?(Owyq@n;@jLiY`@x;BtQayv`TuZYgIGc);BXbe6cs&}GviMu=pqs}|{uot~g?CAaS&{|Hi!Udt*yVUY}>!X;<5cD6?r@IFTd{klx6C_@=SX7^}jVrB_epq?=)RSte zI22Am6Rg}F02^8o%l^je)bLnuS=GqY@svg2rU71arYEdzH?*J$`CZ$i8uH~Xlzr%- z_@|x(hcs3U3q}6QcGUG-*$qx4W|q3#usO$toTf?c;hyLKYGLl298zT^>*qGm?{Tmi zYQ@wit|2SB(kR}>!KDjD1_)-FbG6|Z)^(O3u=a#ZA=w3XfIE)!XeL8kWjK;((P2*k z#V+eulOH}Z$+bUA&25TDkqzLa;;;K;#^|AQ$>ot*78kErl5D3#gR&AwP)c&u2;&uS z1gN3Yt^mlV)zYBl=hQL9QM;Pii;_Pe;xlJ9wn-XdE9U)#kUko+JZrB?Evqb83 ztbUw@ySgrizPlcJ9WnPcgKHa#Jzi@9p#M{+vDi{3xFme%5jC}69Mmjen}SHC8;7i1 z8a`^Ds=&&WLu@b7PX?*kJH|Ov-wJJlxn_kMg(hEmR%@8qu TwCf-ca;u+JAK{ga zvUYao4Tf~%aU;^NpRAGsydOsoUWaW*HH9QBZBF8YHrX7yV2tpj6lP?J>>;`eL0Bni zCyZGzf$2ml*ic0jrs|Tt>P`iwXgsa?SX%+ z&-KKZ1H{^dC$MudpZ_wcw0-5;l|nmqf;n2^3Qb@fDOjjnb1E(zGwHH2+l0mw-soo! zmC}Q=q?yyO52jsc5g&pM#`9r|#?m5*7N#L^ld#IF{>pJ}sKlW7b%4-c&shs~>Df?d zO>?F3+$srnIQg(=DXX8}JhjaZ(vOlapfTl3Pi6H;8}lgX4->H%(2Ww4aQyY`BbC_J zWO!IV-Tb~ByfZZ#Au7qt=I|~=G?++EHuQyz2KAS-M)Lhp@>vls@C2|Mt&TRG26afy z4iLKRk~5?hQHnQ%%@1=oGzH!WG)^rYr=93}-bqM7M2QY_1SFon8G=y)< zVA6QQ2g3)?Z>Uf;z9-s*Y-M9*x*e9-uOr~{>y4^LU&+ySKS?c(ous>)zX+49D6vCn z+ffDO<`Na@gyu}A;_O@+=Sz`hnkNuRE>2V$BkUlEOX*ROOL6!riFDuNz9>%bUxCjR zKM4p;-Ado=3(ZAm0eKGl3j2zO3Y7}1@I9xMe}lxWUvaQlu{C`r<*V`%bGouxfN|rO z4~dOd+Gl&qXI#J9^ZX5B8+=42Gp+TK_@p#(PrN6_i8ZVEe^5$fqEAXZ6Zj;-bGlwJ z;k16a+4R>zlA99D{9g~nZHgqmrJ_167l;)+Cg*ycg@jY>)h|YS<)-n!K79WTVtTBy zKzD%ezt*ksjt$4RgqK?K8zjW^N^8E!QiZ)P&%W$`|5P@-HmI}q;RaT+xcaM-g~Eyb zzf`iAyaV2+C`kNW$pUu1VZP7de@Z0_@zWdf*9~;wd-j9q-yroQb2}R{9a4`XPbxt? zY3;w>WX|L9?@&r`J_#pCUiq^H)V36ca}4cIJWSjBVHsgp2HBnXHI@<&$;T9Z%bcY# z{RDmiUOkBE3#&>DarbeN)P6KSJ2~qVQQwBAw?G84CK^FRLqlLoyrGV2%+UbNm)6%7|evpI&B}k-5{pN^)Qo z8tls$cqusPkzoz=4Cq(m#1lEQZZo=1r^B{R4dwm{7Xi9c*3K8f@zI!hb`8}yVCoX* zXA<2I&w#ZDkFn#OdU!7oXT4fy_jSF~!dlZK6E*zLi@S^ieceBA=DHrG!uc8&_*P$U zt3^vyq#YDEXTZ0q^;FZ{fE%Tv3wDYx*v8A;WOjxM93yUNvVHZd1M{LA(jr8f8i74oEJstUJ|;ZJj@gOo zdFEemZE>dy88tNAqZtSZs#7qeNHmjR(lNDB%b*8XZ2HLM!Ah+)mGSDvwilRqIW?v@ zE?WW@od?a5?cgl;Sr`4ih2qf&H+xA~aAY87gR!;{(lr!WszeByC9Cv5_y4p8NN6Ic zwCQPM9CPBJ1(rmX&fF-&&1?)6K^N{yE9nKZHf3{gRL&w3&%;x3vVjMG{37ZfzDZx{ z%SL(ET3Qvj*trxOgae)JA@KvT9&I1VV~$EQ(rJBPC|O(o(TFjIr;frm*x%4>n4yMqu5UM0(xM+|hR23^G z`tI)pgseEybo|g76?#~P!?5%pd|+cr9a#UiA6UT-hD=2Ex(T+`Z|f#E5=Mhjg-0Sc zDu=KXOs*7^xX+rGIn#C*m+K98Ha3oL@5CV`&_*x=?xRL4eI9a+x^f6sMatBLo5iF; zp-X-kbQE9zD*>&rV&8Y{Di4HxOSr0&!}^55f4kMswaLPBI2aOhufT!HscM`|vqQ?{qipt+p38nfhmBVGDy`@}#tbM-GX= zWr{F$LU$@TdKBh3=^!#itT39magNCuUGV&XOjyweu_(=|X=tH%i>F$7TUOQXA7;Jt zr>hA*ax{4J8S>*2KQcdd8jAy1B_%E5s025jXgyswlBE=YHFX9+FQ5 z3#sghCq(-$RQU zPj^FKf$`6^T&8GQ*8jcdV7Ac`c}z6|@2NHdLaLLV^6vnyBuya>^h)q5;)RN#N$CLM z2zYtS>*5&HXVhqncs5|^q~tZMbnIEQ+RE9&QRG}2OIu6zuLuNJ!eC;|A21gh1q7jv{#EbHe2F#Qf8<`1jzpXFgl^==9^-Bp~}IoA@;N5(2i6 z%IjX6_+?L4?TTQP3bf<+XGw?j6C}=C6`6q#>(n!kinL>+wV+0oSZIca3dMk?Uucwz zvCq0FP`=l0k+hd?Id8ies?0u={-v4k7(~Zlb&&tDGW%FM%~Oe145xKKdnbTciOAjA zOMYo={IjL(&0g)q4IJ{`%_$t5ahRsd9;uO@iKx7y$q(!3B#y7t6}}%TEF`{SQ&w{H z*OBa6B?6CLYmQ6rZuJfcBxz{8n{`s8Aq$RP}6 zucqNS)rULiw)V$uXCjneqggkOeby)^i!eIzG}6{Dh(JQ_Wdt8cs&ggHW%y@+)#^6S zm*ktHnCt_ZSy@RIfWx;!pttX+x`*W0<=6)5yCew|;X(6Z_iL%fiN;|gQKIk>c|plf z7y3%D-M&F^`u!|95jl}uMCKOu9^xa!XDex&gN;ep4j;XhBG`R>y>%PJA$&^cJPyw+ z{9nr%$$;Ua_1Vk+m|s?NIKL~ubA0t&zEbqfy2VORY^i)GYbhV-E*%$eEQ+ydm7o-l zfvdI&jVH>?B5c}(BBugQ+L}06AdfVQGz&mUyp+rN`7XKsrR+CI)L^ko>%=9+Ys`~G zdyHp?#L~xDV1i3INBb>9JTOa~>HaxZMUK%k`Fy8;sw|)ppHXaLtSE27t&nW3#e`f7 zN-<1+{hE$bTDm2mJv+u;vgIiDxW`lc^>Zyr+_Sux*YC;ye4Vy}s=d}E#Wynlo{^<# zqs_`Y;*D7K&ysK7y)r$$+T@F03)IoeZRkRe{{ht#cfZ~&i41y)!9rY= zSKQ%=2roAl)cAqPN$4>A*AMixrr#jt$^LR0jc8w1U9m)rPI9kG=g+j~GaI#pqeJQe z-ls}8Vkn#A%IEUPWViFy&-E9xF}L!O`HlO#&uuufp}~gW6TD|k`8fqMg-aGgbh{Z# z6dAj^R+>r?7TRtZm>OK(+VIso83&RhB6qo7kKc@*wOI6YEbEO^#T~u@F_o7OWS%>K z06V*rq`yHd#YzV|GQ3l+B5ul2+_=2tUNY9H`A$gHEK_%ZcMX{p=1ZD8WX?5QblWb9 zK?^X-bf|hHj=E!Kdu;<6Q*?mpLgXCGMvs`3F$*H>O;x5>e z*0h>kAdLjDF@*#)E}fi9QYl?&DK%ypy^{*_oh0Fvj_4v(xch2ZW2_)4AYv;pD&)^B zEpChB7$s+EE)N?`!QS;M-PXNer_xCw)28|$tK#UNnjSzn0Fj`ofk=9qXQJ=M)X{zzetthw9{91mjFTPg}2Trzzqo)^{FzGuMQkJsrCTLGkc2hDPtRb6n zQg<8t^k>fQsXA8khIK*@jWAnfCF_I_g8BMTL0je3aCbutqd zfR<5VM?Wqt>80c#QWFOB8Zlh{)?(^qjEcJwbSrvT@MX`z6yzE=Rr8Pc)y8~AviuUn zIP&u?ojd}$Ezw&sNWpS0q}FupO|%eXw_@d~{PBq_+lXQb0Mf`2@9wDJL!jl^ewavO`@NJ*cLvpN*ee-m1Y=E7FWU429Yp-P$`>P+(EcLT;d-mFLLH0np13@J&wks-G zVay5xiEkt*6R#r@2`_)<$aD}5gWrB8MW%=;Xh_b1 zDT}>6bzr5Q75>FOOq0}^Rw)LrY1hZzFDHQtZSUw}SIH*9`E)mh^CSK%2pPNBNkk`J zv07C@)-NaI&kOlQvAmq2vQ_NE_-0$*K_FkpAUDbDDqZXXslB>F{MATC$L&@Y*`kQ~ zyFR6oogT`V4p@p~+ZR&xV6kFF&{S$j1+~cKs}Y;3LO($2?sQ4%HHG)Z!Y(*yek!#W zisZ);TbyLx>vZ)W=;sqS3lZ?=fja%=8=Y}tD(Y1B!59->xd?)Y#2U~kt zAp9E9@(g_$T6dKScpwA63-uZTmGSsb^Ho8=9;D5~yM>id3wSQsd>Js>+GX=4X)YSk zdCj$jZ%CTujpne+*iJ;9a(z0xsrrcyy>6+j^?rsYC*@dTy}Bn>s0cjt8qoc24=hSt z3AOl)6J;S4)bsp|do6Spa;De5q+KRam(OtagNj2ReD@oQ=zOyW$Pi8u5C<5BNtD0m zP9-a($`g;t*@aRILpy!CCbEzcNKs_?9?k`pNrJL%=@zEQJWH3E9d-lG6qan(@WB>a z9he%qKDxMUZmtTAX9zpWDyOLOu&{#87o7y({SH^;JdmNoMn0v zG}%^TZYHQ_Ln4bZSs%jlw+{?gx-zIk7wY>fo4x)|OGh`zeODXUly3NafbN?SXB!QP z<<2tGs+((?Wb2aM!E6MiS0N)TdkiUcxtaV-SqUQJ%(6w~tk2A0)gu`bgwih)e$>AT zSm$8Dp_R%62@}PH;wuSg830$xb^Z*4m=pW0rIoGKta3Hyp0jb?#sk;5ay<9wI8#jS{7?ptcW-tNaIfSq6{>^wXUP6BYx_Lm8rC zqOnFrM+Eemy5T;02U&kZ`nc&J>6uqJJ_2g4`{dE3Um`}D7HM9HSJJ$~qr*e&U{D)ijbmq?dkXbt_3)#qM{Es9Z0pj=#I-y+9u^>VQ$k)fJP(3MxQM~h}@hBHE zi=H??Ghz;sKr9NdRq9f3NA5#UDCuMxLgQb}Y4G8Ql|F&!7wzO!0VQC&aK%>?CH6r= zu27hllmE+3EczD%WE)#mlqGJB^LX6GmU-bq?-?u|1Oh&j%{pbw%8*012)q`YX)k); zgBV8n)_G+T{-x zk2q&?qtl{I5Rr#vPQ_o)RD2=vBA9PuFBF~V1nh{%aIB&XPCU|}xXjI5JFCge<kCP{@9*Q zX=s6+cqouVQ;_hfVv{Gc!niwfe|<#wK?-bRVoK7t1vbBbCFYl|LvtY>T2jdZcViYA zM_nQ@g_d>dx@;w`(w8u5%g?elwrUYs#L}&oRD~2%4TXqgz8m2DtCI(OK1aLHdxu}m0RmF;YPUxWXQh}F{l^ZJ86Zb(UP za^U<}GSQ-6kUo70l{-%uz5%&`aQr<~-gxMC;(!{wSD%QldEr!n{5OpS4@YLhg(^Q9 zEd{}<8j)V%E2h@54Axiofr^tOs3z#ohmC@l0kAqfpMrjI4D%MU z4ly8c=vcYxb*8S)=}(5=6Lp$3p$8gZm)=kl`7}cci;nigITr<-_*^+VygCnzAJ47s@IBB)^*5uMOg6*rRhG zI&2tFTxs{85nMKNLZzE-uH`G8Uw0E?-tZ22c*vRY4 zh|~O;By2Cq5ygkgA1Q5vj%IScmc|4iLaAbaC?zh&P!r<&71VITE`8ehf>V;+YB}@- zW_-W4hO;Q6a$#VNh7EhbW}Em-e-nWQ%MM2yJ#(kpLOvJ*cmxsKFL*rmd@cz*1b8n* zYh)aKjs6z=9r*RHV}k_Xo2)oC-Qy@vb<*}1X94iUg4Sn! zQVBwK(!N!pm1EK&_ZN`I)uza%=&DG`WVwI6X10CT=_OoBWArQ_NaiGCO6w)%4iqq< zm<`+#yNvz@DGaTDY0PL}$!-dYxtaGO1*LlM~)iQmCdIdXMHz_T61HZ*TMHo~5S$*v3;KF#k%0X-Xw4Nq@sNeYBLX{W2 zH(vTUQx-3g!dt&D-=s!8YiBPd6A;zoz6Ik0*1SDqH1+ik-dS;KDDxyQW1 zBn7_BL;wqx0;_hL<(HV#%$=eJ0T20dqnkZofRGOm?JrFv?Wcn*jqIowsGf3faa%G* zR!4j>&$|IYl=dh!+iBp{Ah7#2`tvkpU&th?&t2Yu7WQ1Rg`I_RT4!T#tJuJNCLnZc zhD1F`-6lSZvZe?Y#xGIThaQd<$;xbLzLri;-$bv2ny3f26G~S2LP>U*gpt))b5F7# zie;1pJ<<}In4W(GIgvt%b*NE*7$q>&FPgl%3Wv^yHHI-!RDYN4eKC5XIq?uiPE^gs;k>w7J4eYE~;Is$I=M6%W*tXao!^Ro?&)i5};z--f%D zqbb;($QqAr(s;2eW0Y8AiVSz~gMk5Gp4nyH#n1n&qOj6lRm*RI@7CT^2L;}So%hIp zKv7U4Wp#tJ-VHH%tX^=?da|A|rChiC8w6J3rQr2P@mQI3FrM?2pt7kEOzvxp+>*=; zfYX;KnB%ZO5))BE=Ayq4KJQX<7P80rrNoNWGMSK(b11PPUOPPSDxN7>{Fu%6So5O{ zd-fQ70+EitD8TabJ?9gWSVO(9T_Nw_z*1t7o<*n}gD+sSGrR8*E|ty5(!Nlje3Z8R ztMW$bU>0d3v5aiXY%JqllCETOMwDJlT*2N+pWeryJ?e*oI^BAI_b?F6Dq8QES1oMO zgRj#|_sWKAv$T1MTw&oF;_6`GH~|_0W;=HKlB|AIIa_1}y^`~dBOi7+5=$Moy7;l+ z)e-$*==^l7Yr*TBo2A(wPT6Uuplc8>i?id0k=O{PGL(@CD2W#1{T%EVjWScBjFtz( zNp8VxFwbo{7*s##d}B`~+z;Po*s8eKyqlUE8u)3N&KJtS7GuM$f?@XrsOv2~Pp7qy z2|EsvY;YBbUaeWof#IdH8cLR&IZGt{vg?}bFcdBxYl@B*{}~8jnSH{8Dah}hM2@e6HNQ}-qH5uo@e zz_aM+Od*Oqa%Z$~26${5E%BB(!93Lw?tXhg_;I~qx;Nk^pmDyvsYl%+U%7j8KF@n5 z-iS?>GiqGjpU`(eZn2)-fO<^wdqEa24hkE|IKB}?MFJ6+9%ee64(e#kqD53rQ%Ky< z6{$_b4-(`{IaaE5*stw8evw8gv5JFo=Im?x>i4yyr|vavjTDDvZJ&ZzYj$0A<9>eZ zwTZm=7&V?(;uQJO1aFk%?OWUDW>SMiCea)@T^v-IW>B*)b7r?r3MguwN8K(qfVrEQ zZ)xS0L?>$SbC!-cH*H1VnoT)tC%HhZFp@AD?83^nBj>uQUrbeu%08o3Yr2~EyFnC{jw&cE*} zdt=k|BMa84yTg(!_A3nV128wPS(}Fy(f$U(Qi#IEs#s3pXMhCDJ>9TLjc3XLDQpmq z5Byr~OzYt#-eMJb|Unw1#vr1!@Q@w6OLB$ zxgK|a!4hgBM;hB}#?h&3DR2;Fku`HyEskkq8Bt%1R#>vHjNlnGtkMyDW0%&N#Q&yM zIE#Y{zA8?lTNgy2Dp=COzG@53#P~y|zEEvjqt}sOj7f)HRODC8LlUz=I@;0P6!i`b zrg#8Tqg^Qg`)9424-1T+m7nd5-M0S#`!N}3vwoPtPUs_6zsyCDdAM?G1k$#NFJU@U z=!OhYloka%^}LXN&{Y11erg&<-9a=QzL=p!2IdWL{f^}oD}p_D@{u65Iu}h8fgjtU z){ekf@{^5%tHRx&Kb}yB4!Kif?h_cf-xjjs5#d7*V`op5<<#7!VZ{9+SA%oqryWmh z{E3w_Sg8y}4Uoz@nUG&Xqk(hmFJTxkvs7J$h7k&%WehqEtHh%1<0-={+e=y4PkVeB z3UbV~l_8bT?WV5e2|q1^v9kz-#DK=HxRW2a?Lg>?gZIe*m2^yH#SAYgstcdkzfkl-Y2uc z)|%F;c^>YMiFffIgIz;0wus?tUuP?g+zifYDyOp{H#pj`y`_8Z(<*eDe*VB6*jTVf zCXYKsXe(lgExpY1`*GFPft9zgZb12pMb(D!aCUu+j#25SoPzg z?|E#DA%1GNou^+;IU=G#{ecm*Wl5LP>_B!ghTETel0-3ZFb^N-PK z_eq6#bw)5WT;>|~MzNq}2}?a>63VU5#)wFCI_lUML!S+zuNpH!u|Yt%kcRdi9~5KV zVP(7jQ_@bKj?S_O-70 z$Hn2k)Z!OaB7LNda0=h}KB<8^!Kk*S&=%a2D?Bf~{uPR1T#Q8KDNmu`BUt%y?CO3A zwQ7l6h-LUamdemDWE0gB+b=kaSi;Ohqmd&^YXl67VuuggjGcJiY8*_yq0wxMqlTz) zXN=sBCY$`9CY!Ky;<=@wCH3m+FjT|!0*&&;dS;+t+Sq3DIb1$VOfx@khehf_$5Xis zU507vC(hxkR-t~-?}v-Y2ulO!EBWfe$%8BA@XPf#h@(R=waHHjOXa&-26#0BnP*N}bmu0y0Cq1FLLg1=AAvf`$3D!Xo6<#Gxwg|W=d$0ZAD%Mxr zm!$k&r&cf9VcX-r0GkZBsQ~T)9E0Zh?GzvsRMhQ{uL^A)9zEvIlnB`YY}q8ID!a0$ z3Ut6r;wp`PC)sMKokJueYNyHkjwb@M`i&c(2Kv&AWPQ9}`K9}}EA011Wm`u<6eUBC`G+7paZ zw_4U(v-)|7}RD}c5b$A6g&t;yygu|!7-ykmAKJ9-=(W2|ZN;!Wv&@BgwG5n{P zM2`sEPx!@k0T&rd^ZeGnhMN}8ySi=bXVK^WCarC!>DN*>uLdgT+!q=^WiRA+wOh1E z)C)8hBS0Wrt8TRZfRyhqQf$+=aRMi98x^k_HGIBrb3YNaIda`L8=6)-BzQ6N?{4c5 zd!{}3)RcMq&tM~FYkz}ud6|73nKv|Ka~Hr|ZopFMnlJKbu?YX8P4y0RqvKUW`-SSR z24Gpw1k>!;b6HyV3Npk^Tg$&$HqY4$G)iCj8SKKKNm<3~<7??Byb|*|&oyQ}+-0i? zxRkDa0d9t+{;k5djFRy6?`QD!;!Wq9fNWB^qLTaqtIu2@2&-*^jI^G|eg~GDhBH9? zyjR{DM>kk&hh(L>-yk1vQCt97SK_TLj-u?&>k>xiAD1Vzx-4?H?QrpBO0}vN0=`#^ zCy#G>W=qfv$Mi0{2rmS(uZtOFJmj%ua#@{1nGSd+s5eI})a#1m38`GPE0LfN=8pO7 z%opg=a3nm(A98LEtFoQG_77tcSS*`a6{|U>X`WI?Rr=#u1iv{jRW~44I~piDSW9u` zhIM$Ku<~*-qk)zv>j%w#*gAr1l3xZF8MtzibGY7f=CYdp!wzNtWKH@#&j3zgy^KhL(0G?O)~qgyx!JdNQzTISijD z;&Z_lJCa!HCHIKQD!w|UWcBSX*;**DQ>DT!uesa$P@4=X(aQj0U|C^FGeDueLzX=Y_X1C{! zIn%H_hcb&F%ibfKYHb+nZl3nR)1mVDfZ0{YsFgBZTWLjP$~ER$J@35A9%8JH&jT_- zlKMW6_Af&8U9ALgFknlUGRoUVKPLiBv(Z-Oy#h*0s4W6KlYiAtGxm6b=Y6MG{(GK( zBWSk$2^{IEX+qoJ1jgH5u=&RT48y!^a{seTF&Lbw1*5Mxxrr#|)cJ7`9Vom)!HQV165JQ2srG>P5 z$d$Jj->0!r6`j40fg@x}8($`^Bkk?2aNAosaMZN^jGE{Pw%%m~a6)aBtGxVS$zw11 zi^abqA0PA0yB&yf{#2~(sd6rE2|U&52hPsBzlD9dy;W{|DPN0i)ROr#1uB?VFJC`+ zI{pU)hZCXwqtdiTm6pRsr9bMG91$}9*hO@Gt338n47BSnf<%S=D^QeOB}PV3ozpGg zI&Y{*W(}x3`UH;;7=A4}1n1V@APDB?L#>Araes}OfA=4Y8yD>QXH&SY%3bAmm3ZEx zOxro_)PDmp)}78GjVpyIqyLOzpD#HD)>{&)ZMHsX?=SblUpv4R-KfW7Z5Sv)T@fGWzAMnOu7?$qod#eN3s|yjcm#Nit#pkJ1Ma6sSLmu(0DvJJf zpeLS@avC{c(KR!ahbU|KAP(!NrVpJv|$KzWrm#U3*Zlv+f zFK$(drT1ersDcOsSh9E`TNCaQz}SUPgCs=2Ct1*D33G}_t|&z#};%6 zMEfUAI$n*~a*wgoXN7KGb6$wFNeex1086hHc~?7HR-N;(wRi4vdH6{dyhjN&ANjgS zuhfh1^`)2i+K#koh4yF>4z%h#p{pWB>dGzubRfA>k9;Pon|!^Of3&3|w07y|Y^l%QuE9g5vK%}Kf1R5f zxBSHonR|X^53c+giSzyK`4HG4=7#SRutSf}GjE8URlR@{sEJ7b-~{sG$!+cGlTXU9 zE0IEu>L8AzGw;QM-d)W{HE7JR<3teCOjt=P8j(O&tOLV8OgkmOv7`wCF+9zuI zWg1q|rk-~G6b_`eyy~ug=-!mK*iZ!A(%FUt5p>{=4T175&s{N^A4h}On*`M+T6l(W z5w^#0g{Z+%{h8Hu*;M=^nG{lFgOAl=Rc~DaG+9=u2Nkhd^0J{=n+4RJ3IOmMM8c|G zh>OG9ExW{{R`EXHK5l=O=t=g?gSMZkiL%{-1@AD8)UHVAChO%%E8NIl{%%@sEon++ zUn%4e?CxCB(GF_;lT2Mv_{~^!)9Uxqs~^u!c?)|^LhPGT{0~5JkF(iNiJq&kMep1S z&nCalB@e06)aeQXPS=<6xp;n_2LqNUxu!YAUz#>GF6m#Jr*L@ECZ*ZsWIZX;NF9zY zUnTi0Z(bwbxs5VfnR3es@mv=shkambqVn7CJP0pFSiydEiW;!2yBYKcmiQ z-vE`*m+3u4@_aff>#uI2c&(msdN*jE;Z^i=@U2+pciw}%)AXD_B1gRBp|V@q2;My$ z|Kp)X6PDQ<8e$#gFY`MmAw_m6K35|#?KB6bqp#1pMJ}*h*%`~!@4D^E@;yZH%@rGg zO$l{s&p=m>$n9vV&rn{G3VXp2*U?(5#@}Nm-I8+q-u5pY_W@E*L7svC%KOqhSy#6v z*e*z^>GvK7v3I>%u6I^5us7fGcWBOnz0B?22WqS}+`^F>TcrZu6kJ?BY}-JqHDH^ z7LO6V?;VE~|FjiE*QXYo+^n9qlG;_x^6C>$h9{5$5&5d8oq(*5F1Be2&%cDv3mE@u zt$uHvu7&{U5l<#b0C|g=Ip}~P&^+l>NgTK_l;0`cr*O0d06lA5^eJbEIvB}k9>p2b zN_{u4EFr0=enilzo5I%cI@W(B_C>=LI2fZUiGk<-3Rgfk{vmDhuO@86_n?rz2c>kc zsPr1e^TB|=0QazOIreq?opn(=G3LIT;*rT^$a7R;g$?M~Xxw8EpF3mXbd42-qbXn$ z=vcLZr;f}zT>ci$n8MwkCi|9M?L>w9Zpt!AU`$R6Mb_R~$KoFS*xXqQnK!H?s^#zD z52>!J^E`E=*5UJd5bT>-=&XSg@6Ne;1g3ej<|XYN@Gr^D9~h>^t#Nv?bN^{fY_`~D zhWBob|7fzI0R}|(-3(8FYt{OnZh=5M!pbS0`)=aqt8det^3U^UxkcLw$L1qF-rai0 zq`q!t^I**g#63zM5qo!Qcn2cB{*(vU80l?SfTJr76zK4fMRVOq=fRp1gnLvr{N67w z)#PNpr{EW}&j2zd2M%~8v~p`BJ)*@K?b>QiR_hP{PSMGkrXt3B5Nh*SMHa{Z)*`q6 zIgIxIUuXS?>#YA8gxNqzqe<_dr@Jm89*4l!&l^CzX~j|g(EV5USrK)z4et~nIvXj{ z|1-D!>L08o?jh|Ar6!b<$f1rCQ+G=>F>=((dx_{GD~MKPJ`SkYh4(#qqkD7eR^nCG5pc?iw_-`qV4DAA)ooxekBo=8XRJi{tnB^mu8$h;;VR zPD3PpFh`sP)badVe3~3B0+R%27K2wyv0EHY`$5;AGt!7XvKF*OeHC3k)A>*XGb%*- z96kM%B8D5((p6%n@+nyIEWX19)FRk*^B81mQj*tVz0;5&bG1E9`>Tc^pv(S-N8?d) zQypYwjOKh|u~AE#d0;KI1$(&pMIN7PIVLq#`^F<`Pu54J89)kAH1 zRAQsTBzlC2EU>kvBNL8}0iou#|DRv7I-$F7%_YejGv zQzCl9UP59nm*AYO%()jT4LoR)gUC#pF^`^!MJfv*?r>%m+l+(pnUlsh7RTItbI13Y zMBsF*3@+H9-R&^Q?NMywGaQWlV|V)kyS^ntszR>a`+KLu%pAUj2&7(Z3^lwG#4a7fmjY`egbcW<&`twtg-*7(a;=4p`1?JrO+d z__LoZm~|+(Rx^a@5jlf=Q#P=S+}0ERH#`V}3}eRd98}Azc?~(Ub&Y+Sk@Q0UBzu`; z=77{&oE%#gosY4H)8vO&sI&p5ygC6!C)IVU0u3^%s!&Mo{g}T&Fw^i{B3i!D>NeBT zsN12Y!$NK%l@T47&mxa4n9`%SmJ!q-o3qaCTo(sS%Z-XY1@p3$g^hr7*Zd5>tNc(` zTmNf1JmiO0lE!%T(P7TEp+Qb-8LXDxh;@ur`r!Bo zF477yJ32(PEkJXc>x2<62sv?|TeHk2I2AVqGuTD4!#=NyahS_CRy-?cH6C3pLh8#r z3p%o`kDTtQ%~D`-p}w1Hzt6r&)X~5Lf@$NO^yH85c-25XmeD4_eC?TtFDP2nls2ex zz8x&DaJ-?p<26V~OwLPVy7eRU)g6RRGT#pRh$Dw3l4+{mBU<5<3xdzsiX?fFB4;gQ zz%M_WTT=}SW@echolyTiekV+9TZd9NAb?EN2C0_(iF#7<{>aqaC9R#Cp^sGAvXAA5 ziyzw`+m#Ollr1h=<@8ycZ5DochU^d8_~)(B5cuPmn{cSm@FJiyWReN%r8XDl7WRn8 zQIj+gi%NCnUQ0;Q;}Rz)!Y)jlE%7@ZC)U1Mr4MyX%=$|Z>U6?Zv@=AtmX^6(YiDsD zFf&m2Z}6zeRQrOF(aTgKz)tx)_O%9&ST8jwC~_%a@uKzPyN`^Li{cv26Dfel`C8C~ zc7RQr752a{a@yB1&S4cI#D#_W%yV?C;H+jIHp``5?FpgL>Z7zdHi-5yboJI%uYk~w zJ^qZu`dKC}u$2zQeHVjx6u@j6jCLKAgP9<5WquwO&9=N7(Wdg}g35z$B*n>hL zE%mc^oruwhWzTOiFYq~tkpC##k_u!9UmKlO1O~w_x678b&Z%+$)y-JEDL{XWFxoUe zo+SzcVX?lo_ov?=h>!Gl=gE_lA6Dc#FXNs2=)>3S#qmsYxbu>cFOi2JLzq~a<5&aL zbq4PJ5s?FCOz3ihv`JXZu8iGI;M0Qeltg_cN;BCFVdBP+41hr_svIaBu)qd~FuyZf z^h>}|y?GE(GUSj52E}()ng&{jKy;f{?=}MZvy~jfDt4M$n_T8bax$rlnrya$hcOM? z{>#PCV72g71BlxHMciFQ#Swq|g2yGe1b26BToT+}(m+FiZnSX=mIT+PfyNql4XzCV zf;8?D9D+*%1PBm_{fBeTnwh)SJlv;Q_qCqtp{mxd-`@N4?NL;YW|A{tzaINNUG6ju zv=;Jn`j*RRECc|VdOHx~ZZ%t~DIr&}`8_Yd-X!qc;K{BJ>mBB$i%KysS?zUbzd$3r zL}b#i!U1yC|A=SQ9AO(Dg|zNgGWgw!b(i||AxCRJMD6i_IO7TC_k?5;QrKXGy^eWe z!e_{#xifGT|G7gmqVU~k7zU>L&1dxeuK4a#TH$b`2;p2O)A+5i$Ju^JroreKMMzsd z*bz5&LwhwuANn5*d_#}v8PV>anj8Kz*o(vE@{cs48o8}@&)>|MqR!b<7}`@VYgq2`yA_J$Kh|fo;pz;qE#W0IdUq2I7`?{T;7Lh zTU4?yIUWS_DSYn-$u0GAmru~w$0}QI6m>gCFtOV8Ptx9SvD2`}ZMwHvVB{oz+7|VQl=}f93Ym|^wTjN%!(y~M zn!$nl5O%Y$6<+xG*f?*Kfy0?Ic@X+x{3%D_wnW(zLW3D%85tF{?0Tav#ORSWjB+T0!u%t zY;@%r>81*=XVRe$E6u)`XxUwU%D(yB-+B$lpPrQ*t#{#pxzGbDoy7xFB@F8|1qS4_ zgu5#31F|$L&k1IHf}?p21D9p^2r?M#)h6J3faeyv5B3sT9`-2#(~-L;=I6hHzt(Ny zpkg4h`PPqskAhJ${=C18s)g~K8#vwbjYHLF8m`aBw2$_zxMull>4TWZ7uIhj#hviB z)UAA7JRIP@{ljsBp!u`yS$(V)+6L9Vebi_Dlx#hpzK$~ zq#M}HYJGA4q!ZB^@PVmS*U8OTFyK6;#cBYvIx}zN+S?~Dull7sl5=0kAmthu9Pnwn zk%R;%P|aohEX0Hz6~}}*y4k^TViPkWfBHou&uCCKjQv~Urn%&w z4H%u4vL~qs#-pSsAG=s+_a|)S1y-7G&Bf9t zws24?&G<9vJ7lreBPwR*#O?JW(G(bn1t{=X2LqxRg8gyeO_nT2u%-`OcDx%bcot_w z5aSp#x#gm}!;M;;OVQWna!CA;RT^8FSDQD%uHJ{l2)u6SgKCV^QCk@-5%cV%+vssd z(_MiA#(#N2pJHElrkdhp^n@VdJ;!nvVRnbXFI%5i{so)`Z1X!YyFqckDanGht3o2& zs>oosyKSsvE|j1|sR%;}$F4KtP3{&wBsg7~Zko@55YpCY-eGc&PZtpn=RsHnFW|`f zX6I0T+Zc!}>5^~Ee)dlPYhBqtAw>Y*8l^VRVSWS#(+HQNJa5i7ULHtBu>3VC2(d>e zADr@8;VTi_o@_?D?WFpe)k~-B!)h&|*GZ(IZ_~?shy*f+sunN?&vDqCCB-cRBkC?;l{1%Oy8Ng z;6RUw!#Bsn^-dgT;PmRy&B>2PC3C1lk#K(RR!uhWW(7hJDC@f+xt>Zg%>nwfFNNW4%Kb#$EzA2Bs?eDrqH5+@rCsx)cL=X_5 zfh$RVjK&BM6fhcJHa4AIa<+wVfyeR^$`c8CZ0{vl(9AaK$zseLt8z=T4pO!nY z!yMo;)2&=@w`W1>b$s38`mxm?dzv$`GpYd5<6=WPRt8kl-cR4g@uRGtB#Hva&OpdQ zCvZJ@JB+d24h?J6e8b?5L;yT%HT2p?*O(mZfNr$d06?c`OWZs@0@lB>K;j!+1wWKfNW{tK>zfJc!#bodGEz+y%AeY@OBpEtj zDq{CAUG2B|`e`r71Gn<^j1xW*5BQBgT$6@${%rXIC6R;okUZ;VlL;1;7!z~7qx%uXLgCncm0geQvpACd^Ejm`8Q$a72B|(*&I1P7 zryf+RuB5qWAG@jRm%h-!FZ}3 zX7Zu#4CL*dnB)8wumi*zf)r5N`T^75UC;U$MU=d@1@hnIvXTzTl+Y#rDSj@}(#amX zW|)ny;3|R)j>maLQ1MUxf=}PGxdOHQy@`R7q(Q;?&6Zz(j&$?mdE88)^?i<1+$yJuzsFNCbpF^_ z#giY*VmQHo|K_veQKR(ZAc zy!r4=1y5x#prw;3#V4h^BfXFlQONYs$9K#=P3OFvRNu~yx#K+?#7$597&gknqeqHh z<6f^j9+4k9G>%$pPZpt=jB*t=w-5<|KWM*GNPpia2R-R2D~nC<^b?QQ7+Iqv`$#L( zF?eMYRL67pj_7TOW1Q@ATv+u_Y0XE#3I#lbqnfVoY@c0z;@O0E)QBJ2H_GdbeW=OQ z=_{75qcb58%^4K+8;8Pr%*jGwf+Q%SkTlxHsdy&P{p&N-fd9I^HUlhio}6U!{b;LW zhj2xT=IJG5J2g=J>5XCfyxl#q*D1#V94#FIOM-H5IG_{Utvx+h=TmkEN#~KaBalls z4BSM#P_;zGF*xwpi*IJ5wy4P6#lEnzQ6UxnZ!2>SQj|5Td8lQQ9vmva0ZctTm{R% zV!Ie(3Sx#gG8EH|rEzH{H+gU|)VPgvb_xUOeNn(e!|vdc1Z@8^ujiCp#|KUPE0^e4 ziqjlvZMD%#8{Z^1#I$7-F}rbsFx|ouha9B6GA--SEki1EtDd)@7-l>qAg7NKcD~+n zCb{oxdKYFzn>IKakFSv7%j9)#+1LET?)Po0H;pTq-bPM@+n9)}9khsDJbbW;$!zdZ zD*9{H!#`X8c!a<mPzNXk)U*y+}KphbAXwm)WByIHQ!!ZXlH{;nO zb8SOS)NaJ>Xr?lMER0>zwAFGSwq-ZD=pN!$z=^b4=`T5avjq}l`)r2{mIi>%g)p-` z8_$l*GIq~jr=?Dk>yQ$p*wgaIS)(bYkwfa1RNN8rvs`q43j<2B@4RTz$t#u+b(l^# zkj)ft>JCX#@d98f1&vI<=GOhE=_Z#bIP%)mx5|TU%VF3LbDSc~rD2)Oq}0HId(Q8# zcySWi&MgwW8r~)XyNANjvr{Sb;%;)4;k)M%ggz}3*Pjv*2wMVky_VPygh%8W)242a zgwbG_quzbY=DV(rhQ_*8#4NX%!|xJWjsv|XLb}1@&Tzk<@Ncm6NXZY{Ej6knNeJ5z zNlLo(GD|+&jl|-ipK(nLN@ovo-a920CG~>W&(OjBi?MW-P+OFwCdSTcX;O{IORn@? zg??{UvC=vcQeg~6>q}R;GarMoImoxKU6D-|p&_3)3VFe|LU|E>{t}L@gaQeA*PMPi z2~Y2&R&jNPPzt$5Hn(2DBQqTDB}St9wi!QfHc_7O2$Y!3wQe6FBDG!frSYFf+hb$0 z@>~)BL!8x#0=Z_ism3g8(L@So)EB!C{-L&VQf@~Vzs2Moe=;pFrD#?@Z0K)Fjm+qx zcY2&sdFrWn`FJ!u_y=05n}G#eq~a=?)EER#F!nd*gVC^&4ZF>|tM9DxbY*o1vT0Kj zagXF)a)W)T(?vHA&w?#xEhjVk_!=lfD7X_Vf9?-f|5hgh`L}j*9K_Fj&uSohYT4{j zB>qBI`Wi_~uXHT_Ns}Eu%G_Q>-f^Vl+xTuk_dI*NpDIs%QqJ9xRT|$;V4`X6Kg^$oHgO%||t-X#8`J!ZFdg-5Lm0(9_nJ>Drp(2qb0*&1OAH>r4MxtaS}u zwm(qNeS^;>y|6qLPV$py^pN7pQ3I!O!bccYKx(c-CQq*PYY4fAJ&bJ*^>ZZSn)4o%lZ z=yILMr|Zbl%V2bHuGr>mrhuBZ3&vh6wp9bDtcZx&WBEawZd7|g0`dl{(!dvxzg0r}PEi0?1bl_ocN1bO^y zH9s{vj7=LuV1(u+{Ga>Cot;o1{SSs%W@GjSme?C>3G72ZTR6T*e`@IiEuG&SEB%Nf zBA_9K8vOi1lYRBAo7e7vuax{xLNz@^RgxIbh+#K)cH=jKTCuR&BYT2tX024jNvZ@E zA~7r|L2dA^c!zqxgq*{?jyP>t$M&bZ-;Ay+&>Yo8<)OvESW3?b4Rblxcgq}O{MMtGy_gW5xD7*5E0Ps$Y6_nRS`6$Hswk+KPb zFtR2k^4xSls_71uXQG0flo%(amHlwoOV{SOAX`c4QX`uE4kzZt=zPi;!###rtRwv(osxO!KM${eesua51nVfyfIA21gI98!1_oRTAH z*Nv|IM+W1Co_(d4>x5mM7I@2LK8>lY)MIjeZTT(^a`zRl(OxtT171`Yvm~B1v*_Xk zDG3dx!U}KnFx$nu)3Hrk8}Qz%*}L@0?&Mg9P5o{;FlY{Ee-{8b)l>*g)*$n6FW?;% zN-|UC_aS0qEsDtkolV~)fEsii;fmOVK@^0Sc8zevGg2HGkA--x_TygfG`Urw1Gr^B zh0kb|HD>dwq7MczTwYk6f_L+OZsx2{fRu zCeAPv`az1BFN`qv*my_g6Yt#SluI`muf2{LR=^q~Bf+Ho$s?c{g@ftTB>kS?!$a7B zGKImA^U{p;9K1;PjWpju0QV*00*qWK;7rB)py9Wudfi_U_}(t{7VNkYzBm+X}UQ%FoE`;>l(4F?%q=RH7b z=0&L#8o5Q#Tkjcxx6*PvB?uAf6w23FemAjPG`f7X2*+1Wi+||=i>ELUiX0iK99eb6 z`66a8=qqATUc6x}c4a<>%$w*58nq)&9M<#20xZR-;+=NdXr|>C1W(IuW*4Nz!^8`ZEvusDQMAMKut+!KHksF)MKWWJREKex5Uvd7ssM9;uvPpBMmoKrXJ{T9B z8Ife*O?m~_%j)RyI>*}0R~3B#Ut{2n+HUmxPZ*cA~8-&YX0vDKi7}V`2aRu$}0>WWR4f<{5#k|8n#) zwN78)mh#0PDac>*3jW5B$9Lc3?!X+mkpQxfu(O)8JAt5lzvM@k`y`h~VE*cJ@aC(X zlTn)4UD{!fN8Vplk`#YyS^pX5{#!KwuGutf@KP2UKlR9ieo_6qB6|NAp!0qc0rm6Y z-R1dH{#OuPH2Y=wJ@3ycvfqSe9IZ#y|F!p|!+k9Lv+8NDD5ATT*-Q#gviR zYgg?3|EGAmcgM$Q=^zhG!WkCUXN4AHCX#qdQl6`4ePgff80K`I5p$6)5LufZV+#=~ zfOXLeJ$&(_447i({D#ziqv4o>njt)*F)o?m`X1-Fsz@a`OM;#M{`qf*(!Ol-?Q$Jv z14#(P(6t)oo7W8=PX7rEQ>)?FQ!gk_rVF&GJU6PwlFBcSn^8IaZXP=Sf?MIJL4AKFBV zqJ%WdUsJYZ^L$=x&41QIEBw8cnho`Xcm<1q=7ntfD(r=+ajje(N_v_l|1>aD{_K*OoR*2!*Lxi)JLFo zQT9>r(Xq!%pD}%J`eOZA@kD7gmv32_>ejaO8--*qJI31&i*hAOGsQ958mUjej4jKL z&Zd+wLNl7Y-cF_qg=Z%s^Mj^u_I%UOQ&cnk=m)cu>fYgm^%sr;?`G!Y^qATm{86fb z3IJY_?wJEV@yOV+G)45CdK=3o$Bet`=JP{@|H3z`Roz zct`_n0gpHer{S*t{^r_Pl{H(25O^#J+?vLc7*ZM8q~ydFlKE+KYa8f7F>jcUQkMD> zd;2Nf=jJY#R@BXE^jVm4&-G<>D~)d?ION`lDB$9qY{G$%Pd-_186+JOTF(Bg*rNLD zKaLMvSQ68(^hIj(>iBs`CmKJwT4S;~`5FAd7peZ>L?VTAwIrA0zwX@=pMzmr7_CVK zM$UmQJSN&^OT<3>Byb_4UMKY!nKM4`Dg}LfhNe91=HNMCm}r%1QxagAqa2+(>@8RY zgjpuZ_PV7>jBpJ(!f z$ST;&$w!itNIIEE58icKpY6?TbT4e)hofZ$1dFd#6`v2iAzof_SgGG8-eBoeh-=ks zx*}DI0$0HzuFO`gOTVLD2!R=u1ieE{3+^`?kBl&x-Xo~kYAAZNq_!a3rBP#Ii#lCO zgPqppGq1GI*0CWj6sD^^YQE073>e*8*2gb5_&`pRy7e-l@ z5R}L1<+z!9@?OhDY>_hjBV3fd&9iw+2h;v8rFpZ~Q6B5Q!u>;{{sPBz5EmYl3|(^T ze0RmkOm3&Ua#a;Q$vavZLW+)D_?q|ybCa^_qt}d%t(tX|H~)afWP_O(|GPZqcaqc`LuZ^amHO~PT$EDxgVsi=)#{=+O0n^oNFlD zISt+HA8^Z6kxxaUoj;dSs=$Gcq15M5ts+B61}OMz2j`XxgXRI5TNYosa0^K`->gzf z_j_J-_Ps2zy&zuN^z;_o5+auOSY4!vi}TBgoP%^JjdT${?1u==5{$qFEt&8xIL%pV z++)FcKEW%Vsh)ww>2kV`d`SnTpS)w`k8S26O*ecD6x}34?vvxo{~~oqJoijhA>Yi9 z_Cwgos06xy@&#+_^?^4_Jm01xoM20qFaz`>P}A`cU-{5K(zjaP0VT)V>wV) zMipOEc;$SIqGn6fPb^XnnDqOW`}EhhN|Z!>-okb$F~&2hQAgFq)iOoO&OVCqF`{&f zrw&?h+8pI6snfaV;v7!EsB{h%5Kj{~LlNbJ_Px{!j>>v^>1)H;k}0l&-OVVwuHH;Y z#-#(wwv!4WJq;^{#sll`Y3JB)VC}rYOCrVKANRmxqlEGG<89NO{leHIwTNv}Zk^aj zIW`V?ZHQ<#?{%3>|KATPIE-)aTIKvxsL@j72_-ej8M$T2BwyH}k+|y)I9Mb%(lVda zRg#Kh{a@pkMR)WG%!9xFAuFdZfNEJWGP!eyNq2*nRO~{{Nb#NDiwn&Ps1KPo za17IGX><2cW{`c0jQ5@E-X7Jm_{F4+T|pP=?dEyNJi5c0Yb}$Q7k%pxk*%r@d2tzW zY`wU7EE3YnxTI8DpgvPSE`yVDHTrhR?K@mPV97b!Znu>QwY;5CVpY4U!MpW!S*2;g>DC^#jfDzC1UgOPr?aa7W!kOi6=sA(o+mYf@VeNSC zr^8NRJ?eeqQ0H43ikf8;0X)JLy0}%@u2Rl z*EHF$B%}$0kQoc z_-t)kxE{uY3BaYneC7@_wPL~SIG)$J2#`yik1MbXs4;HfjBsZMpmT2!4;^mXd{oCU5&f(6VI>Q5PB|U?m><()k8N`G6;xk z^|Mi*_kqc3ys)N8yHahy=Y*%}29?}Of!6K9>+`SEPnDjWvUaCiCDr;q@bjOX^1P}h_XW)@yj*OnNh7g}XChdal*A}cE2gOI z!pdqY2u?CnYf$gxKexTWKV$LK`UPwGJ>I?JLOT5c)}GKt zyfv`c?7B>f6|o^U%@T+S9-@E!e0-C0oyL^Xp|%p1O4;tN9y@3)UaPCHtu#ZYj!|TK zT4$q7E3b&#SJRERdsf$8Nn8|pi&~oqp1)4m!@*1QFnl!^8svLP8>f6)ZLG+2@cB5o z>~Tn`ewwT{lgf>ztZe#M*o5pF3^$!^(Guw#AofA!3*<_U_=r9^A1ByQOoJJsVlL5S z-lIjhX%?kM@8H>-AMNAMZ~}2wAJ+cGBF!$A)zs}Xq)ACqSUD&kV5=Ou_}LJ9a-Nit zvt`sJ{!u*KmSRU~nT66|$Y8Ko z%5UbOMG>F*N^$Tyrr)}YTRka3#ZR1x@v)Bj6N?UOYA6xF(L?`YTT*A{Sqgiw?Pr=y zjX;$MA@d$NIY0x6EDRk9_7jhO|2)fNhE}xNH_S&m6U$48jCv0|?VZ)Vcwz)hZ#ghx zZNB;{;82qy?f~d%S%@q6mKXG z%l0vF7Z+*2O-(5MrM@Cv>St(^czoUx%mWn8RvS{_UXBqWmffuP^8Tii1FnWtQSO&b zSS_%)4s}>VDP5!WVp<5x25wGTN`Rm+81D<#*}}%CH%msN=#>${RdOThQtH9dI5ZIPV@m3GrtOchiM*wcw=0 z?8G83Z*VsK$t|mUE7Ro@s3X>t! zHEH2-{Te(WoD{m11zaIMm=&8>qmJmL58hePTLU=!RF6*D@IP+)%A3$}NA61-F6(9I zyVi~+X17W+CFm4ZSG)Ujc~oMlKjJ}eP$J%KSZu1wY#Z;MEK-W5cHQ4nU57cto#AHn zm;hwN4o;j#ZiTCDa0?iD134HQYa#{6$mfBtq{QWQkykkYjBn{5x4yT~*Zkr?h=Yf# zzZ?5LZ|vzb6=;dEh_G%GH#H(tw(tdw9(6h6Kdyv4NcDB^4U> zv=T-HYwKj!HpZ>zcMS@}P#F%jsKr%ds7d!Q4Wx*~O(ERzY388Sc9^w}cy}|!=&chk z2ir(S0bx=?iaw~6yOHrL-QpP6Nr`aa_*eEuU0)1BEjFVQ5zy@cf zM_3u?hKpcrzWo(*SG7n^M@eNVZea3-Xgwrnbo zyD0W??GI667G%e&qF$v(+?QgGwcl9w1g)n_gL6MnzFHbtvs}rCV|j_`Sh4Oevq(k* z2P9R2Q2*D)Pklyz8+0~Ib(ME;W={=VUJjStxtH57ZIU%)V_2A#=1@StMxtel45S zV_NpF!{Zm}(a#Rp#t^#w=r*B%eo7p_4tHOZ#O+bVgh^s=2RSmws+ zGt>8*EkAohbMYuiN-#Ek(dVhLHp<8wJk!y&QV&|&aE?%d&-nQHc*PPQLuBb!<@OuY zoFSJ7LyMWgj#ANVH^A<^$g#zTMr~|5%81CeQZGaJ9>%a^#+GzzM`I8lXPCP<@%hVD zg+kI-rsPWJsU+qL%4uTvsAzcHnhISc-qv_oNY3?M0r|_-8-B3wM)p*w-53+GcRe?2 zppn5;3;J>2NiqLp^5F_yr$Yq+#H8d%RZKztz=v@|k==Zac@mZUc3O5~QR3Gt{WJQt zf0kt11%qTN-7gd{SYPNFG*Df|xz4Ys&*s%;ol%WcU*WDdjMN$LLq zWzJ^%Q`BOXR>Hf3BKEx_yqgyca%prIH(x&+{UoWL{NnT}k<^g#E!!~)@44o;+UNT7 zYOJ8;vnt(n{N?G7bP7`Xw~L@XYN^ldRQYVQSfeajk(`MKHB2bs7X8Z0 z>Ywm>qrX%7RQ~nNbW@s(L}gR1gq3oDF_cq6CjL3kol~Mdqpu!YmpCfo-xg|e(QDJK ze>gDhVWk~8N4+9l^iu32dH3up*Rf&LS1jqHKkkMZD3Y=_ljM$tEtYLkMxs;cA7+pk z2PtcI{y{lgWs5NQ%&G!^e7k%bK)%M?`qlVHY&w+^3*9RJEJ7zjYR<`L7H2z9C^4HdDrlPcmo61O?r@-yg%n2gKL2jkTbLhs6 z)1y31U57yS!m-Z40+F{%9vYW`m3xcy!RYK}%c4N7MmMWeIGH{fXf7kI#wpvi{uoRE1!h_LdE;K_`_R;}s;WGZVGUZNeufW*=n36k1ap1I0 zJZ&Z%*?9;mm7Q9b4%(OU@I5ixGzOgn)R;vM%&&cb4eSgR70vezbl=cyD$X>OG!ram zq3RsLF}?$={vNUC8WUU(9r}2<8gwf!@S<0;^_Tw@U-Oi5fY?nN%yq^-2raJ1t+%qg zm_V{TPnph;IMhMK%!weijw!gu>O8_8?YZ=Z`> zm!&3-u1s>HeHlSCkzs`fV=20Q-d~H2=%ZP$c34v-xi}1&;$Q40BsvZj{7NbP4tQ-K zI|NPgTV7?-{6eP2lQXh;s84I{et4$e;>twH|03fV;*w2+2^Jt5NkjH2>UjV3j!J{m zT@1yDjJVpvp_Cg^vY&fkDo&IU&KIY*_52(A(-?(0mN`Mb_$yjbDcrx$c?cRjC;b(K zx>&ts)?;LSe)zA!2yG|=?3bi@p_tl_xfEpe(94f*x$kRxdwl#=>Z0}fLCO>pud3Nt zoF_fdB04`qgf?kVKat}3*M z7%|utq76%Bbzk~m_y$~iqfd)R9Rgww=QX?V?$2iE3|MKI7Opie5~MGENpYmn4t=T7 z!e3kN#>ix>lEifDjS9pSZlRir_#2AjZHHbVLWU{lAFk^a7+y+9vn5rhRAzQ{9>*+! zBGLp7zIiQ+G3X`>0LlljrU6<>B50Cr$%v_x%Pir8VIR2)EyG3s<7Un=|0>;N z_iIhjsmngu&reZc76R=T2j4f`zubKMr^VB7{8J5GeQ$Io^3UKmrl_lci626@_xL{; z%cC~`KyzO%O>EUCenCHFu#>os9aX$hhA3{1~8jS>Aj!sJ(H8aVfmM2 zupsfH&a8>zQSdf&8S&~uhxDl*r|r+R%?JC--k=%Z`5Ke~@3?&wX!tXT()r`>VY6 zq2Qf2s6YY*w@$e=`m`G9bD;4X@c5~ZcTdh ze!I7~>nsW^eHPO(#>_J5qT0r)q)5-_p;SnXnS^t=Rr#o3?g>QinsH}S`|6A{fEzJ( zMkBCVQ0Q>l>U-1ye`KiHpVBub;T9JlCBzg0A7LBF;C;oLkNf0qsbl^t?ApGio&?n~ zOQ{D}lXTRk?W<32S4yiP9saNuPQOn5TZFm9YLaOkk~>Gj71u_GC4KzT6$c=<=3y{g z_3BxujSiip-fWKe=5gHt`Oa{j|70SpQ~-+Lp~d6e&N+xlOm+dD*9xYW(ysu4=Q>s7 zzzE|g3@wF7IQd7HF)P6!!NXC9p{1V#Yuf%86qOwl3A(`U=#DyU`C_(CbyYN8n3Dd=mE(ShUiMt(Ey z+fiek0^~QFN4Ae!FXJA|#Cax1$DM4q+1bC-KdSdh3WqFv9zcV?W8lt0 zzViyrz9hmVc|yytzl!{yM9V+OEl~-or<&%4rfM^i_lbRSVbY(xpFYjZnt{)dW~|_d z7{K{5G0@-0n>2$bWy`VH=zy0L*7toEW?}9+>}#L>)%Vw}4ES*v$<>%t@O|;cEI;iA zDn1VvSvT*{_sCql5O7w4V|2^}Op`(*$2$hnRK}y7f~~WclH36J${JM+u1j$QAtt%L zFnmo(Ca+bBcwGtG%60R#+-(F$Vqd?k06Awomz0VCIY&pb;(2W*h#xD6m39ZcBLa*p0g=BBE_()T$R`?6yq$CCL z!~=d!eZutPZtNb88)B7hY7J{s7LyO`Y_k~M%9ck+a>KUx7WwSZc(V=dGDw4K792gM zz4qI2KDaCDN{@hNV`Kk8e`heuq{KqKJqmP68#E-rYgi+r)mBtt z{6`z#l8~90T@l#LnkK747#%5g8yicWr^P1E4U~wft4bF-ow0zc`Quj-{vKk;RL?zH z0Vw3^MbvTueqB;jlgsKw-An}^xCW8^s-sPZN{ub1XH@1ONZcILzTY!2!X(dWus1gS zJ8NmuS3*R<;$3Za3g2P8y6p91x}Ef-p*o`0gkaOID1pOkV-Y!ZGe%xZhhv{iAR>Noy(gnb`rm!wyJt)bX>5qf(z0E;&l)85N4Ez^|tJ-FjKmQvzv#q9U$E zZPumjk;|#=q0ITDLryLydS8J@OzV}@y-ICt(z&!8)0@Z;2IV9FO_o)jHITc(7&I?4 zdF_oe3Z>Uo0;TW3#ZIVB|5B#yMgaiR6(_D4rrV`c9+z)D86wP#UG#tpt7wbpg#tF)x+6NHk!Fzfx94{bwv4r}{xfAHB|RPm^alFu>mJ-Axf zFvDwR_P0VE8y2jbn@u4L?g(2d3*ZPjp=LEFjU4bK!1enmy%TGTEKJt&K+`EuFxIF3 za7f5=)+n=+T*lS^gg*I2M}X(XOqKIu;wJfdWV@{ydR*A35>1yp@q1Tbv}- z$3#~{Xs|WGi|NePk3YI-Rs3m4gHtfWN!yTB*vjnIx`*qF{3Yvpu=}D0FX`c`l9y!l zu%(21RqdVrR`j)D9y~v^plYl9c6jn@SkAp=jd1z*HxJnZyE((=pZJt;0e6WF|DqTl z?0b};=6`AX@_+o&UD6xqEEEDoizz9p22Lzw(N|_I?5vL#Y zH6+;mS?RAkHGW0oTSgz;DjH7O(fu82Gn5a~tM+_(vSdfKTR@9!Gr8VRRQdzcgH?4i z9ZOK(bQ$^&`W$PecT&GiU?tZAV$zb^NT2sl9%b`I95AhXDTq9?yjn6h2dOabXrinQ zdN6e|A94hQ$4k(B*F`F?vh@S02!s0o5}NPS#jkKd*KJtTX0C_jftYNl%e;whDBkKP zTBpv%13LOqT2j%X$AD@NiJWWp#)Br0r>3Y>hY>>`3Df~+yO_Rl;j-4`Fp)^_;}5(S zud26Ni@YAjSUSAY5EcyZ!QKq0Us^MG5%;`t-0`-1TB*mX93vffTH4i&lj<>ScXzX4(J=D&F@tO)1FrY zcylEr3Jy(G_(et&@`<^(2Ch4O8#<(cm9xB31ODW1pC!6|tdx@vG_00DZ)_JNdFqF@MmX#X zj9@rNtv+uQv_602ggK@UHe97C@jb0>%nlNPe%fe88Rat4S8v^#0^mG*OuaUt4?F6) z>~_x5xsjA>aqPZ+u2T{fNikE7)T7s7Z4&5K?g4T{E1#|}6i<7SPE93SaFfbAZP`75 zRrxaDkNSN-{s#Iqz`IA8XH)WHsLROQR^1Th(?5*c(r^Ocdku-INGNmwC@JoOFT0`s z;m}|$n%=ls3FdJAl`%0?h$J4FP zFU=WiN2Z1_?BQ&domTVgd5pz!s}K5%@Jt{P9;F->XM5|?`23N&sjBS#){z-n#wHSdT$h)9G><_MMSwv;41OMgfwf8=$Nt)0m3-3^&2INw80uYT zYIIwzvr!$5MAbIt1!9s(wPGD*6(ncw^(M25f1;J_mXcIB%o+%Sb5Uv&nGN4iC)p7< z+a#E_N5%PQ`Z|!u%zxbKt5$HAI=+6O*w!g!V;;)fYkv71a9;>&RS7qF-5SU4`#O|a7)k~ta?5Olp6&F9`XM2M&#rzEy zAtNXkcTnKZQmIi9rxLE82K8Pi z-@VjJt~|ZrM3^mP$lGkUey0(PGv?-VY;Kk&kwNWj2D7idK1ZFwHDNQ1zrK^E4{~z1 zC0^Q%2)X}mUu)@&C0wokofZjW?%}0pp*&Y3Sl$BK*!-}cA@}fO*AaSOdg1PbQnNH1k$?G_#$a?VwmWYkt>qG9ib`8yP#+Fmw}U~m zHTMnmMR3~{I|Q`rj`)}}>h#?-eKmnYFq2h!i=}o1WRKrVoyk`^ zT4%p)Q0NT+2T(z*a@T+p_tln6jLq=1+4U@4gFWz7UxWDpcrzu2#y>ho*C+}2Wn_F+ z+R%|eiJ^ML1^>8@Tz(8l>zeUOyV&{vV(+bk>JFAaQG$jb!GgO65AK%W?s|aW?iQS& z!QI_CxN~rKcXxLS5Hx#8?!9||`+o2JsiGG%(Q&EXHGHSp6+Q#dF)u4R(50M zv#IZ7mYTPvOmo}WU)68POGk@8BpY^c{D|iY4E+j0qOPBlsl${&6vjQ-_~EN9EVZ@c zsdi>4dE2-;mVYtd_o@=(T*txUBFc=~rfD+$@2!UEEaP@FWH@X_rE!+&r?jPq^Mw7w zJk=|y^p8F^2s1dd7Lt-lLJ%R8LtNC-{yn!iL}h94?D+G0dm<^gdtH;f_HPYf=glAh z>sm+>YgtHlKNUi%xGRPXhlfg3%Kv-QbLy^A)23M}a1ZpKu>@E`*2TVh#o*k1Vs`!n zN{i#pXk|u(9dQFut}=X2{O_~q78bcjs(M7uJLkwUBLmY#%8)0nJ>=w2AG+k}aOoI5 zkD62U+AT(cfs}x@<2y!hL;G*jH1q#xyI|Ayb6dDu1pM{79J4PCr0>D*hZK->D@4=V zKm1up)E!eThO5LTVQ+U5B$Ju8vV@D$CPlF7hxb{+oM4ub4NbU1Hcl9|wmGaSyHAkE z!QyF&6vxof>PKc{Z48|M@%FTvsmQ1b^r*Jkbj48q;QVVZ24(-vZ)-nd>vZG4)JeBf z$LP4My8%^M|EBjhHE9lw=FDnSdgUKBg<&Q~ z%`>g2!Rpl7Q+xE;j94dt+b6IuM!j1IBJGYQm?RG>dKOJ7yW)RK>~7+JB6jCBlI?By z@cz6T?omy%t;1XJ`6X{{`ZolmdE{-739WDig&yt7a9m|vBDEehgJPE=Am2Z)hA2&- z&9Buv`Ih8mFL(A<>%?(*Vc7RK#5%9;$y6h_WUd{U+g%Y?{2Rh-E`WB$Vw-!HJn>KV z`L5^P*)+6Zi^!<>5zzdC2%L28kYg<pOVjhnsJ z-Hjd{0*-%`w#CCS+KRfBPnVM9aFTO)~BRjH_A~QnpGL)kSuGw3XOwN|ZV>rc`7YOm=Ty=c3QnInvCK!=~^I zLg_jFj?$L8Wilm)mupNOE@x>njRec^3A+?5pjuf-$pT5{VVFjh7zSm+{~e>=$(1u9 z?-=w+(xGuKnz9|sC)U+bitGxD4jZ$v$#S~tz50&S#nb@wQyQ7#1NLtvdGS{WV#H{C zml6nJei7rf(Q_T@P&NCoEMj$EVz$)F7$_p*2`o%4$ICRMjiX||cMK}AiyE1Tjx0OU zj61WXS(*(BD%2DIq{4If-VndFw$aklj27Xxt;@h6exxL;W#``~LAf>Gn~a4Fxtu^D z)=gjWr3$a|P+0nVs!ZLGHra{Gn)vLLI|I4u|SbwXm^oXL7LsZ=?gLAX)!u| z$)L;HJrtdbZG73;=+%1DJrMh0+ZWxFvM%~u1u34atRIT=%$4;YrL8jZli^KjF8O*4 zZ5tEYTmz$m7y-$yTnOR_3<5-|<|k!lKeln!8x`#bCE-=(axCEm-<@E0qX%}0ehN3i zT+MfGw+{KnCMeFCKjarVG+xkEucl4(W=3n)`M{f+>nd^(3*+bUUSti#Rel7p3m~)nUXmco9|B~LRDyaF#|wiY2&m} z%zca>4Ym_1AuDYxTtQSdlm;4P1@u;AX06kn-2lRQJdV{M2%9t6Ti`o6BYrf6p?lzV zo@I=c$tlzl?~)lA6QA!DQ6na2C|(P?9r4E{m9FR_coxk)qo@^z=tgVSk%e4fN=|hi zOn!(7%hoZ)vm|7p8U+4iSuhpXo9)Vb79^g;ccE|DVOjvli%~7&i!=!SlM<;Z)(3X5 z&wKuyzPylg>tsqN&0UfM>CjQ~>{N;~(~DID^aL7V!#2(CSq3ovYdy`Fw4to001}Nh z9}cG+#M4&a5aX;mzC|n z$;L6-m>Z9wHeg7?F%K6d-oPJ3zC~FH`owX0Xd$U<=-Dc#BqP6Yh(BbNv;>8h>yVq0 z-o$R4m?q0JEcK?YOkarCWK&^$k~4M%K9uP2xd{U}CU}S%+OXXqnqa9)EtISNGZZh^ zD3Lv3W+i{-dTt@V)warf22roI9gS`?||E1qG zM8JrzT}JfbE%mLtX=vKP!UABfGi@i(>WEA>U<2Qsx8l|hkGX~7&Hqnwf=ZkQc)k5o< zV!(GLyO2Z?lJRZvA2sE)q~ya12wFzkg}N(-xAtusLV+^Quhw}pCaRg&f1qYZM+<9 z4fSfXY8=pX^Plzr>p;--t(hnK1Sh-G!XQlIu4VboGTbe=RmzxY2dhFcclYt;#HWYEF5oH#7?v%@1Ii&H-nLCxupK7o8K#uI4}Ds^6h3>--H!k!5}KY0%4+0 z*3>S+Sb7sFRbbmnLWXsnEle*NIv0Ihk49_h620f?dceg5!J%g3`v6y3KC!>ZsYC*B z=G@_AONSXEw3ZV+iF^4=)JmQv^#P6s9$C*^J%{l=k9fVDAu?l zO>1B~yb;@1w|*DFmhuyA5~vXf+HIQl>#TNW9_4!fPSXAflLCKgIYQCAB^58*>lokE znVquCB1jdx6dBq-7aDieaK#`LV7+J?2#fX>8q!WnTeh7Uf!&DX1D>IQZ5yJo5-~C| zJp%*_rv9x>Bl+5$N_ZDr=+AIprohk*kQRbuZes(j>7DpP+(@#@KBYiwpDNeR3Td)q zARD3&XHrZ3#L%vJ&%<8rjW47TF+HBSuxZy3Z|7anwqf!Ir_A$`EF}aZhy7 z(jzWjI(*b%7-tzv52}I4!*TE8po|PtXl;YL?m1{&$}Kpv*CD~Be>z-fp9}98Q0J@@ z@g&9kIO1hiP$%7}U~iCY9B!t%9VDi@Tu%=Oe&bgeifwH?sovT-p(2Aq9~Ky4*(W>f zT(4-yG&wt=IK#>B+;~R$D-a7s!^CDH+G0_KJ&vU8{L)?y`s))v=s z#tv^=rmE{E*~Udnt!UXdup3D))W?DN3^nA$-B>ZBSKbb!WNdT>n5!xQZI9p> zZqD8(VVV|Oq|XR+RynN(!Sb<`)Tl!nw(+GX(O$m+Hw zY#-y)Z*3UuJu0@$Okj4oC(jN8EZThEGYM1Ba}m>#L~!(fV7Xi^cVMQsRx{&Xy2I7EQ&W<`xcN|Z z`OVOP{M2vV!Q{I>Lgm7-g%&;Pt?gtfp8V4ACiNJaB^U0DVWI3otnUzCq^+~i3<7^L zQ&{pg$Ke-Tv>U2aBM$0YdQ}D6gE=;p8Tw$N>`t5aI%mA7P%}kmAQe65+Pqb z@RWXF;(j+zODwSDNp#}wPfcYZgo!c1{<8Dk*CKN3dLYE2$nmWzPcVWjOs8_yU%osKXNICn1$07$JFw#r9Kjt1 zB)^BTuBx2{h7HEm{)R}(8qlt`^xWl-s-yC+gNxc6r1({!7BP|Vy z%sHPDRVS7DEdUoaa0{uYFhTHF(mdOudqIMqfxF4L#Rns4*wK=R922IKMuOa>Eruz^ z=%IPgCAD5)rBWkq)W-Tu0@?gZjg}k0Zc|}Oht`AAc|NgfWgFiF3w|4VHh{bt+t4K# zkF@SA$`RRx;xc`-le3W}xX}l6hTtm!_km6>OpF655`F?2m2S5Zb2e@FqXY-#k$M04N( z!dmd+w#L8w3O8NMRpoZXGFqApwpFk--4gSONd@jas7)2fO>ZLJBPjg3z(O6$L&I?g zZv3Ei5JZdrqBVAB-b+>2WyhKGlo1^IK8RH;yDzh+?lRk5Zs3OseuF8 z-gC6Sk&rD?GQ~L*Q!~Kd%uo3tdEz}3xq(zlQtyJSegR(zyYqlYjdGbfvlCM^&0M*H z>dN6;U#C-y5|X45NxdHi7NV6`(0C9e+wbODgc7iesaQTWHdueNhBIg5Nsu7mlM(qq zX0~HNV6vm@AKS+x1iX)W7T@nHG(KMly+JJiTI`75ok|`!Lv}~!5WDychlTKuJJ9f^ z%&ct*S)n;QyN;9)QgQXP0!QqrIl1#l-}WIC<&wqpeuS%ats2qkTE&v%?Fc#CAlDJ^ z?Gv4Mdg6-C0AAD~b7?x)Uacik$%k8XASphX`8cXaYvkkR%3sn^dg|#@Won&=B>P^slqk!e-CyhO?5nKNgsRg+ai}-N=7SjusU-D zRfKllkhM955ygTqwCPZJ^~C$mJ{f*NO6~V#26*g*)wy1nXpJLB-84a zeHJ5LZx6|NpE`K<={Up%eilcb3NFZRQc zLDb{Bh03~?yxtz5q-aK+nIu`J&w5EhExs>127%Sh#Ws?)fCf8yIV7DXi<1U2W`vVz zcId9Ivu7)&7bN3nW3g4AtCTrXQLy;)vhNnt2M?L}$pFEHxD=bAp zr+#Caibf`k4r$fPvlo5Ebq4eO417rA1>Z)|bTx6C8U#mw3f4x)-*&KE{&Pfw>{JfW zE}EUtdic*G%Gbpoz^~5ye}#ml;>ISq4;ak97MmZ-rhBO{$hI>!JrCtkA@uzePA`5c zCMY(#myWq^dz_W5%Gs57k_88IwPD(Gir;1D9J&>{IpiEm$b)%)ZYyuA>?#*eH$=9Z zm3R-G-{Yq=q^`b4%Ul%5Yrf|=fo{zLiVR86T&IFtLE_llI|z6;ia4QD9hSdHd(s@Z zM_-etBOVoN!!^ZN9rwjJO*o0q(*K5FzRue@9&mr4<`5np1Io6D?Nm%2D@y3rkthD1 z{IBD$PiP4*yL}cf#)CwUIR zMkRw5tC$F>^b)i%qX~KOzI(%plAR7lK#H$5IBDK({(Qe7{uzc+Bde4u3`B=6Rf??f?m|JtJ=&A}1;~!4|%5x<|mMa9Eiyz2?ZTCxDfNx{n zjzONwzsv_8fZ|Gk>#OH=q@z9&7D!Vf7o2H@=_BP+_Fb&2@l%ZPt16Cin*REV*6}d< z^bW?r#Vfg=l*VL_iULpBEYIaBRh11|wXzi4;BLT@r$4ElH!y!gTph&xhG-z4+u4xo zkhvFoPz&ZuZ!db0J5M0GMJo_78kPn3HwGItqIE7^g3I{EL6N(_9MMGJc(0gyk^>rg z9WD^~@jFJ*s@HylGKTZfkNE2e*^u3oysq(tEIu8PM$#32&9cm#s9=t<=Ac_0+}db~ z?nbt8t+2z3*{~E~v8NIT@sm>eqj1>h21hAn7>SJIaIi5GlC0|6#m;ABpJ*m%E$; za&Sqy2gHK3HE*hq6=xRLTZ!?t$lpl1j<)6d(K&;nGq@MPQ*~f`V^2IB21Z`Y*U%C@ zDJ;xi;ltz(!RdPi#8IdWvRyJ8A>w&vm(Bf(+0bTep1= zl&M@5(bceliI5C}Q}^o8;}*rsb>weGJmmS&t~Tab6Kd0KZ#z2(EuuFhrhX}~u@Z>U z&uHc)#dlb90yNFE{8SB8OUy=2Syx;9-%Q_st0VQ#%kG2GW-=E{X0BG%uyrdecGW}F zw-b4f-V>&Cj7fXD&|Gp|h^JT*XCLdw{0VQ_SVhTl!?uFD+WK!LIlDPOtsFG|UhrF^z1ttnf?!3!nASV3l*e-Q&s@ zBgW829C}>*2XaGIBgiN={wC27idH)VfEx@DJwaozw|Ci2N_tU;9NC>5a^9?pENSI~ zos>2OuQBAnj(;;IR;bUh=2j2hGFf`!$+ADAyasu?xwGqnbo{V3kV(o~+f9K;eWjj# zI-syhV~yXq@9n}`N|J#LU3~z1jZc8-iVv*vCV4ZaQ)$lFl6rFQCJ0q$os)RI;IT$U zz!Y3)V3#D3iA-19C4_AiHGy>Rm^&Z^yHJM}JXKnl2X6Ld2>uI{* z5PHg^xx9AlC^!LC{NQd0dF0JrIrrB4FgTfABCOus6DvxwKPPifrfwqanCXZhMi-^f z23D`7<#3pC0IR4G+3p4db_?!NS*OSTQlA?z*-8{4ARgn?U@D9;!qKp*u2``EKK4{z z$NTN9{iZEF3;7{f2{msWQoxmsSkQN?c8Y^%saeh^Bq^P?%pe4VXry~j4sI{A4&=rbDd2Fm`pMK2cSrFv_)`idr6ToC~GLOrOC${9<64)0&^q z{S3tKF_~IRgNH9|RrUF}YTZKsE2;i|!Ed`>qH7qRii?(HTRJ~1>%nb`G9`9jc_7sx zY*^-(=Moa{P$@N${E$p2f1ALVu|)2qng)x$QAuhpP(!iwlTU&vv`CM93O75`sCy>a z#R0xG?C}Bk`xI9upFEnIT@2&f^Q3_#fYBXIqO#2pfuTkgojOCQzctjDkHK`9ap&vS z4EB#QN!cp=(ow;sPrXQ_FlcrvWD$k~C-iFuNMSkBQWisdFtQZEBxJ19!s<54A)-}x z;a6-)75=C}TIusudrdB<8LyB%2TFCCl>4~yC8M zt~mxTQb<0GIJp|49!efQ4}KyeGLoKG75bY>5M!DI#%LE20+d!lmj2vb>m`cgAdP(x zypDV(N%!RsYqV*~47&`1(Npe7n;ti>er9n;Sc??whj1RiI&r{@Cu0I%ncz^I6>vcM zD410Ah6*Dsr<@0ol{q?`!6EIAZ?xM5MI67In0E-s?jMP0frD>X_3NS*1tIX-Ud~S; z#wOOHnIV5zb7Lf+pQ(4};UUQQKvZ9lJ(zF7z>4dfwx0#DQ(%dDz; zgs@0pm0PP1JS?V=vi-+y+`h(*N|5i{c1sDKGMi*9RBW z@s_8Nm2vG@YIg3Kddl!kZ619(*J3IefoAEgg^2ebL)k=>4|@TNhAfrj!&H$JIp~Mi zy@~EzQ)5S}E7igpdB}Zlhf-xH_IMk|9GEos0jjN67e+bxHx-Wd!gHbvJKeSuB9%IG z8GqzoeqQ4Bt`|U-l*64>;BLg<71Z$)P(MK$wl&0}w-|P-;l{3P`n+1 z-^CBN69$rh<&5b=8yI5>NF;sitWv!g>=_{Mn%8m&=p0&%Tun)jY@td1+KEUX>MlR_5%aX{B`6v=k1;Ye%jG*gxAj9}Hvw4gSPrHMo12y}`OfhQ$u0iv1FtPM0*UpRs z3%bx9;Ft(Z*xn}`)vN-pcN!=mabpFNh|O&Jj{XMA68MJ9jU!{|LAj_=h5B6gt|ifn zi<&l#-+VQz3;J$b#~gyqkRXU4M$RD8Z};wf@>94-4Y2#_P8rC*my<=c4w7UUIkfy5 zfDbIQudTp7B*Th#=1h^n%+JIMEI`>>Pu>^OjGFX5a-hxTqhM*~C$0c6aUMnv;NGqYV|6P3js=do!C#NkApbQ3J>=x735`%BOit z9$LgnhD8<`Y5|pj0N-Vk)V5(&aB?2gcoMc^u}dEo4AD2mqlVfNdhBubb`)g*q{v2B z!FssNwB6V8<$J7|U^0#Ln;_eI9y$E3DA@bM;qbu&$)I+wAwV9q%62(p{Mi_gK2fgY@Vq{tRYT7B5$5htHuoUTQ#4pWBMqp1;*x ziBvBj5W7V973U$=OuO^Qv-?8J7xp|pyHcp}(7y99Kj_&l71M74SPtuN)>==M69~Xc zlnMQi>O~c@er6T;aBP#>5%*xeDLuk%(zoD^o7I(~J3{^QGLgVR2O?pg&gNsH^u@@S z;2m=GfEX;GJ1U1Bn%vF9x1j`kE?cva+4o*J1U;g61!GivP)9p>v}YL)|Hn9>g?e*| zdAce(Q1k@3XKF*jjGb~Z;%f`$Mdl7rmEN_d(LxSaYRu9zejmq~-D4+%#K zqcwUK6fAd=Ii>xSdJCyDp`5cz9`8E(8v+>C@YI;uzLL`v9Q$+Lo031?k#FEWMmEq% zg~4xCg8gP&g8P6w&pZJfE^wW)ciQ25#z@UE+E*^rA-hsLxB-$^EX7Fup879C7{d5y z-fN)If0TbpgUdhLMl%1c{1f2;ZvSuj=ggn-PkL8sckmbeYuqO<=C%COA@Xh_sQsG1 zM_>h9{yFp<(LPhN(SJAo9$fzE<~1U4WM=!1@=xynr2LZ+75s@`tl0twZX}6+kQ#-~ z;i!{z^aIswx#Hoz$mm0J^yl#T>^NtyAvqeb*Pq=G;%Cf0Hai_4&*}RD=yxFQgQ1Dd->TH|aoZC|}y=^8eZ+Q#R~CiEpTw@>y!I8rZi_h-F?(a_-aBZU>tuw0MeG5~l zWIJ__G+h~8jt^$9Z=_1RH5?%Vnd=?JFm0Rbo(9SNB45+up8N1P=5BB^mUv~);5Cx$ zcZVd+rzpy6#aK@uuo3HOc`S98cH4iP&bYG!cur=4JHt(K%M1Iqq-)9T-AQ@1IG1?N zbs9O(tvfi-wC*46d$)Y~m*U8#9)rkNYPX8>k$}D9`F`p}R%G7`4HMzy33iVVNJhw= zxcu&YC&kr5)PE<~VL;tOwR(c1cy8Z_VA?LutVjGKL*;R-`7cP~?M*BLtZhDqhMVoM zUq=Qf-`_k-FedcMKN8%6p6?Z3WQ7sOQ1*mQ7k^OHzqBY+5GZF5b$Ba+fuMLsY5}~S zfZkN3j6D7@X{*gl_~omMh&KHya6y$X@QiUis?U+j>RwM~U=nHp9VF4y#SjY2<)4PA z;{;z^ko5(CMpsD@<6rrT*Vn^2qDe`W$_ln`PkO86Paha3&j~)eAmxy)Ro_di(cQjV z3V;5k~Am`Jr^ z6WgUnV8})@Ds-V3T)jggFhrQzFOgE1x)R#2N=kRU*2Oz}+7Df#+t3Gs=M;UnT5h2d zE;_f1$aE=dO;Wbxi9ojS1P5GnnpEY_;rOwJ~Zvq1FwsJ+fxof87fi!@Wi@?`lE8id6cW%4VG+$!ahs?H<|M$ z?fB5v*Ey9_&Gt5-sdEK~k{Q>L%QbZOvqU5PbNinhb5cun-X4DWYdOcWL%e$~=kTp} z($1w~Te}A~rNCRD?mF>R<`F}pdcoauF3+9++G8GjsrH#p;KNXMr{>E&oK&mGSgF!J zOFLL!@jvx3n$JF8At$)7d<(&JHPb=BXw+bZpK;zO6B##x${edlxP=NoC~Jd-~_ z6YE0G{y@2RqB#f4beRD&MlIRhy$Yq|#VT(eN`m z=)J^K2xd&Nt0Vp6QWa#xd6xlh?IXr91(S5-^qokLX=hpfJH{`bhA(p#?mK)8H(2{0 zYsP|Puf*ESnkk6>K+%~i`8%ZmZ*^;F>Ud7>zL#dj+5g9Dj20@bSA<}X7OtLi<3A1( z{wg?$^L$;0L3hov^QfO-l#i|ZUou_E)-nDc5_cb(ZlCkG?sE7prT$?7IVYReItkdF zO_yr-6T54{cfAaQVr#FPb7EEW6)dFC)L7jck9#aXik*E*gOk3HmFFqa9{LdPh>Q4kIix$bcU>G4h8e-F;CKTDdcecnljnv+|Q%=lx| z)rk#17{i-8$Cawq9`KA*Wjx=CHhA&V7YJKURSuxeipw?Ew!rAfohc+9fBjjiN94I=SqIrec||RDY)e|4ub79RFo%vIZw`fQ?PmEBRY zzIo;s8|ySx4$!NI`udBAykDI+ciAw`qwI#~?zmE_`$ycq!3^rz_UrcJ!S%SV6U%OB zkcy9ZwTZ?ZNolngylK63m>=D}rsH1Own|;lz&n$3VfL>lWc!Fb&IBf=?W@4jxM#J4 zionA7i|t9Qj`5;OsLh~cnM!uw=$$lR^9JiJTYJrRD<^ZVoQOppQHW0}@h33%C&b8H zwV(JTut~OnkGd6H?Ke!|({>`u9XJrJ!0eTBzSBQd9@I$0EHN=woIl|Xq+Dw;qtb>` z4p&^iX5^NYZ3$}6iFK52If^^(@sfNdtRs(qRFv@kJ^gLPG{O;v15&d8lHJ3h@uc(ADH%+~*g*cdxB zDa~nH_UiX8w{u1JnrpnwH>bM8;r24`=f}xE)v|d(JWFsJ^sIl5_zeNrx}tn8bNF{L zQcLygPdKY|H!+V^j|4A`y*o0Gnoqx`ZcuId2B)u8xh_T-&6m9|OkAa3N!5_xU+HWx2iL| zVc*KMal44yahC8f;J3)=(=U@v6wO%VgXdQb#$s}vygcYlX*WAKKT%!_bXIY93i}yRYkaZpJATI&CNN}pgNj>V!b*Y?PmV(Z7t`9O za_a^Kq8=!ep?!ABfXftlCuGLgmDZ|Q4DuF1(q`?w-}xbjv4}X z2OqM$RvDxeRxq0XDW_3D`TcUi&U9oeHCWQ!t0eBMRIiUj4N|dkD{){t_iNtc>tgYT zEHitv$C%9OWsY_x2k_rTfBsH1s<-jde>#fcFUAF=cmpdtXmq49PUuj0{ z7AKX8a)J|Azz0dHtjr!k+X}Z{=O>3L#2Sg(qG>A#nqe8t7FAw*Nvy&5ax-L znT^zsv^CAgSo>;u51{`@m#!0g^p}hZOy;a+;xfkoZZ*n^b^kKUASYn2cv3KaUJ5w& z1rObxpDNj*yHGF;S-w2a{KbMSqN)Qi?q!Or!2u>W7j7{?m3 zt`)p9PZhT)&m_GDCtT*rokrnOLLOcIN@MsRhS2UP{xLM!J}Tnb{HP2Hc8^JX)v30x zt@SjgcD1&bhzizuD;d;Q=jAlpJZj`k50*~>c6m(Jc^Tli#3g0u6~>m)nWXirdMU|UR%19@v`K~kgZ35L zz{K5L#aIiwYh`9Vqk~j%B6wLyMAmT|=M%=+V{d!U%)^b z<9tEIKgq5!n#sPra{i}79O!?AU1K~)=bVRYyZxHp+wxAhV+8@ZLbRj6LxAILxzykn zt(B>*ygh2?&!KH>4i(}2)>NmVE7Nw04j9d^z}mSVCikxR7g|RdXk_&h4L>$-tNm+8 z-~$ztR5gb7Wl6fH>9s47rOsYzNR;(oVP4wz8SA`nANBw0^wJgeVk$}ZHooqNd^J5J zHYtDj;1MkCWpv-y`(Hyv|6wT9o7qcS0t|zP*yy@fU>U11zgJ>J$^L=qGF&%3s``E_ z6AX81YX>g`&=v`{pc+$`SLwORJ;j>-sh2@l?3I<&@On^SnW(Yw)zEtrsvzFtG1ng|Ao}7s;?P%vw>vE_=nG^#?ws-t{EMG8QPmw>?>a zF8^#POrb(_;|*@GkU0XxJ=2fowcEG1>Hw|Nx@HfprKH*hWz*dVg%IYqZ?zRm{{KBg z&HUGAC|)$xs|x3>7>$hEAD5?6DY=BNz|m!MpW%y9#HOH zOb&t`QzO+DZK|dp5&&2)_nW^VjN`|37lWQ<7zjM_)gEPQsdTn=abByXUthjsq&;tU zP9lZbVlWp|ky7({BE-#2N4JMGJ`0^DP=KW8arzCR%Z8ZKVE<8}=_hl<_@czFR>ewt zg^+zyI0cq<>zk#LgBXhLjIy;nQkmoO%{!aT5~`zo5*df_;X4nG5*&;X_$<#A6B#i% zpi=E}obIqdt&+e{&l;$NaH;K%|5LpSL>r#Tpuj|OQsk*PI48@0{IfQ^I$sZEwIsaP3WYfxA zYT5z=E&)z1VoLiBQIS$~EWCn(?m>u~mjkVs>-3nhDzC6yOsY06#7&!E9*e-Q>2u?X zxf5!8Z(i>e^7((s3pt^2G1}DWIk@0t!zo;vzR;@}r+ei_iWx1SZ;^SEie&y!_EMp? z$oNX$2P&0al|tQkp;?L{!$hg`R%KRjhb_&lqhejK47gDhHu~Wq;A@`Rg;oW)7~WNh z=CvAH+EKRUyzS>uRnDPo5um_1n32impx|9q&fnBW2`^oz!$4qP$yl)pL&N%{a=gT> zmlCCpie}lr-5SV#Fc6sof-eyg^O6pl5S*0Jk<`-Vi(iB7;JSA=iN2D)WYt%{8{{}` zd)^h|QsOeyg0C!`gXe`Dlwg$9G#!txfc8%Jp7UoaB%C@NIpn8myYi<>j9^pJ@79XJ zmeBZt??9wuz*iz4NrU{Z3$3#MQX-KJ_E_yf6xBczhnN#USFuJ__r&H|M7)NKJXRAzrRqa&ysjIkvM)WAS|%5;?_3&{LdFEU#}?s;|rDl zhh0%9%V1z_yRiu zwJ6owk?Bv{oN2)j7{$VDBZ^$PWIRou=!ET^fYAA^dOlToNl;QCPzQ>Lh0gS{rmLB= zii|Ssw3r-OiX~Fo9H*Z_^oXgw=CN-r>#|$`9*6N<0|9}EC$A#=j&_V>9|(XVvLp^7 z4)6J3+NdR~X*7hYtt)!L4(-v)r6lF5MQD)pFoUD0!fibiYLYB5j7Kb6m1CMUwY{or z@O9+vJKoRa+o4OYJ`i2Jq(PQe#niVR%Zk97iFqON2w2L>9vtkbg_>217UMSq04)cr zHuwtx%So#vS;Uh;)B3sIa$i`?nL!qF=~;$CQ~J0xtZsKWL-Ere(AXD+zK;TL6pB)f z)jx{^si!f;NXgI}#A-B|F(}!&+#{%*AUU)ALrF^`sQ45Oa~c%w&H9I_DW9Z}0k-o- zsKJ*rRe4O^=ukFzt3HwpC)>!CRDj+Thk6=tu3I?5|6+vnze(f^aBosi41d?}VC!BD?8q zipW3Wo*vC9v{p~{6KEsG`wx;O5f_97Ue-ZZk6Hv{f5Ks4kVQet>|=O8&tA#_WK|S( z=4#V7CybprJX3;6Dz*)j#Bq3$MwDThW-dvZS~{8~mPSQUIHMk__WBrJ=SxVP7@8O$ z9cwz-gvHI5ai&aVv1n4;Nl1&D4AT@fWmEXwa3-VoTBy>wMa8aPp&@x;T7;ZAQG|@v zoHOf|aJJl{-yC7-)80uyhz*9VuF18L{I6WZemH5Fgwb8jQLT|L#HAjFemm;^ltO&Y z{^N+G>{ygsFiE)xbZDWHSuhz8KAZ7FkXxYJ{pubxIXWk%AKR8bTs~NRcXi)SPD{01 zkhbZ1PprOr27#n4s1JaJi_^(RhJ}nF_P1Yn75Gg=*m8UNH^LWTL68u3R=rbD8{1gm%nXzjUG_TjnzZwMX86>+q$R)f zu!_um)2I<)y?Jl+g^MSNaGnL>l=@8s61j81-HKIu4ORnTuLp_qME2r^j1ct_Vp=!3 zgTarc^t@plcXdH`s~}>HP&)md0H6mNZj^)UNOp>6A;-iJcE;)!OpQ@0Cmypd9^_`Tx80`B`>#-l`pzA z!__gOXt{Me^6_HYcPa)oR;9aZf_%64vqmT9m+DTUuO2}Z4yX23j^Cte?oG&$Qo5{Z z!#eNCJ`&{z>8@H&t>GExXq?zJG^4OL#)NTBv6sYNw&#d2v7;wlEQhXpsMV|X#<**7 z5?ph0J!%mO4SmDm)y}0e;6zr$5THaF(nB>JI5*Y8CSVxY#E^5|hyNZbDclj?15K^7 zQ}xMOtgN$2!qm->qdj^6etMaR5pEqN-aP$SYgv;{t)kwK2 zg4D?>h)S0Z`Z4Sy8`vk5Oa>I)`W7eqe50qfF(OXKN|05DCGxHaXrhHo_s58Kha(Az zm?l?-4B;OoTMU<;@bKy_g zihO5>JspYZ_L_-J>HVfunYFREo@t~chkqe@?O(UBGWo-2P~1W=zW#Ix<~w(h;4#SC zXtN|3!;&W!a0434L{c9tPAQyav47tSzMZx@vQM>-oPJLWJkwp&%&L!UUEk7z^3?@b z@MQhOn4Y6-czA2G>3eWfru0+i)@J;WR?7;U1DF4z^wKfAuSO+;++s#by%_&f>GIR4 z8*leIMq8VQGq#chS5P%Wx7*^v`+mupq5Xw(O&(g?lfp`!7CY{W;#9Z&!P6Z@yL#9; z?likWN<2Gj+j)QhyW0Ip#cs}4VIaQRwgz5>71J`kL&K;yq?<854nc^AiU90d#joUx zQVv@~?{CXMx4YTj5Ezl24G)G&#|{KDpYmku}Q zM@qbgJ0lzJK6r>V&aXNz{o)%B(s3c9y3jdXgbq^2&V{nb-lsRm6GfH&Z|uEgR9)Q` zB?trp1PSgC2=2k%-JReBce%K`b8&ZfcZcA1ahKrk5-dpSChxtfQT?Z?y1GVpkMaH8 zti8@!bFXvm;q1NUykr}B>~<_NtK=f`HJIZmAK;4p@{*Xtyf?HT4GPmrst^_qg7@2l zqJzMLL|&tcZnkFX)~FpMF^A@UJC%uqc4yc9;xygM$ZWGUF073HyYm_&YxbA4lWi+bGJie$zb>0B0&uq1z3`{XQ4E{G%a9Q#9&qvU&L3rT{QRQZ$*>1@!M z($jdp`Hhper7~NEb*@*Pkb);-RR*Sze*r}!nq3|@f?IW?4wsHb$t_QD9aw~9Ub4Bf-5dhA z(3+Lqv@I+En}N+~pzqezpe8oD%uRQgv!#5g^bmTd1ngLBS~XJa@uCGTC6`?ME-!%D zSoVQ!1kqtRp;#CoxmajYBijZaG4^FIk50#m$1;2L9Sv^4R;&uCZ6;!fE_N*}wvAgscap_Sn*RI2y~C zbG6GpAaO~6?NpHWxHg1s_2X>d%$G7owP$9FK<|C>Pjpx9ruSZu8uB)bsmmt*lfHFGs z0vps(G)6Yl?cbCJ7|=|+yTzyX5q;_W=wg{~Izx3=(4|-L2tgU;U=JU+}(kPhQmoT{xJd<|hfaPCt)7am{| zMVD0utuBuO>$S)kmSe9jmeg(n4P)vhSf4UU#-6 znP^O^M4^kehwCv@=SvAh4(*-pdneMP?|PLj5z*2{E+&!}a_)TCk4AZxYLas{Dub&t z@4Bqc=)^vhdgcSIfL0-wtTKhL=9nS79@bJxWk-91;{fbHR;6;|cvtH6LoMcT3?Gh? z)^2{iZMrBFAt5KfQrU|I5;yx49%GI2MuxJC2vczYb*9R-xbG($6+$C@-q_U%pl8MY z_O-@-_1>-Z%IAWRLIQ=Tg$k%ukQ~!(azR;Gp_EH`|WCwwp=Yt##(=A#t>%Vh1xG8+f2+J zgg8fP4+GW=1|h^YZMtPz)R-mru%jmIufUwDX$I*~_$&m&L`SH{%Gko`@)4G|y$#xawwVaY`xld4?l43A;YjO4ynYE#-s(vY8DQ~zz)qPGh)f!6GC#`iw zx71sKc1G-ViHCC`Sv_ratVmHDc$A%YNc+-OMce|9Q;lyu!tr_Z3>$h9!E-`63U4bqUC`kpROcG?sAsSce&>D2N9x;@ecY*=R@5hF*dp?4onXokJLzxUu%8q!DCw zgIp_IOZ*O@Lpv6p1FNeMTNXL? zCw8f>@jihK_7GIoI1~;v2 z)dFT^YRB~y752WhRQYHBpw1Q?3t}`#c06cqEv}P7D#$08p=Swu=BRExXgg_aI+bo<6_Wu2m<5!zSrL)oy8p~9^Fo)CvNmNooT+LKD(T| z9Lf{v|F>+&d-m}c5 zMfPw6kD{ik|QCt#&&Uk z8fy#lR6R#op@$vZ0y|w2FuaVoRbpot9?A1HqQX#P*EpwsNgr;_NCDqU3-~^n4#U57_6UDjHu{E zO;=H@YD#lA<;g0W$fQ=Q_v0Ci(upTOaf@_P!kS=tXWrV#+7?}l&twaDCYfBMX9E`@ znC8$MWOnS2tzNW@u1&_QPk$89nua0aJH#ucmKOMI;S8WclkTH4om4>vIue2iq4j3_~K{-K}b_6s<%I(56b|zJR%Smjz=$8k1 zmy#s}X|CFGTE=BG5AGN|MT@wo7Rz3J;fEIBnktsM1m1#07?-42Dll>NGP=_np9Zg zf(ofP`D?2R+szk35e2#6bLU@cc@tKAvc+!4ITEXdE+d@Vq}(r|FgA1d8v0EBxv?!+ z7vG!BaAeTfE*Rpj3%6**4R_$65_ju@*9E(3JP3IqV2w=C^IGe!DKV171_P*>T9ks; z-3IGb=E$$u_ci7lg1FyBkfYNJ$~O`wvaSGU`mT#V#&yfEX#NUdFa%uEU5~7%D_>*D z!BP;IgXqXQ;mzfdF31*_lg=pZRu>RKr?|!Gn_}$ zxPT1OX4hNl+9g=xK$oQF>8qvlvZaf~>a>Kb`^22{%VLHe<&+INS zZJkt#LK<4mke#eQuzqWH4jp$(mlA%0bqGyQqcOcZm2^=o%)rR4cUKuJNLr~LnkBai zwBLAOm>s6K4Rdj!WQmN$bFG^k5W+(0rUg!_Kh1e^ZCws2uM4`E&%CdYTJUmVOuJ@B z9Ag?aMUAsqbbgn3=p0kU86l{`b2jEPN)|A_Ss;VrY#?v3Sg4HRP=UYbO@!p@n$9K;Y5x)N z%gM=RD}mEJyL)n#$ctvZZ%kaGJg6Y26f)(jcx10JIcj|HL1m*?whu!ZanLr8)nGJ)D-H z2$oL{F+HxVOTpM0oo{=OZriUJtq>l(Vq8sa!K?;7;*Z5Om#myr+8DH%@SG7rYA#&N zM4Rt!B_lZxykLknFaE;p)}KuT&u+nKEeVRe)prPiyPEt4Xr}4#p&q5J%37B*>wS=b zNu+tK@1RVIc<-r4y@xk^EdrphM`%C{iLPX2Xl|DY8KWPa(E$sw^XNLx`FO3+0tiBp z(mm}Q^qQuw@Y;Grvv2v#{h455bm;94ylvSeZha&lQRlfz}Kcz{t0hEUV9>X1e%q-Q$h>>l*>+!XnMwP)Dk_YAi)bODZliMq_X zUoi0`vzgx9k&H~`3J4j)S^4oJyLmzo6E;540%U?$)1A3E_+d%V$=|GLZCY#>v~`vc z<%k1nthbL0E(?wzvCVA{bc<5{PpsAC*Lo_t9d0u!6nu&3o%SC za&cQ_2&sEFpN%i70A95l7V=osjnZ70kZ8i^{Q#u-RV!8vhb<1L!+zniU8%xM* zdKGBmx_+G4QA^lIvIa?&Xk68(Sx8asPSPcEgqFwgETrjt7Dh@FC`SPL6Y|+%*-r_I z;)Vk}DB&`iHyrzb0IlUw2id*MfgM!~A#V<5b$5K@KWmRMQ(k2TrnF#lQxmS#n3XG~G0Yw7$z5dl*?ac9;VnFuU~=whWMOljbndyjZG{xtI61hc z>iLZY$eN@i7w`yH!()2{Xslu3>kTb+MMw1SuahH6^>aA%dYDCRKSe8#Dbh706vc3T zmP!1WJaWDMvon4Og{`eoLXnx0Sr&DDn!HxJ-ub;>{y66Nmrm&%|JiKn4Ed%zj4Q#% z0`sW4EoTBYxZM1Njl#dm3GK36+f zytl7jWlO}YRcgnjI@MM@xJl`XcZnB&T%E%~Z90qFd2O?K00xl5!gnUcs7yXx2V$_W z4PqPISK=*2j3v>1o{>aQ(EeH^gOkcu*Q8qqK!c{BAs%N7%m1MGIY4&+h0v(SKX$xM zyM8}wGE)^7*eYRtYcG?kWBVloHIkXpdu-^F>4$b79D$&pjZF37^GF3YUt*&fu(pXM zXf@W!c(LCNE<7I2^Mg(vHNAAYn6Ui2V6tWj7< z#&t!q+4ySUczw%y;x#qNZwPAXt7y@Ps4aO1)hNmQs*6)1L4BooRKLN`#N;~o(CCKu zJuoy|E9fH>@zr&W*FxypIJP)zEQD69@jDRn7hRVfWffn-_7Bp#=!0oMj;4PhqeJ0MD@o~eEeTof=A4tlt!(Y)l}pmk zdH;Bu4zDNczx7lY!h3a))Wj1|M=Ku_r{rCXNYI0}f1~VRFRhi%-Fk8|esYSak!@sU z{ka*{zZ&Wu3(NwewTTzHqMl)9Zi*0_(K($16OaDla+G^scs&q z-i&pKdKD6bx+(FM5f$cnSJGrEs_gN0y!qa(giB*XNpoGp@?vNNcwJQ+duN|)-LoEa zeooXA@Y0g>Z7QN#KP^0apg-JaFVdcx2ZSQI%@Qltm>XS?VFkaW;g#<)Q zM$6#je7SaC?-%XjMh(b0Y#?pMHH|eZyq70McP&drB&kv7wq!$UZOQx+k=Fm+bzKr`M# zrftOrk-I#G843g#l7&jSpv)2oHAWYWvLPwrcO4l6=F+Js}7Mi%kkMo+~{9g;h2!Z&)c$3eWGpX6a;Ma9fH>zpX6RY93UQLJh%AfF^7RtV8fG%6zX~ zFa8wx@*(E-+AEyg^%?g-xa zoVdhPLzYyRG3QM(K}y2DQsyXG=(f{iGu{szmkbhe-Pq*3^JzrCuT~J!&}u;3y~)1ojZ*FG+EUog4Opp)67)(hA62{ z+&@Ap%LuHh)F_8x@iw-xb|DY`aZEbkzv!EG!yHG$) z#QlKmapdHkN6#AS$gfdN1i`{$EA4M$>G1e1KZc!OsA^6u8JKSn8fRHaSO-b=+)gZ3l7-vT6qN4W~#>gQo6dcLqNTAo< zhn?c|lzCaZpz$C)y51~@U0!YEE4B#}jbgeVvHKRJI4AOJy$eA%Xz&>9+%lz_1(z0- zGDxmag#%b_IueeL>m{SARt90SqC1N_PFv^(Ikw`y?9#e-Y)GprO*L*-RHNo?wY)NN z-`DMOqExV&OKv-}dVPCUzKpVH+6pIEkMWX@E{>U@afBO389>Xk?ioOrd-`Z~O$dpm z7WP7rvTwryG0*FFx>r89G$kdl`MuwcAxS=@@>uyq0V?u}X7HhIpg2^wqkdRqk8tLw zGWQ8{+(-ljbe@Opt`nq758OOeq6wj^ufUA9st|18#&6B#R(T5-laoaAHN$aG+rxhd z0MUttO|;aSmz0KNj$&N&-O`4=e+saLJN(L~>dXcV@KE0kQ>iNqV5Y|+K8%Br(&32x z;^`C8%I6eWcKpmw`@MZ|hQG1kWaEaHq!+yUWh~rCYFfpn6;G;Nh{tF|p_7?aJns-=@wWA!kvloG0 z05xF1wZFRV3=<~8{kk0e&gBd7{8-XlKWZmU}{scpDg|nY01i``R!p`Pp%j~&R%!vYPwr)syTWfQUK1}~3OO`6|H_)vSRWuz0=1-)fJQd+6pXjrT6V0$WoyjmD(xrHFrE3cSte12?jQEPY5j zG_!cD#~UJvw|bqRqxaq|&zNs{2_Z3DC2SrClM?YeL`S*%XRA%+!(qQK@B9mq94K{e z>tiYdTn8}`J|{{V11#ds$<`749BRj&E)`4sy~n7gQtu)zd4P-=y{oOvY|H!1A#9^C zNRBCDC_AKyBLzsaSFDmSn0jy1N~7=&^)M~M2F*NWlRFNEtu@F@LwME~T*=(an`v6s z4l}GpJ7*ZPI(kbuj@Xk+hjl6iXnR?7JmyTRu6bTG$HB@%&N)TEl0K%W!xc-RZ017W zcRTQ_O9WiW*9ECYC@v3`-=-)NTTv;PNO9$(tW--NxA90|G^(;!g5~i-4LfZSi8XY9 zy-U3H6Z(VXO!>p&APCuEX@tm3yt&m@Ff$Ox*zRDu`*n5;@Mb=f^o;nW5S7?H} zVBWMssJthF408M<|Mu}LD1|4aCVUjHFw&DYK?>hEY>wQ-j$R}+CBs2f(k%pz)aiOu zkXl2k^!ws=UDu$0j~1(HgILp#MluK`c%m0q92vQhx^Cl5q4c2k*qM-`yWN0R4u|Gg zKt$&)?9eLIup0t7k1!e2JJ8UR5z&F{)hT)S-unl5C|Ed)y(SqH8hJ2L(OiYBcNCcs zOf?O!#?U$`7ue>dqKRn6-1`((BgNki`SWTZcLXH}jUo7a>f^rOf=4 zTPO*~Gc!MN5){&bqv3ZUUi#F`LIw7r=uOMimW#mpjq%Oz94`xqQkinfNI`eonqKBNBd=<7USUJ4UEw+CxD(}n%rMamRYg%kde)F>mi zl>n6xE7g9H$c+ZzfNooE8zR{fe3Ud6F+|v)?&eM{A~XNSy5nKT#OjpC3s5d=X|x2f z6`N8ajFil4z?^B%Gz(u9)(7IirflMNPq=n=U`CgVjI@yrzGsIpvWaXs=*i+j0lM3K zYtG322_^=2VEss7{u~mc>+b(rBlp-rW2T9myWNIBzgyV?9Xpuc?jl3xYT+US9|ND2 z_=LxbL^H>!U$(HGi=$EM*%~-FGOhQB!(;d!4HEW;nuqz7M>puj@BG@GhFT+AFN#Ij&pREe?th_aGA*To1KYx+^Mh`_M?*$kM*Kh zg&HNwV!N@?MX-}A9i#+~68e$8&!YB&JkY?RqJT=>^N$}o0kZFbng(y%#L{Wnh|>}c zmiFuQ;hWX5HkjpdK8(seRiK~l=OC3CrQ`B%pZB30rNs@4N3sgYVjrE-4PxZW-xwBx z{5g-aqUAjj6d4?k%u)}-AHQ<>k-nm3N~;|5l%e#By<8mn_Os2X2jEwfRrm9>agTsJt> znBSkFH8s9YPk|MRRHQdEXG|OC&r(}QE>2E2oe|em)wIIK)=hy~cEF9an8-l#r&3GzVj_q7#CGOe{@9g{fil7v>_e6|MrR?uH62RWGcw*Hw28&aCK0P zr2|z8F=DJAG~yhD?l8f1zwC)SQVGsrrLC7Ek zx>&j@o=mJtIId~pXW1_lt6TN{=SA+Aut>W=gkzTgxa*(=fPlFup*Saq1teExPq3wB zQx&$6gl95CVLUGZt4BU){bWXU?BSdHYIous$t4CrW*vWebN-Sz5^J!O{3NXrl9V7{ zIC6lq9@gHR?951w#?FA|?~ET6S$pYBcNSgP>32VhZ{q-g_ni%)Be6{|min2W%R8CQg2Z{(91QYr_X(Yk^S z+fFGNnpL{sHC;qB#t?j=)Ru&^$7G?Yb9?-N2@oO_Fhp7s7vO;477IC#?jT9<8=}4< zZEjJ7w6s6{O;rlJPp}Vy0i^xKON__;p0G@|CiORjA@lGm7s2Zs=QtaB??zE@0_wOz zlw9GLJ-7zko9E^-8xK>BojwuE_>fKrR^uBeIABxnS~;ncwFygcGsMzq(T!MI!zjHK zb3T7-xJqIU)kyGeEomDxt zNY~y;<_%(_JK+u6$J`VE_*SY6_!;uwtD_wH8_QDFND@DN zp(NP5<{}oIjPej-zk_7w9LyCce>Z8!#my( zpHC-0S!0n~7lQkl$9Z`+AncCPFFu46k)glUb4`WCEijBNrtL>6#7Dfw5L_yYY?=^Q zevLDvY7r^2wN!mI98Q%-fhMJZHf1_PR2@#w7j(iO=GdX-@XG+Axqb) zIQ|a){r2QA0Z9VLeivqjoz?Gnm>lFx&E#MdZA39rEHKT$0vws5UYMES4#h?*l7nrE zk7A-I=QFoOf!cx_!Y5(1&1XI5$V@>%as()cWSgFgFY+}PnOiV$YluVM?nN~Y{la;J z+xoun9{0P)_xbg&3R++6LltnY-WOcCaL@9vj>>khlTV<~OBh+q(~mx)e{H4G zDBFGywIhqS_YM~(=zE&Opw48Fl;YAF&>pf)CwtCyGFWyjR7~=*W`Jp+g9n1M-}4ZZ zc%>__N3RTeL~X(@xi2lufwXwm!HY}S*StG?JRl($2BmB(B<|m<4~`bzxR8UHW`-)1 zTUl8PenV7_PO68011hC7WIk|Wl8(65kTz7?gdX3~#S4$p zntTf@Gq7e?4u5Bv%kb%Q^%i7sc=k1Zqz>dTf?yj2V%X+0{C4=9IMFayK22ym9ztXT ztoSBxC8LSts)e8UU0S~xMcf`-WH;>q)PlO_f{SUmg0}cr2iCT=HeLE*I8fMav5pgD z!c-L;M2CPL)}fOyxz{bxWWO=8i-dy2;e#?`v$vm8B>sJ9{`~VHkm))a0BaB38hE5m zowVnnzhn!h3jOTns&+deVOpY7z1`P_7&!5Bbw5yY*jR`lQPie^fmUB?C@Q z8AaZ#cs$2bf0SE&azF0sG4+2VrYdHG5jZ&kT{;|e^p&F+J2 zf9VPj2d5iYt>X25@^ZjSqP@t3RIt^%$K?9xNYWsZdLz}^*7OtaaQiE!oRdv*@r9Ga z5QO>riEcvO3x}q9a>A?hjsSAPZOloHafQ|1aXiljRlThc?2LeL>XmkBWTG&HXh7FYP1S9zN~o^}|uw|WcI z*8o$(+=uKyV9_SrrqW)5j!6AQoUH_Sy0P26>QWZ~w*fI^y5!qI6HXSpFo6r$8Dy&M z)-MVyPA{+DmIfYX1{Ovm%83@MFs53JvIo{qQYR~P(09xW>_(c7b0_G4j}?1J8;u1{ zjmQBJ(C;*hOHlXGzWAAHH{!Jx(^=O`#3tafUun1}B^jAGp^T@6MnGijM_k;N6Lv{b zEm`)^gj%NgsE~o1BJY$d&D&H7RcgRZh(dNpv&;E^ zc9&Ciq_WLo@UKFJd!+Yql9AXJxaC=OLI`;H>a`%DMX}|V0VaJUNBNwDU$cIH5)l1E z39Z`$enzK3HC6Lw9=2kng-kniT&>}!GAhN>lKtCwzj{11rz=Ipk^NLN=c~dj^_ep}X;lo`jC5ODg5;exhvl!_1s&l^k`Q zZRX6XH?RGMK#+MUdG)WDtd@@;bek0g_t+lGo}(;*yJTZ$)LHct6aMH;lPBj~9b@-2ctxG*u00k<6_vVw&WXmt+i|MWBm+Bd-oLMcs3` z=5$8-z9AcmdjK*OE_S&^S+g$NTg35OJrdyF)?ols7;haXJ;!~)nbKBY6HdfP zOHsI1Esbw9qYXKG2fJi)vqeFibxe*nX4fFIUKSCSptsKtYhjLg(ITbCOnEuX9Y zZ!O{~39{|GSgE^3jqDZ*QcqffX)}WhJ>({CLxx0>cIcssVMqfbb;-6vqBLf9Oh_SF zG=*ayYD4(>fKBJ$A~ZPX93K2Z^md#yzdm%6G@}mfPKwmb5@7hut9y;K7f#}GD5hCh z8NNg%o6wrnHz8I@t>DpZ)smw@PefU`Bu7D!g^!WthKb|bV_T&fgy@K%*Rv1w;`5DQ z38$OQOCJ|L*m6)Omrut}WRS{{+K*218hLRwlf+L%WN=%^C&7f0LR0rzHCo^eGZo_3 z*+>$m=xWR)x?5mWkTVS@8^1si^ZfKWy$L@Tcv6lL?=(K6d_TVxPLJFimm>V zlKlAz=uX}$Pbgx~f&O`js|P2xEUk5b9nYyefr_RN=Sc?FYPASEnig;tG6l+8SxQ>~s?1R~g71U{)P#d0jtne)Pcdy=-=Y=eH1OafT4zQ>^q zzVEsVmGb}Fzr_DznS0TH2ml6rEhTjT&;VAxxqxgT7qxgTUefq!qQm*d|-G#5-UBnPg z^Fe%LEwg}jiI)3CS-C*`WFSvDP;6}{Q3xhXxQDGu--LfS92dLJ_q|E1KfZ+_*)1dq z*-r+q?(tUgvU`o~E>*F7P_D#7j)p<8vIbOAT zC@4P(-B;ZUy;_!oY3bXpw=6OqnEPM9G;BC^{l`Mza?|+JVtIAQukjqB_h~(jbOLh- ze1|e?`mh2Bp@nzk{|gQ#fA!4que~A0{)UjPcqrm%?nd^(#+Eu!d#&h-70@bghpp@k zuK2;~((|6vw^;3YHU(_)hW%0MM0V5g@;_mWcy!FkJ(yC3{;?>#U+CTJ+d1Zw?rHL^ zhw1+CClmjeuXxs79hxDe?Y|*dX&;K1w!gyKUADF>C|7#o*#CyOE3W`+6F}Kkv2M`1 z)c=(J4Y3&PEAmoU{_%qw-=mH$Wf_Ji&efy#<7qjV7MxB%ZPf|0W^f$9^MFC~P=$C^ z5C~?R8JzWq(>^onpQ#@IV@0M^Zo?O`w6pJ4_55|+qsC)82L2v)TgfTA$bQO5k1)!` zL_zQ_9$;F0g`3=iPBjFx&Zp32F2BvtD^Q*fPT)eoD4PJu#YRE!REn)Vx_8KC$GCL4 zBb4f)3G#(>*-Se1mx6M4%<@A>fo_)4H1{m^uz*2>HNOIh#Bhu4L`crvZwT*`g%{%G zfqkVa6|1I zIu#NSJe5o6Rvs={_5xOwPIR~c#O)%QG{&3Rjlvs2%2)ORK@-h1)S)2mdEb%Qu;+2j z@YNs6RXpXAED2=7t*!o;k9|+XA#3!_3rPC9QcCc~%xqJwBjBEM|21?g>(|Pa({J&d ztt|*y4B|p;*F_d_)vOB=iJ5g|xCXcdweJAyJ22l?cUElK`tUdN^AOT|T~VM`_t5jT z@9r|DSoKmF*UH_FJo9GD`YsIHmKN=JXFR_3*rpJEbmCV2XXqa!{;7$7ZiRmy3jaI{ z|BwADGgQSH5}pj)R|A=oI<>k+mjR=CLQpLy)dHFfaOT@psg{2&kw$P!{9 zOr=Jkr@X{(9xiA`iSi1;2lf_^&h%zqhjkFpa^- z#C-#fT^KEVJ__t@&Zjh2BfPaK`a+AGR{wW?X8)@jMjIXMSBKm;xX>k?$Ef|wb+vdt zksbUX|e~;1$qx(GJYjI zTWk%VM!{Z*>b1W-M(|aI83tEfe^3#rmhl|(P($+{u1g;c|X&eP&||O$4b{OyuPSqm)GC+j9J3& zNn6c@$J{l%GZcKWoIw3*R)G|-4P!)oaLcayc}MLNQqR6Khf#B=TgHg1GN8{EDT_V$ z<+8qMD6SYPbPt;=LyW0xd!RPeyy zrBU~0qaCjx_N!}}SEw^hC|oPQQ+2^hqfh-c29(Q3$`8q*W^)om?@sfZNJO-d?Z@|c z%G6d~1DvV~{}^3iFc^NMM2`==0m=K{&93Iet3??mjPgn6wXjpq)~DRm{Tkuixl#TN zff(H#cymtMe_GO{Y*8*=*8?`HYi1XkEe9VcWN26vJHaQ23yrSovQ*T%|41k-GJ?Ez z(Sp<0Z_`pvhyld*)PW5`i9vG46nSiS&-Z*vzfYKk7ZOwo)sN6LVpVU!+lFA^SzSE0 zS;p79Z>sSqWSU|(Cwr`pA*&N!S8l7aId>4mep)I~EWgElOJP}8(-F<;Z~yR-A^&p5 z-R4x7J6s{{@imeeFHa9ai~Yu@g6%CJU4fR64%D4lUrK%m*9` zcj~`gKc+lT=5hSRO72wW#L*|$%mYlGKhV?42mfUA$vXy|&s{y>tYsbQ#~tl+dO>(R zI46vkzwVKnp67Re`E@M3PN)rZwDG|^tm-{~O5|SR&h%Bp_UHlo`R_;dm9RZ~nqt2Am%PKSGl5t<4v3MuzuV>1?IUbTek3T7#Qv>AxfLH%-@ZwtL|IDvI<=)u) zN%7YKm2iN_0(BDA*WdFrCmOF6zoG_h}Gz+duaKHe_pZ-7D@5=_{TflJs zS1*Fz#9-;F@`?yP@Phx_wrv1*7Iu1VwP=v&zq!_Fg{=P#K}@(QV`rH<0KO(Arc=u+ z`yZTjUP? zJAr-Q#-hROLr_kViuTVDaFU%*Jg*8*`v1YYJfoWhytE@G2zJ}9Ke6x2Yd9W_W-_jY|$8EW)`vvK#`Y%5dy7a#x|j6ivNU~8_v=6fSUOmf{M|y43!&<15JjS+$JtK8Nau=y`zoC4Uyc84wuhpM2DBD;4 zah(k=5MdGty9@M=S>X6{JJYzUad?@5BaIqS zx%+EwOAUS`sHK-{`=tHJKs&a?R~L_@9Wvf_;_1(9N!_OC#7p@k=x)@eLf zx({+9oV!MkD1dP- zRa0^Oi-j)tRB73z>2taxM)Ct?Zn>Ns8~Ah6pY*cks@8m4F z$BD_NqS{6@_{oX!FFXYP;*q0L^DAKTr)yi`5x53|^=bZP?*4=OZF3!g!&~ywQBw5J z1Tne)OyG3SGV>7lBeep4vZz{GRQ)pI1=s38(y&?5;zPpg1P3NL7+dP#f6vI8Z9TiG z-?;GwOP~mVt7bxuujg;DvGH6k-KPwfl+Wn~=-@?7u(WvI1-ok3rA=b8j{;jm4Ij*NHUk{}V%Q(a=@Wr!K6#i9a zPSqK951PL2N*5M5g%=7!@-!Cbe**?zr6euoA&KZ9wVxckAk&laQa#<|?G- z@bBeiB6Yo`1?z;BQa`W;DZ4U&aS_FSW%^UQ-jde`Um0#H))#nX!7?Din&9qEaJhJbySqCC zhYMUBLeSvu7Tn!ExI==wOK=SqAbm*wzk42Lt(kt9HT}vyTdV4DtIn?PJ6@ljyc9h^ zHgzo9*z>LiSO+}yXQsa4zL0{*y+8y}>+Dhiy;2PNxkqxlG61y)RyV+QZZ!?GNcpNG zs2yGYu9C6_7|st1RDcKRJ_3Ka-c;G$^|?KrHRb_$O$-w_0TPqea0rh&s-}!{Zi%lt z;yl1=B2B&NS^hKQu!|vZ^e^f6i|w#t=7iyY(!YI1d_~A3aTG@YJvq=s`O?b>P@i=0 z{B5E<03PcC{3Y++8f5%80w8K>%G?7r23-2T?ZG+0H=7eva@QCs-}#> z4>iDPGK&I0gfd-qWfP`6oA-0Q3l{O*B2jG z2Y6Iu3YJQfba|ANZs|Bt0W3!)F=V7y2ai$e`%x1HdN=*FmcpodMAN>#}~ zUZJEH%@bAWqryYixoH+Nud-PxFIv*&OA~CDoONF2PasB!qTg zocg8ZGc&ct$Xdz^_E|MmIEH?&dWptA`gKCKJg9Ky8H(vrBrtBz;e2TD-@O; zV~ti@-)dTX04&}GSS%!GM(g+w;>K~q?}+H372;`A9wD2%rBn-$M3iKFj&LU~x&226B`(C#Kt9hu3Y~mUGkNs^W?QGuJ0* ziN~f{3rh|krS|;sP^0Y0%se6!D|8%Z1foE3#w6bsGao2A-mWN(XxF%GPVqFf;}QpV zP;TK+E@S1Pj$&6@qP}HvTyI+H9MyJh90_w)WK!q<2;W~QfdD%u3^^^ZYJS%fRw^;C za=dg;m$4!uF+H}2a}^Sd6T1mN+{jO=*oFztaz$r3#?N_FmbA(?RTKw9eR>6`XrfigyW5iDS0ooBQH7qbQzx@M^sz? zI+9d5QLeZIXD6jdJK`*^&6k>$HaORlrQ4idRYi4DVp&S6u$#Vl1eAs`40Ici>-R+n zQM^?*m!J|%CkDhLa=)vDG>9!BEie5UF;cpy#PF z{E&PeF*fx%nTtyn(?$5)m-P1e(kvP(EZZ{nai9pGsN#g6^zvgk3re}a9O64Gq&sit zEC0x(3aV9D4RIn^xC)2$XU};1p3GC8n5)5*Z|=&^gF%*4EJdn^HEgq^%5k{CRhOEz zwJ<}5NZ2Yi8O}@?x1)PxX;okKTsLrPy0KC))8J6EUa$)5G6@IWabsf?CpN_qM9II% z_QP8?(aC1%=NxFj1A$y>S%i(JT6i;#*K`TW?*|4OleMv(K?{=~*yPozvGD@g+VY8) z>ddUFYPGa76?njf+0pU?FB$FPZw@OvM0vqi0#_5Ox`afxZ}2RsyblV8zH4{#yYq3! zkWUpeeyN2j9>9F<*;Id^M9j8H4EZ1PIG)P0a&H_C_V3BBgKC=9&z;|E zeX8zYKD07I2L1uA6w(irj7J#jrL*^^s&Wl%?eUrt(d})co1>5i`ba4!zR$) zKQ0I{nbD~RPiNYDoAC+MG%R?K#j;JAUlS325J~rV1uP`#DP(OB&y)45=6$AWE zCWoKiPe;6*>an8At}&{jaGU4rQRmGN2Fti3)4t|vc;%Lduz6l@I5#H|sN*~tL1ztR zUe2LAq~ftAg0dxwGB~Daa2kHLN@<>VK&QvOX3b(~1>DEr6Wn_lEEzJk8==9)bNs4e z{7@y+Xl*y;o$CQ#V;XnL%IS)Yk$k%u3`{A|^UWw?!EtIorax{s$b;rc)M252TCXr%qtmdjj<<;f&k6>yi_j_*E;9IWZ3=v1alpd&jBPwBt z7>%6cbKe?r;x}y zoLs-q@*yYGs|-b~u+iHM-!nyCFFTtis^;gl2bHHFi57TUuR^eryU}rzueTH zD(rG`bntZ4VCk>@h^XeH5wrHkN&*Yv1AyasS@Z@!x4E5BqgqH;#}ex6x(TTh8>56> zGwNU`xYo`0>g4b7>*P_ED&3A?-?Jp_-ZH*Jk=grcq8&J=NE(j)E=iz2uStTOw`X%W zDOYOeU=+$tD z*gtc0c|8n$em7_-;OSx%kkOONP^4>G0ULMfEZyvXzl+ymnh`1{s7%-wS9vVO1b(Lj zKU{l$Y5(Az1%7!av_}5?pn-9m9##GL3B61|hP@KqXeN*?-)RT~XS&)fjmBj&n?|ZY zfjvJKkI{0W1XsgxT@#cNLji>am4^Za2dj##WuU**u%IBRyEHfRIC4<0t`J+pmF?mT zJ7BsJ>lN<>ZSO^;p*&UT-m9D(S}7DnV;{_Bo=k+-sW;5-FQ25gv^ruRrTT8`5PN!$qY)*^5K;;eC$ntBK=_%dxM^(p*uFp;P3Bs zDypW^)|*6sIsg^h$Jnt&`gE5lu8u)rYpE=rrBtbuZjH=+MVEk{kkLfKD-OH-d|2^a zPBLx%!t<$VvD`RiDaL<+a5L3P^*WDk>8V`w8G=Oq?cK@%Ld#vYAHWSd3}Aj8GFv;A zS|3O+F(kk2B@d>FadjXfSNS{Q_iFin=}MXX*MG>B*@qWU9sS$R3eE!@l`IS;|8|zX zO4F6e>z>7yWdtlEF3JDdlpZpza&D^OCLY#L0IQSv2m52G3g8W(5}lQ)$=y}i0XR2b z)^Bl3=~uwLTARAckoDgubk10 z9Yh?D0w2jXCJ~xhWV;_)O6b>X$YaO9Y@&Ik^ynRVm8EB#GD|lMw`hFj0I7HDFm4>( zMNa15P1dG7>%>c+p*RbMT}l=fE$GyT?cj<#RbC^P#+0*esa@vN<8QTmW4}QgY1Ej= zS@Nb>dWs2+H*%4;XA6JL{1NY_ zlLGnV^vY`(tr_fQi>A>ycd=a@T=Hw-iOohr403~>&r@E!6<+J4RybTV`85ie_?{d~ zJ3rMl_LuT@btH`0?V|7Oi8tzFrB7%bMG<&Aomw@0gKfhdcG9?DWQ~nDQM6Y9V`jc4 zH%Z1Pl_=2cbuov?)(JC*J%ux>gPcu()tq1fG071!!VsKnuw>@ulgh3t8o44P>-L!a zS|MrVOeK$BVT-E^Of_%OJM)`R5kGtzJqqTjSzR6dmZqPe1a0MhWqH!eoe&~F_lm%5 zs3yLCki#o0ckFNSZQV;{{MIwyPYBWjJ~azXW5LBbK){re+XBR|5sT5|f_*+X8jdSl zCzD#OLBlKDQjUOACkz}1Y4-5qy&JJQykM!!{7+|N(HtyATG;X#-@?Qw^n+RIe;+ex z18k~TxY@~Rv9{*@woXi2n79cx1xD5_2!?prx{>Z~`k6-;>o+HhqpXvK ztW9lWaoJ*{=9~yfrig6?jG9`NdsXTzeWkt^)_+vOrm6}VzLQwd$6JXnXk{mkCw@JU z!?xoO2NPMtg5+oNHR7A!G>l!akq8^oT>PTi)s$oO$7!{2h4DCOf`#pPmUP8b(+s*L zEPl;}Tl*8@G|^&rYTs;RR019kWz`S)JPQ(#F&&p27+*CAeY#CCFCS)3uozM<*Z{3IS?`3`tr!ZA_&O>;rB!mpX;#4+Tim`7dP{ovfl$H>F&j@Y zR4@bgeVIIojM#Y0n<%TO?XX=epFmnw!x4#aX6czIa7Mw9p-`E#UHI z`Wt@nnpdGe9NPj;meN02jH)ZwEgNbPW2?zIIy>_72Y;pnzjcGrxz(%QD}4tbDeN&) zW-q03&^T)s?@qOAqh)lnAEgb7E7V5V9+upf3Yf-_5M^bcmnbxZ8Zj!(l{buR;hTz^ zGTF!-%e3mWQaoshLD%wD8Z`t2g*NyR4s32|IiTDaOfT}ASD(n#>D|*WQaNKtdy6%L z?!v!t+tl_Qmpi1D&Sxi|-Z^N-(OF^&{cfH}i^^6hvTCC1O8R73_$e{81FRv~;B)G2 zY%B<|Nir>KX_0B5Nu=CvH!rynbem;#Bc7DW zl=)1F5e~nf0CsILi71a4)rN)5Sjiu6Atoqlbm>7&S0^_m{U{zEG*C4$kJ}`Y!dEtc zXJwGuFqpi(z>Tt9LL<7-yYXp;x59m$XwZaBN5NO2D5^w=C*C4`Cp5(;Xcx4AlU z{+yPSmvcj#CYS3zccztlNLZcU35ueT!=*6^~2LjGN}OoR2mC_l#hDMG5L$JouAs*d z`oaBW&9)C~cF({;#0@u{UDJ7dOSRTryoK{V z6s_vE=ezyU`Nf?}JK?K%+`=_mZil&1f2E+1@OYmruRq%?Gt2`PfH)hw>O0U!3!PA?xUM z)2JR6;eEs>!0WdXb+EwJ|;y{lJ~JkGVk2&QP7Z-krEYhh0O_UL60oO z?qJie1V{L*ec9h~78?QB`5_N348QO7JWdo{%;bjpHg z8enIezO*hdBPd0IQ%`+vG@8QS~;$6C zdWF3{2)2-)EMIaYu`MA384IajPqYbV|I4R$Z^6?>Y`7%L{Qd|aozvQSY%p<}b&KE! zJQ@_@LLr>^ET5WDM$K8lr4YZ3O!moUgJ1Yj*DZzf9ykQz-IfY8gDnVeCE3wX2J&W~ z@nr?Z-{qxv%E{cBnCh7^-PC<#5E`|l7aEl}g41XS!Q4%)zYsjDYaWfURlqdP>ad(e z@0O?57XwS^O;EmOdX*Ppw`!5mFrDt?{7aoWh;vIWxiTb@|DeYqnQa{b7iQT2^w~^K zXqc^=r<*xHuZ3Pg=vWa+Fr|v!JUXj+4)O`=YC5rQK!#&u zhF*MTKxXFsjL-V}abr^*Gl*54P@OcRc~E`L)JSiJ%Sbn^#rMEbtPb7Gj&1@UVqZhtvH=*`Ue=?s9qJ{qRjSVr-q3vdDCbscq}QAl7*XEF|*!H$pjM-|k zO4FouqZ(Ct$1sX&rX!!|x0MKG8jjJ@?o8>@4EL&+_yRV|XAc9Lc zB)msQcPX6*bkflSL9NJjfID_dxvo?e?9fMgAczbI6Du0AyRoFC@ zuu{R}_$NkrzLnfLXINZgbJYif4rk3G&e}ar@3l+B z>1Q>@JH^K7c8DH4V1X%vGuGgdvekhbj24msN9xE%5_-@>8AZGmHa4hS_h(2pbq;vN z)6&w_%Kxi1Dm=0{CpyfO_+1%lx)Ff_{H|?K`-rlj)e&oW~6LeMefi@GkyurU$6Ch*FMi74L`yQj%esehtYdq7HTG@8QZvQGNyK z!#mM#I6LA_xgHJI*haIp9dFolyUl8c#20y)(W;kpCk_8mC4e5)*&JhLkAU zc^xyfrFn>R1(m%;yvUAPcPGj+*Gze0AcA6`e*%Z*mTbYH=~(PZqu=R!DHwNTzEg9ag}tG~w&uIcJMu94@Fjc;R-T9pot_&U{Kxrm z%|B4k`zp`l;i6BPLr#X+f1qLxvlQbGEBkgj#QS%BC6f==6yp{z^{Q~gFLfFAExjfb zcS=W1munr@Tc7-+c{n|PFBl?+m=U|~beV^w{iZyc)Nt}i@iyc<%^ zxKZajlORdf&FYM02+AEt7sN6VxE|%`lyv}2pwd^c#inJdnp(-9C?GoAS_UMfdF*Qs zn4ya+4+ykO8E3kOm>Tv+4{CZJA@3kdbAEiyzC;abPq*7|liXzR&C% zScE*IH7j6_t`M?j3@1#Py>=CM*@SnSyf1^3*{{inj&v%spm!A+7@1Bc)3F$bEj${X^y!z5@WM{%M?z2?Nlg5)98J7abQiR5h5ZSjE_nOD6 z42n&sSIi+wYPz&S#8Oc#j8ia=%W+lYT;kj1n*J!7mO5duv(J#lxz`R#u3m+|g>ymL z*QxhoMs#z<5Gz}`%NFgC1O_eezScL6yEO)xD0Vsd6hD^4n$LxgQ6W?p7x3}ba_+!UaOLsS{g$B$%yqAE4%b_Wq!jKY;BSdBSmeQw1fyn8$lcm@M54B?mLKXIvaw z5tf(f+lPSH&(k%RE9U3o3tfsk;d$pG?qZz%KO|{vn_7np{N2@?%&a1B6REtWN8i1w)FM?3)K!3{9pEtbF{enMrSq54zpf)@ zRbtF4z|f3tt5r`i%ez^t%+kkF{ao{Z=fL|ux1|6i9f0pP|E~*3*;!o}a9`qDS}^zf zfB~)SGS2+(7CX@Ys*o?2WyCDO-TSGW{|^+_{2!<$i8+4mgR^G^pLd{w+k8Wsbs@1C z%+eNpgw{B_qk%ZccZk$1&??IgoID>5c|Fi`i;Yf{D6~uNRQ)(okv43gO8!5UXN?{> z*Oa(Y!zC8GNUtY-wX5L{>!w4d&|*V)qe&55@AoP})S1x#V`Jc^LQ29e4sw?8o|734 za@Z!6ovlWdKOv~JQf6Vd)L&YDjUpb|z+;6q8!gbM959Fl!IZUQ~eJ*hS(8zm9b@ zwNM?Grv8A}Eg20fj6?nWvvVU((J#sms;T%!dXpLgnk}kil>=T<;7Mmal5WJ`=_Wll zHIW2b4a<|QR55JhuGXljXB0y_lRw$h_hfq=@aQ*3X|RrOa%$nPOJw@t*z`+wzaS>Z zR9NldP~)g55^b)Szj3fC+?W`Nqss{je!W?Y2~8Zd#(q*soahHIud4p|O?jS+czAe3 zZD%L14_mx)7@^1%v=fIPbX)@$$7X^{b3$J{vJ{cN+pg3aZG-CD*y7kJzHQthA|p5V7;H#=V+tytSs^Ni#>v}9 z)v=DXGUr!Qa(|GI9~wnoLlZfr4`ru19<%imi~aVdpcZG;CH7L>X<*Ix@oR{$&Rnwy z0o9_~+0xiD_2jNxNEpmE8e#KFxWO&>_mMmAC<2mepa6EfKdPBvL{hmL?%PvN0j`K@ zR{=c6?pR~)F1xQ7@3_vB6*%hj+9N^=^$GZfsV}&tJ}(lDv7WY3v*tTmseO-~;$rpA z5@50MipHtj2W5K^V1*$=9V1`38eoL@yZq)l=4^m$(FpJ(5Wras?C|FJy*<+IeHWB! zmu)VX(}TB1sAz7>Ju8oy$8q@)e*sxmG?j7}Y_sT9y&vdj8+!E#vHnP)VN7bM)#!(F zInek73CWC$=&3@kwSkd@PqaNCHpRppzUmFo-EnDf6RiHd@I7Be3;wM^_tuJ<9(K_p z@J*ns5}j>xf#*A(6n9-39yD4*26v!O&2f%l4_+amtogB|x;%0ohtq%E7l3V2|G7^= zc=Fm;XCG%$AP7JY8|I9Jjo8kxZ3V47WLGH{PkTy)$MVtlNbMke9wNX*q>NF~?%)eN zM1_)J?rFk$^2@PRSK=q!E{mOX2jlY&JXNGInWl6Mky(^aqTTEy>N3~9r(uYUhb0fJ z<}opaa#*#$H5vF3bm!=gh=V&i(}_NuixQIP1RI=I4`D$wPRPxWBUc?`?6g6uZ=rOL zC3zl+lH_#s{it(TK8;&eqd*i}Hjdv2?R4Z|IPKxhkCT;DWa(ehqhwLaOc-YDPT6>hHU^gMd(*~ z*wbx3>;Ffz)&GrlU)4O^@x}M}p_iR+BBdXe!|S{%xxGfvqiP3}Ton$!%nj>-4=X!Z ztklWPkA2KK`n)A}t&lZm39$k}YdWvW6C7b@RE#)w<$Ft3XE4ta5g9g#&Y*oJ29@PNS=f zT&W6wb@TaW>a{R4{9^DU4^#uC!M6)=H=;Q=@U^WHy8$1h`C3aZQZkNe#%x|!-!gJr zPD2`I&ZrgSA?Mxuk>bPb0<_5E+2H4teV#!~Q}kZ_nkd|8v-#J@ikLza)6FX(bmA%c zmpVUfOTCD|`WC&^Naz_s%H{-8!VNr3C9d8|4V5vltW;}zi*XW??>={qTv^(M9_sB0 zZ7peXRW#y+XmP?I`}La$gZhg^B8)cq=bp~*wdi}N@16|kCDaBnHhJr3y9mq7 z`FQAbEi{r%L)uFhNoD%JN@El$Tx+4DdqIDow$U}#7ndIZ=QR49#gZ9jeOn%@;HTH` zyL9?J7=mVJ#bakE#wEn#EIz0oEnx>~CoDp=Q0>S2tX6Nl_nd=L``$mSX#;?*!7s&W& zWjUrz=LpS6Z%XJKCKXl0npw1%a{;yEpt`VdRnTevgGp`6afKTeIkT0i76J1B2Ei-87Q5?p zuS(dU0DTa^H!V>_v)!a6z=9)LU9QjZoZ?bO8B=G}Q{zlMQRDaxxai_H`4VN7Qx&&wmg+X`!XzOP-kSU+A9_ zjWZNF2);mZhBbBL`AKMOuiP$=$-8kw=cn1LRQ=ACoId*SUR*Y~z%wSZU*HPntrk}! zUK$Y;j&lw1edc}sL+TC_(7T35z?7KAiLaaOYZ)e*ei>NycH0~6iT3bO`O8lHfvaHH zuOxxwntL?GW9PbLv`$JZMJm^$eG~aVP+(@Q{J>SkTfJA{vF#a3Q#1C@%gs=V;fgW^ z(=tEPVWU@4q+uDuE4+VA^s2LOdjdOO(faiQ%N?qFTU_;4`q5NsSJ?5`jcR&R7U(@J zn{^BKErXdG>}_`!o1vqmtUy|7wYieS`);H-TV$QVCB78CkI7oSWOyNgmXhQ9&<2ei zV@09mZPqy+J)1Vhu4bg!@^e(ANWS7d$(6$rBHtFp*FpL1lX5E3D|N7pwtdagLFt5o zJN34&E%MC2ux`69cm~hATml*E6xGpU&MZ$~Q`at7iWCTKEBe zlJ@wEJVj#S3|N|PCsXWIw>?Mlu7pq&v<1vj*hf5?rd8I>F8|vmQ0am^3;OWcEyl)? zme<$U9lyS3RGHIKJkmUD034#pX(`SQ)G-`bs>D4`PDPA%U$VEZ=x(SY*hFh`nC}jr zVfLc}cM+o{V&m2}bd`vtSYh7}{u16#Gm~p=}!P_JUNo|z~#dFzZY)@>GZLJ7g#9kpgScP}M-4vM}bQ?DRR^LKaC-1r{I zqVl~`WLA#iku5{57CULHvv36%8v8_9RP?$HBu$#A!ePDON1{1-F=o>UNz|cZq#Urs zD>(*f%iu<6ekZn=?=gKjJb^(Olql%01SnbGU!j?0&8L4LjeIrRz>LE@Bt2jTRd@WV z->)%%NdR$gh=`HkVR%qTF?C-o{4BzG@53w#IlnzwcWeU6;uRkB4(nqAXsDCJ!rs+G zj`o+Sp2}20o(JbB-j+-rl-6fax{_3Hrdv5C-fSlb0q{kArINVe-7XXGNyh2?VyWJB z{QJ(2$*Z$g(+!=5u^+ql#5sf1()h-FpU0Z3*gjt6)C|#4U*an;W=L6Fl-MeJ-k5Dg zNK-0eaznnTHqqNU2&(u&OTkNv!_{`a9)j=1)Dd;?(hAqQMsuM%EVOcDCXW-Wg!r{E zQTg+JX=y0OZa#byL@DKgN&5Db2Tf*1qFC8HgFBwG<%;egTJphks0NnpRL(L}?zoKZ z;9jl&HyxiDJZO6fp7th(l4{o9k&7?rRUk!FSd?q)JZcCco@mB~JtI4$!p`Y}-s&ci zF?Jm}zhq#FNSp9=P0LBTlRf*GyE?C+bvA41tXZD6awvH<<1WSc<=AtQG!COb6G^u0 z9&^0;U;~w!Oo9b>!p~zXA2|<+B3`aj<*)cAZ&`fFqeI`KyZ4GZ@4<{v&5T$_>MG!e zk5+PH*k|*eA!nvtn6Tym$T>nW@u8_+_~+H3RWkpDwihm!lTrF8-0~@yX0{(MR46 z@+b8^o;+l_7#wF5pR$T&*0-xp@ZTQbi{W!^*(H7FM38IoW1YCoQ~9(%IKIcM0GmqY zPgXy!&M;sa6R8&&H1V41iBgAG^zrZ_=z^*2*6?Zl1jI;*zS7$>H+3+IcpXUKaoE>O zq_BNK%P(@TOTX=3%2a4syT7uG8qd?k?;CvD*3OgaS$AR-uU&zh=U-0-N^0|BQp<6m ztaCBP?X)~+?X+s9jOuOP9q%I?qq7+4MJF&B$yG~! zv|~guima%Et$}tfUav|QmpwY%i5a+_RKWDfJSS8gUtc&p|bHC^nZ8VJC@DC#24uVnZU5??n zIu8np*zI}i7&ez>YUQZUi1T{WE$X2*bE=`dm#7A^8|qRng2m%`S}jn55QFyQj>4Qf z-MsoP1kB2)_yB&^2}Qv`xg7rl&Rkn_pGX-R1({)yR8`Mn{ZCx zju7(N97P{V;CBsKqQK;4wlkZ|K;+f@81Y+w1#L8l8DHim>yH8PU=iZng<6M{UzFIB zJZ>!clUJGS2b5BEbFLAd51VAE14+{LONC(D+$stp@T-;moA0>}0-Y5wqNqz+C&u}i zQDn^^iYZ2-eU$54jfQq4kvvNksc$+{4}Ldo=%#p1+!QArXPY<#i>kpQGiZ`ohkS0O zCRvHSPCM&dGS0jCth3fBN&S~ZxsF|Ap-apyi200`f>~mndwHxmem%@u%!Esu4JY0U zX$ygT)!b;_bG4yy9;E!G4py?GmFzV%_uhvh`v#bq;6q8GjxFzwj7U$vkqm;>!ZC2W zN9aXKDr$eMvOOxprS1fqzi4)?axu{`leUkq+O| z(0yWGn@DX=hF(=0;>0)PyW>~SUM131BzP;g#rGiTKx65))Av)4XSgPE24-x_xSLvg zO5mtIFwOh0NmCWZ0&FAq;j)~#mPfl`pR}=r5;3P&;t5t{@G~3Of+%3911Js{`BS@8 z(5US4x2kup`Lk%w6x8**6*zUm)s}D=95rOSK73dz5$CB#_SC_;F3FMNIj5L>l8h4%xhwo%xH(8D14Ywf; zestvs)&ZN|U^@i=N)(e?xSCYmrf1)qXl@4DG2R=N!WABV)7~ha?$I)(taq*3l!!{kw#5n>=#!b=>in^Tc!o^kFesB*s5a zC4xXaxhQC`K4Y3(&9Y}vH1Fl%4HyMZ1k=XpaWM5jZUxrcIJHWPA|q{xEgI!a$xnkH z2UB)8DJEd`w;#0*h80Q}AK8Fo2 z7dihO=mPgYbN7mn3M&&b|K3q$HppcA0Vz^r;ZqR%cYzZ)3x2tyI@0*FRYl=n&u_r0 zXTEjZW$DW~G0--=i^_KZ8~_K0)#N@F{iP59TJD*1SEm)Qg*^<)Nu-wFf@A}EYU`&aeyz3Spuo_ZEs;;R{ zg9Ua$*Kpo^afxBKT1SVYbLa3qy`Wm>kF}w7f6*6oqn7`xor062fXo}kVd=wyo2VO! zM-0-9FFugT#ULw(OUqw}44UWaEDi$1XSKdEis~%&ZnfMS01hVrwN%A_9%$uk_x$W& zkh}Zom8zG=#@XL+O1tVm)v|2F`WlexdxQrP$~GLA-JC@5I?j%ZpOq5~ z=!gwDlGThzeXaTD%K`h+ zRF4^1rDQjlH=?8^f6X!R8xYtKSM?_g7(VO|p8(h47y^3bKd=9N@SRVPPmJ(H63b}s zQEtGm(IWDtM{U!No25TcjS*?fzbg_rdYSxv*%gPoW+gvXPrezbuX)M#EyG-6)_B8R zTr$v*=%XrlnTmvQUXD1c%UTAolb>>#g5-XGki6L-=Mrn~UUV2ct!?x)CQ2cfSt4(%db0-8k#SxClCLLE+Br$cq zNRA(EZ8Hm((iWB$9C`EvE5{;g;R?qSJuQAo5EY9cnkcYWIO1@y45A02&29PjbX4@e z`VBFcRVL;Px?D%JtA#rwTBA-qIV1>s0QJg2$4C*U!ZI=wJ{r@v(ON!~tR;Tvp9wLy zykFgED={&cYix2{XX>!IxwJ@MKJR^^%j=&eE|Z)%qh6IBg#VAQ$jr`A?ZZ^5fI7Fi z74JHrsD9#XFs5$%Qx-=SFwzfXw>{5RwxnaCwg?rRN=dnGq>M-O%T{67r zxPq4dF&RQ5K)5t`M!4Hl0XYu+kF1biCFaa6?zBL$q2r(4kq5Fl@R{qypFVm00q;wo zYCPNxA!~oF9tYL^*J>s$e|s_j=+)BT6?|`Zq<8$e@ptg20yXzm{e=8|z-WF)rJccM zr!Ih<^t^gKA_!zROKUY{sOiCcD7&;bd-@WKul`fp$2jE!brR+jJ3a+BK&HUM+u-n5 z2st|XG))KOhpXcVvlBpZVt!b^R`Ud^Qn1KiI0zIh<=o9WyWJsatzqA-p3q@g z`9SqiUwx79uP}S*bOs0n0J*A?@(qiYDZX%Nqm`ZjS^`(4pZ+EXnA|srdBP?E26mP> zuHf+#Ao9{%>ge@;LCiXM^W(2U=HdUu@n5CgHKp}T$(Le*>kmGW4g-K@-K(YVFJ59a zXs39feHpoe{~5;3f5Uj)8Tt(5e3&&f?aK`$_B`MCpK`d1Un{;;N^z0E;U%%;hLe&N z?iXw={U1I+d@qHPNWTCk0`-ZPH86eB=h`a|`hS>g4@j5#P^bK(FMPIoZsr>m_)D`J3#2qv9fajpeBZbYcCkIG&3B z#_^MubH2-ZM3CVVkl4Q%O9^4EZkd2cy7+d|0gE-y%;~jQf1sj)N%z4Bl9B)!dPV=tT*5Dskx_FG z%q$T6)Tu5oo-T)%zOjl0iWe?VF-0%A$y9zjA^EFkP(^!a?Zfb?x#&3P!#1F#mif89 zEeV)cdjC02rE5LIkV71JK3S9R1a;VB&vSy!nr$Wp>90wu9NNxnL2=;YBCe$Tj(I zg|Etn9uPbMYiel^f2}1OHU5(nCl@W=HM#Szs9*!U){#IlcouYJn;if&w&}9z=U->= zzr%coRA1&zfu_+-^xx`U9!)=C@(*|JC7aQA5duSLTKykMk!H~Yja#1Jxa#7+uuG0{ zrv9ylO2p1Dzkg)C==_0tQPTgsr2luuCmUqj{Wg68X4CwSy{hW}#Be`*@ciXB0MKMw zPhW8htmU|4k{^FZ!x!TDDfj)&2kVmEzkz4vw;|Mfy_DQA(^3Aq=Nk5=#OWnjMHT&k zCBe~`rIqU_zjV*$34l{=TfU^^nJ8-fKSxmh`u3XQri3_0&aDlIuB)_poG^!oP2>K@&Tee||a~Wq&D7bbK(QZ+UqYsxC0) ztuz6MRz;fB4;7uWPoG*2WPlUVmX}%sPTrT3PL_I+%TJt9{h~dl`3_`>)^I-$ zm#y(;f9g1hd)tGygxeSt@^5}SO8#hX_pV*LtAADXJVc{D`3o(J(E64% zolvXbaJ6#>TgTjUj=Bd1$MSRS#?6qq^nW;sF;JNBLQ~ZP_ZR}hW1YGlJv^D_{6urr zKzz1y-peUgUYdUZCj)R5SeO+>XX*ub(%))|UPEgxq`-O|>Y_iD3;VIGt8(3~9fI|8 zBkua6@S(2qQ`K9A24QP%)*W$C8=ApdEErR(8<2NvY?nCtsJ_#g`Bt%Ovgl3$p1z3- zc%P-zuH=h~>LE4fw!C{vH$l&r>9=W`Kl)o!4)o;$3BvFgeS3-j@5?K4-Yh}X3;U}S z-&W9gxghK3w&Ky_mPrx0@AI#3_T9DxqIBxX94Gj`c@bQp;YQxpLi{#Q4k8D8e(5yp zv-r;@?RQwO;nn|Ir}Jwqc2?YD#7UJ7 zg`@wfdZJDgudAr=S10@1ZnTowpCh_|{nUn`_s4;w%}s8<@6f#*d4Q$Zb8ClSM{!HI zJH6q2J#TFu2BImJTsMFnbuS24{&^{5W6vc99OyWg0y?jJsROf4Cw9(L;8;|!cjA-M zHrxotXyc29g#W>V-2?H10UioIDbpdG9Ca$39Q3gAaIQa(LJ_p@oSE+^9o&J_LUe!nf#(E z^Yx$q^|WGYpwN|;!_Vmf`BiLDy6a9vnL6=_*1G%FsaPAy&Q#(t4L1RO0ia;IW$LA+P} zx+^-@bwQZ=tVh#%>gBDyx*0e+#kZL3xBPqXABgG(a1PM@IR#>Q38VU*Ds$Ni-b+4i z)!m^sbpBTxlsVYj#m>En*tGc#;nzBIm#g9R>q~3WYcCieQLgB5UG3c3-Vcc-vr2eX z*q;Qf1M~go!vSEq@R}g=U8|DsELhF1H#KKoEUuvL7gCF!zui1b&m}i)Ke-prcweM~ zI@*3Rc0hpz@cRq@F+z`}Xk2R5k1S>Nc_MYwrK9S6v%C0+$kMmo1M2x$ZpK$*=2>}8 z(|w26tgG_a_g>n;*$&0B2;AcLau%d;m#lu`Ne!E&j`H@6W`>5{iA(2m=|e|j7i{AA zLhgC;53!fql4GHYYn)UN1R7Q3KF&p4=BE0(|Eh4%@WH7DnyDWY5qCC^%!$4M!6$bX zdlI~1u{KYHj$3tp!}Hkgy<69c!8#g9)J)mTqI;Hi$GhCPd)Iy8f-`C_J7xSx)5?rxG^h=!c&kSStGUu(H~3RRG9k_&&gVs;93$KRZ1@ zRqkMFy7F{t>~4^jpKa!HOZxgj{~~JW#5)#n`bjW*S@5X?^+f;G-07CKWD7so&CFaFcsruzS@eX0ew z^3Q`i{&MFkW+8LjLWcF3y=0={Tk5lm;mh$Ochy7fET#_Djt_np6<8pgN1@=?tYxX) zOfdWD(*o$XIc@>V`s|H%%;8k3`NL;Fd$CsV9*G6?V%gtz9QEijkdXI!Sy;t9@$cx0 zja9dR4s^kJwPAYxTq$k6%w@ed^9-`>bPgQ7NWCj`Qwm#zOxt+s+OFP4*GYN(s2@&z zlBGI_*72wO<*oen)N8T~X&P!>LpQ~5?1&1!qC`*&|*>=cBZkL2ogSh`T( zoYYkL?Ef|_^WS1`Mjr>TlwT*<(~gfF(X)ru(_Ls@0;iNlBMP@A9~Mcux<_3p5LEM zj9pf}IYBk|Pit2Ln#`6v>bK%Tu&vE%B-Rh3otFh9c4@QY|F(atXd<_m`R&`$`o{-K z%h~ChQefeS|8FUe^O3@TN_jDbQg6@CXZB{@g@W5S>iLh}yt^0uY%tl>q!CViq57ZU zZuVnVdrwt&sK7-7_0_a>F-?Qe;FaI`MHTxX`2vjM%qgn=L{?U%^XM|_MS?9zb%WiVRQ`4EpY&M?X z26=mbiAQxHZ41o&I{Rix%;+nku6+AOa2s2__nW>)3~n{**Y7vM0I{HlKf~(ls`w&0 zLpRP&rJ6ms`x3#GyQ0H*yH&}PCXqP(r^wUE(wt(WOeu5OjTGbU)R6}gb*rYqOp zE)|o7-EY+9R0jy|>UksS%W}AqiopWz>h|?bnkRga{a3QDP4;$aYUB3KM~ZHZ6HUj3 zdH2li0bMdr+Ycv+(1Vw1ZUz?a(^Folafw7tS8Z<-{!6@ZngaBu64*D)Sr?l_Z&6dS zcMEoZ=)&^)oeO`trykwQSER00v?giryq5CoSQy;BId@7%-KJ~J{Vn*KBieFB#jI^k ze`kPQ)ZPkInyYyWoVVElw+GLKSE>kw+R97hKM*!T>#U{Vz@g6sHU3rp{(Vi^=6TS% zs;j^Pctx)#>cZB;BuvOwPjh~p8V~=vD0`WR-|Yn4q1}v3 zLD

B);Lu`^~n=O1n-Il0upRH7Bi?!~>1&a*;psq5ICg-Qr_AF?Z=&7#}+^57dFB zAB9-K({tzs=bNKS;zI-88(M<9D7t0Cbp0_eQs{c+rAX+f60nG_6s#2WIUn&KrP`nd z)5134uNzvJBwmMaY20S>-y(yBOlpIB^CJ9ZTZlWpxU8w=j*|OdS+4D#hhx-U>^`#{ z$BA41ldlBQGjk*zLS~;$|9WdoC2W&_Z1Z0IH+Nid8Q4VD74^@5hCf{Dz05vvpZm=X zPVG>aO6eNQ^vrL@Ut{S#KfacJXgewY||8d za@qZcr~U*qaHr*8_t=0|j8`VS0?0eyu3I(b+XM*iS z`Qv_%4LZ{E*MrPAan3R-_D(*-12YA0ekO;3!4nCMbFhPwVeJ9JTw1Te;xPCdb)0*z zJi-TgyVZQ^!)ZeT!RW(rouk>=a#Qpv%NCtf{QtQO|G%DZ?Em7y{$D)U|0D2VX9g5x z^T6kxezQ&Lspn0rZB}JR>E8bV;~jCVkozB{-uo{t1)5WZ759vjD4K%4D>&(Tfe2@@ zwrM$YB=A;QG&vcdUcuPUSeXE`plk^9y-ka&*ZUIPeNKOAQT2|rfiZOB~ zzk#lOEOkK{l!X7HJQBVCNtq1jX|YTS%b#on7vbWeP#-!UWeSj`H#gCp5p!@G?|TQR zaklnOVChg51U|epGime;ZOXsq%5dy@L*)*aZ5nKZZ*e8L4<< z+i{_8bFpEzB6_uCE2$ben?wsPg)%flfUl#jtQ-vNndN*&egLc>`(j!*hK>})bXhdk z@ysb6r3dt<04H2N?N*Tj2*R~$0CnTOyjzWB@(0={TUxnn)V=5$I-3 zd*YMI=2)uPBmnmm*KBo>vSj#W4F}hvq-&(uH7m72q3Rl(m)~MC8dN0*0qp~8aHx6w zAMomjbm=NCjfv{(8-od`eG6-2=5juIr+GB&-*o{2$-u@ zkl%-PF#|8dc;mC^wT-i1JcDqW-&(1|2A49-RD#QNO{81vE9|U5v>NwxVz_G8RfdRX z4l(}os;U^I33WI_Js-cLLi=AV#qC;~!gs!&P@H==>D0TG%v+al!VZ1&6&Sb9jc(^#PVMAb3ybd^=xk%`T=~AguL-|}dz`|p&5bF?YHjQ@gb`%lQs%}3?r^C}E~{lT z3CL_KD~qjNVOeF2yq0;FP{!JkD?g+-f1;LZYmJ{VWW(|h=GQHWgbromtIAOUe8(*j z`S4ksWsa?9)E-xnr zPKQE=3_Sa&h7R034xXTDgiRA~#>T>>i7pE@_QEf=f1t&hCa8~oi7X{MU1B&IEXW;2cGQYj`wR30--S@p&M=m~2g*FB(}tFa|7up2-4 zWU7oz+J@>Q33*Xf{`Y*16j!r!Ei{n}g`a4Z`B|ytW)a3b;_4a}YkMnuPDioQClC{F zl~pkHHybUdk=Dk03i4gb4O&mpJ`9`2^_2JlN)uBE|1DInLt7t)4?EGALU1_Ewf#+& z&D@l_L|YdE!^xgcewU&5mgLC+oM>&)$7RR&^lq6>QBM{Rkrq#ynTyyatDiRx2XH2~ zaC>c@Xr3`?@NS+q_dn#Go8IHre^nA5Mk`_h_S6KW-}JtQ9J{N@K^Ua2`|YzSa;X`7 z#0RkGET2mKSOpwN|FJhnG~IOc>Sd%vA1qN(~>Ad5tnNMW?=|+vk^2P<>uX^*XUqp z_$8P`ZJ9#Vt3PoE-Ypx{K{kyDGgZY#l9sAuHa3n~x>r}*2*vXQ7|GoX+MVBRHA47G zE=0=gQO(qtKcCPtX37@Edu=kl@O_lU$-cs|o^EN_v^=dge!_S-ty*IIb*eEeTWsCJ zQg%gg2p#s5ZrX6VV~%+*y6wkpo1&4Pa}2CWT%}ka8*BZ0Bp)udN^bE$5gcdH+J@aZ z_@MXa`!Gf;3xb@#yb&mnR(58%YKe)SzgdkaN!to=bNp50Sx=u!bPg0f$D z7kF555ox(Opn)&5S+txR!r6jrx3N3 zbqKRg<4+=8Q=V54%q29)A0F=SdFX7meW5=o3Cu{uQd4?ns~XP_j#~NYxMs24YX)VC z?>!mMXiGd+S4on(3kcMu{1Itp3Co$mVPXkLfsmv`_Hx=A$zZ;tlJFO09!@401l2g4 zgDQ^XFmp_-GfeTLqVB9JiIkLw`X){KXryrB(85>x83zjm4RI#!$%vCc)UfcI+Arij z#7HCWER0UKlo3*E6y2k=Fb(v2&Qgv&9fK+ui>j4?BT&&4DzFSDW|P6>PJCljYwBr| zrAU=5#Wz<4Jm-v7ZfxwKyAjM;&VAlwd!~|r-6Mhn(jg`e-c?Uhz-(%-@#*gHCgo8i z9a|tqnY6$=1mlj&?>tMHoRqj>4d1Qo7zNDOipx!b!091MH~)sUpCQ$Lrfh@K8MR-$ z2IHG-S*?hwB~ryb*!YoRu!m`L>O2yuPx{FQfmS9G%o!1UN-a|6Q*=AHH=mb?*VSqK z1I3ojPt!&R$G&kUi;P=Fems1M4_@D3tbacQdkb!S}kegmO5nf4yMo7C{8JdZaGnw5<>>7h@o4##Mjfb5sQ`2zkZ}>AbLP|GfjGDyEi2*xET0<3ppE4u`B{aEKPm%;>Q`0T+8CH>bwM zU}kOM`jrQfm$xS^i``iIi6vrG*ThQ9D{b0_sFJ{6v@x<8VRtuei=cB)*W-3rckb-D zRqIJ0+R>962X>;d*$Ef)Nc%1R+E1=`rue&VQ_P`equq)Yb#t#1r_65*dV4x)JIVv5 zSeeXKqL~yAG4H^G_XLPwrn17?v-svT5qkg+;>4GtkR>YXjO*q=G!Z4Y#=HLOY%5WN z(v9)ov>lQCtb2uZ7Xxh=Ct$qH1?<7}urjai#V29a2jS&QofiEnj*92R3PA8pdxXoZ z!U);>l|@P&Jp^J@Ge<3jp%WA##!o0bppIFwhL)KJX)W_q5DVm>t*4i3sn0P|-(*N* z#%K~%TEbVtL3D;W9^tRomQW=NVjG1Zxy17i0o3-T7zCd`MAA2MOg>EH0|BC}13;UG8Dp zJbb)`1LXh4GGVDS~+4PdeUu<&t~uyCBsA!2Qe|J zZpOTPjCyRuZ#eLwxHb%o!o=`3_yeZtsPi$kO*AxD<21(*plg@fFdc%7?HEzc?TK#c z^1mxZO9lKKwSxf;fi7ULZ7z%JB^|TbP86+y#P#AlN}fGBVJ4=$^zmll6ErzpCX&!& zYkuu?UfCd4`X~_kym9|Lg`kQNMUBDL7Ajv*i-o zM9W!_N~@95Zkec^mS>Gg;UY@g`S{7{-3XgL`fv*(AI$IoDRH=xBW-V+p+*9$elPRE zb+5PnBB)V$Qvf52T@dIV9B{1BFQdWX*ozfw&#x_g3;l7mL{GY1lxmr+P^qP3kzn@` z4kzM3#I_%^e=1z=8*h71n`!H(bpHjs)gY~h9XcxGKzUN%{9vAQwe$j1WM(^#yyli` z80pe7Mt1lO6T4D`@7I}{G`phO(g9#x;M|enr;gcoCg%c{1(xM9Hq)tCXzXo_=Anv? zMNIOZorJpoKoBC!YOInkORH;n{Oph!z9HZD6dhFK&qMjKrs&-Ju>%BsYBClpAv;nk zGayh0*kz$XWrD{64GQx)+)@YRNo)-<=W|+=ec-Mkt6P_3JC<;fk<6j|`IR|uyk0ut zIEOz*^72kcQ^;jl|w|#{KY6op?@wrFf(tbc>QddkHx0SI%DpcwJ>#} z`RExVrEhvL>=N`1pzEZ;ns)51!Ujmf-b%aMP^0Scq>JQ5{i?jBCW7o#+9W;lY3R25 zVlaRflYuiaIM0Z=yZO>uKR^JLtjZfHCOvvuPey$pwUZYL-$up~g4ozLAzMKJC8H|Z z?3||i9ut1F*_rSpB8L0QP9w&$p3ltLk)Wu^69Q$a(;Zq2J(8a&a~3xi_^E;(sPYZL zT07f-P-4vurllsh_97fKVKj@5JIiY4G^ydit)ONWk&NO^5r?|OHqq<7ClO|w+Hx@d zQP;beNoD^I*{(TN0sIOGG+cSTw3@=#+k$EnD6(`@H`R3?Z)(MNkiA%G9rKQX_uYrN zfoM{WNQRZgW}m5gnvT2}C&-4#j1pyJm5v|B=84|c*3eqCvxt2-5-vslB~*3=>xJAdaIXE|u@5>+hC zj@~6641XFTbvX)BygM%=)IpEjBg_oECFTuKjG|qV-*AsMpT-OBE1bR8qLwhNgOz-+ zM6jTIN#c?AT)_BBWw&UTXa>AsGRe^8f(R5<3iMbZp&(u$LV$+aq0yjjXe*@D>QZ_R znS4LESi-4XPEGuy`2#Rbfs!N{q>x=)OcvIg!r<|)h&p1x2>B8X&4eXE74laqQ=%A$C6(X$IWbjzpC9^t#|6*$(7haY-eoZ4rN{^Jso_lDa4bRPP<)FPo7R~&R~B&{6Hs4XP(&`23j7NSuT4qLn5ln&>xfEs zkOumtm+#OF2sv+Ku)i))S5=)gj*aP8?KPh2f$6S&x41rOR1m9uIP(w0fZjLx&Ekm( zAv$4?}3hI$kwe>!8o7 z^Nsk)s6Kn|{IK=b{s)3%$NikzQzL)GW9@&I;hG0$yPe`t` zStgdZKcC)NOy4CfB)3{+7}90kkuBJAM4e&j2joJBcI2@~wery*Izmf-b^z$y==k%* zva(Z*^ztMFsV0u5ZWB6QEzzJ3{ZZ<1B$+R~v7U*Eme}pc&8_aB;&W1%P|DhSLoMWo z`Z`#ApHLvAYI0@B9-&pTqXp3R!uWHe8H~R!#WIm$R^1`$(VbP|Rf# zO}?Yy#fjSc+ltoMZz}gnRKqv>jx8VCCwB^6QaW)Gr*4^J z;s)4^+m%^7*9xb&1Bt<0?&;nXXDroUvCx1T#pkoN6ZH?8n@gn@*%o`x^%YsIizALL zFg|*L1f|^sB-XpmlC8!kpV?F$6Gm~sygc4DwhjhNPF<7shzU!Jfpv7W8QTXD2PL>a z@}A&=a5Q=hI5N1l-zj8mu}P~n%FB@tj88+Dd2E$@n{6;LNlrL6J8uH?P}YHMw8WgA&gEFb zr1EVpg{aesMsYSy?#g5UB3rahtpQl=7w6oHvf~COvj*lmxoqtYzq4$i#J2s?$!z^@=3TWK&{T^HDPo6)edn5kl&OHaIbSDU%}yEOiQg#& z{!5A{0U(qb4$y>S*ZAU$b$1shBX4C9R~xY1!h<~w{V0~sKj5H9mb_;JYez7TrD|Oq zed^C)UQ$y&T*8@TqQ=%pGs|BI$`xkH+JW4lc2e#XSu21okxlP#njuP}qe)QK5-8uF zVqgU=Es2ji*rD4yoijQujzukP9e?PTM&VyN+$D{I-Nar^Z*H2%T<(I4;Yit`s22(c z(>HuGsf`|wxUy4@k-bU^Q=Ba$EaYeOwd_-cOy4wDdypZ-*L}g)4El!XtO%ly3rdzU zzAzl2u+k?9%uhqGq!9SHs@Kb!ey7qZ#GRnXmfu~3lM<&_a_b;{_!KW$**EKuoj7KC} zO7_g*$E<6(?^frfttVbCH7W+8Zqkmj-yW%`%DCck0W`Q=iq5g+Zpvhw?v7)ks`&E3 z2=U~9LFwP<(Cm*&G@2x45?ueXr4gUbQ>dpVtG~T{d1GU{DN!z6D$1!D5H|bQwB*NCH|3LJEnJCYvK{4ac-e8L9T-ot*-(%e2rYQ_A z=#%QtA{RxOUe`t0E<~zaa$m>Vk9qlSh<`^aOtarssMTORG2D60C44@Yq59{4Q?5V; zi7lVskBWl7H90OHu{4Z`hlJ+<|3D;L6x-Pw+MgInyxxre?!i3fv+QBhM;0hMFH&W9 znL$ZrujJ}tAxhRQgdM+D-mxENwoP@h4CN>`ry|IVQu`u@rzo_R*kat$)bYe#aBRAw zb~OY4a2?8*M(PNMx)S%xBs=vbeq;9javXrTArAI=f z&@WhRN8LB|gYaO`#5pN%XKgvKf*pO8j_0mJP}x<< zpka%9eiExWZ{BH#6hu#h?%(2iBVsaP2*v3^7dZ#}6JH=7Np~6vXI%qs#SQR}h45V} z&M!Q2IZq4ex1*yeB%uhBch8E6hReousXp!oGnMN`idJ5S0SBd3QkoJ#=dK`bm-$R5 z5tLOurFx1KAxBEP=+!}+j~>!0S@^4W-ZTNMd`(CX)Kl9;3XG4B)QF+#NSiIFKX%+3 zh+UC?8f*ph4}N+45aPPl8`E$!4~-NTV2JR84WnuBtaaWkG3zKw}`<&Y*JgD^hhk9_YT63dNh*e82bzF42Wi~(qL==BAxiBvLQwkp|K^nbM6Ydp_*RD2>AFFcNbWFdQG0=S18E-?aJyp{t-;flF1k zN>&Q~W0{QK49^i2GQ}FnRsuSrRtQ;z$%X6whSQ~LMn_wMT64jWMvY@&3TOqz6yCTi8+=uq0olcxl$mQg0sLX8yQtG=qTaD^<{ERWS1wxb$ZAy3mvd~)fyUcfSYux@)=jQx z+uG^o#i@(}eT1tKBUtZFf2celykj)3F_dATOyl{|uRy}&yX|bk>}T5oGl7a1X2pmS z-aK*S!ylNT`DL93?xZp4H8dQtd9TcxY($Q0EFuOt!11US@gr`F$b2aYHgihG)eLQB zBAY9@yI&}?2YnZ?r-N-Z!>~N2g{*@|_ycG|HNU(a|GiNp@AN&Vdo$1r0tGveAs;Um z5Bf#Ezn;3hW|=@FI&DbzC+d5-OA82uj?1qdzco_S9UMUdRE5iJNFm;^mX-wa17g|} zYyolGur3@ zXO88JSfTiP8~*jdtGX`6#c+Juv_7kGe{_v%RkT?I&|Iqw-umH~-x<3mKzi=ocYQl+$FSA-8 zAA9CEcZP%`#&UtSpFsCW;c^tKRsnH>a|{}^x#sV^fw8@2yFf-PT& z1u*vQwE{C~CQ=(&c8m!-j+p3(N26|?`jE-)8XmkIh57Qt$S%1JLTerKpyPtXHuqF@ zJDd8lqQ-z%!qh+o6#K9e`t3Fv`>)gDY=VPmk`&^strKF(!T!y)QTtc{J&e&LJk-j4 z(R_tq7&xu-{idHt+>7urxyU2p? zY~DMCav;7hd;$_BV$aQ*xS*2cn+!Z;_r=HkR+Wov!k!zT*5K+*9U}+sdZ4)5T<+Ui z!(9yNm)w=-f_Q_P)oNq1YE5%l#FCI&iR7pv4tA3&uxh-y5>4Cr}Y}LxR2gj>H&-EC(u?nvQmXa5})4>?#M?2iG+l+jdCiqEe*l$a{_qkX}WV z>kgrD4V5LN7c_4xLm914AoY37I@@921>aUnA8JTv@J_bYS%e7vDy*_@Brl-X7{-n^ z7#}gsBp|7dZypN7#M5z6-PnOGfA6#?x~6V-9@LEk3JX&iuc%cp_@%L0w8pF*nynz& z6jZ3}MQa*y)IF{*YvI|ew)`zCRuSvc*9sa41syD_qY=O`jrY12v7m2luy znnOVU5=V)SMU>G7>~ZBctbXuAq$)}iqTx&rzN->Z5wb9;2v;Od-*d`iqv|d7w7^NE z@QbemNl)~}-}@{nsH)m3Ab62~A4z`Ymf2KOUN2T&7`j?eNtkvv^hn7J$m5wD$UH}! zEcN7;ctK6-rBndJh-C=o@H>3`nrAJ&KF{&fTYP}jgf6}wK(F#z*cXwNy{6Q3G>!g| zI;=~9eda8trjP7f4@ltT7OeNY2b2D)tJnlChaz~&DO5A_OPw0q-hQ)B>OTDP)4=r~ z2+Cp0I^!kSY8UG=?sX)V!4@Yp)O53bMWb2SWk@ClMXH10LW~xl1Qd5;m2yOe5Zh2s zD*x`A6l8DR4KYkD8~OO=3k0Zj?>r9BS#j1d>_Ya;&dz?wFO95s?#-Q$cJIhffXxPL zrenu0&LL*am?|D-ldwildwNy78c$Lr`_ab&k;6`fzm~IF+`{o&k*2X5Kzr19pg6k(7wX z?dU1k`jXb%vFGwg%SDRL<8@-#S#i)O-H%IYLjf3Nurc2Y)yMbx12r)#Bl&g-bY

e?7HOg%sOd?L;5bGsQtAq z;USF9>g8v}OcmYz1YbXCuSzb$HB4(2sIatBLS0-spxG$YeeijZ^d`a#o$n{}pY_LE zXb$kWT58twhgv1PIcSyHdgU}n(60h;-Ny_K{3ukIW`--16m}lAFh%x?pl^9`hg&f2 zkC`%ue;yWjzsv}Vak$XtIzlH^&PVaLEDf%U#5>nblw30RPaH@e2F0Xxccy@nS=HoE zTD=+rhWklzn-?RUZD1~>`TW_Xh2E`G@CNfy&&^ammV2seo5kM}y1G#gxQ7sxRLWz? z?oDY-p@>zd9@5s2+C(uIYSILwRK{DV>Aq)FWJodWV}tD|2D}sjG~uB*e58C=4P9p0 zvH9O8y$+N{$2GL{!Z*U6FupXKpMON58=1kPR}}cr?b^eE?=umL*CeN{KGGbjR}_sD zTK8aSkYZNL+v#iNtywjTUJ$=4tu0?s;1qMYm6|bA%MLA28$yhfLW<~)%5J4f0K9|- z#4*C#3z;vBEQ%MIG-uf6&D>(C@D>BDv%>=~4z`RtJ*wiQ_-wriV9~dx<~AsPJa^1B zxGSyjJB97R%43#Q(#3KXO>(I)kIOiVW2!XoxT83Jg zV0}o^mobVTGjd7A%aE%wGM)ivgWF&8YFVtabA7|6HV#Xn9r>wkwQzwfKDT2AEIGiR z1RI8q`)gEaoaX(ELp&TYUk;-ie<2_AfP|>9>GBg1ln$b%SFo6`=M+=EqIbnp$&#{j%Z;%9DcYbdI?|p&u;* z`_0lCyfo4}AwUwfwNWgyo`MYC(o&4fZIghZQ}cVWIn_x8Db3?hLZ|Q1-IT_DZm?VC zOup*5^KN!y^;N#V`{1I?J$1-OYGZ`A?#9fz9eaOGh@L`*X?1!RO_Q^;p=ocqxQ5^D zdTEoUrL##oE)X;G;LS3Vi4oeClz3}HjTIa$XJvQjYAA4=ek<_KGO^{aMBx5VFOnza z+#NJ~$5w4KmewV{ z2FoUInS3K85WOd8cM_+zfG4mbgJqfVz?H+N}sa( zK`kP-%|4Eu1^JU-C?!{QEfbToIOK?kFio5@7Uf_3r8}}l^;b^3)Em4e6Vo?zwqNb% zh_%wkYl-O>xHIwNzwc?82C5+hntV}0BIFY`Z~U@b(SI%AM^Wbz@dJq*KI*%@ zk^Uk)u*!@Lm?u!Xf;`NYyZl+wqEmaCKO&;8o?He3!GG4Fj(^)yi*^w3Ie6aD14+w? z(fM1}GE|(@JxYcvMX(ch+Dts&96fJ4j(v%yx)u2-BRMa9C@oiVd+-Hw1^&_+Je6hB z9y0n$`ZQp0FrXx>)RAVW8*^NLntH4+5GQgo-Q<1JAl_o`ziv@_GRcD?V5G1c!^3FZPEnD>6;4z#4~tgKoAbJgHzVhF{6psFqnSI|+# zlj*#GPIJ^I&Z#B?tKSH|tnGG?PZays?XCcst(SYZ+sr{8yZ%>qoOf9#mWN^bVrlbi=&21P6BIQkX{vrEMjym^kYi??eXdXt%ll53 zq(ZBdQ0->6W9Tm@6|F_7V=6d6S4rx+ay=``|ND-$+lJZ>rtsg_Zy?zAD8ZM?4p0Sw}ZLV z1}N=Kd%KwThuBN^(HdtwA$&s4auG}&+B+@?O+DW;m0dv{s$Q!%%2b$Oh|l&bta1`+ zK~3y%Y$iC=QY5KlSc55r_Vj!UvW|Rr^ zrxV=z1=0Mxl*-`z?9Z7Tc#jcvYgb5>@G4m{;auGEm9mezY{?DQ*(N!=SocRca^90C;FLOI z7?L{JIsu`n454;_>D{_AacQl2sQJ*z7P*6Ub74B}%l0<>++{!}(|{XpeG(7p8Sj>x zd+pJuVC^vjT94T|3%QU!BQN5~HO!TtGoK=p2apB>$<4+>^@l%T0!P*j&t7zfOBkLXc#Pj^thHL*-y3U!W`iUIwY>M2-ZF>uV zp1kkm{MpKx@p}bH<-eX$IaQAEMjXx2&&1Ky>keLg{ENbLa=) zgfuHrs>Ng|SJR*>qJ8Kx-p!A47rJ&;Iy>A(#UvkOB#fm8lH{aOIt-(7C-1hmWe^i? zvD41JFh)2k^|hC#{3kKRRs-LY{DR6aU34?LC44k*g36j)W6anT? z-c{TnX_)rv=|$81H(kv?5O%;yH~ZPmUs=zeY~?PcI~6+pBg(*pdaY=#l&gzC%G11) z_zIohW`{Dvi*b*4s&)~n*{azdvBL(13q=WCRx&LA%>D0qC7w!N$dPL>R$`bG6G0wd zT-Fr<@oU%+Q6Y;2)hAR1RpWInjRIZh6A{}`$N^1?CaZ@I$byWFY9Z=sZaI|8kaJXA zd{Ys`hagK^qGV7MF+hCu2nH3f1NGb@Zbbc9MI^@LV zylseg%L+}I#!t06m!-G-a&%%5nvma^936jd8rY{T*$~>U8E*jfhT1)|)9Ve6Qkc4SV)zSz zo)hTI8vTBYE310riPjqG8baLLmKjn*+AM`~Yo}-s-}OK604LnHZ2WC+=~<2KHazZi8k z*;d+jQ8~RfHCn77uguQR3tO=-nY+}mN0~B_xv?3=)ed7wJ2k4rv++w(exUYw!uL&K z&Ji5Up=QVB=7|_h8)q9aKbZmb-J!m3dY8jSyyIx1X*<9WK6>7{@I@R@B!aLlE;lTLSL2RWvtZHSp?tjLKl zKpI_S0NCiGp0W6zuYKDC@LsPfgSA+gMBaZcFbUYKUs$$+eX=b?gY zmBP$1%Yv$Om&N5Fx-*e8~ZfO> zZABK!<{^jUHIDp!q6Ct;PunSJyOnLuEG%4G@jHbt)wl8}m`lUdVMkl*Gf+i@n=jP< zZZ0&nts@R+PNp7lrv3SnWecGO$hHXd9ia%hi5vZl`$|i~-4WiMQxv<83|#HYQCnXr97L~h2RdFOUgiV;FO{jmiRo-HDW|KwdWso|%L=$g0s|87Iw0ZOQ z`3|eVlse({yidXKQzl#W4qb!~;Z(@sK)MX+b*tD($iGWQDT^79B#1hGK@^&vdiy~U z<0~~tEk#UaVYuva`-af->y3`5sEPvb>YJ%91@dR&o)wcBRQfcgfcD*c@XDfgO^G&%BY6N_p7A`Pn0O&yUnKfGLa_#$@(3 zGWD~Bk#uyydnz!)(HKz_Ed(Q>Z2(G;wf5piSpVrf#Qq5uNAHel9$`MxQuP|7c?m$#DbFCSRAr)C5UFf6GgiG;} z^e0O9bx^)M)uTa+29dJpO;LF3Y<{+SmfpAMAAWSRq4+jJY=ps^xh<@69Uw1{I2TG)x%B0A?|01 z6ZwuYWl*uU_UqvgWuX^zW3&Sa$bn|;8uY0wt*;|8M7Xl~ z%HZYm#uhbiBJm{e0>%Ih1_wMC12l_1qgB-{viOa?%FAU?Vj)`@e1`$!@2v^&7VT~Q z(XSR9%kcfVbOGM&fVVkvBNZ~>I`BG8;yf*W+^E0@@2hUIss5$GYK?MN0BFGMdsT6O z43gy< z+}#q~WpE3@g1dWg4+-uB1_>TW0zq>(v%CnL+M4EblJVl{&qql-n>$Ns796jNGzwyfFi%>Az-p&oa$N+@xHI*nnBi z3!%Sf|EB|({A>e%m4VWGJ0Y|gY(V2+r0nnPCYty? zn!QdTSWucn-8a4%7cD`Q)VB$uAN;t_H8xWCkp)E5>htr%EoGHA)Ij6jz6>P^3ADv2 zEl#7KP@MQNj>O?|z5JuQEs)?uLIobZ>@mVuxA%$7Q){9tw=Fe$?`Ot8tuBcWGwUQo z!AK#8&ep5Z& z6ycY(2De!r*@4v0+h!L>mPoIO+?+^W;tX# zUPE?HG@MlSk`ylDI+K3Gp)Ulcmfg`P{Q87xT0OUh8;J1fHyj^JZoU0<+7>OpPgYQ% zVy1D2`(f@chrXLeol_j+f2Z$|-Azs0J+p%Q-_tM7PE0q)xz!M`-061=>7UzLxLIgm zQO{=|cla0? zo6xV2zB!bJ)4NlXAd{Xs{Sp4s>hPkMnbHD-#3?C6n8L1B^(b7?6!-N|b8I*qPQ)sU z-iyvTBw>BYqw~b;XTnb|F~*IGHv*^OJBiQx7H&7|hvUy@zfi7uY*cUfvYgfbhQnnK zn817#@S>iu^q29E4@Li4Hq-6B9M&CrSUoyX^-9%$)0k5Wyc+V>YwX(*mwB`D4KxRwctzJq3z9ex0LQ5Pxg$mtVw5^ zSu8Rwp+(tAx*eu5spUzv6Z{tXwIfa@ja^d_}yIQ3=Idj0_Bt;kx5u*jm}<~`OK znQJ(w{(!Q@z~tOt8TjPAOreU)15pn6&C&3cs? z4|qQ_$6=8P`d9ZyTLAZfQNQS$^Mt}lId7D7pp(n^V25Wp-=%k_Xz@cm}pREXH8pjb6@^*=w%NXha&uCw>!$&nV>E=7FZ(7z2$fZUu zQ2}9B`^uDNx1Hp*D$?5{*XmZ5)8~^tlJUf9B2BD)*2D{eg zZ7pQ9ju^{en+qc(j^(eLGX^ZlNWR2PEwoECx~q$S?IU9y63D)>dj+H{4i;naF%s|K zSoVSA(VmpS_?1#`9CHd_AY97X%p%cBw9<7lEwf#ixF^(q74 zA3js(#X9R|S4L$Ai_a-cU*1f@O50i2elmPM(DKzRKdJTS&~mMF5$>(O%pFoctoNPC zm96BkN%@0rY-AXqv=d0KvS@ixDjKh6-XmXsdJhxep{SCs9mw-*-@H*TfEi#SYbnRd zKYGunymG5>eFr~x{7x|2UgiPHxktYCFDN;kbgkQlJHFhy*b{YsiiW#c=IcXag<0dK z#1#MG`zgV!x=KoXfIo$$BI5^yISFYyPXg_Maudok)?!5uLY4m!>ZT)vusQhwEMe&n zMp#he1M^j#fk+Ru1OZxN_XxFowf_hl@biYkROfv9*M7iA+`TH4j`YgA27YvEUCw{I z3^mSIft9@-!MLAMIWqb)^RXLgoVgPxkX)heMW@);pYMeU%EU+sr+(@vbp4qj)$&mF zyjPOZY*{-XQ>kmUaTjXWA_ZU8zl62<=S1mY!imb2i1lr`mghoOn)^wwP4Wv-p;xA^b^M_~5{gwlt&myI*D35U+j1 zB=WuAaI%$RjXn=B1_#|IQ2coccCH2)Ud)RXUKl=krwa$l68*UTsBm?iQJ)!ew`>ddMk241VYDbZ zH1)k%*~>Sw`Lu)6XW!QV7G{m+nS~Am6-oo#!FMKVUnB0>{zFOQZyt6vb9eA50CWm< zlz&9kD?O>bWiOg9oA{0zFQpP&rsOP!_s6PbZ?7q*_6;3M)>f|~WtSNSr?#(K0JAo# zepyxVdyrMzDt*Sk&d;8GSbq~>Vd}hDW1Z&8P~m$yl4AzmpP9d`=TO`w4QOFkJ9<3) z764HPYH;6cZtA>MW1SX&DM>S$TXZW1<}9d<9isu&0o9Q;s)_G;5Nx`$6gokAKiX%Tt;duYo`)g25eEjv<*l* z3GARRm?atKJ;$YSpGce?@1KaOKQK235p2IAbM1g~W!t~uh8@4V7ae*0MW;3GC#_iW zg=TBu8R^ZcwXb36Ng|-{8D$HPE*Q_Hu|vWVLTC5HT(;9+5X!A7KZ5UcmJ2;zp8%N3 zI;%B3hi`5bzE?q(w~=nDo0Z0fd;_y8Ic#S|&cwvjZw~S(<;AxBfoiBg+up27t}nL| zzwRYD@YFotq1^R#Yo}T}{|(2l=V+Vq0I@eOwcdVP?&}u!8EJEU2iTzj1%H&%PIpCn zt+|>&est&1a5rBjJ1Dyg)Q0l}|ByW|^$_=LzHFO1?^b&-*Y|41yL6+lGzijv=H@Fy zh-cRtXPdp!;-fVHw-yHG%me$?M%ex$ceDHu{rh2s8hr8x=G#X}9l3wQ+5Bjy`NL$u z2TjVD2W_7R@2CCXFFp+RT?Q1_{zz?|cz56M@p-W2N1aPGUoE+%r=8z~fYOIG_((9? z?>{+2JE^Xk&k8P{-6lS0g5QK$Xe%I~XudjjUHeWm0I1>nK>UxqyZZ(QFwa>ak(si! zXp#N`6R_F>YTPAdF7N>TY@Q(aL3_cg0wG-yn*e^{t{x`b+7BEZZdEBI#axy7f`!5_ z+oofK#r&*tOWhB?TV=lZstA-frun!5Xz%t-$?+nfytF{urk~V55PhR$U)3uDL<;%P zZ%^L>T-6~~^CEpB-Q15G=5O%?DovW$2sL?MrK!p3`OT5~zv*Jnwny~DxQD>IT^7Do z_;`jG&vy^8cMbZuvF6Ya*IRN70|Xx%Ccat16nAWSFC?9M0YJPzd@AdCFNlZa176(sm(sYYs^7ga)P#qlLn~+osPE3)%s-cbw2zE^lw%K3ZeW zty84*I^ov~==msm`@pb_3V+(#`s#c9DlE~6ca`}u@Tl{ALjJG2k}5HXunq%UPjcl=nwg3w?+!)Z2Je^ib-*t zFAl%|WY2-tdT*)tiuF&E7Z6|1d$-LRhU7j#b!3D)9x1*(FX*2-jPW1qShb1knIY@D z*K7F;JHawqbgCrXP%*J=0q!3j8e8&EjXp84|FK^7`R zq&7B3qIzVz)Whw!0pkrs; zRq!g9?*^xSvO-u&jq?~ef>(W@9G?Upr%*2WHIYvW zvkE5_Yt7>qL<2y?rt`=~50uj#JEAZ7t#LP=+hab%yd7WjuGd z*ZvzhJGbawCp8O&#boUCf2?+E8NPo1Bc#GRO_bkIBH&BK$W4sp`lDG%-OA&vDV*9J zqjl1*cVbSKW_L0g#&4M4{*7fG<}*3B9$b%f#*^uXq7W0`<|rpWl;Pf$%d^_Nf!^tG z%$sq8_MgV#CJEL<&bHfI)+`(^mHsv1*7f+~n=@Tw*5&p%OM;(^`+2OM7fO35y5Bj3 zzErGG(uH#uwYQubIvTz$gP8$5;hG8$c&{T{zP@CAk&LkVTK?#uVd`N9r`J1b&yFYk zQFDBhiyXfN1EwS|WTPUj>zz74bXQ>Mj;gwMaQDUXc?WK|0JPE`=1gm%MC z%fMkZT_YIoxee7zBPIMpE1H0-vzgE}n|Ku~$yAq+*W9O}XhHJmF+b;-I4QDDv(2)v z+s2`Dy&S_}N`$-s249W^LnzmcKe0-&%7+x;T8{tc?|$?1+>;Q<5*Jl+!i0h zJ-4^lMMHh7k89d90{pB=ptpTzC->EL*+nV7(gAdGNt6riyRj5H*^}1YRZGIw9!Afu z6lOx3H;EnZsplq3{DBV#x5YJ{_--;;RMw9J94OjQH$Sj&bXsD=bAdf3!k)ISrSccE z^Dy-ponuhS70@cZ)cMD?CI(tCn+_F$9kxz1sD`RI@S~Jqko7Tof zQ`h?s)D{fKyE%&dU+4!gr1;#!yF>`ROC@yWn2a@7?QY4}TEn^CPh$*zpnh;#X@h%# zC-2+q=&!zwBu+u|Io?>bhtY>lxzQMZshu3c-eQ1b4ls_ZU?3Uz5#W8|Awv} zoOs*q?3dL+e=$L=_T=`#fVreH-nWA|4!}ZQV4wkhRM)Z-A-$E$|)Y@X_POZYu|1%KASP(Dyt524?UEzzy;-zS2Kz zi&iVj64_3PL||821@CYSU->Koge=hqM5PrOLJ`&Be* z^02UNUHP(U`LsW}qxfQgT?5Gt=fknZ+LF(N_R}U<#?$UkOn32@YMPnp9KdqGE&=@G z)7No@BKLR)6cJ7y21WS~C$HnS$bmuTs=shto_6op0>$z%&T`lBR^DSg?5xUf7#7X8 zo4!=5KS6iD3T;e)9fa7P;bjcBqwhFW_pr8Ak^Dg`5+9Q=m!rd`KZVG6qY zdT&D5Dg1bD%g^6%9f`2L2}J^6YXKHF``{gSes}u)H=Gac&@^hqA9gz&Uw@}O77aX+ z^171wOJ6&}Na@$cZ~j_&ui0F*%KWpg$K%@T@$1*0DMTN(7GN;1cz8>;_5g;HYq1e2 z-_3N*7psrAt9A@UsanG?D7!uLh5fnS;$HLsTiXc!!(zDoZ#aaD<1^X#;%%(u)yL6> z<88(WmT30ZU&I`)5-CJWVH@{K^d}7cCTg;LD^A%C5KDYZ@s)willve_c#ODgtMk7TAdg+#))?P|m%=u?+fd(45ytm3 za0G}yj3vHh@qL#C?BVop+C~5!<7R)kcuF}HLO0wEJ|mr;mp;p1R64)LQz}|U-F;7RT`3B zeex1B$4zpFJg>TeFPo^0O_i&pf>d zoQXL7DVHkhlwb`o)9QGJc)b>J`Xi?Hdq}kW83D$a&f-PC-@VBU>^x>bpL9~bdFUB;`@@B=xXM-f#&ImBagaS)wckc-}1CU=xIaoH(>+QekeEcCBUuY4x6snBtzUxPw zYTtAl)MW?sQ2|5)R)!izo&Ac~jvr>(^l` zF1hBWqhP65H2OAq_YBjn=gi3I%=W~tW&ORB7*ue1Ct z+G{O|@AX3ELu_#~Y<2thp_$xxI2J&y`##1`2K+}Sjnrjk(R*>F^Z)zbcm6jG&Z1KV z+-c^L{o474hrJqqjIAa)_;*Qcfx` z7kEaXIUXm2Ge>`C(`l2j<-YC$e_V2fj;X+Vk|!*mOJphvw@P@DQF|q!Fy%+c;=Cpf zrr-~x;U)wAO;6fAn{9HjdJT}NdF$&Akvu*S>6vHiN9mfpMgz5N-84>c=EtIlv%Yu~ zhAjS!26OlbOWO8u(O2uqL}wKTOvS9zXMf&xT*(SVs16Rru?eV98B4Yvd<>Q6tkOt+ z6qd;uLfu*EjC$VK?%Bvkzcx=OKM=s^X$^LMsZro!``VngkAXoLEYy*_};nr&=ebK`b6j#e&ND>ut2Zf^A}n@uHw4= zG>1YC%eMCKT{HE*Akz;}@t_&FcP;VBB8Z2L?sBTsmZe>2Jrj+pIxZ&5rJ1 z(0s>D_u>>-#&6(O257>>0`=u#-bwm#l3qr6tQq{J@$MYa^v-ky>ADTJ)LL=N7)XB` zR2pxbVJv>}L(_)&8&6ho514LKj5Z+N2&ym$l@kFa_yTv<0IT=y_vkh)4?O({3!&dA z6T4-r!E`V%hytqcS7V8^$m~^7R4~V*<7N*47)W^P1f_)ne;4cRKEUvwZ_t2={iP6k`VTt2*C(q0%E+vvmGeCS zL)+@oJzCNqfUb<5K_yB-9idy*SC_zl^mT{b=6hsw#{iA>ziB)L(`|7)G!Inh|LBnW z1e6wW6MH2BaEJD92I-LPS_NK`5T3XB0Y;)dzGvD0x*6b&;@XqxZ!Spx z2A2R<$!=ADJ^*CMc#z>6#et_asKc@CcI;{fdOz>jb-N6p!TgiLb^1@G_X2wBu6)qz z;6LR>-`V>HV2vMMFPI0&7eXT|vCcK|oZoP1$+@=^Rr$Q?>J*8yg8v6y z8ULI1A`v;Cs#Gg8Y~Vo81{>w$iBV^=6A<4h_!R2Hdo#{O(7?slG#QX`*g7_$CLZeYoI36{&+h1x zF3dS4XXm@ytG63oSvVS_k6wBz^3rv5S7TEdY|dMttl^!7tPBxs_Et1nN5Q?=LPzXD z7HPNZ)HD;^_LJI|>B7n)*q*QXZWsEKi;hZ>fiF1%N;s=P5Xu}hBT%X_q+1eQMogbw z0uQrmsrEJAvLAR_wF2(_!pbfzidSnHr);hYI-5czG3&ij9g{Sy;v^6c8Gd2k?qr&j*qqVi>-7YXuH%!TU#St&jP(cR`7HFz(L^dTyNjY z>lhW59gLx7^71npj$|}kyu5TwtA{;bAO0bE9+rpxR<%iDZ94&jeqc66;Xv;k%Jg)< zyMpqop7$hXYA+vG5lp@BH<)0MNfjZ81);o>?D3eLJ@9jZh;UI6*?emqrp}|=Ec5;Wn=L(vI_e$ zcvVf^T4J{yUA}HCeH?`WBkTzjx})?{Bbt#ts}s6Xk+*q8{2VT{2@CP3BILnt_-3*lqS-ush9Bs7Z>7V?-!{NCxXj2P3h zA#_t@)Pv|{Zalw|5Fx!D7|=-Bu>cH;G3~!pcBD*Hp!S-kQrP@Vqgj#yHv6`SdlSH& zLqltjLP_OqW3Rn3E^oOR@~(%8v*z{d70=FK+qDt8YuxFPyizHH zAl#Vub?+rAzyZvKuieccMz7^HkWFht7d{Gb_}Vu%O@SHc^xS61R3mXBl~KkM>u6YG zU0#}={OYy)sAwi|Rk_*-m7wMwh!r4yNkL|QFkPhmejDrEmOPwp2#sQmI=CTaqob3^ zIqVR}KV`?R-Kz9YgHsGu022Bv4by2n^l;kU#U8!el>>3r(n!@8sBR4k+1b+<3l3rQbb?_$BV7Wd% z)_)u>wXbu|?o@j8IKP~6JpEYcHH;&7u9Ya)gXYW|!j0sS)?$_{Psth%t#8#%`9Ilusai~4KV*a$%utLMxeu9A zP2-8$!-0Fl@-)QaXaM7A1T?ux7SUhL<>b0CbaVQ2#?oo1stxM$7DA-hs4Qz+uYb-q zJgN)L5H*lsHq?wQ*#d_=GTJpzYHsbp#4C>l72x@wi#EUXLg`F1dd{JN6t2EYu@PZW ztJu*)In<{J)fFcvP5rp#@VWR2Yp^pKxJM%NN_C5Co=ZJTp z>Ns&Yx-yPMhj%Ph;^7;%Q8Ap6l{l%pAQo&tD`(pSdOqKXERd~y5&Vn($Mlc+DDgl^ z&Y*;&M}wdiN2N@)Pz16O(C1ZcrL*L6-uAlwwOkTdLbw<07%Tmm=ld-r`nbmKm9R&T zw~^?Y>FLPQ`?0c2LCN%AMhUe0I;gp0Ii61zd%O`>o_HE2{xP{r53kjiDGV!2X`v_u zo08WY>Vm4NS5AjKhzl=dw`?B?8n?EhhxnD|cj2{Ss0`!UCEUg1evyGVCN1j!=-HqZ z^hPu>$SLK?4)u+iB9+tIeD?(ld*NNCcW{jzb<_ozg)wQsTrX_OA*Nv#VzJCN@T|J1 zQfp(I@L59oc>xK3zpmA35WIaf);vpOhqDIOqo{p!HCJhqQ2NiHB&&?AlZAPnZ-*Ac zw27ayGf?`kgr%`Fn_ee~4dKdt+cWh+kerr(CKuMC#3bt+G)MGQQSEh$A}c&9*veS1 z)?TN&|4L&9v8bNwxp$JV)b1D;{`?38=XvtWD8>UauG=U$h^3|)Iv!EoS?`OZ^tiW`{-wA(`3f)wJ5687l^|z z!$FkC2k|yL^*{1%vd&T8iv(EsBdmW`D=Gi;%qKSZaM7i=7smCQPeMx(UWYtJR~t+@ ziAM|5WjFZ{TU#qwR3-)A<81T+uG(bjV$}vrtwDr-KrR>>P1U=dpi^56UsFGB9*-Ta zM#m}n$VPJQJQlURnCJtGuAeg7G3m0S%@3$t8<^_i;gu~Wb!a`I;h9_-w@^X+Fs7Sa z?^S7|nTKbiqL~a~!vuyE&SPbH;{MIIqp#%#K^MYZ>OHV<8!$F;+v@t0Jy0xVfx5tO zmOPYGWN5+%Vgs#D{Wa5BIP?m(=~Mv;w1+>#jdZ2$mC&6ji>et`HT*RYA#nt_ijKiT z=iWu7grT(sc_=qMbV$+L#%NCe#@Qw%;`1=EOzZPU?hx(DRtaeqnjI|D5F)IoO}I)3 zRQdkZ;Xl7|fGR`G{UVbMHDyn;QLPgx6Pq)?RYK7S{L+MqUMUyR(5~V&)lfU>zHmk( z!%gPtN5jKaVdcjU;au7SnR4_Ihr#WP4hatsGvi#)3yzrEoqc7^=i8%F=ZR3nJG4Jx zdzI2oLB|}${(%Y+6~;-u*r44w1?Gc35;A7jkN?zLRBjbpmYX2Zt1h#qsgS@P&ETZN zAk)MrPu)?ZMJ7L0GP|8FClk{!7_kgL5C^=fG^i+LM&;H}rS@=A*klfnS5;T4ze;7z z9)^AN7jIQCJ8tgesb#f)IpZMrQQ49HRV&XZvDpGu)Et<8b?GS|#9s5)Pt4>7Zkps) ziBw@h+10G2DWwbztlCYA(jGlYb!;YUxx#wQ<`@rc3CqVXSgG-=-Jo)R-a}nBh=h0{ z=L3^O)0o?2VV@Sfpe=1ZwKsKoD{F;y(W=i5bU2?0(93xtbJDh;V@aE!&M(%TcY>h2 zgZSJw7NRnHI7xg%Nh1cPvR&1t?lR+2BSq9`IcBngjPiOoTN4o}crGD?Kqa(j?JM%{>oW7QFl77uhtTo zwuQsgZqdI1YKZz_aYS)1fKw@qyDcVlA)%O%)k{V&E$g|1qiPOw!$F!x+_xKQ#!jn8qV)Ks#)eP&h z@NYgV(Xp`OPjQZ7kS3H{sHZkxbYo+QdY5Y2s~9SknA1>X!>UG$|G}`I~0lK`o()JR`l~>8a=Q zeW6dGSnq0Yu2Do~?DTm+F@NOX)=(`dseZ*au~vVz;JN4{-XRjr_ z#6{4$33a@uW5)?FWFko`hJLKqEPB`WfTBVOA7Tk0r%8gOkqM_m_+o)|Rv~-^}K0lzKDG^9k~?E1NRoX6s^rz5wl0rAU`X=#E$; zM!&f5zlz!9q7G9aO8!Z^Vx~R1O`cF+-YT67sj8^jOKjyYX+?tbcr*ZKAfb27X9PTh z^hp0j3@DzA24^thhl{}hVIrS2J(5UqBzT3`AW^o6VK5m4rlk@1QN*NFZ6!O)K~hjI2Os4#I5pAmBFUui0qb$d`b7F zT7F(2o{o1Lo{G^g?0F>}Po22YK54wblafc~SoUsE@p%9CWT-6Trc0XcO_26HNuW!# zRJ?@>GxVLMvu+QusEkX)=S>hM$~4i+xa(OmQfdUVBCC9uT(1tv7yJq+xcvVQaBYe` zL=NR|OF`Yg|K~PD{})Gh4Ix6?W_EMoK(avv8ZTew0csX}0dWmWGM`MlI>bi3#7dm> zy6YBxCMj#w382zwc#kpM@S-w; z4osHDnY)N?vCrIGKGh(^)j~SL-$laR$}8=xjLgo?Y43yrDx^K`E>hS|08dk7|4548 z{zev+1Og?(#wB6+ehKY#4*rh3`P$#&Ecy9g9Yn_GYnBQqgQim%sFt&A(|l1tS;I}u zbq(NMb+njx^ASx?%`pr1Rxt|sB}nwUFCMAFtsP7rF*J^;ydrh2O+??k87JGm$Ff!_ zzGw}{k|-)nh@3?3Dh0)*f7bMIYsanGENRdN^}zSnp{$4^e_h2Aej)Bw8eLELNDjY3 z<|P>4rC2u!2MfMemoE+mrg1Z%e(oQy&7n7F7iuQe`usLQF)qfUW~>iE6?-T_M#<$s z&Pu`$A@Ar&D#E-zjh^RMUkit`(L9(D7URw()aR9@O60*gjcrIgVI!my`L=~yIz=?>~pFEo1xo3=~H#+q2Vgzpbe}w~&7Dn+qu)cM%j6;v zGEE68i83cs>jPl|6_ZAX)ECyM z)TkqA{NShFW@Tn;wq)BM=19C?{d_ux^~3~-=P8cGbIqi&TF7kQm*TqqO1+O? zHIl(mo;A}TGJ?}8BETL-Ha-l2~AidJ=H{)O$R|$cZBhaV_Erkz(Q+G z8d+e8GMx@OG_#b+?Nh#A9fr9i17}q8B=1pcQ$8Pkg;~-SphRJ|@-fkcyHs&S+kly! zG03BK6-f)jywRwu?6AXD1Gf4~sVh-S@*$gUx)i}I4RcE{{iSn|!d@wADG zz^A*o%K|8Lc)YRU@6z|VLW-OhvWPeZv?y-X9wC4ljs~QHxu?}SL-y=ESW7J z8iT1DY838b++&2qjQ-7~d1Q6DT>am0jh%dfnHoji7uf}c(3BD`4^kf`INvcYCQ*qv z2`?5BS4j`}x@VO+ffg~Bd%y)?op71mCCpiUdKF%{3H78tm&qKYC3Hzq$Ni=+Yf$C=@s zOgwe=VBj^gk;M)lQdZ1P2`Ofi-ySS$?QM}~qz2PXA{8jqAH1(5UMQ^ib`efa3V}>a zc|Xo$WlBs0XE0!%61K|EGsz`}GU-qGQZF9JC-!f>FHWXz69PS%q!CPNhoIg{c{5BDK*adBw;+*du-_Ov*qU_+?)o z63jWC@skG&sxqEq%l6?=IEi$7aChrC?vvk?J`8TV>ht3S4ZHz)-FH@RpIj)G4G%#2 zxOebPK}pD8=JB0BzYPnena$91dREdDh-#Tc>z(qw&SjwFdF8lsZF+0yHc7xc{0bd> z-Pf;Lt|e^4?GK@9qspQh!L`Z$E~Jx`MlS{YJ^4l@ZL3KU%@uwQkFhS)^%*v&bjDt% zy}DD_0wTlWt`9*s?Z7AoC)V{Kx1Yr!nTZ+JX)G|p#bq=Uc#|}?!B(YZMwM!s*p}yD zD)t=rD6!^JXEc+UN_%y2++@`XYm=HL5LAp^W?OX?{dwQ6Jh~2llvC`^uayC{@(Fid z#nNa@*iI{IFgZp`oD1i!TQTZBl5fPpdVIj@9w=oAXeqVPb1O#wB3GjLnZz6`(bW4L z9*F0dATtJj_G@f?(BM-Q%H2Ix;vyQZ=5rOUrR(on6d~7qE4iso@}hNfP4YV4Q<}AT1W41#Y>| zS3%h;Ot)Vh+JD21oydwEk7O``ie=(ForHnJ&GnaWyZ8&C=3M8@Q|Ir@NxX*>vym<7 z z8NKEFkr1o@bf0=Qe^9VtT82A#z4|QV77_wMA4xKUo-0LC4A-X_F3=87|CIk&qMe%v zgIZf6o8y25T9XL(2y1hEd10N;TO`+iSV3vn#?P8r$AnUHMPw}VHO#;)q`v*M9@m!vBN_1O+>`}@kgL!MbCg0&TGDzqkB`7m1RB~2sQKP*~9S_6PJV4}m z0sVHw*^`mD&8L;NprrEhPD)cnt0;HA(suj-r|*LZ8PW^iwpt0^!=6jp=4Ql=fXo*; zix-A7mai5-EY(9G} z>pAtvw3U^&!Vl?rJZ>g)r#I{ZEu?P#dv#`1M*X1uhD{aI$EOCKn}tdIsiy*KKL;`k zuqP-LdDG@PjNdooET%zXrcL{zojwqCn$0tmO8Lm4&|E)%TeU?+=lV`g{v`p^a~;N- z9!TAC_4%)s`jlJoo6mJ!~ zT$3_u@%*_^a_gO=)N*;foo&PuB{z@y8Q+kbG7=8!&3D^Cj8oECjXtmD$%*MwP{V`vXwikR^}J+6cn3eG%_}R%)9Gf5Ox#=v@g?CLKIq_Vmz--Ailbwq z7`V=Lxh_=_H?<>wW~D$u!35tGURByRY!9T9^{n%?ah_oGfS~sv_cP)#*-IqEK1#Yt zc30og`Z2eh)n&Y`QsVU6pFj0k)u>d>H?b8s46;8u7G!LD@Zt7!n^?s$!|(m%9vd1h&at)#oWCO=NL}IqVPu8db z85TaK&z9Ul4sSDKB6|={z(@WYO$$iKnCVQ@d{$x1B>ds?kH5CYFqpjHbBN^Wpht zB0ncvLs)7F!OU(mq^X>4<1gV>xZsUrxm*~gGxD7I)A%o}Q9V4U@IGoZc|&?EMKA=p`Q za5(14v8nx0{i}MjS$JgW;YjHd|7IdK2(-2oVa8P_GjiIpYQ7M@!@r39#eo{sXGhYl zG}3-x$yh}Dy%UK$ErXeQR*b0&g0nR0^KId4xj+iYViqb&6B>)y2;K!b#{JB$kBO54DAP3CS(x z4-nC4M1Ru^SFtI~rfFf2o1)vcL{V1thQca3Nxj{i{M`3muL>;df2Qk< z#ZqMSMeqeB$GnzuyUncisCMg#bkZ{v;>xK!LjA(TS{qNhg&uLi2}#_o$cra|R56~> z)=%3$3ytJr!8c&*QH}+dd*##jP?15PDkeAh@6b<^xo@IT~Ng|~@;PomkgMxmfa63Qor!!+nbLos7PMkPD`yR*~ zbviDAzRq6znW>1V+2K)&S!QdlxuWU&%bn8q3!y8qVgsr)cCq7Jrcr0;{@)qoAyYF4 zV*;Vd4l?vlBN~&Ao;~6Ml~~Hh%w6K1AuMXiOiG^Z#C+X9D8-q>)nFA!BGyJRLDg|r zGJi76y%R!jVA4%8N3i@76!BN04d#7HmJvVDQ}Wj4;TDnqA@2Wgw}5}AC~f%99~%P| znSi7m*k4IGf8DMk{xjj}AuZ?c@a3Omr-$S(BZPk>8T}(xPxs~D5}E#j|MDj!@mDfa ziZUedgi0AWd5o&u?aZr@lrmw;?dP~r?1P)DNx<_GK%0x^o3W2cN!8RRUkg`pIRuLY z|4tAJn1LGqZ}gJ*zldr8*={u7-2*^UMK|d5B*mP<_SEi9PZ~UsUOC;T7af9CR zl{T2IM}CSBB~PjK4W2iTF(~FKr=X-HqFE2xG#f{)@lqF2 z{@yyG1_I*HhNL8J1lt!Od<4-X97*&kjB<^?XiTI~&~wZzA_oILs^PcG+2SFYk{qdi zta@3H-b%+3N3*I?_n|SG`2!eqTmZR>N!YidkYvalnsxy9vFrU9RJ2>bBp-Q)_Fam+ zg!t|D;AeRfedl_`#7;NznuI*pws2`PI`;~MDz9qiRnXg&`64;Rp%4v<1FRmpF1!#K z2Hfo4pTqQZ1c$}%J#Ak=X`PmxvgUDf1ZPY^5w)2WP3FhOL?ShyB*|UTexl|Og^(!W z?=I${37W^9Jc?VU{E9F2l^yj07_`x`zwfm{*H?}1v|!YM@e0=dbY^PVM*g_tm#ZZHwtwV+{ab%7X zwI*xdVvZ=bQh3QI8DnI>SYM%a-W%cS=X$ZenQx!oFq{?dY;LD49~9@Lo;`*~*>bcD z+`QrA&fX~oH<^3T6UZU!I@i<-LJk$u4ai<@*N%U;bSbV`EbwXW(`b{PkQ)Ru$t&Yo zLDAMW1$q_`;K97lZJFstQD4SON7b@3D~?Lm!7WFbSEE}SNHz2-k%XsUo`DtWGlQgL zdL7K(GEox+!7a@%LZCjBya{z`xKV@KS02*_jlIM%W(zq`yIX?cq4B7=7v}wg3;5ZU z^cphb=A+LF7mn43C8cMKF1GA2TY%Ilj<|W+pT!lo79l@52HQ`ITypIFx9=FpGtWKe ztN0<=xt6P**YRauof0$av@?EztxTX$^CJi(+ig86BMU~fdKr!{?lQ-v+{a|H+s;xI56 zcxB#o8E>56u}l=NvMmY_**ValGbjL+Nr2L{Hk|9%!S&~QSEN@c4kGuo`@fu`*o8I z3b#T*8DzF;kn}tPRcUg`!G?;)Xk#wC7!`0@2^XG&_yuX8UMuvg!E#W>x1vPI2IKT$ zk$^mLDmD#%FBJ${gMIJb!$rFFi^As(wKuy|!e&@OJtwQW7aPA2VYdmx?A zpqNy|Ce%y&p-AvG{9rN;KH?}V)R1;5J!kMIKEuouL3(8Ui22>i#FmC~@iT~2%3@-^ z(HH|04!m)MHyaIJy_TdnE@3)%3~PBN?ICuEHK--Vme`k`b#MWyRy-lFZ7HmpsvSo0 z9a``le^fpNG-!0h>rD*=F4bcANOa#)AesG&MB9YwvITJ7ze@)7@kVq}BLof{G~5=0 zx3FRC;CdD}A6Js$=)`^qG)zYKEUcvZC3oGS@OJ`-t?NQHp53litqsGc)@@ z&6!TEw$x*}xyM0rpm)njtkiX2GU>)fPU}$lJlt{Vp!y`gZtk)A>H09kSS2!YghZ>S ztVcKuk9EQz$S=$wQNxzVhdSzOUpQfWXpv5EC?E=^itSUcnSFGJoQ7G3q3_E=UQEX;LZRtEimiSs#IEtr5!= zhN~9&B$A!GM4Z&Oi=Qo!qV~gT!Wt5?h2@`TYt2^9fpTVjE_BZZ5ucryDR$6TIdkbJ zL}9-H!4UPpye$fpLgL#BRcQ&NlZ%ih-_$>#*(CS28mg~OPE zn9Hq-^Js!MY>uSAU=bNk)hO9M#@RHlHsHzXrkHfcEg-9=Wq2N!J z^XKaEttKyV6x~Dclp1bHZG(rb$~PS4Ez}RaK0kq)L%gm694(3(ga+C%lbPgA)CwH6 z?R(XNM0W^H)xo}H^C;8O>7vwPy)Dc_L8q4fF_u;au2If5gayLod?|bLonQ#6(|36i?>P}UvsI1ewd-YnY*Y4fj&(lV8 z)zj<1V%D5neZfdFB zn>Kc{Z5sN4L4?4MmGv`V3ZkXbLiwW!a%P1jvYbwjR;KUqY- zuAx-i_)>1>$QSVS^aW=;h3MpJ34i!rgo8}vCZ`;eM?h@|-1(*f&kK8O-^FRzsS zL}u=lC&V*`8l8@^41SB3nsEW6=ahOAW>{4i?PesbRT_Vwkt9qkMA~s;u{24=3KTm? zurSc%%l$qcOxd{B*d|l_36bVWRFzO&(ufdm4@&rp!M6qMisgj-?%W#{H0+~HTFBhf zd%7UzX=dP%<-K}-N^W9yVpS`J;%S8R>tXtJn|M4XYniI=?pjFRD6&H8QM*&RjJwY$ zgtt+exS@F1blN4VF=+(9CPCyQmAGVc8ogPKxqGgzPxPiwB9^xlk;WU-$sAsWCClnj zhV)Jqy2{Qc@=xTK*wzoIGN1=M(Tfdhly=OEuY5keZ58y<$gTd31XlZQvl4}Sx7y{~ zmlg3EjR{TAc^I~$d--zRvxZgTpKyz@7p8bpih0yCbj&5tjQX3w^oietw?XV>iucjR zylTj-YyA=hRs=n>Eu(Mx$}^<>@DN8{6;gag)#l`!#EPVlC#Go>56xtFF7;%@(2G=x1K}wF zl?y^Q885eWsId1G>P89Aa_u26eZI!#Me&hWG7xy8s$TRrGX|R^Lx%V` zYU40ZR9|H6sdB_-SRrK$G{zuEVURyF`6z*O&<2r}*%>y6Qag;bLrblYh$~9! ztIiS(+kzbm=6f0$$!Cis)THuLaxW*0zSEo8DVYJLEcwWEdvGKaa8C;6AiR?^Z-gg z83K064t8g9vrOA^64Wi6yLIz6=~BQf^A~y34t|gGvT!yH&y6dNvz9(Jpf9v%ds3WD zTAfKf47?Y;K~v?ZJUy~WLYlL2@xIqIx+R|(NKKvcRDHsPqZqyYyxhWMt3^}0#B-S5KyzKV>3X}8lR=xR%9P6Jj0`r@r6^c5 zhA>KYj=Mh9g?5(Xvh#4W5qoird|ZC2kx+JgvVoA;n~^F<00W2T`Rug7F^ss|CIr;_ z`l9Ca0&U{LYr=<%QFN9$=bc2B&i3i-_Y#dy+O#7!F12i(gtyMR=9@!>OPC>3AM^+E z>jDj_d_OdUQMqGxP`3I%=&kTUohG3$2*bq z!tX(m?HlJi_70@@jh(21+uvB{(wZ!qbw;cOMAuV9?1_`^eGs9;bedwHZq13JJVyW4 zC?mM@f|ZaMR2Wyp#a7-!VgiZiV8%FL?}4boI(C_+m{m17Bgo|Ia;1tB$R-Y(##6577pRh#>)~TeC^frE7Kvs^;ZKGEH?~xmfN9t*;j^@}d(CWZ zc_*G2Cpz!8YEo>4DdP23uz?lz61%cnn4`-xDqRv~ATacNG0WZKwT0HQJfuw$`WHIW zV5)gPoG3-AY4L*ke$y}0{q|#I8>VXw(Swo^vJp@CpTN+&35>K*RZYk8JlTon30SC@ zL2uy`m@Ne*_)TnAle(O~XNhVQP;6rN3nB}pm157*4FZF-Q3%`%1Q0)IdN521auI~T66&5wqr+Y#0#?wUrlZ(Fwu z#Pqe>%P)9Gr(fnG6c4=K*g&EZtdN|tWZBM!UiHKi_2kJz;G3jN=|}Zn!@blpA|)&i z2PPQAZy|C+mt~?`JGRi_eBs(`(JT{hLTtTb2=XdUsbC5X3IM`3#wPudGdE7>G-EvYNO!Zc#ojNo1T$*k+7$1T=>6_|CD~(w} zY`&MJR<60(WS|r-%t8bP6{qaG%U5Vy3uS&tBSaQw;S~fOp$3QktQN@ir4G;`AfkxL^j3d(3!U0p4l7El2#OK#rKrwB8)k86S9oEa-{sPvM zg@s9~sgLrMDlV!2Q@XAO#@NTJH8N$4uMq7E&fCROtuts4?94iq`4S)-otJT31>1!gwb06ld6zK0VloQBhkAZ&6uk{fFq!tN zdQsdH$@VnD!~&V-UF|(_Y%g2nlQfx2cGdYb<@}7xpe1vFk?^&LvWMxn&&J5HNLD6j z733(F)nHBW z<&1@2QSmV>;$7rh_Y(I1z=acURl)xMId!hw|6V@)AAI#+*yn${KE#~E20kQPIZ2%r_O2Re&LkCgcTnbb zoc9GiB<}*!{FqOza24e$_?y-AN-D|`{sX9$inN`#$L#iOa5xR;+#*!55W|~ovBQ1V zg>Mt5)RzVh<;)zP0%LS^ARPTl<105i;^pV98~J_dOP=kujibqIUdK^#-ry2A?T!fH#- zJsGGkRqq2F4+w#Cd51O!*oRIExCDzl_3rW$!AmkW-$VB5Z+9Xea@Iv)UN%q=&!oQW z!JfY(Yc6fS-ViDBNbYpGd*^Oo2H0ue4^2pYc|kzX@*EZ8iaGhQFkso{hatF^r13p5 zD(EbM#ZKf)z&^GhQf2Rbi9`Q}bkV!)PM0RMqt<@MESQ;%yD65NP(|tD=A^X$Vf`L~KuJu`c=1Oc) zVP`^5FzWA6JMLX6LG52qDteSI{-yEB#h&-+V-*bF?U*G5D%pYX^Yiib%IL5^OUhPp zbO^Hgt3l?58L5KfcBiTN2k7+`*oo68zdKTXK3rUSoKfc$5#d897mKn#l#IWs?JyR6+8W@;BsSevi#OJgUjvKf0CVT^v2Fg8N6Kr#I7+nU^C3_fkE=sDIVZ+%R2{=P=*8R5@vD zx_YDc=8s!WcCE!v?PvD*IgSzk65d_?;3B@}wmB!pzYO`ycpbumOHzIcI``5&ZEG*x zgxbD-d$jau411iX8fL9g5*sTCM&nIvV{GO1UFmYDJZ8)MH9N z#=*r)xOUMPAgmG?7Bg&^%XKft1B$!x0d)Wa0J!72y`m1r_e|Wk-BM43622WiF9ZTG zoA|BoT`G?kB>HwB!}hx$AkGI;l2UYou%Gu z_%e~X7v=bOY2Cd-!IcKj2;W=7IGM}V;uaEtN}g#T01PyzG(6g`gZzuNfLT)UEz$C~ zAHG(NpNE=}Uh=J&KinVQc2HCSJYf?~u*{B03;;jWOZ=XTxj8le;v`Z})yGGu0XuxrDH_2!bW;tQoIq{0 z3V|2#@xn|_oY*KFUV#LY6r{9m$-fePOrNZk413w!haD@K9g_>Bt-IIobtj(jCrOLV zhU#cnxnJGIy+0T1Kc@HlQ`QsUtjk(yKtxqLXg@9S#Cz<@>)T3^=w$W>88km5 z_S=ED+^5HY)Sly0Yj25sf5Sz4sCx!u`$^to&gYrVCXpMt$s2NCeh58-NnYk`0_qsZ zdhALvo7=;svM=g9CQoaz6Wfw!qSk5i?#Z|XsR){%an0>OG6>L&2XH0N6t&^mF5EVy zNNaHu8IY+4%bJnYJ-u!nX7WT*A|Fl)INuYRoKZi%kETfU+bJP(|8-aYB@C@w)2t;Z!q z<-SL+9BOV#tS_S|9P~Nrwu6;j;NT^JN%dQ$EG}ibisP(dwmL)=9zU zMkY7c*2$PTDbI(yA)Ct(N=4Womj>=ru*!!krC)tX=WQ%ot*&`@)HfJLZr%^z9xgeN z4o$Ckz7!PQd8zLip^h(5S7$=_6N)e$p!XtIU57bz>6|fepL|egtGu*B-S&2GwQ_u# z$>*A!crFspbf&{1!+qNm`-XTIPTzAyZH|70^TUq_pipC+w?4%_3&Wm2=&)n)TN`NQ zm_ScavsrY>rM<9P=rSW>Tt;X^;megObM+|>Ai346W}dqH612>DC$po7c-CI%#J!&p z{u64NljNj;?MWM(YjSq4K%8Gu#P~ZPF@zKC9L?Vw&9;6Bk8+aWof?StEhC`QFPB7v5d877$P(i zTgNBTU}M6()c!sO(7`Z-l}BHgujzRF!!ZJ&=o^~m)AmA|bcag<=F|9P&X3Omn$K_w z3IUe={@i1*(1yHjCWN}v(@@Ph{g{gQ!tV@xOrXWyHr^aFYH!)rA|P1u(Kdgn-#IM; z&OFF;rkG2m{&^r8Y-iQm6^lWg6`jd7Hi;kqrx?W^sbL*U{-epGtc;6`)3u8Rwa)t- zb`_9mbW*Lk-gipEi8PI|68tBuIYe0NYQ2q&WQ!X0!}SaGLka(kN2-7L--V@(NCY-^_Qkh986Kr&WvB<3K%M z^srgh>Q^GffqPvlWLGArM1EJ~ffU$wB@)9SdFw4wq&hl5#lt{6`dC+BOyJfPX$+y{ ztzm3;w)9=&?zPsD&#t2%7TO51^!UXC@^hZKJJ_;@uMhB*Kj;uY315yRm zaagUN^3DDf*y2mN8u$}RtYj;<>Fgfd*n!Bz6BPBWvRAvza0aa3RDXT%rB~}N>;nA4 zz!S8)CcM-nR0~WE%{6xBcDDB7wNRI@Crc~O!(aHq-$eKi+DwH1QAPr>HfuH6XA18&(lXTjWaP))V_QIsQfpWZ zTPw7fTS?bMHRX*RJv6NNGV2Ir^8=>k!8ow>QE9Y?6~Lx2ExbfpeNUj>ct4M zn_oYglMK(3z2mM2M=&yH_%d1TY$W!fq8SX{BJIobb?MG1esq(bA{M!=hLHR!79nO`?xs zh+jltOhnChuZ4yy4d6yZCK0APLy<|qzWOYk3e^jKEFyFI_rT7T!}|3@9$>uPTl$!_ zll#PNi|9*&X_A2xa_)+I~^Mb$L4~GMNoDK00UUk4kz_dv7@5w9@_ift_1MmbI1pq>j>? ze=6kC0n0oH#O(1w9Z)uW-}g9sPoeut^Tlg3M>oIcEp9)dkQ}DYMOf~XewzTMGZ`=e zQ@a1hm2+>o|ZSM%ft=Ux5A7+q2^p~nhSqF5)X8DOpg!%GX-%kb#xt{ zyZ*i5)j7Kk9K(awd%f|0wvd0td!ciARc+sJBy_L$x28ug^nOBpj<{v<*TlN(I00}F zkA89KfYW$11My1N?YV=}9O)k;U*9qxQ32DX&W5_%m<#{mx`z5e?3Nh?`DAPQ>rG>{>+LI8;6uV1wszAH(A@j{R7FLai0wKYb{p>ltfU z{@v>7)t8R9**{QkD*vg;zAxfI`)GLb5A>VL{6AJPz}{v;wmtMwcV9Tgyr}?gd-$bx z2jX{}kPBUhZfS?}{(DWe3+dm+)eP|j5?|>9qTotw_LQsI9P)9{=s67#1-%;t4+J7( zN_)*2he-E3zXLDo2VebyFg@v*=3ae@9g;5~ejH7N-<*MQ6CU~l`=&g95V)Uc^qdL^ z)3`L^r7N1BP%jKiRw{o7g;MvHO(nxOZ}a*)wVzP$E2r$34Id52gS%!xF?O-ka8-Eb zC)V+@wySx?bLS@h#1DJxB-S7K_#4Z)w!BT{W&1DZwn}~=S^5d}>^S$<{GpWY+E1es zh+Vc7uTz|N^=|ES3y0T8_Zn(W$A3oGDigZxowNH-s3-m<19~n081NAi*_8gBJU7TU zKcTi}6bAiqN6$g{cKf{>pa=YpG0VM*?0=*T!f)vg8>#oOekHq2mXFD0Ey;lAt1u9l zSi>uyfTD3_9m{}t5lAGclEfXy)Xsk5{x4m+s#m;$o~#ZG_UX_9qXDa}V}=K~E+xqi z$VHE%`CYBreA@$^Kk$N@{KpSUK@2P`k(F2GfcZK~O7v=e>1@x^)yndrV=eI>_MxP> z$?#E)tp)>=ulA2>RDKgUe<^QW;X4Yr1ug`*7QBVq4?*|hVQ!_Q#82}IupPxj6(rL?rqnp|nZzHa4JZ8%LBPz9vq|%@ zu^`4)humMLkNGBYuBI^X%VOHD>rD7SGHQ)$xk9~HAKwxIl6CysIZq@hZ$JZJs`ut6 z)RZKpntvP6+D8wd%U0FEtDE!&D6RUt^qPU@*NQ=dpHNu7(|7OuO6Z!Tfk>qO)au9K zuoiO<4@d|ij?V67OIaAct@yUnu1Kv+il3{*hcG|_2IMgX29@5pb z!s-)}XgAwC^o|_TDUq4{-krisvIQ3MSNrCbSHS17m`I72o`29toE^|n*M^hshhTs8 zA6?$|XL*F?8@?oVca9>!lS0e_0&^4#mFTZ78~i&lZ!m61XR?xs6U$<848*=(3?5v4 zaQ2hvQ2(*$9`LK!t);_ySo=??IF>0cW`R#xfZs(Q{VuffDS0m~F8NbRJrH1`4|x7# z!2gp8SCxR)$nEZ4cjSmnaq$qYDufDDl3SsgU-&5h5>YZ;(o~<4#9r9mp*8%3g7z;l ztdx;JgjDw+5A0#XKSuP-vi1blrcnXfZZoZTtA>z>lwn7Ht7eeL5#@jzN7EP-ZTiM=1y zw_r+$zCD%3Bug43b6_c-jwbhUb)n*ef%>9>=+wTx4hx*lOe1EKsR9-c<-*ra?>W{4 zZHOYcM)Oe)u1ff3OKrErqzcxg#|BhCpabh)eZ{;Zn{UVhl%pCI;3GDmz@Xr6{Lch? zHfsD%LB-=)S&lViPSwWz=m+4~srSHj&WPOO{O2=uSbtmyzC3Zq1EZZ9FgWrHFL1Xk_DzzBkQiZi3;gV1@MX zLTuPoLw(<~R?F^r{I#|GmeYX^DyH zVLv8kQs%JE*|0Jarcs9RXS4%yWX)!cV0aFVf$HtIt zf1(a%Q+@ZF)Aeanv#q*w#2?Nt_mm;*DU8eo&qKQLSe%0)N|e#;)-;?}hA6DBkYGnx zsg=6$b~hL-*lV|4mvfzX5}X3FdIW1o$dP?Rtj!{a_j+K+nm1spFSqPmpDof+J>iUc z`f~Jsb=$#W_?sN3JkBxC|Wh#HyESgK>E$+JOAzt~^3z=!4_CgnMttXxWnqHsNKz~Y24nfy=yZoc3Af ziN(hZoH+_+D!xRigWtYmD zBN7wCd+n+DcvCALzmsNesyx2VF3Ec>&gyFFLlU2QkvJJS*<{Y{aI5f!6x;kNN+TIX zXxv~qCL2QI`2|(k+8@S+MVQkur-K#ooNIZ+M%Dvd#l?%B59lROtn$-^pAJEcQeE`~ z^Jf%c^;8D6Mhnr02FszXn^3segm;?q?`xWIcLvoR~ zKS~=R!HY$f)>d<6IK^R1dm15rkgqCy^1^a<49^G_Jq%P@$mXw7(=o7_H8jwacQ3F_ zA9|ZecE#l17Lekf;`HL-sPUr0Mnsmhzr(1+bY@#~Oz0k)AiVY8;kyNq{N9Fqya*lO z*Zs|;1psn>Qwjf<32y=Vzu!!a|Ao!nzY`G2{f$)68q7lO$?(t){C)aw_M!i|1N?g% z`tcg~e}@D3k@fMB`0>}y?r##s|5a+ke^4*~OJ>7|x?jwPgue+T|HY;|KxO<$B>ab- z`A>iUJrc}+j~MxH^vuAv_kYNY3~Y7-4*mByFl7tbMJ{7${I|A%S*Np2QIvW@Xr8Mg zaA7QHVI~T+{~owER`40fmFCe*Ji@I zOlQ=OPwx(qq|BxoQogw6Cey``4`XAJL2>{yNge(8wpf*jznPRzhY-fQG4!#A3>N@c zpa^5ys;{wr(I&q{Z$**UK+iA|+$WCAv5xqX&;^Q)n@LhOIVVnN6~t0QoE9&87GVqN zqf(8B!k@8T%t#bvW}A2Oq>c2TXZF)`cG6)VkDrl-T+c`M>dzXHMna zsOo+DZd>-X8fBamE=TU>vsbdRl(0^v%RO2!+Mv`WJsWRaYxB=CsYJ@s@7y#$n}I9` zX(_AfHOjo21&1jtePg&cSWzs#|9B>ULGBwSyNb2SQpyOG`VkimA|t+n0FveEa;BNV zLO}p|_f|_V+9KvuFVsm}P@1t|O0wPdcjJxi)0m5Gr>?z`9iQJRrv0%7kGv^m2au`r zI13rS+DEr)P|@O%)+!aqbDttt;l2x?7{%cAn!XCcdf7khC8cHR!@Gq@)GsSHKaQEz z*fqJlXKS@h+7DHlmk^WqeX(0}pWBLCeI|X&i)tVxGE!fti{16Pgh*|M4MDA z^$T4zD^I<${n$K7e#+(_^a@Q~2wR2tR#e)&JXGHeP4Nk|ZBKbB5;2io6ZN;RnvElu ztFT`05?K`=GDHiUQExGY zDXcj9raSLXC|yeF@~7biJs!dITu;prxAL)G(D0U6ML@iw<+&MYQdr+!J}>vqOVqMp z+2eM<){CD@)#S`B^yyaq3FVYW3R8LckpuM%MMh+4hGc5ka##GC=Kx#Y>u%r%uH!?l ziso;6T{_&n18vg5)DarebR46q76ca0r?hELCqYlNj!A3E69*85tho?y$eE@H&11FJ zqiDy6Gzf!@a5_UkSwx{hY+b(ktJLtpj;iw5MTxV}R;ZEZTWj4qS{lkuCf0G?uy;;1 z2GM$B4{%bq>CQJrxU@WtfkOSDt!Tod_QQZYvyf^oqYu# zUAf#udhGH{)+N->+Gf;oh) zhwk(`)i_Wx<_V2wautW*dT-+Znt*gxUO`FRIpqzViU}Kk&1e&BoHvyF^GEs%I_kKA zKE-D5NE0XtJA_*9I>%VnzZ)_-b7lUXCuK5e?9yl6(MD%t&qZny#Z#Tz zlMy1Brtq+e-gQj?rg@6`?JWX#>}I{?z4B_Z*;iybfxajdty5igtymV73-^#$$ud&= zVG{3C%*@ral>DpAlG?wiV}ZB@O*+s)u?jl23N+oUI7O9TPA;r?r#~$5S6gW8W*o?~ zNb*Xoa+`_4cg;d1idQ!U!OUxx>D2rEIlN|Jq}19zW{~GBV+J}=wJZi-xw@K&&yy*| zs%(>W$b-<0YlQzA6d(1@1(hIf zj~XfDlsfOVSBpSv2gksyqwR?7KEH!Y%qiV;cRH$5_rQnC_$ciSba(tvGIZLPcEX>Z z7_))ib4!JO2$aR?N#9Y5L*q=}Fo6?Gquv z6D+Jo%D7(?w_ZlOqKxHVkcKQJnu3P= zGYnE96V2vKPq%G7Ot2#>JD3{)X*!93@i37w%J+H9IFK#lXE0+C+axx+-y$ zuZy2flfiWKdbr>@nwR)7GIj)-R=p-Mt=3XB*n=t=EvY1Ckt2AYT|mJ4d>5EjI=VIrH1PeC3|Xqfv$mm0$<6UEDKbr^o!SAWW?-b; z+&Q}0;c!@<8MA|foib>)mM1U5f_mW{7U}>b(Re|)bbNO0F82WD!HefTP0oz$)oh&> zXJE=_mn_{Wg~M@S^Y9c}i>_ZjB$U2%eGZl??k}6rJV2nVy8_6bSUAKuz!}A}vqPmA zrb(&o2K}?Z`kB}%&}b0rLIuQ;m=u(*}tZI{;u z5ivQ5vKcH`J%Tfznl&AWp>w+Q&!S@UC~d2jd6U=;dOo+6l`618_m?JciAf}-<~-P5 zudFByha-l^hGg}?mAsYgK|%oK{J6onjoLnwcc&vu827-c5J!qDQ1rm2QM|?R(C9gO zff#Wc1*YO9XC-G9p(@fCQ`~u_NlWhc6l!yC^3!nYddQn$Gw4xf@H(evt)dZ z{O*rCC(V6nai42kdTFS?pf#u@3{~MX*Yn$M#zJHaU}&SBB%=*rx69Ch2t}K@d#{O> zB>AklS!=(G>ujgQ5cG_=`I-->eXAoV$gjWwtl{ym2w6f>8_(mtDQ}0 zlm}ro9-qdUppMzDS)#4702|nv3@Krz%HFl5k}Ex-$4%RMU0vtm_xTH2neM!)`E*$~tcT;wOzdGS(%I`ZYf4w38mmests@%z-%z z9p}EP3Z?L_p_qQkITR&oa{_Hl+h8fZ$!SF=XT-M8jykuD05U)*4}q~4v%~C4(cn`J zu2zA7VPZa>?_2v*3w@)&>N&n@9@m4?53fwt`Pri+6PTssI!k3i6pgZY<=xF-Sz_AP ze%u`+Y^DW18OG@xW<)~#gXMV7u77maaQ0Kstn*DK6A!2WGa8KAgy0>i+8A7=nbVig z%lUFy(*r*A6lJ_p$Aj|AFQ=1;NZVbsRQX*nKP~=(_EG3mH5>6dgw6aPM{Cg0@TOvh= zlx^C~8D)poN_BQaO#TQVfw6aIT8xOqF^iFi^fmHk*Ay-RrR_D!U1OWRQjMR90kFYt9Fn#M8y{AnJ`nvCZTZpAz9?&o4Pk zr-E&5l-ND0(~12`AJuIeur#U?AYjW^iitJkUQ4}WpYI2Vv-g+HMBk zV<$>vRdKxJ$%l(Yl}jsH2^{69Aq z{Ou~D|NR>a{#$8J0fLY}$F(cCZ>+nN7tsrQzdgh_8cmm^Q zAOEqjs?YR&W-B|nOp_^QmwH4W_JSdzhKvFuH)q@%xA`YJmteZg(uG%=sBPv^A}@Z- z@lk?41-ck#v+->e6YMj`C&&WdE^cq5ddtM5lw-I*Z9*x8zJ4P{F^xTl2nP4~3Vut( z7=~Z#=T0Fu<6zQSZG2XD%gDZ6mNc?K01*_AYWCJk&<{+=NC?WpG7M)!xeS1Tu2{34-$)`a96z_I; zR18YH%uc@L2h)zGbQI04Y&j`enCp%EWu9S+-w(w4;B+j18WI`eBl`*EuIH`Em7R~e zx$^NP+x{8DHgIaN$_zx0hG3y==yjmZOKE=5S`Dqo8 zQZ@6*Z4f)~)e!{Fr;k4iK6=K;YuZ-v1l^g%4C0GsZK!xg;i6BWGZX~hVwJ06+s*K{ zHWOk=Crp6|iu8OgUsO!^6~j`pEk=%T8q@7-37>1YwtYUMr0o+r(2^`y$yh0pxvkz? z#(bY;ajJo;MkzJuibASvxGkdC&bZAglzIk5-PmY1Pcz|8^bFC2#4lrx&_WH(V1cHm zY`Y!oedp^V9sHad(|Y?{6Tt;fm^^yt6b{=?XU|KN%?7%xV!mj$b0G+%zej>e!pQ2; z8_w4yMxgFb1oCve?h>I>X-lVOj2Nk+DzSwRt@FKR;atpf!)!F}fDpJcL< zQLtpTBDXk-rRi;xvBrv_Yj+vf`U-eQipgHIg|mUOT~qhvlKXe%u-5SsyFBytF5j{r zIreI4C_U*t!i-E3-|%4;wO|$fvO>Xr5Lt*Oo|aesg3EaQ;{gd%`#PA~8N&jZA(Da@ z5zSAQ{9CJG7w<&svcZHz!Mll~&%?GJb@gZCDkME7rC9I3jStR!6wGb3)wJX3hh7J> z*m4z@y^LIPC8u`wP;b1(so)$;HAXJdUVc_@h}NB&Or}T$N69CKbvR5PX6`CO(eEUL z{x(W75J^vz{S%mlH)BE(&rthnRE1Iddt~Bs!&Gxj1aVyp_CCknDt@ zOw92FQ}0Gax}vEMjcV^iH}piGkIT~cvla-5BM#MgR5;nB*O~I+fTFq+$OFL}yX>lW zu}LOXtF^?5DMKfu`~*oPj)9CnX!Kt*lS6MkBb0(DNfC{vu%K9pX+QOQQ@h1IL%=2uv@I1(@E~>_D35Tt%w>xU z#3$BU(91Ei^vtrHz1>uz=oA#kp0GQ06~4I2Qs||u z>$a)^VK#4u7|E}hib0>>9P(@pvszu^4{&i1es;=u!8tm@Y{$-Mo|6CB;436MM>cm) zr82l{X}+=hLpZVpl;x~jQ)k@);jBhAn8I1nrn_2S!3ej{6d=xE>U6Z@GlkwBwUtf7 z?Z^KMdrE{JNGAo({7?HcuzTXMzN>fJN$ zM)TPu!C_@vO_E5Zs&u5us6%-io%i4J=prZwH!2mL$|v3<#W4TKN8iAHi_OhRotIpn z%E2A-vDKNNBVZq|W|ZKhM=okpGuhU-h)-BgH;#0piGo!=oRAHeDB)L`&P*4&jU&k0 z5QI`p;JsD?1%*0KOI_@vefTz`uQIJAP0ax>UvMYZ1MHB&R=qCG;D^sFWnODF=lHCx#MeoQpoEmO!U=E7N%PMcn~Jfm&pA-Zdpi7W1Gf zEyI|@j(qDbK*TGWrn}m57qV%%A%KMZY(5&qAG`b4Hah> zG^SWb9BILJPry(!mcGpPQ0R)3SglJopB+ua%@udr!;~UDLa@p?hxt9)+C+~bX^~4B zdiMek3f>$bfSIs1fBQT(;~c+)w_bwD3!%QIS`HZpWM=%ebxxi@u|PNWFb+e2zeu#4 z+%CA(Mn0!(dyJm?C@*pi!W8lxl!arbY_)*9Yx&F=l4^*Rp2vNw}8% z%h#{Dc`zO9HIba(7RNPJeImr1?bhV7rg)oc;>^g9X_OaVn&D~}U>>E@(x0-eL%Xdq zdBir3^)bcUn`nyfH)a!AWwGkCTJQ*fKH18o~z`YAlkqDk*-*|a8nt)zsC zN&Lp33@aJtJ0yhAXRwh~&jw*YaTva`rgN7F5wJz0Y+Y>O$0d=baW9$OM2ppO^-##U zB1p=2J9Aq?#wUn^fufWsbfC7X?OI1Cd6h!AzGcfUUQ5PRdG2_^Vx@S_H>I3g zejPnoTl<70Cs{h;xX$ERf>%ov%$=e+!?mV%YInACF!* zcto$wGJ2{P+c2QhKa)XLtWzkj1wn5#0+%~@uhgA}3QC-Z1)iaZTfUWaVxQ^PKpl)- z)S7wD=kzWz-+9v#*X##gJ=6kS{0nwD2MWSsN0}TO=Nz*lO=qMn#%B;z`ixf{JeG^j zW%$vB`X})$oaq&l6 zo4&%Tlo{qdBcltK%L~hO!cilI-6B+%5)uA3BLeXgK_U0gzfRhLz(7$PG~XgCzW zdBGr`v?nboFU!@IwB|XjO`B}JfH*>4?osSUo={U33Xq9~*cEqeu=Bdb@KR_TRL?Yk z2({Uskm5~*80lGohIbS(ntlkNT4@qnsqxa8D`8Fez!E1f#EC9+6<7F^M%8eXiDjNT zBM_Vrg!c_G58zKBkX^MJr7+cFkAri1Qfn0Pv!!vzM{;YBKO=%Su1`MmHf(+uB{}`< z^KnNC3=Q**m~eIq>q@se7sSB3okA9rgbt_I8JfgM2XfD-gUi2{vNjKt300YMmp>8Y z32;0j>E_FPi$v)W+t$*ZWw=OWhBEYeMuju?WkB3q&RRe@<)sjWyt(ndN^@Yy7%h`tcs? z|IX7I|7C&{X~*QPDs3V^<#}ji$EdbC5i;oa$X`C|jI;>^D$_C6AmRtZ|5*Fjg${d( zvY>OZV)42~W(NT|JnJV^jo@Tq%1L&yrQrmZ_Nn+1O0Xiwr>0wuVOh@@{F@}#dk)pj z&}QOCEZ*o`mbKE5$LM5DW z|B!0sBo3fUd(a$T$6AAQ$4z6@n6nG`^u%kqUo@!mAo+H-NpkF`i1EY zSDdQX0e&e8b=cL`)SdBYW0q1{d4#OO=SUTtI%R)GitvpP8}vF#F5Hy}L zS7LQ6epAai54q#sN7=D{B9CqaK+ms(thMw@+a!4W#DO1D#&nPG9?_kjI{YnQij{;v zkUW-8kMraCRh~JU^dj%>_T5bk__#zb;sgLn{sK7rr$IXGk5x!jc7-k#X&oNj4IDMr zcc?6vS@E8fuDgQ=T`q}`-2p&$0Lu!Z(D8h~X#b-590LFn zLeu$P%KfD!&}vey&rR4(++(XjCxC`Pt8Q1fuDE_H1|NLeSI*JpM(=G)JW<@bw{Ip+ z04v|@O};#CsB~0P1!_|26CwR$hx5^*^keUEuQ<8m=6_V>XQw7OV8c*Xy>qs}(kM;?58+GGnZmw4wc0cbGbR>D;PW8J#HGiftf;_s(+A@Ivh z4gs$170&sOp6F05Zo8KIn>r2|kLuXpSN2r{54in)0AOn{)VS`sFKz9)mn#H<#c@APkTvL?~bjXL%9e5r}ZapcQ{+-+cOtwFCO`A-^E|mq}ER_ z%y=9ZCp_E~5DNc<`lJ5HP2F(v@;>o7XErc&;C>lx6#9&$+}Y5^rp$eE-rr~Co9eHy zoH$7}7#z8&9o9V59~atl6ltgX9UK6J`xK|#S+ahi!tq!w&wD*ZdF9W@X?+k79No$A zpFH&^m49XTC-t>fmI@Bm|Ag{l`w6AC2{3fNNIU_&6$zl7kuFE4Ee;~0u2FMHRK{nftr3=&{3_({OiaQ6M=~&-O7MAEi z*6XIwGX#;S!cD`d->w?DsT#bvt_wXwjv-X&0}!@t6Gp*FQ${VJ9@ov_Oc~tNbAbnhl zT(@vr+xtW0)fexQ-*O*Mea}1BYtvz5|CG(!Kf=et4kdS$+i#VI0)?9G!B;?r0N}fF zdo2uZ$IhuNcUCk@Y6Q9f>e<)I^g9c=bqoE9#YCzId)US9HO*soi#TZQE?F$BskdSJ z2^D2FfoS-cP~nyU6nc}KJMG{?LuT;!X2)Z{SpfWPDSK20mc`u=&3u0Uqv7(1U3Yvn z%1UkT*6Hz&8BL}gfL@gLFZjfZY&!~XCsRM#5`ajL{9o-|YfMvT7*3Wrr%V|P zR;?ODL8Qc@gv%%cm5H;oFmrkekxRP-ozkf&p+H-~sK`L3jsnthr?!@!4!ccC1zU(t znbaHYQaZU#ut3iPK_IQ9AJ55I;HFX|7ICg4oaHZp`*QQ^i+YXK8hV+#dygt$T%WF*K?^JGP{$*_# z=iKg!&S|w2XtTv`spi{>Qz4MY9qrQ@mw8A2#iNI7U|)S1CR7eo|%<5_|b;IFD%)qC4pn z@ZjNVr6+E84%$u11Du~kK44RZPhtplBX zc;90d9WIESxO(cwFvc7M1WOe|cVmPlvGVo_%do<{oelL(d!7QydCmRl7fF@_7VRbb z&9;|I>pxlhSB>G@tbm`t{4Q%FVM4*^Rv+UDV!nchswSZb zRk}8^ais(+Y964al$EpWPljz8`yb#AGgM?sz!LmzZ4c-nC}Rvs^&Z z)>y9 z&g8N&N&g652gM^-2Dt`;^&^{xCOw_9rJ#XVje;iWDf7lgYTVm6jftfQsU=_MGwR~! z9%Xn&=BHoG*s2L^!xxd{WMx>>c+^uv3JAs({zmQ30P^yO(kcaw?~05F^K~hhfC!2T z))|N$xM<$AQwf6g#w8#kmR&`D7h=(A`H<3^EKSX;ouoIAnef?uc1CNL@=l{<*^1~Q z<2oo+TJ=V9lqgkOj9h}uHE9qDJuVnk1!^cN7Cpm=Ysq_91AF7rHh2dC+fi;6Luo47 zr(xMMx0o);f?f#A29;+YkR2JLDu_w!@ODU2ZSN6d@x=%w5X4*R44-RKR8R_;N%J*{ z2JmblLt-%)h-%Tm_7&@S3?%}Zki|}UIz7-Zg=I1jzE$HY7-9=e`I>Zk1w!z!12-$d zUh!(-{V)5BMR#SIg9?f<6l4Ek(V|~$=PwjvXdYXj4bY->n<&Omj4cpji&kZ!7(+3( zK#VPVw<3x$6k`j-*dOTsG4#FiJ-=5jx+aF^F*J`YFpufbJcj16_ngO2ehlSc-!})V zmn_fUG5fnWtmwSKNDz=ZY9imV<)F1`QOqpQo+{r)RfJiHgujuNTs%c+8)4<&5zs=U z)(1Pss?U+~*8`_;#U8Yxa8hAETs$20t_VPMb(-h;Z3OQ5Pr-c6NA`%1N+##(uxwkpryuM~V?xy|s|$&E^O>-A z36ZJ#*lidS3rKmMNn+PBW(1{BChTaZ1{H#GWH5vgTOeoYa3Zi-jxZFN&7?*5vTLKj zG@;cjN`nKdmuV@yxffI9Qwypd#ofeh8kSuNZ*V)CB4ponyzhTRGI#p6WK$Y(R>!t(_~ Date: Tue, 26 Aug 2025 16:52:56 -0400 Subject: [PATCH 27/58] ONNX version 1.22.0 updated ROCm 7.0.0 (#524) * Indentation and formatting updated * ONNX v 1.22.0 udpated --- RELEASE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index fa784e6d0..3395b070d 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -108,7 +108,7 @@ ROCm 7.0 enables support for TensorFlow 2.19.1. ##### ONNX Runtime -ROCm 7.0 enables support for ONNX Runtime 1.22.1. +ROCm 7.0 enables support for ONNX Runtime 1.22.0. ##### vLLM From a7edb17538db820942ce83f390711ae9768aaadf Mon Sep 17 00:00:00 2001 From: randyh62 <42045079+randyh62@users.noreply.github.com> Date: Tue, 26 Aug 2025 16:02:49 -0700 Subject: [PATCH 28/58] Fix hip7 rn (#523) * Update RELEASE.md Update per LRT meeting notes * Update RELEASE.md move warpSize change as requested * Update RELEASE.md update warpSize change wording. * Update RELEASE.md * Update RELEASE.md Why either? * Update RELEASE.md Add content from HIP 7 Changelog * Update RELEASE.md looks good * Update RELEASE.md Co-authored-by: Julia Jiang <56359287+jujiang-del@users.noreply.github.com> --------- Co-authored-by: Julia Jiang <56359287+jujiang-del@users.noreply.github.com> --- RELEASE.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 3395b070d..3503e1cd4 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -884,11 +884,12 @@ functions added for logical reduction. For details, see [Warp cross-lane functio - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. * New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. -* New debug mask, to print precise code object information for logging. * The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. * Added `constexpr` operators for `fp16`/`bf16`. * Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`) -* Extended fine grained system memory pool. +* Support for the flags in APIs as following, now allows uncached memory allocation. + - `hipExtHostRegisterUncached`, used in `hipHostRegister`. + - `hipHostMallocUncached` and `hipHostAllocUncached`, used in `hipHostMalloc` and `hipHostAlloc`. * `num_threads` total number of threads in the group. The legacy API size is alias. * Added PCI CHIP ID information as the device attribute. * Added new tests applications for OCP data types `FP4`/`FP6`/`FP8`. @@ -898,6 +899,8 @@ functions added for logical reduction. For details, see [Warp cross-lane functio * Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. * Removal of beta warnings in HIP Graph APIs All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. +* `warpSize` has changed. +In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). * Behavior changes - `hipGetLastError` now returns the error code which is the last actual error caught in the current thread during the application execution. - Cooperative groups in `hipLaunchCooperativeKernelMultiDevice` and `hipLaunchCooperativeKernel` functions, additional input parameter validation checks are added. @@ -999,9 +1002,6 @@ In order to match the CUDA runtime behavior more closely, HIP APIs with streams - Event Management Related APIs * `hipEventRecord` * `hipEventRecordWithFlags` -* `warpSize` Change - -In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see either the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). #### Optimized @@ -1021,7 +1021,6 @@ HIP runtime has the following functional improvements which improves runtime per Developers can now use the environment variable `HSA_SCRATCH_SINGLE_LIMIT_ASYNC` to change the default allocation size with expected scratch limit in ROCR runtime. On top of it, this value can also be overwritten programmatically in the application using the HIP API `hipDeviceSetLimit(hipExtLimitScratchCurrent, value)` to reset the scratch limit value. * HIP runtime now enables peer-to-peer (P2P) memory copies to utilize all available SDMA engines, rather than being limited to a single engine. It also selects the best engine first to give optimal bandwidth. * Improved launch latency for `D2D` copies and `memset` on MI300 series. -* Memory manager was implemented to improve the efficiency of memory usage and speed-up memory allocation/free in memory pools. * Introduced a threshold to handle the command submission patch to the GPU device(s), considering the synchronization with CPU, for performance improvement. #### Resolved issues @@ -1037,6 +1036,11 @@ HIP runtime has the following functional improvements which improves runtime per * A numerical error/corruption found in Pytorch during graph replay. HIP runtime fixed the input sizes of kernel launch dimensions in hipExtModuleLaunchKernel for the execution of hipGraph capture. * A crash during kernel execution in a customer application. The structure of kernel arguments was updated via adding the size of kernel arguments, and HIP runtime does validation before launch kernel with the structured arguments. +#### Known issues + +* `hipLaunchHostFunc` returns an error during stream capture. Any application using `hipLaunchHostFunc` might fail to capture graphs during stream capture, instead, it returns `hipErrorStreamCaptureUnsupported`. +* Compilation failure in kernels via hiprtc when use option `std=c++11`. + ### **hipBLAS** (3.0.0) #### Added From 010a191938fd010559965e91faa7a72162b96144 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Wed, 27 Aug 2025 14:22:04 -0400 Subject: [PATCH 29/58] 700 RN update Batch 4 (#526) * Indentation and formatting updated * Resolved issue for kokkos option added * Known issue for ROCr added * 2nd known issue added * Known issues updated * adding 2 known issues * Apply suggestions from code review Co-authored-by: Pratik Basyal * Update RELEASE.md * Known issues added * Approved known issue added * Component removed based on Leo's feedback * Issue link added --------- Co-authored-by: Matt Williams Co-authored-by: Matt Williams --- RELEASE.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 3503e1cd4..5cea39c0b 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -337,9 +337,6 @@ ROCm documentation continues to be updated to provide clearer and more comprehen * [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/develop/reference/env-variables.html) * [hipSPARSELt](https://rocm.docs.amd.com/projects/hipSPARSELt/en/develop/reference/env-variables.html) - * [MIVisionX](https://rocm.docs.amd.com/projects/MIVisionX/en/develop/reference/MIVisionX-env-variables.html) - * [MIOpen](https://rocm.docs.amd.com/projects/MIOpen/en/develop/reference/env_variables.html) - * [rocBLAS](https://rocm.docs.amd.com/projects/rocBLAS/en/develop/reference/env-variables.html) * [ROCm Performance Primitives (RPP)](https://rocm.docs.amd.com/projects/rpp/en/develop/reference/rpp-env-variables.html) * [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/develop/reference/env_variables.html) * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/env_variables.html) @@ -2349,6 +2346,10 @@ The previous default accumulator types could lead to situations in which unexpec ROCm known issues are noted on {fab}`github` [GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known issues related to individual components, review the [Detailed component changes](#detailed-component-changes). +### A memory error in the kernel might lead to applications using the ROCr library being unresponsive + +Applications using the ROCr library may become unresponsive if a memory error occurs in the launched kernel when the queue from which it was launched is destroyed. The application is unable to receive further signal, resulting in the stall condition. The issue will be fixed in a future ROCm release. + ## ROCm resolved issues The following are previously known issues resolved in this release. For resolved issues related to @@ -2362,6 +2363,10 @@ An issue where compiling of a generic target with compression failing has been r An issue where due to limited support for Sparse API in JAX, some of the functionality of the Pallas extension were restricted has been resolved. See [GitHub issue #4608](https://github.com/ROCm/ROCm/issues/4608). +### Failure to use –kokkos-trace option in ROCm Compute Profiler + +An issue where using of the ``--kokkos-trace`` option resulted in a difference between the output of the ``--kokkos-trace`` and the ``counter_collection.csv`` output file has been resolved. Due to this issue the program used to exit with a warning message if the ``-kokkos-trace`` option was detected in the ROCm Compute Profiler. This issue resulted due to the partial implementation of ``--kokkos-trace`` in ``rocprofv3`` tool. See [GitHub issue #4604](https://github.com/ROCm/ROCm/issues/4604). + ## ROCm upcoming changes The following changes to the ROCm software stack are anticipated for future releases. From d58e2b16db0f6386b3a99967569cad8661c0bb30 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Thu, 28 Aug 2025 19:05:00 +0530 Subject: [PATCH 30/58] Update mi350-performance-counters.rst --- docs/conceptual/gpu-arch/mi350-performance-counters.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/conceptual/gpu-arch/mi350-performance-counters.rst b/docs/conceptual/gpu-arch/mi350-performance-counters.rst index d42eaa0e6..fe103c17c 100644 --- a/docs/conceptual/gpu-arch/mi350-performance-counters.rst +++ b/docs/conceptual/gpu-arch/mi350-performance-counters.rst @@ -6,7 +6,7 @@ MI350 series performance counters *********************************** -This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI350 and MI355 series GPUs. These counters are available for profiling using `ROCprofiler-SDK `_ and `ROCm Compute Profiler `_. +This topic lists and describes the hardware performance counters and derived metrics available on the AMD Instinct MI350 and MI355 accelerators. These counters are available for profiling using `ROCprofiler-SDK `_ and `ROCm Compute Profiler `_. The following sections list the performance counters based on the IP blocks. From 0665e73e2da7c637797eca1d1b5cda9420a8252c Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Thu, 28 Aug 2025 09:50:57 -0400 Subject: [PATCH 31/58] 700 known Issues update [Batch1] (#527) * Indentation and formatting updated * Known issues added * Known issues udpated * Minor change * Known issues updated * KMD UMD udpate * Updated known issues * Additional text removed from known issues * Oracle linux 10 removed --- RELEASE.md | 23 +++++++++++++++---- .../compatibility-matrix-historical-6.0.csv | 4 ++-- docs/compatibility/compatibility-matrix.rst | 11 ++++----- 3 files changed, 26 insertions(+), 12 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 5cea39c0b..1bcff67c7 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -45,7 +45,6 @@ ROCm 7.0.0 adds support for [AMD Instinct MI355X](https://www.amd.com/en/product ROCm 7.0.0 adds support for the following operating systems and kernel versions: * Ubuntu 24.04.3 (kernel: 6.8 [GA], 6.14 [HWE]) -* Oracle Linux 10 (kernel: 6.12.0 UEK) * Rocky Linux 9 (kernel: 5.14.0-570) ROCm 7.0.0 marks the end of support (EoS) for Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE]) and SLES 15 SP6. @@ -121,7 +120,7 @@ ROCm 7.0 enables support for Triton 3.3.0. ### Instinct Driver/ROCm packaging separation -The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See the [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) blog for more information. +The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See the [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) blog and [User and kernel-space support matrix](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/user-kernel-space-compat-matrix.html)for more information. [AMD SMI](https://github.com/ROCm/amdsmi) continues to stay with the ROCm software stack under the ROCm organization repository. @@ -295,7 +294,7 @@ See the [ROCprofiler-SDK changelog](#rocprofiler-sdk-1-0-0) for more details. The ROCm Offline Installer Creator 7.0.0 includes the following features and improvements: -* Added support for Oracle 10.0, and Rocky Linux 9.6. +* Added support for Rocky Linux 9.6. * Added support for the new graphics repo structure for graphics/Mesa related packages. * Improvements to kernel header version matching for AMDGPU driver installation. * Added support for creating an offline installer when the kernel version of the target operating system differs from the operating system of the host creating the installer (for Ubuntu 22.04 and 24.04 only). @@ -306,7 +305,7 @@ See [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/install- The ROCm Runfile Installer 7.0.0 adds the following features and improvements: -* Added support for Oracle 10.0, and Rocky Linux 9.6. +* Added support for Rocky Linux 9.6. * Added `untar` mode for the `.run` file to allow extraction of ROCm to a given directory, similar to a normal tarball. * Added an RVS test script. * Fixes to the rocm-examples test script. @@ -2350,6 +2349,22 @@ issues related to individual components, review the [Detailed component changes] Applications using the ROCr library may become unresponsive if a memory error occurs in the launched kernel when the queue from which it was launched is destroyed. The application is unable to receive further signal, resulting in the stall condition. The issue will be fixed in a future ROCm release. +### Applications using stream capture APIs may fail during stream capture + +Applications using ``hipLaunchHostFunc`` with stream capture APIs may fail to capture graphs during stream capture, and return `hipErrorStreamCaptureUnsupported`. This issue resulted from an update in ``hipStreamAddCallback``. This issue will be fixed in a future ROCm release. + +### Compilation failure via hipRTC when compiling with std=c++11 + +Applications compiling kernels using `hipRTC` might fail while passing the `std=c++11` compiler option. This issue will be fixed in a future ROCm release. + +### Compilation failure when referencing std::array if _GLIBCXX_ASSERTIONS is defined + +Compiling from a device kernel or function results in failure when attempting to reference `std::array` if `_GLIBCXX_ASSERTIONS` is defined. The issue occurs because there's no device definition for `std::__glibcxx_asert_fail()`. This issue will be resolved in a future ROCm release with the implementation of `std::__glibcxx_assert_fail()`. + +### Segmentation fault in ROCprofiler-SDK due to ABI mismatch affecting std::regex + +Starting with GCC 5.1, GNU `libstdc++` introduced a dual Application Binary Interface (ABI) to adopt `C++11`, primarily affecting the `std::string` and its dependencies, including `std::regex`. If your code is compiled against headers expecting one ABI but linked or run with the other, it can cause problems with `std::string` and `std::regex`, leading to a segmentation fault in ROCprofiler-SDK, which uses `std::regex`. This issue is resolved in the [ROCm Systems `develop` branch](https://github.com/ROCm/rocm-systems) and will be part of a future ROCm release. + ## ROCm resolved issues The following are previously known issues resolved in this release. For resolved issues related to diff --git a/docs/compatibility/compatibility-matrix-historical-6.0.csv b/docs/compatibility/compatibility-matrix-historical-6.0.csv index b54e8025d..26089a642 100644 --- a/docs/compatibility/compatibility-matrix-historical-6.0.csv +++ b/docs/compatibility/compatibility-matrix-historical-6.0.csv @@ -6,7 +6,7 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 ,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8" ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4" ,,,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9 - ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,, + ,"Oracle Linux 9, 8 [#ol-700-mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,, ,Debian 12,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,, ,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-past-60]_,Azure Linux 3.0 [#az-mi300x-630-past-60]_,Azure Linux 3.0 [#az-mi300x-630-past-60]_,,,,,,,,,,,, ,Rocky Linux 9,,,,,,,,,,,,,,,,,, @@ -50,7 +50,7 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 CUB,2.6.0,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1 ,,,,,,,,,,,,,,,,,,, KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, - :doc:`KMD versions `,"7.0.x, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x" + :doc:`KMD versions `,"30.10, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x" ,,,,,,,,,,,,,,,,,,, ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, :doc:`Composable Kernel `,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0 diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index 5f0ace4ea..30dda9b02 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -31,7 +31,7 @@ compatibility and system requirements. ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.5, 9.4" ,RHEL 8.10,RHEL 8.10,RHEL 8.10 ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP6, SP5" - ,"Oracle Linux 10, 9, 8 [#ol-700-mi300x]_","Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_ + ,"Oracle Linux 9, 8 [#ol-700-mi300x]_,"Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_ ,Debian 12,Debian 12 [#single-node]_, ,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#az-mi300x]_, ,Rocky Linux 9,, @@ -74,7 +74,7 @@ compatibility and system requirements. CUB,2.6.0,2.5.0,2.3.2 ,,, KMD & USER SPACE [#kfd_support]_,.. _kfd-userspace-support-compatibility-matrix:,, - :doc:`KMD versions `,"7.0.x, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x" + :doc:`KMD versions `,"30.10, 6.4.x, 6.3.x, 6.2.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x" ,,, ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix:,, :doc:`Composable Kernel `,1.1.0,1.1.0,1.1.0 @@ -159,7 +159,7 @@ compatibility and system requirements. .. rubric:: Footnotes -.. [#ol-700-mi300x] **For ROCm 7.0** - Oracle Linux 10 and 9 are supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. +.. [#ol-700-mi300x] **For ROCm 7.0** - Oracle Linux 9 is supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. .. [#ol-mi300x] **Prior ROCm 7.0** - Oracle Linux is only on AMD Instinct MI300X. .. [#single-node] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality. .. [#az-mi300x] Starting ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. @@ -199,8 +199,7 @@ Use this lookup table to confirm which operating system and kernel versions are ,, `Rocky Linux `_, 9, 5.14.0-570, 2.34 ,, - `Oracle Linux `_, 10, 6.12.0 (UEK), 2.39 - ,9, 6.12.0 (UEK), 2.34 + `Oracle Linux `_, 9, 6.12.0 (UEK), 2.34 ,8, 5.15.0 (UEK), 2.28 ,, `Debian `_,12, 6.1.0, 2.36 @@ -238,7 +237,7 @@ Expand for full historical view of: .. rubric:: Footnotes - .. [#ol-700-mi300x-past-60] **For ROCm 7.0** - Oracle Linux 10 and 9 are supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. + .. [#ol-700-mi300x-past-60] **For ROCm 7.0** - Oracle Linux 9 is supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. .. [#mi300x-past-60] **Prior ROCm 7.0** - Oracle Linux is supported only on AMD Instinct MI300X. .. [#single-node-past-60] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality. .. [#az-mi300x-past-60] Starting ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. From 53bd9b5da48b6cf5dfba2b35d4345fa66d3f3f76 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Thu, 28 Aug 2025 10:52:03 -0400 Subject: [PATCH 32/58] Table loading and broken link fixed in 7.0.0 (#528) * Indentation and formatting updated * Table and broken link fixed * Clang-ocl removed --- docs/compatibility/compatibility-matrix.rst | 3 +-- docs/how-to/rocm-for-ai/install.rst | 2 +- docs/reference/gpu-arch-specs.rst | 1 + 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index 30dda9b02..2a696e3c6 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -31,7 +31,7 @@ compatibility and system requirements. ,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.5, 9.4" ,RHEL 8.10,RHEL 8.10,RHEL 8.10 ,SLES 15 SP7,"SLES 15 SP7, SP6","SLES 15 SP6, SP5" - ,"Oracle Linux 9, 8 [#ol-700-mi300x]_,"Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_ + ,"Oracle Linux 9, 8 [#ol-700-mi300x]_","Oracle Linux 9, 8 [#ol-mi300x]_",Oracle Linux 8.10 [#ol-mi300x]_ ,Debian 12,Debian 12 [#single-node]_, ,Azure Linux 3.0 [#az-mi300x]_,Azure Linux 3.0 [#az-mi300x]_, ,Rocky Linux 9,, @@ -145,7 +145,6 @@ compatibility and system requirements. :doc:`ROCr Debug Agent `,2.1.0,2.0.4,2.0.3 ,,, COMPILERS,.. _compilers-support-compatibility-matrix:,, - `clang-ocl `_,N/A,N/A,N/A :doc:`hipCC `,1.1.1,1.1.1,1.1.1 `Flang `_,20.0.0.25314,19.0.0.25224,18.0.0.24455 :doc:`llvm-project `,20.0.0.25314,19.0.0.25224,18.0.0.24491 diff --git a/docs/how-to/rocm-for-ai/install.rst b/docs/how-to/rocm-for-ai/install.rst index d9c9c345d..7ec961ffa 100644 --- a/docs/how-to/rocm-for-ai/install.rst +++ b/docs/how-to/rocm-for-ai/install.rst @@ -24,7 +24,7 @@ If you’re new to ROCm, refer to the :doc:`ROCm quick start install guide for L If you’re using a Radeon GPU for graphics-accelerated applications, refer to the `Radeon installation instructions `_. -You can install ROCm on :ref:`compatible systems ` via your Linux +You can install ROCm on :doc:`compatible systems ` via your Linux distribution's package manager. See the following documentation resources to get started: * :doc:`ROCm installation overview ` diff --git a/docs/reference/gpu-arch-specs.rst b/docs/reference/gpu-arch-specs.rst index 67f954f00..c6c92e383 100644 --- a/docs/reference/gpu-arch-specs.rst +++ b/docs/reference/gpu-arch-specs.rst @@ -33,6 +33,7 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - VGPR File (KiB) - SGPR File (KiB) - GFXIP Major version + - GFXIP Minor version * - MI355X - CDNA4 From eabf72c2db3c42c5ae1391dc35b5d47a89c929ba Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Thu, 28 Aug 2025 20:28:34 +0530 Subject: [PATCH 33/58] Update _toc.yml.in --- docs/sphinx/_toc.yml.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index a3f23da94..880b835c6 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -146,7 +146,7 @@ subtrees: title: White paper - file: conceptual/gpu-arch/mi300-mi200-performance-counters.rst title: MI300 and MI200 performance counters - - file: conceptual/gpu-arch/mi355-performance-counters.rst + - file: conceptual/gpu-arch/mi350-performance-counters.rst title: MI350 and MI355 series performance counters - file: conceptual/gpu-arch/mi250.md title: MI250 microarchitecture From 95d175287443661cbd3c408d1d85d92f7f5d3524 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Thu, 28 Aug 2025 20:35:01 +0530 Subject: [PATCH 34/58] Update _toc.yml.in --- docs/sphinx/_toc.yml.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index 880b835c6..8a12c9484 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -147,7 +147,7 @@ subtrees: - file: conceptual/gpu-arch/mi300-mi200-performance-counters.rst title: MI300 and MI200 performance counters - file: conceptual/gpu-arch/mi350-performance-counters.rst - title: MI350 and MI355 series performance counters + title: MI350 series performance counters - file: conceptual/gpu-arch/mi250.md title: MI250 microarchitecture subtrees: From 04beef8773caf87d6f85ea673d4ecae485b92a9f Mon Sep 17 00:00:00 2001 From: Adel Johar Date: Thu, 28 Aug 2025 17:08:27 +0200 Subject: [PATCH 35/58] Docs: Overhaul JAX compatibility page for ROCm 7.0 --- .../ml-compatibility/jax-compatibility.rst | 53 ++++++++++++++++++- 1 file changed, 51 insertions(+), 2 deletions(-) diff --git a/docs/compatibility/ml-compatibility/jax-compatibility.rst b/docs/compatibility/ml-compatibility/jax-compatibility.rst index f85a3d722..cf2eabd6e 100644 --- a/docs/compatibility/ml-compatibility/jax-compatibility.rst +++ b/docs/compatibility/ml-compatibility/jax-compatibility.rst @@ -27,7 +27,7 @@ with ROCm support: - Offers AMD-validated and community :ref:`Docker images ` with ROCm and JAX preinstalled. - - ROCm JAX repository: `ROCm/jax `_ + - ROCm JAX repository: `ROCm/rocm-jax `_ - See the :doc:`ROCm JAX installation guide ` to get started. @@ -310,5 +310,54 @@ For a complete and up-to-date list of JAX public modules (for example, ``jax.num Since version 0.1.56, JAX has full support for ROCm, and the :ref:`Known issues and important notes ` section contains details about limitations specific to the ROCm backend. The list of - JAX API modules is maintained by the JAX project and is subject to change. + JAX API modules are maintained by the JAX project and is subject to change. Refer to the official Jax documentation for the most up-to-date information. + +Key features and enhancements for ROCm 7.0 +=============================================================================== + +- Upgraded XLA backend: Integrates a newer XLA version, enabling better + optimizations, broader operator support, and potential performance gains. + +- RNN support: Native RNN support (including LSTMs via ``jax.experimental.rnn``) + now available on ROCm, aiding sequence model development. + +- Comprehensive linear algebra capabilities: Offers robust ``jax.linalg`` + operations, essential for scientific and machine learning tasks. + +- Expanded AMD GPU architecture support: Provides ongoing support for gfx1101 + GPUs and introduces support for gfx950 and gfx12xx GPUs. + +- Mixed FP8 precision support: Enables ``lax.dot_general`` operations with mixed FP8 + types, offering pathways for memory and compute efficiency. + +- Streamlined PyPi packaging: Provides reliable PyPi wheels for JAX on ROCm, + simplifying the installation process. + +- Pallas experimental kernel development: Continued Pallas framework + enhancements for custom GPU kernels, including new intrinsics (specific + kernel behaviors under review). + +- Improved build system and CI: Enhanced ROCm build system and CI for greater + reliability and maintainability. + +- Enhanced distributed computing setup: Improved JAX setup in multi-GPU + distributed environments. + +.. _jax_comp_known_issues: + +Known issues and notes for ROCm 7.0 +=============================================================================== + +- ``nn.dot_product_attention``: Certain configurations of ``jax.nn.dot_product_attention`` + may cause segmentation faults, though the majority of use cases work correctly. + +- SVD with dynamic shapes: SVD on inputs with dynamic/symbolic shapes might result in an error. + SVD with static shapes is unaffected. + +- QR decomposition with symbolic shapes: QR decomposition operations may fail when using + symbolic/dynamic shapes in shape polymorphic contexts. + +- Pallas kernels: Specific advanced Pallas kernels may exhibit variations in + numerical output or resource usage. These are actively reviewed as part of + Pallas's experimental development. From d476d09aff492382c1736ef861429f99fbdbd3dd Mon Sep 17 00:00:00 2001 From: Istvan Kiss Date: Thu, 28 Aug 2025 17:09:34 +0200 Subject: [PATCH 36/58] Update precision support page with missing libraries and RDNA2 and CDNA4 support --- .wordlist.txt | 12 + .../floating-point-data-types.png | Bin 82859 -> 117072 bytes .../precision-support/precision-support.yaml | 391 +++++++++ docs/index.md | 2 +- docs/reference/precision-support.rst | 813 ++++++++++-------- docs/sphinx/_toc.yml.in | 2 +- 6 files changed, 883 insertions(+), 337 deletions(-) create mode 100644 docs/data/reference/precision-support/precision-support.yaml diff --git a/.wordlist.txt b/.wordlist.txt index 5377d2eef..9473bf389 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -96,6 +96,7 @@ DIMM DKMS DL DMA +DOMContentLoaded DNN DNNL DPM @@ -531,6 +532,7 @@ ZenDNN accuracies activations addr +addEventListener ade ai alloc @@ -564,6 +566,7 @@ boson bosons br BrainFloat +btn buildable bursty bzip @@ -575,6 +578,7 @@ centric changelog checkpointing chiplet +classList cmake cmd coalescable @@ -586,6 +590,7 @@ composable concretization config conformant +const constructible convolutional convolves @@ -649,6 +654,7 @@ exascale executables ffmpeg filesystem +forEach fortran fp framebuffer @@ -657,6 +663,7 @@ galb gcc gdb gemm +getAttribute gfortran gfx githooks @@ -808,6 +815,8 @@ recommenders quantile quantizer quasirandom +querySelector +querySelectorAll queueing qwen radeon @@ -870,9 +879,11 @@ scalability scalable scipy seealso +selectedTag sendmsg seqs serializers +setAttribute sglang shader sharding @@ -899,6 +910,7 @@ symlink symlinks sys tabindex +targetContainer td tensorfloat th diff --git a/docs/data/about/compatibility/floating-point-data-types.png b/docs/data/about/compatibility/floating-point-data-types.png index b59b40be4d81ade4498332fdbf1e409814f6cb3e..87c3afe29c4ab53a7a8204a3a663194f80e765a6 100644 GIT binary patch literal 117072 zcmeFZcTiMY_brOYEDA@098^FhXQ643tRf&;at29F2FV%BNDu_1$sjp{ zBaKu^bE|6>~H8B8Jk)O z(yvt1(BCjM6r@*0%iNZEB4T7>`q0(ZNYPbR$-vdzfY*>-Sm@FP0cSoqf~AqY-VJ9< z3oAQ5XTe*49+(f_Bb(W82?^L5V)zsui2c0>d=k86VsHP1kDcAg$%)Mg#b#}5%+A5f z%gcV7lbw^374~4YbFs45b7r-&qyGB<4~*;#Y)zlon_64lKn|##~Juz%k-TDK-B@$lifN5e&>6=#^2bCj8s(wHAzs54i4 zi&c1rz3)fbH?d4WTPkvn=)TsdFx$^5UX%pgP4eH|h0R;b7Zx{K@btb(uPhz}$eg7A zip-w>c-5XBZuyUo$lDu5kF-wx_ck^Wo6G;brSxK*=znkd!AU}P;=eXMZT>%Ypnk3= z_p`|Pf*bn8(fB{P4F2mpczBk)cIsr$MOyj;y6WlzlKo5^bEh#E|9hto1d3rM%i|#! z3-82)|FH}GKX%9e_dx&s`2I(y*Z&>t|2x?KIRgKm$GiW*32D$(dp-{{`>>%?l&eD3 z4uyaRJiOrET+P()-yflh88nNt)~B0|>wgUh;j4c`=ftf)H#2^auiE3YoAxKybm!P( zc911G1#&6_td77ar~nz6C5 zm}rZRj*bfi1l$Jvb5)(q%vE9JlS1*O=l(fArNf_xE41oRIHnM?s863LV6T1@W~JTw z2OeJO_x$AM&B12&sJs}m{Z09r6NjyL1~5m0B8Nf;-~E>=H%8s&_v;rh{`)WgzIHgS zb2qQ1mObhua*EpHm#^uBEaiibB2b$0^73#!1(-H2YVYX7!(X?AXL=sVcdOsd6A<)% zI(fFpa83X4j^6V#?%i^7xW)9wnnR4vg&&v6u3j}0BhWryn&4pG7ip6p&F3IFRBVMd z@V+ndrRCiA4L&{%v=S<^&mE-;hpMy%PS&wqMmP?WO zh(>PBUS##b0si4(pYC9#X+e%|h1KY#T52|~uVoVnR#Yp|M40wyBeAs$m-Hs|ZKDNT z6f#wFI)CbV^?d&nH~(t1swb&~m`OAL{neh#T;n)bjs2}<)XHBE9xmW1$$ccf3lOQW z=*GwJJ29;`k=o(oO^^n^)c;Y#tTH=TWR8iZ;o#ti6l~s|&_5?GN-tH?roB0_(Qe-V zE&a1WeDEOSe z`^VU#7#>PtS-2-py?RBvwC&xE8^&*6$aWf)T~#T!H{G!Mb%?<0QD|1XSDxe$o+~rd`F^m}o;vN=jQNCmsX5#_m!*>KHzkPJh->=LOa( zT-y-l;*ZN)TT#|#pDK4T(H~}FSF)eII1^!$kaE16dWK-9-=A11Ux|@bT7TfO!rP1< zd-Uz^{;btex9_!T7T^9C!O(p8(oGxbc4xeOy6&qe-a{AeC>b3coo)BT!gF}Fhh&jo z6BB4nm!rK%E??HG_lZQKOdB3ZDKLnO{yBY-oBDC$EAq|XjOo}gZPDudTZ`NV!;ZV_ zMoS|F;_$4ylBLKZ?tbR>S@usADEA?&I3=@Sof0ZxQYMyrh0cXw~8DAR)NY`W!HnkCOk-50PH`RNZk zZZ&$wxIUt1tN;3Sc5$Gfob-Cuw{H(c#l$4JqdHxj88!?88DSZ(Z*6g}U@gYS$NAi* z7nf$lN)r+ieojxb$+62BwMHuQ+Wvf0Za;^NvYJ|WTO2<`*xRqXy_xy=B(0X0C09aq z2J8$~i!BCzj*sh&UixQ(Cf#h4)k=a0DDkAm&rdh-TB>!umu3AGPu%=mSYr*X#t}Z; z&wj6^{bE0dU94Fu+ zM}q=7i#9lyT~denPTme-5%=FtoDr4}bBTIrXMDYsXFht#rk z%5tz<-EB2o5XYvSEfRFTyJPoN5j$vRqKH94nX6EU}W=bO}Y9tKjyXpYk~@TUhbZ z($GARy29GIs9sR$(uJ|(*TK#Asph7}Dkaj#K2(tZATs1QSYsVWE)`rKK$Tm(G*p*O zed|`C^_UPGQW*}}6R^Zj;=YkGC6t0;jWqvh95D_@HzOXTitO#|q$#B_{eYbL=-#^$G)NAAq6c{t!5*QbD(iRpbWq4HUPb*^V~I&|KN_xvx)^*LGT|Et*#;X?8M0 zDLpun+d8Es0u>cD=mR&_StnKKvVmC|DiA+D+>V23+0x9CZ`f$N-`AY3kSv+5KP73_ zk#AJ#MOA*NV>#5D&F8ooE*mG5+?6W3u$it1(>B;8JtmshE<>^>BSX2OIFideW4O#N zD(s(G`gH#dWYnbR``TI;3Gr1kudouuT;1;J=a~PVmq@PZe+{KF<{9(ke7<80UnD1v z(3?u2FH=#Ktc{{Y*FydhRB_5(DX;qDrvn>@)4F(7%BHe5=x5nk&HJ1B`3R0(>uK{n?Jkc zB5a;Rcu4c4@=UWcZjaEEW!8;_NRy6|yi&Wp1I zW=%iGxV)AHyp7&}C4nGWa1=55gs4AVF}1t&o(QU%UeI-`dsUvg^^aPSwunAY@!{d& zLYrocQG3vr;e28Wifq@Sh1vC~=4q7#_|xWCULt{R=1RD*Q>@VPnC?wR`z-q@rcm4` zajJB&*2rnBh?h)UPT)R{$6#)Y@KV-D>Aktm12?XWZ<7 zqKNp#u&Ai0vSi4zX%dO_EjzLPlsqa+{PwpE&>5<_5RNzu{Mg03rn(7lUFUOtbSOMe zwnF$DP2KQme5GP7byTrh>EmZ&m#E_X$k-z_B_<{(69vz1HS9;b$2(bPs}~LWhtzd- zDGIvcI9G`x1rTl{EWPT2jp zid@qfo@)P_@iouc=yz}4oU2;X&_$bfs@Y-MWmSBh)U8jsZfBQ7v$1N`cxnq)ICdFR zi_{4o{jRs3sjYb^yvHz*EGn^Q@$;c}>61^ej7NCJ=I3TgpXcL4?v6@QgCjaF4JW6%Zgn1%ZM0<5;GaA9 zooBM9mzf>cL5#aL>%Bs6vJxvOTXQtV;W5b78p+Y+uP8GuoxUIW(R5PGVPTk7RMIc_ zJ&&#qDu2gx;z1Q|@7lJO#`~m7mkqj|qmGoF?O>)`x9%%??#sZ8-d$$b5UpHz- zTh7N}EaaUEhZ*TLI1TS{OYIbBDPo$2yeD!Ju#G&v! zk**Zkp8k$RfXY^9MwG@>b!8mOrmY%z9eaY}Jfmm3^MV=2^BNRNl772kdEvZ_=eTvR zKe;T0qOkdN73uPQ5s}t!35Jcq82fg8ef`blA?ws{+Z!u2hbqiicAiO^{q>D@QhVxe z{)(a)dU{U!hSm=^s*c+#e2%L4m_7D3&&$3WEpf`N{sR|5POlkTc|MFr3M;ugGue?S zj_cfLXQrd8AO3|5O-TS4KGfsL<0yC8m=+u%V&>h1M8iqXd1$#Td~DJ>kH%yPdmd}M zInuko$4%Bz_W612g|B(6E%E5m^U<2k{*Y6wEOZus^yr<={9)ziFTp&dx6u|_HJ|(! zqwjcXy$~ihtlwUgtR?l}{9%Bqwgv=*BsY{SX}Y??iFb~#UcHV8>N+?)B#sGpI^1($ z)h%xE5249>t~u(y`*UHzyndU7n_ETKeM>!IMN?~Pz9&pE)ZIFd?{Il2|CNi-7cK1RBgvQWj)zVZ= z{RV1GGgjvC%*ERV{BiCFc~&Fkq$NUL%hA<)2l&nCfQ#2S?gL`Pxa(>`;elJ|jk1Xr zIZJ3(Kjsx>%07MBF24eaipNZUqMvg zNDq%NuD`$kLGy-B1h2tsa`l!t9=V{PAVB{<6vmbX14;=G&ZA^6LrK%)GuhC(Q*F~@ zem_OYT}G|ItcZLhM8IW&ok&bn)O2L92;(kC#E<1e)3J?mcxX(7l>bvr-5tCO^#`PT0 z_ilD=Fz5Xvj!DV8Pq8dI7WHdPs@0WOxy;fuip(3kn!AP_O$OskQ~&sA>~x^(_x?RE zikR$~xXgnGUj@TzXIf*Tesqtjqn4M#O(ZTz{4CNAmc+`&FggV*{4AY)KT+QIvYU}j z?7;(*jT9N=1|px9TCWJ5KApU{Ht4eclY`ptmZ)FTmoH!Z6~D%gxOH~#qyKquzY9ZE z@nl{)?Oy^i@O>|<#(d2e@76nY6hiYW-Dk_i4+Aq~nz83S@0u43aM`t40Qk^Ysl;K+ zEQ4qPF=_7RLY~&Q8c0`7&0&icwyr;)p;Q(O=sJl-owBPApRCBFXV>b z8hK{PCsTX7@FZ7yyPmAX?YRh`F`ebQ1PZ58*Jq+UE^c2CNpgA@} zDO(@FoXjhbXUD@`kzZDcLZAt?Sco*-f_$<~6t;>=v2#ZBD+VVj0lwN~PU% z&s0eE%}~n3!1;{bgu_h8Pu6~!+QThmsHEqU(+})e*T^q$TehenStL8M_27}YGmMe?Y6&GjcZN@hB#T|JYz&~_$Ih;O zE3=z%bpGn-YQ5sDd-58T$0*~r@Vq**#2SUvwip->66YyjGnGuM`YT9!=3lAGd@Uys zf1THs0Zv_V--_|4${VYZ3%!7c|15U+9i^d^QXF!$#I0urjGNN5US&t3o7mN@KU%WRyZwZ66$RN=jUOcy(Pp=11ucag#V)`LLp8iVEMrtw-(3Pb2kG5N#v|-@>G^IG9Kjxa~iiLjZ`?A?&eQT(>h)cRl06E{4q#2 zMtY}J{=Tj%USKRl!Ze*FK8%HY+7rE3-mm5+>inivqW3P%FN8IQw5m~i6ZWx(cydqP zSzca-^_TBxF=l0@e!#iQ(_{U;)}Qsc)hR|MK1uZg47y~(0XPW3?aeNqYA*)GV*hDY zEt7hi<>XSeBold~_E>G>_Gp`V9577$j!JMmIBTLATKm#PE(5<6I_XCnSP5qAsJYGB z@IZmt57l;mRzAn&EOv1(qx#16_Gn($;~1ybe*Ny5xJ>hHD~L*Mal*M!mNs{Fl}WVc z<>g)F!B`}$xX_yJe7ig!`Ku<(oIoN0<-ZF>$)hL*$SqH$ik$vR#eGFAIW`+JA&>9Q1k z*Dqf_&P#-pfuiPT)hM!61K=eg3$<)io~V-2dxGnC9@jT2#OsS*y1wbdZ8ef0oFdcZ zeJIq~3y*WfU;0=Qfv+Fl>eCI$mE?Hp*HJdXNm605U&JnkwPH{u%MvA& zf=cDmNsE%q2TrqN@a?tD zI#u0tox?rH5Hvm+V}kqcx@LaTg>suO{);4x8ejcc)_*oyiaD}r`+Aj!qkPHObC9xT z!0x!iu3gw#)&JvwMaII6aU))B=>6-JTj2tNf;~lUPGx0v$8jn!gCbGp#k86Qt_}&r z$F4%VNs3cB@4kP<$2KbGDSXb2FW3GoeST@Mq%U}?8ErhORkhmb*Azl8HyjHkNvz8j zZ!>O9V)5K;Vao{XxI|EJA>_=uuV2G$@=MDajw`DS+G9sB_PWSO7&V8&W4L^oal`$W z$e7N&CdY_}XQ~5banZ{6PyR7mg9{$YuzFCAKK64&?1l`n3J*-wCSC>Ke}&Jl{EbEH z&iJjk2}U`(o(=Zu4@ljcCyFpF3=*F|e{T6Et6z66igm5z+%{iFQ?94rpICx*8E1Y1 zZ~JqEj~~#^O)>l>P|})NskcP(b}6Jqp$%RejVW!eB|x!@Rjd;QGL0^*q-1ig(y5}a z@8rnQMgY06Y@II|s$^nk$2qD)gx+pi8e?cY`TElQKz2oMmX2L}*J)xmxmyxI+2jTa z?c4<-LNcgk!R_)`;cNx3Yx^%wyEFEy=6SrQXForEM#Zd{@>O_|o0L&Q$wSAb&R@IU zM^>-$R!n%>#hLoVr;C=G?v2HGp=!} zWnuG$!1(R##&E_EqfSG22y;1Pv2L=UF)IzJWOHqH-PFi;&ijoxV#1V3J^wv9FPC|} zG{UijHifC?E(>xwZvJ7ZR-p*k%y8JA?;texVp~I8ON5TMxls*lNrc2 zR;>v<3pC8@!0R{JbH7Smb}oJS`t>~Efwj@-h{1Dsg#5;I?oTiA1gsS-T$g^eJU(I8<#O;A`Z$mnjN=d&ll2Wc2MF)Rd%`DdV#>i)SmWt2+wh z1hXTRGfiv74-1XAmr0l}iiwL42@+1m268ZNnE!KYPxk{6-s3jJpi7i|KVyN(Ym+(8 ztemNgnsf$gO&x9C!wDSk4aToN_dlfe9s9nrva*_w_|{$b3jYcNlxv8_P{J}; zXnxJRLSXv@>D)??@}SA$83WLKdLn)r8P1luL~+P+_tV)x}A5x{Aih9z#bKv zUE({?U0j}vcX0EgBB4#8sy+#UnLj7%7Gu4gtQPvRQB=PwE`bk#6KX}iLi&FpblB}r zAsNONvv=v?-b*n&JW5iDh*AcLmvFj)S+v>u&Twpa)k?AnR4IDdJTU;N0mD|ELqhbrI zpPct4Wu(Tv)9AQ7oW5l_h|p`(*cj#$CtUNkH|D#mUsRI|>u8o($Uk}VQC~lN_59z@ zr8e<7JQe{%EW0K~i~zV=v!zHhqBvRMUy{+V;Q(pCtf#!lXl!#sE~MpKg2g~SU5O2NK=m zc#@DwJ>z{@s3CB_$owK=`FiHenGCnE9E~co&@#kq^XH55CQ{emoLE`OrK&z2^`AuR z&)(Lqa_s@N>}k>?;{ab-W7aEM?o8v83B+9sbE!bePxQ9QXi1lrmKrn=9IhUoLWy0S z2E0f`f0*QgPnKbnsWR;t{I$N`Y>-N^wLjs350S=4=ODTx0SQuOWzLe2pMYidoa!#1 z0yYNT1o1h{zyE~44d;hnf!Ch&JR~J0HL|K(0D0#sT0aTdOX8*v$!}91)YOtC zLk;C}HgGuQA;%?_h%n;1x_1u(d}G%atT8XoK8VOS+HBn(Go2blUlp)_6UlAPH0Dp< z+DuerJqG-TLBYW0gUR|xUgKg+`=R(s<#sGkZH$D3gd3d?mWC?Y{b?jrb9H*62MQ1o z@2{7@^FJruM6PL9p?Ab0%i)=FRNhbW9is_aPlp5f|^ zOH{n~zlgu}uMphbRGdyOZ66*PQLes!x~e-h5Yrx&Ii5%q7fV4xlJsWv9LmwRq@-lN zJ3G^9%uBafI?8DbhNrOXq10#ViGvBU{E{s z#T+|@Z`9RY=e3Iriou^@2VhtGd!&D2g2Vo|8Al4faGU2D%DFj2w36X@UmAF5H~E|n z2OUuPihb>JA;&w~?OV&)lvS?j;_b0EJTrDbKvXD~;-D>I^}MD;P>|O}SzF&poX^O#tPE5#e({ z+1hpoX^NX_FrKtI%)D`;4r4Lia=AhH@QyqZ@g|i@^mlDhw%>+6+1CWqN<{?4eEfL3 zH3B{4&x+*VOTfkC#V8~b9pgHme-^zE6%=D6g$=(PEx>ubHjz61{tt>ZQv5jk+kzZ7 z<%Fi3^Yz>Vy-HpN6gpze4C^FGV_b7r(|mefY(&2M#PfnYIr1WT@Ui(TK92G7`^mnT zUjT~o{-4?Q^y$;-(mui4?owG6^b(&wecIa$xrecsf%3Q~Uzr(I{03yUhQqUWw8QP| zeW#kajMfd8MvJ6Cj>jsfnV6)}9q(nP936R22?YR`%VdxujbW+TYm<3)mDBlSidA06)Prb=9e zxH`QhGI_9ThdK4*@|CMsy>6}s?D)P)ZhkKhoOo-0Ji=P2`%Pn5>g-607-^nyhF4Tn zkPno(@8*-&xxC*W24w_I9o7$|q2tQnRJR-Ph$o@b-X`Ga$tfe@DvjXY9Q8!vhCqJp zQDK*fB=P|tYe@$L5q^`!`wvud{!f~fdpfhD+<7elN?!zY>3V5HmQx}}8F7`cZY_>x zr+d-p(Af7bYgeG{S5xOZ^Fo;3GiGwqN@j?KW|+;Y%AqrqdOqG1X|1@Ulx#-oO;tS8 z7$RQhOA5#qRPw9+UQf-GvQzqWTtp9r4<4>mh^ z9(2*5t@q7LsnAg=kkB60oH*HH z$*=2i^a`qK>w3mpU{66(6G(M2d`znWgAViI^3f`m8d36<&#gzh^J%5flPfi7>t(GN z-jE&p1b#&LcHiBIvWZkGlO&yLiy88#;B(rFzS0tv%PxF;A3^JKXD@Yr8~=g^^wk>6 zD>YPNsd)G9(wn~LG zMP?%!5h!D@$$&(|m3Zg8;1R{N(qu&_t14TE!3XP+^5mIVPq8VfF!|k1BV@MVJg3!V z_U7odtK<~VDn?n?zw`XGnF-gW0WtF4DBfs5pRv+?{h!L-;<$=Btb}c_T!&VODMoy zaa;9&U<{!*|6cpdHD6p#P`>8be0N%U1}SqorfA-ED?PQy>)3NcZ&4{zWyC*es3Djh zvlLmBp_rJ$6v~{7EHk|-SA_lnwC+;b4vyu&UaLP4R}kR`pnxvThV7h6S{8Io11-&F z0nnsoZvZ03e*5srbpFM-D{di5K6*yiT^?a*CP}vAiMw8PUf|*3soDkWMsxzf9eIt1 z=DsA1`eL&Do{Qn+C!9~c95t}v0Chqm&ovZy2%;6sjpA%3wT~ZM`tGz{&t{eYu!o_M zXo}`7P&DEg@$^qRc>L)_*4<}6k8j^MtEL0=j#j4Q>qQ768VNVY!(5t(&WU~8!F=6u zY#Yh->scU*{q47cJ0d)fBu_kKrR=S&>Y<9i#Zp%YaRxXQ4p>~}Ff)8ki!XuTI9AR4 zZvPcOK8J;j`*l0`WYyKxkm(Rm%XHh1Ta*km=lE_11 zrxreL@$V9VXC$*@li(|)(>*MCnNZ&*59}XPjZuBz@EJeYJtcJyk5X2zjf9MNSMRA? zs2az5z)R0u?Q+MU+}vDp;m%03b&h`okucj}5Foiod0&aQxAa!XbP%$!MJ*UAJ2q6g zahyA!nw3?+aqby7pI~1_$$wzvcaw6&5HA!!eHS8M&AauPScp5N9m*f;1eE1Q*$DsagDCE1csrSSUa^aVsdaJ>y6<-+{2inTA z8SB2de}96VZMr#5SL8Jz${>}LS*69F(wUso@Vjvw{fb9hUSfm768eeU1v0X0Jf;aw zqaJH^+>;RQM{I5%z^6sCxk<)m~ z0t_D={;Zsa-#$P&7K-pz69{!fEqu+W-tp+bAOm4T>(kl1jt=%Va_7!t*=3+uLWJW9 zBPj%F068BXb|?FOoi8H%U%D2QFZyl&rg+=+NFlq|)jRW2QxAXMQ7~4{l^5|Ca~FY} zIjs_@S|xT#CXPd7cYOv`LgX;9V}tr_^>=W(@PHSO^$J~BK_J!D$R~~Y9^M4=Jdf`_ zUm=usI6TH$z-jNUb}l< zZOp(X(}EP>+Hwgf_>%Bv+83a@^Y!)RWDlW9mW$H~Wl~j`$b2C!Hwgm3o&c^8TD6w7tu2-G&SbE)nT9ecO9FgT@d>64 z#A%rZ5%;i^pBpe3(``ashU(XAh^fZNDKnH}iK^&d$#f9h|x&rB$3LcjFr9&>8 z1lj9$T0gcLe-5=#e5Gg<}eKV82`g>39%dt#{xn3FN3_5{JygzK8 z!t-J{rat1EUt#w;rskXXQUX9eoovI2X-Z>K&dj*?<+a&Jc~QPmyOqua;`EV?5lE*# zrU(;GJ5C?$aGNcf) z|2C{rwmvnb+AimCIG+ygAseq$xOkxFz(8{6?N#A;gc$lIGCv9WFH7$Y^=L-6>c2(6 z$zE8G<+fA4Y}%D+pi%*q!l=zhI0X>bWqmp{N^}0wZv>Yv^`wW;J22Bc2)Z1!zWO9A zH8pjejN#o+2`NIq{(d#>YIpUS_83(|!K;QkbZT6Ot$kuy|+A6@yM`%YzBhB zH}I<45>fJzMF5Q9OzKeDJ5h1K4XfCrV&>!{M|^jT&u zF6p44%j*xrWC!w%6k~^7g0+;SzqLm3O!@}TRaJ$u>k9qpP5jkGB7g_<1fGJjEoANg zcsVR?Nn)*Ms!P{EuSO40L|N7Gt^8fxO>|}%-^1ohT|6zRo>knG5|7Q>Mlw=av zrF^4igjqsFV3CUn!+M(j#S2dIVSt$rkLt z=r8zgE%jgi)}18sq5tjPmLm%4L<4+(Ig$Um5XVHF8 z;D$y}8Koqw;R}^S;E~NApvm2wn_6081^NwA)+!!SRYwYTfI)d`ZZ3UubhHV}8inYA z<(>}8*{%+dS4JyIUZXQr76ALR4nS^E2H%I&G!qcG?{%{Wu@=L9x|uQ+#n_7hZ>jiY zp-`nN$2j@Blpkz<-*1^*gWIbXIa9gM>^J zcTrYMo<$>(ce3b6?7J#>_wOwd;vwt2v=*WldM8AV%i`OkCII^9G>Z+>>jNo_>uW`3 z#z>o47hugXSy#JbTgdUJ**|9ClQ2A1XV-c>@-Mx1200SVvx*A0=R8rZn9(}@ChHp0 z__Y%Pct1X^HJRs{0Rg)r?>@lU zTyDC7!4{!%PMtDoo^Hij{G5qX@73hCmIoi2)UO5DDTH`f%oY{@?GE`7Px8kfX`qX! z=OZ}MW6XIZn3ss0JRQ)r8Ibdln_V<90ydxXbwAuak*!rh*AYyO_u~jLkk0pJ88z-* z`r`xHex+d}5z&<^DWHng<`fDU_TNSbO6CF?h3c7dtFWm<3(NhvEn>HU*F&YSHjV?O|zikHg5WFAX zj=x=o;c&i#fUA?>3sY4&-X0^xqL1jUM=R%de}-U(4iR@DFsCCc-ktgrD%o15jI7h( zlm}!`s5R*?ichA|+p1akjs{(ESmVT^@wkfMpS7Cg0U#Uip(t=oorp&*Pvu%y5|o0u zK!v8+hB9Fn!2^IruijSAvnVqfDRe2N*3AG3%}+6nh~Z^NY`j&JoUfO_NlkR?rOh5l^&wg4cC{n8*B5;X%Iofufuz%Z|pA{`Md(8^E- zi61P25B0|Nn%*-i`d*(T2PepnACJLn$Hd7biR_KTg+fi|GAHQg<0B8u0%*J7t_Pbl zC$r+TG{TJ7gt2KWj6@c(r~{9)sUH`{5`oh3!l9)uGedEsQR5Oa70Hh6g>FBTv-$$<+z{DBMsyPfyOp=I5941`j8^(|W z-m*-nIvV#pDM?9_1>LqxfR5IvvC5W>;miK`{n4w}ubV(C8x<023S*DEaD^%D{;La_ zj!T1x4Quyya8g&Q+z`e7P`0>}QL7oNCm#{`g-0r!%!jbqgXIo>TPq`}05BPWd-_wK zjcp2|_J$^`)PvLOnUCT>XJ}++Kn+3%t)W}BPiGQbA0#efZn<8frUAaz*L=|)_1U`Zy8%cOM7yN-5lFu!NiWEoP zcUk&gdV8CLlUAbwlg({CnAD7PO}xG43&Rdhp6((y9*30hoW1Ares6Kms8o?1g>6+p=ii4-V$OQ}*+9dX7t7if%Ti7cy zzi2-cr<^oNH1Ux@*#dq;}%)f^L(ckujSRS0Ao{2@$CKOde9xHZY45Std!!6%cSZe#a_VKG(zMEKJC}r;p-rnuK&N2374bVr zP5Z+`od4!^T$gP()dmh7evO>JSNK!!|7Y%0C@)p3wkLn}`smm`8^|z|uf~Ziec|Bt zM_L7z`j2V|UFBljbS^p!7S5Yp~pfl>q%Hvb;N9HzQ1LO zwVk1WN1-%eJM}Z<=8D=0_FZ0W1T}X~~xs`sM)f!)TN&&U_0JNp8l$1#yY0rLDd;7?X zUM~9kQ_sVtfT6d6PDv}!~_x5OA3M7_@Ap)ei=Mb!X4Q(+ou7Zc=d-fi~xn9GrF#g?D;)JKQ2);v^ zUg(IsYB^dc0kZ(R9|5k;gfe_G21p}dScm;N6TfkiVL-EIn_=zaWi@5}ik1yjfP_5w|jdCDHNr z#6##H-)3EJqTTK~K*iVNpRyWd^4QSX*MnBi}ynA zTvw4x7hMd$TbmwY=i=U&Y3l-SXQuq&8I8rCK|Y^KtW10TBWy}s>}Di`5u4|lK?CA{ z$j~YC9rc`)K^*%z0NxPd!JNIuf)UsS78dCj!n+SzV@ZaIi|A!T zj)=N86X(Gu0WA+r5zJv=rmn5M3ua3W^<+^wxpt+9#cPz5Ibpi)5^%|1kwcIvLfgg3 zs>A6h*+Yu(Bkk(@RF&J+^L^Qw@h^mX=9M#vSXt+{2pKBZG{&6A4DNudUW{2K>!YDe zj;G=j;8VcsOglJzAv~z- zw_L4`Yq_|dMRf^bPMf}f7}v?7YBROmYQzcx=2LEVpT2kZUN{`|bD)aP0X4DFE|dyD zK0ZAB4?BKQ@LMQr^MT}SHyy5d9ufXx=LZyYD8f`JoXAa@tK&FxVpDaqId(C}iYjG*Aw9X8C#w9y>?-;D}6yuu}<~ z1HY5iR8y!FK%ePi8dKtI_KL}lhr_r2UfL1R6p7H_#D0Ya!{#F^Qt zDOym#S^Dl6bbLOIb=bJVx_A@|jgFBe7J7|WSPLmp%l?7@H6H!ewOLtNIpM)J(?ou3 z=h8BpTWDo@U^EC0P6ovkX~+b5{22nO7mlwQR}7*bIl)NUneR{CV?J#R}7%JN?7?JLZFhr@Wrl15*~kX{7M_BKaa+ zQ>5<7G;3B}U4e1?{A#=imW4wqj3p1Lcs_6%HLG#{4Ue6><>8k{ZoGv{kp~yVWN$S& zDgyziU|{IU(AI=`t6XrURXNp4TX3u20J}fRVfg(qP)!5=#3aA}{sD3~txLW?4KP;o zx(1!Wb5U!${Ly$=dTSkr#bI?I3`LOu*Iu*p+^jF1T8`aZ;zvNA&Pt|ks&p@wVu*rr zjr`iROdcIK8OSiVOWs2(q{D2#JmM8`6~rY$n}Nyp&0?CW{oWx^_`qR(H<)(DZJq)V z(~zb^17ui)=8umXPK~Yg%;@%qx+4G~y(`sp65EP!<&g&sD|>Fe(v~A-A_fhCs=Zn2 ztbNB@&Fsjuq^%hGH5b}0LGme-Z(+aor(%27_GS?o?givF?1$8dYRIrgONpPvWpVSXZglm2#90 zR4FvGlQ64DvsWGDBMqBlo^c6~XztNUg>tfYrO5cm6-&KQ-^m^OPz^JgRb3y6z#|n| z!dxB0;ZL2s)o%|KxRtcUN(?R(Ck9&qX@s8NXfM+5kz5jXXBeHa-4CE>t&l89eJr69C+b_?6O-k%%ewr#TV>Es^tr6L!K!nF6kX%|a>;1fKp-Q^FG=UhP$ zH7Ny!WuqYU_gbusX6PiA)0PZaxE|G!p&$RaD|0mI1jC_xL?2+jedICt(Wn4>xSxB4 zT{i_6#besMWqEbGwGaB3qoz0j0fawhL_@EZL4!ZqpeIXN5$P^PtjSOWn9Ym|4%$7g zA)vw`1B+{azPqQtsft92DbIiu7dFkvLMn`UxpSV;NWO!4#X9TXvs+*uVf%jk*2tQ* zdF{Jyi07`TIgi!%oGQ1%#W63$R(r^?BzL&l^}yr!3n3z$An+Xj3cCVu?yTE24hz`n z_88;F;4k2C(!k9lCKjZd10;mZQ1iaDm21}FVRUM)rQ{?@x^y!6hQN&W`41EnIeu~V z;N4^P=wG3(+WvGF((ng_sBv5x$P~)0)GrG}77XGpZo;^idNhDBq{yhC=mu!bvWFa& zk=zyoS-JHA9Z*$6!heE@D}(t46(hOCK(bQNNq;oNdv|?0uLk-f3bKgxt6%?w7>3MW zKP;Ov;93?pXfDvG{N}1tYwq#AItYhKztl)?MtuFLU4$oXkK8i~?@~Ml+|1 z8d(2pJZkjWp)AUh@T3(=mPZh5T=85MNChP*RuLtwp_U7h$-743;gJyQz z-1m6we&Z(tT2W9_ag9Ij?Kwe}rkyigt!60TW%u)2JQ%|?^Z7xCO^W8tZeU{9A@Tauis-(r_4`-|bSCwS>hptD zQ5VP4=K(q5Zsea#LtemW!QKvBy4l^IibOP|1)7yH{Sj6qfN*4$)QU zgkVyfKB8!F1xCYi40BlSu+SkwF%i+?RFWF&qLB?q-S8TBmMXAUv{KR)TT4YI0}9BB z796T5L9{EKTO*z$sO{?%?8{leoy~n+I~L9#g0(zOET@)Z^O`q1A1qJ=x+M-3BxXX^ zt^4$b(Bcm@c%C6U&Sk*rH&cZOqlbVss@8g=)~fql9)S=nT`+T$*fCk3arfD8(~kLw zNCQ@xJVVX5RAOaOyEtqTkam4J;ZY>h@7v+$EqbA$nMK{bCD;JzHz?NP{l)$L^=SE@ z{CGOzJ;7@?m7k^P3mrEa#rd-K169pO&K+M1;f|P9ayz$$4?Z|j4Dv@`v=M)HNAC;$AfeK3rWwK_K*jpMhJUOs1!mnGXBV?Oe zFEo^7?}NVRRQ}lUyEEI61$U2ie{C?W_iiafAfzRU*6kf~H=z4RRy=+NW*`%*)Hmy- zO`5%+_hE4(ExAr$fl%&u_KH*jgRT#|SR7adCC$JO~QkqTuhB(en*tjb(foa7wG%LVdqy17fa1<%9E6 z-}D6eY@kCKi=*%+q$=KjIy9gl1M=*W@0AKix~hZoc)|Ab-Hb?EGA65DCcEn^kwT^; zE2~<*8sZ2jS%g4Pmk+s5t+H!g8gp3?9*wurCj4SW^I$_m!}!|AoFB_S+8rnN98k6q zeq?NFy-07ALQw{-)I(T-tq;T4bhr|^Om{iWZ8+2$9@i^>pd34vrRd%{XXfo1i^DWq~r=_Ji z(?9kmSCmkkSXi2##;8VWSJSLbOhCgNVk45O=jix*{Cy?Xg-EVzePdUlc#C!0BQLTV{cJU;52U72&9e=~ z5noc~901hNVm-aBE+;qRCmIl-jP}zGCLuTVkg>5hEXowX()v|aW_i_LTHv86`xvKi z|9+OPmo*hts$XKc_HXEsawD(njv98z@3)K=4F#i3d%N=;>l0q}8pK9%nMclUG=phm zxsQ{IUTAth(VlYa_)IW$5hYJHv=jxCCzmE&WZ8xTw>p-kJKXPM}?fX1*-ARDr5@+?_QVWJCOuwVyiNfv8n(R~GCzb1{r#_#2f-tdi(q!U~r{Z1DT{du| zLn~k1+@N!<=QuHNvvp?;=+c-kE8-{55Gd8tkd%acW?7`2qW#pdAnyMnFt* z)1ToXYQJ3=F;POpHES93*Z;-ddq+iiw*8{yHJUdzf(R&8B@{(Kx^xY~NE4CXl|J;L zw*jM(UX-dJVCd3&FBW>0-a%*ReP{y=F#CEe`M&k-bj7^mtn$F?s8qf zb~n-IJINo6-rU!1YPgxm9~+aSU-+SjN296)%&8As&pB;0{K_QpP;#Rjs>T=>Z{3mt zL$~?Qmk|&sLbUPykhE;-ApXPU_y@B~2EcD2?zMuFojy%14V4v>t+xcm{Z z@pqxeTb-Z(^>`Yo%s>k;5Te%ElhQ2cJ%2yLuKwVl*n{}XAKP&|C`+;swn|snHeHxo z4?Vs91!)28CTq}!e&>t4_Z-`+6mGH>zw7WNSrTv)%>KGd+@V$Ms5$n!$ zYh>I$iDi-0mJRaTl+iM)G+}P0DfXlFl)c}G&phX!U-il|sP%(*FQ<&K4{{#>akciqc7b4_EsEQulARCy;s_k9Qv{lkMexbro61tp zwW_QKH#c#n!eRIn<*A`D`O6vytk+ssem# z20V)$6Yv*+Fze)1eq4PyugM)LNcfRK#w_%8xb#woGFZ11JoFxn{=8v6_hDg)n>`OwU=C`JL9#CLYYzq z04|=VzyXU)xuLwt z`N5!Re!c$(jSA67iVT}OgSgLY#FkMx1=3v?SY!a5M=-gCPpU|>U(T!zW=4(L{cY}% zxfH3sc&yP7tb8s<2ObFHB__7-i&6nCEWnr9TEx|pHI9vr^-wAzDA#bdZf;!6VfB@f z@@kGy$$x`=Glf!W|E{9u^&#lMkxJQRWl$Lk9s%Cy-<8RmDxlGA4Ed4@a-dw@(%uNR zZGRyU)~Mz7mpQ!m4th%*fnWXAHWLF=76F<7O!5~&OgYu#B5<1}IYT*yxuA(rEOTr; zDT_ii=fc*Is}L>9PXB$qIETc*JwC6VY${(%dQXbE4)7=-iiv<`UiUDb?b<_q4Vjtg zc%>#}HHheu7$>2XF`i^ozIfB}IkVR?pJJJN&lLob%reTzL#9K z<^qh!5j(cCvMhDyUPmttm-E(ipy|XUmMppXh2ry{pn?t$k~(rPvIkOWRz~WE<4ju` zl$zH4bV=3#p&(r`M3{Z3QSu_L3bEbx$y#2cIg$bg{t^uJfet5Nagwtf4Z@r;v`K6AG1hpQP?K0a;FFUyRJfisU` zFuz>C*tBr_Y7%85{<4s1y-9JAz!}0=_$h`!BX1f?)Se~1kO7%<+ zO?!;qAP75OPZupc+1u%(Jw?Owx<>EsKW1gvY~xItmr-KYgixK+~FvKS!GA_9g z7IDpAcEq*9DuCZVdqFnHA`vSGm2eqXnnXZbG;dOB%I~(q@jzClBM=d9-G1BZO}ohW z&Li%7&K+cysb454xxn$%E_UV0hoXDG0?8WV0w}v#Ld@R&ewO~;1`IpF{-Qd%i~QO~ zB=93dY;9M1Y?3qRCSSZKk~wPA*nmLLt+RQ>rr2g(Z6^(d(J5}70jq(nwMC8A>Rg>% z1X5(S|MKJ+A;!9M&hgHa%jy4oX0>ff?o2(Vbl};dV=(R&RT9JeQh` z>>ZfH6N)jf!Gw!ZceojOx`-3ch#k1wj(DFHFFZvf3~+@GD*DY zBQaTIL|tt8_aR@`tZ8+s z)xc=o?)?1Fi_{I8<)n5l^5IZOL{1IgfD@_A_N*@_n`p=Tu@^7M@%ypQ0c(QFhnHT| zrlK-BYn5kzg6jNvQ@OKjY4`O3p>E#?0VBcK}JhL}vH8Pcxn6lrfH`NpMk$BXFkt9kxHee78NFiZJdfKJO@>!ZY@ zSi2L4WYw49k~=T`!Y(8xqaM;h^AEDn5>t31#W_EHel)E$<-@=7?x$nUeXT{Gm_+Qx z1Kb{%0IY$83%2^~9(gfsOIx+-vt5lGfC!- zJc{eYix8gpMaCa}D7Iv+@lo!~O3-t3BS3@POX;1FK1y33_6GlW6dEVAVptz}&G*lP zEd54LrL_-xrx<+>xwPJ1f$~&w=dgTgV)L(tMImc#L{v(@MY!cp{qM*&@md$-`Iydl zCT5fmz$qZS37hmh1WBo?8fYC{}4@N}|%V%hG~GV*?8mv(ioJE=Ga2#dt&UzGz3M{HbY z5&&mLL+JyHy*|iW5%ZG?#oc0Li`s9()Orz@Rf%~Mh`ksogcUFIft({lC24RmIve>* z#06byJD{M!rP;gZ6c+Bg18SJs{Xru*&?N^#dqwf5lDF?Gcx4} zRSGUcEP2HZNphU2BW5j7hi+RXN$r~-xOXOrt3br3#;4FrimEH{6(q=)!)psMTF)H$ zY)81(V2mfdm$(IbMexl_uQ-x=OWvxvtS;+-Lyr@nR@l>fqFN&diuEApfD>LD=y}d4 z#%R8lbkg*v)ZrKv5>8j*n0oFitL_`bI7Y)P&GLzkzlT5(S>>-mtH00$oeB+e*>6 zh!oPpTs>!0;cy$@teX)~{`sZl_ZjXG2g0f0a3+U0CJR^MCbBERVX}T7s%Z7jtU*fM zssB(;A%-}lO)@p8av8ArHR`m{N*@a3Dj65QDlU48o^4EbX14_V8zC}5z%|gz7VNTY zdC>Rg*#o!XFBoiu^TxEi5PG;J0?Z~Ual%e^j=i5${4GH&*qFNs-*o{zJ+E>s0P+#R zlCNFFk>sNg8ZW52I$jwqpv42ayS+q?8>?g0T1;c5A|?abMnKS&r`RPEsnCtS(Fc#@NR)2CIl8*RY?&W=fBK$F=t-iHWUg~Lwu(D6VSi>UItGR{= zRaFj0WW}c{l7}TyOv3$+%6~v(Kk&Em*v}c&&;kVNdo;{8U8RA=5q3&~bU z7-P9<_0FnllhP}|7?I>9R`MZTIrtbN>lg(_9Ya~`4>yASD4A%>^Pql*ZTutDn^Fv- z3Sig^(F1VMcC<z6AxzJzvZ?Uvvu(7 zi#?_Bb>y9~pB01Ap`vMu<HDYDG-|V_Fkot&KC`U`B6NY4YSHkue63cuu_&LDFh>4^P z-zUXW{zQS+1vQTlVNTwadTJGFu!M(b)j)4DmqB`nTsT8~PQ>0bcXm>0K(bevlP!`2DUf*AlGN0LY~@*OBK92h-s$RtnH;9qV5}b?ct;W z&MYUc+lZVv#kBqTdlb+z!uUIYqbVbKPgTTu*f9SM;Ui$IL2t0sXR9yt&brpLX=T2t zH}Cocp=t4{y)j%!*A2lPrJz3%D$SuTwjkOiNV|1Kl%nJLo9|3LVy44FEcjQ_(wE$)Q zRJ~T8?@QY;^m=cz9|+oLE_+EUu~H8Y5A=ry77b7ZI<<#^Xlzj=Gau5oK6Te0WkInK z7u(Pdk2dA#aO*H!I_v(I53{aMcwW05c|E|VE8`?jR>=m(5bpi;!x%A)hgdBN#=8qC z$!iycO}|>_*{nx%-qw)`DF5Tn;Ef;}ui5VS44^-p#P0#+Jy4*^>ayA^Yt|N{+MeDy z?K_m3Q{oKPUIzn_BH5F}=Z+nS9_Co2_c$y*C@RHO>=2y_xY$7k9P}1s7m*Ur)_Zc(q zzcr5Fm_&92F0uh4I$vxEq^@$|(uUIcBL6NDjCW15rUJp*xTgt{qtYo4t|`ana{k&H|qu> zp#+23lejK)m0gFwv%zrS7K z0#qzhoslqtvjWKrI?n`%OHR%H^G`%y>+W75A<^3^B6DJf_x`QpCuzSLd;mV2>0I)b zcLdrKv4$ZPm!cH#H3I-6Z~chi3BO_y9+VCg@#ABN)5_<7EZrZS3PC$p^~!8wu0tpE z5g&G`POm3fb_RNqCJdNAw!4=_5#coyVn@W=`Qhd8#&S=bVHm~!+jq7Dw5@(kxxqAC zwDyFqp0%s6J)M1kYe&+nV zK_2{7S$uiCG+2tfp(qeG1Vf_)(t06tQvv3hsmQR?rsdIoQ*P+XYL@_2BT&D5lf<#_tN1$S zSDO>FJNR#_=-o9yq29k^JGQklG^MQ&0h*{@P#Ps9UI(5bf)@}rnB9d34h^u(QYZBO zTfzo}XHfichwIhv_9`wC2^8%=5=PfE9m^_O!k-+Ycr4fsiyabQ07&)rI2^;UxP^{6 z1hXq*1q4gL)6zws^;*J|7i4tZFs!P8Z2`tGCMNH2nfhv?CPq8yg6LjCWZKgEy#M*!#o zK&i{77~Ay`k@8*T1tZ@##UbK2_!IU1$=)XSlIzNt>Swrp$$k91`7I`!B;m^YUcMO z%NU!FF`b}4iDml%$Dk5^c#*@*2X1$OL>$<2nsZ)V7DeQCH4^vWo;6A=RV58`Uc%Cw z99L@9Ix#U(^N(E;p3^w$`!NO-y?=p^bmH{M2hf!M-a7G72GCu=Tc7r3sMs6+V~&BZ z^j`RG%%bT18P17RLi!{Ai#w-ap~ba!lYI=;qSKL6Q+-~7IC(w2FM>EeIRAHEk5 z`fs&)x(Qea*Z+OeZZ9u+Gg{b$g7Qz2s{ZSWxlCF{IKkwfXXx<4ZAnwTJE*bqLMtQ^ zzexG-v-|#eHpn0TUp|QOf8oJo|CiS2|G3}(Qx5_BpUzxk{d!knC(fZ*CN4KNX&3eXHojpXJI7_)M8W*8;MnUz zDPDs`vLDE2gfQKz+&5h(L)LUE^>v~f$J#^IMrQkB&tHt(CF+XAM{$^(py`TGl`S%T z@THkQ%ONJDZ8p+yL2`U%VU;IdIVdP_VBph@<3IZ^K5)p~F+AB{Rvhc{#XMl|`9BZ- zEe?(x7$`Qz2*~lfysWz8M42RLA`WN;2asOpFAA; z6m{w3?|-{3)c!1g>GEaCKeucJHV+Bthh26b?1V2ybOI^wg{`Zf>!^McZ#oxm)HPCR zAD(%Q>pe!CWG zFOiiYnpTgsuhv&R(Op#?iyKppq5Vl)nz2Twe%R~bnr1XD>y@&l2}vJJhY#QW8-wJM z-y_4m(TKzEu7dVESCii%K2vNaC{toFq@`r-b=Z);28mPppg(gT$D8b2oX~fbV(FF0 zlum?P?#7EpGPozlT;)x^y#`?ikO#h_Z+Bj$qG`rTW=U;6RLPxb?0J`i1^cm&dNDe?$;GG3*|#CAT-p5?WL%C!3ygB_Z#@M<+Mbgl)YN z{$FfQ%Y%bHs)R-)KJDwV;;uiSrgGuxb~SPAlEN!+UK5T>**RS&Ft&Yp6MhU4wZ571}j&)1U-_hYC>1rG8trX^|4{*miq7?s+m>3 zJ+H@_;`T{QxCSO+6_deFErry$L3y>l3aUm~%M*>VkBL!sUbmuH4n{Z^`X!lYTIPyE zT@~~VFnoP!T2U;Xj#0LyT4L8e^*h0;qXcTh6?6MrujCdDM)_u$N5RAv+MhC`m$*^@jHs@NX)lm3nPmd48&~9#*suJIAtGbc$`=}_E0;_E7LKTCls5WB; z8ajDZ1 zi_EyE?>d8pA+fm1!XTuTz&t{?8XO=b-XtKhl4iNvjX$ol6FQeXJL;6YT4p5cv+&`` zMGD)#Cp4vKi1ybo`|%V?Xr%UjqR{=l-oac@LP%<3PhU_%fKOB3#ZyqiTkVdFK~+iP z;@Y0tw9`mSrx%x19@(zn&`ouE?w(?7;ZK>lxuJAM?vHZyoyVjnYg37aVDSl@HF&t-9Ooln9m#mB%XINBB z9*1hVG`Bd<)9!g>^U!Ge?D(sSJ=DHExIR9i>NduEsz|-{aW1Qf$%8vmv0+!PBs&jF zsqXdUXJD@Xm%+|;Kuj(iDK>9xmJ3zF3vG;a`%2xDTnR=qN*{eOBXs*7u!c(CN2Afp zCG9+gL@82}4{431mb4`wE7(3>>4ck!H4M@+^c0;;?aqN@LT!4xVWa!;i7+0 zSXR;~YU=6QgpO}%Y3k_yReCwTvq?=wf~vk^BZJTyp7dL|)P_TqX|!gJHkEJn7r&d2 z`ZIUu3!9tVcq%U0RoK}`Y<)gP*sUisbxRzoOAwcf3A-lYjZcYFEF$_M!aRqoZum4h z!@Dd4SN6Q6e4IxpaR-}Sw$8PZCZ1J{!j`X;;tuNlUQ2%$tc{*Tx?^Eo7Z(@I`=NSf zH|2xR2|MX^d*1uxSPOFi#9@zaIuhGAm8QjX8r^IE>pONuYcQ!LKdyQSKJZ>Fl9n4)GlF2kZ|;O##I#vR zQv>jzh@ztE0ap3KP0RaHv08UvPzahb7{l3Ed|(WIDdc26F8s*Yd2w76t5XbAD2J32 zE)Zf6`;18Cz6T~Yl^S@mkEJL|HlO9}M2@e%uy?m$SPI$Ot3#)1)i>!INO4z~iuXq3acj+)7yisa!S#}2 zR#sM;=CMcfs6z5jztH9clVg~>Me!l-JFhZMZLo$*iJg_&Di}6y=T5O6$~#_6%!Y{x zabztyiw9!^Ei;aoRUL~)0Tt;|el9MqU_U>iYoeOIzSnuWA<9_BGSulQ?_0OaO3I|p z?oZ?%{}y5MZuCt>g{x3`#XN$j3yYDK%Z(U)8Umgy*Hy1wtsp1G8hRJLDS6pyit~y+ zx>P5$KUUbB=#^Jia)CTy>!_>%tE0%2v)XMmCcl;#m|H$l_V7#zTfE!acxK53D#CJ( z$3M%O5+Y{`?VTj8>N!YTYxY%7PI$R}x%YxKoMg%9+;aK&NY#z)P7(ZKwomOwok?>T zVfz$MfY9=-4=28aNXGNb9_@{g{~0qkz79YICUIxZiu|{4x#3!2k^1}d(Wj4Rxhjm{ zFj#tA#F`-^j8LxP%`UR1O;IcR@ufFx`);ehu}qlv?Y0!W$MnZM#jXE+XhMFEH3|Qz z*4Te?!lc+PA=$t~XTK;})I9rkVfPy;G-G$iYr5{vxTGHo|2u>&8s;=p=w*4f*^0d* zl@xxlFC)JuKQkxf(U^@|fw{phe$AIIUpcO1_9^=}-MtWJj@n6{$c3Y*u} zQFz8=V(gE*{*2u9Lz+daDe)}PE6K?<$)9~}v){fARIK;HOgmOakBIP8+yJB4gMza` zW|qV*LN(kEHvZ-Z4_i^+dM4!2_T4m$C)*utO!21keYbi6ik>CpiJw%X|>3J`ykE| zzE6yH|5ou?iVv=E3^zx5l;RwX;ZLkwl7qBJ@dL z+Gl-aw_|EGFw(wG8Ue9m>QP=TFso=`nA7(uvf(&|en1TcFt-AyHKv8g|9^6L@9OZsJ( z8SIrHnbi8rlj931gK;_n7PRA98X9!tBF)LN!MJC0N#V`RSV3m&^cVNpg$(0aINz;Z z3Sky;zf6&RZo7VQtD{~SbK`7zQ)zzPkT$=|i^DLk_(oz*`2v|wKwEtM>8l)$z2 z%jpZh869j~gi9H2sy#4=Vy(fowmL&KSUoDHK}Bwv=}=r`QrUS?ToI?|CdEABEyeV0 z6n?viPwWSqEaeRz7D*G+E~_IJjXFkRl2V?_e0+;9Q}B6#)Ta!^-PfNZ%lfIJ(nmng zV0JEAG<3XDW?_-EUmdT&ZLs9lber;HzR1$aYG`P@7vn5Luki$mD#m7DO7b}sQ)u_P z-d8uBRbC#gs7UAOs>Z+2*40W1`!>5QpCXf1d+ypbE_}O1!T`=Xymm*4rBnf8Rl+%v z+-q8w;Nb+!C1QnVyAtB-uJ2P`_)WTQqpii1f59@^&#umc(e0oam%N6RK3b2hZ#d&y zFWNM@RbH{>FC9dMV$vvZg=}^VF=1^bb5|NE9^4n$3L*3|dW1vz@YS&S7^PGZeb6IS zWdnLyGc=lBXpGKvOw7+->PBeF<=9|*r2{u5qY}X%PfZT#)(W~bG&P$7lMd8_^O;=7 zuVVXkOY!%_8}svzl($yiFDSo)25pAL5p# zsxH~>4cQXf`Ls~rne9X6K4`wjlww3aLXWq!;HBuz%+{FRk`}r08#ZRvn5xAsc9wUy z!Q;45S5hJ)TlrRJM0&rS)9_gM9yf2Fk+*I*$GVZ^Y8Mjl zlHbGez4u_ptK&}EYG!PqQ#|IIozhZOdM;)M2lI!?P8#@xJ5$;fH|t#HOt?Lnzi$U9 zyh<{VQ=08Lp(~4@W%|{>6r@YV%PD7F;|fh;u2k7t>X%|_JTT+JEsrq=v6XIrhYucy zeqCb))#{E{e@wHUsy}8zT%EwTq=f6FWLuZp9q*oiS9NlB%%y`*^E{>$6lqZ~ zj4gXabjw#k;8n#|fuwP~;j^i1DysG+2Hfj0<}gvkDRD8Tub1{X6ghkbb}Dw!Yvp+t z({jmZOG~IeGTrToRUp4;ESC#6OtJp_?$6`hl6z5VCujZ7uo&L(7s7mg?t^q>UxyR@ z-l&gw(~-=6I!18Xk+A(MIgET=JL>y%9d1EZu6Nf;vU$cacCh$2A9%3nkL%WXdnm<= zxy)c8e=6A|;~uhyUsX9-&MjoMbeYgNQ7(^ncaycP;dgsPPq*7~Pp^qEW*38OLf?{tTu&|n$+JF6!}$kH0cw# zxQt_cByrOTorLpY!RZj*iB5++wp)7@aw9fWA~xSUel-@LW!GO_VI|>|qSzmlNZKqz%5| zD-^jAfjuU1+vvf}M>59a@*n#y2%p@miqppIHTPY3Jn(k$H&tETmqPuhejeTq-K-Lc zBF zSCXr$svdS$@tf>#xo-q=;1Zslg0aB9vFEL>rG;{-UL5xEXEM%yq=!Q9AJ)U^q$Qlt z0ccLTK{wadHn`gJjv<>s&7Q8B9eUGvd+qN$4W zI;?nExD?ZvY;)TNC->FBUM2RTJ#RqAOjdho_$(GS+`R}z?w+$X+UjbtQh7R68u`@| zvJ8%u+WLAwmd9MjNh$G$KyjPIrkuZDm~b@YMy97@Sc8H;orZyZ0INuefo(3ry~F3- zFVLj@Ks~>`FbazMg2)W`{<+ojpQCDJQ@(!@`LE}{<_>=Ux&Qvs|M?Fh5~kAzdN@k; zqhCWsq+bS~&H0TpE~=Z&_NjcgiiP?`QmGg+X-GcTiX{C~`97TpNxH!os%zUomF-LE z((aqyd`w_A4SJ4S#nG|RX$Zi=y}po876J>{-T#8cet%dL6mJH;W9~ox=Kr&SWNN(i z(wklw0=_Tvt)_lKpDwuFHZ&yX)A~Ws&&7PQ<~X|H#I>P1#XCB(^v7)9?fy#LmpLHE zSh!ET9}G)28Y_b|0D_9KZ5zWjx26P7z*&o-CK&N*K{L3 zAA2uY9+L{e;eNp6(DFPs7xjwe#`SsGe4W(mf1~X%J*quFo)E=<&<P;{OK^;(wZx z?@nALH94tzO~|58pV!my=VYo=*)szkkvxK=9Mvf?f*K4ha{v$?c(_bLWhFddLu3jrkZ$gJNG$C5 z$GNQfP_VgYJM8f2HDnNnrH}^bvy6b_@?RA$z5RlAcQ91LoL6~1_-Chff8vy0S5xIT zCACCFrQy;# z6>Euc=PB>mj!2KG>YWYBU#*sL&dIrp>fAZV*(1zZko$1?AW36jMc@aDL6J^w9Ft=#^v}pbQs7vQwN%PkZ^^$=N5}#^U~3@oV~=$23#{ zkYu4tw;iVOg;#SlYL!~Utdm?T#NA=bW%M*f*-RgYz@WScqGUaEI;E1}PVdowk5n8rDLe75RTdppif=QOtWS%xwR zJKs8R*iA%#(`DUaK-6~UWf#u+-x(1mWe^nh{PtVAUY;Lm?Y{HTH!)y2%HoQMg%v>B z&I<}N!0ss#eP9Lb@x&*vq5JR32-CGME}F<#2kUox9= z=dsGK<_!JQg-z|;fYF|e>8P7LN6-2wzGuCm{tCjwlu0q6)ZIdQZ;#3C{h1ULp}e2b zXX(D(lUcooLf3R%XTlholFlfayg740`jIhx>1Y6}CYgCLZjPhbR?R$}K+grdhyOhW znlqL9HNxZx_sxVxzU>4yaxHE1a$0-JWvyM>0&5qu(<_6bey(fhIaUf}w^oPq9x_wG zk2Mksm)4Z39|6w4DX3ezAtuFBa%JvO>22B=9MtXD&8 zwGB6$bP-rZsz%RrDhZ_EG0vp{pLJ*Tj5KVph5JSz0P zGcm;zJYGT<&%-TuCrA6-h&(wgPTE_tNX_VOhN_rzY(P;E@U)x*bB>rrZerv6`f8zj z7u}5X9jE#TLvu&elq&Hw0<^r=>bfBT?iRAP5_>!26-Uon39a#~LCqQ6^05t{^AeI( zsGl{d^RQs{b19p`Sv+~{eD*@!-A8kFyPwDc%guU4Q_vJ&sk4c9Q>mvl+r(#1_Xsz!O|m?UR)L;%a_K?`82+ zm=#NDqNh^wE~0LUm|<+O`kT{SXgDd_>+F2Aob^D|vtYbEukt+rAcoVm$_2kw`VWsG z;^G`9wpwB3l$Lw@Zg+U%o3f*mHqEDJyYJ##w(JhPDF!t(AAX zw*-TB*Ix8~olF!ZObCx&rDgbv{e_+&Z)hAFd39VcNXbKk{-**z$Gw1O?bf|FuLa*{ zeqC+jK`(oL!5a2+_}rI3z9<%Zl}jZhKwlA2m|@7KhQ@RT}9 zCqwCle{V_bYA6qtjTsf^F&_p*yaPdl4rvJ@cpf+(QrMgSd>>ush25-SG@58a4L% z#3q(Za5fea>wt(SEi;I(6)CD~=K9QaxlMX-AQW0hq7avPcOnh=(gJC?Vo)+NGEuRy zD99>3&CLJ1>!?TUuy-(ba!$^5Y{Rt~$I6!iUam^%=6+a^+XAnpmoq?%fm= zo=R1`?TSAYiRYk{)_R_MDm>w+hm*VYys2~rd7)w^TI3W&Mqa`#86F9OLrKXi<|-CaRkt;WA6t%ZB7R&g~=`FMrw z4dmlWx|0}D%s$%sn7#1ivNB<;`k!u7Sdlu9qaj>?3}qm77z5250Q|cH-M-A_la7kx z*trD4J1nuW)@_Q3dbNVJ>1eN;FyPb`h_vcmM;ozt7(xmxzgR1?t7$Wa1X#Ar^nUBl zC7 z$X(uCQp{+ssdVZiC=3h?5avHpHV0`iKAnhyUbkCn)7g8WUk#z3m6{7QopbY)QnU?? zUPri7C&)LWz76{lJ~*I5SRpLP4|i>hl%k4Rw>Vkky|ST=nXzD{Up%=naCM>DLmhj> zfo2r)J&>C=S5m;cd{FJ-j%22heE9D11B%L9XpYVNl8f#XApiBfWTy<(m;rGk&Xw?9lkPH%gXgC*_s{B6?A$;%tA5Bm}AbxTKLQ&v`1 z#da8>2W+O!4fGD`TjZsIzs2lG9ja(XgcL`33i^DFv{yG<(OXm3vZRP9f?2N*!dR8&lMfHLsUmpxq30hhVdUitBPn@AYf9w~^FgQuzd`|0(BcIn`P6a7X9Pc)y zPhCl58X?sx`u_F{ljGg@o`cYZkV*5x@l7I>!=j=1-6i&~ED}Mt)TKL*kb!6LmG;H? ziJP?jQN8ppid%?OgyQkGvQuy{a+J$p<#>qv3utfz!-`hUh6Lfe!YzlK|6cOtk3Sud zrDD{Jkg}@eciXaD+`VN{(b0{u-C|rP&M?>Tx6f^2$XFsxPH@2dK(!kqb{SQuCuUz)ut8Z@KX$DOc60Uv5JQsQr_hB`OeotK$uP)Qdxj1FX@w!HK8cce$k z4Fmm}?UhX|%aquuNvrj~w|90z2ANQLs;X4KFgoAA4aq3UPe_BqUQ^gps~O4(nhun? zm=CTp-|*k1Bwxv*J#*W^`lhp^B}T7HKGLEY%j@>d1*ZpBzSn_oDv<3pxrH~5u2jTZ zdZQhjK`-9g(9!(j!=Kf?Zwqp!SDY7%WYihG?x`mb;yK3%1udrZdfSq;m+Qf_)%%rJL(IwNI=XcWBvOI~r`6^2}W9z6#mNN#(i?CmneWNbJ_tZ}rrx2bwK_w6hD?&}pQ;c0#`V&3zQ0pU(ZGjjc) zm8VG_26_cG>~_Aky4ST=FF1dF;E&6YxyPBM6nzsJaoOcADU+rg(32oNZ%RyL-p>e5D|+Us;*O!umYn^nIw zMsGT{EV~3E!x)YRAuD_l1{X8U_m%7Ba*C}oZM1zV?{j@YUzuC+JXnl$zlb~|O zg0M`KV{xrUSn~iEm03H(@gkn6StWSoibBzI(XN)#W!!!oerdaqlO|3=F)>6lg6P@YHTbS}|X_8pnrmXV2C%o-nyWgrq=3RK6e$)up9i-I%z}e)SB@~P3{cCIs85P#3l-c)MbWl*%cSM zG`@ncD7#7vb)RKDq?8^JF$&oKrAf}NDb;z4%+ID;)#;;JM-Ei&%~cT#J=kbe-5hgb zt#|i)visg;= z6t%go<9l%klY;N)Q{ipV-0p6RyZ9gdT7CC74D38JcU_*@6)ZQ-VO2{PRX=xDR`f8Sh{XN{s7`7AQfckc8n*9coR z`*;ncH3kHD^Azz){w7o}DnM`OaWg?K-5?|L=FV$DKT1J3=JF=F#_5zoFimuHA<#TD zIiw^%Kfjm!d)w@Jt%Si6bf`iwhtGx@W`@_dQd1!qs^+ksscYBC%M6Q3%(8TaJ^eRYr{E_wU|t5__A0vtDQS=d z!$y6NuG1&haEEGfkNxI?_c?c_dm`apvh(^56;o9@AH&-z^3Le)YI!}d`%v-YpXB>W z-aCM5XmLK}tLnggMT)e_0CX&fSNrSj4p9dyehkTZc|1&|ca@cuy_c5enyc4piuspD zvbDSq^`?T9*wqSXrCRv(&ICoeja);xXcAG1hPPAaWs|Ji5?3hsgy9;XIu@j+X5DWK zL%}0Qb-@Bv_}6)r#zLdOq7}+(Esm$m?{vNtfOe(Ld6`zjg3QDJ>O;M+s@GWl71Fo? z$gMtS&d11y2dx4v#ppeM845*{-502)=a=KtF?Xmz=l0gNY(m)QppPpBC`V)_PkPKH z>_Qk}2IkJ4>5WS5B6oQu&+TYAao*K35;FnM1ubT8?~@u)s4Vpb_disuC3wZsi*se+M z)p9aD&>ppNH#hE|z-ZhqOD^&ciJ}3QHF&@(=2eVdf``g;h<+!@#F>C_rj9ZjV$5PH$(B%BI6g;XnF3LP*4nK zBD^6*K^@@9g#=BIQ(CL(;3Jgm2u}p9UL4@|(C1a5JGMzxP*hCG$;oNWuBnj_bDsYV zPAm$SBuTK-XiJ+X((q&i_gXn8`4 zHS*-!*qe0*LuEE*UXG*8ksw9x_TKe)dz`s>@9rv@o6 z+`Pe4H{Ht#YDeu;*^z+(Ys?^)`(zR6d|v3x$j!AdlyO%t&BDBlRv@}|08IZkT%_PC{Y{%=ciNMy*930V%gJ{gNP7QlJu#due+{& zie;LMWW(ZbGbkO(8r$9c%uR|mzXQv`luMTtVI)qDv~9}2{OvA;5uWdzShP{qjnL%O zsTg$szh1E$B^mYZF1_YBeU2>NZ&d&_~Dnmh8-DvDpT2A zME-H`-U9#rZE>%AZK2FsllUFEJJ;Xls(gP^UrBQK9ZAxxp&*Z*q>5f+N*O!HCQ}w3 z!Bg9N({Xc3fpzzO8cHTMXEtEdTZ$#tVB-+cl%3qj_g#82Y&JA;l7IISQeSk9S@eQJ^vpo#Qp4}t@9KQxdFP2l6!GAJ0K{f+Xmyo zhj=p>T@~n2ckb9VS)+euL9}g6kXofR85mnDQsf(`xrZQ^G{k zFFmC!!mh0nYGvHQmDnT0j6e8J24ypQT@4*#NXWj1Bl)i~OO9$~q_}(J8Mu5~rAN;J z82m7yt#1GB^b-AAgod!qTBg{_oNC$TI`}el?krqWFaRvr^i!E?*;I4N;wddT`R>~b zK`nLLj?;8qN+!Ga0cNd0^e#ov7M~xlpUN!{39didQ#n5J0>#3F$bxR4=p<|(x&?tG zQZP(kiUr2hE+XC7vv$Xtt-%o`!u+$thjD5ub(O8i~FK>X} zagQ%!@g);wVeug9WE3u0Fyg`w;tj5RyAR9C|H0mShc%h4d&A6jJDzO;5drBVC?E(7 z9cgh!ib@-#OH-;yGfIaL%P1ujktWS3C?LH^kroROEQr+5qX`fKgcu-{BqZ;80O!m( zd%ko2Ip24_>%I1Kjb0;4NY=BSwbuP>w`qcSs>R#FgujNKPK;6cEa9)>e4Ktle=uTC zyyP8W#5yVFzfj9>kvp%wb~aN0hypBiR>!uy27h&wa0reLV^K#W* z{-{N^LMwJjO1dDX)7J?n(8*EkR*84bp;U2F?%JTNe|}WGTr;K3*xF8W@>x=b5bVh{ z+=sf$+XQ9zaw@U@gl}aSf+)kC3~VIliFIz_h%=(sD6qc9SD=`zp14)I-4%TNRr^C_oT;bmU&K1TuJlwOHEwt?y1s&rp$k_Hj%Ud;k`4 zV}Qj#r63j;gZBGQ^m%@OdMo?1T$})n@KUg1g2*;D11R?V`;}juQwIg+c-cmZe@E6Q zfZ6Nn2HWP+D`N;{NWH`RGH_2#w%58VyFfEfzcldM{H+HyvgA{;=NSRw#}P3`g*SR) z`<7yBmH;_qpg(r7_csaf-^<;g$^b5?oQ;Sne3lr{fzUkju2~0tIO*J^Hq(s=)O<5H zH#a*`WC12@L}X_|acQYzIKJ5eT?|f5Ks^s6hJ_YragZ|zQaZh`Am!X0h-rI)D2SIA zU{sdQ&SDVkQa=&U484knO4f>UBEUq|Wf{QnE{{x5~e)@=S? zW3&H?W7gRmEN-5b@!3Cbs*z(NrSaPb9`>&ukq6BFGso6sSJ*kxKLbhpPS@tvOccUd z6=Pe1zd!R^f4;@#L))IX?ME0ui1|?Wr|(z15dcn)!|Y=9g9zs4G~9b585zlDP6G;yuRFsbyy49egY; z%V+;9xrFf+q*K`N4qN*wBL8KFp|Rnv@1K?HU$U#6q4>XlAk6+Noafu+6AGvYug46} zRhkt5u}4?XqL8H}zvYEPDIW}@J03t{k0t8IxbXx#DNWz&jNkJe{yS4{aA8xKTEvS= z#jcD&&Gy^V>bpm>A+5$jA7FT?<3|~x0TSQ{Y#e^>oQDc8B>Vt%Q=pQIfU1B(BBa}R zw@=_%GS~WPDNU+`pm@Fa|HFfDSn%@Zee#gz`!$Y1T0nKAhumJ2T4^e`fPVe_RYUoVoa!(C21g z0IEmAlyiN&qy8yLBW{rgJ5KK_kVgMzzC4VuB8%tnrJOinW~}hQ2X{g)iQ|oYqlCg$&U#|;#)MtL46qFGW7u0@$CN;~iK}qd z(tDuGNrl8Cr=qTK)|qi{w~COGnjy;7u{0oQRn)pf0`x?t0MLHx>Lbiq3n?Tb5r-E7 z{A%O^JqKRB7TDRbXt$}Cw6QFYdnMYH2`>L@qYu zKGLTNyxaswJ_%HF(Z{nyn+VTE9r&oin&!)8LP^^;l)~j7v-J~?BQz|4R3hS}Bv1yI zrsr&+yA*QMJ1zSfXv;s0HsJ(M#(R5xG69u-G0=eb^t1>_VUvt)HEn5~y!kIK-P0 znBVx&hp@pVatE!)_*J;09fyx6R^}YRL@jjK+v4sgIf9bn7$f8=x|8-QC$Pw}R^2A- znQo%wbzj@UcbfDj6`i1Rr3@ukag>0z!0r@ zu|XPF<7v`w{PxGoK#N99!W9Ce)N<_%mc)(@OlX^rdsycY-V~H*lb-nu&{8cwM_vX_ zpnbK1Pr;w8Gu1bNgAux}h(M_bwV(EisQaBw(D4Zq-sN}yk1hO_w(S=2;eF*7+t40~ zg0J|7!yet1@He|H>pR4EJ8bfC;u$(f$NS~}-?g0&A#=;2(DBtuH5Yky#M?yWE8o7{ z_8U+(2G$BQiS)5U%?nMwtOYxaIS~YWD*MiSe5mJZ3tVDbA`H+hGx6R%hOB}D~T>ex&^7)D7g}xe$gf+m~RSPYWpIhM*8Vb!(hA06;XJ>y?v1gxv@3kZ`kGi^r*>nJlFW%(YpceDuRe*PL%WZ>4Cgd{tJ3S= z@(tajxBsDlP085e-}Y`eDJDvuKi~hf3!lV`p@L*xQ7LY{V-SE1hrDXj znVUbcSuwEoMhr)Ry^)M;SJ1?ZYPuj1EP}UPCkeJp){0pF3V+ujh<$L}q|aFWI%UwE zP>a?*!L4_Rv;Og%%$5N?Iw6(QxqjYPwTpU~1(dtwD$ zuSA{v8Mvw4Kte~Th#EQ>`v^rj39EJQTGFKpM0VuCZD$a zq^_j2#2w^k`m~%k$v7Do$JdD!G-o@tY;=l{8(+C|*r6@Tp692s<>(quKpFWcdhYr9 z`z-JFJ3Qm)o!x^=>M*Qha5*>}2|HT(d^6E~^1L z%k>~&TFr@d?=5?~Z}d2ac9d3@v5}}#wWRjVwTb4XHefk|uYcoVhc7?1Kxj7wQ(f*q zGd4`tnk$|icy9GNdj=GumlAXr8*lpoj}-h3ykKt)@az_sl*GR;Dd{3z-x4orjD=x) zfd8X&SR@co1F0Ykfxr}WJsfKO@P4PIrGrp9KYz-H>HNaN!gt=@Dp`%IJjeuvhEX}O z>U$b11^GGd&Z7K?8I1Ur;*97;zUV9z`Y6g&SVvCS{`k%bw+z2+Sg*3Y;Pg9>@w-?4 z`E#nUv&3TL(rPg}%uHCbJYQSRqDA)np2Hb?mJiS&WrdN){Jy0+GdP?Q4nRdhuIeZI zaLi!UWjIBGZC_RRvX4N3-ZL;p@s~MR0Zg4%gy-mMO!(CvX(8XA#Ur6AT(ImpaIwFu z#E2tVn^Z!zv4p}y=iu(T>WS4cK@ZMygGfNEu>|5+{Yc2z5w);u`-RoCe%vn#X;S2I z12`OZrC~C@2pu68ePgh)puRq2dI1{kwxF3Td8&_CN5>!_bcXh9Roh%#^9?fsf@1s! zz^WF&C{4*zK9N`u)liU6C{~qxU)pDR%e2TWf9WX-gw04v3bXYB@-hWRw9(XE^?Kn^^WsV1f=fma_y89bEmZVqwL@#++t=!j z3`c=90V1YZC!lG>i&mYC)NPCOxlCJQGZz!1+8mHPJ0YHzTDQce7W#t09G2mfh_K-P z47gtI52Q2W3nwf&pUr&s8`@pCa0oc2JbtOa z_Vo>gMtQX(XsS92VG!|hm~Rvy-dyU!DOwn(kUt*>g!Kh$ zgpX+&<&=s_t7hWT8H=R3+KRZP)?}2agCXNVS9kWIIwbS5LB&zA*D;XQaNpwmys%&} z&Kii*f}QBabpun!)NIa59XFGXaRTb%U>6)y}4V z3DK~sZ@8Q5$}Q8(N0u8SAM%&n@$nDDclFb*rc=V~tr6rBNDa+<+>?An-}(mWlFS|*ek)VEA`DNdo5gM-A;`8mAy4jZWyJ%zb&CBE~&&CNG0Jhte zQt6li9(GBq3&PXP=#wSpZMrHG%%UCWnzY!wuLeBkC10Y=fCR7+;Oe*iV#GpEHtuv@ z7(Tf5+BWq)*!=4Qc2ZU&E3|rW^fIOZRcp3u^wBvst>2zjn_3V+P)~6jUxD|5^9rg& z$w}z=X8l=XzjmOB8k)VD2qY`*JW8!k9!2|x!bQ+&$|ZbbI(_Z< z@#F9}GUg}kcgp3R30COQ#cdh=MF!B(lGkhTv7;hNanQ-NrczHJrDv1#IKWU%H#`zT zBrP>*4JodS07hR|nvFq+F-EAj;)b@H6n(_v(cwLmW(!osjUIh@gy}7+Q`j$2(;roZ zoxLBiJG98ZCwCYp>uD@DQe5S6p^M|E6xrs)9 zJMVCNbUG|8#nPoP!da%$c?O$zG!41~dr%AW7KH_C9`glX-yu8VOU;vkF1z0+Eoe?W z9b48G_G3Bi&2kU6-$HM27V6jL#&8VeFPrdR(<5Z#IY*&&>UNZwnwnaBsFCZDWm30e z7JXqv^^?~8!tLPiPX?5p@>uqK%)-H})x=PPvgYLmOU@TMZ;FMnS*as$oug@nCe*d<0 z90#TZUtz<=iBmwq?eD{Sef@wXvPYs{4fr}!U1Jh^ZtW2hHI|6;aDUBclPr? zM)f{q;PLVgRQqdfdGQ<1lEJg~#Crj`y(hfaFadiQXf6Fw<%Et?UWfTJq4{zjg(x{x=7( z4FH4#Q*>XWLzAstXP%5iYA1Y=lHC5bAFL;*|1+56_J=$+)z;fRipQVV*;Ejt%R*7SV?wX@dQ}`8$1u(a6BUp_c6qhBpsnTCDzXlgFa44i+!l zQpJAfq0$?8nseK4V~KzFTImIF>=le&&QabhZ?N?On<4FvgtK^S#G0c0z7E^W?n3BP zpPpW5S!+{4s4Hb$+@AU`4wU(b=ZRI$kg)PH8a3z z!NJiJTic`Sd>^^|rnj0BGRE)m?XrFE=b)@>P|aREQolH|;%hgcSzVyG{nGyTprY&W zh9|YqGHqq#Wa33I7Mk?nOJ}~}V4M>H92{(4jYaf~s&W*_x7n|MvhKjhLrfWM@yQ3L#P-xqhaSpt!KP3GHueqVxCwsnOQWMZ_i8;?aUJU^uF#PLrO=N`d zFTLt&$C^7lZINXg?KC1WTiJbY%`P5oEpweslifm5!{WKx@`-L|8_cV%&}Mz>yCD35 z$$d5I?+l=oZ-=tFzWYPA$}Pi13g@Y%hg?a4+xl}}1zi>GexFoX*vgbzl6?p6mfQR3 z>$KICH~9e|*yc4b18feY!bgtdmAdN3F2@`}&^jzH5U{~RN<1y6RH0IpWK0#=3Y{xe ztGW@hPAR^Vk1!0^zBrBB9=6AnZ|7veXFO1;dl(o}fN)fd2UTKK5!+Y38gEJ(Omk5= zo7G(t zXJ|_1bXR=*kz4pMtc8WO1p8@~z|bCxPY*=^aAigW_Gc7Y0$9YOz6B^pE&l*yj_{+c z018_b8<6>g55%rd81uwWij6%p$uDd>+R4FEi95I2`M*Nws*)i28FaQt+&5)+GH(2K z)*rm7%CrsmFh8kCRH?dC5;F}}=`#IcK}XN2@W?BMn3rN{&S^+Fm_Kd zW$KgD-Q z$NG}{Gk`mQt18+z01l^*IRja%8Q?TG=Nk&X4jT8p6aGNDVHw(k&SzKPi=^!3y*`~h zncm9$q@(P)G*5f~K?%}i9|qY=1Hj4i^y#rrjQ4igCl%h@Tf8SIjSXFlmYghq4xmA> zPIanVw1{R-R_-jQRn=QZ8GC7L;1F)$6Aljk z2K7fJ9RYnR_;N#4QgfNk!OXM>^?t7Tbz~H%xaz#$Jva_N{8aiHSk6%j_2V^{%x<)@ z`cRgx_^PGo(#vo2fe8OncQ&+qbf&$7#QMN{VGdHxbY^<;1cQU?Qa+FhfYwM(Ue5g+ zkh+JA1qwPI7}FlD^Uc&ROab=$u#IP#eT^NEoE<3RL9*kkP9Wu z;p*ap+hxyGc&q&@1FC&7k9sSun6vnoh?Og@V$;u#r^J~8JP(o*wzP&he6(*l38jEO ziiiwQ8+PNpa`Z2%(Z3I+L2h{w5CwVZOwpg_#lOtB+mq*i&8z)qI`Myt6Z%W0WTGovcknUc>yQ=nLxhWDPiJHj2q)xU+ z8vOp^;*Qbc`&a%yWJ>=FPWC_N$o_Z}K+4GV!eR^uNGdSd^cV$^-#v{ViHx*udEGlL z9zHZG*C50WVq%(B|HOfgwjiVC{|Yzc)lvLgRmjvbr=0XMyGf-&X4qS z*j=B~`W~1DRg}p#jsl?n!_e6x5XBI94Xp)dK7q4=dT!cyf!_|KII$Lr5d;}vAP5wz zx)J47q(J;)#x(3%9VCC)u49K||0xZE3&M#$CjtXgu+RQd8|67)%lhtUpYf~m(VHDz z7*J2);=j8v_rLgoC}V0`U@HP|pTnPVUS>@-`oI6E<{|@;IBhk6!2)4Hw^{!VuiKrI zCd4ZI*a_Ltof)eM;DdlbRhLJ#KQBH~kz%j*r4{>k6}sP)PW(?&fj?DulvprkqgCBa z!SSvV&8{qu_Q@><)2t_eMoJbF#_M^k4YwjIx46%G@}R<}kw|9a#?!CxZ-twEK)(T9 z{trRb!F5_>;AM>@bPjG@B*t_sZbm*M+Kzy%>-FxgU2Y!$IaVM5Y;?}7+1OS>J0<43 zf_79Edi-u`JbiJDPsXpoA%Qu?yXAE{)?br%5Y`1YY%IUjY7&q&1`WuOD9!Go`Y({% z>mJwVA0c`2d^)Sfu=i&&zPxFPUI!wNH1;;HQH5}M*X(mKc=O)pC6Zzd7o{Z?wXTt5 z<^37uQ%}RG?NoHMdfT-RoA6;OM~LZq!(<_}TLC^xWecI3$cyWP7}ta%{3F2;Cw+uZ zJMsfcO#Z_*K>^+@Bm+^~jI(t6Bl+X6@AY=z@oE!lZw>ujA1}$U9$%&QHYqRE#yR=?^Xd^te4WuqRPzf^iA5@?WXr>_nfv zIIx9Ni&OzZzvJl!^(nj^8TxArJzdis0uaEF(FicG&V7>g;MHq zK4S6lVQOc!30}OnEuUtdur-mk{np@}HEv}ic7PbL_H&vOVwvNdAfO>g1tcKIN&hjJ z{IC z?9ku*lfI^JG@$irYiA&m4}?t=?g)we&;C=%2IP8lo{1xL9;eLCs*C&uZ33QS3wHN+ zvIvTQazy`;DUkiOl$0xGq00^fHx(B!K|qHlSP;Yf`<`miZ=lUU@C3-Su(9C>I_(7K zP`uviq&I#G{&r@rlKXfDfHttMj!L;==itEaSt=V|ZVm!lq0nehCy0WoC5h3*C@!r; zb&(2cO5`9a>D8-S5XV=2E?GR~okySX=e_^Dnw%#(@4>ad`XsiMc!sVtPxNaeH-!J! z$WhevY78;Or?$V*0Z2aLb@U^Hgt0FhV240jKzb*a@?TbH9PA_lYv+t-?n0uE9ox^( zyZg0}=3>glz2;6o?S3Y5E5OI!e}1t~;@M=s9-h5KT6vZTVpVB98_UF0qIn{ajoL-(uFgu4O+-Mqs)B`E zdk9SYMo6v-V`_Z>+u6aR*z|MZ=29kYo{a01K&~}I$+!}#6)&YLnwH|VIJvb%cXw<% z@WrLH@#nl0w5^E4z?>(q_*55I558b;j4tiO!Zj&ED_fWa2*Gp$xHFB3XGzzHoIvV< z*fn@3*)<7;{Q0;GvivJ0KyxNhb22tCmo})y%)!Stv+0##^WB>LnQB)*KmKJWFp%r9 z2~_P}O4VAwf5vU<@VK*(n#7q+HgT~p~YYXQhpw|%C#fb7Ok4q zgObHTxFASf%iO`BgZ`1#cNfZ`v8lqKP2jx6uq{PiUg z%&vLTrznrWz_8TeW@lV6UT*D;S z`QnCV4!m5-#3Z~uub-CW>sbqo6lHei|;3 zl)*614SD#NU-lgT)ElxqgU?PH^&99a#KdH?hY5|I?`nd^FISwJp4IY;p&S|+=jckn z8aH1X4Bh_$5z#T;lW2~G&*U}}@k$kEvE6-@c9yK=eeQkmT)oiCIZ6%>At=Bdbj#zV`tQ>ttD@t` z{e^?_3f$w-%pP|{>?7qxpV6MHi{J!_73oU_!gOv@`QvQiQ5ZJnAP-sBHddSV#@w zZEc93?NyZ|Rw@SN-oAZ1!ipP@u4tUN`Ga!wo8Mz&yFPqC0e&%+R)pD9kTzxb9p>jh zF5WN7Q&^~w#9Ws-w8JdjT>NOk`QfJdab!$u0dA@^A7_UYYHf=)TVZro7Ob7Z8BzoB zM_MtQwb|g;=zNl0H1FM_M5S_)`heNW1_2l;c$ZR_K4kFgp$5zFR0vYl?-goE4aid$p7}V zzW~<|EqJ#mSFG3ZR!Y(!&~rCld!IHOp6T3Iba5)(&gEQLsDfe=pt!rXUx%wn6{1N@ zK1@v;zy1Mh01S{SD!a_e3zj31O!68(zku}6&cB?<11F}>3LX)G{CXBS!qtZJ_0+pg z&!2BxVv?_1ioE*$jY0XT!yYB#&3bYHdA$4$n;LQgEPc7fMn$>B7JWJD#trMqNGoeL zbEfzt)7V`5wNa>Q-+afEF`@ki1*+5G9xkg@N_qWRL7D-+{iViMvp zuHy5R=tI6rIS1#XLw~VWik3w*gF6lTHXJK%NIs%b0%8trf$2$dex%mhEl(b-e$^55 zIs4X#V5ms}=)~!|!*PE0gbr`cCiT)+UDy_7oaN;Fx#!LO-cqTq2g7x!nMvf|71^)T zwBYU43flUo76(@RN4ry#VdC<36=lwX0s@_@V}a<#LMbRxapVq9XM%5K9Q6@0I_fG2 zvsjJ&3SasM7W+;{R6;6AS5lu|4TpCC=h7}t8w`L;eGNM|WA*~XKUF+0)Rv>RDd!m( z7_b%hJeAUZ0{9Jk=3=vgzV@4aN-YOD%v1@PnkR4X83I!Akc0918p!7u!E33q*PHPj zdZBoJ&;i`U^-v{XFf=#mO7O*oHh%mm;LXBBBN&Hv?x`X`hPY;HYAP{D)B`KeVpGVi zp4x7zE35C*X=AQESJodYOL;j(nmNHoJe4<4uCYBy9MhI>tmQbYL@4VYalzg+Du@@M z<|17nfsVxvi5Bz^NK%!V;Y^Rvj-RzmGoSta-a>PrX!uGm7(11yFos6r#n!7ftVJ9q zV12#2l9-HPI*+g5FztY`5(9Kp95zW@RU=7%b=Y92cWHUq9kaQX+>@(u^b_Y(DB;Zs zSS&S)!=zPPM5)wEsL8_pM|SbFpYimJ+iBzXxD?u)n(D2;`rR?re$1!sPk6dcvnSw^6^#_jvWDdl)PCaO>TS;o1!C~Smfy+#JbpQ_UE4EaQ z_&BV^Y*{8GcA5W0;UFZqw@a+HL#aR~@7t!mpm*F#Dym6cjB^o_2 zT2&=$Jlz!)idRU|otu@FyBc-j3a|{hEeW}rc&trC2D*WeF6jDf0|7&y6%GKQ20N-B zFQUiRX3jKD4*BbGo*#+A~cqU}l!C#CW z>}eVGRD%An!gYzgi_U+w!i1d@-qDvD|&RtYufkB3%q-#R%bxZ;l%9hY_QQbOh(AtR=PMh zi?^J4n>zPcYN9U4K^u@=rC4!(CiiS+l8E|ai^_+}iw}oOye~7X$89FR-Prp0T%Qrp z9nCTL`py;grrO)^b&QemFQ{ zhL|z_(OqRqS^;t4>z2KE(@zJc|;X1r6*(ATFNqS~; z?QYNxfu?E2DFP`Xr|#QD{dQP1%BX zWK4z@1iijp?00EwmOek!0pfj#@gEdTreZ`x?A;E$tZFOKR%~UaKu5&Z$iwYEdniJf zBR4$8+0U#9N<@o@&y`?==q3|)PK=@{8f~iY1W9j$zOpel3 zmoPc8GexQqA0M_e@(eo`L2eEof9+YuQtP08pe8czXO z!B-iSdXa-tNjmy)p00Fcx)Mr=M?Wu#FMjWmJ1*L>Lw}5pI=YL|5isV*?|7Jg)-~fA z^Fbydlt=G-tF764(9)m#VPztu;COomkW*UHDjh6eehDR8eLf);VSg@)#1u4&j$T^( z1t^Ae0eJ~l-$fJ>k3@n$SOM==USnZZ)z55F5e#~|4Y*&bZN4rd_BBhy1*c0KzUo*i zos$UDWBWyM(43@_Y(IwD_AH%%tZ(-tNBpc|%u62fxT2mHXX^{KOCC8KnXlM1vYuEb zt@z9i7$pnwWjez>f#eg;urA=%o&B;-r*aN57aLp|ho{ETX!}DF&C$^xqYHH>{S9wb zmo!N)%DD>gv0Hvp#E`0sKglU%u*JjX_@%w~E@{3de-xwGeOiySUP3Pf? z+^<}W{sr73DHTVa1XKJEDGl_ST`OV-vv6p{pZ!#6~d{LBn&3_a{xdDTmSl6*`) z-;2y=xyef9oyVKu>&}!51)Ez_e?BJGzdsqaYTvRs4^9XN8ou~5Ob@*+PE zY~?pflO5Fsz8(`JT|t($(5D$ds>izdUEJETpviS+W8n#Tm+<3k1P5Wc`sGgb)#>-p z4(hSAl<#fXGwg;^Al!e;jk^3c2K$QAU;#@DQ43_3lZ(5J68#vF#pGm#B*%=CCRnkV zFTad+4(X-#YulAMi1f6qzNU5$$$soX@ldpWIv;Z6ppM_5OvreNgWE#nyi_Ojs`WWfrK)b4) zU;17?w#%dM#ODJabCAB;S8FY$a@gV}+BsmW^g@{Yc-mlU+S3jD%dY-yI5w~SIZ&}L z7@G9M_5M7QazNLlE2USy&n2#(y#LHU4&DLbApfJ>Bvod(0(x>43-w_HEfM!30byoJ|Z8P{c)xnqb^oAQk-H2 zLZc_FfaCdD0^K*}S@q-(YJc+k?(1-NQvqkT^w+Y>K?}>kt94!?g%DF46D%TME_tDZVpGH6nT= zzRM5e_`Npeddvs8#x$~Szr`B$J|`IzuGx~U6dq_^!_A%|8>ciZN0Y5~KKf}HXan;* z1nP9byE4%y#AG~tOQf3FK!4JF^(`!C_>r2`fKI$Vn-crT@#*&}l`9Vts8yXioRpgj zw8Hr_-P4wE>__{x=6!Ay3z(3vl9+@qQp4vh4E)c0!1`tDNe(il+&(AziU(sQQQR1v z0alAvJ6u^}j7xOgM3NnDfTJ4XY!mEJK>ORN?f0f;C#{$}REE|~6uBU(nHO?F`^VV_ zCFn@CAy?ROv0bMgjD!@;(g$4&fuaj}a)?7gN)5H9>f2a&y;8JXA)_QKsUdN?%uny_ zBd1AHHR^FKt!3X&HoGQXwQMvL7HEj5*N~}C3SsqH0S+}M#R4y6#jbzwk?Bx$49x1{ zu-XZYD~rZ_Libmm?FvCviicm(-jZeC?C$*c)ma*B?WZy^xUXc{8k_WLb|uW;76d&> zGUt{p;UARx;LvAo5nBb@f?9uX$DR3ffl}WZ<>g+EG>9~Xt}6;)ZT^Y06gsIXW=OB) zF)Bl9ifKLv$FdaXO)u@Fmb^6+WAM!+?g<^Sj@!qTI+q3&n5sL#moRh-r(7;mgAS?7 zzwhtu^z_Z286@@lc*M`)C2l6_fIO0F&MP)qi@p90*|)=tyvT+Qt~av8?}B=?NiI*# zq`pZ-%up=|sbK;fK8HvL)w$f_$;)T87Z0u}I33}r()W-X&*??1!~%(t z3b$U_v3mz3#G%p$0Flp>O1)R{f$;!r1x)3}#-JTmJaRET$^A_cj6V?M(oz^W4Cl;# zVO=e_S!CE|9gZMzBQQ=U0HY@WWBM_I-h2@YiFF5oLgwO|z1zT`qbGoImD#E21o}Mg z_|j1(MG!i@OZs-rSGEeE^R8Xg4iGdJ4-Sf2d$X?}9`gT&bMzJYDNThr zS?e48Fo4P+G!f@e6QR^r+Nf{kP-H8ojoJoAI587*wMM-V~sVd1H+&w3R!NOMJ9DD8&R5B%^}$6*`rD2@(3iH2qwn6e3ieGQlS9tNG=^g;EsFf#gfPT!P@-~&H!-K@9A zXyh4qq5S(SYr!K3Tp;r~ir)U_85x;ayS9}M`?j(hra|+v&>NuVqX?runE2rN2dUgS zP>UCK1Thi5s@cX{YFB<99gxyDCit}AVkn60C@_Bl{yrd2Q^}k;AhQL)l{q#jmXwzI zCeO%4ez~>2g4Gv*ZT2P;*pbvko9-wjW?-|wwj}n)fL?j?(HliUqR6Nz*mpW;^rLO> zE3bF9y;VMRt&j^VN7!)&wzZ5z`_REmM>PfI{y^p|o$lKYFKhm~$Fg=RC1_mn`b(P? z=TNqM8;25ztALut8B6E@8pe5~t8>n#l$a7D5pQQ#b~aZBz;{~2kE(nQ&d7~^G@(Sk zve-GWVdCcSjY>T}oElEEM_(pn7>*4RbqAen{rC|?d_eqbodr>H9F zJXEUjY$o>e<+sF}Ah|{#!2sNJQ}8?igJSbRK%DNgYcFaUXPPpYRzyxr+@g8-@m4M*Ew2$==CvjGH*wNPYV)~+DM$virn zc5+Qf1-P#Y$$$F>IR^OjnIi_P1F06BHg>F&kd4lO-3DQK6`W<)KDZty7Y2)7dHdYb z3dmF6Qkm@_^ct^C8V}07k+W>uOWF^U?UYDk>2dnrtJ7%nBxAGMHut~3vF+VjxjrwH zL+NhaAl=RHVtBKJupSo2(X*%Z;0*?MX%-Rhc`oH|abnbO2D#Ot&Nre_O9JbvnLKPD(!Vt6G1@>_Uyy;#k+#i z6+Wd&a*FzZK&00Tivwzv-E22Hcxa0ojYGUi^q|EC0jh^9> zKzvsZzN$46fF)%vb#g9IU++R+97ZJT&5gtRhn-L;1Zpoaqvw&RUgR-O)H*qj2aA6c zp_uZ68I)uQu1hy?AnT(X^uNw6bzm3_w9u|=C%xgr9Nv8fYuIiiJ;8t2=<9RRfILXK zw}!clUVhe5Aqj)Q+8A|}$Ir(Hwr+T}fS|0n`6iTOfO&><=twZGakT+6b_MLM5TQvV z*>U9bgVtmCHga>kWrn(gnYYId%}XU|YbWW?3>$=CDGf41Id9Mv7=36>IglZ+Htt&) z^ukiL&^c9BVU>!9MYBJY60yLio2tLIvHlGZ?`9mfG34XF7WS;d30D0bKT}JVnrmS!G^PPCf|!LVo7V@Oe_>B45hy_F{>!U^HG+-DL~D( z!pWg|5rF5Sp>N{iKma6X7k8|4sG0edC_**%(0rmpSdSiLr)mbb@<6HpDn6bd^%nM! zo3UbL(Xr&26XN@&U)rohA#@dNuAlU&E$hSs$arw`bXpz!eRr54HlUL zhm*-^oeQ-leiNm#Ddpv5@fiJ=$5OW8?9-9kr{$78q)tW_N2!L|lBdrPG73BP81@!n zBw9aV^!Hk@FHZS4TQMF0g(FmrY7!TU9on=I@d1l@>b{5ODSHmYyR=3eubBb;E9vza z(WJboR!!+vvlUCMs1=in6$PtvMG&8b^FqHj>S0tTj_@Kbm0*EvKB2!FXCYlL5Eut~ z9xGRaR?83Cvp>nhQsd(GQ1gd64x?$}#)-);_=mh3d&G0}>qa7JcLfRMn)otj5{ZOe4VLEWYdbG^Oh$HSLYjfV zisrC_BaF#~=%!)ut5+X%d}Xm%;8u{oiS22h918&VqRE(@y3;W6@dQMA=9o@?Y0W2* z?TlW(#uix&zPGhA>ITshjkblj3`@1uGFjGogH(inmVT>4NpW$k>n1@kZsGnJp2Yk| zdVE3@cY28p?G=Vv=}fI>E>`z{(vhi%i#-zEo)UKfPias%n@iOg{rTmC@s05(bQED9 z+Ev_S(kcD>N8f%A25h#maW4re9KI?1=x-05mu;=b*)jIaWu?w0e6EOZ5kf&uB3ERj zWNH;ll|Q0#GQzA)P9<@?QYVZ_Nr%$c)uYBE1EIdNv}Qa@yKUuVPbc6(ZJ3SXdJ*ke zcx`#O(kNiqpnik2jt3zZG~JNmsHnYDPls3_JWWiFKGf_jxcrXbO=Q*fTwT{9Ev!%O7Hkg$F12UV z({;%sKBPS2D4wMef_G1DZ`GJd8WU_U>P+;dRj-eD_czR~o);X?u$MrXzU=_vzwSoI zdeRkVqpbvSm5zc?<6Tw+qmDMQ+ynYwiwI)H5Z0onR^BQTx}@AbcBK;DP@`o)BD6?g zE8Ujyb?TKj&vTeR?Hzw$yVM6)kRP*keLe~i5##TbV)K)cIw}X!_LV+r@;!5zhOd<> zk?Si0nJUeU?&4gee_76IU2!<&?h3g}q|tY)kk&M5jiAT7CyDWHGS@87qs7oVy)^yd zBIqfvq03v_eDURV&osUDOOyy~UP$~z2wo2y_@HK$Zq5;rYx0UN>l4)WP$ezUFn0_O z-9d=SM#a;C0VuzwMrAUZNoib8(jj04IX~;`hnS8xT)WH=2zo45KgWlqw7F|*_-_}7 zH*=vS@w!}K=B897hnuVg%^!Xw$HAaubmjSvo_H7DdG@!w;%Hjd%JVMRmxr-$T*zUy zwcx1NxX6P^wm|qfeaz9(L}Z6tBZ-9?fWTo2W7a)65vVL03qe7!mVAd>zcIv_J$}Q| zh3!Ph?mh>+d{IN%{k%Vbj+!KMttaeaT#)7s#S#&zFnBQou2L04O? zid-Pm%)q0+D7$_!ulS*+0w6ar##4e4X-Um35NCneBnBsnV>aGO5a)c0us95@8#=}K zN;E`oGSRxnbF@x;(pvtw-U<5Ndt<)c)%y1M4&U@Au!J=-NOscVCRotG#xg;u##cLO zUVOu1;KjG(?N0d_Aksc-I>_#36wSUh7@7;}!xCMGB zuA_z1c7I1V^SA(75ntL2=;Rr6`})gw5413%wBt>C^vPFhFWB23@y-qRl8jt`65}@g zK0PjBsCn6ybN(=Q@@SVvFI7bA+NJ%VOT(NknjuSk&44hYvOU z)_LjD7L#Dgx|uQOCJKJi4}Y~t)(Qw)7%47hUuxcAjk|V31Kgl;fJH9=sKyJ^Gs!)^ zU&1w1LEik0{YSK_Lh`!3h;??ZW=?13C)&Q$w=>8x=yG>70f1lZ(JLW zc`(+AVRz2QsZ-|Dj&@XgT3(inEXr32h?sk6u;a~t`9+4o#Wi7XdAX+8dg8ZtsREIk zbR4laM#9{~0!H^pQ?&nFMg*-^&g-tO?%vll7}DF_Tht>E7u>;!Z_i$}x}O>Mk3H0^ zBw}IFBa}M-AG^GEE;P#>M;Pc}I|LWkJGFBLzgRmy@Y?%ex=FxymZnZed)E#ml%tw! zT3V*&ODihO%N!-0=RT^biLK~`PFeyf+!Xk^@!lLJK4`pr5N{i~EEh1-cP?^KbXUV{ zUROzpFX0{*ZRWAPaNP&--f99|kEP0-Q_Ur7DIkx^aH)PmSC-3<3q*@+NP; zJZ@VAT3A^uO$JZ2UHZjs0y*aI{`v811?~Fr>mSQUBdsGIA za@MK)Xrnsw-(}`gP`l*&R_cMr8oCl=WrwM=u`RcQ#!qJb&};;bl>jcohbfEsXN;=c zs**qQ^QF^nt%P(t0u1%&oJTjrB7#uZm_s9wX2>2x1$Q!0Mm(}C#M#>VuqNoqO(S7A zt%g`0BrCNYVInv_sZfuqDDj&ddP{cA+1OpqRJ~bv0Pp$RLYo;rNED~XK<_9txVG4+ zDeG?eo@WJJD`b(~Z}l^hG%JiPSG2r-)lR%J6|SM_J5&~jbB$C;p`SOCU6`vU<6+#| z>t75sm{g=iS&loq1!>@!_r{sS-gL+8;y&xN+@^Cr(4s6x3~sP8Hz=Q?jy)P9F1!gU z+mP0%jT-wE(+?(@ITw4tabP~vrNq@dpiBNkrp(Y-AWXKp5D~TbxwYSN+@bAcMq8cs z4mrQgCZ4j{9B>BJ##5rKuGg=J5i|K+K#^hKC51FRA)C-bxU+pOafvDFG3pjkGVK1q zF~jEMiBw|rL3^gFIAEGi0^WN?xLN46@qfrj4?zj`S;+M>wMBxH(PRGb?m;`58i0#c z^p>Mcti}wsbxSvZL=Rgnm1N>Ujtt$cv(RYe2KvP-h#;u7BYw~sBF?X??aU$H(ZyF3 zCieB6dNV8yYi#PcErYq#KYCUkwEjeR+F^tSEhf{Gz?O(ielSZ<2Wr9H$oTvo*F$&R zG5mLBkO~nAjH`!}^i}s_^F;J8OQ=~oMWlfAyM)oC#-09#*x1bj5mQl^lNK@rw4n`Q)B0XD&B(BoCo650?&0Q#%IUQ4#q=DBQ{WN|Sa zO4`XDXB~w2!;SR2ISiRrcXQ^AyqKC24HUD^6thYh%grvU&!Czh& zkcoH{Z8+tx_cNlfyF_3@kRQ3}qaC2R&67=esf>SziHeA}D3LB*6$dNlCVvudUN2s`M%jiw<)$-T2xO=X()E z21)CO&z*o&-a}qE76S*6r2~Oh&N$6a>sNIdT~c-PXqS-i{V+moyLvhJXYvH1J9eld zbTpjQ$nHU&b4p#M+~|pm6jU_z0EK#d=omm<#}pz~W_fMWZD>m$wt?8}?5TmlerfY~ z)la&rDgJx^S|RM!ygH`BR%v73KrbiXg|GEZX-u4EZuI@o++mljtG4~fz49Xd2j%b0 zaYv4M8D^1b*}5>YlBMxHBIp-u7QUyC<`p;}y-6r>HS>M(Vpt5?(HiIGTn#z?^MNQw z2L~arEm%WmxHP?Qr*&zdXw49aq%VpIa;9Vwp{9EQuGbvh^a$%jzWI#zKz7Ufyu2_F zQ$Fu6xS6+4DSmSm4@&{!=&*oy4;nY?$K2R3Ht)UwrnfGHC7nkx_X{0zU0=%N6Y|%j zePQ$qpd)YCj2GOoh&_N_f96@rj$r+R+ zStMtn$!pk=SHZXpo#~GLmUhlbg(~Ms(&4>wfEob=P-)zwi8ktJ~A( z)TvXoYuA3B{p>N*ukW$%z|SxFM8~RXJ_^+z1gwdFNH94m*818G^tPlcBDFw(Nn@vj zqkLtu?x<;+k|M4$lPg#RqN!7xwzKCpU*A=z-{7^mt*=K;7eq#IaBu*wL*l-iV~YR^ zMndj;a>=Q<6)XK5lIU{T?@kn_<+$+owaVckU|_`4XiJFEg|`OOes&k1{3b+g(~}g+ z1)30Er z@c8!1VLTm_a1wIg4J3COhU|gRPaDXYQo!`{y^ym9wdXZUfMCbAk}%qt)189>6rtW{n@sM z${^>;m`F;b)prQ}(?VgDwgeOFFYz=8iKWNvu7FB}H>ir2_*_kcH6TFL+A$hn^{5M; z%f~8^LtpGl?E-3OL;}8%qcMn|{b0brE)uHGU+4d_VD$IFii7h%M4*eF_?M3OchZdR zGXO!BpbKRH%%JDF8{{BP6uDLHA=k-Ndw8evMAV?=NCPOLKpLgMbT^qN_6L300YN z_2l=lHto!|Pj&OXkLBWUOW4}};3aIDptqD7XR>ELUql&SDir-%S>AK<_V4A0c=|>_iiUjz8@;OWk_cObwR zDrs~oR#ghwY5X2ePnj?VY`D(D*oDgb2-|*hx(q0gN`Kc@V-wgWP^8$>!~8a_fdX)J z+nU1y)^<l z_pK>g(be*`q2l%-?dCo0Jq7)tY*6PkK+j!ZomrpI+sn>2Ur#@_{sA2;B^PrrBSpf0 zm?#h|z_D}PT>>y=*1W_s4=!_LubjBPjpgsh<2iP3_mXE|3A5Bc+Bn?|fJil`i|qy9 zMst3Yz!nh$euLPRoq!@QmxfB$zE+6!)XWae_p#Gsf+g-b2#oO0Hx9Wpq*P7|>*_nZ z;BqP+!~@mGJebljb-iX;xFkNSFx16{TH=GV3uUIFSaYqpAi7_apZ1RebF-7rxkddH-Ng&tOWX_sAI`>=2)*5oN9f&AgI`Tnh+a`sb?tAQ z{(z{pSn!DyEcfY@8UAAX{Xs>AO|)S|*9w!1mEH{&V7tC2z#gg!vX=52ykw|WZvK3y z=0qc-T87Pg&1}WULAU6oa9I-2r!5eJ>~awxJkkJ z(Xk~4rIb`Z?cNseJ_SjGuEuQ7yhddxO%8XGgbue=)e6dFB@+<^-Xn+hcSAs~qEm+} zB&|#^DbM|l(?BN1w&xSEKwUABJ_SefKtp|%)GX#iv@v;bsWPPu)ne^(ru|flnJwrB zi3>~T0m%v@TOtKV7bUnDhJ=H^IY~sw#A({gV4ekA)?09YD{r&GStu-+ zT7(Usa?wmu-CUo=U|6ez|2g}JR2984Q^~~c2n4UR<&w9#i)D6Vc-{I$h>+`ogxbZ z=pNKy?{kq7zR9Bmx-P5ih|zZ?D~4?!U)er_>I=s=)nNB{xm+k-kRZ_igUuuZ_Q`6d z==$cr*kV{UlG>x@?Aq-^kT~~@cfA$_o4$&xT@aG|@;Nkq@{FEm*nUpggp@kz-cdG9 zB(0$Wx+Hr+1cXp1K=Ttef|^VAW@$p2w)7j8NoQgh_ogsooStjwOk@prJUY%a|6b zNC@_$v zM38j!P#o)fE`3?FHbur_@;&^1M5!N7Q;DpIi^@bFc!RwVLC6=IXC1Ye63I62UQt(8 zDjC+`xAwF)>BC4e7SB4CP)61Y?U60d-G;p!4(lHS9>( zbQBKAoGhwN)|Qq_aI;Q;RE4sexSDtDn@k+l)9tR9DL>UY{HlTHQPOoZhgd@sMr=1n2W|umnll1mQ)W7!4lAc=8)0 zBVlg#6-U>y#3vL}t-4@J6lg9f?GHQZlo5o~GKL-3^fzo+q1g_R1NhB{BBeVx3>MT)2C+ z6Yf&Sb3HBM-IH^mZl4+Pv2LhhN+L5IJo%%I)!#g$V3eAlN(SmR0ZXDw8k>Q|OX*Zo zMRms3h8iv(NjBX>{bD7EFYD2jDB1n`*x_#E;NVXyECJT3mxLsE)XKA4ZO=NtRCF4< z+qmf$gbkN!8dr997pKwE3~?-$Tbmdh?6j{~iIGEWRZ2-RVb5x%+}a&44|b(08a!GZ z1~ppA*z$FLVwtK~3b~mx?(GxHb@93_jgstjwHW>15uMS zL@oEor?k`Y1N#o@J@;($?S*GzyfjbswY*mRWVP67doe=>$CZ~O4^sAWjZ z6N2_f5lV11B94}pVU+ycmA0&w;cLjjYcht*$N|FmXC7Y|{4kZ^{q{Bq=qRFUtI?Qr zUQ4^iWMg+G{bM;sh-N7VuVM5f+n$s|e4@oO%v%k0)x;lQDz+0|b}@t#g-$^mMNm3z zHUZTFH>AW@OQooTN@rR(x0}R2y(?i5gKmHYCC&mF8J$D~_0_%WXAn@j(2NQiuP$X` zV=HoYcsGM@D3@C0q9kY8{Ji}gu0$`B>5%54Rkg8&`|`VYVRhWS=~9!(ygx*`NW7l& z2Lpel!JT(d?K`UQlDh}QO8GO6XD+2-oX<$k)1^J9SC@7Z8zhUiO(cqJPFUT%*5Z~9 zGLj{x>FV(>OL@_5ISHhX5t5anVZYTf7~jd;irMTms(J`!Cr6xF4S`}3=?&$11bq!X z?JZJbXg6EX)>dDP?dcM+@N_jlhQ-gXRIQ0dejL7*^{T4_m%H zx&+-w>4oeaqgd)qXZ>e)O6h3AVz=XvI?N3)rx~H{5r_6GZAJLqN%@tPex8dW@$D!TbH)TcQ@?QWMp zU3bviQd$~#Q1H8w*jdo(O^C5c;^BBx0Cb_wUZ4_Nu#!-jHZa7mlg~$&yfg0?ZYj-( z&RT4Lki2P0P`BAKvNVzk^!Iao6C2$U7&Hmz%p5#`f+*Usbkn1QU>v^7%x4YhbGT&Qg90|@7&;taS=JRgxLjtReEI|4qO1xVsawRBMu7<+TZ|=JIXBWhFvgu9-{UveF{I_)OaOP7!hhM zhuUx^zum4(f(w^n9d4}iV&DU|BF5=ruNJ z7ynVLZeK`C5!HOT@mW8i_EWEn6RcOvdXJCUjc_hY@!W0A=&t+WwWoAq`&YgY2vek% zSm~(qs2ShkbUqcU&owU zmLSLYD3|^mWtEO5pH##3h^8cYo^`E4qi1r7j^5)QsCPjT2f|PF-1Gv;xj8kpJkY`% zS==RWcTy^%58}8POoo_lggwuSk#&f zGb8AlVwzw@0S7@7B|FPaQByR^4N2jCz8Q-ligK=jF?Tt%pLu2I&YQAthU1HnVWZHV zjBQgo1R~n|8b1u66IPhr%5E@Zf&ygj`}unxGcsBq7^{pQZ)fi}BugOVqs0;WUSun0 zJB`>384Vy-iU6b!{1vawc@6i$Xv33Ek&rm`WW^~~Dyr+kJH1$}ErG*tEorPSd-(O2 ziLEMIH!-1VLI&V-gvIb~&cwK)Pl~O&jBVX+?Afsjc{|)pexaMl;7$Xv9lDCS2RW!? zNnar#ZpS)gut~`(4#p%wxR!Ali)(<^nv)%P!@7=bNL@ASBEvwm<~#{KeKi-EcDUK6 zf$>F4?%q($P?XC2Xgar6>CE31+u<|fuPXAIO&_sh-hHt*a#>z#(r8|z^$C-=a3(@I-^@m+X0h8egAeHi9s3iTnD-xUPm zJ!BL!`>`cII87cByCN%;9mAaDA^ccRyl5C`Aex8ds@JdcqT#K1AJxZrr zU+HSwuEsK&vucfpsmJ7Xh854G5x94frKPa)BRA>nu zN!90HemHW4(MVU7cGT2;We$XZ@l9dFguJD_t`P{^LscR=c#yT3-8U1qv2Xf|QXVV^ zlacaO{KCVq8qUm&wsdNN5>X8>GhiuYF57dNqj?VL6Yvp6FT1{;ua@61`%}qYtky0b zwm*lj`Wam!4vww@x5cDR)GcbayBYX#_et?WVV1b1oD0Z_U|w3p(VLT!&+b1_2r;$>!szkcCqM3f3p;rnZV#%W9#DnXG%OD!>kf_7^2|RBcWsD`sZY`6 ziv$h52=!t4j3D{I;2 zN-6G}$skt?H=+jnkJ+to-xh4xA=y6~2ngA9H?ZYeXzkV9{r(cPn4CpAui`8{jW?a9 z?bk3baWb5_dX}GAW@E%Dw6{9Vc6yyH%hrUGnKvT zwB?INRXvp_C{>C*__0Z;q5|fXACYQ7iVp)ZNu{3`J8RFfGbttnDaWVZ6d5<#bMJSc z8G?nBwS~H?3NCjd`tqPMh9m|8QHR{QYv>WlI`wtg7#BMM>oGwpN!s`Th;f^QANzC@ z2D5IhINuLl{W!M`__1#Kl-1s3@S|W9uSLno{$v5Br4O$5b5i?aSw+>pSKx20b2X5s zqZ#W%!|6>rZ6^3d8IIlhWGx}moJ46>Z9QOTeQzFdi)|Msw?$yHx*M%XN&coKF!+W{ zd)mDs);8JhJoWVeU_!^L#(8EeiKMphv(x5gc+)(S70ipn5BI%l5R129E>pWqgu;ZF zx=N}aAroKACAwoUA{8GQ@cU)2zSd5zm`!g|Yd7N78nO(YXf73)FS3t_QV6-x92oqWkA4@nA;-+X64Vz(hRccHp3D+J&lCVoZ{B!xiKn)HyF1{X zBPEZgo6Za1S_zmvzo`gIBf|rx(|Aq4gIW0gvx#0U8mPj%y$PCXPSz*h)->UU9ogAu z6xMTQ-&isugGU{of~ERCcBwA(ULD!*pAGooJk5!J$AV=;eqzD@w8gDbdP-c^m3S|f zLc*_d>vNTo>&lM!klN3TN%DM7R$@+Jlc7R7M8`agt=c%1vkU$5h?eu=+ahD&MAzBR z?l_r`WK6ZF(O2;jcEu*ppG1PLZwx=uf{TK!+HQNLQB$+t^FvFUNz=g%lH4ak13N8*EiAH88MEAD&Y}&K zdjrr^(??!F@}Yh%V}HE>1DAbNm8SE59@XiYpIzdBYkBw@cGi|+6oElml z4hJinXM}?9j}@68l~O5a-THLYRmt3k))0}as;0f%kZ6>gY`h^=VgFWu_7J=<{MtGX1fG4%#+<;F~?;HKFXeYwO>ZvIghi%ciD(AR2B)MxW-#VS08fS^aYKExlm27|&(qxWio~54)4s zMof{bBem+1jfZ>8fqBAi;yWv<4IOnm1%tNhB7~9>j`D{W#%3-8uZa&*anP^@`{{}sCr z0w4ioerX!w5&h`<9GulUTT3NoY{%n*%O$FgEq-Uk9?Jsvlk*C37^hWLJZ56Fu4X3hCyDL2dKSmq z>vC}}4(WN;B)^b9NPXNJp(-~)^l5m{hNK$q5#VvDqWWu#XtV{93TjrLFSNFCo>PC9 zZFu1JGdJ#^WgDX88rG<77Eh>cR{jw>;`ZF=(3&W={t{KG>e*MLq(kK4afAEb}& zR#H|M&-=8)mT;T7J+DqV*KJop^^p1X+ zAT*jXw)Xg|i6t0bR3Q1Xp52stac1VI&$w#GZ98&eqapV8WRRQ@zk^}MjM>rhEd*ly z@dYZEVrc;o;?v)#ab9+%W-=PCEx&V~3|?L4p&F_!mzE8fkQ(!MaFq3r#hjX!_J4~S zi~TwPVuP0OkuIY+T`8+c)$cPFCouplaGy-*@bN~R80yM%V~}0!ej#0k8pYzurYJ1D0b3ZTU>=(9dGyI2P&r>)BE?}AJRvJ_3ho41obbA%w4R5{yIECL zL=vXWUjM+QyC`&V>uhWpv6uYp<>9}NsST|N^RFGfkBPdtSC8X)*#D2q&3m&Tq3`_I ze+eE8N8YXB(-ZA%(310`RuOv@Zj?o2t^DS$Nh#Nl;L!6@tbDy_{DTFE3AY5x3+iY3 z{j{-^BMF!cK4dLhjkMlH-0Hh;opLh3FknnwW8O@f=@=QNFI(T$wNZa@lQJyH7KfY) zKX!D}hb4Qi{8@{Lm^K&OvT}2Uy|enlvq&k$uIDL@jvk$bv&kO{xb+IgH%Pd>CYp=B zw-slp0~=UMQ8u%_UtER!jt}8DC}0C;J;uYd7x}A8huE|JkYb0ilE+m5(L3zn;E6r! zJ@p7T(e1HAKqJ5&FXgdMk5e63pMp#W%C3Du= zyykQ)JxvmKGFNejVKwwWKx}_+-e~@R6Lc9ABu$8t_r<-(lg?|+zm^V^BXqSoSJ)&F z;~=fvNfji|1TTFV@|@#Ps#|0IHwlaFnP}*Adr<~-H07gIh{HgNB4wpi!p9bcA22%i zKamk$c53aGm_ko%x@B8>vboZ_{Fcdmfc^#P@LdGrr`*llC(XuNfq)hKBg@0=&3_l_ z0(M4T9RIt3;unwm|20WTl&3hSL0lq_(n%a)D1n-8awfDmK48=0rp99lq-TuNNr>0~s) zBkJe&K$Pb{<4G{~Qy^OL5|DY5PL7&WSdPS%fZHB+1+Wn5y)7LDtB-g0+rwoIH%Wmm zfCG~$uE0ZDvRd^Y-UIGmd=xCqzI12HX+-3D@V?62?o{zKH2x9F7p%v$ST+HeIiPw+ zL7c8uAdt#oFCOF8MzrjWu~_f#TmwNe-?%T|7Y9*aY;=gfA$3xTcP}5DiQ>RXXV%_T z;J1jJ#SBBd&NgRLZIldtIL2juU^(kS5hS^|&FxOG0<_wgdTfrQI@~yKN?xjat_o=m7m_A#!Z))(W63!GZc6UJaVK#tH~V zUPS!WuuBntk+5gliyb7;Z9u}OI3d+h4)==GO;ZvVStK!d$^MR}v-*DxpU0*BGj?v)Xx~B4 z60(J<@r%XKQF*0Qy>0U0JQcoW_D7y~VB(|?k-o4HD{*-JBEZf=-7{~*;d-m3&A#sQ zq10DD)RSjqgMC}ELRqqfZ1tgRKOdw)t?Jwd4-)!sr%mMRsNd43J7(+N5yD>C3;(|QYj3M@4Vj*eM+5;BZ ztC}Ky@90DBO_5D~Lf%v2o?oVy_uY*E5%e^Gs(gTWe35|In`lN1)~F|TSH&V^?N-X2 zhY%-NS>j%}sliWp?R=CZb_`ug+h}6Tn$7ljt}p=dK>2A~u&6w~Z_uQf0V_KxEp9M5 zC#P4>Q_?~u(2lkqVvp12JiGX3VsAf{S&0ksoOGCdbRK4HGi)o;Iu+(psp^Q*n>Z}m zuIqL+loeULCZ^5X8LH!Afrv-ZyY13Hk@{|2dMvQqH8Bx*zUg0k7zgvg_N1WQ1|c@` zQU;JFZ`Kz=<-O>MZwpcmTB`4vN~oxcf9_FE+t_D&&}E7&@B_K?PmIV7eo)aWFGV(A ze$(92G3qD{BJkF$NtKz`5Y&d^w)}w$Q_hk*vZsfZ=5xE;8NHYi2GpW_ ztcw_Pje|J9{K!`!!*l4}YL!)pL!F(K~aTN!ji_+v_lgHZqO7TG?BRiOoIPf#uRbV&BMU^X9)ma%n?IUdHn;XM7RaJJKNDAIXPMM@%Gx( z+PZL6>LPef+6a37s!Vs0HgCdR4pVuV1e-)@2-l{hE|GeLqlLi%43edg_R(m$3X-{g zV8R5#89Ys2=9vsf6540NTG0*O;j=rki_O?e$Kv;gm`@Bm;a9M)cy#rAguJXcx647; zOwSg`ZWgT?%dz$VR&!KNAWvIW#^!Z`-wd&B>$paYHDOn6Z754xJM_^6t&X@{g5wAv z6o9B}0x4CmB@-62uk|K+>ZWIS-Jo@#h=;%UV}_H+bVT-%bXoZ~&BMHaWx2=Lh%D#9 zBd%~2Gw+=2VIV|&qwxUDdS*kGy zwbh(Od|e`TZ1)f9)MiTZurARcHij%#_%NEzZ(60oc^M?R^UBr9+-YTC05TARqj%b% z%>+o#kIYrPUX!fadJS1Fb_uGOHvm}8EO#?AFz+Z5AhDmRVwnVS+YGdPpVbh6OZOqD z-ncVWVC1pK%TBC%QV+upuiN7}++nUzlT{|F+`Kz{uyEM(RKx9{ccD22UhSjh zn%MBPva{|TckaBK$f^iM5su6OLLOPeW!3Qbn8(BX0G@{@Ch-+Hb|QA7Ch?`X@(f04 z5$-por5va(yY4o~=jRqm|D3_Q-c*-y-a}ml)`05vz3AYytwK^DRMh7nM`|1wlRaz; z;NQ+=o~V{E2}h)uA!g6|*s*L2wOMdmE1)3m`7krJf`USWmBZM2W_raAYZY(T(*C5O zCyjf#dvW;CScP-9!(HsUNPdx9e@;=0R|BcpWbSH4?5+1+4zium<*wH;g!V@M6H&mC zmO!$pDIBW2voxBzqfUgl%4PtDby=F;=XbnNiwTx!(y@w~YdM%2qmFBQ48KQp<&_3G zi8l$m&&|1-C#5-ob>Q3wx!8!E>l=NPCP)IYt`(k#p~g%|Da2~Blo3pRltQb4#X;;l ztze|xwF$`@TwPur!zbLVV3Vq$E-cinkgaXT1fp>(*?sy~8HCtqhen)T0iMcV=A!w} zG=maRkPR9)zF|622>1w*j+o4K|73d}r?&UH&?6T2tl3J;TlKzwYE;&EFvXX`F+FSY)dC zR-Qo4Ljdkpk(VOud>MjnEEBTX7q8iEUW3r~>eSL_azQ^Di(k6@g-j04%$BH!wqO5P zP-Y$qbH@a-Mc?$h5h8z@6!Z?8W;9cn(f%X6{Jq1>cP1IHiyZDckPOOFh5${3OsE#a zSYbH=ODIr5Sy9i|&a~P?4RULVjfoIV2A80#2N_CwdL&rRTO*y0`eHp6DfF*Q_dBYX z;3!Tn9IEcNV*k1Iy}Ftrnz6j9Mq zt~2yOeW?D~*N{ul{Xgf5lFzNstudMSQeR5qDdFY!k|AhvdowwqburtMtl$vW1Vzu_i?QG2Z;qbYU51R0UG9 zcOKBcVm?4q7x7K9JEYrPD`;X>W~oWizw8x5lhw4jwz&&n zJSmtdzfa1fK0ZiHo2omL+7;gO!~imfEFr}JW!MuI42_a;~rCf%cl?Z@Vl z5oqn>=C=JWQ!=wvaJ7sdQ0X~glrP%*8_1<}H{%cX_dca9VbYgvc~F+}g2%vP8;)z8 z{yQJ^&=ox6E8yMii*GVhQPEEJtiCqbeAhm+Fpp{J@IxIMk^^>+rhYsbb+t}7i;$n) zvji@_Xf{LfgwPlZO+tHX%j&9_gj<>oFGok0UmVPs_9;;(`5nL`rTX=0;`=(gnS;7d z=l3*4IAc|sBAaV(v=ASR2^xEqgs2#KFj&MBL&4A>z$Khl}Se)yY(!YSUHaqh8r$y z5ftb1wIpO#G_nTWpw+>AF1f5sKcHtMBAj6teO5%Sooh=tHNmi!1QaTk{ zB_77L=kCu)DVR~Wg`)BBl z+1F#;ec7^8+1!$R@*GpSwOxbw&p4;!a`F1OehJKTIQ#nRL~qBo)eY;0%F(a$oI)Ed z_@P=6>34NKw~d7(d6Kf|QCzf&ssn)$v&r}SB4(E14LdxQF)xxG9u!kxf-@ZRXe2axFM?K=xuin)>|#nlnlst@}kGdE@!Cq(jGuZ$8N zZwn_Yge^KuWL~~OI8bM7y;J4W$Ke^mvmB(*qcSy@Dx8RR{|wt+RB`X@>)=pO@%?(O zdMM#p0ciH8BbQn3y8Q`$d(Hc|YOcToc%uJ`zs^2>0aiLrK3NYbO?-j)cDn*ydP|Rz z{KRLZWtxVgacnf2Yg|K`0vTYn{M z9=f5jondtCdQLIxT#4^pkJH4346|Q^CkV?$?ejvYt<6MPjl;Sij=F-zzrgaiDb-)} zl9I^BrrUe2Pmspdq%0pW?>`=-lpVc<9ORH26@OOkuVK3#p|4fAd$iSIjPUV^k^MRP zKH1*K#*AOiF2-x@gx_^E-}9@WbqnI?wla!&D4{;(5vw@gnWf*S2J3HER7by#&9tW- z=7c;TREKi!2C;WoTjJuAerY$pNFTGr_4@=w&Si}7fV+%_0;DP z^JS;k`uM>wNVzU5qn@|@^!(l8ctxdJXDW3kciD*A*?tyrXi!Woc$Jo^=!(xd$WxVl zgl_1jd%(LLNX;|Q_q8H+OpjkydT~7SGOkP0XoV=-p|BCdnqZ;=cAQ)syfo)VN4|@G zIBP2;d24JztD)(b+K9K^GN0*epFy?5y|wnPJgY=j@@%-yK<<(6O7P)>v5LBMd$t1M zzG2GQa77{f)BO~WWKk~j+smUCJOkPKV;$86Zeko#v`gb|zY@b)T%;F+Yc*`*mN{(d zCFcm^!<#Q_{WWl_zc-;MUt`@d;H1LUt|G?!^_0$CVnT+1o>t1`u?lnTa@gnB(GD$qrVQrkHm0H|_BvF^W-FJ~K-A?MWhwW^ zAiubF%CtR*ElJZeUzD>bdHF3j7YF)mtkhv0P=<`rC+^a0Q(p01u=o_E=(uFt`LC-j z@iUOw%F81br@2eM@0fn{7S11dSQGuJR|VaK-1yY@t5AJyRH1jC8mjY%ZKxx(kt=tU zd%N#wXMxc^fkpl{vv-MKHA`XNNwi*mEbfR&U3xfOjYjE;aYnmCKbK<@A?ayKWM6r7 z)dTZLwYmeza(9p9b?@z`Pq(dmFImXTEL9E zUTPL*7bL8C8cY$Rsr|Ch+MI)OAiiA$Ip}D#`V>A|tw&+My0c}Z7gSM*?6 zwiY$(n{z|U5$uoDMK-HS?A4Y#uTl(jlyMHm*~rQ&k9M^0e7b}-)9ljo+-l;A`-bEk z`&HgErGj!CN2rEd_<3~SMAi*sEx3h%<3o|W!@67B?h5^vyz@CKmeORqEd*K)$1~)5 zW|P%~7thyD4omMYb93`JzU@bKRUR27B4BYi!rX7X{e{d)mm?is#Vp=WxPl zL8Q9t=OaZcQFYHakHQ!hfY2))m$Q>kn|@uVjwFXErLg>?EPa$RN1MOu;arArdFE4( zW)?em9*lFnwcE`98`XFJNO^&ug#Vl=+{@w z6D+^)H`aTW_dH(Jd~Mco(fraNH`Ta@hA(3R|B6U5^2m+uO}yk@f5xR?ytWGf5G?a= z|5YURf^RrZeju>@Qn#UPQeKpj&xs-@J5s&Ab}g zAv(cd-C#{Vq3@i;{q(jxNhXFZE}sS|hvYB~Dom!`BC9!h=u7fQfejJ32lqN!su z4262~!?t&x`0wCS_*~_SAj8iwAiv|_)NQ02Gb%x}yOJ4j=-7YRv&KvYeoMwLC}PlV zN8}%Un8g(YdMLL2=cxN+OO3UjI0+@Ut`de&Xf?LdpyTz?JDbgF}MHr zB7(3hJx=XX4f2jABh;_lX&|q_CtUM!-cx&2r4a!|pYl}UP`Q27lBjg-!NZ@o>7mCC zyHC8V;PAmI6LyHD?BptGpXIpm^`3SUWX=p51L?aNp?G6 z-LqJ4N?aZtnK}|J2LVeeZ%$PyhsxpTd1Ub6?Ubc~)}#>{y&{4_{*kh?y2)JHwsnw8rz% z_x-;Iu6_nUIu0jG^Q*-PH8fqftb0;`ZQR8nicfVu41atE|Ni20e)iwK`HlFzxqtiS zkLTwT;&0!)e0lB-{Oz0nxzPW@eZ*Xik!aM(@7&R2i1Tc{Ykv>tWq6#D9+jd(X4R^C zmFwk&I|_zLL*z!k`o_28XubUV6pz%^$-R^@zrarN!BMWPN+s`BZ$U&4W5<7<$bbCn z_#`Q)|12T-x7sDxG57!5;Rg(6??WO>jyiRJ6Mc3Bwhz1(jyGki>d!iKgv~3*S?zNq z7=+fYSIySMV0*G^yFScdUbPUr!#!QnkPDVIT&&& zVo3OrRKtaw80aFi;LJKMb^AVed^;D`ZluK8vU)0ZA#CE?&1n6*D<#&Q>N<>ZSU&mv z##&!XtDFpL6{dL|fVL60fGL1nV9ph{~`7_XZZY^UvK{P zqw_Ctet&pwOuv6P|10|G+=%_{8=U{S5Y|=v+wnhr^#5oaX?=WJBQG?Y%2lp?RTq-& zd-|{}#3bm+U}xGP$_Fo}+T;1{#F4tRzEK1N`>=s9`Olu;5crSNQ|B-#Az+F-x>xRQ zvTeZlnj7%z{`17#I(KHYPY-iGJH=lqgQa~~n;lG(FXNYxqK8m*)UIaVYx z!XQm^KX#bw-T7Tnj{HM$;uvbl`3I#72dA^Q48CBYaOh-Tb%p<&L??Q~cCAO)?Q|7# z*!hWqWa$2fM;|1lOEj352`JO1Zm*>AadZy^3MZ;ct^XA>_`VJKbj>6*pIh^FqmlfR z(1OL^Mvj7_JvZ)^WF=}=;U+xLM)#J`WVPKPq#0RV08Y*WSUX$T@jyed>U%Ng5b|_O zpPigi%LhNZspsjrlj%46TPRmf-l|xaT2B=Z<5F$>U5BCF>-@7?m*r192c6{SPc@MY zORJjK`4_M7`6x!0JG%c8i1WOJtD#4+7(wk7JtTYSjy6Zdy`NY1IyWi5CLtItAN!M; zn=&qSh_pct)k_>H89_XT`9r}2sgJ=Ov_gHMy?4mjw2_}nLSk53=JJeIgyl-E#u&lvT8d}7F{{F0-t1{+{WX6sK^7w(Nfiu9vJ^zs{6PNmVuVISUVmzEhK z^-dz{^z%ygEawaB;!166PEXfJbdOr^=we!*t%;~u)aM_nKgzX>q<&z3@G4%TX$UTO6NurSusO{c4!T{?pM0(GS_Jul~35i5LB`% z39F4=kp@Qy)6A%&Qm6fR|WGPdGpg3&r*w#!i zvaumQi216hsH9_0ARus^inom@bY}5unGFTMmzMVH7NTch;4@x-r0k0F#nq~&mbyGo zuprh_#%P&LpboF8aLE1 z9An&HB{fk-2BYKFr)v__ZB{@~Bi8t0;+0&S+Oesf_5x#Hqt8TqxWo}Q=M z_t&Nx^g$_SWzZ1^??aZB%p{~*=phiL-rnAYM%<|Q_&nSdq-AH^;An|)KuL)clSqST z*g&o-6HU47v&AmmfjW0*Tyl05+s(}qEz+}w7V{3zi1dhvuI1h|->TKL?Cd8f6zOhz z99P@duRX!!0uj;CbRjKi?|Xiyah6fTtWR*6y@-3JtW0q4o_T$f()^r}Wup)7B+{mI z^e!oBUg1PFBV+Dn_x#7LEj!p`t+cz{-q$r-*)XzOzCbW82grxVM$Lqk7I)Ht`#+qGK5^FYDR<)MQ3E4UQ-uP$9>prvhQ>xK(2bRD&!U$y2a(<1o{ z59YaI53OV?j*pZ{={gLM>gq{p&ucofPSL8QyM4DwkG(=HwlCa)N76rD;>+-kzibnB zr@+apA9id&pJ-sW8)!|ytRGj##9Irdf&ifciG z@jJ)#G?APe5lEhe@|_eHlVZjU_wqEnW|FFbRT$DZ@NOQVki za!IHYqgjy7$zSX2p+UvHQSXLInkyoivljGWUK+cNw!UjX2RY--+pC?%(tVgIZ|9Dl z)Lci#=k-|@)u(FPR7u+}c&y@U-Mn>6O-Cm| z3vONT?HlzhLRLnm%m@_*y+XE<2f-+X?Va%X|p&3L# za1B0fa7eo{M*=a~%Z&>LBIrB?+^hxo5!hz`36Z%me`pNdV?kCt0T>##;+ z=QswHW?6KSM&Yq!Ahg*uwY!h6xZEp6*)PvCkzd-yuMHccWuWEPx+S$neoB%njD{Do zWZ3(6ANU#@uMOoD)#rA`^R&-y(sOdE=+-#(BcgRu-O)#-6mClf6RyaKI8O60D(bXa zXef5t%tz}l0fBj5lcgSDJm&dJnhlFxYL3ZyUx?~^ZA!d(onCHRrE>_fAv3aUB69h~ z)X=#or<(cPSx2v}Rx(e_giK^a#8+!L6FEV^#&Qkrj$P~PSVR-g(*%{ zZ9%J|JpV{58_rs{>9keZIpD~&MEK87p17}9sSCLeqPZlcTC#!s^mSZZF3McniL7KV z$&;%$cQln%RI(EVoR@vk4ahHVZ^YOhJkW7T8h?Hsu`HZa&l0 z&$YQDy*yC^r=rqoo&GIsY)#F;;6oJ4ks6!eSpqu;N8L1Ed?%RTwbJkjFW-X_FGThd z_2l6oYI(BR`PUW!d^z@)QbBoF6mDEflJ0&$xf^m z0^9iZx!i})vi}2=Zg+L0-;|WyD7#DK?3a_>&%O>*$!PUxR^7V$1z?Rr?IQIC!5ZBe zy)|0r4xjYIklUC+z^H9FnHa2|?W!55QKWq9<1_S|=jE$c^W@{Xm#4g69%3lJMfVuO zEZiF$VdZZLX&87PXtXAVIS#5e403b}XE7Dl$^bo16(gevE7de-@>{-l@W(7hcTm{H z0E05Bv|m*4gm2Ycd5fT;qULhjAx77=cXkfKWurNlNK96S3V`)E{-g1sQLzT6(>O1p zBAAO!Ah$jzhyP?p+HHTj0abKcI9)oNHU~^E|G>a!W@g!1|!=ciwX6MO@^*?D1C%$Gh|9bLuv;P!_&d>VTC zC(4;}g9Y%(OfXn&gs7>-L{9e8xK z>^9Tdu+kiFt!8Ie;(T|fH;_U|D<>ys@&qX9!zQXFGBPrRSAva~J{kyj&x{#QS%6wN z$y~ehz>_eWejFJcEgLopw<*G-mW+uWMENrn4upGq%it@D)OCH#X`Jq9^2@Ru$nF*1 z1@F#$u`5xSi!*5BkD5`bL9+V|E6gPS}KP`nqjI(A$pWIm^)1^`E@#b&q`5getw=&(50uwrd(?bN#KXxf>>$iyTICh6{aOLH?LSNW9f zTmG>!^E9P2NmmRg3}0r5uuJA@*#iEcMWfH|t^Pu~xBb4lx3S^n3r^vNX0 zOas>aLDb5|TyLM@HukKDz`|NmroguE;y8YT-2RPqn&DCvM>jbx+h>YTHGS&6eWF`zE<^z?IdMCkFqrD7xt;vp3%34~vHua}k3a0b# z7Um-E29fdg)pnq7HuiwAovbA%mmd~3dN}Y+-#LN%X)ALuHFTn72&zxS}n`7ZsM zv7dZcIn+i|V0YKLBEqZWb{tj9e*0aIJzTrMv!ML7YU#bkM4-Ry>OOZkZC~YeDkCZV z0hP8OML|G7at6sc=ZpzdG6;>NCg%)I4knU>Cg%o}+$6~o8oqTf-m3TgB?PT8gybo5VoDt%p?ZtMS#lF4YjvGvB`t z->h6&Sy@BxRp!8vd%bF=-5Fu6QJj*n@&rxE6I?eI$OK%996g)DA24f_-=z$!H`Fj! z%-vNn?cKd1R?u&Tr1v4Aa?MRU3vf`_mh&1`5Oe$K;dbl%qcOO`(AK&3^tu<*3CqU> zD%_ZH#J@naQKj#Xu6s04whf+@e~bvtY_yea<5wZSemhiuzE2`Nui+DmpSP%pcFxcY zqr#27^MJt0i}VM7C6~%dNVrn9x9j?;ud-jw*Vk7Ly5hU{h;{6`Z_CTde!*FuKOJ8= z#P1H756%{gwp)?pnyFLiHSIas3#n&r=gD5JR244uTyo3kWFee8YM0BknCK$+sI8=@ z>k@jejk#pVt$BWPDmGgwqT~7r$egZmarIUp*%lNaZCoR6*+-~$GU2p1`iXIQhb+Iq zz~1vpn#qM&K~dt)>0$2|{{G$1j-IWGuc@r$M`FkF%1Vob`$b;6`&w>G&m9(rnZ?BK z-*0=}C|j5-(r20dv}O;WV>YAk&r@&pjUi zgnwRfGW!AVY~7idOQy11Dyp>&XDIFUj{Antaz7~GVL0e3tBOdh~z1x;GJrjT0#<06mZq07_t5*!7pAH{7G}C+& zD~pZg;A0`CVo`QiJBM%W-S*;)4^_32gF`Vk^2w9G&4==CQ+e(P z!IH6Vi7GaOc1yG~iOpKhRQ)kUEq`EWHlM5%?ytS7Rn|L71R#wTPu-5{ zZ9{cN*8!D+yR{OEQO?#YXGTwFVud!cETZj&tBP4`zP)$bl9?YVk*;u>K@Rjm#u>?qy7{TOn` zu%L*vSOHg))rcEpH*}K9?PfAyAYi+6=BI>`b^7V4;a)k@Huph)62aM4^N4UyHsOG`aoP< zR(0S%iyj^x)rvXFEQtZwNl1o^^d73>R5vasqGh$!LSax~({NPF9fpDd?R21R;3!|TN_5l0Q{oU``(zt$8l}Gx} zphvLlwko;>=x}P!K)(Xvkko>)C?>_m(@YXhMP~E_1l2nv`OV*qVq2clRKyRj{Xpm+ zY~&qdraRtrhfQHepc9mg8v{ipkS27A}@z8p(k*fL2Lk74=QqY?)9 zt&kb-{9H|hFJ|UENupLNB`GVY03y8&r(B#JvN%v`R|C~zqaQePX20fhFB>99Idfsd z(G&fc+p$?K%X1^N5)WUD{~KIukxNEp9$dQ+gs~6FvZfgCFI1?Df8yd`$?tI1)bEtFJMBh88jDztNytJN75nKl+S=KAHhMr<;3s@4OM z$qv1v&XJO)fBkCp+{Xv0U7_K8XWr)z7~O3LeLcPV3ff#MN%V3lpFUg3Zy!Br=usoR zOQ6?qU*m}6UGw8Dfh?JUiK%%?6@#Ra^->fZdJPYOdN>hdUH67}%T{aJ|0qE%jWecTJVJfDLK$+)6nx?6`6{0js5Z^z)ey3 z&YdC;f{Xb`wMNDKWZC_?Fz~>f_u(U!-dlq}MOf)5yl%Y493mq-UFdA%*i>bYQO}%e z+MM4;RJ!}yu1vKhcl%@zM|#D@52=je|3DI&cs@83U-_B$0lv|?xx}5i|MW|d7Q)c{2Jcir8nvD<=Z9xqz|b_JN!Q}&%a_N9h&a(6@`VcloE0e zS*Y1#<(-|Q#+_{X^l7!<5N)#bg*q`R(5JG-#2|KOZDTO#@OUpe!TrP~|j#q!#| zzVal{Z%1%0HH+Y)(`aH@kKnf7uQ>XBi6q_m#MPjoD8+`*ib1=NS0rZF3+iYh@sHct ztEN?RAXf2Q6crI8IZvEXSvh3b(%v`k@Z6h{R83t`F=fYX#JZmDrf)5r)9A`zZWSr* z4mJ9xmX@`2j@R&_(%KW2t1I-Q&ciC-a;F)^+zW{#m3KhTZd&9b44Wk2&h2 zB!520fw=VVPeLjjTY1rWeLjfa0!&ji@lF0HiBP|_WS z5VEm4b-|fryM?(zEmhToN|D1PwCMn)iA(UEc{LRBCcRdI7`P4NrI6ZzO6O3T#?Mwp z?>}~iMMta8o9$VZsYeR97RS0z8YA3J2%LeN4KS!GiRQP=!~WOn^HpeYLYF1o&#LwQ zt3sm0Z9j`@3e!@o8T~Du8$Gq3lQq_lh08=M>8cko^?aKDNBgY_p_Yy&O(FB&MGp~? zly$1K6L}O8#$qptx(B4f8WRv?qc>ZQLrurpn| z)>c_&8L@<0#0)%-mg1wI|J5vlDXc_;Fe>*eW{0AmPc( z-K^5lQPuo+?e6wio*-|Pgb!K#y+G~h;oR!Z;>u4`zefAOKc8N18&JWjxG`jrZW3*7 zt4(z1^E80TsYwoE!-Rx{JAnK^)!eQ(HEu-Mf^rxLFME?GIbL+CQc}fO4~PUhx@y~9 z0X_i9d|Nu(RqgI-Y2II;zxDOQsUov}HcRujUVj03Hig?j)%udmEY(wX2h0y9XN=D1 zqX;4hRtFduJF6&l%QQ{}bqY@}!R&cUpm2wTc?9Fr#_{W3{|-P(YVM;ddPk3<%s591 z1{DgAQ%BMQ`UsVs>jsqJ%lb|7GBTeFTbt({5Rg|ez8c}xy3#|j<&%0N)$RyN+wK)N z51Gzdr?k*!7q0_837!Z&u!J8#7%xFKmh!x>al@W_0_TY-UuV4cKT8*z6BHC=V#pED zG|>SdAHLr(wtWFt0ZGH1J9pT1>o&_LH#{HC_V}ODf8QH1UR+Kn`m3NTMAdByb)srp z!oar4^ZQw`@F=6_ZX?o{Xgj9(&jlZ8#8YI-MVw#C%Y2a&Ug5BMdkanEil8Kx)tjnP zshx&6ZOTAVIDxHw&j7pL0~=4BKUMDNXkFk#3kHPxqI4Q%-P)W&PuNiKDXRe|tE9DA z(vzz-l?|}5KP5+pSUoJ+wkcY+RGB9biTh| zHNytbQ#KWwN^V)Ma`foY$qA#R;iVfcMjhAtvpGB`)bjL8Vl~z!p1v>o^-8>ix_5|= z!|Z8<8pJ<{mPx!IvYfUD1tfe0Wx-Vk&Xb0mdgtcCrbc&7IM@y0{IworZ~R;q9eq(? zt3;IDaRCKq=WQIm5F?cemFU{q&gRzaLC?-EI+Ln z5;D8+`mv=lZ=jMzM7#xeM%|IJ&t)!O6a%vj2}?lt4X?4FB%n?+8mj~MJmkDvv$>h2 zrOFvD=tPmgo`zs*_Z*yZSYn)9i#`-!#$RSayW)jKSK`OQQK(cBt({cGT+MlREQ|wQ z=#f`PCl?;d!kpe>qxL(H6=2<*BZ_=a^vW3q$VPnN8e$eRwIQ0`we$#zYWs}=NnQG` zaaI09XDeJN>~Abxk|;??>--{ib&WxXunbgrR(9iOZe-Br%Yj$d$(H>(oa zI!qEXVUr!3`)V^+oYHYkZYSfLpW3)ZZ#ULY35pEESm##5=t3Z1)^b_9VE&6M#1JJWBy^=pYppHd z)F5>z7&K#m!nwzMun6x%Rb)Fwgbwb9lU+N6PYESxe+bwWEG#Inn>dd$#g$`1k^q#d zT3%G@Mo3!O+*hxT32Zlax_o{<#1=eBe&Ic0rC#WEZSJU9BY|Q9hB2COdy7$y1stbv%L^_u$lEtX5LyUdh8l`u2f(mPD};ymxf@ zN>*N`{#v$in^EgM>Mdd;7A(CFx4F=;0wgcn#eX-j>S^*6py=pB0o3PjhNK zZHA9ij=2{;dR=AEAxw8Dxxk)6T{qisPpYg#iQU-f^sYWI19}brA?!lj(9uzYikZ|q zzRGpoXnihlwJ*`Lvua~F5snv#GQjTMN48s#PJH7#800-<#_b=0L}%HdfqP2#g#m$yCox>$_E^$N9Cy4je=a(S1q|q z((>+4Oy~=8pqi55fUm>mP(dNY3)X&g7Uc4Hc?qL--3u;`<|!_9wMI@p&&CG4i-A=~ zcl-(%w}iiK3`&CtlUIet_g=w(-b=C9Rg21pBzI$;w0?U!_cWy;yhFO@**O)RGx{%I zp=fz;!yo!X10bw0RM<2m3*4EZ?l=|Y@ zX=rI14GJ4gFbp^Hvw=iymLO~SY&DH=bE0q@GPP*HYJk6BBox~kZcDT-yU#t1@Le|p z%BnAWC!Lf5=Dkea^7H4<$Br%$%)~qNYs%Dt(&376U*{S2GSGnO2JBuBY_-WqavVx0 z%`>@u0`gtpu;uNg`Rt}2=3;NdcEe!+W79UJEN2vum;l&Ag85)^$ZmvQlt(uz*%ht;8%$`@*Y$ zmx?g)&QnraXijns7F%?m%JgW1MHu!bI37SmzZESRMGj3Cr^X+0_v=b9C{HTkzrP2!+79$%?FCacwB8>$Arxb#CEZWnV%iZ zkRGYV_vWI@9p)7mx91%y!eXmsp!5pCSt!Ae?qmrPh}dsXJ$bGh7Ul!5SxTC4aZ__Z zLfOwM&JSSZAo5f2+C9X^R++f(0r9$}-$F*wAn-`A>k%L>8Bjq_7HAbclWAYvaS16e z4q!tO5cnVLF{ZnRrsK zfm{AsZ)EO-DgonHN&Qz|N*72{_JvMO6Cs^5uVLA0UjR)ARLg)#!~ffRzm+;{QKH6z zYTCwr$f&6wc<*WC@(;psYQ(<4YGPsv6>-a$Q! zMA)W64>+Wal*RGry;tgt22Bl3H&{Q4eS#?u1R%^^Xu9)IPcI1}7J>fZxfsBE@Dbf{ zNS0uwz`}*pyrTm7$;}<{$DZXy92fv35KOM}u`k++QB1++dgh(Ci&>r`OO=CD?3g*N zFLj?IO;J5_Su`7ixc<3`Plbk={QXfA!{s6(G8Pj{2NK4vwAo;u1^7;_iz`03pI0_V zA{8hK8aA1|ENALAq@msSYjj+GB)}I*osA5nUXFgzCid0CG@ZG~fK)Ee?tMCQQIS-= zNS?#np#r>d$;-lQ!12iC<+5d67=8Oi`yUcoVQ;4HFLx&iawwTis;1xy?cjUfBUg+$ zNJXzye|;MMA9XYOKeYAH(RgxoJ$ZIWBdt#e+wkeXKKH56^eI$KpVpfGF7t?_>kZUW zv;c^q%H!qG8=+6Oj~pkAX3m~dHRB;83C^IDEs`gU(5LoTMcoTG0RhQ#Uh-z0bjOSN z9+^L%39=#QeT|LPr~LbDJ7-+@avsH^RVw#b|C2GiD8oCwIbm_ZyC&!NdOpx{2AN`B zav{lVWJ;z}Z*P6oO0D~}p`Ej(rGLLuD7T*~63h6DqB!8?{r&S9R%R|J`boItilG0P`;S(A$Q@Zd*)OMVg?&0*TTj(y>UFkfCjT=TiK=|(p*sR}*q^N$ z;a9!z`y&-c=38=BXY9QTvuI(}DM1|D?=K!5-%nl-V6MEK78<(!`-!=qAgTEQydC%3 z7j>7)5x3Hh%jf&oCq|nESDsyRIr%~N56{?J`tNoVvf5Ot#EH8f{|*cyX+d0CwdrkW7 zKU^DA7gvz-i2d&yYUAJkssmDFelLFu+^v?s+qM3Ak=!0HT>hH|KES>C{rCU=Xk_;_ zFMsM+c5&eCOVR!H&uz?JtV{j-8={?%wejcme@EXPr&1UJ$@$OAO7{DWv736A{ro%l zNyqWuS$0~^{lPjMD7w3UFJ6;%X?{$q&3}x8(O=(}Sd0JjhTqw*SNQjr`^Q)P|Joh& zU+KRME;0nks6S#sv#(P%+(fCKlhz||JgviW(C9BvQ;K`b@JZ$?ss)$=1fe-LZzww=R~nLIOtEfZvagth%4_7cgov~u{wTBnsmp1AFf6VaSv?G$T7lZS%LXLghnuRb+PNQESwE$OY} z*BytI*!)}wF7lR;+=tid+(1mZt6ro#vSwYBAHsng%hWWgD44wm^cxy#B|c^S=?o{}{;sO+S>;J00Pt-E@MAq2z|7A?0aW zN|K8)35ex{p}|7%{I9PyIzx=xPS9rQF>uK}*HW$bstmXV)qj85x`{4GSVcfCEc-A` z?^H;wPB0BAW#BjiO2*;7$mhQr=h@I?4}ThY*}RA!EmY>}U0B;%OB+s+dNW5@$>RK3 zl4H>>GA2sN#pc-y*@<18SJc4;<1Cg9s+HoME7^@aai@ALW$GS3oup;{lyy2ik|)kJ z2K&dVzWIAqBl+v-KLrhE&a-a66|;FzdFS>`3#G5M8;p9BPmfj4X&=&B&%49oe(POL zwe-8Gp=Lu4u)@@)mqT+pp9+e)FU)8n6C z{p%2c`n@On?oV(KGC`Jq{Zjg3-A)z&-MOhymoh8 zFqM9KNX5o4XD-g@C{d_l@7K)Pbh<|Yc z4slt2a=(%jFRJ|vj|lg+-l=i?>QvfjyqFK8Z2VgCe@< zzj1tIWjz4LM_s!hIo)4?1%SR4z2N}h14saYn~6e|+=mheafor@neanY%9i9*Pb&F?}*w|G>b;vT_P=Mz~!LI+}_XODwIGR z64p7xPQ{zTnLU_hX_5iJZ?jJka8l-dd7AK~lfs(BD(B9UgL%V<_UhGZ3=9L6Nw;D4 z0?>>Uz}YH=B2YRjC@O~UBlY`)u1PA3Mn#5bAf>njc>?I3nmQV!tN`y-QBfHvqSs(i z`XD6gP@#dV*mzhUVDDoDiYTNOp6$)i2ayC^NC-d`hHE@?I ziS)-)#}HGzD-fw`?9q*wdx{OBWoVtU3DFzZ&)n&p8t=!nyQPV-r;a9QUqF;UYgo_S z#8$7*?RK+0l6xK#93T*ANnNlhTm1$;g>I$Y?2wSez-Wvz6R4o%fxj%bn@@Q6^sp&t zF7bWop=QOZPg3l;VPc1~YUP{~a?%>{sE`K)Zjx3I24oGS8o47H?wI%Q-#7b^C&4It z5@xklL6AhHbf60$sK}AM+aU1H-3)@+0Y0fBolTpTH!JxFZW|qKw!+2=CHipH_Ow#X zR+vGw=VOrH1d?Wa`t+&8ykt0Gq6y`!0 z`H;qb@;nFhJRvnBV&%2^h$jO!mYGf5*S%htuW6zY$2{d|JHBHP;I|xknODN@2`Fdo zbNXzy(SP<)5W1)Y@fP%F^YaGEF`#_j7;#rZEFSTdh{dCc)fA-5yQ!j7geZ$j<8Z;P zX<;B}*ja@#0NXIe#5wy^fBRyKZqor$>dTieUFG(sXrgBTa;R48s66#(ZGRp9 zdK%OdLOEbhx#N6feUS@AD^^xk=CZpvWw(KI#E%8;?SJWp$NYfS>c_Lv(?Iqp0v!*& zlEGs0&Pa|-cTG>wn;~KdU)EhY;AYk(E-Y4rF|KWy<}%RH4Nkdiug`a8PQNteF%;qD-*{!`}OHM_OSjwv1Ez|PFYS-7n!P=Tnt`_Y*e51H9phkV4 zlvK8cDDKySxaGcYdw#6zG(KFdxZvNUuvF`7Hg=xkJabg`I+5o|?S`Pjffq)g)@o=-BU{`s zQiH_>?Ay3)%LhdB39cdO{{ArS8UU+k>F8z=aeB9sR-(}E60aoshA@c2UVpHraZ^)v zdbn7gfloy}KVHW;P0jpWPxIOyr@A;uJIgq8deYxlRI>tRI>d|1aSo!Qi1&8^79r~80 zJ*@KF>aACZ1k`fsBN$9HY#NgzIo-WsH;x1B4}A#{6BAMLr-4wMkqPnm*z@7nW=%oO z-x6^{W1l#xt*@U2C5w!#EFat%0P7gzP7yFcP6}_0hw_xKg-5|0;4o^t6l34hPsV1K zHc%`n3&WVDmajFP3Ooo~@OiO{xx-a0sCCs+tH(Kn4^#m| zA3ojduv$d6X4OzBWp}i2jU+WUp8(W+jq3WXeifBrLqSlU2`o2^ z07GvssPN_uY56361d(nc`%x>;l)66|kNEsRiCm`QX#iYpcw;^_wRD4mICTo`DMiki zwc(&YH-L$Xb=^G#^2WDr#x=U~cF9eLSKCffZ*5gBJSnh|be_M7cvF5{aZ>wll{tXU zG}JFdRpg)=)0wifh&+;#Um3Vz&0O=*w~smQ_+mipj(;7U1?q_{GvskV8zg>8VD-hq zZl{2_l-SZ?FIYPJ(g?@`Hf=QDkdh)=*Q&=AeSOLro`jR-+CBb=?-G&=x4-WZ``){J z`7)x#9ewu{xUxkNA7WG8!v5|uTh#vg0oje|NoH}`?R5%vosai5JU=iI2 zVCg~94urxuc~arQ13w^qvT&j^w(CwL{(9N5| zBc2-s>m$pciNjcZ{(HVpX0z!Y>?A1p7&30f{Tei(wpc!CB~9lrtSzp}6%L70w;sx+I%&wBd4cxlupUso^4)A8Bf}6z z!KP&W1O$!vXxr9qguVk6silDXm@ObhDxf2xQ%Vk8 zJv8YczyeQT^3x-xM+rS2fXgoeUU$*-K3IHVrbeMC6d>5IHBblQ5iKBMI7*BQG#GCP z#*ggXcdSMHFoWd^z%vWZ{(7JT(Uth!9`ISmspQ`$-refBh!D%mNYbnd?c}1Po6#vi|A8~kB^@e=1|U7T$}CH0{hug zx}l*V$F)-6ggC$sx0XZ33Z%#yVPnL&GsSMD!$k#S62@uvLz8DaV0i4`rRG2P` zrG9r0(tC)!-?C&Bf-G7hIUt#gl0R=o4qt=JROcCbI$#?DE{!bpvZQ9d(+wHKNOa_B zf2gEJ0p+_oUow)Bknp~;ND7EK)gq(#jY>)Z7nA9r`1-lXz6*+(l0+4_RD*m~#-~mH z{_Dz&?VXs!D*qbNZ_+K6N${sgdQ+WOOqhR2{NEo8so15MX=8^#bLQ-s%587$w4$Oe zL;sBODZ);tv`rZU-!#l5lW$)?7eu``N$Ub9%Ha`m-`b2Dss3vYd8Ms})Uwk{(r?IAeb97jwr8IUBd;Z=|B8J#6o zr@j`c;nZLVnwcM$(TV+;oTTk95l-88PLu6~^ST*`#3t25-;#QNrXVnh!CJl@a0$&3 zBJcDyt|wQB?G7(Jy%;?`J@|VRpnS-lVIX$!a3rElGVsqec z!myF$;9Q;s1%jfKQ~(YA<;%B2uSz4^mUFrmV*j2m#|#r-(Ils&xKxNH%=n8*h44g0 zzHw|T`FWSX>HNL#7L&~bVkNRmq2S1Nm%s3#<+;JjWsDUYpFe_k|5Iz@`of?YCoRp@ zt*qcAoQetq6aSefh*c0ApKd+T8Hk4vcMpsYjdnkZWX>(=Dw0?blSQQo*C|`pQ zT#f8069^5vAX8X4|0>3noWJ1WuUqFP`#!>tB$y3m zY|W~H3d_R&u1rH#808|%5l!^z4)f3fkk~%()}Dq$8<_4kM7$HqW4j7wSQB8{iy*>* zL|?{-2s31jL72jsdIi|NmU|!@$O54@rBeMp$i~4LlnzyztU{jM#H$G?&2W`Pim(OC zEB083g@Ej-)Osw^PJIPYzpeQ1;UNb{m_HhX7M+re0v8@NbwfM*z_aJmB~M*vhL@h| zLw^zV3OXf;;;9#XzajIT=LjNcw9a(-bHN%VwS>_#Y1}nN$n9K7Y5lGXv>!Wu3jg7h zU|TB%UnQnkOiX+pg)fu}fg)|{rYtW%le+8r_%yg_x^RxT^eBtyWR*gTwv$}ysfeAS z(4tfm`G*U~ErCb?s&|%;L;Z+YOL(2~Zh5+nYvXL1TI?-KhFuDGSJ1r*va&kWAFcZf z4FidplrNz&3tZE17QJopAG)&DxFGN#re$I86Keq~!}+>CP!F*{vjj~%H5Zo>$Xi9u zpAXZ`1gB2o?WdgqjJKR4v@b9)Gb>nFT!yNZwMWq@Is5ww=OAtZ*8N`?+^W9F{v;N+&oiH z3%6{!vr*$R6;o10aT@dx_LFZ6>3Mi+E7g67D3*`55>{8F%fZKPKT@@U%$-$Gj~PS8 zv`7ZEF{H=Y8r!~z7Kt#kP_eqRf;9!>Pj?Pp2=<6A_}$e7KFi*+s+ILYL>7nSZNguc zU|B@&b@OLy)wHTfJb2Lc#xN!sG@5A4No^3>gL4$D&n~9E#4Pff7CG2*JQR$MBiUVb z{^WWWykWNh8vr+?%26oJ`yjSN?5{25pIyK|iYZ?xH?^Oq%4ttVsb?V*0m-hRNChya z=j&%`D0MtLvQ)SjI5~UV=({@0s$}hMkFbd1QZ^Q{Zo1kbt&%6@Xb~IZ+F<|jus5RI zzZc`~e&;)OU1AlTP(3P}6aco|0xsMuvmxw+A>O;K^yaap=`urfw|}7p_oPFoe?Q(8Tt6{CSb>oE-<>D;?##8;LCe`3zQwi`tWa*1>xQ_3c3J!f@0kz@7MZ)xEKv`{ zc=J4KU`>KS0malFn5mc!VHwgI@dsv^ka!p&))P*%Qp9Wxc6rzphAc|JZj!mTJ7r(B zvn;`I&8WQ9cTL-g-^7fX_rvufhj~>nXl*mFu!LunGfTa3@jiT#%uI9m5$Kf=cTUwd zVnqoQB3v80#<~Nk7g*Q%LEIo6(p7A(79(g}@Z`ypW(o2n5Z)lSgc!pS4~t5rGWe=H z!c|uh|UVa*6_iew8CApRA zr}(XEkOh67SkXbR5M(_3+>#|jHm2D)V%o+Dhs|e`fMh~$jYrC=sttllZq;u2?)C`R zTA*iT+Mer-e$`T4)Yk0NQzoFf8~&;ezKCJh4GQ-X8+{Dpwy3glL@V#C8DXoJq`OMP zE+2N)_;261S7^x)Ri(>(w^Ahq5i_yx)~O3|zO3b?J)?%gkbsJSjP)UGQ}W%HZFiNl z;N7jaErRP0bfG1~1W2_qA+8FV28dc&1woRchv^|O-WRDe5ZD_k+#195;=?>^vgYQ@ zR?htMmPa6TZD4bZqg12vMZn)bNAK;(_oQp|ZvFst@{wP!P8H+a4ldVjDCE<~L05+` z1lY|BCqo%zKi9Vup$7W?H9T(6%>{v$5p@cPNYLhsu>!DDAVC{cR(Es~SgOCB5m^2D z0f0H1X0HKlfEc>8D4lKv3a*zLv{pVKY@d24D+?yuFXyPJa$936rgv6Y+qCw!g1m_+ z@B*x?tkNshk*NqUI?I7f%!RTIK>L{&1Y0Js^a>B&^F0Y9~&f?Ecs( ztpqOzTRzsFq3|rH9ADzHUN+@k>_h3I%+*{jlZqA;1C23+8jyvw0U)wA9x6SJXmDF& z_$HQ|no)Pp&h}hY-Q!(k$MKifXbJv~q;Iv?wmG50(xo^i&~k?Bj-OCT$Z#yYi6}a4 zH$|5(_NinOiGjSuW%{!y`?jZTKpA1fpNqp`P8lW!4I7&*v|}L?F!0;Itu63cq`EH1 zfR+f((Q0lAG=ahfv1uXz7g+#&41PG#2XXpC#VZj4>zIv=laY?DzH%oCn$>V@G_kIs zd$m>qUJxBW-bo8hDPE?chvQtGm;_u_yGCz6t?F$CknB?!W0iK}2OArk)%lL(;rtfR zG_gB<9d+?}II9C00;m=ddlZ;nw46(ta!X~wGwHj#+tul^^~TVCYbI2$M9Di0@hLRY z*{*)m4K(poh&Z$62oi>+Fdw^jM$#({Y z^fWYgVM0Y-*G|~oncB8rD4TK?LhI}6Z+-E_Le2PujN*z+Nz4&nOk&)0+Ti-m74t^M zu0WOTwSmBya)(wOfyz-)h0>u(iJ6pO!a3PM&mX5cz>s-u?YD4@JTGzA^pJlPy;?6j zG_}fQ0MyrGQ)GcdOCskLJ~i$oXqzP?uFF78{}{{Z{OKH+(d0q-wc^l_8Cc3`4bGDr zsM%!6Sx%FvzMcvM(gvT6plKvu$zV2>r+P9SrRRgFj?Ux^SAK^%ZrCu_pv-1Re<4Bi zZ-s`^iyj6B-qqU+e2X3c{Z(&HnnH3Y4vsIRA;wFn6NWh+x6$*cAPWr8`L8{&L9u)8 z?!cpe_Jp>HGtj`d_urS^+;e0@EY+&7e{Y`L9{aJrQ2XVJFx{?0NOLlkO#5{kg$9g6 zzWRteLtmk3}jK%iCOrJmLfz0qS3YKc41*}qO~#-*r37g30A-aW4CH&pJ1>36tKTK^Squ^!D%kiS`eV_$ZT2WFL zV%K>7JYC`Lv$7#3n2^oQ4+w-tx>YDlc%LL=SLPbF^L1IBqtC?5?7C0_%!s%< zX=(3e9FxgW&R0huJJ8`LJU?##^^yLI8!#Qm>H?7%IK%KbG$i`VS#|J5K#38HC1dyB z>9ENjLD8)|TIdV<&-Z3{`%CjJYve+d49qx}f@eh`?tRW-@%Kc(1|Os>8B+5Ix9#^~ zJ9d=0_#DW;k7f68+yG!?^UxvKs8aIs+UP2Y&!10*k_XMv(hd7PLMwpuyaWO`;)(+# zWubDGWo>b!Moie(XAR#i0oz4Ti(-Gf9Y8j!#b$311vV0=5+jZfT}--t@W`QY4hsJk z&Ft9>nhG2U^h`VjT1nA}!?UzfTDJq64Nbe{xA?dYfPM@V{K z#xB_Lkkrs(hs|aEhqbs>Kn#5Q=+T$M1dwRX_f_KzCCy>S3!oSTnIy4gwxoJAOy|uk z(*UR*_66^V`98Y!(=BBXPTFmtcats)d%HY+CdmvVoa#?8d-lMN%dXOX`Wl?Cl7ZLZ z@z`;~n2uBrym^cY4IL|_A|uP_>+6B89uy@qwus2?$2H1|O!~J+b%C^m-hnoL(U&EZ z$G;D@r(!MFUa;X}{y}r@vS;4j#%*!8hY7H8GOO)obG13LL(>t}ayz7H86D#5$M+|u z0tjKMIgm96Nf}mPBTJAleijfS5c_6;n7w>vG77}ZZ@r!qpCZ5h$P0Q;EZ{~;trHPT zM_9OG!0bX>{L`zkf?`BIi32M2mXN4}1p%eqCFJHd`t~ovDp-fP&BYLy3w9>=oe_zAgC`|?^!m6V#Jn9-K9P>Vl7HcVRJKqrUf*OGj@n| zU@72Xjz6#(@Y+|w!scoUeZR=KJSriXbnD@y1H8or3gy+$M-~yALdfgr=!A-eEvWKi zzm!-#5~iWWbWY#Ut<;TcYHA9La@9NM3WjJ%{+0L=tz(@hjR_OGLeM0EDu7fc8u>gQkFwEbJs-}U6rcjA@uqhDeR3pCEM@vD4W(wVRiT)^s87>jQ| z1E4P~$NSWx{>$D|4rU`cNYsSHLl(T>TyLjAZ)-Us=)OLa>hfi&9%DKsmCmrk4PD5@ z$)hK2%{{FuGzOr>Lq})lECx~ef{H!8U8W62!xt;mnVGwz29ZEk;XtnCNVQFw1_H`J zpqa2KU2Fmq)5e^7vy>0HY zY8f~^m@I)Ylnwx!Y$e5c4HgvC$_xSm8d#ydXrx6JTw)3vW*-}Z$t10A#62y75xb z$FI^%!ENo;(3?y*e2dsGw)oBk@AwtN2fgr`lQO(|eW+S`01az_asim2DQrXv+G9&B z&zw0skvE8ZrukrrG4M`C^#K%syM5J6 zgt}ngmSQu0-?9yp-W;=`;D5kGS2BjJtQ9-mWXiVIA3kS5a?EX`H?%9+9$MVZHfxj{ z?rto9+oX)SRw$BWGU>HoFZdg#WB=WH&^v`Nc?zuQKKF_#Kb*7M3M;rvC!a~aJlCJ` z%5s-_qnM)`Z{gxBf&5w93l&O;2}^Kq=jO!Z zZ5ZSMP?NBnKQ|u5YCi-tvKl>u7d70?1QLY+kRhUk{@i)>oV@{kBC5 zj2QO)&`?opcUcMgeck7bYN`Av3h9R3M3~YH==PG^6YDS9y1PI3es0{*7Jm}|L^%dN z^{(DC2$R@TKxKf%AaJ|98~7>4t5XGfnnnk^q&vm`2^YP!cMWI_n zKEl9AAj!Y=mzgzQE_gH0p%ZkLaHoocABzhHO+C>h8gBWNF5gba*e;Yo60 zcxdk_saq$lq$m=;K1H}v&_Y}j$}OmPq^I`` zSiLkvlmYuDevqr%@M58m6F(*7;|MzdeI!>t>lUbHfrhuhsdD|dmodz_4}GH7w&xBFW|0jEYB~mx#w9gG zZV(S67kV?@6}~k@&=<#x32Xbq(4K(ki{Fl>Q~u zQ2%w;MeHnV3v+2-2z0Z7L4p5)ZH3hPGA!x}+yc&OnQtfh)ZWVV3LFbS@Pw(CZD*eN zcDkfaOY0tS_MD7Ambwm5{tI1$!_$>;Ln}CY)X;4GS3mU z*1U}Is1c&uSM>kbT=qbiyEM%V>E?l*oRy zrOV}_z9$zwTA}$U*kV^L;3v#jLGAe3@GJCJ)H{MNX*vZ7L*oQ!<9yq2<3Z2!bPYS1 z8Vk_yacQ)b_@>y+#=Y)q;1%|bTA!NL8p*bliHB7E#}%!-Uw^#4un8#yNkr4`XX~ay z*H92dZT@>8nl0qY`-~}Ka&!B258QICO<4KRA19@~wbuPP&sA?- zP>0=syiP0fjGl$RsMWe{`&1Z%yH+*l0=Adc)dmAC zgNJrqWLnN41<8+VZ(rz6_H0y#0MZH79I%1#ll^fR`4FN;H`TMHH?OA#J?$ zhno7E7PPDn8&Td*{8X8sOEtgE3*c&Ekd1H~O(NAI$ZpU;Sfgsax?4{T%Kzr#y$Zn}Pw zV@}l1>K%udVh-*9!q>fdoO9lYz5C))uO$y%Q{!Qt%kTaZPIHOQ>}A}Ghd~?7~S|sn-p*7HMMs&URw_Pf6)te=&*wtd37M4^FM$1KRpKWHr@6uU#ujCyY25!siQC>6wLWp;fbA46%)qu{cf+V6MqnjNo+0duQ)q2wuYP~| z$=VwgkLCn*UnMd}vK!p33%sCt@%xm>JKh>2z*Qo&u4lb-44E81u16?y%hG9u_Md+~ zaLf4F?`8P^-X9ps@Z|>9rL>E9UK(4LNENv4gST}1Z_ezt3K7>=C^$MwTNeor3=Y0Z z)Z^lJ6wMa;EHE?6pT>Rou$QwVE=D9#=Hg-q5I%2A>}P9O9qQj3u6!`mYFHq2QFIO$ zPk3tivhbUTw??}>g7NyH$L^aR>bJGs)|Do%&%FdP`;HQ)+nhsA`qs_FS%OoqRwd^c z0v`W0iuso$*V9OQLAmrB$4Qcs-ekIqVZNzU;je5??+=?-cHgj-^{mkCUmBMKeJ%e- zM)7~X^#6ew`_?p7R&u~Qh0Vat#HaekTB4LBq0d%UXrY8Fhtx2o$6hDnh?Io>^ot}_ zM!~sDphu4cjHlq3+69p{0ObrRBLK2$Qn+%tG9-!PTUCEDxsBR+Y?M!s^QR&x?`sG7&# zT=?3%vjGei-?6OyXzq1JL6($e8ex)feq|D&MTU{I25e-Rhl zgD{_Lzu_H|Tp>a8nzh&>^ip-#RcxP=D)j+1yXI&4n;n~}Q`D)HDEFA0@6N+pLSuBh z2X65&bj$Odx<6>_8B~08j|M|;d0&6LZ(c|sUsXc?TUr;9(B*kUf|!Y1prna;t;T_; z1K&=6SS~T7M9+@6n-EczNH4K{ACJ}J$atI!)25b-@({sm(4}*={PVj%5#L5noIcda!}(P2KL1HvC|Y{^YGD6c+2Ob$=1N z$I(KfbwAR-1^qfH1po1aEZSO;;REln-UEkGg(;LlQJ$6tEY4#^C$cT4=!?I9%U7~9 za1O{sn!IfK&X!uB0k!`GHf7^+MTh&;RHyC6$r<D_#2Z#i=C2ub&krjY>wYt2-HX=StOvC!N0`<={8* zX(c&R3G?OEBJ_%3x*y)Gs>dv4%7-g?qrN1$%tLY(BcQj&qm*Q+qh#GonP)fC!N2oW zcxHZvS#u!8eH}+9N?~v3eUeO3)8#;ZHX-sJ%WIk@Hw@4>NCjLTTzwD`bVIl9 z9P45T6efls15ha#FJs^0tUgm!Sa?$|MG7~Mu3e9E?8O)+S* zw@VAy%}qB~fAy1RP&-wklrtFQS-scZ%x@yB z@uIQOT?48tYlP)$JM8wINLCi}H;eS|YnR>ddme0itI~TBC&@Jht`CJgJtfG^HFTX( zqci0^>qIAdS3YNZNJt+Fwpt11xvTCb-8ym`e0u6ckT8*0l_skN?y`V@ zK-rkqE^V)-t3BvEf~vKAe2-kTy0tq_plWa(*{(i8%9i9rPTLptzLsWlhflgS|Lg{1 zKqsyjWx3P)e(2j<)`FA@F0|(qJ&(he(r|&F+x7ahdDwn-&$1uz!lm_BvA75BJ{lAA zZHgVk%ttBAvMel6aG)n+r&ef6b>Vi^?t6XmmlsCp_-H@Tt+ z8Z7_H14_@GrI_8DH@~^9i=H|!`ljTOOzYl2rTX@mPhpB{y)@(Z6xxdoc_CGIR-WHe zJ3(@m&hw;vtBW~!b_5iYa)9u^WuGuyA zsFo~^3eHo4-yR9><_AvJk>_9|IruhAmWPLjha6_J-w5q}hF_9rIG@^^-%07JtzW(U zO-?LiPqlz)v1)U~wi~yh-zYIiWcxkA8^zGT4!?e^WjHp;Rl~aZ(sKMMjbN%N=TJVxuL#OS{FInU(psYVfwU!ae=L%+)8JT9%Zpl3UUiCn2Uao9U_T zXl!YTZGK}o^@>q2ny@}-+35xXH0RO()!wy7C7Eq;?(LRqR*l`pDI3jkbuH~hR=!5e zxV=L~8Z6DnsFjb@)RYtf6{ngkYitHm$4sqE9P*h;q#{_RX1-rwqA;3*_=r*gL4kX| z!qwb0Yu!KYANT$<=fCrz-?z^``|Q2X-uwJcFf#0Q4IC$BSt$l*O+&1WB;(^Rmm6v# z%%?cyDC3X1u`bEt1Ec%Y1)B~!q`a~A*om&@88(&iD;bQjqD(8OTrBdAOeP!q`KX9F zMY7QX>NBmx=|z5=eGD&nL2lF$S4Uapi9)uCE>W|>6uEUdOvQ29+}DnxR~q`N`Mj(~ z%JPtiQ+!17P*Gl<8Ca-{)^%{N-WeIU$cwQJgv2uTkTs2+Q2?z$z z?D8dW;$mE}?QCO#s`?<6=-tAq3iUK<$w3<>z;XQsjkj_;XJi8mzG16oUO>WuR6-z_ z6MA1LVL{3=12HuhFndwkbwj=fd5DECZ!*dr0cx zWM=$#bD&sv)Y6rGsTe!R%Dq6 zIkO{P)Zvy?ci0;Ug;!S*uyy{y^POScBiySlh%PE{+7~oOUqQ|M4oBw zHgWC3AbRrBtS}p?sz{$cm-i7`S&W+UYXiMpBD z!cfRv1sNrK4fw`hU@|D+htk$=q|M6Lz zfFKr2H5#h@`xjJfF?)tT9#oMDt>vm|uD#^#TUYRR0Pm7Y#n^97Ys2V&v^KEyqrOT5 z4_FfKKyl8%#-jzAbbP^p{nqBl1EoR0%A-4OMWBS#)tk{>BmT1Mm_@W`khoyW^I)s` z?#2#C8u<8^0KyE{xnFPWih+QFT+CC5th5(m_NgA<3RcZY=)B2>X%s|jYX*C+VK1mL zLQ#MU9$l8WydomyHO3TM!p@~0oJbQ{8-DVJQV4l-)V1!P1z3k{GM zV$DjsJ-=K`HO{zhU=e{s|58T2y6P(+wuSrRB!N|;0rGiI4?kS^K9bBvIn~D-J!`rU z*S>eQb*e_nE+s~f0641}NoB*6Cr?=FX%4ldB&3uG5-?D!^M?$buQ!7js^`jI4|k1E zaIWn4S-=v>Gt+a9@wd+X>?peJCb%=(!MPU*Au!o%=Iw&JAEH9S!nO;_56_C)+rTXu zluafkCX%<3W+q$N79TF3DlQ>O)!$av_-Sg?FUVSaM3s^TJ&%7g>DPh*hrqy!&PvcN zo;m>r*f>_~w!U8Tn0!t8EH-Zj%TYTXrmyZ#RO z8oVQn^ojY)ZsZmM-6ZbNB>ovfK$D`Z#zI#f1RDG0qn9B>h} za6921iwyF33b;q4dTi--QF$!1l{%CLc4jVolp)P_?DJv^cnN81?P?)apeJ>j&0h=0 zKNH*oMiLkap?)zazs~nkw}o|6!Ewp_3fv%V5_n)PfjwcpOXgw)&@snToi?X+y}=^i zvwBnKi__2T21MI|(njU`@DO(8AZjU2v?~cxl-LtTVrbKCVp`(-XY|U|B;|n))f{{= zghAGZlc*sTvgL62D`Tv~VjCdZJVY)xuRoHXxaiOmt6J!e zZ8Vk66DJ$!smO2W^W3FV*D43(kRHtGTNC{8ZS_ajHqlp29-F>rhDFYdX8T#SJE>nX zBlZ$L#%5r!aUO_@{L}ssV^iA5#WzLRp|}J?_rQzd?XI{0X>8qJ^a22$EZx0OGBwo- z%Gz&u$G)2MRaEzeGENlsxqy|4KwLFs-VU1TfBY3d{YOy%(}J|08DL9!OS^m6iZ3C4 zr37`T9(H+ekeaty^NO5Q^4?hj+$UB!rW9Ka{xli+ba$?n?Dz6?vWfd(x!cZ~#hJki z^+i79`X7~pF+2cO3_k9J1NLytDp#32c;QV^{IJS6x=ZTE;{R0#=$fQ?L@6u@TbJxT zQ&MtpkN81m!*19v*c{yVr_b$tJwd>bvDFGYiP)O9#u$^qFE*-`s3rIw#b`Y$uQV=( z>6OsFR+80!a>KG^b*!4Vu6bQsQt{2L740sJYMIcmMsxp%s| z%S;fzFPM2xuHghefoJ4gII%M)@zl07cy`Xpk1D6RszNN1H|6X|_1h0g`xiTDSahf? z0JGe())9rn(2&NT8J_;_i=Au);M0f3FRvXQ;gT<{fR#De7^5Qb-4&-hioSTT6q)qr zeF#;rD8p!f6IYFLNnYHdTu1jooebo!0a_Ne%70}#=NG+<9bulB;&}Q~T``6eUD8WF~>Q{$ui z0O=|a;v_61=q9PL-TaCJUf4dQGfKNoL1Wl5^!AwH_N>j98^mili8g6%XI`09>2!5c zHbm;Ziw?vi!|9ivi&EJS3C^pPG3DR9^Yzn>rjrc4Wn2;K?EbVN`FxZP3##sGNX;*` zKaTRxU;Yo=ace%wv$;_ZCwTBSX5zY9N7lpE!|oV!X^#U=N3p!&+P+sK=aO#@;a#F` zPhAzi{k5?l6{0WM0;0_Rg75GJUFW`RstEwl@iea?mGlPOw{~X>H+)N~vrTL+T)W_9 z9nyO3rs-}L$Pqv-3H={}`cGwf$R#0w!D6kJRvvhE-008FpFH^Ql!59c{=nvv8yYYdGx9L^+SAo xNm~DRowT5e*E{reP7nCJ{BIVJUD36CrFLulsaAG-_dN6jb=dV#$pQaM{{ks=dJX^p literal 82859 zcmeFZcTkh*-#5zYs=Knvx(cWWSXOkYA_77vmK9tPP$Bdt2uLr1P(v(>s0gSSdQlOO znF?Qnc(;&{(4N1p{&`DCXt$z}(2gH}+6jJg@lfcW;J+UbxAZRw6?Gn&0Z)E(y=Zt* zNT?)En0tE%c>c5RRZE1Bkl0DVzaRd(_4|U5&}Q~ue_k{VwwoK?6>jR>yKD5ae#1!3 zX|Xv;(GLbC59*j!ee;81%(|lLqWQs^(@&o|G9AoK58Uv4eN-%zMog~_e}3oX^`p;T z(FQ2oU{d;DGRe5)uwgbyBXE4q`qVD)vELsT_I)$l-1hG?q4t0W(f@gJq3b8b|N7OH z6IDO`=T`?}MvwpJ$&GXWqZe&mNp2t}!L#>wD}@>h=s#~Llrzh28a;T<(G-$3_w_A7 z5U0AFDQkKEbumd$H*Yx0BlCwmDD7s~e_r_i@D=}$Zu5V?Xlr`@hl}KYkKX^^N6+L` zXMLc*m3iKO-^DqrQxZO(rn|CCFqFd{1^WyX>H_}R|5Nhm^GqpQ&M#NtlRt|8ASCo! z%(Dl%ygn8seG5au^w~((%vV_d=a0bxQEBW@A33NivY`Q2F{cd&MekATKMEIWu)ob+ zUsQJg_TCikoe?ti^La0?Sa_EeEV?zl>qcSZk<(F*^-=_F2!M%8k13dwliWv*?MtO>1J?F z;OFAS7J6~6drLTLIB3zX*e(wmobGb^`gHJl%yg%wO#SLqs`5f^=2x@t@9&-p%Jv!a zs=1d2jwIWlXfp}#1N0)C*<5YYUcdGi`ps~2n?rG0T;EzwXU!skiGF9j{ncS@=)c(TSHa ze5x2HE@gAX*>ojf08u$i{qj&GHd?Xj%sh@WAI!j*J2g~NdS|PAzQ3>RDM4ep@uW-j zr=2cuDs!umhCM4GG)AS1N!cmDCJn>oJ%31D z-_N(4%yuRw{CkgxVokrb!uam$Yw6Z%af+M@W6=h(Thv}jYQ7E%EZ^|u4sm8}pB%iq zID)rvv)HcA^amC9RE5f?%$;DtM-ZgY;fki)&NS=B!4#{tR|BBk%dCeYPfJPK>hQ?0 z>HE9J=Tu17;J3|ey~=}g9^KYIt~TP(wC*+i#dMNYZ0!@XCJB2dz#>9I_bvrg0-p)u}590I2%Wv)R5ncF+B}OHs<^78jPGC3QERLwNxe$=-obUu=Rrc97!4^4y zmp`L9N4mY(xQnUJBm8blNc#xce>ZkHND?IzwoVFcg{?&ip%tuH(e}x zJ&rJu>Qnz@--$=GjHo$fx50OA=j`2@Eu1lMr=A|Om>uFaXHf9P>tM6j^JnveH+d|u z94_K$o)hnHcD;_^kt4oV8`3SDd*I!L!nxGrXSDbB2jPRkvN3^IS$o$g3ZUgX%3UfB(Gyt@m(f*N2-;7I)u7I(A(VkJ-8B z@H=Z#m9kUIr-sakr(IcuiFw5Mra%}&QX$C=uN(xK}4TTknJ4VEz#7l{7R}C zC?fasWIb}zgJL~=s&wG{nxghZdmHOY#%XU@gfIE^#C2R#Fnijx>G$VcHOISK(xLr9 zh;b0(u&oqYtA3puT~;1CkjktHm`|%=!p$sCdMUKSeDM2s&DHooZC3ty+2cLxc1ru7 zwr@*guFi@wS)ody7_Rkj1SWF3kuJTxKd^H6YeLE)Rd>?>^x6Q~X=xtD%qi^gs9{f) zF!`KjGix8)tk)NE(~+H#XO?mEk0cy(ZIMQBw}X?VFNf;wm(JgbYu?2~|9aw)h;iRk zV&r|rI_{E%!~Gp!6R`E1`I>$tvG5k$eQDx>QHbm?Q_~xdBCi*xn(Sy-SnCh$Tc2)< zy9KZEsu)(|qX_6*@H#mq{BdtQqry~FQFs$ghHcUrW!QVxYmN6wx|>?{N+B^^?}ng7 z9*fc}LfL#CCRub%%*Mv6yytCCPU=Lgu!L8wUrcNl6VI~KVt27BgO0rJ_b( ztIoIQq}Y=*&D2*~Qj@lb9{3{12cIj8w+*6b_fi~qrf>xqay7itwRUOA04IV9AE^lf ztMjaFs(ppAe09HLNxx&X-m<9q)4)Dk^$HJ)Vo4{OXbMgYEvg|3WQIa~3ZeP-lCB9RD)61wvT{S^W zQ%t#EzS&|!lgQ+y>;b36fWF+NM>nDPK>KiR>!~H=HsXWmAX;zi+&x8)(-vjgdlD2dv`2Om;V^G)~}+7 zU-^1yuYyap4*O0S1zEGc@KB^X;Si!VQv6D)p?Zvp5zMRla&+<>OXA$!8xX`QPWA4L z(MW&s#c`_CST`me*{`Dm8bX=##_H^VN?)TVO%nN-AQci&eJMJ*-X3ksh_FeWl6!3; zdWLpLJN-7^^PP{g?M`CY@~V1{Tv<}s>3dhX^xG{1cnCi!hFzpEP>%*kZL;%+s9&UBR(uCCrP~{?+#9&w zyQ5^es3u1iFaIfM=8AYOR3QJ;^FU^v_Z+2*AhloaoKFUn62>X4bUA;apBgE(ILZ?N zGmxbp7uzJF^SciB1M<=%lV5nNedkwRKYZ&$sdpo;5H*GP{IH&^6T&ZeMSb>RUJ$f4 z%3~1%`)uN&Ly3W3v^GAU$c5Rc6wVx~Ng<*6sIZGpCq0_B3IC+{k$m^t^o$5sn?Iek zBPI{Dw?o2YXTl?>pn4W^gNI=oCZ23^T?=b7DE`YGI1%~w_xVLt#~<-DUQ#ZR~G3r5Y``ig#-^wD+%~YAXgaj<>NI-LzNde5L(A zW3mg3iItk$(z}B?;p@NHLc$tp5xvQ{7vJEmnmg;)WNgFiI@kBwZG9V%vuIZvD z^kc8Xx3EEean)Hkre^x3fdKyHZFo3e#Hmw)Rp6p1cC+~6^oh84?o;cF1ZiqU4HMj% zl7y(OZC{f|QFUlepk(Mwvv`(0B_eNw?tJ1(^qcArXU@=CKia5hIMIH-U1;_ z)e}Np0Gu9wvn!wZZO`|%a9Ge5Olp`oeb2JRmb^A4VOgU?a%quDof@1u_nNcLlpv~E z;Vo5?gi;wvgX&Sck zXUkRjc(nt}q%(*I(E~ax06gA-EARPX5mXamU}BM>fg+d@iQuPy2L%F*h=F2We#V*1#b@Ea|Hdg4?UnHt6`)VX%$bzLk zg9XH|PR0RG1P#>aJb%4R6)(WDcERjyGQP*H#*%*UoUa44yL3L3-ZDrpuwLScKhE66 zJ3`XoTraxp9B6iTM9VIGa?;cg6PQE3Tem74+Jt%&DfW~R)PD^0h3em9Bw%Vf+|j1# zwUuswrHi`@p|#J{IKqZRK_-M93E`xlcc7q5RF~7kXY>3%-PF&43UX_7;0#3iKzMWq z^+B{!taGBRn%9$cKn$pDM8&u$>Cn?&txCc$9rlVz9`wFD7B;D;CU#j&iai>SrIOG9 ze+!swHuSAYMB9SFui$Uc!3TOLL0CTW3nhP#lMo`o^Vu?J<%!TtP*aC`^Iq6|5|k%E z@BQ_tC{siHF6oe}K@k;l5UI|OkE69XMk|OBi$7eFFPZqQ3+@}o`<(}<>+jl2n# z#sqU$He|ZV=jBVr)q=O2kSrMv&Ipm!-s$~+qd~MC-k=wExi`h!!q`9Dxo7fjcrGFE|5z}EhYMr=@1u- znkrKvoz~bjc$5|VOhF^+uqt+fuNU)l-wC}bDN4ji7RwNcKAC86Z7tF))$?V&Cf*ZI zif9ovw^2^83C*)OJ0xzf*4iv)ua(l?{rhZBvH9R)*+kH+iT^yWk;62HAyH5NZRxUG z3-fiq<}gLOdhO72s12wU-x>EG|v~u=)?WQO$op+3!zCzl2(3i!ZXa?|r|-_|1Gc@U3DlYmd!SB@NJzFWqB)yzGWbBwE7yAEd@wD8}XtX;fELacX2Ls;1k%?RwtSawmY zUwPSI^H^Hudj)EfJy&(t^jw4I{QIo9*vUkD9XNZHF;qGDrBNo;`*6CZx%9SzCzA=* ziq*(4%|16M96*rIZ=+j|MM)+=kh3kSEsWk#`4w_Ew+}NM0hmX4)Hbj~wdBi3;amlb z+zKD+E_3Vd4`dCGqgiB0AJ*(4bDKc;5d_)KZ@l#sp-zNs|D|fmk18t-+psHUeTDob zD=0cXca`G<{oT8f%K+@}&;jsr*|a06ywCC!SSV>^CqK$Vk(RvN%mn%`fkgbKBy7&K z-C~mxJlFev=Wc<@WtyjM5-zDU7GO3kP{|N?FlhQ$0|i@l66FuQ zIH+E3JbamCo+Y^S-{-5AJM$o$W5e+tj)moYu_~RKawCBFPcH&e6#KHcZq=oggf*$Y z(&Iv-egyN{4f#haRNS@ds4B2_3dF`nmn%rv+ujvi{dVoCDkp@r+566gvGN+-?@&Y~ zYV>%6ry@N#XzZkTuKUKFi{;VR>h;%he7yz{%(|?}*?d6C0(&h_6+#I4&_ahv{W&kXJ!+!+S7$x?vM-UZEtZ@s#`e=q_QSH zSEW!E^qna;4CSSP;tMkjXEL#^CR+43{l^W!NUkIn8MjbGbgC|$IX)z9UFjAtdv9C< z-jFmQA6`xXEm`vya~Zl=V&J#*UXS8>D=z{bwKi1z+PCk0SI;i_+&)dZ89J@=Qd0SC znl19lJOJbSjRRvY?}e4M2lM9^wHJKQ#_%3U;sSkiYsm$;+Gj!oxcwmfb-*xz8OH6(`56E!fk%R-CeoDx)$^%fdTX2NFBqG530M-7@ z#fKlO856=+o=KUYUjHsJW_k4}B2pCEyX3qSdCFsU^s>kkHZ4;Z>{p+5(qkZjyn|0d z!g}BGOn85Me1ZYi}j+W$@O`c$wvQJ!u|% zHq9Wvt&Z1gT|&wTYX-xn)-=2$A9kvh9L(F}H~2^HCw=LNmBgUZf&%N6GIzPF!do}n zZUX=t;;{-zM{X*$ctgF0NZb}3CGGbo^wdW$MJrb6b(T@xMo*Ap+c%#Z70b7EN{Q30 zp3e}0ow_V?zwhv>Y@H37Q9hG^kx)mjNjCp{ocT)oc}m+a>uW>n3r!*j z`kTv9EkvcdG;4DiPK4z`{p}__P?FtGSlB0DitR2%L^WA<)SjgIcOz2@+IKPW(otL< zvxGlL|70}eBQ2hRz9N0^ zsd2ipbqBtd?{hFRzj~_akH+Q>tIDm~R9VzcjeCW(44-W+QuZ7FMApolt0IR0ct?Mr zsn|LCe%f=+S1gnilzgrPsrQ>n72M2PwGc{-ToLg1Dq2y^c+!cur-o|e(|cj-&x1?! zDh^vnki~tDbTF5wAFM^sVkSBlREChk^Z$%t^Z7?7Gq_7f*R+ca3+c8^@<{r zYa?U$+qEABH+D}W66j_Msje9zg#8^NAJp}fC2z@ zyG0=P4tiqU3hLXSlD4OE&F%_jonab)YIU36l4lH78t11FKRzt3<@z7->mMuWnX>-m zQCyo+)C!13JCZS&4NYNr>=#kef?gY2haA%0SY;#@{n0MaYSnFmP({&#S;Gv8o7?7H z89^!6)UsP7X5_Xy#Jz!ZP_d5uE^L?<#%Je2Rst}yqT(^~W}ncMnCeVb(tuaoaQ-ic zV}V$t{uED2w9ips8^Vy)uVbTkJfnAP^V+Og#+R&us~T`kJ2adTsu=Eq`eKRvc#8E>sLhxwybKb?u1S^OyM1}|Tv5(xC-x&&FF*R=(yHi0>N z#1yMWAC&O+!hp5`+eGMFO;)xX_V@(Qay*nLDVcvZ^!lnUP1W31G|?vDrs9#70tNbm z;8y3vX0;*j)Qa{bVG>MtV~vptJxx*d5Ub>JDuAWY^EF{cU7+@Uo`trRT#nisQ-`FS zutQB^$^_VfGcoz4#=C#}n55JK}@lQw~;khh#rWPrXT{BTBW*FkOQfr@5L#?$x6Q{qfww@n* z7w-tz*;_z_X2ih#c9XtB1#{a~j00v~e#e~--ZFyO3kJ6hd?1~x@Ue_w?qV>|DQ>BT zC0Cq)XUiBHMMKWRk0FyUybhU84JLV-%@0!v7;sC(jvcgLQXmo#?ee zbOc00tIR(9Yef3>3K3UoTb#~XQt;?l)OZ$22WWzLrC80WX^Zg1W&uYs1j0DHmtH&| za3w)qxFsm)_v|CsCyT1)mP~udg0=6|^6W3TpXj<&=(j)E_8~yRFv@R>Ii#Xi2&uUQ ztlE|f)cvxQa@D(Ej)LC(exk%W_{9~M>06(vB4=O{+Bx@Pqhn-u3J1-Q^DYUM{K{^* zuI|}?x=>agjKni2Ujx{~Xh5S>?z0-xu?ZanI`Yk#q@;9CF+t-XM9n1u#l7$DO zocf_jBu>#dutVFp=Mc~|o=LzI;i~cJn|C;bC#N!xDAjUr&DD`~W>VUmX*uizOShBf zd}cqOxhFdD3m?yN9FdkGlbG(pt@ZRzg&2Litn?T#l6TNl{U)DN1xkns+~#r=KgA60 z@vX|yH)~O=XBSO}Hxciyf7c~+Z7$gi`L~W zSnf>M%!bZLc$eKQ5U2(z5HOqIk~IE0FRl!-p{sd~&vWKu@R9*H=IWUBIETSk2B{TZ z-Gxb_J%!flEG-2*cM0@18N-ZMX5GcWBJv~uya_;tjBy*<9>2L3W(A+Gru#hZ?mVx( zrn_)&>$d#!PX71Xx}ZR8Spf6o{YYv5DWmI@5_hGw>3zeQVTy3J=-Kgg-OYOKjc~Rz z^z;;mEv#&uF(2{0FDlZ1XMdvF0s?T)2M_uo5L7VMUgfh)AF~gamPG-|RSIBcm|f5R zqMXt|e)D`Z%(?L+Zz;gIpe=$!V|rSAc|R>1(Rry_uc&lhb4R*dyO&HN*9X~EX4zeq z!p1I)uN=0$5JfLH_@~1IFmQDI|B@`$f8W)MK`m!P6T@Z ze%9AgeRl6VLmIrjK&8!n0eA#k?YJ zljLn#zN&7a%S>emG@i1%UB}^=)N^jP(oikQC5t;HT5jqe0g`#BRa!7t*9nlM#{egl zw>7ks4VJmZo0V38)@_gvIUoDvRY5+>43%;D1!jph#XDxoV$ zV4E~OVmCP?qt@gK=;QNY^OkP!3&1vVf__pI`zp!Axn{%wvF}7sZTp?utLk=WZAJXV z*Oye>MfDhq=F0=^zCO?b|q(dB{(!@yX54~TcGk`V$7{El!H5#IK#`P1@+Gl^N z7g=^M*=LiC8`{nqme!`A)CC2E9P%|2%|CP*%~Kr-UG8`}l$lYF^~VI>YmbPX6`)<` z&W;G~7$JD>g+BFh(_RuI_j`Lw+nQ)h-Axp@qHM@N(FW;u&YgT(%+|B(Jg;4)ySL32 z)r?X09c?h}DH0$AA3RS?m%r|GF>70He?V!+SLiJxh%_x4p271PX~XBd2IT?AV@R7wMZvuB+Tsnojr`*7@&@u9J-sWg6_ndi zNoYRX1dgj?4WBBMmHoZSqCS7Itk(k3;ZC>^2jCVzqTzmt8LgvXN{Bq>n!hSKccvH5Yzmmvrb|?@v?hNcSIxBV9x-^|Pr)>#mSXEJ^q$g?9H8b)O6~ zRhFPF?`SvFBRTk%04aBIJiqwUP+h25Xb7v$u7vDQ%r1Wa{&`n*a)}cm2X1NUi+%pV zLWU!y%64d+^Bi!_-E8_vWjX2jjw`Nu`h`&xYC+Tc z1f;@S-;rLsn?RzzJW$n8ys`2ZJR(}Y`f8e)cdb*e-yie_3tLuBqFuOhU$U>+2TOV{ z&N~X@1|kw^h>?)I6b;S5`6H>G)qg=QA~nns{eqg2>O2Tl>g4N3GfMvoO>GD(ezb8j z{oz>sqZmjhXA8>$Ultx}2@(k13d{NK2Ug3V*}VhEW$rO7ZosH!?xmTkx=P&7nGxK- zta>s+S7!l`^-6@WI!42i9^F`?qJ)CEPcnox^T1ij15WmjhcM*=5KL+E58HQsicz=8 zgxb=B8h0pDxXan@(3D)!LBf$B8%Q=WrcF}E(*oijwmkSQeEn=@I&wh6NSjv%c2=*~ z;HPYJn+#|`zc+BP7fq3pD=Kpwyj9~u7ow`avQZJ1wc+6drul_rjlxS^H&qPdFDtmkOI4xHUAcv;ElsG}Mnsd@o1ljze z6o~uimqWjq+lDxU+vu}HqY6r<4ui1N?ag(48vvmMiH$Pn7Dg$h-jB}M%$RxeD%rWK zSuBPl9e!GXlFPE!R`>-hc~zs@UsvgUN&~q?#K4Rcpv*3NS^a=`IHy|lyutrXR$8w4 z8^Y_GS+XLEilBP&Lo=OY2of}MgTcmlxc-SEQ|EPT?Hp@3jDn)# z#53YzhkP=H^9+kU&#h{+C;+G3s=4?Xe@V`&Kl$@L~=k^{<=F!;+Z5&fO5NGiel?L{kS5{J2-<}BfpJgSF@0a z8+N^c>th7zXaGrL^kQ~PSZc%EC?%K4Z2V^PC1HhE6$f1UKipm%`tk-CD5>Qm^1T*W zA4I%NPr}d|{J+sH9hD=$#T=%g=Ubi~+7BXtLjjWeiA2rA_UP^yzm*cgI|U3;=YsFD z!sb%}Dfh~=Lq>!FOw{b*Q+G)JNTBK9_gGYW=j^9}O0jRUZ~qfWOo9yqePwa^^~Jmt z9CFBs)$f}Xa|=Gmo`K|wo>9;o1+Ug(hFB_L4L0{2f`$>S;l)XsJz$q&a>YlYjv((s zIqtxjA+M7m&pz)O3;)*aaZpLyhu3JC=-8k4%e?a-x|6H?1bh7?o+X8A7zfg2d{MFv z9;0P6oDrcZppoRO1h9kuur=v=Agp*$uKe!JrC^_f>6%QdC0Ji=scA222F#W*e6sUA zy`c6)Z7%$CL!a6A^XE#tN?l3o{2C_o%hVlGqq~B6G~C?dP<~;B zRhdnlbCC(HcG*nVY8eDD3zn2gW!^(FNU~clh>wl!l5PQrFAca6CGm*|e*Ki*D^Rw9 z?v4@OhUav`K$}Ou^T=s?APRQ*tyU)99mu|F6$%5H;vh&9pl~LaSgl|e-WUlSNZnY? zvk8D3nfBBJMZIMQEj>^&NWi%fq=SsLMoz-n3fSql0(m)G7Ju<1p(V8l ze+-U+F+U2sA1FI=Hu&?V^!`xH5o!I{BgsX{vM?DJLPJ7+v6OYfk%#IO$LQKOU-OG= z&qhdCt>Oj&Y&t~uxe=&7G3UCmL{xgq>_Y+1v%ag6*>`etEn2{NtW96V)Z3vS?g742 zv#ZUooUpy(=u|`9P7r z;dTLgmf8{%eJV5f_&kn2KW*Y)WHFNBQ7Gzlty9!0@yNqcf=3%7MO&@U7Gda}PBI(v zwDsRR_z$bUP!hP6=uC(0#s08E3 zJNL{IfGnI${}#XFi%Wuh+KakBbp5}*Kd9>+{;k+vN&LKp%Yd^7+0D)9#SYP(B2q`T zqYqmw**qimSl&A)?0P3X#$^q0HvjV)(Psm_zbQ(D#7G>_(O`xzH}94!IMWTw`+*C` zo(WUiwi%}zK??k%^T7Zyv39=a-3e;iY0kYXV*SJi`Ia9PVSDfV(2$*Ix1pp9=3z#_ znIG}kuJg4r^rR&1pn+Yxu};Wi((>=0%#{FBqe$KV)8m&MiFla#d&=vJ^0f;efyZai zd;YxzX6~TY$6I#EA)s~7RY#|GwQfZJab%AF7P2CC!ouL8O<2ayIj zBquKo>d7VfO0Iv&mrKG|(lrCWD*XG}q}1<0&wrN1zdsgov3S#XCF|{X6A1kD-rIdP zTZ-n^Up&Oi)sb z7M2AlJV2p`4Y2Kag|#`;On{p8xFcA*zd;Om}e@|lw^e@Nh;>#f8 zsTp`2gujn0x2qU~y`by0!vT3NcwzKz<*HhqEZ|goviw^89L6HWV-3}Y&skNvNeX0& zbBY=|H_GL!f8`ke6P z&E>3~oglQnA&>iZ3-{p~z1Pc;SXlWqgqE&uyS-sz(L6UL2pBfj< z+k5`Phq~Plq7ez{z+@AM;-**95d4tEO8llaJdzFKxQ_9A)Z0++-_=$J(pzB6)wamL zze`${m%lF=2bn`4WuUe8Im_?kB7tD$r5?v&){%n$yodV+VypYaZGCSa32Ijsq+iQK zr)&g4)nt%ms$oQ+sF}mM!Do}bor}MpS8@>nqjF>W>TFN` z(yd&h^Z+6k6w#vK9J0iMP`gg-`Pt9G?A1Qmy4sb&{6%iGQp@H%4#p1||GC6JV;=!D zoUmUGpF0gd<2t*g9J|-7F}^~+_`QdI6A)5(wqo7*QkiVe@^FUoorUNB&Puc((!HY` zHA}{Z1PoN7AR&iNwSeg_cgq_e<;ezq{OeRRh+D-&Z86fiIv_7BE{$HfNnXr^IfBq# zj>k3U^aog9gTQ0@LVPPAR~eLfm;VJ_O9w$6zv;24Y$y~+hEKYUG_o(<l2+d3lGkZHf#q%u;9VP{8tcQ z24v~W3obs~Sz<*@C}GueTK3y!K3i4qKWb|-8zmj_806#9Wre2h0KqJ_l%!F4$$|&! z`gOUSau8Z;*5NSBrb9qa%QVSfkDTa(nB z;Cm(V)5izAoqfJfar*{XC=N0yLG|qMFHIuGJ*LG8beSCuX&wSF4um9Kcs59563QNE zkfN1VuIAqwNSaPTrgUm%tbr^aJR5Y`Bu?dJ!dG-|aItwYnelyH4**M6sQr$$XUP>S zL)ANogm*Z*m&whRQzl8mT+6Nx(l{_r4l;GJ$=V#uX-{^O`1y{*0u6ExNU|&d_ zKy3vfU0Px!#0BEXJjL@Dr!2h~|15bLh_oLLQc1CMBE-hrM(>SQb^@X9O0||?7d#2* zu8!@~UCI$3|0Xkg3M@;>*%i)l)?!kUDlXT*0X1|g(}$nr!)Jdj7s|%&^)9sbdI{ai z%?xol61sjnPnN8#F9nx*Tn>bcPpQoMcnCX>XVt&#YZ}XYdY=3Eglug+i!r1=4_pTe zpSo3l!YgQny+jO#=q8IH%#kjj8I(zBn92q1h4j6K$}4{UIsmo`8j-nP44(0}EHwvHK-w5CpTj%@bEa zMB4lyr-n8X2ABY5PsEf8@cq`;sd3c0jJmhx#Ti|kZj$4~=NHNm+;uWFkyjtBSbbL> zzEmn-I(HVA>HKhy`e~f!K~NbCDA@&bbY>kV1MCrG4TGCh7lBE^;C z--`QK#_?BM3muqD=wRpyIFTA@`3KArmcU7`q{OOYX`{nI)VP<+Rb5Vcw9Y-HG#G4` zr8jQUK#|ydO~L9<0t42vyCXHXSdk8reX(6xdfBp$jKJO}hU!D|a5$Y(jIIUAP%u{}!zy`Ed8Xn@DF0KuR0WEv7NNy|Aot&F_^ETPr87tKDLSGv0uTeev z+)^y1X8M6Jb;zgwFKeiBsZEZ6nEYsBQoGic@ArR!ElUzW**SuqE5Z5N8KPDOa+R62 z%OLItR9Ovy9}QizMN?x+sGTq%vRWRbH*%yOgIK8SBMdh@^7FP7P+s%`M!YH?#O8|V zuG~n2FC8&{k#2p5cJT5}ktei_Ff}{$YPpStval2Uq_8*=m18Xr8igQd9FZ|cOL45! zM#jlqlX{qP5M;H>hRZw;GF!ckH_=t{@jGUb1AYWkOel|0nZpVCyjCZyRCTl-9B=vK zo4bOwUgb*y?yTkA)ipFlOc4HzT#kj({wuL4INK%=mf`yZ1U=2K*Af`pV)9yR@3TNE zt^DQkTlof?WcjP85e?cqS^wgFr_%ked8gOytoy=aS51sZ8+00@+q~!?9C(&?qt2ky z8OE^SQ4DlE3hR(I!HZXLXkq0^pZUT3;?Bq`!sUdFuoh=?yBE-7-@`MFNR7PJBVsp^UAIIW&?4{wgKF8`pw-k*2v2T@OICP6B(rXJ+C z;(4C>JgT(wgWd4fZ3hNYM;R3(GDWrAr#9+o;cuF+_a>j>*yivK5$&kL`3Fp!0S2!sLU{JzL@TQ9Q3s?@e>NW3*C+S@{Xh zheq#on6dd?WnqubXPxx}epAv>QVFnL(Zg2lcN)K$80(xAxFtKZcTLuPqc)$Pd>8Mu z+*?TfxZD|MZXc382;e2>-X*cS`Ey6IZdm%xHH+I{hvyYsb~e;zT^~Rs3GyDj7EUpy z7%Nmv4`Ns(7tW{eUTr|JXxmCve^zi2vknK2_x;x}J# z)HxSE-{&x-^`%25HlxH;2116)5)M6Bt2P<+f1PSpZ(oL|XyE-3_uDMp~9aZ-`V?>E!jUUD8wa zNMc1QL6CHiNVa$nRB$Ens)B31A!miF??{GY)stOwz6oB3RHAFdPpXHAt{h) z3j}EYtmNEs0Gy2bP7lteg)X(?luOumsJ^AKsg}7Do?6~IB&}fu#q&z&Qyow(_p8C; zpRFZ$=mOdqOGQzN@kTu>Y;fq37_hxP0z$&&o?lDG zI<yqGLog4nl1D@?1Df@sE=wew-;cUU>A8$L>OvO>cRj&shFtM5gyvwxMwf|IQbT<9#VR* z^+3l|x;kshTTM|(q9ZNvjC@>);%fZWb>>b>nE&rmFiQboTzvDWFPu@*lQ*BLLQ;;yGTAQ`lxyx0{b z4W#_JB;Ub9CWIf2{Z>WI&3nZZ~8q_PCneYdzkrq!X8D7pAJ7urb+ZA00#qQ_^AA28FLpiZ6as zAx%UuJ>`*JAPZ-?@S%1t7y&@Wl%9#J(?^PIh$42?mNgPDk!xeuYq?x!S9eR*325p2 zqCjcGQ8a4sf`giZqp$%CQ)aCR%C!c1CBpToZ=S+6SQy?R49g2?6s}EbjvC(7aY%^# z70rF)LSTd+8=f?`IHp)L{xoh}^C`NKz);L}hYI2;c;>os?dGCB z4z<{1tlP@SwO?f>4v9-KzMfQ{4+CKwLwjZV%pzCV4|uXIpFmDD;nqONH_Bek(`tjk zS!LO#VrR71zg$LSC6xB12T|yvC_GPw73_lEu2j2nYu@J^CsmMZ@#7h)PkoH-ZnKMq zc4xZ`I~LchRht_2xW}8}Xfq5uZ3LP%Kuy=4M{TTj^bD*o$`#`@#iugN>b*0RUU5fM z7r9$ZB^EY!#^lo1tzuKE0`#2I4`%ih2o9! zU=BG>iV4zV6PISnPjW2Vf_Cybjk{>u9NE=>xnPglid}DuV9H;&7y_q1DQ<9CIyXrY zfu#ah)%%vr>CC_dqULXQem!tMex!I#p$TTwPW443oCIP_Ho>KwXl;MO!dy`2yb@>h zvY3McUV4pJ?PFz_IoA&>WDH`4b%hr_aXfX(v@}nGY z+;V~^<@4I4_S&Ed(9>(%P#R%6%OTYAKDmIp_6J$&!%M~eE<;|+5u?9zer*Y*7mJ56 z>Vm)t&$n*$f>RU?hGM$rl{j;|PKJ-LVwJ<7W{JQcM$tke#&a&jS&-^kFPtd%v4YK2IgtaIbvRlcSLbD`-7Z0tN^nBgpF{((k>Zb4 z(=YpmE`el~?#3z|98(=)en5RE&1XQC5)knLiCW83Np2WvRg2`#n)$=l-gR&RFvQfi zrnVLlFGX9j3Os)qDT6|@DrE;)?GN?{OgqJc>@Qhy6C(lk=!dHa>O8j73OHqsk^7!g z&@7Y{d}Yw182%GeYvpqjPJLF6!xMbFgRqK=ZFtVAjVkqMjn(muYb(Mle<6Fsc%4+1}RtaPwpLQa1P;Y;jDCM}MUEQxJDPXBWPTvAJ-@KDR%p zZO3jE*R(*Fa$6>tn!3Av^(g3C9bobx90t;~_G9H-AYI6HpRBUbi#v=lGbya!uvd>= zWl_*^EQ%bbih*N|;1uya1Dg%o{cq>H)vflX;(kRL&eakX7w3dYG@Xg(lCV<00Z^&y zmii_olKgT8QlO8GQ-|tzy{O@U0LjFoz{rV?9I*EU z;W!BCoL;KwZC{Z^BuK-LC|RW!5U-0pYt_`82S>=S>ECqS^6Imp-P*O_B%T}GJu61N zGjK}wlU*1SW0oBO5+u7XzWxJyp#$cVh}s{#yFbD@#g?$Cu7QPgLpozU68*E@$m>mK;rQ{HO_gTsdk0_kP4 ztAMvmFp5CSv47WNYVBPPw-W99~b4{!pTeghu$X4zGX46Gl*_SQL`-0c6Hb34=E+A$(!#14( z({3t*P;%v`Ve^XgZ?BYsZ((dTgP^j@CLw{cGY?*owoVtAg4EnsC{(T(d(GQsq72)0 zkG(R}o%>OO9QnxI%rdZN_r>Tqoq~|{{YT?>ypPq*8MokhRmcq)7RXxW_InR!5_dF0 z%k*M=4m=NI?tRm(vU;g)S6AsMvi{75;2TFkyZ_v#E>&@Gc9K2J=ThU!Wa~3`?YRa7 zy1VlN`4HYiY=I@RfH{@hQ&30{29wmSDpPGe;N50v{yE0F9Go~Bpj=^#SiIWQ!5mV_ znX4yzj=11BX*ZpSC`tihEp37yCDk4yMXGir$?rM*ieZ$2It`Srey6s;lZ7z^SlF<< zbPA)ZZew&D&>|4xH6kY66|}>9+$~k{5a7Bt;bW10q~V}kg>0ifrC+?~_OC=;~=S)ag+R6*dS2=)){8~^r2$H!U=`U8u(!-#*G`FwsMn~_; z*qhhCI3X9?l+p8xl-7A2{xlWnHWkB2H+Yq6<{z*r4USX~!+v0Ol9|tCM}bAhe>9HhV%=DK{5^B{{UlO%)opD1&F79or zjrTiO!;eRjs{`9cBE~;GJ#k6p|6uREqnb+Fet%}>QJ;}fY=BZ{WK=pLz1x5h3`hyR zC`fP8JFyHZeU#o62qiS>gsuV+=@4pwD4_?4p#>6>oO^@wywCftcm3A+y4ga&+>pcojq`- z-n#$=Rrhx6gOV|P>ndj1y<;?J7+s}|0w>DKq?#t znxSfdpqY^tM+~>3yg7k$wzGI00t%(=fxR)C$T0sj-|9b;+Ty%ZsaG5+P|Y8<^meJ< zvT?kEu;U4=w-^zLRvu7#)g9Bvf%N|?BV_G#*}^nOwvgP60`Xb{5Eua=OMaf`Ky75Tb zb^NMZDh*BG|5Wff(-M=viVv) z0m#XM7gZnL3(J!o_Go{3nhXRapqS=2|G_FeAm4m~?osSkTXywm)3u4UN}O&n&b2|U zzo?S!$uH-x{t9KS<*6?PY$<)2wgqc-{ozrYmQWegLm@#;_hhm4 zbD$`pNbK9Mm^_S&MZhae>4FYS^&oMC@dffrP!mB+n$<%p1>l-F2bH$Ug0T%QrMy68 z>s)OSnAnsY)z|uK=zNJoZ_Vi4Vtv!Gw403=>^+Xgfr<|tDLeiMtidy&K*k9)0A@Fm zPd9k|kZAHYaMN+P)-5r)1UD6{L^_91T6!}@WFF&3@KB}@)+Nn-ip!1n(jh(gE``_s zHaO9F^9~8e$25b3S-<>zd%VqXjBG4$$mhJ#8#i2w8MBt~4(a<~R$N{YggQQ4?$Npn z)P$fwL_Z<>>4*nw^TWDqqft7w{5+?Q`UHAaElx$ecG|00o|NJ3MMzKsE5_6xBvFeD z(lt=%+!{fYLY2)0P;@X<$CM8ib*gS%ThyfG0FzL2HxDKt(TiEgzKtl|xY zpSw}TrAh88BeFPGCCkhg*B}=J9Y_~Xs?815(-6$=<{z@dS1Mn10ml~J&gkR=a1e%p zkn8DDsQOm(wZru;g+1B&6-s}CyK7!a{>L8IV2EqwOHzSbxzqaBJn1oVoeGPY>A`eR z&M11zX&;MBG7~OqS)UBp?v)>j*j%#j&-=hp+<|t;HhAuTjSb8l(>sdvA&wZcf%e5bBK$OL}!2T-MYo4c@@1J$|Q8Y^@A ztpgZ))oCjbV!|IEcmDw z3WU`n&lyw~YNS4mKj6CA5BE|_q}Es^n&qdHos?pFPbOJ-b;-|H59fe0!)$3534lzj z`GZ%{4x-iHJ}h(Cscg0V1Sm26hI$4Q$1Yn10?&6MYWams<(H+N<25S7YfkU6ttNE^ zS&?jNyT8H>?7ohwb6EbKWMUK1lnj|T-1o+aKqqIG7_>HGY}=C}StPu@5hbzEXI=0q z?1k@`)a3(x!W&_z?ivM!3G(EGcmud_uE9a8Q{j7OKOU!0kW-Rh23Qmm?jLAfw zB2@cL+ASXUxM;k%$7vlM_ESq{SvT-0B($)$tsn$-b-BPv9o9TnZ2SwqtTOx+YlC|=S))k!uIvynVKacb z48O(9|L{-eNMhIP%?T&Fav{i^7P7m1g}fe2>VS%4l#TwgjMSFd@-dJ#dp$OkUle&> zCCH}<@YITgN9gEE+Z-iVyyVL0HUfQ63^eRuMjdCVS^aYBWW)(MVOoCZs0?`r$H+u| z0EDfGYT-oREHvfGF)!XczJT4guea%e5@10fEsNO0L~2Zg7{L@vw7Al+?HJ(KiIle} z!ddPX@4uY66K@|@hn{$wYsrx`31ZoR>Tbe;Sg{7nnT^OYk}3mSC=gErFSHXpC8Vz0 z*kklWT+8lF5P;Qe%i^%|tCMTXk<%Q>)(^s$8%0|~s3#3Z3eyQalgW!iWpPmP8zK;) zw0OVO{bWlDrNjg}*l2ciU1Ao)Vz^`?OKT?-;mCYm&dkuZ1;E++16~9Y4SyX11}_jDnlK-9%}@`csx1vVr*;SxKO4 zm$zICO&+s~fr{aw1^WFy!TCdv-1b5PEyUKkrHHgXcTl!XGcu@gp9TmI-SDwI&l+Yx zGHHvhAIHgTO%IY)44NF9EWr_>Gx?$PR2TxHO1+2FwGP?-?%-Df*qDy$*FVoI*}G2a zqk*4Usx*J{{<9Oq&O81T=6WQ+PyGr?)V_}VG~RcJ)uAE-=*U+wHWQ3_|AOy~9D}$3 zchk#NmN9i{w{tqIQpAVS9a|8#UNY&7IE;?u+SZ(k+BegOas8wdgXR<=U!I|pA| zjFIx9o4)vzIrlwm$K<~r338s-GsOT_(C zhEJ`fnrIEL?92A2Q(8SCDvwx)q}zZ;)DOwh6 zbFu>)T;Dtb#z0}7Ru|_~yX1)NQ3<7dxyI~}I3`F?@k6zHc#*j~W9%3IKc)&tIzeJB zIILfVxZ1?~PZ#H`PCKRN_KOL7ov(mzghZY@tJ_Y-@Nh%Mk~#zo_gmHZB(tF9<8h=H z{2?hudv=Ii<{qV$Ky@jHnWKyu7uTz7o)_vfUm?9m9Gd@pLV|$eQn;Tl$%}z{sIUCt6-tSw|t)c0_>xbaImN!S+Pf*gZyHO{4Zgg$AVAyyh7ApG1D(`=*Uw+O# zyIiK*;gB7c#wrTqHtY_I6+KrEY#orRVQ5s}ZF1DL$X$n*}E*RXq1o*Snk3$IGzN9;=0Nx)pvy)5&Kfqq~HjYohCp}eha zPJy7BI2H36(?oG#SCVj88v$jF=;xOsnnTM9^P`GJ& z@_uLnDIxQS2}OKQsfn<|Y`+3xW>8S>3TLhIj#>5mw1X=l*zd?0GCWg90i!Gx%y`dl z6JengK*|yy1L%p>8X8{(nabVXbtNOE5q}`x+$gIoRb~zt?yB^;N47l+UV0s10VMZ<%U(9 zDxBr21Gl1&WA)^lRqj$q?avdsDp`a|g)CHAOGB67NI$%HTzymIUk^AwERk3BX2R|~3$F}5>%3sHcZwy5~4-E_y%5E^WYgjZ|XTS(t+=r=|W%+Ao( zxuI~1>U`fFERwCaiAHvs64`&N-JBO34-rPrj*pl>oqV||(E$AOL|mPR&!lI8aATw0 z+j`M96F6rZl*EU07zS`74(f9Ca32H%s75}5oa9rFhG9j*M8KW{EDA+II5I2r>mJ1Z z1gZT2Yeb9@w~?xr0_z1kHCwxBfhx&Ce!!>L?WX6RV44=`e04}V?%f!rVWTR`&5`s zz`$?`3iF@I5XTKt{dg5Y*3|U+BYg8tZ0V#s-&*a2>==46Dcz&&TD5$!5FRNK-z*gN z4bRA8z2^#cl8R1LaYvHd>Z5n+T^i!4_HrRWI+37*0?= z&0O!l8Do0iySvscmtKooPz=uu+7N<_lHI-z8ADDLK!}e<{9w;10*FLYVT-)%iY+h^ z)=fCOS&!j1b)-IKZvd*ot)quY*+!SwmGc+(84USO;1k$nO z6Zc1Nk2;Ydz5GZ?!H1NkAJtZ`s@DSSrqQ@_Jk0qLp!;>gJSv?g2FOyED9Ik#)*$aa zn?~Gt!B!ma%Td}xzLN15u|-ZrnCO?bT>o(g%@%~krhzqxmhJ`$NKV?GtB5Z@Bp%&E zQE^it-D1vL)Jhy*Bj)W3en8_ZS_u>_M=koiQs?Xn4>&kE2u4Fk>>HzdOP^iJbii{? z&MPrDRLxZMkL_z*OG15k=<&_o@$CcFSSdPeTksA$vl^ua0EPG}h`7U_eYn;KMo01l?P zX7aW5XzEoOPMJIxPXPKL%voKvMB*v*f|;Ertl!lXLn?`C?i4NwRrWnG{ilZ+t_CMgx$M#c{Ai$x)qkXy7B($6CUi~d=t=G!hjq^q<2tML^YXwsh-n~}a&sr};N_faJf)q6 zeh>AXKTy^-oB8pZ1W)@qPm#sBOX(Zh5}LpSH;V(#`Bws(bFnla#*Bh!W@Z2ti*v8D z>}nMoGCsEeX^tg=5LqaFbB{)t`)$A9W`qO#fX(<7>eG>q%5M+h(0-|-K+ZEv^xF+7 z)12qc+wAI7X`xo?0x9okP}NgS3r?HeZ(lFb9!_k&y2%Kuy6?8*>(P>mu(q;KwX*@i zXzyA#w`;W?(zd4{(U)U?cC()Vj=IrS%|V@jFKe^4Bzb8u&{jLE__?HY7pd8`i(+0j z=kYu7WlCsUZo_7}==C&TQW)5YFVU#)UbjwMu4wy_wcV>*BAMvk+?AgWq%|PhqB1ge z32_C#!vu%Y%oD}bU?`F$boaM{Gqq@;+3;hlo!|ZfqQBqy!oI!T&6y)Tz!ttjw|1{C zx`RCGRG{mQlrF960L|{uMN&|2Jh{ete9NAa=qrFIo>K?`0<77(*PGM>F{lD@%;QwA zRv)=?@oF&4EG&H9o6>05wdeA7Ml>)JEDF{Hkbb{wt~JdZc!utKdocC^tX@-{Rs6^6 ze5QYVOZUicDM~kmAkxVfUNt{ZwEG9Sx^m#=XllVV)4w*l#C6*jn}ac+l&*zea=WHm z?K*L-cHFD`xW@;F8_x%GH4B7u)T1~OkCHa%cPC7UAs&8x!6E`KQ#yNJF~mCu`qZ$ zB-!<7+N@=3IJn&eP)TF44+hkSX#8^q2N36bH!){|JwYiV@#j1Cypcg+-@@K!B3T|U zC;PsGSc=^t@)x-)EG&~XQigc~cfSq-tSV6TZoT_ByH(Y9mmT`pwyDyAFFA2fe`2Wu zHSDiBQDPuOtuflCe|hFDu$!M4qp=FYeV}8t&j+a9;=7KL#B0gCxewf5$56wz6DtMg z{3q>_s}HI$Z|;HATbnC{G(Sx5V?Vvh9Rif0RI7~qj~4cTW(9eH^pAMI0NkeEI^+1W zQ7<9u?m>?D;3fqV!2WU-m!cf#wSKJ>Si%a}S*}(=V00_MH3220GRwy5r)bT(f^3~) zeI-(56^q2dZTK8K9qtK}B``trQHFr80{v!F1`uYCyO5e^Kjp?zncbbGW=NL@>sNsz zB7oQ{2NnxJ3D-H)yg5U_eL)mN`^iF-1@G^EjFSWbH-LegI+8Q0`oLZRB1Z`$E4xJC zwvg)nz$(E5$uLv3fx>*a=BhvNGkQyc`j?@4NzKk=#m&uyL;dfY%aDmPDYMQb;$K)z zmNyHPazSHOp)u!yBEGE5yv`liW?ol7##eo9iFnqFIzQjInol$!ZYuDSM~I#WE(xd0L;6>CQ3paH`} zft+k0S~GbN&OXEeK12Z|TFxCX1(7o=d+reP@bH`eYT)>2u+Wj^jUxmYGLo_ft8+96 z$o_CV4GdblP5vYZT*ho&1ylk(L|ADuO2U3ML0jU8{URh&w)#&GcLQsY3 z+yFyk*@#^Xn1m{3BSkua^t-(l=f+(sI{`?L#mV0tQNRIrL_-}6%S7OWKUmnn){1Vh z0*wd&5dv&N&{pfteK2Z(fsS}hn2RN)RN5IU2mz{1oR^W_$*x;7;3bJx|J@46`~3f% z#^QRnUiC7}n z+DD1)s>G5EVSB^>=be`SGa~=@N92F9(dO%9gc9ef%k&qMdULy5)(nn#C;w{@wPh6b zJ=MNFt)2uj4tG9=#dNiXq@$J%>%y>ZEq>S{a|t&=nF{=|OA_zm^<`-12g&Uz^*-JI zbri4c=u*||7(W@ngGVhw-kLnh(kMcykiwYS(bs?Ootml}&G z9g7GZHC6bp(}Lx_`u*6^!aLHAjmB7fK*8~^AAaU3 z9frAOQ%W{dONPCw|KNPkCv)Wpr^k2&%(Zsv4EWACI;CjWwx`|@UcKoo(k?5O6tSIb zCe3*-!z!DuEX~-Q;UxF5wdQ>{Hv^Qa&DtTpmZhh31sfAYtcePYZ4l#f9V)u%=&6A< z4M=sY?hhOcAbD>*=lCH-%B^{52;DXp4dnbTVik|3>J4!6!Nn(XtVUZ7{mzdS4wCBO zE8=w;(yDW5j@2Ir^$M2vom|8mr=+>6m87|tgwDdcA*z!Ai+7Xst5fx`I z18;x~hq>VTWGHc$+<%(f@da<&#{ZGfh=0-X7_ZF`n@A$Lq)Qj_i-=2GSA!(Gd`Rwy z?UO(*q)UF#TN0b6QC!ydyDL<=QfqZ($JAloP%`&FET=z<-~8bpoKGf9n`P==9yMo* zt@PDBntY9)yK`2=N2^4*-o)WaNACODW9sw~S(y1oHzic+X;fQ@y$+jr3zmtIr|gka#~%$zs^xIsFQy>jrtw#FV}ItSyp&`rUcB5yzSmwa3D5wAGdMgw12_mW zc@7PyDN=y2+Q#F)-%Xk8l+k?^)5=d-p-V0#3E0dQls`1rDldF{2frQrvo!^HVIuG$ zJhlw)kZ7Wx_gy}+IpCj!b1Me9nkA0Zw=hEvy}v020gK0279PbTxX+)RL5PDO^W0h0 z5LFGl*X_Fjo3CE}Oc)Z&$L}Vu_k~})a$--lQ#s&9bm*6+6y@DUA2IJLD3k+QX3E%cpqKtO*;J=s;rc{xRRRovnJogA7`xU_iVnN3_@Ch!k1 z{Jy;h#%tyuBZkJ7VW!I4bMe6M4@%fJ%_n4E0t#(JZMp}awkY~xL%#h{HpV4Y(zH^` z5%RUyZ^l~3UBK(`3YN#c-Tlu@r~&Q^3LhTd}q|A)hc^`7#J=!iS zf4e@L8G$qB{jG0c=t|iPu~{2(nf+-x2n-1Xx>fURC$TJs4us+}VZ>e^*fwY##~Y`* zb#HY7?Y&yN+2^^Q>d<+MZHnKciD9-CvX}|lsZa=nIMqnNC;+Abs`Gk-ws`%oK?tf7 z_2e)R+jV&YT}!9Xm8^STH-_WX^3307yw%C97IgUVJLW(ya!AYtYh)?Fado5?WO1ov zmA4GfK$OJRwMGxXN%oa=yw|HZ2c1})WJ0q@3=7~+6e)VDfV&9eDu9~`bi%&drdaI# zHPmnY5FL4R;pniEaD$?TdVl8UxX3<@EByW+eg-uYxRIrpk&7u=QbJ8QC<>(G4#cFU zrkD=^ywm>Xbv7|-rkPOX#d`{n`?z#fC>HCBJy*Dognq)GjE0W3Vs>Ic_*OBs(X9!< zx8AvhSMFw<%GHdC1R05TK&;I*r!3mG>@Dinm8C-Dl9K>^dR|7D8H)EDv#eNg-szgW zZ@5+R0GF5)(LL$$8dc=iw3xbU*S_jMb@<}!-89t4$_{cX2WuVr$rqak|KMabP_37T zU@BF>$_`%>LfKu<8x5j}0#X=bj}`!4=T%X8)Du=RY~*rb~QFS~}Cf zYwiEc+Hd_^b;_RO0I=R4Je2!70OHHiX5myN3#7@|H#hbgXPWnICUjn9`b~?~PTHp% z_A39>LYH-KoVPBDgt`e3fL|zjx##)^*RM;kjbq-Vy(Qv*{j?eTUhrw4N#8K41aj(u z0(2wL$jWR!%zY!#@?{_DbvJp&b5^LJ^J0|fYvDq)-GT>4V`+uW>X29X_=A!8T+1js zn@#qlNTY6XKB8#Fqmt@@kiongbRXg@&Z%ZTcY8w{%uTjqCnMsH(Z4@3mW6+SObts< z;tEy_A*`%GR~nh?iYH+B!~THZ{-_a?GtcD$nyjFSt^S?Hp;TZv<>jk$8xx0L4K=cS z_(VQ5YMZcHDjU67Hz6qQRCmXsif-?@spCe7wXBPLvE$BJ&O0iUHPvcsQNPw%v0NFY zIepQ0*R|zKljq6;5w!kcHc`~Sa|x~%K$)AzTG3oG*E9hA2=&QV*>&K}4G5eE@Ug+9 z1`+saR`Ke4Ww?vyx`#GpVq%L$Ipx-|vg>&7Mw8HrBf~OMnz_{N*~*>t+l1GM?%;3n zEo6r7t?%rywv2;Dio%lat606>)aNI-@<)0k?R!${=bmobLQ*8{b|;4{_5XY*RHozvN1E z+=y%CdEpm$i9w!o@!T>f$B?IIqT{ndm2?n{Y57LY{`zn8lv$LVYNXN(^P=jm)1&C@ z6g9z=7;n{}_!@n<66j?dkb$+IHca%j>+0W{=KuJev(c6T$)z5ry3hU@L549IGA;YE zK91CrH6EZ*CHQ)tgg}5J0X#^D5Na3DqTxI+IAEvZ2T90Z_LmEM+`c|f49V7(XaEJz zK1Iru<7Bld>WzIEkL+PzQtKVeVUngX96w zGOCNItg5(He`|HT#gc~}_mPlrR`=zwj-pVk8g*&hrOwh8B-tX(Jz5mPb{Vv-M0n%+ zO0n${Fc}EjP*f-ob>u4ip0e$gyU|2ELz-_l&nBSaMsq_2<8>79REQ1KKd@-MMRXi3 zyaB#yBc91K*1GNVeXmS0fLKshjrRabynbnaa~AA{LrG8(!OaZrKJexVRMC$Hdz|@Y zN0S*KcWMVQQwH$NzeFK{8elO)WxJK!Tv6FLjN zAESD@4qqLYtn)cd-eA`liBj3=-$P%l&_Tlh(8DS#P{P!|18^P$0R$`q6e_L(QJb-) zzoodeNg>3zYg}*91jamdax3Vd^@1;0c^xq~SU)G!h3mZ2UiBe=I8ZgM*h4LKU=+s zkKkwUysAAMYF7(D=*)6I-vwK!2NI!RrT{}s(6Ukw53#JyYM)q(zyL8)YOSFg=!DrF zevau+3ibQ|I*kC{E~t7L1Ja1;Z6>Q3f#_eyA$*8i{8|AX!a_6P==+oe~1#* zU$!ekCPe~#ecY%UuCoZpb##IA*}Y-A*2M8^_^^NB%z>d=$f#hrtWlj#HPBoM1MRjz zLt<=2W9$CDH|RjXZNLuky5(t2n>j!s=ET~gFly=bO5UOPgaLe1AbUFs47m8iT^XSG z54M=oCspSL0xEG*iB^2;8@H!`jnH!sZj1q9{xbkX>PY=OIR;!{CSl?1KpVcbH0Z<* z!GzAs`MIJhV-sQsX(Hv(0&Ggvligzg=F@`0f-lLMGF&|E?S8&;+OTl$^6BoUZ^{3RE z-~2NtXY07T`YHcc%QmQAkJs9L(#WqA8oFsZ{cPPnV=g)1!LGj~GU0Vb9Xk+l!npu{9?j@IT zg)QFj)Vm?1PoXvi3|OnNcakw2V_K(jzA9OkFZ1NVy|?9SLchXDpz5<8Ih;L`PBEor><{Dh8yfo_wG9IV)X z{YfL{#>b;bY+*zG3KVV*u5{rJbru$r81Z?Se?~ePFj)Y=BFelwSt4#M zUsqQsM>kt_uvs%M;r>TqE94`bk*z-uNA`4ZK>}BEbIwJpl&W8mNueQeYlQYpe6qd) zzHzZDU@K-q=UyaE z^aRoW3LghTG4Nps@murkRjFnyb3%0n3Y-L$K`A}Eoee~Jc zSN)-ad?o25#kJk~RnR4wx!&jQ+>>f048_?m&t|0x!mcsCHRAL8 zov%Ln^5(n(bqTzw6$2}-`3`lVnz+xdvMqrWTS-Kqtsdv$8y|f!dy$2uqTOj2DGIs4E^ZG&EHGBlqYRKFeXOs=c@=hr|BblYslw zp6|1Og{XV*3~dJ9y!V`7H|~r8?arUQ_~lM<@fgG${W4xbe!=Y?{orG*PVu|1y(Uef z;=x;c0rsQ&g+-;XVz+paxt6OAH}=1^ATU3V=|mRT_i5K1e$DH&Pwl%umr{pIDn7pG z=p|_6|06wV@^fOvY!36z&u`*|i5z++DL3^k+4U-v!+i}+EGXla`T@yyQACE@w3ZAB ziOlVfSpD)SKRxuu(TVVG>hmr=4qKr>=E+-m>9#U|^sY_b6ax zpN>=8F%h+R_p4Evg{k${XBGdzpL z9Tlzs0~ybuIq910!Q9b(zLkxFf=;(#`#fw*BgHfBW>6AXDH^Qyu3xkQ471Gd$ zq{e~Rz6nH~d|;iMflX)(+E0H<(Jn}<7>t+FH}(H)Q#!}_c5}rQn$*xWs^eJ^T2ouU z1PY9QU%z^s`!||nlOC9$e_tC1tJ?R!ed7N6Q@}s`&kucnD*q2Zh|6aQSFrhkF`Xy* zIppHFQ_r}XITwrA&U>^bow<&K8^a4CV%|KM?vMv#b41{4h35BhG9yxX_$I z)0mMCJXfcanl-c{1mf(!FJfs;hq@-H zG5>u9=0Ypm_iHW&T8qA4f5}B2mban5U!M{Dw@>;1;fG+CJ9P_Tn_Mxxs-o_Ea@Q_B zjxDW4U35+s{o`lZCbah=xcRbtm-3f6uM-~kJ)F(!a&lJ7)hyjp!OljaOw8b!XTZ0@ zZ3&CTQrsj3FM;nEK>APf&-Ha>xiUNZnB`w*t`o(0LvF!L_?CiX*ZVWEJ z_Yc6mf8*97Z_~ItuG*LnO*~TbCI0`fC9o*gu8QcYcldJV{iFJhX`SGd{q1vG{2ZxP^xQY=k*_No@WRJ0I#=m6 z+BjjwGuUf9`eRqqB{X`s9EesvBg=if_@TkY+kW4^{ns)4_W&#kO$4D)QgZ|5-ORMD zBZ+UD|CkT)jGcP8p1wc3{WU#H1MVmhVcYdec1efj$dqgT50n|B84XR>RJ3q*D`mf_ zG>G9Wvax_`Bj9*3|6foqe54 ziPDanX$^bPIf^Q!jZqUgy#2WA1L@ufW?|^$*7v-x4-Z)5+PD&<6_g?=%8E$eWFhgG zAwugpbrOG~5sRZoxu{7({*Gsx(N6>i4Bx3ErK{KbESi*IMA9d>+L!#XQ*zLw1DE1i zush$cIqt8d9OwR)JQZ&r0dI|*5Y29YC&sP%=aWbyJ= zug$bDtGPM9zm(lyy!{^8512l*+5c09c}MTse>-{w4NlElH(b7E8H^0q7?Dun8i}K; zIL~D0Q=X!rmL#hruPQ5*H70*{V?5_XU;XhPWwtE{o%Cjr6k3m?mn}beA+?QnwAL$BRyV3 zSls+H6nOPJ8(JBj$5GaJ1&(gV7-+7ZLW=R6pl4t3CTz9NB!hk6o4Ykng=L6qwxt2s z>K;>OQe3Wdlp5q0R0sKiaLUZ^>O&fbb`(5rA5;alo)reJO(bbYDTpN9_WRswGcRKu zv%2LlWRv^BP2={t#Jqo84=zF#Md;$abb0(nNo`wa3*<)yJ}c0rgmjo;ArhVM+^A+S2cekolH2E71^rH)S_=P z{24jmiztYos>s-$K+lnQiOwsG5PvA%oE~zFy8)o~V;j(>4z8tNKw$WogWbK(KaFcd z+?of@+z)kkHZe($w9^Y*91)WKd@fnz>$_?FAK$2`$OYdpteySmQW6MGBfI$#zajH? zrgIQ!ve8rGF^C{{XD!$owpBOnKY#WXUJ(}K)hMU|lSPgeWzwCoJ;72)jnp4{`T4Jt*v`7Bn!-$oDTQ(z%7hjv6ADEzN>L z6f@_$lR5y&WJY{!V_u9i&Au{2fQQ8Edv> z(x!x(!zMNZ%XVNQ7}1Lc#=m}?UGNkz_;G3cUy;DT>&jvAh^J3*dDu4_>v)arR`)wa z+gdt`YLT@#CFD^!m=&n?`fJEo+vO$ii%J0jS|z<|AR#Y)9ucNRTm-D#Gqg3tb*&iH zy}`K7>LQ5(Y!KAy9#FmCFeLG6Hn$U!N&`6u!dW>xw~Vc0Or;z;OH8F~cS?cOLTEUZPbXiQ5>qu!i%DR1T|haqGjGlog6tEnLz zuKwV`1G=q8?cTz+HE;>7@!7fTnVHI+y#muQpj}XL?j^kyM+7-f$+~H1amNvL6v3x9 z8Ec(z53zT)*#BZ6>uLU?nkZ&^@BlAZzlLZ zqQ(iZ7(42@0V@tke}H;%w9(G2!P;-y&*RUXYYu%GyFf~cF>Vdf$)HxmjdC6{UIg4- z;aGkZ;g;Rj>^Kr)5eL!i+Sd?F^H9l621rXblA5G0&x8fv-JO>pZ0TllY4ZC%wD1~? z%gM>PsB*L0OGzz@lIU%VJ^cc6=Sp-;c?KQ1JKrD1Q8(kIJ>`qx^#ssttQsZ{w7@H) z(}6PEA(I_Cd#@9B@l;E}2e-N&rg0qe2FuXU_cIg`%Go$hY;-s|8orzOB3%Qsue#yn zY>4|n_!vANBRBK1AF&`sp4>`eF6dI$r-!kMB1sbE66#@JH?yPA-V1pn3!l5Y383>v z33Gq+vwamg8=-a~o4Mr#LEZA! zb8KZ(?|Q(mTXe&%W5dLA`N~C<$nfV$_FpDtKdbtBYQ-2ac3ku}7ZMT_Pim^y^`bP@ zJt{glBmtS)z&ZUEo09T#>@)&i%*P(OpCT!F|4Um!8mhE32K5<570AvG-APSR^RQ|_ zzK8mFCNpM}g=EaOvjvPF_9UL{UMZWB>>GiPfF8W&dZrWmHJU>q6RthkYONj9+}@fRcxqh4AW}QI`UCZmq0ckM^k+hC9Dlri z?&f8+Gyb|@p*WFyH9kU9mH9S-D-(ec@WfzG(7@I@a2nkB;Z`0OX=L#a**lR@@BEJ3 z@_SjS%Me6rjP%cNCn9L3>!Qoml}G2Nb5|2QG~Yii9lrT&5*2AeCU~Bf;U(2Oc68L5 z)_kk$Kc;$$)0#RA(gdiE4r|Mubf}fJfm!p)%F1qUIf54eR^7~;@^wbmklx9)g66O7*-5p8`a3qF-1%j=k z<0${66H%V@g;IcBYA0UCxOhn1ar!u3VNKedj_7U8R`dGEE9v$mGYDi);Roj`S64H0 z3m-#imVulp_m}9`BvhoI!tYu-yIQxCleA4ppIlqi6GB%Td$pnz;1o;zZ`x5l;_B)k z!*9KN^OtGo;_9Jy0)%GD6nGyDP#vkjAg+QYp~=wpg2vz;?0(zr__I6h99uC8HT(7j z-`yg#L{_L@2CZguww^8QUCSf1vy)`On>W86J;BGr&zvjrTyX*l^AqdMi`h<3w~ftA zP8F<8G-c3tK1T+44-O^BW_+yW%gEP&Po$l&N;#ZizqUeq)Cj7Ho~&n%zKGON|4512 zkP17iek|jho<*7w_rnGcj!T%V3wGJ`z;jzJqJK6E(5S3>^rcdv_rW4{A7_66y`uD= z)y-J65*)oXHTgr9-U`W>G&$;}jpD0Y;{@9aa2D%bGseIDw5xW#N}gSl@~I;rlRJNuk;d#$o=SSuHb9~I^j z1eqTa^+Lumx%bE2^}O~?^^!!S*TWt@yw;28{iK7qEX>&WILa$^B5(-g&7@^a(3OFv zb)`ls&Wd$Jo8V>894C#-(5U3@*^=ziApf{w%A_PxLtCVHHs7HDiERl=N^F0H`z3<) zBu*+!-U}}%?nG$SMI^XOt7N=x#EfUK9RB4m2SK z%U-DG7s^i$3HtB5kNhO3(NV*BThygBbMcKziCJZDXlq&dQ;x{SwMz!?K98lVexM49 z2}kqu)OaMYb0QNp+tu?cV>jzJP6fGKZV_b!G>SE zKX|ZdxjH%$U1tx{!zhgYEpy(QUJgk=OsRo$bq-3guaz)ZXf{s^yp`S>VmHMYa@poT zOnNV6Kx4G#^A<0cT{h!R0-v|WxviCCU<+`hg|3jyy|$Pd=aTc`OH=kej%MOK0CWw8 z9wAO8E&nLawF;K@2BgZHjQ8;mg7dLhbqe!}m36f=a(|*k+cbpm*J;n;E&;;3(qjdO zt$n{A1!5w16fb|)Y5e^-^#n|$ccdqk4<0woQKOC55$R=bMm$UiU2xF5>evWUx-VSa zjR*Wq@wv&I5@zQB4CcDl38?p=bZK^=_hKh(%sN~D85AqP24vAszv#x!v@(_!;IOd0 z?fNz3n7(r(y4$$Ds%oK}92B46d9nK4%yU>}( znVsL$IU!nE<(hg27uP&^gpDj$K1yQ5(k5hk%9@1r8 z1;@Z7qj9Vu+^A!JZxa^+H#E%CbVPVB_S`XeR=KeOHdr_uzQ5RYh0xQF5Kj^{eQhM5 z0g9zmw)1}tQTKZPw8nf+jKM6P1Phq_mp9M!pN@)vL94An;-ojoFvGIJ0XKw_b;ZC; zjv27Ly^HOol#r=&){F~R@Bt1yLGz{LRP*&x3GLcB;)|x%Fo!=5Z%Q%lZge(aHOX)( zg_uk!yFb)o;Wgh$EXbK{VAcQ8OuwGqqDTE&UVt^+8F9#2X>rM2=oNCZV^4i5IJMj+ zsJb*qt(-+UCriX88fWp=L@go{|N97V{TL@bpyo)kZrE0?q}xijkox{gCBD6^k(4!* z1~)dwMA)jH-&~XllL*_y;bR5joEgC$?sXr%0cRZPUtB67ws8C}5u=hnTu%5+yHOg- z={ZZFA()w|Kd>!k%WryTYZtv=X)zk@{f~4XLhbOZHM-?UJ|Sk{PrpCc;<-zEE-l*B zy+1EPIE!devX^k9HfKl>{BW{;Gp&JWI#`K-+bBR7oKL^-{(l~i(nN?{moTA{;rYT> ztWUF#G4Ly352=xP6}sMHvX?HEOAS9ZY?BKuO<<`6tG4Y-Qy@9fyU$(UOjJlLhR!B` z6`AjK31e}ia!*Fa3bX~aM|Q;P?m1deqpYj7oji9+OeGmaSXBE9o*)w*&pADz8^qAC z>`Ze}F(dmb&-L~f_^KhBK40Cj=6nuOt)jsj<|nky9=_SkC(t3|?$^=a+xaCfS~0?m zo#Dc;{n`Yx-^_Uv<*a!|bf*(JxY*rmTyEFR2;HEesh>q=&6yO0m=$`Ozm+^v-r2>k zD%r^wd{>z~Zo6!CVn^-EeNi&KoDl_sT zSlEft8*IDWr8oG_{t^Zu6zQiTjXWx|iH}Y^+xJJ2+qG%h?Fr8E$h-N=_|(Gswd7Z! zD}4!yN56-EY&`OUCzWQuC75{tvpeS91M>nb7KDZ6#Zc3Wu>E*#2ktk z<~uvP*mTWalAQ10Oj~QJ9|PcsIEiNIvI#}4)T@p<+CbTQgNTdTicfW@6IZD3Lxh|d!y%eKHC_Vr!9dhrTE zHx}AGkg+#BE8L94#Md{Uu(fw2Go+C_gM1ww^c}pK`2IR2rzwQ)=hMx5MFo(V2v^iv z3d6!MoUJ~7A-e+n;Fwm1Uin;4jTF&${+77e7e&BC!yiQ2Q^n41_aEbn zj_{x6MKsg-w=p;46u$nsG**siaG+iA-5PXq{sxXEqx`ffvwmf|PY6>9uT%7XhPHav zA%t9#@Wxk*m38*@Ag$xCtmz@G+`Vq&&SWO}UP(z`{&(H{CpNbqSl;+vBks98dwnte zFJQ4%PZqZ{i2P(<1&fWfiof8|%#Pn(F-{9%aG1NBA`*PvH8mYeEsoI5HCK5XPfRsE z_-g58*Wt||-|MkzxLE3{eB4a9|L`6MiX2a|u=I{#^WN&&hsOE3FEhlH*?c&{)x({f z2hw;slSNI>SSu{9np8XLnMEwkGzaf5&1$cx-82tzVpEr5ppK~{+lq2x?@z_Y6C19=hm2z*0982w$q0l@~g=cZr3+?FM1~#8X^x9 z42P9!$9%t;ba0ZC4{3cl)S2sn%eg;ZrZ-TG(28+hXwJI46k?>prD%N9B6G?;T5WgT ze-^K+`pSqaWSfY8c{+GK@l18!19P@m?;X4tGCzz)5+WxrTHkRZ7?}JO!M4`qs};jh zS`rQ3&_P)OOuK^`v~+x^w^kaLU8Gjd+_)6Z!m7tr_IIsVW$+o!Ga*8-0d!A>q=1pA zb(a2ZJv}{ojbgHH$|--HtPK5U2D&sK9`7knu=uVj-mlu-IUTHdIUY=y%XG^2?zTWb zvYNbXYIFu)M4P1fiRp^OWM$jFusV3CkNa*zC+W+y*;{(%6Cn=8IbUh&6U%B*Ih45Q zr$3OQ1TA2vi>tNI{W7f(d2&Bnkk*6L8}M*!c&oRwy_oCwK04ve=Ko^vy~CPJ*S%3@ zTNMTydjkzPV9OOYB;I#L8_p$P;+O;k!0r1u(4 zq=sT>38bATfU{hCe|w*OowLt%zU%t-`-gwvMZ){O&vQTbegDe+em>2gvO196d*i(@ z>{Wg>%(MCohruLe_kNm&qoYhA?3vh%ReWHlq_LG*qDfYA&0=d@-;f6V%@{?2>W_3L z*z{x-jsJtfAI?h{(uZ7$8otbzrSzvErO-uSl2J90F0!N6FYICH{_q~EE;k1)E~M|# zv*bE48Reg_ohl3~{af7aEHNHAk)Kz!@F4bHyr2Xl=J*-Lmsl%v*z>4i+*EIU9>Aqs zj_%nj9p0Lm;YeCJtk^14D__hefo3u)5Bi$mJKD@xV z8JZ2S3*Yx62_k*n)WW;*TH&m&N(hn6YWl*O=Q^LnqHeD@4nMfBZ7q(&U)Q6G4q z;nMAo70=22ay_P2I@zMaz-v4tq#mq* zNqdnf+;8XZ12r36i#PzYy48cd^Y+0p04lW9Xb1Hd^2E{9lD;@$=Gwdnq-vU|7%%*? z{4k&#B@Yv`JU>RGIWr#FdRyD!8V{?mtatM7qY;~Ym@k?{s1qD#DsDW zDxg+}-s}2USUrlqIa7AKD;Mw_tYqoy@M{{shM?H1r9JIhxuEDUH-{2UPfh9h>a=^e zOUST4tt8PN5g>n}+U)Qr$F`u`vyS9EryI||kyL|aF z+slK{@O3aC!Dg25`06oJ)JXCPSmo)-j~)TI=1S4BCShZ1Yr55~dhf?snV~DwM`Rv7 zlB+05uHj0gW@Cb-Z%6PkNX4;=vs?aYTSzQ74zCZES`00F$VY8DN}R6xQNa zxqt~9x?<6|*GsrD!b_yNcRgZu0Cbf45ujtj#|I8>Nui>uzq>sX6uIw{UG?`=3&%}> zIz@PsB%!8Oq^P9H>p^W_|3jYmKlDL-0Ia17XWOj5SWBIyw@Lmg@}AWiGJ{{shLTTr zbtyvTPQ}a<3=PgYp8TFSR3nl@v?*!3w#gE>HgcP)f#^sz`L_tucPxoFfv)a|F5|9m ztM0#_Ht}Bt?w^Oz4!QKyU4)17N2Cy=uKtt(FDUwO(X5!quDD)4l^u4Vr0R5L#_u~b zP9!d)RYQ<^?t)G0GHkp~tV@as^a5BA(?)8h6&p*}RzDM#vNx!1~!NxZojuS<#eJciEgRjalqhiMU%qla!fxe)?Nu)~_wKT=Xn zNs$oURTBW|8Wixk=4?^5UL&@n%o{5qDet?n$b|%B?DAW8dBsV=mBAC?bcjkzM^zzy zqff53Mlq^Ue(5E9%FvMegbM29S32W6*Br9vzDX?(aCyVJvyMqx{}3sYRUuGL8tjC zV;OXL0Sh{^EOT{44kyDSj8i_*^>Y>|7#CMpN|(9=(MT##lQXrUWuN{NDw>})IXPIe zFw7i4MN>k!XvVmXQ8OgB{;)zp`HZb$Y6YpdtG4d+w?9?^{IMB&J0DhtJxD{Ex|p$h zQ;iYx>e8U%5@;34?aOfyX%FXhQsOMqVFWP>2CTT+1yZ=lk2jF>oj42YGiy&7a=sin?A!W{h)~FTYr=gAdo7G%rP}hcoMV|e6RBm%*VlZ_ z3~RI>T}_a(vdf9JqBK$C3#OG7Pv;*fa(Jf>)rffu{k+- z-SwC%*WnboqA`%*HP7HiN676F1wmWQGd@lJt!J?VXJ`tbLH$`ykvw20B+*ED*MA4> zgq;tQA-a6-^d*)aZSim$da@(`qXUJ`4UucM)>7fDjAHXb=xR@>pE8P)6ZVj->P{>m zeJ&alC5*(;Vg?+udynG^;2qu1zTI9V@Al&3VzXZww>_v7F}>>oJFetlxvXUtTeAQolnd!od| zC05}|JLo-t%Hi-SYB4aQYwni00xAc2(1Xl+_3F0`y426L1TLF~2g3>;Jg8)|Ge9qf zH&(=cW5t59I#be@a9lw_;pp=Zy*j?m-qdkL-w`Bz(0R%x6N^1P7O|%GeR8%%>h5<} zAAznenoDY}QV(&1Z?9_oV!QVCA-wSuUQV@&ojpK)S1V=Yyt?6|h#r9;t{OzO*UZ25 zuX-e%4udZ%-BzJJC+74{&y7fh@3wQr!<647x`#Hd>o24LS`oO!Qd9sB9+rC*e8wD) zvXWPsa||#d{!Dya`9To-+RNN<*=G3$eC6wT{^n`v1L)_FVEY&BCgnyw2qT2WNp>Yq zLb!6AwE;r2B5M5#kXEnKmeXyIq;D{Y|v%VtY&p`ODpOSs)qKsYBe54E>S0mtB>75S|gi;5WI;WU0~jVo#t|+|60-{@yCX_ zY=wSHtIiANe9e`wzRgu zo5F?UifoEsYXo#RGCt;~M*`bqnWDhs9xv;2@u!NFkt)FG)8yf(K$mRl?`qk*nwpx7&CP|geJniux^bkCT8_pH)V*{f z9x)%K-Y2y_`fSnK|J|3*6%M#5LnJ+G1{R z-Yjh-%VlXou1N24Kx|s@IYEyu?6d{UMo1wf`0iLlseBQScsM@=T7t$X7EQ}nRJRWQ z1g8XA z!d#36>?-XOZ2S!K^C4q*&I3Q`^StH)u@h+RU!KQdl(pg$5^_88g4&?aCSVz0OXwkc zhOi4ZrULbg8`sxQbzBWw&916)0EvC`S}&;DELm2>CLMQQ?F&XE(AuYWP5dY07FiPlVL8NY{r~jD4FR&r8LCYW$lD2D0T)jEL=%F5i(sq9^gx|-X~ zi41Ap?ergKF2PHeFv|LEF9x~|On25Z^)DB{X3Sj4(y+L3f*SAPjF}A!D_z3bMF2)a z_a5(Sk9LZ!S0@NSJ6R||r`WQS+)eqkXYy?_LzZ9)T|_dXu5Y;`+uHKN;y^@{<7wR(IS(8k zTUn{LfTtG69&sZy9B{UR0O}56{U1NcsSdi+$9lHkZYP45-xffz+|VrVxdds`;U!}% zFA;4@5BXUEMN7(KQH|`IfJ~s}LxO%AIUfC7NtA)#Y-|7S&%^95`uMDB+EC%`FNSCh zi%8`(|G84^K^{9xQe@kuM+W`^5vFANY zgV&d^WedkxO9_V2>Fq+{g!Oq(eRP_y;?AvgrFo2(UM9V>&c7q13K;vdM2{=^`&}?u zvkRo0&f^w%mzjVJ9r3%|3bpKt)&csnh7|Fi>fMIQ z7!8XdV!5cm6=WccS}*OFi4tepz%O(vbU`PS7@ zt83NuzY{4ss-C?IM_944_NO0-<{EvMV6m7xcuZ)(T^ygtMiBFf+20bZl%#`av;J*Q z=%19+^+QH{8P()69-UZ2!H}A>Hh*RtiX|qDAZVQp2TGKM@gsorb_mc+MH^E^$ zE&wBAg-N)!N)^t7y>Y@@@$BA#j`YAE`Py!uboNh3ZvL@P(|Lxo$I|-`>9t`6FzhMT zq%nM=l@jM{YHC#rQgEV0`x!0a0c5Z~Fa6L5hTS$b-j@|#(~i5%I9^mJ>OgLN?sOFv zKq37f3KvWEx5&92Ihk~I-?;rHgZ&_@kI7MOo&V_137Fy?vuWl>0nR~YLTZll=r87< z>W!1ya`N-bGT)u9u$ggLsI8gW%FZ&pcxOOzLDA*&SyzoOC-)m)A^tIN8YMZs@9yn} zV=k^euLCQ5(!D-_IV&g*`<9jSo$+-Fe9Ff6@=xPh3K*Eup@mMs5+C*;hA5K!^!fr9 zmiBHg_BR&|dAMDU@6ZSUD}Vorbhc7g93x~;-8bx$LUC)W*o!DYIFXa$`bUcEXv+9| zLd_iE<7Jbzp^_&HX-+GZn*M{IPOGZIX=SIktBk|ERI;tNJ?NBS)E+6Bsu}KIy6}k6daD+X}!MUpj#eZwFFWW>8}kf^gc&KpR?FDPPgz9F! z5ae?au&jqlcQ>XzWaeOfVQX{37STGVYvKiS<1T(zgQ5=Z0pH7hx5F4_6K9gOd4>V* z_&W*RpOeyDkLTUmbF{~=_4pAL&6aE2>w*WymBu=OE@dJwl^JNz8Ni~x47H#9*i42V zdmaW1)U$hgu>jf=vtwJ`Dj&igES>0lNo$A*5eNB^^G-EMG~tZwxHaIFT?A6$)rqFP z{38k%J!A^?aH7Sv?M?tEpU0gMZldEVOq(nyndRmU)W`-@oH`36+uVcwY48AUcFXNF}RaNT; z5J2yu0_fsZUfr*%05K;MgDGXy{)LSmDY7g*dhN{2t3)loYgQlX|FMTsN29<$0lSsk z^Od7_omK!GDAP~;Cl@LxuQfgG2+pvTbl_nE^6Mwj(w6ooTXD#b%BB0KD{c^ZT$b$k z3EM-SC54ed&D-1CmaX?_^>%Y+#oW=2<n6k^k~2j z(Kfi^H8FbiRymJx9cfwSk7$rBBzLECkFJe^KG>1A6m?}D6$x}BSzzeG`>rgY8$Pa7 zIj-+Jb_^f3d+rux7_3aK_B*VfnHxp_5Z<8Iwo*0Rrl7p*&14xpKP_v|6>wc9X>$kv;l z_V9Ob30^KHlIP=uKrR3UJgjwi2a&!2wc6TwgJzp4l124VdG1}~!a5FYRV%KzV0adj z$LUBdU0N#71Y?t@%#V27i?Uv*8Ucq4z&9-{M{f_5I|eF&&g zQH+i&D!{t|I~>TP%2vwtP#OjdqZQKpRf@La;A^=%puD;aYOKPnr3K^P@1z_T#kw~T z(~#2P;nI7k#Sh+X*c_3D2^rfQR$U&COg^lvZe4CyWJld6E;JSb%IJeVIoAy3GOr{k zm5!~AfWB%6gzjbr?L%^9s^PhvJuNQIkf$On$J1zyBG ztKqhVBCs*JSUST~>`V6$QnY;+o1l0QQOC7~sIM)C{5#ipk_^{cUfi50xuc)b& z+*U&!y1q;a!%g-UDU|JS-%jQ4MZsMEV$Q#*0gdK%S5xphyuG*4CNbRq$W*tY9{Bga zL5}=i#nt|QQcM3Qe;%*pH+;IJ>wC3RPkL0i7$N1Ln?I0BZ6L1fe(m-}>8io?ed7xh!ka$>|J+;MZVUYNga0iI3sU|a`?u4q z@1eS0U;RLKBgSgAA^yP4p)N;#-_C!h#1;A`1;1BuxOV<2Vn}Z$r_|)|N zof17GAG6yd62mOM64st`Yp&WD+jC#8B3+2E;RcY=I1}RQZ%zAu1@wK^drT>n&ZkU< zUmuWEJC4Vaid9}Y;;`MH))H(Z_AZ8KgWUc7g~(x#Rt*Q{O8}1&HtcV?>Og|djjGPK zF?zguGBY(nr?=njW94}=OPao^6#Q30K;;>*#-?X? z6yph_|8o#={QL$kADdWi%GE{mhtxLPHAfYa9&I)o0kv$?1?Q}6tky0~zgx>bT(4Wm z($_lkxQTm=w#0{UW?N8poDV1x5*Xkq_DZR2$?V?boa41CGY);25BBd?T8v0BIt-;F zQjWjJy300J4EdI)AwEpAGqHKU7h%y}2L^SbhaD~f0x<9L2K`RHH=6l2mB%r;EZ6#C zpgmi0;=_j4hpt2b4dqGj(j>dG3V$oAOt;>-Bo)D3C$FXdRoq3Y34SZH%z$s2+)Tg8X}1$Jom9M< zpFeu-2@YhvY?50oIZOn8+edAz%;SK|OEV4Ndaz+5HZr?@W-qx^-7&YitF{Z&X zyewii&9T5%hsWmuSH>w5P=fPKe@aVJg)9VQzdXxmdC-LDcjs?TLZLE3)SK5#0Me%P z#Hj5>LQW;K<72h63#P8>cb@SIxV$g^*S4(cd;F4_nYG_JgOuoRj!0d?cDmFE+F%ws z2fGkRpE-nL5*q>G^Ym?l;Em_$mz?K8lOZ-PP7f$l;T`q#X!aW9mo*G?Fxz!t<#3UO zjg5qmP?XkiTztGocu>zy6O&w0FxL#JQpJOiY=)e1W`|`UU5)0V`W`NuI}4 zoo9->Jt)(t+`~?ge=j61S}%S`wQo%uvj#Ja( z#5G`)69zYNS;-!eiVU?)i8C1i#%jV`Mq{a*swQ(Slnf;EIe+VP`j6uWPo#h2d91=u z#h@IV{FrozTQy%P%3WP1Cjx}bj^k~*5n0iy(=74)SW_r5I?Om!io-3*P2Vuu`uQQW zj__1Ey@K&cMSoK|yqD`4yeTv0foBB5WHN*8i4wB_!$MiC+=OOctLK^qr^{iFNP}&f+q~SNlVCpLu{JlfiUtO0M%i(b<7ZlGfKnBxbqBbp)7}UKVr9s!eB^h6(cp?SQ(XUI-<&mt@{fuFVQIp_RKpL0URh4LJ4{`ke&a!_Nqi&NWGdo=`$Z_F z`cj5+WGxc+ocP_rn{q{jgq*8kMyRDja(Zq+tc(~-=e+2io$xjP7`IzlgDY10$wo_Z zGCYdw_Zepro|l*RB*mjjm-BUm)Gwt{e+)pz%J9atpzL1jg)<%O?(2q}<@ZCNa)guE zYdmYZUn@x=Oo!~2QTAke>Y;Ln_HY}&5yhfz+y81)k|I}RODLW|FlD1J=XZfU{0lEK z7`XQU&c%YpDgY=`k<9b7Pp+p-e{?$SF!C#c^B-k?e6KIJ&f6!BuxVl>H>*bECpQMr!Zb!wTGHPI!lKZVB>Ps%KBc%&4C}nc^vs zG`>T+r`+>Gf}{Y*RSpyEJ86PKLaHMZUsK0~m}^uK+4Q>x>=z0mX?Mt>l*a^CHr~zB z%F2o`E{rZI8zB$!rzRfTT1EHTjS7G_q3olFIP_8vN^&Bdc{?^8aXl9Gxn=rpyHsZD zfJda#6zSlEUHz{uy3B~s?@egq?qM2^)U0+ z7spIip=;)shmQK*#y4KUXS*w0BMF}#3<57kY?+5cTXI~^uIg@kAaU3dZt!I+^V|=Y zj+*`{s#hy+LO2i4A&gB7de?igRiT_Is0fTKN1%^e zeF~FKfbh&HfQwE~eM+nM{9&2h>CXBk z+=m_qo-LyX`}o;sFj7@2n1y;Z3}?MIuVxvOm%Nga$CF?Ow94>EnWbd1>6WcYuWV1& zX=oMor9e|QQ0^&PSXr;EjmW72qQw={5T0J~naq{h5kBwo6 zR6Xkjh18MCm~_-HifS!_!g_DxwXz#?rnN+Df7X}qbxJAbM@s+!F()l(GKr_!F%SUY zlFE%Y_kdp9O-;;5o8xayOKBxj?bZ!moc47S4}8NxtiF9Knk( z`O2!>rHc2h&F_akj{-^W)tCA2Ue>+}Gc>+^`*xuvzOHTH@mDYynureh$_D-2T2e?C zI0(Vl>2p_0G?;rGn-@!}kMZ;qiD$en)?{}TCY_@2SQsFXT}O;p+wtTF-dpjV?<|p5 z(!1TMSRQ|pj05S#@t*9Ur$E=(%=2QXN^!{uuu91X;l+MtHtyta;*FGTsK*SnHdVhp>g&><)A z+^mRU?^JHv-tKz!7uJZct|Z1qWX$hU00+J3nM37zgi--o-zTvP<5+OQ7k{EU+QF2jAz%k9o+}f`v zeKz7@(p8Q@MZ-AGN8G>jt+7?jhQy^-;H0tA!SyuQ&sqK2;Jp2^r$I0E2fpdpyIP0G zy_XdW>N&ww61nn&Ut=nLTm%B%DFt;+QiPYY)XT!1avQy?b$-&&Nyhw~-NrvIvHDZvH9@Hl}dDx*m z&Do{a-NK6$*T)Ouyel;cy!qFk<9x%ea&O9=#uQm49m{}@S6p$v`MGBC?r!Q)i(}SR znE{_!Pj5a1q_NO_VlQ%XfR$(bc8t}O-VAgK;*~+StTg0nCX2W^KdDJ1L46Gii!la0 z!?#I3!5%=_mHZ1zGEo;GFqI>97H)Qa70{BhXBu}3?#rnr0}b^@o1DR|Godqy2iJ(1 zx}r18a2IpqV9$J>&uS{y&EDvGB>J3n9re2E(3xuC@!Sp_bwJYj14I0`rD7r`;M z8e|nY+&p%u_Q2o?-CBAyVik00ivsDFaHlaX{f&)WeA@Q@NaeNDUtin_pyf;({-|d$ zx%xvNL@ulYEt}F5OSgqUOE}I1?hxh)t=E{BV?$=i$C)eH#%t?#dg~URz`%s{yTJjb z|1E>sSNblP*>MPXKZ-UrzF^QbPvgs_N0oqSGc^hrD-E?8kR>GgGv^UAO(MB*yMiJ` zV~)K~2Z#iVai9CD*DQgYDT*7Zq(x8;MAjNHZ?I;*WRdx$bhUW|N(4neI=oibC)nEO z-Amexj$b5a%!)H~r;U)?2n?vQzu{xN1hvpM=a-X;bMcAK<&GDUE|+GI%#8uqXRP5g za$NMakD!~{D^8_qe&Cl9W8gOdp7m5BCHji#!`nb`wv*C<1_`xry3RaSgFdI+RBfkO z;K4DAEQB#Ret3z#zP_1qX)p{?Bvc~gg27WeI$l${*q`+XDCDfnED2fU1IQra*N}QR z(3fSxoDnq@w|V9;#Q{rA->8ruH@P$|E2F_hGL4nssEKL`Dv83>76X9QIgtBAhf=pp z1ktW%zybt$fLeIbrU@J&wC>WSF;+0Z%e;Xa_1QDV0zb-ZjVqXII_2b zipg1UY}nsXCq6}@IhhUg=6v&E*yR7Ny0}U+mngNnka%5TiR#d(A_dTUo*0RnD9i2Xf0q(t?tv?4Lhzy94bx`#Nin(C+tFhg!@z; zTR{0-1cqNWhz}2qKwadxoqeS|25rbr3`F9eCIp}j35+}XkOSEla!oA-ckaN2T`l48 zn9X|vR|~j_ACM=g!Z9OpDt>+a7t8?5gS>8Be#^EOucPIX0}&evDeXhLp28RVeyCHYn zG%zhOa8>6C-Bw~yO^47B$p{fea1w^U7Zkj3-TAsVwO?^fRORT}2ndys8)s{Fi@p2z zxLZn0_hTIBDP&myj}J{pWg7wIcK2ZS;)rMO9t^r(A`E)nA@FRv_vP1a_$r{uHH(x7 zSC3)JR7-C(wfMWz%7B$j8P9Rpe|j~kwD2z5&+Tt&ittK_gHuvT>FJA`;RHMOz1V~g zySGVCO(;B~Gr)aMlis#d^mi1stq=q{6aQQEqW{OHck6xpzm(j40=Dw9Uw!&+a~SE^ znPz-Kyy_n@We<&AS{*XFO(K4$>@rNH<$C|E>!zRgCsKZ7Rzv0!`-;eq#}%iiGqTS{ zoE7iZ&jWw^y7w`*F}qx>yT&vZtr!7o*FFQ~`!nw)UMUl= z3&pMtOr;qqY}B92u1?P|F!^EMZ+k_?lFXY<1(-~wznndCIJ zqqYVV^QENpOe%)2Uy0Afz`y_I5+C&{r%eIm*4O`+|H!|82LdF0KOOO;G_Sp}ecj@X zKlyh3XHP>0F*ElVyU;rWZzac{GP)*omTOL5ftRkya?|9egH~5po%K-k{>zT~V+(MT zo&S2<9>ImjLh!7T{2w=Jop;ZRrTZ^_-Wo7-PqD*!S-6QM(`wqM0tOmJn9;v9l&<9@ z<^MR1hdx(+QlrzHvq$ujn%*5myYv-G1-Drd>NN7TWi+-TBPN(F<4eL@$tFo&Q3y~&ib9g zdS?gBJ&(hc&{+Ib?{OWcKlTm#EMm*O_OJRFb62FY*1GI2j4h&KNHHEW1am|DFSBc+ zDe^adxn4Ehj+IaQ@LZF-PjxCL8!=KQlt1h*rRB!0`kE+h(&yLQ#!1g}o$lAaG*M^oN*t|OeEY2z-Z=BfZSY2D>9(LmPWu>*>ear=&lKcwrN zer(9L-pB17$ht-Md$m2rhVPzY9Ee?=`N0WIH*;-%mVvn^thv27Cryv~+(Qt5xTdQy z|1D%yUI5SYZ0ORj4KEkDw=lyAQoF}quN67@Q&)3JQX_v>1b4oVIntpIp4LBq>Gb9= zJ-(J4WBzE?&h{`hL`A)yH6>HJQ6LrFXOSjQKv1inarJ-m>}@@?-Ao0hfO zs*rm~2GH`FWPSU2E-R=pS-mGMI&QAgXdiR>wu%$erYG&#eowq`LCHEZxV9Oa%e7Qp ze~>o?ic4Z?b zl%=i-b*33fZbUOyO7*k1%*iUXbK78o%LylpphR!b^BGoU+@rsOmOma|1u@TV-rPy0 zoe=MS>s9OZ40(*R||u)98sLDD)+2DIDqc0 z?H`j@OzwkNJU7?r{8MVR0}>bRAk#1`DRs|~`rf#<@mFcTYtV8j)|!#?Hu1Gy76f~2 z!L9e5SeKW@YKnn5FFlY}6`kX>m+!}ey`U`T4-{kItcYMffKe)8Q$!oT@xv(pl`jAJ zih=I>d;5o}m5fh4s3-{G3w^tP(qwutq21#W^;)WK2R(iM00Oiq%| z&}o1K`0P7Ufx;vg)=m*J*Y2n1V3?0XzC;)YzUj($DdP<6wP$y5rJ-yh29#lAl5o>L zoTO!)2FItL@Cp<3c;DfJBW|_s;;Y3=qL7<^h_4s>HNC#s&|M^v!tH_Cx6cjb&!&1> z``OGvChwW4rmS_L8w*P6o?)JF86}IvJV3C5YAfTv{xOPRy3-^^;!qR7qrj(L|};7opub)ud7kKv7wqjYspzh1rI0X%NN7Z zTDtfXQZ6vqRpMc%E=L7zBG$|I39&*ebw;{Dx6&WG9Lcd&tHgdZ0_ zXkN0wTAi@tu9Dc-u4;c~Xxx0q0KzAaKzm@48#Si5yhz}G!QwPiY03#UM=U~C3&2O) zNDLsChnq_*f+tAibneV@6t07TO2345Fij4*HR2J0hH%3>w{tVS%|+}vKZ|uTFKfHz z+;>U_wwP$HSU8V78(FMak{^a*eBnmKDX5{NJ&-a|@hLt-w`pTTq-a&wJz5gJ`bZ>_ z^+g{Nz6@awHQ%%IG#m5sv#MB)z+kY$bzkhbsh|szB(;8r^$Ef)r-@HK%n$=^6dcfo z%0exx)B2QAXC&*moaQ$UesguLO3qC=C-0^RAeIvAP5S0`6PTvkQ~u7Ey7{Y97q~Lv$_iNk_0cw5@-XrEE-?^UI%wl!*%=N%dSY zE^ECbSKb~s<{LpFS;Sx}kK82yL*OXJn5R_uRCAkak1@PHd z&*AgK6;Z1K`VvB|Y5qsY{V7q+8&A*kM-2`5=^kbG0rTS$YfiSA&s>+jl;Yp7OKSMK z&0$TaK9d^`K#t}hS(JR2r9mmq&jTTBR~nHhc_nDidJd%UcAdJxfBagHe=)X?F?WZoLHsq@q3@PrQL+{t=t}F_vgkv=aGml@xE>0xahAW3~o;829y^_FgE%fKU zx|~0Y$J8!HCna#aFh`YIYzPqT4W|<|lxVChS$5eoiZ}Vxg@Rn0Ga@z+Tn}(VRxt(e zcPhkJ7}xdEK2r5UO_f*IJ8Uj;>?fsSOPX6SEArI%0k_GChv`J|AnQmSMux_x)ZNWs z)TVtch(>Q8rd%YJz;MMX{_CKHpuRu?x&|@2rvc364uW~}Jn;5SKiCi|$XWl|7laby zE)PD>|FP!$R!MMA(Z=s##6`UweDfGPj#DpT#SN3ZAIX)?@*bV3hKh_$8-zA@_9!_W zviexSL_!Js6xT%7Zw%&koXP#=6{7{1?p{&GCt~hTRKoT?#4_hgw zES~C!%KZgf<;AHBq1G^P3dK)mP;@$Cr3w)>rO8eExckWR2skcy^0+8z(rR(rQEZhA zv-4MG#ZQX(vOxFI{wV(KF4b$5&2zP{2<@}CIO_4^%C@A?v=zDv>$yc!29n<}$OpPiL z+5m>|f!D@2Cab%eZ>Ac+z!9QEf<#w>f#(qAEpyQb+G9Bxd~b0~UrM4Ol;0?5`E65e zk4sMgu**84?|{P&*akIO6juRC$^_Kei2j8L=Pu!cO#8mr5jGa1?q5(tH5lwB5)ZhF ziL*a1ra!6h(UD?q4>1|z&DiwL<0xMNCv6ED+}P*1b6IQX&vD!%4LL};UFZO##!dm_ zz}^okEUDtCn^Fw7eZwA8-W?9Bo^^s$&J@W9_VLn(409Wm5r;pS^||q28kud7B2+;7 zV&s!UvUmt|% z=bb*C$~8i1jaIYQMu_VoHiD3<^-#n|1X1s@NShg~HB=ks3)5yiJ%6M_mfg_1gB~Vv zao-i_^FFF8MHXclFTa&b9oGeAdWLBQp<1WEoz2bs+-A{}_Q%841L&Jl>?u(M%}`|} z*2}(ADJUDZXEVL8=CJbYs!z|~9}n{C!(d&?n0dxgpPSNJ&`KUURvT^Vd(mTY{qRh* z*5`j6$LN+FlX&z*{Ar0^rFZeab5SQAN(-Ix!?i}FSa zYJLyB7MU&5<4G6?2peu#hxh$996E!+TV6_ikX5I`b@X!98XYtrqaZC-nGN!2=l@dbCT6uFr>I?a-osE_C%+0 zL=$T0LZwo2^f@oKALO(cFDb>HuA|*8M168JCS*F^j~^x}g%9gv?(TNd@<_&4;Gub* zU@#{)%MR~FzDR9?qkKs9#xPh4rcy!}H|$d2-%*9g4`=i|lX2G+TC?!1Uw5!lb(Y|} zy}U}xI&ihX!B3ZVRU6l$f!ca|8}YoVrQ)Tu{q$~KF=@`{VbMDMkfn# zFY)-_V_%?3O4oxr(rTNu{Dyvowt29|(!bdnZ$xjFO6RippL}+-Ih3vcB6bbi_a{zw z)wtPQT@RcNEu z+KBSnaL^)1J@0%h-T4w9v{0i}v@3DkyZ*ORC_l52pEt<4=FMdZc1BzM3BDBx-kLrC zkpGPCV>a6Y-WfgbH|uJ-Y5+21X6c3)SGhlU@tvg(^MxfhWhCom<(&((UDNytJq*N!XY`&KL`vf4ncui%Y9BGj^ko z1$lx5FI%3>0M-ENgTA%%UGn8RQDsih0WV)m;zhYRyNa2LMJHnWv=#Hfoe4!IVm2v7?a zpFFM^Ze}aSxz(EPV|I!hH3-eo@wU*N_NA>)RS<0Mgap%`T_HcFmY^q8Rc8dc2GU%P$>@eBgL5;MnhgwN~dMb1jo&N=x?YsRlaS> zj6Nj23H5Q^Blt5pLX0FLD~Yle*IJm)(^yMP)SUTJwX&BhxYv`QV;e9Lw{pB}g(f}X z3u85SDK{;e;}gFveHd@)!yaCN50XN34TgUb>DTqOGrd9MPELjUID5W)tZo~<8i5yc zbx1$7s;r&5rY3g!E=g3=`66aoVKW4dTL@?^)_D#0APX!?qQShv6@M53&-O^!(Z`$S z&_`G|!~g+^Ye}B_OOn?Qh849{f>bdH^ zxGcuf4|N+G8)m9eVM=mxm!UAN`8qH0kVb+0@VtPf4|(|kC!w&k;HR{Rbi!}E@b1iy zbl(uz-I!s%*p+LGlwN@0kAACh>AYa4wrkNY&Ajc%c(7tWX|iI?LOlMSm-mUIig1{xvbxJd9Zexj^ksCoWv#D!@Mh{7rU-j^RQ(#Q zv%pw+Kbl=og02Qe5SSXZkasGGF+aix#hp}E=KYWRi*X5d(V>_EQ=!rkkJ_gDaAFPy zPOLBay~-c>^y-~&cyl>wg5*cqX!o=48lyL=gG5IKo5-&m@&)eLc%A!orkVc1qNGNi z7yBgzXdh2?G@7<1lJ-&gwUxlF>&`whXrl$L`)8!3LKtTbqH z7f;r!U0vw@8b{DvUFo3!Cd7%qAFGV#Z?0U>UI**cS&c%MDZ*m;^+f)qtV?#ou8Z|>T#U4-P}HxrB55dH=!!Ys)cy*J z)c{vn?nW*2uL0UeuD)u%i%+5A+uCs(jc1&u1ly|$8}D`#E@sL#ke2LfR83Qg7(!Dm4Z2j3ZCQVH9W?K)7Y(c-C4<_&qvgs zHtYEj`(d8X%?3@AR?_rvsA1Qq)ll|z2DoNKx z+G_LjwQ9hETIc{B{a_>*OX=L_Rdha8X5L5AaDlA{wGz_Tea~o2=f1!wg5Mm5ncq2e zMHW4zbmLr$>v7TR;DHKbxoVMfR0PaJOXxeqCMq^YhM_z?@q+B)l_^ZQ57(Ioq*2 z9N(RC;2-R_9<{&<-Y|_xwWAnaku-wR>p8=&G_Z6GpwGwrNIFG(>Vf!-HZ6^QC2v3Y z8kl~eQ3rvLI(^NF*-b#oNW8q&3Vpy;M!+wpnBptnj^IE9dEfwg5*~8A9O~bmJI0vd_c+9Qv%n zqp+=jW@2jFl8pp4o$ZC0-V3Rj$+5(F*kG1kHWl#-dc(dq2y!DGyz8CIQ$*0IA&iXe zU96@SGunLyvN0b(8>_UTs|i}vd?DSxyAeB^#G;Wb5cZg4tgbKWH4*SEq0b-TNc|Jl z9VAXGbHNb$ymWVzp+fk?>oLD*>5KEvK)T9E_1H$a+^l#zRBXRzowyIGN*VeH#Bs^Z z^bGNVADohQ0_Tm`MA5*OGb*>1a^cglUhwG|k>;-`hC?KSg|=s4m76kBFXrkCTX8?J zu$i^}B_&Z7wjtFG^SnxUDlJHfcYdf)xwR)a)W?h2`f5oheP%EE?3z;m-L&5%!E~P> zY@hZ}<&`Np)0{@zAfd-ocrE4#Vn~G0>91s7Vhax80ze}m>8$(-yl@Cag0-*nIb4%_ z<F$Cq-EhakXE zrvm?%2!ika5zWWxtE-~ehWFSkf0qR9pQ#~BeP*l51;^}u_6fSxb9ezC#8v04FQ*$p z7xx`L`CG`Hp0-lQf_tQ05YA_Q%CIXw4fDiQ_jlFJhNFBm=;AbOwlz zN_99yNHZjjnPkcChdCRSvs|UT-u%D|4HF?7ieVM5{f)rtex%-0_ntb{jvy`zMmFd? z3eoViwWhbZn)V3}r>tiKz|9MwcC^nux}qHrZt(CSS}K=;)?hqcBbwn>hRPul{M-dU zklW9KLFWH}M_=x14&tm&kv)xcRG|XEf7owyg#@_R zp0KL|RBc0!+-_0JHo`)Dd*KU79_bUKmij1S;c5Xn>Hli)yQ7-gw!J;aqaGF8L7KF4 zL;!7OLqZWLQF`wI4UzyML3#-x-` zv`8+Xvg+X-*}F>rfb(f3R0tBqjgpa((_muTm6A}j(i9kpVbRXuZq%Xuz!y<%EpD?I z4c^CbPYk<2 zKiiMZ2!@#xmQrr-IuLw}v)?P(WourWfS%C4;X7hnL>Mw$SewWV9dyV;O(<(1Zud1Q zPSc;)HCpK5d-z8yDo0>|bvVZ{Vp_>k@&`}wB(8W;dV93O3Ii8SBtW1j_y-vGxPoVs5NCT*1ufgU002w;b{`@GIe9Rq|!%{JhRq#*G^gQx`@)jOs zLF9`-+Q?dqWQSCwpm~XpN3X1u)Ik=F2p(=Mugw=JQ2RKluWX+7~ z#W`C{76bGt=QJcqip*oM5oaBPSX%YlM7jBlK_9a7y}lY68$-z4XFgL;t%?L_rF^Gi zvmGW-?Md0#z^z3+H7DS-`S}`V?K_ZxF`vb@HOjRUjX@_yc>}k4QmoqIrHfqf7RHnK; z;CiKB&wfz*&GgyXmP2fZj)+^~;daM7$m!ou1G)$N=hjIUw}dya!QV!%Gl^-kTzn;B>TYYy(((Of zQed!;qrY>00qQnO=eVBjmo+&W#tW>Xbs)+bpQQ{=O_0I(}wiOJ&A$L>%(* za`D5iBzuU`TLF(Pj+xq6C2j_-)h+kT;H}%YdytL{_tDq$1CpnVqBndVuPdsbK(iDDLxA8&At9q_x*I4Rsb?WE{u;5OX$OomCbt zz0uz;zwv5~*b9Yj0FglY>Y1VRPAl`9)Q-s_EQLKqW#cQ_W8(sBy9x5c|hXc z!9+jH53==dWPdrG5ZS{xeAh+Z!8|}2<>Q+0{u&`npi0=Lik4b>Xy-SfAjZ_nD1$`3!n(!NieJA#DEAgva+%%}SmI;f>mf1`_M$rLAwat4%|Ta}gcHe#S=F z%o>Q*%+xCJ9y>-^B`MLSuI**1eG1QsCE-zDV1MBi+WqNIn51^VOBET2=1j7mT>qGq zB*Y|gk0TdKXl57=pT644z?!ILbZfSeVj@NV;=kh;KX;>@heGvOaH>_LRIk}e1437A z&FMlQ-?8G^fn6OG6-!x59`w%pX0QDgvk@#eAG|(K#5ny-GSKGX#YIPG{S%kR4bUPt zo*?CQt(J0Lb5>|3F-uR&q59lc(Yo0Uxw_(^w~2acni_}5m+J&cqd|GZ?~!j+%@gJp zpES+E{2`i3@2KWZ)e4A8O0PV3@)vY~0}V0V-aijZC<{boHJq-64-=uv6g6&;_``~nD3!w>g} zg}tlXKPl$%ezPnk0V@hm(jWyP8*5sZh8fC>rul_=+K>`Dd`tzk)BjXBkY(LBRpj?0 z|IudD@oTmRbSmTJVKX_fh(F4yd6AWpxW=h4Y-IDXUnZf=-40MaT&pYGJ=sT&2Qz~i zoN}_ov`4Rhuo~{m^tH(-G68pM0|uGMp%2}4f0Bo`0|p6fA@qMTB87w@ zurE21RxZf1&Cvd;qtIGCkKFvTLc?v19*gMPb1&7jm3Z7nj`0wA+!z#?e%J)taef!N za(tA5IS)xQncdn|wSeD&is37G=M7EhG+EC-G||sdeR##Wvt==Epv`S)X=6njGSDm} zl(uLvxaI96qf9KQ5)!)CSO2Z^X29q&nzgQRK~UUvNTTS6Z>he={O25}TWkI$d{%VRbnW zHe9795%^_6&*RR+Q;*Y5nsN>P!j|C7Q`gwO{{eoWXM%{^mohS_Q+Ro@t%%S})F@*D zkB0n>`H1n_WDSm9IoEanGRnhCEPHdou*y=Mt*(YTl?Oba5ZbWx!{kBd6k5I!0@una zfE*Z?9G8G?TDu<7XtBj}>9Qqu5sckEYmCh+7IG3j?k6gz2VJz$giOfIJatSQnDH;Y z@T-^%7SN$=j5G7R0bvIHB-k+B_^)Z;p%w1BMtY{+z9Pwx<@ zJ@(Oy3P%AFVx!Sfm&j&4X=E}l9DebQ&+YQ@h7;;7!aCJo3r0CQru!K&z50C>3D?5b ze|+l=Q9gVnrZM0Uu_nrb$8D}wxnN4LG|oW3Yvl~7t(xyn_?TZ*R4HZ1XyC(ZSFxSm z*r|cOnCC(Q?0#OSk%tI6ulm7#O(mUICW$<6F@1LnMVd}O;LDphT-R9MY}4Q^Dt>+0 zT0${ple`{2m54O#9c`CLTlmC-a4tYER2mk6HtannM40XWY0yh4rm{UD;#+1=4{~S& zgDKr%gFYSuDbcAfgskr6d3yJu-aM}+x5Qvpt5Hf#OBb(KWO|K%pSJINK>ZX!HZqhX z&9Fjq_=E&j>7R+kbQ6Rr+g-e=lX9ul%4k4>p0w{q8RXAN1JbK6ca@DV6=J!U9ju1l zlsZYttT3B8@dZZS7UE~~taXwu$NCyaEQ!@BwNq)MXA_F*;p68~M@0Rn%l^K9h*D~+ zA-7~LJ|gllWT3G%hA&H4`wMqSe>}!GdC=keBMm54v1`UJRy&D3fnNYwLL?Fx)p@Ct z_UV8N_ga_)0{a~sS8sRgL>cuZyUy4ZLOvm}CEYq-jF@~XyM6kcp@F1;R+1-FXaSh2 z@>XF<7ZTR$p1PTWub2fOtvna)SY48HRhOBri_b)4_Z?JlbU7Y5WUI%ef^^0;ss#op z^!N7fQYZIq``AIb^Od}O2T#~lU)x4m>d?fy%6PO`k1x^@yX z1l8>!4Pk4{oe&RuY7@5Tf+*Rrk3(f;WwrGSH1jw0KK)LxmcVjGh;sI|hBy+lJj`wI zq9zg9?ZT>-%pN`uf=*=yO;jCzk%=?{LXu6At4U_`PVkY+cJ3G;TXG3K`(beAsA>pm zYaJ;1$y|gssM61=n@WkRZ%za@{cx*cu2)?zW8Zim7}^b0WepmGWjvlxcet_tiEQSZ zrLvovri{u@5AOpb+7|xCQ#b+UE8@ap-ui+pghPbuiWh{&$0N~OgbNb)ha*HIO?aXg zlf#5Jq|{cuMDmcKNUD=E6F0H`9dlC!ULA4f-ALAVbGQoHDF1h;QKHAz5+EK1zrXq> zOd8%ME=PLjQQ;)8Y3#klRM)6En-Zxiv@&#mL{B=;XJN=|rcZV+8CflGW#7KjYB#4} z33AkEMuE`lu)Q8YytIhoSsMd6#1fmWf%Z&NF+8-cmvx!br%Lnw;63dWd8FJ_mVD)_ zSjDmo&#I?;ah8zSI|VTtm+N2iC933HeZCD&CC*gZHawfRZD_3&CAdY$;znt%a}z)q z=FytysjlZ~%Nc0qUJ-JC0O?k!$UE8!Pp-<*Bz=kKx+@_s)Pw07&70T&kUEwU6Z6)E z6HPaPlIOe{ywK*E9_T&wcQ=P+mp}RwqTtlGNE_FDGqwSdtv@Lfav+L$C(5bh8k6!? zEBjV%p~jbT#q4S+2Rpk+0V1mz+-Mf;yoayT!$VI#lS!thWAAWtq~lUfi7q_X6mq3{ ze*BP6K%zY+gCr`7J^YHEFSJZlx!GKAjOq3%+-S<-Qavv=Z7o!MPAk0RD8^VH_Y|w3 zE!N*!scxgJxIbMFHry^GI>yH=>`1M ztHH|R5MGffJ0s4vaMUb@39rQImZCy6?0bp%b#-Slq4WsYNkV@A#>>U)jBjBkUq_3gkdsIOi~vg z_A1j2dkaljmkX<3@eI{m!On7AX6hAWfF*oWEM@uK9pG8b{M4_+ z;+h;g-~JF*la^bx?J^h(bf3ina!1-F@v#<&LeYS8yLj|nQxMUJo#Ckz=ojD6%EwNO z6j94v8!;xer^o>-Wrpw;mLe*S!Ri>emTBnQ(`rZy5^sv1Gdt;6vz}|{PZ5ARQv*Pf zb>zLAB|PtE`nh}Y$S2vBqi$V6p~}CGqNbZ zA7j~fv}JjUXFj8Y_=xYSp0r7s zi@)h3(*C0Iak69C%%RieB3l?ifPRR@mY*0kDCyQg-@@gX`}Ox zxt9oT8GcCH4Cfg_7+i(D#loBK&z^!Ka-)A(7D_jh%}DD6m;pTwmQx6wMwLGPd?1z+ zLklT5_Z#Y;vK96E0P!kPun8!jw?Cnq7=_!jbek5J3KU*qL^^g2mFTG55xOJR^b~sp zVdOTj$%+>T7i(BY{U*)Q7+{8-nk?Z77cMm(eK3X{_>5i~Har$%%6_tL4tNj+z2QP+Q9K$^1g4NWU_ZWWllWlHCn6uiN z`FaBmb4B5h%lyI|x?Hv0AW_$ginKLnYwIb`v+~TlIGs_$bW-2!bapwjag|4~epo^? zNCCHy_cmp6Ag!)xGu9rAZqhRMaKsvLseX|92T*kJx?hxuBZk#U>&*VRW7QG&^&8V- zCSF2)oa|-?I@j`!S~Ei(@(RMLdD4}BFLUD`NyKtOyMn)4^>KGC6i0Pz+=0fGYG-s> zwZuK}o@~1smeASG7u+bV8ySLnar~xlyz2|0!5OxHkse8 zHkr4~jCmli+IqEqAx@zParGJZbJjvBULRc!eFeD!73-_BH57ed#@1Gq3#rsjJ;_1M zC8)CP1S+Z0U>Ep)mhk_B_cte5!8K1@1FEl95eGFw zDDd|;6rZ=R3~_FKHxr|eKM|%-UP*|`xznP)t~26B4Luxll*slrRaBG zYG2qUtx@H?zH4bJ8XE$0C@~O6@N%YL_L2h|$v|fP^6OIGK{6^*k)1eMYl|J19XBen ze_i8JMdXuGjm?@H8WI)@lR&8}i=*=neQ89Jt!%EyemV;vAH6iU{>;-<;k(|qPT^z> z8Zth8w`xuFHo6k`;Zgl8)vCF+VE*~Ih8?Vj){4GH00?>jUwb1sc#V*BqG`ck?1t}- zDSqE3=*bLoAH~b+)M{o5!YU#a&f8pA_3V6$jj5qr zft)S16sGw{tvo~rqRT8TT8>1@!){W`yEvY0#4GW%ARl$43z#T=X^0ni)KQMX%$I4h zOBYWsoB8ROjxWI69u-mu7hn$5mBXkeK@y}Li(IQ~Ee`1tQrswC$YoAP>!#%q-t-Tk z{Cw^tC1Nuty{DS`(?baEGiZN}#`8GtcgU-<$_v*wf2`i#i*<*DfSfO!JeR5c*PO4# zmSF$5uP!n*y8FEZ;Zvo~r3Z$?T@UE4E{J7#`%MVrgTulBoeJEbo^_^>7>9>?t0n~c z%GG)<$mM+Hg!$+poM@*OTTSu>gwg6ak1fl2>wyg=muCwt(D|nv2j18E;TjJ=cSHH6 zDhZMNg$r_~W)8TF`lixJ{dlq|^ME$@fYk%I03r_~s9r&RQ&}YB8p)E7b9>^G8axX! ze1XmRUB~m*$m8$oLcBfjEA66nEA=m$%6R4YJ-NDML~1spWeJ}YS#cQ5 z#@?isIR>B+f=ANR4i$7C=5^RKt|f%nc}*5}eVB=UIqVTZQz+|Ok?<-kl6XK6fOa0o zW+t`Gq~wK252VB*x1c@Yp$CgEo>W(}khH#fXV6v;8Q&)>)lfUWiZz_yNU+!Lm_*Ez zYn4)vHP*W&36~kYc%3dF3OW7!E8(N`<}BH&v)wlfd9!{70^D`d&Wo=C1)#k(VikQF zOkxp&+`k2R;sp*;(7qTrt$(Y7MQ3%5ptsEXH0*o?<9g7Ql7yh9+g+s@zaG5$a9)Gg zH?RttO}Pg#bww3MOcXMVklXB9(0PEj-rnI~7Pyu6;KgfueEx$5840_AYUmHe$d}zk z%xOzsf%?f#;BP7vprfQ?nMC`=yi+ZLSt}m2eM655J!d^Ylu-swVBKz02rw=`4l+EW zn0VeG2|4~tfl$1tL1t*1@y(hPi?>8b$;!&kzH8rQ1Mq)G9^+tL7H6}4W>0r(fl?4P zTx4RPvO#ZF^@9QdzT|Ogeoo;oOKsnFQ5m&={CditMP--YFV}ABCBLoF+o)HL~VE$#DZQO{(B8Y!!EdZt(?2eD>$&I$w!;55TgXW?%s$_rYXQc{KV8 z`9tE5(Swz|d5erFsg#IK;J_YzB4W0ct>f~qV#iPKfuwZZ-4YB=an%LyWzJKsHD>^d z&gr6==%2Wax#WDHwXAn+xl$Rq;{s#Lnwu+6P|J zB++kFbo_kL>PB&bC0x2Dvh_E=>1CPVqa#AiLv~DRE*p!xY0^&c1@0@qv3{xv2uP2& zUY%EvmZ^?xvQBY0Vs8`kYPOrWcxQKL_<%L!mbdjg=ikK@l^g&C%@Dt1G% zg(X)W?IO@q+2A#AEc}EW`(??;@~wx@Y~>ix)r~2?{?!z>-JHR_bWG$h&7CGti(*A#eOJV%GB;Dhfa?12Z;3dUhjV-S-C2pY2AE1s}%0 zU=He;5csN)Y-io$IJFUw5O`x(9WZrUX5fR-l*{irfEBmf)HvA1)bWe_yQ*VT4Df^B z`Oa4#=bkCwwbN1{*nwBCgUt_}H~@SNw#Qo_V{YHZF=*P~eX#xIe_e6?!ASU~#jZ^c z9t15gm@l1Cxclq8GrA}CIul3o ze<%lE?w%=|-@q&b$gwBhJG*pdc#2XU2}a%j1V#Z(z1va;{x!hhfAZ`HgE!b1zmZ)S z`pAtYj0d?V^oeDguPUntcJ}RmczO?Vw=HxYC~OV2Vpn)@!`OqG{({2GygDwtyE#8y z|97}~xt)h_^2;(=(rJJ!Mn7VJ=^`y zlr6wISmXXNfn=|EeZO|{RHMYg^MZy-xVjI zz!1yk)nlC-a;q5hv9tHtb>>T`f1Jk$g8|B5!!W1VEtbZM-`6CeKdvy}gXG+&%mf71 z82lFs4?>ZW$469;n{9P-FB4Q1Bx)A6V3>jiTuSfSKw1*Ci z-^m-3w`V9%)NZXb3%NXy0USG2x+sMFI zMUFJFY0K?uRqVUV1x%C6g~tL&XhG$wvyczDlhBaoK%R57Fh-_Vlfi2Bz|siFBC90O zHUc}8*p9}CAm$*dJdBlE;AGdMbBGSHHS0v6*hNlub$Kz2=OrU&3n~)eV4aUbIr^D8 zvd}0TAR}&fYNu}3l8g=c(AX6wC3EfiKa_iM*IEWqJ#Mw)e#6QZS54amoU^g4oL<## zI6S}*(MrI6kVst1M9!?W7xIRsY!&fSQAVdus4SXp5{F2f*CHU=DxBl=W!{OZ5jGqy z&`(>$cfHI#>gwEhAaJhQ_M~_`wpdXLJS9`oBbO+rGBbdiU%PU(cgyDratig|0J$Uo zf^hdNDHjwXL*kDOHM}{-eJa62_SIl=WXFUHW!4d!q>v2=klAQaJt|a)QKRzCj5yT( zc#1{d=!6?7Pgwa(n3-;0`{L}hV~1Hia+F*x7pQ5%o7d=}wVJhNZc-vPW@?dur1~yI z4{%~7`ejS~JGiiOQ-3}Nm{^k;{>rBn+P!}a$gz`M=)sX;YPx>`y?-kDN6?!i+gwQi z^E%e?JwMS@>AX6Emn~Y-4flmt9N(S4i+YD^J*f!>9w_Yn4(|Pl4WD$`QpLvTjW2y5 zkXk%Sb{QVXS0I)v5Dz~8sV%YqLY?nvk04{wtd!@ep3Jc`9cvkp`y)ZH##{*>T&M)6X4IsM1`KfyIn3BW1p4Yztz=W3u3 zP)5)2xzY=iS0dS8B1z_~(G%J39C40!Z~#)Is}T_vXfJ+j;QdUTqMK=W4Vs)Yn$?>` zMXcLRymf1jFt_1X)B(Y@#l>Fp@5S$DFcE>t(+UPFOA6YY16Zw?r4WEdFOQ0kV80i? zdzy8p@3wAg-ikw{F4AWJO~8;v$lGtihe)ihQ0Fb(@^Ms_vnHdJEyBt?z(?z?muOhIP+|~;*vRK zDTt=IS&0n)*oWUC`|G_YPK4QOfC?$wE%aNdcbL`Io5|dnxJbaUAfh8_mI<9!*RJ{J zW0&KbI?wvsJIxyCoMOS_G292qU(u2Up~3}K%E@NymaR$V$yDqWw~E@8E$z5?d4q~0 zBbtcHRp2Z@b9p}aY_Xv5$Omcv6ce{`sD7R+h*M7`1o$Z{A=aJwo_avoDww#4=8URo zdpu_kRKG3+BT+EzJn$(VYjq{;`u+9%qfiIq)jQr%24h@d1}bdAtl!5JC1nz8Ibix! z|ESs~H{l+$@p70M-?Sp0BWPC%uo^76Ub^wch1T$jc^J^{G}~0wVa%rJss;95LiYOn z`sikqS|dvZj?n+*Q>6j?;w20})#cr@mfS=*gzZ~xv7BRFo6al8JOE~29k}@k!p7qM zhX{asn0u-@?LzKdBadzSD#RVT4i_n9Y0Y}yk{rM3Mx|Ha0PN*^!2^v)bLqmoPRw^B z&flg9rta=koFc)cfy!xmrd|vC`>lf~ZYsw2s#YlVGZtRtrqw!d=+EYi8v_acqe0Jg zxQ|jD#2@K1YoGqknvKhF$y{l+x^pE5rI0)o8&V#R-XhCpd1$Rq&PSw$S84qXjQ34X zU?Q0f{B`>>09l&; zc81S&sxbYUO4J}G-D(xmkfEO*kQWv&A`(VF4rFoxpy{P|ZGLa7ysS+P_oxWMXrO&6 z|09Tx?=@2c41YGV4N^+_)D?bC2J|_Fq+y4HJGVa&GRax-D zh1nj1)!CG^l`WHd2Sm&MitU$8>BZ@K-=f~2-v-`rOZD(EV0H@AW?1wZhDwYhO{sQH zZ~^0_IUpx0dHAHhsFQ@cB7mnKV3Xy;Q4|$W@qb4i*nUtccQB6?$p_f@K7H6wmcd#1 zeExj`HiGvkKDnR!6koGq+@vY2eX{G4KAT&|OI?!tZ~1xen6tGUIqZZHMx>>ZL90GB zogpjgqsgkg4Jj`pnfagtB(zVRS*38V|MQ>)r0NFYsswH!661g+DKFq8SPvj1W2?q})XdGGG*KiG{&kD{ z86Fr_vPTqtRtNLi>DlceNd0L{JTi>XJ1?rpLcA7OvD)@>Qbm^~wv@@-M#8hw$?A^i z4KZ`p5^UFCu?~6P{eyT>MZ?ll%~}>VS}YznWYj0M`|vAzl)^1$kVmm$a(U zLJkf{!bVtj!!vu?ayb`oy>?%*f~lU<_=%!?q!1EQ(&t|I;(l9VUb_4tL!Od?q}O$8 z6UB1mO~CO0#1iUxHb-i_#l!Zn{^g7C8iyn0qwv^=VDg zia5$TU)kIO1)R((43NB1u+dWF6T+56)QVxkf9`J$33P%zYPR(k#)>^<$-ZuFEEDhR zVLEQzUo2!(Q584I8~!$Q-Y-uqv}*!4S>V_r3O{aZ&MGSA78R@AVMYQ7UBbebeucYK zs>qd#sJMR);C9L8U5b>OOrkZ!nH^yyj$2OOE^HLo$`(rYn_T|8_|?=O-La{8+ta^voPmsuQXbzoY6S3R0BLuz z9+4<@6$U6cini){OW1QJWUZ*4i7F1GnwOz+*wj30kchPS-ReI4bNNrb5fSGHMEsRd zm<2D9W@Q|8x%h!DdU<|wex7%nxdUaKw6ZhMFVBPi%{j`c)`UrUJs+TL0BT=4tDo_? zR+jdEqXQ`O%$w*kL)cO{qbb}@K{0$8Jq2x7*MCZaAIQfPq%vCr6Yebe8_t?p)v&Xx z*vH1!?5rb!dO)*_iJqQIE8SwgafEt(Yt;`8@ z&O;|rA;9#$odmZnHUGH!4Mco*1PoU#0Z{Dx#Pr>5O*)61l4oCECo){Ad>|DfZ0 zA`t-FIeE{1+pLNMw-74Q2j|>?LMs#&>e+xaxYwA~6Z8?^JTj5rAA$`I<{@@Br&m=e z2?>jXOFz3Xt0{}5P&0-*ARps(>rR{}G)$to658gd&~8*Yq1SqYFJt5v_B<$Y?KSL+ z4R8z$W-XkRFSZ4sJ0+Ygi!)%gO;S}xCA88v5@nKv7Y9%d_rp@Hsq)%MgEN}lc@G>9 zH8-c_TdA8CeO)nxG1|I(th`k6F_jkNEkV_p^=q>Dp>HeS66V4Y^dF@8>Gjp7SqANz z5`eExzkc4X;cVr{$xqgX@9;aVcDZ7|G z^fiS|hUW-69ub=iLf{Xp0*AYFXSEPUX>K^A9rxo~KQ+-@BaKi$JG*5sdypEb)A`W)aiyR3 zX&#Jj`p_fx@=f<2_8M=uwM^8;HCf#AQ*|u*)Fj?dDiQoHKB|Ls@s|wVGg3<|`&y^} ztr6gNRJKWW>Mv7TjwsyJ!q@Tx#}Fut$EYMzQ$&o6y?hJJ?f!)FPZ)wweusS8qlK)Q zlSZd(E3|UUnM=?4s`^%C71695>t94QVOIu4cfH<@$X+_2T@iT&TgwM&bKs?`F%3J9 zvzK8B}KS^xOt;zqf62WBe@@s^CqMLV>Fluod$p3Mqk{W=hg3Iu+=rRAOqSrX6m zcI9<3{n`WX#ieE-`#88L4pbonIg{+&l;hHN!5C-Z#UjIV)SR)9?X}~!j2Gubp8T{Pm5UGq=bi+ts7SIZi&>9c>Q7{ZzyG>)IrAm0|K!#6i&aW z_5OLE6Ej!b^0>xi`S{H7Ku$goU2 z2Sn%F&Eyv_643z^GH}Bw{wO;W3UoD-Pu9_}wj56@z`AK+4(-2Ylqgm9dnJ1xK&C#A zzgZ6MCbP@9>=S#)^IuGo|B9f8F%NfEm%mHOtC83y17c5E=I1t>O#s}vQ9akY-do|L ziZ9wUQd*zv6dpWQ2NNl8rSsCLCstXVLAD7u`Z;EqGFIX8W&iYXrvjfd_k{o^>^C$a zv9hPBaVl4r@a7-T^@*d3`-xc5cG0G_@4!LA){sh4*ISgvM1gG4-@tR4$nSmZTAsSjkFf1 zNj?-KL%G+sd<6NoHR-oCMmO5 zm>b={zGOoLtKEx}%7ID%TXVDMy=}ZAnDtF414h=R6Z!#{jhD=QaTII4B2;mx9JzS} z3UcwN`7NCF&J^Ov0u9iA$~OWCe9;|^D1-Kwp=nt|LRQ`R$%h1wa-jp~_Ip|%I_>^m zg!e70nVmlhv#MlVOI(Np(lv^lRJf(_caPntVc7Ciot2 zI7=nXLT5DW&M7D93@NZgMUTaYkU8*YRY>-BRmhWvf8*d^yx5EKI@>??PzNcP)eXyT z?0l>88p(gmQNM8}FDnxnXuPe1nqJ+72Fh+TY&^Tc2Q`RJSwic0Y=Lc4>cl zg)kG3i?aI-fc@tQTQcXH*nf!1$*baCyEb?qdZyw8cdg~bWn zRN$M5$P#vw(AuJRK8lHhAN~j!-YkPHuLK+{?ils4NW^}Ph1Fj};F7eJxIyiKWRqq6 zUO0?y;pOHrYEW{vhDOGnLCni8vNpBVTAcA1lilgz(vzKPF#f+FRe#BFyF2%NKbq!( zD9RmvGYNh4v!;;)2d4*-*7T2%H$XiZ=TOQy=ZSZ?&cu63?$#k`I>Ql&1FW2#{ek+= zh=Klm$I*)O^8a~C`*FnY3bAFI9OECZX!rHWHdq)`vw+k;Fd;s%(wmoeX4N_5{VwVy z1Cq}6O);Mh-g%mRlx@&ZrRxz|a`zL?ZP2&k)~)hazos1D9TNMtRqig2JJ<=nTsl_8 zJ`rq?U4M4x7nkASFJ9$-Vuy|o4*vSBNmLQU*^ldzW^-q-!zGhWPgv_m<1)4BFE$D_UD?juPRBr z0#>a`+?XZpaf*Ozezh ::: diff --git a/docs/reference/precision-support.rst b/docs/reference/precision-support.rst index 4f5be7e33..8ee81e4b3 100644 --- a/docs/reference/precision-support.rst +++ b/docs/reference/precision-support.rst @@ -9,8 +9,8 @@ Data types and precision support ************************************************************* -This topic lists the data types support on AMD GPUs, ROCm libraries along -with corresponding :doc:`HIP ` data types. +This topic summarizes the data types supported on AMD GPUs and accelerators and +ROCm libraries, along with corresponding :doc:`HIP ` data types. Integral types ============== @@ -61,18 +61,38 @@ The floating-point types supported by ROCm are listed in the following table. - Type name - HIP type - Description + + * + - float4 (E2M1) + - | ``__hip_fp4_e2m1`` + - A 4-bit floating-point number with **E2M1** bit layout, as described + in :doc:`low precision floating point types page `. + + * + - float6 (E3M2) + - | ``__hip_fp6_e3m2`` + - A 6-bit floating-point number with **E3M2** bit layout, as described + in :doc:`low precision floating point types page `. + + * + - float6 (E2M3) + - | ``__hip_fp6_e2m3`` + - A 6-bit floating-point number with **E2M3** bit layout, as described + in :doc:`low precision floating point types page `. + * - float8 (E4M3) - | ``__hip_fp8_e4m3_fnuz``, | ``__hip_fp8_e4m3`` - - An 8-bit floating-point number with **S1E4M3** bit layout, as described in :doc:`low precision floating point types page `. + - An 8-bit floating-point number with **E4M3** bit layout, as described in :doc:`low precision floating point types page `. The FNUZ variant has expanded range with no infinity or signed zero (NaN represented as negative zero), while the OCP variant follows the Open Compute Project specification. + * - float8 (E5M2) - | ``__hip_fp8_e5m2_fnuz``, | ``__hip_fp8_e5m2`` - - An 8-bit floating-point number with **S1E5M2** bit layout, as described in :doc:`low precision floating point types page `. + - An 8-bit floating-point number with **E5M2** bit layout, as described in :doc:`low precision floating point types page `. The FNUZ variant has expanded range with no infinity or signed zero (NaN represented as negative zero), while the OCP variant follows the Open Compute Project specification. @@ -81,22 +101,26 @@ The floating-point types supported by ROCm are listed in the following table. - ``half`` - A 16-bit floating-point number that conforms to the IEEE 754-2008 half-precision storage format. + * - bfloat16 - ``bfloat16`` - A shortened 16-bit version of the IEEE 754 single-precision storage format. + * - tensorfloat32 - Not available - A floating-point number that occupies 32 bits or less of storage, providing improved range compared to half (16-bit) format, at (potentially) greater throughput than single-precision (32-bit) formats. + * - float32 - ``float`` - A 32-bit floating-point number that conforms to the IEEE 754 single-precision storage format. + * - float64 - ``double`` @@ -108,8 +132,8 @@ The floating-point types supported by ROCm are listed in the following table. * The float8 and tensorfloat32 types are internal types used in calculations in Matrix Cores and can be stored in any type of the same size. - * CNDA3 natively supports FP8 FNUZ (E4M3 and E5M2), which differs from the customised - FP8 format used in NVIDIA's H100 + * CDNA3 natively supports FP8 FNUZ (E4M3 and E5M2), which differs from the customized + FP8 format used with NVIDIA H100 (`FP8 Formats for Deep Learning `_). * In some AMD documents and articles, float8 (E5M2) is referred to as bfloat8. @@ -168,11 +192,13 @@ Data type support by hardware architecture AMD's GPU lineup spans multiple architecture generations: -* CDNA1 architecture: includes models such as MI100 -* CDNA2 architecture: includes models such as MI210, MI250, and MI250X -* CDNA3 architecture: includes models such as MI300A, MI300X, and MI325X -* RDNA3 architecture: includes models such as RX 7900XT and RX 7900XTX -* RDNA4 architecture: includes models such as RX 9070 and RX 9070XT +* CDNA1 such as MI100 +* CDNA2 such as MI210, MI250, and MI250X +* CDNA3 such as MI300A, MI300X, and MI325X +* CDNA4 such as MI350X and MI355X +* RDNA2 such as PRO W6800 and PRO V620 +* RDNA3 such as RX 7900XT and RX 7900XTX +* RDNA4 such as RX 9070 and RX 9070XT HIP C++ type implementation support ----------------------------------- @@ -188,6 +214,8 @@ following table. - CDNA1 - CDNA2 - CDNA3 + - CDNA4 + - RDNA2 - RDNA3 - RDNA4 @@ -198,6 +226,8 @@ following table. - ✅ - ✅ - ✅ + - ✅ + - ✅ * - ``int16_t``, ``uint16_t`` @@ -206,6 +236,8 @@ following table. - ✅ - ✅ - ✅ + - ✅ + - ✅ * - ``int32_t``, ``uint32_t`` @@ -214,6 +246,8 @@ following table. - ✅ - ✅ - ✅ + - ✅ + - ✅ * - ``int64_t``, ``uint64_t`` @@ -222,6 +256,38 @@ following table. - ✅ - ✅ - ✅ + - ✅ + - ✅ + + * + - ``__hip_fp4_e2m1`` + - ❌ + - ❌ + - ❌ + - ✅ + - ❌ + - ❌ + - ❌ + + * + - ``__hip_fp6_e2m3`` + - ❌ + - ❌ + - ❌ + - ✅ + - ❌ + - ❌ + - ❌ + + * + - ``__hip_fp6_e3m2`` + - ❌ + - ❌ + - ❌ + - ✅ + - ❌ + - ❌ + - ❌ * - ``__hip_fp8_e4m3_fnuz`` @@ -230,6 +296,8 @@ following table. - ✅ - ❌ - ❌ + - ❌ + - ❌ * - ``__hip_fp8_e5m2_fnuz`` @@ -238,12 +306,16 @@ following table. - ✅ - ❌ - ❌ + - ❌ + - ❌ * - ``__hip_fp8_e4m3`` - ❌ - ❌ - ❌ + - ✅ + - ❌ - ❌ - ✅ @@ -252,6 +324,8 @@ following table. - ❌ - ❌ - ❌ + - ✅ + - ❌ - ❌ - ✅ @@ -262,6 +336,8 @@ following table. - ✅ - ✅ - ✅ + - ✅ + - ✅ * - ``bfloat16`` @@ -270,6 +346,8 @@ following table. - ✅ - ✅ - ✅ + - ✅ + - ✅ * - ``float`` @@ -278,6 +356,8 @@ following table. - ✅ - ✅ - ✅ + - ✅ + - ✅ * - ``double`` @@ -286,6 +366,8 @@ following table. - ✅ - ✅ - ✅ + - ✅ + - ✅ .. note:: @@ -314,18 +396,21 @@ The following table lists data type support for compute units. - int16 - int32 - int64 + * - CDNA1 - ✅ - ✅ - ✅ - ✅ + * - CDNA2 - ✅ - ✅ - ✅ - ✅ + * - CDNA3 - ✅ @@ -333,6 +418,20 @@ The following table lists data type support for compute units. - ✅ - ✅ + * + - CDNA4 + - ✅ + - ✅ + - ✅ + - ✅ + + * + - RDNA2 + - ✅ + - ✅ + - ✅ + - ✅ + * - RDNA3 - ✅ @@ -347,53 +446,132 @@ The following table lists data type support for compute units. - ✅ - ✅ - .. tab-item:: Floating-point types - :sync: floating-point-type + .. tab-item:: Low precision floating-point types + :sync: floating-point-type-low .. list-table:: :header-rows: 1 * - Type name + - float4 + - float6 (E2M3) + - float6 (E3M2) - float8 (E4M3) - float8 (E5M2) + + * + - CDNA1 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + + * + - CDNA2 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + + * + - CDNA3 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + + * + - CDNA4 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + + * + - RDNA2 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + + * + - RDNA3 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + + * + - RDNA4 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + + .. tab-item:: High precision floating-point types + :sync: floating-point-type-high + + .. list-table:: + :header-rows: 1 + + * + - Type name - float16 - bfloat16 - tensorfloat32 - float32 - float64 + * - CDNA1 - - ❌ - - ❌ - ✅ - ✅ - ❌ - ✅ - ✅ + * - CDNA2 - - ❌ - - ❌ - ✅ - ✅ - ❌ - ✅ - ✅ + * - CDNA3 + - ✅ + - ✅ - ❌ + - ✅ + - ✅ + + * + - CDNA4 + - ✅ + - ✅ - ❌ - ✅ - ✅ + + * + - RDNA2 + - ✅ + - ✅ - ❌ - ✅ - ✅ * - RDNA3 - - ❌ - - ❌ - ✅ - ✅ - ❌ @@ -402,8 +580,6 @@ The following table lists data type support for compute units. * - RDNA4 - - ❌ - - ❌ - ✅ - ✅ - ❌ @@ -429,18 +605,21 @@ The following table lists data type support for AMD GPU matrix cores. - int16 - int32 - int64 + * - CDNA1 - ✅ - ❌ - ❌ - ❌ + * - CDNA2 - ✅ - ❌ - ❌ - ❌ + * - CDNA3 - ✅ @@ -448,6 +627,20 @@ The following table lists data type support for AMD GPU matrix cores. - ❌ - ❌ + * + - CDNA4 + - ✅ + - ❌ + - ❌ + - ❌ + + * + - RDNA2 + - ✅ + - ❌ + - ❌ + - ❌ + * - RDNA3 - ✅ @@ -462,53 +655,132 @@ The following table lists data type support for AMD GPU matrix cores. - ❌ - ❌ - .. tab-item:: Floating-point types - :sync: floating-point-type + .. tab-item:: Low precision floating-point types + :sync: floating-point-type-low .. list-table:: :header-rows: 1 * - Type name + - float4 + - float6 (E2M3) + - float6 (E3M2) - float8 (E4M3) - float8 (E5M2) - - float16 - - bfloat16 - - tensorfloat32 - - float32 - - float64 + * - CDNA1 - ❌ - ❌ - - ✅ - - ✅ - ❌ - - ✅ - ❌ + - ❌ + * - CDNA2 - ❌ - ❌ - - ✅ - - ✅ + - ❌ + - ❌ + - ❌ + + * + - CDNA3 + - ❌ + - ❌ - ❌ - ✅ - ✅ + * - - CDNA3 - - ✅ - - ✅ + - CDNA4 - ✅ - ✅ - ✅ - ✅ - ✅ + * + - RDNA2 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + * - RDNA3 - ❌ - ❌ + - ❌ + - ❌ + - ❌ + + * + - RDNA4 + - ❌ + - ❌ + - ❌ + - ✅ + - ✅ + + .. tab-item:: High precision floating-point types + :sync: floating-point-type-high + + .. list-table:: + :header-rows: 1 + + * + - Type name + - float16 + - bfloat16 + - tensorfloat32 + - float32 + - float64 + + * + - CDNA1 + - ✅ + - ✅ + - ❌ + - ✅ + - ❌ + + * + - CDNA2 + - ✅ + - ✅ + - ❌ + - ✅ + - ✅ + + * + - CDNA3 + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + + * + - CDNA4 + - ✅ + - ✅ + - ✅ + - ✅ + - ✅ + + * + - RDNA2 + - ✅ + - ✅ + - ❌ + - ❌ + - ❌ + + * + - RDNA3 - ✅ - ✅ - ❌ @@ -519,8 +791,6 @@ The following table lists data type support for AMD GPU matrix cores. - RDNA4 - ✅ - ✅ - - ✅ - - ✅ - ❌ - ❌ - ❌ @@ -582,48 +852,59 @@ page. - ✅ - ✅ - .. tab-item:: Floating-point types - :sync: floating-point-type + .. tab-item:: Low precision floating-point types + :sync: floating-point-type-low .. list-table:: :header-rows: 1 * - Type name + - float4 + - float6 (E2M3) + - float6 (E3M2) - float8 (E4M3) - float8 (E5M2) - - 2 x float16 - - 2 x bfloat16 - - tensorfloat32 - - float32 - - float64 + * - CDNA1 - ❌ - ❌ - - ✅ - - ✅ - ❌ - - ✅ - ❌ + - ❌ + * - CDNA2 - ❌ - ❌ - - ✅ - - ✅ - ❌ - - ✅ - - ✅ + - ❌ + - ❌ + * - CDNA3 - ❌ - ❌ - - ✅ - - ✅ - ❌ - - ✅ - - ✅ + - ❌ + - ❌ + + * + - CDNA4 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ + + * + - RDNA2 + - ❌ + - ❌ + - ❌ + - ❌ + - ❌ * - RDNA3 @@ -632,13 +913,79 @@ page. - ❌ - ❌ - ❌ - - ✅ - - ❌ * - RDNA4 - ❌ - ❌ + - ❌ + - ❌ + - ❌ + + .. tab-item:: High precision floating-point types + :sync: floating-point-type-high + + .. list-table:: + :header-rows: 1 + + * + - Type name + - 2 x float16 + - 2 x bfloat16 + - tensorfloat32 + - float32 + - float64 + + * + - CDNA1 + - ✅ + - ✅ + - ❌ + - ✅ + - ❌ + + * + - CDNA2 + - ✅ + - ✅ + - ❌ + - ✅ + - ✅ + + * + - CDNA3 + - ✅ + - ✅ + - ❌ + - ✅ + - ✅ + + * + - CDNA4 + - ✅ + - ✅ + - ❌ + - ✅ + - ✅ + + * + - RDNA2 + - ❌ + - ❌ + - ❌ + - ✅ + - ❌ + + * + - RDNA3 + - ❌ + - ❌ + - ❌ + - ✅ + - ❌ + + * + - RDNA4 - ✅ - ✅ - ❌ @@ -662,295 +1009,64 @@ Libraries input/output type support ----------------------------------- The following tables list ROCm library support for specific input and output -data types. Refer to the corresponding library data type support page for a -detailed description. +data types. Select a library from the below table to view the supported data +types. -.. tab-set:: +.. datatemplate:yaml:: /data/reference/precision-support/precision-support.yaml - .. tab-item:: Integral types - :sync: integral-type + {% set library_groups = data.library_groups %} - .. list-table:: - :header-rows: 1 + .. raw:: html - * - - Library input/output data type name - - int8 - - int16 - - int32 - - int64 +
+
+
Category
+
+ {% for group in library_groups %} +
{{ group.group }}
+ {% endfor %} +
+
- * - - :doc:`Composable Kernel ` - - ✅/✅ - - ❌/❌ - - ✅/✅ - - ❌/❌ +
+
Library
+
+ {% for group in library_groups %} + {% for library in group.libraries %} +
{{ library.name }}
+ {% endfor %} + {% endfor %} +
+
+
- * - - :doc:`hipCUB ` - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ + {% for group in library_groups %} + {% for library in group.libraries %} - * - - :doc:`hipRAND ` - - NA/✅ - - NA/✅ - - NA/✅ - - NA/✅ + .. container:: model-doc {{ library.tag }} - * - - :doc:`hipSOLVER ` - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ❌/❌ + For more information, please visit :doc:`{{ library.name }} <{{ library.doc_link }}>`. - * - - :doc:`hipSPARSELt ` - - ✅/✅ - - ❌/❌ - - ❌/❌ - - ❌/❌ + .. list-table:: + :header-rows: 1 + :widths: 70, 30 - * - - :doc:`hipTensor ` - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ❌/❌ + * + - Data Type + - Support + {% for data_type in library.data_types %} + * + - {{ data_type.type }} + - {{ data_type.support }} + {% endfor %} - * - - :doc:`MIGraphX ` - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ + {% endfor %} + {% endfor %} - * - - :doc:`MIOpen ` - - ⚠️/⚠️ - - ❌/❌ - - ⚠️/⚠️ - - ❌/❌ +.. note:: - * - - :doc:`RCCL ` - - ✅/✅ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`rocFFT ` - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ❌/❌ - - * - - :doc:`rocPRIM ` - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - * - - :doc:`rocRAND ` - - NA/✅ - - NA/✅ - - NA/✅ - - NA/✅ - - * - - :doc:`rocSOLVER ` - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ❌/❌ - - * - - :doc:`rocThrust ` - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - * - - :doc:`rocWMMA ` - - ✅/✅ - - ❌/❌ - - ❌/✅ - - ❌/❌ - - - .. tab-item:: Floating-point types - :sync: floating-point-type - - .. list-table:: - :header-rows: 1 - - * - - Library input/output data type name - - float8 (E4M3) - - float8 (E5M2) - - float16 - - bfloat16 - - tensorfloat32 - - float32 - - float64 - - * - - :doc:`Composable Kernel ` - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`hipCUB ` - - ❌/❌ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`hipRAND ` - - NA/❌ - - NA/❌ - - NA/✅ - - NA/❌ - - NA/❌ - - NA/✅ - - NA/✅ - - * - - :doc:`hipSOLVER ` - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`hipSPARSELt ` - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ❌/❌ - - ❌/❌ - - ❌/❌ - - * - - :doc:`hipTensor ` - - ❌/❌ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`MIGraphX ` - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - * - - :doc:`MIOpen ` - - ⚠️/⚠️ - - ⚠️/⚠️ - - ✅/✅ - - ⚠️/⚠️ - - ❌/❌ - - ✅/✅ - - ⚠️/⚠️ - - * - - :doc:`RCCL ` - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`rocFFT ` - - ❌/❌ - - ❌/❌ - - ✅/✅ - - ❌/❌ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`rocPRIM ` - - ❌/❌ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`rocRAND ` - - NA/❌ - - NA/❌ - - NA/✅ - - NA/❌ - - NA/❌ - - NA/✅ - - NA/✅ - - * - - :doc:`rocSOLVER ` - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`rocThrust ` - - ❌/❌ - - ❌/❌ - - ⚠️/⚠️ - - ⚠️/⚠️ - - ❌/❌ - - ✅/✅ - - ✅/✅ - - * - - :doc:`rocWMMA ` - - ✅/❌ - - ✅/❌ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ - - ✅/✅ + The meaning of partial support depends on the library. Please refer to the individual + libraries' documentation for more information. .. note:: @@ -958,6 +1074,15 @@ detailed description. data types for the random values they generate, with no need for input data types. +.. note:: + + hipBLASLt supports additional data types as internal compute types, which may + differ from the supported input/output types shown in the tables above. While + TensorFloat32 is not supported as an input or output type in this library, it + is available as an internal compute type. For complete details on supported + compute types, refer to the :doc:`hipBLASLt ` + documentation. + hipDataType enumeration ----------------------- @@ -1049,6 +1174,24 @@ following table with descriptions and values. - 29 - 8-bit real bfloat8 precision floating-point (OCP version). + * + - ``HIP_R_6F_E2M3`` + - ``__hip_fp6_e2m3`` + - 31 + - 6-bit real float6 precision floating-point. + + * + - ``HIP_R_6F_E3M2`` + - ``__hip_fp6_e3m2`` + - 32 + - 6-bit real bfloat6 precision floating-point. + + * + - ``HIP_R_4F_E2M1`` + - ``__hip_fp4_e2m1`` + - 33 + - 4-bit real float4 precision floating-point. + * - ``HIP_R_8F_E4M3_FNUZ`` - ``__hip_fp8_e4m3_fnuz`` @@ -1061,4 +1204,4 @@ following table with descriptions and values. - 1001 - 8-bit real bfloat8 precision floating-point (FNUZ version). -The full list of the ``hipDataType`` enumeration listed in `library_types.h `_ . +The full list of the ``hipDataType`` enumeration listed in `library_types.h `_. diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index 1f56af9a8..b6d1343cf 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -180,7 +180,7 @@ subtrees: - file: reference/gpu-arch-specs.rst - file: reference/gpu-atomics-operation.rst - file: reference/precision-support.rst - title: Precision support + title: Data types and precision support - file: reference/graph-safe-support.rst title: Graph safe support From 4f8426376b2353d84b01a96c7a1f5fe33c49e395 Mon Sep 17 00:00:00 2001 From: srawat <120587655+SwRaw@users.noreply.github.com> Date: Thu, 28 Aug 2025 20:43:10 +0530 Subject: [PATCH 37/58] Update gpu-arch.md --- docs/conceptual/gpu-arch.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/conceptual/gpu-arch.md b/docs/conceptual/gpu-arch.md index d2f790578..f14a39421 100644 --- a/docs/conceptual/gpu-arch.md +++ b/docs/conceptual/gpu-arch.md @@ -22,7 +22,7 @@ architecture. * [AMD Instinct MI300/CDNA3 ISA](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf) * [White paper](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf) * [MI300 performance counters](./gpu-arch/mi300-mi200-performance-counters.rst) -* [MI350 performance counters](./gpu-arch/mi350-performance-counters.rst) +* [MI350 series performance counters](./gpu-arch/mi350-performance-counters.rst) ::: :::{grid-item-card} From 52ce2014011be7b02ed9d17576fb272d60c57f29 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Thu, 28 Aug 2025 16:50:55 -0400 Subject: [PATCH 38/58] ROCm 7.0.0 Known issues [Batch2] (#529) * Known issues added * SME feedback added --- RELEASE.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index 1bcff67c7..2ea6521bc 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -120,7 +120,7 @@ ROCm 7.0 enables support for Triton 3.3.0. ### Instinct Driver/ROCm packaging separation -The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See the [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) blog and [User and kernel-space support matrix](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/user-kernel-space-compat-matrix.html)for more information. +The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See the [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) blog and [User and kernel-space support matrix](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/user-kernel-space-compat-matrix.html) for more information. [AMD SMI](https://github.com/ROCm/amdsmi) continues to stay with the ROCm software stack under the ROCm organization repository. @@ -2365,6 +2365,10 @@ Compiling from a device kernel or function results in failure when attempting to Starting with GCC 5.1, GNU `libstdc++` introduced a dual Application Binary Interface (ABI) to adopt `C++11`, primarily affecting the `std::string` and its dependencies, including `std::regex`. If your code is compiled against headers expecting one ABI but linked or run with the other, it can cause problems with `std::string` and `std::regex`, leading to a segmentation fault in ROCprofiler-SDK, which uses `std::regex`. This issue is resolved in the [ROCm Systems `develop` branch](https://github.com/ROCm/rocm-systems) and will be part of a future ROCm release. +### Decline in performance of batched GEMM operation for applications using hipBLASLT kernels + +Default batched General Matrix Multiplications (GEMM) operations for rocBLAS and hipBLAS on gfx1200 and gfx1201 may have a decline in performance in comparison with non-batched and strided_batched GEMM operations. By default, the batched GEMM uses hipBLASLT kernels, and switching to the Tensile kernel resolves the performance decline issue. The issue will be fixed in a future ROCm release. As a workaround, you can set the environment variable `ROCBLAS_USE_HIPBLASLT=0` before the batched GEMM operation is performed on gfx1200 and gfx1201. After completing the batched operation, reset the variable to `ROCBLAS_USE_HIPBLASLT=1` before calling non-batched or strided_batched. + ## ROCm resolved issues The following are previously known issues resolved in this release. For resolved issues related to From b4c5980a964a788f44f33869d9ba697152e2fdc1 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Thu, 28 Aug 2025 17:52:39 -0400 Subject: [PATCH 39/58] Update to 7.0.0 RN and Compatibility matrix (#530) * Fixes applied * Tutorial HUB update added --- RELEASE.md | 6 +++++- docs/compatibility/compatibility-matrix.rst | 2 +- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 2ea6521bc..958175fe1 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -317,6 +317,10 @@ For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/pro ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases. +* [Tutorials for AI developers](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/) have been expanded with the following new inference tutorial: [PD disaggregation with SGLang](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/SGlang_PD_Disagg_On_AMD_GPU.html) + + In addition, the [AI agent with MCPs using vLLM and PydanticAI](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/build_airbnb_agent_mcp.html) tutorial has been updated. For more information about the changes, see [Changelog for the AI Developer Hub](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/changelog.html). + * Documentation for [rocCV](https://advanced-micro-devices-roccv--28.com.readthedocs.build/en/28/), an efficient GPU-accelerated library for image pre- and post-processing, has been added. rocCV is in an early access state, and using it on production workloads is not recommended. * ROCm Math libraries support a wide range of data types, enabling optimized performance across various precision requirements. The following Math libraries are now updated with new precision content. For more information, click the Math library’s link: @@ -2367,7 +2371,7 @@ Starting with GCC 5.1, GNU `libstdc++` introduced a dual Application Binary Inte ### Decline in performance of batched GEMM operation for applications using hipBLASLT kernels -Default batched General Matrix Multiplications (GEMM) operations for rocBLAS and hipBLAS on gfx1200 and gfx1201 may have a decline in performance in comparison with non-batched and strided_batched GEMM operations. By default, the batched GEMM uses hipBLASLT kernels, and switching to the Tensile kernel resolves the performance decline issue. The issue will be fixed in a future ROCm release. As a workaround, you can set the environment variable `ROCBLAS_USE_HIPBLASLT=0` before the batched GEMM operation is performed on gfx1200 and gfx1201. After completing the batched operation, reset the variable to `ROCBLAS_USE_HIPBLASLT=1` before calling non-batched or strided_batched. +Default batched General Matrix Multiplications (GEMM) operations for rocBLAS and hipBLAS on gfx1200 and gfx1201 may have a decline in performance in comparison with non-batched and strided_batched GEMM operations. By default, the batched GEMM uses hipBLASLT kernels, and switching to the Tensile kernel resolves the performance decline issue. The issue will be fixed in a future ROCm release. As a workaround, you can set the environment variable `ROCBLAS_USE_HIPBLASLT=0` before the batched GEMM operation is performed on gfx1200 and gfx1201. After completing the batched operation, reset the variable to `ROCBLAS_USE_HIPBLASLT=1` before calling non-batched or strided_batched operations. ## ROCm resolved issues diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index 2a696e3c6..794292cf3 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -159,7 +159,7 @@ compatibility and system requirements. .. rubric:: Footnotes .. [#ol-700-mi300x] **For ROCm 7.0** - Oracle Linux 9 is supported only on AMD Instinct MI300X, MI350X, and MI355X. Oracle Linux 8 is only supported on AMD Instinct MI300X. -.. [#ol-mi300x] **Prior ROCm 7.0** - Oracle Linux is only on AMD Instinct MI300X. +.. [#ol-mi300x] **Prior ROCm 7.0** - Oracle Linux is supported only on AMD Instinct MI300X. .. [#single-node] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality. .. [#az-mi300x] Starting ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. .. [#RDNA-OS] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4. From 08dad2dc41a787564dda98f85d84f895c2414180 Mon Sep 17 00:00:00 2001 From: randyh62 <42045079+randyh62@users.noreply.github.com> Date: Tue, 2 Sep 2025 13:34:02 -0700 Subject: [PATCH 40/58] Update RELEASE.md (#531) Remove fine-grained system memory pool from HIP Highlights --- RELEASE.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index 958175fe1..385123c94 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -138,7 +138,9 @@ The HIP runtime now includes support for: * `__syncwarp` operation. * The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. * Added warp level primitives: `__syncwarp` and reduce intrinsics (for example, `__reduce_add_sync()`). -* Extended fine grained system memory pool. +* Support for the flags in APIs as following, now allows uncached memory allocation. + - `hipExtHostRegisterUncached`, used in `hipHostRegister`. + - `hipHostMallocUncached` and `hipHostAllocUncached`, used in `hipHostMalloc` and `hipHostAlloc`. * A new attribute in HIP runtime was implemented which exposes a new device capability of how many compute dies (chiplets, xcc) are available on a given GPU. Developers can get this attribute via the API `hipDeviceGetAttribute`, to make use of the best cache locality in a kernel, and optimize the Kernel launch grid layout, for performance improvement. Additionally, the HIP runtime includes functional improvements, which improve functionality, runtime performance, and the user experience. For more information, see [HIP changelog](#hip-7-0-0) below. From c2080a90c7a61341e8677c7f4c81c6e31fdbc39f Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Fri, 5 Sep 2025 09:07:51 -0400 Subject: [PATCH 41/58] Changelog editorial fix ROCm 700 (#534) * Changelog editorial fix * Changelog synced --- CHANGELOG.md | 385 +++++++++++++++++++++++++-------------------------- RELEASE.md | 375 ++++++++++++++++++++++++------------------------- 2 files changed, 375 insertions(+), 385 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 40d86ebba..e66d1062b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -92,9 +92,6 @@ for a complete overview of this release. - `amdsmi_dpm_policy_entry_t` member `policy_description` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. - `amdsmi_name_value_t` member `name` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. -* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. - - The `clk_deep_sleep` field now returns the sleep integer value. - * For backwards compatibility, updated `amdsmi_bdf_t` union to have an identical unnamed struct. #### Removed @@ -163,7 +160,7 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc * Support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). * Support for Stream-K version of mixed `FP8` / `BF16` GEMM. * Support for Multiple D GEMM. -* GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types +* GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types. * Support for `FP16` 2:4 structured sparsity to universal GEMM. * Support for Split K for grouped convolution backward data. * Logit soft-capping support for fMHA forward kernels. @@ -208,11 +205,12 @@ functions added for logical reduction. For details, see [Warp cross-lane functio - HIP APIs for `FP4`/`FP6`/`FP8`, which are compatible with corresponding CUDA APIs. - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. * New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. -* New debug mask, to print precise code object information for logging. * The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. * Added `constexpr` operators for `fp16`/`bf16`. -* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`) -* Extended fine grained system memory pool. +* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`). +* Support for the flags in APIs as following, now allows uncached memory allocation. + - `hipExtHostRegisterUncached`, used in `hipHostRegister`. + - `hipHostMallocUncached` and `hipHostAllocUncached`, used in `hipHostMalloc` and `hipHostAlloc`. * `num_threads` total number of threads in the group. The legacy API size is alias. * Added PCI CHIP ID information as the device attribute. * Added new tests applications for OCP data types `FP4`/`FP6`/`FP8`. @@ -220,13 +218,14 @@ functions added for logical reduction. For details, see [Warp cross-lane functio #### Changed * Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. -* Removal of beta warnings in HIP Graph APIs -All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. +* Removal of beta warnings in HIP Graph APIs. All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. +* `warpSize` has changed. +In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). * Behavior changes - `hipGetLastError` now returns the error code which is the last actual error caught in the current thread during the application execution. - Cooperative groups in `hipLaunchCooperativeKernelMultiDevice` and `hipLaunchCooperativeKernel` functions, additional input parameter validation checks are added. - `hipPointerGetAttributes` returns `hipSuccess` instead of an error with invalid value `hipErrorInvalidValue`, in case `NULL` host or attribute pointer is passed as input parameter. It now matches the functionality of `cudaPointerGetAttributes` which changed with CUDA 11 and above releases. - - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree` + - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree`. - `hipFreeAsync` now returns `hipSuccess` when the input pointer is NULL, instead of ` hipErrorInvalidValue` , to be consistent with `hipFree`. * Changes in hipRTC. - Removal of `hipRTC` symbols from HIP Runtime Library. @@ -248,14 +247,14 @@ All Beta warnings in usage of HIP Graph APIs are removed, they are now officiall - HIP vector constructor change in `hipComplex` initialization now generates correct values. The affected constructors will be small vector types such as `float2`, `int4`, etc. * Stream Capture updates - Restricted stream capture mode, it is made in HIP APIs via adding the macro `CHECK_STREAM_CAPTURE_SUPPORTED ()`. -In the previous HIP enumeration `hipStreamCaptureMode`, three capture modes were defined. With checking in the macro, the only supported stream capture mode is now `hipStreamCaptureModeRelaxed`. The rest are not supported, and the macro will return `hipErrorStreamCaptureUnsupported`. This update involves the following APIs, which is allowed only in relaxed stream capture mode, +In the previous HIP enumeration `hipStreamCaptureMode`, three capture modes were defined. With checking in the macro, the only supported stream capture mode is now `hipStreamCaptureModeRelaxed`. The rest are not supported, and the macro will return `hipErrorStreamCaptureUnsupported`. This update involves the following APIs, which is allowed only in relaxed stream capture mode: * `hipMallocManaged` * `hipMemAdvise` - Checks stream capture mode, the following APIs check the stream capture mode and return error codes to match the behavior of CUDA. * `hipLaunchCooperativeKernelMultiDevice` * `hipEventQuery` * `hipStreamAddCallback` - - Returns error during stream capture. The following HIP APIs now returns specific error `hipErrorStreamCaptureUnsupported` on the AMD platform, but not always `hipSuccess`, to match behavior with CUDA. + - Returns error during stream capture. The following HIP APIs now returns specific error `hipErrorStreamCaptureUnsupported` on the AMD platform, but not always `hipSuccess`, to match behavior with CUDA: * `hipDeviceSetMemPool` * `hipMemPoolCreate` * `hipMemPoolDestroy` @@ -264,7 +263,7 @@ In the previous HIP enumeration `hipStreamCaptureMode`, three capture modes were * `hipMemcpyWithStream` * Error code update Returned error/value codes are updated in the following HIP APIs to match the corresponding CUDA APIs. - - Module Management Related APIs + - Module Management Related APIs: * `hipModuleLaunchKernel` * `hipExtModuleLaunchKernel` * `hipExtLaunchKernel` @@ -273,13 +272,13 @@ Returned error/value codes are updated in the following HIP APIs to match the co * `hipLaunchKernelExC` * `hipModuleLaunchCooperativeKernel` * `hipModuleLoad` - - Texture Management Related APIs + - Texture Management Related APIs: The following APIs update the return codes to match the behavior with CUDA: * `hipTexObjectCreate`, supports zero width and height for 2D image. If either is zero, will not return `false`. * `hipBindTexture2D`, adds extra check, if pointer for texture reference or device is NULL, returns `hipErrorNotFound`. * `hipBindTextureToArray`, if any NULL pointer is input for texture object, resource descriptor, or texture descriptor, returns error `hipErrorInvalidChannelDescriptor`, instead of `hipErrorInvalidValue`. * `hipGetTextureAlignmentOffset`, adds a return code `hipErrorInvalidTexture` when the texture reference pointer is NULL. - - Cooperative Group Related APIs, more calidations are added in the following API implementation, + - Cooperative Group Related APIs, more calidations are added in the following API implementation: * `hipLaunchCooperativeKernelMultiDevice` * `hipLaunchCooperativeKernel` * Invalid stream input parameter handling @@ -323,13 +322,10 @@ In order to match the CUDA runtime behavior more closely, HIP APIs with streams - Event Management Related APIs * `hipEventRecord` * `hipEventRecordWithFlags` -* `warpSize` Change - -In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see either the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). #### Optimized -HIP runtime has the following functional improvements which improves runtime performance and user experience. +HIP runtime has the following functional improvements which improves runtime performance and user experience: * Reduced usage of the lock scope in events and kernel handling. - Switches to `shared_mutex` for event validation, uses `std::unique_lock` in HIP runtime to create/destroy event, instead of `scopedLock`. @@ -338,14 +334,13 @@ HIP runtime has the following functional improvements which improves runtime per * Refactored memory validation, creates a unique function to validate a variety of memory copy operations. * Improved kernel logging using demangling shader names. * Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). -* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, +* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following: - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. Developers can now use the environment variable `HSA_SCRATCH_SINGLE_LIMIT_ASYNC` to change the default allocation size with expected scratch limit in ROCR runtime. On top of it, this value can also be overwritten programmatically in the application using the HIP API `hipDeviceSetLimit(hipExtLimitScratchCurrent, value)` to reset the scratch limit value. * HIP runtime now enables peer-to-peer (P2P) memory copies to utilize all available SDMA engines, rather than being limited to a single engine. It also selects the best engine first to give optimal bandwidth. * Improved launch latency for `D2D` copies and `memset` on MI300 series. -* Memory manager was implemented to improve the efficiency of memory usage and speed-up memory allocation/free in memory pools. * Introduced a threshold to handle the command submission patch to the GPU device(s), considering the synchronization with CPU, for performance improvement. #### Resolved issues @@ -357,10 +352,15 @@ HIP runtime has the following functional improvements which improves runtime per * Fixed issue of handling the kernel parameters for the graph launch. * Failures in roc-obj tools. HIP runtime now makes `DEPRECATED` message in roc-obj tools as `STDERR`. * Support of `hipDeviceMallocContiguous` flags in `hipExtMallocWithFlags()`. It now enables `HSA_AMD_MEMORY_POOL_CONTIGUOUS_FLAG` in the memory pool allocation on GPU device. -* Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v` +* Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v`. * A numerical error/corruption found in Pytorch during graph replay. HIP runtime fixed the input sizes of kernel launch dimensions in hipExtModuleLaunchKernel for the execution of hipGraph capture. * A crash during kernel execution in a customer application. The structure of kernel arguments was updated via adding the size of kernel arguments, and HIP runtime does validation before launch kernel with the structured arguments. +#### Known issues + +* `hipLaunchHostFunc` returns an error during stream capture. Any application using `hipLaunchHostFunc` might fail to capture graphs during stream capture, instead, it returns `hipErrorStreamCaptureUnsupported`. +* Compilation failure in kernels via hiprtc when using option `std=c++11`. + ### **hipBLAS** (3.0.0) #### Added @@ -390,13 +390,13 @@ HIP runtime has the following functional improvements which improves runtime per #### Added * Stream-K GEMM support has been enabled for the `FP32`, `FP16`, `BF16`, `FP8`, and `BF8` data types on the Instinct MI300A APU. To activate this feature, set the `TENSILE_SOLUTION_SELECTION_METHOD` environment variable to `2`, for example, `export TENSILE_SOLUTION_SELECTION_METHOD=2`. -* Added fused Swish/SiLU GEMM (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``). -* Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. -* Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. -* Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8` / `BF8` swizzle GEMM respectively. -* Added TF32 emulation on gfx950. -* Added support for `FP6`, `BF6`, and `FP4` on gfx950. -* Added support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. +* Fused Swish/SiLU GEMM (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``). +* Support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. +* `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. +* ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8` / `BF8` swizzle GEMM respectively. +* TF32 emulation on gfx950. +* Support for `FP6`, `BF6`, and `FP4` on gfx950. +* Support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. #### Changed @@ -421,28 +421,19 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large that symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. -* Added single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: +* A new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large that symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. +* Single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: * `BlockScanRunningPrefixOp` * `ScanTileStatus` * `ScanTileState` * `ReduceByKeyScanTileState` * `TilePrefixCallbackOp` -* Added gfx950 support. -* Added an overload of `BlockScan::InclusiveScan` that accepts an initial value to seed the scan. -* Added an overload of `WarpScan::InclusiveScan` that accepts an initial value to seed the scan. +* Support for gfx950. +* An overload of `BlockScan::InclusiveScan` that accepts an initial value to seed the scan. +* An overload of `WarpScan::InclusiveScan` that accepts an initial value to seed the scan. * `UnrolledThreadLoad`, `UnrolledCopy`, and `ThreadLoadVolatilePointer` were added to align hipCUB with CUB. * `ThreadStoreVolatilePtr` and the `IterateThreadStore` struct were added to align hipCUB with CUB. -* Added `hipcub::InclusiveScanInit` for CUB parity. - -#### Removed - -* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you want to build for these architectures, specify them explicitly in the `AMDGPU_TARGETS` cmake option. -* Deprecated `hipcub::AsmThreadLoad` is removed, use `hipcub::ThreadLoad` instead. -* Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. -* Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. -* This release removes support for custom builds on gfx940 and gfx941. -* Removed C++14 support. Only C++17 is supported. +* `hipcub::InclusiveScanInit` for CUB parity. #### Changed @@ -453,6 +444,15 @@ HIP runtime has the following functional improvements which improves runtime per * The `hipcub::detail::accumulator_t` in rocPRIM backend has been changed to utilise `rocprim::accumulator_t`. * The usage of `rocprim::invoke_result_binary_op_t` has been replaced with `rocprim::accumulator_t`. +#### Removed + +* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you want to build for these architectures, specify them explicitly in the `AMDGPU_TARGETS` cmake option. +* Deprecated `hipcub::AsmThreadLoad` is removed, use `hipcub::ThreadLoad` instead. +* Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. +* Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. +* This release removes support for custom builds on gfx940 and gfx941. +* Removed C++14 support. Only C++17 is supported. + #### Resolved issues * Fixed an issue where `Sort(keys, compare_op, valid_items, oob_default)` in `block_merge_sort.hpp` would not fill in elements that are out of range (items after `valid_items`) with `oob_default`. @@ -471,7 +471,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added gfx950 support. +* Support for gfx950. #### Removed @@ -483,7 +483,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added documentation clarifying how hipfort is built for the NVIDIA platform. +* Documentation clarifying how hipfort is built for the NVIDIA platform. #### Changed @@ -493,24 +493,24 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* CUDA 12.9.1 support -* cuDNN 9.11.0 support -* cuTENSOR 2.2.0.0 support -* LLVM 20.1.8 support +* CUDA 12.9.1 support. +* cuDNN 9.11.0 support. +* cuTENSOR 2.2.0.0 support. +* LLVM 20.1.8 support. #### Resolved issues -* `hipDNN` support is removed by default -* [#1859](https://github.com/ROCm/HIPIFY/issues/1859)[hipify-perl] Fix warnings on unsupported Driver or Runtime APIs which were erroneously not reported -* [#1930](https://github.com/ROCm/HIPIFY/issues/1930) Revise `JIT API` -* [#1962](https://github.com/ROCm/HIPIFY/issues/1962) Support for cuda-samples helper headers -* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Remove `const_cast;` in `hiprtcCreateProgram` and `hiprtcCompileProgram` +* `hipDNN` support is removed by default. +* [#1859](https://github.com/ROCm/HIPIFY/issues/1859)[hipify-perl] Fix warnings on unsupported Driver or Runtime APIs which were erroneously not reported. +* [#1930](https://github.com/ROCm/HIPIFY/issues/1930) Revise `JIT API`. +* [#1962](https://github.com/ROCm/HIPIFY/issues/1962) Support for cuda-samples helper headers. +* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Removed `const_cast;` in `hiprtcCreateProgram` and `hiprtcCompileProgram`. ### **hipRAND** (3.0.0) #### Added -* gfx950 support. +* Support for gfx950. #### Changed @@ -524,7 +524,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added compatibility-only functions +* Added compatibility-only functions: * csrlsvqr * `hipsolverSpCcsrlsvqr`, `hipsolverSpZcsrlsvqr` @@ -537,15 +537,15 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added the `int8`, `int32`, and `float16` data types to `hipDataTypeToHCCDataType` so that sparse matrix descriptors can be used with them. -* Added half float mixed precision to `hipsparseAxpby` where X and Y use `float16` and the result and compute type use `float`. -* Added half float mixed precision to `hipsparseSpVV` where X and Y use `float16` and the result and compute type use `float`. -* Added half float mixed precision to `hipsparseSpMM` where A and B use `float16` and C and the compute type use `float`. -* Added half float mixed precision to `hipsparseSDDMM` where A and B use `float16` and C and the compute type use `float`. -* Added half float uniform precision to the `hipsparseScatter` and `hipsparseGather` routines. -* Added half float uniform precision to the `hipsparseSDDMM` routine. -* Added `int8` precision to the `hipsparseCsr2cscEx2` routine. -* Added the `almalinux` operating system name to correct the GFortran dependency. +* `int8`, `int32`, and `float16` data types to `hipDataTypeToHCCDataType` so that sparse matrix descriptors can be used with them. +* Half float mixed precision to `hipsparseAxpby` where X and Y use `float16` and the result and compute type use `float`. +* Half float mixed precision to `hipsparseSpVV` where X and Y use `float16` and the result and compute type use `float`. +* Half float mixed precision to `hipsparseSpMM` where A and B use `float16` and C and the compute type use `float`. +* Half float mixed precision to `hipsparseSDDMM` where A and B use `float16` and C and the compute type use `float`. +* Half float uniform precision to the `hipsparseScatter` and `hipsparseGather` routines. +* Half float uniform precision to the `hipsparseSDDMM` routine. +* `int8` precision to the `hipsparseCsr2cscEx2` routine. +* The `almalinux` operating system name to correct the GFortran dependency. #### Changed @@ -585,21 +585,21 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added element-wise binary operation support. -* Added element-wise trinary operation support. -* Added support for GPU target gfx950. -* Added dynamic unary and binary operator support for element-wise operations and permutation. -* Added a CMake check for `f8` datatype availability. -* Added `hiptensorDestroyOperationDescriptor` to free all resources related to the provided descriptor. -* Added `hiptensorOperationDescriptorSetAttribute` to set attribute of a `hiptensorOperationDescriptor_t` object. -* Added `hiptensorOperationDescriptorGetAttribute` to retrieve an attribute of the provided `hiptensorOperationDescriptor_t` object. -* Added `hiptensorCreatePlanPreference` to allocate the `hiptensorPlanPreference_t` and enabled users to limit the applicable kernels for a given plan or operation. -* Added `hiptensorDestroyPlanPreference` to free all resources related to the provided preference. -* Added `hiptensorPlanPreferenceSetAttribute` to set attribute of a `hiptensorPlanPreference_t` object. -* Added `hiptensorPlanGetAttribute` to retrieve information about an already-created plan. -* Added `hiptensorEstimateWorkspaceSize` to determine the required workspace size for the given operation. -* Added `hiptensorCreatePlan` to allocate a `hiptensorPlan_t` object, select an appropriate kernel for a given operation and prepare a plan that encodes the execution. -* Added `hiptensorDestroyPlan` to free all resources related to the provided plan. +* Element-wise binary operation support. +* Element-wise trinary operation support. +* Support for GPU target gfx950. +* Dynamic unary and binary operator support for element-wise operations and permutation. +* CMake check for `f8` datatype availability. +* `hiptensorDestroyOperationDescriptor` to free all resources related to the provided descriptor. +* `hiptensorOperationDescriptorSetAttribute` to set attribute of a `hiptensorOperationDescriptor_t` object. +* `hiptensorOperationDescriptorGetAttribute` to retrieve an attribute of the provided `hiptensorOperationDescriptor_t` object. +* `hiptensorCreatePlanPreference` to allocate the `hiptensorPlanPreference_t` and enabled users to limit the applicable kernels for a given plan or operation. +* `hiptensorDestroyPlanPreference` to free all resources related to the provided preference. +* `hiptensorPlanPreferenceSetAttribute` to set attribute of a `hiptensorPlanPreference_t` object. +* `hiptensorPlanGetAttribute` to retrieve information about an already-created plan. +* `hiptensorEstimateWorkspaceSize` to determine the required workspace size for the given operation. +* `hiptensorCreatePlan` to allocate a `hiptensorPlan_t` object, select an appropriate kernel for a given operation and prepare a plan that encodes the execution. +* `hiptensorDestroyPlan` to free all resources related to the provided plan. #### Changed @@ -628,11 +628,11 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added the compiler `-gsplit-dwarf` option to enable the generation of separate debug information file at compile time. When used, separate debug information files are generated for host and for each offload architecture. For additional information, see [DebugFission](https://gcc.gnu.org/wiki/DebugFission). -* Added `llvm-flang`, AMD's next-generation Fortran compiler. It's a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. -* Added Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. -* Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` to determine the current target processor ability to invoke a particular builtin. -* Added HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. +* The compiler `-gsplit-dwarf` option to enable the generation of separate debug information file at compile time. When used, separate debug information files are generated for host and for each offload architecture. For additional information, see [DebugFission](https://gcc.gnu.org/wiki/DebugFission). +* `llvm-flang`, AMD's next-generation Fortran compiler. It's a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. +* Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. +* Compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` to determine the current target processor ability to invoke a particular builtin. +* HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. #### Changed @@ -656,12 +656,12 @@ HIP runtime has the following functional improvements which improves runtime per * Support for PyTorch 2.7 via Torch-MIGraphX. * Support for the Microsoft ONNX Contrib Operators (Self) Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, SkipLayerNorm. * Support for Sigmoid and AddN TensorFlow operators. -* Added GroupQuery Attention support for LLMs. -* Added support for edge mode in the ONNX Pad operator. -* Added ONNX runtime Python driver. -* Added FLUX e2e example. -* Added C++ and Python APIs to save arguments to a graph as a msgpack file, and then read the file back. -* Added rocMLIR fusion for kv-cache attention. +* GroupQuery Attention support for LLMs. +* Support for edge mode in the ONNX Pad operator. +* ONNX runtime Python driver. +* FLUX e2e example. +* C++ and Python APIs to save arguments to a graph as a msgpack file, and then read the file back. +* rocMLIR fusion for kv-cache attention. * Introduced a check for file-write errors. #### Changed @@ -684,14 +684,14 @@ HIP runtime has the following functional improvements which improves runtime per #### Removed * `ROCM_USE_FLOAT8` macro. -* The BF16 GEMM test was removed for Navi21, as it is unsupported by rocBLAS and hipBLASLt on that platform. +* The `BF16` GEMM test was removed for Navi21, as it is unsupported by rocBLAS and hipBLASLt on that platform. #### Optimized * Use common average in `compile_ops` to reduce run-to-run variations when tuning. * Improved the performance of the TopK operator. * Conform to a single layout (NHWC or NCHW) during compilation rather than combining two. -* Slice Channels Conv Optimization (slice output fusion) +* Slice Channels Conv Optimization (slice output fusion). * Horizontal fusion optimization after pointwise operations. * Reduced the number of literals used in `GridSample` linear sampler. * Fuse multiple outputs for pointwise operations. @@ -718,12 +718,12 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* [Conv] Added misa kernels for gfx950. +* [Conv] Misa kernels for gfx950. * [Conv] Enabled Split-K support for CK backward data solvers (2D). * [Conv] Enabled CK wrw solver on gfx950 for the `BF16` data type. * [BatchNorm] Enabled NHWC in OpenCL. -* Added grouped convolution + activation fusion. -* Added grouped convolution + bias + activation fusion. +* Grouped convolution + activation fusion. +* Grouped convolution + bias + activation fusion. * Composable Kernel (CK) can now be built inline as part of MIOpen. #### Changed @@ -771,12 +771,12 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added support for the extended fine-grained system memory pool. -* Added support for gfx950. -* Added support for `unroll=1` in device-code generation to improve performance. +* Support for the extended fine-grained system memory pool. +* Support for gfx950. +* Support for `unroll=1` in device-code generation to improve performance. * Set a default of 112 channels for a single node with `8 * gfx950`. * Enabled LL128 protocol on the gfx950. -* Added the ability to choose the unroll factor at runtime using `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. +* The ability to choose the unroll factor at runtime using `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. * Added MSCCL support for AllGather multinode on the gfx942 and gfx950 (for instance, 16 and 32 GPUs). To enable this feature, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. The maximum message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. * Thread thresholds for LL/LL128 are selected in Tuning Models for the AMD Instinct MI300X. This impacts the number of channels used for AllGather and ReduceScatter. The channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS`, or `NCCL_MAX_NCHANNELS` are set. * Multi-node tuning for AllGather, AllReduce, and ReduceScatter that leverages LL/LL64/LL128 protocols to use nontemporal vector load/store for tunable message size ranges. @@ -804,8 +804,8 @@ HIP runtime has the following functional improvements which improves runtime per * Setup - installs rocdecode dev packages for Ubuntu, RedHat, and SLES. * Setup - installs turbojpeg dev package for Ubuntu and Redhat. * rocAL's image decoder has been extended to support the rocJPEG hardware decoder. -* Added numpy reader support for reading npy files in rocAL. -* Added test case for numpy reader in C++ and python tests. +* Numpy reader support for reading npy files in rocAL. +* Test case for numpy reader in C++ and python tests. #### Resolved issues * `TurboJPEG` no longer needs to be installed manually. It is now installed by the package installer. @@ -823,7 +823,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added support for gfx950. +* Support for gfx950. #### Changed @@ -831,17 +831,17 @@ HIP runtime has the following functional improvements which improves runtime per #### Optimized -* Improved the user documentation +* Improved the user documentation. #### Resolved issues -* Fix for GPU hashing algorithm when not compiling with -O2/O3 +* Fix for GPU hashing algorithm when not compiling with -O2/O3. ### **rocBLAS** (5.0.0) #### Added -* gfx950 support. +* Support for gfx950. * Internal API logging for `gemm` debugging using `ROCBLAS_LAYER = 8`. * Support for the AOCL 5.0 gcc build as a client reference library. * The use of `PkgConfig` for client reference library fallback detection. @@ -886,10 +886,10 @@ HIP runtime has the following functional improvements which improves runtime per ### **ROCdbgapi** (0.77.3) #### Added -- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. +* Support for the `gfx950` architectures. #### Removed -- Support for the `gfx940` and `gfx941` architectures. +* Support for the `gfx940` and `gfx941` architectures. ### **rocDecode** (1.0.0) @@ -900,11 +900,17 @@ HIP runtime has the following functional improvements which improves runtime per * HEVC/AVC/AV1/VP9 stream syntax error handling. * HEVC stream bit depth change handling and DPB buffer size change handling through decoder reconfiguration. * AVC stream DPB buffer size change handling through decoder reconfiguration. -* A new avcodec-based decoder built as a separate `rocdecode-host` library +* A new avcodec-based decoder built as a separate `rocdecode-host` library. #### Changed * rocDecode now uses the Cmake `CMAKE_PREFIX_PATH` directive. +* Changed asserts in query API calls in RocVideoDecoder utility class to error reports, to avoid hard stop during query in case error occurs and to let the caller decide actions. +* `libdrm_amdgpu` is now explicitly linked with rocdecode. + +#### Removed + +* `GetStream()` interface call from RocVideoDecoder utility class. #### Optimized @@ -917,20 +923,11 @@ HIP runtime has the following functional improvements which improves runtime per * Fixed a decoded frame output issue in video size change cases. * Removed incorrect asserts of `bitdepth_minus_8` in `GetBitDepth()` and `num_chroma_planes` in `GetNumChromaPlanes()` API calls in the RocVideoDecoder utility class. -#### Removed - -* `GetStream()` interface call from RocVideoDecoder utility class. - -#### Changed - -* Changed asserts in query API calls in RocVideoDecoder utility class to error reports, to avoid hard stop during query in case error occurs and to let the caller decide actions. -* `libdrm_amdgpu` is now explicitly linked with rocdecode. - ### **rocFFT** (1.0.34) #### Added -* Added gfx950 support. +* Support for gfx950. #### Removed @@ -958,7 +955,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. +- Support for the `gfx950` architectures. #### Removed @@ -1029,12 +1026,12 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * L2 to EA stalls * L2 to EA stalls per channel -* Roofline support for AMD Instinct MI350 series architecture. +* Roofline support for AMD Instinct MI350 series accelerators. ##### Textual User Interface (TUI) (beta version) * Text User Interface (TUI) support for analyze mode - * A command line based user interface to support interactive single-run analysis + * A command line based user interface to support interactive single-run analysis. * To launch, use `--tui` option in analyze mode. For example, ``rocprof-compute analyze --tui``. ##### PC Sampling (beta version) @@ -1069,7 +1066,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * ``-b`` option in profile mode also accepts hardware IP block for filtering; however, this filter support will be deprecated soon. * ``--list-metrics`` option added in profile mode to list possible metric id(s), similar to analyze mode. -* Support MEM chart on CLI (single run) +* Support MEM chart on CLI (single run). * ``--specs-correction`` option to provide missing system specifications for analysis. @@ -1081,30 +1078,30 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * Updated Dash to >=3.0.0 (for web UI). * Changed the condition when Roofline PDFs are generated during general profiling and ``--roof-only`` profiling (skip only when ``--no-roof`` option is present). * Updated Roofline binaries: - * Rebuild using latest ROCm stack + * Rebuild using latest ROCm stack. * Minimum OS distribution support minimum for roofline feature is now Ubuntu 22.04, RHEL 8, and SLES15 SP6. #### Removed -* Roofline support for Ubuntu 20.04 and SLES below 15.6 +* Roofline support for Ubuntu 20.04 and SLES below 15.6. * Removed support for AMD Instinct MI50 and MI60. #### Optimized -* ROCm Compute Profiler CLI has been improved to better display the GPU architecture analytics +* ROCm Compute Profiler CLI has been improved to better display the GPU architecture analytics. #### Resolved issues * Fixed kernel name and kernel dispatch filtering when using ``rocprofv3``. * Fixed an issue of TCC channel counters collection in ``rocprofv3``. * Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI300. -* Fixed not detecting memory clock issue when using amd-smi -* Fixed standalone GUI crashing +* Fixed not detecting memory clock issue when using ``amd-smi``. +* Fixed standalone GUI crashing. * Fixed L2 read/write/atomic bandwidths on AMD Instinct MI350 series. #### Known issues -* On AMD Instinct MI100, accumulation counters are not collected, resulting in the following metrics failing to show up in the analysis: Instruction Fetch Latency, Wavefront Occupancy, LDS Latency +* On AMD Instinct MI100, accumulation counters are not collected, resulting in the following metrics failing to show up in the analysis: Instruction Fetch Latency, Wavefront Occupancy, LDS Latency. * As a workaround, use the environment variable ``ROCPROF=rocprof``, to use ``rocprof v1`` for profiling on AMD Instinct MI100. * GPU id filtering is not supported when using ``rocprofv3``. @@ -1133,14 +1130,14 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin #### Added -- More profiling and monitoring metrics, especially for AMD Instinct MI300 and newer GPUs. -- Advanced logging and debugging options, including new log levels and troubleshooting guidance. +* More profiling and monitoring metrics, especially for AMD Instinct MI300 and newer GPUs. +* Advanced logging and debugging options, including new log levels and troubleshooting guidance. #### Changed -- Completed migration from legacy [ROCProfiler](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/) to [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/). -- Reorganized the configuration files internally and improved [README/installation](https://github.com/ROCm/rdc/blob/amd-staging/README.md) instructions. -- Updated metrics and monitoring support for the latest AMD data center GPUs. +* Completed migration from legacy [ROCProfiler](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/) to [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/). +* Reorganized the configuration files internally and improved [README/installation](https://github.com/ROCm/rdc/blob/amd-staging/README.md) instructions. +* Updated metrics and monitoring support for the latest AMD data center GPUs. #### Optimized @@ -1156,8 +1153,8 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin - Support for GPU metrics 1.8. - Added new fields for `rsmi_gpu_metrics_t` including: - Adding the following metrics to allow new calculations for violation status: - - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts - - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts. + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts. - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). - Increasing available JPEG engines to 40. @@ -1215,29 +1212,25 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele #### Added -* Added gfx950 support. -* Added `rocprim::accumulator_t` to ensure parity with CCCL. -* Added test for `rocprim::accumulator_t`. -* Added `rocprim::invoke_result_r` to ensure parity with CCCL. -* Added function `is_build_in` into `rocprim::traits::get`. -* Added virtual shared memory as a fallback option in `rocprim::device_merge` when it exceeds shared memory capacity, similar to `rocprim::device_select`, `rocprim::device_partition`, and `rocprim::device_merge_sort`, which already include this feature. -* Added initial value support to device level inclusive scans. -* Added new optimization to the backend for `device_transform` when the input and output are pointers. -* Added `LoadType` to `transform_config`, which is used for the `device_transform` when the input and output are pointers. -* Added `rocprim:device_transform` for n-ary transform operations API with as input `n` number of iterators inside a `rocprim::tuple`. -* Added `rocprim::key_value_pair::operator==`. -* Added the `rocprim::unrolled_copy` thread function to copy multiple items inside a thread. -* Added the `rocprim::unrolled_thread_load` function to load multiple items inside a thread using `rocprim::thread_load`. -* Added `rocprim::int128_t` and `rocprim::uint128_t` to benchmarks for improved performance evaluation on 128-bit integers. -* Added `rocprim::int128_t` to the supported autotuning types to improve performance for 128-bit integers. -* Added the `rocprim::merge_inplace` function for merging in-place. -* Added initial value support for warp- and block-level inclusive scan. -* Added support for building tests with device-side random data generation, making them finish faster. This requires rocRAND, and is enabled with the `WITH_ROCRAND=ON` build flag. -* Added tests and documentation to `lookback_scan_state`. It is still in the `detail` namespace. - -#### Optimized - -* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 series. +* Support for gfx950. +* `rocprim::accumulator_t` to ensure parity with CCCL. +* Test for `rocprim::accumulator_t`. +* `rocprim::invoke_result_r` to ensure parity with CCCL. +* Function `is_build_in` into `rocprim::traits::get`. +* Virtual shared memory as a fallback option in `rocprim::device_merge` when it exceeds shared memory capacity, similar to `rocprim::device_select`, `rocprim::device_partition`, and `rocprim::device_merge_sort`, which already include this feature. +* Initial value support to device level inclusive scans. +* New optimization to the backend for `device_transform` when the input and output are pointers. +* `LoadType` to `transform_config`, which is used for the `device_transform` when the input and output are pointers. +* `rocprim:device_transform` for n-ary transform operations API with as input `n` number of iterators inside a `rocprim::tuple`. +* `rocprim::key_value_pair::operator==`. +* The `rocprim::unrolled_copy` thread function to copy multiple items inside a thread. +* The `rocprim::unrolled_thread_load` function to load multiple items inside a thread using `rocprim::thread_load`. +* `rocprim::int128_t` and `rocprim::uint128_t` to benchmarks for improved performance evaluation on 128-bit integers. +* `rocprim::int128_t` to the supported autotuning types to improve performance for 128-bit integers. +* The `rocprim::merge_inplace` function for merging in-place. +* Initial value support for warp- and block-level inclusive scan. +* Support for building tests with device-side random data generation, making them finish faster. This requires rocRAND, and is enabled with the `WITH_ROCRAND=ON` build flag. +* Tests and documentation to `lookback_scan_state`. It is still in the `detail` namespace. #### Changed @@ -1265,26 +1258,22 @@ The previous default accumulator types could lead to situations in which unexpec * Renamed `rocprim::load_cs` to `rocprim::load_nontemporal` and `rocprim::store_cs` to `rocprim::store_nontemporal` to express the intent of these load and store methods better. * All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, for example, `rocprim::ROCPRIM_300400_NS::symbol` instead of `rocPRIM::symbol`, letting the user link multiple libraries built with different versions of rocPRIM. -#### Upcoming changes - -* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` instead. - #### Removed -* Removed `rocprim::detail::float_bit_mask` and relative tests, use `rocprim::traits::float_bit_mask` instead. -* Removed `rocprim::traits::is_fundamental`, use `rocprim::traits::get::is_fundamental()` directly. -* Removed the deprecated parameters `short_radix_bits` and `ShortRadixBits` from the `segmented_radix_sort` config. They were unused, it is only an API change. -* Removed the deprecated `operator<<` from the iterators. -* Removed the deprecated `TwiddleIn` and `TwiddleOut`. Use `radix_key_codec` instead. -* Removed the deprecated flags API of `block_adjacent_difference`. Use `subtract_left()` or `block_discontinuity::flag_heads()` instead. -* Removed the deprecated `to_exclusive` functions in the warp scans. -* Removed the `rocprim::load_cs` from the `cache_load_modifier` enum. Use `rocprim::load_nontemporal` instead. -* Removed the `rocprim::store_cs` from the `cache_store_modifier` enum. Use `rocprim::store_nontemporal` instead. -* Removed the deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. This header included: +* `rocprim::detail::float_bit_mask` and relative tests, use `rocprim::traits::float_bit_mask` instead. +* `rocprim::traits::is_fundamental`, use `rocprim::traits::get::is_fundamental()` directly. +* The deprecated parameters `short_radix_bits` and `ShortRadixBits` from the `segmented_radix_sort` config. They were unused, it is only an API change. +* The deprecated `operator<<` from the iterators. +* The deprecated `TwiddleIn` and `TwiddleOut`. Use `radix_key_codec` instead. +* The deprecated flags API of `block_adjacent_difference`. Use `subtract_left()` or `block_discontinuity::flag_heads()` instead. +* The deprecated `to_exclusive` functions in the warp scans. +* The `rocprim::load_cs` from the `cache_load_modifier` enum. Use `rocprim::load_nontemporal` instead. +* The `rocprim::store_cs` from the `cache_store_modifier` enum. Use `rocprim::store_nontemporal` instead. +* The deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. This header included: * `rocprim::detail::invoke_result`. Use `rocprim::invoke_result` instead. * `rocprim::detail::invoke_result_binary_op`. Use `rocprim::invoke_result_binary_op` instead. * `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. -* Removed the deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. +* The deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. * Removed `rocprim/detail/radix_sort.hpp`, functionality can now be found in `rocprim/thread/radix_key_codec.hpp`. * Removed C++14 support. Only C++17 is supported. * Due to the removal of `__AMDGCN_WAVEFRONT_SIZE` in the compiler, the following deprecated warp size-related symbols have been removed: @@ -1299,6 +1288,10 @@ The previous default accumulator types could lead to situations in which unexpec * This was a fallback define for the compiler's removed symbol, having the same name. * This release removes support for custom builds on gfx940 and gfx941. +#### Optimized + +* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 series. + #### Resolved issues * Fixed an issue where `device_batch_memcpy` reported benchmarking throughput being 2x lower than it was in reality. @@ -1312,6 +1305,10 @@ The previous default accumulator types could lead to situations in which unexpec * When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x. However, if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs. +#### Upcoming changes + +* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` instead. + ### **ROCprofiler-SDK** (1.0.0) #### Added @@ -1327,7 +1324,7 @@ The previous default accumulator types could lead to situations in which unexpec - relative == logical_node_id - type-relative == logical_node_type_id - MI300 and MI350 stochastic (hardware-based) PC sampling support in ROCProfiler-SDK and `rocprofv3`. -- Python bindings for `rocprofiler-sdk-roctx` +- Python bindings for `rocprofiler-sdk-roctx`. - SQLite3 output support for `rocprofv3` using `--output-format rocpd`. - `rocprofiler-sdk-rocpd` package: - Public API in `include/rocprofiler-sdk-rocpd/rocpd.h`. @@ -1382,17 +1379,17 @@ The previous default accumulator types could lead to situations in which unexpec #### Added * ``rocpyjpegdecode`` package. -* Added ``src/rocjpeg`` source new subfolder. -* Added ``samples/rocjpeg`` new subfolder. +* ``src/rocjpeg`` source new subfolder. +* ``samples/rocjpeg`` new subfolder. #### Changed -* Minimum version for rocdecode and rocjpeg updated to V1.0.0 +* Minimum version for rocdecode and rocjpeg updated to V1.0.0. ### **rocRAND** (4.0.0) #### Added -* gfx950 support. +* Support for gfx950. * Additional unit tests for `test_log_normal_distribution.cpp`, `test_normal_distribution.cpp`, `test_rocrand_mtgp32_prng.cpp`, `test_rocrand_scrambled_sobol32_qrng.cpp`, `test_rocrand_scrambled_sobol64_qrng.cpp`, `test_rocrand_sobol32_qrng.cpp`, `test_rocrand_sobol64_qrng.cpp`, `test_rocrand_threefry2x32_20_prng.cpp`, `test_rocrand_threefry2x64_20_prng.cpp`, `test_rocrand_threefry4x32_20_prng.cpp`, `test_rocrand_threefry4x64_20_prng.cpp`, and `test_uniform_distribution.cpp`. * New unit tests for `include/rocrand/rocrand_discrete.h` in `test_rocrand_discrete.cpp`, `include/rocrand/rocrand_mrg31k3p.h` in `test_rocrand_mrg31k3p_prng.cpp`, `include/rocrand/rocrand_mrg32k3a.h` in `test_rocrand_mrg32k3a_prng.cpp`, and `include/rocrand/rocrand_poisson.h` in `test_rocrand_poisson.cpp`. @@ -1428,7 +1425,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -* Added the `-e` and `--precise-alu-exceptions` flags to enable precise ALU exceptions reporting on supported configurations. +* The `-e` and `--precise-alu-exceptions` flags to enable precise ALU exceptions reporting on supported configurations. ### **ROCr Runtime** (1.18.0) @@ -1528,11 +1525,11 @@ The previous default accumulator types could lead to situations in which unexpec #### Added * Additional unit tests for: binary_search, complex, c99math, catrig, ccosh, cexp, clog, csin, csqrt, and ctan. -* Added `test_param_fixtures.hpp` to store all the parameters for typed test suites. -* Added `test_real_assertions.hpp` to handle unit test assertions for real numbers. -* Added `test_imag_assertions.hpp` to handle unit test assertions for imaginary numbers. +* `test_param_fixtures.hpp` to store all the parameters for typed test suites. +* `test_real_assertions.hpp` to handle unit test assertions for real numbers. +* `test_imag_assertions.hpp` to handle unit test assertions for imaginary numbers. * `clang++` is now used to compile google benchmarks on Windows. -* Added gfx950 support. +* Support for gfx950. * Merged changes from upstream CCCL/thrust 2.6.0. #### Changed @@ -1631,12 +1628,12 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -- Added support for gfx950. -- Added code object compression via bundling. -- Added support for non-default HIP SDK installations on Windows. -- Added master solution library documentation. -- Added compiler version-dependent assembler and architecture capabilities. -- Added documentation from GitHub Wiki to ROCm docs. +- Support for gfx950. +- Code object compression via bundling. +- Support for non-default HIP SDK installations on Windows. +- Master solution library documentation. +- Compiler version-dependent assembler and architecture capabilities. +- Documentation from GitHub Wiki to ROCm docs. #### Changed @@ -1650,9 +1647,9 @@ The previous default accumulator types could lead to situations in which unexpec #### Removed -- Removed support for the gfx940 and gfx941 targets. -- Removed unused tuning files. -- Removed disabled tests. +- Support for the gfx940 and gfx941 targets. +- Unused tuning files. +- Disabled tests. #### Resolved issues diff --git a/RELEASE.md b/RELEASE.md index 385123c94..6eec4e879 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -206,8 +206,8 @@ For more information about hipBLASLt changes, see the [hipBLASLt changelog](#hip * Support for OCP `FP8` on AMD Instinct MI350X and MI355X accelerators. * Support for PyTorch 2.7 via Torch-MIGraphX. -* Improved performance of Generative AI models -* Added additional MSFT Contrib Operators for improved ONNX Runtime Experience +* Improved performance of Generative AI models. +* Added additional MSFT Contrib Operators for improved ONNX Runtime Experience. For more information about MIGraphX changes, see the [MIGraphX changelog](migraphx-2-13-0) below. @@ -770,9 +770,6 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - `amdsmi_dpm_policy_entry_t` member `policy_description` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. - `amdsmi_name_value_t` member `name` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. -* Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. - - The `clk_deep_sleep` field now returns the sleep integer value. - * For backwards compatibility, updated `amdsmi_bdf_t` union to have an identical unnamed struct. #### Removed @@ -841,7 +838,7 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc * Support for GKCYX layout for grouped convolution backward data (NGCHW/GKCYX/NGKHW). * Support for Stream-K version of mixed `FP8` / `BF16` GEMM. * Support for Multiple D GEMM. -* GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types +* GEMM pipeline for microscaling (MX) `FP8` / `FP6` / `FP4` data types. * Support for `FP16` 2:4 structured sparsity to universal GEMM. * Support for Split K for grouped convolution backward data. * Logit soft-capping support for fMHA forward kernels. @@ -888,7 +885,7 @@ functions added for logical reduction. For details, see [Warp cross-lane functio * New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. * The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. * Added `constexpr` operators for `fp16`/`bf16`. -* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`) +* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`). * Support for the flags in APIs as following, now allows uncached memory allocation. - `hipExtHostRegisterUncached`, used in `hipHostRegister`. - `hipHostMallocUncached` and `hipHostAllocUncached`, used in `hipHostMalloc` and `hipHostAlloc`. @@ -899,15 +896,14 @@ functions added for logical reduction. For details, see [Warp cross-lane functio #### Changed * Some unsupported GPUs such as gfx9, gfx8 and gfx7 are deprecated on Microsoft Windows. -* Removal of beta warnings in HIP Graph APIs -All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. +* Removal of beta warnings in HIP Graph APIs. All Beta warnings in usage of HIP Graph APIs are removed, they are now officially and fully supported. * `warpSize` has changed. In order to match the CUDA specification, the `warpSize` variable is no longer `constexpr`. In general, this should be a transparent change; however, if an application was using `warpSize` as a compile-time constant, it will have to be updated to handle the new definition. For more information, see the discussion of `warpSize` within the [HIP C++ language extensions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize). * Behavior changes - `hipGetLastError` now returns the error code which is the last actual error caught in the current thread during the application execution. - Cooperative groups in `hipLaunchCooperativeKernelMultiDevice` and `hipLaunchCooperativeKernel` functions, additional input parameter validation checks are added. - `hipPointerGetAttributes` returns `hipSuccess` instead of an error with invalid value `hipErrorInvalidValue`, in case `NULL` host or attribute pointer is passed as input parameter. It now matches the functionality of `cudaPointerGetAttributes` which changed with CUDA 11 and above releases. - - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree` + - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree`. - `hipFreeAsync` now returns `hipSuccess` when the input pointer is NULL, instead of ` hipErrorInvalidValue` , to be consistent with `hipFree`. * Changes in hipRTC. - Removal of `hipRTC` symbols from HIP Runtime Library. @@ -929,14 +925,14 @@ In order to match the CUDA specification, the `warpSize` variable is no longer ` - HIP vector constructor change in `hipComplex` initialization now generates correct values. The affected constructors will be small vector types such as `float2`, `int4`, etc. * Stream Capture updates - Restricted stream capture mode, it is made in HIP APIs via adding the macro `CHECK_STREAM_CAPTURE_SUPPORTED ()`. -In the previous HIP enumeration `hipStreamCaptureMode`, three capture modes were defined. With checking in the macro, the only supported stream capture mode is now `hipStreamCaptureModeRelaxed`. The rest are not supported, and the macro will return `hipErrorStreamCaptureUnsupported`. This update involves the following APIs, which is allowed only in relaxed stream capture mode, +In the previous HIP enumeration `hipStreamCaptureMode`, three capture modes were defined. With checking in the macro, the only supported stream capture mode is now `hipStreamCaptureModeRelaxed`. The rest are not supported, and the macro will return `hipErrorStreamCaptureUnsupported`. This update involves the following APIs, which is allowed only in relaxed stream capture mode: * `hipMallocManaged` * `hipMemAdvise` - Checks stream capture mode, the following APIs check the stream capture mode and return error codes to match the behavior of CUDA. * `hipLaunchCooperativeKernelMultiDevice` * `hipEventQuery` * `hipStreamAddCallback` - - Returns error during stream capture. The following HIP APIs now returns specific error `hipErrorStreamCaptureUnsupported` on the AMD platform, but not always `hipSuccess`, to match behavior with CUDA. + - Returns error during stream capture. The following HIP APIs now returns specific error `hipErrorStreamCaptureUnsupported` on the AMD platform, but not always `hipSuccess`, to match behavior with CUDA: * `hipDeviceSetMemPool` * `hipMemPoolCreate` * `hipMemPoolDestroy` @@ -945,7 +941,7 @@ In the previous HIP enumeration `hipStreamCaptureMode`, three capture modes were * `hipMemcpyWithStream` * Error code update Returned error/value codes are updated in the following HIP APIs to match the corresponding CUDA APIs. - - Module Management Related APIs + - Module Management Related APIs: * `hipModuleLaunchKernel` * `hipExtModuleLaunchKernel` * `hipExtLaunchKernel` @@ -954,13 +950,13 @@ Returned error/value codes are updated in the following HIP APIs to match the co * `hipLaunchKernelExC` * `hipModuleLaunchCooperativeKernel` * `hipModuleLoad` - - Texture Management Related APIs + - Texture Management Related APIs: The following APIs update the return codes to match the behavior with CUDA: * `hipTexObjectCreate`, supports zero width and height for 2D image. If either is zero, will not return `false`. * `hipBindTexture2D`, adds extra check, if pointer for texture reference or device is NULL, returns `hipErrorNotFound`. * `hipBindTextureToArray`, if any NULL pointer is input for texture object, resource descriptor, or texture descriptor, returns error `hipErrorInvalidChannelDescriptor`, instead of `hipErrorInvalidValue`. * `hipGetTextureAlignmentOffset`, adds a return code `hipErrorInvalidTexture` when the texture reference pointer is NULL. - - Cooperative Group Related APIs, more calidations are added in the following API implementation, + - Cooperative Group Related APIs, more calidations are added in the following API implementation: * `hipLaunchCooperativeKernelMultiDevice` * `hipLaunchCooperativeKernel` * Invalid stream input parameter handling @@ -1007,7 +1003,7 @@ In order to match the CUDA runtime behavior more closely, HIP APIs with streams #### Optimized -HIP runtime has the following functional improvements which improves runtime performance and user experience. +HIP runtime has the following functional improvements which improves runtime performance and user experience: * Reduced usage of the lock scope in events and kernel handling. - Switches to `shared_mutex` for event validation, uses `std::unique_lock` in HIP runtime to create/destroy event, instead of `scopedLock`. @@ -1016,7 +1012,7 @@ HIP runtime has the following functional improvements which improves runtime per * Refactored memory validation, creates a unique function to validate a variety of memory copy operations. * Improved kernel logging using demangling shader names. * Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). -* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following, +* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following: - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. @@ -1034,14 +1030,14 @@ HIP runtime has the following functional improvements which improves runtime per * Fixed issue of handling the kernel parameters for the graph launch. * Failures in roc-obj tools. HIP runtime now makes `DEPRECATED` message in roc-obj tools as `STDERR`. * Support of `hipDeviceMallocContiguous` flags in `hipExtMallocWithFlags()`. It now enables `HSA_AMD_MEMORY_POOL_CONTIGUOUS_FLAG` in the memory pool allocation on GPU device. -* Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v` +* Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v`. * A numerical error/corruption found in Pytorch during graph replay. HIP runtime fixed the input sizes of kernel launch dimensions in hipExtModuleLaunchKernel for the execution of hipGraph capture. * A crash during kernel execution in a customer application. The structure of kernel arguments was updated via adding the size of kernel arguments, and HIP runtime does validation before launch kernel with the structured arguments. #### Known issues * `hipLaunchHostFunc` returns an error during stream capture. Any application using `hipLaunchHostFunc` might fail to capture graphs during stream capture, instead, it returns `hipErrorStreamCaptureUnsupported`. -* Compilation failure in kernels via hiprtc when use option `std=c++11`. +* Compilation failure in kernels via hiprtc when using option `std=c++11`. ### **hipBLAS** (3.0.0) @@ -1072,13 +1068,13 @@ HIP runtime has the following functional improvements which improves runtime per #### Added * Stream-K GEMM support has been enabled for the `FP32`, `FP16`, `BF16`, `FP8`, and `BF8` data types on the Instinct MI300A APU. To activate this feature, set the `TENSILE_SOLUTION_SELECTION_METHOD` environment variable to `2`, for example, `export TENSILE_SOLUTION_SELECTION_METHOD=2`. -* Added fused Swish/SiLU GEMM (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``). -* Added support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. -* Added `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. -* Added ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8` / `BF8` swizzle GEMM respectively. -* Added TF32 emulation on gfx950. -* Added support for `FP6`, `BF6`, and `FP4` on gfx950. -* Added support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. +* Fused Swish/SiLU GEMM (enabled by ``HIPBLASLT_EPILOGUE_SWISH_EXT`` and ``HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT``). +* Support for ``HIPBLASLT_EPILOGUE_GELU_AUX_BIAS`` for gfx942. +* `HIPBLASLT_TUNING_USER_MAX_WORKSPACE` to constrain the maximum workspace size for user offline tuning. +* ``HIPBLASLT_ORDER_COL16_4R16`` and ``HIPBLASLT_ORDER_COL16_4R8`` to ``hipblasLtOrder_t`` to support `FP16`/`BF16` swizzle GEMM and `FP8` / `BF8` swizzle GEMM respectively. +* TF32 emulation on gfx950. +* Support for `FP6`, `BF6`, and `FP4` on gfx950. +* Support for block scaling by setting `HIPBLASLT_MATMUL_DESC_A_SCALE_MODE` and `HIPBLASLT_MATMUL_DESC_B_SCALE_MODE` to `HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0`. #### Changed @@ -1103,28 +1099,19 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added a new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large that symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. -* Added single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: +* A new cmake option, `BUILD_OFFLOAD_COMPRESS`. When hipCUB is built with this option enabled, the `--offload-compress` switch is passed to the compiler. This causes the compiler to compress the binary that it generates. Compression can be useful in cases where you are compiling for a large number of targets, since this often results in a large binary. Without compression, in some cases, the generated binary may become so large that symbols are placed out of range, resulting in linking errors. The new `BUILD_OFFLOAD_COMPRESS` option is set to `ON` by default. +* Single pass operators in `agent/single_pass_scan_operators.hpp` which contains the following API: * `BlockScanRunningPrefixOp` * `ScanTileStatus` * `ScanTileState` * `ReduceByKeyScanTileState` * `TilePrefixCallbackOp` -* Added gfx950 support. -* Added an overload of `BlockScan::InclusiveScan` that accepts an initial value to seed the scan. -* Added an overload of `WarpScan::InclusiveScan` that accepts an initial value to seed the scan. +* Support for gfx950. +* An overload of `BlockScan::InclusiveScan` that accepts an initial value to seed the scan. +* An overload of `WarpScan::InclusiveScan` that accepts an initial value to seed the scan. * `UnrolledThreadLoad`, `UnrolledCopy`, and `ThreadLoadVolatilePointer` were added to align hipCUB with CUB. * `ThreadStoreVolatilePtr` and the `IterateThreadStore` struct were added to align hipCUB with CUB. -* Added `hipcub::InclusiveScanInit` for CUB parity. - -#### Removed - -* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you want to build for these architectures, specify them explicitly in the `AMDGPU_TARGETS` cmake option. -* Deprecated `hipcub::AsmThreadLoad` is removed, use `hipcub::ThreadLoad` instead. -* Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. -* Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. -* This release removes support for custom builds on gfx940 and gfx941. -* Removed C++14 support. Only C++17 is supported. +* `hipcub::InclusiveScanInit` for CUB parity. #### Changed @@ -1135,6 +1122,15 @@ HIP runtime has the following functional improvements which improves runtime per * The `hipcub::detail::accumulator_t` in rocPRIM backend has been changed to utilise `rocprim::accumulator_t`. * The usage of `rocprim::invoke_result_binary_op_t` has been replaced with `rocprim::accumulator_t`. +#### Removed + +* The AMD GPU targets `gfx803` and `gfx900` are no longer built by default. If you want to build for these architectures, specify them explicitly in the `AMDGPU_TARGETS` cmake option. +* Deprecated `hipcub::AsmThreadLoad` is removed, use `hipcub::ThreadLoad` instead. +* Deprecated `hipcub::AsmThreadStore` is removed, use `hipcub::ThreadStore` instead. +* Deprecated `BlockAdjacentDifference::FlagHeads`, `BlockAdjacentDifference::FlagTails` and `BlockAdjacentDifference::FlagHeadsAndTails` have been removed. +* This release removes support for custom builds on gfx940 and gfx941. +* Removed C++14 support. Only C++17 is supported. + #### Resolved issues * Fixed an issue where `Sort(keys, compare_op, valid_items, oob_default)` in `block_merge_sort.hpp` would not fill in elements that are out of range (items after `valid_items`) with `oob_default`. @@ -1153,7 +1149,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added gfx950 support. +* Support for gfx950. #### Removed @@ -1165,7 +1161,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added documentation clarifying how hipfort is built for the NVIDIA platform. +* Documentation clarifying how hipfort is built for the NVIDIA platform. #### Changed @@ -1175,24 +1171,24 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* CUDA 12.9.1 support -* cuDNN 9.11.0 support -* cuTENSOR 2.2.0.0 support -* LLVM 20.1.8 support +* CUDA 12.9.1 support. +* cuDNN 9.11.0 support. +* cuTENSOR 2.2.0.0 support. +* LLVM 20.1.8 support. #### Resolved issues -* `hipDNN` support is removed by default -* [#1859](https://github.com/ROCm/HIPIFY/issues/1859)[hipify-perl] Fix warnings on unsupported Driver or Runtime APIs which were erroneously not reported -* [#1930](https://github.com/ROCm/HIPIFY/issues/1930) Revise `JIT API` -* [#1962](https://github.com/ROCm/HIPIFY/issues/1962) Support for cuda-samples helper headers -* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Remove `const_cast;` in `hiprtcCreateProgram` and `hiprtcCompileProgram` +* `hipDNN` support is removed by default. +* [#1859](https://github.com/ROCm/HIPIFY/issues/1859)[hipify-perl] Fix warnings on unsupported Driver or Runtime APIs which were erroneously not reported. +* [#1930](https://github.com/ROCm/HIPIFY/issues/1930) Revise `JIT API`. +* [#1962](https://github.com/ROCm/HIPIFY/issues/1962) Support for cuda-samples helper headers. +* [#2035](https://github.com/ROCm/HIPIFY/issues/2035) Removed `const_cast;` in `hiprtcCreateProgram` and `hiprtcCompileProgram`. ### **hipRAND** (3.0.0) #### Added -* gfx950 support. +* Support for gfx950. #### Changed @@ -1206,7 +1202,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added compatibility-only functions +* Added compatibility-only functions: * csrlsvqr * `hipsolverSpCcsrlsvqr`, `hipsolverSpZcsrlsvqr` @@ -1219,15 +1215,15 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added the `int8`, `int32`, and `float16` data types to `hipDataTypeToHCCDataType` so that sparse matrix descriptors can be used with them. -* Added half float mixed precision to `hipsparseAxpby` where X and Y use `float16` and the result and compute type use `float`. -* Added half float mixed precision to `hipsparseSpVV` where X and Y use `float16` and the result and compute type use `float`. -* Added half float mixed precision to `hipsparseSpMM` where A and B use `float16` and C and the compute type use `float`. -* Added half float mixed precision to `hipsparseSDDMM` where A and B use `float16` and C and the compute type use `float`. -* Added half float uniform precision to the `hipsparseScatter` and `hipsparseGather` routines. -* Added half float uniform precision to the `hipsparseSDDMM` routine. -* Added `int8` precision to the `hipsparseCsr2cscEx2` routine. -* Added the `almalinux` operating system name to correct the GFortran dependency. +* `int8`, `int32`, and `float16` data types to `hipDataTypeToHCCDataType` so that sparse matrix descriptors can be used with them. +* Half float mixed precision to `hipsparseAxpby` where X and Y use `float16` and the result and compute type use `float`. +* Half float mixed precision to `hipsparseSpVV` where X and Y use `float16` and the result and compute type use `float`. +* Half float mixed precision to `hipsparseSpMM` where A and B use `float16` and C and the compute type use `float`. +* Half float mixed precision to `hipsparseSDDMM` where A and B use `float16` and C and the compute type use `float`. +* Half float uniform precision to the `hipsparseScatter` and `hipsparseGather` routines. +* Half float uniform precision to the `hipsparseSDDMM` routine. +* `int8` precision to the `hipsparseCsr2cscEx2` routine. +* The `almalinux` operating system name to correct the GFortran dependency. #### Changed @@ -1267,21 +1263,21 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added element-wise binary operation support. -* Added element-wise trinary operation support. -* Added support for GPU target gfx950. -* Added dynamic unary and binary operator support for element-wise operations and permutation. -* Added a CMake check for `f8` datatype availability. -* Added `hiptensorDestroyOperationDescriptor` to free all resources related to the provided descriptor. -* Added `hiptensorOperationDescriptorSetAttribute` to set attribute of a `hiptensorOperationDescriptor_t` object. -* Added `hiptensorOperationDescriptorGetAttribute` to retrieve an attribute of the provided `hiptensorOperationDescriptor_t` object. -* Added `hiptensorCreatePlanPreference` to allocate the `hiptensorPlanPreference_t` and enabled users to limit the applicable kernels for a given plan or operation. -* Added `hiptensorDestroyPlanPreference` to free all resources related to the provided preference. -* Added `hiptensorPlanPreferenceSetAttribute` to set attribute of a `hiptensorPlanPreference_t` object. -* Added `hiptensorPlanGetAttribute` to retrieve information about an already-created plan. -* Added `hiptensorEstimateWorkspaceSize` to determine the required workspace size for the given operation. -* Added `hiptensorCreatePlan` to allocate a `hiptensorPlan_t` object, select an appropriate kernel for a given operation and prepare a plan that encodes the execution. -* Added `hiptensorDestroyPlan` to free all resources related to the provided plan. +* Element-wise binary operation support. +* Element-wise trinary operation support. +* Support for GPU target gfx950. +* Dynamic unary and binary operator support for element-wise operations and permutation. +* CMake check for `f8` datatype availability. +* `hiptensorDestroyOperationDescriptor` to free all resources related to the provided descriptor. +* `hiptensorOperationDescriptorSetAttribute` to set attribute of a `hiptensorOperationDescriptor_t` object. +* `hiptensorOperationDescriptorGetAttribute` to retrieve an attribute of the provided `hiptensorOperationDescriptor_t` object. +* `hiptensorCreatePlanPreference` to allocate the `hiptensorPlanPreference_t` and enabled users to limit the applicable kernels for a given plan or operation. +* `hiptensorDestroyPlanPreference` to free all resources related to the provided preference. +* `hiptensorPlanPreferenceSetAttribute` to set attribute of a `hiptensorPlanPreference_t` object. +* `hiptensorPlanGetAttribute` to retrieve information about an already-created plan. +* `hiptensorEstimateWorkspaceSize` to determine the required workspace size for the given operation. +* `hiptensorCreatePlan` to allocate a `hiptensorPlan_t` object, select an appropriate kernel for a given operation and prepare a plan that encodes the execution. +* `hiptensorDestroyPlan` to free all resources related to the provided plan. #### Changed @@ -1310,11 +1306,11 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added the compiler `-gsplit-dwarf` option to enable the generation of separate debug information file at compile time. When used, separate debug information files are generated for host and for each offload architecture. For additional information, see [DebugFission](https://gcc.gnu.org/wiki/DebugFission). -* Added `llvm-flang`, AMD's next-generation Fortran compiler. It's a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. -* Added Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. -* Added compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` to determine the current target processor ability to invoke a particular builtin. -* Added HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. +* The compiler `-gsplit-dwarf` option to enable the generation of separate debug information file at compile time. When used, separate debug information files are generated for host and for each offload architecture. For additional information, see [DebugFission](https://gcc.gnu.org/wiki/DebugFission). +* `llvm-flang`, AMD's next-generation Fortran compiler. It's a re-implementation of the Fortran frontend that can be found at `llvm/llvm-project/flang` on GitHub. +* Comgr support for an in-memory virtual file system (VFS) for storing temporary files generated during intermediate compilation steps to improve performance in the device library link step. +* Compiler support of a new target-specific builtin `__builtin_amdgcn_processor_is` for late or deferred queries of the current target processor, and `__builtin_amdgcn_is_invocable` to determine the current target processor ability to invoke a particular builtin. +* HIPIFY support for NVIDIA CUDA 12.9.1 APIs. Added support for all new device and host APIs, including FP4, FP6, and FP128, and support for the corresponding ROCm HIP equivalents. #### Changed @@ -1338,12 +1334,12 @@ HIP runtime has the following functional improvements which improves runtime per * Support for PyTorch 2.7 via Torch-MIGraphX. * Support for the Microsoft ONNX Contrib Operators (Self) Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, SkipLayerNorm. * Support for Sigmoid and AddN TensorFlow operators. -* Added GroupQuery Attention support for LLMs. -* Added support for edge mode in the ONNX Pad operator. -* Added ONNX runtime Python driver. -* Added FLUX e2e example. -* Added C++ and Python APIs to save arguments to a graph as a msgpack file, and then read the file back. -* Added rocMLIR fusion for kv-cache attention. +* GroupQuery Attention support for LLMs. +* Support for edge mode in the ONNX Pad operator. +* ONNX runtime Python driver. +* FLUX e2e example. +* C++ and Python APIs to save arguments to a graph as a msgpack file, and then read the file back. +* rocMLIR fusion for kv-cache attention. * Introduced a check for file-write errors. #### Changed @@ -1366,14 +1362,14 @@ HIP runtime has the following functional improvements which improves runtime per #### Removed * `ROCM_USE_FLOAT8` macro. -* The BF16 GEMM test was removed for Navi21, as it is unsupported by rocBLAS and hipBLASLt on that platform. +* The `BF16` GEMM test was removed for Navi21, as it is unsupported by rocBLAS and hipBLASLt on that platform. #### Optimized * Use common average in `compile_ops` to reduce run-to-run variations when tuning. * Improved the performance of the TopK operator. * Conform to a single layout (NHWC or NCHW) during compilation rather than combining two. -* Slice Channels Conv Optimization (slice output fusion) +* Slice Channels Conv Optimization (slice output fusion). * Horizontal fusion optimization after pointwise operations. * Reduced the number of literals used in `GridSample` linear sampler. * Fuse multiple outputs for pointwise operations. @@ -1400,12 +1396,12 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* [Conv] Added misa kernels for gfx950. +* [Conv] Misa kernels for gfx950. * [Conv] Enabled Split-K support for CK backward data solvers (2D). * [Conv] Enabled CK wrw solver on gfx950 for the `BF16` data type. * [BatchNorm] Enabled NHWC in OpenCL. -* Added grouped convolution + activation fusion. -* Added grouped convolution + bias + activation fusion. +* Grouped convolution + activation fusion. +* Grouped convolution + bias + activation fusion. * Composable Kernel (CK) can now be built inline as part of MIOpen. #### Changed @@ -1453,12 +1449,12 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added support for the extended fine-grained system memory pool. -* Added support for gfx950. -* Added support for `unroll=1` in device-code generation to improve performance. +* Support for the extended fine-grained system memory pool. +* Support for gfx950. +* Support for `unroll=1` in device-code generation to improve performance. * Set a default of 112 channels for a single node with `8 * gfx950`. * Enabled LL128 protocol on the gfx950. -* Added the ability to choose the unroll factor at runtime using `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. +* The ability to choose the unroll factor at runtime using `RCCL_UNROLL_FACTOR`. This can be set at runtime to 1, 2, or 4. This change currently increases compilation and linking time because it triples the number of kernels generated. * Added MSCCL support for AllGather multinode on the gfx942 and gfx950 (for instance, 16 and 32 GPUs). To enable this feature, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. The maximum message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. * Thread thresholds for LL/LL128 are selected in Tuning Models for the AMD Instinct MI300X. This impacts the number of channels used for AllGather and ReduceScatter. The channel tuning model is bypassed if `NCCL_THREAD_THRESHOLDS`, `NCCL_MIN_NCHANNELS`, or `NCCL_MAX_NCHANNELS` are set. * Multi-node tuning for AllGather, AllReduce, and ReduceScatter that leverages LL/LL64/LL128 protocols to use nontemporal vector load/store for tunable message size ranges. @@ -1486,8 +1482,8 @@ HIP runtime has the following functional improvements which improves runtime per * Setup - installs rocdecode dev packages for Ubuntu, RedHat, and SLES. * Setup - installs turbojpeg dev package for Ubuntu and Redhat. * rocAL's image decoder has been extended to support the rocJPEG hardware decoder. -* Added numpy reader support for reading npy files in rocAL. -* Added test case for numpy reader in C++ and python tests. +* Numpy reader support for reading npy files in rocAL. +* Test case for numpy reader in C++ and python tests. #### Resolved issues * `TurboJPEG` no longer needs to be installed manually. It is now installed by the package installer. @@ -1505,7 +1501,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Added support for gfx950. +* Support for gfx950. #### Changed @@ -1513,17 +1509,17 @@ HIP runtime has the following functional improvements which improves runtime per #### Optimized -* Improved the user documentation +* Improved the user documentation. #### Resolved issues -* Fix for GPU hashing algorithm when not compiling with -O2/O3 +* Fix for GPU hashing algorithm when not compiling with -O2/O3. ### **rocBLAS** (5.0.0) #### Added -* gfx950 support. +* Support for gfx950. * Internal API logging for `gemm` debugging using `ROCBLAS_LAYER = 8`. * Support for the AOCL 5.0 gcc build as a client reference library. * The use of `PkgConfig` for client reference library fallback detection. @@ -1568,10 +1564,10 @@ HIP runtime has the following functional improvements which improves runtime per ### **ROCdbgapi** (0.77.3) #### Added -- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. +* Support for the `gfx950` architectures. #### Removed -- Support for the `gfx940` and `gfx941` architectures. +* Support for the `gfx940` and `gfx941` architectures. ### **rocDecode** (1.0.0) @@ -1582,11 +1578,17 @@ HIP runtime has the following functional improvements which improves runtime per * HEVC/AVC/AV1/VP9 stream syntax error handling. * HEVC stream bit depth change handling and DPB buffer size change handling through decoder reconfiguration. * AVC stream DPB buffer size change handling through decoder reconfiguration. -* A new avcodec-based decoder built as a separate `rocdecode-host` library +* A new avcodec-based decoder built as a separate `rocdecode-host` library. #### Changed * rocDecode now uses the Cmake `CMAKE_PREFIX_PATH` directive. +* Changed asserts in query API calls in RocVideoDecoder utility class to error reports, to avoid hard stop during query in case error occurs and to let the caller decide actions. +* `libdrm_amdgpu` is now explicitly linked with rocdecode. + +#### Removed + +* `GetStream()` interface call from RocVideoDecoder utility class. #### Optimized @@ -1599,20 +1601,11 @@ HIP runtime has the following functional improvements which improves runtime per * Fixed a decoded frame output issue in video size change cases. * Removed incorrect asserts of `bitdepth_minus_8` in `GetBitDepth()` and `num_chroma_planes` in `GetNumChromaPlanes()` API calls in the RocVideoDecoder utility class. -#### Removed - -* `GetStream()` interface call from RocVideoDecoder utility class. - -#### Changed - -* Changed asserts in query API calls in RocVideoDecoder utility class to error reports, to avoid hard stop during query in case error occurs and to let the caller decide actions. -* `libdrm_amdgpu` is now explicitly linked with rocdecode. - ### **rocFFT** (1.0.34) #### Added -* Added gfx950 support. +* Support for gfx950. #### Removed @@ -1640,7 +1633,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -- Support for the `gfx950`, `gfx1150`, and `gfx1151` architectures. +- Support for the `gfx950` architectures. #### Removed @@ -1711,12 +1704,12 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * L2 to EA stalls * L2 to EA stalls per channel -* Roofline support for AMD Instinct MI350 series architecture. +* Roofline support for AMD Instinct MI350 series accelerators. ##### Textual User Interface (TUI) (beta version) * Text User Interface (TUI) support for analyze mode - * A command line based user interface to support interactive single-run analysis + * A command line based user interface to support interactive single-run analysis. * To launch, use `--tui` option in analyze mode. For example, ``rocprof-compute analyze --tui``. ##### PC Sampling (beta version) @@ -1751,7 +1744,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * ``-b`` option in profile mode also accepts hardware IP block for filtering; however, this filter support will be deprecated soon. * ``--list-metrics`` option added in profile mode to list possible metric id(s), similar to analyze mode. -* Support MEM chart on CLI (single run) +* Support MEM chart on CLI (single run). * ``--specs-correction`` option to provide missing system specifications for analysis. @@ -1763,30 +1756,30 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * Updated Dash to >=3.0.0 (for web UI). * Changed the condition when Roofline PDFs are generated during general profiling and ``--roof-only`` profiling (skip only when ``--no-roof`` option is present). * Updated Roofline binaries: - * Rebuild using latest ROCm stack + * Rebuild using latest ROCm stack. * Minimum OS distribution support minimum for roofline feature is now Ubuntu 22.04, RHEL 8, and SLES15 SP6. #### Removed -* Roofline support for Ubuntu 20.04 and SLES below 15.6 +* Roofline support for Ubuntu 20.04 and SLES below 15.6. * Removed support for AMD Instinct MI50 and MI60. #### Optimized -* ROCm Compute Profiler CLI has been improved to better display the GPU architecture analytics +* ROCm Compute Profiler CLI has been improved to better display the GPU architecture analytics. #### Resolved issues * Fixed kernel name and kernel dispatch filtering when using ``rocprofv3``. * Fixed an issue of TCC channel counters collection in ``rocprofv3``. * Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI300. -* Fixed not detecting memory clock issue when using amd-smi -* Fixed standalone GUI crashing +* Fixed not detecting memory clock issue when using ``amd-smi``. +* Fixed standalone GUI crashing. * Fixed L2 read/write/atomic bandwidths on AMD Instinct MI350 series. #### Known issues -* On AMD Instinct MI100, accumulation counters are not collected, resulting in the following metrics failing to show up in the analysis: Instruction Fetch Latency, Wavefront Occupancy, LDS Latency +* On AMD Instinct MI100, accumulation counters are not collected, resulting in the following metrics failing to show up in the analysis: Instruction Fetch Latency, Wavefront Occupancy, LDS Latency. * As a workaround, use the environment variable ``ROCPROF=rocprof``, to use ``rocprof v1`` for profiling on AMD Instinct MI100. * GPU id filtering is not supported when using ``rocprofv3``. @@ -1815,14 +1808,14 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin #### Added -- More profiling and monitoring metrics, especially for AMD Instinct MI300 and newer GPUs. -- Advanced logging and debugging options, including new log levels and troubleshooting guidance. +* More profiling and monitoring metrics, especially for AMD Instinct MI300 and newer GPUs. +* Advanced logging and debugging options, including new log levels and troubleshooting guidance. #### Changed -- Completed migration from legacy [ROCProfiler](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/) to [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/). -- Reorganized the configuration files internally and improved [README/installation](https://github.com/ROCm/rdc/blob/amd-staging/README.md) instructions. -- Updated metrics and monitoring support for the latest AMD data center GPUs. +* Completed migration from legacy [ROCProfiler](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/) to [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/). +* Reorganized the configuration files internally and improved [README/installation](https://github.com/ROCm/rdc/blob/amd-staging/README.md) instructions. +* Updated metrics and monitoring support for the latest AMD data center GPUs. #### Optimized @@ -1838,8 +1831,8 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin - Support for GPU metrics 1.8. - Added new fields for `rsmi_gpu_metrics_t` including: - Adding the following metrics to allow new calculations for violation status: - - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts - - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts + - Per XCP metrics `gfx_below_host_limit_ppt_acc[XCP][MAX_XCC]` - GFX Clock Host limit Package Power Tracking violation counts. + - Per XCP metrics `gfx_below_host_limit_thm_acc[XCP][MAX_XCC]` - GFX Clock Host limit Thermal (TVIOL) violation counts. - Per XCP metrics `gfx_low_utilization_acc[XCP][MAX_XCC]` - violation counts for how did low utilization caused the GPU to be below application clocks. - Per XCP metrics `gfx_below_host_limit_total_acc[XCP][MAX_XCC]`- violation counts for how long GPU was held below application clocks any limiter (see above new violation metrics). - Increasing available JPEG engines to 40. @@ -1897,29 +1890,25 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele #### Added -* Added gfx950 support. -* Added `rocprim::accumulator_t` to ensure parity with CCCL. -* Added test for `rocprim::accumulator_t`. -* Added `rocprim::invoke_result_r` to ensure parity with CCCL. -* Added function `is_build_in` into `rocprim::traits::get`. -* Added virtual shared memory as a fallback option in `rocprim::device_merge` when it exceeds shared memory capacity, similar to `rocprim::device_select`, `rocprim::device_partition`, and `rocprim::device_merge_sort`, which already include this feature. -* Added initial value support to device level inclusive scans. -* Added new optimization to the backend for `device_transform` when the input and output are pointers. -* Added `LoadType` to `transform_config`, which is used for the `device_transform` when the input and output are pointers. -* Added `rocprim:device_transform` for n-ary transform operations API with as input `n` number of iterators inside a `rocprim::tuple`. -* Added `rocprim::key_value_pair::operator==`. -* Added the `rocprim::unrolled_copy` thread function to copy multiple items inside a thread. -* Added the `rocprim::unrolled_thread_load` function to load multiple items inside a thread using `rocprim::thread_load`. -* Added `rocprim::int128_t` and `rocprim::uint128_t` to benchmarks for improved performance evaluation on 128-bit integers. -* Added `rocprim::int128_t` to the supported autotuning types to improve performance for 128-bit integers. -* Added the `rocprim::merge_inplace` function for merging in-place. -* Added initial value support for warp- and block-level inclusive scan. -* Added support for building tests with device-side random data generation, making them finish faster. This requires rocRAND, and is enabled with the `WITH_ROCRAND=ON` build flag. -* Added tests and documentation to `lookback_scan_state`. It is still in the `detail` namespace. - -#### Optimized - -* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 series. +* Support for gfx950. +* `rocprim::accumulator_t` to ensure parity with CCCL. +* Test for `rocprim::accumulator_t`. +* `rocprim::invoke_result_r` to ensure parity with CCCL. +* Function `is_build_in` into `rocprim::traits::get`. +* Virtual shared memory as a fallback option in `rocprim::device_merge` when it exceeds shared memory capacity, similar to `rocprim::device_select`, `rocprim::device_partition`, and `rocprim::device_merge_sort`, which already include this feature. +* Initial value support to device level inclusive scans. +* New optimization to the backend for `device_transform` when the input and output are pointers. +* `LoadType` to `transform_config`, which is used for the `device_transform` when the input and output are pointers. +* `rocprim:device_transform` for n-ary transform operations API with as input `n` number of iterators inside a `rocprim::tuple`. +* `rocprim::key_value_pair::operator==`. +* The `rocprim::unrolled_copy` thread function to copy multiple items inside a thread. +* The `rocprim::unrolled_thread_load` function to load multiple items inside a thread using `rocprim::thread_load`. +* `rocprim::int128_t` and `rocprim::uint128_t` to benchmarks for improved performance evaluation on 128-bit integers. +* `rocprim::int128_t` to the supported autotuning types to improve performance for 128-bit integers. +* The `rocprim::merge_inplace` function for merging in-place. +* Initial value support for warp- and block-level inclusive scan. +* Support for building tests with device-side random data generation, making them finish faster. This requires rocRAND, and is enabled with the `WITH_ROCRAND=ON` build flag. +* Tests and documentation to `lookback_scan_state`. It is still in the `detail` namespace. #### Changed @@ -1947,26 +1936,22 @@ The previous default accumulator types could lead to situations in which unexpec * Renamed `rocprim::load_cs` to `rocprim::load_nontemporal` and `rocprim::store_cs` to `rocprim::store_nontemporal` to express the intent of these load and store methods better. * All kernels now have hidden symbol visibility. All symbols now have inline namespaces that include the library version, for example, `rocprim::ROCPRIM_300400_NS::symbol` instead of `rocPRIM::symbol`, letting the user link multiple libraries built with different versions of rocPRIM. -#### Upcoming changes - -* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` instead. - #### Removed -* Removed `rocprim::detail::float_bit_mask` and relative tests, use `rocprim::traits::float_bit_mask` instead. -* Removed `rocprim::traits::is_fundamental`, use `rocprim::traits::get::is_fundamental()` directly. -* Removed the deprecated parameters `short_radix_bits` and `ShortRadixBits` from the `segmented_radix_sort` config. They were unused, it is only an API change. -* Removed the deprecated `operator<<` from the iterators. -* Removed the deprecated `TwiddleIn` and `TwiddleOut`. Use `radix_key_codec` instead. -* Removed the deprecated flags API of `block_adjacent_difference`. Use `subtract_left()` or `block_discontinuity::flag_heads()` instead. -* Removed the deprecated `to_exclusive` functions in the warp scans. -* Removed the `rocprim::load_cs` from the `cache_load_modifier` enum. Use `rocprim::load_nontemporal` instead. -* Removed the `rocprim::store_cs` from the `cache_store_modifier` enum. Use `rocprim::store_nontemporal` instead. -* Removed the deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. This header included: +* `rocprim::detail::float_bit_mask` and relative tests, use `rocprim::traits::float_bit_mask` instead. +* `rocprim::traits::is_fundamental`, use `rocprim::traits::get::is_fundamental()` directly. +* The deprecated parameters `short_radix_bits` and `ShortRadixBits` from the `segmented_radix_sort` config. They were unused, it is only an API change. +* The deprecated `operator<<` from the iterators. +* The deprecated `TwiddleIn` and `TwiddleOut`. Use `radix_key_codec` instead. +* The deprecated flags API of `block_adjacent_difference`. Use `subtract_left()` or `block_discontinuity::flag_heads()` instead. +* The deprecated `to_exclusive` functions in the warp scans. +* The `rocprim::load_cs` from the `cache_load_modifier` enum. Use `rocprim::load_nontemporal` instead. +* The `rocprim::store_cs` from the `cache_store_modifier` enum. Use `rocprim::store_nontemporal` instead. +* The deprecated header file `rocprim/detail/match_result_type.hpp`. Include `rocprim/type_traits.hpp` instead. This header included: * `rocprim::detail::invoke_result`. Use `rocprim::invoke_result` instead. * `rocprim::detail::invoke_result_binary_op`. Use `rocprim::invoke_result_binary_op` instead. * `rocprim::detail::match_result_type`. Use `rocprim::invoke_result_binary_op_t` instead. -* Removed the deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. +* The deprecated `rocprim::detail::radix_key_codec` function. Use `rocprim::radix_key_codec` instead. * Removed `rocprim/detail/radix_sort.hpp`, functionality can now be found in `rocprim/thread/radix_key_codec.hpp`. * Removed C++14 support. Only C++17 is supported. * Due to the removal of `__AMDGCN_WAVEFRONT_SIZE` in the compiler, the following deprecated warp size-related symbols have been removed: @@ -1981,6 +1966,10 @@ The previous default accumulator types could lead to situations in which unexpec * This was a fallback define for the compiler's removed symbol, having the same name. * This release removes support for custom builds on gfx940 and gfx941. +#### Optimized + +* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 series. + #### Resolved issues * Fixed an issue where `device_batch_memcpy` reported benchmarking throughput being 2x lower than it was in reality. @@ -1994,6 +1983,10 @@ The previous default accumulator types could lead to situations in which unexpec * When using `rocprim::deterministic_inclusive_scan_by_key` and `rocprim::deterministic_exclusive_scan_by_key` the intermediate values can change order on Navi3x. However, if a commutative scan operator is used then the final scan value (output array) will still always be consistent between runs. +#### Upcoming changes + +* `rocprim::invoke_result_binary_op` and `rocprim::invoke_result_binary_op_t` are deprecated. Use `rocprim::accumulator_t` instead. + ### **ROCprofiler-SDK** (1.0.0) #### Added @@ -2009,7 +2002,7 @@ The previous default accumulator types could lead to situations in which unexpec - relative == logical_node_id - type-relative == logical_node_type_id - MI300 and MI350 stochastic (hardware-based) PC sampling support in ROCProfiler-SDK and `rocprofv3`. -- Python bindings for `rocprofiler-sdk-roctx` +- Python bindings for `rocprofiler-sdk-roctx`. - SQLite3 output support for `rocprofv3` using `--output-format rocpd`. - `rocprofiler-sdk-rocpd` package: - Public API in `include/rocprofiler-sdk-rocpd/rocpd.h`. @@ -2064,17 +2057,17 @@ The previous default accumulator types could lead to situations in which unexpec #### Added * ``rocpyjpegdecode`` package. -* Added ``src/rocjpeg`` source new subfolder. -* Added ``samples/rocjpeg`` new subfolder. +* ``src/rocjpeg`` source new subfolder. +* ``samples/rocjpeg`` new subfolder. #### Changed -* Minimum version for rocdecode and rocjpeg updated to V1.0.0 +* Minimum version for rocdecode and rocjpeg updated to V1.0.0. ### **rocRAND** (4.0.0) #### Added -* gfx950 support. +* Support for gfx950. * Additional unit tests for `test_log_normal_distribution.cpp`, `test_normal_distribution.cpp`, `test_rocrand_mtgp32_prng.cpp`, `test_rocrand_scrambled_sobol32_qrng.cpp`, `test_rocrand_scrambled_sobol64_qrng.cpp`, `test_rocrand_sobol32_qrng.cpp`, `test_rocrand_sobol64_qrng.cpp`, `test_rocrand_threefry2x32_20_prng.cpp`, `test_rocrand_threefry2x64_20_prng.cpp`, `test_rocrand_threefry4x32_20_prng.cpp`, `test_rocrand_threefry4x64_20_prng.cpp`, and `test_uniform_distribution.cpp`. * New unit tests for `include/rocrand/rocrand_discrete.h` in `test_rocrand_discrete.cpp`, `include/rocrand/rocrand_mrg31k3p.h` in `test_rocrand_mrg31k3p_prng.cpp`, `include/rocrand/rocrand_mrg32k3a.h` in `test_rocrand_mrg32k3a_prng.cpp`, and `include/rocrand/rocrand_poisson.h` in `test_rocrand_poisson.cpp`. @@ -2110,7 +2103,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -* Added the `-e` and `--precise-alu-exceptions` flags to enable precise ALU exceptions reporting on supported configurations. +* The `-e` and `--precise-alu-exceptions` flags to enable precise ALU exceptions reporting on supported configurations. ### **ROCr Runtime** (1.18.0) @@ -2210,11 +2203,11 @@ The previous default accumulator types could lead to situations in which unexpec #### Added * Additional unit tests for: binary_search, complex, c99math, catrig, ccosh, cexp, clog, csin, csqrt, and ctan. -* Added `test_param_fixtures.hpp` to store all the parameters for typed test suites. -* Added `test_real_assertions.hpp` to handle unit test assertions for real numbers. -* Added `test_imag_assertions.hpp` to handle unit test assertions for imaginary numbers. +* `test_param_fixtures.hpp` to store all the parameters for typed test suites. +* `test_real_assertions.hpp` to handle unit test assertions for real numbers. +* `test_imag_assertions.hpp` to handle unit test assertions for imaginary numbers. * `clang++` is now used to compile google benchmarks on Windows. -* Added gfx950 support. +* Support for gfx950. * Merged changes from upstream CCCL/thrust 2.6.0. #### Changed @@ -2313,12 +2306,12 @@ The previous default accumulator types could lead to situations in which unexpec #### Added -- Added support for gfx950. -- Added code object compression via bundling. -- Added support for non-default HIP SDK installations on Windows. -- Added master solution library documentation. -- Added compiler version-dependent assembler and architecture capabilities. -- Added documentation from GitHub Wiki to ROCm docs. +- Support for gfx950. +- Code object compression via bundling. +- Support for non-default HIP SDK installations on Windows. +- Master solution library documentation. +- Compiler version-dependent assembler and architecture capabilities. +- Documentation from GitHub Wiki to ROCm docs. #### Changed @@ -2332,9 +2325,9 @@ The previous default accumulator types could lead to situations in which unexpec #### Removed -- Removed support for the gfx940 and gfx941 targets. -- Removed unused tuning files. -- Removed disabled tests. +- Support for the gfx940 and gfx941 targets. +- Unused tuning files. +- Disabled tests. #### Resolved issues From 519364179cbaa8bc877820db2ee7766874f96c9e Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Tue, 9 Sep 2025 11:40:26 -0400 Subject: [PATCH 42/58] Mono repo highlight and known issues feedback added (#532) * Mono repo highlight added * Leo's feedback incorporated * Minor wording change * Randy's feedback incorp * Update for upcoming change * Minor feedback added * Ram's feedback incorporated * Reworded for clarity * Minor update * Minor update --- CHANGELOG.md | 1 + RELEASE.md | 22 ++++++++++++++++------ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e66d1062b..0ae63cfd5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1184,6 +1184,7 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele - Replaced ROCm SMI backend with AMD SMI backend for collecting GPU metrics. - ROCprofiler-SDK is now used to trace RCCL API and collect communication counters. + - Use the setting `ROCPROFSYS_USE_RCCLP = ON` to enable profiling and tracing of RCCL application data. - Updated the Dyninst submodule to v13.0. - Set the default value of `ROCPROFSYS_SAMPLING_CPUS` to `none`. diff --git a/RELEASE.md b/RELEASE.md index 6eec4e879..f5cf817bc 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -124,6 +124,15 @@ The Instinct Driver is now distributed separately from the ROCm software stack a [AMD SMI](https://github.com/ROCm/amdsmi) continues to stay with the ROCm software stack under the ROCm organization repository. +### Consolidation of ROCm library repositories + +The following ROCm library repositories are migrating from multiple repositories under {fab}`github` [ROCm](https://github.com/ROCm) to a single repository under {fab}`github` [rocm-libraries](https://github.com/ROCm/rocm-libraries) in the ROCm organization GitHub: [hipBLAS](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipblas), [hipBLASLt](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipblaslt) +, [hipCUB](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipcub), [hipFFT](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipfft), [hipRAND](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hiprand), [hipSPARSE](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipsparse), [hipSPARSELt](https://github.com/ROCm/rocm-libraries/tree/develop/projects/hipsparselt), [MIOpen](https://github.com/ROCm/rocm-libraries/tree/develop/projects/miopen), [rocBLAS](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocblas), [rocFFT](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocfft), [rocPRIM](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocprim), [rocRAND](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocrand), [rocSPARSE](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocsparse), [rocThrust](https://github.com/ROCm/rocm-libraries/tree/develop/projects/rocthrust), and [Tensile](https://github.com/ROCm/rocm-libraries/tree/develop/shared/tensile). + +Use the new ROCm Libraries repository to access source code, clone projects, and contribute to the code base and documentation.The change helps to streamline development, CI, and integration. For more information about working with the ROCm Libraries repository, see [Contributing to the ROCm Libraries](https://github.com/ROCm/rocm-libraries/blob/develop/CONTRIBUTING.md) in GitHub. + +Other ROCm libraries are also in the process of migration along with ROCm tools to {fab}`github` [rocm-systems](https://github.com/ROCm/rocm-systems). For latest status information, see the [README file](https://github.com/ROCm/rocm-systems/blob/develop/README.md). The official completion of migration will be communicated in a future ROCm release. + ### HIP API compatibility improvements To improve code portability between AMD ROCm and other programming models, HIP API has been updated in ROCm 7.0 to simplify cross-platform programming. These changes are incompatible with prior ROCm releases and might require recompiling existing HIP applications for use with ROCm 7.0. For more information, see the [HIP API 7.0 changes](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/hip-7-changes.html) and the [HIP changelog](#hip-7-0-0) below. @@ -1862,6 +1871,7 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele - Replaced ROCm SMI backend with AMD SMI backend for collecting GPU metrics. - ROCprofiler-SDK is now used to trace RCCL API and collect communication counters. + - Use the setting `ROCPROFSYS_USE_RCCLP = ON` to enable profiling and tracing of RCCL application data. - Updated the Dyninst submodule to v13.0. - Set the default value of `ROCPROFSYS_SAMPLING_CPUS` to `none`. @@ -2344,13 +2354,13 @@ The previous default accumulator types could lead to situations in which unexpec ROCm known issues are noted on {fab}`github` [GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known issues related to individual components, review the [Detailed component changes](#detailed-component-changes). -### A memory error in the kernel might lead to applications using the ROCr library being unresponsive +### A memory error in the kernel might lead to applications using the ROCr library becoming unresponsive -Applications using the ROCr library may become unresponsive if a memory error occurs in the launched kernel when the queue from which it was launched is destroyed. The application is unable to receive further signal, resulting in the stall condition. The issue will be fixed in a future ROCm release. +Applications using the ROCr library might become unresponsive if a memory error occurs in the launched kernel when the queue from which it was launched is destroyed. The application is unable to receive further signal, resulting in the stall condition. The issue will be fixed in a future ROCm release. -### Applications using stream capture APIs may fail during stream capture +### Applications using stream capture APIs might fail during stream capture -Applications using ``hipLaunchHostFunc`` with stream capture APIs may fail to capture graphs during stream capture, and return `hipErrorStreamCaptureUnsupported`. This issue resulted from an update in ``hipStreamAddCallback``. This issue will be fixed in a future ROCm release. +Applications using ``hipLaunchHostFunc`` with stream capture APIs might fail to capture graphs during stream capture, and return `hipErrorStreamCaptureUnsupported`. This issue resulted from an update in ``hipStreamAddCallback``. This issue will be fixed in a future ROCm release. ### Compilation failure via hipRTC when compiling with std=c++11 @@ -2375,7 +2385,7 @@ individual components, review the [Detailed component changes](#detailed-compone ### Failure when using a generic target with compression and vice versa -An issue where compiling of a generic target with compression failing has been resolved in this release. This issue prevented you from compiling a generic target and using compression simultaneously. See [GitHub issue #4602](https://github.com/ROCm/ROCm/issues/4602). +An issue where compiling a generic target resulted in compression failing has been resolved in this release. This issue prevented you from compiling a generic target and using compression simultaneously. See [GitHub issue #4602](https://github.com/ROCm/ROCm/issues/4602). ### Limited support for Sparse API and Pallas functionality in JAX @@ -2383,7 +2393,7 @@ An issue where due to limited support for Sparse API in JAX, some of the functio ### Failure to use –kokkos-trace option in ROCm Compute Profiler -An issue where using of the ``--kokkos-trace`` option resulted in a difference between the output of the ``--kokkos-trace`` and the ``counter_collection.csv`` output file has been resolved. Due to this issue the program used to exit with a warning message if the ``-kokkos-trace`` option was detected in the ROCm Compute Profiler. This issue resulted due to the partial implementation of ``--kokkos-trace`` in ``rocprofv3`` tool. See [GitHub issue #4604](https://github.com/ROCm/ROCm/issues/4604). +An issue where using of the ``--kokkos-trace`` option resulted in a difference between the output of the ``--kokkos-trace`` and the ``counter_collection.csv`` output file has been resolved. Due to this issue, the program exited with a warning message if the ``-kokkos-trace`` option was detected in the ROCm Compute Profiler. This issue was due to the partial implementation of ``--kokkos-trace`` in ``rocprofv3`` tool. See [GitHub issue #4604](https://github.com/ROCm/ROCm/issues/4604). ## ROCm upcoming changes From aebf1b44809de47a8e99c409f166054053ce57f8 Mon Sep 17 00:00:00 2001 From: Peter Park Date: Wed, 10 Sep 2025 13:46:58 -0400 Subject: [PATCH 43/58] Update amdsmi changelog (#533) * update amdsmi cl * remove duplicated changelog entry * minor tweaks and add upcoming changes * update --- RELEASE.md | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index f5cf817bc..9a824b018 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -720,20 +720,19 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - Increased available JPEG engines to 40. Current ASICs might not support all 40. These are indicated as `UINT16_MAX` or `N/A` in CLI. * Bad page threshold count. - - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions required to display the count. + - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions are required to display the count. * CPU model name for RDC. - Added new C and Python API `amdsmi_get_cpu_model_name`. - Not sourced from esmi library. -* Added `amdsmi_get_cpu_affinity_with_scope()`. +* New API `amdsmi_get_cpu_affinity_with_scope()`. * `socket power` to `amdsmi_get_power_info` - - Previously the C API had the value in the `amdsmi_power_info` structure, but was unused - - Now we populate the value in both C and Python APIs + - Previously, the C API had the value in the `amdsmi_power_info` structure, but was unused. - The value is representative of the socket's power agnostic of the the GPU version. -* New event notification types to `amdsmi_evt_notification_type_t`. +* New event notification types to `amdsmi_evt_notification_type_t`. The following values were added to the `amdsmi_evt_notification_type_t` enum: - `AMDSMI_EVT_NOTIF_EVENT_MIGRATE_START` - `AMDSMI_EVT_NOTIF_EVENT_MIGRATE_END` @@ -745,7 +744,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - `AMDSMI_EVT_NOTIF_PROCESS_START` - `AMDSMI_EVT_NOTIF_PROCESS_END` -- Power cap to `amd-smi monitor`. +- Power cap to `amd-smi monitor`. - `amd-smi monitor -p` will display the power cap along with power. #### Changed @@ -753,7 +752,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid * Separated driver reload functionality from `amdsmi_set_gpu_memory_partition()` and `amdsmi_set_gpu_memory_partition_mode()` APIs -- and from the CLI `amd-smi set -M `. -* Disabled `amd-smi monitor --violation` on guest. Modified `amd-smi metric --throttle` to alias to `amd-smi metric --violation`. +* Disabled `amd-smi monitor --violation` on guests. Modified `amd-smi metric -T/--throttle` to alias to `amd-smi metric -v/--violation`. * Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. - The `clk_deep_sleep` field now returns the sleep integer value. @@ -774,13 +773,17 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - `acc_low_utilization`, `per_low_utilization`, `active_low_utilization` - Python API and CLI now report these expanded fields. -* The char arrays in the following structures have been changed. +* The char arrays in the following structures have been changed. - `amdsmi_vbios_info_t` member `build_date` changed from `AMDSMI_MAX_DATE_LENGTH` to `AMDSMI_MAX_STRING_LENGTH`. - `amdsmi_dpm_policy_entry_t` member `policy_description` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. - `amdsmi_name_value_t` member `name` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. * For backwards compatibility, updated `amdsmi_bdf_t` union to have an identical unnamed struct. +* Updated `amdsmi_get_temp_metric` and `amdsmi_temperature_type_t` with new values. + - Added new values to `amdsmi_temperature_type_t` representing various baseboard and GPU board temperature measures. + - Updated `amdsmi_get_temp_metric` API to be able to take in and return the respective values for the new temperature types. + #### Removed - Unnecessary API, `amdsmi_free_name_value_pairs()` @@ -793,9 +796,9 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - Unused member `year` in struct `amdsmi_version_t`. -- `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`. +- `amdsmi_io_link_type_t` has been replaced with `amdsmi_link_type_t`. - `amdsmi_io_link_type_t` is no longer needed as `amdsmi_link_type_t` is sufficient. - - `amdsmi_link_type_t` enum has changed. + - `amdsmi_link_type_t` enum has changed; primarily, the ordering of the PCI and XGMI types. - This change will also affect `amdsmi_link_metrics_t`, where the link_type field changes from `amdsmi_io_link_type_t` to `amdsmi_link_type_t`. - `amdsmi_get_power_info_v2()`. @@ -820,7 +823,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - Removed partition information from the default `amd-smi static` CLI command. - Users can still retrieve the same data by calling `amd-smi`, `amd-smi static -p`, or `amd-smi partition -c -m`/`sudo amd-smi partition -a`. - - Reading ``current_compute_partition`` may momentarily wake the GPU up. This is due to reading XCD registers, which is expected behavior. Changing partitions is not a trivial operation, `current_compute_partition` SYSFS controls this action. + - Reading `current_compute_partition` may momentarily wake the GPU up. This is due to reading XCD registers, which is expected behavior. Changing partitions is not a trivial operation, `current_compute_partition` SYSFS controls this action. - Optimized CLI command `amd-smi topology` in partition mode. - Reduced the number of `amdsmi_topo_get_p2p_status` API calls to one fourth. @@ -831,6 +834,10 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - Fixed `amd-smi monitor` decoder utilization (`DEC%`) not showing up on MI300 series ASICs. +#### Known issues + +- `amd-smi monitor` on Linux Guest systems triggers an attribute error. + ```{note} See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. ``` From e3227d14e6e7170a3f548f7b81a79824803bcca1 Mon Sep 17 00:00:00 2001 From: Peter Park Date: Wed, 10 Sep 2025 16:45:35 -0400 Subject: [PATCH 44/58] 7.0.0 release notes: Add highlight for training/inference benchmark docker docs (#538) * add highlight for training/inference benchmark docker docs * update update blurb double word Update RELEASE.md Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> Update RELEASE.md Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> Update RELEASE.md Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> update wording --- RELEASE.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/RELEASE.md b/RELEASE.md index 9a824b018..511f61dc8 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -328,6 +328,28 @@ For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/pro ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases. +* The ROCm AI [training](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/index.html) and + [inference](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/inference/index.html) + benchmarking guides have been updated with expanded model coverage and + optimized Docker environments. Highlights include: + + * The [Training a model with Primus and Megatron](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/benchmark-docker/primus-megatron.html) benchmarking guide + now leverages the unified AMD Primus framework with the Megatron backend. See [Primus: A Lightweight, Unified Training Framework for Large Models on AMD + GPUs](https://rocm.blogs.amd.com/software-tools-optimization/primus/README.html) for an introduction to Primus. + + * The [Training a model with PyTorch](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/benchmark-docker/pytorch-training.html) benchmarking guide + now includes fine-tuning for OpenAI GPT OSS and Qwen models. It also includes a multi-node training example. + + * The [Training a model with JAX MaxText](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/benchmark-docker/pytorch-training.html) benchmarking guide + now supports [MAD](https://github.com/ROCm/MAD)-integrated benchmarking. The MaxText training environment now uses JAX 0.6.0 or 0.5.0. FP8 quantized training is supported with JAX 0.5.0. + + * The [vLLM inference performance testing](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/inference/benchmark-docker/vllm.html) documentation + now features clearer serving and throughput benchmarking commands -- for improved transparency of model benchmarking configurations. The vLLM inference + environment now uses vLLM 0.10.1 and includes improved default configurations. + + These training and inference resources will continue to grow with ongoing improvements and expanded model coverage. + For a searchable view of supported frameworks and models, see [AMD Infinity Hub](https://www.amd.com/en/developer/resources/infinity-hub.html). + * [Tutorials for AI developers](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/) have been expanded with the following new inference tutorial: [PD disaggregation with SGLang](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/SGlang_PD_Disagg_On_AMD_GPU.html) In addition, the [AI agent with MCPs using vLLM and PydanticAI](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/build_airbnb_agent_mcp.html) tutorial has been updated. For more information about the changes, see [Changelog for the AI Developer Hub](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/changelog.html). From 8eee15558504ee60865ea4454c6a6f853a89d78c Mon Sep 17 00:00:00 2001 From: Peter Park Date: Thu, 11 Sep 2025 14:48:47 -0400 Subject: [PATCH 45/58] Mockup: List some bullets horizontally (#539) * list horizontally * make it 2 cols * use grid * margin - * update margins --- RELEASE.md | 50 +++++++++++++++++++++++++++++++++++++------------- 1 file changed, 37 insertions(+), 13 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 511f61dc8..bddb3f3c2 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -358,25 +358,49 @@ ROCm documentation continues to be updated to provide clearer and more comprehen * ROCm Math libraries support a wide range of data types, enabling optimized performance across various precision requirements. The following Math libraries are now updated with new precision content. For more information, click the Math library’s link: - * [hipBLAS](https://rocm.docs.amd.com/projects/hipBLAS/en/develop/reference/data-type-support.html) - * [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/develop/reference/data-type-support.html) - * [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/develop/reference/precision.html) - * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/precision.html) - * [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/develop/src/reference/precision-support.html#precision-support) + ::::{grid} 2 + :margin: auto 0 auto auto + :::{grid} + :margin: auto 0 auto auto + * [hipBLAS](https://rocm.docs.amd.com/projects/hipBLAS/en/develop/reference/data-type-support.html) + * [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/develop/reference/data-type-support.html) + * [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/develop/reference/precision.html) + ::: + :::{grid} + :margin: auto 0 auto auto + * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/precision.html) + * [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/develop/src/reference/precision-support.html#precision-support) + ::: + :::: * ROCm offers a comprehensive ecosystem for deep learning development, featuring libraries optimized for deep learning operations and ROCm-aware versions of popular deep learning frameworks and libraries. The following deep learning frameworks' content now includes release notes and known issues: - * [PyTorch](https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/pytorch-compatibility.html) - * [JAX](https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/jax-compatibility.html) + ::::{grid} 1 + :margin: auto 0 auto auto + :::{grid-item} + :margin: auto 0 auto auto + * [PyTorch](https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/pytorch-compatibility.html) + * [JAX](https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/jax-compatibility.html) + ::: + :::: * ROCm components support a wide range of environment variables that can be used for testing, logging, debugging, experimental features, and more. The following components have been updated with new environment variable content. For more information, click the component’s link: - * [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/develop/reference/env-variables.html) - * [hipSPARSELt](https://rocm.docs.amd.com/projects/hipSPARSELt/en/develop/reference/env-variables.html) - * [ROCm Performance Primitives (RPP)](https://rocm.docs.amd.com/projects/rpp/en/develop/reference/rpp-env-variables.html) - * [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/develop/reference/env_variables.html) - * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/env_variables.html) - * [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/develop/src/reference/environment-variables.html) + ::::{grid} 2 + :margin: auto 0 auto auto + :::{grid-item} + :margin: auto 0 auto auto + * [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/develop/reference/env-variables.html) + * [hipSPARSELt](https://rocm.docs.amd.com/projects/hipSPARSELt/en/develop/reference/env-variables.html) + * [ROCm Performance Primitives (RPP)](https://rocm.docs.amd.com/projects/rpp/en/develop/reference/rpp-env-variables.html) + ::: + :::{grid-item} + :margin: auto 0 auto auto + * [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/develop/reference/env_variables.html) + * [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/develop/reference/env_variables.html) + * [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/develop/src/reference/environment-variables.html) + ::: + :::: * Modern computing tasks often require balancing numerical precision against hardware resources and processing speed. Low precision floating point number formats in HIP include `FP4` (4-bit) and `FP6` (6-bit), which reduce memory and bandwidth requirements. For more information, see the updated [Low precision floating point types](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html) topic. From e1a1a4e71238487c547f8c4ff67bba6edd8a86bf Mon Sep 17 00:00:00 2001 From: randyh62 <42045079+randyh62@users.noreply.github.com> Date: Thu, 11 Sep 2025 14:46:12 -0700 Subject: [PATCH 46/58] Update RELEASE.md (#540) * Update RELEASE.md Added per Julia * Update CHANGELOG.md change added to Changelog.md as well --- CHANGELOG.md | 1 + RELEASE.md | 1 + 2 files changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0ae63cfd5..0a8b23ad3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -227,6 +227,7 @@ In order to match the CUDA specification, the `warpSize` variable is no longer ` - `hipPointerGetAttributes` returns `hipSuccess` instead of an error with invalid value `hipErrorInvalidValue`, in case `NULL` host or attribute pointer is passed as input parameter. It now matches the functionality of `cudaPointerGetAttributes` which changed with CUDA 11 and above releases. - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree`. - `hipFreeAsync` now returns `hipSuccess` when the input pointer is NULL, instead of ` hipErrorInvalidValue` , to be consistent with `hipFree`. + - Exceptions occurring during a kernel execution will not abort the process anymore but will return an error unless core dump is enabled. * Changes in hipRTC. - Removal of `hipRTC` symbols from HIP Runtime Library. Any application using `hipRTC` APIs should link explicitly with the `hipRTC` library. This makes the usage of `hipRTC` library on Linux the same as on Windows and matches the behavior of CUDA `nvRTC`. diff --git a/RELEASE.md b/RELEASE.md index bddb3f3c2..9dbf8a6b8 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -967,6 +967,7 @@ In order to match the CUDA specification, the `warpSize` variable is no longer ` - `hipPointerGetAttributes` returns `hipSuccess` instead of an error with invalid value `hipErrorInvalidValue`, in case `NULL` host or attribute pointer is passed as input parameter. It now matches the functionality of `cudaPointerGetAttributes` which changed with CUDA 11 and above releases. - `hipFree` previously there was an implicit wait which was applicable for all memory allocations, for synchronization purpose. This wait is now disabled for allocations made with `hipMallocAsync` and `hipMallocFromPoolAsync`, to match the behavior of CUDA API `cudaFree`. - `hipFreeAsync` now returns `hipSuccess` when the input pointer is NULL, instead of ` hipErrorInvalidValue` , to be consistent with `hipFree`. + - Exceptions occurring during a kernel execution will not abort the process anymore but will return an error unless core dump is enabled. * Changes in hipRTC. - Removal of `hipRTC` symbols from HIP Runtime Library. Any application using `hipRTC` APIs should link explicitly with the `hipRTC` library. This makes the usage of `hipRTC` library on Linux the same as on Windows and matches the behavior of CUDA `nvRTC`. From e805e987019627d70067a07b5e98af1c4a0a0d66 Mon Sep 17 00:00:00 2001 From: Adel Johar Date: Sat, 13 Sep 2025 11:56:58 +0200 Subject: [PATCH 47/58] Add key features and known issue for ROCm 7.0 (#421) Co-authored-by: Istvan Kiss --- .../pytorch-compatibility.rst | 72 ++++++++++++++++++- 1 file changed, 71 insertions(+), 1 deletion(-) diff --git a/docs/compatibility/ml-compatibility/pytorch-compatibility.rst b/docs/compatibility/ml-compatibility/pytorch-compatibility.rst index cd325c8c5..243afb022 100644 --- a/docs/compatibility/ml-compatibility/pytorch-compatibility.rst +++ b/docs/compatibility/ml-compatibility/pytorch-compatibility.rst @@ -366,7 +366,8 @@ feature set available to developers. Supported modules and data types ================================================================================ -The following section outlines the supported data types, modules, and domain libraries available in PyTorch on ROCm. +The following section outlines the supported data types, modules, and domain +libraries available in PyTorch on ROCm. Supported data types -------------------------------------------------------------------------------- @@ -533,3 +534,72 @@ with ROCm. dispatching. **Note:** Only official release exists. + +Key features and enhancements for PyTorch 2.7 with ROCm 7.0 +================================================================================ + +- Enhanced TunableOp framework: Introduces ``tensorfloat32`` support for + TunableOp operations, improved offline tuning for ScaledGEMM operations, + submatrix offline tuning capabilities, and better logging for BLAS operations + without bias vectors. + +- Expanded GPU architecture support: Provides optimized support for newer GPU + architectures, including gfx1200 and gfx1201 with preferred hipBLASLt backend + selection, along with improvements for gfx950 and gfx1100 series GPUs. + +- Advanced Triton Integration: AOTriton 0.10b introduces official support for + gfx950 and gfx1201, along with experimental support for gfx1101, gfx1151, + gfx1150, and gfx1200. + +- Improved element-wise kernel performance: Delivers enhanced vectorized + element-wise kernels with better support for heterogeneous tensor types and + optimized input vectorization for tensors with mixed data types. + +- MIOpen deep learning optimizations: Enables NHWC BatchNorm by default on + ROCm 7.0+, provides ``maxpool`` forward and backward performance improvements + targeting ResNet scenarios, and includes updated launch configurations for + better performance. + +- Enhanced memory and tensor operations: Features fixes for in-place ``aten`` + sum operations with specialized templated kernels, improved 3D tensor + performance with NHWC format, and better handling of memory-bound matrix + multiplication operations. + +- Robust testing and quality improvements: Includes comprehensive test suite + updates with improved tolerance handling for Navi3x architectures, generalized + ROCm-specific test conditions, and enhanced unit test coverage for Flash + Attention and Memory Efficient operations. + +- Build system and infrastructure improvements: Provides updated CentOS Stream 9 + support, improved Docker configuration, migration to public MAGMA repository, + and enhanced QA automation scripts for PyTorch unit testing. + +- Composable Kernel (CK) updates: Features updated CK submodule integration with + the latest optimizations and performance improvements for core mathematical + operations. + +- Development and debugging enhancements: Includes improved source handling for + dynamic compilation, better error handling for atomic operations, and enhanced + state checking for trace operations. + +- Integrate APEX fused layer normalization, which can have positive impact on + text-to-video models. + +- Integrate APEX distributed fused LAMB and distributed fused ADAM, which can + have positive impact on BERT-L and Llama2-SFT. + +- FlashAttention v3 has been integrated for AMD GPUs. + +- `Pytorch C++ extensions `_ + provide a mechanism for compiling custom operations that can be used during + network training or inference. For AMD platforms, ``amdclang++`` has been + validated as the supported compiler for building these extensions. + +Known issues and notes for PyTorch 2.7 with ROCm 7.0 +================================================================================ + +- The ``matmul.allow_fp16_reduced_precision_reduction`` and + ``matmul.allow_bf16_reduced_precision_reduction`` options under + ``torch.backends.cuda`` are not supported. As a result, + reduced-precision reductions using FP16 or BF16 accumulation types are not + available. From 2de5a33aecda398330b0ff23bdadecad93fb64c9 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Sun, 14 Sep 2025 01:58:32 -0400 Subject: [PATCH 48/58] User space and firmware content added 700 (#542) * User space and firmware content added * New updates added * BKC dep added --- RELEASE.md | 190 +++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 161 insertions(+), 29 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 9dbf8a6b8..214f0f684 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -18,6 +18,8 @@ The release notes provide a summary of notable changes since the previous ROCm r - [Operating system, hardware, and virtualization support changes](#operating-system-hardware-and-virtualization-support-changes) +- [User space, driver, and firmware dependent changes](#user-space-driver-and-firmware-dependent-changes) + - [ROCm components versioning](#rocm-components) - [Detailed component changes](#detailed-component-changes) @@ -57,9 +59,9 @@ for more information about operating system and hardware compatibility. #### Virtualization support -ROCm 7.0 introduces support for KVM Passthrough for AMD Instinct MI350X and MI355X accelerators. +ROCm 7.0 introduces support for KVM Passthrough for AMD Instinct MI350X and MI355X GPUs. -All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver version 8.4.0.K. In addition, support for VMware ESXi 8 has been introduced for select AMD accelerators. For more information, see [Virtualization Support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support). +All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver version 8.4.0.K. In addition, support for VMware ESXi 8 has been introduced for select AMD GPUs. For more information, see [Virtualization Support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support). ### Deep learning and AI framework updates @@ -190,12 +192,12 @@ Key compiler enhancements include: #### New data type support -MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). ROCm 7.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 series accelerators in these ROCm libraries: +MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). ROCm 7.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 series GPUs in these ROCm libraries: * Composable Kernel (`FP4`, `FP6`, and `FP8` only) * hipBLASLt -The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on MI350 series accelerators instead of the NANOO `FP8` format: +The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on MI350 series GPUs instead of the NANOO `FP8` format: * Composable Kernel * hipBLASLt @@ -213,7 +215,7 @@ For more information about hipBLASLt changes, see the [hipBLASLt changelog](#hip #### MIGraphX improvements -* Support for OCP `FP8` on AMD Instinct MI350X and MI355X accelerators. +* Support for OCP `FP8` on AMD Instinct MI350X and MI355X GPUs. * Support for PyTorch 2.7 via Torch-MIGraphX. * Improved performance of Generative AI models. * Added additional MSFT Contrib Operators for improved ONNX Runtime Experience. @@ -250,8 +252,8 @@ ROCm Compute Profiler includes the following key changes: * Support for AMD Instinct MI355X and MI350X with addition of performance counters: CPC, SPI, SQ, TA/TD/TCP, and TCC. * Roofline enhancement added for AMD Instinct MI350 series. * Improved support for Selective Kernel profiling. -* Program Counter (PC) sampling (Software-based) feature has been enabled for AMD Instinct MI200, MI300X, MI350X, and MI355X accelerators. This feature helps in GPU profiling to understand code execution patterns and hotspots during GPU kernel execution. For more details, see [Using PC sampling in ROCm Compute Profiler](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/amd-staging/how-to/pc_sampling.html). -* Program Counter (PC) sampling (Hardware-based, Stochastic) feature has been enabled for AMD Instinct MI300X, MI350, and MI355X accelerators. +* Program Counter (PC) sampling (Software-based) feature has been enabled for AMD Instinct MI200, MI300X, MI350X, and MI355X GPUs. This feature helps in GPU profiling to understand code execution patterns and hotspots during GPU kernel execution. For more details, see [Using PC sampling in ROCm Compute Profiler](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/amd-staging/how-to/pc_sampling.html). +* Program Counter (PC) sampling (Hardware-based, Stochastic) feature has been enabled for AMD Instinct MI300X, MI350, and MI355X GPUs. * Docker files has been added to package the application and dependencies into a single portable and executable standalone binary file. See the [ROCm Compute Profiler changelog](#rocm-compute-profiler-3-2-3) for more details. @@ -271,7 +273,7 @@ ROCm Systems Profiler includes the following key changes: See the [ROCm Systems Profiler changelog](#rocm-systems-profiler-1-1-0) for more details. #### ROCm Validation Suite -In ROCm 7.0, ROCm Validation Suite includes support for the AMD Instinct MI355X and MI350X accelerators in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. +In ROCm 7.0, ROCm Validation Suite includes support for the AMD Instinct MI355X and MI350X GPUs in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more details. @@ -280,8 +282,8 @@ See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more ##### Core SDK enhancements * ROCprofiler-SDK is now compatible with the HIP 7.0 API. -* ROCprofiler-SDK adds support for AMD Instinct MI350X and MI355X accelerators. -* The stochastic and host-trap PC sampling support has been added for all AMD Instinct MI300 and MI350 series accelerators, which +* ROCprofiler-SDK adds support for AMD Instinct MI350X and MI355X GPUs. +* The stochastic and host-trap PC sampling support has been added for all AMD Instinct MI300 and MI350 series GPUs, which provides information particularly useful for understanding stalls during kernel execution. * The added support for tracing events surfaced by AMD's Kernel Fusion Driver (KFD) captures low-level driver routines involved in mapping, invalidation, and migration of data between CPU and GPU memories. Such events are central to the support for [Unified Memory](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_runtime_api/memory_management/unified_memory.html) on AMD systems. Tracing of KFD events helps to detect performance problems arising from excessive data migration. * New APIs are added for profiling applications using thread traces (beta) @@ -295,7 +297,7 @@ efficient foundation for analysis and post-processing. ##### rocprofv3 CLI tool enhancements -* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 and MI350 series accelerators. +* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 and MI350 series GPUs. * HIP streams translate to Queues in Time Traces in Perfetto output. * Support for thread trace service. @@ -404,6 +406,136 @@ ROCm documentation continues to be updated to provide clearer and more comprehen * Modern computing tasks often require balancing numerical precision against hardware resources and processing speed. Low precision floating point number formats in HIP include `FP4` (4-bit) and `FP6` (6-bit), which reduce memory and bandwidth requirements. For more information, see the updated [Low precision floating point types](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html) topic. +## User space, driver, and firmware dependent changes + +GPU Software for AMD datacenter GPU products requires you to maintain a hardware and software stack with interdependencies between the GPU and baseboard firmware, AMD GPU drivers, and the ROCm user space software. Starting ROCm 7.0 release, we are publicly documenting these interdependencies. Note that while AMD publishes drivers and ROCm user space, your server or infrastructure provider publishes the GPU and baseboard firmware by bundling AMD’s firmware releases via AMD's Platform Level Data Model (PLDM) bundle (Firmware), which includes Integrated Firmware Image (IFWI). + +The GPU and baseboard firmware releases numbering may vary by GPU family. Note that, ROCm 7.0 release is the first release where the AMD GPU driver is versioned independently of ROCm. + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+

ROCm Version

+
+

GPU

+
+

PLDM Bundle (Firmware)

+
+

AMD GPU Driver

+
+

AMD GPU
+ Virtualization Driver (GIM)

+
ROCm 7.0.0MI355X + 01.25.13.04 (or later)
+ 01.25.11.02 +
30.108.4.0.K
MI350X + 01.25.13.04 (or later)
+ 01.25.11.02 +
30.10
MI325X + 01.25.04.00 (or later)
+ 01.25.03.03 +
+ 30.10
+ 6.4.z where z (0-3)
+ 6.3.y where y (1-3) +
MI300X01.25.03.02 (or later) + 30.10
+ 6.4.z where z (0–3)
+ 6.3.y where y (0–3)
+ 6.2.x where x (1–4) +
8.4.0.K
MI300A26 (or later)Not Applicable
MI250XIFWI 47 (or later)
MI250MU5 w/ IFWI 75 (or later)
MI210MU5 w/ IFWI 758.4.0.K
MI100VBIOS D3430401-037Not Applicable
+
+ +### New feature details + +#### AMD SMI changes dependent on PLDM bundles + +New APIs introduced in AMD SMI for ROCm 7.0 provide additional data for the AMD Instinct products. To support these features, the following firmware for each GPUs are required: + +* AMD Instinct MI355x - PLDM bundle 01.25.13.04 + +* AMD Instinct MI350x - PLDM bundle 01.25.13.04 + +* AMD Instinct MI325x - PLDM bundle 01.25.04.00 + +* AMD Instinct MI300x - PLDM bundle 01.25.03.12 + +If ROCm 7.0 is applied on system with prior version of PLDM bundles (firmware), the new APIs will return `N/A` to indicate lack of support for these items. + +#### Enhanced temperature telemetry introduced in AMD SMI for MI355X and MI350X GPUs + +AMD SMI in ROCm 7.0 provides support for enhanced temperature metrics and temperature anomaly detection for AMD Instinct MI350X and MI355X GPUs when paired with: AMD Instinct MI355x/MI350X - PLDM bundle 01.25.13.04. + +For more information on these features, see [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-7.0/CHANGELOG.md). + +#### KVM SR-IOV virtualization changes dependent on open source AMD GPU Virtualization Driver (GIM) + +KVM SR-IOV support for all Instinct GPUs require the open source AMD GPU Virtualization Driver (GIM) 8.4.0.K. For detailed support information, see [virtualization support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support) and [GIM Release Note](https://github.com/amd/MxGPU-Virtualization/releases). + +#### GPU partitioning support for AMD Instinct MI355X and MI350X GPUs + +NPS2 and DPX partitioning on bare metal is enabled on AMD Instinct MI355X and MI350X GPUs on ROCm 7.0 when paired with: AMD Instinct MI355x/MI350X - PLDM bundle 01.25.13.04. + ## ROCm components The following table lists the versions of ROCm components for ROCm 7.0.0, including any version @@ -1108,7 +1240,7 @@ HIP runtime has the following functional improvements which improves runtime per * Added the `hipblasSetWorkspace()` API. * Support for codecoverage tests. - + #### Changed * HIPBLAS_V2 API is the only available API using the `hipComplex` and `hipDatatype` types. @@ -1256,9 +1388,9 @@ HIP runtime has the following functional improvements which improves runtime per #### Changed * Deprecated the hipRAND Fortran API in favor of hipfort. - + #### Removed - + * Removed C++14 support, so only C++17 is supported. ### **hipSOLVER** (3.0.0) @@ -1270,7 +1402,7 @@ HIP runtime has the following functional improvements which improves runtime per * `hipsolverSpCcsrlsvqr`, `hipsolverSpZcsrlsvqr` #### Resolved issues - + * Corrected the value of `lwork` returned by various `bufferSize` functions to be consistent with NVIDIA cuSOLVER. The following functions now return `lwork` so that the workspace size (in bytes) is `sizeof(T) * lwork`, rather than `lwork`. To restore the original behavior, set the environment variable `HIPSOLVER_BUFFERSIZE_RETURN_BYTES`. * `hipsolverXorgbr_bufferSize`, `hipsolverXorgqr_bufferSize`, `hipsolverXorgtr_bufferSize`, `hipsolverXormqr_bufferSize`, `hipsolverXormtr_bufferSize`, `hipsolverXgesvd_bufferSize`, `hipsolverXgesvdj_bufferSize`, `hipsolverXgesvdBatched_bufferSize`, `hipsolverXgesvdaStridedBatched_bufferSize`, `hipsolverXsyevd_bufferSize`, `hipsolverXsyevdx_bufferSize`, `hipsolverXsyevj_bufferSize`, `hipsolverXsyevjBatched_bufferSize`, `hipsolverXsygvd_bufferSize`, `hipsolverXsygvdx_bufferSize`, `hipsolverXsygvj_bufferSize`, `hipsolverXsytrd_bufferSize`, `hipsolverXsytrf_bufferSize`. @@ -1291,14 +1423,14 @@ HIP runtime has the following functional improvements which improves runtime per #### Changed * Switched to defaulting to C++17 when building hipSPARSE from source. Previously hipSPARSE was using C++14 by default. - + #### Resolved issues * Fixed a compilation [issue](https://github.com/ROCm/hipSPARSE/issues/555) related to using `std::filesystem` and C++14. * Fixed an issue where the clients-common package was empty by moving the `hipsparse_clientmatrices.cmake` and `hipsparse_mtx2csr` files to it. #### Known issues - + * In `hipsparseSpSM_solve()`, the external buffer is passed as a parameter. This does not match the NVIDIA CUDA cuSPARSE API. This extra external buffer parameter will be removed in a future release. For now, this extra parameter can be ignored and nullptr passed in because it is unused internally. ### **hipSPARSELt** (0.2.4) @@ -1313,10 +1445,10 @@ HIP runtime has the following functional improvements which improves runtime per * Support for the cuSPARSELt v0.6.3 backend. #### Removed - + * Support for LLVM targets gfx940 and gfx941 has been removed. * `hipsparseLtDatatype_t` has been removed. - + #### Optimized * Improved the library loading time. @@ -1393,7 +1525,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Support for OCP `FP8` on AMD Instinct MI350X accelerators. +* Support for OCP `FP8` on AMD Instinct MI350X GPUs. * Support for PyTorch 2.7 via Torch-MIGraphX. * Support for the Microsoft ONNX Contrib Operators (Self) Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, SkipLayerNorm. * Support for Sigmoid and AddN TensorFlow operators. @@ -1458,7 +1590,7 @@ HIP runtime has the following functional improvements which improves runtime per ### **MIOpen** (3.5.0) #### Added - + * [Conv] Misa kernels for gfx950. * [Conv] Enabled Split-K support for CK backward data solvers (2D). * [Conv] Enabled CK wrw solver on gfx950 for the `BF16` data type. @@ -1611,7 +1743,7 @@ HIP runtime has the following functional improvements which improves runtime per * Improved the performance of Level 3 `dgmm` for all precisions and variants on gfx942. #### Resolved issues - + * Fixed environment variable path-based logging to append multiple handle outputs to the same file. * Support numerics when `trsm` is running with `rocblas_status_perf_degraded`. * Fixed the build dependency installation of `joblib` on some operating systems. @@ -1767,7 +1899,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * L2 to EA stalls * L2 to EA stalls per channel -* Roofline support for AMD Instinct MI350 series accelerators. +* Roofline support for AMD Instinct MI350 series GPUs. ##### Textual User Interface (TUI) (beta version) @@ -1777,9 +1909,9 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin ##### PC Sampling (beta version) -* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X series and later accelerators. +* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X series and later GPUs. -* Host-trap PC Sampling has been enabled for AMD Instinct MI200 series and later accelerators. +* Host-trap PC Sampling has been enabled for AMD Instinct MI200 series and later GPUs. * Support for sorting of PC sampling by type: offset or count. @@ -1940,7 +2072,7 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele #### Added -- Support for AMD Instinct MI350X and MI355X accelerators. +- Support for AMD Instinct MI350X and MI355X GPUs. - Introduced rotating buffer mechanism for GEMM operations. - Support for read and write tests in Babel. - Support for AMD Radeon RX9070 and RX9070GRE graphics cards. @@ -2056,7 +2188,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Added - Support for [rocJPEG](https://rocm.docs.amd.com/projects/rocJPEG/en/latest/index.html) API Tracing. -- Support for AMD Instinct MI350X and MI355X accelerators. +- Support for AMD Instinct MI350X and MI355X GPUs. - `rocprofiler_create_counter` to facilitate adding custom derived counters at runtime. - Support in `rocprofv3` for iteration based counter multiplexing. - Perfetto support for counter collection. @@ -2207,7 +2339,7 @@ The previous default accumulator types could lead to situations in which unexpec * Improved the performance of BDSQR and downstream functions, such as GESVD. * Improved the performance of STEQR and downstream functions, such as SYEV/HEEV. * Improved the performance of LARFT and downstream functions, such as GEQR2 and GEQRF. - + #### Resolved issues * Fixed corner cases that can produce NaNs in SYEVD for valid input matrices. @@ -2248,7 +2380,7 @@ The previous default accumulator types could lead to situations in which unexpec * Improved the user documentation. #### Resolved issues - + * Fixed an issue in the public headers where `extern "C"` was not wrapped by `#ifdef __cplusplus`, which caused failures when building C programs with rocSPARSE. * Fixed a memory access fault in the `rocsparse_Xbsrilu0` routines. * Fixed failures that could occur in `rocsparse_Xbsrsm_solve` or `rocsparse_spsm` with BSR format when using host pointer mode. From 29f4d65da54548d234d31cc31df2c66191e4db8f Mon Sep 17 00:00:00 2001 From: amitkumar-amd Date: Sun, 14 Sep 2025 01:17:53 -0500 Subject: [PATCH 49/58] Update RELEASE.md --- RELEASE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index 214f0f684..0cf405011 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -59,7 +59,7 @@ for more information about operating system and hardware compatibility. #### Virtualization support -ROCm 7.0 introduces support for KVM Passthrough for AMD Instinct MI350X and MI355X GPUs. +ROCm 7.0.0 introduces support for KVM Passthrough for AMD Instinct MI350X and MI355X GPUs. All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver version 8.4.0.K. In addition, support for VMware ESXi 8 has been introduced for select AMD GPUs. For more information, see [Virtualization Support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support). From b357ba993b4c342649e50bbc84cd6c7a31f191ec Mon Sep 17 00:00:00 2001 From: amitkumar-amd Date: Sun, 14 Sep 2025 01:30:49 -0500 Subject: [PATCH 50/58] Update RELEASE.md --- RELEASE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index 0cf405011..0ad842c7f 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -2607,7 +2607,7 @@ It's anticipated that ROCTracer, ROCProfiler, `rocprof`, and `rocprofv2` will re ### AMDGPU wavefront size compiler macro deprecation Access to the wavefront size as a compile-time constant via the `__AMDGCN_WAVEFRONT_SIZE` -and `__AMDGCN_WAVEFRONT_SIZE__` macros are deprecated and will be disabled in a future release. In ROCm 7.0.0 `warpSize` is only available as a non-`constextpr` variable. You're encougared to update your code if needed to ensure future compatibility. +and `__AMDGCN_WAVEFRONT_SIZE__` macros are deprecated and will be disabled in a future release. In ROCm 7.0.0 `warpSize` is only available as a non-`constextpr` variable. You're encouraged to update your code if needed to ensure future compatibility. * The `__AMDGCN_WAVEFRONT_SIZE__` macro and `__AMDGCN_WAVEFRONT_SIZE` alias will be removed in an upcoming release. It is recommended to remove any use of this macro. For more information, see From 1660ac335a66e53a9373dd635115e7352ddd66f7 Mon Sep 17 00:00:00 2001 From: amitkumar-amd Date: Sun, 14 Sep 2025 01:50:27 -0500 Subject: [PATCH 51/58] Update RELEASE.md Swap new framework vs updated framework --- RELEASE.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 0ad842c7f..888a6f2b4 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -68,13 +68,6 @@ All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver vers ROCm provides a comprehensive ecosystem for deep learning development. For more information, see [Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/latest/how-to/deep-learning-rocm.html) and the [Compatibility matrix](../../docs/compatibility/compatibility-matrix.rst) for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm. -#### New frameworks - -AMD ROCm has officially added support for the following Deep learning and AI frameworks: - -* Ray is a unified framework for scaling AI and Python applications from your laptop to a full cluster, without changing your code. Ray consists of a core distributed runtime and a set of AI libraries for simplifying machine learning computations. It is currently supported on ROCm 6.4.1. For more information, see [Ray compatibility](https://advanced-micro-devices-rocm-internal--500.com.readthedocs.build/en/500/compatibility/ml-compatibility/ray-compatibility.html). - -* llama.cpp is an open-source framework for Large Language Model (LLM) inference that runs on both central processing units (CPUs) and graphics processing units (GPUs). It is written in plain C/C++, providing a simple, dependency-free setup. It is currently supported on ROCm 6.4.0. For more information, see [llama.cpp compatibility](https://advanced-micro-devices-rocm-internal--500.com.readthedocs.build/en/500/compatibility/ml-compatibility/llama-cpp-compatibility.html). #### Updated framework support @@ -120,6 +113,15 @@ ROCm 7.0 enables support for ONNX Runtime 1.22.0. ROCm 7.0 enables support for Triton 3.3.0. +#### New frameworks + +AMD ROCm has officially added support for the following Deep learning and AI frameworks: + +* Ray is a unified framework for scaling AI and Python applications from your laptop to a full cluster, without changing your code. Ray consists of a core distributed runtime and a set of AI libraries for simplifying machine learning computations. It is currently supported on ROCm 6.4.1. For more information, see [Ray compatibility](https://advanced-micro-devices-rocm-internal--500.com.readthedocs.build/en/500/compatibility/ml-compatibility/ray-compatibility.html). + +* llama.cpp is an open-source framework for Large Language Model (LLM) inference that runs on both central processing units (CPUs) and graphics processing units (GPUs). It is written in plain C/C++, providing a simple, dependency-free setup. It is currently supported on ROCm 6.4.0. For more information, see [llama.cpp compatibility](https://advanced-micro-devices-rocm-internal--500.com.readthedocs.build/en/500/compatibility/ml-compatibility/llama-cpp-compatibility.html). + + ### Instinct Driver/ROCm packaging separation The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See the [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) blog and [User and kernel-space support matrix](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/user-kernel-space-compat-matrix.html) for more information. From cbd4e8f0ba64a5b433db449826ced42cb1535fc2 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Mon, 15 Sep 2025 16:29:34 -0400 Subject: [PATCH 52/58] 7.0.0 release notes feedback updated [Batch 6] (#550) * RN changes updated * Changelog synced and release notes updated * Compatibility changes added --- CHANGELOG.md | 41 +++++++------ RELEASE.md | 60 +++++++++++-------- .../compatibility-matrix-historical-6.0.csv | 2 +- docs/compatibility/compatibility-matrix.rst | 6 +- 4 files changed, 62 insertions(+), 47 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0a8b23ad3..42e0f0cfc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -33,20 +33,19 @@ for a complete overview of this release. - Increased available JPEG engines to 40. Current ASICs might not support all 40. These are indicated as `UINT16_MAX` or `N/A` in CLI. * Bad page threshold count. - - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions required to display the count. + - Added `amdsmi_get_gpu_bad_page_threshold` to Python API and CLI; root/sudo permissions are required to display the count. * CPU model name for RDC. - Added new C and Python API `amdsmi_get_cpu_model_name`. - Not sourced from esmi library. -* Added `amdsmi_get_cpu_affinity_with_scope()`. +* New API `amdsmi_get_cpu_affinity_with_scope()`. * `socket power` to `amdsmi_get_power_info` - - Previously the C API had the value in the `amdsmi_power_info` structure, but was unused - - Now we populate the value in both C and Python APIs + - Previously, the C API had the value in the `amdsmi_power_info` structure, but was unused. - The value is representative of the socket's power agnostic of the the GPU version. -* New event notification types to `amdsmi_evt_notification_type_t`. +* New event notification types to `amdsmi_evt_notification_type_t`. The following values were added to the `amdsmi_evt_notification_type_t` enum: - `AMDSMI_EVT_NOTIF_EVENT_MIGRATE_START` - `AMDSMI_EVT_NOTIF_EVENT_MIGRATE_END` @@ -58,7 +57,7 @@ for a complete overview of this release. - `AMDSMI_EVT_NOTIF_PROCESS_START` - `AMDSMI_EVT_NOTIF_PROCESS_END` -- Power cap to `amd-smi monitor`. +- Power cap to `amd-smi monitor`. - `amd-smi monitor -p` will display the power cap along with power. #### Changed @@ -66,7 +65,7 @@ for a complete overview of this release. * Separated driver reload functionality from `amdsmi_set_gpu_memory_partition()` and `amdsmi_set_gpu_memory_partition_mode()` APIs -- and from the CLI `amd-smi set -M `. -* Disabled `amd-smi monitor --violation` on guest. Modified `amd-smi metric --throttle` to alias to `amd-smi metric --violation`. +* Disabled `amd-smi monitor --violation` on guests. Modified `amd-smi metric -T/--throttle` to alias to `amd-smi metric -v/--violation`. * Updated `amdsmi_get_clock_info` in `amdsmi_interface.py`. - The `clk_deep_sleep` field now returns the sleep integer value. @@ -87,13 +86,17 @@ for a complete overview of this release. - `acc_low_utilization`, `per_low_utilization`, `active_low_utilization` - Python API and CLI now report these expanded fields. -* The char arrays in the following structures have been changed. +* The char arrays in the following structures have been changed. - `amdsmi_vbios_info_t` member `build_date` changed from `AMDSMI_MAX_DATE_LENGTH` to `AMDSMI_MAX_STRING_LENGTH`. - `amdsmi_dpm_policy_entry_t` member `policy_description` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. - `amdsmi_name_value_t` member `name` changed from `AMDSMI_MAX_NAME` to `AMDSMI_MAX_STRING_LENGTH`. * For backwards compatibility, updated `amdsmi_bdf_t` union to have an identical unnamed struct. +* Updated `amdsmi_get_temp_metric` and `amdsmi_temperature_type_t` with new values. + - Added new values to `amdsmi_temperature_type_t` representing various baseboard and GPU board temperature measures. + - Updated `amdsmi_get_temp_metric` API to be able to take in and return the respective values for the new temperature types. + #### Removed - Unnecessary API, `amdsmi_free_name_value_pairs()` @@ -106,9 +109,9 @@ for a complete overview of this release. - Unused member `year` in struct `amdsmi_version_t`. -- `amdsmi_io_link_type_t` and replaced with `amdsmi_link_type_t`. +- `amdsmi_io_link_type_t` has been replaced with `amdsmi_link_type_t`. - `amdsmi_io_link_type_t` is no longer needed as `amdsmi_link_type_t` is sufficient. - - `amdsmi_link_type_t` enum has changed. + - `amdsmi_link_type_t` enum has changed; primarily, the ordering of the PCI and XGMI types. - This change will also affect `amdsmi_link_metrics_t`, where the link_type field changes from `amdsmi_io_link_type_t` to `amdsmi_link_type_t`. - `amdsmi_get_power_info_v2()`. @@ -133,7 +136,7 @@ for a complete overview of this release. - Removed partition information from the default `amd-smi static` CLI command. - Users can still retrieve the same data by calling `amd-smi`, `amd-smi static -p`, or `amd-smi partition -c -m`/`sudo amd-smi partition -a`. - - Reading ``current_compute_partition`` may momentarily wake the GPU up. This is due to reading XCD registers, which is expected behavior. Changing partitions is not a trivial operation, `current_compute_partition` SYSFS controls this action. + - Reading `current_compute_partition` may momentarily wake the GPU up. This is due to reading XCD registers, which is expected behavior. Changing partitions is not a trivial operation, `current_compute_partition` SYSFS controls this action. - Optimized CLI command `amd-smi topology` in partition mode. - Reduced the number of `amdsmi_topo_get_p2p_status` API calls to one fourth. @@ -144,6 +147,10 @@ for a complete overview of this release. - Fixed `amd-smi monitor` decoder utilization (`DEC%`) not showing up on MI300 series ASICs. +#### Known issues + +- `amd-smi monitor` on Linux Guest systems triggers an attribute error. + ```{note} See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-7.0/CHANGELOG.md) for details, examples, and in-depth descriptions. ``` @@ -653,7 +660,7 @@ HIP runtime has the following functional improvements which improves runtime per #### Added -* Support for OCP `FP8` on AMD Instinct MI350X accelerators. +* Support for OCP `FP8` on AMD Instinct MI350X GPUs. * Support for PyTorch 2.7 via Torch-MIGraphX. * Support for the Microsoft ONNX Contrib Operators (Self) Attention, RotaryEmbedding, QuickGelu, BiasAdd, BiasSplitGelu, SkipLayerNorm. * Support for Sigmoid and AddN TensorFlow operators. @@ -1027,7 +1034,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * L2 to EA stalls * L2 to EA stalls per channel -* Roofline support for AMD Instinct MI350 series accelerators. +* Roofline support for AMD Instinct MI350 series GPUs. ##### Textual User Interface (TUI) (beta version) @@ -1037,9 +1044,9 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin ##### PC Sampling (beta version) -* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X series and later accelerators. +* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X series and later GPUs. -* Host-trap PC Sampling has been enabled for AMD Instinct MI200 series and later accelerators. +* Host-trap PC Sampling has been enabled for AMD Instinct MI200 series and later GPUs. * Support for sorting of PC sampling by type: offset or count. @@ -1200,7 +1207,7 @@ See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/rele #### Added -- Support for AMD Instinct MI350X and MI355X accelerators. +- Support for AMD Instinct MI350X and MI355X GPUs. - Introduced rotating buffer mechanism for GEMM operations. - Support for read and write tests in Babel. - Support for AMD Radeon RX9070 and RX9070GRE graphics cards. @@ -1316,7 +1323,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Added - Support for [rocJPEG](https://rocm.docs.amd.com/projects/rocJPEG/en/latest/index.html) API Tracing. -- Support for AMD Instinct MI350X and MI355X accelerators. +- Support for AMD Instinct MI350X and MI355X GPUs. - `rocprofiler_create_counter` to facilitate adding custom derived counters at runtime. - Support in `rocprofv3` for iteration based counter multiplexing. - Perfetto support for counter collection. diff --git a/RELEASE.md b/RELEASE.md index 888a6f2b4..0e78448fb 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -71,11 +71,11 @@ matrix](../../docs/compatibility/compatibility-matrix.rst) for the complete list #### Updated framework support -ROCm 7.0 introduces several newly supported versions of Deep learning and AI frameworks: +ROCm 7.0.0 introduces several newly supported versions of Deep learning and AI frameworks: ##### PyTorch -ROCm 7.0 enables the following PyTorch features: +ROCm 7.0.0 enables the following PyTorch features: * Support for PyTorch 2.7. * Integrated Fused Rope kernels in APEX. @@ -84,7 +84,7 @@ ROCm 7.0 enables the following PyTorch features: ##### JAX -ROCm 7.0 enables support for JAX 0.6.0. +ROCm 7.0.0 enables support for JAX 0.6.0. ##### Megatron-LM @@ -98,11 +98,15 @@ Megatron-LM for ROCm now supports: ##### TensorFlow -ROCm 7.0 enables support for TensorFlow 2.19.1. +ROCm 7.0.0 enables the following TensorFlow support: + +* Support for TensorFlow 2.19.1. +* MX data type support for AMD Instinct MI350 series GPUs. +* Triton autotuner. ##### ONNX Runtime -ROCm 7.0 enables support for ONNX Runtime 1.22.0. +ROCm 7.0.0 enables support for ONNX Runtime 1.22.0. ##### vLLM @@ -111,7 +115,7 @@ ROCm 7.0 enables support for ONNX Runtime 1.22.0. ##### Triton -ROCm 7.0 enables support for Triton 3.3.0. +ROCm 7.0.0 enables support for Triton 3.3.0. #### New frameworks @@ -122,9 +126,9 @@ AMD ROCm has officially added support for the following Deep learning and AI fra * llama.cpp is an open-source framework for Large Language Model (LLM) inference that runs on both central processing units (CPUs) and graphics processing units (GPUs). It is written in plain C/C++, providing a simple, dependency-free setup. It is currently supported on ROCm 6.4.0. For more information, see [llama.cpp compatibility](https://advanced-micro-devices-rocm-internal--500.com.readthedocs.build/en/500/compatibility/ml-compatibility/llama-cpp-compatibility.html). -### Instinct Driver/ROCm packaging separation +### AMD GPU Driver/ROCm packaging separation -The Instinct Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as Instinct Driver version 30.10. See the [ROCm Gets Modular: Meet the Instinct Datacenter GPU Driver](https://rocm.blogs.amd.com/ecosystems-and-partners/instinct-gpu-driver/README.html) blog and [User and kernel-space support matrix](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/user-kernel-space-compat-matrix.html) for more information. +The AMD GPU Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as AMD GPU Driver version 30.10. See the [User and kernel-space support matrix](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/user-kernel-space-compat-matrix.html) for more information. [AMD SMI](https://github.com/ROCm/amdsmi) continues to stay with the ROCm software stack under the ROCm organization repository. @@ -139,7 +143,7 @@ Other ROCm libraries are also in the process of migration along with ROCm tools ### HIP API compatibility improvements -To improve code portability between AMD ROCm and other programming models, HIP API has been updated in ROCm 7.0 to simplify cross-platform programming. These changes are incompatible with prior ROCm releases and might require recompiling existing HIP applications for use with ROCm 7.0. For more information, see the [HIP API 7.0 changes](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/hip-7-changes.html) and the [HIP changelog](#hip-7-0-0) below. +To improve code portability between AMD ROCm and other programming models, HIP API has been updated in ROCm 7.0.0 to simplify cross-platform programming. These changes are incompatible with prior ROCm releases and might require recompiling existing HIP applications for use with ROCm 7.0.0. For more information, see the [HIP API 7.0.0 changes](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/hip-7-changes.html) and the [HIP changelog](#hip-7-0-0) below. ### HIP runtime updates @@ -160,7 +164,7 @@ Additionally, the HIP runtime includes functional improvements, which improve fu ### Compiler changes and improvements -ROCm 7.0 introduces the AMD Next-Gen Fortran compiler. ``llvm-flang`` (sometimes called ``new-flang`` or ``flang-18``) is a re-implementation of the Fortran frontend. It is a strategic replacement for ``classic-flang`` and is developed in LLVM’s upstream repo at [llvm/llvm-project](https://github.com/llvm/llvm-project/tree/main/flang). +ROCm 7.0.0 introduces the AMD Next-Gen Fortran compiler. ``llvm-flang`` (sometimes called ``new-flang`` or ``flang-18``) is a re-implementation of the Fortran frontend. It is a strategic replacement for ``classic-flang`` and is developed in LLVM’s upstream repo at [llvm/llvm-project](https://github.com/llvm/llvm-project/tree/main/flang). Key compiler enhancements include: @@ -194,7 +198,7 @@ Key compiler enhancements include: #### New data type support -MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). ROCm 7.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 series GPUs in these ROCm libraries: +MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). ROCm 7.0.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 series GPUs in these ROCm libraries: * Composable Kernel (`FP4`, `FP6`, and `FP8` only) * hipBLASLt @@ -241,7 +245,7 @@ have been refined for improved usability. See the [AMD SMI changelog](#amd-smi-2 #### ROCgdb -The micro-scaling (MX) data types now support `FP4`, `FP6`, and `FP8`. +ROCgdb now supports `FP4`, `FP6`, and `FP8` micro-scaling (MX) data types with AMD Instinct MI350 series GPUs. See the [ROCgdb changelog](#rocgdb-16-3) for more details. @@ -262,7 +266,7 @@ See the [ROCm Compute Profiler changelog](#rocm-compute-profiler-3-2-3) for more #### ROCm Data Center (RDC) improvements -The ROCm Data Center tool (RDC) streamlines the administration of AMD GPUs in cluster data center environments. ROCm 7.0 introduces new data center management and monitoring tools for system administrators. For more information, see [ROCm Data Center (RDC) tool documentation](https://rocm.docs.amd.com/projects/rdc/en/latest/index.html). +The ROCm Data Center tool (RDC) streamlines the administration of AMD GPUs in cluster data center environments. ROCm 7.0.0 introduces new data center management and monitoring tools for system administrators. For more information, see [ROCm Data Center (RDC) tool documentation](https://rocm.docs.amd.com/projects/rdc/en/latest/index.html). #### ROCm Systems Profiler @@ -275,7 +279,7 @@ ROCm Systems Profiler includes the following key changes: See the [ROCm Systems Profiler changelog](#rocm-systems-profiler-1-1-0) for more details. #### ROCm Validation Suite -In ROCm 7.0, ROCm Validation Suite includes support for the AMD Instinct MI355X and MI350X GPUs in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. +In ROCm 7.0.0, ROCm Validation Suite includes support for the AMD Instinct MI355X and MI350X GPUs in the IET (Integrated Execution Test), GST (GPU Stress Test), and Babel (memory bandwidth test) modules. See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more details. @@ -283,7 +287,7 @@ See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more ##### Core SDK enhancements -* ROCprofiler-SDK is now compatible with the HIP 7.0 API. +* ROCprofiler-SDK is now compatible with the HIP 7.0.0 API. * ROCprofiler-SDK adds support for AMD Instinct MI350X and MI355X GPUs. * The stochastic and host-trap PC sampling support has been added for all AMD Instinct MI300 and MI350 series GPUs, which provides information particularly useful for understanding stalls during kernel execution. @@ -410,9 +414,9 @@ ROCm documentation continues to be updated to provide clearer and more comprehen ## User space, driver, and firmware dependent changes -GPU Software for AMD datacenter GPU products requires you to maintain a hardware and software stack with interdependencies between the GPU and baseboard firmware, AMD GPU drivers, and the ROCm user space software. Starting ROCm 7.0 release, we are publicly documenting these interdependencies. Note that while AMD publishes drivers and ROCm user space, your server or infrastructure provider publishes the GPU and baseboard firmware by bundling AMD’s firmware releases via AMD's Platform Level Data Model (PLDM) bundle (Firmware), which includes Integrated Firmware Image (IFWI). +GPU Software for AMD datacenter GPU products requires you to maintain a hardware and software stack with interdependencies between the GPU and baseboard firmware, AMD GPU drivers, and the ROCm user space software. Starting ROCm 7.0.0 release, we are publicly documenting these interdependencies. Note that while AMD publishes drivers and ROCm user space, your server or infrastructure provider publishes the GPU and baseboard firmware by bundling AMD’s firmware releases via AMD's Platform Level Data Model (PLDM) bundle (Firmware), which includes Integrated Firmware Image (IFWI). -The GPU and baseboard firmware releases numbering may vary by GPU family. Note that, ROCm 7.0 release is the first release where the AMD GPU driver is versioned independently of ROCm. +The GPU and baseboard firmware releases numbering may vary by GPU family. Note that, ROCm 7.0.0 release is the first release where the AMD GPU driver is versioned independently of ROCm.
@@ -510,23 +514,23 @@ The GPU and baseboard firmware releases numbering may vary by GPU family. Note t ### New feature details -#### AMD SMI changes dependent on PLDM bundles +#### AMD SMI changes dependent on PLDM bundles (firmware) -New APIs introduced in AMD SMI for ROCm 7.0 provide additional data for the AMD Instinct products. To support these features, the following firmware for each GPUs are required: +New APIs introduced in AMD SMI for ROCm 7.0.0 provide additional data for the AMD Instinct products. To support these features, the following firmware for each GPUs are required: -* AMD Instinct MI355x - PLDM bundle 01.25.13.04 +* AMD Instinct MI355X - PLDM bundle 01.25.13.04 -* AMD Instinct MI350x - PLDM bundle 01.25.13.04 +* AMD Instinct MI350X - PLDM bundle 01.25.13.04 -* AMD Instinct MI325x - PLDM bundle 01.25.04.00 +* AMD Instinct MI325X - PLDM bundle 01.25.04.00 -* AMD Instinct MI300x - PLDM bundle 01.25.03.12 +* AMD Instinct MI300X - PLDM bundle 01.25.03.12 -If ROCm 7.0 is applied on system with prior version of PLDM bundles (firmware), the new APIs will return `N/A` to indicate lack of support for these items. +If ROCm 7.0.0 is applied on system with prior version of PLDM bundles (firmware), the new APIs will return `N/A` to indicate lack of support for these items. #### Enhanced temperature telemetry introduced in AMD SMI for MI355X and MI350X GPUs -AMD SMI in ROCm 7.0 provides support for enhanced temperature metrics and temperature anomaly detection for AMD Instinct MI350X and MI355X GPUs when paired with: AMD Instinct MI355x/MI350X - PLDM bundle 01.25.13.04. +AMD SMI in ROCm 7.0.0 provides support for enhanced temperature metrics and temperature anomaly detection for AMD Instinct MI350X and MI355X GPUs when paired with: PLDM bundle 01.25.13.04. For more information on these features, see [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-7.0/CHANGELOG.md). @@ -536,7 +540,7 @@ KVM SR-IOV support for all Instinct GPUs require the open source AMD GPU Virtual #### GPU partitioning support for AMD Instinct MI355X and MI350X GPUs -NPS2 and DPX partitioning on bare metal is enabled on AMD Instinct MI355X and MI350X GPUs on ROCm 7.0 when paired with: AMD Instinct MI355x/MI350X - PLDM bundle 01.25.13.04. +NPS2 and DPX partitioning on bare metal is enabled on AMD Instinct MI355X and MI350X GPUs on ROCm 7.0.0 when paired with: PLDM bundle 01.25.13.04. ## ROCm components @@ -2566,6 +2570,10 @@ Starting with GCC 5.1, GNU `libstdc++` introduced a dual Application Binary Inte Default batched General Matrix Multiplications (GEMM) operations for rocBLAS and hipBLAS on gfx1200 and gfx1201 may have a decline in performance in comparison with non-batched and strided_batched GEMM operations. By default, the batched GEMM uses hipBLASLT kernels, and switching to the Tensile kernel resolves the performance decline issue. The issue will be fixed in a future ROCm release. As a workaround, you can set the environment variable `ROCBLAS_USE_HIPBLASLT=0` before the batched GEMM operation is performed on gfx1200 and gfx1201. After completing the batched operation, reset the variable to `ROCBLAS_USE_HIPBLASLT=1` before calling non-batched or strided_batched operations. +### Failure to declare out-of-bound CPERs for bad memory page + +Exceeding of bad memory page threshold fails to declare Out-Of-Band Common Platform Error Records (CPERs). This issue affects all MI300 series GPUs, including MI300X, MI325, MI350X, and MI355X. This issue will be fixed in a future AMD GPU Driver releases. + ## ROCm resolved issues The following are previously known issues resolved in this release. For resolved issues related to diff --git a/docs/compatibility/compatibility-matrix-historical-6.0.csv b/docs/compatibility/compatibility-matrix-historical-6.0.csv index 26089a642..61ac9027b 100644 --- a/docs/compatibility/compatibility-matrix-historical-6.0.csv +++ b/docs/compatibility/compatibility-matrix-historical-6.0.csv @@ -31,7 +31,7 @@ ROCm Version,7.0.0,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6 ,,,,,,,,,,,,,,,,,,, FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,, :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.7, 2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13" - :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.19.1, 2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1" + :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.19.1, 2.18.1","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1" :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.6.0,0.4.35,0.4.35,0.4.35,0.4.35,0.4.31,0.4.31,0.4.31,0.4.31,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26 :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.3.0.post0,N/A,N/A,N/A,N/A,N/A, :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,N/A,N/A,N/A,85f95ae,85f95ae,85f95ae,85f95ae,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A, diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index 794292cf3..b6015d51f 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -56,7 +56,7 @@ compatibility and system requirements. ,,, FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix:,, :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.7, 2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 2.1, 2.0, 1.13" - :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.19.1, 2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1" + :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.19.1, 2.18.1","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1" :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.6.0,0.4.35,0.4.31 :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,85f95ae @@ -164,7 +164,7 @@ compatibility and system requirements. .. [#az-mi300x] Starting ROCm 6.4.0, Azure Linux 3.0 is supported only on AMD Instinct MI300X and AMD Radeon PRO V710. .. [#RDNA-OS] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), Radeon RX 9060 XT (gfx1200), Radeon PRO W7700 (gfx1101), and Radeon RX 7800 XT (gfx1101) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.6, and RHEL 9.4. .. [#7700XT-OS] Radeon RX 7700 XT (gfx1101) is supported only on Ubuntu 24.04.2 and RHEL 9.6. -.. [#kfd_support] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix `_. +.. [#kfd_support] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The supported user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix `_. .. [#ROCT-rocr] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package. @@ -255,6 +255,6 @@ Expand for full historical view of: .. [#verl_compat] verl is only supported on ROCm 6.2.0. .. [#dgl_compat] DGL is only supported on ROCm 6.4.0. .. [#taichi_compat] Taichi is only supported on ROCm 6.3.2. - .. [#kfd_support-past-60] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix `_. + .. [#kfd_support-past-60] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The supported user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix `_. .. [#ROCT-rocr-past-60] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package. From 06fd378036b63242ea9d0ef809788edb5f7b7e3c Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Mon, 15 Sep 2025 17:09:07 -0400 Subject: [PATCH 53/58] Known issues updated (#555) --- RELEASE.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 0e78448fb..285d1a496 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -128,7 +128,7 @@ AMD ROCm has officially added support for the following Deep learning and AI fra ### AMD GPU Driver/ROCm packaging separation -The AMD GPU Driver is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as AMD GPU Driver version 30.10. See the [User and kernel-space support matrix](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/user-kernel-space-compat-matrix.html) for more information. +The AMD GPU Driver (amdgpu) is now distributed separately from the ROCm software stack and is stored under in its own location ``/amdgpu/`` in the package repository at [repo.radeon.com](https://repo.radeon.com/amdgpu/). The first release is designated as AMD GPU Driver (amdgpu) version 30.10. See the [User and kernel-space support matrix](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/user-kernel-space-compat-matrix.html) for more information. [AMD SMI](https://github.com/ROCm/amdsmi) continues to stay with the ROCm software stack under the ROCm organization repository. @@ -416,7 +416,7 @@ ROCm documentation continues to be updated to provide clearer and more comprehen GPU Software for AMD datacenter GPU products requires you to maintain a hardware and software stack with interdependencies between the GPU and baseboard firmware, AMD GPU drivers, and the ROCm user space software. Starting ROCm 7.0.0 release, we are publicly documenting these interdependencies. Note that while AMD publishes drivers and ROCm user space, your server or infrastructure provider publishes the GPU and baseboard firmware by bundling AMD’s firmware releases via AMD's Platform Level Data Model (PLDM) bundle (Firmware), which includes Integrated Firmware Image (IFWI). -The GPU and baseboard firmware releases numbering may vary by GPU family. Note that, ROCm 7.0.0 release is the first release where the AMD GPU driver is versioned independently of ROCm. +The GPU and baseboard firmware releases numbering may vary by GPU family. Note that, ROCm 7.0.0 release is the first release where the AMD GPU Driver (amdgpu) is versioned independently of ROCm.
@@ -432,7 +432,7 @@ The GPU and baseboard firmware releases numbering may vary by GPU family. Note t

PLDM Bundle (Firmware)

-

AMD GPU Driver

+

AMD GPU Driver (amdgpu)

AMD GPU
@@ -2572,7 +2572,7 @@ Default batched General Matrix Multiplications (GEMM) operations for rocBLAS and ### Failure to declare out-of-bound CPERs for bad memory page -Exceeding of bad memory page threshold fails to declare Out-Of-Band Common Platform Error Records (CPERs). This issue affects all MI300 series GPUs, including MI300X, MI325, MI350X, and MI355X. This issue will be fixed in a future AMD GPU Driver releases. +Exceeding of bad memory page threshold fails to declare Out-Of-Band Common Platform Error Records (CPERs). This issue affects all AMD Instinct MI300 series and MI350 series GPUs and will be fixed in a future AMD GPU Driver release. ## ROCm resolved issues From df1ae524b258be8a57666eb99f53a043fa25003f Mon Sep 17 00:00:00 2001 From: randyh62 <42045079+randyh62@users.noreply.github.com> Date: Mon, 15 Sep 2025 14:15:25 -0700 Subject: [PATCH 54/58] Hip minor update (#553) * Update CHANGELOG.md Removed duplicate num_threads entry, and added a new Resolved issue from Julia. * Update RELEASE.md Removed duplicate num_threads entry and added a resolved issue from Julia. --- CHANGELOG.md | 2 +- RELEASE.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 42e0f0cfc..97552c0ea 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -204,7 +204,6 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc - `hipLaunchKernelExC` launches a HIP kernel using a generic function pointer and the specified configuration. - `hipDrvLaunchKernelEx` dispatches the device kernel represented by a HIP function object. - `hipMemGetHandleForAddressRange` gets a handle for the address range requested. - - `num_threads` Total number of threads in the group. The legacy API size is alias. - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for aritimetic reduction across lanes of a warp, and `__reduce_and_sync`, `__reduce_or_sync`, and `__reduce_xor_sync` functions added for logical reduction. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). * New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as follows. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). @@ -363,6 +362,7 @@ HIP runtime has the following functional improvements which improves runtime per * Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v`. * A numerical error/corruption found in Pytorch during graph replay. HIP runtime fixed the input sizes of kernel launch dimensions in hipExtModuleLaunchKernel for the execution of hipGraph capture. * A crash during kernel execution in a customer application. The structure of kernel arguments was updated via adding the size of kernel arguments, and HIP runtime does validation before launch kernel with the structured arguments. +* Compilation error when using bfloat16 functions. HIP runtime removed the anonymous namespace from FP16 functions to resolve this issue. #### Known issues diff --git a/RELEASE.md b/RELEASE.md index 285d1a496..2e245881e 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1075,7 +1075,6 @@ See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/roc - `hipLaunchKernelExC` launches a HIP kernel using a generic function pointer and the specified configuration. - `hipDrvLaunchKernelEx` dispatches the device kernel represented by a HIP function object. - `hipMemGetHandleForAddressRange` gets a handle for the address range requested. - - `num_threads` Total number of threads in the group. The legacy API size is alias. - `__reduce_add_sync`, `__reduce_min_sync`, and `__reduce_max_sync` functions added for aritimetic reduction across lanes of a warp, and `__reduce_and_sync`, `__reduce_or_sync`, and `__reduce_xor_sync` functions added for logical reduction. For details, see [Warp cross-lane functions](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warp-cross-lane-functions). * New support for Open Compute Project (OCP) floating-point `FP4`/`FP6`/`FP8` as follows. For details, see [Low precision floating point document](https://rocm.docs.amd.com/projects/HIP/en/latest/reference/low_fp_types.html). @@ -1234,6 +1233,7 @@ HIP runtime has the following functional improvements which improves runtime per * Compilation failure, HIP runtime refactored the vector type alignment with `__hip_vec_align_v`. * A numerical error/corruption found in Pytorch during graph replay. HIP runtime fixed the input sizes of kernel launch dimensions in hipExtModuleLaunchKernel for the execution of hipGraph capture. * A crash during kernel execution in a customer application. The structure of kernel arguments was updated via adding the size of kernel arguments, and HIP runtime does validation before launch kernel with the structured arguments. +* Compilation error when using bfloat16 functions. HIP runtime removed the anonymous namespace from FP16 functions to resolve this issue. #### Known issues From 5637deb81eebf80f5a1c5704a7ad3c8d06c29bd7 Mon Sep 17 00:00:00 2001 From: Pratik Basyal Date: Mon, 15 Sep 2025 17:36:19 -0400 Subject: [PATCH 55/58] Release notes changes to TF (#556) * RN changes to TF * Series capitalized * Minor update --- CHANGELOG.md | 34 +++++++++++++++++----------------- RELEASE.md | 44 ++++++++++++++++++++------------------------ 2 files changed, 37 insertions(+), 41 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 97552c0ea..23ee67539 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -145,7 +145,7 @@ for a complete overview of this release. - Removed duplicated GPU IDs when receiving events using the `amd-smi event` command. -- Fixed `amd-smi monitor` decoder utilization (`DEC%`) not showing up on MI300 series ASICs. +- Fixed `amd-smi monitor` decoder utilization (`DEC%`) not showing up on MI300 Series ASICs. #### Known issues @@ -341,13 +341,13 @@ HIP runtime has the following functional improvements which improves runtime per * Refactored memory validation, creates a unique function to validate a variety of memory copy operations. * Improved kernel logging using demangling shader names. * Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). -* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following: +* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 Series up GPU devices. More enumeration values were added in `hipLimit_t` as following: - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. Developers can now use the environment variable `HSA_SCRATCH_SINGLE_LIMIT_ASYNC` to change the default allocation size with expected scratch limit in ROCR runtime. On top of it, this value can also be overwritten programmatically in the application using the HIP API `hipDeviceSetLimit(hipExtLimitScratchCurrent, value)` to reset the scratch limit value. * HIP runtime now enables peer-to-peer (P2P) memory copies to utilize all available SDMA engines, rather than being limited to a single engine. It also selects the best engine first to give optimal bandwidth. -* Improved launch latency for `D2D` copies and `memset` on MI300 series. +* Improved launch latency for `D2D` copies and `memset` on MI300 Series. * Introduced a threshold to handle the command submission patch to the GPU device(s), considering the synchronization with CPU, for performance improvement. #### Resolved issues @@ -745,7 +745,7 @@ HIP runtime has the following functional improvements which improves runtime per * [RNN] Dynamic algorithm optimization. * [Conv] Eliminated redundant clearing of output buffers. * [RNN] Updated selection heuristics. -* Updated tuning for the AMD Instinct MI300 series. +* Updated tuning for the AMD Instinct MI300 Series. #### Resolved issues @@ -1015,7 +1015,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin ##### CDNA4 (AMD Instinct MI350/MI355) support -* Support for AMD Instinct MI350 series GPUs with the addition of the following counters: +* Support for AMD Instinct MI350 Series GPUs with the addition of the following counters: * VALU co-issue (Two VALUs are issued instructions) efficiency * Stream Processor Instruction (SPI) Wave Occupancy * Scheduler-Pipe Wave Utilization @@ -1034,7 +1034,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * L2 to EA stalls * L2 to EA stalls per channel -* Roofline support for AMD Instinct MI350 series GPUs. +* Roofline support for AMD Instinct MI350 Series GPUs. ##### Textual User Interface (TUI) (beta version) @@ -1044,9 +1044,9 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin ##### PC Sampling (beta version) -* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X series and later GPUs. +* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X Series and later GPUs. -* Host-trap PC Sampling has been enabled for AMD Instinct MI200 series and later GPUs. +* Host-trap PC Sampling has been enabled for AMD Instinct MI200 Series and later GPUs. * Support for sorting of PC sampling by type: offset or count. @@ -1056,7 +1056,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * Support for Roofline plot on CLI (single run) analysis. -* `FP4` and `FP6` data types have been added for roofline profiling on AMD Instinct MI350 series. +* `FP4` and `FP6` data types have been added for roofline profiling on AMD Instinct MI350 Series. ##### rocprofv3 support @@ -1105,7 +1105,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI300. * Fixed not detecting memory clock issue when using ``amd-smi``. * Fixed standalone GUI crashing. -* Fixed L2 read/write/atomic bandwidths on AMD Instinct MI350 series. +* Fixed L2 read/write/atomic bandwidths on AMD Instinct MI350 Series. #### Known issues @@ -1299,7 +1299,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Optimized -* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 series. +* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 Series. #### Resolved issues @@ -2040,7 +2040,7 @@ for a complete overview of this release. - Changed the name of the `power` field to `energy_accumulator` in the Python API for `amdsmi_get_energy_count()`. - Added violation status output for Graphics Clock Below Host Limit to `amd-smi` CLI: `amdsmi_get_violation_status()`, `amd-smi metric --throttle`, and `amd-smi monitor --violation`. - Users can retrieve violation status through either our Python or C++ APIs. Only available for MI300 series+ ASICs. + Users can retrieve violation status through either our Python or C++ APIs. Only available for MI300 Series+ ASICs. - Updated API `amdsmi_get_violation_status()` structure and CLI `amdsmi_violation_status_t` to include GFX Clk below host limit. @@ -2060,7 +2060,7 @@ for a complete overview of this release. #### Resolved issues -- Fixed `amdsmi_get_gpu_asic_info` and `amd-smi static --asic` not displaying graphics version correctly for Instinct MI200 series, Instinct MI100 series, and RDNA3-based GPUs. +- Fixed `amdsmi_get_gpu_asic_info` and `amd-smi static --asic` not displaying graphics version correctly for Instinct MI200 Series, Instinct MI100 Series, and RDNA3-based GPUs. #### Known issues @@ -2699,7 +2699,7 @@ The following lists the backward incompatible changes planned for upcoming major #### Resolved issues -- Fixed `rsmi_dev_target_graphics_version_get`, `rocm-smi --showhw`, and `rocm-smi --showprod` not displaying graphics version correctly for Instinct MI200 series, MI100 series, and RDNA3-based GPUs. +- Fixed `rsmi_dev_target_graphics_version_get`, `rocm-smi --showhw`, and `rocm-smi --showprod` not displaying graphics version correctly for Instinct MI200 Series, MI100 Series, and RDNA3-based GPUs. > [!NOTE] > See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-6.4/CHANGELOG.md) for details, examples, and in-depth descriptions. @@ -6035,7 +6035,7 @@ for a complete overview of this release. #### Resolved issues * Fixed an issue causing ROCm SMI to incorrectly report GPU utilization for RDNA3 GPUs. See the issue on [GitHub](https://github.com/ROCm/ROCm/issues/3112). -* Fixed the parsing of `pp_od_clk_voltage` in `get_od_clk_volt_info` to work better with MI-series hardware. +* Fixed the parsing of `pp_od_clk_voltage` in `get_od_clk_volt_info` to work better with MI-Series hardware. ## ROCm 6.1.1 @@ -6476,8 +6476,8 @@ for a complete overview of this release. #### Added * Added support for additional GPU architectures. - * Navi 3 series: gfx1100, gfx1101, and gfx1102. - * MI300 series: gfx942. + * Navi 3 Series: gfx1100, gfx1101, and gfx1102. + * MI300 Series: gfx942. ### **ROCm SMI** (6.0.0) diff --git a/RELEASE.md b/RELEASE.md index 2e245881e..6fbadb5af 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -61,7 +61,7 @@ for more information about operating system and hardware compatibility. ROCm 7.0.0 introduces support for KVM Passthrough for AMD Instinct MI350X and MI355X GPUs. -All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver version 8.4.0.K. In addition, support for VMware ESXi 8 has been introduced for select AMD GPUs. For more information, see [Virtualization Support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support). +All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver version 8.4.0.K. In addition, support for VMware ESXi 8 has been introduced for AMD Instinct MI300X GPUs. For more information, see [Virtualization Support](https://rocm.docs.amd.com/projects/install-on-linux-internal/en/latest/reference/system-requirements.html#virtualization-support). ### Deep learning and AI framework updates @@ -98,11 +98,7 @@ Megatron-LM for ROCm now supports: ##### TensorFlow -ROCm 7.0.0 enables the following TensorFlow support: - -* Support for TensorFlow 2.19.1. -* MX data type support for AMD Instinct MI350 series GPUs. -* Triton autotuner. +ROCm 7.0.0 enables support for TensorFlow 2.19.1. ##### ONNX Runtime @@ -198,12 +194,12 @@ Key compiler enhancements include: #### New data type support -MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). ROCm 7.0.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 series GPUs in these ROCm libraries: +MX-compliant data types bring microscaling support to ROCm. For more information, see the [OCP Microscaling (MX) Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). ROCm 7.0.0 enables functional support for MX data types `FP4`, `FP6`, and `FP8` on AMD Instinct MI350 Series GPUs in these ROCm libraries: * Composable Kernel (`FP4`, `FP6`, and `FP8` only) * hipBLASLt -The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on MI350 series GPUs instead of the NANOO `FP8` format: +The following libraries are updated to support the Open Compute Project (OCP) floating-point `FP8` format on MI350 Series GPUs instead of the NANOO `FP8` format: * Composable Kernel * hipBLASLt @@ -245,7 +241,7 @@ have been refined for improved usability. See the [AMD SMI changelog](#amd-smi-2 #### ROCgdb -ROCgdb now supports `FP4`, `FP6`, and `FP8` micro-scaling (MX) data types with AMD Instinct MI350 series GPUs. +ROCgdb now supports `FP4`, `FP6`, and `FP8` micro-scaling (MX) data types with AMD Instinct MI350 Series GPUs. See the [ROCgdb changelog](#rocgdb-16-3) for more details. @@ -256,7 +252,7 @@ ROCm Compute Profiler includes the following key changes: * Interactive command line with a Textual User Interface (TUI) has been added to analyze mode. For more details, see [TUI analysis](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/amd-staging/how-to/analyze/tui.html). * Support added for advanced data types: `FP4` and `FP6` * Support for AMD Instinct MI355X and MI350X with addition of performance counters: CPC, SPI, SQ, TA/TD/TCP, and TCC. -* Roofline enhancement added for AMD Instinct MI350 series. +* Roofline enhancement added for AMD Instinct MI350 Series. * Improved support for Selective Kernel profiling. * Program Counter (PC) sampling (Software-based) feature has been enabled for AMD Instinct MI200, MI300X, MI350X, and MI355X GPUs. This feature helps in GPU profiling to understand code execution patterns and hotspots during GPU kernel execution. For more details, see [Using PC sampling in ROCm Compute Profiler](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/amd-staging/how-to/pc_sampling.html). * Program Counter (PC) sampling (Hardware-based, Stochastic) feature has been enabled for AMD Instinct MI300X, MI350, and MI355X GPUs. @@ -289,7 +285,7 @@ See the [ROCm Validation Suite changelog](#rocm-validation-suite-1-2-0) for more * ROCprofiler-SDK is now compatible with the HIP 7.0.0 API. * ROCprofiler-SDK adds support for AMD Instinct MI350X and MI355X GPUs. -* The stochastic and host-trap PC sampling support has been added for all AMD Instinct MI300 and MI350 series GPUs, which +* The stochastic and host-trap PC sampling support has been added for all AMD Instinct MI300 and MI350 Series GPUs, which provides information particularly useful for understanding stalls during kernel execution. * The added support for tracing events surfaced by AMD's Kernel Fusion Driver (KFD) captures low-level driver routines involved in mapping, invalidation, and migration of data between CPU and GPU memories. Such events are central to the support for [Unified Memory](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_runtime_api/memory_management/unified_memory.html) on AMD systems. Tracing of KFD events helps to detect performance problems arising from excessive data migration. * New APIs are added for profiling applications using thread traces (beta) @@ -303,7 +299,7 @@ efficient foundation for analysis and post-processing. ##### rocprofv3 CLI tool enhancements -* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 and MI350 series GPUs. +* Added stochastic and host-trap PC sampling support for all AMD Instinct MI300 and MI350 Series GPUs. * HIP streams translate to Queues in Time Traces in Perfetto output. * Support for thread trace service. @@ -1016,7 +1012,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid - Removed duplicated GPU IDs when receiving events using the `amd-smi event` command. -- Fixed `amd-smi monitor` decoder utilization (`DEC%`) not showing up on MI300 series ASICs. +- Fixed `amd-smi monitor` decoder utilization (`DEC%`) not showing up on MI300 Series ASICs. #### Known issues @@ -1212,13 +1208,13 @@ HIP runtime has the following functional improvements which improves runtime per * Refactored memory validation, creates a unique function to validate a variety of memory copy operations. * Improved kernel logging using demangling shader names. * Advanced support for SPIRV, now kernel compilation caching is enabled by default. This feature is controlled by the environment variable `AMD_COMGR_CACHE`, for details, see [hip_rtc document](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_rtc.html). -* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 series up GPU devices. More enumeration values were added in `hipLimit_t` as following: +* Programmatic support for scratch limits on the AMD Instinct MI300 and MI350 Series up GPU devices. More enumeration values were added in `hipLimit_t` as following: - `hipExtLimitScratchMin`, minimum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchMax`, maximum allowed value in bytes for scratch limit on the device. - `hipExtLimitScratchCurrent`, current scratch limit threshold in bytes on the device. Must be between the value `hipExtLimitScratchMin` and `hipExtLimitScratchMax`. Developers can now use the environment variable `HSA_SCRATCH_SINGLE_LIMIT_ASYNC` to change the default allocation size with expected scratch limit in ROCR runtime. On top of it, this value can also be overwritten programmatically in the application using the HIP API `hipDeviceSetLimit(hipExtLimitScratchCurrent, value)` to reset the scratch limit value. * HIP runtime now enables peer-to-peer (P2P) memory copies to utilize all available SDMA engines, rather than being limited to a single engine. It also selects the best engine first to give optimal bandwidth. -* Improved launch latency for `D2D` copies and `memset` on MI300 series. +* Improved launch latency for `D2D` copies and `memset` on MI300 Series. * Introduced a threshold to handle the command submission patch to the GPU device(s), considering the synchronization with CPU, for performance improvement. #### Resolved issues @@ -1616,7 +1612,7 @@ HIP runtime has the following functional improvements which improves runtime per * [RNN] Dynamic algorithm optimization. * [Conv] Eliminated redundant clearing of output buffers. * [RNN] Updated selection heuristics. -* Updated tuning for the AMD Instinct MI300 series. +* Updated tuning for the AMD Instinct MI300 Series. #### Resolved issues @@ -1886,7 +1882,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin ##### CDNA4 (AMD Instinct MI350/MI355) support -* Support for AMD Instinct MI350 series GPUs with the addition of the following counters: +* Support for AMD Instinct MI350 Series GPUs with the addition of the following counters: * VALU co-issue (Two VALUs are issued instructions) efficiency * Stream Processor Instruction (SPI) Wave Occupancy * Scheduler-Pipe Wave Utilization @@ -1905,7 +1901,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * L2 to EA stalls * L2 to EA stalls per channel -* Roofline support for AMD Instinct MI350 series GPUs. +* Roofline support for AMD Instinct MI350 Series GPUs. ##### Textual User Interface (TUI) (beta version) @@ -1915,9 +1911,9 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin ##### PC Sampling (beta version) -* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X series and later GPUs. +* Stochastic (hardware-based) PC sampling has been enabled for AMD Instinct MI300X Series and later GPUs. -* Host-trap PC Sampling has been enabled for AMD Instinct MI200 series and later GPUs. +* Host-trap PC Sampling has been enabled for AMD Instinct MI200 Series and later GPUs. * Support for sorting of PC sampling by type: offset or count. @@ -1927,7 +1923,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * Support for Roofline plot on CLI (single run) analysis. -* `FP4` and `FP6` data types have been added for roofline profiling on AMD Instinct MI350 series. +* `FP4` and `FP6` data types have been added for roofline profiling on AMD Instinct MI350 Series. ##### rocprofv3 support @@ -1976,7 +1972,7 @@ Review the [README](https://github.com/ROCm/rocm_bandwidth_test/blob/amd-mainlin * Fixed peak FLOPS of `F8`, `I8`, `F16`, and `BF16` on AMD Instinct MI300. * Fixed not detecting memory clock issue when using ``amd-smi``. * Fixed standalone GUI crashing. -* Fixed L2 read/write/atomic bandwidths on AMD Instinct MI350 series. +* Fixed L2 read/write/atomic bandwidths on AMD Instinct MI350 Series. #### Known issues @@ -2170,7 +2166,7 @@ The previous default accumulator types could lead to situations in which unexpec #### Optimized -* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 series. +* Improved performance of `rocprim::device_select` and `rocprim::device_partition` when using multiple streams on the AMD Instinct MI300 Series. #### Resolved issues @@ -2572,7 +2568,7 @@ Default batched General Matrix Multiplications (GEMM) operations for rocBLAS and ### Failure to declare out-of-bound CPERs for bad memory page -Exceeding of bad memory page threshold fails to declare Out-Of-Band Common Platform Error Records (CPERs). This issue affects all AMD Instinct MI300 series and MI350 series GPUs and will be fixed in a future AMD GPU Driver release. +Exceeding bad memory page threshold fails to declare Out-Of-Band Common Platform Error Records (CPERs). This issue affects all AMD Instinct MI300 Series and MI350 Series GPUs, and will be fixed in a future AMD GPU Driver release. ## ROCm resolved issues From b80080142748fc56f84b4dd3de94a5eba4f26e29 Mon Sep 17 00:00:00 2001 From: pbhandar-amd <138039281+pbhandar-amd@users.noreply.github.com> Date: Tue, 16 Sep 2025 04:10:31 -0400 Subject: [PATCH 56/58] Update versions.md --- docs/release/versions.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/release/versions.md b/docs/release/versions.md index 943449d03..09a5253f9 100644 --- a/docs/release/versions.md +++ b/docs/release/versions.md @@ -10,6 +10,7 @@ | Version | Release date | | ------- | ------------ | +| [7.0.0](https://rocm.docs.amd.com/en/docs-7.0.0/) | September 16, 2025 | | [6.4.3](https://rocm.docs.amd.com/en/docs-6.4.3/) | August 7, 2025 | | [6.4.2](https://rocm.docs.amd.com/en/docs-6.4.2/) | July 21, 2025 | | [6.4.1](https://rocm.docs.amd.com/en/docs-6.4.1/) | May 21, 2025 | @@ -50,3 +51,4 @@ | [5.0.2](https://rocm.docs.amd.com/en/docs-5.0.2/) | Mar 4, 2022 | | [5.0.1](https://rocm.docs.amd.com/en/docs-5.0.1/) | Feb 16, 2022 | | [5.0.0](https://rocm.docs.amd.com/en/docs-5.0.0/) | Feb 9, 2022 | + From 81f5314368770f53db71e13722ec3e924e5f3999 Mon Sep 17 00:00:00 2001 From: pbhandar-amd <138039281+pbhandar-amd@users.noreply.github.com> Date: Tue, 16 Sep 2025 05:16:12 -0400 Subject: [PATCH 57/58] Update versions.md --- docs/release/versions.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/release/versions.md b/docs/release/versions.md index 09a5253f9..f492d2742 100644 --- a/docs/release/versions.md +++ b/docs/release/versions.md @@ -51,4 +51,3 @@ | [5.0.2](https://rocm.docs.amd.com/en/docs-5.0.2/) | Mar 4, 2022 | | [5.0.1](https://rocm.docs.amd.com/en/docs-5.0.1/) | Feb 16, 2022 | | [5.0.0](https://rocm.docs.amd.com/en/docs-5.0.0/) | Feb 9, 2022 | - From 882f71302add7f8a8492a336dab40c1cb048a7c9 Mon Sep 17 00:00:00 2001 From: Yanyao Wang Date: Tue, 16 Sep 2025 04:25:09 -0500 Subject: [PATCH 58/58] Update default manifest file for ROCm7.0.0 (#5317) Co-authored-by: Wang, Yanyao --- default.xml | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/default.xml b/default.xml index 74f068837..d20ce2103 100644 --- a/default.xml +++ b/default.xml @@ -1,7 +1,7 @@ - @@ -9,6 +9,7 @@ + @@ -22,7 +23,7 @@ - + @@ -37,36 +38,26 @@ - - - - - - - - - - - + + - - - - -