KV cache highlight update (#344)

* ROCProfiler deprecation notice udpated

* RHEL 9.6 support removed and 9.5 EOS rejected

* KV cache highlight updated

* Feedback from Peter Incorporated

Co-authored-by: Peter Park <peter.park@amd.com>

---------

Co-authored-by: Peter Park <peter.park@amd.com>
This commit is contained in:
Pratik Basyal
2025-03-27 09:50:51 -04:00
committed by GitHub
parent f418dbac55
commit 97085691f6

View File

@@ -92,7 +92,7 @@ The new bitstream reader feature has been added to [rocDecode](https://github.co
* ROCm Compute Profiler now supports:
* ROCprofiler-SDK/ ROCProfv3
* ROCprofiler-SDK (`rocprofv3`)
* Experimental multi-nodes profiling support.
* Roofline plot for 64-bit floating point (FP64) and 32-bit floating point (FP32) data types.
@@ -142,7 +142,7 @@ For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/pro
### Dynamic calculation of KV cache scaling factors supported
ROCm 6.4.0 enables dynamic calculation of key-value (KV) cache scaling factors.
When using an FP8 key-value (KV) cache, models that dont provide scaling factors (derived from the attention later projection output scales) might experience reduced accuracy due to quantization. ROCm 6.4.0 enables dynamic calculation of key-value (KV) cache scaling factors based on the actual runtime weight and input data to improve accuracy. For a vLLM usage example, see [Quantized KV Cache](https://docs.vllm.ai/en/latest/features/quantization/quantized_kvcache.html#usage-example) in the vLLM documentation.
### ROCm documentation updates
@@ -1174,7 +1174,7 @@ The following lists the backward incompatible changes planned for upcoming major
#### Added
* Roofline support for Ubuntu 24.04.
* Experimental support rocprofv3 (not enabled as default).
* Experimental support `rocprofv3` (not enabled as default).
* Experimental feature: Spatial multiplexing.
#### Resolved issues