diff --git a/.wordlist.txt b/.wordlist.txt
index 17f11e77b..2b7b7eb70 100644
--- a/.wordlist.txt
+++ b/.wordlist.txt
@@ -276,6 +276,7 @@ OpenSSL
 OpenVX
 OpenXLA
 Oversubscription
+PagedAttention
 PCC
 PCI
 PCIe
diff --git a/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst b/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst
index 3ee672353..84e839391 100644
--- a/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst
+++ b/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst
@@ -16,7 +16,7 @@ This section discusses how to implement `vLLM <https://docs.vllm.ai/en/latest>`_
 vLLM inference
 ==============
 
-vLLM is renowned for its paged attention algorithm that can reduce memory consumption and increase throughput thanks to
+vLLM is renowned for its PagedAttention algorithm that can reduce memory consumption and increase throughput thanks to
 its paging scheme. Instead of allocating GPU high-bandwidth memory (HBM) for the maximum output token lengths of the
 models, the paged attention of vLLM allocates GPU HBM dynamically for its actual decoding lengths. This paged attention
 is also effective when multiple requests share the same key and value contents for a large value of beam search or
@@ -139,9 +139,7 @@ Refer to :ref:`mi300x-vllm-optimization` for performance optimization tips.
 
 ROCm provides a prebuilt optimized Docker image for validating the performance of LLM inference with vLLM 
 on the MI300X accelerator. The Docker image includes ROCm, vLLM, PyTorch, and tuning files in the CSV 
-format. For more information, see the guide to 
-`LLM inference performance validation with vLLM on the AMD Instinct™ MI300X accelerator <https://github.com/ROCm/MAD/blob/develop/benchmark/vllm/README.md>`_ 
-on the ROCm GitHub repository.
+format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`.
 
 .. _fine-tuning-llms-tgi:
 
diff --git a/docs/how-to/rocm-for-ai/deploy-your-model.rst b/docs/how-to/rocm-for-ai/deploy-your-model.rst
index 0435e83ee..7a96f32e0 100644
--- a/docs/how-to/rocm-for-ai/deploy-your-model.rst
+++ b/docs/how-to/rocm-for-ai/deploy-your-model.rst
@@ -46,9 +46,7 @@ Validating vLLM performance
 
 ROCm provides a prebuilt optimized Docker image for validating the performance of LLM inference with vLLM 
 on the MI300X accelerator. The Docker image includes ROCm, vLLM, PyTorch, and tuning files in the CSV 
-format. For more information, see the guide to 
-`LLM inference performance validation with vLLM on the AMD Instinct™ MI300X accelerator <https://github.com/ROCm/MAD/blob/develop/benchmark/vllm/README.md>`_ 
-on the ROCm GitHub repository.
+format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`.
 
 .. _rocm-for-ai-serve-hugging-face-tgi:
 
diff --git a/docs/how-to/tuning-guides/mi300x/workload.rst b/docs/how-to/tuning-guides/mi300x/workload.rst
index 6857eae1b..56ad0e98a 100644
--- a/docs/how-to/tuning-guides/mi300x/workload.rst
+++ b/docs/how-to/tuning-guides/mi300x/workload.rst
@@ -152,9 +152,7 @@ address any new bottlenecks that may emerge.
 
 ROCm provides a prebuilt optimized Docker image that has everything required to implement
 the tips in this section. It includes ROCm, vLLM, PyTorch, and tuning files in the CSV 
-format. For more information, see the guide to 
-`LLM inference performance validation with vLLM on the AMD Instinct™ MI300X accelerator <https://github.com/ROCm/MAD/blob/develop/benchmark/vllm/README.md>`_ 
-on the ROCm GitHub repository.
+format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`.
 
 .. _mi300x-profiling-tools:
 
@@ -378,11 +376,10 @@ Refer to `vLLM documentation <https://docs.vllm.ai/en/latest/models/performance.
 for additional performance tips. :ref:`fine-tuning-llms-vllm` describes vLLM
 usage with ROCm.
 
-ROCm provides a prebuilt optimized Docker image for validating the performance of LLM inference with vLLM 
-on the MI300X accelerator. The Docker image includes ROCm, vLLM, PyTorch, and tuning files in the CSV 
-format. For more information, see the guide to 
-`LLM inference performance validation with vLLM on the AMD Instinct™ MI300X accelerator <https://github.com/ROCm/MAD/blob/develop/benchmark/vllm/README.md>`_ 
-on the ROCm GitHub repository.
+ROCm provides a prebuilt optimized Docker image for validating the performance
+of LLM inference with vLLM on the MI300X accelerator. The Docker image includes
+ROCm, vLLM, PyTorch, and tuning files in the CSV format. For more information,
+see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`.
 
 Maximize throughput
 -------------------