From bb7af3351a133a46551e508b67b86fd54ca057cd Mon Sep 17 00:00:00 2001
From: Peter Park <peter.park@amd.com>
Date: Thu, 8 May 2025 09:24:51 -0400
Subject: [PATCH] Fix incorrect throughput benchmark command in
 inference/vllm-benchmark.rst (#4723)

* update inference index to include pyt inference

* fix incorrect command in throughput benchmark

* wording
---
 docs/how-to/rocm-for-ai/inference/index.rst          | 4 +++-
 docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst | 6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/docs/how-to/rocm-for-ai/inference/index.rst b/docs/how-to/rocm-for-ai/inference/index.rst
index 298014b6a..779c32381 100644
--- a/docs/how-to/rocm-for-ai/inference/index.rst
+++ b/docs/how-to/rocm-for-ai/inference/index.rst
@@ -20,6 +20,8 @@ training, fine-tuning, and inference. It leverages popular machine learning fram
 
 - :doc:`LLM inference frameworks <llm-inference-frameworks>`
 
-- :doc:`Performance testing <vllm-benchmark>`
+- :doc:`vLLM inference performance testing <vllm-benchmark>`
+
+- :doc:`PyTorch inference performance testing <pytorch-inference-benchmark>`
 
 - :doc:`Deploying your model <deploy-your-model>`
diff --git a/docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst b/docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst
index df6454aa9..8d530778f 100644
--- a/docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst
+++ b/docs/how-to/rocm-for-ai/inference/vllm-benchmark.rst
@@ -276,7 +276,7 @@ vLLM inference performance testing
 
             * Latency benchmark
 
-              Use this command to benchmark the latency of the {{model.model}} model on eight GPUs with the ``{{model.precision}}`` data type.
+              Use this command to benchmark the latency of the {{model.model}} model on eight GPUs with ``{{model.precision}}`` precision.
 
               .. code-block::
 
@@ -286,11 +286,11 @@ vLLM inference performance testing
 
             * Throughput benchmark
 
-              Use this command to throughput the latency of the {{model.model}} model on eight GPUs with the ``{{model.precision}}`` data type.
+              Use this command to benchmark the throughput of the {{model.model}} model on eight GPUs with ``{{model.precision}}`` precision.
 
               .. code-block:: shell
 
-                 ./vllm_benchmark_report.sh -s latency -m {{model.model_repo}} -g 8 -d {{model.precision}}
+                 ./vllm_benchmark_report.sh -s throughput -m {{model.model_repo}} -g 8 -d {{model.precision}}
 
               Find the throughput report at ``./reports_{{model.precision}}_vllm_rocm{{unified_docker.rocm_version}}/summary/{{model.model_repo.split('/', 1)[1] if '/' in model.model_repo else model.model_repo}}_throughput_report.csv``.