Docs: adding ray and llama.cpp live blog links (#5290)

2026-01-08 06:13:59 -05:00 · 2025-09-10 15:02:03 -04:00
parent 0840c14b6d
commit 3ca9cb1fcc
3 changed files with 15 additions and 3 deletions
--- a/.wordlist.txt
+++ b/.wordlist.txt
@@ -156,6 +156,7 @@ GEMMs
 GFLOPS
 GFortran
 GFXIP
+GGUF
 Gemma
 GiB
 GIM
--- a/docs/compatibility/ml-compatibility/llama-cpp-compatibility.rst
+++ b/docs/compatibility/ml-compatibility/llama-cpp-compatibility.rst
@@ -67,9 +67,14 @@ llama.cpp is also used in a range of real-world applications, including:
 - Various other AI applications use llama.cpp as their inference engine;  
  for a detailed list, see the `user interfaces (UIs) section <https://github.com/ggml-org/llama.cpp?tab=readme-ov-file#description>`__.

-Refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`_,
+For more use cases and recommendations, refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`__, 
 where you can search for llama.cpp examples and best practices to optimize your workloads on AMD GPUs.

+- The `Llama.cpp Meets Instinct: A New Era of Open-Source AI Acceleration <https://rocm.blogs.amd.com/ecosystems-and-partners/llama-cpp/README.html>`__, 
+  blog post outlines how the open-source llama.cpp framework enables efficient LLM inference—including interactive inference with ``llama-cli``, 
+  server deployment with ``llama-server``, GGUF model preparation and quantization, performance benchmarking, and optimizations tailored for 
+  AMD Instinct GPUs within the ROCm ecosystem. 
+
 .. _llama-cpp-docker-compat:

 Docker image compatibility
--- a/docs/compatibility/ml-compatibility/ray-compatibility.rst
+++ b/docs/compatibility/ml-compatibility/ray-compatibility.rst
@@ -66,9 +66,15 @@ Use cases and recommendations
  GPUs. Follow this guide to get started with verl on AMD Instinct GPUs and 
  accelerate your RLHF training with ROCm-optimized performance.

+* The `Exploring Use Cases for Scalable AI: Implementing Ray with ROCm Support for Efficient ML Workflows 
+  <https://rocm.blogs.amd.com/artificial-intelligence/rocm-ray/README.html>`__
+  blog post describes key use cases such as training and inference for large language models (LLMs), 
+  model serving, hyperparameter tuning, reinforcement learning, and the orchestration of large-scale 
+  workloads using Ray in the ROCm environment.
+
 For more use cases and recommendations, see the AMD GPU tabs in the `Accelerator Support 
-topic <https://docs.ray.io/en/latest/ray-core/scheduling/accelerators.html#accelerator-support>`_ 
-of the Ray core documentation and refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`_, 
+topic <https://docs.ray.io/en/latest/ray-core/scheduling/accelerators.html#accelerator-support>`__ 
+of the Ray core documentation and refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`__, 
 where you can search for Ray examples and best practices to optimize your workloads on AMD GPUs.

 .. _ray-docker-compat: