mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-08 22:28:06 -05:00
* Docs: references of accelerator removal and change to GPU Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
30 lines
1.9 KiB
ReStructuredText
30 lines
1.9 KiB
ReStructuredText
.. meta::
|
||
:description: How to use ROCm for AI inference workloads.
|
||
:keywords: ROCm, AI, machine learning, LLM, AI inference, NLP, GPUs, usage, tutorial
|
||
|
||
****************************
|
||
Use ROCm for AI inference
|
||
****************************
|
||
AI inference is a process of deploying a trained machine learning model to make predictions or classifications on new data. This commonly involves using the model with real-time data and making quick decisions based on the predictions made by the model.
|
||
|
||
Understanding the ROCm™ software platform’s architecture and capabilities is vital for running AI inference. By leveraging the ROCm platform's capabilities, you can harness the power of high-performance computing and efficient resource management to run inference workloads, leading to faster predictions and classifications on real-time data.
|
||
|
||
Throughout the following topics, this section provides a comprehensive guide to setting up and deploying AI inference on AMD GPUs. This includes instructions on how to install ROCm, how to use Hugging Face Transformers to manage pre-trained models for natural language processing (NLP) tasks, how to validate vLLM on AMD Instinct™ MI300X GPUs and illustrate how to deploy trained models in production environments.
|
||
|
||
The AI Developer Hub contains `AMD ROCm tutorials <https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/>`_ for
|
||
training, fine-tuning, and inference. It leverages popular machine learning frameworks on AMD GPUs.
|
||
|
||
- :doc:`Installing ROCm and machine learning frameworks <../install>`
|
||
|
||
- :doc:`Running models from Hugging Face <hugging-face-models>`
|
||
|
||
- :doc:`LLM inference frameworks <llm-inference-frameworks>`
|
||
|
||
- :doc:`vLLM inference performance testing <benchmark-docker/vllm>`
|
||
|
||
- :doc:`PyTorch inference performance testing <benchmark-docker/pytorch-inference>`
|
||
|
||
- :doc:`SGLang inference performance testing <benchmark-docker/sglang>`
|
||
|
||
- :doc:`Deploying your model <deploy-your-model>`
|