mirror of
https://github.com/ROCm/ROCm.git
synced 2026-02-11 06:55:06 -05:00
Rename Omnitools to ROCm Compute/Systems Profiler (#183)
* rename Omniperf and Omnitrace * rename labels rename more labels * update licenses and rocm-tools.md * fix rocprof-sys ref
This commit is contained in:
@@ -52,8 +52,6 @@ additional licenses. Please review individual repositories for more information.
|
||||
| [MIGraphX](https://github.com/ROCm/AMDMIGraphX/) | [MIT](https://github.com/ROCm/AMDMIGraphX/blob/develop/LICENSE) |
|
||||
| [MIOpen](https://github.com/ROCm/MIOpen/) | [MIT](https://github.com/ROCm/MIOpen/blob/develop/LICENSE.txt) |
|
||||
| [MIVisionX](https://github.com/ROCm/MIVisionX/) | [MIT](https://github.com/ROCm/MIVisionX/blob/develop/LICENSE.txt) |
|
||||
| [Omniperf](https://github.com/ROCm/omniperf) | [MIT](https://github.com/ROCm/omniperf/blob/main/LICENSE) |
|
||||
| [Omnitrace](https://github.com/ROCm/omnitrace) | [MIT](https://github.com/ROCm/omnitrace/blob/main/LICENSE) |
|
||||
| [rocAL](https://github.com/ROCm/rocAL) | [MIT](https://github.com/ROCm/rocAL/blob/develop/LICENSE.txt) |
|
||||
| [rocALUTION](https://github.com/ROCm/rocALUTION/) | [MIT](https://github.com/ROCm/rocALUTION/blob/develop/LICENSE.md) |
|
||||
| [rocBLAS](https://github.com/ROCm/rocBLAS/) | [MIT](https://github.com/ROCm/rocBLAS/blob/develop/LICENSE.md) |
|
||||
@@ -67,11 +65,13 @@ additional licenses. Please review individual repositories for more information.
|
||||
| [ROCm CMake](https://github.com/ROCm/rocm-cmake/) | [MIT](https://github.com/ROCm/rocm-cmake/blob/develop/LICENSE) |
|
||||
| [ROCm Communication Collectives Library (RCCL)](https://github.com/ROCm/rccl/) | [Custom](https://github.com/ROCm/rccl/blob/develop/LICENSE.txt) |
|
||||
| [ROCm-Core](https://github.com/ROCm/rocm-core) | [MIT](https://github.com/ROCm/rocm-core/blob/master/copyright) |
|
||||
| [ROCm Compute Profiler](https://github.com/ROCm/rocprofiler-compute) | [MIT](https://github.com/ROCm/rocprofiler-compute/blob/amd-staging/LICENSE) |
|
||||
| [ROCm Data Center (RDC)](https://github.com/ROCm/rdc/) | [MIT](https://github.com/ROCm/rdc/blob/develop/LICENSE) |
|
||||
| [ROCm-Device-Libs](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/LICENSE.TXT) |
|
||||
| [ROCm-OpenCL-Runtime](https://github.com/ROCm/clr/tree/develop/opencl) | [MIT](https://github.com/ROCm/clr/blob/develop/opencl/LICENSE.txt) |
|
||||
| [ROCm Performance Primitives (RPP)](https://github.com/ROCm/rpp) | [MIT](https://github.com/ROCm/rpp/blob/develop/LICENSE) |
|
||||
| [ROCm SMI Lib](https://github.com/ROCm/rocm_smi_lib/) | [MIT](https://github.com/ROCm/rocm_smi_lib/blob/develop/License.txt) |
|
||||
| [ROCm Systems Profiler](https://github.com/ROCm/rocprofiler-systems) | [MIT](https://github.com/ROCm/rocprofiler-systems/blob/amd-staging/LICENSE) |
|
||||
| [ROCm Validation Suite](https://github.com/ROCm/ROCmValidationSuite/) | [MIT](https://github.com/ROCm/ROCmValidationSuite/blob/master/LICENSE) |
|
||||
| [rocPRIM](https://github.com/ROCm/rocPRIM/) | [MIT](https://github.com/ROCm/rocPRIM/blob/develop/LICENSE.txt) |
|
||||
| [ROCProfiler](https://github.com/ROCm/rocprofiler/) | [MIT](https://github.com/ROCm/rocprofiler/blob/amd-master/LICENSE) |
|
||||
|
||||
@@ -111,10 +111,10 @@ Accelerators and GPUs listed in the following table support compute workloads (n
|
||||
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,rocm-6.2.4,rocm-6.2.2,rocm-6.1.0
|
||||
,,,
|
||||
PERFORMANCE TOOLS,,,
|
||||
:doc:`Omniperf <omniperf:index>`,2.0.1,2.0.1,N/A
|
||||
:doc:`Omnitrace <omnitrace:index>`,1.11.2,1.11.2,N/A
|
||||
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0
|
||||
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60204,2.0.60202,2.0.60100
|
||||
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,2.0.1,2.0.1,N/A
|
||||
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.11.2,1.11.2,N/A
|
||||
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60202,2.0.60201,2.0.60100
|
||||
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.4.0,0.4.0,N/A
|
||||
:doc:`ROCTracer <roctracer:index>`,4.1.60204,4.1.60202,4.1.60100
|
||||
,,,
|
||||
|
||||
|
Before Width: | Height: | Size: 153 KiB After Width: | Height: | Size: 153 KiB |
|
Before Width: | Height: | Size: 219 KiB After Width: | Height: | Size: 219 KiB |
@@ -22,8 +22,8 @@ as well as other profiling and debugging suggestions.
|
||||
|
||||
* :ref:`ROCProfiler <mi300x-rocprof>`
|
||||
|
||||
* :ref:`Omniperf <mi300x-omniperf>`
|
||||
* :ref:`ROCm Compute Profiler <mi300x-rocprof-compute>`
|
||||
|
||||
* :ref:`Omnitrace <mi300x-omnitrace>`
|
||||
* :ref:`ROCm Systems Profiler <mi300x-rocprof-systems>`
|
||||
|
||||
* :ref:`ROCr Debug Agent <mi300x-rocr-debug-agent>`
|
||||
|
||||
@@ -67,7 +67,7 @@ When profiling indicates that GPUs are a performance bottleneck, delve deeper
|
||||
into kernel-level profiling. Tools such as the
|
||||
:ref:`ROCr Debug Agent <mi300x-rocr-debug-agent>`,
|
||||
:ref:`ROCProfiler <mi300x-rocprof>`, and
|
||||
:ref:`Omniperf <mi300x-omniperf>` offer detailed insights
|
||||
:ref:`ROCm Compute Profiler <mi300x-rocprofiler-compute>` offer detailed insights
|
||||
into GPU kernel execution. These tools can help isolate problematic GPU
|
||||
operations and provide data needed for targeted optimizations.
|
||||
|
||||
@@ -171,9 +171,9 @@ tools available depending on their specific profiling needs.
|
||||
:doc:`ROCProfiler <rocprofiler:index>`
|
||||
documentation.
|
||||
|
||||
* Omniperf builds upon ROCProfiler but provides more guided analysis.
|
||||
* ROCm Compute Profiler builds upon ROCProfiler but provides more guided analysis.
|
||||
For more information, see
|
||||
:doc:`Omniperf documentation <omniperf:index>`.
|
||||
:doc:`ROCm Compute Profiler documentation <rocprofiler-compute:index>`.
|
||||
|
||||
Refer to :doc:`/how-to/llm-fine-tuning-optimization/profiling-and-debugging`
|
||||
to explore commonly used profiling tools and their usage patterns.
|
||||
@@ -244,9 +244,9 @@ working with AMD Instinct accelerators have multiple tools depending on their sp
|
||||
|
||||
* :ref:`ROCProfiler <mi300x-rocprof>`
|
||||
|
||||
* :ref:`Omniperf <mi300x-omniperf>`
|
||||
* :ref:`ROCm Compute Profiler <mi300x-rocprof-compute>`
|
||||
|
||||
* :ref:`Omnitrace <mi300x-omnitrace>`
|
||||
* :ref:`ROCm Systems Profiler <mi300x-rocprof-systems>`
|
||||
|
||||
.. _mi300x-rocprof:
|
||||
|
||||
@@ -271,61 +271,61 @@ ability to collect timeline traces of the accelerator software stack as well as
|
||||
gives the user full access and control of raw performance profiling data, but requires extra effort to analyze the
|
||||
collected data.
|
||||
|
||||
.. _mi300x-omniperf:
|
||||
.. _mi300x-rocprof-compute:
|
||||
|
||||
Omniperf
|
||||
ROCm Compute Profiler
|
||||
^^^^^^^^
|
||||
|
||||
:doc:`Omniperf <omniperf:index>` is a system performance profiler for high-performance computing (HPC) and
|
||||
machine learning (ML) workloads using Instinct accelerators. Under the hood, Omniperf uses
|
||||
:ref:`ROCProfiler <mi300x-rocprof>` to collect hardware performance counters. The Omniperf tool performs
|
||||
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>` is a system performance profiler for high-performance computing (HPC) and
|
||||
machine learning (ML) workloads using Instinct accelerators. Under the hood, ROCm Compute Profiler uses
|
||||
:ref:`ROCProfiler <mi300x-rocprof>` to collect hardware performance counters. The ROCm Compute Profiler tool performs
|
||||
system profiling based on all approved hardware counters for Instinct
|
||||
accelerator architectures. It provides high level performance analysis features including System Speed-of-Light, IP
|
||||
block Speed-of-Light, Memory Chart Analysis, Roofline Analysis, Baseline Comparisons, and more.
|
||||
|
||||
Omniperf takes the guesswork out of profiling by removing the need to provide text input files with lists of counters
|
||||
to collect and analyze raw CSV output files as is the case with ROC-profiler. Instead, Omniperf automates the collection
|
||||
ROCm Compute Profiler takes the guesswork out of profiling by removing the need to provide text input files with lists of counters
|
||||
to collect and analyze raw CSV output files as is the case with ROC-profiler. Instead, ROCm Compute Profiler automates the collection
|
||||
of all available hardware counters in one command and provides a graphical interface to help users understand and
|
||||
analyze bottlenecks and stressors for their computational workloads on AMD Instinct accelerators.
|
||||
|
||||
.. note::
|
||||
|
||||
Omniperf collects hardware counters in multiple passes, and will therefore re-run the application during each pass
|
||||
ROCm Compute Profiler collects hardware counters in multiple passes, and will therefore re-run the application during each pass
|
||||
to collect different sets of metrics.
|
||||
|
||||
.. figure:: ../../../data/how-to/tuning-guides/omniperf-analysis.png
|
||||
.. figure:: ../../../data/how-to/tuning-guides/rocprof-compute-analysis.png
|
||||
|
||||
Omniperf memory chat analysis panel.
|
||||
ROCm Compute Profiler memory chat analysis panel.
|
||||
|
||||
In brief, Omniperf provides details about hardware activity for a particular GPU kernel. It also supports both
|
||||
In brief, ROCm Compute Profiler provides details about hardware activity for a particular GPU kernel. It also supports both
|
||||
a web-based GUI or command-line analyzer, depending on your preference.
|
||||
|
||||
.. _mi300x-omnitrace:
|
||||
.. _mi300x-rocprof-systems:
|
||||
|
||||
Omnitrace
|
||||
ROCm Systems Profiler
|
||||
^^^^^^^^^
|
||||
|
||||
:doc:`Omnitrace <omnitrace:index>` is a comprehensive profiling and tracing tool for parallel applications,
|
||||
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>` is a comprehensive profiling and tracing tool for parallel applications,
|
||||
including HPC and ML packages, written in C, C++, Fortran, HIP, OpenCL, and Python which execute on the CPU or CPU and
|
||||
GPU. It is capable of gathering the performance information of functions through any combination of binary
|
||||
instrumentation, call-stack sampling, user-defined regions, and Python interpreter hooks.
|
||||
|
||||
Omnitrace supports interactive visualization of comprehensive traces in the web browser in addition to high-level
|
||||
ROCm Systems Profiler supports interactive visualization of comprehensive traces in the web browser in addition to high-level
|
||||
summary profiles with ``mean/min/max/stddev`` statistics. Beyond runtime
|
||||
information, Omnitrace supports the collection of system-level metrics such as CPU frequency, GPU temperature, and GPU
|
||||
information, ROCm Systems Profiler supports the collection of system-level metrics such as CPU frequency, GPU temperature, and GPU
|
||||
utilization. Process and thread level metrics such as memory usage, page faults, context switches, and numerous other
|
||||
hardware counters are also included.
|
||||
|
||||
.. tip::
|
||||
|
||||
When analyzing the performance of an application, it is best not to assume you know where the performance
|
||||
bottlenecks are and why they are happening. Omnitrace is the ideal tool for characterizing where optimization would
|
||||
bottlenecks are and why they are happening. ROCm Systems Profiler is the ideal tool for characterizing where optimization would
|
||||
have the greatest impact on the end-to-end execution of the application and to discover what else is happening on the
|
||||
system during a performance bottleneck.
|
||||
|
||||
.. figure:: ../../../data/how-to/tuning-guides/omnitrace-timeline.png
|
||||
.. figure:: ../../../data/how-to/tuning-guides/rocprof-systems-timeline.png
|
||||
|
||||
Omnitrace timeline trace example.
|
||||
ROCm Systems Profiler timeline trace example.
|
||||
|
||||
For details usage and examples of using these tools, refer to the
|
||||
`Introduction to profiling tools for AMD hardware <https://rocm.blogs.amd.com/software-tools-optimization/profilers/README.html>`_
|
||||
|
||||
@@ -29,9 +29,9 @@
|
||||
:::{grid-item-card} Performance
|
||||
:class-body: rocm-card-banner rocm-hue-6
|
||||
|
||||
* {doc}`Omniperf <omniperf:index>`
|
||||
* {doc}`Omnitrace <omnitrace:index>`
|
||||
* {doc}`ROCm Bandwidth Test <rocm_bandwidth_test:index>`
|
||||
* {doc}`ROCm Compute Profiler <rocprofiler-compute:index>`
|
||||
* {doc}`ROCm Systems Profiler <rocprofiler-systems:index>`
|
||||
* {doc}`ROCProfiler <rocprofiler:index>`
|
||||
* {doc}`ROCprofiler-SDK <rocprofiler-sdk:index>`
|
||||
* {doc}`ROCTracer <roctracer:index>`
|
||||
|
||||
@@ -113,9 +113,9 @@ Performance
|
||||
.. csv-table::
|
||||
:header: "Component", "Description"
|
||||
|
||||
":doc:`Omniperf <omniperf:index>`", "System performance profiling tool for machine learning and HPC workloads"
|
||||
":doc:`Omnitrace <omnitrace:index>`", "Comprehensive profiling and tracing tool for HIP applications"
|
||||
":doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`", "Captures the performance characteristics of buffer copying and kernel read/write operations"
|
||||
":doc:`ROCm Compute Profiler <rocprofiler-compute:index>`", "System performance profiling tool for machine learning and HPC workloads"
|
||||
":doc:`ROCm Systems Profiler <rocprofiler-systems:index>`", "Comprehensive profiling and tracing tool for HIP applications"
|
||||
":doc:`ROCProfiler <rocprofiler:index>`", "Profiling tool for HIP applications"
|
||||
":doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`", "Toolkit for developing analysis tools for profiling and tracing GPU compute applications. This toolkit is in beta and subject to change"
|
||||
":doc:`ROCTracer <roctracer:index>`", "Intercepts runtime API calls and traces asynchronous activity"
|
||||
|
||||
Reference in New Issue
Block a user