mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-09 22:58:17 -05:00
Compare commits
9 Commits
docs_fix_a
...
docs/6.0.0
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f9da490d91 | ||
|
|
dcbc392be6 | ||
|
|
1645bc8c34 | ||
|
|
67727ccc04 | ||
|
|
f6658eabde | ||
|
|
f6bc5eb332 | ||
|
|
dafff84ad7 | ||
|
|
3be264865d | ||
|
|
77acc99361 |
32
.github/ISSUE_TEMPLATE/1_feature_request.yml
vendored
32
.github/ISSUE_TEMPLATE/1_feature_request.yml
vendored
@@ -1,32 +0,0 @@
|
||||
name: Feature Suggestion
|
||||
description: Suggest an additional functionality, or new way of handling an existing functionality.
|
||||
title: "[Feature]: "
|
||||
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Thank you for taking the time to make a suggestion!
|
||||
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: Suggestion Description
|
||||
description: Describe your suggestion.
|
||||
validations:
|
||||
required: true
|
||||
- type: input
|
||||
attributes:
|
||||
label: Operating System
|
||||
description: (Optional) If this is for a specific OS, you can mention it here.
|
||||
placeholder: "e.g. Ubuntu"
|
||||
- type: input
|
||||
attributes:
|
||||
label: GPU
|
||||
description: (Optional) If this is for a specific GPU or GPU family, you can mention it here.
|
||||
placeholder: "e.g. MI200"
|
||||
- type: input
|
||||
attributes:
|
||||
label: ROCm Component
|
||||
description: (Optional) If this issue relates to a specific ROCm component, it can be mentioned here.
|
||||
placeholder: "e.g. rocBLAS"
|
||||
|
||||
1
.github/ISSUE_TEMPLATE/config.yml
vendored
1
.github/ISSUE_TEMPLATE/config.yml
vendored
@@ -1 +0,0 @@
|
||||
blank_issues_enabled: true
|
||||
180
.github/ISSUE_TEMPLATE/issue_report.yml
vendored
180
.github/ISSUE_TEMPLATE/issue_report.yml
vendored
@@ -1,180 +0,0 @@
|
||||
name: Issue Report
|
||||
description: File a report for ROCm related issues on Linux and Windows. For issues pertaining to documentation or non-bug related, please open a blank issue located below.
|
||||
title: "[Issue]: "
|
||||
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Thank you for taking the time to fill out this report!
|
||||
|
||||
You can acquire your OS, CPU, GPU (for filling out this report) with the following commands:
|
||||
|
||||
Linux:
|
||||
echo "OS:" && cat /etc/os-release | grep -E "^(NAME=|VERSION=)";
|
||||
echo "CPU: " && cat /proc/cpuinfo | grep "model name" | sort --unique;
|
||||
echo "GPU:" && /opt/rocm/bin/rocminfo | grep -E "^\s*(Name|Marketing Name)";
|
||||
|
||||
Windows:
|
||||
(Get-WmiObject Win32_OperatingSystem).Version
|
||||
(Get-WmiObject win32_Processor).Name
|
||||
(Get-WmiObject win32_VideoController).Name
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: Problem Description
|
||||
description: Describe the issue you encountered.
|
||||
validations:
|
||||
required: true
|
||||
- type: input
|
||||
attributes:
|
||||
label: Operating System
|
||||
description: What is the name and version number of the OS?
|
||||
placeholder: "e.g. Ubuntu 22.04.3 LTS (Jammy Jellyfish)"
|
||||
validations:
|
||||
required: true
|
||||
- type: input
|
||||
attributes:
|
||||
label: CPU
|
||||
description: What CPU did you encounter the issue on?
|
||||
placeholder: "e.g. AMD Ryzen 9 5900HX with Radeon Graphics"
|
||||
validations:
|
||||
required: true
|
||||
- type: dropdown
|
||||
attributes:
|
||||
label: GPU
|
||||
description: What GPU(s) did you encounter the issue on (you can select multiple GPUs from the list)
|
||||
multiple: true
|
||||
options:
|
||||
- AMD Instinct MI300
|
||||
- AMD Instinct MI300A
|
||||
- AMD Instinct MI300X
|
||||
- AMD Instinct MI250X
|
||||
- AMD Instinct MI250
|
||||
- AMD Instinct MI210
|
||||
- AMD Instinct MI100
|
||||
- AMD Instinct MI50
|
||||
- AMD Instinct MI25
|
||||
- AMD Radeon Pro V620
|
||||
- AMD Radeon Pro VII
|
||||
- AMD Radeon RX 7900 XTX
|
||||
- AMD Radeon VII
|
||||
- AMD Radeon Pro W7900
|
||||
- AMD Radeon Pro W7800
|
||||
- AMD Radeon Pro W6800
|
||||
- AMD Radeon Pro W6600
|
||||
- AMD Radeon Pro W5500
|
||||
- AMD Radeon RX 7900 XT
|
||||
- AMD Radeon RX 7600
|
||||
- AMD Radeon RX 6950 XT
|
||||
- AMD Radeon RX 6900 XT
|
||||
- AMD Radeon RX 6800 XT
|
||||
- AMD Radeon RX 6800
|
||||
- AMD Radeon RX 6750
|
||||
- AMD Radeon RX 6700 XT
|
||||
- AMD Radeon RX 6700
|
||||
- AMD Radeon RX 6650 XT
|
||||
- AMD Radeon RX 6600 XT
|
||||
- AMD Radeon RX 6600
|
||||
- Other
|
||||
validations:
|
||||
required: true
|
||||
- type: input
|
||||
attributes:
|
||||
label: Other
|
||||
description: If you selected Other, please specify
|
||||
- type: dropdown
|
||||
attributes:
|
||||
label: ROCm Version
|
||||
description: What version(s) of ROCm did you encounter the issue on?
|
||||
multiple: true
|
||||
options:
|
||||
- ROCm 6.0.0
|
||||
- ROCm 5.7.1
|
||||
- ROCm 5.7.0
|
||||
- ROCm 5.6.0
|
||||
- ROCm 5.5.1
|
||||
- ROCm 5.5.0
|
||||
validations:
|
||||
required: true
|
||||
- type: dropdown
|
||||
attributes:
|
||||
label: ROCm Component
|
||||
description: (Optional) If this issue relates to a specific ROCm component, it can be mentioned here.
|
||||
options:
|
||||
- Other
|
||||
- AMDMIGraphX
|
||||
- amdsmi
|
||||
- aomp
|
||||
- aomp-extras
|
||||
- clang-ocl
|
||||
- clr
|
||||
- composable_kernel
|
||||
- flang
|
||||
- half
|
||||
- HIP
|
||||
- hipBLAS
|
||||
- HIPCC
|
||||
- hipCUB
|
||||
- HIP-Examples
|
||||
- hipFFT
|
||||
- hipfort
|
||||
- HIPIFY
|
||||
- hipSOLVER
|
||||
- hipSPARSE
|
||||
- hipTensor
|
||||
- llvm-project
|
||||
- MIOpen
|
||||
- MIVisionX
|
||||
- rccl
|
||||
- rdc
|
||||
- rocALUTION
|
||||
- rocBLAS
|
||||
- ROCdbgapi
|
||||
- rocFFT
|
||||
- ROCgdb
|
||||
- ROCK-Kernel-Driver
|
||||
- ROCm
|
||||
- rocm_bandwidth_test
|
||||
- rocm_smi_lib
|
||||
- rocm-cmake
|
||||
- ROCm-CompilerSupport
|
||||
- rocm-core
|
||||
- ROCm-Device-Libs
|
||||
- rocminfo
|
||||
- rocMLIR
|
||||
- ROCmValidationSuite
|
||||
- rocPRIM
|
||||
- rocprofiler
|
||||
- rocr_debug_agent
|
||||
- rocRAND
|
||||
- ROCR-Runtime
|
||||
- rocSOLVER
|
||||
- rocSPARSE
|
||||
- rocThrust
|
||||
- roctracer
|
||||
- ROCT-Thunk-Interface
|
||||
- rocWMMA
|
||||
- rpp
|
||||
- Tensile
|
||||
default: 32
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: Steps to Reproduce
|
||||
description: (Optional) Detailed steps to reproduce the issue.
|
||||
validations:
|
||||
required: false
|
||||
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: (Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
|
||||
description: The output of rocminfo --support could help to better address the problem.
|
||||
validations:
|
||||
required: false
|
||||
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: Additional Information
|
||||
description: (Optional) Any additional information that is relevant, e.g. relevant environment variables, dockerfiles, log files, dmesg output (on Linux), etc.
|
||||
validations:
|
||||
required: false
|
||||
|
||||
22
.github/workflows/issue_retrieval.yml
vendored
Normal file
22
.github/workflows/issue_retrieval.yml
vendored
Normal file
@@ -0,0 +1,22 @@
|
||||
name: Issue retrieval
|
||||
|
||||
on:
|
||||
issues:
|
||||
types: [opened]
|
||||
|
||||
jobs:
|
||||
auto-retrieve:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Generate a token
|
||||
id: generate_token
|
||||
uses: actions/create-github-app-token@v1
|
||||
with:
|
||||
app_id: ${{ secrets.ACTION_APP_ID }}
|
||||
private_key: ${{ secrets.ACTION_PEM }}
|
||||
- name: 'Retrieve Issue'
|
||||
uses: abhimeda/rocm_issue_management@main
|
||||
with:
|
||||
authentication-token: ${{ steps.generate_token.outputs.token }}
|
||||
github-organization: 'ROCm'
|
||||
project-num: '6'
|
||||
@@ -3,16 +3,19 @@
|
||||
|
||||
version: 2
|
||||
|
||||
sphinx:
|
||||
configuration: docs/conf.py
|
||||
|
||||
formats: [htmlzip, pdf]
|
||||
build:
|
||||
os: ubuntu-22.04
|
||||
tools:
|
||||
python: "3.10"
|
||||
apt_packages:
|
||||
- "doxygen"
|
||||
- "graphviz" # For dot graphs in doxygen
|
||||
|
||||
python:
|
||||
install:
|
||||
- requirements: docs/sphinx/requirements.txt
|
||||
|
||||
build:
|
||||
os: ubuntu-20.04
|
||||
tools:
|
||||
python: "3.8"
|
||||
sphinx:
|
||||
configuration: docs/conf.py
|
||||
|
||||
formats: []
|
||||
|
||||
1279
CHANGELOG.md
1279
CHANGELOG.md
File diff suppressed because it is too large
Load Diff
@@ -8,7 +8,7 @@
|
||||
|
||||
AMD values and encourages contributions to our code and documentation. If you want to contribute
|
||||
to our ROCm repositories, first review the following guidance. For documentation-specific information,
|
||||
see [Contributing to ROCm docs](./contribute-docs.md).
|
||||
see [Contributing to ROCm docs](https://rocm.docs.amd.com/en/latest/contribute/contribute-docs.html).
|
||||
|
||||
ROCm is a software stack made up of a collection of drivers, development tools, and APIs that enable
|
||||
GPU programming from low-level kernel to end-user applications. Because some of our components
|
||||
@@ -47,14 +47,13 @@ General issue guidelines:
|
||||
|
||||
### Pull requests
|
||||
|
||||
Our repositories typically use the **develop** branch an integration branch for new code so, when
|
||||
making a PR, target this branch.
|
||||
When you create a pull request, you should target the default branch. Our repositories typically use the **develop** branch as the default integration branch.
|
||||
|
||||
When creating a PR, use the following process. Note that each repository may include additional,
|
||||
project-specific steps. Refer to each repository's PR process for any additional steps.
|
||||
|
||||
* Identify the issue you want to fix
|
||||
* Target the **develop** branch for integration
|
||||
* Target the default branch (usually the **develop** branch) for integration
|
||||
* Ensure your code builds successfully
|
||||
* Each component has a suite of test cases to run; include the log of the successful test run in your PR
|
||||
* Do not break existing test cases
|
||||
@@ -73,7 +72,7 @@ terms of the LICENSE.txt file in the corresponding repository. Different reposit
|
||||
licenses.
|
||||
:::
|
||||
|
||||
You can look up each license on the [ROCm licensing](../about/licensing.md) page.
|
||||
You can look up each license on the [ROCm licensing](https://rocm.docs.amd.com/en/latest/about/license.html) page.
|
||||
|
||||
### New feature development
|
||||
|
||||
|
||||
@@ -47,6 +47,7 @@
|
||||
<project groups="mathlibs" name="Tensile" />
|
||||
<project groups="mathlibs" name="hipTensor" />
|
||||
<project groups="mathlibs" name="hipBLAS" />
|
||||
<project groups="mathlibs" name="hipBLASLt" />
|
||||
<project groups="mathlibs" name="rocFFT" />
|
||||
<project groups="mathlibs" name="hipFFT" />
|
||||
<project groups="mathlibs" name="rocRAND" />
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
.. meta::
|
||||
:description: How ROCm uses PCIe atomics
|
||||
:keywords: PCIe, PCIe atomics, atomics, BAR memory
|
||||
:keywords: PCIe, PCIe atomics, atomics, BAR memory, AMD, ROCm
|
||||
|
||||
*****************************************************************************
|
||||
How ROCm uses PCIe atomics
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="Inference optimization with MIGraphX">
|
||||
<meta name="keywords" content="Inference optimization, MIGraphX, deep-learning, MIGraphX
|
||||
installation">
|
||||
installation, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# Inference optimization with MIGraphX
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="Inception V3 with PyTorch">
|
||||
<meta name="keywords" content="PyTorch, Inception V3, deep-learning, training data, optimization
|
||||
algorithm">
|
||||
algorithm, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# Deep learning: Inception V3 with PyTorch
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
.. meta::
|
||||
:description: Using CMake
|
||||
:keywords: CMake, dependencies, HIP, C++
|
||||
:keywords: CMake, dependencies, HIP, C++, AMD, ROCm
|
||||
|
||||
*********************************
|
||||
Using CMake
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="ROCm compilers disambiguation">
|
||||
<meta name="keywords" content="compilers, compiler naming">
|
||||
<meta name="keywords" content="compilers, compiler naming, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# ROCm compilers disambiguation
|
||||
|
||||
@@ -1,7 +1,8 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="ROCm Linux Filesystem Hierarchy Standard reorganization">
|
||||
<meta name="keywords" content="FHS, Linux Filesystem Hierarchy Standard, directory structure">
|
||||
<meta name="keywords" content="FHS, Linux Filesystem Hierarchy Standard, directory structure,
|
||||
AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# ROCm Linux Filesystem Hierarchy Standard reorganization
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="AMD Instinct MI100 microarchitecture">
|
||||
<meta name="keywords" content="Instinct, MI100, microarchitecture">
|
||||
<meta name="keywords" content="Instinct, MI100, microarchitecture, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# AMD Instinct™ MI100 microarchitecture
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="MI200 performance counters and metrics">
|
||||
<meta name="keywords" content="MI200, performance counters, counters, GRBM counters, GRBM,
|
||||
CPF counters, CPF, CPC counters, CPC, command processor counters, SPI counters, SPI">
|
||||
CPF counters, CPF, CPC counters, CPC, command processor counters, SPI counters, SPI, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# MI200 performance counters and metrics
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="AMD Instinct MI250 microarchitecture">
|
||||
<meta name="keywords" content="Instinct, MI250, microarchitecture">
|
||||
<meta name="keywords" content="Instinct, MI250, microarchitecture, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# AMD Instinct™ MI250 microarchitecture
|
||||
@@ -33,8 +33,8 @@ Units (CU). The MI250 GCD has 104 active CUs. Each compute unit is further
|
||||
subdivided into four SIMD units that process SIMD instructions of 16 data
|
||||
elements per instruction (for the FP64 data type). This enables the CU to
|
||||
process 64 work items (a so-called “wavefront”) at a peak clock frequency of 1.7
|
||||
GHz. Therefore, the theoretical maximum FP64 peak performance per GCD is 45.3
|
||||
TFLOPS for vector instructions. The MI250 compute units also provide specialized
|
||||
GHz. Therefore, the theoretical maximum FP64 peak performance per GCD is 22.6
|
||||
TFLOPS for vector instructions. This equates to 45.3 TFLOPS for vector instructions for both GCDs together. The MI250 compute units also provide specialized
|
||||
execution units (also called matrix cores), which are geared toward executing
|
||||
matrix operations like matrix-matrix multiplications. For FP64, the peak
|
||||
performance of these units amounts to 90.5 TFLOPS.
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="GPU isolation techniques">
|
||||
<meta name="keywords" content="GPU isolation techniques, UUID, universally unique identifier,
|
||||
environment variables, virtual machines">
|
||||
environment variables, virtual machines, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# GPU isolation techniques
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="GPU memory">
|
||||
<meta name="keywords" content="GPU memory, VRAM, video random access memory, pageable
|
||||
memory, pinned memory, managed memory">
|
||||
memory, pinned memory, managed memory, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# GPU memory
|
||||
|
||||
@@ -2,16 +2,15 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="Using the LLVM ASan on a GPU">
|
||||
<meta name="keywords" content="LLVM, ASan, address sanitizer, AddressSanitizer, instrumented
|
||||
libraries, instrumented applications">
|
||||
libraries, instrumented applications, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# Using the LLVM ASan on a GPU (beta release)
|
||||
# Using the AddressSanitizer on a GPU (beta release)
|
||||
|
||||
The LLVM AddressSanitizer (ASan) provides a process that allows developers to detect runtime addressing errors in applications and libraries. The detection is achieved using a combination of compiler-added instrumentation and runtime techniques, including function interception and replacement.
|
||||
|
||||
Until now, the LLVM ASan process was only available for traditional purely CPU applications. However, ROCm has extended this mechanism to additionally allow the detection of some addressing errors on the GPU in heterogeneous applications. Ideally, developers should treat heterogeneous HIP and OpenMP applications exactly like pure CPU applications. However, this simplicity has not been achieved yet.
|
||||
|
||||
This document provides documentation on using ROCm ASan.
|
||||
|
||||
For information about LLVM ASan, see the [LLVM documentation](https://clang.llvm.org/docs/AddressSanitizer.html).
|
||||
|
||||
:::{note}
|
||||
@@ -26,17 +25,28 @@ Recommendations for doing this are:
|
||||
|
||||
* Compile as many application and dependent library sources as possible using an AMD-built clang-based compiler such as `amdclang++`.
|
||||
* Add the following options to the existing compiler and linker options:
|
||||
|
||||
* `-fsanitize=address` - enables instrumentation
|
||||
|
||||
* `-shared-libsan` - use shared version of runtime
|
||||
|
||||
* `-g` - add debug info for improved reporting
|
||||
|
||||
* Explicitly use `xnack+` in the offload architecture option. For example, `--offload-arch=gfx90a:xnack+`
|
||||
|
||||
Other architectures are allowed, but their device code will not be instrumented and a warning will be emitted.
|
||||
|
||||
:::{tip}
|
||||
It is not an error to compile some files without ASan instrumentation, but doing so reduces the ability of the process to detect addressing errors. However, if the main program "`a.out`" does not directly depend on the ASan runtime (`libclang_rt.asan-x86_64.so`) after the build completes (check by running `ldd` (List Dynamic Dependencies) or `readelf`), the application will immediately report an error at runtime as described in the next section.
|
||||
:::
|
||||
|
||||
:::{note}
|
||||
When compiling OpenMP programs with ASan instrumentation, it is currently necessary to set the environment variable `LIBRARY_PATH` to `/opt/rocm-<version>/lib/llvm/lib/asan:/opt/rocm-<version>/lib/asan`. At runtime, it may be necessary to add `/opt/rocm-<version>/lib/llvm/lib/asan` to `LD_LIBRARY_PATH`.
|
||||
:::
|
||||
|
||||
### About compilation time
|
||||
|
||||
When `-fsanitize=address` is used, the LLVM compiler adds instrumentation code around every memory operation. This added code must be handled by all of the downstream components of the compiler toolchain and results in increased overall compilation time. This increase is especially evident in the AMDGPU device compiler and has in a few instances raised the compile time to an unacceptable level.
|
||||
When `-fsanitize=address` is used, the LLVM compiler adds instrumentation code around every memory operation. This added code must be handled by all downstream components of the compiler toolchain and results in increased overall compilation time. This increase is especially evident in the AMDGPU device compiler and has in a few instances raised the compile time to an unacceptable level.
|
||||
|
||||
There are a few options if the compile time becomes unacceptable:
|
||||
|
||||
@@ -56,7 +66,7 @@ For a complete ROCm GPU Sanitizer installation, including packages, instrumented
|
||||
## Using AMD-supplied ASan instrumented libraries
|
||||
|
||||
ROCm releases have optional packages that contain additional ASan instrumented builds of the ROCm libraries (usually found in `/opt/rocm-<version>/lib`). The instrumented libraries have identical names to the regular uninstrumented libraries, and are located in `/opt/rocm-<version>/lib/asan`.
|
||||
These additional libraries are built using the `amdclang++` and `hipcc` compilers, while some uninstrumented libraries are built with g++. The preexisting build options are used but, as described above, additional options are used: `-fsanitize=address`, `-shared-libsan` and `-g`.
|
||||
These additional libraries are built using the `amdclang++` and `hipcc` compilers, while some uninstrumented libraries are built with `g++`. The preexisting build options are used but, as described above, additional options are used: `-fsanitize=address`, `-shared-libsan` and `-g`.
|
||||
|
||||
These additional libraries avoid additional developer effort to locate repositories, identify the correct branch, check out the correct tags, and other efforts needed to build the libraries from the source. And they extend the ability of the process to detect addressing errors into the ROCm libraries themselves.
|
||||
|
||||
@@ -86,16 +96,25 @@ If it does not appear, when executed the application will quickly output an ASan
|
||||
|
||||
* Ensure that the application `llvm-symbolizer` can be executed, and that it is located in `/opt/rocm-<version>/llvm/bin`. This executable is not strictly required, but if found is used to translate ("symbolize") a host-side instruction address into a more useful function name, file name, and line number (assuming the application has been built to include debug information).
|
||||
|
||||
There is an environment variable, `ASAN_OPTIONS`, that can be used to adjust the runtime behavior of the ASAN runtime itself. There are more than a hundred "flags" that can be adjusted (see an old list at [flags](https://github.com/google/sanitizers/wiki/AddressSanitizerFlags)) but the default settings are correct and should be used in most cases. It must be noted that these options only affect the host ASAN runtime. The device runtime only currently supports the default settings for the few relevant options.
|
||||
There is an environment variable, `ASAN_OPTIONS`, that can be used to adjust the runtime behavior of the ASan runtime itself. There are more than a hundred "flags" that can be adjusted (see an old list at [flags](https://github.com/google/sanitizers/wiki/AddressSanitizerFlags)) but the default settings are correct and should be used in most cases. It must be noted that these options only affect the host ASan runtime. The device runtime only currently supports the default settings for the few relevant options.
|
||||
|
||||
There are two `ASAN_OPTION` flags of particular note.
|
||||
There are three `ASAN_OPTION` flags of note.
|
||||
|
||||
* `halt_on_error=0/1 default 1`.
|
||||
|
||||
This tells the ASAN runtime to halt the application immediately after detecting and reporting an addressing error. The default makes sense because the application has entered the realm of undefined behavior. If the developer wishes to have the application continue anyway, this option can be set to zero. However, the application and libraries should then be compiled with the additional option `-fsanitize-recover=address`. Note that the ROCm optional ASan instrumented libraries are not compiled with this option and if an error is detected within one of them, but halt_on_error is set to 0, more undefined behavior will occur.
|
||||
This tells the ASan runtime to halt the application immediately after detecting and reporting an addressing error. The default makes sense because the application has entered the realm of undefined behavior. If the developer wishes to have the application continue anyway, this option can be set to zero. However, the application and libraries should then be compiled with the additional option `-fsanitize-recover=address`. Note that the ROCm optional ASan instrumented libraries are not compiled with this option and if an error is detected within one of them, but halt_on_error is set to 0, more undefined behavior will occur.
|
||||
|
||||
* `detect_leaks=0/1 default 1`.
|
||||
This option directs the ASan runtime to enable the [Leak Sanitizer](https://clang.llvm.org/docs/LeakSanitizer.html) (LSAN). Unfortunately, for heterogeneous applications, this default will result in significant output from the leak sanitizer when the application exits due to allocations made by the language runtime which are not considered to be to be leaks. This output can be avoided by adding `detect_leaks=0` to the `ASAN_OPTIONS`, or alternatively by producing an LSAN suppression file (syntax described [here](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)) and activating it with environment variable `LSAN_OPTIONS=suppressions=/path/to/suppression/file`. When using a suppression file, a suppression report is printed by default. The suppression report can be disabled by using the `LSAN_OPTIONS` flag `print_suppressions=0`.
|
||||
|
||||
This option directs the ASan runtime to enable the [Leak Sanitizer](https://clang.llvm.org/docs/LeakSanitizer.html) (LSan). For heterogeneous applications, this default results in significant output from the leak sanitizer when the application exits due to allocations made by the language runtime which are not considered to be leaks. This output can be avoided by adding `detect_leaks=0` to the `ASAN_OPTIONS`, or alternatively by producing an LSan suppression file (syntax described [here](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)) and activating it with environment variable `LSAN_OPTIONS=suppressions=/path/to/suppression/file`. When using a suppression file, a suppression report is printed by default. The suppression report can be disabled by using the `LSAN_OPTIONS` flag `print_suppressions=0`.
|
||||
|
||||
* `quarantine_size_mb=N default 256`
|
||||
|
||||
This option defines the number of megabytes (MB) `N` of memory that the ASan runtime will hold after it is `freed` to detect use-after-free situations. This memory is unavailable for other purposes. The default of 256 MB may be too small to detect some use-after-free situations, especially given that the large size of many GPU memory allocations may push `freed` allocations out of quarantine before the attempted use.
|
||||
|
||||
:::{note}
|
||||
Setting the value of `quarantine_size_mb` larger may enable more problematic uses to be detected, but at the cost of reducing memory available for other purposes.
|
||||
:::
|
||||
|
||||
## Runtime overhead
|
||||
|
||||
@@ -134,7 +153,7 @@ instrumentation.
|
||||
|
||||
## Runtime reporting
|
||||
|
||||
It is not the intention of this document to provide a detailed explanation of all of the types of reports that can be output by the ASan runtime. Instead, the focus is on the differences between the standard reports for CPU issues, and reports for GPU issues.
|
||||
It is not the intention of this document to provide a detailed explanation of all the types of reports that can be output by the ASan runtime. Instead, the focus is on the differences between the standard reports for CPU issues, and reports for GPU issues.
|
||||
|
||||
An invalid address detection report for the CPU always starts with
|
||||
|
||||
@@ -181,7 +200,7 @@ or
|
||||
|
||||
currently may include one or two surprising CPU side tracebacks mentioning :`hostcall`". This is due to how `malloc` and `free` are implemented for GPU code and these call stacks can be ignored.
|
||||
|
||||
### Running with `rocgdb`
|
||||
## Running ASan with `rocgdb`
|
||||
|
||||
`rocgdb` can be used to further investigate ASan detected errors, with some preparation.
|
||||
|
||||
@@ -198,7 +217,7 @@ This is solved by setting environment variable `LD_PRELOAD` to the path to the A
|
||||
amdclang++ -print-file-name=libclang_rt.asan-x86_64.so
|
||||
```
|
||||
|
||||
It is also recommended to set the environment variable `HIP_ENABLE_DEFERRED_LOADING=0` before debugging HIP applications.
|
||||
You should also set the environment variable `HIP_ENABLE_DEFERRED_LOADING=0` before debugging HIP applications.
|
||||
|
||||
After starting `rocgdb` breakpoints can be set on the ASan runtime error reporting entry points of interest. For example, if an ASan error report includes
|
||||
|
||||
@@ -233,18 +252,176 @@ $ rocgdb <path to application>
|
||||
(gdb) c
|
||||
```
|
||||
|
||||
### Using ASan with a short HIP application
|
||||
## Using ASan with a short HIP application
|
||||
|
||||
Refer to the following example to use ASan with a short HIP application,
|
||||
Consider the following simple and short demo of using the Address Sanitizer with a HIP application:
|
||||
|
||||
https://github.com/Rmalavally/rocm-examples/blob/Rmalavally-patch-1/LLVM_ASAN/Using-Address-Sanitizer-with-a-Short-HIP-Application.md
|
||||
```C++
|
||||
|
||||
### Known issues with using GPU sanitizer
|
||||
#include <cstdlib>
|
||||
#include <hip/hip_runtime.h>
|
||||
|
||||
* Red zones must have limited size and it is possible for an invalid access to completely miss a red zone and not be detected.
|
||||
__global__ void
|
||||
set1(int *p)
|
||||
{
|
||||
int i = blockDim.x*blockIdx.x + threadIdx.x;
|
||||
p[i] = 1;
|
||||
}
|
||||
|
||||
int
|
||||
main(int argc, char **argv)
|
||||
{
|
||||
int m = std::atoi(argv[1]);
|
||||
int n1 = std::atoi(argv[2]);
|
||||
int n2 = std::atoi(argv[3]);
|
||||
int c = std::atoi(argv[4]);
|
||||
int *dp;
|
||||
hipMalloc(&dp, m*sizeof(int));
|
||||
hipLaunchKernelGGL(set1, dim3(n1), dim3(n2), 0, 0, dp);
|
||||
int *hp = (int*)malloc(c * sizeof(int));
|
||||
hipMemcpy(hp, dp, m*sizeof(int), hipMemcpyDeviceToHost);
|
||||
hipDeviceSynchronize();
|
||||
hipFree(dp);
|
||||
free(hp);
|
||||
std::puts("Done.");
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
This application will attempt to access invalid addresses for certain command line arguments. In particular, if `m < n1 * n2` some device threads will attempt to access
|
||||
unallocated device memory.
|
||||
|
||||
Or, if `c < m`, the `hipMemcpy` function will copy past the end of the `malloc` allocated memory.
|
||||
|
||||
**Note**: The `hipcc` compiler is used here for simplicity.
|
||||
|
||||
Compiling without XNACK results in a warning.
|
||||
|
||||
```bash
|
||||
$ hipcc -g --offload-arch=gfx90a:xnack- -fsanitize=address -shared-libsan mini.hip -o mini
|
||||
clang++: warning: ignoring` `-fsanitize=address' option for offload arch 'gfx90a:xnack-`, as it is not currently supported there. Use it with an offload arch containing 'xnack+' instead [-Woption-ignored]`.
|
||||
```
|
||||
|
||||
The binary compiled above will run, but the GPU code will not be instrumented and the `m < n1 * n2` error will not be detected. Switching to `--offload-arch=gfx90a:xnack+` in the command above results in a warning-free compilation and an instrumented application. After setting `PATH`, `LD_LIBRARY_PATH` and `HSA_XNACK` as described earlier, a check of the binary with `ldd` yields the following,
|
||||
|
||||
```bash
|
||||
$ ldd mini
|
||||
linux-vdso.so.1 (0x00007ffd1a5ae000)
|
||||
libclang_rt.asan-x86_64.so => /opt/rocm-6.1.0-99999/llvm/lib/clang/17.0.0/lib/linux/libclang_rt.asan-x86_64.so (0x00007fb9c14b6000)
|
||||
libamdhip64.so.5 => /opt/rocm-6.1.0-99999/lib/asan/libamdhip64.so.5 (0x00007fb9bedd3000)
|
||||
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb9beba8000)
|
||||
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb9bea59000)
|
||||
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb9bea3e000)
|
||||
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb9be84a000)
|
||||
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb9be844000)
|
||||
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb9be821000)
|
||||
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fb9be817000)
|
||||
libamd_comgr.so.2 => /opt/rocm-6.1.0-99999/lib/asan/libamd_comgr.so.2 (0x00007fb9b4382000)
|
||||
libhsa-runtime64.so.1 => /opt/rocm-6.1.0-99999/lib/asan/libhsa-runtime64.so.1 (0x00007fb9b3b00000)
|
||||
libnuma.so.1 => /lib/x86_64-linux-gnu/libnuma.so.1 (0x00007fb9b3af3000)
|
||||
/lib64/ld-linux-x86-64.so.2 (0x00007fb9c2027000)
|
||||
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fb9b3ad7000)
|
||||
libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007fb9b3aa7000)
|
||||
libelf.so.1 => /lib/x86_64-linux-gnu/libelf.so.1 (0x00007fb9b3a89000)
|
||||
libdrm.so.2 => /opt/amdgpu/lib/x86_64-linux-gnu/libdrm.so.2 (0x00007fb9b3a70000)
|
||||
libdrm_amdgpu.so.1 => /opt/amdgpu/lib/x86_64-linux-gnu/libdrm_amdgpu.so.1 (0x00007fb9b3a62000)
|
||||
|
||||
```
|
||||
|
||||
This confirms that the address sanitizer runtime is linked in, and the ASan instrumented version of the runtime libraries are used.
|
||||
Checking the `PATH` yields
|
||||
|
||||
```bash
|
||||
$ which llvm-symbolizer
|
||||
/opt/rocm-6.1.0-99999/llvm/bin/llvm-symbolizer
|
||||
```
|
||||
|
||||
Lastly, a check of the OS kernel version yields
|
||||
|
||||
```bash
|
||||
$ uname -rv
|
||||
5.15.0-73-generic #80~20.04.1-Ubuntu SMP Wed May 17 14:58:14 UTC 2023
|
||||
```
|
||||
|
||||
which indicates that the required HMM support (kernel version > 5.6) is available. This completes the necessary setup. Running with `m = 100`, `n1 = 11`, `n2 = 10` and `c = 100` should produce
|
||||
a report for an invalid access by the last 10 threads.
|
||||
|
||||
```bash
|
||||
=================================================================
|
||||
==3141==ERROR: AddressSanitizer: heap-buffer-overflow on amdgpu device 0 at pc 0x7fb1410d2cc4
|
||||
WRITE of size 4 in workgroup id (10,0,0)
|
||||
#0 0x7fb1410d2cc4 in set1(int*) at /home/dave/mini/mini.cpp:0:10
|
||||
|
||||
Thread ids and accessed addresses:
|
||||
00 : 0x7fb14371d190 01 : 0x7fb14371d194 02 : 0x7fb14371d198 03 : 0x7fb14371d19c 04 : 0x7fb14371d1a0 05 : 0x7fb14371d1a4 06 : 0x7fb14371d1a8 07 : 0x7fb14371d1ac
|
||||
08 : 0x7fb14371d1b0 09 : 0x7fb14371d1b4
|
||||
|
||||
0x7fb14371d190 is located 0 bytes after 400-byte region [0x7fb14371d000,0x7fb14371d190)
|
||||
allocated by thread T0 here:
|
||||
#0 0x7fb151c76828 in hsa_amd_memory_pool_allocate /work/dave/git/compute/external/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:692:3
|
||||
#1 ...
|
||||
|
||||
#12 0x7fb14fb99ec4 in hipMalloc /work/dave/git/compute/external/clr/hipamd/src/hip_memory.cpp:568:3
|
||||
#13 0x226630 in hipError_t hipMalloc<int>(int**, unsigned long) /opt/rocm-6.1.0-99999/include/hip/hip_runtime_api.h:8367:12
|
||||
#14 0x226630 in main /home/dave/mini/mini.cpp:19:5
|
||||
#15 0x7fb14ef02082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
|
||||
|
||||
Shadow bytes around the buggy address:
|
||||
0x7fb14371cf00: ...
|
||||
|
||||
=>0x7fb14371d180: 00 00[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa
|
||||
0x7fb14371d200: ...
|
||||
|
||||
Shadow byte legend (one shadow byte represents 8 application bytes):
|
||||
Addressable: 00
|
||||
Partially addressable: 01 02 03 04 05 06 07
|
||||
Heap left redzone: fa
|
||||
...
|
||||
==3141==ABORTING
|
||||
```
|
||||
|
||||
Running with `m = 100`, `n1 = 10`, `n2 = 10` and `c = 99` should produce a report for an invalid copy.
|
||||
|
||||
```shell
|
||||
=================================================================
|
||||
==2817==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x514000150dcc at pc 0x7f5509551aca bp 0x7ffc90a7ae50 sp 0x7ffc90a7a610
|
||||
WRITE of size 400 at 0x514000150dcc thread T0
|
||||
#0 0x7f5509551ac9 in __asan_memcpy /work/dave/git/compute/external/llvm-project/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:61:3
|
||||
#1 ...
|
||||
|
||||
#9 0x7f5507462a28 in hipMemcpy_common(void*, void const*, unsigned long, hipMemcpyKind, ihipStream_t*) /work/dave/git/compute/external/clr/hipamd/src/hip_memory.cpp:637:10
|
||||
#10 0x7f5507464205 in hipMemcpy /work/dave/git/compute/external/clr/hipamd/src/hip_memory.cpp:642:3
|
||||
#11 0x226844 in main /home/dave/mini/mini.cpp:22:5
|
||||
#12 0x7f55067c3082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
|
||||
#13 0x22605d in _start (/home/dave/mini/mini+0x22605d)
|
||||
|
||||
0x514000150dcc is located 0 bytes after 396-byte region [0x514000150c40,0x514000150dcc)
|
||||
allocated by thread T0 here:
|
||||
#0 0x7f5509553dcf in malloc /work/dave/git/compute/external/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
|
||||
#1 0x226817 in main /home/dave/mini/mini.cpp:21:21
|
||||
#2 0x7f55067c3082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
|
||||
|
||||
SUMMARY: AddressSanitizer: heap-buffer-overflow /work/dave/git/compute/external/llvm-project/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:61:3 in __asan_memcpy
|
||||
Shadow bytes around the buggy address:
|
||||
0x514000150b00: ...
|
||||
|
||||
=>0x514000150d80: 00 00 00 00 00 00 00 00 00[04]fa fa fa fa fa fa
|
||||
0x514000150e00: ...
|
||||
|
||||
Shadow byte legend (one shadow byte represents 8 application bytes):
|
||||
Addressable: 00
|
||||
Partially addressable: 01 02 03 04 05 06 07
|
||||
Heap left redzone: fa
|
||||
...
|
||||
==2817==ABORTING
|
||||
```
|
||||
|
||||
## Known issues with using GPU sanitizer
|
||||
|
||||
* Red zones must have limited size. It is possible for an invalid access to completely miss a red zone and not be detected.
|
||||
|
||||
* Lack of detection or false reports can be caused by the runtime not properly maintaining red zone shadows.
|
||||
|
||||
* Lack of detection on the GPU might also be due to the implementation not instrumenting accesses to all GPU specific address spaces. For example, in the current implementation accesses to "private" or "stack" variables on the GPU are not instrumented, and accesses to HIP shared variables (also known as "local data store" or "LDS") are also not instrumented.
|
||||
|
||||
* It can also be the case that a memory fault is hit for an invalid address even with the instrumentation. This is usually caused by the invalid address being so wild that its shadow address is outside of any memory region, and the fault actually occurs on the access to the shadow address. It is also possible to hit a memory fault for the `NULL` pointer. While address 0 does have a shadow location, it is not poisoned by the runtime.
|
||||
* It can also be the case that a memory fault is hit for an invalid address even with the instrumentation. This is usually caused by the invalid address being so wild that its shadow address is outside any memory region, and the fault actually occurs on the access to the shadow address. It is also possible to hit a memory fault for the `NULL` pointer. While address 0 does have a shadow location, it is not poisoned by the runtime.
|
||||
|
||||
@@ -38,7 +38,7 @@ latex_elements = {
|
||||
# configurations for PDF output by Read the Docs
|
||||
project = "ROCm Documentation"
|
||||
author = "Advanced Micro Devices, Inc."
|
||||
copyright = "Copyright (c) 2023-2024 Advanced Micro Devices, Inc. All rights reserved."
|
||||
copyright = "Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved."
|
||||
version = "6.0.0"
|
||||
release = "6.0.0"
|
||||
setting_all_article_info = True
|
||||
@@ -50,7 +50,7 @@ article_pages = [
|
||||
{
|
||||
"file":"release",
|
||||
"os":["linux", "windows"],
|
||||
"date":"2023-07-27"
|
||||
"date":"2024-01-09"
|
||||
},
|
||||
|
||||
{"file":"install/windows/install-quick", "os":["windows"]},
|
||||
|
||||
@@ -1,7 +1,8 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="Building ROCm documentation">
|
||||
<meta name="keywords" content="documentation, Visual Studio Code, GitHub, command line">
|
||||
<meta name="keywords" content="documentation, Visual Studio Code, GitHub, command line,
|
||||
AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# Building documentation
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="Providing feedback for ROCm documentation">
|
||||
<meta name="keywords" content="documentation, pull request, GitHub">
|
||||
<meta name="keywords" content="documentation, pull request, GitHub, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# Providing feedback for ROCm documentation
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="ROCm documentation toolchain">
|
||||
<meta name="keywords" content="documentation, toolchain, Sphinx, Doxygen, MyST">
|
||||
<meta name="keywords" content="documentation, toolchain, Sphinx, Doxygen, MyST, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# ROCm documentation toolchain
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="Deep learning using ROCm">
|
||||
<meta name="keywords" content="deep learning, frameworks, installation, PyTorch, TensorFlow,
|
||||
MAGMA">
|
||||
MAGMA, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# Deep learning guide
|
||||
|
||||
@@ -1,3 +1,7 @@
|
||||
.. meta::
|
||||
:description: GPU-enabled Message Passing Interface
|
||||
:keywords: Message Passing Interface, MPI, AMD, ROCm
|
||||
|
||||
***************************************************************************************************
|
||||
GPU-enabled Message Passing Interface
|
||||
***************************************************************************************************
|
||||
|
||||
@@ -1,7 +1,8 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="System debugging guide">
|
||||
<meta name="keywords" content="debug, system-level debug, debug flags, PCIe debug">
|
||||
<meta name="keywords" content="debug, system-level debug, debug flags, PCIe debug, AMD,
|
||||
ROCm">
|
||||
</head>
|
||||
|
||||
# System debugging guide
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="Tuning guides">
|
||||
<meta name="keywords" content="high-performance computing, HPC, Instinct accelerators,
|
||||
Radeon, tuning, tuning guide">
|
||||
Radeon, tuning, tuning guide, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# Tuning guides
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="MI100 high-performance computing and tuning guide">
|
||||
<meta name="keywords" content="MI100, high-performance computing, HPC, tuning, BIOS
|
||||
settings, NBIO">
|
||||
settings, NBIO, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# MI100 high-performance computing and tuning guide
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="MI200 high-performance computing and tuning guide">
|
||||
<meta name="keywords" content="MI200, high-performance computing, HPC, tuning, BIOS
|
||||
settings, NBIO">
|
||||
settings, NBIO, AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# MI200 high-performance computing and tuning guide
|
||||
|
||||
@@ -1,7 +1,8 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="RDNA2 workstation tuning guide">
|
||||
<meta name="keywords" content="RDNA2, workstation tuning, BIOS settings, installation">
|
||||
<meta name="keywords" content="RDNA2, workstation tuning, BIOS settings, installation, AMD,
|
||||
ROCm">
|
||||
</head>
|
||||
|
||||
# RDNA2 workstation tuning guide
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="AMD ROCm documentation">
|
||||
<meta name="keywords" content="documentation, guides, installation, compatibility, support,
|
||||
reference">
|
||||
reference, ROCm, AMD">
|
||||
</head>
|
||||
|
||||
# AMD ROCm™ documentation
|
||||
@@ -100,7 +100,7 @@ Topic overviews & background information
|
||||
* [Compiler disambiguation](./conceptual/compiler-disambiguation.md)
|
||||
* [File structure (Linux FHS)](./conceptual/file-reorg.md)
|
||||
* [GPU isolation techniques](./conceptual/gpu-isolation.md)
|
||||
* [LLVN ASan](./conceptual/using-gpu-sanitizer.md)
|
||||
* [LLVM ASan](./conceptual/using-gpu-sanitizer.md)
|
||||
* [Using CMake](./conceptual/cmake-packages.rst)
|
||||
* [ROCm & PCIe atomics](./conceptual/More-about-how-ROCm-uses-PCIe-Atomics.rst)
|
||||
* [Inception v3 with PyTorch](./conceptual/ai-pytorch-inception.md)
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
<meta name="description" content="ROCm API libraries & tools">
|
||||
<meta name="keywords" content="ROCm, API, libraries, tools, artificial intelligence, development,
|
||||
Communications, C++ primitives, Fast Fourier transforms, FFTs, random number generators, linear
|
||||
algebra">
|
||||
algebra, AMD">
|
||||
</head>
|
||||
|
||||
# ROCm API libraries & tools
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="Compiler reference guide">
|
||||
<meta name="keywords" content="compiler, hipCC, Clang, amdclang, optimizations, LLVM,
|
||||
rocm-llvm">
|
||||
rocm-llvm, , AMD, ROCm">
|
||||
</head>
|
||||
|
||||
# Compiler reference guide
|
||||
|
||||
@@ -1,10 +1,17 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="ROCm release history">
|
||||
<meta name="keywords" content="documentation, release history, ROCm, AMD">
|
||||
</head>
|
||||
|
||||
# ROCm release history
|
||||
|
||||
| Version | Release Date |
|
||||
| Version | Release date |
|
||||
| ------- | ------------ |
|
||||
| [6.0.0](https://rocm.docs.amd.com/en/docs-6.0.0/) | Dec 15, 2023 |
|
||||
| [5.7.1](https://rocm.docs.amd.com/en/docs-5.7.1/) | Oct 13, 2023 |
|
||||
| [5.7.0](https://rocm.docs.amd.com/en/docs-5.7.0/) | Sep 15, 2023 |
|
||||
| [5.6.1](https://rocm.docs.amd.com/en/docs-5.6.1/) | Aug 29, 2023 |
|
||||
| [5.6.0](https://rocm.docs.amd.com/en/docs-5.6.0/) | Jun 28, 2023 |
|
||||
| [5.5.1](https://rocm.docs.amd.com/en/docs-5.5.1/) | May 24, 2023 |
|
||||
| [5.5.0](https://rocm.docs.amd.com/en/docs-5.5.0/) | May 1, 2023 |
|
||||
|
||||
@@ -91,7 +91,7 @@ subtrees:
|
||||
- file: conceptual/gpu-isolation.md
|
||||
title: GPU isolation techniques
|
||||
- file: conceptual/using-gpu-sanitizer.md
|
||||
title: LLVN ASan
|
||||
title: LLVM ASan
|
||||
- file: conceptual/cmake-packages.rst
|
||||
title: Using CMake
|
||||
- file: conceptual/More-about-how-ROCm-uses-PCIe-Atomics.rst
|
||||
|
||||
@@ -1 +1,2 @@
|
||||
rocm-docs-core==0.30.3
|
||||
rocm-docs-core==1.8.0
|
||||
sphinx-reredirects
|
||||
|
||||
@@ -1,114 +1,106 @@
|
||||
#
|
||||
# This file is autogenerated by pip-compile with Python 3.8
|
||||
# This file is autogenerated by pip-compile with Python 3.10
|
||||
# by the following command:
|
||||
#
|
||||
# pip-compile requirements.in
|
||||
#
|
||||
accessible-pygments==0.0.3
|
||||
accessible-pygments==0.0.5
|
||||
# via pydata-sphinx-theme
|
||||
alabaster==0.7.13
|
||||
alabaster==1.0.0
|
||||
# via sphinx
|
||||
babel==2.11.0
|
||||
babel==2.16.0
|
||||
# via
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
beautifulsoup4==4.11.2
|
||||
beautifulsoup4==4.12.3
|
||||
# via pydata-sphinx-theme
|
||||
breathe==4.34.0
|
||||
breathe==4.35.0
|
||||
# via rocm-docs-core
|
||||
certifi==2023.7.22
|
||||
certifi==2024.8.30
|
||||
# via requests
|
||||
cffi==1.15.1
|
||||
cffi==1.17.1
|
||||
# via
|
||||
# cryptography
|
||||
# pynacl
|
||||
charset-normalizer==2.1.1
|
||||
charset-normalizer==3.3.2
|
||||
# via requests
|
||||
click==8.1.3
|
||||
click==8.1.7
|
||||
# via sphinx-external-toc
|
||||
cryptography==41.0.3
|
||||
cryptography==43.0.1
|
||||
# via pyjwt
|
||||
deprecated==1.2.13
|
||||
deprecated==1.2.14
|
||||
# via pygithub
|
||||
docutils==0.19
|
||||
docutils==0.21.2
|
||||
# via
|
||||
# breathe
|
||||
# myst-parser
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
fastjsonschema==2.16.3
|
||||
fastjsonschema==2.20.0
|
||||
# via rocm-docs-core
|
||||
gitdb==4.0.10
|
||||
gitdb==4.0.11
|
||||
# via gitpython
|
||||
gitpython==3.1.30
|
||||
gitpython==3.1.43
|
||||
# via rocm-docs-core
|
||||
idna==3.4
|
||||
idna==3.10
|
||||
# via requests
|
||||
imagesize==1.4.1
|
||||
# via sphinx
|
||||
importlib-metadata==7.0.0
|
||||
# via sphinx
|
||||
importlib-resources==6.1.1
|
||||
# via rocm-docs-core
|
||||
jinja2==3.1.2
|
||||
jinja2==3.1.4
|
||||
# via
|
||||
# myst-parser
|
||||
# sphinx
|
||||
markdown-it-py==2.2.0
|
||||
markdown-it-py==3.0.0
|
||||
# via
|
||||
# mdit-py-plugins
|
||||
# myst-parser
|
||||
markupsafe==2.1.2
|
||||
markupsafe==2.1.5
|
||||
# via jinja2
|
||||
mdit-py-plugins==0.3.4
|
||||
mdit-py-plugins==0.4.2
|
||||
# via myst-parser
|
||||
mdurl==0.1.2
|
||||
# via markdown-it-py
|
||||
myst-parser==1.0.0
|
||||
myst-parser==4.0.0
|
||||
# via rocm-docs-core
|
||||
packaging==23.0
|
||||
packaging==24.1
|
||||
# via
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
pycparser==2.21
|
||||
pycparser==2.22
|
||||
# via cffi
|
||||
pydata-sphinx-theme==0.13.3
|
||||
pydata-sphinx-theme==0.15.4
|
||||
# via
|
||||
# rocm-docs-core
|
||||
# sphinx-book-theme
|
||||
pygithub==1.58.1
|
||||
pygithub==2.4.0
|
||||
# via rocm-docs-core
|
||||
pygments==2.15.0
|
||||
pygments==2.18.0
|
||||
# via
|
||||
# accessible-pygments
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
pyjwt[crypto]==2.6.0
|
||||
# via
|
||||
# pygithub
|
||||
# pyjwt
|
||||
pyjwt[crypto]==2.9.0
|
||||
# via pygithub
|
||||
pynacl==1.5.0
|
||||
# via pygithub
|
||||
pytz==2022.7.1
|
||||
# via babel
|
||||
pyyaml==6.0
|
||||
pyyaml==6.0.2
|
||||
# via
|
||||
# myst-parser
|
||||
# rocm-docs-core
|
||||
# sphinx-external-toc
|
||||
requests==2.31.0
|
||||
requests==2.32.3
|
||||
# via
|
||||
# pygithub
|
||||
# sphinx
|
||||
rocm-docs-core==0.30.3
|
||||
rocm-docs-core==1.8.0
|
||||
# via -r requirements.in
|
||||
smmap==5.0.0
|
||||
smmap==5.0.1
|
||||
# via gitdb
|
||||
snowballstemmer==2.2.0
|
||||
# via sphinx
|
||||
soupsieve==2.4
|
||||
soupsieve==2.6
|
||||
# via beautifulsoup4
|
||||
sphinx==5.3.0
|
||||
sphinx==8.0.2
|
||||
# via
|
||||
# breathe
|
||||
# myst-parser
|
||||
@@ -119,35 +111,40 @@ sphinx==5.3.0
|
||||
# sphinx-design
|
||||
# sphinx-external-toc
|
||||
# sphinx-notfound-page
|
||||
sphinx-book-theme==1.0.1
|
||||
# sphinx-reredirects
|
||||
sphinx-book-theme==1.1.3
|
||||
# via rocm-docs-core
|
||||
sphinx-copybutton==0.5.1
|
||||
sphinx-copybutton==0.5.2
|
||||
# via rocm-docs-core
|
||||
sphinx-design==0.4.1
|
||||
sphinx-design==0.6.1
|
||||
# via rocm-docs-core
|
||||
sphinx-external-toc==0.3.1
|
||||
sphinx-external-toc==1.0.1
|
||||
# via rocm-docs-core
|
||||
sphinx-notfound-page==0.8.3
|
||||
sphinx-notfound-page==1.0.4
|
||||
# via rocm-docs-core
|
||||
sphinxcontrib-applehelp==1.0.4
|
||||
sphinx-reredirects==0.1.5
|
||||
# via -r requirements.in
|
||||
sphinxcontrib-applehelp==2.0.0
|
||||
# via sphinx
|
||||
sphinxcontrib-devhelp==1.0.2
|
||||
sphinxcontrib-devhelp==2.0.0
|
||||
# via sphinx
|
||||
sphinxcontrib-htmlhelp==2.0.1
|
||||
sphinxcontrib-htmlhelp==2.1.0
|
||||
# via sphinx
|
||||
sphinxcontrib-jsmath==1.0.1
|
||||
# via sphinx
|
||||
sphinxcontrib-qthelp==1.0.3
|
||||
sphinxcontrib-qthelp==2.0.0
|
||||
# via sphinx
|
||||
sphinxcontrib-serializinghtml==1.1.5
|
||||
sphinxcontrib-serializinghtml==2.0.0
|
||||
# via sphinx
|
||||
typing-extensions==4.5.0
|
||||
# via pydata-sphinx-theme
|
||||
urllib3==1.26.13
|
||||
# via requests
|
||||
wrapt==1.14.1
|
||||
# via deprecated
|
||||
zipp==3.17.0
|
||||
tomli==2.0.1
|
||||
# via sphinx
|
||||
typing-extensions==4.12.2
|
||||
# via
|
||||
# importlib-metadata
|
||||
# importlib-resources
|
||||
# pydata-sphinx-theme
|
||||
# pygithub
|
||||
urllib3==2.2.3
|
||||
# via
|
||||
# pygithub
|
||||
# requests
|
||||
wrapt==1.16.0
|
||||
# via deprecated
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="description" content="What is ROCm">
|
||||
<meta name="keywords" content="documentation, projects, introduction">
|
||||
<meta name="keywords" content="documentation, projects, introduction, ROCm, AMD">
|
||||
</head>
|
||||
|
||||
# What is ROCm?
|
||||
|
||||
@@ -2,6 +2,7 @@
|
||||
|
||||
## Pre-requisites
|
||||
|
||||
* Python 3.10
|
||||
* Create a GitHub Personal Access Token.
|
||||
* Tested with all the read-only permissions, but public_repo, read:project read:user, and repo:status should be enough.
|
||||
* Copy the token somewhere safe.
|
||||
@@ -17,16 +18,16 @@
|
||||
* Run this for 5.6.0 (change for whatever version you require)
|
||||
* `GITHUB_ACCESS_TOKEN=my_token_here`
|
||||
|
||||
To generate the changelog from 5.0.0 up to and including 5.7.1:
|
||||
To generate the changelog from 5.0.0 up to and including 6.0.0:
|
||||
|
||||
```sh
|
||||
python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --do-previous --compile_file ../../CHANGELOG.md --branch release/rocm-rel-5.7 5.7.1
|
||||
python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --do-previous --compile_file ../../CHANGELOG.md --branch release/rocm-rel-6.0 6.0.0
|
||||
```
|
||||
|
||||
To generate the changelog only for 5.7.1:
|
||||
To generate the changelog only for 6.0.0:
|
||||
|
||||
```sh
|
||||
python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --compile_file ../../CHANGELOG.md --branch release/rocm-rel-5.7 5.7.1
|
||||
python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --compile_file ../../CHANGELOG.md --branch release/rocm-rel-6.0 6.0.0
|
||||
```
|
||||
|
||||
### Notes
|
||||
|
||||
@@ -281,7 +281,7 @@ Note: These complex operations are equivalent to corresponding types/functions o
|
||||
* `HIP_ROCclr`
|
||||
* NVIDIA platform
|
||||
* `HIP_PLATFORM_NVCC`
|
||||
* File directories in the clr repository are removed, for more details see https://github.com/ROCm-Developer-Tools/clr/blob/develop/hipamd/include/hip/hcc_detail and https://github.com/ROCm-Developer-Tools/clr/blob/develop/hipamd/include/hip/nvcc_detail
|
||||
* The `hcc_detail` and `nvcc_detail` directories in the clr repository are removed.
|
||||
* Deprecated gcnArch is removed from hip device struct `hipDeviceProp_t`.
|
||||
* Deprecated `enum hipMemoryType memoryType;` is removed from HIP struct `hipPointerAttribute_t` union.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user