updates for SWDEV-459863 (#3114)

This commit is contained in:
randyh62
2024-05-10 08:55:46 -07:00
committed by GitHub
parent ef62da6cb3
commit 70f8dba582
5 changed files with 45 additions and 24 deletions

View File

@@ -0,0 +1,14 @@
<head>
<meta charset="UTF-8">
<meta name="description" content="AMD ROCm documentation">
<meta name="keywords" content="documentation, guides, installation, compatibility, support,
reference, ROCm, AMD">
</head>
# Using compiler features
The following topics describe using specific features of the compilation tools:
* [Using AddressSanitizer](./using-gpu-sanitizer.md)
* [Compiler disambiguation](./compiler-disambiguation.md)
* [OpenMP support in ROCm](../about/compatibility/openmp.md)

View File

@@ -5,7 +5,7 @@
libraries, instrumented applications, AMD, ROCm">
</head>
# Using the LLVM ASan on a GPU (beta release)
# Using the AddressSanitizer on a GPU (beta release)
The LLVM AddressSanitizer (ASan) provides a process that allows developers to detect runtime addressing errors in applications and libraries. The detection is achieved using a combination of compiler-added instrumentation and runtime techniques, including function interception and replacement.
Until now, the LLVM ASan process was only available for traditional purely CPU applications. However, ROCm has extended this mechanism to additionally allow the detection of some addressing errors on the GPU in heterogeneous applications. Ideally, developers should treat heterogeneous HIP and OpenMP applications exactly like pure CPU applications. However, this simplicity has not been achieved yet.
@@ -40,7 +40,7 @@ Other architectures are allowed, but their device code will not be instrumented
### About compilation time
When `-fsanitize=address` is used, the LLVM compiler adds instrumentation code around every memory operation. This added code must be handled by all of the downstream components of the compiler toolchain and results in increased overall compilation time. This increase is especially evident in the AMDGPU device compiler and has in a few instances raised the compile time to an unacceptable level.
When `-fsanitize=address` is used, the LLVM compiler adds instrumentation code around every memory operation. This added code must be handled by all downstream components of the compiler toolchain and results in increased overall compilation time. This increase is especially evident in the AMDGPU device compiler and has in a few instances raised the compile time to an unacceptable level.
There are a few options if the compile time becomes unacceptable:
@@ -90,7 +90,7 @@ If it does not appear, when executed the application will quickly output an ASan
* Ensure that the application `llvm-symbolizer` can be executed, and that it is located in `/opt/rocm-<version>/llvm/bin`. This executable is not strictly required, but if found is used to translate ("symbolize") a host-side instruction address into a more useful function name, file name, and line number (assuming the application has been built to include debug information).
There is an environment variable, `ASAN_OPTIONS`, that can be used to adjust the runtime behavior of the ASAN runtime itself. There are more than a hundred "flags" that can be adjusted (see an old list at [flags](https://github.com/google/sanitizers/wiki/AddressSanitizerFlags)) but the default settings are correct and should be used in most cases. It must be noted that these options only affect the host ASAN runtime. The device runtime only currently supports the default settings for the few relevant options.
There is an environment variable, `ASAN_OPTIONS`, that can be used to adjust the runtime behavior of the ASan runtime itself. There are more than a hundred "flags" that can be adjusted (see an old list at [flags](https://github.com/google/sanitizers/wiki/AddressSanitizerFlags)) but the default settings are correct and should be used in most cases. It must be noted that these options only affect the host ASan runtime. The device runtime only currently supports the default settings for the few relevant options.
There are two `ASAN_OPTION` flags of particular note.
@@ -100,7 +100,7 @@ This tells the ASan runtime to halt the application immediately after detecting
* `detect_leaks=0/1 default 1`.
This option directs the ASan runtime to enable the [Leak Sanitizer](https://clang.llvm.org/docs/LeakSanitizer.html) (LSAN). Unfortunately, for heterogeneous applications, this default will result in significant output from the leak sanitizer when the application exits due to allocations made by the language runtime which are not considered to be to be leaks. This output can be avoided by adding `detect_leaks=0` to the `ASAN_OPTIONS`, or alternatively by producing an LSAN suppression file (syntax described [here](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)) and activating it with environment variable `LSAN_OPTIONS=suppressions=/path/to/suppression/file`. When using a suppression file, a suppression report is printed by default. The suppression report can be disabled by using the `LSAN_OPTIONS` flag `print_suppressions=0`.
This option directs the ASan runtime to enable the [Leak Sanitizer](https://clang.llvm.org/docs/LeakSanitizer.html) (LSan). Unfortunately, for heterogeneous applications, this default will result in significant output from the leak sanitizer when the application exits due to allocations made by the language runtime which are not considered to be leaks. This output can be avoided by adding `detect_leaks=0` to the `ASAN_OPTIONS`, or alternatively by producing an LSan suppression file (syntax described [here](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)) and activating it with environment variable `LSAN_OPTIONS=suppressions=/path/to/suppression/file`. When using a suppression file, a suppression report is printed by default. The suppression report can be disabled by using the `LSAN_OPTIONS` flag `print_suppressions=0`.
## Runtime overhead
@@ -139,7 +139,7 @@ instrumentation.
## Runtime reporting
It is not the intention of this document to provide a detailed explanation of all of the types of reports that can be output by the ASan runtime. Instead, the focus is on the differences between the standard reports for CPU issues, and reports for GPU issues.
It is not the intention of this document to provide a detailed explanation of all the types of reports that can be output by the ASan runtime. Instead, the focus is on the differences between the standard reports for CPU issues, and reports for GPU issues.
An invalid address detection report for the CPU always starts with
@@ -203,7 +203,7 @@ This is solved by setting environment variable `LD_PRELOAD` to the path to the A
amdclang++ -print-file-name=libclang_rt.asan-x86_64.so
```
It is also recommended to set the environment variable `HIP_ENABLE_DEFERRED_LOADING=0` before debugging HIP applications.
You should also set the environment variable `HIP_ENABLE_DEFERRED_LOADING=0` before debugging HIP applications.
After starting `rocgdb` breakpoints can be set on the ASan runtime error reporting entry points of interest. For example, if an ASan error report includes
@@ -314,7 +314,7 @@ $ ldd mini
```
This confirms that the address sanitizer runtime is linked in, and the ASAN instrumented version of the runtime libraries are used.
This confirms that the address sanitizer runtime is linked in, and the ASan instrumented version of the runtime libraries are used.
Checking the `PATH` yields
```bash
@@ -404,10 +404,10 @@ Shadow byte legend (one shadow byte represents 8 application bytes):
### Known issues with using GPU sanitizer
* Red zones must have limited size and it is possible for an invalid access to completely miss a red zone and not be detected.
* Red zones must have limited size. It is possible for an invalid access to completely miss a red zone and not be detected.
* Lack of detection or false reports can be caused by the runtime not properly maintaining red zone shadows.
* Lack of detection on the GPU might also be due to the implementation not instrumenting accesses to all GPU specific address spaces. For example, in the current implementation accesses to "private" or "stack" variables on the GPU are not instrumented, and accesses to HIP shared variables (also known as "local data store" or "LDS") are also not instrumented.
* It can also be the case that a memory fault is hit for an invalid address even with the instrumentation. This is usually caused by the invalid address being so wild that its shadow address is outside of any memory region, and the fault actually occurs on the access to the shadow address. It is also possible to hit a memory fault for the `NULL` pointer. While address 0 does have a shadow location, it is not poisoned by the runtime.
* It can also be the case that a memory fault is hit for an invalid address even with the instrumentation. This is usually caused by the invalid address being so wild that its shadow address is outside any memory region, and the fault actually occurs on the access to the shadow address. It is also possible to hit a memory fault for the `NULL` pointer. While address 0 does have a shadow location, it is not poisoned by the runtime.

View File

@@ -96,6 +96,10 @@ Our documentation is organized into the following categories:
* [RDNA2](./how-to/tuning-guides/w6000-v620.md)
* [Setting up for deep learning with ROCm](./how-to/deep-learning-rocm.md)
* [GPU-enabled MPI](./how-to/gpu-enabled-mpi.rst)
* [Using compiler features](./conceptual/compiler-topics.md)
* [Using AddressSanitizer](./conceptual/using-gpu-sanitizer.md)
* [Compiler disambiguation](./conceptual/compiler-disambiguation.md)
* [OpenMP support in ROCm](./about/compatibility/openmp.md)
* [System level debugging](./how-to/system-debugging.md)
* [GitHub examples](https://github.com/amd/rocm-examples)
:::
@@ -111,15 +115,12 @@ Our documentation is organized into the following categories:
* [MI250](./conceptual/gpu-arch/mi250.md)
* [MI300](./conceptual/gpu-arch/mi300.md)
* [GPU memory](./conceptual/gpu-memory.md)
* [Compiler disambiguation](./conceptual/compiler-disambiguation.md)
* [File structure (Linux FHS)](./conceptual/file-reorg.md)
* [GPU isolation techniques](./conceptual/gpu-isolation.md)
* [LLVM ASan](./conceptual/using-gpu-sanitizer.md)
* [Using CMake](./conceptual/cmake-packages.rst)
* [ROCm & PCIe atomics](./conceptual/More-about-how-ROCm-uses-PCIe-Atomics.rst)
* [Inception v3 with PyTorch](./conceptual/ai-pytorch-inception.md)
* [Inference optimization with MIGraphX](./conceptual/ai-migraphx-optimization.md)
* [OpenMP support in ROCm](./about/compatibility/openmp.md)
:::
::::

View File

@@ -47,12 +47,6 @@ subtrees:
- caption: How to
entries:
- file: how-to/deep-learning-rocm.md
title: Deep learning
- file: how-to/gpu-enabled-mpi.rst
title: Using MPI
- file: how-to/system-debugging.md
title: Debugging
- file: how-to/tuning-guides.md
title: Tuning guides
subtrees:
@@ -63,6 +57,22 @@ subtrees:
title: MI200
- file: how-to/tuning-guides/w6000-v620.md
title: RDNA2
- file: how-to/deep-learning-rocm.md
title: Deep learning
- file: how-to/gpu-enabled-mpi.rst
title: Using MPI
- file: conceptual/compiler-topics.md
title: Using compiler features
subtrees:
- entries:
- file: conceptual/using-gpu-sanitizer.md
title: Using AddressSanitizer
- file: conceptual/compiler-disambiguation.md
title: Compiler disambiguation
- file: about/compatibility/openmp.md
title: OpenMP support
- file: how-to/system-debugging.md
title: Debugging
- url: https://github.com/amd/rocm-examples
title: GitHub examples
@@ -100,16 +110,10 @@ subtrees:
title: White paper
- file: conceptual/gpu-memory.md
title: GPU memory
- file: conceptual/compiler-disambiguation.md
title: Compiler disambiguation
- file: about/compatibility/openmp.md
title: OpenMP
- file: conceptual/file-reorg.md
title: File structure (Linux FHS)
- file: conceptual/gpu-isolation.md
title: GPU isolation techniques
- file: conceptual/using-gpu-sanitizer.md
title: LLVM ASan
- file: conceptual/cmake-packages.rst
title: Using CMake
- file: conceptual/More-about-how-ROCm-uses-PCIe-Atomics.rst