Files
ROCm/docs/temp/troubleshooting.md
Lisa e87dba01c6 ROCm restructuring (#2521)
Flattened out page structure for improved navigability.
 * Change Table of Contents 
 * Update the install guides for windows and linux
 * Removed extraneous index pages
 * GPU architecture pages duplicate entries removed
 * spack page cleanup

---------

Co-authored-by: Sam Wu <samwu103@amd.com>
Co-authored-by: Saad Rahim (AMD) <44449863+saadrahim@users.noreply.github.com>
2023-10-06 15:42:11 -06:00

1.5 KiB

Troubleshooting

Q: What do I do if I get this error when trying to run PyTorch:

hipErrorNoBinaryForGPU: Unable to find code object for all current devices!

Ans: The error denotes that the installation of PyTorch and/or other dependencies or libraries do not support the current GPU.

Workaround:

To implement a workaround, follow these steps:

  1. Confirm that the hardware supports the ROCm stack. Refer to {ref}linux-support and {ref}windows-support.

  2. Determine the gfx target.

    rocminfo | grep gfx
    
  3. Check if PyTorch is compiled with the correct gfx target.

    TORCHDIR=$( dirname $( python3 -c 'import torch; print(torch.__file__)' ) )
    roc-obj-ls -v $TORCHDIR/lib/libtorch_hip.so # check for gfx target
    
    Recompile PyTorch with the right gfx target if compiling from the source if
    the hardware is not supported. For wheels or Docker installation, contact
    ROCm support [^ROCm_issues].

Q: Why am I unable to access Docker or GPU in user accounts?

Ans: Ensure that the user is added to docker, video, and render Linux groups as described in the ROCm Installation Guide at {ref}linux_group_permissions.

Q: Can I install PyTorch directly on bare metal?

Ans: Bare-metal installation of PyTorch is supported through wheels. Refer to Option 2: Install PyTorch Using Wheels Package. See Installing PyTorch for more information.

Q: How do I profile PyTorch workloads?

Ans: Use the PyTorch Profiler to profile GPU kernels on ROCm.