mirror of
https://github.com/ROCm/ROCm.git
synced 2026-04-05 03:01:17 -04:00
[FRONTEND] improve the process of finding libcuda.so and the error message (#1981)
`triton` uses `whereis` command to find `libcuda.so`, which is intended to find binary, source, and manual page files. When `libcuda.so` is not properly setup, the `whereis` command ends up with `/usr/share/man/man7/libcuda.7`, which is not the place to look for. This PR uses `ldconfig -p` to reliably find `libcuda.so`. In my case, I find that I have a `libcuda.so.1` file, but it is not linked to `libcuda.so`. Therefore `ld` cannot find the library to link. After creating the linking, I was able to run `triton` successfully. Therefore, I improve the code by first invoking `ldconfig -p`, and checking `libcuda.so` strings first. These might be possible library to link against. If the literal `libcuda.so` file is not found, then I raise an error and tells the user that a possible fix is to create a symlink file.
This commit is contained in:
@@ -18,8 +18,17 @@ def is_hip():
|
||||
|
||||
@functools.lru_cache()
|
||||
def libcuda_dirs():
|
||||
locs = subprocess.check_output(["whereis", "libcuda.so"]).decode().strip().split()[1:]
|
||||
return [os.path.dirname(loc) for loc in locs]
|
||||
libs = subprocess.check_output(["ldconfig", "-p"]).decode()
|
||||
# each line looks like the following:
|
||||
# libcuda.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcuda.so.1
|
||||
locs = [line.split()[-1] for line in libs.splitlines() if "libcuda.so" in line]
|
||||
dirs = [os.path.dirname(loc) for loc in locs]
|
||||
msg = 'libcuda.so cannot found!\n'
|
||||
if locs:
|
||||
msg += 'Possible files are located at %s.' % str(locs)
|
||||
msg += 'Please create a symlink of libcuda.so to any of the file.'
|
||||
assert any(os.path.exists(os.path.join(path, 'libcuda.so')) for path in dirs), msg
|
||||
return dirs
|
||||
|
||||
|
||||
@functools.lru_cache()
|
||||
|
||||
Reference in New Issue
Block a user