mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-09 22:58:17 -05:00
Removing Linux installation related content (#2673)
* Removing Linux installation related content * TOC updates * Removing added files * Line spacing on code block
This commit is contained in:
@@ -1,90 +0,0 @@
|
||||
# Deploy ROCm Docker containers
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Docker containers share the kernel with the host operating system, therefore the
|
||||
ROCm kernel-mode driver must be installed on the host. Please refer to
|
||||
{ref}`linux-install-methods` on installing `amdgpu-dkms`. The other
|
||||
user-space parts (like the HIP-runtime or math libraries) of the ROCm stack will
|
||||
be loaded from the container image and don't need to be installed to the host.
|
||||
|
||||
(docker-access-gpus-in-container)=
|
||||
|
||||
## Accessing GPUs in containers
|
||||
|
||||
In order to access GPUs in a container (to run applications using HIP, OpenCL or
|
||||
OpenMP offloading) explicit access to the GPUs must be granted.
|
||||
|
||||
The ROCm runtimes make use of multiple device files:
|
||||
|
||||
* `/dev/kfd`: the main compute interface shared by all GPUs
|
||||
* `/dev/dri/renderD<node>`: direct rendering interface (DRI) devices for each
|
||||
GPU. **`<node>`** is a number for each card in the system starting from 128.
|
||||
|
||||
Exposing these devices to a container is done by using the
|
||||
[`--device`](https://docs.docker.com/engine/reference/commandline/run/#device)
|
||||
option, i.e. to allow access to all GPUs expose `/dev/kfd` and all
|
||||
`/dev/dri/renderD` devices:
|
||||
|
||||
```shell
|
||||
docker run --device /dev/kfd --device /dev/renderD128 --device /dev/renderD129 ...
|
||||
```
|
||||
|
||||
More conveniently, instead of listing all devices, the entire `/dev/dri` folder
|
||||
can be exposed to the new container:
|
||||
|
||||
```shell
|
||||
docker run --device /dev/kfd --device /dev/dri
|
||||
```
|
||||
|
||||
Note that this gives more access than strictly required, as it also exposes the
|
||||
other device files found in that folder to the container.
|
||||
|
||||
(docker-restrict-gpus)=
|
||||
|
||||
### Restricting a container to a subset of the GPUs
|
||||
|
||||
If a `/dev/dri/renderD` device is not exposed to a container then it cannot use
|
||||
the GPU associated with it; this allows to restrict a container to any subset of
|
||||
devices.
|
||||
|
||||
For example to allow the container to access the first and third GPU start it
|
||||
like:
|
||||
|
||||
```shell
|
||||
docker run --device /dev/kfd --device /dev/dri/renderD128 --device /dev/dri/renderD130 <image>
|
||||
```
|
||||
|
||||
### Additional options
|
||||
|
||||
The performance of an application can vary depending on the assignment of GPUs
|
||||
and CPUs to the task. Typically, `numactl` is installed as part of many HPC
|
||||
applications to provide GPU/CPU mappings. This Docker runtime option supports
|
||||
memory mapping and can improve performance.
|
||||
|
||||
```shell
|
||||
--security-opt seccomp=unconfined
|
||||
```
|
||||
|
||||
This option is recommended for Docker Containers running HPC applications.
|
||||
|
||||
```shell
|
||||
docker run --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined ...
|
||||
```
|
||||
|
||||
## Docker images in the ROCm ecosystem
|
||||
|
||||
### Base images
|
||||
|
||||
<https://github.com/RadeonOpenCompute/ROCm-docker> hosts images useful for users
|
||||
wishing to build their own containers leveraging ROCm. The built images are
|
||||
available from [Docker Hub](https://hub.docker.com/u/rocm). In particular
|
||||
`rocm/rocm-terminal` is a small image with the prerequisites to build HIP
|
||||
applications, but does not include any libraries.
|
||||
|
||||
### Applications
|
||||
|
||||
AMD provides pre-built images for various GPU-ready applications through its
|
||||
Infinity Hub at <https://www.amd.com/en/technologies/infinity-hub>.
|
||||
Examples for invoking each application and suggested parameters used for
|
||||
benchmarking are also provided there.
|
||||
@@ -1,64 +0,0 @@
|
||||
# MAGMA installation for ROCm
|
||||
|
||||
## MAGMA for ROCm
|
||||
|
||||
Matrix Algebra on GPU and Multicore Architectures (MAGMA) is a
|
||||
collection of next-generation dense linear algebra libraries that is designed
|
||||
for heterogeneous architectures, such as multiple GPUs and multi- or many-core
|
||||
CPUs.
|
||||
|
||||
MAGMA provides implementations for CUDA, HIP, Intel Xeon Phi, and OpenCL™. For
|
||||
more information, refer to
|
||||
[https://icl.utk.edu/magma/index.html](https://icl.utk.edu/magma/index.html).
|
||||
|
||||
### Using MAGMA for PyTorch
|
||||
|
||||
Tensor is fundamental to deep-learning techniques because it provides extensive
|
||||
representational functionalities and math operations. This data structure is
|
||||
represented as a multidimensional matrix. MAGMA accelerates tensor operations
|
||||
with a variety of solutions including driver routines, computational routines,
|
||||
BLAS routines, auxiliary routines, and utility routines.
|
||||
|
||||
### Building MAGMA from source
|
||||
|
||||
To build MAGMA from the source, follow these steps:
|
||||
|
||||
1. In the event you want to compile only for your uarch, use:
|
||||
|
||||
```bash
|
||||
export PYTORCH_ROCM_ARCH=<uarch>
|
||||
```
|
||||
|
||||
`<uarch>` is the architecture reported by the `rocminfo` command.
|
||||
|
||||
2. Use the following:
|
||||
|
||||
```bash
|
||||
export PYTORCH_ROCM_ARCH=<uarch>
|
||||
|
||||
# "install" hipMAGMA into /opt/rocm/magma by copying after build
|
||||
git clone https://bitbucket.org/icl/magma.git
|
||||
pushd magma
|
||||
# Fixes memory leaks of MAGMA found while executing linalg UTs
|
||||
git checkout 5959b8783e45f1809812ed96ae762f38ee701972
|
||||
cp make.inc-examples/make.inc.hip-gcc-mkl make.inc
|
||||
echo 'LIBDIR += -L$(MKLROOT)/lib' >> make.inc
|
||||
echo 'LIB += -Wl,--enable-new-dtags -Wl,--rpath,/opt/rocm/lib -Wl,--rpath,$(MKLROOT)/lib -Wl,--rpath,/opt/rocm/magma/lib' >> make.inc
|
||||
echo 'DEVCCFLAGS += --gpu-max-threads-per-block=256' >> make.inc
|
||||
export PATH="${PATH}:/opt/rocm/bin"
|
||||
if [[ -n "$PYTORCH_ROCM_ARCH" ]]; then
|
||||
amdgpu_targets=`echo $PYTORCH_ROCM_ARCH | sed 's/;/ /g'`
|
||||
else
|
||||
amdgpu_targets=`rocm_agent_enumerator | grep -v gfx000 | sort -u | xargs`
|
||||
fi
|
||||
for arch in $amdgpu_targets; do
|
||||
echo "DEVCCFLAGS += --amdgpu-target=$arch" >> make.inc
|
||||
done
|
||||
# hipcc with openmp flag may cause isnan() on __device__ not to be found; depending on context, compiler may attempt to match with host definition
|
||||
sed -i 's/^FOPENMP/#FOPENMP/g' make.inc
|
||||
make -f make.gen.hipMAGMA -j $(nproc)
|
||||
LANG=C.UTF-8 make lib/libmagma.so -j $(nproc) MKLROOT=/opt/conda
|
||||
make testing/testing_dgemm -j $(nproc) MKLROOT=/opt/conda
|
||||
popd
|
||||
mv magma /opt/rocm
|
||||
```
|
||||
@@ -1,446 +0,0 @@
|
||||
# Installing PyTorch for ROCm
|
||||
|
||||
[PyTorch](https://pytorch.org/) is an open-source tensor library designed for deep learning. PyTorch on
|
||||
ROCm provides mixed-precision and large-scale training using our
|
||||
[MIOpen](https://github.com/ROCmSoftwarePlatform/MIOpen) and
|
||||
[RCCL](https://github.com/ROCmSoftwarePlatform/rccl) libraries.
|
||||
|
||||
To install [PyTorch for ROCm](https://pytorch.org/blog/pytorch-for-amd-rocm-platform-now-available-as-python-package/), you have the following options:
|
||||
|
||||
* [Use a Docker image with PyTorch pre-installed](#using-a-docker-image-with-pytorch-pre-installed)
|
||||
(recommended)
|
||||
* [Use a wheels package](#using-a-wheels-package)
|
||||
* [Use the PyTorch ROCm base Docker image](#using-the-pytorch-rocm-base-docker-image)
|
||||
* [Use the PyTorch upstream Docker file](#using-the-pytorch-upstream-docker-file)
|
||||
|
||||
For hardware, software, and third-party framework compatibility between ROCm and PyTorch, refer to:
|
||||
|
||||
* [GPU and OS support (Linux)](../about/compatibility/linux-support.md)
|
||||
* [Compatibility](../about/compatibility/3rd-party-support-matrix.md)
|
||||
|
||||
## Using a Docker image with PyTorch pre-installed
|
||||
|
||||
1. Download the latest public PyTorch Docker image
|
||||
([https://hub.docker.com/r/rocm/pytorch](https://hub.docker.com/r/rocm/pytorch)).
|
||||
|
||||
```bash
|
||||
docker pull rocm/pytorch:latest
|
||||
```
|
||||
|
||||
You can also download a specific and supported configuration with different user-space ROCm
|
||||
versions, PyTorch versions, and operating systems.
|
||||
|
||||
2. Start a Docker container using the image.
|
||||
|
||||
```bash
|
||||
docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
|
||||
--device=/dev/kfd --device=/dev/dri --group-add video \
|
||||
--ipc=host --shm-size 8G rocm/pytorch:latest
|
||||
```
|
||||
|
||||
:::{note}
|
||||
This will automatically download the image if it does not exist on the host. You can also pass the `-v`
|
||||
argument to mount any data directories from the host onto the container.
|
||||
:::
|
||||
|
||||
(install_pytorch_wheels)=
|
||||
|
||||
## Using a wheels package
|
||||
|
||||
PyTorch supports ROCm software by providing tested wheels packages. To access this feature, go
|
||||
to [https://pytorch.org/get-started/locally/](https://pytorch.org/get-started/locally/). For the correct
|
||||
wheels command, you must select 'Linux', 'Python', 'pip', and 'ROCm' in the matrix.
|
||||
|
||||
1. Choose one of the following three options:
|
||||
|
||||
**Option 1:**
|
||||
|
||||
a. Download a base Docker image with the correct user-space ROCm version.
|
||||
| Base OS | Docker image | Link to Docker image|
|
||||
|----------------|-----------------------------|----------------|
|
||||
| Ubuntu 20.04 | `rocm/dev-ubuntu-20.04` | [https://hub.docker.com/r/rocm/dev-ubuntu-20.04](https://hub.docker.com/r/rocm/dev-ubuntu-20.04)
|
||||
| Ubuntu 22.04 | `rocm/dev-ubuntu-22.04` | [https://hub.docker.com/r/rocm/dev-ubuntu-22.04](https://hub.docker.com/r/rocm/dev-ubuntu-22.04)
|
||||
| CentOS 7 | `rocm/dev-centos-7` | [https://hub.docker.com/r/rocm/dev-centos-7](https://hub.docker.com/r/rocm/dev-centos-7)
|
||||
|
||||
b. Pull the selected image.
|
||||
|
||||
```bash
|
||||
docker pull rocm/dev-ubuntu-20.04:latest
|
||||
```
|
||||
|
||||
c. Start a Docker container using the downloaded image.
|
||||
|
||||
```bash
|
||||
docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/dev-ubuntu-20.04:latest
|
||||
```
|
||||
|
||||
**Option 2:**
|
||||
|
||||
Select a base OS Docker image (Check [OS compatibility](../about/compatibility/linux-support.md))
|
||||
|
||||
Pull selected base OS image (Ubuntu 20.04 for example)
|
||||
|
||||
```docker
|
||||
docker pull ubuntu:20.04
|
||||
```
|
||||
|
||||
Start a Docker container using the downloaded image
|
||||
|
||||
```docker
|
||||
docker run -it --device=/dev/kfd --device=/dev/dri --group-add video ubuntu:20.04
|
||||
```
|
||||
|
||||
Install ROCm using the directions in the [Installation section](./linux/install.md).
|
||||
|
||||
**Option 3:**
|
||||
|
||||
Install on bare metal. Check [OS compatibility](../about/compatibility/linux-support.md) and install ROCm using the
|
||||
directions in the [Installation section](./linux/install.md).
|
||||
|
||||
2. Install the required dependencies for the wheels package.
|
||||
|
||||
```bash
|
||||
sudo apt update
|
||||
sudo apt install libjpeg-dev python3-dev python3-pip
|
||||
pip3 install wheel setuptools
|
||||
```
|
||||
|
||||
3. Install `torch`, `torchvision`, and `torchaudio`, as specified in the
|
||||
[installation matrix](https://pytorch.org/get-started/locally/).
|
||||
|
||||
:::{note}
|
||||
The following command uses the ROCm 5.6 PyTorch wheel. If you want a different version of ROCm,
|
||||
modify the command accordingly.
|
||||
:::
|
||||
|
||||
```bash
|
||||
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.6/
|
||||
```
|
||||
|
||||
4. (Optional) Use MIOpen kdb files with ROCm PyTorch wheels.
|
||||
|
||||
PyTorch uses [MIOpen](https://github.com/ROCmSoftwarePlatform/MIOpen) for machine learning
|
||||
primitives, which are compiled into kernels at runtime. Runtime compilation causes a small warm-up
|
||||
phase when starting PyTorch, and MIOpen kdb files contain precompiled kernels that can speed up
|
||||
application warm-up phases. For more information, refer to the
|
||||
{doc}`MIOpen installation page <miopen:install>`.
|
||||
|
||||
MIOpen kdb files can be used with ROCm PyTorch wheels. However, the kdb files need to be placed in
|
||||
a specific location with respect to the PyTorch installation path. A helper script simplifies this task by
|
||||
taking the ROCm version and GPU architecture as inputs. This works for Ubuntu and CentOS.
|
||||
|
||||
You can download the helper script here:
|
||||
[install_kdb_files_for_pytorch_wheels.sh](https://raw.githubusercontent.com/wiki/ROCmSoftwarePlatform/pytorch/files/ install_kdb_files_for_pytorch_wheels.sh), or use:
|
||||
|
||||
`wget https://raw.githubusercontent.com/wiki/ROCmSoftwarePlatform/pytorch/files/install_kdb_files_for_pytorch_wheels.sh`
|
||||
|
||||
After installing ROCm PyTorch wheels, run the following code:
|
||||
|
||||
```bash
|
||||
#Optional; replace 'gfx90a' with your architecture and 5.6 with your preferred ROCm version
|
||||
export GFX_ARCH=gfx90a
|
||||
|
||||
#Optional
|
||||
export ROCM_VERSION=5.6
|
||||
|
||||
./install_kdb_files_for_pytorch_wheels.sh
|
||||
```
|
||||
|
||||
## Using the PyTorch ROCm base Docker image
|
||||
|
||||
The pre-built base Docker image has all dependencies installed, including:
|
||||
|
||||
* ROCm
|
||||
* Torchvision
|
||||
* Conda packages
|
||||
* The compiler toolchain
|
||||
|
||||
Additionally, a particular environment flag (`BUILD_ENVIRONMENT`) is set, which is used by the build
|
||||
scripts to determine the configuration of the build environment.
|
||||
|
||||
1. Download the Docker image. This is the base image, which does not contain PyTorch.
|
||||
|
||||
```bash
|
||||
docker pull rocm/pytorch:latest-base
|
||||
```
|
||||
|
||||
2. Start a Docker container using the downloaded image.
|
||||
|
||||
```bash
|
||||
docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest-base
|
||||
```
|
||||
|
||||
You can also pass the `-v` argument to mount any data directories from the host onto the container.
|
||||
|
||||
3. Clone the PyTorch repository.
|
||||
|
||||
```bash
|
||||
cd ~
|
||||
git clone https://github.com/pytorch/pytorch.git
|
||||
cd /pytorch
|
||||
git submodule update --init --recursive
|
||||
```
|
||||
|
||||
4. Set ROCm architecture (optional). The Docker image tag is `rocm/pytorch:latest-base`.
|
||||
|
||||
:::{note}
|
||||
By default in the `rocm/pytorch:latest-base` image, PyTorch builds simultaneously for the following
|
||||
architectures:
|
||||
* gfx900
|
||||
* gfx906
|
||||
* gfx908
|
||||
* gfx90a
|
||||
* gfx1030
|
||||
:::
|
||||
|
||||
If you want to compile _only_ for your microarchitecture (uarch), run:
|
||||
|
||||
```bash
|
||||
export PYTORCH_ROCM_ARCH=<uarch>
|
||||
```
|
||||
|
||||
Where `<uarch>` is the architecture reported by the `rocminfo` command.
|
||||
|
||||
To find your uarch, run:
|
||||
|
||||
```bash
|
||||
rocminfo | grep gfx
|
||||
```
|
||||
|
||||
5. Build PyTorch.
|
||||
|
||||
```bash
|
||||
./.ci/pytorch/build.sh
|
||||
```
|
||||
|
||||
This converts PyTorch sources for
|
||||
[HIP compatibility](https://www.amd.com/en/developer/rocm-hub/hip-sdk.html) and builds the
|
||||
PyTorch framework.
|
||||
|
||||
To check if your build is successful, run:
|
||||
|
||||
```bash
|
||||
echo $? # should return 0 if success
|
||||
```
|
||||
|
||||
## Using the PyTorch upstream Docker file
|
||||
|
||||
If you don't want to use a prebuilt base Docker image, you can build a custom base Docker image
|
||||
using scripts from the PyTorch repository. This uses a standard Docker image from operating system
|
||||
maintainers and installs all the required dependencies, including:
|
||||
|
||||
* ROCm
|
||||
* Torchvision
|
||||
* Conda packages
|
||||
* The compiler toolchain
|
||||
|
||||
1. Clone the PyTorch repository.
|
||||
|
||||
```bash
|
||||
cd ~
|
||||
git clone https://github.com/pytorch/pytorch.git
|
||||
cd /pytorch
|
||||
git submodule update --init --recursive
|
||||
```
|
||||
|
||||
2. Build the PyTorch Docker image.
|
||||
|
||||
```bash
|
||||
cd .ci/docker
|
||||
./build.sh pytorch-linux-<os-version>-rocm<rocm-version>-py<python-version> -t rocm/pytorch:build_from_dockerfile
|
||||
```
|
||||
|
||||
Where:
|
||||
* `<os-version>`: `ubuntu20.04` (or `focal`), `ubuntu22.04` (or `jammy`), `centos7.5`, or `centos9`
|
||||
* `<rocm-version>`: `5.4`, `5.5`, or `5.6`
|
||||
* `<python-version>`: `3.8`-`3.11`
|
||||
|
||||
To verify that your image was successfully created, run:
|
||||
|
||||
`docker image ls rocm/pytorch:build_from_dockerfile`
|
||||
|
||||
If successful, the output looks like this:
|
||||
|
||||
```bash
|
||||
REPOSITORY TAG IMAGE ID CREATED SIZE
|
||||
rocm/pytorch build_from_dockerfile 17071499be47 2 minutes ago 32.8GB
|
||||
```
|
||||
|
||||
3. Start a Docker container using the image with the mounted PyTorch folder.
|
||||
|
||||
```bash
|
||||
docker run -it --cap-add=SYS_PTRACE --security-opt --user root \
|
||||
seccomp=unconfined --device=/dev/kfd --device=/dev/dri \
|
||||
--group-add video --ipc=host --shm-size 8G \
|
||||
-v ~/pytorch:/pytorch rocm/pytorch:build_from_dockerfile
|
||||
```
|
||||
|
||||
You can also pass the `-v` argument to mount any data directories from the host onto the container.
|
||||
|
||||
4. Go to the PyTorch directory.
|
||||
|
||||
```bash
|
||||
cd pytorch
|
||||
```
|
||||
|
||||
5. Set ROCm architecture.
|
||||
|
||||
To determine your AMD architecture, run:
|
||||
|
||||
```bash
|
||||
rocminfo | grep gfx
|
||||
```
|
||||
|
||||
The result looks like this (for `gfx1030` architecture):
|
||||
|
||||
```bash
|
||||
Name: gfx1030
|
||||
Name: amdgcn-amd-amdhsa--gfx1030
|
||||
```
|
||||
|
||||
Set the `PYTORCH_ROCM_ARCH` environment variable to specify the architectures you want to
|
||||
build PyTorch for.
|
||||
|
||||
```bash
|
||||
export PYTORCH_ROCM_ARCH=<uarch>
|
||||
```
|
||||
|
||||
where `<uarch>` is the architecture reported by the `rocminfo` command.
|
||||
|
||||
6. Build PyTorch.
|
||||
|
||||
```bash
|
||||
./.ci/pytorch/build.sh
|
||||
```
|
||||
|
||||
This converts PyTorch sources for
|
||||
[HIP compatibility](https://www.amd.com/en/developer/rocm-hub/hip-sdk.html) and builds the
|
||||
PyTorch framework.
|
||||
|
||||
To check if your build is successful, run:
|
||||
|
||||
```bash
|
||||
echo $? # should return 0 if success
|
||||
```
|
||||
|
||||
## Testing the PyTorch installation
|
||||
|
||||
You can use PyTorch unit tests to validate your PyTorch installation. If you used a
|
||||
**prebuilt PyTorch Docker image from AMD ROCm Docker Hub** or installed an
|
||||
**official wheels package**, validation tests are not necessary.
|
||||
|
||||
If you want to manually run unit tests to validate your PyTorch installation fully, follow these steps:
|
||||
|
||||
1. Import the torch package in Python to test if PyTorch is installed and accessible.
|
||||
|
||||
:::{note}
|
||||
Do not run the following command in the PyTorch git folder.
|
||||
:::
|
||||
|
||||
```bash
|
||||
python3 -c 'import torch' 2> /dev/null && echo 'Success' || echo 'Failure'
|
||||
```
|
||||
|
||||
2. Check if the GPU is accessible from PyTorch. In the PyTorch framework, `torch.cuda` is a generic way
|
||||
to access the GPU. This can only access an AMD GPU if one is available.
|
||||
|
||||
```bash
|
||||
python3 -c 'import torch; print(torch.cuda.is_available())'
|
||||
```
|
||||
|
||||
3. Run unit tests to validate the PyTorch installation fully.
|
||||
|
||||
:::{note}
|
||||
You must run the following command from the PyTorch home directory.
|
||||
:::
|
||||
|
||||
```bash
|
||||
PYTORCH_TEST_WITH_ROCM=1 python3 test/run_test.py --verbose \
|
||||
--include test_nn test_torch test_cuda test_ops \
|
||||
test_unary_ufuncs test_binary_ufuncs test_autograd
|
||||
```
|
||||
|
||||
This command ensures that the required environment variable is set to skip certain unit tests for
|
||||
ROCm. This also applies to wheel installs in a non-controlled environment.
|
||||
|
||||
:::{note}
|
||||
Make sure your PyTorch source code corresponds to the PyTorch wheel or the installation in the
|
||||
Docker image. Incompatible PyTorch source code can give errors when running unit tests.
|
||||
:::
|
||||
|
||||
Some tests may be skipped, as appropriate, based on your system configuration. ROCm doesn't
|
||||
support all PyTorch features; tests that evaluate unsupported features are skipped. Other tests might
|
||||
be skipped, depending on the host or GPU memory and the number of available GPUs.
|
||||
|
||||
If the compilation and installation are correct, all tests will pass.
|
||||
|
||||
4. Run individual unit tests.
|
||||
|
||||
```bash
|
||||
PYTORCH_TEST_WITH_ROCM=1 python3 test/test_nn.py --verbose
|
||||
```
|
||||
|
||||
You can replace `test_nn.py` with any other test set.
|
||||
|
||||
## Running a basic PyTorch example
|
||||
|
||||
The PyTorch examples repository provides basic examples that exercise the functionality of your
|
||||
framework.
|
||||
|
||||
Two of our favorite testing databases are:
|
||||
|
||||
* **MNIST** (Modified National Institute of Standards and Technology): A database of handwritten
|
||||
digits that can be used to train a Convolutional Neural Network for **handwriting recognition**.
|
||||
* **ImageNet**: A database of images that can be used to train a network for
|
||||
**visual object recognition**.
|
||||
|
||||
### MNIST PyTorch example
|
||||
|
||||
1. Clone the PyTorch examples repository.
|
||||
|
||||
```bash
|
||||
git clone https://github.com/pytorch/examples.git
|
||||
```
|
||||
|
||||
2. Go to the MNIST example folder.
|
||||
|
||||
```bash
|
||||
cd examples/mnist
|
||||
```
|
||||
|
||||
3. Follow the instructions in the `README.md`` file in this folder to install the requirements. Then run:
|
||||
|
||||
```bash
|
||||
python3 main.py
|
||||
```
|
||||
|
||||
This generates the following output:
|
||||
|
||||
```bash
|
||||
...
|
||||
Train Epoch: 14 [58240/60000 (97%)] Loss: 0.010128
|
||||
Train Epoch: 14 [58880/60000 (98%)] Loss: 0.001348
|
||||
Train Epoch: 14 [59520/60000 (99%)] Loss: 0.005261
|
||||
|
||||
Test set: Average loss: 0.0252, Accuracy: 9921/10000 (99%)
|
||||
```
|
||||
|
||||
### ImageNet PyTorch example
|
||||
|
||||
1. Clone the PyTorch examples repository (if you didn't already do this step in the preceding MNIST example).
|
||||
|
||||
```bash
|
||||
git clone https://github.com/pytorch/examples.git
|
||||
```
|
||||
|
||||
2. Go to the ImageNet example folder.
|
||||
|
||||
```bash
|
||||
cd examples/imagenet
|
||||
```
|
||||
|
||||
3. Follow the instructions in the `README.md` file in this folder to install the Requirements. Then run:
|
||||
|
||||
```bash
|
||||
python3 main.py
|
||||
```
|
||||
@@ -1,421 +0,0 @@
|
||||
# Introduction to Spack
|
||||
|
||||
Spack is a package management tool designed to support multiple software versions and
|
||||
configurations on a wide variety of platforms and environments. It was designed for large
|
||||
supercomputing centers, where many users share common software installations on clusters with
|
||||
exotic architectures using libraries that do not have a standard ABI. Spack is non-destructive: installing
|
||||
a new version does not break existing installations, so many configurations can coexist on the same
|
||||
system.
|
||||
|
||||
Most importantly, Spack is *simple*. It offers a simple *spec* syntax, so users can concisely specify
|
||||
versions and configuration options. Spack is also simple for package authors: package files are written
|
||||
in pure Python, and specs allow package authors to maintain a single file for many different builds of
|
||||
the same package. For more information on Spack, see
|
||||
[https://spack-tutorial.readthedocs.io/en/latest/](https://spack-tutorial.readthedocs.io/en/latest/).
|
||||
|
||||
## ROCM packages in Spack
|
||||
|
||||
| **Component** | **Spack Package Name** |
|
||||
|---------------------------|------------------------|
|
||||
| **rocm-cmake** | rocm-cmake |
|
||||
| **thunk** | hsakmt-roct |
|
||||
| **rocm-smi-lib** | rocm-smi-lib |
|
||||
| **hsa** | hsa-rocr-dev |
|
||||
| **lightning** | llvm-amdgpu |
|
||||
| **devicelibs** | rocm-device-libs |
|
||||
| **comgr** | comgr |
|
||||
| **rocclr (vdi)** | hip-rocclr |
|
||||
| **hipify_clang** | hipify-clang |
|
||||
| **hip (hip_in_vdi)** | hip |
|
||||
| **ocl (opencl_on_vdi )** | rocm-opencl |
|
||||
| **rocminfo** | rocminfo |
|
||||
| **clang-ocl** | rocm-clang-ocl |
|
||||
| **rccl** | rccl |
|
||||
| **atmi** | atmi |
|
||||
| **rocm_debug_agent** | rocm-debug-agent |
|
||||
| **rocm_bandwidth_test** | rocm-bandwidth-test |
|
||||
| **rocprofiler** | rocprofiler-dev |
|
||||
| **roctracer-dev-api** | roctracer-dev-api |
|
||||
| **roctracer** | roctracer-dev |
|
||||
| **dbgapi** | rocm-dbgapi |
|
||||
| **rocm-gdb** | rocm-gdb |
|
||||
| **openmp-extras** | rocm-openmp-extras |
|
||||
| **rocBLAS** | rocblas |
|
||||
| **hipBLAS** | hipblas |
|
||||
| **rocFFT** | rocfft |
|
||||
| **rocRAND** | rocrand |
|
||||
| **rocSPARSE** | rocsparse |
|
||||
| **hipSPARSE** | hipsparse |
|
||||
| **rocALUTION** | rocalution |
|
||||
| **rocSOLVER** | rocsolver |
|
||||
| **rocPRIM** | rocprim |
|
||||
| **rocThrust** | rocthrust |
|
||||
| **hipCUB** | hipcub |
|
||||
| **hipfort** | hipfort |
|
||||
| **ROCmValidationSuite** | rocm-validation-suite |
|
||||
| **MIOpenGEMM** | miopengemm |
|
||||
| **MIOpen(Hip variant)** | miopen-hip |
|
||||
| **MIOpen(opencl)** | miopen-opencl |
|
||||
| **MIVisionX** | mivisionx |
|
||||
| **AMDMIGraphX** | migraphx |
|
||||
| **rocm-tensile** | rocm-tensile |
|
||||
| **hipfft** | hipfft |
|
||||
| **RDC** | rdc |
|
||||
| **hipsolver** | hipsolver |
|
||||
| **mlirmiopen** | mlirmiopen |
|
||||
|
||||
```{note}
|
||||
You must install all prerequisites before installing Spack.
|
||||
```
|
||||
|
||||
::::{tab-set}
|
||||
:::{tab-item} Ubuntu
|
||||
:sync: Ubuntu
|
||||
|
||||
```shell
|
||||
# Install some essential utilities:
|
||||
apt-get update
|
||||
apt-get install make patch bash tar gzip unzip bzip2 file gnupg2 git gawk
|
||||
apt-get update -y
|
||||
apt-get install -y xz-utils
|
||||
apt-get build-essential
|
||||
apt-get install vim
|
||||
# Install Python:
|
||||
apt-get install python3
|
||||
apt-get upgrade python3-pip
|
||||
# Install Compilers:
|
||||
apt-get install gcc
|
||||
apt-get install gfortran
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} SLES
|
||||
:sync: SLES
|
||||
|
||||
```shell
|
||||
# Install some essential utilities:
|
||||
zypper update
|
||||
zypper install make patch bash tar gzip unzip bzip xz file gnupg2 git awk
|
||||
zypper in -t pattern
|
||||
zypper install vim
|
||||
# Install Python:
|
||||
zypper install python3
|
||||
zypper install python3-pip
|
||||
# Install Compilers:
|
||||
zypper install gcc
|
||||
zypper install gcc-fortran
|
||||
zypper install gcc-c++
|
||||
```
|
||||
|
||||
:::
|
||||
:::{tab-item} CentOS
|
||||
:sync: CentOS
|
||||
|
||||
```shell
|
||||
# Install some essential utilities:
|
||||
yum update
|
||||
yum install make
|
||||
yum install patch bash tar yum install gzip unzip bzip2 xz file gnupg2 git gawk
|
||||
yum group install "Development Tools"
|
||||
yum install vim
|
||||
# Install Python:
|
||||
yum install python3
|
||||
pip3 install --upgrade pip
|
||||
# Install compilers:
|
||||
yum install gcc
|
||||
yum install gcc-gfortran
|
||||
yum install gcc-c++
|
||||
```
|
||||
|
||||
:::
|
||||
::::
|
||||
|
||||
## Steps to build ROCm components using Spack
|
||||
|
||||
1. To use the spack package manager, clone the Spack project from GitHub.
|
||||
|
||||
```bash
|
||||
git clone <https://github.com/spack/spack>
|
||||
```
|
||||
|
||||
2. Initialize Spack.
|
||||
|
||||
The `setup-env.sh` script initializes the Spack environment.
|
||||
|
||||
```bash
|
||||
cd spack
|
||||
|
||||
. share/spack/setup-env.sh
|
||||
```
|
||||
|
||||
Spack commands are available once the above steps are completed. To list the available commands,
|
||||
use `help`.
|
||||
|
||||
```bash
|
||||
root@[ixt-rack-104:/spack\#](http://ixt-rack-104/spack) spack help
|
||||
```
|
||||
|
||||
## Using Spack to install ROCm components
|
||||
|
||||
1. `rocm-cmake`
|
||||
|
||||
Install the default variants and the latest version of `rocm-cmake`.
|
||||
|
||||
```bash
|
||||
spack install rocm-cmake
|
||||
```
|
||||
|
||||
To install a specific version of `rocm-cmake`, use:
|
||||
|
||||
```bash
|
||||
spack install rocm-cmake@<version number>
|
||||
```
|
||||
|
||||
For example, `spack install rocm-cmake@5.2.0`
|
||||
|
||||
2. `info`
|
||||
|
||||
The `info**` command displays basic package information. It shows the preferred, safe, and
|
||||
deprecated versions, in addition to the available variants. It also shows the dependencies with other
|
||||
packages.
|
||||
|
||||
```bash
|
||||
spack info mivisionx
|
||||
```
|
||||
|
||||
For example:
|
||||
|
||||
```bash
|
||||
root@[ixt-rack-104:/spack\#](http://ixt-rack-104/spack) spack info mivisionx
|
||||
CMakePackage: mivisionx
|
||||
|
||||
Description:
|
||||
MIVisionX toolkit is a set of comprehensive computer vision and machine
|
||||
intelligence libraries, utilities, and applications bundled into a
|
||||
single toolkit.
|
||||
|
||||
Homepage: <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX>
|
||||
|
||||
Preferred version:
|
||||
5.3.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.3.0.tar.gz>
|
||||
|
||||
Safe versions:
|
||||
5.3.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.3.0.tar.gz>
|
||||
5.2.3 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.2.3.tar.gz>
|
||||
5.2.1 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.2.1.tar.gz>
|
||||
5.2.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.2.0.tar.gz>
|
||||
5.1.3 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.1.3.tar.gz>
|
||||
5.1.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.1.0.tar.gz>
|
||||
5.0.2 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.0.2.tar.gz>
|
||||
5.0.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-5.0.0.tar.gz>
|
||||
4.5.2 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-4.5.2.tar.gz>
|
||||
4.5.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-4.5.0.tar.gz>
|
||||
|
||||
Deprecated versions:
|
||||
4.3.1 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-4.3.1.tar.gz>
|
||||
4.3.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-4.3.0.tar.gz>
|
||||
4.2.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-4.2.0.tar.gz>
|
||||
4.1.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-4.1.0.tar.gz>
|
||||
4.0.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-4.0.0.tar.gz>
|
||||
3.10.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-3.10.0.tar.gz>
|
||||
3.9.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-3.9.0.tar.gz>
|
||||
3.8.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-3.8.0.tar.gz>
|
||||
3.7.0 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/rocm-3.7.0.tar.gz>
|
||||
1.7 <https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/archive/1.7.tar.gz>
|
||||
|
||||
Variants:
|
||||
Name [Default] When Allowed values Description
|
||||
==================== ==== ==================== ==================================
|
||||
|
||||
build_type [Release] -- Release, Debug, CMake build type
|
||||
RelWithDebInfo
|
||||
hip [on] -- on, off Use HIP as backend
|
||||
ipo [off] -- on, off CMake interprocedural optimization
|
||||
opencl [off] -- on, off Use OPENCL as the backend
|
||||
|
||||
Build Dependencies:
|
||||
cmake ffmpeg libjpeg-turbo miopen-hip miopen-opencl miopengemm opencv openssl protobuf rocm-cmake rocm-opencl
|
||||
|
||||
Link Dependencies:
|
||||
miopen-hip miopen-opencl miopengemm openssl rocm-opencl
|
||||
|
||||
Run Dependencies:
|
||||
None
|
||||
|
||||
root@[ixt-rack-104:/spack\#](http://ixt-rack-104/spack)
|
||||
```
|
||||
|
||||
## Installing variants for ROCm components
|
||||
|
||||
The variants listed above indicate that the `mivisionx` package is built by default with
|
||||
`build_type=Release` and the `hip` backend, and without the `opencl` backend. `build_type=Debug` and
|
||||
`RelWithDebInfo`, with `opencl` and without `hip`, are also supported.
|
||||
|
||||
For example:
|
||||
|
||||
```bash
|
||||
spack install mivisionx build_type=Debug (Backend will be hip since it is the default one)
|
||||
spack install mivisionx+opencl build_type=Debug (Backend will be opencl and hip will be disabled as per the conflict defined in recipe)
|
||||
```
|
||||
|
||||
* `spack spec` command
|
||||
|
||||
To display the dependency tree, the `spack spec` command can be used with the same format.
|
||||
|
||||
For example:
|
||||
|
||||
```bash
|
||||
root@[ixt-rack-104:/spack\#](http://ixt-rack-104/spack) spack spec mivisionx
|
||||
Input spec
|
||||
--------------------------------
|
||||
mivisionx
|
||||
|
||||
Concretized
|
||||
--------------------------------
|
||||
mivisionx@5.3.0%gcc@9.4.0+hip\~ipo\~opencl build_type=Release arch=linux-ubuntu20.04-skylake_avx512
|
||||
```
|
||||
|
||||
## Creating an environment
|
||||
|
||||
You can create an environment with all the required components of your version.
|
||||
|
||||
1. In the root folder, create a new folder when you can create a `.yaml` file. This file is used to
|
||||
create an environment.
|
||||
|
||||
```bash
|
||||
* mkdir /localscratch
|
||||
* cd /localscratch
|
||||
* vi sample.yaml
|
||||
```
|
||||
|
||||
2. Add all the required components in the `sample.yaml` file:
|
||||
|
||||
```bash
|
||||
* spack:
|
||||
* concretization: separately
|
||||
* packages:
|
||||
* all:
|
||||
* compiler: [gcc@8.5.0]
|
||||
* specs:
|
||||
* - matrix:
|
||||
* - ['%gcc@8.5.0\^cmake@3.19.7']
|
||||
* - [rocm-cmake@5.3.2, rocm-dbgapi@5.3.2, rocm-debug-agent@5.3.2, rocm-gdb@5.3.2,
|
||||
* rocminfo@5.3.2, rocm-opencl@5.3.2, rocm-smi-lib@5.3.2, rocm-tensile@5.3.2, rocm-validation-suite@4.3.1,
|
||||
* rocprim@5.3.2, rocprofiler-dev@5.3.2, rocrand@5.3.2, rocsolver@5.3.2, rocsparse@5.3.2,
|
||||
* rocthrust@5.3.2, roctracer-dev@5.3.2]
|
||||
* view: true
|
||||
```
|
||||
|
||||
3. Once you've created the `.yaml` file, you can use it to create an environment.
|
||||
|
||||
```bash
|
||||
* spack env create -d /localscratch/MyEnvironment /localscratch/sample.yaml
|
||||
```
|
||||
|
||||
4. Activate the environment.
|
||||
|
||||
```bash
|
||||
* spack env activate /localscratch/MyEnvironment
|
||||
```
|
||||
|
||||
5. Verify that you want all the component versions.
|
||||
|
||||
```bash
|
||||
* spack find - this command will list out all components been in the environment (and 0 installed )
|
||||
```
|
||||
|
||||
6. Install all the components in the `.yaml` file.
|
||||
|
||||
```bash
|
||||
* cd /localscratch/MyEnvironment
|
||||
* spack install -j 50
|
||||
```
|
||||
|
||||
7. Check that all components are successfully installed.
|
||||
|
||||
```bash
|
||||
* spack find
|
||||
```
|
||||
|
||||
8. If any modification is made to the `.yaml` file, you must deactivate the existing environment and create a new one in order for the modications to be reflected.
|
||||
|
||||
To deactivate, use:
|
||||
|
||||
```bash
|
||||
* spack env deactivate
|
||||
```
|
||||
|
||||
## Create and apply a patch before installation
|
||||
|
||||
Spack installs ROCm packages after pulling the source code from GitHub and building it locally. In
|
||||
order to build a component with any modification to the source code, you must generate a patch and
|
||||
apply it before the build phase.
|
||||
|
||||
To generate a patch and build with the changes:
|
||||
|
||||
1. Stage the source code.
|
||||
|
||||
```bash
|
||||
spack stage hip@5.2.0 (This will pull the 5.2.0 release version source code of hip and display the path to spack-src directory where entire source code is available)
|
||||
|
||||
root@[ixt-rack-104:/spack#](http://ixt-rack-104/spack) spack stage hip@5.2.0
|
||||
==> Fetching <https://github.com/ROCm-Developer-Tools/HIP/archive/rocm-5.2.0.tar.gz>
|
||||
==> Fetching <https://github.com/ROCm-Developer-Tools/hipamd/archive/rocm-5.2.0.tar.gz>
|
||||
==> Fetching <https://github.com/ROCm-Developer-Tools/ROCclr/archive/rocm-5.2.0.tar.gz>
|
||||
==> Moving resource stage
|
||||
source: /tmp/root/spack-stage/resource-hipamd-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src/
|
||||
destination: /tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src/hipamd
|
||||
==> Moving resource stage
|
||||
source: /tmp/root/spack-stage/resource-opencl-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src/
|
||||
destination: /tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src/opencl
|
||||
==> Moving resource stage
|
||||
source: /tmp/root/spack-stage/resource-rocclr-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src/
|
||||
destination: /tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src/rocclr
|
||||
==> Staged hip in /tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7
|
||||
```
|
||||
|
||||
2. Change directory to `spack-src` inside the staged directory.
|
||||
|
||||
```bash
|
||||
root@[ixt-rack-104:/spack#cd /tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7](http://ixt-rack-104/spack)
|
||||
root@[ixt-rack-104:/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7#](http://ixt-rack-104/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7) cd spack-src/
|
||||
```
|
||||
|
||||
3. Create a new Git repository.
|
||||
|
||||
```bash
|
||||
root@[ixt-rack-104:/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src#](http://ixt-rack-104/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src) git init
|
||||
```
|
||||
|
||||
4. Add the entire directory to the repository.
|
||||
|
||||
```bash
|
||||
root@[ixt-rack-104:/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src#](http://ixt-rack-104/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src) git add .
|
||||
```
|
||||
|
||||
5. Make the required changes to the source code.
|
||||
|
||||
```bash
|
||||
root@[ixt-rack-104:/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src#](http://ixt-rack-104/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src) vi hipamd/CMakeLists.txt (Make required changes in the source code)
|
||||
```
|
||||
|
||||
6. Generate the patch using the `git diff` command.
|
||||
|
||||
```bash
|
||||
diff > /spack/var/spack/repos/builtin/packages/hip/0001-modifications.patch
|
||||
```
|
||||
|
||||
7. Update the recipe with the patch file name and any conditions you want to apply.
|
||||
|
||||
```bash
|
||||
root@[ixt-rack-104:/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src#](http://ixt-rack-104/tmp/root/spack-stage/spack-stage-hip-5.2.0-wzo5y6ysvmadyb5mvffr35galb6vjxb7/spack-src) spack edit hip
|
||||
```
|
||||
|
||||
Provide the patch file name and the conditions for the patch:
|
||||
|
||||
`patch("0001-modifications.patch", when="@5.2.0")`
|
||||
|
||||
Spack applies `0001-modifications.patch` on the `5.2.0` release code before starting the `hip` build.
|
||||
|
||||
After each modification, you must update the recipe. If there is no change to the recipe, run
|
||||
`touch /spack/var/spack/repos/builtin/packages/hip/package.py`
|
||||
@@ -1,192 +0,0 @@
|
||||
# TensorFlow for ROCm
|
||||
|
||||
TensorFlow is an open-source library for solving machine-learning,
|
||||
deep-learning, and artificial-intelligence problems. It can be used to solve
|
||||
many problems across different sectors and industries but primarily focuses on
|
||||
training and inference in neural networks. It is one of the most popular and
|
||||
in-demand frameworks and is very active in open source contribution and
|
||||
development.
|
||||
|
||||
:::{warning}
|
||||
ROCm 5.6 and 5.7 deviates from the standard practice of supporting the last three
|
||||
TensorFlow versions. This is due to incompatibilities between earlier TensorFlow
|
||||
versions and changes introduced in the ROCm 5.6 compiler. Refer to the following
|
||||
version support matrix:
|
||||
|
||||
| ROCm | TensorFlow |
|
||||
|:-----:|:----------:|
|
||||
| 5.6.x | 2.12 |
|
||||
| 5.7.0 | 2.12, 2.13 |
|
||||
| Post-5.7.0 | Last three versions at ROCm release. |
|
||||
:::
|
||||
|
||||
For current TensorFlow ROCm version support, refer to the
|
||||
[ROCm fork of the TensorFlow repository](https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/rocm_docs/tensorflow-rocm-release.md).
|
||||
|
||||
## Installing TensorFlow
|
||||
|
||||
The following sections contain options for installing TensorFlow.
|
||||
|
||||
### Option 1: using a Docker image
|
||||
|
||||
To install ROCm on bare metal, follow the section
|
||||
[Linux installation guide](../install/linux/install.md). The recommended option to
|
||||
get a TensorFlow environment is through Docker.
|
||||
|
||||
Using Docker provides portability and access to a prebuilt Docker container that
|
||||
has been rigorously tested within AMD. This might also save compilation time and
|
||||
should perform as tested without facing potential installation issues.
|
||||
Follow these steps:
|
||||
|
||||
1. Pull the latest public TensorFlow Docker image.
|
||||
|
||||
```bash
|
||||
docker pull rocm/tensorflow:latest
|
||||
```
|
||||
|
||||
2. Once you have pulled the image, run it by using the command below:
|
||||
|
||||
```bash
|
||||
docker run -it --network=host --device=/dev/kfd --device=/dev/dri \
|
||||
--ipc=host --shm-size 16G --group-add video --cap-add=SYS_PTRACE \
|
||||
--security-opt seccomp=unconfined rocm/tensorflow:latest
|
||||
```
|
||||
|
||||
### Option 2: using a wheels package
|
||||
|
||||
To install TensorFlow using the wheels package, follow these steps:
|
||||
|
||||
1. Check the Python version.
|
||||
|
||||
```bash
|
||||
python3 --version
|
||||
```
|
||||
|
||||
| If: | Then: |
|
||||
|:-----------------------------------:|:--------------------------------:|
|
||||
| The Python version is less than 3.7 | Upgrade Python. |
|
||||
| The Python version is more than 3.7 | Skip this step and go to Step 3. |
|
||||
|
||||
```{note}
|
||||
The supported Python versions are:
|
||||
|
||||
* 3.7
|
||||
* 3.8
|
||||
* 3.9
|
||||
* 3.10
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo apt-get install python3.7 # or python3.8 or python 3.9 or python 3.10
|
||||
```
|
||||
|
||||
2. Set up multiple Python versions using update-alternatives.
|
||||
|
||||
```bash
|
||||
update-alternatives --query python3
|
||||
sudo update-alternatives --install
|
||||
/usr/bin/python3 python3 /usr/bin/python[version] [priority]
|
||||
```
|
||||
|
||||
```{note}
|
||||
Follow the instruction in Step 2 for incompatible Python versions.
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo update-alternatives --config python3
|
||||
```
|
||||
|
||||
3. Follow the screen prompts, and select the Python version installed in Step 2.
|
||||
|
||||
4. Install or upgrade PIP.
|
||||
|
||||
```bash
|
||||
sudo apt install python3-pip
|
||||
```
|
||||
|
||||
To install PIP, use the following:
|
||||
|
||||
```bash
|
||||
/usr/bin/python[version] -m pip install --upgrade pip
|
||||
```
|
||||
|
||||
Upgrade PIP for Python version installed in step 2:
|
||||
|
||||
```bash
|
||||
sudo pip3 install --upgrade pip
|
||||
```
|
||||
|
||||
5. Install TensorFlow for the Python version as indicated in Step 2.
|
||||
|
||||
```bash
|
||||
/usr/bin/python[version] -m pip install --user tensorflow-rocm==[wheel-version] --upgrade
|
||||
```
|
||||
|
||||
For a valid wheel version for a ROCm release, refer to the instruction below:
|
||||
|
||||
```bash
|
||||
sudo apt install rocm-libs rccl
|
||||
```
|
||||
|
||||
6. Update `protobuf` to 3.19 or lower.
|
||||
|
||||
```bash
|
||||
/usr/bin/python3.7 -m pip install protobuf=3.19.0
|
||||
sudo pip3 install tensorflow
|
||||
```
|
||||
|
||||
7. Set the environment variable `PYTHONPATH`.
|
||||
|
||||
```bash
|
||||
export PYTHONPATH="./.local/lib/python[version]/site-packages:$PYTHONPATH" #Use same python version as in step 2
|
||||
```
|
||||
|
||||
8. Install libraries.
|
||||
|
||||
```bash
|
||||
sudo apt install rocm-libs rccl
|
||||
```
|
||||
|
||||
9. Test installation.
|
||||
|
||||
```bash
|
||||
python3 -c 'import tensorflow' 2> /dev/null && echo 'Success' || echo 'Failure'
|
||||
```
|
||||
|
||||
```{note}
|
||||
For details on `tensorflow-rocm` wheels and ROCm version compatibility, see:
|
||||
[https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/rocm_docs/tensorflow-rocm-release.md](https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/rocm_docs/tensorflow-rocm-release.md)
|
||||
```
|
||||
|
||||
## Test the TensorFlow installation
|
||||
|
||||
To test the installation of TensorFlow, run the container image as specified in
|
||||
the previous section Installing TensorFlow. Ensure you have access to the Python
|
||||
shell in the Docker container.
|
||||
|
||||
```bash
|
||||
python3 -c 'import tensorflow' 2> /dev/null && echo ‘Success’ || echo ‘Failure’
|
||||
```
|
||||
|
||||
## Run a basic TensorFlow example
|
||||
|
||||
The TensorFlow examples repository provides basic examples that exercise the
|
||||
framework's functionality. The MNIST database is a collection of handwritten
|
||||
digits that may be used to train a Convolutional Neural Network for handwriting
|
||||
recognition.
|
||||
|
||||
Follow these steps:
|
||||
|
||||
1. Clone the TensorFlow example repository.
|
||||
|
||||
```bash
|
||||
cd ~
|
||||
git clone https://github.com/tensorflow/models.git
|
||||
```
|
||||
|
||||
2. Install the dependencies of the code, and run the code.
|
||||
|
||||
```bash
|
||||
#pip3 install requirement.txt
|
||||
#python mnist_tf.py
|
||||
```
|
||||
@@ -11,24 +11,10 @@ subtrees:
|
||||
|
||||
- caption: Installation
|
||||
entries:
|
||||
- file: install/windows/install-quick.md
|
||||
title: Quick start (Windows)
|
||||
- file: install/windows/install.md
|
||||
title: Windows install guide
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: install/windows/windows-app-deployment-guidelines.md
|
||||
title: Application deployment guidelines
|
||||
- file: install/docker.md
|
||||
title: ROCm Docker containers
|
||||
- file: install/pytorch-install.md
|
||||
title: PyTorch for ROCm
|
||||
- file: install/tensorflow-install.md
|
||||
title: Tensorflow for ROCm
|
||||
- file: install/magma-install.md
|
||||
title: MAGMA for ROCm
|
||||
- file: install/spack-intro.md
|
||||
title: ROCm & Spack
|
||||
- file: ${project:linux-install}
|
||||
title: Install ROCm on Linux
|
||||
- file: ${project:windows-install}
|
||||
title: Install HIP SDK on Windows
|
||||
|
||||
- caption: Compatibility & support
|
||||
entries:
|
||||
@@ -42,8 +28,6 @@ subtrees:
|
||||
title: User/kernel space support
|
||||
- file: about/compatibility/docker-image-support-matrix.rst
|
||||
title: Docker
|
||||
- file: about/compatibility/openmp.md
|
||||
title: OpenMP
|
||||
|
||||
- caption: Release information
|
||||
entries:
|
||||
@@ -110,6 +94,8 @@ subtrees:
|
||||
title: GPU memory
|
||||
- file: conceptual/compiler-disambiguation.md
|
||||
title: Compiler disambiguation
|
||||
- file: about/compatibility/openmp.md
|
||||
title: OpenMP
|
||||
- file: conceptual/file-reorg.md
|
||||
title: File structure (Linux FHS)
|
||||
- file: conceptual/gpu-isolation.md
|
||||
|
||||
Reference in New Issue
Block a user