diff --git a/.github/workflows/markdownlint.yml b/.github/workflows/markdownlint.yml index d2aa8bc65..e4fdaf1e3 100644 --- a/.github/workflows/markdownlint.yml +++ b/.github/workflows/markdownlint.yml @@ -14,4 +14,4 @@ jobs: - name: Use markdownlint uses: actionshub/markdownlint@v3.1.3 with: - filesToIgnoreRegex: "CHANGELOG.md|CONTRIBUTING.md|README.md|packaging_guidelines.md|about.md|quick_start.md|index.md|gpu_os_support.md|magma_install.md|math.md|reference/rocmcc.md|docker.md|hip.md|hip_sdk_install_win.md|all_deploy_options.md|inception_casestudy.md|deep_learning_rocm.md|quick_start_windows.md|tensorflow_install.md|pytorch_install.md|mi250.md|openmp.md|c++_primitives.md|ai_tools.md|computer_vision.md|all.md|compilers.md|management_tools.md|validation_tools.md|installing_linux.md|file_reorg.md|prerequisites.md|quick_start_linux.md|docker_support_matrix.md|gpu_os_support.md|" + filesToIgnoreRegex: CHANGELOG.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 90e1b9773..ffa814193 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -17,36 +17,39 @@ GitHub repository. Our documentation includes both markdown and rst files. Markdown is encouraged over rst due to the lower barrier to participation. GitHub flavored markdown is preferred -for all submissions as it will render accurately on our GitHub repositories. For existing documentation, +for all submissions as it will render accurately on our GitHub repositories. For existing documentation, [MyST](https://myst-parser.readthedocs.io/en/latest/intro.html) markdown -is used to implement certain features unsupported in GitHub markdown. This is +is used to implement certain features unsupported in GitHub markdown. This is not encouraged for new documentation. AMD will transition -to stricter use of GitHub flavored markdown with a few caveats. ROCm documentation +to stricter use of GitHub flavored markdown with a few caveats. ROCm documentation also uses [sphinx-design](https://sphinx-design.readthedocs.io/en/latest/index.html) in our markdown and rst files. We also will use breathe syntax for doxygen documentation in our markdown files. Other design elements for effective HTML rendering of the documents may be added to our markdown files. Please see [GitHub](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github)'s -guide on writing and formatting on GitHub as a starting point. +guide on writing and formatting on GitHub as a starting point. ROCm documentation adds additional requirements to markdown and rst based files as follows: - Level one headers are only used for page titles. There must be only one level -1 header per file for both Markdown and Restructured Text. + 1 header per file for both Markdown and Restructured Text. - Pass [markdownlint](https://github.com/markdownlint/markdownlint) check via -our automated github action on a Pull Request (PR). + our automated github action on a Pull Request (PR). ## Filenames and folder structure -Please use snake case for file names. Our documentation follows pitchfork for + +Please use snake case for file names. Our documentation follows pitchfork for folder structure. All documentation is in /docs except for special files like -the contributing guide in the / folder. All images used in the documentation are +the contributing guide in the / folder. All images used in the documentation are place in the /docs/data folder. -## How do provide feedback for for ROCm documentation? +## How to provide feedback for for ROCm documentation + There are three standard ways to provide feedback for this repository. ### Pull Request + All contributions to ROCm documentation should arrive via the [GitHub Flow](https://docs.github.com/en/get-started/quickstart/github-flow) targetting the develop branch of the repository. If you are unable to contribute @@ -95,6 +98,7 @@ python3 -mvenv .venv ``` Python versions known to build documentation: + - 3.8 - 3.9 - 3.10 @@ -108,15 +112,15 @@ the PR (`https://github.com/RadeonOpenCompute/ROCm/pull/`) will have a summary at the bottom. This requires the user be logged in to GitHub. - There, click `Show all checks` and `Details` of the Read the Docs pipeline. It -will take you to `https://readthedocs.com/projects/advanced-micro-devices-rocm/ -builds//` + will take you to `https://readthedocs.com/projects/advanced-micro-devices-rocm/ + builds//` - The list of commands shown are the exact ones used by CI to produce a render - of the documentation. + of the documentation. - There, click on the small blue link `View docs` (which is not the same as the -bigger button with the same text). It will take you to the built HTML site with -a URL of the form `https:// -advanced-micro-devices-demo--.com.readthedocs.build/projects/alpha/en -//`. + bigger button with the same text). It will take you to the built HTML site with + a URL of the form `https:// + advanced-micro-devices-demo--.com.readthedocs.build/projects/alpha/en + //`. ### Build the docs using VS Code @@ -147,12 +151,11 @@ resulting website show up on a locally serving web server. The settings in order are set for the following reasons: - Sets the root of the output website for live previews. Must be changed - alongside the `tasks.json` command. + alongside the `tasks.json` command. - Tells live server to wait with the update to give time for Sphinx to - regenerate site contents and not refresh before all is don. (Empirical value - ) + regenerate site contents and not refresh before all is don. (Empirical value) - Automatic virtual env activation is a nice touch, should you want to build - the site from the integrated terminal. + the site from the integrated terminal. 3. Add the following tasks in `.vscode/tasks.json` @@ -215,11 +218,11 @@ resulting website show up on a locally serving web server. ``` > (Implementation detail: two problem matchers were needed to be defined, - because VS Code doesn't tolerate some problem information being potentially - absent. While a single regex could match all types of errors, if a capture - group remains empty (the line number doesn't show up in all warning/error - messages) but the `pattern` references said empty capture group, VS Code - discards the message completely.) + > because VS Code doesn't tolerate some problem information being potentially + > absent. While a single regex could match all types of errors, if a capture + > group remains empty (the line number doesn't show up in all warning/error + > messages) but the `pattern` references said empty capture group, VS Code + > discards the message completely.) 4. Configure Python virtual environment (venv) @@ -241,3 +244,5 @@ resulting website show up on a locally serving web server. `.vscode/build/html/index.html` and select `Open with Live Server`. The contents should update on every rebuild without having to refresh the browser. + + diff --git a/README.md b/README.md index 85244a3fa..af56e3aae 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ repositories and the associated commit used to build the current ROCm release. The default.xml file uses the repo Manifest format. The develop branch of this repository contains content for the next -ROCm release. +ROCm release. ## How to build documentation via Sphinx diff --git a/docs/about.md b/docs/about.md index 96f64c028..5235089d7 100644 --- a/docs/about.md +++ b/docs/about.md @@ -9,7 +9,7 @@ yourself with our documentation toolchain. [ReadTheDocs](https://docs.readthedocs.io/en/stable/) is our frontend for the our documentation. By frontend, this is the tool that serves our HTML based -documentation to our end users. +documentation to our end users. ## Doxygen @@ -28,8 +28,9 @@ projects that believe markdown is not suitable should contact the documentation team prior to selecting rst. ### MyST -[Markedly Structured Text (MyST)](https://myst-tools.org/docs/spec) is an extended -flavor of Markdown ([https://commonmark.org/](CommonMark)) influenced by ReStructured + +[Markedly Structured Text (MyST)](https://myst-tools.org/docs/spec) is an extended +flavor of Markdown ([https://commonmark.org/](CommonMark)) influenced by ReStructured Text (RST) and Sphinx. It is intergrated via [`myst-parser`](https://myst-parser.readthedocs.io/en/latest/). A cheat sheet that showcases how to use the MyST syntax is available over at [the Jupyter diff --git a/docs/examples/inception_casestudy/inception_casestudy.md b/docs/examples/inception_casestudy/inception_casestudy.md index 5f8392fe7..3cb119bf7 100644 --- a/docs/examples/inception_casestudy/inception_casestudy.md +++ b/docs/examples/inception_casestudy/inception_casestudy.md @@ -176,7 +176,7 @@ Follow these steps: for line in f.readlines(): split_line = line.split('\t') val_dict[split_line[0]] = split_line[1] - + paths = glob.glob('./tiny-imagenet-200/val/images/*') for path in paths: file = path.split('/')[-1] @@ -184,13 +184,13 @@ Follow these steps: if not os.path.exists(target_folder + str(folder)): os.mkdir(target_folder + str(folder)) os.mkdir(target_folder + str(folder) + '/images') - + for path in paths: file = path.split('/')[-1] folder = val_dict[file] dest = target_folder + str(folder) + '/images/' + str(file) move(path, dest) - + rmdir('./tiny-imagenet-200/val/images') ``` @@ -201,7 +201,7 @@ Follow these steps: ```py import torch import os - import torchvision + import torchvision from torchvision import transforms from torchvision.transforms.functional import InterpolationMode ``` @@ -231,7 +231,7 @@ Follow these steps: 9. To smooth the image, use bilinear interpolation, a resampling method that uses the distance weighted average of the four nearest pixel values to estimate a new pixel value. ```py - interpolation = "bilinear" + interpolation = "bilinear" ``` The next parameters control the size to which the validation image is cropped and resized. @@ -244,7 +244,7 @@ Follow these steps: The pretrained Inception v3 model is chosen to be downloaded from torchvision. ```py - model_name = "inception_v3" + model_name = "inception_v3" pretrained = True ``` @@ -289,9 +289,9 @@ Follow these steps: ```py interpolation = InterpolationMode(interpolation) - + TRAIN_TRANSFORM_IMG = transforms.Compose([ - Normalizaing and standardardizing the image + Normalizaing and standardardizing the image transforms.RandomResizedCrop(train_crop_size, interpolation=interpolation), transforms.PILToTensor(), transforms.ConvertImageDtype(torch.float), @@ -310,16 +310,16 @@ Follow these steps: transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] ) ]) - - dataset_test = torchvision.datasets.ImageFolder( - val_dir, + + dataset_test = torchvision.datasets.ImageFolder( + val_dir, transform=TEST_TRANSFORM_IMG ) - + print("Creating data loaders") train_sampler = torch.utils.data.RandomSampler(dataset) test_sampler = torch.utils.data.SequentialSampler(dataset_test) - + data_loader = torch.utils.data.DataLoader( dataset, batch_size=batch_size, @@ -327,7 +327,7 @@ Follow these steps: num_workers=num_workers, pin_memory=True ) - + data_loader_test = torch.utils.data.DataLoader( dataset_test, batch_size=batch_size, sampler=test_sampler, num_workers=num_workers, pin_memory=True ) @@ -445,10 +445,10 @@ Follow these steps: running_loss = 0 for step, (image, target) in enumerate(data_loader_test): image, target = image.to(device), target.to(device) - + output = model(image) loss = criterion(output, target) - + running_loss += loss.item() running_loss = running_loss / len(data_loader_test) print('Epoch: ', epoch, '| test loss : %0.4f' % running_loss ) @@ -548,7 +548,7 @@ Follow these steps: ```py import torch.nn as nn - import torch.nn.functional as F + import torch.nn.functional as F ``` 8. Define the CNN (Convolution Neural Networks) and relevant activation functions. @@ -564,7 +564,7 @@ Follow these steps: self.conv3 = nn.Conv2d(3, 6, 5) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) - + def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) @@ -594,21 +594,21 @@ Follow these steps: ```py for epoch in range(2): # loop over the dataset multiple times - + running_loss = 0.0 for i, data in enumerate(train_loader, 0): # get the inputs; data is a list of [inputs, labels] inputs, labels = data - + # zero the parameter gradients optimizer.zero_grad() - + # forward + backward + optimize outputs = net(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() - + # print statistics running_loss += loss.item() if i % 2000 == 1999: # print every 2000 mini-batches @@ -701,7 +701,7 @@ To understand the code step by step, follow these steps: 4. The model is tested against the test set, test_images, and test_labels arrays. ```py - fashion_mnist = tf.keras.datasets.fashion_mnist + fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data() ``` @@ -751,7 +751,7 @@ To understand the code step by step, follow these steps: ```py train_images = train_images / 255.0 - + test_images = test_images / 255.0 ``` @@ -823,16 +823,16 @@ To understand the code step by step, follow these steps: ```py test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2) - + print('\nTest accuracy:', test_acc) ``` 6. With the model trained, you can use it to make predictions about some images: the model's linear outputs and logits. Attach a softmax layer to convert the logits to probabilities, making it easier to interpret. ```py - probability_model = tf.keras.Sequential([model, + probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()]) - + predictions = probability_model.predict(test_images) ``` @@ -856,20 +856,20 @@ To understand the code step by step, follow these steps: plt.grid(False) plt.xticks([]) plt.yticks([]) - + plt.imshow(img, cmap=plt.cm.binary) - + predicted_label = np.argmax(predictions_array) if predicted_label == true_label: color = 'blue' else: color = 'red' - + plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label], 100*np.max(predictions_array), class_names[true_label]), color=color) - + def plot_value_array(i, predictions_array, true_label): true_label = true_label[i] plt.grid(False) @@ -878,7 +878,7 @@ To understand the code step by step, follow these steps: thisplot = plt.bar(range(10), predictions_array, color="#777777") plt.ylim([0, 1]) predicted_label = np.argmax(predictions_array) - + thisplot[predicted_label].set_color('red') thisplot[true_label].set_color('blue') ``` @@ -930,7 +930,7 @@ To understand the code step by step, follow these steps: ```py # Add the image to a batch where it's the only member. img = (np.expand_dims(img,0)) - + print(img.shape) ``` @@ -938,9 +938,9 @@ To understand the code step by step, follow these steps: ```py predictions_single = probability_model.predict(img) - + print(predictions_single) - + plot_value_array(1, predictions_single[0], test_labels) _ = plt.xticks(range(10), class_names, rotation=45) plt.show() @@ -973,7 +973,7 @@ Follow these steps: import shutil import string import tensorflow as tf - + from tensorflow.keras import layers from tensorflow.keras import losses ``` @@ -982,7 +982,7 @@ Follow these steps: ```py url = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz" - + dataset = tf.keras.utils.get_file("aclImdb_v1", url, untar=True, cache_dir='.', cache_subdir='') @@ -1061,12 +1061,12 @@ Follow these steps: 10. Next, create validation and test the dataset. Use the remaining 5,000 reviews from the training set for validation into two classes of 2,500 reviews each. ```py - raw_val_ds = tf.keras.utils.text_dataset_from_directory('aclImdb/train', + raw_val_ds = tf.keras.utils.text_dataset_from_directory('aclImdb/train', batch_size=batch_size,validation_split=0.2,subset='validation', seed=seed) - - raw_test_ds = + + raw_test_ds = tf.keras.utils.text_dataset_from_directory( - 'aclImdb/test', + 'aclImdb/test', batch_size=batch_size) ``` @@ -1107,7 +1107,7 @@ To prepare the data for training, follow these steps: def vectorize_text(text, label): text = tf.expand_dims(text, -1) return vectorize_layer(text), label - + text_batch, label_batch = next(iter(raw_train_ds)) first_review, first_label = text_batch[0], label_batch[0] print("Review", first_review) @@ -1143,7 +1143,7 @@ To prepare the data for training, follow these steps: ```py AUTOTUNE = tf.data.AUTOTUNE - + train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE) val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE) test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE) @@ -1188,7 +1188,7 @@ To prepare the data for training, follow these steps: ```py loss, accuracy = model.evaluate(test_ds) - + print("Loss: ", loss) print("Accuracy: ", accuracy) ``` @@ -1209,9 +1209,9 @@ To prepare the data for training, follow these steps: val_acc = history_dict['val_binary_accuracy'] loss = history_dict['loss'] val_loss = history_dict['val_loss'] - + epochs = range(1, len(acc) + 1) - + # "bo" is for "blue dot" plt.plot(epochs, loss, 'bo', label='Training loss') # b is for "solid blue line" @@ -1220,7 +1220,7 @@ To prepare the data for training, follow these steps: plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend() - + plt.show() ``` @@ -1250,11 +1250,11 @@ To prepare the data for training, follow these steps: model, layers.Activation('sigmoid') ]) - + export_model.compile( loss=losses.BinaryCrossentropy(from_logits=False), optimizer="adam", metrics=['accuracy'] ) - + # Test it with `raw_test_ds`, which yields raw strings loss, accuracy = export_model.evaluate(raw_test_ds) print(accuracy) @@ -1268,7 +1268,7 @@ To prepare the data for training, follow these steps: "The movie was okay.", "The movie was terrible..." ] - + export_model.predict(examples) ``` @@ -1296,7 +1296,7 @@ MIGraphX is a graph compiler focused on accelerating the Machine Learning infere - Constant propagation -After doing all these transformations, MIGraphX emits code for the AMD GPU by calling to MIOpen or rocBLAS or creating HIP kernels for a particular operator. MIGraphX can also target CPUs using DNNL or ZenDNN libraries. +After doing all these transformations, MIGraphX emits code for the AMD GPU by calling to MIOpen or rocBLAS or creating HIP kernels for a particular operator. MIGraphX can also target CPUs using DNNL or ZenDNN libraries. MIGraphX provides easy-to-use APIs in C++ and Python to import machine models in ONNX or TensorFlow. Users can compile, save, load, and run these models using MIGraphX's C++ and Python APIs. Internally, MIGraphX parses ONNX or TensorFlow models into internal graph representation where each operator in the model gets mapped to an operator within MIGraphX. Each of these operators defines various attributes such as: @@ -1351,7 +1351,7 @@ To use Docker, follow these steps: 2. The repo contains a Dockerfile from which you can build a Docker image as: ```bash - docker build -t migraphx . + docker build -t migraphx . ``` 3. Then to enter the development environment, use Docker run: @@ -1388,22 +1388,22 @@ Follow these steps: 2. The following script shows the usage of Python API to import the ONNX model, compile it, and run inference on it. Set LD_LIBRARY_PATH to /opt/rocm/ if required. ```py - # import migraphx and numpy + # import migraphx and numpy import migraphx import numpy as np - # import and parse inception model + # import and parse inception model model = migraphx.parse_onnx("inceptioni1.onnx") # compile model for the GPU target model.compile(migraphx.get_target("gpu")) # optionally print compiled model - model.print() - # create random input image + model.print() + # create random input image input_image = np.random.rand(1, 3, 299, 299).astype('float32') - # feed image to model, 'x.1` is the input param name + # feed image to model, 'x.1` is the input param name results = model.run({'x.1': input_image}) # get the results back result_np = np.array(results[0]) - # print the inferred class of the input image + # print the inferred class of the input image print(np.argmax(result_np)) ``` @@ -1422,7 +1422,7 @@ Follow these steps: #include #include #include - + int main(int argc, char** argv) { migraphx::program prog; @@ -1438,12 +1438,12 @@ Follow these steps: prog.compile(targ, comp_opts); // print the compiled program prog.print(); - // randomly generate input image + // randomly generate input image // of shape (1, 3, 299, 299) std::srand(unsigned(std::time(nullptr))); std::vector input_image(1*299*299*3); std::generate(input_image.begin(), input_image.end(), std::rand); - // users need to provide data for the input + // users need to provide data for the input // parameters in order to run inference // you can query into migraph program for the parameters migraphx::program_parameters prog_params; @@ -1453,7 +1453,7 @@ Follow these steps: prog_params.add(input, migraphx::argument(param_shapes[input], input_image.data())); // run inference auto outputs = prog.eval(prog_params); - // read back the output + // read back the output float* results = reinterpret_cast(outputs[0].data()); float* max = std::max_element(results, results + 1000); int answer = max - results; @@ -1466,16 +1466,16 @@ Follow these steps: ```py cmake_minimum_required(VERSION 3.5) project (CAI) - + set (CMAKE_CXX_STANDARD 14) set (EXAMPLE inception_inference) - + list (APPEND CMAKE_PREFIX_PATH /opt/rocm/hip /opt/rocm) find_package (migraphx) - + message("source file: " ${EXAMPLE}.cpp " ---> bin: " ${EXAMPLE}) add_executable(${EXAMPLE} ${EXAMPLE}.cpp) - + target_link_libraries(${EXAMPLE} migraphx::c) ``` @@ -1541,7 +1541,7 @@ Inference time: 0.029ms iterator : 9 Inference complete Inference time: 0.029ms - + ### TUNED ### iterator : 0 Inference complete @@ -1581,7 +1581,7 @@ The best inference performance through MIGraphX is conditioned upon having tuned Tuning is time consuming, and if the users have not performed tuning, they would see discrepancies between expected or claimed inference performance and actual inference performance. This has led to repetitive and time-consuming tuning tasks for each user. -MIGraphX introduces a feature, known as YModel, that stores the kernel config parameters found during tuning into a .mxr file. This ensures the same level of expected performance, even when a model is copied to a different user/system. +MIGraphX introduces a feature, known as YModel, that stores the kernel config parameters found during tuning into a .mxr file. This ensures the same level of expected performance, even when a model is copied to a different user/system. The YModel feature is available starting from ROCm 5.4.1 and UIF 1.1. diff --git a/docs/how_to/deep_learning_rocm.md b/docs/how_to/deep_learning_rocm.md index 2cc68a4db..3039d9d5e 100644 --- a/docs/how_to/deep_learning_rocm.md +++ b/docs/how_to/deep_learning_rocm.md @@ -10,8 +10,9 @@ align: center --- ROCm Compatible Frameworks Flowchart ``` + ## Frameworks Installation - - [How to Install PyTorch?](pytorch_install/pytorch_install) - - [How to Install Magma?](magma_install/magma_install) - - [How to Install Magma?](tensorflow_install/tensorflow_install) \ No newline at end of file +- [How to Install PyTorch?](pytorch_install/pytorch_install) +- [How to Install Magma?](magma_install/magma_install) +- [How to Install Magma?](tensorflow_install/tensorflow_install) diff --git a/docs/how_to/magma_install/magma_install.md b/docs/how_to/magma_install/magma_install.md index ff177c670..cb33492ea 100644 --- a/docs/how_to/magma_install/magma_install.md +++ b/docs/how_to/magma_install/magma_install.md @@ -89,4 +89,4 @@ Advanced Micro Devices, Inc., \[Online\]. Available: [https://github.com/ROCmSof Docker, \[Online\]. [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/) -Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision) \ No newline at end of file +Torchvision, \[Online\]. Available [https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision](https://pytorch.org/vision/master/index.html?highlight=torchvision#module-torchvision) diff --git a/docs/how_to/manage_install/install_linux.md b/docs/how_to/manage_install/install_linux.md index 9a1d0e837..52aebb77d 100644 --- a/docs/how_to/manage_install/install_linux.md +++ b/docs/how_to/manage_install/install_linux.md @@ -26,10 +26,10 @@ For AMDGPU and ROCm installation using the installer script method on Linux distribution, follow these steps: 1. **Meet prerequisites** – Ensure the Prerequisites are met before downloading -and installing the installer using the installer script method. + and installing the installer using the installer script method. 2. **Download and install the installer script** – Ensure you download and -install the installer script from the recommended URL. + install the installer script from the recommended URL. ```{tip} The installer package is updated periodically to resolve known issues and add @@ -38,7 +38,7 @@ install the installer script from the recommended URL. ``` 3. **Use the installer script on Linux distributions** – Ensure you execute the -script for installing use cases. + script for installing use cases. ### Download and Install the Installer Script @@ -147,7 +147,7 @@ To install use cases specific to your requirements, use the installer - To install multiple use cases: ```shell - sudo amdgpu-install --usecase=hiplibsdk,rocm + sudo amdgpu-install --usecase=hiplibsdk,rocm ``` - To display a list of available use cases: @@ -164,7 +164,7 @@ To install use cases specific to your requirements, use the installer ```none If --usecase option is not present, the default selection is "graphics,opencl,hip" - + Available use cases: rocm(for users and developers requiring full ROCm stack) - OpenCL (ROCr/KFD based) runtime @@ -176,16 +176,16 @@ To install use cases specific to your requirements, use the installer lrt(for users of applications requiring ROCm runtime) - ROCm Compiler and device libraries - ROCr runtime and thunk - opencl(for users of applications requiring OpenCL on Vega or + opencl(for users of applications requiring OpenCL on Vega or later products) - ROCr based OpenCL - ROCm Language runtime - + openclsdk (for application developers requiring ROCr based OpenCL) - ROCr based OpenCL - ROCm Language runtime - development and SDK files for ROCr based OpenCL - + hip(for users of HIP runtime on AMD products) - HIP runtimes hiplibsdk (for application developers requiring HIP on AMD products) @@ -351,9 +351,9 @@ The functions of a package manager installation system are: - Grouping packages based on function - Extracting package archives - Ensuring a package is installed with all necessary packages and dependencies -are managed + are managed - From a remote repository, looking up, downloading, installing, or updating -existing packages + existing packages - Ensuring the authenticity and integrity of the package ### Installing ROCm on Linux Distributions @@ -362,27 +362,27 @@ For a fresh ROCm installation using the package manager method on a Linux distribution, follow the steps below: 1. **Meet prerequisites** – Ensure the Prerequisites are met before the ROCm -installation. + installation. 2. **Install kernel headers and development packages** – Ensure kernel headers -and development packages are installed on the system. + and development packages are installed on the system. 3. **Select the base URLs for AMDGPU and ROCm stack repository** – Ensure the -base URLs for AMDGPU and ROCm stack repositories are selected. + base URLs for AMDGPU and ROCm stack repositories are selected. 4. **Add the AMDGPU stack repository** – Ensure the AMDGPU stack repository is -added. + added. 5. **Install the kernel-mode driver and reboot the system** – Ensure the -kernel-mode driver is installed and the system is rebooted. + kernel-mode driver is installed and the system is rebooted. 6. **Add ROCm stack repository** – Ensure the ROCm stack repository is added. 7. **Install single-version or multiversion ROCm meta-packages** – Install the -desired meta-packages. + desired meta-packages. 8. **Verify installation for the applicable distributions** – Verify if the -installation is successful. + installation is successful. ```{important} You cannot install a kernel-mode driver in a Docker container. Refer to the @@ -428,8 +428,8 @@ To check the `kernel-headers` and `linux-modules-extra` package versions, follow these steps: 1. For the Ubuntu/Debian environment, execute the following command to verify -the kernel headers and development packages are installed with the respective -versions: + the kernel headers and development packages are installed with the + respective versions: ```shell sudo dpkg -l | grep linux-headers @@ -442,7 +442,7 @@ versions: ``` 2. Execute the following command to check whether the development packages are -installed: + installed: ```shell sudo dpkg -l | grep linux-modules-extra @@ -456,8 +456,8 @@ installed: ``` 3. If the supported version installation of Linux headers and development -packages are not installed on the system, execute the following command to -install the packages: + packages are not installed on the system, execute the following command + to install the packages: ```shell sudo apt install linux-headers-`uname -r` linux-modules-extra-`uname -r` @@ -492,26 +492,26 @@ install the packages: To add the AMDGPU stack repository, follow these steps: -::::{tab-set} -:::{tab-item} Ubuntu 20.04 -:sync: ubuntu-20.04 + ::::{tab-set} + :::{tab-item} Ubuntu 20.04 + :sync: ubuntu-20.04 ```shell echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/amdgpu/5.4.3/ubuntu focal main' | sudo tee /etc/apt/sources.list.d/amdgpu.list sudo apt update ``` -::: -:::{tab-item} Ubuntu 22.04 -:sync: ubuntu-22.04 + ::: + :::{tab-item} Ubuntu 22.04 + :sync: ubuntu-22.04 ```shell echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/amdgpu/5.4.3/ubuntu jammy main' | sudo tee /etc/apt/sources.list.d/amdgpu.list sudo apt update ``` -::: -:::: + ::: + :::: Install the kernel mode driver and reboot the system using the following commands: @@ -525,9 +525,9 @@ install the packages: To add the ROCm repository, use the following steps: -::::{tab-set} -:::{tab-item} Ubuntu 20.04 -:sync: ubuntu-20.04 + ::::{tab-set} + :::{tab-item} Ubuntu 20.04 + :sync: ubuntu-20.04 ```shell for ver in 5.0.2 5.1.4 5.2.5 5.3.3 5.4.3; do @@ -537,9 +537,9 @@ install the packages: sudo apt update ``` -::: -:::{tab-item} Ubuntu 22.04 -:sync: ubuntu-22.04 + ::: + :::{tab-item} Ubuntu 22.04 + :sync: ubuntu-22.04 ```shell for ver in 5.0.2 5.1.4 5.2.5 5.3.3 5.4.3; do @@ -549,8 +549,8 @@ install the packages: sudo apt update ``` -::: -:::: + ::: + :::: Install packages of your choice in a single-version ROCm install or in a multi-version ROCm install fashion. For more information on what @@ -596,7 +596,7 @@ To check the kernel headers and `linux-modules-extra` package versions, follow these steps: 1. To verify you have the supported version of the installed kernel headers, -type the following on the command line: + type the following on the command line: ```shell sudo yum list installed kernel-headers @@ -607,15 +607,15 @@ type the following on the command line: the same versions as the kernel. 2. The following command lists the development packages on your system. Verify -if the listed development package's version number matches the kernel version -number: + if the listed development package's version number matches the kernel + version number: ```shell sudo yum list installed kernel-devel ``` 3. If the supported version installation of kernel headers and development -packages does not exist on the system, execute the command below to install: + packages does not exist on the system, execute the command below to install: ```shell sudo yum install kernel-headers-`uname -r` kernel-devel-`uname -r` @@ -631,9 +631,9 @@ packages does not exist on the system, execute the command below to install: section. ``` -::::{tab-set} -:::{tab-item} RHEL 8.6 -:sync: RHEL-8.6 + ::::{tab-set} + :::{tab-item} RHEL 8.6 + :sync: RHEL-8.6 ```shell sudo tee --append /etc/yum.repos.d/amdgpu.repo < #include - + #define N 64 #pragma omp requires unified_shared_memory int main() { int n = N; int *a = new int[n]; int *b = new int[n]; - + for(int i = 0; i < n; i++) b[i] = i; - + #pragma omp target parallel for map(to:b[:n]) for(int i = 0; i < n; i++) a[i] = b[i]; - + for(int i = 0; i < n; i++) if(a[i] != i) printf("error at %d: expected %d, got %d\n", i, i+1, a[i]); - + return 0; } $ clang++ -O2 -target x86_64-pc-linux-gnu -fopenmp --offload-arch=gfx90a:xnack+ parallel_for.cpp @@ -231,7 +231,7 @@ See the example below, where the user builds the program using -msafe-fp-atomics double a = 0.0;. #pragma omp atomic hint(AMD_fast_fp_atomics) a = a + 1.0; - + double b = 0.0; #pragma omp atomic b = b + 1.0; @@ -260,11 +260,12 @@ Address Sanitizer is a memory error detector tool utilized by applications to de - Initialization order bugs **Features Supported on AMDGPU Platform (amdgcn-amd-amdhsa):** + - Heap buffer overflow - Global buffer overflow -**Software (Kernel/OS) Requirements:** Unified Shared Memory support with Xnack capability. See the section on [Unified Shared Memory](#unified-shared-memory) for prerequisites and details on Xnack. +**Software (Kernel/OS) Requirements:** Unified Shared Memory support with Xnack capability. See the section on [Unified Shared Memory](#unified-shared-memory) for prerequisites and details on Xnack. **Example:** @@ -276,7 +277,7 @@ void main() { ....... // Some program statements #pragma omp target map(to : A[0:N], B[0:N]) map(from: C[0:N]) { -#pragma omp parallel for +#pragma omp parallel for for(int i =0 ; i < N; i++){ C[i+10] = A[i] + B[i]; } // end of for loop @@ -290,7 +291,7 @@ See the complete sample code for heap buffer overflow [here](https://github.com/ - Global buffer overflow ```bash -#pragma omp declare target +#pragma omp declare target int A[N],B[N],C[N]; #pragma omp end declare target void main(){ @@ -300,7 +301,7 @@ void main(){ { #pragma omp target update to(A,B) #pragma omp target parallel for -for(int i=0; i @@ -153,7 +154,6 @@ to perform this optimization. Users can choose different levels of aggressiveness with which this optimization can be applied to the application, with 1 being the least aggressive and 7 being the most aggressive level. - :::{table} -fstruct-layout Values and Their Effects | -fstruct-layout value | Structure peeling | Pointer size after selective compression of self-referential pointers in structures, wherever safe | Type of structure fields eligible for compression | Whether compression performed under safety check | | ----------- | ----------- | ----------- | ----------- | ----------- | diff --git a/docs/reference/validation_tools.md b/docs/reference/validation_tools.md index 9b2d16aa2..485379773 100644 --- a/docs/reference/validation_tools.md +++ b/docs/reference/validation_tools.md @@ -4,14 +4,14 @@ :gutter: 1 :::{grid-item-card} [RVS](https://rocmdocs.amd.com/projects/RVS/en/latest/) -The ROCm Validation Suite is a system administrator’s and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform. +The ROCm Validation Suite is a system administrator’s and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform. - [Documentation](https://rocmdocs.amd.com/projects/RVS/en/latest/) ::: :::{grid-item-card} [TransferBench](https://rocmdocs.amd.com/projects/TransferBench/en/latest/) -TransferBench is a simple utility capable of benchmarking simultaneous transfers between user-specified devices (CPUs/GPUs). +TransferBench is a simple utility capable of benchmarking simultaneous transfers between user-specified devices (CPUs/GPUs). - [Documentation](https://rocmdocs.amd.com/projects/TransferBench/en/latest/) - [Changelog](https://github.com/ROCmSoftwarePlatform/TransferBench/blob/develop/CHANGELOG.md) diff --git a/docs/release/docker_support_matrix.md b/docs/release/docker_support_matrix.md index 5025d0678..3c85419e0 100644 --- a/docs/release/docker_support_matrix.md +++ b/docs/release/docker_support_matrix.md @@ -1,6 +1,6 @@ # Frameworks Support Matrix -The software support matrices for ROCm container releases is listed. +The software support matrices for ROCm container releases is listed. ## ROCm 5.6 diff --git a/docs/release/gpu_os_support.md b/docs/release/gpu_os_support.md index 508af73b2..32f4c0109 100644 --- a/docs/release/gpu_os_support.md +++ b/docs/release/gpu_os_support.md @@ -61,7 +61,6 @@ Use Driver Shipped with ROCm | AMD Radeon™ VII | Vega |Full | gfx906 | Supported | Unsupported | | AMD Radeon™ R9 Fury | Fiji |Full | gfx803 | Community | Unsupported | - ::: :::: @@ -73,7 +72,7 @@ Use Driver Shipped with ROCm :::{tab-item} AMD Instinct™ :sync: instinct -Instinct™ accelerators support the full stack available in ROCm. Instinct™ +Instinct™ accelerators support the full stack available in ROCm. Instinct™ accelerators are Linux only. ::: @@ -104,6 +103,7 @@ below: - HIP SDK includes the HIP Runtime and a selection of GPU libraries for compute. Please see [article](link) for details of HIP SDK. - HIP enables the use of the HIP Runtime only. + ::: :::: diff --git a/docs/release/licensing.md b/docs/release/licensing.md index 5494220bd..2ff866378 100644 --- a/docs/release/licensing.md +++ b/docs/release/licensing.md @@ -2,9 +2,9 @@ ROCm™ is released by Advanced Micro Devices, Inc. and is licensed per component separately. The following table is a list of ROCm components with links to their respective license -terms. These components may include third party components subject to +terms. These components may include third party components subject to additional licenses. Please review individual repositories for more information. -The table shows ROCm components, the name of license and link to the license terms. +The table shows ROCm components, the name of license and link to the license terms. The table is ordered to follow ROCm's manifest file. | Component | License | @@ -63,9 +63,9 @@ The table is ordered to follow ROCm's manifest file. | [aomp-extras](https://github.com/ROCm-Developer-Tools/aomp-extras/) | [MIT](https://github.com/ROCm-Developer-Tools/aomp-extras/blob/aomp-dev/LICENSE) | | [flang](https://github.com/ROCm-Developer-Tools/flang/) | [Apache 2.0](https://github.com/ROCm-Developer-Tools/flang/blob/master/LICENSE.txt) | -Open sourced ROCm components are released via public GitHub +Open sourced ROCm components are released via public GitHub repositories, packages on repo.radeon.com and other distribution channels. -Proprietary products are only available on repo.radeon.com. Currently, only +Proprietary products are only available on repo.radeon.com. Currently, only one component of ROCm, rocm-llvm-alt is governed by a proprietary license. Proprietary components are organized in a proprietary subdirectory in the package repositories to distinguish from open sourced packages. diff --git a/docs/understand/file_reorg.md b/docs/understand/file_reorg.md index a9ef3b047..06840bfad 100644 --- a/docs/understand/file_reorg.md +++ b/docs/understand/file_reorg.md @@ -13,13 +13,13 @@ distributions. Following is the ROCm proposed file structure. | -- lib | -- lib.so->lib.so.major->lib.so.major.minor.patch (public libaries to link with applications) - | -- + | -- | -- architecture dependent libraries and binaries used internally by components | -- cmake | -- | --.config.cmake | -- libexec - | -- + | -- | -- non ISA/architecture independent executables used internally by components | -- include | -- @@ -94,11 +94,12 @@ from the new location (/opt/rocm-xxx/include) as shown in the example below. The depreciation plan for backward compatibility wrapper header files is as follows -- #pragma message announcing deprecation – ROCm v5.2 release. -- #pragma message changed to #warning – Future release, tentatively ROCm v5.5. -- #warning changed to #error – Future release, tentatively ROCm v5.6. +- `#pragma` message announcing deprecation – ROCm v5.2 release. +- `#pragma` message changed to `#warning` – Future release, tentatively ROCm + v5.5. +- `#warning` changed to `#error` – Future release, tentatively ROCm v5.6. - Backward compatibility wrappers removed – Future release, tentatively ROCm -v6.0. + v6.0. ### Executable files @@ -145,19 +146,19 @@ will be deprecated in a future release. Application have to make sure to include correct header file and use correct search paths. 1. `#include` needs to be changed to -`#include ` + `#include ` - For eg: `#include ` needs to change -to `#include ` + For eg: `#include ` needs to change + to `#include ` 2. Any variable in cmake or makefiles pointing to component folder needs to -changed. + changed. - For eg: `VAR1=/opt/rocm/hip` needs to be changed to `VAR1=/opt/rocm` - `VAR2=/opt/rocm/hsa` needs to be changed to `VAR2=/opt/rocm` + For eg: `VAR1=/opt/rocm/hip` needs to be changed to `VAR1=/opt/rocm` + `VAR2=/opt/rocm/hsa` needs to be changed to `VAR2=/opt/rocm` 3. Any reference to `/opt/rocm//bin` or `/opt/rocm//lib` -needs to be changed to `/opt/rocm/bin` and `/opt/rocm/lib/` respectively. + needs to be changed to `/opt/rocm/bin` and `/opt/rocm/lib/` respectively. ## References diff --git a/docs/understand/installing_linux.md b/docs/understand/installing_linux.md index 37fe0a696..08c220909 100644 --- a/docs/understand/installing_linux.md +++ b/docs/understand/installing_linux.md @@ -26,11 +26,11 @@ It is customary for Linux installers to integrate into the system's package manager. There are two notable groups of package sources: - AMD-hosted repositories maintained by AMD available to register on supported -Linux distribution versions. For a complete list of AMD-supported platforms, -refer to the article: [GPU and OS Support](../release/gpu_os_support). + Linux distribution versions. For a complete list of AMD-supported platforms, + refer to the article: [GPU and OS Support](../release/gpu_os_support). - Distribution-hosted repositories maintained by the developer of said Linux -distribution. These require little to no setup from the user, but aren't tested -by AMD. For support on these installations, contact the relevant maintainers. + distribution. These require little to no setup from the user, but aren't tested + by AMD. For support on these installations, contact the relevant maintainers. AMD also provides installer scripts for those that wish to drive installations in a more manual fashion. @@ -71,12 +71,12 @@ The `amdgpu-install` script streamlines the installation process by: - Abstracting the distribution-specific package installation logic - Performing the repository setup - Allowing you to specify the use case and automating the installation of all -the required packages + the required packages - Installing multiple ROCm releases simultaneously on a system - Automating updating local repository information through enhanced -functionality of the amdgpu-install script + functionality of the amdgpu-install script - Performing post-install checks to verify whether the installation was -completed successfully + completed successfully - Upgrading the installed ROCm release - Uninstalling the installed single-version or multiversion ROCm releases @@ -125,8 +125,8 @@ The single-version ROCm installation refers to the following: The multiversion installation refers to the following: - Installation of multiple instances of the ROCm stack on a system. Extending -the package name and its dependencies with the release version adds the ability -to support multiple versions of packages simultaneously. + the package name and its dependencies with the release version adds the + ability to support multiple versions of packages simultaneously. - Use of versioned ROCm meta-packages. ```{note} diff --git a/docs/understand/installing_linux/package_manager_integration.md b/docs/understand/installing_linux/package_manager_integration.md index d71e71fbb..668724f0f 100644 --- a/docs/understand/installing_linux/package_manager_integration.md +++ b/docs/understand/installing_linux/package_manager_integration.md @@ -60,9 +60,9 @@ of required packages and libraries. **Example:** - rocm-hip-runtime is used to deploy on supported machines to execute HIP -applications. + applications. - rocm-hip-sdk contains runtime components to deploy and execute HIP -applications. + applications. ```{figure-md} meta-packages diff --git a/docs/understand/installing_linux/prerequisites.md b/docs/understand/installing_linux/prerequisites.md index 7fb98195c..831d6f94f 100644 --- a/docs/understand/installing_linux/prerequisites.md +++ b/docs/understand/installing_linux/prerequisites.md @@ -18,14 +18,14 @@ kernel version. Verify the Linux distribution using the following steps: 1. To obtain the Linux distribution information, type the following command on -your system from the Command Line Interface (CLI): + your system from the Command Line Interface (CLI): ```shell uname -m && cat /etc/*release ``` 2. Confirm that the obtained Linux distribution information matches with those -with [System Requirements](/release/gpu_os_support#os-support). + with [System Requirements](/release/gpu_os_support#os-support). **Example:** Running the command above on an Ubuntu system results in the following output: @@ -51,7 +51,7 @@ Verify the kernel version using the following steps: ``` 2. Confirm that the obtained kernel version information matches with System -Requirements. + Requirements. **Example:** The output of the command above lists the kernel version in the following format: @@ -97,7 +97,7 @@ To verify that your system has a ROCm-capable GPU, use these steps: ``` 2. Verify from the output that the listed product names match with the Product -Id given in the table above. + Id given in the table above. ## Confirm the System Has All the Required Tools and Packages Installed @@ -125,7 +125,7 @@ GPU resources. ``` 3. Use of the video group is recommended for all ROCm-supported operating -systems. + systems. ```{note} render group is required only for Ubuntu v20.04. diff --git a/mdlrc-style.rb b/mdlrc-style.rb index cda8491e4..5f20119ad 100644 --- a/mdlrc-style.rb +++ b/mdlrc-style.rb @@ -2,7 +2,16 @@ all # Extend line length rule 'MD013', :line_length => 99999 +# Use "1. 2. 3."-style numbered lists instead of "1. 1. 1." +rule 'MD029', :style => :ordered + # Allow in-line HTML exclude_rule 'MD033' exclude_rule 'MD041' + +# False positives, see: https://github.com/markdownlint/markdownlint/issues/374 +exclude_rule 'MD005' + +# False positives, see: https://github.com/markdownlint/markdownlint/issues/313 +exclude_rule 'MD007' \ No newline at end of file