Compare commits

...

67 Commits

Author SHA1 Message Date
Sam Wu
cd7898608c Update documentation requirements 2024-09-16 10:12:54 -08:00
Sam Wu
cb7fd65a13 Update documentation requirements 2024-06-06 16:58:45 -06:00
Sam Wu
7789c72b3c Fix RTD config 2024-05-02 08:55:38 -06:00
Sam Wu
0144f872d9 Update documentation requirements 2024-05-01 16:59:34 -06:00
Sam Wu
1084ccc9fc Update documentation requirements 2024-05-01 16:54:06 -06:00
Young Hui - AMD
bdf4024fcc update SLES 15.4 prerequisites for 5.6.0 (#2982)
* update SLES 15.4 prerequisites

* update wordlist from develop

* fix spelling errors

* more spelling fixes
2024-04-01 17:50:26 -04:00
Mátyás Aradi
c61e253c6e Update Linux and ROCm versions (#2870) 2024-02-05 09:37:35 -07:00
Mátyás Aradi
10a4078f82 Remove RHEL 8.6, because it is not supported (#2830) 2024-01-19 09:15:31 -07:00
Sam Wu
e8c0aa9ac0 fix linux installer instructions for rhel 9 2023-09-20 13:05:34 -06:00
Sam Wu
b1db9fb122 update rtd config 2023-08-29 14:46:13 -06:00
Nara
c3e8e15e51 doc: Update version in install guide to 5.6 (#2387) 2023-08-18 13:57:45 -06:00
Sam Wu
df0ee5a0ae add version to html title 2023-08-04 17:18:41 -06:00
Saad Rahim
6fb7b9f3b5 GPU support clarification (#2350) 2023-07-27 17:42:24 -06:00
Saad Rahim
7f8eede7d1 linting fix 2023-07-27 16:30:18 -06:00
Saad Rahim
0741268fd5 Updating GPU support list 2023-07-27 16:30:18 -06:00
Sam Wu
4ab3787abe Merge pull request #2345 from RadeonOpenCompute/docs/5.5.1
Docs/5.5.1 Sync into 5.6
2023-07-27 13:32:02 -06:00
Sam Wu
b4d3dde1a2 Update management_tools.md 2023-07-27 13:28:31 -06:00
Saad Rahim
b60afeeafe Update ai_tools.md 2023-07-27 13:28:21 -06:00
Saad Rahim
76af020540 Merge branch 'docs/5.6.0' into docs/5.5.1 2023-07-27 13:26:47 -06:00
Sam Wu
e96f137f44 fix merge conflict 2023-07-27 13:16:43 -06:00
Saad Rahim
4dd5cf1e59 fixing linting (#2343)
Co-authored-by: Sam Wu <sam.wu2@amd.com>
2023-07-27 13:11:50 -06:00
Sam Wu
fab4379715 Add 5.5.1 release notes (#2342)
* add 5.5.1 release notes

* fix markdown linting violations

* fix release notes

---------

Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
2023-07-27 12:43:11 -06:00
Sam Wu
d17a27ca84 set article info for windows pages (#2341) 2023-07-27 12:28:33 -06:00
Saad Rahim
ddb77b9dcf Merge branch 'docs/5.5.1' of github.com:saadrahim/ROCm into docs/5.5.1 2023-07-27 12:20:03 -06:00
Saad Rahim
52f52b7976 CI on docs branch 2023-07-27 12:18:58 -06:00
Saad Rahim
a35248bb77 Delete 5.5-win.md 2023-07-27 12:11:41 -06:00
Saad Rahim
9d05c49458 Delete #5.5-win.md# 2023-07-27 12:11:29 -06:00
Saad Rahim
419f674456 Windows release notes 2023-07-27 12:08:28 -06:00
Saad Rahim
e13e1d31c3 Adding Windows Installation Instructions (#2339) 2023-07-27 11:00:44 -06:00
Sam Wu
eb12f3f851 Changelog updates for 5.6.0 (#2306)
* remove typos in changelog

* add 5.6 release notes

* add amd smi changes for 5.6.0
2023-07-07 09:39:42 -06:00
Sam Wu
524f009280 Links for Reference pages (#2307)
* reorg toc to match all ref material page

* add links to docs, github, and changelogs
2023-07-07 09:37:15 -06:00
Rahul Garg
d23a85c707 Update backward incompatible planned changes in 5.5 (#2279)
* Update backward incompatible planned changes

* add planned changes to changelog

* update rocm-docs-core to v0.18.3

---------

Co-authored-by: Sam Wu <sam.wu2@amd.com>
2023-07-07 09:36:40 -06:00
Sam Wu
2786b32eea Update Links (#2240)
* update link to PCIe Gen 4 pdf

* fix broken links

* remove references to broken links

* fix spelling of data center
2023-07-07 09:35:55 -06:00
Sam Wu
2c828465f2 rocm-docs-core v0.18.3 2023-06-30 09:42:51 -06:00
Sam Wu
58b137d43e rocm-docs-core v0.18.3 2023-06-30 09:41:51 -06:00
Saad Rahim
a144653405 Fixing typos 2023-06-29 13:41:44 -06:00
Saad Rahim
85a4eca655 Fixing links for management tools 2023-06-29 13:31:58 -06:00
Saad Rahim
bdb527980a Fixing typo on GPU support tables for Radeon 2023-06-29 13:14:14 -06:00
Sam Wu
8e39a2a147 update release notes 2023-06-29 12:18:50 -06:00
Sam Wu
72c128f681 update project names for intersphinx 2023-06-29 11:30:50 -06:00
Sam Wu
284d024045 pdf configs 2023-06-29 11:24:57 -06:00
Sam Wu
da32369db1 config for pdf 2023-06-29 11:24:09 -06:00
Sam Wu
e70545bcd9 update release notes date 2023-06-29 11:21:30 -06:00
Sam Wu
3d88626dd4 update conf.py 2023-06-29 11:13:57 -06:00
Rahul Garg
0cfc1e480a Update backward incompatible planned changes in 5.5 (#2279)
* Update backward incompatible planned changes

* add planned changes to changelog

* update rocm-docs-core to v0.18.3

---------

Co-authored-by: Sam Wu <sam.wu2@amd.com>
2023-06-29 11:05:27 -06:00
Saad Rahim
f9aeee3e15 CLR manifest update and release note edit (#2299)
* removing deprecated libraries

* Release note fix

* manual updates

* Updating manifest for clr changes
2023-06-28 19:02:49 -06:00
Saad Rahim
6e50a85a93 removing deprecated libraries (#2298) 2023-06-28 17:33:07 -06:00
Saad Rahim
e8fdc582d8 Updating manifest for 5.6.0 release (#2297) 2023-06-28 17:13:00 -06:00
Saad Rahim
4df2273587 Table fix (#2296)
* Table fix

* Supported and unsupported tab fix
2023-06-28 16:47:18 -06:00
Saad Rahim
996f4a8c37 Compatibility Section for ROCm 5.6 (#2294)
* Update 3rd party compat for 5.6

* Update supported OS for 5.6

* Validated kernels

* linting

* missed GPU

* Update .wordlist.txt

---------

Co-authored-by: Máté Ferenc Nagy-Egri <mate@streamhpc.com>
2023-06-28 16:34:08 -06:00
Sam Wu
5bbe13fb75 Cherry pick changes from develop to 5.6 (#2295)
* Update Links (#2240)

* update link to PCIe Gen 4 pdf

* fix broken links

* remove references to broken links

* fix spelling of data center

* Fixing HIP link (#2236)

* Swati develop (#2245)

* Added deleted sections to openmp.md and other improvements

* Update openmp.md

Tagged `ICV`

* Solving indiscrepencies in openmp.md

There are apparently differences in the published document and information conveyed by the Dev. Fixed it.

* add new words to wordlist

---------

Co-authored-by: Sam Wu <sam.wu2@amd.com>

* fix rocm_smi_lib link in toc (#2260)

* ROCm FHS Reorganization, Backward Compatibility, and Versioning - rev (#2255)

* update requirements

---------

Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Ehud Sharlin <112672820+Ehud-Sharlin@users.noreply.github.com>
2023-06-28 16:30:19 -06:00
Saad Rahim
b899a3697c Further release notes (#2285)
* gfx906 GPU Maintenance Mode

* update changelog and release notes

* Final release notes

* Fix link

* update changelog and release notes

* rocgdb 13

---------

Co-authored-by: Sam Wu <sam.wu2@amd.com>
2023-06-28 16:23:38 -06:00
Sam Wu
f655458f87 Update release notes and changelog (#2274)
* update release notes for rocprofiler

* add release notes for rocgdb
2023-06-28 15:44:11 -06:00
Saad Rahim
8781a7706d Mi50 Maintenance Mode (#2277)
* gfx906 GPU Maintenance Mode

* update changelog and release notes

---------

Co-authored-by: Sam Wu <sam.wu2@amd.com>
2023-06-28 15:11:32 -06:00
dependabot[bot]
3643e8a6c2 Bump rocm-docs-core from 0.18.0 to 0.18.1 in /docs/sphinx (#2280)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.18.0 to 0.18.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.18.0...v0.18.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-27 19:23:38 -06:00
dependabot[bot]
02d86aa41b Bump rocm-docs-core from 0.17.2 to 0.18.0 in /docs/sphinx (#2278)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.17.2 to 0.18.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.17.2...v0.18.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-27 17:13:08 -06:00
dependabot[bot]
5615c90889 Bump rocm-docs-core from 0.17.1 to 0.17.2 in /docs/sphinx (#2276)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.17.1 to 0.17.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.17.1...v0.17.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-27 10:06:37 -06:00
Sam Wu
21e433e91f Update changelog and release notes with hipStreamGetDevice (#2259)
* docs: update changelog and release notes with hipStreamGetDevice

* docs: fix typos and add version update notes

* docs: add HIP changelog

* remove What's New section from changelog
2023-06-26 16:03:04 -06:00
Mészáros Gergely
8bf7cfdddc Add documentation on 5.6 support SLES 15.5 (#2271)
* docs: clean up SLES tab-sets

- Always use a tab-set for SLES 15.4
- In the toplevel SLES title don't say version 15
- harmonize the `:sync:` labels between documents

* docs: Misc fixes in installation

- Fix rocm repository url in the installer script installation for SLES
- Add a missing :sync: tab in installation prerequisites

* docs: add SLES 15.5 support to installation and OS support pages
2023-06-26 15:29:55 -06:00
Saad Rahim
e05ce21fb4 MIOpen kdb installation instructions for PyTorch warmup performance improvement (#2248) 2023-06-22 09:47:38 -06:00
Sam Wu
6b1fdeab82 rocm_smi_lib 2023-06-21 17:18:17 -06:00
Nara
c1a8c5b030 docs(deploy/linux): update install instructions to 5.6 (#2244) 2023-06-16 07:27:00 -06:00
Mészáros Gergely
014c904c4c Add RHEL 8.8 and 9.2 as supported distributions for 5.6 (#2242)
- add them to the os support table
- add install instructions for them
2023-06-14 07:07:50 -06:00
Nara
e8275e7fd3 ROCm 5.6 Changelog Updates (#2238)
* fix(manifest): fix missing remote entries in default.xml

* fix(autotag): fix issues when fetching non-standardized changelogs

* docs(changelog): updated changelog for ROCm 5.6
2023-06-14 07:06:49 -06:00
Alfin Auzikri
51af0be780 Update tensorflow_install.md (#2237)
* Update tensorflow_install.md

Fixed writing commands so that when executed by copy paste it doesn't cause an error.

* Update tensorflow_install.md

Following @saadrahim's suggestion of using "\" to signify a line break in bash.
2023-06-12 09:29:44 -06:00
Nagy-Egri Máté Ferenc
5e24832f3b Remove package pin from quick start quide (#2233)
* Remove package pin from quick start quide

When installing a single-package fashion, no version pinning is needed

* Add package pinning to quick start guide

Pinning the packages is required to make apt prefer the rocm packages
instead of the system ones when both provide the same package (e.g
`rocm-smi`).

* Removing Ubuntu 20.04 change

---------

Co-authored-by: Gergely Meszaros <gergely@streamhpc.com>
Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
2023-06-09 13:56:23 -06:00
srawat
6757f9dc56 Added specialized kernels to openmp.md (#2187)
* Added specialized kernels to openmp.md

A few formatting changes and addition of specialized kernels section at the end.

* Added Specialized kernels in openmp.md

Some formatting changes and addition of specialized kernels instead of no loop and cross team kernels

* Added specialized kernel to openmp.md

* Added specialized kernels to openmp.md

* Replaced the usage of uncertain clauses(may/might) in  openmp.md

* Attempt to align the table headings for environment variables in openmp.md

* Feedback from Dhruva

---------

Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
2023-06-08 10:00:51 -06:00
101 changed files with 3818 additions and 500 deletions

View File

@@ -5,10 +5,14 @@ on:
branches:
- develop
- main
- 'docs/*'
- 'roc**'
pull_request:
branches:
- develop
- main
- 'docs/*'
- 'roc**'
concurrency:
group: ${{ github.ref }}-${{ github.workflow }}

View File

@@ -3,12 +3,19 @@
version: 2
build:
os: ubuntu-22.04
tools:
python: "3.10"
apt_packages:
- "doxygen"
- "graphviz" # For dot graphs in doxygen
python:
install:
- requirements: docs/sphinx/requirements.txt
sphinx:
configuration: docs/conf.py
formats: [htmlzip, pdf, epub]
python:
version: "3.8"
install:
- requirements: docs/sphinx/requirements.txt
formats: []

View File

@@ -1,29 +1,686 @@
# isv_deployment_win
AAC
ABI
# gpu_aware_mpi
ACE
ACEs
AccVGPR
AccVGPRs
ALU
AMD
AMDGPU
AMDGPUs
AMDMIGraphX
AMI
AOCC
AOMP
APIC
APIs
APU
ASIC
ASICs
ASan
ASm
ATI
AddressSanitizer
AlexNet
Arb
BLAS
BMC
BitCode
Blit
Bluefield
CCD
CDNA
CIFAR
CLI
CLion
CMake
CMakeLists
CMakePackage
CP
CPC
CPF
CPP
CPU
CPUs
CSC
CSE
CSV
CSn
CTests
CU
CUDA
CUs
CXX
Cavium
CentOS
ChatGPT
CoRR
Codespaces
Commitizen
CommonMark
Concretized
Conda
ConnectX
DGEMM
DKMS
DL
DMA
DNN
DNNL
DPM
DRI
DW
DWORD
Dask
DataFrame
DataLoader
DataParallel
DeepSpeed
Dependabot
DevCap
Dockerfile
Doxygen
ELMo
ENDPGM
EPYC
ESXi
FFT
FFTs
FFmpeg
FHS
FMA
FP
Filesystem
Flang
Fortran
Fuyu
GALB
GCD
GCDs
GCN
GDB
GDDR
GDR
GDS
GEMM
GEMMs
GFortran
GiB
GIM
GL
GLXT
GMI
GPG
GPR
GPT
GPU
GPU's
GPUs
GRBM
GenAI
GenZ
GitHub
Gitpod
HBM
HCA
MPI
MVAPICH
Mellanox's
NIC
OFED
OSU
OpenFabrics
PeerDirect
RDMA
UCX
ib_core
# linear algebra
HIPCC
HIPExtension
HIPIFY
HPC
HPCG
HPE
HPL
HSA
HWE
Haswell
Higgs
Hyperparameters
ICV
IDE
IDEs
IMDb
IOMMU
IOP
IOPM
IOV
IRQ
ISA
ISV
ISVs
ImageNet
InfiniBand
Inlines
IntelliSense
Intersphinx
Intra
Ioffe
JSON
Jupyter
KFD
KiB
KVM
Keras
Khronos
LAPACK
LCLK
LDS
LLM
LLMs
LLVM
LM
LSAN
LTS
LoRA
MEM
MERCHANTABILITY
MFMA
MiB
MIGraphX
MIOpen
MIOpenGEMM
MIVisionX
MLM
MMA
MMIO
MMIOH
MNIST
MPI
MSVC
MVAPICH
MVFFR
Makefile
Makefiles
Matplotlib
Megatron
Mellanox
Mellanox's
Meta's
MirroredStrategy
Multicore
Multithreaded
MyEnvironment
MyST
NBIO
NBIOs
NIC
NICs
NLI
NLP
NPS
NSP
NUMA
NVCC
NVIDIA
NVPTX
NaN
Nano
Navi
Noncoherently
NousResearch's
NumPy
OAM
OAMs
OCP
OEM
OFED
OMP
OMPI
OMPT
OMPX
ONNX
OSS
OSU
OpenCL
OpenCV
OpenFabrics
OpenGL
OpenMP
OpenSSL
OpenVX
PCI
PCIe
PEFT
PIL
PILImage
PRNG
PRs
PaLM
Pageable
PeerDirect
Perfetto
PipelineParallel
PnP
PowerShell
PyPi
PyTorch
Qcycles
RAII
RCCL
RDC
RDMA
RDNA
RHEL
ROC
ROCProfiler
ROCTracer
ROCclr
ROCdbgapi
ROCgdb
ROCk
ROCm
ROCmCC
ROCmSoftwarePlatform
ROCmValidationSuite
ROCr
RST
RW
Radeon
RelWithDebInfo
Req
Rickle
RoCE
Ryzen
SALU
SBIOS
SCA
SDK
SDMA
SDRAM
SENDMSG
SGPR
SGPRs
SHA
SIGQUIT
SIMD
SIMDs
SKU
SKUs
SLES
SMEM
SMI
SMT
SPI
SQs
SRAM
SRAMECC
SVD
SWE
SerDes
Shlens
Skylake
Softmax
Spack
Supermicro
Szegedy
TCA
TCC
TCI
TCIU
TCP
TCR
TF
TFLOPS
TPU
TPUs
TensorBoard
TensorFlow
TensorParallel
ToC
TorchAudio
TorchMIGraphX
TorchScript
TorchServe
TorchVision
TransferBench
TrapStatus
UAC
UC
UCC
UCX
UIF
USM
UTCL
UTIL
Uncached
Unhandled
VALU
VBIOS
VGPR
VGPRs
VM
VMEM
VMWare
VRAM
VSIX
VSkipped
Vanhoucke
Vulkan
WGP
WGPs
WX
WikiText
Wojna
Workgroups
Writebacks
XCD
XCDs
XGBoost
XGBoost's
XGMI
XT
XTX
Xeon
Xilinx
Xnack
Xteam
YAML
YML
YModel
ZeRO
ZenDNN
accuracies
activations
addr
alloc
allocator
allocators
amdgpu
api
atmi
atomics
autogenerated
avx
awk
backend
backends
benchmarking
bfloat
bilinear
bitsandbytes
blit
boson
bosons
buildable
bursty
bzip
cacheable
cd
centos
centric
changelog
chiplet
cmake
cmd
coalescable
codename
collater
comgr
completers
composable
concretization
config
conformant
convolutional
convolves
cpp
csn
cuBLAS
cuFFT
cuLIB
cuRAND
cuSOLVER
cuSPARSE
# tuning_guides
BMC
DGEMM
HPCG
HPL
IOPM
dataset
datasets
dataspace
datatype
datatypes
dbgapi
de
deallocation
denoise
denoised
denoises
denormalize
deserializers
detections
dev
devicelibs
devsel
dimensionality
disambiguates
distro
el
embeddings
enablement
endpgm
encodings
env
epilog
etcetera
ethernet
exascale
executables
ffmpeg
filesystem
fortran
galb
gcc
gdb
gfortran
gfx
githooks
github
gnupg
grayscale
gzip
heterogenous
hipBLAS
hipBLASLt
hipCUB
hipFFT
hipLIB
hipRAND
hipSOLVER
hipSPARSE
hipSPARSELt
hipTensor
hipamd
hipblas
hipcub
hipfft
hipfort
hipify
hipsolver
hipsparse
hpp
hsa
hsakmt
hyperparameter
ib_core
inband
incrementing
inferencing
inflight
init
initializer
inlining
installable
interprocedural
intra
invariants
invocating
ipo
kdb
latencies
libfabric
libjpeg
libs
linearized
linter
linux
llvm
localscratch
logits
lossy
macOS
matchers
microarchitecture
migraphx
miopen
miopengemm
mivisionx
mkdir
mlirmiopen
mtypes
mvffr
namespace
namespaces
numref
ocl
opencl
opencv
openmp
openssl
optimizers
os
pageable
parallelization
parameterization
passthrough
perfcounter
performant
perl
pragma
pre
prebuilt
precompiled
prefetch
prefetchable
preprocess
preprocessed
preprocessing
prequantized
prerequisites
profiler
protobuf
pseudorandom
py
quasirandom
queueing
rccl
rdc
reStructuredText
reformats
repos
representativeness
req
resampling
rescaling
reusability
roadmap
roc
rocAL
rocALUTION
rocBLAS
rocFFT
rocLIB
rocMLIR
rocPRIM
rocRAND
rocSOLVER
rocSPARSE
rocThrust
rocWMMA
rocalution
rocblas
rocclr
rocfft
rocm
rocminfo
rocprim
rocprof
rocprofiler
rocr
rocrand
rocsolver
rocsparse
rocthrust
roctracer
runtime
runtimes
sL
scalability
scalable
sendmsg
serializers
shader
sharding
sigmoid
sm
smi
softmax
spack
src
stochastically
strided
subdirectory
subexpression
subfolder
subfolders
supercomputing
tensorfloat
th
tokenization
tokenize
tokenized
tokenizer
tokenizes
toolchain
toolchains
toolset
toolsets
torchvision
tqdm
tracebacks
txt
uarch
uncached
uncorrectable
uninstallation
unsqueeze
unstacking
unswitching
untrusted
untuned
upvote
USM
UTCL
UTIL
utils
vL
variational
vdi
vectorizable
vectorization
vectorize
vectorized
vectorizer
vectorizes
vjxb
walkthrough
walkthroughs
wavefront
wavefronts
whitespaces
workgroup
workgroups
writeback
writebacks
wrreq
wzo
xargs
xz
yaml
ysvmadyb
zypper

View File

@@ -15,14 +15,595 @@ The release notes for the ROCm platform.
-------------------
## ROCm 5.6.0
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable no-duplicate-header -->
<!-- markdownlint-disable header-increment -->
#### Release Highlights
ROCm 5.6 consists of several AI software ecosystem improvements to our fast-growing user base.A few examples include:
- New documentation portal at https://rocm.docs.amd.com
- Ongoing software enhancements for LLMs, ensuring full compliance with the HuggingFace unit test suite
- OpenAI Triton, CuPy, HIP Graph support, and many other library performance enhancements
- Improved ROCm deployment and development tools, including CPU-GPU (rocGDB) debugger, profiler, and docker containers
- New pseudorandom generators are available in rocRAND. Added support for half-precision transforms in hipFFT/rocFFT. Added LU refactorization and linear system solver for sparse matrices in rocSOLVER.
#### OS and GPU Support Changes
- SLES15 SP5 support was added this release. SLES15 SP3 support was dropped.
- AMD Instinct MI50, Radeon Pro VII, and Radeon VII products (collectively referred to as gfx906 GPUs) will be entering the maintenance mode starting Q3 2023. This will be aligned with ROCm 5.7 GA release date.
- No new features and performance optimizations will be supported for the gfx906 GPUs beyond ROCm 5.7
- Bug fixes / critical security patches will continue to be supported for the gfx906 GPUs till Q2 2024 (End of Maintenance [EOM])(will be aligned with the closest ROCm release)
- Bug fixes during the maintenance will be made to the next ROCm point release
- Bug fixes will not be back ported to older ROCm releases for this SKU
- Distro / Operating system updates will continue as per the ROCm release cadence for gfx906 GPUs till EOM.
#### AMDSMI CLI 23.0.0.4
##### Added
- AMDSMI CLI tool enabled for Linux Bare Metal & Guest
- Package: amd-smi-lib
##### Known Issues
- not all Error Correction Code (ECC) fields are currently supported
- RHEL 8 & SLES 15 have extra install steps
#### Kernel Modules (DKMS)
##### Fixes
- Stability fix for multi GPU system reproducilble via ROCm_Bandwidth_Test as reported in [Issue 2198](https://github.com/RadeonOpenCompute/ROCm/issues/2198).
#### HIP 5.6 (For ROCm 5.6)
##### Optimizations
- Consolidation of hipamd, rocclr and OpenCL projects in clr
- Optimized lock for graph global capture mode
##### Added
- Added hipRTC support for amd_hip_fp16
- Added hipStreamGetDevice implementation to get the device associated with the stream
- Added HIP_AD_FORMAT_SIGNED_INT16 in hipArray formats
- hipArrayGetInfo for getting information about the specified array
- hipArrayGetDescriptor for getting 1D or 2D array descriptor
- hipArray3DGetDescriptor to get 3D array descriptor
##### Changed
- hipMallocAsync to return success for zero size allocation to match hipMalloc
- Separation of hipcc perl binaries from HIP project to hipcc project. hip-devel package depends on newly added hipcc package
- Consolidation of hipamd, ROCclr, and OpenCL repositories into a single repository called clr. Instructions are updated to build HIP from sources in the HIP Installation guide
- Removed hipBusBandwidth and hipCommander samples from hip-tests
##### Fixed
- Fixed regression in hipMemCpyParam3D when offset is applied
##### Known Issues
- Limited testing on xnack+ configuration
- Multiple HIP tests failures (gpuvm fault or hangs)
- hipSetDevice and hipSetDeviceFlags APIs return hipErrorInvalidDevice instead of hipErrorNoDevice, on a system without GPU
- Known memory leak when code object files are loaded/unloaded via hipModuleLoad/hipModuleUnload APIs. Issue will be fixed in a future ROCm release
##### Upcoming changes in future release
- Removal of gcnarch from hipDeviceProp_t structure
- Addition of new fields in hipDeviceProp_t structure
- maxTexture1D
- maxTexture2D
- maxTexture1DLayered
- maxTexture2DLayered
- sharedMemPerMultiprocessor
- deviceOverlap
- asyncEngineCount
- surfaceAlignment
- unifiedAddressing
- computePreemptionSupported
- uuid
- Removal of deprecated code
- hip-hcc codes from hip code tree
- Correct hipArray usage in HIP APIs such as hipMemcpyAtoH and hipMemcpyHtoA
- HIPMEMCPY_3D fields correction (unsigned int -> size_t)
- Renaming of 'memoryType' in hipPointerAttribute_t structure to 'type'
#### ROCgdb-13 (For ROCm 5.6.0)
##### Optimized
- Improved performances when handling the end of a process with a large number of threads.
Known Issues
- On certain configurations, ROCgdb can show the following warning message:
`warning: Probes-based dynamic linker interface failed. Reverting to original interface.`
This does not affect ROCgdb's functionalities.
#### ROCprofiler (For ROCm 5.6.0)
In ROCm 5.6 the `rocprofilerv1` and `rocprofilerv2` include and library files of
ROCm 5.5 are split into separate files. The `rocmtools` files that were
deprecated in ROCm 5.5 have been removed.
| ROCm 5.6 | rocprofilerv1 | rocprofilerv2 |
|-----------------|-------------------------------------|----------------------------------------|
| **Tool script** | `bin/rocprof` | `bin/rocprofv2` |
| **API include** | `include/rocprofiler/rocprofiler.h` | `include/rocprofiler/v2/rocprofiler.h` |
| **API library** | `lib/librocprofiler.so.1` | `lib/librocprofiler.so.2` |
The ROCm Profiler Tool that uses `rocprofilerV1` can be invoked using the
following command:
```sh
$ rocprof …
```
To write a custom tool based on the `rocprofilerV1` API do the following:
```C
main.c:
#include <rocprofiler/rocprofiler.h> // Use the rocprofilerV1 API
int main() {
// Use the rocprofilerV1 API
return 0;
}
```
This can be built in the following manner:
```sh
$ gcc main.c -I/opt/rocm-5.6.0/include -L/opt/rocm-5.6.0/lib -lrocprofiler64
```
The resulting `a.out` will depend on
`/opt/rocm-5.6.0/lib/librocprofiler64.so.1`.
The ROCm Profiler that uses `rocprofilerV2` API can be invoked using the
following command:
```sh
$ rocprofv2 …
```
To write a custom tool based on the `rocprofilerV2` API do the following:
```C
main.c:
#include <rocprofiler/v2/rocprofiler.h> // Use the rocprofilerV2 API
int main() {
// Use the rocprofilerV2 API
return 0;
}
```
This can be built in the following manner:
```sh
$ gcc main.c -I/opt/rocm-5.6.0/include -L/opt/rocm-5.6.0/lib -lrocprofiler64-v2
```
The resulting `a.out` will depend on
`/opt/rocm-5.6.0/lib/librocprofiler64.so.2`.
##### Optimized
- Improved Test Suite
##### Added
- 'end_time' need to be disabled in roctx_trace.txt
##### Fixed
- rocprof in ROcm/5.4.0 gpu selector broken.
- rocprof in ROCm/5.4.1 fails to generate kernel info.
- rocprof clobbers LD_PRELOAD.
### Library Changes in ROCM 5.6.0
| Library | Version |
|---------|---------|
| hipBLAS | ⇒ [1.0.0](https://github.com/ROCmSoftwarePlatform/hipBLAS/releases/tag/rocm-5.6.0) |
| hipCUB | ⇒ [2.13.1](https://github.com/ROCmSoftwarePlatform/hipCUB/releases/tag/rocm-5.6.0) |
| hipFFT | ⇒ [1.0.12](https://github.com/ROCmSoftwarePlatform/hipFFT/releases/tag/rocm-5.6.0) |
| hipSOLVER | ⇒ [1.8.0](https://github.com/ROCmSoftwarePlatform/hipSOLVER/releases/tag/rocm-5.6.0) |
| hipSPARSE | ⇒ [2.3.6](https://github.com/ROCmSoftwarePlatform/hipSPARSE/releases/tag/rocm-5.6.0) |
| MIOpen | ⇒ [2.19.0](https://github.com/ROCmSoftwarePlatform/MIOpen/releases/tag/rocm-5.6.0) |
| rccl | ⇒ [2.15.5](https://github.com/ROCmSoftwarePlatform/rccl/releases/tag/rocm-5.6.0) |
| rocALUTION | ⇒ [2.1.9](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.6.0) |
| rocBLAS | ⇒ [3.0.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.6.0) |
| rocFFT | ⇒ [1.0.23](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.6.0) |
| rocm-cmake | ⇒ [0.9.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.6.0) |
| rocPRIM | ⇒ [2.13.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.6.0) |
| rocRAND | ⇒ [2.10.17](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.6.0) |
| rocSOLVER | ⇒ [3.22.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.6.0) |
| rocSPARSE | ⇒ [2.5.2](https://github.com/ROCmSoftwarePlatform/rocSPARSE/releases/tag/rocm-5.6.0) |
| rocThrust | ⇒ [2.18.0](https://github.com/ROCmSoftwarePlatform/rocThrust/releases/tag/rocm-5.6.0) |
| rocWMMA | ⇒ [1.1.0](https://github.com/ROCmSoftwarePlatform/rocWMMA/releases/tag/rocm-5.6.0) |
| Tensile | ⇒ [4.37.0](https://github.com/ROCmSoftwarePlatform/Tensile/releases/tag/rocm-5.6.0) |
#### hipBLAS 1.0.0
hipBLAS 1.0.0 for ROCm 5.6.0
##### Changed
- added const qualifier to hipBLAS functions (swap, sbmv, spmv, symv, trsm) where missing
##### Removed
- removed support for deprecated hipblasInt8Datatype_t enum
- removed support for deprecated hipblasSetInt8Datatype and hipblasGetInt8Datatype functions
##### Deprecated
- in-place trmm is deprecated. It will be replaced by trmm which includes both in-place and
out-of-place functionality
#### hipCUB 2.13.1
hipCUB 2.13.1 for ROCm 5.6.0
##### Added
- Benchmarks for `BlockShuffle`, `BlockLoad`, and `BlockStore`.
##### Changed
- CUB backend references CUB and Thrust version 1.17.2.
- Improved benchmark coverage of `BlockScan` by adding `ExclusiveScan`, benchmark coverage of `BlockRadixSort` by adding `SortBlockedToStriped`, and benchmark coverage of `WarpScan` by adding `Broadcast`.
- Updated `docs` directory structure to match the standard of [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core).
##### Known Issues
- `BlockRadixRankMatch` is currently broken under the rocPRIM backend.
- `BlockRadixRankMatch` with a warp size that does not exactly divide the block size is broken under the CUB backend.
#### hipFFT 1.0.12
hipFFT 1.0.12 for ROCm 5.6.0
##### Added
- Implemented the hipfftXtMakePlanMany, hipfftXtGetSizeMany, hipfftXtExec APIs, to allow requesting half-precision transforms.
##### Changed
- Added --precision argument to benchmark/test clients. --double is still accepted but is deprecated as a method to request a double-precision transform.
#### hipSOLVER 1.8.0
hipSOLVER 1.8.0 for ROCm 5.6.0
##### Added
- Added compatibility API with hipsolverRf prefix
#### hipSPARSE 2.3.6
hipSPARSE 2.3.6 for ROCm 5.6.0
##### Added
- Added SpGEMM algorithms
##### Changed
- For hipsparseXbsr2csr and hipsparseXcsr2bsr, blockDim == 0 now returns HIPSPARSE_STATUS_INVALID_SIZE
#### MIOpen 2.19.0
MIOpen 2.19.0 for ROCm 5.6.0
##### Added
- ROCm 5.5 support for gfx1101 (Navi32)
##### Changed
- Tuning results for MLIR on ROCm 5.5
- Bumping MLIR commit to 5.5.0 release tag
##### Fixed
- Fix 3d convolution Host API bug
- [HOTFIX][MI200][FP16] Disabled ConvHipImplicitGemmBwdXdlops when FP16_ALT is required.
#### rccl 2.15.5
RCCL 2.15.5 for ROCm 5.6.0
##### Changed
- Compatibility with NCCL 2.15.5
- Unit test executable renamed to rccl-UnitTests
##### Added
- HW-topology aware binary tree implementation
- Experimental support for MSCCL
- New unit tests for hipGraph support
- NPKit integration
##### Fixed
- rocm-smi ID conversion
- Support for HIP_VISIBLE_DEVICES for unit tests
- Support for p2p transfers to non (HIP) visible devices
##### Removed
- Removed TransferBench from tools. Exists in standalone repo: https://github.com/ROCmSoftwarePlatform/TransferBench
#### rocALUTION 2.1.9
rocALUTION 2.1.9 for ROCm 5.6.0
##### Improved
- Fixed synchronization issues in level 1 routines
#### rocBLAS 3.0.0
rocBLAS 3.0.0 for ROCm 5.6.0
##### Optimizations
- Improved performance of Level 2 rocBLAS GEMV on gfx90a GPU for non-transposed problems having small matrices and larger batch counts. Performance enhanced for problem sizes when m and n &lt;= 32 and batch_count &gt;= 256.
- Improved performance of rocBLAS syr2k for single, double, and double-complex precision, and her2k for double-complex precision. Slightly improved performance for general sizes on gfx90a.
##### Added
- Added bf16 inputs and f32 compute support to Level 1 rocBLAS Extension functions axpy_ex, scal_ex and nrm2_ex.
##### Deprecated
- trmm inplace is deprecated. It will be replaced by trmm that has both inplace and out-of-place functionality
- rocblas_query_int8_layout_flag() is deprecated and will be removed in a future release
- rocblas_gemm_flags_pack_int8x4 enum is deprecated and will be removed in a future release
- rocblas_set_device_memory_size() is deprecated and will be replaced by a future function rocblas_increase_device_memory_size()
- rocblas_is_user_managing_device_memory() is deprecated and will be removed in a future release
##### Removed
- is_complex helper was deprecated and now removed. Use rocblas_is_complex instead.
- The enum truncate_t and the value truncate was deprecated and now removed from. It was replaced by rocblas_truncate_t and rocblas_truncate, respectively.
- rocblas_set_int8_type_for_hipblas was deprecated and is now removed.
- rocblas_get_int8_type_for_hipblas was deprecated and is now removed.
##### Dependencies
- build only dependency on python joblib added as used by Tensile build
- fix for cmake install on some OS when performed by install.sh -d --cmake_install
##### Fixed
- make trsm offset calculations 64 bit safe
##### Changed
- refactor rotg test code
#### rocFFT 1.0.23
rocFFT 1.0.23 for ROCm 5.6.0
##### Added
- Implemented half-precision transforms, which can be requested by passing rocfft_precision_half to rocfft_plan_create.
- Implemented a hierarchical solution map which saves how to decompose a problem and the kernels to be used.
- Implemented a first version of offline-tuner to support tuning kernels for C2C/Z2Z problems.
##### Changed
- Replaced std::complex with hipComplex data types for data generator.
- FFT plan dimensions are now sorted to be row-major internally where possible, which produces better plans if the dimensions were accidentally specified in a different order (column-major, for example).
- Added --precision argument to benchmark/test clients. --double is still accepted but is deprecated as a method to request a double-precision transform.
##### Fixed
- Fixed over-allocation of LDS in some real-complex kernels, which was resulting in kernel launch failure.
#### rocm-cmake 0.9.0
rocm-cmake 0.9.0 for ROCm 5.6.0
##### Added
- Added the option ROCM_HEADER_WRAPPER_WERROR
- Compile-time C macro in the wrapper headers causes errors to be emitted instead of warnings.
- Configure-time CMake option sets the default for the C macro.
#### rocPRIM 2.13.0
rocPRIM 2.13.0 for ROCm 5.6.0
##### Added
- New block level `radix_rank` primitive.
- New block level `radix_rank_match` primitive.
- Added a stable block sorting implementation. This be used with `block_sort` by using the `block_sort_algorithm::stable_merge_sort` algorithm.
##### Changed
- Improved the performance of `block_radix_sort` and `device_radix_sort`.
- Improved the performance of `device_merge_sort`.
- Updated `docs` directory structure to match the standard of [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core). Contributed by: [v01dXYZ](https://github.com/v01dXYZ).
##### Known Issues
- Disabled GPU error messages relating to incorrect warp operation usage with Navi GPUs on Windows, due to GPU printf performance issues on Windows.
- When `ROCPRIM_DISABLE_LOOKBACK_SCAN` is set, `device_scan` fails for input sizes bigger than `scan_config::size_limit`, which defaults to `std::numeric_limits&lt;unsigned int&gt;::max()`.
#### rocRAND 2.10.17
rocRAND 2.10.17 for ROCm 5.6.0
##### Added
- MT19937 pseudo random number generator based on M. Matsumoto and T. Nishimura, 1998, Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator.
- New benchmark for the device API using Google Benchmark, `benchmark_rocrand_device_api`, replacing `benchmark_rocrand_kernel`. `benchmark_rocrand_kernel` is deprecated and will be removed in a future version. Likewise, `benchmark_curand_host_api` is added to replace `benchmark_curand_generate` and `benchmark_curand_device_api` is added to replace `benchmark_curand_kernel`.
- experimental HIP-CPU feature
- ThreeFry pseudorandom number generator based on Salmon et al., 2011, &#34;Parallel random numbers: as easy as 1, 2, 3&#34;.
##### Changed
- Python 2.7 is no longer officially supported.
#### rocSOLVER 3.22.0
rocSOLVER 3.22.0 for ROCm 5.6.0
##### Added
- LU refactorization for sparse matrices
- CSRRF_ANALYSIS
- CSRRF_SUMLU
- CSRRF_SPLITLU
- CSRRF_REFACTLU
- Linear system solver for sparse matrices
- CSRRF_SOLVE
- Added type `rocsolver_rfinfo` for use with sparse matrix routines
##### Optimized
- Improved the performance of BDSQR and GESVD when singular vectors are requested
##### Fixed
- BDSQR and GESVD should no longer hang when the input contains `NaN` or `Inf`
#### rocSPARSE 2.5.2
rocSPARSE 2.5.2 for ROCm 5.6.0
##### Improved
- Fixed a memory leak in csritsv
- Fixed a bug in csrsm and bsrsm
#### rocThrust 2.18.0
rocThrust 2.18.0 for ROCm 5.6.0
##### Fixed
- `lower_bound`, `upper_bound`, and `binary_search` failed to compile for certain types.
##### Changed
- Updated `docs` directory structure to match the standard of [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core).
#### rocWMMA 1.1.0
rocWMMA 1.1.0 for ROCm 5.6.0
##### Added
- Added cross-lane operation backends (Blend, Permute, Swizzle and Dpp)
- Added GPU kernels for rocWMMA unit test pre-process and post-process operations (fill, validation)
- Added performance gemm samples for half, single and double precision
- Added rocWMMA cmake versioning
- Added vectorized support in coordinate transforms
- Included ROCm smi for runtime clock rate detection
- Added fragment transforms for transpose and change data layout
##### Changed
- Default to GPU rocBLAS validation against rocWMMA
- Re-enabled int8 gemm tests on gfx9
- Upgraded to C++17
- Restructured unit test folder for consistency
- Consolidated rocWMMA samples common code
#### Tensile 4.37.0
Tensile 4.37.0 for ROCm 5.6.0
##### Added
- Added user driven tuning API
- Added decision tree fallback feature
- Added SingleBuffer + AtomicAdd option for GlobalSplitU
- DirectToVgpr support for fp16 and Int8 with TN orientation
- Added new test cases for various functions
- Added SingleBuffer algorithm for ZGEMM/CGEMM
- Added joblib for parallel map calls
- Added support for MFMA + LocalSplitU + DirectToVgprA+B
- Added asmcap check for MIArchVgpr
- Added support for MFMA + LocalSplitU
- Added frequency, power, and temperature data to the output
##### Optimizations
- Improved the performance of GlobalSplitU with SingleBuffer algorithm
- Reduced the running time of the extended and pre_checkin tests
- Optimized the Tailloop section of the assembly kernel
- Optimized complex GEMM (fixed vgpr allocation, unified CGEMM and ZGEMM code in MulMIoutAlphaToArch)
- Improved the performance of the second kernel of MultipleBuffer algorithm
##### Changed
- Updated custom kernels with 64-bit offsets
- Adapted 64-bit offset arguments for assembly kernels
- Improved temporary register re-use to reduce max sgpr usage
- Removed some restrictions on VectorWidth and DirectToVgpr
- Updated the dependency requirements for Tensile
- Changed the range of AssertSummationElementMultiple
- Modified the error messages for more clarity
- Changed DivideAndReminder to vectorStaticRemainder in case quotient is not used
- Removed dummy vgpr for vectorStaticRemainder
- Removed tmpVgpr parameter from vectorStaticRemainder/Divide/DivideAndReminder
- Removed qReg parameter from vectorStaticRemainder
##### Fixed
- Fixed tmp sgpr allocation to avoid over-writing values (alpha)
- 64-bit offset parameters for post kernels
- Fixed gfx908 CI test failures
- Fixed offset calculation to prevent overflow for large offsets
- Fixed issues when BufferLoad and BufferStore are equal to zero
- Fixed StoreCInUnroll + DirectToVgpr + no useInitAccVgprOpt mismatch
- Fixed DirectToVgpr + LocalSplitU + FractionalLoad mismatch
- Fixed the memory access error related to StaggerU + large stride
- Fixed ZGEMM 4x4 MatrixInst mismatch
- Fixed DGEMM 4x4 MatrixInst mismatch
- Fixed ASEM + GSU + NoTailLoop opt mismatch
- Fixed AssertSummationElementMultiple + GlobalSplitU issues
- Fixed ASEM + GSU + TailLoop inner unroll
-------------------
## ROCm 5.5.1
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable no-duplicate-header -->
### What's New in This Release
#### HIP SDK for Windows
AMD is pleased to announce the availability of the HIP SDK for Windows as part
of the ROCm platform. The
[HIP SDK OS and GPU support page](https://rocm.docs.amd.com/en/docs-5.5.1/release/windows_support.html)
lists the versions of Windows and GPUs validated by AMD. HIP SDK features on
Windows are described in detail in our
[What is ROCm?](https://rocm.docs.amd.com/en/docs-5.5.1/rocm.html#rocm-on-windows)
page and differs from the Linux feature set. Visit
[Quick Start](https://rocm.docs.amd.com/en/docs-5.5.1/deploy/windows/quick_start.html#)
page to get started. Known issues are tracked on
[GitHub](https://github.com/RadeonOpenCompute/ROCm/issues?q=is%3Aopen+label%3A5.5.1+label%3A%22Verified+Issue%22+label%3AWindows).
#### HIP API Change
The following HIP API is updated in the ROCm v5.5.1 release,
The following HIP API is updated in the ROCm 5.5.1 release:
##### `hipDeviceSetCacheConfig`
@@ -37,10 +618,12 @@ The following HIP API is updated in the ROCm v5.5.1 release,
| hipFFT | [1.0.11](https://github.com/ROCmSoftwarePlatform/hipFFT/releases/tag/rocm-5.5.1) |
| hipSOLVER | [1.7.0](https://github.com/ROCmSoftwarePlatform/hipSOLVER/releases/tag/rocm-5.5.1) |
| hipSPARSE | [2.3.5](https://github.com/ROCmSoftwarePlatform/hipSPARSE/releases/tag/rocm-5.5.1) |
| MIOpen | [2.19.0](https://github.com/ROCmSoftwarePlatform/MIOpen/releases/tag/rocm-5.5.1) |
| rccl | [2.15.5](https://github.com/ROCmSoftwarePlatform/rccl/releases/tag/rocm-5.5.1) |
| rocALUTION | [2.1.8](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.5.1) |
| rocBLAS | [2.47.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.5.1) |
| rocFFT | [1.0.22](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.5.1) |
| rocm-cmake | [0.8.1](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.5.1) |
| rocPRIM | [2.13.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.5.1) |
| rocRAND | [2.10.17](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.5.1) |
| rocSOLVER | [3.21.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.5.1) |
@@ -80,6 +663,29 @@ The following hipcc changes are implemented in this release:
- `hipCommander` at <https://github.com/ROCm-Developer-Tools/hip-tests/tree/develop/samples/1_Utils/hipCommander>
Note that the samples will continue to be available in previous release branches.
- Removal of gcnarch from hipDeviceProp_t structure
- Addition of new fields in hipDeviceProp_t structure
- maxTexture1D
- maxTexture2D
- maxTexture1DLayered
- maxTexture2DLayered
- sharedMemPerMultiprocessor
- deviceOverlap
- asyncEngineCount
- surfaceAlignment
- unifiedAddressing
- computePreemptionSupported
- hostRegisterSupported
- uuid
- Removal of deprecated code
- hip-hcc codes from hip code tree
- Correct hipArray usage in HIP APIs such as hipMemcpyAtoH and hipMemcpyHtoA
- HIPMEMCPY_3D fields correction to avoid truncation of "size_t" to "unsigned int" inside hipMemcpy3D()
- Renaming of 'memoryType' in hipPointerAttribute_t structure to 'type'
- Correct hipGetLastError to return the last error instead of last API call's return code
- Update hipExternalSemaphoreHandleDesc to add "unsigned int reserved[16]"
- Correct handling of flag values in hipIpcOpenMemHandle for hipIpcMemLazyEnablePeerAccess
- Remove hiparray* and make it opaque with hipArray_t
##### New HIP APIs in This Release
@@ -348,10 +954,12 @@ Multiple HIP directed tests fail.
| hipFFT | 1.0.10 ⇒ [1.0.11](https://github.com/ROCmSoftwarePlatform/hipFFT/releases/tag/rocm-5.5.0) |
| hipSOLVER | 1.6.0 ⇒ [1.7.0](https://github.com/ROCmSoftwarePlatform/hipSOLVER/releases/tag/rocm-5.5.0) |
| hipSPARSE | 2.3.3 ⇒ [2.3.5](https://github.com/ROCmSoftwarePlatform/hipSPARSE/releases/tag/rocm-5.5.0) |
| MIOpen | ⇒ [2.19.0](https://github.com/ROCmSoftwarePlatform/MIOpen/releases/tag/rocm-5.5.0) |
| rccl | 2.13.4 ⇒ [2.15.5](https://github.com/ROCmSoftwarePlatform/rccl/releases/tag/rocm-5.5.0) |
| rocALUTION | 2.1.3 ⇒ [2.1.8](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.5.0) |
| rocBLAS | 2.46.0 ⇒ [2.47.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.5.0) |
| rocFFT | 1.0.21 ⇒ [1.0.22](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.5.0) |
| rocm-cmake | 0.8.0 ⇒ [0.8.1](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.5.0) |
| rocPRIM | 2.12.0 ⇒ [2.13.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.5.0) |
| rocRAND | 2.10.16 ⇒ [2.10.17](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.5.0) |
| rocSOLVER | 3.20.0 ⇒ [3.21.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.5.0) |
@@ -441,6 +1049,24 @@ hipSPARSE 2.3.5 for ROCm 5.5.0
- Improved documentation
- Fixed a bug with deprecation messages when using gcc9 (Thanks @Maetveis)
#### MIOpen 2.19.0
MIOpen 2.19.0 for ROCm 5.5.0
##### Added
- ROCm 5.5 support for gfx1101 (Navi32)
##### Changed
- Tuning results for MLIR on ROCm 5.5
- Bumping MLIR commit to 5.5.0 release tag
##### Fixed
- Fix 3d convolution Host API bug
- [HOTFIX][MI200][FP16] Disabled ConvHipImplicitGemmBwdXdlops when FP16_ALT is required.
#### rccl 2.15.5
RCCL 2.15.5 for ROCm 5.5.0
@@ -552,6 +1178,18 @@ rocFFT 1.0.22 for ROCm 5.5.0
- Removed zero-length twiddle table allocations, which fixes errors from hipMallocManaged.
- Fixed incorrect freeing of HIP stream handles during twiddle computation when multiple devices are present.
#### rocm-cmake 0.8.1
rocm-cmake 0.8.1 for ROCm 5.5.0
##### Fixed
- ROCMInstallTargets: Added compatibility symlinks for included cmake files in `&lt;ROCM&gt;/lib/cmake/&lt;PACKAGE&gt;`.
##### Changed
- ROCMHeaderWrapper: The wrapper header deprecation message is now a deprecation warning.
#### rocPRIM 2.13.0
rocPRIM 2.13.0 for ROCm 5.5.0
@@ -867,6 +1505,7 @@ This issue is under investigation, and the known workaround is not to use -save-
| rocALUTION | [2.1.3](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.4.3) |
| rocBLAS | [2.46.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.4.3) |
| rocFFT | 1.0.20 ⇒ [1.0.21](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.4.3) |
| rocm-cmake | [0.8.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.4.3) |
| rocPRIM | [2.12.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.4.3) |
| rocRAND | [2.10.16](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.4.3) |
| rocSOLVER | [3.20.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.4.3) |
@@ -930,6 +1569,7 @@ This is a known issue and will be fixed in a future release.
| rocALUTION | [2.1.3](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.4.2) |
| rocBLAS | [2.46.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.4.2) |
| rocFFT | [1.0.20](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.4.2) |
| rocm-cmake | [0.8.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.4.2) |
| rocPRIM | [2.12.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.4.2) |
| rocRAND | [2.10.16](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.4.2) |
| rocSOLVER | [3.20.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.4.2) |
@@ -1019,6 +1659,7 @@ Maintenance update #3, combined with ROCm 5.4.1, now provides SRIOV virtualizati
| rocALUTION | [2.1.3](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.4.1) |
| rocBLAS | [2.46.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.4.1) |
| rocFFT | 1.0.19 ⇒ [1.0.20](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.4.1) |
| rocm-cmake | [0.8.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.4.1) |
| rocPRIM | [2.12.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.4.1) |
| rocRAND | [2.10.16](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.4.1) |
| rocSOLVER | [3.20.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.4.1) |
@@ -1135,6 +1776,8 @@ The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, co
>
> There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option.
(5_4_0_filesystem_reorg_deprecation_notice)=
##### Linux Filesystem Hierarchy Standard for ROCm
ROCm packages have adopted the Linux foundation filesystem hierarchy standard in this release to ensure ROCm components follow open source conventions for Linux-based distributions. While moving to a new filesystem hierarchy, ROCm ensures backward compatibility with its 5.1 version or older filesystem hierarchy. See below for a detailed explanation of the new filesystem hierarchy and backward compatibility.
@@ -1245,9 +1888,8 @@ The test was incorrectly using the `hipDeviceAttributePageableMemoryAccess` devi
`hipHostMalloc()` allocates memory with fine-grained access by default when the environment variable `HIP_HOST_COHERENT=1` is used.
For more information, refer to the HIP Programming Guide at
For more information, refer to {doc}`hip:.doxygen/docBin/html/index`.
<https://docs.amd.com/bundle/HIP-Programming-Guide-v5.4/page/Introduction_to_HIP_Programming_Guide.html>
#### SoftHang with `hipStreamWithCUMask` test on AMD Instinct™
@@ -1278,6 +1920,7 @@ GPU IDs reported by ROCTracer and ROCProfiler or ROCm Tools are HSA Driver Node
| rocALUTION | 2.1.0 ⇒ [2.1.3](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.4.0) |
| rocBLAS | 2.45.0 ⇒ [2.46.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.4.0) |
| rocFFT | 1.0.18 ⇒ [1.0.19](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.4.0) |
| rocm-cmake | [0.8.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.4.0) |
| rocPRIM | 2.11.0 ⇒ [2.12.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.4.0) |
| rocRAND | 2.10.15 ⇒ [2.10.16](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.4.0) |
| rocSOLVER | 3.19.0 ⇒ [3.20.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.4.0) |
@@ -1605,6 +2248,7 @@ This issue is resolved with the following fixes to compilation failures:
| rocALUTION | [2.1.0](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.3.3) |
| rocBLAS | [2.45.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.3.3) |
| rocFFT | [1.0.18](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.3.3) |
| rocm-cmake | [0.8.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.3.3) |
| rocPRIM | [2.11.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.3.3) |
| rocRAND | [2.10.15](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.3.3) |
| rocSOLVER | [3.19.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.3.3) |
@@ -1674,6 +2318,7 @@ This issue is currently under investigation and will be resolved in a future rel
| rocALUTION | [2.1.0](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.3.2) |
| rocBLAS | [2.45.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.3.2) |
| rocFFT | [1.0.18](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.3.2) |
| rocm-cmake | [0.8.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.3.2) |
| rocPRIM | [2.11.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.3.2) |
| rocRAND | [2.10.15](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.3.2) |
| rocSOLVER | [3.19.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.3.2) |
@@ -1857,6 +2502,7 @@ Workaround: To avoid the system crash, add `amd_iommu=on iommu=pt` as the kernel
| rocALUTION | 2.0.3 ⇒ [2.1.0](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.3.0) |
| rocBLAS | 2.44.0 ⇒ [2.45.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.3.0) |
| rocFFT | 1.0.17 ⇒ [1.0.18](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.3.0) |
| rocm-cmake | ⇒ [0.8.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.3.0) |
| rocPRIM | 2.10.14 ⇒ [2.11.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.3.0) |
| rocRAND | 2.10.14 ⇒ [2.10.15](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.3.0) |
| rocSOLVER | 3.18.0 ⇒ [3.19.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.3.0) |
@@ -2039,6 +2685,21 @@ rocFFT 1.0.18 for ROCm 5.3.0
An example is 98^3 R2C out-of-place.
- Fixed bugs in SBRC_ERC type.
#### rocm-cmake 0.8.0
rocm-cmake 0.8.0 for ROCm 5.3.0
##### Fixed
- Fixed error in prerm scripts created by `rocm_create_package` that could break uninstall for packages using the `PTH` option.
##### Changed
- `ROCM_USE_DEV_COMPONENT` set to on by default for all platforms. This means that Windows will now generate runtime and devel packages by default
- ROCMInstallTargets now defaults `CMAKE_INSTALL_LIBDIR` to `lib` if not otherwise specified.
- Changed default Debian compression type to xz and enabled multi-threaded package compression.
- `rocm_create_package` will no longer warn upon failure to determine version of program rpmbuild.
#### rocPRIM 2.11.0
rocPRIM 2.11.0 for ROCm 5.3.0
@@ -2561,7 +3222,8 @@ The new APIs for virtual memory management are as follows:
hipError_t hipMemUnmap(void* ptr, size_t size);
```
For more information, refer to the HIP API documentation at <https://docs.amd.com/bundle/HIP_API_Guide/page/modules.html>
For more information, refer to the HIP API documentation at
{doc}`hip:.doxygen/docBin/html/modules`.
##### Planned HIP Changes in Future Releases
@@ -2577,7 +3239,8 @@ This release introduces a new ROCm C++ library for accelerating mixed precision
rocWMMA is released as a header library and includes test and sample projects to validate and illustrate example usages of the C++ API. GEMM matrix multiplication is used as primary validation given the heavy precedent for the library. However, the usage portfolio is growing significantly and demonstrates different ways rocWMMA may be consumed.
For more information, refer to <https://docs.amd.com/category/libraries>.
For more information, refer to
[Communication Libraries](../../../../docs/reference/gpu_libraries/communication.md).
#### OpenMP Enhancements in This Release
@@ -3171,7 +3834,8 @@ ROCDebugger Machine Interface (MI) extends support to lanes. The following enhan
- MI varobjs are now lane-aware.
For more information, refer to the ROC Debugger User Guide at <https://docs.amd.com>.
For more information, refer to the ROC Debugger User Guide at
{doc}`ROCgdb <rocgdb:index>`.
##### Enhanced - clone-inferior Command
@@ -3193,7 +3857,7 @@ This release includes support for AMD Radeon™ Pro W6800, in addition to other
- Various other bug fixes and performance improvements
For more information, see <https://docs.amd.com/bundle/MIOpen_gh-pages/page/releasenotes.html>
For more information, see {doc}`Documentation <miopen:index>`.
#### Checkpoint Restore Support With CRIU

View File

@@ -15,40 +15,568 @@ The release notes for the ROCm platform.
-------------------
## ROCm 5.5.1
## ROCm 5.6.0
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable no-duplicate-header -->
### What's New in This Release
<!-- markdownlint-disable header-increment -->
#### Release Highlights
#### HIP API Change
ROCm 5.6 consists of several AI software ecosystem improvements to our fast-growing user base.A few examples include:
The following HIP API is updated in the ROCm v5.5.1 release,
- New documentation portal at https://rocm.docs.amd.com
- Ongoing software enhancements for LLMs, ensuring full compliance with the HuggingFace unit test suite
- OpenAI Triton, CuPy, HIP Graph support, and many other library performance enhancements
- Improved ROCm deployment and development tools, including CPU-GPU (rocGDB) debugger, profiler, and docker containers
- New pseudorandom generators are available in rocRAND. Added support for half-precision transforms in hipFFT/rocFFT. Added LU refactorization and linear system solver for sparse matrices in rocSOLVER.
##### `hipDeviceSetCacheConfig`
#### OS and GPU Support Changes
- The return value for `hipDeviceSetCacheConfig` is updated from `hipErrorNotSupported` to `hipSuccess`
- SLES15 SP5 support was added this release. SLES15 SP3 support was dropped.
- AMD Instinct MI50, Radeon Pro VII, and Radeon VII products (collectively referred to as gfx906 GPUs) will be entering the maintenance mode starting Q3 2023. This will be aligned with ROCm 5.7 GA release date.
- No new features and performance optimizations will be supported for the gfx906 GPUs beyond ROCm 5.7
- Bug fixes / critical security patches will continue to be supported for the gfx906 GPUs till Q2 2024 (End of Maintenance [EOM])(will be aligned with the closest ROCm release)
- Bug fixes during the maintenance will be made to the next ROCm point release
- Bug fixes will not be back ported to older ROCm releases for this SKU
- Distro / Operating system updates will continue as per the ROCm release cadence for gfx906 GPUs till EOM.
### Library Changes in ROCM 5.5.1
#### AMDSMI CLI 23.0.0.4
##### Added
- AMDSMI CLI tool enabled for Linux Bare Metal & Guest
- Package: amd-smi-lib
##### Known Issues
- not all Error Correction Code (ECC) fields are currently supported
- RHEL 8 & SLES 15 have extra install steps
#### Kernel Modules (DKMS)
##### Fixes
- Stability fix for multi GPU system reproducilble via ROCm_Bandwidth_Test as reported in [Issue 2198](https://github.com/RadeonOpenCompute/ROCm/issues/2198).
#### HIP 5.6 (For ROCm 5.6)
##### Optimizations
- Consolidation of hipamd, rocclr and OpenCL projects in clr
- Optimized lock for graph global capture mode
##### Added
- Added hipRTC support for amd_hip_fp16
- Added hipStreamGetDevice implementation to get the device associated with the stream
- Added HIP_AD_FORMAT_SIGNED_INT16 in hipArray formats
- hipArrayGetInfo for getting information about the specified array
- hipArrayGetDescriptor for getting 1D or 2D array descriptor
- hipArray3DGetDescriptor to get 3D array descriptor
##### Changed
- hipMallocAsync to return success for zero size allocation to match hipMalloc
- Separation of hipcc perl binaries from HIP project to hipcc project. hip-devel package depends on newly added hipcc package
- Consolidation of hipamd, ROCclr, and OpenCL repositories into a single repository called clr. Instructions are updated to build HIP from sources in the HIP Installation guide
- Removed hipBusBandwidth and hipCommander samples from hip-tests
##### Fixed
- Fixed regression in hipMemCpyParam3D when offset is applied
##### Known Issues
- Limited testing on xnack+ configuration
- Multiple HIP tests failures (gpuvm fault or hangs)
- hipSetDevice and hipSetDeviceFlags APIs return hipErrorInvalidDevice instead of hipErrorNoDevice, on a system without GPU
- Known memory leak when code object files are loaded/unloaded via hipModuleLoad/hipModuleUnload APIs. Issue will be fixed in a future ROCm release
##### Upcoming changes in future release
- Removal of gcnarch from hipDeviceProp_t structure
- Addition of new fields in hipDeviceProp_t structure
- maxTexture1D
- maxTexture2D
- maxTexture1DLayered
- maxTexture2DLayered
- sharedMemPerMultiprocessor
- deviceOverlap
- asyncEngineCount
- surfaceAlignment
- unifiedAddressing
- computePreemptionSupported
- uuid
- Removal of deprecated code
- hip-hcc codes from hip code tree
- Correct hipArray usage in HIP APIs such as hipMemcpyAtoH and hipMemcpyHtoA
- HIPMEMCPY_3D fields correction (unsigned int -> size_t)
- Renaming of 'memoryType' in hipPointerAttribute_t structure to 'type'
#### ROCgdb-13 (For ROCm 5.6.0)
##### Optimized
- Improved performances when handling the end of a process with a large number of threads.
Known Issues
- On certain configurations, ROCgdb can show the following warning message:
`warning: Probes-based dynamic linker interface failed. Reverting to original interface.`
This does not affect ROCgdb's functionalities.
#### ROCprofiler (For ROCm 5.6.0)
In ROCm 5.6 the `rocprofilerv1` and `rocprofilerv2` include and library files of
ROCm 5.5 are split into separate files. The `rocmtools` files that were
deprecated in ROCm 5.5 have been removed.
| ROCm 5.6 | rocprofilerv1 | rocprofilerv2 |
|-----------------|-------------------------------------|----------------------------------------|
| **Tool script** | `bin/rocprof` | `bin/rocprofv2` |
| **API include** | `include/rocprofiler/rocprofiler.h` | `include/rocprofiler/v2/rocprofiler.h` |
| **API library** | `lib/librocprofiler.so.1` | `lib/librocprofiler.so.2` |
The ROCm Profiler Tool that uses `rocprofilerV1` can be invoked using the
following command:
```sh
$ rocprof …
```
To write a custom tool based on the `rocprofilerV1` API do the following:
```C
main.c:
#include <rocprofiler/rocprofiler.h> // Use the rocprofilerV1 API
int main() {
// Use the rocprofilerV1 API
return 0;
}
```
This can be built in the following manner:
```sh
$ gcc main.c -I/opt/rocm-5.6.0/include -L/opt/rocm-5.6.0/lib -lrocprofiler64
```
The resulting `a.out` will depend on
`/opt/rocm-5.6.0/lib/librocprofiler64.so.1`.
The ROCm Profiler that uses `rocprofilerV2` API can be invoked using the
following command:
```sh
$ rocprofv2 …
```
To write a custom tool based on the `rocprofilerV2` API do the following:
```C
main.c:
#include <rocprofiler/v2/rocprofiler.h> // Use the rocprofilerV2 API
int main() {
// Use the rocprofilerV2 API
return 0;
}
```
This can be built in the following manner:
```sh
$ gcc main.c -I/opt/rocm-5.6.0/include -L/opt/rocm-5.6.0/lib -lrocprofiler64-v2
```
The resulting `a.out` will depend on
`/opt/rocm-5.6.0/lib/librocprofiler64.so.2`.
##### Optimized
- Improved Test Suite
##### Added
- 'end_time' need to be disabled in roctx_trace.txt
##### Fixed
- rocprof in ROcm/5.4.0 gpu selector broken.
- rocprof in ROCm/5.4.1 fails to generate kernel info.
- rocprof clobbers LD_PRELOAD.
### Library Changes in ROCM 5.6.0
| Library | Version |
|---------|---------|
| hipBLAS | [0.54.0](https://github.com/ROCmSoftwarePlatform/hipBLAS/releases/tag/rocm-5.5.1) |
| hipCUB | [2.13.1](https://github.com/ROCmSoftwarePlatform/hipCUB/releases/tag/rocm-5.5.1) |
| hipFFT | [1.0.11](https://github.com/ROCmSoftwarePlatform/hipFFT/releases/tag/rocm-5.5.1) |
| hipSOLVER | [1.7.0](https://github.com/ROCmSoftwarePlatform/hipSOLVER/releases/tag/rocm-5.5.1) |
| hipSPARSE | [2.3.5](https://github.com/ROCmSoftwarePlatform/hipSPARSE/releases/tag/rocm-5.5.1) |
| rccl | [2.15.5](https://github.com/ROCmSoftwarePlatform/rccl/releases/tag/rocm-5.5.1) |
| rocALUTION | [2.1.8](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.5.1) |
| rocBLAS | [2.47.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.5.1) |
| rocFFT | [1.0.22](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.5.1) |
| rocPRIM | [2.13.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.5.1) |
| rocRAND | [2.10.17](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.5.1) |
| rocSOLVER | [3.21.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.5.1) |
| rocSPARSE | [2.5.1](https://github.com/ROCmSoftwarePlatform/rocSPARSE/releases/tag/rocm-5.5.1) |
| rocThrust | [2.17.0](https://github.com/ROCmSoftwarePlatform/rocThrust/releases/tag/rocm-5.5.1) |
| rocWMMA | [1.0](https://github.com/ROCmSoftwarePlatform/rocWMMA/releases/tag/rocm-5.5.1) |
| Tensile | [4.36.0](https://github.com/ROCmSoftwarePlatform/Tensile/releases/tag/rocm-5.5.1) |
| hipBLAS | ⇒ [1.0.0](https://github.com/ROCmSoftwarePlatform/hipBLAS/releases/tag/rocm-5.6.0) |
| hipCUB | [2.13.1](https://github.com/ROCmSoftwarePlatform/hipCUB/releases/tag/rocm-5.6.0) |
| hipFFT | [1.0.12](https://github.com/ROCmSoftwarePlatform/hipFFT/releases/tag/rocm-5.6.0) |
| hipSOLVER | [1.8.0](https://github.com/ROCmSoftwarePlatform/hipSOLVER/releases/tag/rocm-5.6.0) |
| hipSPARSE | [2.3.6](https://github.com/ROCmSoftwarePlatform/hipSPARSE/releases/tag/rocm-5.6.0) |
| MIOpen | ⇒ [2.19.0](https://github.com/ROCmSoftwarePlatform/MIOpen/releases/tag/rocm-5.6.0) |
| rccl | ⇒ [2.15.5](https://github.com/ROCmSoftwarePlatform/rccl/releases/tag/rocm-5.6.0) |
| rocALUTION | ⇒ [2.1.9](https://github.com/ROCmSoftwarePlatform/rocALUTION/releases/tag/rocm-5.6.0) |
| rocBLAS | ⇒ [3.0.0](https://github.com/ROCmSoftwarePlatform/rocBLAS/releases/tag/rocm-5.6.0) |
| rocFFT | ⇒ [1.0.23](https://github.com/ROCmSoftwarePlatform/rocFFT/releases/tag/rocm-5.6.0) |
| rocm-cmake | ⇒ [0.9.0](https://github.com/RadeonOpenCompute/rocm-cmake/releases/tag/rocm-5.6.0) |
| rocPRIM | ⇒ [2.13.0](https://github.com/ROCmSoftwarePlatform/rocPRIM/releases/tag/rocm-5.6.0) |
| rocRAND | ⇒ [2.10.17](https://github.com/ROCmSoftwarePlatform/rocRAND/releases/tag/rocm-5.6.0) |
| rocSOLVER | ⇒ [3.22.0](https://github.com/ROCmSoftwarePlatform/rocSOLVER/releases/tag/rocm-5.6.0) |
| rocSPARSE | ⇒ [2.5.2](https://github.com/ROCmSoftwarePlatform/rocSPARSE/releases/tag/rocm-5.6.0) |
| rocThrust | ⇒ [2.18.0](https://github.com/ROCmSoftwarePlatform/rocThrust/releases/tag/rocm-5.6.0) |
| rocWMMA | ⇒ [1.1.0](https://github.com/ROCmSoftwarePlatform/rocWMMA/releases/tag/rocm-5.6.0) |
| Tensile | ⇒ [4.37.0](https://github.com/ROCmSoftwarePlatform/Tensile/releases/tag/rocm-5.6.0) |
## Older versions
#### hipBLAS 1.0.0
The release notes for older versions can be found in [the changelog](./CHANGELOG.md).
hipBLAS 1.0.0 for ROCm 5.6.0
##### Changed
- added const qualifier to hipBLAS functions (swap, sbmv, spmv, symv, trsm) where missing
##### Removed
- removed support for deprecated hipblasInt8Datatype_t enum
- removed support for deprecated hipblasSetInt8Datatype and hipblasGetInt8Datatype functions
##### Deprecated
- in-place trmm is deprecated. It will be replaced by trmm which includes both in-place and
out-of-place functionality
#### hipCUB 2.13.1
hipCUB 2.13.1 for ROCm 5.6.0
##### Added
- Benchmarks for `BlockShuffle`, `BlockLoad`, and `BlockStore`.
##### Changed
- CUB backend references CUB and Thrust version 1.17.2.
- Improved benchmark coverage of `BlockScan` by adding `ExclusiveScan`, benchmark coverage of `BlockRadixSort` by adding `SortBlockedToStriped`, and benchmark coverage of `WarpScan` by adding `Broadcast`.
- Updated `docs` directory structure to match the standard of [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core).
##### Known Issues
- `BlockRadixRankMatch` is currently broken under the rocPRIM backend.
- `BlockRadixRankMatch` with a warp size that does not exactly divide the block size is broken under the CUB backend.
#### hipFFT 1.0.12
hipFFT 1.0.12 for ROCm 5.6.0
##### Added
- Implemented the hipfftXtMakePlanMany, hipfftXtGetSizeMany, hipfftXtExec APIs, to allow requesting half-precision transforms.
##### Changed
- Added --precision argument to benchmark/test clients. --double is still accepted but is deprecated as a method to request a double-precision transform.
#### hipSOLVER 1.8.0
hipSOLVER 1.8.0 for ROCm 5.6.0
##### Added
- Added compatibility API with hipsolverRf prefix
#### hipSPARSE 2.3.6
hipSPARSE 2.3.6 for ROCm 5.6.0
##### Added
- Added SpGEMM algorithms
##### Changed
- For hipsparseXbsr2csr and hipsparseXcsr2bsr, blockDim == 0 now returns HIPSPARSE_STATUS_INVALID_SIZE
#### MIOpen 2.19.0
MIOpen 2.19.0 for ROCm 5.6.0
##### Added
- ROCm 5.5 support for gfx1101 (Navi32)
##### Changed
- Tuning results for MLIR on ROCm 5.5
- Bumping MLIR commit to 5.5.0 release tag
##### Fixed
- Fix 3d convolution Host API bug
- [HOTFIX][MI200][FP16] Disabled ConvHipImplicitGemmBwdXdlops when FP16_ALT is required.
#### rccl 2.15.5
RCCL 2.15.5 for ROCm 5.6.0
##### Changed
- Compatibility with NCCL 2.15.5
- Unit test executable renamed to rccl-UnitTests
##### Added
- HW-topology aware binary tree implementation
- Experimental support for MSCCL
- New unit tests for hipGraph support
- NPKit integration
##### Fixed
- rocm-smi ID conversion
- Support for HIP_VISIBLE_DEVICES for unit tests
- Support for p2p transfers to non (HIP) visible devices
##### Removed
- Removed TransferBench from tools. Exists in standalone repo: https://github.com/ROCmSoftwarePlatform/TransferBench
#### rocALUTION 2.1.9
rocALUTION 2.1.9 for ROCm 5.6.0
##### Improved
- Fixed synchronization issues in level 1 routines
#### rocBLAS 3.0.0
rocBLAS 3.0.0 for ROCm 5.6.0
##### Optimizations
- Improved performance of Level 2 rocBLAS GEMV on gfx90a GPU for non-transposed problems having small matrices and larger batch counts. Performance enhanced for problem sizes when m and n &lt;= 32 and batch_count &gt;= 256.
- Improved performance of rocBLAS syr2k for single, double, and double-complex precision, and her2k for double-complex precision. Slightly improved performance for general sizes on gfx90a.
##### Added
- Added bf16 inputs and f32 compute support to Level 1 rocBLAS Extension functions axpy_ex, scal_ex and nrm2_ex.
##### Deprecated
- trmm inplace is deprecated. It will be replaced by trmm that has both inplace and out-of-place functionality
- rocblas_query_int8_layout_flag() is deprecated and will be removed in a future release
- rocblas_gemm_flags_pack_int8x4 enum is deprecated and will be removed in a future release
- rocblas_set_device_memory_size() is deprecated and will be replaced by a future function rocblas_increase_device_memory_size()
- rocblas_is_user_managing_device_memory() is deprecated and will be removed in a future release
##### Removed
- is_complex helper was deprecated and now removed. Use rocblas_is_complex instead.
- The enum truncate_t and the value truncate was deprecated and now removed from. It was replaced by rocblas_truncate_t and rocblas_truncate, respectively.
- rocblas_set_int8_type_for_hipblas was deprecated and is now removed.
- rocblas_get_int8_type_for_hipblas was deprecated and is now removed.
##### Dependencies
- build only dependency on python joblib added as used by Tensile build
- fix for cmake install on some OS when performed by install.sh -d --cmake_install
##### Fixed
- make trsm offset calculations 64 bit safe
##### Changed
- refactor rotg test code
#### rocFFT 1.0.23
rocFFT 1.0.23 for ROCm 5.6.0
##### Added
- Implemented half-precision transforms, which can be requested by passing rocfft_precision_half to rocfft_plan_create.
- Implemented a hierarchical solution map which saves how to decompose a problem and the kernels to be used.
- Implemented a first version of offline-tuner to support tuning kernels for C2C/Z2Z problems.
##### Changed
- Replaced std::complex with hipComplex data types for data generator.
- FFT plan dimensions are now sorted to be row-major internally where possible, which produces better plans if the dimensions were accidentally specified in a different order (column-major, for example).
- Added --precision argument to benchmark/test clients. --double is still accepted but is deprecated as a method to request a double-precision transform.
##### Fixed
- Fixed over-allocation of LDS in some real-complex kernels, which was resulting in kernel launch failure.
#### rocm-cmake 0.9.0
rocm-cmake 0.9.0 for ROCm 5.6.0
##### Added
- Added the option ROCM_HEADER_WRAPPER_WERROR
- Compile-time C macro in the wrapper headers causes errors to be emitted instead of warnings.
- Configure-time CMake option sets the default for the C macro.
#### rocPRIM 2.13.0
rocPRIM 2.13.0 for ROCm 5.6.0
##### Added
- New block level `radix_rank` primitive.
- New block level `radix_rank_match` primitive.
- Added a stable block sorting implementation. This be used with `block_sort` by using the `block_sort_algorithm::stable_merge_sort` algorithm.
##### Changed
- Improved the performance of `block_radix_sort` and `device_radix_sort`.
- Improved the performance of `device_merge_sort`.
- Updated `docs` directory structure to match the standard of [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core). Contributed by: [v01dXYZ](https://github.com/v01dXYZ).
##### Known Issues
- Disabled GPU error messages relating to incorrect warp operation usage with Navi GPUs on Windows, due to GPU printf performance issues on Windows.
- When `ROCPRIM_DISABLE_LOOKBACK_SCAN` is set, `device_scan` fails for input sizes bigger than `scan_config::size_limit`, which defaults to `std::numeric_limits&lt;unsigned int&gt;::max()`.
#### rocRAND 2.10.17
rocRAND 2.10.17 for ROCm 5.6.0
##### Added
- MT19937 pseudo random number generator based on M. Matsumoto and T. Nishimura, 1998, Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator.
- New benchmark for the device API using Google Benchmark, `benchmark_rocrand_device_api`, replacing `benchmark_rocrand_kernel`. `benchmark_rocrand_kernel` is deprecated and will be removed in a future version. Likewise, `benchmark_curand_host_api` is added to replace `benchmark_curand_generate` and `benchmark_curand_device_api` is added to replace `benchmark_curand_kernel`.
- experimental HIP-CPU feature
- ThreeFry pseudorandom number generator based on Salmon et al., 2011, &#34;Parallel random numbers: as easy as 1, 2, 3&#34;.
##### Changed
- Python 2.7 is no longer officially supported.
#### rocSOLVER 3.22.0
rocSOLVER 3.22.0 for ROCm 5.6.0
##### Added
- LU refactorization for sparse matrices
- CSRRF_ANALYSIS
- CSRRF_SUMLU
- CSRRF_SPLITLU
- CSRRF_REFACTLU
- Linear system solver for sparse matrices
- CSRRF_SOLVE
- Added type `rocsolver_rfinfo` for use with sparse matrix routines
##### Optimized
- Improved the performance of BDSQR and GESVD when singular vectors are requested
##### Fixed
- BDSQR and GESVD should no longer hang when the input contains `NaN` or `Inf`
#### rocSPARSE 2.5.2
rocSPARSE 2.5.2 for ROCm 5.6.0
##### Improved
- Fixed a memory leak in csritsv
- Fixed a bug in csrsm and bsrsm
#### rocThrust 2.18.0
rocThrust 2.18.0 for ROCm 5.6.0
##### Fixed
- `lower_bound`, `upper_bound`, and `binary_search` failed to compile for certain types.
##### Changed
- Updated `docs` directory structure to match the standard of [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core).
#### rocWMMA 1.1.0
rocWMMA 1.1.0 for ROCm 5.6.0
##### Added
- Added cross-lane operation backends (Blend, Permute, Swizzle and Dpp)
- Added GPU kernels for rocWMMA unit test pre-process and post-process operations (fill, validation)
- Added performance gemm samples for half, single and double precision
- Added rocWMMA cmake versioning
- Added vectorized support in coordinate transforms
- Included ROCm smi for runtime clock rate detection
- Added fragment transforms for transpose and change data layout
##### Changed
- Default to GPU rocBLAS validation against rocWMMA
- Re-enabled int8 gemm tests on gfx9
- Upgraded to C++17
- Restructured unit test folder for consistency
- Consolidated rocWMMA samples common code
#### Tensile 4.37.0
Tensile 4.37.0 for ROCm 5.6.0
##### Added
- Added user driven tuning API
- Added decision tree fallback feature
- Added SingleBuffer + AtomicAdd option for GlobalSplitU
- DirectToVgpr support for fp16 and Int8 with TN orientation
- Added new test cases for various functions
- Added SingleBuffer algorithm for ZGEMM/CGEMM
- Added joblib for parallel map calls
- Added support for MFMA + LocalSplitU + DirectToVgprA+B
- Added asmcap check for MIArchVgpr
- Added support for MFMA + LocalSplitU
- Added frequency, power, and temperature data to the output
##### Optimizations
- Improved the performance of GlobalSplitU with SingleBuffer algorithm
- Reduced the running time of the extended and pre_checkin tests
- Optimized the Tailloop section of the assembly kernel
- Optimized complex GEMM (fixed vgpr allocation, unified CGEMM and ZGEMM code in MulMIoutAlphaToArch)
- Improved the performance of the second kernel of MultipleBuffer algorithm
##### Changed
- Updated custom kernels with 64-bit offsets
- Adapted 64-bit offset arguments for assembly kernels
- Improved temporary register re-use to reduce max sgpr usage
- Removed some restrictions on VectorWidth and DirectToVgpr
- Updated the dependency requirements for Tensile
- Changed the range of AssertSummationElementMultiple
- Modified the error messages for more clarity
- Changed DivideAndReminder to vectorStaticRemainder in case quotient is not used
- Removed dummy vgpr for vectorStaticRemainder
- Removed tmpVgpr parameter from vectorStaticRemainder/Divide/DivideAndReminder
- Removed qReg parameter from vectorStaticRemainder
##### Fixed
- Fixed tmp sgpr allocation to avoid over-writing values (alpha)
- 64-bit offset parameters for post kernels
- Fixed gfx908 CI test failures
- Fixed offset calculation to prevent overflow for large offsets
- Fixed issues when BufferLoad and BufferStore are equal to zero
- Fixed StoreCInUnroll + DirectToVgpr + no useInitAccVgprOpt mismatch
- Fixed DirectToVgpr + LocalSplitU + FractionalLoad mismatch
- Fixed the memory access error related to StaggerU + large stride
- Fixed ZGEMM 4x4 MatrixInst mismatch
- Fixed DGEMM 4x4 MatrixInst mismatch
- Fixed ASEM + GSU + NoTailLoop opt mismatch
- Fixed AssertSummationElementMultiple + GlobalSplitU issues
- Fixed ASEM + GSU + TailLoop inner unroll

View File

@@ -12,44 +12,41 @@ fetch="https://github.com/GPUOpen-ProfessionalCompute-Libraries/" />
fetch="https://github.com/GPUOpen-Tools/" />
<remote name="KhronosGroup"
fetch="https://github.com/KhronosGroup/" />
<default revision="refs/tags/rocm-5.5.1"
<default revision="refs/tags/rocm-5.6.0"
remote="roc-github"
sync-c="true"
sync-j="4" />
<!--list of projects for ROCM-->
<project name="ROCK-Kernel-Driver" />
<project name="ROCT-Thunk-Interface" />
<project name="ROCR-Runtime" />
<project name="rocm_smi_lib" />
<project name="rocm-core" />
<project name="rocm-cmake" />
<project name="rocminfo" />
<project name="ROCK-Kernel-Driver" remote="roc-github" />
<project name="ROCT-Thunk-Interface" remote="roc-github" />
<project name="ROCR-Runtime" remote="roc-github" />
<project name="rocm_smi_lib" remote="roc-github" />
<project name="rocm-core" remote="roc-github" />
<project name="rocm-cmake" remote="roc-github" />
<project name="rocminfo" remote="roc-github" />
<project name="rocprofiler" remote="rocm-devtools" />
<project name="roctracer" remote="rocm-devtools" />
<project name="ROCm-OpenCL-Runtime" />
<project path="ROCm-OpenCL-Runtime/api/opencl/khronos/icd" name="OpenCL-ICD-Loader" remote="KhronosGroup" revision="6c03f8b58fafd9dd693eaac826749a5cfad515f8" />
<project name="clang-ocl" />
<project name="clang-ocl" remote="roc-github" />
<!--HIP Projects-->
<project name="HIP" remote="rocm-devtools" />
<project name="hipamd" remote="rocm-devtools" />
<project name="clr" remote="rocm-devtools" />
<project name="HIP-Examples" remote="rocm-devtools" />
<project name="ROCclr" remote="rocm-devtools" />
<project name="HIPIFY" remote="rocm-devtools" />
<project name="HIPCC" remote="rocm-devtools" />
<!-- The following projects are all associated with the AMDGPU LLVM compiler -->
<project name="llvm-project" />
<project name="ROCm-Device-Libs" />
<project name="atmi" />
<project name="ROCm-CompilerSupport" />
<project name="llvm-project" remote="roc-github" />
<project name="ROCm-Device-Libs" remote="roc-github" />
<project name="ROCm-CompilerSupport" remote="roc-github" />
<project name="rocr_debug_agent" remote="rocm-devtools" />
<project name="rocm_bandwidth_test" />
<project name="rocm_bandwidth_test" remote="roc-github" />
<project name="half" remote="rocm-swplat" revision="37742ce15b76b44e4b271c1e66d13d2fa7bd003e" />
<project name="RCP" remote="gpuopen-tools" revision="3a49405a1500067c49d181844ec90aea606055bb" />
<!-- gdb projects -->
<project name="ROCgdb" remote="rocm-devtools" />
<project name="ROCdbgapi" remote="rocm-devtools" />
<!-- ROCm Libraries -->
<project name="rdc" />
<project name="rdc" remote="roc-github" />
<project groups="mathlibs" name="rocBLAS" remote="rocm-swplat" />
<project groups="mathlibs" name="Tensile" remote="rocm-swplat" />
<project groups="mathlibs" name="hipBLAS" remote="rocm-swplat" />
@@ -61,7 +58,6 @@ fetch="https://github.com/KhronosGroup/" />
<project groups="mathlibs" name="hipSOLVER" remote="rocm-swplat" />
<project groups="mathlibs" name="hipSPARSE" remote="rocm-swplat" />
<project groups="mathlibs" name="rocALUTION" remote="rocm-swplat" />
<project name="MIOpenGEMM" remote="rocm-swplat" />
<project name="MIOpen" remote="rocm-swplat" />
<project groups="mathlibs" name="rccl" remote="rocm-swplat" />
<project name="MIVisionX" remote="gpuopen-libs" />

View File

@@ -5,9 +5,9 @@ Documentation is built using open source toolchains. Contributions to our
documentation is encouraged and welcome. As a contributor, please familiarize
yourself with our documentation toolchain.
## ReadTheDocs
## Read The Docs
[ReadTheDocs](https://docs.readthedocs.io/en/stable/) is our front end for the
[Read the Docs](https://docs.readthedocs.io/en/stable/) is our front end for the
our documentation. By front end, this is the tool that serves our HTML based
documentation to our end users.

View File

@@ -14,12 +14,28 @@ shutil.copy2('../RELEASE.md','./release.md')
# Keep capitalization due to similar linking on GitHub's markdown preview.
shutil.copy2('../CHANGELOG.md','./CHANGELOG.md')
latex_engine = "xelatex"
# configurations for PDF output by Read the Docs
project = "ROCm Documentation"
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved."
version = "5.6.0"
release = "5.6.0"
setting_all_article_info = True
all_article_info_os = ["linux"]
all_article_info_os = ["linux", "windows"]
all_article_info_author = ""
# pages with specific settings
article_pages = [
{
"file":"release",
"os":["linux", "windows"],
"date":"2023-07-27"
},
{"file":"deploy/linux/index", "os":["linux"]},
{"file":"deploy/linux/install_overview", "os":["linux"]},
{"file":"deploy/linux/prerequisites", "os":["linux"]},
@@ -30,7 +46,20 @@ article_pages = [
{"file":"deploy/linux/package_manager_integration", "os":["linux"]},
{"file":"deploy/docker", "os":["linux"]},
{"file":"deploy/windows/cli/index", "os":["windows"]},
{"file":"deploy/windows/cli/install", "os":["windows"]},
{"file":"deploy/windows/cli/uninstall", "os":["windows"]},
{"file":"deploy/windows/cli/upgrade", "os":["windows"]},
{"file":"deploy/windows/gui/index", "os":["windows"]},
{"file":"deploy/windows/gui/install", "os":["windows"]},
{"file":"deploy/windows/gui/uninstall", "os":["windows"]},
{"file":"deploy/windows/gui/upgrade", "os":["windows"]},
{"file":"deploy/windows/index", "os":["windows"]},
{"file":"deploy/windows/prerequisites", "os":["windows"]},
{"file":"deploy/windows/quick_start", "os":["windows"]},
{"file":"release/gpu_os_support", "os":["linux"]},
{"file":"release/windows_support", "os":["windows"]},
{"file":"release/docker_support_matrix", "os":["linux"]},
{"file":"reference/gpu_libraries/communication", "os":["linux"]},
@@ -57,7 +86,7 @@ article_pages = [
external_toc_path = "./sphinx/_toc.yml"
docs_core = ROCmDocs("ROCm Documentation Home")
docs_core = ROCmDocs("ROCm 5.6.0 Documentation Home")
docs_core.setup()
external_projects_current_project = "rocm"

Binary file not shown.

Before

Width:  |  Height:  |  Size: 163 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 183 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 407 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 465 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 207 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 461 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 461 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 412 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.5 KiB

View File

Before

Width:  |  Height:  |  Size: 3.5 KiB

After

Width:  |  Height:  |  Size: 3.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 114 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 228 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 796 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 310 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 309 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 789 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 801 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

View File

@@ -21,7 +21,7 @@ Instructions for upgrading an existing ROCm installation.
:link: uninstall
:link-type: doc
Steps for removing ROCm packages libraries and tools.
Steps for removing ROCm packages, libraries and tools.
:::
::::

View File

@@ -18,8 +18,8 @@ following commands based on your distribution.
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/5.5.1/ubuntu/focal/amdgpu-install_5.5.50501-1_all.deb
sudo apt install ./amdgpu-install_5.5.50501-1_all.deb
wget https://repo.radeon.com/amdgpu-install/5.6/ubuntu/focal/amdgpu-install_5.6.50600-1_all.deb
sudo apt install ./amdgpu-install_5.6.50600-1_all.deb
```
:::
@@ -28,8 +28,8 @@ sudo apt install ./amdgpu-install_5.5.50501-1_all.deb
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/5.5.1/ubuntu/jammy/amdgpu-install_5.5.50501-1_all.deb
sudo apt install ./amdgpu-install_5.5.50501-1_all.deb
wget https://repo.radeon.com/amdgpu-install/5.6/ubuntu/jammy/amdgpu-install_5.6.50600-1_all.deb
sudo apt install ./amdgpu-install_5.6.50600-1_all.deb
```
:::
@@ -39,21 +39,21 @@ sudo apt install ./amdgpu-install_5.5.50501-1_all.deb
:sync: RHEL
::::{tab-set}
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
:sync: RHEL-8
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/5.5.1/rhel/8.6/amdgpu-install-5.5.50501-1.el8.noarch.rpm
```
:::
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
:sync: RHEL-8
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/5.5.1/rhel/8.7/amdgpu-install-5.5.50501-1.el8.noarch.rpm
sudo yum install https://repo.radeon.com/amdgpu-install/5.6/rhel/8.7/amdgpu-install-5.6.50600-1.el8.noarch.rpm
```
:::
:::{tab-item} RHEL 8.8
:sync: RHEL-8.8
:sync: RHEL-8
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/5.6/rhel/8.8/amdgpu-install-5.6.50600-1.el8.noarch.rpm
```
:::
@@ -62,21 +62,38 @@ sudo yum install https://repo.radeon.com/amdgpu-install/5.5.1/rhel/8.7/amdgpu-in
:sync: RHEL-9
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/5.5.1/rhel/9.1/amdgpu-install-5.5.50501-1.el8.noarch.rpm
sudo yum install https://repo.radeon.com/amdgpu-install/5.6/rhel/9.1/amdgpu-install-5.6.50600-1.el9.noarch.rpm
```
:::
:::{tab-item} RHEL 9.2
:sync: RHEL-9.2
:sync: RHEL-9
```shell
sudo yum install https://repo.radeon.com/amdgpu-install/5.6/rhel/9.2/amdgpu-install-5.6.50600-1.el9.noarch.rpm
```
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
:::::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
::::{tab-set}
:::{tab-item} Service Pack 4
:sync: SLES15-SP4
:::{tab-item} SLES 15.4
:sync: SLES-15.4
```shell
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/5.5.1/sle/15.4/amdgpu-install-5.5.50501-1.noarch.rpm
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/5.6/sle/15.4/amdgpu-install-5.6.50600-1.noarch.rpm
```
:::
:::{tab-item} SLES 15.5
:sync: SLES-15.5
```shell
sudo zypper --no-gpg-checks install https://repo.radeon.com/amdgpu-install/5.6/sle/15.5/amdgpu-install-5.6.50600-1.noarch.rpm
```
:::
@@ -155,9 +172,9 @@ the installer script will install packages in the single-version layout.
For the multi-version ROCm installation you must use the installer script from
the latest release of ROCm that you wish to install.
**Example:** If you want to install ROCm releases 5.3.3 and 5.5.1
**Example:** If you want to install ROCm releases 5.3.3 and 5.4.3
simultaneously, you are required to download the installer from the latest ROCm
release v5.5.1.
release v5.4.3.
### Add Required Repositories
@@ -242,14 +259,14 @@ sudo yum clean all
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
:::::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell
for ver in 5.3.3 5.4.3; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
name=rocm
baseurl=https://repo.radeon.com/rocm/$ver/sle/15.4/main/x86_64
baseurl=https://repo.radeon.com/rocm/zyp/$ver/main
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -272,12 +289,12 @@ sudo amdgpu-install --usecase=rocm --rocmrelease=<release-number-3>
```
Following are examples of ROCm multi-version installation. The kernel-mode
driver, associated with the ROCm release v5.5.1, will be installed as its latest
driver, associated with the ROCm release v5.4.3, will be installed as its latest
release in the list.
```none
sudo amdgpu-install --usecase=rocm --rocmrelease=5.3.3
sudo amdgpu-install --usecase=rocm --rocmrelease=5.5.1
sudo amdgpu-install --usecase=rocm --rocmrelease=5.4.3
```
## Additional options

View File

@@ -52,8 +52,11 @@ To add the AMDGPU repository, follow these steps:
:sync: ubuntu-20.04
```shell
# version
ver=5.6
# amdgpu repository for focal
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/5.5.1/ubuntu focal main' \
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/$ver/ubuntu focal main" \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
@@ -63,8 +66,11 @@ sudo apt update
:sync: ubuntu-22.04
```shell
# version
ver=5.6
# amdgpu repository for jammy
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/5.5.1/ubuntu jammy main' \
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/$ver/ubuntu jammy main" \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
@@ -91,7 +97,7 @@ To add the ROCm repository, use the following steps:
```shell
# ROCm repositories for focal
for ver in 5.3.3 5.4.3 5.5.1; do
for ver in 5.3.3 5.4.3 5.5.1 5.6; do
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ver focal main" \
| sudo tee --append /etc/apt/sources.list.d/rocm.list
done
@@ -106,7 +112,7 @@ sudo apt update
```shell
# ROCm repositories for jammy
for ver in 5.3.3 5.4.3 5.5.1; do
for ver in 5.3.3 5.4.3 5.5.1 5.6; do
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ver jammy main" \
| sudo tee --append /etc/apt/sources.list.d/rocm.list
done
@@ -136,7 +142,7 @@ For a comprehensive list of meta-packages, refer to
- Sample Multi-version installation
```shell
sudo apt install rocm-hip-sdk5.5.1 rocm-hip-sdk5.3.3
sudo apt install rocm-hip-sdk5.6 rocm-hip-sdk5.3.3
```
:::::
@@ -152,15 +158,18 @@ section.
```
::::{tab-set}
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
:sync: RHEL-8
```shell
# version
ver=5.6
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.5.1/rhel/8.6/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/$ver/rhel/8.7/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -171,15 +180,18 @@ sudo yum clean all
:::
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
:::{tab-item} RHEL 8.8
:sync: RHEL-8.8
:sync: RHEL-8
```shell
# version
ver=5.6
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.5.1/rhel/8.7/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/$ver/rhel/8.8/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -195,10 +207,35 @@ sudo yum clean all
:sync: RHEL-9
```shell
# version
ver=5.6
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.5.1/rhel/9.1/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/$ver/rhel/9.1/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 9.2
:sync: RHEL-9.2
:sync: RHEL-9
```shell
# version
ver=5.6
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/$ver/rhel/9.2/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -228,7 +265,7 @@ To add the ROCm repository, use the following steps, based on your distribution:
:sync: RHEL-8
```shell
for ver in 5.3.3 5.4.3 5.5.1; do
for ver in 5.3.3 5.4.3 5.5.1 5.6; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
@@ -247,7 +284,7 @@ sudo yum clean all
:sync: RHEL-9
```shell
for ver in 5.3.3 5.4.3 5.5.1; do
for ver in 5.3.3 5.4.3 5.5.1 5.6; do
sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
@@ -282,12 +319,12 @@ For a comprehensive list of meta-packages, refer to
- Sample Multi-version installation
```shell
sudo yum install rocm-hip-sdk5.5.1 rocm-hip-sdk5.3.3
sudo yum install rocm-hip-sdk5.6 rocm-hip-sdk5.3.3
```
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
:::::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
::::{rubric} 1. Add the AMDGPU Repository and Install the Kernel-mode Driver
::::
@@ -297,11 +334,18 @@ If you have a version of the kernel-mode driver installed, you may skip this
section.
```
::::{tab-set}
:::{tab-item} SLES 15.4
:sync: SLES-15.4
```shell
# version
ver=5.6
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.5.1/sle/15.4/main/x86_64
baseurl=https://repo.radeon.com/amdgpu/$ver/sle/15.4/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -309,6 +353,28 @@ EOF
sudo zypper ref
```
:::
:::{tab-item} SLES 15.5
:sync: SLES-15.5
```shell
# version
ver=5.6
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/$ver/sle/15.5/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
:::
::::
Install the kernel mode driver and reboot the system using the following
commands:
@@ -323,7 +389,7 @@ sudo reboot
To add the ROCm repository, use the following steps:
```shell
for ver in 5.3.3 5.4.3 5.5.1; do
for ver in 5.3.3 5.4.3 5.5.1 5.6; do
sudo tee --append /etc/zypp/repos.d/rocm.repo <<EOF
[ROCm-$ver]
name=ROCm$ver
@@ -355,7 +421,7 @@ For a comprehensive list of meta-packages, refer to
- Sample Multi-version installation
```shell
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk5.5.1 rocm-hip-sdk5.3.3
sudo zypper --gpg-auto-import-keys install rocm-hip-sdk5.6 rocm-hip-sdk5.3.3
```
:::::
@@ -392,7 +458,7 @@ but are generally useful. Verification of the install is advised.
2. Add binary paths to the `PATH` environment variable.
```shell
export PATH=$PATH:/opt/rocm-5.5.1/bin:/opt/rocm-5.5.1/opencl/bin
export PATH=$PATH:/opt/rocm-5.6/bin:/opt/rocm-5.6/opencl/bin
```
```{attention}

View File

@@ -114,8 +114,8 @@ sudo yum autoremove amdgpu-dkms
```
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
:::::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
::::{rubric} Uninstalling Specific Meta-packages
::::

View File

@@ -25,8 +25,11 @@ repository to the new release.
:sync: ubuntu-20.04
```shell
# version
version=5.6
# amdgpu repository for focal
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/5.5.1/ubuntu focal main' \
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/$version/ubuntu focal main" \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
@@ -36,8 +39,11 @@ sudo apt update
:sync: ubuntu-22.04
```shell
# version
version=5.6
# amdgpu repository for jammy
echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/5.5.1/ubuntu jammy main' \
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/$version/ubuntu jammy main" \
| sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update
```
@@ -49,15 +55,18 @@ sudo apt update
:sync: RHEL
::::{tab-set}
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
:sync: RHEL-8
```shell
# version
version=5.6
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.5.1/rhel/8.6/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/$version/rhel/8.7/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -67,15 +76,18 @@ sudo yum clean all
```
:::
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
:::{tab-item} RHEL 8.8
:sync: RHEL-8.8
:sync: RHEL-8
```shell
# version
version=5.6
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.5.1/rhel/8.7/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/$version/rhel/8.8/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -90,10 +102,34 @@ sudo yum clean all
:sync: RHEL-9
```shell
# version
version=5.6
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.5.1/rhel/9.1/main/x86_64/
baseurl=https://repo.radeon.com/amdgpu/$version/rhel/9.1/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo yum clean all
```
:::
:::{tab-item} RHEL 9.2
:sync: RHEL-9.2
:sync: RHEL-9
```shell
# version
version=5.6
sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/$version/rhel/9.2/main/x86_64/
enabled=1
priority=50
gpgcheck=1
@@ -105,14 +141,21 @@ sudo yum clean all
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
:::::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
::::{tab-set}
:::{tab-item} SLES 15.4
:sync: SLES-15.4
```shell
# version
version=5.6
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.5.1/sle/15.4/main/x86_64
baseurl=https://repo.radeon.com/amdgpu/$version/sle/15.4/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -120,6 +163,24 @@ EOF
sudo zypper ref
```
:::
:::{tab-item} SLES 15.5
:sync: SLES-15.5
```shell
sudo tee /etc/zypp/repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.6/sle/15.5/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
sudo zypper ref
```
:::
::::
:::::
::::::
@@ -147,8 +208,8 @@ sudo reboot
```
:::
:::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
:::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell
sudo zypper --gpg-auto-import-keys install amdgpu-dkms
@@ -172,7 +233,10 @@ repository to the new release.
:sync: ubuntu-20.04
```shell
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.5.1 focal main" \
# version
version=5.6
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$version focal main" \
| sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
@@ -184,7 +248,10 @@ sudo apt update
:sync: ubuntu-22.04
```shell
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.5.1 jammy main" \
# version
version=5.6
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$version jammy main" \
| sudo tee /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' \
| sudo tee /etc/apt/preferences.d/rocm-pin-600
@@ -202,10 +269,13 @@ sudo apt update
:sync: RHEL-8
```shell
# version
version=5.6
sudo tee /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-5.5.1]
name=ROCm5.5.1
baseurl=https://repo.radeon.com/rocm/rhel8/5.5.1/main
[ROCm-$ver]
name=ROCm$ver
baseurl=https://repo.radeon.com/rocm/rhel8/$version/main
enabled=1
priority=50
gpgcheck=1
@@ -219,10 +289,13 @@ sudo yum clean all
:sync: RHEL-9
```shell
# version
version=5.6
sudo tee /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-5.5.1]
name=ROCm5.5.1
baseurl=https://repo.radeon.com/rocm/rhel9/5.5.1/main
[ROCm-$ver]
name=ROCm$ver
baseurl=https://repo.radeon.com/rocm/rhel9/$version/main
enabled=1
priority=50
gpgcheck=1
@@ -234,15 +307,18 @@ sudo yum clean all
:::
::::
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:sync: SLES15
:::::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell
# version
version=5.6
sudo tee /etc/zypp/repos.d/rocm.repo <<EOF
[ROCm-5.5.1]
name=ROCm5.5.1
[ROCm-$ver]
name=ROCm$ver
name=rocm
baseurl=https://repo.radeon.com/rocm/zyp/5.5.1/main
baseurl=https://repo.radeon.com/rocm/zyp/$version/main
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -275,8 +351,8 @@ sudo yum update rocm-hip-sdk
```
:::
:::{tab-item} Suse Linux Enterprise Server 15
:sync: SLES15
:::{tab-item} Suse Linux Enterprise Server
:sync: SLES
```shell
sudo zypper --gpg-auto-import-keys update rocm-hip-sdk

View File

@@ -91,6 +91,7 @@ sudo rpm -ivh epel-release-latest-8.noarch.rpm
:::
:::{tab-item} RHEL 9
:sync: RHEL-9
```shell
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
@@ -110,14 +111,33 @@ sudo crb enable
```
:::::
:::::{tab-item} SUSE Linux Enterprise Server 15
:::::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
Add the perl languages repository.
```shell
zypper addrepo https://download.opensuse.org/repositories/devel:languages:perl/SLE_15_SP4/devel:languages:perl.repo
```{note}
Mar 25, 2024: We currently need to install the Perl module from SLES 15 SP5 as a workaround. The module was removed for SLES 15 SP4.
```
::::{tab-set}
:::{tab-item} SLES 15.4
:sync: SLES-15.4
```shell
zypper addrepo https://download.opensuse.org/repositories/devel:/languages:/perl/15.5/devel:languages:perl.repo
```
:::
:::{tab-item} SLES 15.5
:sync: SLES-15.5
```shell
zypper addrepo https://download.opensuse.org/repositories/devel:/languages:/perl/15.5/devel:languages:perl.repo
```
:::
::::
:::::
::::::

View File

@@ -29,11 +29,11 @@ wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | \
```shell
# Kernel driver repository for focal
sudo tee /etc/apt/sources.list.d/amdgpu.list <<'EOF'
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/latest/ubuntu focal main
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/5.6/ubuntu focal main
EOF
# ROCm repository for focal
sudo tee /etc/apt/sources.list.d/rocm.list <<'EOF'
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/debian focal main
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.6 focal main
EOF
```
@@ -44,13 +44,14 @@ EOF
```shell
# Kernel driver repository for jammy
sudo tee /etc/apt/sources.list.d/amdgpu.list <<'EOF'
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/latest/ubuntu jammy main
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/5.6/ubuntu jammy main
EOF
# ROCm repository for jammy
sudo tee /etc/apt/sources.list.d/rocm.list <<'EOF'
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/debian jammy main
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/5.6 jammy main
EOF
# Prefer packages from the rocm repository over system packages
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
```
:::
@@ -72,15 +73,15 @@ sudo apt update
::::
::::{tab-set}
:::{tab-item} RHEL 8.6
:sync: RHEL-8.6
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
```shell
# Add the amdgpu module repository for RHEL 8.6
# Add the amdgpu module repository for RHEL 8.7
sudo tee /etc/yum.repos.d/amdgpu.repo <<'EOF'
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/latest/rhel/8.6/main/x86_64
baseurl=https://repo.radeon.com/amdgpu/5.6/rhel/8.7/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -89,7 +90,7 @@ EOF
sudo tee /etc/yum.repos.d/rocm.repo <<'EOF'
[rocm]
name=rocm
baseurl=https://repo.radeon.com/rocm/rhel8/latest/main
baseurl=https://repo.radeon.com/rocm/rhel8/5.6/main
enabled=1
priority=50
gpgcheck=1
@@ -99,15 +100,15 @@ EOF
:::
:::{tab-item} RHEL 8.7
:sync: RHEL-8.7
:::{tab-item} RHEL 8.8
:sync: RHEL-8.8
```shell
# Add the amdgpu module repository for RHEL 8.7
# Add the amdgpu module repository for RHEL 8.8
sudo tee /etc/yum.repos.d/amdgpu.repo <<'EOF'
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/latest/rhel/8.7/main/x86_64
baseurl=https://repo.radeon.com/amdgpu/5.6/rhel/8.8/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -116,7 +117,7 @@ EOF
sudo tee /etc/yum.repos.d/rocm.repo <<'EOF'
[rocm]
name=rocm
baseurl=https://repo.radeon.com/rocm/rhel8/latest/main
baseurl=https://repo.radeon.com/rocm/rhel8/5.6/main
enabled=1
priority=50
gpgcheck=1
@@ -134,7 +135,7 @@ EOF
sudo tee /etc/yum.repos.d/amdgpu.repo <<'EOF'
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/latest/rhel/9.1/main/x86_64
baseurl=https://repo.radeon.com/amdgpu/5.6/rhel/9.1/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
@@ -143,7 +144,34 @@ EOF
sudo tee /etc/yum.repos.d/rocm.repo <<'EOF'
[rocm]
name=rocm
baseurl=https://repo.radeon.com/rocm/rhel9/latest/main
baseurl=https://repo.radeon.com/rocm/rhel9/5.6/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
```
:::
:::{tab-item} RHEL 9.2
:sync: RHEL-9.2
```shell
# Add the amdgpu module repository for RHEL 9.2
sudo tee /etc/yum.repos.d/amdgpu.repo <<'EOF'
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.6/rhel/9.2/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
# Add the rocm repository for RHEL 9
sudo tee /etc/yum.repos.d/rocm.repo <<'EOF'
[rocm]
name=rocm
baseurl=https://repo.radeon.com/rocm/rhel9/5.6/main
enabled=1
priority=50
gpgcheck=1
@@ -170,8 +198,8 @@ sudo yum clean all
::::
::::{tab-set}
:::{tab-item} SLES 15 SP4
:sync: SLES15-SP4
:::{tab-item} SLES 15.4
:sync: SLES-15.4
```shell
@@ -179,7 +207,34 @@ sudo yum clean all
sudo tee /etc/zypp/repos.d/amdgpu.repo <<'EOF'
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/latest/sle/15.4/main/x86_64
baseurl=https://repo.radeon.com/amdgpu/5.6/sle/15.4/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
# Add the rocm repository for SLES
sudo tee /etc/zypp/repos.d/rocm.repo <<'EOF'
[rocm]
name=rocm
baseurl=https://repo.radeon.com/rocm/zyp/zypper
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
```
:::
:::{tab-item} SLES 15.5
:sync: SLES-15.5
```shell
# Add the amdgpu module repository for SLES 15.5
sudo tee /etc/zypp/repos.d/amdgpu.repo <<'EOF'
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/5.6/sle/15.5/main/x86_64
enabled=1
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key

View File

@@ -0,0 +1,31 @@
# Command Line Installation
::::{grid} 2 3 3 3
:gutter: 1
:::{grid-item-card} Install
:link: install
:link-type: doc
How to install ROCm?
:::
:::{grid-item-card} Upgrade
:link: upgrade
:link-type: doc
Instructions for upgrading an existing ROCm installation.
:::
:::{grid-item-card} Uninstall
:link: uninstall
:link-type: doc
Steps for removing ROCm packages and libraries.
:::
::::
## See Also
- {doc}`/release/gpu_os_support`

View File

@@ -0,0 +1,56 @@
# Installation Using the Command Line Interface
The steps to install the HIP SDK for Windows are described in this document.
## System Requirements
The HIP SDK is supported on Windows 10 and 11. The HIP SDK may be installed on a
system without AMD GPUs to use the build toolchains. To run HIP applications, a
compatible GPU is required. Please see the supported GPU guide for more details.
## HIP SDK Installation
The command line installer is the same executable which is used by the graphical
front-end. Download the installer from the
[HIP-SDK download page](https://www.amd.com/en/developer/rocm-hub/hip-sdk.html).
The options supported by the command line interface are summarized in
{numref}`hip-sdk-cli-options`.
```{table} HIP SDK Command Line Options
:name: hip-sdk-cli-options
| **Install Option** | **Description** |
|:------------------:|:---------------:|
| `-install` | Command used to install packages, both driver and applications. No output to the screen. |
| `-install -boot` | Silent install with auto reboot. |
| `-install -log <absolute path>` | Write install result code to the specified log file. The specified log file must be on a local machine. Double quotes are needed if there are spaces in the log file path. |
| `-uninstall` | Command to uninstall all packages installed by this installer on the system. There is no option to specify which packages to uninstall. |
| `-uninstall -boot` | Silent uninstall with auto reboot. |
| `/?` or /help | Shows a brief description of all switch commands. |
```
```{note}
Unlike the graphical installer, the command line interface doesn't support
selectively installing parts of the SDK bundle. It's all or nothing.
```
### Launching the Installer From the Command Line
The installer is still a graphical application with a `WinMain` entry point, even
when called on the command line. This means that the application lifetime is
tied to a window, even on headless systems where that window may not be visible.
To launch the installer from PowerShell that will block until the installer
exits, one may use the following pattern:
```pwsh
Start-Process $InstallerExecutable -ArgumentList $InstallerArgs -NoNewWindow -Wait
```
```{important}
Running the installer requires Administrator Privileges.
```
For example, installing all components and
```pwsh
Start-Process ~\Downloads\Setup.exe -ArgumentList '-install','-log',"${env:USERPROFILE}\installer_log.txt" -NoNewWindow -Wait
```

View File

@@ -0,0 +1,48 @@
# Uninstallation Using the Command Line Interface
The steps to uninstall the HIP SDK for Windows are described in this document.
## HIP SDK Uninstallation
The command line installer is the same executable which is used by the graphical
front-end. The options supported by the command line interface are summarized in
{numref}`hip-sdk-cli-options`.
```{table} HIP SDK Command Line Options
:name: hip-sdk-cli-options
| **Install Option** | **Description** |
|:------------------:|:---------------:|
| `-install` | Command used to install packages, both driver and applications. No output to the screen. |
| `-install -boot` | Silent install with auto reboot. |
| `-install -log <absolute path>` | Write install result code to the specified log file. The specified log file must be on a local machine. Double quotes are needed if there are spaces in the log file path. |
| `-uninstall` | Command to uninstall all packages installed by this installer on the system. There is no option to specify which packages to uninstall. |
| `-uninstall -boot` | Silent uninstall with auto reboot. |
| `/?` or /help | Shows a brief description of all switch commands. |
```
```{note}
Unlike the graphical installer, the command line interface doesn't support
selectively installing parts of the SDK bundle. It's all or nothing.
```
### Launching the Installer From the Command Line
The installer is still a graphical application with a `WinMain` entry point, even
when called on the command line. This means that the application lifetime is
tied to a window, even on headless systems where that window may not be visible.
To launch the installer from PowerShell that will block until the installer
exits, one may use the following pattern:
```pwsh
Start-Process $InstallerExecutable -ArgumentList $InstallerArgs -NoNewWindow -Wait
```
```{important}
Running the installer requires Administrator Privileges.
```
For example, uninstalling all components and
```pwsh
Start-Process ~\Downloads\Setup.exe -ArgumentList '-uninstall' -NoNewWindow -Wait
```

View File

@@ -0,0 +1,14 @@
# Upgrading Using the Graphical Interface
The steps to uninstall the HIP SDK for Windows are described in this document.
## HIP SDK Upgrade
To upgrade an existing installation of the HIP SDK without preserving the
previous version, first uninstall it, then install the new version following the
instructions in {doc}`/deploy/windows/cli/uninstall` and
{doc}`/deploy/windows/cli/install` using the old and new installers
respectively.
To upgrade by installing both versions side-by-side, just run the installer of
the newer version.

View File

@@ -0,0 +1,31 @@
# Graphical Installation
::::{grid} 2 3 3 3
:gutter: 1
:::{grid-item-card} Install
:link: install
:link-type: doc
How to install ROCm?
:::
:::{grid-item-card} Upgrade
:link: upgrade
:link-type: doc
Instructions for upgrading an existing ROCm installation.
:::
:::{grid-item-card} Uninstall
:link: uninstall
:link-type: doc
Steps for removing ROCm packages and libraries.
:::
::::
## See Also
- {doc}`/release/gpu_os_support`

View File

@@ -0,0 +1,163 @@
# Installation Using the Graphical Interface
The steps to install the HIP SDK for Windows are described in this document.
## System Requirements
The HIP SDK is supported on Windows 10 and 11. The HIP SDK may be installed on a
system without AMD GPUs to use the build toolchains. To run HIP applications, a
compatible GPU is required. Please see the supported GPU guide for more details.
## HIP SDK Installation
### Download the installer
Download the installer from the
[HIP-SDK download page](https://www.amd.com/en/developer/rocm-hub/hip-sdk.html).
### Launching the installer
To launch the AMD HIP SDK Installer, click the **Setup** icon shown in
{numref}`setup-icon`.
```{figure} /data/deploy/windows/000-setup-icon.png
:name: setup-icon
:alt: Icon with AMD arrow logo and User Access Control Shield overlayed.
Setup Icon
```
The installer requires Administrator Privileges, so you may be greeted with a
User Access Control (UAC) pop-up. Click Yes.
```{figure} /data/deploy/windows/001-uac-dark.png
:name: uac-dark
:class: only-dark
:alt: User Access Control pop-up
User Access Control pop-up
```
```{figure} /data/deploy/windows/001-uac-light.png
:name: uac-light
:class: only-light
:alt: User Access Control pop-up
User Access Control pop-up
```
The installer executable will temporarily extract installer packages to `C:\AMD`
which it will remove after installation completes. This extraction is signified
by the "Initializing install" window in {numref}`init-install`.
```{figure} /data/deploy/windows/002-initializing.png
:name: init-install
:alt: Window with AMD arrow logo, futuristic background and progress counter.
Installer initialization window
```
The installer will then detect your system configuration as per
{numref}`detecting-system-components` to decide, which installable components
are applicable to your system.
```{figure} /data/deploy/windows/003-detecting-system-config.png
:name: detecting-system-components
:alt: Window with AMD arrow logo, futuristic background and activity indicator.
Installer initialization window.
```
### Customizing the install
When the installer launches, it displays a window that lets the user customize
the installation. By default, all components are selected for installation.
Refer to {numref}`installer-window` for an instance when the Select All option
is turned on.
```{figure} /data/deploy/windows/004-installer-window.png
:name: installer-window
:alt: Window with AMD arrow logo, futuristic background and activity indicator.
Installer initialization window.
```
#### HIP SDK Installer
The HIP SDK installation options are listed in {numref}`hip-sdk-options`.
```{table} HIP SDK Components for Installation
:name: hip-sdk-options
| **HIP Components** | **Install Type** | **Additional Options** |
|:------------------:|:----------------:|:----------------------:|
| HIP SDK Core | 5.5.0 | Install location |
| HIP Libraries | Full, Partial, None | Runtime, Development (Libs and headers) |
| HIP Runtime Compiler | Full, Partial, None | Runtime, Development (Headers) |
| HIP Ray Tracing | Full, Partial, None | Runtime, Development (Headers) |
| Visual Studio Plugin | Full, Partial, None | Visual Studio 2017, 2019, 2022 Plugin |
```
```{note}
The Select/DeSelect All option only applies to the installation of HIP SDK
components. To install the bundled AMD Display Driver, manually select the
install type.
```
```{tip}
Should you only wish to install a few select components,
DeSelecting All and then picking the individual components may be more
convenient.
```
#### AMD Display Driver
The HIP SDK installer bundles an AMD Radeon Software PRO 23.10 installer. The
supported install options are summarized by
{numref}`display-driver-install-options`:
```{table} AMD Display Driver Install Options
:name: display-driver-install-options
| **Install Option** | **Description** |
|:------------------:|:---------------:|
| Install Location | Location on disk to store driver files. |
| Install Type | The breadth of components to be installed. Refer to {numref}`display-driver-install-types` for details. |
| Factory Reset (Optional) | A Factory Reset will remove all prior versions of AMD HIP SDK and drivers. You will not be able to roll back to previously installed drivers. |
```
```{table} AMD Display Driver Install Types
:name: display-driver-install-types
| **Install Type** | **Description** |
|:----------------:|:---------------:|
| Full Install | Provides all AMD Software features and controls for gaming, recording, streaming, and tweaking the performance on your graphics hardware. |
| Minimal Install | Provides only the basic controls for AMD Software features and does not include advanced features such as performance tweaking or recording and capturing content. |
| Driver Only | Provides no user interface for AMD Software features. |
```
```{note}
You must perform a system restart for a complete installation of the
Display Driver.
```
### Installing Components
Please wait for the installation to complete during as shown in
{numref}`install-progress`.
```{figure} /data/deploy/windows/012-install-progress.png
:name: install-progress
:alt: Window with AMD arrow logo, futuristic background and progress meter.
Installation Progress
```
### Installation Complete
Once the installation is complete, the installer window may prompt you for a
system restart. Click **Restart** at the lower right corner, shown in
{numref}`install-complete`
```{figure} /data/deploy/windows/013-install-complete.png
:name: install-complete
:alt: Window with AMD arrow logo, futuristic background and completion notice.
Installation Complete
```
```{error}
Should the installer terminate due to unexpcted circumstances, or the user
forcibly terminates the installer, the temporary directory created under
`C:\AMD` may be safely removed. Installed components will not depend on this
folder (unless the user specifies `C:\AMD` as an install folder explicitly).
```

View File

@@ -0,0 +1,27 @@
# Uninstallation Using the Graphical Interface
The steps to uninstall the HIP SDK for Windows are described in this document.
## Uninstallation
All components, except visual studio plug-in should be uninstalled through
control panel -> Add/Remove Program. For visual studio extension uninstallation,
please refer to
<https://github.com/ROCm-Developer-Tools/HIP-VS/blob/master/README.md>.
Uninstallation of the HIP SDK components can be done through the Windows
Settings app. Navigate to "Apps > Installed apps", click the "..." on the far
right next to the component to uninstall, and click "Uninstall".
```{figure} /data/deploy/windows/014-uninstall-dark.png
:name: uninstall-dark
:class: only-dark
:alt: Installed apps section of the Setting app showing installed HIP SDK components.
Removing the SDK via the Setting app
```
```{figure} /data/deploy/windows/014-uninstall-light.png
:name: uninstall-light
:class: only-light
:alt: Installed apps section of the Setting app showing installed HIP SDK components.
Removing the SDK via the Setting app
```

View File

@@ -0,0 +1,4 @@
# Upgrading Using the Graphical Interface
The steps to upgrade an existing HIP SDK installation for Windows are described
in this document.

View File

@@ -0,0 +1,44 @@
# Deploy ROCm on Windows
Start with {doc}`/deploy/windows/quick_start` or follow the detailed
instructions below.
## Prepare to Install
::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} Prerequisites
:link: prerequisites
:link-type: doc
The prerequisites page lists the required steps to verify that the system
supports ROCm.
:::
::::
## Choose your install method
::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} Graphical Installation
:link: gui/index
:link-type: doc
Use the graphical front-end of the installer.
:::
:::{grid-item-card} Command Line Installation
:link: cli/index
:link-type: doc
Use the command line front-end of the installer.
:::
::::
## See Also
- {doc}`/release/gpu_os_support`

View File

@@ -0,0 +1,74 @@
# Installation Prerequisites (Windows)
You must perform the following steps before installing ROCm and check if the
system meets all the requirements to proceed with the installation.
## Confirm the System Is Supported
The ROCm installation is supported only on specific host architectures, Windows
SKUs and update versions.
### Check the Windows SKU and Update Version on Your System
This section discusses obtaining information about the host architecture,
Windows SKU and update version.
#### Command Line Check
Verify the Windows SKU using the following steps:
1. To obtain the Linux distribution information, type the following command on
your system from a PowerShell Command Line Interface (CLI):
```pwsh
Get-ComputerInfo | Format-Table CsSystemType,OSName,OSDisplayVersion
```
2. Confirm that the obtained information matches with those listed in
{ref}`supported_skus`.
**Example:** Running the command above on a Windows system may result in the
following output:
```output
CsSystemType OsName OSDisplayVersion
------------ ------ ----------------
x64-based PC Microsoft Windows 11 Pro 22H2
```
#### Graphical Check
1. Open the Setting app.
```{figure} /data/deploy/windows/000-settings-dark.png
:name: settings-dark
:class: only-dark
:alt: Gear icon of the Windows Settings app
Windows Settings app icon
```
```{figure} /data/deploy/windows/000-settings-light.png
:name: settings-light
:class: only-light
:alt: Gear icon of the Windows Settings app
Windows Settings app icon
```
2. Navigate to **System > About**.
```{figure} /data/deploy/windows/001-about-dark.png
:name: about-dark
:class: only-dark
:alt: Settings app panel showing Device and OS information
Settings > About page
```
```{figure} /data/deploy/windows/001-about-light.png
:name: about-light
:class: only-light
:alt: Settings app panel showing Device and OS information
Settings > About page
```
3. Confirm that the obtained information matches with those listed in
{ref}`supported_skus`.

View File

@@ -0,0 +1,187 @@
# Quick Start (Windows)
The steps to install the HIP SDK for Windows are described in this document.
## System Requirements
The HIP SDK is supported on Windows 10 and 11. The HIP SDK may be installed on a
system without AMD GPUs to use the build toolchains. To run HIP applications, a
compatible GPU is required. Please see the supported GPU guide for more details.
## HIP SDK Installation
### Download the installer
Download the installer from the
[HIP-SDK download page](https://www.amd.com/en/developer/rocm-hub/hip-sdk.html).
### Launching the installer
To launch the AMD HIP SDK Installer, click the **Setup** icon shown in
{numref}`setup-icon`.
```{figure} /data/deploy/windows/000-setup-icon.png
:name: setup-icon
:alt: Icon with AMD arrow logo and User Access Control Shield overlayed.
Setup Icon
```
The installer requires Administrator Privileges, so you may be greeted with a
User Access Control (UAC) pop-up. Click Yes.
```{figure} /data/deploy/windows/001-uac-dark.png
:name: uac-dark
:class: only-dark
:alt: User Access Control pop-up
User Access Control pop-up
```
```{figure} /data/deploy/windows/001-uac-light.png
:name: uac-light
:class: only-light
:alt: User Access Control pop-up
User Access Control pop-up
```
The installer executable will temporarily extract installer packages to `C:\AMD`
which it will remove after installation completes. This extraction is signified
by the "Initializing install" window in {numref}`init-install`.
```{figure} /data/deploy/windows/002-initializing.png
:name: init-install
:alt: Window with AMD arrow logo, futuristic background and progress counter.
Installer initialization window
```
The installer will then detect your system configuration as per
{numref}`detecting-system-components` to decide, which installable components
are applicable to your system.
```{figure} /data/deploy/windows/003-detecting-system-config.png
:name: detecting-system-components
:alt: Window with AMD arrow logo, futuristic background and activity indicator.
Installer initialization window.
```
### Customizing the install
When the installer launches, it displays a window that lets the user customize
the installation. By default, all components are selected for installation.
Refer to {numref}`installer-window` for an instance when the Select All option
is turned on.
```{figure} /data/deploy/windows/004-installer-window.png
:name: installer-window
:alt: Window with AMD arrow logo, futuristic background and activity indicator.
Installer initialization window.
```
#### HIP SDK Installer
The HIP SDK installation options are listed in {numref}`hip-sdk-options`.
```{table} HIP SDK Components for Installation
:name: hip-sdk-options
| **HIP Components** | **Install Type** | **Additional Options** |
|:------------------:|:----------------:|:----------------------:|
| HIP SDK Core | 5.5.0 | Install location |
| HIP Libraries | Full, Partial, None | Runtime, Development (Libs and headers) |
| HIP Runtime Compiler | Full, Partial, None | Runtime, Development (Headers) |
| HIP Ray Tracing | Full, Partial, None | Runtime, Development (Headers) |
| Visual Studio Plugin | Full, Partial, None | Visual Studio 2017, 2019, 2022 Plugin |
```
```{note}
The Select/DeSelect All option only applies to the installation of HIP SDK
components. To install the bundled AMD Display Driver, manually select the
install type.
```
```{tip}
Should you only wish to install a few select components,
DeSelecting All and then picking the individual components may be more
convenient.
```
#### AMD Display Driver
The HIP SDK installer bundles an AMD Radeon Software PRO 23.10 installer. The
supported install options are summarized by
{numref}`display-driver-install-options`:
```{table} AMD Display Driver Install Options
:name: display-driver-install-options
| **Install Option** | **Description** |
|:------------------:|:---------------:|
| Install Location | Location on disk to store driver files. |
| Install Type | The breadth of components to be installed. Refer to {numref}`display-driver-install-types` for details. |
| Factory Reset (Optional) | A Factory Reset will remove all prior versions of AMD HIP SDK and drivers. You will not be able to roll back to previously installed drivers. |
```
```{table} AMD Display Driver Install Types
:name: display-driver-install-types
| **Install Type** | **Description** |
|:----------------:|:---------------:|
| Full Install | Provides all AMD Software features and controls for gaming, recording, streaming, and tweaking the performance on your graphics hardware. |
| Minimal Install | Provides only the basic controls for AMD Software features and does not include advanced features such as performance tweaking or recording and capturing content. |
| Driver Only | Provides no user interface for AMD Software features. |
```
```{note}
You must perform a system restart for a complete installation of the
Display Driver.
```
### Installing Components
Please wait for the installation to complete during as shown in
{numref}`install-progress`.
```{figure} /data/deploy/windows/012-install-progress.png
:name: install-progress
:alt: Window with AMD arrow logo, futuristic background and progress meter.
Installation Progress
```
### Installation Complete
Once the installation is complete, the installer window may prompt you for a
system restart. Click **Restart** at the lower right corner, shown in
{numref}`install-complete`
```{figure} /data/deploy/windows/013-install-complete.png
:name: install-complete
:alt: Window with AMD arrow logo, futuristic background and completion notice.
Installation Complete
```
```{error}
Should the installer terminate due to unexpcted circumstances, or the user
forcibly terminates the installer, the temporary directory created under
`C:\AMD` may be safely removed. Installed components will not depend on this
folder (unless the user specifies `C:\AMD` as an install folder explicitly).
```
## Uninstallation
All components, except visual studio plug-in should be uninstalled through
control panel -> Add/Remove Program. For visual studio extension uninstallation,
please refer to
<https://github.com/ROCm-Developer-Tools/HIP-VS/blob/master/README.md>.
Uninstallation of the HIP SDK components can be done through the Windows
Settings app. Navigate to "Apps > Installed apps", click the "..." on the far
right next to the component to uninstall, and click "Uninstall".
```{figure} /data/deploy/windows/014-uninstall-dark.png
:name: uninstall-dark
:class: only-dark
:alt: Installed apps section of the Setting app showing installed HIP SDK components.
Removing the SDK via the Setting app
```
```{figure} /data/deploy/windows/014-uninstall-light.png
:name: uninstall-light
:class: only-light
:alt: Installed apps section of the Setting app showing installed HIP SDK components.
Removing the SDK via the Setting app
```

View File

@@ -24,7 +24,7 @@ MIGraphX is a graph compiler focused on accelerating the Machine Learning infere
After doing all these transformations, MIGraphX emits code for the AMD GPU by calling to MIOpen or rocBLAS or creating HIP kernels for a particular operator. MIGraphX can also target CPUs using DNNL or ZenDNN libraries.
MIGraphX provides easy-to-use APIs in C++ and Python to import machine models in ONNX or TensorFlow. Users can compile, save, load, and run these models using MIGraphX's C++ and Python APIs. Internally, MIGraphX parses ONNX or TensorFlow models into internal graph representation where each operator in the model gets mapped to an operator within MIGraphX. Each of these operators defines various attributes such as:
MIGraphX provides easy-to-use APIs in C++ and Python to import machine models in ONNX or TensorFlow. Users can compile, save, load, and run these models using MIGraphX C++ and Python APIs. Internally, MIGraphX parses ONNX or TensorFlow models into internal graph representation where each operator in the model gets mapped to an operator within MIGraphX. Each of these operators defines various attributes such as:
- Number of arguments
@@ -187,7 +187,7 @@ Follow these steps:
}
```
2. To compile this program, you can use CMake and you only need to link the `migraphx::c` library to use MIGraphX's C++ API. The following is the `CMakeLists.txt` file that can build the earlier example:
2. To compile this program, you can use CMake and you only need to link the `migraphx::c` library to use the MIGraphX C++ API. The following is the `CMakeLists.txt` file that can build the earlier example:
```cmake
cmake_minimum_required(VERSION 3.5)
@@ -327,7 +327,7 @@ To run generated `.mxr` files through `migraphx-driver`, use the following:
./path/to/migraphx-driver run --migraphx resnet50.mxr --enable-offload-copy
```
Alternatively, you can use MIGraphX's C++ or Python API to generate `.mxr` file. Refer to {numref}`image018` for an example.
Alternatively, you can use the MIGraphX C++ or Python API to generate `.mxr` file. Refer to {numref}`image018` for an example.
```{figure} ../../data/understand/deep_learning/image.018.png
:name: image018

View File

@@ -22,7 +22,7 @@ MPI project is an open source implementation of the Message Passing Interface
and industry partners.
Several MPI implementations can be made ROCm-aware by compiling them with
[Unified Communication Framework](http://www.openucx.org/) (UCX) support. One
[Unified Communication Framework](https://www.openucx.org/) (UCX) support. One
notable exception is MVAPICH2: It directly supports AMD GPUs without using UCX,
and you can download it [here](http://mvapich.cse.ohio-state.edu/downloads/).
Use the latest version of the MVAPICH2-GDR package.
@@ -32,7 +32,7 @@ whose goal is to provide a common set of communication interfaces that targets a
broad set of network programming models and interfaces. UCX is ROCm-aware, and
ROCm technologies are used directly to implement various network operation
primitives. For more details on the UCX design, refer to it's
[documentation](http://www.openucx.org/documentation).
[documentation](https://www.openucx.org/documentation).
## Building UCX

View File

@@ -60,7 +60,7 @@ Follow these steps:
PyTorch supports the ROCm platform by providing tested wheels packages. To
access this feature, refer to
[https://pytorch.org/get-started/locally/](https://pytorch.org/get-started/locally/)
and choose the "ROCm" compute platform. {numref}`Installation-Matrix-from-Pytorch` is a matrix from <http://pytorch.org/> that illustrates the installation compatibility between ROCm and the PyTorch build.
and choose the "ROCm" compute platform. {numref}`Installation-Matrix-from-Pytorch` is a matrix from <https://pytorch.org/> that illustrates the installation compatibility between ROCm and the PyTorch build.
```{figure} ../../data/how_to/magma_install/image.006.png
:name: Installation-Matrix-from-Pytorch
@@ -83,7 +83,7 @@ To install PyTorch using the wheels package, follow these installation steps:
installation directions in the section
[Installation](../../deploy/linux/install.md). ROCm 5.2 is installed in
this example, as supported by the installation matrix from
<http://pytorch.org/>.
<https://pytorch.org/>.
or
@@ -299,7 +299,7 @@ USE_ROCM=1 MAX_JOBS=4 python3 setup.py install --user
### Test the PyTorch Installation
You can use PyTorch unit tests to validate a PyTorch installation. If using a
prebuilt PyTorch Docker image from AMD ROCm DockerHub or installing an official
prebuilt PyTorch Docker image from AMD ROCm Docker Hub or installing an official
wheels package, these tests are already run on those configurations.
Alternatively, you can manually run the unit tests to validate the PyTorch
installation fully.
@@ -405,6 +405,22 @@ Follow these steps:
python3 main.py
```
## Using MIOpen kdb files with ROCm PyTorch wheels
PyTorch uses MIOpen for machine learning primitives. These primitives are compiled into kernels at runtime. Runtime compilation causes a small warm-up phase when starting PyTorch. MIOpen kdb files contain precompiled kernels that can speed up the warm-up phase of an application. More information is available in the {doc}`MIOpeninstallation page <miopen:install>`.
MIOpen kdb files can be used with ROCm PyTorch wheels. However, the kdb files need to be placed in a specific location with respect to the PyTorch installation path. A helper script simplifies this task for the user. The script takes in the ROCm version and user's GPU architecture as inputs, and works for Ubuntu and CentOS.
Helper script: [install_kdb_files_for_pytorch_wheels.sh](https://raw.githubusercontent.com/wiki/ROCmSoftwarePlatform/pytorch/files/install_kdb_files_for_pytorch_wheels.sh)
Usage:
After installing ROCm PyTorch wheels:
1. [Optional] `export GFX_ARCH=gfx90a`
2. [Optional] `export ROCM_VERSION=5.5`
3. `./install_kdb_files_for_pytorch_wheels.sh`
## References
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens and Z. Wojna, "Rethinking the Inception Architecture for Computer Vision," CoRR, p. abs/1512.00567, 2015

View File

@@ -33,8 +33,8 @@ Follow these steps:
2. Once you have pulled the image, run it by using the command below:
```bash
docker run -it --network=host --device=/dev/kfd --device=/dev/dri
--ipc=host --shm-size 16G --group-add video --cap-add=SYS_PTRACE
docker run -it --network=host --device=/dev/kfd --device=/dev/dri \
--ipc=host --shm-size 16G --group-add video --cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined rocm/tensorflow:latest
```

View File

@@ -275,7 +275,7 @@ sudo yum install cpupowerutils
:::
:::{tab-item} SUSE Linux Enterprise Server 15
:::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell
@@ -453,7 +453,7 @@ sudo yum install rocm-bandwidth-test
:::
:::{tab-item} SUSE Linux Enterprise Server 15
:::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell

View File

@@ -258,7 +258,7 @@ sudo yum install cpupowerutils
:::
:::{tab-item} SUSE Linux Enterprise Server 15
:::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell
@@ -436,7 +436,7 @@ sudo yum install rocm-bandwidth-test
:::
:::{tab-item} SUSE Linux Enterprise Server 15
:::{tab-item} SUSE Linux Enterprise Server
:sync: SLES
```shell

View File

@@ -7,20 +7,26 @@
AMD's library for high performance machine learning primitives.
- {doc}`Documentation <miopen:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/MIOpen)
- [Changelog](https://github.com/ROCmSoftwarePlatform/MIOpen/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} {doc}`Composable Kernel <composable-kernel:index>`
:::{grid-item-card} {doc}`Composable Kernel <composable_kernel:index>`
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
- {doc}`Documentation <composable-kernel:index>`
- {doc}`Documentation <composable_kernel:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/composable_kernel)
- [Changelog](https://github.com/ROCmSoftwarePlatform/composable_kernel/blob/develop/CHANGELOG.md)
:::
:::{grid-item-card} {doc}`MIGraphX <migraphx:index>`
:::{grid-item-card} {doc}`MIGraphX <amdmigraphx:index>`
AMD MIGraphX is AMD's graph inference engine that accelerates machine learning model inference.
- {doc}`Documentation <migraphx:index>`
- {doc}`Documentation <amdmigraphx:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX)
- [Changelog](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/CHANGELOG.md)
:::

View File

@@ -8,8 +8,9 @@
:::{grid-item-card} [HIP](./hip)
HIP is both AMD's GPU programming language extension and the GPU runtime.
- {doc}`hip:.doxygen/docBin/html/index`
- [Examples](https://github.com/amd/rocm-examples/tree/develop/HIP-Basic)
- {doc}`HIP <hip:index>`
- [HIP Examples](https://github.com/amd/rocm-examples/tree/develop/HIP-Basic)
- {doc}`HIPIFY <hipify:index>`
:::
@@ -42,8 +43,8 @@ Inter and intra-node communication is supported by the following projects:
Libraries related to AI.
- {doc}`MIOpen <miopen:index>`
- {doc}`Composable Kernel <composable-kernel:index>`
- {doc}`MIGraphX <migraphx:index>`
- {doc}`Composable Kernel <composable_kernel:index>`
- {doc}`MIGraphX <amdmigraphx:index>`
:::
@@ -64,6 +65,7 @@ Computer vision related projects.
:::{grid-item-card} [Compilers and Tools](compilers)
- [ROCmCC](/reference/rocmcc/rocmcc)
- {doc}`ROCdbgapi <rocdbgapi:index>`
- {doc}`ROCgdb <rocgdb:index>`
- {doc}`ROCProfiler <rocprofiler:rocprof>`
- {doc}`ROCTracer <roctracer:index>`
@@ -72,15 +74,15 @@ Computer vision related projects.
:::{grid-item-card} [Management Tools](management_tools)
- AMD SMI
- [ROCm SMI](https://rocmdocs.amd.com/projects/rocmsmi/en/latest/)
- {doc}`ROCm Datacenter Tool <rdc:index>`
- {doc}`AMD SMI <amdsmi:index>`
- {doc}`ROCm SMI <rocm_smi_lib:index>`
- {doc}`ROCm Data Center Tool <rdc:index>`
:::
:::{grid-item-card} [Validation Tools](validation_tools)
- {doc}`ROCm Validation Suite <rocm-validation-suite:index>`
- {doc}`ROCm Validation Suite <rocmvalidationsuite:index>`
- {doc}`TransferBench <transferbench:index>`
:::

View File

@@ -3,42 +3,46 @@
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} ROCmCC
:link: /reference/rocmcc/rocmcc
:link-type: doc
:::{grid-item-card} {doc}`ROCdbgapi <rocdbgapi:index>`
The AMD Debugger API is a library that provides all the support necessary for a
debugger and other tools to perform low level control of the execution and
inspection of execution state of AMD's commercially available GPU architectures.
- {doc}`Documentation <rocdbgapi:index>`
- [GitHub](https://github.com/ROCm-Developer-Tools/ROCdbgapi/)
:::
:::{grid-item-card} [ROCmCC](./rocmcc/rocmcc)
ROCmCC is a Clang/LLVM-based compiler. It is optimized for high-performance
computing on AMD GPUs and CPUs and supports various heterogeneous programming
models such as HIP, OpenMP, and OpenCL.
- [Documentation](./rocmcc/rocmcc)
:::
:::{grid-item-card} ROCgdb
:link: rocgdb:index
:link-type: doc
:::{grid-item-card} {doc}`ROCgdb <rocgdb:index>`
This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.
- {doc}`Documentation <rocgdb:index>`
- [GitHub](https://github.com/ROCm-Developer-Tools/ROCgdb/)
:::
:::{grid-item-card} ROCProfiler
:link: rocprofiler:rocprof
:link-type: doc
:::{grid-item-card} {doc}`ROCProfiler <rocprofiler:rocprof>`
ROC profiler library. Profiling with performance counters and derived metrics. Library supports GFX8/GFX9. Hardware specific low-level performance analysis interface for profiling of GPU compute applications. The profiling includes hardware performance counters with complex performance metrics.
:::
:::{grid-item-card} ROCTracer
:link: roctracer:index
:link-type: doc
Callback/Activity Library for Performance tracing AMD GPU's
- {doc}`Documentation <rocprofiler:rocprof>`
- [GitHub](https://github.com/ROCm-Developer-Tools/rocprofiler/)
:::
:::{grid-item-card} ROCdbgapi
:link: rocdbgapi:index
:link-type: doc
The AMD Debugger API is a library that provides all the support necessary for a
debugger and other tools to perform low level control of the execution and
inspection of execution state of AMD's commercially available GPU architectures.
:::{grid-item-card} {doc}`ROCTracer <roctracer:index>`
Callback/Activity Library for Performance tracing AMD GPUs
- {doc}`Documentation <roctracer:index>`
- [GitHub](https://github.com/ROCm-Developer-Tools/roctracer)
:::

View File

@@ -7,6 +7,8 @@
MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.
- {doc}`Documentation <mivisionx:README>`
- [GitHub](https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/)
- [Changelog](https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/blob/master/CHANGELOG.md)
:::

View File

@@ -11,6 +11,7 @@ transforms, reductions, scans, etc. It also serves as a common back-end for
similar libraries found inside ROCm.
- {doc}`Documentation <rocprim:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocPRIM/)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocPRIM/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/rocPRIM)
@@ -22,6 +23,7 @@ interface. Their CPU back-ends are identical, while the GPU back-end calls into
rocPRIM.
- {doc}`Documentation <rocthrust:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocThrust)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocThrust/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/rocThrust)
@@ -32,6 +34,7 @@ hipCUB is a template library of algorithm primitives with a CUB-compatible
interface. It's back-end is rocPRIM.
- {doc}`Documentation <hipcub:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/hipCUB)
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipCUB/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/hipCUB)

View File

@@ -10,6 +10,7 @@ The collective operations are implemented using ring and tree algorithms and hav
throughput and latency.
- {doc}`Documentation <rccl:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rccl)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocFFT/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/ROCmSoftwarePlatform/rccl/tree/develop/tools)

View File

@@ -9,6 +9,7 @@ ROCm libraries for FFT are as follows:
rocFFT is an AMD GPU optimized library for FFT.
- {doc}`Documentation <rocfft:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocFFT)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocFFT/blob/develop/CHANGELOG.md)
:::
@@ -19,6 +20,7 @@ using rocFFT. hipFFT allows for a common interface for other non AMD GPU
FFT libraries.
- {doc}`Documentation <hipfft:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/hipFFT)
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipFFT/blob/develop/CHANGELOG.md)
:::

View File

@@ -9,6 +9,7 @@ ROCm libraries for linear algebra are as follows:
`rocBLAS` is an AMD GPU optimized library for BLAS (Basic Linear Algebra Subprograms).
- {doc}`Documentation <rocblas:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocBLAS)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/rocBLAS)
@@ -20,6 +21,7 @@ via `rocBLAS` and `rocSOLVER`. `hipBLAS` allows for a common interface for other
BLAS libraries.
- {doc}`Documentation <hipblas:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/hipBLAS)
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipBLAS/blob/develop/CHANGELOG.md)
:::
@@ -31,6 +33,7 @@ flexible API and extends functionalities beyond traditional BLAS library.
optimized generator as a back-end kernel provider.
- {doc}`Documentation <hipblaslt:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/hipBLASLt)
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipBLASLt/blob/develop/CHANGELOG.md)
:::
@@ -41,6 +44,7 @@ fine-grained parallelism on top of AMD's ROCm runtime and toolchains, targeting
modern CPU and GPU platforms.
- {doc}`Documentation <rocalution:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocALUTION)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocALUTION/blob/develop/CHANGELOG.md)
:::
@@ -50,6 +54,7 @@ modern CPU and GPU platforms.
(MMA) problems into fragments and distributes these over GPU wavefronts.
- {doc}`Documentation <rocwmma:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocWMMA)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocWMMA/blob/develop/CHANGELOG.md)
:::
@@ -58,6 +63,7 @@ modern CPU and GPU platforms.
`rocSOLVER` provides a subset of LAPACK (Linear Algebra Package) functionality on the ROCm platform.
- {doc}`Documentation <rocsolver:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocSOLVER)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocSOLVER/blob/develop/CHANGELOG.md)
:::
@@ -67,6 +73,7 @@ modern CPU and GPU platforms.
as backends whilst exporting a unified interface.
- {doc}`Documentation <hipsolver:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/hipSOLVER)
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipSOLVER/blob/develop/CHANGELOG.md)
:::
@@ -75,6 +82,7 @@ as backends whilst exporting a unified interface.
`rocSPARSE` is a library to provide BLAS for sparse computations.
- {doc}`Documentation <rocsparse:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocSPARSE)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocSOLVER/blob/develop/CHANGELOG.md)
:::
@@ -84,6 +92,7 @@ as backends whilst exporting a unified interface.
supporting both `rocSPARSE` and `cuSPARSE` as backends.
- {doc}`Documentation <hipsparse:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/hipSPARSE)
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipSOLVER/blob/develop/CHANGELOG.md)
:::

View File

@@ -7,6 +7,7 @@
rocRAND is an AMD GPU optimized library for pseudo-random number generators (PRNG).
- {doc}`Documentation <rocrand:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/rocRAND/)
- [Changelog](https://github.com/ROCmSoftwarePlatform/rocRAND/blob/develop/CHANGELOG.md)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/Libraries/rocRAND)
@@ -18,6 +19,7 @@ generation (PRNG) optimized for AMD GPUs using rocRAND. hipRAND allows for a
common interface for other non AMD GPU PRNG libraries.
- {doc}`Documentation <hiprand:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/hipRAND/)
- [Changelog](https://github.com/ROCmSoftwarePlatform/hipRAND/blob/develop/CHANGELOG.md)
:::

View File

@@ -12,7 +12,8 @@ page introduces the HIP runtime and other HIP libraries and tools.
The HIP Runtime is used to enable GPU acceleration for all HIP language based
products.
- {doc}`hip:.doxygen/docBin/html/index`
- {doc}`Documentation <hip:index>`
- [GitHub](https://github.com/ROCm-Developer-Tools/HIP)
- [Examples](https://github.com/amd/rocm-examples/tree/develop/HIP-Basic)
:::
@@ -28,7 +29,9 @@ products.
HIPIFY assists with porting applications from based on CUDA to the HIP Runtime.
Supported CUDA APIs are documented here as well.
- {doc}`Reference Manual <hipify:index>`
- {doc}`Documentation <hipify:index>`
- [GitHub](https://github.com/ROCm-Developer-Tools/HIPIFY/)
- [Changelog](https://github.com/ROCm-Developer-Tools/HIPIFY/blob/amd-staging/CHANGELOG.md)
:::

View File

@@ -3,28 +3,29 @@
:::::{grid} 1 1 3 3
:gutter: 1
:::{grid-item-card} AMD SMI
:::{grid-item-card} {doc}`AMD SMI <amdsmi:index>`
The AMD System Management Interface Library, or AMD SMI library, is a C library for Linux that provides a user space interface for applications to monitor and control AMD devices.
- {doc}`Documentation <amdsmi:index>`
- [GitHub](https://github.com/RadeonOpenCompute/amdsmi)
- [Examples](https://github.com/amd/go_amd_smi#example)
:::
:::{grid-item-card} [ROCm SMI](https://rocmdocs.amd.com/projects/rocmsmi/en/latest/)
:::{grid-item-card} {doc}`ROCm SMI LIB <rocm_smi_lib:index>`
This tool acts as a command line interface for manipulating and monitoring the AMD GPU kernel, and is intended to replace and deprecate the existing `rocm_smi.py` CLI tool. It uses `ctypes` to call the `rocm_smi_lib` API.
- [Documentation](https://rocmdocs.amd.com/projects/rocmsmi/en/latest/)
- {doc}`Documentation <rocm_smi_lib:index>`
- [GitHub](https://github.com/RadeonOpenCompute/rocm_smi_lib)
- [Examples](https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/master/python_smi_tools)
:::
:::{grid-item-card} {doc}`ROCm Datacenter Tool <rdc:index>`
:::{grid-item-card} {doc}`ROCm Data Center Tool <rdc:index>`
The ROCm™ Data Center Tool simplifies the administration and addresses key infrastructure challenges in AMD GPUs in cluster and data center environments.
- {doc}`Documentation <rdc:index>`
- [GitHub](https://github.com/RadeonOpenCompute/rdc)
- [Changelog](https://github.com/RadeonOpenCompute/rdc/blob/master/CHANGELOG.md)
- [Examples](https://github.com/RadeonOpenCompute/rdc/tree/master/example)
:::

View File

@@ -53,11 +53,10 @@ that are required for target offload from an OpenMP program:
```
:::{note}
The Makefile in the example above uses a more classical and verbose set of flags
which can also be used:
The compiler also accepts the alternative offloading notation:
```bash
-fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa
-fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=<gpu-arch>
```
:::
@@ -142,8 +141,8 @@ For more details on tracing, refer to the ROCm Profiling Tools document on
:::{table}
:widths: auto
| Environment Variable | Description |
| --------------------------- | ----------- |
| Environment Variable | Description |
| --------------------------- | ---------------------------- |
| `OMP_NUM_TEAMS` | The implementation chooses the number of teams for kernel launch. The user can change this number for performance tuning using this environment variable, subject to implementation limits. |
| `LIBOMPTARGET_KERNEL_TRACE` | This environment variable is used to print useful statistics for device operations. Setting it to 1 and running the program emits the name of every kernel launched, the number of teams and threads used, and the corresponding register usage. Setting it to 2 additionally emits timing information for kernel launches and data transfer operations between the host and the device. |
| `LIBOMPTARGET_INFO` | This environment variable is used to print informational messages from the device runtime as the program executes. Users can request fine-grain information by setting it to the value of 1 or higher and can set the value of -1 for complete information. |
@@ -158,6 +157,14 @@ implemented in the past releases.
(openmp_usm)=
### Asynchronous Behavior in OpenMP Target Regions
- Multithreaded offloading on the same device
The `libomptarget` plugin for GPU offloading allows creation of separate configurable HSA queues per chiplet, which enables two or more threads to concurrently offload to the same device.
- Parallel memory copy invocations
Implicit asynchronous execution of single target region enables parallel memory copy invocations.
### Unified Shared Memory
Unified Shared Memory (USM) provides a pointer-based approach to memory
@@ -178,39 +185,34 @@ with Xnack capability.
When enabled, Xnack capability allows GPU threads to access CPU (system) memory,
allocated with OS-allocators, such as `malloc`, `new`, and `mmap`. Xnack must be
enabled both at compile- and run-time. To enable Xnack support at compile-time,
the programmer should use
use:
```bash
--offload-arch=gfx908:xnack+
```
Or, equivalently
Or use another functionally equivalent option Xnack-any:
```bash
--offload-arch=gfx908
```
:::{note}
The second case is called Xnack-any and it is functionally equivalent to the
first case.
:::
At runtime, programmers enable Xnack functionality on a per-application basis
using an environment variable:
To enable Xnack functionality at runtime on a per-application basis,
use environment variable:
```bash
HSA_XNACK=1
```
When Xnack support is not needed, then applications can be built to maximize
resource utilization using:
When Xnack support is not needed:
- Build the applications to maximize resource utilization using:
```bash
--offload-arch=gfx908:xnack-
```
At runtime, the `HSA_XNACK` environment variable can be set to 0, as Xnack
functionality is not needed.
- At runtime, set the `HSA_XNACK` environment variable to 0.
#### Unified Shared Memory Pragma
@@ -264,7 +266,7 @@ The difference between the memory pages pointed to by these two variables is
that the pages pointed by “a” are in fine-grain memory, while the pages pointed
to by “b” are in coarse-grain memory during and after the execution of the
target region. This is accomplished in the OpenMP runtime library with calls to
the ROCR runtime to set the pages pointed by “b” as coarse grain.
the ROCr runtime to set the pages pointed by “b” as coarse grain.
### OMPT Target Support
@@ -431,43 +433,46 @@ for(int i=0; i<N; i++){
See the complete sample code for global buffer overflow
[here](https://github.com/ROCm-Developer-Tools/aomp/blob/aomp-dev/examples/tools/asan/global_buffer_overflow/openmp/vecadd-GBO.cpp).
### No-loop Kernel Generation
### Clang Compiler Option for Kernel Optimization
The No-loop kernel generation feature optimizes the compiler performance by
generating a specialized kernel for certain OpenMP Target Constructs such as
target teams distribute parallel for. The specialized kernel generation assumes
that every thread executes a single iteration of the user loop, which implies
that the runtime launches a total number of GPU threads equal to or greater than
the iteration space size of the target region loop. This allows the compiler to
generate code for the loop body without an enclosing loop, resulting in reduced
control-flow complexity and potentially better performance.
You can use the clang compiler option `-fopenmp-target-fast` for kernel optimization if certain constraints implied by its component options are satisfied. `-fopenmp-target-fast` enables the following options:
To enable the generation of the specialized kernel, follow these guidelines:
- `-fopenmp-target-ignore-env-vars`: It enables code generation of specialized kernels including No-loop and Cross-team reductions.
- Do not specify teams, threads, and schedule-related environment variables. The
`num_teams` or a `thread_limit` clause in an OpenMP target construct acts as
an override and prevents the generation of the specialized kernel. As the user
is unable to specify the number of teams and threads used within target
regions in the absence of the above-mentioned environment variables, the
runtime will select the best values for the launch configuration based on
runtime knowledge of the program.
- `-fopenmp-assume-no-thread-state`: It enables the compiler to assume that no thread in a parallel region modifies an Internal Control Variable (`ICV`), thus potentially reducing the device runtime code execution.
- Assert the absence of the above-mentioned environment variables by adding the
command-line option `-fopenmp-target-ignore-env-vars`. This option also allows
programmers to enable the No-loop functionality at lower optimization levels.
- `-fopenmp-assume-no-nested-parallelism`: It enables the compiler to assume that no thread in a parallel region encounters a parallel region, thus potentially reducing the device runtime code execution.
- Also, the No-loop functionality is automatically enabled when `-O3` or
`-Ofast` is used for compilation. To disable this feature, use
`-fno-openmp-target-ignore-env-vars`.
- `-O3` if no `-O*` is specified by the user.
Note The compiler might not generate the No-loop kernel in certain scenarios
where the performance improvement is not substantial.
### Specialized Kernels
### Cross-Team Optimized Reductions
Clang will attempt to generate specialized kernels based on compiler options and OpenMP constructs. The following specialized kernels are supported:
In scenarios where a No-loop kernel is generated but the OpenMP construct has a
reduction clause, the compiler may generate optimized code utilizing efficient
Cross-Team (Xteam) communication. No separate user option is required, and there
is a significant performance improvement with Xteam reduction. New APIs for
Xteam reduction are implemented in the device runtime, and clang generates these
APIs automatically.
- No-Loop
- Big-Jump-Loop
- Cross-Team (Xteam) Reductions
To enable the generation of specialized kernels, follow these guidelines:
- Do not specify teams, threads, and schedule-related environment variables. The `num_teams` clause in an OpenMP target construct acts as an override and prevents the generation of the No-Loop kernel. If the specification of `num_teams` clause is a user requirement then clang tries to generate the Big-Jump-Loop kernel instead of the No-Loop kernel.
- Assert the absence of the teams, threads, and schedule-related environment variables by adding the command-line option `-fopenmp-target-ignore-env-vars`.
- To automatically enable the specialized kernel generation, use `-Ofast` or `-fopenmp-target-fast` for compilation.
- To disable specialized kernel generation, use `-fno-openmp-target-ignore-env-vars`.
#### No-Loop Kernel Generation
The No-loop kernel generation feature optimizes the compiler performance by generating a specialized kernel for certain OpenMP target constructs such as target teams distribute parallel for. The specialized kernel generation feature assumes every thread executes a single iteration of the user loop, which leads the runtime to launch a total number of GPU threads equal to or greater than the iteration space size of the target region loop. This allows the compiler to generate code for the loop body without an enclosing loop, resulting in reduced control-flow complexity and potentially better performance.
#### Big-Jump-Loop Kernel Generation
A No-Loop kernel is not generated if the OpenMP teams construct uses a `num_teams` clause. Instead, the compiler attempts to generate a different specialized kernel called the Big-Jump-Loop kernel. The compiler launches the kernel with a grid size determined by the number of teams specified by the OpenMP `num_teams` clause and the `blocksize` chosen either by the compiler or specified by the corresponding OpenMP clause.
#### Xteam Optimized Reduction Kernel Generation
If the OpenMP construct has a reduction clause, the compiler attempts to generate optimized code by utilizing efficient Xteam communication. New APIs for Xteam reduction are implemented in the device runtime and are automatically generated by clang.

View File

@@ -3,10 +3,12 @@
:::::{grid} 1 1 2 2
:gutter: 1
:::{grid-item-card} {doc}`RVS <rocm-validation-suite:index>`
:::{grid-item-card} {doc}`RVS <rocmvalidationsuite:index>`
The ROCm Validation Suite is a system administrators and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform.
- {doc}`Documentation <rocm-validation-suite:index>`
- {doc}`Documentation <rocmvalidationsuite:index>`
- [GitHub](https://github.com/ROCm-Developer-Tools/ROCmValidationSuite)
- [Changelog](https://github.com/ROCm-Developer-Tools/ROCmValidationSuite/blob/master/CHANGELOG.md)
:::
@@ -14,6 +16,7 @@ The ROCm Validation Suite is a system administrators and cluster manager's to
TransferBench is a simple utility capable of benchmarking simultaneous transfers between user-specified devices (CPUs/GPUs).
- {doc}`Documentation <transferbench:index>`
- [GitHub](https://github.com/ROCmSoftwarePlatform/TransferBench/)
- [Changelog](https://github.com/ROCmSoftwarePlatform/TransferBench/blob/develop/CHANGELOG.md)
- {doc}`transferbench:examples/index`

View File

@@ -19,6 +19,7 @@ TensorFlow
| 5.3.x | 1.10.1, 1.11, 1.12.1, 1.13 | 2.8, 2.9, 2.10 | |
| 5.4.x | 1.10.1, 1.11, 1.12.1, 1.13 | 2.8, 2.9, 2.10, 2.11 | 2.5.4 |
| 5.5.x | 1.10.1, 1.11, 1.12.1, 1.13 | 2.10, 2.11 | 2.5.4 |
| 5.6 | 1.11, 1.12.1, 1.13.1 | 2.12 | 2.5.4 |
## Communication libraries
@@ -47,6 +48,7 @@ contemporary CUDA / NVIDIA HPC SDK alternatives.
| 5.3.x | 1.16 | 22.7 |
| 5.4.x | 1.16 | 22.9 |
| 5.5.x | 1.17 | 22.9 |
| 5.6 | 1.17.2 | 22.9 |
For the latest documentation of these libraries, refer to the
[associated documentation](../reference/gpu_libraries/c%2B%2B_primitives.md).

View File

@@ -8,7 +8,7 @@ The software support matrices for ROCm container releases is listed.
#### `Ubuntu+ rocm5.6_internal_testing +169530b`
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [ROCm5.6](https://repo.radeon.com/rocm/apt/5.6/)
* [Python 3.8](https://www.python.org/downloads/release/python-380/)
* [Torch 2.0.0](https://github.com/ROCmSoftwarePlatform/pytorch/tree/rocm5.6_internal_testing)
* [Apex 0.1](https://github.com/ROCmSoftwarePlatform/apex/tree/v0.1)
@@ -21,7 +21,7 @@ The software support matrices for ROCm container releases is listed.
#### `CentOS7+ rocm5.6_internal_testing +169530b`
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [ROCm5.6](https://repo.radeon.com/rocm/apt/5.6/)
* [Python 3.8](https://www.python.org/downloads/release/python-380/)
* [Torch 2.0.0](https://github.com/ROCmSoftwarePlatform/pytorch/tree/rocm5.6_internal_testing)
* [Apex 0.1](https://github.com/ROCmSoftwarePlatform/apex/tree/v0.1)
@@ -31,7 +31,7 @@ The software support matrices for ROCm container releases is listed.
#### `1.13 +bfeb431`
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [ROCm5.6](https://repo.radeon.com/rocm/apt/5.6/)
* [Python 3.8](https://www.python.org/downloads/release/python-380/)
* [Torch 1.13.1](https://github.com/ROCmSoftwarePlatform/pytorch/tree/release/1.13)
* [Apex 0.1](https://github.com/ROCmSoftwarePlatform/apex/tree/v0.1)
@@ -44,7 +44,7 @@ The software support matrices for ROCm container releases is listed.
#### `1.12 +05d5d04`
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [ROCm5.6](https://repo.radeon.com/rocm/apt/5.6/)
* [Python 3.8](https://www.python.org/downloads/release/python-380/)
* [Torch 1.12.1](https://github.com/ROCmSoftwarePlatform/pytorch/tree/release/1.12)
* [Apex 0.1](https://github.com/ROCmSoftwarePlatform/apex/tree/v0.1)
@@ -59,7 +59,7 @@ The software support matrices for ROCm container releases is listed.
#### `tensorflow_develop-upstream-QA-rocm56 +c88a9f4`
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [ROCm5.6](https://repo.radeon.com/rocm/apt/5.6/)
* [Python 3.9](https://www.python.org/downloads/release/python-390/)
* `tensorflow-rocm` 2.13.0
* [OFED 5.3](https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz)
@@ -69,7 +69,7 @@ The software support matrices for ROCm container releases is listed.
#### `r2.11-rocm-enhanced +5be4141`
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [ROCm5.6](https://repo.radeon.com/rocm/apt/5.6/)
* [Python 3.9](https://www.python.org/downloads/release/python-390/)
* [`tensorflow-rocm` 2.11.0](https://pypi.org/project/tensorflow-rocm/2.11.0.540/)
* [OFED 5.3](https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz)
@@ -79,7 +79,7 @@ The software support matrices for ROCm container releases is listed.
#### `r2.10-rocm-enhanced +72789a3`
* [ROCm5.6](https://repo.radeon.com/rocm/apt/latest/)
* [ROCm5.6](https://repo.radeon.com/rocm/apt/5.6/)
* [Python 3.9](https://www.python.org/downloads/release/python-390/)
* [`tensorflow-rocm` 2.10.1](https://pypi.org/project/tensorflow-rocm/2.10.1.540/)
* [OFED 5.3](https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz)

View File

@@ -1,18 +1,52 @@
# GPU and OS Support (Linux)
# GPU Support and OS Compatibility (Linux)
(supported_distributions)=
## Supported Distributions
## Supported Linux Distributions
AMD ROCm™ Platform supports the following Linux distributions.
| Distribution |Processor Architectures| Validated Kernel |
|--------------------|-----------------------|--------------------|
| RHEL 9.1 | x86-64 | 5.14 |
| RHEL 8.6 to 8.7 | x86-64 | 4.18 |
| SLES 15 SP4 | x86-64 | |
| Ubuntu 20.04.5 LTS | x86-64 | 5.15 |
| Ubuntu 22.04.1 LTS | x86-64 | 5.15, OEM 5.17 |
::::{tab-set}
:::{tab-item} Supported
| Distribution | Processor Architectures | Validated Kernel | Support |
| :----------- | :---------------------: | :--------------: | ------: |
| RHEL 9.2 | x86-64 | 5.14 (5.14.0-284.11.1.el9_2.x86_64) | ✅ |
| RHEL 9.1 | x86-64 | 5.14.0-284.11.1.el9_2.x86_64 | ✅ |
| RHEL 8.8 | x86-64 | 4.18.0-477.el8.x86_64 | ✅ |
| RHEL 8.7 | x86-64 | 4.18.0-425.10.1.el8_7.x86_64 | ✅ |
| SLES 15 SP5 | x86-64 | 5.14.21-150500.53-default | ✅ |
| SLES 15 SP4 | x86-64 | 5.14.21-150400.24.63-default | ✅ |
| Ubuntu 22.04.2 | x86-64 | 5.19.0-45-generic | ✅ |
| Ubuntu 20.04.5 | x86-64 | 5.15.0-75-generic | ✅ |
:::{versionadded} 5.6
- RHEL 8.8 and 9.2 support is added.
- SLES 15 SP5 support is added
:::
:::{tab-item} Unsupported
| Distribution | Processor Architectures | Validated Kernel | Support |
| :----------- | :---------------------: | :--------------: | ------: |
| RHEL 9.0 | x86-64 | 5.14 | ❌ |
| RHEL 8.6 | x86-64 | 5.14 | ❌ |
| SLES 15 SP3 | x86-64 | 5.3 | ❌ |
| Ubuntu 22.04.0 | x86-64 | 5.15 LTS, 5.17 OEM | ❌ |
| Ubuntu 20.04.4 | x86-64 | 5.13 HWE, 5.13 OEM | ❌ |
| Ubuntu 22.04.1 | x86-64 | 5.15 LTS | ❌ |
:::
::::
- ✅: **Supported** - AMD performs full testing of all ROCm components on distro
GA image.
- ❌: **Unsupported** - AMD no longer performs builds and testing on these
previously supported distro GA images.
## Virtualization Support
@@ -26,7 +60,11 @@ ROCm supports virtualization for select GPUs only as shown below.
(supported_gpus)=
## GPU Support Table
## Linux Supported GPUs
The table below shows supported GPUs for Instinct™, Radeon Pro™ and Radeon™
GPUs. Please click the tabs below to switch between GPU product lines. If a GPU
is not listed on this table, the GPU is not officially supported by AMD.
::::{tab-set}
@@ -59,6 +97,17 @@ Use Driver Shipped with ROCm
:::
:::{tab-item} Radeon™
:sync: radeonpro
[Use Radeon Pro Driver](https://www.amd.com/en/support/linux-drivers)
| Name | Architecture |[LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Support|
|:----:|:------------:|:--------------------------------------------------------------------:|:-------:|
| AMD Radeon™ VII | GCN5.1 | gfx906 | ✅ |
:::
::::
### Support Status

View File

@@ -5,7 +5,6 @@ The following table is a list of ROCm components with links to their respective
terms. These components may include third party components subject to
additional licenses. Please review individual repositories for more information.
The table shows ROCm components, the name of license and link to the license terms.
The table is ordered to follow ROCm's manifest file.
<!-- spellcheck-disable -->
| Component | License |

View File

@@ -0,0 +1,83 @@
# GPU and OS Support (Windows)
(supported_skus)=
## Supported SKUs
AMD ROCm™ Platform supports the following Windows SKU.
| Distribution |Processor Architectures| Validated update |
|---------------------|-----------------------|--------------------|
| Windows 10 | x86-64 | 22H2 (GA) |
| Windows 11 | x86-64 | 22H2 (GA) |
| Windows Server 2022 | x86-64 | |
## Windows Supported GPUs
The table below shows supported GPUs for Radeon Pro™ and Radeon™ GPUs. Please
click the tabs below to switch between GPU product lines. If a GPU is not listed
on this table, the GPU is not officially supported by AMD.
::::{tab-set}
:::{tab-item} Radeon Pro™
:sync: radeonpro
| Name | Architecture |[LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Runtime | HIP SDK |
|:----:|:------------:|:--------------------------------------------------------------------:|:-------:|:----------------:|
| AMD Radeon Pro™ W7900 | RDNA3 | gfx1100 | ✅ | ✅ |
| AMD Radeon Pro™ W7800 | RDNA3 | gfx1100 | ✅ | ✅ |
| AMD Radeon Pro™ W6800 | RDNA2 | gfx1030 | ✅ | ✅ |
| AMD Radeon Pro™ W6600 | RDNA2 | gfx1032 | ✅ | ❌ |
| AMD Radeon Pro™ W5500 | RDNA1 | gfx1012 | ❌ | ❌ |
| AMD Radeon Pro™ VII | GCN5.1 | gfx906 | ❌ | ❌ |
:::
:::{tab-item} Radeon™
:sync: radeon
| Name | Architecture | [LLVM Target](https://www.llvm.org/docs/AMDGPUUsage.html#processors) | Runtime | HIP SDK |
|:----:|:------------:|:--------------------------------------------------------------------:|:-------:|:----------------:|
| AMD Radeon™ RX 7900 XTX | RDNA3 | gfx1100 | ✅ | ✅ |
| AMD Radeon™ RX 7900 XT | RDNA3 | gfx1100 | ✅ | ✅ |
| AMD Radeon™ RX 7600 | RDNA3 | gfx1100 | ✅ | ✅ |
| AMD Radeon™ RX 6950 XT | RDNA2 | gfx1030 | ✅ | ✅ |
| AMD Radeon™ RX 6900 XT | RDNA2 | gfx1030 | ✅ | ✅ |
| AMD Radeon™ RX 6800 XT | RDNA2 | gfx1030 | ✅ | ✅ |
| AMD Radeon™ RX 6800 | RDNA2 | gfx1030 | ✅ | ✅ |
| AMD Radeon™ RX 6750 | RDNA2 | gfx1032 | ✅ | ❌ |
| AMD Radeon™ RX 6700 XT | RDNA2 | gfx1032 | ✅ | ❌ |
| AMD Radeon™ RX 6700 | RDNA2 | gfx1032 | ✅ | ❌ |
| AMD Radeon™ RX 6650 XT | RDNA2 | gfx1032 | ✅ | ❌ |
| AMD Radeon™ RX 6600 XT | RDNA2 | gfx1032 | ✅ | ❌ |
| AMD Radeon™ RX 6600 | RDNA2 | gfx1032 | ✅ | ❌ |
:::
::::
### Component Support
ROCm components are described in the [reference](../reference/all) page. Support
on Windows is provided with two levels on enablement.
- **Runtime**: Runtime enables the use of the HIP/OpenCL runtimes only.
- **HIP SDK**: Runtime plus additional components refer to libraries found under
[Math Libraries](../reference/gpu_libraries/math.md) and
[C++ Primitive Libraries](../reference/gpu_libraries/c%2B%2B_primitives.md).
Some [Math Libraries](../reference/gpu_libraries/math.md) are Linux exclusive,
please check the library details.
### Support Status
- ✅: **Supported** - AMD enables these GPUs in our software distributions for
the corresponding ROCm product.
- ⚠️: **Deprecated** - Support will be removed in a future release.
- ❌: **Unsupported** - This configuration is not enabled in our software
distributions.
## CPU Support
ROCm requires CPUs that support PCIe™ Atomics. Modern CPUs after the release of
1st generation AMD Zen CPU and Intel™ Haswell support PCIe Atomics.

View File

@@ -18,7 +18,97 @@ integrated into ML frameworks such as PyTorch and TensorFlow. ROCm can be
deployed in many ways, including through the use of containers such as Docker,
Spack, and your own build from source.
ROCms goal is to allow our users to maximize their GPU hardware investment.
The goal of ROCm is to allow our users to maximize their GPU hardware investment.
ROCm is designed to help develop, test and deploy GPU accelerated HPC, AI,
scientific computing, CAD, and other applications in a free, open-source,
integrated and secure software ecosystem.
## ROCm on Windows
Starting with ROCm 5.5, the HIP SDK brings a subset of ROCm to developers on Windows.
The collection of features enabled on Windows is referred to as the HIP SDK.
These features allow developers to use the HIP runtime, HIP math libraries
and HIP Primitive libraries. The following table shows the differences
between Windows and Linux releases.
|Component|Linux|Windows|
|---------|-----|-------|
|Driver|Radeon Software for Linux |AMD Software Pro Edition|
|Compiler|`hipcc`/`amdclang++`|`hipcc`/`clang++`|
|Debugger|`rocgdb`|no debugger available|
|Profiler|`rocprof`|[Radeon GPU Profiler](https://gpuopen.com/rgp/)|
|Porting Tools|HIPIFY|Coming Soon|
|Runtime|HIP (Open Sourced)|HIP (closed source)|
|Math Libraries|Supported|Supported|
|Primitives Libraries|Supported|Supported|
|Communication Libraries|Supported|Not Available|
|AI Libraries|MIOpen, MIGraphX|Not Available|
|System Management|`rocm-smi-lib`, RDC, `rocminfo`|`amdsmi`, `hipInfo`|
|AI Frameworks|PyTorch, TensorFlow, etc.|Not Available|
|CMake HIP Language|Enabled|Unsupported|
|Visual Studio| Not applicable| Plugin Available|
|HIP Ray Tracing| Supported|Supported|
AMD is continuing to invest in Windows support and AMD plans to release enhanced
features in subsequent revisions.
```{note}
The 5.5 Windows Installer collectively groups the Math and Primitives
libraries.
```
```{note}
GPU support on Windows and Linux may differ. You must refer to
Windows and Linux GPU support tables separately.
```
```{note}
HIP Ray Tracing is not distributed via ROCm in Linux.
```
### ROCm release versioning
Linux OS releases set the canonical version numbers for ROCm. Windows will
follow Linux version numbers as Windows releases are based on Linux ROCm
releases. However, not all Linux ROCm releases will have a corresponding Windows
release. The following table shows the ROCm releases on Windows and Linux. Releases
with both Windows and Linux are referred to as a joint release. Releases with
only Linux support are referred to as a skipped release from the Windows
perspective.
|Release version|Linux|Windows|
|---------------|-----|-------|
|5.5|✅|✅|
|5.6|✅|❌|
ROCm Linux releases are versioned with following the Major.Minor.Patch
version number system. Windows releases will only be versioned with Major.Minor.
In general, Windows releases will trail Linux releases. Software developers that
wish to support both Linux and Windows using a single ROCm version should
refrain from upgrading ROCm unless there is a joint release.
### Windows Documentation implications
The ROCm documentation website contains both Windows and Linux documentation.
Just below each article title, a convenient article information section states
whether the page applies to Linux only, Windows only or both OSes. To find the
exact Windows documentation for a release of the HIP SDK, please view the ROCm documentation with the same
Major.Minor version number while ignoring the Patch version. The Patch version
only matters for Linux releases. For convenience,
Windows documentation will continue to be included in the overall ROCm
documentation for the skipped Windows releases.
Windows release notes will contain only information pertinent to Windows.
The software developer must read all the previous ROCm release notes (including)
skipped ROCm versions on Windows for information on all the changes present in
the Windows release.
### Windows Builds from Source
Not all source code required to build Windows from source is available under a
permissive open source license. Build instructions on Windows is only provided
for projects that can be built from source on Windows using a toolchain that
has closed source build prerequisites. The ROCm manifest file is not valid for
Windows. AMD does not release a manifest or tag our components in Windows.
Users may use corresponding Linux tags to build on Windows.

View File

@@ -38,6 +38,32 @@ subtrees:
title: Upgrade
- file: deploy/linux/installer/uninstall
title: Uninstallation
- file: deploy/windows/quick_start
title: Windows Quick Start
- file: deploy/windows/index
title: Windows Overview
subtrees:
- entries:
- file: deploy/windows/prerequisites
title: Prerequisites
- file: deploy/windows/gui/index
subtrees:
- entries:
- file: deploy/windows/gui/install
title: Installation
- file: deploy/windows/gui/upgrade
title: Upgrade
- file: deploy/windows/gui/uninstall
title: Uninstallation
- file: deploy/windows/cli/index
subtrees:
- entries:
- file: deploy/windows/cli/install
title: Installation
- file: deploy/windows/cli/upgrade
title: Upgrade
- file: deploy/windows/cli/uninstall
title: Uninstallation
- file: deploy/docker
title: Docker
@@ -47,6 +73,7 @@ subtrees:
- file: CHANGELOG
title: Changelog
- file: release/gpu_os_support
- file: release/windows_support
- url: https://github.com/RadeonOpenCompute/ROCm/labels/Verified%20Issue
title: Known Issues
- file: release/compatibility
@@ -61,20 +88,6 @@ subtrees:
- caption: APIs and Reference
entries:
- file: reference/all
- file: reference/compilers
title: Compilers and Tools
subtrees:
- entries:
- file: reference/rocmcc/rocmcc
title: ROCmCC
- url: ${project:rocgdb}
title: ROCgdb
- url: ${project:rocprofiler}
title: rocprofiler
- url: ${project:roctracer}
title: roctracer
- url: ${project:rocdbgapi}
title: ROCdbgapi
- file: reference/hip
subtrees:
- entries:
@@ -82,8 +95,6 @@ subtrees:
url: ${project:hip}
- title: HIPify - Port Your Code
url: ${project:hipify}
- file: reference/openmp/openmp
title: OpenMP
- file: reference/gpu_libraries/math
title: Math Libraries
subtrees:
@@ -148,9 +159,9 @@ subtrees:
- title: MIOpen - Machine Intelligence
url: ${project:miopen}
- title: Composable Kernel
url: ${project:composable-kernel}
url: ${project:composable_kernel}
- title: MIGraphX - Graph Optimization
url: ${project:migraphx}
url: ${project:amdmigraphx}
- file: reference/computer_vision
subtrees:
- entries:
@@ -159,13 +170,29 @@ subtrees:
- entries:
- url: ${project:rocal}
title: rocAL
- file: reference/openmp/openmp
title: OpenMP
- file: reference/compilers
title: Compilers and Tools
subtrees:
- entries:
- file: reference/rocmcc/rocmcc
title: ROCmCC
- url: ${project:rocgdb}
title: ROCgdb
- url: ${project:rocprofiler}
title: rocprofiler
- url: ${project:roctracer}
title: roctracer
- url: ${project:rocdbgapi}
title: ROCdbgapi
- file: reference/management_tools
title: Management Tools
subtrees:
- entries:
- url: https://rocm.docs.amd.com/projects/amdsmi/en/{branch}/
title: AMD SMI
- url: https://rocm.docs.amd.com/projects/rocmsmi/en/{branch}/
- url: https://rocm.docs.amd.com/projects/rocm_smi_lib/en/{branch}/
title: ROCm SMI
- url: ${project:rdc}
title: ROCm Datacenter Tool
@@ -173,7 +200,7 @@ subtrees:
title: Validation Tools
subtrees:
- entries:
- url: ${project:rocm-validation-suite}
- url: ${project:rocmvalidationsuite}
title: RVS
- url: ${project:transferbench}
title: TransferBench

View File

@@ -1 +1,2 @@
rocm-docs-core==0.13.4
rocm-docs-core==1.8.0
sphinx-reredirects

View File

@@ -1,110 +1,106 @@
#
# This file is autogenerated by pip-compile with Python 3.8
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile sphinx/requirements.in
# pip-compile requirements.in
#
accessible-pygments==0.0.3
accessible-pygments==0.0.5
# via pydata-sphinx-theme
alabaster==0.7.13
alabaster==1.0.0
# via sphinx
babel==2.11.0
babel==2.16.0
# via
# pydata-sphinx-theme
# sphinx
beautifulsoup4==4.11.2
beautifulsoup4==4.12.3
# via pydata-sphinx-theme
breathe==4.34.0
breathe==4.35.0
# via rocm-docs-core
certifi==2022.12.7
certifi==2024.8.30
# via requests
cffi==1.15.1
cffi==1.17.1
# via
# cryptography
# pynacl
charset-normalizer==2.1.1
charset-normalizer==3.3.2
# via requests
click==8.1.3
click==8.1.7
# via sphinx-external-toc
cryptography==40.0.2
cryptography==43.0.1
# via pyjwt
deprecated==1.2.13
deprecated==1.2.14
# via pygithub
docutils==0.19
docutils==0.21.2
# via
# breathe
# myst-parser
# pydata-sphinx-theme
# sphinx
fastjsonschema==2.16.3
fastjsonschema==2.20.0
# via rocm-docs-core
gitdb==4.0.10
gitdb==4.0.11
# via gitpython
gitpython==3.1.30
gitpython==3.1.43
# via rocm-docs-core
idna==3.4
idna==3.10
# via requests
imagesize==1.4.1
# via sphinx
jinja2==3.1.2
jinja2==3.1.4
# via
# myst-parser
# sphinx
linkify-it-py==1.0.3
# via myst-parser
markdown-it-py==2.2.0
markdown-it-py==3.0.0
# via
# mdit-py-plugins
# myst-parser
markupsafe==2.1.2
markupsafe==2.1.5
# via jinja2
mdit-py-plugins==0.3.4
mdit-py-plugins==0.4.2
# via myst-parser
mdurl==0.1.2
# via markdown-it-py
myst-parser[linkify]==1.0.0
myst-parser==4.0.0
# via rocm-docs-core
packaging==23.0
packaging==24.1
# via
# pydata-sphinx-theme
# sphinx
pycparser==2.21
pycparser==2.22
# via cffi
pydata-sphinx-theme==0.13.3
pydata-sphinx-theme==0.15.4
# via
# rocm-docs-core
# sphinx-book-theme
pygithub==1.58.1
pygithub==2.4.0
# via rocm-docs-core
pygments==2.14.0
pygments==2.18.0
# via
# accessible-pygments
# pydata-sphinx-theme
# sphinx
pyjwt[crypto]==2.6.0
pyjwt[crypto]==2.9.0
# via pygithub
pynacl==1.5.0
# via pygithub
pytz==2022.7.1
# via babel
pyyaml==6.0
pyyaml==6.0.2
# via
# myst-parser
# rocm-docs-core
# sphinx-external-toc
requests==2.28.1
requests==2.32.3
# via
# pygithub
# sphinx
rocm-docs-core==0.13.4
rocm-docs-core==1.8.0
# via -r requirements.in
smmap==5.0.0
smmap==5.0.1
# via gitdb
snowballstemmer==2.2.0
# via sphinx
soupsieve==2.4
soupsieve==2.6
# via beautifulsoup4
sphinx==5.3.0
sphinx==8.0.2
# via
# breathe
# myst-parser
@@ -115,33 +111,40 @@ sphinx==5.3.0
# sphinx-design
# sphinx-external-toc
# sphinx-notfound-page
sphinx-book-theme==1.0.1
# sphinx-reredirects
sphinx-book-theme==1.1.3
# via rocm-docs-core
sphinx-copybutton==0.5.1
sphinx-copybutton==0.5.2
# via rocm-docs-core
sphinx-design==0.4.1
sphinx-design==0.6.1
# via rocm-docs-core
sphinx-external-toc==0.3.1
sphinx-external-toc==1.0.1
# via rocm-docs-core
sphinx-notfound-page==0.8.3
sphinx-notfound-page==1.0.4
# via rocm-docs-core
sphinxcontrib-applehelp==1.0.4
sphinx-reredirects==0.1.5
# via -r requirements.in
sphinxcontrib-applehelp==2.0.0
# via sphinx
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-devhelp==2.0.0
# via sphinx
sphinxcontrib-htmlhelp==2.0.1
sphinxcontrib-htmlhelp==2.1.0
# via sphinx
sphinxcontrib-jsmath==1.0.1
# via sphinx
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-qthelp==2.0.0
# via sphinx
sphinxcontrib-serializinghtml==1.1.5
sphinxcontrib-serializinghtml==2.0.0
# via sphinx
typing-extensions==4.5.0
# via pydata-sphinx-theme
uc-micro-py==1.0.1
# via linkify-it-py
urllib3==1.26.13
# via requests
wrapt==1.14.1
tomli==2.0.1
# via sphinx
typing-extensions==4.12.2
# via
# pydata-sphinx-theme
# pygithub
urllib3==2.2.3
# via
# pygithub
# requests
wrapt==1.16.0
# via deprecated

View File

@@ -13,7 +13,7 @@ The full list of HSA system architecture platform requirements are here: `HSA Sy
The ROCm Platform uses the new PCI Express 3.0 (PCIe 3.0) features for Atomic Read-Modify-Write Transactions which extends inter-processor synchronization mechanisms to IO to support the defined set of HSA capabilities needed for queuing and signaling memory operations.
The new PCIe AtomicOps operate as completers for ``CAS`` (Compare and Swap), ``FetchADD``, ``SWAP`` atomics. The AtomicsOps are initiated by the
The new PCIe atomic operations operate as completers for ``CAS`` (Compare and Swap), ``FetchADD``, ``SWAP`` atomics. The atomic operations are initiated by the
I/O device which support 32-bit, 64-bit and 128-bit operand which target address have to be naturally aligned to operation sizes.
For ROCm the Platform atomics are used in ROCm in the following ways:
@@ -22,11 +22,11 @@ For ROCm the Platform atomics are used in ROCm in the following ways:
* Update HSA queues write_dispatch_id: 64 bit atomic add used by the CPU and GPU agent to support multi-writer queue insertions.
* Update HSA Signals 64bit atomic ops are used for CPU & GPU synchronization.
The PCIe 3.0 AtomicOp feature allows atomic transactions to be requested by, routed through and completed by PCIe components. Routing and completion does not require software support. Component support for each is detectable via the DEVCAP2 register. Upstream bridges need to have AtomicOp routing enabled or the Atomic Operations will fail even though PCIe endpoint and PCIe I/O Devices has the capability to Atomics Operations.
The PCIe 3.0 atomic operations feature allows atomic transactions to be requested by, routed through and completed by PCIe components. Routing and completion does not require software support. Component support for each is detectable via the DEVCAP2 register. Upstream bridges need to have atomic operations routing enabled or the Atomic Operations will fail even though PCIe endpoint and PCIe I/O Devices has the capability to Atomics Operations.
To do AtomicOp routing capability between two or more Root Ports, each associated Root Port must indicate that capability via the AtomicOp Routing Supported bit in the Device Capabilities 2 register.
To do atomic operations routing capability between two or more Root Ports, each associated Root Port must indicate that capability via the atomic operations routing supported bit in the Device Capabilities 2 register.
If your system has a PCIe Express Switch it needs to support AtomicsOp routing. Again AtomicOp requests are permitted only if a components ``DEVCTL2.ATOMICOP_REQUESTER_ENABLE`` field is set. These requests can only be serviced if the upstream components support AtomicOp completion and/or routing to a component which does. AtomicOp Routing Support=1 Routing is supported, AtomicOp Routing Support=0 routing is not supported.
If your system has a PCIe Express Switch it needs to support atomic operations routing. Atomic operations requests are permitted only if a components ``DEVCTL2.ATOMICOP_REQUESTER_ENABLE`` field is set. These requests can only be serviced if the upstream components support atomic operations completion and/or routing to a component which does. Atomic operations routing support=1, routing is supported; Atomic operations routing support=0, routing is not supported.
Atomic Operation is a Non-Posted transaction supporting 32-bit and 64-bit address formats, there must be a response for Completion containing the result of the operation. Errors associated with the operation (uncorrectable error accessing the target location or carrying out the Atomic operation) are signaled to the requester by setting the Completion Status field in the completion descriptor, they are set to to Completer Abort (CA) or Unsupported Request (UR).
@@ -39,14 +39,14 @@ There are also a number of papers which talk about these new capabilities:
* `Atomic Read Modify Write Primitives by Intel <https://www.intel.es/content/dam/doc/white-paper/atomic-read-modify-write-primitives-i-o-devices-paper.pdf>`_
* `PCI express 3 Accelerator Whitepaper by Intel <https://www.intel.sg/content/dam/doc/white-paper/pci-express3-accelerator-white-paper.pdf>`_
* `Intel PCIe Generation 3 Hotchips Paper <https://www.hotchips.org/wp-content/uploads/hc_archives/hc21/1_sun/HC21.23.1.SystemInterconnectTutorial-Epub/HC21.23.131.Ajanovic-Intel-PCIeGen3.pdf>`_
* `PCIe Generation 4 Base Specification includes Atomics Operation <http://composter.com.ua/documents/PCI_Express_Base_Specification_Revision_4.0.Ver.0.3.pdf>`_
* `PCIe Generation 4 Base Specification includes Atomics Operation <https://astralvx.com/storage/2020/11/PCI_Express_Base_4.0_Rev0.3_February19-2014.pdf>`_
Other I/O devices with PCIe Atomics support
* `Mellanox ConnectX-5 InfiniBand Card <http://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX-5_VPI_Card.pdf>`_
* `Cray Aries Interconnect <http://www.hoti.org/hoti20/slides/Bob_Alverson.pdf>`_
* `Xilinx PCIe Ultrascale Whitepaper <https://www.xilinx.com/support/documentation/white_papers/wp464-PCIe-ultrascale.pdf>`_
* `Xilinx 7 Series Devices <https://www.xilinx.com/support/documentation/ip_documentation/pcie_7x/v3_1/pg054-7series-pcie.pdf>`_
* `Xilinx PCIe Ultrascale Whitepaper <https://docs.xilinx.com/v/u/8OZSA2V1b1LLU2rRCDVGQw>`_
* `Xilinx 7 Series Devices <https://docs.xilinx.com/v/u/1nfXeFNnGpA0ywyykvWHWQ>`_
Future bus technology with richer I/O Atomics Operation Support
@@ -54,8 +54,8 @@ Future bus technology with richer I/O Atomics Operation Support
New PCIe Endpoints with support beyond AMD Ryzen and EPYC CPU; Intel Haswell or newer CPUs with PCIe Generation 3.0 support.
* `Mellanox Bluefield SOC <http://www.mellanox.com/related-docs/npu-multicore-processors/PB_Bluefield_SoC.pdf>`_
* `Cavium Thunder X2 <http://www.cavium.com/ThunderX2_ARM_Processors.html>`_
* `Mellanox Bluefield SOC <https://docs.nvidia.com/networking/display/BlueFieldSWv25111213/BlueField+Software+Overview>`_
* `Cavium Thunder X2 <https://en.wikichip.org/wiki/cavium/thunderx2>`_
In ROCm, we also take advantage of PCIe ID based ordering technology for P2P when the GPU originates two writes to two different targets:
@@ -71,7 +71,7 @@ BAR Memory Overview
*******************
On a Xeon E5 based system in the BIOS we can turn on above 4GB PCIe addressing, if so he need to set MMIO Base address ( MMIOH Base) and Range ( MMIO High Size) in the BIOS.
In SuperMicro system in the system bios you need to see the following
In Supermicro system in the system bios you need to see the following
* Advanced->PCIe/PCI/PnP configuration-> Above 4G Decoding = Enabled
@@ -79,7 +79,7 @@ In SuperMicro system in the system bios you need to see the following
* Advanced->PCIe/PCI/PnP Configuration->MMIO High Size = 256G
When we support Large Bar Capability there is a Large Bar Vbios which also disable the IO bar.
When we support Large Bar Capability there is a Large Bar VBIOS which also disable the IO bar.
For GFX9 and Vega10 which have Physical Address up 44 bit and 48 bit Virtual address.
@@ -118,32 +118,5 @@ Legend:
5 : Expansion ROM This is required for the AMD Driver SW to access the GPUs video-bios. This is currently fixed at 128KB.
Excepts form Overview of Changes to PCI Express 3.0
===================================================
By Mike Jackson, Senior Staff Architect, MindShare, Inc.
********************************************************
Atomic Operations Goal:
*************************
Support SMP-type operations across a PCIe network to allow for things like offloading tasks between CPU cores and accelerators like a GPU. The spec says this enables advanced synchronization mechanisms that are particularly useful with multiple producers or consumers that need to be synchronized in a non-blocking fashion. Three new atomic non-posted requests were added, plus the corresponding completion (the address must be naturally aligned with the operand size or the TLP is malformed):
* Fetch and Add uses one operand as the “add” value. Reads the target location, adds the operand, and then writes the result back to the original location.
* Unconditional Swap uses one operand as the “swap” value. Reads the target location and then writes the swap value to it.
* Compare and Swap uses 2 operands: first data is compare value, second is swap value. Reads the target location, checks it against the compare value and, if equal, writes the swap value to the target location.
* AtomicOpCompletion new completion to give the result so far atomic request and indicate that the atomicity of the transaction has been maintained.
Since AtomicOps are not locked they don't have the performance downsides of the PCI locked protocol. Compared to locked cycles, they provide “lower latency, higher scalability, advanced synchronization algorithms, and dramatically lower impact on other PCIe traffic.” The lock mechanism can still be used across a bridge to PCI or PCI-X to achieve the desired operation.
AtomicOps can go from device to device, device to host, or host to device. Each completer indicates whether it supports this capability and guarantees atomic access if it does. The ability to route AtomicOps is also indicated in the registers for a given port.
ID-based Ordering Goal:
*************************
Improve performance by avoiding stalls caused by ordering rules. For example, posted writes are never normally allowed to pass each other in a queue, but if they are requested by different functions, we can have some confidence that the requests are not dependent on each other. The previously reserved Attribute bit [2] is now combined with the RO bit to indicate ID ordering with or without relaxed ordering.
This only has meaning for memory requests, and is reserved for Configuration or IO requests. Completers are not required to copy this bit into a completion, and only use the bit if their enable bit is set for this operation.
To read more on PCIe Gen 3 new options https://www.mindshare.com/files/resources/PCIe%203-0.pdf
For more information, you can review
`Overview of Changes to PCI Express 3.0 <https://www.mindshare.com/files/resources/PCIe%203-0.pdf>`_.

View File

@@ -4,7 +4,7 @@ Using CMake
Most components in ROCm support CMake. Projects depending on header-only or
library components typically require CMake 3.5 or higher whereas those wanting
to make use of CMake's HIP language support will require CMake 3.21 or higher.
to make use of the CMake HIP language support will require CMake 3.21 or higher.
Finding Dependencies
====================
@@ -16,7 +16,7 @@ Finding Dependencies
<https://cmake.org/cmake/help/latest/command/find_package.html>`_ and the
`Using Dependencies Guide
<https://cmake.org/cmake/help/latest/guide/using-dependencies/index.html>`_
to get an overview of CMake's related facilities.
to get an overview of CMake related facilities.
In short, CMake supports finding dependencies in two ways:
@@ -28,7 +28,7 @@ In short, CMake supports finding dependencies in two ways:
regards needed to consume it.
ROCm predominantly relies on Config mode, one notable exception being the Module
driving the compilation of HIP programs on Nvidia runtimes. As such, when
driving the compilation of HIP programs on NVIDIA runtimes. As such, when
dependencies are not found in standard system locations, one either has to
instruct CMake to search for package config files in additional folders using
the ``CMAKE_PREFIX_PATH`` variable (a semi-colon separated list of filesystem
@@ -55,8 +55,8 @@ to the installation guides in these docs (`Linux <../deploy/linux/index.html>`_)
Using HIP in CMake
==================
ROCm componenents providing a C/C++ interface support being consumed using any
C/C++ toolchain that CMake knows how to drive. ROCm also supports CMake's HIP
ROCm components providing a C/C++ interface support being consumed using any
C/C++ toolchain that CMake knows how to drive. ROCm also supports the CMake HIP
language features, allowing users to program using the HIP single-source
programming model. When a program (or translation-unit) uses the HIP API without
compiling any GPU device code, HIP can be treated in CMake as a simple C/C++
@@ -172,7 +172,7 @@ all the flags necessary for device compilation.
.. note::
Compiling for the GPU device requires at least C++11.
This project can then be configured with for eg.
This project can then be configured with the following CMake commands:
- Windows: ``cmake -D CMAKE_CXX_COMPILER:PATH=${env:HIP_PATH}\bin\clang++.exe``
@@ -186,7 +186,7 @@ When using the CXX language support to compile HIP device code, selecting the
target GPU architectures is done via setting the ``GPU_TARGETS`` variable.
``CMAKE_HIP_ARCHITECTURES`` only exists when the HIP language is enabled. By
default, this is set to some subset of the currently supported architectures of
AMD ROCm. It can be set to eg. ``-D GPU_TARGETS="gfx1032;gfx1035"``.
AMD ROCm. It can be set to the CMake option ``-D GPU_TARGETS="gfx1032;gfx1035"``.
ROCm CMake Packages
-------------------
@@ -251,9 +251,9 @@ options.
IDEs supporting CMake (Visual Studio, Visual Studio Code, CLion, etc.) all came
up with their own way to register command-line fragments of different purpose in
a setup'n'forget fashion for quick assembly using graphical front-ends. This is
a setup-and-forget fashion for quick assembly using graphical front-ends. This is
all nice, but configurations aren't portable, nor can they be reused in
Continuous Intergration (CI) pipelines. CMake has condensed existing practice
Continuous Integration (CI) pipelines. CMake has condensed existing practice
into a portable JSON format that works in all IDEs and can be invoked from any
command-line. This is
`CMake Presets <https://cmake.org/cmake/help/latest/manual/cmake-presets.7.html>`_

View File

@@ -10,6 +10,6 @@ disambiguates compiler naming used throughout the documentation.
| `amdclang++` | Clang/LLVM-based compiler that is part of `rocm-llvm` package. The source code is available at <a href="https://github.com/RadeonOpenCompute/llvm-project" target="_blank">https://github.com/RadeonOpenCompute/llvm-project</a>. |
| AOCC | Closed-source clang-based compiler that includes additional CPU optimizations. Offered as part of ROCm via the `rocm-llvm-alt` package. See for details, <a href="https://developer.amd.com/amd-aocc/" target="_blank">https://developer.amd.com/amd-aocc/</a>. |
| HIP-Clang | Informal term for the `amdclang++` compiler |
| HIPify | Tools including `hipify-clang` and `hipify-perl`, used to automatically translate CUDA source code into portable HIP C++. The source code is available at <a href="https://github.com/ROCm-Developer-Tools/HIPIFY" target="_blank">https://github.com/ROCm-Developer-Tools/HIPIFY</a> |
| HIPIFY | Tools including `hipify-clang` and `hipify-perl`, used to automatically translate CUDA source code into portable HIP C++. The source code is available at <a href="https://github.com/ROCm-Developer-Tools/HIPIFY" target="_blank">https://github.com/ROCm-Developer-Tools/HIPIFY</a> |
| `hipcc` | HIP compiler driver. A utility that invokes `clang` or `nvcc` depending on the target and passes the appropriate include and library options for the target compiler and HIP infrastructure. The source code is available at <a href="https://github.com/ROCm-Developer-Tools/HIPCC" target="_blank">https://github.com/ROCm-Developer-Tools/HIPCC</a>. |
| ROCmCC | Clang/LLVM-based compiler. ROCmCC in itself is not a binary but refers to the overall compiler. |

View File

@@ -1,10 +1,12 @@
# Linux Folder Structure Reorganization
# ROCm FHS Reorganization
## Introduction
ROCm packages have adopted the Linux foundation file system hierarchy standard
to ensure ROCm components follow open source conventions for Linux-based
distributions. Following is the ROCm proposed file structure.
The ROCm platform has adopted the Linux foundation Filesystem Hierarchy Standard (FHS) [https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html](https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html) in order to to ensure ROCm is consistent with standard open source conventions. The following sections specify how current and future releases of ROCm adhere to FHS, how the previous ROCm filesystem is supported, and how improved versioning specifications are applied to ROCm.
## Adopting the Linux foundation Filesystem Hierarchy Standard (FHS)
In order to standardize ROCm directory structure and directory content layout ROCm has adopted the [FHS](https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html), adhering to open source conventions for Linux-based distribution. FHS ensures internal consistency within the ROCm stack, as well as external consistency with other systems and distributions. The ROCm proposed file structure is outlined below:
```none
/opt/rocm-<ver>
@@ -42,14 +44,13 @@ distributions. Following is the ROCm proposed file structure.
| -- architecture independent misc files
```
## Changes from earlier ROCm versions
## Changes From Earlier ROCm Versions
ROCm with the file reorganization is going to have a lean structure. Following
table gives the comparison with new and old folder structure.
The following table provides a brief overview of the new ROCm FHS layout, compared to the layout of earlier ROCm versions. Note that /opt/ is used to denote the default rocm-installation-path and should be replaced in case of a non-standard installation location of the ROCm distribution.
```none
______________________________________________________
| New File Structure | Old File Structure |
| New ROCm Layout | Previous ROCm Layout |
|_____________________________|________________________|
| /opt/rocm-<ver> | /opt/rocm-<ver> |
| | -- bin | | -- bin |
@@ -72,39 +73,28 @@ table gives the comparison with new and old folder structure.
|______________________________________________________|
```
## ROCm File reorganization transition plan
## ROCm FHS Reorganization: Backward Compatibility
New file organization for ROCm was first introduced ROCm v5.2 release. Backward
compatibility was in place to make sure users had a chance to change their
applications using ROCm. ROCm has moved header files and libraries to its new
location as indicated in the above structure and included symbolic-link and
wrapper header files in its old location for backward compatibility.
The FHS file organization for ROCm was first introduced in the release of ROCm 5.2 . Backward compatibility was implemented to make sure users could still run their ROCm applications while transitioning to the new FHS. ROCm has moved header files and libraries to their new locations as indicated in the above structure, and included symbolic-links and wrapper header files in their old location for backward compatibility. The following sections detail ROCm backward compatibility implementation for wrapper header files, executable files, library files and CMake config files.
### Wrapper header files
### Wrapper Header Files
Wrapper header files are placed in the old location (
`/opt/rocm-xxx/<component>/include`) with a warning message to include files
from the new location (`/opt/rocm-xxx/include`) as shown in the example below.
`/opt/rocm-<ver>/<component>/include`) with a warning message to include files
from the new location (`/opt/rocm-<ver>/include`) as shown in the example below.
```cpp
#pragma message "This file is deprecated. Use file from include path /opt/rocm-ver/include/ and prefix with hip."
#include "hip/hip_runtime.h"
#include <hip/hip_runtime.h>
```
The deprecation plan for backward compatibility wrapper header files is as
follows
- Starting at ROCm 5.2 release, the deprecation for backward compatibility wrapper header files is: `#pragma` message announcing `#warning`.
- Starting from ROCm 6.0 (tentatively) backward compatibility for wrapper header files will be removed, and the `#pragma` message will be announcing `#error`.
- `#pragma` message announcing deprecation ROCm v5.2 release.
- `#pragma` message changed to `#warning` Future release, tentatively ROCm
v5.5.
- `#warning` changed to `#error` Future release, tentatively ROCm v5.6.
- Backward compatibility wrappers removed Future release, tentatively ROCm
v6.0.
### Executable Files
### Executable files
Executable files are available in the `/opt/rocm-xxx/bin` folder. For backward
compatibility, the old library location (`/opt/rocm-xxx/<component>/bin`) has a
Executable files are available in the `/opt/rocm-<ver>/bin` folder. For backward
compatibility, the old library location (`/opt/rocm-<ver>/<component>/bin`) has a
soft link to the library at the new location. Soft links will be removed in a
future release, tentatively ROCm v6.0.
@@ -113,10 +103,10 @@ $ ls -l /opt/rocm/hip/bin/
lrwxrwxrwx 1 root root 24 Jan 1 23:32 hipcc -> ../../bin/hipcc
```
### Library files
### Library Files
Library files are available in the `/opt/rocm-xxx/lib` folder. For backward
compatibility, the old library location (`/opt/rocm-xxx/<component>/lib`) has a
Library files are available in the `/opt/rocm-<ver>/lib` folder. For backward
compatibility, the old library location (`/opt/rocm-<ver>/<component>/lib`) has a
soft link to the library at the new location. Soft links will be removed in a
future release, tentatively ROCm v6.0.
@@ -126,11 +116,11 @@ drwxr-xr-x 4 root root 4096 Jan 1 10:45 cmake
lrwxrwxrwx 1 root root 24 Jan 1 23:32 libamdhip64.so -> ../../lib/libamdhip64.so
```
### CMake Config files
### CMake Config Files
All CMake configuration files are available in the
`/opt/rocm-xxx/lib/cmake/<component>` folder. For backward compatibility, the
old CMake locations (`/opt/rocm-xxx/<component>/lib/cmake`) consist of a soft
`/opt/rocm-<ver>/lib/cmake/<component>` folder. For backward compatibility, the
old CMake locations (`/opt/rocm-<ver>/<component>/lib/cmake`) consist of a soft
link to the new CMake config. Soft links will be removed in a future release,
tentatively ROCm v6.0.
@@ -139,10 +129,10 @@ $ ls -l /opt/rocm/hip/lib/cmake/hip/
lrwxrwxrwx 1 root root 42 Jan 1 23:32 hip-config.cmake -> ../../../../lib/cmake/hip/hip-config.cmake
```
## Changes required in applications using ROCm
## Changes Required in Applications Using ROCm
Applications using ROCm are advised to use the new file paths. As the old files
will be deprecated in a future release. Application have to make sure to include
will be deprecated in a future release. Applications have to make sure to include
correct header file and use correct search paths.
1. `#include<header_file.h>` needs to be changed to
@@ -158,10 +148,18 @@ correct header file and use correct search paths.
`VAR2=/opt/rocm/hsa` needs to be changed to `VAR2=/opt/rocm`
3. Any reference to `/opt/rocm/<component>/bin` or `/opt/rocm/<component>/lib`
needs to be changed to `/opt/rocm/bin` and `/opt/rocm/lib/` respectively.
needs to be changed to `/opt/rocm/bin` and `/opt/rocm/lib/`, respectively.
## References
## Changes in Versioning Specifications
{ref}`ROCm deprecation warning <5_4_0_filesystem_reorg_deprecation_notice>`
In order to better manage ROCm dependencies specification and allow smoother releases of ROCm while avoiding dependency conflicts, the ROCm platform shall adhere to the following scheme when numbering and incrementing ROCm files versions:
[Linux File System Standard](https://refspecs.linuxfoundation.org/fhs.shtml)
rocm-\<ver\>, where \<ver\> = \<x.y.z\>
x.y.z denote: MAJOR.MINOR.PATCH
z: PATCH - increment z when implementing backward compatible bug fixes.
y: MINOR - increment y when implementing minor changes that add functionality but are still backward compatible.
x: MAJOR - increment x when implementing major changes that are not backward compatible.

View File

@@ -81,7 +81,6 @@ class TaggingArgs(argparse.Namespace):
def exclude(self) -> List[str]:
"""Get the excluded libraries plus defaults."""
defaults = [
"AMDMIGraphX",
"MIOpenGEMM",
"MIOpenKernels",
"MIOpenTensile",
@@ -236,9 +235,21 @@ def run_tagging():
)
# Find all the math libraries and their remotes.
names_and_remotes = list(
(entry.get("name"), entry.get("remote")) for entry in manifest_tree.findall(".//project[@groups='mathlibs']")
)
included_names = [
"rocm-cmake",
"MIOpen",
"AMDMIGraphX",
"rocprofiler"
]
included_groups = [
"mathlibs"
]
projects = [ ]
for project in manifest_tree.iterfind(".//project"):
include = str(project.get("name")) in included_names
if (project.get("name") in included_names) or (project.get("groups") in included_groups):
projects.append(project)
names_and_remotes = list((entry.get("name"), entry.get("remote")) for entry in projects)
# Get all the relevant ROCm releases, and only the last version if not doing previous.
minimum_version = "5.0.0" if args.previous else args.version
@@ -249,12 +260,17 @@ def run_tagging():
for (version, release) in releases.items():
for (_, library) in release.libraries.items():
# Parse the changelog for each library and each version
success = PROCESSORS[library.name](
library,
TEMPLATES[library.name],
args.previous,
Version(version) < Version(args.version)
)
try:
success = PROCESSORS[library.name](
library,
TEMPLATES[library.name],
args.previous,
Version(version) < Version(args.version)
)
except Exception as e:
success = False
print(f"Exception parsing {library.name} for ROCm {version}")
print(e)
if not success:
print(f"Error processing {library.name} for ROCm {version}")
failed.append((version, library.name))

View File

@@ -32,15 +32,15 @@ The release notes for the ROCm platform.
| Library | Version |
|---------|---------|
{%- for lib_name, lib in release.libraries | dictsort %}
{%- if rocm_ver_by_lib_ver[lib_name][lib.lib_version] == version %}
{%- if rocm_ver_by_lib_ver[lib_name][lib.lib_version] == version and lib.lib_version %}
| {{ lib_name }} | {{prev_lib_ver[lib_name][lib.lib_version]}} ⇒ [{{ lib.lib_version }}]({{ lib.release_url }}) |
{%- else %}
{%- elif lib.lib_version %}
| {{ lib_name }} | [{{ lib.lib_version }}]({{ lib.release_url }}) |
{%- endif %}
{%- endfor %}
{%- for lib_name, lib in release.libraries | dictsort %}
{%- if rocm_ver_by_lib_ver[lib_name][lib.lib_version] == version %}
{%- if rocm_ver_by_lib_ver[lib_name][lib.lib_version] == version and lib.lib_version%}
#### {{lib_name}} {{lib.lib_version}}

View File

@@ -95,8 +95,6 @@ The `hipcc` and `hipconfig` Perl scripts are deprecated. In a future release, co
>
> There will be a transition period where the Perl scripts and compiled binaries are available before the scripts are removed. There will be no functional difference between the Perl scripts and their compiled binary counterpart. No user action is required. Once these are available, users can optionally switch to `hipcc.bin` and `hipconfig.bin`. The `hipcc`/`hipconfig` soft link will be assimilated to point from `hipcc`/`hipconfig` to the respective compiled binaries as the default option.
(5_4_0_filesystem_reorg_deprecation_notice)=
##### Linux Filesystem Hierarchy Standard for ROCm
ROCm packages have adopted the Linux foundation filesystem hierarchy standard in this release to ensure ROCm components follow open source conventions for Linux-based distributions. While moving to a new filesystem hierarchy, ROCm ensures backward compatibility with its 5.1 version or older filesystem hierarchy. See below for a detailed explanation of the new filesystem hierarchy and backward compatibility.

View File

@@ -26,6 +26,29 @@ The following hipcc changes are implemented in this release:
- `hipCommander` at <https://github.com/ROCm-Developer-Tools/hip-tests/tree/develop/samples/1_Utils/hipCommander>
Note that the samples will continue to be available in previous release branches.
- Removal of gcnarch from hipDeviceProp_t structure
- Addition of new fields in hipDeviceProp_t structure
- maxTexture1D
- maxTexture2D
- maxTexture1DLayered
- maxTexture2DLayered
- sharedMemPerMultiprocessor
- deviceOverlap
- asyncEngineCount
- surfaceAlignment
- unifiedAddressing
- computePreemptionSupported
- hostRegisterSupported
- uuid
- Removal of deprecated code
- hip-hcc codes from hip code tree
- Correct hipArray usage in HIP APIs such as hipMemcpyAtoH and hipMemcpyHtoA
- HIPMEMCPY_3D fields correction to avoid truncation of "size_t" to "unsigned int" inside hipMemcpy3D()
- Renaming of 'memoryType' in hipPointerAttribute_t structure to 'type'
- Correct hipGetLastError to return the last error instead of last API call's return code
- Update hipExternalSemaphoreHandleDesc to add "unsigned int reserved[16]"
- Correct handling of flag values in hipIpcOpenMemHandle for hipIpcMemLazyEnablePeerAccess
- Remove hiparray* and make it opaque with hipArray_t
##### New HIP APIs in This Release
@@ -280,7 +303,3 @@ When user applications call `ncclCommAbort` to destruct communicators and then c
communicators repeatedly, subsequent communicators may fail to initialize.
This issue is under investigation and will be resolved in a future release.
#### Failures In HIP Directed Tests
Multiple HIP directed tests fail.

View File

@@ -2,9 +2,22 @@
<!-- markdownlint-disable no-duplicate-header -->
### What's New in This Release
#### HIP SDK for Windows
AMD is pleased to announce the availability of the HIP SDK for Windows as part
of the ROCm platform. The
[HIP SDK OS and GPU support page](https://rocm.docs.amd.com/en/docs-5.5.1/release/windows_support.html)
lists the versions of Windows and GPUs validated by AMD. HIP SDK features on
Windows are described in detail in our
[What is ROCm?](https://rocm.docs.amd.com/en/docs-5.5.1/rocm.html#rocm-on-windows)
page and differs from the Linux feature set. Visit
[Quick Start](https://rocm.docs.amd.com/en/docs-5.5.1/deploy/windows/quick_start.html#)
page to get started. Known issues are tracked on
[GitHub](https://github.com/RadeonOpenCompute/ROCm/issues?q=is%3Aopen+label%3A5.5.1+label%3A%22Verified+Issue%22+label%3AWindows).
#### HIP API Change
The following HIP API is updated in the ROCm v5.5.1 release,
The following HIP API is updated in the ROCm 5.5.1 release:
##### `hipDeviceSetCacheConfig`

Some files were not shown because too many files have changed in this diff Show More