Commit Graph

1613 Commits

Author SHA1 Message Date
randyh62
f500c32989 add quarantine_size_mb (#3264)
* add quarantine_size_mb

* Update docs/conceptual/using-gpu-sanitizer.md

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* Update docs/conceptual/using-gpu-sanitizer.md

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

* format fix

* format fix again

* ASAN capitalization

* remove particular

* indent bullets

* Leo comments

---------

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
2024-06-10 11:59:47 -07:00
Joseph Macaranas
f8151b6cb5 rocprofiler-register: Add unit testing (#3272)
Since this component uses the base pool, does not need GPU for testing and is very quick to run, unit testing can be done within the same job.
2024-06-10 11:29:47 -04:00
Joseph Macaranas
8db9220935 External CI: non-interactive apt upgrades (#3271) 2024-06-08 22:20:11 -04:00
alexxu-amd
30851e9c85 Merge pull request #3266 from ROCm/amd/alexxu12/aptScriptTypo
Fix a typo from .azuredevops/templates/steps/dependencies-other.yml
2024-06-07 13:36:37 -04:00
alexxu-amd
fdd0ed080b fix a typo 2024-06-07 13:29:14 -04:00
Joseph Macaranas
d3f634ea33 Remove branch filter for aomp pipeline trigger (#3258)
Previous filter was not triggering this CI pipeline when ROCm-Runtime build was triggered from a pipeline completion trigger of llvm-project.
2024-06-07 11:14:32 -04:00
Sam Wu
6c73abbaea Merge pull request #3262 from ROCm/bb-develop-6.1.2-pr
Add the manifest file for ROCm6.1.2
2024-06-06 17:07:14 -06:00
Sam Wu
c49877adc9 Merge branch 'roc-6.1.x' into develop 2024-06-06 17:06:13 -06:00
Sam Wu
49404d69f8 Merge pull request #3263 from ROCm/dependabot/pip/docs/sphinx/rocm-docs-core-1.4.0
Bump rocm-docs-core from 1.2.0 to 1.4.0 in /docs/sphinx
2024-06-06 14:18:31 -06:00
dependabot[bot]
d17e602769 Bump rocm-docs-core from 1.2.0 to 1.4.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.2.0 to 1.4.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.2.0...v1.4.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-06 20:04:21 +00:00
Wang, Yanyao
2fdbc8b475 Add the manifest file for ROCm6.1.2 2024-06-06 12:44:08 -07:00
Peter Park
7d3fb25725 Update links in compat matrix and what-is-rocm (#3253)
* Update links in compat matrix and what-is-rocm

* Tensorflow -> TensorFlow

* Remove extra lines

* Revert "Remove extra lines"

This reverts commit 607c4323ac.

ROCm Debug Agent
2024-06-06 13:27:00 -04:00
Wang, Yanyao
b7c6671e06 Fix Markdown formate for the linter check 2024-06-05 13:44:50 -07:00
Wang, Yanyao
27bd772bbe Update the branch of ROCm repo after testing 2024-06-05 13:44:50 -07:00
Wang, Yanyao
68c45d30b5 Build ROCm from source 2024-06-05 13:44:50 -07:00
Sam Wu
35835c4289 Fix first link in compatibility matrix table (#3239)
* Fix first link in compatibility matrix table

* Revert "Fix first link in compatibility matrix table"

This reverts commit 069c5c116a.

* Remove sticky header and unused css

* Remove container from hardware specs matrix

---------

Co-authored-by: Peter Jun Park <peter.park@amd.com>
2024-06-05 15:48:27 -04:00
Wang, Yanyao
73b7b02c4f Fix Markdown formate for the linter check 2024-06-05 12:15:12 -07:00
Wang, Yanyao
ba7afa9808 Update the branch of ROCm repo after testing 2024-06-05 12:15:12 -07:00
Wang, Yanyao
ae6eac2823 Build ROCm from source 2024-06-05 12:15:12 -07:00
Young Hui - AMD
55bb127e9a fix links for MIVisionX (#3240) 2024-06-05 11:55:11 -04:00
Sam Wu
e65e9307f5 Add 6.1.2 to version list (#3238) 2024-06-05 11:25:35 -04:00
Peter Park
6494885359 Rename fine-tuning and optimization guide directory and fix index.md (#3242)
* Mv fine-tuning and optimization files

* Reorder index.md

* Rename images directory

* Fix internal links
2024-06-05 11:11:00 -04:00
Sam Wu
266f502010 Update manifest to 6.1.2 2024-06-05 11:06:24 -04:00
abhimeda
bf08674992 Built rccl using latest source code (#3230) 2024-06-04 17:50:36 -04:00
Sam Wu
17f12a11e7 Merge pull request #3234 from WBobby/roc-6.1.2-manifest
Update manifest file for ROCm6.1.2
rocm-6.1.2
2024-06-04 14:50:14 -06:00
Wang, Yanyao
b2f0f0acdf Update manifest file for ROCm6.1.2 2024-06-04 15:39:16 -05:00
Sam Wu
a11c0512e1 Merge branch 'docs/6.1.2' into roc-6.1.x 2024-06-04 14:38:59 -06:00
Sam Wu
eec71da8dd Merge pull request #3232 from ROCm/develop
Merge develop into roc-6.1.x
2024-06-04 14:36:34 -06:00
Sam Wu
39891fe185 Sync develop branch 2024-06-04 14:32:36 -06:00
Peter Park
14ee171649 Add OS support note (#91) 2024-06-04 14:11:01 -04:00
Peter Park
e7bff21d3e Add final fixes to 6.1.2 release notes and changelog (#90)
* Regenerate changelog

* Add component changelogs and known issue

Fix RELEASE.md headings

Update pub datestamp for 6.1.2

Add AMDSMI and ROCm SMI to 6.1.2 template

Add rccl and rocBLAS

Update intro blurb and headings

Add ROCm SMI fix

Add missed heading to AMDSMI

Update datestamp and release version number

Update version and release number

Add known issue re: MI300X error detection

Words

Add issue link

Rm GitHub issue link

Move known issue down

Update ki wording

Remove "this issue has been investigated ... " from known issue

Fix changelog h1

* Reorg known issue, upcoming changes, remove rocDecode tested configurations

* Add fixes from review

* Add fixed issue link

* Fix heading

* Remove known issue
2024-06-04 12:23:07 -04:00
Peter Park
6abe5b50a2 Merge pull request #3229 from peterjunpark/docs/6.1.2
docs/6.1.2: Update the links for rocminfo and rocm-bandwidth-test (#3213)
2024-06-04 08:12:15 -07:00
amitkumar-amd
df864f8f79 Update the links for rocminfo and rocm-bandwidth-test (#3213)
* Update the links for rocminfo and rocm-bandwidth-test

* Update the links for rocminfo and rocm-bandwidth-test

* Update the links for rocminfo and rocm-bandwidth-test

* Update links to intersphinx links

---------

Co-authored-by: Peter Jun Park <peter.park@amd.com>
2024-06-04 11:00:52 -04:00
amitkumar-amd
7290ce9030 Update the links for rocminfo and rocm-bandwidth-test (#3213)
* Update the links for rocminfo and rocm-bandwidth-test

* Update the links for rocminfo and rocm-bandwidth-test

* Update the links for rocminfo and rocm-bandwidth-test

* Update links to intersphinx links

---------

Co-authored-by: Peter Jun Park <peter.park@amd.com>
2024-06-04 10:59:22 -04:00
Peter Park
d6d18d7cd4 Merge pull request #3226 from peterjunpark/docs/6.1.2
docs/6.1.2: Add "Fine Tuning LLMs" how to guide (#3124)
2024-06-04 07:02:36 -07:00
Peter Park
30f10e0145 Update fine-tuning guide: title, improve readibility in code blocks, fix typos (#3222)
* Fix typo

* Add torchtune link

* Add newlines before comments in code blocks for readability

* Update title
2024-06-03 22:15:36 -04:00
Peter Park
1e55e01af3 Add "Fine Tuning LLMs" how to guide (#3124)
* Add Fine Tuning LLMs how to guide

* Reorg and refactor Fine-tuning LLMs with ROCm

Update index and headings

Fix formatting and update toc

Split out content from index to overview.rst

Add metadata

Clean up overview

Add inference sections, fix rst errors, clean up single-gpu-fine-tuning

Combine fine-tuning and inference guides

Fix some links and formatting

Update toc and add formatting fixes

Add ck kernel fusion content

Update toc

Clean up model quantization and acceleration

Add CK images

Clean up profiling

Update triton kernel performance optimization

Update llm inference frameworks guide

Disable automatic number of figures and tables in Sphinx conf

Change tabs to spaces

Change heading to end with -ing

Add link fixes and heading updates

Add rocprof/Omniperf/Omnitrace section

Update profiling and debugging guide

Add formatting fixes

Satisfy spellcheck

Fix words

Delete unused file

Finish overview

Clean up first 4 sections

Multi-gpu fine-tuning guide: slight fixes

Update toc

Remove tabs

Formatting fixes

* Minor wording updates

* Add some clean-up

* Update profiling and debugging gudie

* Fix Omnitrace link

* Update ck kernel fusion with latest

* Update CK formatting

* Fix perfetto link syntax

* Fix typos and add blurbs

* Add fixes to Triton optimization doc

* Tabify saving adapters / models section

* Fix linting errors - spellcheck

Fix spelling and grammar

Satisfy linter

Update wording in profiling guide

Add fixes to satisfy linter

More fixes for linting in Triton guide

More linting fixes

Spellcheck in CK guide

* Improve triton guide

Fix linting errors and optics

* Add occupancy / vgpr table

Change some wording

* Re-add tunableop

* Add missing indent in _toc.yml

* Remove ckProfiler references

* Add links to resources

* Add refs in CK optimization guide

* Rename files and fix internal links

* Organize tuning guides

Reorg triton

* Add compute unit diagram

* Remove AutoAWQ

* Add higher res image for Perfetto trace example

* Update link text

* Update fig nums

* Update some formatting

* Update "Inductor"

* Change "Inductor" to TorchInductor

* Add link to official TorchInductor docs
2024-06-03 22:15:13 -04:00
Peter Park
9a347aa168 Update fine-tuning guide: title, improve readibility in code blocks, fix typos (#3222)
* Fix typo

* Add torchtune link

* Add newlines before comments in code blocks for readability

* Update title
2024-06-03 22:11:19 -04:00
Peter Park
fed33835a0 Add "Fine Tuning LLMs" how to guide (#3124)
* Add Fine Tuning LLMs how to guide

* Reorg and refactor Fine-tuning LLMs with ROCm

Update index and headings

Fix formatting and update toc

Split out content from index to overview.rst

Add metadata

Clean up overview

Add inference sections, fix rst errors, clean up single-gpu-fine-tuning

Combine fine-tuning and inference guides

Fix some links and formatting

Update toc and add formatting fixes

Add ck kernel fusion content

Update toc

Clean up model quantization and acceleration

Add CK images

Clean up profiling

Update triton kernel performance optimization

Update llm inference frameworks guide

Disable automatic number of figures and tables in Sphinx conf

Change tabs to spaces

Change heading to end with -ing

Add link fixes and heading updates

Add rocprof/Omniperf/Omnitrace section

Update profiling and debugging guide

Add formatting fixes

Satisfy spellcheck

Fix words

Delete unused file

Finish overview

Clean up first 4 sections

Multi-gpu fine-tuning guide: slight fixes

Update toc

Remove tabs

Formatting fixes

* Minor wording updates

* Add some clean-up

* Update profiling and debugging gudie

* Fix Omnitrace link

* Update ck kernel fusion with latest

* Update CK formatting

* Fix perfetto link syntax

* Fix typos and add blurbs

* Add fixes to Triton optimization doc

* Tabify saving adapters / models section

* Fix linting errors - spellcheck

Fix spelling and grammar

Satisfy linter

Update wording in profiling guide

Add fixes to satisfy linter

More fixes for linting in Triton guide

More linting fixes

Spellcheck in CK guide

* Improve triton guide

Fix linting errors and optics

* Add occupancy / vgpr table

Change some wording

* Re-add tunableop

* Add missing indent in _toc.yml

* Remove ckProfiler references

* Add links to resources

* Add refs in CK optimization guide

* Rename files and fix internal links

* Organize tuning guides

Reorg triton

* Add compute unit diagram

* Remove AutoAWQ

* Add higher res image for Perfetto trace example

* Update link text

* Update fig nums

* Update some formatting

* Update "Inductor"

* Change "Inductor" to TorchInductor

* Add link to official TorchInductor docs
2024-06-03 14:04:33 -04:00
danielsu-amd
f52bc2bc68 External CI: Add rocBLAS dependency to rocSPARSE (#3216) 2024-06-03 13:41:30 -04:00
danielsu-amd
205790159d External CI: use pipelined rocm-core for rocprofiler (#3215) 2024-06-03 10:52:56 -04:00
Peter Park
9679a84a8b Add components, known issues, and fixed issues to 6.1.2 RN / CL (#87)
* Regenerate changelog

* Add component changelogs and known issue

Fix RELEASE.md headings

Update pub datestamp for 6.1.2

Add AMDSMI and ROCm SMI to 6.1.2 template

Add rccl and rocBLAS

Update intro blurb and headings

Add ROCm SMI fix

Add missed heading to AMDSMI

Update datestamp and release version number

Update version and release number

Add known issue re: MI300X error detection

Words

Add issue link

Rm GitHub issue link

Move known issue down

Update ki wording

Remove "this issue has been investigated ... " from known issue

Fix changelog h1
2024-06-03 08:51:38 -04:00
Sam Wu
d34f7d7777 Merge pull request #3210 from ROCm/dependabot/pip/docs/sphinx/requests-2.32.2
Bump requests from 2.31.0 to 2.32.2 in /docs/sphinx
2024-05-31 17:10:09 -06:00
dependabot[bot]
16fca72626 Bump requests from 2.31.0 to 2.32.2 in /docs/sphinx
Bumps [requests](https://github.com/psf/requests) from 2.31.0 to 2.32.2.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.31.0...v2.32.2)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-31 23:02:26 +00:00
Sam Wu
1a6ce7f6e0 Merge pull request #3212 from ROCm/dependabot/pip/docs/sphinx/rocm-docs-core-1.2.0
Bump rocm-docs-core from 1.1.1 to 1.2.0 in /docs/sphinx
2024-05-31 17:01:03 -06:00
dependabot[bot]
35c17fcce5 Bump rocm-docs-core from 1.1.1 to 1.2.0 in /docs/sphinx
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.1 to 1.2.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.1...v1.2.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-31 22:07:13 +00:00
Sam Wu
bf19dd1dc8 Update RTD config 2024-05-31 15:18:53 -06:00
Sam Wu
5fec2e1ca4 Update documentation requirements 2024-05-31 13:49:14 -06:00
danielsu-amd
1975889da1 External CI: Remove redundant rocm_smi_lib pipeline ID (#3211) 2024-05-31 14:25:09 -04:00
Sam Wu
b9c4490f96 Merge branch 'roc-6.1.x' into docs/6.1.2 2024-05-31 11:59:44 -06:00