Commit Graph

2735 Commits

Author SHA1 Message Date
Peter Park
548d31f990 fix broken image in megatron-lm-v24.12-dev.rst (#5043) 2025-07-15 10:57:12 -04:00
Daniel Su
32f79a966b [Ex CI] fix MIOpen CK script again (#5034) 2025-07-14 14:18:24 -04:00
Daniel Su
393df3e05c [Ex CI] hipSPARSELt monorepo enablement (#5033) 2025-07-11 16:40:18 -04:00
Daniel Su
aa3cdcb3c3 [Ex CI] increase hipSPARSELt test timeout (#5028) 2025-07-10 12:04:06 -04:00
Pratik Basyal
e8bb027c20 HIP 7.0 upcoming changes blog link updated (#5021) 2025-07-10 09:53:44 -04:00
Pratik Basyal
544186aef8 ROCm for HPC table update for Develop (#5015) (#5016) (#5019)
* ROCm for HPC table update for 6.4.0 (#5015) (#5016)

* 6.4.0 updates synced

* Minor change

* Link update
2025-07-09 14:57:53 -04:00
Peter Park
22524eeaa5 fix xrefs in vllm-0.9.0.1-20250605.rst (#5017) 2025-07-09 14:38:24 -04:00
Peter Park
d471b04cd5 Update vLLM Docker doc for 07/02 2025-07-09 11:38:27 -04:00
Di Nguyen
1c7cff8a47 Merge pull request #5011 from ROCm/zenguyen/disable-device-merge-inplace-rocprim
[rocPRIM] Disable device_merge_inplace unit test for rocPRIM
2025-07-09 09:12:08 -06:00
Daniel Su
84c664074f [Ex CI] add OS to copyHIP filenames (#5012) 2025-07-09 10:37:23 -04:00
NguyenNhuDi
7c6083d840 disabled device_merge_inplace 2025-07-08 14:08:53 -06:00
Daniel Su
94099b1398 [Ex CI] rocPyDecode: fix test running (#5002) 2025-07-08 14:32:30 -04:00
Peter Park
3b3fc4894b Fix xrefs and Sphinx warnings in documentation
Fix xrefs and Sphinx warnings in documentation
2025-07-08 13:22:53 -04:00
Daniel Su
8aba1d2318 [Ex CI] fix printed artifact download links (#4998) 2025-07-04 14:41:33 -04:00
Mirza Halilčević
e9e75cfc46 Merge pull request #4963 from ROCm/pybind11
Add pybind11 as a pip module requirement for azure
2025-07-04 13:35:24 +02:00
Peter Park
58b3ad0509 Fix Docker run commands in Megatron-LM Docker doc (#4996)
* fix megatron-lm docker run commands

* update --shm-size option
2025-07-02 14:19:27 -04:00
Daniel Su
523d8520f3 [Ex CI] rocBLAS: increase test timeout to 2 hours (#4995) 2025-07-02 12:16:50 -04:00
Peter Park
d0c8ba0805 Add Wan2.1 to PyTorch inference Docker documentation (#4984)
* add wan2.1 to pyt inference models

* update group name

* fix container tag

* fix group name

* change documented data type to bfloat16

* fix col width
2025-07-02 09:58:37 -04:00
ammallya
73de8a3e46 Removing failing checkout step 2025-07-01 11:25:17 -07:00
Daniel Su
1fc312f90f [Ex CI] fix hardcoded gfx in MIOpen CK script (#4993) 2025-06-30 15:34:54 -04:00
Daniel Su
fde2647ccd [Ex CI] migrate rocBLAS to monorepo (#4987) 2025-06-30 15:16:58 -04:00
Daniel Su
798c8debb5 [Ex CI] consolidate artifact extraction and deletion in deps-rocm (#4961) 2025-06-30 14:12:52 -04:00
dependabot[bot]
393ba600c2 Build(deps): Bump sphinx-sitemap from 2.6.0 to 2.7.2 in /docs/sphinx (#4985)
Bumps [sphinx-sitemap](https://github.com/jdillard/sphinx-sitemap) from 2.6.0 to 2.7.2.
- [Release notes](https://github.com/jdillard/sphinx-sitemap/releases)
- [Changelog](https://github.com/jdillard/sphinx-sitemap/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/jdillard/sphinx-sitemap/compare/v2.6.0...v2.7.2)

---
updated-dependencies:
- dependency-name: sphinx-sitemap
  dependency-version: 2.7.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-30 09:33:28 -06:00
Daniel Su
c64c545b52 [Ex CI] hipBLASLt: build some archs on medium pool (#4986) 2025-06-30 11:32:35 -04:00
Daniel Su
76ee1d720f [Ex CI] rocAL: switch to medium pool (#4983) 2025-06-27 13:41:07 -04:00
Daniel Su
5adc040367 [Ex CI] migrate hipBLAS-common & hipBLASLt pipeline IDs (#4982) 2025-06-27 12:09:58 -04:00
Daniel Su
061da8f306 [Ex CI] enable almalinux8 and gfx1100 builds for hipBLASLt, rocBLAS, rocSOLVER (#4955) 2025-06-27 10:39:30 -04:00
Daniel Su
e26767bca6 [Ex CI] Tensile: add boost filesystem (#4980) 2025-06-27 10:38:31 -04:00
Daniel Su
7b6f1800d4 [Ex CI] fix miopen-get-ck for new artifact naming scheme (#4979) 2025-06-26 15:49:13 -04:00
Pratik Basyal
a6221937f2 KMD UMD support footnote update ROCm 640 (#4973) (#4976)
* KMD UMD support footnote update ROCm 640

* Histotical footnote
2025-06-26 15:34:21 -04:00
Daniel Su
ac2df2961d [Ex CI] add component name to artifact download filter (#4974) 2025-06-26 13:55:03 -04:00
Mirza Halilcevic
9b102061f4 Add pybind11 as a pip module requirement for azure. 2025-06-24 08:06:52 -05:00
Daniel Su
f20e8dec8b [Ex CI] revert PRIM default branch to develop (#4960) 2025-06-23 16:35:02 -04:00
Daniel Su
10e9157f39 [Ex CI] allow rerun jobs to upload artifacts (#4959) 2025-06-23 15:37:52 -04:00
Daniel Su
a2ce6021cb [Ex CI] add more OSs to nightly build (#4958) 2025-06-23 15:13:11 -04:00
Peter Park
2196fc9a2f Fix pytorch training 25.6 doc (#4956)
* fix pytorch-training history

* fix pytorch-training

fix
2025-06-23 13:45:50 -04:00
Daniel Su
925689f89e [Ex CI] enable gfx1100 builds (#4954) 2025-06-23 11:26:35 -04:00
Peter Park
91a541f8b9 Update PyTorch training benchmark doc for v25.6 (#4950)
* update pytorch-training docker details

* add previous version

* add models data

* update models data id

* add models picker

* update data

* update fmt

fmt

* update data yaml

* update template

* update data

* fix

* fix vllm-0.6.4 broken link

* fix vllm history
2025-06-23 09:26:15 -04:00
Peter Park
34f8d57ece Organize version histories in ROCm for AI benchmark Docker docs (#4948)
* add vllm 0.8.3 20250415

update prev versions table

* add vllm previous versions page

* move index to vllm-history

* add standalone megatron-lm version history

* add pytorch training version history

* fix

* add vllm-0.4.3

* add vllm-0.6.4

* update vllm-history

* add vllm-0.7.3

* add vllm-0.6.6

* add notes

* fix vllm readme links

fix main page link

* add latest version to previous versions list

* add jax-maxtext history

* fix jax-maxtext history

* add pytorch-training history

* add link in jax-maxtext 25.4

* add megatron-lm history

* fix datatemplate path for vllm 0.8.3

* fix jax-maxtext history link

* update note about performance measurements

* add vllm 0.8.5_20250521 previous version

* consistency fixes
2025-06-20 15:01:38 -04:00
yugang-amd
55f95adc7c Update for vllm -06/10 (#4943) 2025-06-20 08:41:37 -04:00
Daniel Su
e05b1702d8 [Ex CI] fix experimental HIP to CLR triggers (#4946) 2025-06-19 12:56:53 -04:00
Daniel Su
4179042cf7 [Ex CI] add multi-OS support to copyHIP (#4945) 2025-06-19 12:15:22 -04:00
dependabot[bot]
ae2de81b79 Build(deps): Bump urllib3 from 2.4.0 to 2.5.0 in /docs/sphinx (#4942)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.4.0 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.4.0...2.5.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.5.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-19 09:03:29 -06:00
Daniel Su
efd6cec4a4 [Ex CI] disable downstream triggers for mathlibs not yet migrated (#4936) 2025-06-18 14:10:58 -04:00
Daniel Su
b65996587f [Ex CI] remove ALLOWED_PARTIAL_SUCCEED_BUILDS library variable (#4937) 2025-06-18 12:10:04 -04:00
yugang-amd
7b7eaf69f2 remove broken xref (#4939) 2025-06-18 10:15:53 -04:00
Daniel Su
4cfc8ddad2 [Ex CI] MIVisionX: add hipBLASLt to build deps (#4931) 2025-06-17 13:40:35 -04:00
Daniel Su
97ebbb227d [Ex CI] rocprof-sdk: add cmake, libsqlite3-dev (#4935) 2025-06-17 13:40:15 -04:00
Daniel Su
8c6a1726fe [Ex CI] remove old aqlprofile param in Pytorch (#4927) 2025-06-16 15:17:23 -04:00
Daniel Su
2656143c9e [Ex CI] fix ROCm versions (#4930) 2025-06-16 11:42:51 -04:00