github/ROCm - ROCm - AtHeartEngineering

mirror of https://github.com/ROCm/ROCm.git synced 2026-04-05 03:01:17 -04:00

Author	SHA1	Message	Date
jon-chuang	36859aebff	[DOCS] Add MLIR Autogenerated Docs to Sphinx Docs (#2234 ) Partially fixes: https://github.com/openai/triton/issues/2226 Here are some example renderings: ![Screenshot from 2023-09-04 18-39-20](https://github.com/openai/triton/assets/9093549/e9c4af04-aeae-4021-a8db-6a4a82b59ae7) ![Screenshot from 2023-09-04 18-39-30](https://github.com/openai/triton/assets/9093549/410391b8-e07e-4bed-909c-8ce5484072d1) ![Screenshot from 2023-09-04 18-39-41](https://github.com/openai/triton/assets/9093549/f1eaef95-66c1-4506-a153-c6069e2b5072)	2023-09-06 08:17:12 +00:00
Michael Melesse	c6d33dcebf	[ROCM] Core Functionality for AMD (#1983 ) * this pr adds a third party backend for triton that works on AMD * this expose a lot of the work that has been done in our [fork](https://github.com/ROCmSoftwarePlatform/triton) * most unit tests on `test_core.py` pass * it skips some unit tests for various reasons * we plan to follow up with more prs improving Functionality and Performance in the future --------- Co-authored-by: Philippe Tillet <phil@openai.com>	2023-08-31 14:02:00 -07:00
Zahi Moudallal	bffb76e847	[DOCS] Fixing docs (#2221 )	2023-08-31 11:47:57 -07:00
Zahi Moudallal	f922ecbc29	[DOCS] Fixing dependency (#2219 )	2023-08-31 17:51:04 +00:00
Zahi Moudallal	cf31f4ddb2	[DOCS] Fixing docs by using sphinx-build instead of multiversion (#2217 )	2023-08-31 16:51:14 +00:00
Zahi Moudallal	5282ed890d	[CI] Add back pre-commit to nvidia CI job (#2159 )	2023-08-23 01:11:03 +00:00
Zahi Moudallal	3c8f959f91	[CI] Adding workflow_run (#2120 )	2023-08-18 23:58:41 -07:00
Zahi Moudallal	1faf93e6fb	[CI] Fix PR comment (#2131 )	2023-08-18 09:16:18 -07:00
Zahi Moudallal	b33f97a682	[CI] Fix bug in Compare Artifacts workflow (#2128 ) Forgot to remove this line	2023-08-17 18:06:36 -07:00
Zahi Moudallal	6f654cfbbf	[CI] Testing PR comment from another workflow (#2127 )	2023-08-17 17:34:59 -07:00
Zahi Moudallal	3fa6d51bc9	[CI] Adding new github workflow for testing (#2121 )	2023-08-17 15:32:38 -07:00
Zahi Moudallal	557b2d4b34	[CI] upload only test/unit/operators cache to artifacts and rely on kernel names in cache to compare artifacts (#2111 )	2023-08-16 20:34:40 -07:00
Zahi Moudallal	0312ed3473	[CI] Update kernels names (#2093 ) Co-authored-by: Philippe Tillet <phil@openai.com>	2023-08-14 19:41:41 -07:00
Philippe Tillet	facc1dcbac	[TESTS] better matmul unit testing (#2098 )	2023-08-13 17:54:32 -07:00
Zahi Moudallal	c309f7e57a	[CI] Skip PR if we paginate 30 times without finding a run_id (#2092 )	2023-08-11 15:31:46 -07:00
Philippe Tillet	3ec05fb023	[CI] H100 tests always use ENABLE_TMA=1 ENABLE_MMA_V3=1 (#2051 )	2023-08-07 19:32:55 -07:00
Philippe Tillet	54f1ac950e	[CI] disable AMD CI (#2045 )	2023-08-07 12:03:26 -07:00
Philippe Tillet	223c2d32a2	[CI] disable XPU tests (not compiling) (#2044 ) cc @EikanWang . I'm disabling this for now since it broke with the H100 merge, but please feel free to fix the compilation errors and submit a PR.	2023-08-07 11:56:16 -07:00
goostavz	f1512bded1	Initial code merge of Hopper support (#2036 ) The initial code merge of Nvidia Hopper features support. Please be aware that the code merge is not finished yet and the trouble-shooting is still ongoing. The new hardware features (GMMA, TMA, STMATRIX etc.) and automatic warp-specialization are experimental for now and turned off by default. It is recommended for a trial when version 3.0 is released. The work is contributed by: ben-zhang-609, bealwang, donproc, qliu93, jsh20, allatit23, LyricZhao, ivanyinwz, goostavz & yangjunpro from Nvidia, in cooperation with: ptillet, Jokeren, ThomasRaoux & zahimoud from OpenAI. Co-authored-by: Goostav Zhu <gzhu@nvidia.com>	2023-08-07 09:53:04 +08:00
Philippe Tillet	1db3bdc52e	[BACKEND] avoid code duplication for fully warp-synchronous reductions (#1978 )	2023-07-21 16:06:00 -07:00
Phil Tillet	c7757fae71	[GITHUB] tweak CODEOWNERS	2023-07-17 00:41:11 -07:00
Keren Zhou	571c92f2a8	[CI] Fix CI kernel compare (#1931 ) With this PR, we find the latest merged PR that successfully passed "Integration Tests".	2023-07-12 10:06:34 -07:00
Philippe Tillet	7e3ebbc4c8	[TESTING] now using cuda graphs for perf regression tests (#1925 )	2023-07-10 22:49:25 -07:00
Zahi Moudallal	2a2cbc352b	[CI] Fix failure due to not finding workflow run on first page (#1907 ) Co-authored-by: Philippe Tillet <phil@openai.com>	2023-07-06 12:48:36 -07:00
Thomas	787cdff0cd	[TESTS] Enable parallel pytest in CI for CUDA (#1905 ) Run most of the pytest in parallel, this allows to speed up CI from 36min to 10min for A100 and 22min to 6min for H100. Some tests still need to run serially like runtime tests.	2023-07-06 11:40:33 -07:00
Zahi Moudallal	2c51dc00ab	[CI] Use PAT for PR comment in CI (#1839 ) Co-authored-by: Philippe Tillet <phil@openai.com>	2023-07-05 12:34:58 -04:00
Zahi Moudallal	62aef58c17	[CI] Split integration tests into two jobs (#1855 ) We need to split the CI into two jobs, nvidia (PR blocking) and third-party (PR non-blocking). This way we can guarantee that artifacts are uploaded for any PR that gets merged into `main`, and that `compare artifacts` job can just wait on the artifacts-uploading job.	2023-07-01 11:28:55 -07:00
Zahi Moudallal	6ad8cd52e7	[CI] Added IR reference-check github workflow (#1755 )	2023-06-22 18:00:40 -07:00
Zahi Moudallal	cfef4389f5	[CI] Add gpu arch to artifacts name (#1827 )	2023-06-22 16:45:50 -07:00
Nathaniel McVicar	d416d63152	[PUBLISHING] Skip nightly run if no commits (#1807 ) Previously the nightly run was failing to upload if there had been no commits since the previous night. Also moves time ahead 20 minutes to avoid hourly spike delays launching workflows.	2023-06-21 18:12:32 -07:00
Ashay Rane	2d774ab129	[CI] add workflow for building, testing, and uploading LLVM artifacts (#1777 ) This patch adds a GitHub workflow yaml file and a Docker file to build LLVM for the commit hash specified in the llvm-hash.txt file in the llvm-head branch. If the tests run successfully, the built artifacts are uploaded to Azure blob storage and their URL is available in the CI logs. These artifacts can then be used in the python/setup.py script for fetching the necessary LLVM binary objects while building Triton.	2023-06-20 09:26:19 -05:00
Philippe Tillet	9a2580de13	[CI] Added H100 node (#1779 )	2023-06-15 14:21:47 -07:00
Wang Weihan	b27a91a113	[FRONTEND] Enable triton to support register thirdparty backend at runtime (#1643 ) This PR intends to provide a mechanism to support a third-party backend at runtime to generate the backend-specific code. The mechanism provided a common class to abstract the third-party backend logic and two essential functions to register and get the third-party backend at runtime. - `BaseBackend`: A common class to abstract the third-party backend logic - `register_backend`: Register a third-party backend with a given device type - `get_backend`: Get the third-party backend with a given device type Generally, a third-party backend must inherit from `BaseBackend` and implement all the member functions according to the backend characteristics. As long as the backend implementation is ready, the third-party backend can invoke `register_backend` to register it under a given device. During the kernel compilation and execution, the mechanism will get the registered backend to generate the kernel and launcher code for a given device. This PR added a dummy backend to simulate a third-party backend and demonstrate the usage. - [test_device_backend.py](https://github.com/openai/triton/pull/1643/files#diff-bbe4d50624f2d11bf17c878a1ed4d422918c124c182cf9357b993240c385bea1): To define a third-party backend and register the backend - [ExtensionBackend](https://github.com/openai/triton/pull/1643/files#diff-bbe4d50624f2d11bf17c878a1ed4d422918c124c182cf9357b993240c385bea1R123): Inherit from the `BaseBackend` and implement some specific logic like [filter out some compile stages](https://github.com/openai/triton/pull/1643/files#diff-bbe4d50624f2d11bf17c878a1ed4d422918c124c182cf9357b993240c385bea1R129-R135) - [Register the `ExtensionBackend` for `CPU`](https://github.com/openai/triton/pull/1643/files#diff-bbe4d50624f2d11bf17c878a1ed4d422918c124c182cf9357b993240c385bea1R279) - [extension_backend.c](https://github.com/openai/triton/pull/1643/files#diff-169c1d08b3a0a7b343cfa3258fbc32b47e0f6c46305a112652fa1bdaaec89d29): To provide the utility function to load kernel binary and get the backend properties.	2023-06-09 09:09:59 -07:00
Nathaniel McVicar	9101736da1	[PUBLISHING] Install Azure CLI for nightly builds (#1748 ) The Azure CLI wasn't available on the builders, so this resolves that issue and updates the docs to point to the new packages. Also stops publishing Python 3.6 wheels, as that version is out of support. Temporarily disables musllinux builds, as they are broken.	2023-06-07 19:39:53 -07:00
Zahi Moudallal	e3ab942c44	[CI] Update path to artifacts (#1749 )	2023-06-06 17:05:44 -07:00
Nathaniel McVicar	035381aa28	[PUBLISHING] Enable nightlies on Azure DevOps (#1729 ) This reenables nightly builds and uploads them to a public AzDO, where you can use pip to install a specific version or the latest as normal. For example: `python -m pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/Triton-Nightly/pypi/simple/ triton-nightly`	2023-06-02 16:36:22 -07:00
Zahi Moudallal	0cd8f05e01	[CI] Upload CUDA test artifacts (#1645 )	2023-05-09 23:04:23 -07:00
Paul Ganssle	319af1fb65	[CI] Build wheels for musllinux (#1638 ) Ideally you would also build source distributions so that it is in principle possible to build `triton` on other platforms, but building `musllinux` wheels would at least help with openai/whisper#1328. I suspect you will also get people showing up at some point asking for `aarch64` wheels as well. It might be worth taking a look at the [`cibuildwheel` output matrix](https://cibuildwheel.readthedocs.io/en/stable/#what-does-it-do) to see what you are comfortable with shipping (particularly if you aren't shipping source distributions).	2023-05-09 01:04:13 -07:00
Philippe Tillet	aaeba98f13	[CI] no longer runs CI job on macos-10.15 (#1624 )	2023-05-05 12:13:33 -07:00
peterbell10	c71bf73f24	[BUILD] Use a persistent directory for cmake (#1548 ) Fixes #1545 `build_temp` is a temporary directory which `distutils` used to keep in the `./build` directory, but when `pyproject.toml` is present `pip` now puts it in `/tmp` and removes it at the end of the build. Instead, this creates a new permanent directory like `python/build/cmake.linux_x86_64-cpython-3.8` (the old name but with cmake instead of temp). While I was looking at the verbose pip output, I also noticed a bunch of warnings like ``` Python recognizes 'triton/runtime.backends' as an importable package, but it is not listed in the `packages` configuration of setuptools. 'triton/runtime.backends' has been automatically added to the distribution only because it may contain data files, but this behavior is likely to change in future versions of setuptools (and therefore is considered deprecated). ``` So I've also added these to the packages list. --------- Co-authored-by: Keren Zhou <kerenzhou@openai.com>	2023-04-20 16:38:44 -07:00
Phil Tillet	92d07d1b8e	[DOCS] Fixed up workflow	2023-04-13 16:05:29 -07:00
Philippe Tillet	2aad2336e9	[DOCS] Documentation job now uses A100 GPUs (#1522 )	2023-04-13 15:35:16 -07:00
Keren Zhou	fdf1c1f2a1	[DOCS] Fix documentation workflow (#1520 ) Co-authored-by: Phil Tillet <phil@openai.com>	2023-04-13 13:49:36 -07:00
Philippe Tillet	5b9119117b	[CI] No longer install triton in editable mode to run tests (#1476 )	2023-04-12 17:55:44 -07:00
Keren Zhou	272f23457a	[DOCS] Restore the documentation workflow (#1503 ) Not sure if it works at this moment, but at least we can restore the workflow first.	2023-04-11 13:36:15 -07:00
Michael Holman	e3a763872b	[CI] Avoid using EXTRA_INDEX_URL in pip install (#1424 ) While not currently a vulnerability, using this option can introduce risks of supply chain attacks, where an attacker adds a malicious package into the default public repository with the same name as one that you are installing. The best practice is to instead use --index-url when a package is not available in the default repo, so I've updated to use this option instead. For full disclosure, this is currently causing Microsoft's internal security checks to fail when building triton (which is why I care about this theoretical issue/best practice).	2023-03-27 20:34:45 +00:00
Xuehai Pan	c52219b5c3	[SETUP] avoid using deprecated `distutils` (#1400 ) Module [`distutils`](https://docs.python.org/3/library/distutils.html) is deprecated and will be removed in Python 3.12. Ref: - `distutils` documentation: > ## [distutils](https://docs.python.org/3/library/distutils.html#module-distutils) — Building and installing Python modules > [distutils](https://docs.python.org/3/library/distutils.html#module-distutils) is deprecated with removal planned for Python 3.12. - PEP 632 – Deprecate distutils module: > [PEP 632 – Deprecate distutils module](https://peps.python.org/pep-0632) ------ This PR removes references to `distutils` and replaces them with [`packaging`](https://pypi.org/project/packaging) and[ `sysconfig`](https://docs.python.org/3/library/sysconfig.html). Alleviate potential breakage in the modern Python packaging system. Changes: - Removes references to `distutils` and replaces them with [`packaging`](https://pypi.org/project/packaging) and[ `sysconfig`](https://docs.python.org/3/library/sysconfig.html) - Add `cmake` and `package` in `build-system.requires` to install necessary build dependencies prior to calling `setup.py`. - Minor changes: `multiprocessing.cpu_count() -> os.cpu_count()` and add PyPI classifiers. --------- Co-authored-by: Philippe Tillet <phil@openai.com>	2023-03-27 10:37:47 -07:00
Xuehai Pan	5b36cb48ad	[CI][TEST] update `pre-commit` hooks and use `pre-commit` for style tests in CI (#1409 ) Ref issue: - #1408 Changes: - Add `.editorconfig` - Add `pre-commit-hooks`: ```yaml - repo: https://github.com/pre-commit/pre-commit-hooks rev: v4.4.0 hooks: - id: check-symlinks - id: destroyed-symlinks - id: trailing-whitespace - id: end-of-file-fixer - id: check-yaml - id: check-toml - id: check-ast - id: check-added-large-files - id: check-merge-conflict - id: check-executables-have-shebangs - id: check-shebang-scripts-are-executable - id: detect-private-key - id: debug-statements ``` - Add `flake8` to `pre-commit` config and add `.flake8` file - Use `pre-commit` for style tests in CI - Run `pre-commit` and fix existing violations: - fix trailing spaces - fix end-of-files - fix mod file mode with `chmod -x` - run `autopep8` on existing code - fix `flake8` violations	2023-03-25 14:52:16 -07:00
Michael Melesse	a9c87245b4	[ROCM] Enable ROCM Backend #1 : Empty Kernel (#1312 ) This PR is a first in a series of PRs to import the changes that we have made to enable ROCM on [our fork](https://github.com/ROCmSoftwarePlatform/triton) of triton. The PR contains the major changes to the python frontend and enough changes to the c++ backend to allow compilation and running of the empty kernel. We use the ROCM ci added a few weeks ago to verify things. --------- Co-authored-by: Ronan Keryell <ronan@keryell.fr>	2023-03-24 17:18:27 -07:00
Philippe Tillet	7c7b769e37	[SETUP] Fixed dependencies (#1389 )	2023-03-22 16:15:35 -07:00

1 2

92 Commits