github/ROCm - ROCm - AtHeartEngineering

mirror of https://github.com/ROCm/ROCm.git synced 2026-02-21 03:00:39 -05:00

Author	SHA1	Message	Date
Jason Furmanek	a08dafe7fe	Initial commit to resolve merge conflicts	2023-11-20 22:41:03 +00:00
Jason Furmanek	5c87f363e4	Merge commit 'cb3d79a185e40c9d8a579bea07747a8a8d157d52' into ifu-231117 Conflicts: lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp lib/Conversion/TritonGPUToLLVM/TritonGPUToLLVM.cpp lib/Dialect/TritonGPU/IR/Dialect.cpp python/setup.py python/test/unit/language/assert_helper.py python/test/unit/operators/test_flash_attention.py python/test/unit/runtime/test_subproc.py python/triton/compiler/compiler.py python/triton/language/semantic.py python/triton/runtime/autotuner.py python/triton/runtime/jit.py python/tutorials/03-matrix-multiplication.py python/tutorials/05-layer-norm.py python/tutorials/06-fused-attention.py python/tutorials/11-grouped-gemm.py test/Conversion/tritongpu_to_llvm.mlir	2023-11-17 20:42:12 +00:00
Jason Furmanek	977d5aa267	Merge commit '721897fcc4f942aa97d2e9ba3787a5e213758177' into ifu-231108 Conflicts: bin/triton-translate.cpp lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp lib/Dialect/TritonGPU/Transforms/RemoveLayoutConversions.cpp python/triton/compiler/compiler.py python/triton/runtime/jit.py python/tutorials/06-fused-attention.py test/Conversion/tritongpu_to_llvm.mlir	2023-11-08 18:51:23 +00:00
Jason Furmanek	33151a860f	Merge commit 'ac9fa68d18c777e421bd3f6fb1ddcfd60b6fda33' into ifu-rebase-again Conflicts: .gitignore .gitmodules README.md bin/triton-translate.cpp include/triton/Dialect/TritonGPU/IR/TritonGPUAttrDefs.td include/triton/Target/AMDGCN/AMDGCNTranslation.h include/triton/Target/HSACO/HSACOTranslation.h lib/Analysis/Allocation.cpp lib/Analysis/Utility.cpp lib/Conversion/TritonGPUToLLVM/CMakeLists.txt lib/Conversion/TritonGPUToLLVM/ConvertLayoutOpToLLVM.cpp lib/Conversion/TritonGPUToLLVM/ReduceOpToLLVM.cpp lib/Conversion/TritonGPUToLLVM/ScanOpToLLVM.cpp lib/Conversion/TritonGPUToLLVM/Utility.cpp lib/Conversion/TritonGPUToLLVM/Utility.h lib/Dialect/TritonGPU/IR/Dialect.cpp lib/Dialect/TritonGPU/Transforms/RemoveLayoutConversions.cpp lib/Target/HSACO/CMakeLists.txt lib/Target/HSACO/HSACOTranslation.cpp lib/Target/LLVMIR/LLVMIRTranslation.cpp python/src/triton.cc python/test/unit/language/test_core.py python/test/unit/operators/test_flash_attention.py python/triton/compiler/compiler.py python/triton/compiler/make_launcher.py python/triton/language/semantic.py python/triton/runtime/jit.py python/tutorials/06-fused-attention.py python/tutorials/11-grouped-gemm.py test/Conversion/tritongpu_to_llvm.mlir	2023-11-06 23:10:10 +00:00
Justin Lebar	df08301e76	Reformat Python code with yapf. (#2589 ) I've add an option to yapf to do what we want for long lines, see https://github.com/google/yapf/pull/1177. We can now have a real Python formatter, yay! To make this PR, I ran my modified yapf over the repository, then looked over the full diff. Where yapf was mangling the param list of long function decls/calls (mostly kernels), I manually added `#` to put linebreaks where we want. I fixed up other formatting too -- mostly adding or removing a trailing comma from lists. Overall, trailing `#` was sufficient to get formatting similar to our current code. I didn't have to disable yapf anywhere. --------- Co-authored-by: Phil Tillet <phil@openai.com>	2023-11-02 20:44:17 -07:00
Goran Flegar	601b95cdbb	[DEPS] bump LLVM version to llvm/llvm-project@49af650 (#2570 ) Co-authored-by: Ashay Rane <ashay@users.noreply.github.com> Co-authored-by: khasanovaa <khasanovaaliya19@gmail.com>	2023-10-31 12:06:25 -07:00
Someone	cde42e6221	[BUILD] make cuda tools vendoring optional (#2546 )	2023-10-26 23:16:41 -07:00
Justin Lebar	9b4d91b132	Add TRITON_BUILD_WITH_ASAN envvar. (#2537 ) Note that asan doesn't work with programs that use the GPU, so this is only useful for running tools like triton-opt. I was not able to get msan working. libstdc++'s std::string implementation seems to use uninitialized memory in a way that seems safe but triggers an msan error. I tried and gave up on switching to libc++ and teaching msan to ignore this error.	2023-10-24 10:30:30 -07:00
Thomas Raoux	376acb610b	[BUILD] Fix macos x86 build (#2505 ) There was a mismatch in the llvm link name	2023-10-17 09:49:09 -07:00
Mehdi Amini	721897fcc4	upgrade llvm to `b1115f8c` (NFC) (#2403 ) Co-authored-by: Thomas Raoux <thomas.raoux@openai.com> Co-authored-by: Keren Zhou <kerenzhou@openai.com> Co-authored-by: Phil Tillet <phil@openai.com>	2023-10-16 16:38:49 -07:00
Jack Taylor	47563240f8	PyTorch triton branch synchronisation (#354 ) * Restructure ROCM Library Search Currently there are a handful of ROCM dependant files which are required for triton to run. The linker(ld.lld), the include files, and multiple hip/hsa shared objects. This change will provide three search areas to find these files. All in the same order. 1. third_party/rocm. This location is within the python/triton directory and is carried over when triton is built. IF all necessary files are in this location there will be no need to have ROCM installed at all on the system. 2. $ROCM_PATH environmental variable. If this exists it will override all other locations to find ROCM necessary files 3. /opt/rocm. The default location for ROCm installations. Finding one here will notify triton that ROCM is installed in this environment To ease with step 3. A new script scripts/amd/setup_rocm_libs.sh has been added to the repo. Executing this script will cause all necessary ROCM files to be downloaded from their respective packages on repo.radeon.com and installed in third_party/rocm. Allowing for triton to run without installing the full ROCM stack. setup_rocm_libs.sh takes a env_var ROCM_VERSION if a user wishes to install a ROCM version other than the default (currently 5.4.2) When triton whls are built to support Pytorch, method 3 will be used to stay in sync with PyTorch's approach of bringing along any libraries needed and not requiring ROCM to be installed. (cherry picked from commit e6aea90fb3e8218cb562e5d990719112d8282702) * Fix default rocm path Running into `fatal error: hip/hip_runtime.h: No such file or directory` with latest wheel due to incorrect directory for ROCm libs (cherry picked from commit 292bae625b113eb65c66cfe4442da7a6456c988a) * setup_rocm_libs.sh manylinux refactor (cherry picked from commit f995f314ada4606cb78dc6233cd9c8effc356191) * Set setup_rocm_libs.sh to be executable (cherry picked from commit 05d67b9418cacda0d356c2102d7c1a887948b013) * Revert to using numbered so files to fix upstream (cherry picked from commit 34f8189eae57a23cc15b4b4f032fe25757e0db8e) * Remove drm script --------- Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2023-10-11 15:30:39 +01:00
Jason Furmanek	74fd8e9754	Merge commit '36fc54b6f28168d3644808bfe299f1ba06a36272' into ifu230908-2 Conflicts: .gitignore bin/triton-translate.cpp include/triton/Conversion/TritonGPUToLLVM/TritonGPUToLLVMPass.h include/triton/Dialect/TritonGPU/IR/TritonGPUAttrDefs.td include/triton/Dialect/TritonGPU/IR/TritonGPUDialect.td lib/Analysis/Utility.cpp lib/Conversion/TritonGPUToLLVM/ConvertLayoutOpToLLVM/SharedToDotOperandMMAv2.cpp lib/Conversion/TritonGPUToLLVM/DotOpToLLVM.cpp lib/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.cpp lib/Conversion/TritonGPUToLLVM/ReduceOpToLLVM.cpp lib/Conversion/TritonGPUToLLVM/TritonGPUToLLVM.cpp lib/Conversion/TritonGPUToLLVM/TritonGPUToLLVMBase.h lib/Conversion/TritonGPUToLLVM/TritonGPUToLLVMPass.cpp lib/Conversion/TritonGPUToLLVM/Utility.h lib/Dialect/Triton/Transforms/RewriteTensorPointer.cpp lib/Dialect/TritonGPU/IR/Dialect.cpp lib/Dialect/TritonGPU/Transforms/AccelerateMatmul.cpp lib/Dialect/TritonGPU/Transforms/RemoveLayoutConversions.cpp lib/Target/LLVMIR/LLVMIRTranslation.cpp python/src/triton.cc python/test/unit/runtime/test_subproc.py python/triton/compiler/compiler.py python/triton/compiler/make_launcher.py python/triton/language/semantic.py python/triton/runtime/jit.py python/tutorials/06-fused-attention.py test/Conversion/triton_to_tritongpu.mlir test/Conversion/tritongpu_to_llvm.mlir test/TritonGPU/coalesce.mlir unittest/Conversion/TritonGPUToLLVM/CMakeLists.txt	2023-10-02 18:01:04 +00:00
Justin Lebar	9bf9c20f30	[DOCS] update build instructions, and add testing instrs. (#2400 ) - Note `wheel` as a build-time dependency. - Add tips for getting a faster build. - Add instructions for running tests. - Add flag to build with ccache. (Thanks to @ThomasRaoux for most of these instructions!)	2023-09-27 22:13:03 -07:00
Justin Lebar	2a3746bac5	[BUILD] use ninja (#2318 )	2023-09-18 15:30:06 -05:00
Philippe Tillet	e686b4d6d4	[FRONTEND] interpreter rewrite (#2321 ) This is a new interpreter mode that shares semantic analysis with the JIT'ed codepath and that the Triton core team is committed to maintain	2023-09-17 14:58:50 -07:00
Justin Lebar	073aa16379	[BUILD] use ninja (#2318 )	2023-09-17 02:08:04 -07:00
Zahi Moudallal	36087a108f	[FRONTEND] Added SASS to asm dict (#2280 )	2023-09-13 21:21:01 +00:00
jon-chuang	36859aebff	[DOCS] Add MLIR Autogenerated Docs to Sphinx Docs (#2234 ) Partially fixes: https://github.com/openai/triton/issues/2226 Here are some example renderings: ![Screenshot from 2023-09-04 18-39-20](https://github.com/openai/triton/assets/9093549/e9c4af04-aeae-4021-a8db-6a4a82b59ae7) ![Screenshot from 2023-09-04 18-39-30](https://github.com/openai/triton/assets/9093549/410391b8-e07e-4bed-909c-8ce5484072d1) ![Screenshot from 2023-09-04 18-39-41](https://github.com/openai/triton/assets/9093549/f1eaef95-66c1-4506-a153-c6069e2b5072)	2023-09-06 08:17:12 +00:00
Wang Weihan	e721911705	[FRONTEND] clean build directly when executing python setup.py clean (#2238 ) Current setup.py could not clean the build directly because the default build directly has been changed in `CMakeBuild`. This PR is to clean build directly in this regard.	2023-09-04 21:31:38 -07:00
darkbuck	a3df6068b4	[BACKEND] Minor fixes found when building triton with LLVM 17/main branches (#2089 ) - These minor fixes are not specific to interface changes from LLVM main or official llvm-17 branch and can be applied on triton main branch. - https://github.com/darkbuck/triton/tree/darkbuck/main/llvm-main-branch has extra changes to build again LLVM main branch build to enable me to work on other backends on the main branch only. That's the hobby effort and just FYR.	2023-08-16 01:18:06 +00:00
Alex Collins	4ed8381fdb	Linux arm64 support (#2003 ) We are interested in having python wheels for triton built for Linux arm64 platforms, such as NVIDIA's Grace CPU. This change is fairly simple, however: - It requires a linux arm64 build of LLVM to be available (see MR here: https://github.com/ptillet/triton-llvm-releases/pull/15) - For now my changes use the LLVM build hosted here: https://github.com/acollins3/triton-llvm-releases/releases/tag/llvm-17.0.0-c5dede880d17 - The Triton release process will need to be updated to include arm64 wheels. Is this something you have time to work on @ptillet? It would be difficult for me to update this part without more access permissions. With these changes, I managed to build a set of python wheels and have hosted them here for us to use in the meantime: https://github.com/acollins3/triton/releases/tag/triton-2.1.0-arm64	2023-08-08 12:39:41 +08:00
goostavz	f1512bded1	Initial code merge of Hopper support (#2036 ) The initial code merge of Nvidia Hopper features support. Please be aware that the code merge is not finished yet and the trouble-shooting is still ongoing. The new hardware features (GMMA, TMA, STMATRIX etc.) and automatic warp-specialization are experimental for now and turned off by default. It is recommended for a trial when version 3.0 is released. The work is contributed by: ben-zhang-609, bealwang, donproc, qliu93, jsh20, allatit23, LyricZhao, ivanyinwz, goostavz & yangjunpro from Nvidia, in cooperation with: ptillet, Jokeren, ThomasRaoux & zahimoud from OpenAI. Co-authored-by: Goostav Zhu <gzhu@nvidia.com>	2023-08-07 09:53:04 +08:00
Philippe Tillet	3452615d79	[BUILD] Reverted ptxas change and fixed bug in cache key computation (#1971 )	2023-07-19 20:58:24 -07:00
Philippe Tillet	8207eabd7b	[FRONTEND][OPTIMIZER] small perf improvements (#1945 )	2023-07-14 15:11:36 -07:00
Daniyal khan	b70d07aafe	[BUILD][DOCS] updated setup.py and documentation (#1930 )	2023-07-11 11:46:28 -07:00
Goran Flegar	938a6754b4	[BUILD] Export compile commands (#1854 ) This can be used by IDEs to figure out how to correctly compile individual sources and offer semantic code completion.	2023-06-28 14:11:59 +00:00
Wang Weihan	4d3a92f1b8	[BUILD] Make sure always build_ext first (#1819 ) The third-party backend might install its python package to the `triton/third_party` python package during the build process. But the `build_py` could be executed before the `build_ext`, and then `build_py` would only copy the `packages` defined in the `setup.py` w/o the third-party related packages as the third-party backend has not been built, which is triggered by `build_ext`. Therefore, this PR refined the build order a little bit to ensure `build_ext` always happens before `build_py`.	2023-06-22 13:32:03 -07:00
Philippe Tillet	b24dc19741	[FRONTEND] cleaned up symbol names (#1782 )	2023-06-14 18:55:32 -07:00
Wang Weihan	b27a91a113	[FRONTEND] Enable triton to support register thirdparty backend at runtime (#1643 ) This PR intends to provide a mechanism to support a third-party backend at runtime to generate the backend-specific code. The mechanism provided a common class to abstract the third-party backend logic and two essential functions to register and get the third-party backend at runtime. - `BaseBackend`: A common class to abstract the third-party backend logic - `register_backend`: Register a third-party backend with a given device type - `get_backend`: Get the third-party backend with a given device type Generally, a third-party backend must inherit from `BaseBackend` and implement all the member functions according to the backend characteristics. As long as the backend implementation is ready, the third-party backend can invoke `register_backend` to register it under a given device. During the kernel compilation and execution, the mechanism will get the registered backend to generate the kernel and launcher code for a given device. This PR added a dummy backend to simulate a third-party backend and demonstrate the usage. - [test_device_backend.py](https://github.com/openai/triton/pull/1643/files#diff-bbe4d50624f2d11bf17c878a1ed4d422918c124c182cf9357b993240c385bea1): To define a third-party backend and register the backend - [ExtensionBackend](https://github.com/openai/triton/pull/1643/files#diff-bbe4d50624f2d11bf17c878a1ed4d422918c124c182cf9357b993240c385bea1R123): Inherit from the `BaseBackend` and implement some specific logic like [filter out some compile stages](https://github.com/openai/triton/pull/1643/files#diff-bbe4d50624f2d11bf17c878a1ed4d422918c124c182cf9357b993240c385bea1R129-R135) - [Register the `ExtensionBackend` for `CPU`](https://github.com/openai/triton/pull/1643/files#diff-bbe4d50624f2d11bf17c878a1ed4d422918c124c182cf9357b993240c385bea1R279) - [extension_backend.c](https://github.com/openai/triton/pull/1643/files#diff-169c1d08b3a0a7b343cfa3258fbc32b47e0f6c46305a112652fa1bdaaec89d29): To provide the utility function to load kernel binary and get the backend properties.	2023-06-09 09:09:59 -07:00
Ingo Müller	0c4de8ab72	[DEPENDENCIES] Update LLVM to 17.0.0 (c5dede880d17) and port changes. (#1668 ) This depends on a [pending LLVM release](https://github.com/ptillet/triton-llvm-releases/pull/10). * Implement setCalleeFromCallable in CallOp. * Cast type to ShapedType for various getters. * Improve TritonDialect::materializeConstant due to breaking change in constructor of arith::ConstantOp. * Add OpaqueProperties argument in inferReturnTypes. Co-authored-by: Philippe Tillet <phil@openai.com>	2023-05-15 21:51:14 -07:00
Michaël Benesty	858a2f0a5e	[FRONTEND] Added interpreter mode (#1573 ) Simple mechanism to run Triton kernels on PyTorch for debugging purpose (upstream from Kernl). Todo: - random grid iteration - support of atomic ops - more unit tests - cover new APIs?	2023-05-08 14:28:20 -07:00
Philippe Tillet	d338521b65	[SETUP] Removing `torch` as a test dependency (#1632 ) circular dependency is causing troubles now that our interpreter depends on torch 2.0 ...	2023-05-07 12:29:19 -07:00
Philippe Tillet	ec242430d1	[THIRD_PARTY] bumped `ptxas` version to 12.1.105 (#1574 )	2023-04-24 16:49:31 -07:00
peterbell10	c71bf73f24	[BUILD] Use a persistent directory for cmake (#1548 ) Fixes #1545 `build_temp` is a temporary directory which `distutils` used to keep in the `./build` directory, but when `pyproject.toml` is present `pip` now puts it in `/tmp` and removes it at the end of the build. Instead, this creates a new permanent directory like `python/build/cmake.linux_x86_64-cpython-3.8` (the old name but with cmake instead of temp). While I was looking at the verbose pip output, I also noticed a bunch of warnings like ``` Python recognizes 'triton/runtime.backends' as an importable package, but it is not listed in the `packages` configuration of setuptools. 'triton/runtime.backends' has been automatically added to the distribution only because it may contain data files, but this behavior is likely to change in future versions of setuptools (and therefore is considered deprecated). ``` So I've also added these to the packages list. --------- Co-authored-by: Keren Zhou <kerenzhou@openai.com>	2023-04-20 16:38:44 -07:00
Daniil Fukalov	a90a2d864f	[BUILD] Add ability to build with clang+lld. (#1544 ) This way reduces build time with assertions enabled LLVM and dramatically speeds up triton's build with a "debug" LLVM. Co-authored-by: Philippe Tillet <phil@openai.com>	2023-04-18 21:20:12 +00:00
Philippe Tillet	e5c7d2a83c	[FRONTEND] cleaned up language; added frontend function for `globaltimer` special register (#1525 )	2023-04-14 15:29:27 -07:00
Philippe Tillet	c0d86d3b04	[RUNTIME] refactor driver (#1515 ) Improved separation between different backends	2023-04-12 23:50:44 -07:00
Phil Tillet	d7d62ddae9	Revert "[BUILD] Fixed typo in setup.py" This reverts commit `2931bb8195`.	2023-04-11 20:12:22 -07:00
Phil Tillet	2931bb8195	[BUILD] Fixed typo in setup.py	2023-04-11 20:09:09 -07:00
Philippe Tillet	e0d6f5f4f5	[BUILD] updated LLVM binaries (#1504 ) Co-authored-by: Christian Sigg <csigg@google.com>	2023-04-11 00:14:00 -07:00
Eta	577cafff0a	[BUILD] Add missing subpackages to build (#1475 ) The `triton/compiler`, `triton/runtime/driver`, and `triton/third_party` subpackages were missing from the distribution built with the old `setup.py` after #1464, causing an immediate error upon importing Triton with a non-editable installation. This change adds the missing Python subpackages and moves `triton/third_party` inclusion to `MANIFEST.in`, where it will automatically be included in wheels due to the existing `include_package_data` setup flag.	2023-04-04 22:41:08 -07:00
Philippe Tillet	053af4e9f8	[FRONTEND] Refactor file hierarchy (#1464 ) The purpose of this PR is to remove some circular dependencies and separate concerns better in the frontend. It's still not perfect -- `triton.compile` still includes a few runtime architecture-specific component, but at least much better than before. This PR still assumes that AMD only supports empty kernels right now. Other PRs will follow to make the frontend supports multiple devices in a more modular way.	2023-04-02 12:07:08 -07:00
Francisco Massa	c1b057eee9	[FRONTEND] Add option to specify number of compilation threads during Triton compilation (#1450 ) On some machines, the amount of available RAM might not be enough to compile Triton with `2 * num_cpus` parallelism. For example, CircleCI's `large` instance can't handle Triton compilation as is due to insufficient memory. Instead, I propose to take PyTorch's approach where we can define a [`MAX_JOBS` env var](`0e4ddc2b40/tools/setup_helpers/cmake.py (L366-L368)`) that gives the user the possibility to reduce (or increase) the parallelism during compilation. Co-authored-by: Philippe Tillet <phil@openai.com>	2023-03-31 11:34:18 -07:00
Philippe Tillet	fe76b12354	[BUILD] Back to cmake >= 3.18 (#1428 )	2023-03-27 16:47:34 -07:00
Xuehai Pan	c52219b5c3	[SETUP] avoid using deprecated `distutils` (#1400 ) Module [`distutils`](https://docs.python.org/3/library/distutils.html) is deprecated and will be removed in Python 3.12. Ref: - `distutils` documentation: > ## [distutils](https://docs.python.org/3/library/distutils.html#module-distutils) — Building and installing Python modules > [distutils](https://docs.python.org/3/library/distutils.html#module-distutils) is deprecated with removal planned for Python 3.12. - PEP 632 – Deprecate distutils module: > [PEP 632 – Deprecate distutils module](https://peps.python.org/pep-0632) ------ This PR removes references to `distutils` and replaces them with [`packaging`](https://pypi.org/project/packaging) and[ `sysconfig`](https://docs.python.org/3/library/sysconfig.html). Alleviate potential breakage in the modern Python packaging system. Changes: - Removes references to `distutils` and replaces them with [`packaging`](https://pypi.org/project/packaging) and[ `sysconfig`](https://docs.python.org/3/library/sysconfig.html) - Add `cmake` and `package` in `build-system.requires` to install necessary build dependencies prior to calling `setup.py`. - Minor changes: `multiprocessing.cpu_count() -> os.cpu_count()` and add PyPI classifiers. --------- Co-authored-by: Philippe Tillet <phil@openai.com>	2023-03-27 10:37:47 -07:00
Philippe Tillet	7c7b769e37	[SETUP] Fixed dependencies (#1389 )	2023-03-22 16:15:35 -07:00
Philippe Tillet	e4b2d1bc3d	[FRONTEND][BACKEND] no longer using indices for loops (#1370 )	2023-03-19 14:57:50 -07:00
Stonepia	109b5e2729	[BUILD] Fix the build bug when user use system package of llvm by setting `LLVM_SYSPATH` (#1336 ) When the user set the `LLVM_SYSPATH` to use custom build llvm, it will throw the error because there is no version.txt under the custom build one. This PR skips the version check If the `LLVM_SYSPATH` is set. --------- Co-authored-by: Philippe Tillet <phil@openai.com>	2023-03-15 13:28:19 -07:00
Phil Tillet	773c29cfaa	[BUILD] Fix comment typo	2023-03-07 16:47:30 -08:00
Phil Tillet	305f99e614	[BUILD] Fixed typo in setup.py	2023-03-07 15:45:36 -08:00

1 2 3

140 Commits