github/ROCm - ROCm - AtHeartEngineering

mirror of https://github.com/ROCm/ROCm.git synced 2026-04-05 03:01:17 -04:00

Author	SHA1	Message	Date
Whitney Tsang	129e7dfc6f	[TritonGPUToLLVM] Correct the usage of option passing (#2104 ) For example, when given `--convert-triton-gpu-to-llvm="is-rocm=true"`, `ConvertTritonGPUToLLVMPass` should generate ROCM-compatible LLVM. Before this PR, transformation options passed in command line are not respected.	2023-08-16 00:56:01 +00:00
Zahi Moudallal	4d373aa103	[BACKEND] Remove HopperHelpers.c and replace with inline ptx and LLVM codegen (#2047 )	2023-08-10 15:52:37 -07:00
Peter Hawkins	3be74fa92d	Include only necessary MLIR conversion passes, rather than all of them. (#2068 ) No functional changes intended, and it might slightly speed up the build. This allows a downstream Bazel build of Triton to avoid building a number of dialects and passes that Triton doesn't need.	2023-08-09 08:30:42 -07:00
goostavz	f1512bded1	Initial code merge of Hopper support (#2036 ) The initial code merge of Nvidia Hopper features support. Please be aware that the code merge is not finished yet and the trouble-shooting is still ongoing. The new hardware features (GMMA, TMA, STMATRIX etc.) and automatic warp-specialization are experimental for now and turned off by default. It is recommended for a trial when version 3.0 is released. The work is contributed by: ben-zhang-609, bealwang, donproc, qliu93, jsh20, allatit23, LyricZhao, ivanyinwz, goostavz & yangjunpro from Nvidia, in cooperation with: ptillet, Jokeren, ThomasRaoux & zahimoud from OpenAI. Co-authored-by: Goostav Zhu <gzhu@nvidia.com>	2023-08-07 09:53:04 +08:00
Thomas	e6216047b8	[BACKEND] Upgrade the max PTX version allowed to 8.2 (#1982 )	2023-07-23 19:56:01 -07:00
Keren Zhou	cc5a7ed52f	[FRONTEND][BACKEND] Materialize line info for triton kernels (#1902 ) `export TRITON_DISABLE_LINE_INFO=1` to disable the feature.	2023-07-07 16:03:44 -04:00
Ingo Müller	a5fb71eed8	[CMAKE] Add link dependency to dl for TritonLLVMIR. (#1857 ) That library makes use of the dladdr function, so it eventually needs to be linked with -ldl, which may not be done automatically. This commit adds a link dependency to `${CMAKE_DL_LIBS}`, which is CMake's way of specifying that library in a portable way.	2023-06-29 10:31:45 -04:00
Mehdi Amini	83245259a6	[OPTIMIZER][BACKEND] switch the TritonGPU dialect to use MLIR Properties (NFC) (#1696 ) Also try to switch APIs access to the new upstream APIs that separate explicitly the access to "discardable" and "inherent" attributes (the latter being stored in properties now). Generic accessors like `getAttr()` `setAttr()` `setAttrs()` are much more expensive and to be avoided.	2023-05-20 01:36:48 +00:00
cloudhan	323843cde8	[BUILD] stop depending on dlfcn-win32 by implementing `dladdr` natively with WIN32 API (#1674 ) Co-authored-by: Philippe Tillet <phil@openai.com>	2023-05-16 07:19:36 +00:00
Ingo Müller	b2a757d000	[BUILD] Add missing CMake link-time dependencies. (#1654 )	2023-05-11 19:17:44 -07:00
Philippe Tillet	e5c7d2a83c	[FRONTEND] cleaned up language; added frontend function for `globaltimer` special register (#1525 )	2023-04-14 15:29:27 -07:00
Philippe Tillet	e0d6f5f4f5	[BUILD] updated LLVM binaries (#1504 ) Co-authored-by: Christian Sigg <csigg@google.com>	2023-04-11 00:14:00 -07:00
Philippe Tillet	053af4e9f8	[FRONTEND] Refactor file hierarchy (#1464 ) The purpose of this PR is to remove some circular dependencies and separate concerns better in the frontend. It's still not perfect -- `triton.compile` still includes a few runtime architecture-specific component, but at least much better than before. This PR still assumes that AMD only supports empty kernels right now. Other PRs will follow to make the frontend supports multiple devices in a more modular way.	2023-04-02 12:07:08 -07:00
Michael Melesse	a9c87245b4	[ROCM] Enable ROCM Backend #1 : Empty Kernel (#1312 ) This PR is a first in a series of PRs to import the changes that we have made to enable ROCM on [our fork](https://github.com/ROCmSoftwarePlatform/triton) of triton. The PR contains the major changes to the python frontend and enough changes to the c++ backend to allow compilation and running of the empty kernel. We use the ROCM ci added a few weeks ago to verify things. --------- Co-authored-by: Ronan Keryell <ronan@keryell.fr>	2023-03-24 17:18:27 -07:00
Philippe Tillet	b4decbe155	[BACKEND] Now using `call_once` to initialize LLVM target (#1373 )	2023-03-19 21:23:39 -07:00
Philippe Tillet	e4b2d1bc3d	[FRONTEND][BACKEND] no longer using indices for loops (#1370 )	2023-03-19 14:57:50 -07:00
Philippe Tillet	39139258c8	[FRONTEND][BACKEND] tl.mathlib -> tl.math; internally reverted to mathlib -> libdevice (#1368 )	2023-03-19 02:14:57 -07:00
rsanthanam-amd	c575911a01	[FRONTEND] Change libdevice to mathlib and fix abs (#1361 ) Co-authored-by: Phil Tillet <phil@openai.com>	2023-03-19 01:34:16 -07:00
Keren Zhou	da0b0bfde6	[BACKEND] Still run llvm-opt but set optLevel to 0 to avoid the `abs(float)` bug (#1339 ) https://github.com/openai/triton/issues/1337	2023-03-14 12:38:57 -07:00
Philippe Tillet	6a8634e2a7	[BACKEND] No longer running LLVM-IR optimizations after codegen. (#1338 ) This triggered some outrageous bugs. See #1337.	2023-03-13 22:50:15 -07:00
Ilia Sergachev	22a61e6f59	[BACKEND] Fix stack-use-after-scope in translateLLVMIRToPTX. (#1304 ) stack-use-after-scope is reported by LLVM AddressSanitizer at stream flush happening in PassManager's destructor.	2023-03-08 18:25:38 -08:00
Philippe Tillet	3db55c5f94	[OPTIMIZER]]BACKEND] Some backend and optimization passes clean-up (#1284 ) * Cleaned up pipeline pass. Now works when there are element-wise ops between the load and the dot * Made `splat` compatible with varibales that have DotOperandLayout * Moves rematerialization utils to separate Transforms/Utility.cpp file.	2023-03-06 17:17:59 -08:00
Philippe Tillet	fa0fbc937f	[FRONTEND][BACKEND][OPTIMIZER] Loops now use 64-bit indices when necessary (#1261 ) * Frontend: - `int` kernel arguments are always signed - Loop induction variable is now determine by integer promotion on lb/ub/step * Optimizer: - Added new ExtractSliceOp that enforces 32-bit offsets * Backend: - Use 64-bit indices when lowering functions and control flow - Removed `idx_val` macro and replaced it with `i32_val` - Cleaned up comments - Added new ArithToIndex pass to make sure operations on indices are done with the `index` dialect, that gets converted to LLVM separately using a 64-bit target	2023-03-01 23:09:48 -08:00
Keren Zhou	6a9316e69a	[BACKEND] Clean up SCF -> CF conversion (#1234 )	2023-02-22 23:49:47 +00:00
Yu Guo	19228d88bc	[FRONTEND][BACKEND] add env variable TRITON_LIBDEVICE_PATH (#1166 ) we may compile kernels on remote machines which do not have local libdevice.10.bc. Co-authored-by: Philippe Tillet <phil@openai.com>	2023-02-21 20:15:12 +00:00
Christian Sigg	9ef4b5d773	Rebase to LLVM-head. (#1200 ) Rebase to `37b7a60cd7`	2023-02-17 13:16:11 -08:00
Christian Sigg	fc7a8e3581	Rebase Triton to LLVM-15. (#1070 ) This PR rebases Triton from LLVM-14 to LLVM-15. Most changes are mechanical, except for the analysis framework changes.	2023-02-16 06:40:53 -08:00
Nikita Shulga	2d4370bc9f	[LINKER] search for `libdevice` relative to shared library (#1176 )	2023-02-11 02:24:33 +00:00
Philippe Tillet	43798ab27e	[BUILD] Restored wheels workflow (#1146 ) - Dependent CUDA files (ptxas, cuda.h, libdevice.bc.10) are now packaged in `triton/third_party/cuda`. `ptxas` is downloaded from conda repo at install time. - Can now be built with old glibc (as that used by manylinux2014)	2023-02-03 16:22:10 -08:00
goostavz	3e8d83b7cc	Minor fix to support sm_90 (#1125 ) This fix enables the support on sm_90 (otherwise it will crash). Logs like > 'sm_90' is not a recognized processor for this target (ignoring processor) could be ignored and should be eliminated with the update of llvm nvptx backend.	2023-01-31 14:08:02 +08:00
Goran Flegar	afd02626ea	[BUILD] Fix build issues of triton-translate tool (#1068 )	2023-01-17 09:03:29 -08:00
Connor Baker	c20215dad1	[FRONTEND] Update PTX/SM support for LLVM14 (PR #1038 redux) (#1039 ) =	2023-01-09 10:31:55 -08:00
Keren Zhou	733301ff31	[Backend] Rewrite code for linking external library to expose more inlining opportunities (#1037 ) - Also make it cleaner. - And mark out the code needs to be fixed in `semantic.py`.	2023-01-08 13:44:29 -08:00
Philippe Tillet	20100a7254	Merge `triton-mlir` branch - Complete rewrite of the backend from scratch (#1004 ) This PR merges the `triton-mlir` branch, in which we have been quietly rewriting the Triton backend from scratch to increase maintainability, stability and ultimately performance. Changes to the runtime are minimal, and this new version aims to remain backward-compatible with the previous commit. The legacy backend is now officially deprecated, but can still be accessed via the `legacy-backend` tag. Co-authored-by: Keren Zhou <kerenzhou@openai.com> Co-authored-by: Yan Chunwei <yanchunwei@outlook.com> Co-authored-by: goostavz <109190422+goostavz@users.noreply.github.com> Co-authored-by: Shintaro Iwasaki <siwasaki@fb.com> Co-authored-by: Yan Da <dyanab@connect.ust.hk> Co-authored-by: Jun Yang <yangjunpro@gmail.com> Co-authored-by: Ian Bearman <ianb@microsoft.com> Co-authored-by: Jason Ansel <jansel@jansel.net> Co-authored-by: Qingyi Liu <qingyil@nvidia.com> Co-authored-by: ben-zhang-609 <110140741+ben-zhang-609@users.noreply.github.com> Co-authored-by: Chenggang Zhao <lyricz@yeah.net> Co-authored-by: ben-zhang-609 <benzh609@gmail.com> Co-authored-by: dongdongl <dongdongl@nvidia.com>	2022-12-21 01:30:50 -08:00

34 Commits