github/ROCm - ROCm - AtHeartEngineering

mirror of https://github.com/ROCm/ROCm.git synced 2026-04-05 03:01:17 -04:00

Author	SHA1	Message	Date
Philippe Tillet	9b7c65a3a9	[BACKEND][OPTIMIZER] Refactor MMAv1 codegen (#1322 ) - Significant simplification of the optimizer pipeline. Right mma version is now set directly after the coalescing pass. DotOperand layout no longer hold a state to `isRow` argument, and instead query it from their parent - Moved a bunch of things from TritonGPUToLLVM/DotOpHelpers to TritonGPUAttrDefs. All MMAv1 state is now queried from attributes. - logic for getELemsPerThread is no longer duplicated in TypeConverter	2023-03-12 19:54:38 -07:00
Yu Guo	ef55ccfed0	[TESTING] fix get_max_simd_tflops (#1318 ) `_triton.runtime.num_sm`, `_triton.runtime.clock_rate`, `_triton.runtime.cc` seem no longer exist. use the corresponding methods from `get_max_tensorcore_tflops` in the same file.	2023-03-11 10:07:25 -08:00
Philippe Tillet	5a786cf778	[FRONTEND] Fixed `contains_return_op` behavior (#1317 )	2023-03-10 23:58:28 -08:00
Philippe Tillet	3fe3adbcde	[FRONTEND][BACKEND] Add support for float8e5m2 type (#1314 )	2023-03-10 19:14:47 -08:00
Luo Yihang	9626c8e944	[DOC] Fix typos in comments (#1311 ) Fixed several typos in `python/triton/runtime/autotuner.py`	2023-03-10 09:33:24 -08:00
Keren Zhou	8b25c30d39	[BACKEND] Fix bfloat16 flash attention (#1306 ) See https://github.com/openai/triton/issues/1245 for more detailed information --------- Co-authored-by: giorgio-arena <arena.cpp@gmail.com>	2023-03-09 21:14:52 -08:00
Sophia Wisdom	a4a824a3c9	[FRONTEND] Correct error message (#1308 )	2023-03-09 21:14:11 -08:00
Da Yan	902c61affb	[BACKEND] Add arith::SelectOp => LLVM::SelectOp conversion (#1307 )	2023-03-09 09:35:30 -08:00
Keren Zhou	78b311f6e2	[FRONTEND] Fix cast when both `src_ty` and `dst_ty` are of block_type (#1301 ) Commonly used in atomic_rmw ops	2023-03-08 09:25:00 -08:00
shunting314	f5c9f9b4b5	[FRONTEND] Expose the register usage and spill information thru CompiledKernel (#1296 )	2023-03-08 01:30:31 +00:00
Phil Tillet	773c29cfaa	[BUILD] Fix comment typo	2023-03-07 16:47:30 -08:00
Phil Tillet	305f99e614	[BUILD] Fixed typo in setup.py	2023-03-07 15:45:36 -08:00
Philippe Tillet	c34b32866b	[BUILD] re-download package if version has changed (#1294 )	2023-03-07 10:15:35 -08:00
JiCheng	849a40baad	[FRONTEND] Add check for the axis of reduction op (#1268 )	2023-03-06 22:11:43 -08:00
Philippe Tillet	3db55c5f94	[OPTIMIZER]]BACKEND] Some backend and optimization passes clean-up (#1284 ) * Cleaned up pipeline pass. Now works when there are element-wise ops between the load and the dot * Made `splat` compatible with varibales that have DotOperandLayout * Moves rematerialization utils to separate Transforms/Utility.cpp file.	2023-03-06 17:17:59 -08:00
Keren Zhou	4731f300d3	[BACKEND] Mask out wrapped threads in store ops (#1283 )	2023-03-06 14:50:20 -08:00
Alexander Zinoviev	5e92a66267	[DOC] Fix a typo in where's description (#1286 ) Co-authored-by: Alexander Zinoviev <zinoviev@google.com>	2023-03-06 14:38:03 -08:00
Philippe Tillet	ff94e34430	[TESTS][BUILD] now using llvm @ 8e5a41e8271f (#1282 ) Now we also use the FileTest utility packaged with llvm pre-built binaries	2023-03-05 17:23:00 -08:00
Keren Zhou	d376020f90	[FRONTEND][BACKEND] Implement `tl.device_assert` and rename `tl.printf` to `tl.device_print` (#1143 ) Note that `tl.device_print` and `print` accepts different arguments than the normal `print`. The first argument must be a string, following by variables. Device side: - `tl.device_print` - `tl.device_assert` - `print` - `assert` Compilation time: - `tl.static_assert` - `tl.static_print` Usage example: 1. ```Python tl.device_assert(x == 0, "x != 0") ``` Output: ```Python ... python/test/unit/language/assert_helper.py:18: kernel: block: [0,0,0], thread: [33,0,0] Assertion `x != 0` failed. ... ``` 2. ```Python tl.device_print("hello ", x) ``` Output: ```Python ... hello 1 ... ``` The environment variable `TRITON_DEBUG` sets the default debugging flag; if it's true, `tl.device_assert` or `assert` will be skipped.	2023-03-04 08:08:29 -08:00
Keren Zhou	77c145cec8	[BUILD] Bump cmake requirement to >= 3.20 and format CMakeLists.txt (#1276 ) cc @malfet	2023-03-03 11:43:09 -08:00
Phil Tillet	c7581c9a91	[PACKAGING] bump dev version to 2.1.0	2023-03-02 21:52:30 -08:00
Keren Zhou	65e5a3bc24	[FRONTEND] Improve `tl.full` to accept both static and dynamic values (#1269 )	2023-03-02 12:19:54 -08:00
Phil Tillet	2660c814c9	[FRONTEND] for loop negative step hotfix	2023-03-01 23:45:03 -08:00
Philippe Tillet	fa0fbc937f	[FRONTEND][BACKEND][OPTIMIZER] Loops now use 64-bit indices when necessary (#1261 ) * Frontend: - `int` kernel arguments are always signed - Loop induction variable is now determine by integer promotion on lb/ub/step * Optimizer: - Added new ExtractSliceOp that enforces 32-bit offsets * Backend: - Use 64-bit indices when lowering functions and control flow - Removed `idx_val` macro and replaced it with `i32_val` - Cleaned up comments - Added new ArithToIndex pass to make sure operations on indices are done with the `index` dialect, that gets converted to LLVM separately using a 64-bit target	2023-03-01 23:09:48 -08:00
Keren Zhou	90fcb38c7b	[BACKEND] Overwrite NVPTX converters for fp16<->fp32 and int16<->int32 to avoid ptxas problems (#1267 )	2023-03-01 18:26:06 -08:00
Da Yan	cb7b315a17	[OPTIMIZER] Copying named attributes when converting from Triton to TritonGPU (#1265 )	2023-03-01 12:31:46 -08:00
Keren Zhou	be6217cce7	[BACKEND] Improve ptxas error message (#1263 )	2023-03-01 00:59:36 +00:00
Keren Zhou	5376fe9443	[FRONTEND] Improve triton hooks (#1256 ) Callback interfaces are not changed, just to record more attributes (i.e., `constants`) and simplify invocations	2023-02-26 17:16:05 -08:00
Da Yan	0eead250c1	[FRONTEND] add missing tensor/constexpr ops (#1249 )	2023-02-24 18:45:22 +00:00
Yan Chunwei	7eecc4d4ad	[Frontend] Fix jit cache bug (#1242 )	2023-02-23 09:21:30 -08:00
Michaël Benesty	66ddd17e72	[EXAMLPES] remove unnecessary argument (#1243 ) Small cleaning of an example calling an old API to display generated IR	2023-02-23 09:15:32 -08:00
Douglas Lehr	729211a404	Ensure __triton_launcher calls right _launch. (#1229 ) Per issue https://github.com/openai/triton/issues/1228. I believe we are potentially exposed when a Triton executor (Pytorch for example) links in two or more `triton_.so` shared objects and each has a stub for `_launch`. This fix ensures the `_launch` function is tied locally to the calling `__triton_launcher` and can't be misused by another library.	2023-02-23 00:16:36 +00:00
Philippe Tillet	0ec277efc5	[OPTIMIZER] cleaned, renamed and simplified some optimization passes (#1232 ) This shouldn't actually change the behavior of Triton -- only clean things up.	2023-02-22 13:54:55 -08:00
Philippe Tillet	ba0198326e	[TESTS] make performance regression testing less strict (#1231 )	2023-02-21 22:22:02 -08:00
Mihir Patel	6bef0c2bd6	[FRONTEND] Update path for headers to support Python 3.10 (#1123 ) Python 3.10 changes where packages are installed by default, causing problems with Ubuntu into `/local`. See [this](https://lists.debian.org/debian-python/2022/03/msg00039.html) and [this](https://bugs.launchpad.net/ubuntu/+source/python3.10/+bug/1967920). Triton seems to break when using 3.10 as it looks for the headers, but the headers are not in `/local`, e.g. they are at `/usr/include/python3.X` and not `/usr/local/include/python3.X` Not 100% sure what's going on here since it's deep in python / pip, but I think this should fix it. Otherwise, you have to hack around it in dockerfiles, e.g. `ENV DEB_PYTHON_INSTALL_LAYOUT=deb`, which breaks things with the release of pip that went. --------- Co-authored-by: Keren Zhou <kerenzhou@openai.com>	2023-02-21 21:19:08 -08:00
Philippe Tillet	174f121c1c	[TESTS] Added attention regression tests (#1227 )	2023-02-21 20:22:36 -08:00
Eric Wang	320ae18093	[FRONTEND] Add error messages for arange (#1218 ) Fix issue https://github.com/openai/triton/issues/244 Check `end` is greater than `start`. Check if the range can fit in `int32`. Check the number of elements less than or equal to `TRITON_MAX_TENSOR_NUMEL = 131072`. --------- Co-authored-by: Philippe Tillet <phil@openai.com>	2023-02-22 00:37:28 +00:00
Philippe Tillet	307dde9cb5	[CI] revived regression tests (#1225 )	2023-02-21 16:33:03 -08:00
Yu Guo	19228d88bc	[FRONTEND][BACKEND] add env variable TRITON_LIBDEVICE_PATH (#1166 ) we may compile kernels on remote machines which do not have local libdevice.10.bc. Co-authored-by: Philippe Tillet <phil@openai.com>	2023-02-21 20:15:12 +00:00
Philippe Tillet	cdd59eae68	[CI] Added A100 runner; tentative merge queues support (#1224 )	2023-02-21 01:37:56 -08:00
Michaël Benesty	940f394a35	[Frontend] fix crash on cast when dest is constexpr (#1222 ) This pull request addresses a crash that occurs when casting to a tl.constexpr type in the frontend. More info and repro code available in: https://github.com/openai/triton/issues/1221	2023-02-20 10:50:33 -08:00
Christian Sigg	17795a34ac	[NFC] Remove null character (#1220 )	2023-02-20 08:50:28 +00:00
BillSchumacher	6b44d31ae4	[BUILD] windows and cmake compatibility. (#1214 ) Make cmake happier, it doesn't like multiple target_link_library definitions for the same name. Use find_package instead on Windows for dlfcn-win32. Set LLVM_SYS_PATH on Windows for python setup. Debug build almost working, AlwaysCreate error thrown still.	2023-02-19 09:51:50 +00:00
Arun A. Kumar	35d1c062b8	[FRONTEND] fix AutoTuner error when OutOfResources (#1208 ) Minor bug: AutoTuner currently throws the following error when certain configs go OutOfResources (e.g. the matmul example when testing on GPUs with less shared memory).	2023-02-18 07:29:33 +00:00
Philippe Tillet	4d067f5120	[FRONTEND] Now emit an error for `tl.reshape`, instead of silently calling `tl.view` (#1212 )	2023-02-17 20:21:20 -08:00
Christian Sigg	9ef4b5d773	Rebase to LLVM-head. (#1200 ) Rebase to `37b7a60cd7`	2023-02-17 13:16:11 -08:00
Philippe Tillet	969331aedd	[BUILD] fixed setup.py on older glibc (#1206 )	2023-02-16 19:43:18 -08:00
Philippe Tillet	8a4117a0f4	[FRONTEND] launcher module is now renamed from `launcher` to `__triton_launcher` (#1201 ) creating dynamically a module named `launcher` may conflict with other modules named the same in the user's environment.	2023-02-16 17:28:51 -08:00
Christian Sigg	fc7a8e3581	Rebase Triton to LLVM-15. (#1070 ) This PR rebases Triton from LLVM-14 to LLVM-15. Most changes are mechanical, except for the analysis framework changes.	2023-02-16 06:40:53 -08:00
Horace He	f21e76affe	[TUTORIALS] changed for loop to iterate by 1 in matmuls (#1198 ) For the new MLIR backend, this appears to increase matmul perf significantly in many cases.	2023-02-16 03:44:42 +00:00

1 2 3 4 5 ...

566 Commits