github/ROCm - ROCm - AtHeartEngineering

mirror of https://github.com/ROCm/ROCm.git synced 2026-04-05 03:01:17 -04:00

Author	SHA1	Message	Date
Rohit Santhanam	cd9ae1cd36	Merge remote-tracking branch 'upstream/main' into triton-mlir-IFU-02232023	2023-02-23 21:41:54 +00:00
Eric Wang	320ae18093	[FRONTEND] Add error messages for arange (#1218 ) Fix issue https://github.com/openai/triton/issues/244 Check `end` is greater than `start`. Check if the range can fit in `int32`. Check the number of elements less than or equal to `TRITON_MAX_TENSOR_NUMEL = 131072`. --------- Co-authored-by: Philippe Tillet <phil@openai.com>	2023-02-22 00:37:28 +00:00
Rohit Santhanam	841784d1e3	Merge remote-tracking branch 'upstream/main' into upgrade_triton_mlir_rocm_to_llvm_head	2023-02-18 09:25:20 +00:00
Philippe Tillet	4d067f5120	[FRONTEND] Now emit an error for `tl.reshape`, instead of silently calling `tl.view` (#1212 )	2023-02-17 20:21:20 -08:00
Rohit Santhanam	a2416e0901	Merge remote-tracking branch 'upstream/main' into triton-mlir-IFU-02112023	2023-02-11 14:48:19 +00:00
Philippe Tillet	2aba985daa	[OPTIMIZER] Improved layout simplifications heuristics (#1168 )	2023-02-09 20:17:25 -08:00
fdrocha	972b761390	[FRONTEND] For __rshift__ operator, use arithmetic right shift if dtype is a signed int. (#1153 )	2023-02-06 10:26:17 +00:00
Rohit Santhanam	8cb6ab5b1a	Merge remote-tracking branch 'upstream/main' into triton_mlir_IFU_02022023	2023-02-02 22:54:53 +00:00
Philippe Tillet	8fea1fb478	[FRONTEND] Adding static range (#1130 ) Included: Revert "[BACKEND] Replace `mlir::topologicalSort` with a custom implementation (#1113)"	2023-01-31 18:04:19 -08:00
Philippe Tillet	c4b9d699d2	[FRONTEND][BACKEND] Fixed many bugs (#1122 ) - temporarily commenting assertion in `MemBar.cpp`. We need to fix this! but for now the following patches will unblock a number of users. - Fixed frontend codegen issue for If / For / While. Emit an error when replaced values' type mismatch. - Added "top level" codepath for if statements, which allows users to write patterns to exit early from kernels (e.g., `if cond1: if cond2: return else: ...`). Added associated codegen in TritonToTritonGPUPass - Added basic control flow tests - Pipeline pass is no longer activated when memory accesses can't be vectorized - Added missing magic methods to `constexpr` - Fixed issue in random.py: bitcast some values to uint when they need to be. - Added support for `Not` - Fixed nondeterministic compilation issue	2023-01-30 23:22:36 -08:00
Yan Chunwei	94b419c327	[FRONTEND] some tiny fix (#1120 )	2023-01-30 19:39:38 -08:00
Rohit Santhanam	2d0ee0fa0f	Merge remote-tracking branch 'upstream/main' into triton-mlir-IFU-01232023	2023-01-24 03:59:17 +00:00
Nishant Sikarwar	7687f85ca4	[FRONTEND] decorating static methods with @staticmethod (#1069 )	2023-01-17 14:35:06 -08:00
Keren Zhou	3f47e9aa0e	[BACKEND] Fix unrealized conversion for fp32 dot (#1051 )	2023-01-17 21:55:44 +00:00
Nishant Sikarwar	4a74d6eae9	[FRONTEND] replaced chains comparison operator with `in` (#1059 )	2023-01-15 20:14:35 +00:00
Rohit Santhanam	ce8adb92bd	Merge remote-tracking branch 'upstream/master' into triton-mlir-IFU-01142023	2023-01-14 19:19:58 +00:00
Keren Zhou	4023149ee3	[Frontend] Convert constexpr to value for store and load ops (#1030 ) Fixing problem 2 in https://github.com/openai/triton/issues/1017 Co-authored-by: Philippe Tillet <phil@openai.com>	2023-01-05 14:40:16 -05:00
Sophia Wisdom	411bacb2a8	[FRONTEND] Add logical operations on constexprs (#1033 )	2023-01-04 18:06:32 -08:00
Keren Zhou	c9e7385255	[FRONTEND] Fix 3d indexing (#1006 )	2023-01-04 14:59:31 +00:00
Keren Zhou	b5aafb0dab	[FRONTEND] Fix 3d indexing (#1006 )	2022-12-21 12:52:32 -08:00
Michael Melesse	41578a63d2	Merge remote-tracking branch 'upstream/triton-mlir' into triton-mlir-IFU	2022-12-21 12:53:03 -06:00
Philippe Tillet	20100a7254	Merge `triton-mlir` branch - Complete rewrite of the backend from scratch (#1004 ) This PR merges the `triton-mlir` branch, in which we have been quietly rewriting the Triton backend from scratch to increase maintainability, stability and ultimately performance. Changes to the runtime are minimal, and this new version aims to remain backward-compatible with the previous commit. The legacy backend is now officially deprecated, but can still be accessed via the `legacy-backend` tag. Co-authored-by: Keren Zhou <kerenzhou@openai.com> Co-authored-by: Yan Chunwei <yanchunwei@outlook.com> Co-authored-by: goostavz <109190422+goostavz@users.noreply.github.com> Co-authored-by: Shintaro Iwasaki <siwasaki@fb.com> Co-authored-by: Yan Da <dyanab@connect.ust.hk> Co-authored-by: Jun Yang <yangjunpro@gmail.com> Co-authored-by: Ian Bearman <ianb@microsoft.com> Co-authored-by: Jason Ansel <jansel@jansel.net> Co-authored-by: Qingyi Liu <qingyil@nvidia.com> Co-authored-by: ben-zhang-609 <110140741+ben-zhang-609@users.noreply.github.com> Co-authored-by: Chenggang Zhao <lyricz@yeah.net> Co-authored-by: ben-zhang-609 <benzh609@gmail.com> Co-authored-by: dongdongl <dongdongl@nvidia.com>	2022-12-21 01:30:50 -08:00
Philippe Tillet	899bb0a0e7	[FORMAT] Run `clang-format`, `autopep8` and `isort` (#1000 )	2022-12-20 17:47:34 -08:00
Keren Zhou	be2f70699c	[BACKEND][FRONTEND] Fix problems with test_matmul (#973 ) 1. Handle induction variable when step is negative 2. Restore async_wait that accidentally deleted 3. Add missing induction variable in prefetch 4. Add device property functions Co-authored-by: Philippe Tillet <Phil.Tillet@gmail.com>	2022-12-10 20:34:58 -08:00
Rohit Santhanam	dbe1b2aafb	AMDGCN fixes for libdevice.py.	2022-12-08 19:08:26 +00:00
Philippe Tillet	b2b793dfb5	[FRONTEND][BACKEND] Fixes for cat / reshape / addptr (#959 ) Most notably, this PR: - changes the traits (and assembly format) of addptr so it can handle offsets that have arbitrary integer width. - adds support for `cat`	2022-12-06 23:29:50 -08:00
Philippe Tillet	981aee7f1e	[FRONTEND] Frontend fixes for uint / for loops / random (#958 )	2022-12-06 20:25:47 -08:00
Philippe Tillet	115cd3ac47	[FRONTEND] Added `reshape` as an alias for `view` (for now) (#956 )	2022-12-06 09:57:05 -08:00
Philippe Tillet	532e10cf87	[FRONTEND][BACKEND] Clean-up transpositions (#953 )	2022-12-06 09:32:13 -08:00
Crutcher Dunnavant	189491727a	[FRONTEND] Extract and unify @builtin/@extern (#913 ) This change attaches builtin-ness as an explicit attribute, rather than a module prefix expectation. This permits us to source those builtins from multiple sub-modules (useful when some builtins are part of the true cyclic implementation core, and some are just useful library additions); but also prevents accidental inclusion of non-builtins that happen to be in the right library. Once the flag exists, and the compiler is using `is_builtin()` for decision making; the existence of the current `@extern` interface becomes isomorphic to `@builtin`; and the interface can be unified. Leaving `@extern` a thin-wrapper, and encouraging continued use of it, establishes future-proofing towards adding additional extern tracing, metric hooks, or scanning in the future. * Add `triton.impl` package to hold the core, order dependent impl details. * Extract `@builtin` and unify `@extern`; add `is_builtin()` * Add sense bit so that `@builtin` detection is less fragile. * Modify the compiler to use `is_builtin()`	2022-12-05 22:59:41 +00:00
Crutcher Dunnavant	e0072d210a	[FRONTEND] Propagate mypy types through @jit, @builtin, etc (#915 ) Changes to make decorated API methods no longer type-opaque. ``` $ echo 'import triton; reveal_type(triton.language.max)' \| mypy /dev/stdin /dev/stdin:1: note: Revealed type is "def (input: Any, axis: Any, _builder: Any =) -> Any" Success: no issues found in 1 source file ```	2022-12-05 22:41:02 +00:00
Crutcher Dunnavant	2fa17588f7	[FRONTEND] Expand __init__ * imports, add __all__ (#912 ) Expand `from .foo import *` to full listings, and `__all__` sections. This reifies the module export listings, which is useful for code importing this module; without this, clients will need special `mypy` control pragmas for this library. This removes a number of `# flake8` control pragmas. Verified with `flake8`	2022-12-05 14:22:55 -08:00
Philippe Tillet	8edfe813a5	[FRONTEND][BACKEND] Added `trans` instruction; made flash attention bwd pass work (#943 )	2022-12-03 09:58:24 -08:00
Yang Hau	8650b4d1cb	[DRIVER] Fix typos (#939 )	2022-12-02 11:13:46 -08:00
Philippe Tillet	6461254fb5	[BACKEND] Make flash attention forward pass work (#928 ) This also simplifies BroadcastOp codegen	2022-11-30 10:13:24 +00:00
Qingyi Liu	9d31998a9d	[Triton-MLIR][BACKEND] Add argmin / argmax implementation for ReduceOp (#918 )	2022-11-27 22:59:27 -08:00
Philippe Tillet	4d64ffb5fe	[FRONTEND] Handle for loops with negative constant steps (#896 )	2022-11-20 11:37:38 +01:00
Chenggang Zhao	516a241234	[Triton-MLIR] Fix some typos (#874 ) Fix some typos	2022-11-13 18:15:53 -08:00
Chenggang Zhao	57fd1864a7	[Triton-MLIR] Support FP8 (#864 ) Co-authored-by: Superjomn <yanchunwei@outlook.com>	2022-11-10 15:53:06 +08:00
ben-zhang-609	5feb6e24f9	[Triton-MLIR]Add ptx vprintf support (#825 ) Not know how to write unit test for this feature. Co-authored-by: Yan Chunwei <yanchunwei@outlook.com>	2022-11-02 16:39:09 +08:00
Philippe Tillet	e61dc75942	[FRONTEND] Fixed inliner and got more tests to pass (#822 ) This adds a `DialectInlinerInterface` to the Triton dialect. This, along with a few other minor semantic changes, fixes our tests on call instructions. Also added the option to provide use an "LLVM_SYSPATH" environment variable to link against locally build of LLVM; this was useful for debugging this issue.	2022-10-30 14:10:02 -07:00
Philippe Tillet	3e6cc6d66c	[FRONTEND] Made more tests pass (#805 )	2022-10-26 17:47:33 -07:00
Philippe Tillet	fcb228d1d4	Merge select commits from `master` branch into `triton-mlir` (#799 ) Co-authored-by: Keren Zhou <kerenzhou@openai.com> Co-authored-by: vesuppi <zt9465@gmail.com> Co-authored-by: Jason Ansel <jansel@jansel.net> Co-authored-by: daadaada <dyanab@connect.ust.hk> Co-authored-by: Anton Kostin <masguit42@users.noreply.github.com> Co-authored-by: Yunxing Dai <nov503@gmail.com> Co-authored-by: Shintaro Iwasaki <shintaro.iwasaki.work@gmail.com>	2022-10-24 14:52:37 -07:00
Philippe Tillet	bb0f9235d1	[OPTIMIZER] Made layout simplification pass efficient for fused attention kernels (#790 )	2022-10-21 16:52:15 -07:00
Shintaro Iwasaki	0d22d2bc03	[TritonMLIR] Disallow 0D tensor (#788 )	2022-10-19 10:34:32 -07:00
Yu Guo	71b46acc42	[IR] Added special-purpose `dequantize` instruction (#759 ) It is currently necessary for optimal performance in quantized workloads to add a special-purpose instruction in the IR. Backward compatibility with this instruction is NOT guaranteed.	2022-10-12 14:14:45 -07:00
Yan Chunwei	555f94f9b9	[triton-mlir][BACKEND] Support masked load/store (#657 ) This PR does - fix some bugs to support masked load/store, - refine frontend, and support the `and` and `or` syntax in mask(by extending the BoolOp in python ast.visitor), e.g. `tl.store(..., mask=offset<n and other_conditions)`, - add `arith.cmpI` and `arith.cmpF` op conversion in backend(required by mask), - add more test cases in vecadd.	2022-10-10 13:29:53 +08:00
Philippe Tillet	4a77dfb042	[FRONTEND] Complete rewrite of the runtime (#644 ) This PR completely rewrites the runtime of Triton to be more lean and clearly separate the compilation step from the just-in-time caching logic. This should substantially reduce launch overhead.	2022-09-18 08:51:48 -07:00
Shintaro Iwasaki	13669b46a6	[DOCS] Correct spelling (#665 ) This PR corrects spelling like #664 for Triton-MLIR. It should not break anything.	2022-09-16 15:07:34 -07:00
Shintaro Iwasaki	c668d6596e	[DOCS] Fix spelling (#664 ) This PR applies minor spelling fix in comments and string literals to `master`. It shouldn't hurt anything.	2022-09-16 12:26:40 -07:00

1 2 3

106 Commits