Commit Graph

106 Commits

Author SHA1 Message Date
Rohit Santhanam
cd9ae1cd36 Merge remote-tracking branch 'upstream/main' into triton-mlir-IFU-02232023 2023-02-23 21:41:54 +00:00
Eric Wang
320ae18093 [FRONTEND] Add error messages for arange (#1218)
Fix issue https://github.com/openai/triton/issues/244

Check `end` is greater than `start`.
Check if the range can fit in `int32`.
Check the number of elements less than or equal to
`TRITON_MAX_TENSOR_NUMEL = 131072`.

---------

Co-authored-by: Philippe Tillet <phil@openai.com>
2023-02-22 00:37:28 +00:00
Rohit Santhanam
841784d1e3 Merge remote-tracking branch 'upstream/main' into upgrade_triton_mlir_rocm_to_llvm_head 2023-02-18 09:25:20 +00:00
Philippe Tillet
4d067f5120 [FRONTEND] Now emit an error for tl.reshape, instead of silently calling tl.view (#1212) 2023-02-17 20:21:20 -08:00
Rohit Santhanam
a2416e0901 Merge remote-tracking branch 'upstream/main' into triton-mlir-IFU-02112023 2023-02-11 14:48:19 +00:00
Philippe Tillet
2aba985daa [OPTIMIZER] Improved layout simplifications heuristics (#1168) 2023-02-09 20:17:25 -08:00
fdrocha
972b761390 [FRONTEND] For __rshift__ operator, use arithmetic right shift if dtype is a signed int. (#1153) 2023-02-06 10:26:17 +00:00
Rohit Santhanam
8cb6ab5b1a Merge remote-tracking branch 'upstream/main' into triton_mlir_IFU_02022023 2023-02-02 22:54:53 +00:00
Philippe Tillet
8fea1fb478 [FRONTEND] Adding static range (#1130)
Included: Revert "[BACKEND] Replace `mlir::topologicalSort` with a
custom implementation (#1113)"
2023-01-31 18:04:19 -08:00
Philippe Tillet
c4b9d699d2 [FRONTEND][BACKEND] Fixed many bugs (#1122)
- **temporarily commenting assertion in `MemBar.cpp`. We need to fix
this! but for now the following patches will unblock a number of
users.**
- Fixed frontend codegen issue for If / For / While. Emit an error when
replaced values' type mismatch.
- Added "top level" codepath for if statements, which allows users to
write patterns to exit early from kernels (e.g., `if cond1: if cond2:
return else: ...`). Added associated codegen in TritonToTritonGPUPass
- Added basic control flow tests
- Pipeline pass is no longer activated when memory accesses can't be
vectorized
- Added missing magic methods to `constexpr`
- Fixed issue in random.py: bitcast some values to uint when they need
to be.
- Added support for `Not`
- Fixed nondeterministic compilation issue
2023-01-30 23:22:36 -08:00
Yan Chunwei
94b419c327 [FRONTEND] some tiny fix (#1120) 2023-01-30 19:39:38 -08:00
Rohit Santhanam
2d0ee0fa0f Merge remote-tracking branch 'upstream/main' into triton-mlir-IFU-01232023 2023-01-24 03:59:17 +00:00
Nishant Sikarwar
7687f85ca4 [FRONTEND] decorating static methods with @staticmethod (#1069) 2023-01-17 14:35:06 -08:00
Keren Zhou
3f47e9aa0e [BACKEND] Fix unrealized conversion for fp32 dot (#1051) 2023-01-17 21:55:44 +00:00
Nishant Sikarwar
4a74d6eae9 [FRONTEND] replaced chains comparison operator with in (#1059) 2023-01-15 20:14:35 +00:00
Rohit Santhanam
ce8adb92bd Merge remote-tracking branch 'upstream/master' into triton-mlir-IFU-01142023 2023-01-14 19:19:58 +00:00
Keren Zhou
4023149ee3 [Frontend] Convert constexpr to value for store and load ops (#1030)
Fixing problem 2 in https://github.com/openai/triton/issues/1017

Co-authored-by: Philippe Tillet <phil@openai.com>
2023-01-05 14:40:16 -05:00
Sophia Wisdom
411bacb2a8 [FRONTEND] Add logical operations on constexprs (#1033) 2023-01-04 18:06:32 -08:00
Keren Zhou
c9e7385255 [FRONTEND] Fix 3d indexing (#1006) 2023-01-04 14:59:31 +00:00
Keren Zhou
b5aafb0dab [FRONTEND] Fix 3d indexing (#1006) 2022-12-21 12:52:32 -08:00
Michael Melesse
41578a63d2 Merge remote-tracking branch 'upstream/triton-mlir' into triton-mlir-IFU 2022-12-21 12:53:03 -06:00
Philippe Tillet
20100a7254 Merge triton-mlir branch - Complete rewrite of the backend from scratch (#1004)
This PR merges the `triton-mlir` branch, in which we have been quietly
rewriting the Triton backend from scratch to increase maintainability,
stability and ultimately performance. Changes to the runtime are
minimal, and this new version aims to remain backward-compatible with
the previous commit. The legacy backend is now officially deprecated,
but can still be accessed via the `legacy-backend` tag.

Co-authored-by: Keren Zhou <kerenzhou@openai.com>
Co-authored-by: Yan Chunwei <yanchunwei@outlook.com>
Co-authored-by: goostavz <109190422+goostavz@users.noreply.github.com>
Co-authored-by: Shintaro Iwasaki <siwasaki@fb.com>
Co-authored-by: Yan Da <dyanab@connect.ust.hk>
Co-authored-by: Jun Yang <yangjunpro@gmail.com>
Co-authored-by: Ian Bearman <ianb@microsoft.com>
Co-authored-by: Jason Ansel <jansel@jansel.net>
Co-authored-by: Qingyi Liu <qingyil@nvidia.com>
Co-authored-by: ben-zhang-609 <110140741+ben-zhang-609@users.noreply.github.com>
Co-authored-by: Chenggang Zhao <lyricz@yeah.net>
Co-authored-by: ben-zhang-609 <benzh609@gmail.com>
Co-authored-by: dongdongl <dongdongl@nvidia.com>
2022-12-21 01:30:50 -08:00
Philippe Tillet
899bb0a0e7 [FORMAT] Run clang-format, autopep8 and isort (#1000) 2022-12-20 17:47:34 -08:00
Keren Zhou
be2f70699c [BACKEND][FRONTEND] Fix problems with test_matmul (#973)
1. Handle induction variable when step is negative
2. Restore async_wait that accidentally deleted
3. Add missing induction variable in prefetch
4. Add device property functions

Co-authored-by: Philippe Tillet <Phil.Tillet@gmail.com>
2022-12-10 20:34:58 -08:00
Rohit Santhanam
dbe1b2aafb AMDGCN fixes for libdevice.py. 2022-12-08 19:08:26 +00:00
Philippe Tillet
b2b793dfb5 [FRONTEND][BACKEND] Fixes for cat / reshape / addptr (#959)
Most notably, this PR:
- changes the traits (and assembly format) of addptr so it can handle offsets that have arbitrary integer width.
- adds support for `cat`
2022-12-06 23:29:50 -08:00
Philippe Tillet
981aee7f1e [FRONTEND] Frontend fixes for uint / for loops / random (#958) 2022-12-06 20:25:47 -08:00
Philippe Tillet
115cd3ac47 [FRONTEND] Added reshape as an alias for view (for now) (#956) 2022-12-06 09:57:05 -08:00
Philippe Tillet
532e10cf87 [FRONTEND][BACKEND] Clean-up transpositions (#953) 2022-12-06 09:32:13 -08:00
Crutcher Dunnavant
189491727a [FRONTEND] Extract and unify @builtin/@extern (#913)
This change attaches builtin-ness as an explicit attribute, rather than
a module prefix expectation. This permits us to source those builtins
from multiple sub-modules (useful when some builtins are part of the
true cyclic implementation core, and some are just useful library
additions); but also prevents accidental inclusion of non-builtins that
happen to be in the right library.

Once the flag exists, and the compiler is using `is_builtin()` for
decision making; the existence of the current `@extern` interface
becomes isomorphic to `@builtin`; and the interface can be unified.

Leaving `@extern` a thin-wrapper, and encouraging continued use of it,
establishes future-proofing towards adding additional extern tracing,
metric hooks, or scanning in the future.

* Add `triton.impl` package to hold the core, order dependent impl
details.
 * Extract `@builtin` and unify `@extern`; add `is_builtin()`
   * Add sense bit so that `@builtin` detection is less fragile.
 * Modify the compiler to use `is_builtin()`
2022-12-05 22:59:41 +00:00
Crutcher Dunnavant
e0072d210a [FRONTEND] Propagate mypy types through @jit, @builtin, etc (#915)
Changes to make decorated API methods no longer type-opaque.

```
$ echo 'import triton; reveal_type(triton.language.max)' | mypy /dev/stdin
/dev/stdin:1: note: Revealed type is "def (input: Any, axis: Any, _builder: Any =) -> Any"
Success: no issues found in 1 source file
```
2022-12-05 22:41:02 +00:00
Crutcher Dunnavant
2fa17588f7 [FRONTEND] Expand __init__ * imports, add __all__ (#912)
Expand `from .foo import *` to full listings, and `__all__` sections.

This reifies the module export listings, which is useful for code
importing this module; without this, clients will need special `mypy`
control pragmas for this library.

This removes a number of `# flake8` control pragmas.

Verified with `flake8`
2022-12-05 14:22:55 -08:00
Philippe Tillet
8edfe813a5 [FRONTEND][BACKEND] Added trans instruction; made flash attention bwd pass work (#943) 2022-12-03 09:58:24 -08:00
Yang Hau
8650b4d1cb [DRIVER] Fix typos (#939) 2022-12-02 11:13:46 -08:00
Philippe Tillet
6461254fb5 [BACKEND] Make flash attention forward pass work (#928)
This also simplifies BroadcastOp codegen
2022-11-30 10:13:24 +00:00
Qingyi Liu
9d31998a9d [Triton-MLIR][BACKEND] Add argmin / argmax implementation for ReduceOp (#918) 2022-11-27 22:59:27 -08:00
Philippe Tillet
4d64ffb5fe [FRONTEND] Handle for loops with negative constant steps (#896) 2022-11-20 11:37:38 +01:00
Chenggang Zhao
516a241234 [Triton-MLIR] Fix some typos (#874)
Fix some typos
2022-11-13 18:15:53 -08:00
Chenggang Zhao
57fd1864a7 [Triton-MLIR] Support FP8 (#864)
Co-authored-by: Superjomn <yanchunwei@outlook.com>
2022-11-10 15:53:06 +08:00
ben-zhang-609
5feb6e24f9 [Triton-MLIR]Add ptx vprintf support (#825)
Not know how to write unit test for this feature.

Co-authored-by: Yan Chunwei <yanchunwei@outlook.com>
2022-11-02 16:39:09 +08:00
Philippe Tillet
e61dc75942 [FRONTEND] Fixed inliner and got more tests to pass (#822)
This adds a `DialectInlinerInterface` to the Triton dialect. This, along
with a few other minor semantic changes, fixes our tests on call
instructions. Also added the option to provide use an "LLVM_SYSPATH"
environment variable to link against locally build of LLVM; this was
useful for debugging this issue.
2022-10-30 14:10:02 -07:00
Philippe Tillet
3e6cc6d66c [FRONTEND] Made more tests pass (#805) 2022-10-26 17:47:33 -07:00
Philippe Tillet
fcb228d1d4 Merge select commits from master branch into triton-mlir (#799)
Co-authored-by: Keren Zhou <kerenzhou@openai.com>
Co-authored-by: vesuppi <zt9465@gmail.com>
Co-authored-by: Jason Ansel <jansel@jansel.net>
Co-authored-by: daadaada <dyanab@connect.ust.hk>
Co-authored-by: Anton Kostin <masguit42@users.noreply.github.com>
Co-authored-by: Yunxing Dai <nov503@gmail.com>
Co-authored-by: Shintaro Iwasaki <shintaro.iwasaki.work@gmail.com>
2022-10-24 14:52:37 -07:00
Philippe Tillet
bb0f9235d1 [OPTIMIZER] Made layout simplification pass efficient for fused attention kernels (#790) 2022-10-21 16:52:15 -07:00
Shintaro Iwasaki
0d22d2bc03 [TritonMLIR] Disallow 0D tensor (#788) 2022-10-19 10:34:32 -07:00
Yu Guo
71b46acc42 [IR] Added special-purpose dequantize instruction (#759)
It is currently necessary for optimal performance in quantized workloads to add a special-purpose instruction in the IR. Backward compatibility with this instruction is *NOT* guaranteed.
2022-10-12 14:14:45 -07:00
Yan Chunwei
555f94f9b9 [triton-mlir][BACKEND] Support masked load/store (#657)
This PR does

- fix some bugs to support masked load/store,
- refine frontend, and support the `and` and `or` syntax in mask(by
extending the BoolOp in python ast.visitor), e.g. `tl.store(...,
mask=offset<n and other_conditions)`,
- add `arith.cmpI` and `arith.cmpF` op conversion in backend(required by
mask),
- add more test cases in vecadd.
2022-10-10 13:29:53 +08:00
Philippe Tillet
4a77dfb042 [FRONTEND] Complete rewrite of the runtime (#644)
This PR completely rewrites the runtime of Triton to be more lean and
clearly separate the compilation step from the just-in-time caching logic.
This should substantially reduce launch overhead.
2022-09-18 08:51:48 -07:00
Shintaro Iwasaki
13669b46a6 [DOCS] Correct spelling (#665)
This PR corrects spelling like #664 for Triton-MLIR. It should not break anything.
2022-09-16 15:07:34 -07:00
Shintaro Iwasaki
c668d6596e [DOCS] Fix spelling (#664)
This PR applies minor spelling fix in comments and string literals to
`master`. It shouldn't hurt anything.
2022-09-16 12:26:40 -07:00