Commit Graph

1291 Commits

Author SHA1 Message Date
Rohit Santhanam
cd9ae1cd36 Merge remote-tracking branch 'upstream/main' into triton-mlir-IFU-02232023 2023-02-23 21:41:54 +00:00
rsanthanam-amd
af1ed3a01f Merge pull request #130 from binarman/unittest_fix/Conversion/tritongpu_to_llvm/crashes
[Test][LIT] Fix Convertion/tritongpu_to_llvm.mlir crash
2023-02-23 08:31:31 -06:00
Alexander Efimov
c3f24143d2 fix basic_load atomic_add_f32 tests 2023-02-23 12:03:56 +01:00
Alexander Efimov
69d8c4756b add workflow for offline tests specific for amd build 2023-02-23 12:03:56 +01:00
Alexander Efimov
33667cd106 Specialize checks for async slice tests:
- basic_insert_slice_async_v4
- basic_insert_slice_async_v1
- basic_insert_slice_async_v1_multictas
2023-02-23 12:03:56 +01:00
Alexander Efimov
9ad7fec871 [Test][LIT] Fix Convertion/tritongpu_to_llvm.mlir crash
This PR disables following sub tests, because they are PTX specific:
- basic_async_wait
- convert_dot
- matmul_kernel_dot_operand_layout
- matmul884_kernel_dot_operand_layout
- matmul_tf32dot
2023-02-23 12:03:56 +01:00
Stonepia
a38d2defb8 [Build] Strip the static libraries symbol from the triton shared library (#1240)
This is to solve https://github.com/openai/triton/issues/1236

This commit hides the symbols of the shared libraries for
`libtriton.so`, so that when other object link against `libtriton.so`,
it won't have confilct.
2023-02-22 23:39:31 -08:00
Chenggang Zhao
b5efa91e2a [Backend] Fix a bug in swizzling store (#1235)
The function calculates the swizzled address to **store** (not load), so
we should use `outOrder` instead of `inOrder`. Current tests do not
cover this case, but at NVIDIA, we have a case related to `sm_90` that
could trigger. Already discussed in the Slack channel with @Jokeren.
2023-02-22 19:13:21 -08:00
Douglas Lehr
729211a404 Ensure __triton_launcher calls right _launch. (#1229)
Per issue https://github.com/openai/triton/issues/1228. I believe we are
potentially exposed when a Triton executor (Pytorch for example) links
in two or more `triton_.so` shared objects and each has a stub for
`_launch`.

This fix ensures the `_launch` function is tied locally to the calling
`__triton_launcher` and can't be misused by another library.
2023-02-23 00:16:36 +00:00
Keren Zhou
6a9316e69a [BACKEND] Clean up SCF -> CF conversion (#1234) 2023-02-22 23:49:47 +00:00
rsanthanam-amd
e7f84448bf Merge pull request #127 from dfukalov/dfukalov/work-3
[ROCM] Enable float16 and int8 types for FMA based `dot` implementation.
2023-02-22 16:39:04 -06:00
Philippe Tillet
0ec277efc5 [OPTIMIZER] cleaned, renamed and simplified some optimization passes (#1232)
This shouldn't actually change the behavior of Triton -- only clean things up.
2023-02-22 13:54:55 -08:00
Daniil Fukalov
2d678efb89 [ROCM] Enable float16 and int8 types for FMA based dot implementation.
By default Triton generates MLIR with f32 result of the tt.dot operation on f16
 typed operands. So we have "tt.dot(f16,f16,f32)->f32" types in .ttgir. But
LLVM FMA instruction requires for the same type for all three operands. So first
two operands are implicitly casted f16->f32 as
"unrealized_conversion_cast struct{f16,f16,...}->struct{f32,f32}".
The change fixed incorrect implicit cast generation.
For the int8 typed operands result operand is also casted after performing dot.

As the next step to improve FMA based dot operation FMA on f16 and int8 target
specific intrinsics (e.g. fma(f16,f16,f16)->f16) could be used, perhaps as an
option.
2023-02-22 22:36:20 +01:00
rsanthanam-amd
4e7fa6f795 Merge pull request #134 from dllehr-amd/triton_static_fix
Add static to _launch function call
2023-02-22 00:30:45 -06:00
Philippe Tillet
ba0198326e [TESTS] make performance regression testing less strict (#1231) 2023-02-21 22:22:02 -08:00
Mihir Patel
6bef0c2bd6 [FRONTEND] Update path for headers to support Python 3.10 (#1123)
Python 3.10 changes where packages are installed by default, causing
problems with Ubuntu into `/local`. See
[this](https://lists.debian.org/debian-python/2022/03/msg00039.html) and
[this](https://bugs.launchpad.net/ubuntu/+source/python3.10/+bug/1967920).
Triton seems to break when using 3.10 as it looks for the headers, but
the headers are not in `/local`, e.g. they are at
`/usr/include/python3.X` and not `/usr/local/include/python3.X`


Not 100% sure what's going on here since it's deep in python / pip, but
I think this should fix it. Otherwise, you have to hack around it in
dockerfiles, e.g. `ENV DEB_PYTHON_INSTALL_LAYOUT=deb`, which breaks
things with the release of pip that went.

---------

Co-authored-by: Keren Zhou <kerenzhou@openai.com>
2023-02-21 21:19:08 -08:00
Philippe Tillet
174f121c1c [TESTS] Added attention regression tests (#1227) 2023-02-21 20:22:36 -08:00
Douglas Lehr
a4ffd09177 Add static to _launch function call
See issue https://github.com/openai/triton/issues/1228
2023-02-21 23:04:33 -05:00
Phil Tillet
0192ab2178 Revert "[CI] Now only running CI when checks are requested in merge groups"
This reverts commit d023e1cb06.
2023-02-21 16:39:47 -08:00
Eric Wang
320ae18093 [FRONTEND] Add error messages for arange (#1218)
Fix issue https://github.com/openai/triton/issues/244

Check `end` is greater than `start`.
Check if the range can fit in `int32`.
Check the number of elements less than or equal to
`TRITON_MAX_TENSOR_NUMEL = 131072`.

---------

Co-authored-by: Philippe Tillet <phil@openai.com>
2023-02-22 00:37:28 +00:00
Phil Tillet
d023e1cb06 [CI] Now only running CI when checks are requested in merge groups 2023-02-21 16:34:25 -08:00
Philippe Tillet
307dde9cb5 [CI] revived regression tests (#1225) 2023-02-21 16:33:03 -08:00
Yu Guo
19228d88bc [FRONTEND][BACKEND] add env variable TRITON_LIBDEVICE_PATH (#1166)
we may compile kernels on remote machines which do not have local
libdevice.10.bc.

Co-authored-by: Philippe Tillet <phil@openai.com>
2023-02-21 20:15:12 +00:00
rsanthanam-amd
e57d04481b Merge pull request #133 from binarman/cmake_policy_guard
Add if guard around policy in CMakeLists.txt
2023-02-21 09:13:41 -06:00
Alexander Efimov
15f0c8f4c9 Add if guard around policy in CMakeLists.txt
This fixes error while building with old cmake versions.
2023-02-21 14:44:30 +01:00
Philippe Tillet
cdd59eae68 [CI] Added A100 runner; tentative merge queues support (#1224) 2023-02-21 01:37:56 -08:00
Michaël Benesty
940f394a35 [Frontend] fix crash on cast when dest is constexpr (#1222)
This pull request addresses a crash that occurs when casting to a
tl.constexpr type in the frontend.

More info and repro code available in:
https://github.com/openai/triton/issues/1221
2023-02-20 10:50:33 -08:00
rsanthanam-amd
b8270fa8c8 Merge pull request #129 from ROCmSoftwarePlatform/amdgpu_atomics_performance_enhancements
Performance enhancements for AMDGPU atomics.
2023-02-20 09:55:34 -06:00
Christian Sigg
17795a34ac [NFC] Remove null character (#1220) 2023-02-20 08:50:28 +00:00
Rohit Santhanam
4b56eb2fd4 Performance enhancements for AMDGPU atomics.
Add "agent" syncscope specification to prevent large performance loss
for gfx90a.

Add LLVM function attributes to enable fp32 atomic adds for archs that
support it.
2023-02-20 06:28:00 +00:00
Keren Zhou
123c687ed9 [BACKEND] Rewrite Membar to fit the CF dialect (#1213) 2023-02-19 14:54:33 -08:00
rsanthanam-amd
be2b51a8f1 Merge pull request #126 from ROCmSoftwarePlatform/upgrade_triton_mlir_rocm_to_llvm_head
Upgrade triton mlir rocm to llvm head
2023-02-19 07:22:05 -06:00
BillSchumacher
6b44d31ae4 [BUILD] windows and cmake compatibility. (#1214)
Make cmake happier, it doesn't like multiple target_link_library
definitions for the same name.

Use find_package instead on Windows for dlfcn-win32. 
Set LLVM_SYS_PATH on Windows for python setup.

Debug build almost working, AlwaysCreate error thrown still.
2023-02-19 09:51:50 +00:00
Philippe Tillet
c1194bd237 [OPTIMIZER] Refined side-effect traits (#1216) 2023-02-19 01:21:19 -08:00
Rohit Santhanam
50e6329d01 Fix LIT tests. 2023-02-18 09:32:42 +00:00
Rohit Santhanam
841784d1e3 Merge remote-tracking branch 'upstream/main' into upgrade_triton_mlir_rocm_to_llvm_head 2023-02-18 09:25:20 +00:00
Arun A. Kumar
35d1c062b8 [FRONTEND] fix AutoTuner error when OutOfResources (#1208)
Minor bug: AutoTuner currently throws the following error when certain
configs go OutOfResources (e.g. the matmul example when testing on GPUs
with less shared memory).
2023-02-18 07:29:33 +00:00
rsanthanam-amd
e8f1003ff3 Merge pull request #125 from ROCmSoftwarePlatform/upgrade_hip_launcher
Upgrade HIP launcher code to be at parity with CUDA.
2023-02-17 23:48:22 -06:00
Rohit Santhanam
711370f008 Upgrade HIP launcher code to be at parity with CUDA. 2023-02-18 05:02:46 +00:00
Philippe Tillet
4d067f5120 [FRONTEND] Now emit an error for tl.reshape, instead of silently calling tl.view (#1212) 2023-02-17 20:21:20 -08:00
Christian Sigg
9ef4b5d773 Rebase to LLVM-head. (#1200)
Rebase to
37b7a60cd7
2023-02-17 13:16:11 -08:00
rsanthanam-amd
cd3b27f248 Merge pull request #116 from ROCmSoftwarePlatform/bitcode
use local bit code
2023-02-17 14:46:40 -06:00
Michael Melesse
2077c0723b local ROCM bitcode files
This is a combination of 6 commits.

use local bitcode

This is a combination of 3 commits.

add bit code to repo

update test

change bit code path

move bit code

update path

update scripts

update test

fix path issue
2023-02-17 14:10:34 -06:00
rsanthanam-amd
a3d41af77a Merge pull request #122 from ROCmSoftwarePlatform/remove_tmpname_from_hsaco_file_generation
Remove the unsafe tmpnam call to generate the HSACO file name.
2023-02-17 07:49:57 -06:00
rsanthanam-amd
8614210b54 Merge pull request #120 from binarman/unittest_fix/Target/tritongpu_to_hsaco
[Test][LIT] Fix tritongpu_to_hsaco.mlir test
2023-02-17 07:35:50 -06:00
Rohit Santhanam
1d8fd49254 Remove the unsafe tmpnam call to generate the HSACO file name. 2023-02-17 13:09:26 +00:00
Goran Flegar
3b72ebd199 [BACKEND] correctly propagate address spaces through GEP (#1207)
While doing it incorrectly works at the moment, this will cause a
validation error once we rebase Triton closer to LLVM head, since
validation for some LLVM Dialect ops got stricter.

Specifically, if we remove the shared memory space attribute, a
subsequent bitcast tries to add it back, which is illegal.
2023-02-17 11:19:03 +00:00
Philippe Tillet
969331aedd [BUILD] fixed setup.py on older glibc (#1206) 2023-02-16 19:43:18 -08:00
Philippe Tillet
8a4117a0f4 [FRONTEND] launcher module is now renamed from launcher to __triton_launcher (#1201)
creating dynamically a module named `launcher` may conflict with other
modules named the same in the user's environment.
2023-02-16 17:28:51 -08:00
Alexander Efimov
fce08c2ebb [Test][LIT] Fix tritongpu_to_hsaco.mlir test
This PR fixes tritongpu_to_hsaco.mlir test,
plus fixes minor issue with tritongpu_to_amdgcn test
2023-02-16 20:15:52 +01:00