github/ROCm - ROCm - AtHeartEngineering

mirror of https://github.com/ROCm/ROCm.git synced 2026-04-05 03:01:17 -04:00

Author	SHA1	Message	Date
Rohit Santhanam	cd9ae1cd36	Merge remote-tracking branch 'upstream/main' into triton-mlir-IFU-02232023	2023-02-23 21:41:54 +00:00
rsanthanam-amd	af1ed3a01f	Merge pull request #130 from binarman/unittest_fix/Conversion/tritongpu_to_llvm/crashes [Test][LIT] Fix Convertion/tritongpu_to_llvm.mlir crash	2023-02-23 08:31:31 -06:00
Alexander Efimov	c3f24143d2	fix basic_load atomic_add_f32 tests	2023-02-23 12:03:56 +01:00
Alexander Efimov	69d8c4756b	add workflow for offline tests specific for amd build	2023-02-23 12:03:56 +01:00
Alexander Efimov	33667cd106	Specialize checks for async slice tests: - basic_insert_slice_async_v4 - basic_insert_slice_async_v1 - basic_insert_slice_async_v1_multictas	2023-02-23 12:03:56 +01:00
Alexander Efimov	9ad7fec871	[Test][LIT] Fix Convertion/tritongpu_to_llvm.mlir crash This PR disables following sub tests, because they are PTX specific: - basic_async_wait - convert_dot - matmul_kernel_dot_operand_layout - matmul884_kernel_dot_operand_layout - matmul_tf32dot	2023-02-23 12:03:56 +01:00
Stonepia	a38d2defb8	[Build] Strip the static libraries symbol from the triton shared library (#1240 ) This is to solve https://github.com/openai/triton/issues/1236 This commit hides the symbols of the shared libraries for `libtriton.so`, so that when other object link against `libtriton.so`, it won't have confilct.	2023-02-22 23:39:31 -08:00
Chenggang Zhao	b5efa91e2a	[Backend] Fix a bug in swizzling store (#1235 ) The function calculates the swizzled address to store (not load), so we should use `outOrder` instead of `inOrder`. Current tests do not cover this case, but at NVIDIA, we have a case related to `sm_90` that could trigger. Already discussed in the Slack channel with @Jokeren.	2023-02-22 19:13:21 -08:00
Douglas Lehr	729211a404	Ensure __triton_launcher calls right _launch. (#1229 ) Per issue https://github.com/openai/triton/issues/1228. I believe we are potentially exposed when a Triton executor (Pytorch for example) links in two or more `triton_.so` shared objects and each has a stub for `_launch`. This fix ensures the `_launch` function is tied locally to the calling `__triton_launcher` and can't be misused by another library.	2023-02-23 00:16:36 +00:00
Keren Zhou	6a9316e69a	[BACKEND] Clean up SCF -> CF conversion (#1234 )	2023-02-22 23:49:47 +00:00
rsanthanam-amd	e7f84448bf	Merge pull request #127 from dfukalov/dfukalov/work-3 [ROCM] Enable float16 and int8 types for FMA based `dot` implementation.	2023-02-22 16:39:04 -06:00
Philippe Tillet	0ec277efc5	[OPTIMIZER] cleaned, renamed and simplified some optimization passes (#1232 ) This shouldn't actually change the behavior of Triton -- only clean things up.	2023-02-22 13:54:55 -08:00
Daniil Fukalov	2d678efb89	[ROCM] Enable float16 and int8 types for FMA based `dot` implementation. By default Triton generates MLIR with f32 result of the tt.dot operation on f16 typed operands. So we have "tt.dot(f16,f16,f32)->f32" types in .ttgir. But LLVM FMA instruction requires for the same type for all three operands. So first two operands are implicitly casted f16->f32 as "unrealized_conversion_cast struct{f16,f16,...}->struct{f32,f32}". The change fixed incorrect implicit cast generation. For the int8 typed operands result operand is also casted after performing dot. As the next step to improve FMA based dot operation FMA on f16 and int8 target specific intrinsics (e.g. fma(f16,f16,f16)->f16) could be used, perhaps as an option.	2023-02-22 22:36:20 +01:00
rsanthanam-amd	4e7fa6f795	Merge pull request #134 from dllehr-amd/triton_static_fix Add static to _launch function call	2023-02-22 00:30:45 -06:00
Philippe Tillet	ba0198326e	[TESTS] make performance regression testing less strict (#1231 )	2023-02-21 22:22:02 -08:00
Mihir Patel	6bef0c2bd6	[FRONTEND] Update path for headers to support Python 3.10 (#1123 ) Python 3.10 changes where packages are installed by default, causing problems with Ubuntu into `/local`. See [this](https://lists.debian.org/debian-python/2022/03/msg00039.html) and [this](https://bugs.launchpad.net/ubuntu/+source/python3.10/+bug/1967920). Triton seems to break when using 3.10 as it looks for the headers, but the headers are not in `/local`, e.g. they are at `/usr/include/python3.X` and not `/usr/local/include/python3.X` Not 100% sure what's going on here since it's deep in python / pip, but I think this should fix it. Otherwise, you have to hack around it in dockerfiles, e.g. `ENV DEB_PYTHON_INSTALL_LAYOUT=deb`, which breaks things with the release of pip that went. --------- Co-authored-by: Keren Zhou <kerenzhou@openai.com>	2023-02-21 21:19:08 -08:00
Philippe Tillet	174f121c1c	[TESTS] Added attention regression tests (#1227 )	2023-02-21 20:22:36 -08:00
Douglas Lehr	a4ffd09177	Add static to _launch function call See issue https://github.com/openai/triton/issues/1228	2023-02-21 23:04:33 -05:00
Phil Tillet	0192ab2178	Revert "[CI] Now only running CI when checks are requested in merge groups" This reverts commit `d023e1cb06`.	2023-02-21 16:39:47 -08:00
Eric Wang	320ae18093	[FRONTEND] Add error messages for arange (#1218 ) Fix issue https://github.com/openai/triton/issues/244 Check `end` is greater than `start`. Check if the range can fit in `int32`. Check the number of elements less than or equal to `TRITON_MAX_TENSOR_NUMEL = 131072`. --------- Co-authored-by: Philippe Tillet <phil@openai.com>	2023-02-22 00:37:28 +00:00
Phil Tillet	d023e1cb06	[CI] Now only running CI when checks are requested in merge groups	2023-02-21 16:34:25 -08:00
Philippe Tillet	307dde9cb5	[CI] revived regression tests (#1225 )	2023-02-21 16:33:03 -08:00
Yu Guo	19228d88bc	[FRONTEND][BACKEND] add env variable TRITON_LIBDEVICE_PATH (#1166 ) we may compile kernels on remote machines which do not have local libdevice.10.bc. Co-authored-by: Philippe Tillet <phil@openai.com>	2023-02-21 20:15:12 +00:00
rsanthanam-amd	e57d04481b	Merge pull request #133 from binarman/cmake_policy_guard Add if guard around policy in CMakeLists.txt	2023-02-21 09:13:41 -06:00
Alexander Efimov	15f0c8f4c9	Add if guard around policy in CMakeLists.txt This fixes error while building with old cmake versions.	2023-02-21 14:44:30 +01:00
Philippe Tillet	cdd59eae68	[CI] Added A100 runner; tentative merge queues support (#1224 )	2023-02-21 01:37:56 -08:00
Michaël Benesty	940f394a35	[Frontend] fix crash on cast when dest is constexpr (#1222 ) This pull request addresses a crash that occurs when casting to a tl.constexpr type in the frontend. More info and repro code available in: https://github.com/openai/triton/issues/1221	2023-02-20 10:50:33 -08:00
rsanthanam-amd	b8270fa8c8	Merge pull request #129 from ROCmSoftwarePlatform/amdgpu_atomics_performance_enhancements Performance enhancements for AMDGPU atomics.	2023-02-20 09:55:34 -06:00
Christian Sigg	17795a34ac	[NFC] Remove null character (#1220 )	2023-02-20 08:50:28 +00:00
Rohit Santhanam	4b56eb2fd4	Performance enhancements for AMDGPU atomics. Add "agent" syncscope specification to prevent large performance loss for gfx90a. Add LLVM function attributes to enable fp32 atomic adds for archs that support it.	2023-02-20 06:28:00 +00:00
Keren Zhou	123c687ed9	[BACKEND] Rewrite Membar to fit the CF dialect (#1213 )	2023-02-19 14:54:33 -08:00
rsanthanam-amd	be2b51a8f1	Merge pull request #126 from ROCmSoftwarePlatform/upgrade_triton_mlir_rocm_to_llvm_head Upgrade triton mlir rocm to llvm head	2023-02-19 07:22:05 -06:00
BillSchumacher	6b44d31ae4	[BUILD] windows and cmake compatibility. (#1214 ) Make cmake happier, it doesn't like multiple target_link_library definitions for the same name. Use find_package instead on Windows for dlfcn-win32. Set LLVM_SYS_PATH on Windows for python setup. Debug build almost working, AlwaysCreate error thrown still.	2023-02-19 09:51:50 +00:00
Philippe Tillet	c1194bd237	[OPTIMIZER] Refined side-effect traits (#1216 )	2023-02-19 01:21:19 -08:00
Rohit Santhanam	50e6329d01	Fix LIT tests.	2023-02-18 09:32:42 +00:00
Rohit Santhanam	841784d1e3	Merge remote-tracking branch 'upstream/main' into upgrade_triton_mlir_rocm_to_llvm_head	2023-02-18 09:25:20 +00:00
Arun A. Kumar	35d1c062b8	[FRONTEND] fix AutoTuner error when OutOfResources (#1208 ) Minor bug: AutoTuner currently throws the following error when certain configs go OutOfResources (e.g. the matmul example when testing on GPUs with less shared memory).	2023-02-18 07:29:33 +00:00
rsanthanam-amd	e8f1003ff3	Merge pull request #125 from ROCmSoftwarePlatform/upgrade_hip_launcher Upgrade HIP launcher code to be at parity with CUDA.	2023-02-17 23:48:22 -06:00
Rohit Santhanam	711370f008	Upgrade HIP launcher code to be at parity with CUDA.	2023-02-18 05:02:46 +00:00
Philippe Tillet	4d067f5120	[FRONTEND] Now emit an error for `tl.reshape`, instead of silently calling `tl.view` (#1212 )	2023-02-17 20:21:20 -08:00
Christian Sigg	9ef4b5d773	Rebase to LLVM-head. (#1200 ) Rebase to `37b7a60cd7`	2023-02-17 13:16:11 -08:00
rsanthanam-amd	cd3b27f248	Merge pull request #116 from ROCmSoftwarePlatform/bitcode use local bit code	2023-02-17 14:46:40 -06:00
Michael Melesse	2077c0723b	local ROCM bitcode files This is a combination of 6 commits. use local bitcode This is a combination of 3 commits. add bit code to repo update test change bit code path move bit code update path update scripts update test fix path issue	2023-02-17 14:10:34 -06:00
rsanthanam-amd	a3d41af77a	Merge pull request #122 from ROCmSoftwarePlatform/remove_tmpname_from_hsaco_file_generation Remove the unsafe tmpnam call to generate the HSACO file name.	2023-02-17 07:49:57 -06:00
rsanthanam-amd	8614210b54	Merge pull request #120 from binarman/unittest_fix/Target/tritongpu_to_hsaco [Test][LIT] Fix tritongpu_to_hsaco.mlir test	2023-02-17 07:35:50 -06:00
Rohit Santhanam	1d8fd49254	Remove the unsafe tmpnam call to generate the HSACO file name.	2023-02-17 13:09:26 +00:00
Goran Flegar	3b72ebd199	[BACKEND] correctly propagate address spaces through GEP (#1207 ) While doing it incorrectly works at the moment, this will cause a validation error once we rebase Triton closer to LLVM head, since validation for some LLVM Dialect ops got stricter. Specifically, if we remove the shared memory space attribute, a subsequent bitcast tries to add it back, which is illegal.	2023-02-17 11:19:03 +00:00
Philippe Tillet	969331aedd	[BUILD] fixed setup.py on older glibc (#1206 )	2023-02-16 19:43:18 -08:00
Philippe Tillet	8a4117a0f4	[FRONTEND] launcher module is now renamed from `launcher` to `__triton_launcher` (#1201 ) creating dynamically a module named `launcher` may conflict with other modules named the same in the user's environment.	2023-02-16 17:28:51 -08:00
Alexander Efimov	fce08c2ebb	[Test][LIT] Fix tritongpu_to_hsaco.mlir test This PR fixes tritongpu_to_hsaco.mlir test, plus fixes minor issue with tritongpu_to_amdgcn test	2023-02-16 20:15:52 +01:00

1 2 3 4 5 ...

1291 Commits