Commit Graph

7 Commits

Author SHA1 Message Date
Thomas Raoux
a7061e19b2 [BACKEND] Fix multiple bugs in WGMMA (#2457)
Fix dependencies in wgmma_wait op to prevent the scheduler from moving
it past the uses of wgmma accumulator. We need to explicitly represent
the dependency between the wait and the accumulator uses otherwise LLVM
is free to re-order those.
This allows us to remove a workaround to prevent the re-ordering. We can
also remove the wait op added in the loop during pipelining.

Also fix the descritpor calcuation for wgmma, we should calculate the
same descriptor for the whole warpgroup.
Added a workaround for a bug that was exposed by different timing due to
those changes. We shouldn't insert operations between the loop and
async_wait or we may have race conditions.
2023-10-06 17:59:28 -07:00
Thomas Raoux
6bc1d9e1be [BACKEND] Support MMA V3 with register operand (#2375)
MMA V3 support taking operand A from register. This helps for chained
matmul operations like in attention.
Add an optimization to use this mode when it helps and add the lowering
for it.
2023-09-25 10:43:54 -07:00
Thomas Raoux
31b0c52142 [FRONTEND][BACKEND] Add flag to control accumulation for fp8 (#2300)
Change the dot to allow taking an initial accumulator and add a flag
that will allow the compiler to accumulate in a lower precision than the
output type.
On Hopper this flag is on by default which allows accumualting with
lower precision.
This only affect Hopper fp8 dot.
2023-09-15 18:42:54 -07:00
Thomas Raoux
994f7e4460 [BACKEND] Remove dependency between NVGPU and TritonNvidiaGPU (#2282) 2023-09-12 11:02:20 -07:00
ivanyinwz
a539836876 Fix predicate for store tiled op (#2215)
The predicate of Store Tiled op was not set which caused a lot of perf
drop due to duplicated memory traffic in epilogue.
2023-09-04 02:27:00 +00:00
Zahi Moudallal
acbf716889 [BACKEND] Refactoring NVGPUToLLVMPass (#2158) 2023-09-01 23:40:31 +00:00
Zahi Moudallal
4d373aa103 [BACKEND] Remove HopperHelpers.c and replace with inline ptx and LLVM codegen (#2047) 2023-08-10 15:52:37 -07:00