Jason Furmanek
e5d7bb4fae
Initial commit to resolve merge conflicts
...
rename tl.float8e4 to tl.float8e4nv to align with upstream
ROCM IFU: Fix python arch issues
ROCM IFU: Fix kernel launcher
ROCM IFU: Fix merge conflicts
fix debug build
Set correct threadsPerCTA
2023-10-03 04:04:26 +00:00
Lixun Zhang
8d99331c89
Combine split_k and non split_k kernels in GEMM tuning API ( #344 )
2023-09-28 12:37:22 -05:00
Shucai Xiao
10795d8fd3
Fixed a bug related to split_k and prune unnecessary tuning space ( #332 )
...
* refine tuning scrit by adding prune_configs, also fixed a bug in generating tuning configs
* fixed a bug in returning the empty config
2023-09-21 23:47:14 -05:00
Shucai Xiao
fb3f2d6feb
refine gemm tuning scripts ( #309 )
...
* refine the gemm tuning scripts to reduce tuning space and better perf numbers
* added code to support tuning in full tuning space
* add a function to get best tuning config
* refine the matmul tutorial example to print out best tuning config for each input
* added even_k to gemm kernel heuristic for better performance
* address review comments
2023-09-07 08:09:11 -05:00
Shucai Xiao
1c86e3238a
remove multiple archtictures to isa head and adding gemm tuning scripts ( #261 )
...
* Remove adding multiple architectures to isa head
* Add mask for gpu memory load in scripts for tuning gemm 'script/amd/gemm/matmul.py'
* Move the scripts to a better place 'scripts/amd/gemm/'
2023-07-18 14:21:16 -05:00
Michael Melesse
275fead8e3
fix lit test
2023-05-12 15:37:08 -05:00
Michael Melesse
9cc141b12d
assume ROCM device
...
This is a combination of 7 commits.
use pyt nightly with root
repro with pytorch unit test
hardcode isROCM to true
set is_cuda to False
ignore cc arg
clean up
match triton-mlir branch
2023-05-04 16:46:59 -05:00
Michael Melesse
fdd2af8b38
fix workflow
...
This is a combination of 6 commits.
change github actions
install git
remove pre-commit
back to old install
use -e
clean up
2023-05-01 12:49:29 -05:00
Michael Melesse
13facab95f
fix lit tests
...
This is a combination of 3 commits.
fix build and test errors
fix lit test error
fix lit tests
2023-05-01 12:48:20 -05:00
Michael Melesse
d211cd7750
skip bad test
2023-04-17 13:12:34 -05:00
Michael Melesse
705d47d0dd
fix lit test issues
...
This is a combination of 6 commits.
install lit
fix lit test
fix lit test
fix aot lit issues
fix final lit tests
add lit tests
2023-04-17 11:46:37 -05:00
Michael Melesse
f50116208f
match masked load
2023-04-11 15:20:08 -05:00
Rahul Batra
c7ac25dc60
fix shift op
2023-04-10 15:05:45 -05:00
Rahul Batra
3d71a6a034
fix issues
2023-04-07 14:40:59 -05:00
Rohit Santhanam
dadc09623b
Replace hard coded ROCM paths with ROCM_PATH env var.
2023-03-06 03:20:38 +00:00
Michael Melesse
2077c0723b
local ROCM bitcode files
...
This is a combination of 6 commits.
use local bitcode
This is a combination of 3 commits.
add bit code to repo
update test
change bit code path
move bit code
update path
update scripts
update test
fix path issue
2023-02-17 14:10:34 -06:00
Daniil Fukalov
6b4687db34
[ROCM][scripts] Add script to build debug LLVM installation.
2023-01-13 00:41:57 +01:00
Michael Melesse
bcccbf7787
update test script
2022-12-24 10:25:50 -06:00
Michael Melesse
28bec3dc41
update test
2022-12-24 07:53:28 -06:00
Michael Melesse
3f8b402f8a
update script
2022-12-22 22:06:32 -06:00
Michael Melesse
9ff2f8b653
enable kernel launching
2022-12-22 21:59:47 -06:00
Michael Melesse
46357a92f2
label kernels correctly
2022-12-22 21:24:34 -06:00
Michael Melesse
34f95bc7d9
update scripts
2022-12-22 18:47:42 -06:00
Michael Melesse
8b1fb798e6
show segfaults
2022-12-22 16:29:49 -06:00
Michael Melesse
f06fdff372
add prints in c code
2022-12-22 16:20:46 -06:00
Michael Melesse
814a59a3d6
attempt launch
2022-12-22 08:29:20 -06:00
Michael Melesse
edd0df94dc
compiles
2022-12-21 13:48:56 -06:00
Michael Melesse
5e055a5165
add scripts
2022-12-21 13:13:24 -06:00