wmma: add CUDA tensor core and fix test_speed_v_torch failure (#3544)

This commit is contained in:
Francis Lam
2024-03-01 17:51:02 -08:00
committed by GitHub
parent b3cdc11a58
commit e17f1821a7
8 changed files with 50 additions and 25 deletions

View File

@@ -45,6 +45,7 @@ jobs:
run: |
DEBUG=2 EMULATE_METAL=1 FORWARD_ONLY=1 PYTHON=1 python3 ./test/test_linearizer.py TestLinearizer.test_tensor_cores
DEBUG=2 EMULATE_HIP=1 FORWARD_ONLY=1 PYTHON=1 python3 ./test/test_linearizer.py TestLinearizer.test_tensor_cores
DEBUG=2 EMULATE_CUDA=1 FORWARD_ONLY=1 PYTHON=1 python3 ./test/test_linearizer.py TestLinearizer.test_tensor_cores
- name: Test dtype with Python emulator
run: DEBUG=2 PYTHON=1 python3 test/test_dtype.py
- name: Test ops with Python emulator