ROCm/python/test/regression at 0d7e7532279e45672555e344646f5c19c3972331 - ROCm

mirror of https://github.com/ROCm/ROCm.git synced 2026-04-05 03:01:17 -04:00

Files

Natalia Gimelshein 0d7e753227 [TESTING] use torch.int for autotuning cache (#840 )

For stupid reasons, ops on int8 are 3 times slower than on int, and for
another set of stupid reasons we are not using cudaMemset for `zero_`,
so using `int8` buffer in `do_bench` makes it slow.

Co-authored-by: Philippe Tillet <phil@openai.com>

2022-11-04 18:05:16 -07:00

test_performance.py

[TESTING] use torch.int for autotuning cache (#840 )

2022-11-04 18:05:16 -07:00