Files
ROCm/python
Vinayak Gokhale c2766bbd5f Merge changes from upstream FA bwd kernel (#444)
* Add optimized FA bwd from upstream

* Add autotuning

* Change loads and stores to use block ptrs

* Cleanup
2024-01-05 15:12:05 -06:00
..