Beka Barbakadze 56b986da8b feat(cuda): new decomposition algorithm for pbs.
- removes 16 bit limitation on base_log
- optimizes shared memory use: buffers for decomposition are not used anymore, rotated buffers are reused as state buffer for decomposition for the amortized PBS.
- Add a private test for cuda PBS, as we have for fft backend.
2022-11-28 11:48:16 +01:00
Description
No description provided
148 MiB
Languages
C++ 34.3%
Python 23.1%
MLIR 22.9%
Rust 14.6%
C 2.2%
Other 2.8%