Agnes Leroy
e10c2936d1
feat(cuda): support N=4096 and 8192 for the low latency bootstrap
2022-12-02 09:41:32 +01:00
Beka Barbakadze
c1f1b533ea
fix(cuda): fix pbs for 8192 polynomial_size
2022-12-01 13:05:28 +01:00
Agnes Leroy
921c0a6306
fix(cuda): fix N = 8192 support
2022-11-29 09:00:35 +01:00
Pedro Alves
68866766a4
feat(cuda): Adds a parameter in the CUDA host functions passing the gpu index that should be used.
2022-11-28 15:11:46 +01:00
Agnes Leroy
f04a29aea4
chore(cuda): fix asserts regarding the base log value
2022-11-28 13:58:57 +01:00
Pedro Alves
739db73d46
feat(cuda): batch_fft_ggsw_vector uses global memory in case there is not enough space in the shared memory
2022-11-28 11:48:16 +01:00
Beka Barbakadze
56b986da8b
feat(cuda): new decomposition algorithm for pbs.
...
- removes 16 bit limitation on base_log
- optimizes shared memory use: buffers for decomposition are not used anymore, rotated buffers are reused as state buffer for decomposition for the amortized PBS.
- Add a private test for cuda PBS, as we have for fft backend.
2022-11-28 11:48:16 +01:00
Pedro Alves
d59b2f6dda
feat(cuda): Check for errors after each kernel launch.
2022-11-28 09:52:19 +01:00
Pedro Alves
9d25f9248d
feat(cuda): Implements vertical packing's blind rotation and sample extraction on CUDA backend. Implements a private test for the CUDA vertical packing + blind rotation.
2022-11-21 09:30:26 +01:00
Agnes Leroy
f36f565b75
chore(cuda): replace casts with cuda intrinsics
2022-11-16 14:22:03 +01:00
Agnes Leroy
da654ee9cb
chore(core): fix clippy error in cuda backend and fix formatting
2022-11-10 14:46:49 +01:00
Pedro Alves
80f4ca7338
fix(cuda): Checks the cudaDevAttrMemoryPoolsSupported property to ensure that asynchronous allocation is supported
2022-11-10 14:46:49 +01:00
Agnes Leroy
553c2e6948
feat(cuda): add lwe / cleartext multiplication GPU acceleration
2022-11-09 15:31:36 +01:00
Pedro Alves
25f103f62d
feat(cuda): Refactor the low latency PBS to use asynchronous allocation.
2022-11-09 09:44:25 +01:00
Pedro Alves
0b58741fd4
feat(cuda): Refactor the amortized PBS to use asynchronous allocation.
2022-11-09 09:44:25 +01:00
Pedro Alves
cf222e9176
feat(cuda): encapsulate asynchronous allocation methods.
2022-11-09 09:44:25 +01:00
Agnes Leroy
13e77b2d8c
feat(cuda): implement lwe ciphertext / plaintext add in concrete-cuda
2022-11-08 09:49:21 +01:00
Agnes Leroy
d34f53b7ee
feat(cuda): implement LWE ciphertext addition on GPU
2022-10-28 13:59:53 +02:00
Agnes Leroy
bc66816341
feat(cuda): implement negation of an LWE ciphertext vector
2022-10-27 17:03:23 +02:00
Agnes Leroy
841c1e6952
refactor(cuda): remove SharedMemory
2022-10-27 15:43:18 +02:00
Agnes Leroy
17312bbd52
refactor(cuda): remove PolynomialFourier
2022-10-27 15:43:00 +02:00
Agnes Leroy
4445fcc7f1
chore(cuda): rename some variables to match concrete-core notations
...
- rename l_gadget and stop calling low lat PBS with N too large
- rename trlwe and trgsw
- rename lwe_mask_size into lwe_dimension
- rename lwe_in into lwe_array_in
- rename lwe_out into lwe_array_out
- rename decomp_level into level
- rename lwe_dimension_before/after into lwe_dimension_in/out
2022-10-19 10:26:08 +02:00
Agnes Leroy
c22aa3e4e9
chore(cuda): format sources and add check in ci
2022-10-19 10:26:08 +02:00
Agnes Leroy
acbad678ec
chore(cuda): add assert on glwe_dimension
2022-10-14 17:12:07 +02:00
Agnes Leroy
703c74401c
chore(cuda): add asserts on base log, poly size and num samples values
2022-10-14 17:12:07 +02:00
Pedro Alves
1a76cadaa8
feat(cuda): Implement Stream-Ordered Memory Allocator for CUDA's CMUX Tree
2022-10-10 17:09:13 +02:00
Pedro Alves
3d6524ccf3
chore(cuda): Remove old comments from concrete-cuda.
2022-10-10 16:21:38 +02:00
Beka Barbakadze
01ea1cf2f2
feat(cuda): add extract bits feature in concrete-cuda
...
- also, update decomposition algorithm for concrete-cuda keyswitch
2022-10-03 14:58:36 +02:00
Pedro Alves
4c1c26e1fa
feat(cuda): Implements the CMUX Tree on CUDA backend.
2022-09-22 09:10:50 +02:00
Agnes Leroy
ca2be27149
chore(cuda): add an error in case the size of data to copy is 0
2022-07-22 20:51:24 +02:00
Agnes Leroy
64521f6747
feat(cuda): introduce cuda acceleration for the pbs and keyswitch
...
- a new crate concrete-cuda is added to the repository, containing some
Cuda implementations for the bootstrap and keyswitch and a Rust wrapping
to call them
- a new backend_cuda is added to concrete-core, with dedicated entities
whose memory is located on the GPU and engines that call the Cuda
accelerated functions
2022-06-27 09:10:20 +02:00