Commit Graph

37 Commits

Author SHA1 Message Date
Agnes Leroy
75e9baae78 refactor(cuda): introduce scratch for low latency pbs 2023-02-22 16:49:43 +01:00
Agnes Leroy
5cd0cb5d19 refactor(cuda): introduce scratch for amortized PBS 2023-02-17 13:51:41 +01:00
Agnes Leroy
2a487ffbfd refactor(cuda): introduce scratch for blind rotation and sample extraction 2023-02-16 09:31:49 +01:00
Agnes Leroy
870d896ad9 refactor(cuda): introduce cmux tree scratch 2023-02-15 17:32:12 +01:00
Agnes Leroy
e6dfb588db refactor(cuda): prepare to introduce cmux tree scratch 2023-02-15 17:32:12 +01:00
Agnes Leroy
730274f156 refactor(cuda): create scratch function and cleanup for wop pbs 2023-02-14 09:21:30 +01:00
Agnes Leroy
2a299664e7 chore(cuda): refactor cuda errors, remove deprecated files 2023-02-08 14:12:55 +01:00
Beka Barbakadze
3cd48f0de2 feat(cuda): add a new fft algorithm.
- FFT can work for any polynomial size, as long as twiddles are provided.
 - All the twiddles fit in the constant memory.
 - Bit reverse is not used anymore, no more sw1 and sw2 arrays in constant memory.
 - Real to complex compression algorithm is changed.
 - Twiddle initialization functions are removed.
2023-02-08 00:49:44 +04:00
Agnes Leroy
bd9cbbc7af fix(cuda): fix asynchronous behaviour for pbs and wop pbs 2023-01-28 14:20:34 +01:00
Agnes Leroy
2fcc5b2d0f feat(cuda): add missing boolan gates 2023-01-25 09:30:50 +01:00
Beka Barbakadze
bc90576454 docs(cuda): add Rust doc for all concrete-cuda entry points 2023-01-25 09:30:26 +01:00
Agnes Leroy
8327cd7fff feat(cuda): add NOT and AND gates to the library 2023-01-09 15:28:52 -03:00
Agnes Leroy
29284b4260 chore(cuda): split wop pbs file and add entry point for wop pbs 2022-12-20 12:49:16 -03:00
Pedro Alves
e324f14c6b chore(cuda): Modifies the CBS+VP host function to fully parallelize the cmux tree and blind rotation. Also changes how the CMUX Tree handles the input LUTs to match the CPU version. 2022-12-16 16:29:59 +01:00
Pedro Alves
9bcf0f8a70 chore(cuda): Refactor device.cu functions to take pointers to cudaStream_t instead of void 2022-12-15 10:40:19 +01:00
Agnes Leroy
e4ba380594 fix(cuda): remove u32 support for cbs+vp entry point 2022-12-14 13:32:01 +01:00
Agnes Leroy
4da789abda feat(cuda): add a cbs+vp entry point
- fix bug in CBS as well
- update cuda benchmarks
2022-12-14 13:32:01 +01:00
Quentin Bourgerie
2db1ef6a56 fix(cuda): Include cuda_runtime.h in device.h to include the defininition of cudaStream_t 2022-12-06 11:31:47 +01:00
Beka Barbakadze
0aedb1a4f4 feat(cuda): Add circuit bootstrap in the cuda backend
- Add FP-Keyswitch.
- Add entry points for cuda fk ksk in the public API.
- Add test for fp_ksk in cuda backend.
- Add fixture for bit extract

Co-authored-by: agnesLeroy <agnes.leroy@zama.ai>
2022-12-05 22:00:43 +01:00
Agnes Leroy
e10c2936d1 feat(cuda): support N=4096 and 8192 for the low latency bootstrap 2022-12-02 09:41:32 +01:00
Pedro Alves
68866766a4 feat(cuda): Adds a parameter in the CUDA host functions passing the gpu index that should be used. 2022-11-28 15:11:46 +01:00
Pedro Alves
9d25f9248d feat(cuda): Implements vertical packing's blind rotation and sample extraction on CUDA backend. Implements a private test for the CUDA vertical packing + blind rotation. 2022-11-21 09:30:26 +01:00
Pedro Alves
80f4ca7338 fix(cuda): Checks the cudaDevAttrMemoryPoolsSupported property to ensure that asynchronous allocation is supported 2022-11-10 14:46:49 +01:00
Agnes Leroy
553c2e6948 feat(cuda): add lwe / cleartext multiplication GPU acceleration 2022-11-09 15:31:36 +01:00
Pedro Alves
25f103f62d feat(cuda): Refactor the low latency PBS to use asynchronous allocation. 2022-11-09 09:44:25 +01:00
Pedro Alves
cf222e9176 feat(cuda): encapsulate asynchronous allocation methods. 2022-11-09 09:44:25 +01:00
Agnes Leroy
13e77b2d8c feat(cuda): implement lwe ciphertext / plaintext add in concrete-cuda 2022-11-08 09:49:21 +01:00
Quentin Bourgerie
b560ae6a72 fix(cuda): Remove extra ; in bootstrap header file that lead to warnings 2022-11-07 14:01:32 +01:00
Agnes Leroy
d34f53b7ee feat(cuda): implement LWE ciphertext addition on GPU 2022-10-28 13:59:53 +02:00
Agnes Leroy
bc66816341 feat(cuda): implement negation of an LWE ciphertext vector 2022-10-27 17:03:23 +02:00
Agnes Leroy
4445fcc7f1 chore(cuda): rename some variables to match concrete-core notations
- rename l_gadget and stop calling low lat PBS with N too large
- rename trlwe and trgsw
- rename lwe_mask_size into lwe_dimension
- rename lwe_in into lwe_array_in
- rename lwe_out into lwe_array_out
- rename decomp_level into level
- rename lwe_dimension_before/after into lwe_dimension_in/out
2022-10-19 10:26:08 +02:00
Agnes Leroy
c22aa3e4e9 chore(cuda): format sources and add check in ci 2022-10-19 10:26:08 +02:00
Agnes Leroy
acbad678ec chore(cuda): add assert on glwe_dimension 2022-10-14 17:12:07 +02:00
Beka Barbakadze
01ea1cf2f2 feat(cuda): add extract bits feature in concrete-cuda
- also, update decomposition algorithm for concrete-cuda keyswitch
2022-10-03 14:58:36 +02:00
Pedro Alves
26f26a2132 fix(cuda): Add a conditional macro check to remove CUDA-specific
definitions when not needed.
2022-09-27 09:43:01 +02:00
Pedro Alves
4c1c26e1fa feat(cuda): Implements the CMUX Tree on CUDA backend. 2022-09-22 09:10:50 +02:00
Agnes Leroy
64521f6747 feat(cuda): introduce cuda acceleration for the pbs and keyswitch
- a new crate concrete-cuda is added to the repository, containing some
Cuda implementations for the bootstrap and keyswitch and a Rust wrapping
to call them
- a new backend_cuda is added to concrete-core, with dedicated entities
whose memory is located on the GPU and engines that call the Cuda
accelerated functions
2022-06-27 09:10:20 +02:00