Commit Graph

426 Commits

Author SHA1 Message Date
Guillermo Oyarzun
c2e816a86c fix(gpu): change mininum number of elements in benches 2025-09-04 11:03:27 +02:00
Guillermo Oyarzun
baad6a6b49 feat(gpu): change broadcast lut to communicate the minimum possible 2025-09-03 15:20:58 +02:00
Guillermo Oyarzun
88c3df8331 feat(gpu): improve communication scheme 2025-09-03 15:20:58 +02:00
Pedro Alves
57ea3e3e88 chore(gpu): refactor the entry points for PBS in the backend 2025-08-29 16:46:27 -03:00
Pedro Alves
cad4070ebe fix(gpu): fix the decompression function signature in the backend 2025-08-29 21:09:40 +02:00
Pedro Alves
94d24e1f8b feat(gpu): implement the centered modulus switch technique to classical PBS 2025-08-29 11:38:26 -03:00
Pedro Alves
9a1c0f48f4 feat(gpu): implement 128-bit compression and add it to the integer API 2025-08-29 11:26:07 -03:00
Guillermo Oyarzun
ff29535eb0 feat(gpu): enable specialized pbs for 4_1_1 params 2025-08-29 10:19:45 +02:00
Andrei Stoian
c06b513182 chore(gpu): add valgrind and fix leaks 2025-08-28 14:21:57 +02:00
Nicolas Sarlin
fa48444611 chore(ci): update toolchain to nightly-2025-08-26 2025-08-28 08:41:48 +02:00
Andrei Stoian
71f427de9e chore(gpu): add assert macro 2025-08-27 10:32:43 +02:00
Enzo Di Maria
14063ca3b3 fix(gpu): fix perf of ilog2 backend 2025-08-26 14:53:08 +02:00
Andrei Stoian
f776c737a1 chore(gpu): fix typos 2025-08-25 10:02:07 +02:00
Guillermo Oyarzun
c1c7fe78ed fix(gpu): fix memory leak in count consecutive bits 2025-08-22 17:39:32 +02:00
Guillermo Oyarzun
827cea966b chore(gpu): fix nvtx labels and a comment 2025-08-21 18:02:53 +02:00
Enzo Di Maria
e5e54be4a4 refactor(gpu): moving unchecked_ilog2_async to the backend 2025-08-12 09:05:29 +02:00
Guillermo Oyarzun
4a3be71bd7 fix(gpu): create message extract lut only when needed 2025-08-11 10:38:31 +02:00
pgardratzama
afd8f58a8d feat(hpu): update backend to support multiple V80 device, id of v80 is its serial number
- update psi64 to replace fw with stable version (3.1.0), remove psi16.hpu
2025-08-07 14:58:39 +02:00
Guillermo Oyarzun
1b92bcf476 feat(gpu): extra optimizations for 2_2 params kernels and bugs fixes 2025-08-07 09:34:32 +02:00
Guillermo Oyarzun
79d5db66d4 feat(gpu): use warp level optimizations for fft 2025-08-07 09:34:32 +02:00
Guillermo Oyarzun
d741e55218 feat(gpu): write specialized pbs keybundle for 2_2 params 2025-08-07 09:34:32 +02:00
Guillermo Oyarzun
ef5a391dc2 feat(gpu): write specialized pbs accumulate for 2_2 params 2025-08-07 09:34:32 +02:00
Enzo Di Maria
d1c417bf71 refactor(gpu): cleaning compression 2025-08-07 09:31:55 +02:00
Enzo Di Maria
852a06b330 refactor(gpu): orpf with grouped processing and for multi-gpu 2025-08-05 09:58:25 +02:00
Mayeul@Zama
fe2dde0e0c chore(gpu): fix index type 2025-08-01 10:38:09 +02:00
Afounso Souza
e7e095b924 fix(gpu): fix typo
fix(gpu): fix typo
2025-08-01 10:21:54 +02:00
Andrei Stoian
7bf2ec6ff2 chore(gpu): fix warnings detection 2025-07-31 18:47:08 +02:00
Agnes Leroy
2d7e1b2293 chore(gpu): change active gpu count logic 2025-07-31 16:10:45 +01:00
Guillermo Oyarzun
a411e5720d fix(gpu): update soon deprecated version nvtx 2025-07-31 16:52:05 +02:00
Agnes Leroy
54d038ef30 chore(gpu): enhance scatter to check gpu count is ok 2025-07-31 13:11:52 +01:00
Guillermo Oyarzun
908922171d fix(gpu): remove unused pointer in squash and add some extra checks 2025-07-31 09:52:34 +01:00
Kendra Karol Sevilla
84f6a8082d fix(cuda): correct radix block mismatch check in LWE array validation 2025-07-31 09:28:48 +01:00
otc group
0bc59dca59 chore: fix typo in comment
chore: fix typo in comment
2025-07-31 09:20:49 +01:00
Agnes Leroy
09ffc39b15 fix(gpu): fix inconsistent types 2025-07-31 08:14:45 +01:00
Andrei Stoian
36eceaf05e feat(gpu): utility debug workflows in ci 2025-07-30 12:55:40 +01:00
cryptoraph
d78266e141 fix(cuda): correct typo in keyswitch error message 2025-07-28 17:09:56 +02:00
Pedro Alves
62e6504ef0 fix(gpu): fixes some wrong indexes used in cuda_set_device() 2025-07-28 12:43:13 +01:00
Pedro Alves
7ecda32b41 fix(gpu): refactor broadcast_lut() so make it less error prone 2025-07-17 10:58:52 +01:00
Agnes Leroy
c7785b7214 chore(gpu): add checks on noise levels in debug mode 2025-07-16 11:00:09 +01:00
Enzo Di Maria
a5c876fdac refactor(gpu): creating CudaScalarDivisorFFI for storing decomposed scalars and their metadata 2025-07-16 07:59:20 +01:00
Agnes Leroy
15762623d1 chore(gpu): minor refactor in sum ctxt 2025-07-10 16:24:02 +01:00
Beka Barbakadze
c6865ab880 fix(gpu): fix pbs128 multi-gpu bug
Signed-off-by: Beka Barbakadze <beka.barbakadze@zama.ai>
2025-07-10 15:54:27 +01:00
Enzo Di Maria
e376df2fa4 refactor(gpu): moving unsigned_scalar_div_rem and signed_scalar_div_rem to the backend 2025-07-10 09:24:13 +02:00
Enzo Di Maria
ba87f1ba5e chore(gpu): removing useless arguments 2025-07-08 16:17:51 +02:00
Nicolas Sarlin
7bcd6b94da chore: use script to pull hpu files 2025-07-07 13:10:55 +02:00
Nicolas Sarlin
57cbab9fe1 chore(backward): integrate backward compat data
Code is taken from
59a6179831

Adapted to make ci work
2025-07-07 13:10:55 +02:00
Agnes Leroy
48dfeb21dc chore(gpu): refactor size tracker to avoid future bugs 2025-07-04 14:37:02 +01:00
Pedro Alves
8c88678ee8 feat(gpu): implement 128-bit multi-bit PBS 2025-07-03 20:34:32 -03:00
JJ-hw
405fdec6b9 fix(hpu): Fix iop_propagate_msb_to_lsb_blockv: propagation in application was not done correctly 2025-07-03 14:31:59 +02:00
Agnes Leroy
b3355e2b2f chore(gpu): remove template from sum ciphertexts, add two missing delete 2025-07-03 12:51:29 +01:00