Guillermo Oyarzun
|
c2e816a86c
|
fix(gpu): change mininum number of elements in benches
|
2025-09-04 11:03:27 +02:00 |
|
Guillermo Oyarzun
|
baad6a6b49
|
feat(gpu): change broadcast lut to communicate the minimum possible
|
2025-09-03 15:20:58 +02:00 |
|
Guillermo Oyarzun
|
88c3df8331
|
feat(gpu): improve communication scheme
|
2025-09-03 15:20:58 +02:00 |
|
Pedro Alves
|
57ea3e3e88
|
chore(gpu): refactor the entry points for PBS in the backend
|
2025-08-29 16:46:27 -03:00 |
|
Pedro Alves
|
cad4070ebe
|
fix(gpu): fix the decompression function signature in the backend
|
2025-08-29 21:09:40 +02:00 |
|
Pedro Alves
|
94d24e1f8b
|
feat(gpu): implement the centered modulus switch technique to classical PBS
|
2025-08-29 11:38:26 -03:00 |
|
Pedro Alves
|
9a1c0f48f4
|
feat(gpu): implement 128-bit compression and add it to the integer API
|
2025-08-29 11:26:07 -03:00 |
|
Guillermo Oyarzun
|
ff29535eb0
|
feat(gpu): enable specialized pbs for 4_1_1 params
|
2025-08-29 10:19:45 +02:00 |
|
Andrei Stoian
|
c06b513182
|
chore(gpu): add valgrind and fix leaks
|
2025-08-28 14:21:57 +02:00 |
|
Nicolas Sarlin
|
fa48444611
|
chore(ci): update toolchain to nightly-2025-08-26
|
2025-08-28 08:41:48 +02:00 |
|
Andrei Stoian
|
71f427de9e
|
chore(gpu): add assert macro
|
2025-08-27 10:32:43 +02:00 |
|
Enzo Di Maria
|
14063ca3b3
|
fix(gpu): fix perf of ilog2 backend
|
2025-08-26 14:53:08 +02:00 |
|
Andrei Stoian
|
f776c737a1
|
chore(gpu): fix typos
|
2025-08-25 10:02:07 +02:00 |
|
Guillermo Oyarzun
|
c1c7fe78ed
|
fix(gpu): fix memory leak in count consecutive bits
|
2025-08-22 17:39:32 +02:00 |
|
Guillermo Oyarzun
|
827cea966b
|
chore(gpu): fix nvtx labels and a comment
|
2025-08-21 18:02:53 +02:00 |
|
Enzo Di Maria
|
e5e54be4a4
|
refactor(gpu): moving unchecked_ilog2_async to the backend
|
2025-08-12 09:05:29 +02:00 |
|
Guillermo Oyarzun
|
4a3be71bd7
|
fix(gpu): create message extract lut only when needed
|
2025-08-11 10:38:31 +02:00 |
|
pgardratzama
|
afd8f58a8d
|
feat(hpu): update backend to support multiple V80 device, id of v80 is its serial number
- update psi64 to replace fw with stable version (3.1.0), remove psi16.hpu
|
2025-08-07 14:58:39 +02:00 |
|
Guillermo Oyarzun
|
1b92bcf476
|
feat(gpu): extra optimizations for 2_2 params kernels and bugs fixes
|
2025-08-07 09:34:32 +02:00 |
|
Guillermo Oyarzun
|
79d5db66d4
|
feat(gpu): use warp level optimizations for fft
|
2025-08-07 09:34:32 +02:00 |
|
Guillermo Oyarzun
|
d741e55218
|
feat(gpu): write specialized pbs keybundle for 2_2 params
|
2025-08-07 09:34:32 +02:00 |
|
Guillermo Oyarzun
|
ef5a391dc2
|
feat(gpu): write specialized pbs accumulate for 2_2 params
|
2025-08-07 09:34:32 +02:00 |
|
Enzo Di Maria
|
d1c417bf71
|
refactor(gpu): cleaning compression
|
2025-08-07 09:31:55 +02:00 |
|
Enzo Di Maria
|
852a06b330
|
refactor(gpu): orpf with grouped processing and for multi-gpu
|
2025-08-05 09:58:25 +02:00 |
|
Mayeul@Zama
|
fe2dde0e0c
|
chore(gpu): fix index type
|
2025-08-01 10:38:09 +02:00 |
|
Afounso Souza
|
e7e095b924
|
fix(gpu): fix typo
fix(gpu): fix typo
|
2025-08-01 10:21:54 +02:00 |
|
Andrei Stoian
|
7bf2ec6ff2
|
chore(gpu): fix warnings detection
|
2025-07-31 18:47:08 +02:00 |
|
Agnes Leroy
|
2d7e1b2293
|
chore(gpu): change active gpu count logic
|
2025-07-31 16:10:45 +01:00 |
|
Guillermo Oyarzun
|
a411e5720d
|
fix(gpu): update soon deprecated version nvtx
|
2025-07-31 16:52:05 +02:00 |
|
Agnes Leroy
|
54d038ef30
|
chore(gpu): enhance scatter to check gpu count is ok
|
2025-07-31 13:11:52 +01:00 |
|
Guillermo Oyarzun
|
908922171d
|
fix(gpu): remove unused pointer in squash and add some extra checks
|
2025-07-31 09:52:34 +01:00 |
|
Kendra Karol Sevilla
|
84f6a8082d
|
fix(cuda): correct radix block mismatch check in LWE array validation
|
2025-07-31 09:28:48 +01:00 |
|
otc group
|
0bc59dca59
|
chore: fix typo in comment
chore: fix typo in comment
|
2025-07-31 09:20:49 +01:00 |
|
Agnes Leroy
|
09ffc39b15
|
fix(gpu): fix inconsistent types
|
2025-07-31 08:14:45 +01:00 |
|
Andrei Stoian
|
36eceaf05e
|
feat(gpu): utility debug workflows in ci
|
2025-07-30 12:55:40 +01:00 |
|
cryptoraph
|
d78266e141
|
fix(cuda): correct typo in keyswitch error message
|
2025-07-28 17:09:56 +02:00 |
|
Pedro Alves
|
62e6504ef0
|
fix(gpu): fixes some wrong indexes used in cuda_set_device()
|
2025-07-28 12:43:13 +01:00 |
|
Pedro Alves
|
7ecda32b41
|
fix(gpu): refactor broadcast_lut() so make it less error prone
|
2025-07-17 10:58:52 +01:00 |
|
Agnes Leroy
|
c7785b7214
|
chore(gpu): add checks on noise levels in debug mode
|
2025-07-16 11:00:09 +01:00 |
|
Enzo Di Maria
|
a5c876fdac
|
refactor(gpu): creating CudaScalarDivisorFFI for storing decomposed scalars and their metadata
|
2025-07-16 07:59:20 +01:00 |
|
Agnes Leroy
|
15762623d1
|
chore(gpu): minor refactor in sum ctxt
|
2025-07-10 16:24:02 +01:00 |
|
Beka Barbakadze
|
c6865ab880
|
fix(gpu): fix pbs128 multi-gpu bug
Signed-off-by: Beka Barbakadze <beka.barbakadze@zama.ai>
|
2025-07-10 15:54:27 +01:00 |
|
Enzo Di Maria
|
e376df2fa4
|
refactor(gpu): moving unsigned_scalar_div_rem and signed_scalar_div_rem to the backend
|
2025-07-10 09:24:13 +02:00 |
|
Enzo Di Maria
|
ba87f1ba5e
|
chore(gpu): removing useless arguments
|
2025-07-08 16:17:51 +02:00 |
|
Nicolas Sarlin
|
7bcd6b94da
|
chore: use script to pull hpu files
|
2025-07-07 13:10:55 +02:00 |
|
Nicolas Sarlin
|
57cbab9fe1
|
chore(backward): integrate backward compat data
Code is taken from
59a6179831
Adapted to make ci work
|
2025-07-07 13:10:55 +02:00 |
|
Agnes Leroy
|
48dfeb21dc
|
chore(gpu): refactor size tracker to avoid future bugs
|
2025-07-04 14:37:02 +01:00 |
|
Pedro Alves
|
8c88678ee8
|
feat(gpu): implement 128-bit multi-bit PBS
|
2025-07-03 20:34:32 -03:00 |
|
JJ-hw
|
405fdec6b9
|
fix(hpu): Fix iop_propagate_msb_to_lsb_blockv: propagation in application was not done correctly
|
2025-07-03 14:31:59 +02:00 |
|
Agnes Leroy
|
b3355e2b2f
|
chore(gpu): remove template from sum ciphertexts, add two missing delete
|
2025-07-03 12:51:29 +01:00 |
|