Beka Barbakadze
80cacbd079
feat(gpu): add boolean bitops in cuda backend
2025-11-20 14:56:21 +01:00
Nicolas Sarlin
edb435bd46
chore: update msrv to 1.91.1
2025-11-20 09:29:37 +01:00
Agnes Leroy
df73c36cbf
fix(gpu): fix decomposition algorithm not matching the theory
2025-11-14 16:36:35 +01:00
Agnes Leroy
4f9f4982f6
fix(gpu): fix memory leak in rerand
2025-11-14 14:00:01 +01:00
pgardratzama
d38df76eb6
chore(hpu): adds a page about HPU PBS performances
2025-11-10 18:43:50 +01:00
pgardratzama
afaf761cdd
chore(hpu): adds 3 custom IOp to measure PBS performance on HPU and update trace parser to handle 32b timestamp wrap
2025-11-10 18:43:50 +01:00
pgardratzama
4eb4fa95e3
feat(hpu): new HPU bitstream with few optimizations (GRAM arb, ALU nb, BSK manager)
2025-11-10 09:14:18 +01:00
Guillermo Oyarzun
12426573fa
fix(gpu): add upper bound to lwe_chunk_size calculation
2025-11-07 09:29:40 +01:00
Guillermo Oyarzun
6f105cd82e
fix(gpu): fix out of bounds in specialized classical pbs
2025-11-06 15:35:04 +01:00
Enzo Di Maria
4ff95e3a42
feat(gpu): AES 256
2025-11-05 13:37:08 +01:00
Baptiste Roux
f970031d33
chore(hpu): Update version of hw_regmap deps
...
This new version update rust MSRV.
2025-11-04 15:26:27 +01:00
Arthur Meyre
00ce0deec9
chore: make typos version fixed
...
- add a script to properly install the correct version
- correct new typos
2025-11-03 14:58:23 +01:00
Enzo Di Maria
026cc376ed
refactor(gpu): multibit decompression
2025-10-30 08:59:10 +01:00
Pedro Alves
867f8fb579
feat(gpu): implement re-randomization
...
- exposed to integer and HL API
- test on the HL API
- benchmarks for GPU and CPU implementation
2025-10-29 17:55:45 -03:00
Guillermo Oyarzun
62780ac500
fix(gpu): fix decompression mem leak
2025-10-24 13:02:41 +02:00
Guillermo Oyarzun
e12638dabe
feat(gpu): extend specialized version to classical pbs
2025-10-22 09:20:40 +02:00
pgardratzama
79f1d22573
fix(hpu): scalar rot & shift were not doing anything and not tested in test/hpu.rs
2025-10-21 13:29:59 +02:00
pgardratzama
b918f77859
chore(hpu): add force_reload option in v80 config, remove added line in sim config
2025-10-21 13:29:59 +02:00
Helder Campos
054c5028a1
feat(hpu): Added the option to forcefully reload the HPU
2025-10-21 13:29:59 +02:00
Helder Campos
7b621e57b0
feat(hpu): LLT ROT/SHIFT IOPs
2025-10-21 13:29:59 +02:00
Agnes Leroy
b4b6275ca5
chore(gpu): remove device synchronize in drop for cudavec
2025-10-21 11:33:46 +02:00
Beka Barbakadze
39862c2861
fix(gpu): fix bug in are_all_comparison_blocks_true when number of blocks is 0
2025-10-20 13:26:50 +02:00
Guillermo Oyarzun
c22e63895e
fix(gpu): fix multi-gpu throughput benches with classical pbs
2025-10-16 17:55:10 +02:00
Enzo Di Maria
126e779533
refactor(gpu): oprf_unsigned_custom_range + tests
2025-10-16 09:31:01 +02:00
Enzo Di Maria
353237c0d6
refactor(gpu): oprf_unsigned_custom_range
2025-10-16 09:31:01 +02:00
Agnes Leroy
7bad509f9a
fix(gpu): fix perf regression introduced in cf3f25efdd
2025-10-16 09:21:05 +02:00
Agnes Leroy
cf3f25efdd
chore(gpu): add missing syncs in linearalgebra functions and aes
2025-10-14 09:23:11 +02:00
Agnes Leroy
c3ed1a7558
chore(gpu): internal renaming
2025-10-14 09:23:11 +02:00
Agnes Leroy
6347f25668
chore(gpu): synchronize after every release
2025-10-14 09:23:11 +02:00
Agnes Leroy
91b263d480
chore(gpu): split integer utilities file
2025-10-10 14:49:02 +02:00
Andrei Stoian
30938eec74
chore(gpu): use active streams in int_radix_lut
2025-10-09 21:59:15 +02:00
pgardratzama
ca4159f123
fix(hpu): fix overflow flag of OVF_MUL & OVF_MULS, also update simulation HPU config
2025-10-07 10:14:43 +02:00
Arthur Meyre
e07f07c4c8
chore: bump tfhe-cuda-backend to 0.12.0
2025-10-06 13:26:54 +02:00
Arthur Meyre
81cc0c31b4
chore: constrain bytemuck < 1.24.0 as we don't have avx512 updated code
2025-10-06 13:24:16 +02:00
Enzo Di Maria
f0f3dd76eb
feat(gpu): aes 128
2025-10-06 09:31:36 +02:00
Andrei Stoian
0604d237eb
chore(gpu): multi-gpu debug target
2025-10-03 16:48:42 +02:00
Agnes Leroy
f9e876730a
chore(gpu): remove support for drift noise reduction
2025-10-03 09:45:20 +02:00
pgardratzama
602c6faf8a
chore(hpu): update hpu-backend dependencies, fix pcc
2025-10-02 13:20:36 +02:00
pgardratzama
563502a6a6
chore(hpu): update tfhe-hpu-backend version, readme and run-on-hpu doc
2025-10-02 13:20:36 +02:00
pgardratzama
5f30569452
fix(hpu): update AMC firmware in bitstream to lower polling period
2025-10-02 13:20:36 +02:00
pgardratzama
39b81a8ded
feat(hpu): move to new bitstream at 400Mhz with GRAM_NB 3
...
- update SIMD_N and min_batch_size to 12 which seems to give better
latency and ERC20 throughput
- support IOp on several lines in ami /proc file
- reduce amount of ERC_20_SIMD per batch in HLAPI bench
2025-10-02 13:20:36 +02:00
pgardratzama
da223b36b6
fix(hpu): reduce polling period of backend on iop ack file from 10 to 2us
2025-10-02 13:20:36 +02:00
JJ-hw
db16276715
chore(hpu): Remove all references to U55C, which is not supported anymore.
2025-10-02 13:20:36 +02:00
pgardratzama
a59742f518
fix(hpu): uuid comparison is now done in lower case from both value (metadata, ami read)
2025-10-02 13:20:36 +02:00
Arthur Meyre
9fdaa983e3
chore: fix october typos
2025-10-01 14:32:41 +02:00
Agnes Leroy
71b45c14da
chore(gpu): refactor subset_first and subset
2025-09-30 12:21:39 +02:00
Beka Barbakadze
7549474aac
feat(gpu): Implements optimized division algorithm for message_2_carry_2, when 4 or more gpus are used
2025-09-29 15:16:34 +02:00
Agnes Leroy
23d46ba2bc
fix(gpu): fix oprf output degree
2025-09-29 08:33:25 +02:00
Agnes Leroy
daf0e79e4a
fix(gpu): fix get oprf size on gpu
2025-09-29 08:33:25 +02:00
Arthur Meyre
c5ad73865c
chore: prepare alpha.2
...
- bump tfhe-cuda-backend to 0.12.0-alpha.2
- bump tfhe to 1.4.0-alpha.2
2025-09-27 11:35:27 +02:00