Commit Graph

3590 Commits

Author SHA1 Message Date
Mayeul@Zama
5ee5569d0d chore: remove redundant gpu feature gate 2025-07-31 14:11:33 +01:00
Agnes Leroy
54d038ef30 chore(gpu): enhance scatter to check gpu count is ok 2025-07-31 13:11:52 +01:00
Agnes Leroy
b6e6abb066 chore(gpu): add corner case test for mul 2025-07-31 13:11:29 +01:00
Arthur Meyre
82a5cc7f2d chore(ci): increase timeout for noise checks 2025-07-31 12:00:15 +02:00
Guillermo Oyarzun
908922171d fix(gpu): remove unused pointer in squash and add some extra checks 2025-07-31 09:52:34 +01:00
Kendra Karol Sevilla
84f6a8082d fix(cuda): correct radix block mismatch check in LWE array validation 2025-07-31 09:28:48 +01:00
otc group
0bc59dca59 chore: fix typo in comment
chore: fix typo in comment
2025-07-31 09:20:49 +01:00
Agnes Leroy
09ffc39b15 fix(gpu): fix inconsistent types 2025-07-31 08:14:45 +01:00
swarnabhasinha
099345df02 fix(api): Add min/max on owned types 2025-07-30 21:33:58 +02:00
Agnes Leroy
48c10e91f7 chore(gpu): fix gpu setup action for ci 2025-07-30 14:22:13 +01:00
Andrei Stoian
36eceaf05e feat(gpu): utility debug workflows in ci 2025-07-30 12:55:40 +01:00
Arthur Meyre
e8986cbd7c chore: setup CI for noise checks 2025-07-29 15:29:24 +02:00
Arthur Meyre
fc8063a59b test: add noise simulation framework for basic operators
- test secret key encryption + start of compute atomic pattern in shortint
- only supports classic PBS with drift mitigation currently
2025-07-29 15:29:24 +02:00
Arthur Meyre
65b034ef70 chore(core): update noise formulas 2025-07-29 15:29:24 +02:00
cryptoraph
2aa83c99ea fix(core_crypto): correct typo in GLWE keyswitch assertion message 2025-07-28 17:09:56 +02:00
cryptoraph
d78266e141 fix(cuda): correct typo in keyswitch error message 2025-07-28 17:09:56 +02:00
Pedro Alves
62e6504ef0 fix(gpu): fixes some wrong indexes used in cuda_set_device() 2025-07-28 12:43:13 +01:00
bigbear
209a8f1ad9 fix: correct typo in InvalidRangeError message 2025-07-25 17:18:57 +02:00
Guillermo Oyarzun
3621dd1ae7 chore(gpu): correct pfail in readme 2025-07-25 16:46:03 +02:00
Guillermo Oyarzun
b5a7199c15 chore(gpu): update cuda version in ci 2025-07-25 15:57:25 +02:00
Mayeul@Zama
09aaa4e045 chore: enable unused_imports lint in doctests 2025-07-23 11:09:23 +02:00
Mayeul@Zama
63203c58aa refactor(shortint): cleanup decompression key 2025-07-23 10:08:01 +02:00
Mayeul@Zama
03f8a134b3 chore: remove unused gitignore entry 2025-07-21 10:15:56 +02:00
Maximilian Hubert
981da1d3fc fix: fix typo 2025-07-18 18:16:52 +02:00
Nicolas Sarlin
0386090048 chore: missing from/into_raw_parts for noise squash comp priv key 2025-07-18 13:34:23 +02:00
Pedro Alves
7ecda32b41 fix(gpu): refactor broadcast_lut() so make it less error prone 2025-07-17 10:58:52 +01:00
tmontaigu
a4cfc12941 chore: add missing into/from_raw_parts for SQCompression 2025-07-17 11:20:30 +02:00
Alex Pikme
82b6f45785 fix: correct tuple element in izip! macro for 7 parameters 2025-07-17 10:54:45 +02:00
tmontaigu
82cc9d3884 docs: add UpgradeKeyChain 2025-07-16 17:57:10 +02:00
Agnes Leroy
c7785b7214 chore(gpu): add checks on noise levels in debug mode 2025-07-16 11:00:09 +01:00
Enzo Di Maria
a5c876fdac refactor(gpu): creating CudaScalarDivisorFFI for storing decomposed scalars and their metadata 2025-07-16 07:59:20 +01:00
Nicolas Sarlin
2d8ea2de16 feat(shortint): add pbs_order method to AtomicPatternKind 2025-07-15 17:35:47 +02:00
Andrei Stoian
494e0e0601 chore(gpu): add short op sequence test for GPU on PRs 2025-07-15 16:03:45 +02:00
tmontaigu
8c838da209 chore(integer): improve measurements
It seems that in
```rust
bench_group.bench_function(&bench_id, |b| {
  // some code
  b.iter(|| {
      // function to bench
  })
});
```
If we put code in the '// some code' part, it affects the measurements
the slower this code is the worse the measurements can be.

For many operations the gap is small (a few ms or no gap),
but for the division the gap was around 500ms.

So to reduce this, we move out what we can, moving
the keycache access is the most important aspect as it
cost around 70ms to 100ms.

A LazyCell is used in order only access the keycache is the bench is not
filtered out. Which is the behaviour we had before this commit, and the
behaviour we want to keep so that running specific benches via regex
selection stay fast.

Also, for clean input benches, we use `iter` instead of `iter_batched`
as it makes more sense and should give more accurate results as
iter_batched timing include other things that just the timing of the
function.
2025-07-15 12:46:38 +02:00
tmontaigu
c13587b713 fix(integer): fix non-parallel prop with noisy block 2025-07-15 12:43:41 +02:00
tmontaigu
8dea5cf145 feat(integer): truncate carry prop on trivial zeros
This changes the full_propagate_parallelized to not propagate
most significant blocks which are trivial zeros.

This is a small performance improvement, especially interesting
when having a bunch of FheUintX data, casted to FheUintY (Y > X)
and summing them (e.g. n FheUint2, casted to FheUint32  and doing the
sum to get the result on 32 bit)
2025-07-15 12:43:41 +02:00
Agnes Leroy
0d41b4f445 chore(gpu): add bench command for cuda and update weekly bench 2025-07-11 14:04:32 +01:00
Agnes Leroy
068cbc0f41 chore(gpu): add hl api noise squash latency and throughput bench 2025-07-11 14:04:32 +01:00
Agnes Leroy
f8947ddff3 chore(gpu): remove nightly schedule now that ci is lighter 2025-07-11 12:43:36 +01:00
Pedro Alves
1b98312e2c fix(gpu): fix regression on ERC20 throughput
- partially revert changes done in fd79c4f972
- transfers for the GPU case should be measured using sequential
  operations (without rayon!)
2025-07-11 08:57:19 +01:00
Pedro Alves
d3dd010deb fix(gpu): reduces number of elements in the ZK throughput benchmark 2025-07-11 08:57:01 +01:00
Agnes Leroy
15762623d1 chore(gpu): minor refactor in sum ctxt 2025-07-10 16:24:02 +01:00
Beka Barbakadze
c6865ab880 fix(gpu): fix pbs128 multi-gpu bug
Signed-off-by: Beka Barbakadze <beka.barbakadze@zama.ai>
2025-07-10 15:54:27 +01:00
Enzo Di Maria
e376df2fa4 refactor(gpu): moving unsigned_scalar_div_rem and signed_scalar_div_rem to the backend 2025-07-10 09:24:13 +02:00
Arthur Meyre
bd739c2d48 chore(docs): uniformize paths in docs to use "-" instead of "_"
- this is to avoid conflicts with gitbook
2025-07-09 14:36:04 +02:00
Pedro Alves
9960f5e8b6 fix(gpu): Fix expand bench on multi-gpus 2025-07-09 09:17:55 +01:00
Nicolas Sarlin
776f08b534 chore(ci): remove close_data_pr workflow 2025-07-09 09:31:29 +02:00
David Testé
ac13eed3b1 chore(ci): allow git lfs sync between repositories
Since integration of HPU backend, some Git LFS references need to be synced along with the rest of the codebase. The usage of valtech-sd/git-sync action, which is a fork of wei/git-sync, allows to push git lfs reference to another repository.
2025-07-09 09:07:48 +02:00
Arthur Meyre
17d3a492b6 chore: only run backward compat clippy on x86 machines
- older versions of the crates are only compilable with x86, disable on arm
for now
- revisit when the crates are split ?
2025-07-09 08:29:12 +02:00
Enzo Di Maria
ba87f1ba5e chore(gpu): removing useless arguments 2025-07-08 16:17:51 +02:00