Mayeul@Zama
5ee5569d0d
chore: remove redundant gpu feature gate
2025-07-31 14:11:33 +01:00
Agnes Leroy
54d038ef30
chore(gpu): enhance scatter to check gpu count is ok
2025-07-31 13:11:52 +01:00
Agnes Leroy
b6e6abb066
chore(gpu): add corner case test for mul
2025-07-31 13:11:29 +01:00
Arthur Meyre
82a5cc7f2d
chore(ci): increase timeout for noise checks
2025-07-31 12:00:15 +02:00
Guillermo Oyarzun
908922171d
fix(gpu): remove unused pointer in squash and add some extra checks
2025-07-31 09:52:34 +01:00
Kendra Karol Sevilla
84f6a8082d
fix(cuda): correct radix block mismatch check in LWE array validation
2025-07-31 09:28:48 +01:00
otc group
0bc59dca59
chore: fix typo in comment
...
chore: fix typo in comment
2025-07-31 09:20:49 +01:00
Agnes Leroy
09ffc39b15
fix(gpu): fix inconsistent types
2025-07-31 08:14:45 +01:00
swarnabhasinha
099345df02
fix(api): Add min/max on owned types
2025-07-30 21:33:58 +02:00
Agnes Leroy
48c10e91f7
chore(gpu): fix gpu setup action for ci
2025-07-30 14:22:13 +01:00
Andrei Stoian
36eceaf05e
feat(gpu): utility debug workflows in ci
2025-07-30 12:55:40 +01:00
Arthur Meyre
e8986cbd7c
chore: setup CI for noise checks
2025-07-29 15:29:24 +02:00
Arthur Meyre
fc8063a59b
test: add noise simulation framework for basic operators
...
- test secret key encryption + start of compute atomic pattern in shortint
- only supports classic PBS with drift mitigation currently
2025-07-29 15:29:24 +02:00
Arthur Meyre
65b034ef70
chore(core): update noise formulas
2025-07-29 15:29:24 +02:00
cryptoraph
2aa83c99ea
fix(core_crypto): correct typo in GLWE keyswitch assertion message
2025-07-28 17:09:56 +02:00
cryptoraph
d78266e141
fix(cuda): correct typo in keyswitch error message
2025-07-28 17:09:56 +02:00
Pedro Alves
62e6504ef0
fix(gpu): fixes some wrong indexes used in cuda_set_device()
2025-07-28 12:43:13 +01:00
bigbear
209a8f1ad9
fix: correct typo in InvalidRangeError message
2025-07-25 17:18:57 +02:00
Guillermo Oyarzun
3621dd1ae7
chore(gpu): correct pfail in readme
2025-07-25 16:46:03 +02:00
Guillermo Oyarzun
b5a7199c15
chore(gpu): update cuda version in ci
2025-07-25 15:57:25 +02:00
Mayeul@Zama
09aaa4e045
chore: enable unused_imports lint in doctests
2025-07-23 11:09:23 +02:00
Mayeul@Zama
63203c58aa
refactor(shortint): cleanup decompression key
2025-07-23 10:08:01 +02:00
Mayeul@Zama
03f8a134b3
chore: remove unused gitignore entry
2025-07-21 10:15:56 +02:00
Maximilian Hubert
981da1d3fc
fix: fix typo
2025-07-18 18:16:52 +02:00
Nicolas Sarlin
0386090048
chore: missing from/into_raw_parts for noise squash comp priv key
2025-07-18 13:34:23 +02:00
Pedro Alves
7ecda32b41
fix(gpu): refactor broadcast_lut() so make it less error prone
2025-07-17 10:58:52 +01:00
tmontaigu
a4cfc12941
chore: add missing into/from_raw_parts for SQCompression
2025-07-17 11:20:30 +02:00
Alex Pikme
82b6f45785
fix: correct tuple element in izip! macro for 7 parameters
2025-07-17 10:54:45 +02:00
tmontaigu
82cc9d3884
docs: add UpgradeKeyChain
2025-07-16 17:57:10 +02:00
Agnes Leroy
c7785b7214
chore(gpu): add checks on noise levels in debug mode
2025-07-16 11:00:09 +01:00
Enzo Di Maria
a5c876fdac
refactor(gpu): creating CudaScalarDivisorFFI for storing decomposed scalars and their metadata
2025-07-16 07:59:20 +01:00
Nicolas Sarlin
2d8ea2de16
feat(shortint): add pbs_order method to AtomicPatternKind
2025-07-15 17:35:47 +02:00
Andrei Stoian
494e0e0601
chore(gpu): add short op sequence test for GPU on PRs
2025-07-15 16:03:45 +02:00
tmontaigu
8c838da209
chore(integer): improve measurements
...
It seems that in
```rust
bench_group.bench_function(&bench_id, |b| {
// some code
b.iter(|| {
// function to bench
})
});
```
If we put code in the '// some code' part, it affects the measurements
the slower this code is the worse the measurements can be.
For many operations the gap is small (a few ms or no gap),
but for the division the gap was around 500ms.
So to reduce this, we move out what we can, moving
the keycache access is the most important aspect as it
cost around 70ms to 100ms.
A LazyCell is used in order only access the keycache is the bench is not
filtered out. Which is the behaviour we had before this commit, and the
behaviour we want to keep so that running specific benches via regex
selection stay fast.
Also, for clean input benches, we use `iter` instead of `iter_batched`
as it makes more sense and should give more accurate results as
iter_batched timing include other things that just the timing of the
function.
2025-07-15 12:46:38 +02:00
tmontaigu
c13587b713
fix(integer): fix non-parallel prop with noisy block
2025-07-15 12:43:41 +02:00
tmontaigu
8dea5cf145
feat(integer): truncate carry prop on trivial zeros
...
This changes the full_propagate_parallelized to not propagate
most significant blocks which are trivial zeros.
This is a small performance improvement, especially interesting
when having a bunch of FheUintX data, casted to FheUintY (Y > X)
and summing them (e.g. n FheUint2, casted to FheUint32 and doing the
sum to get the result on 32 bit)
2025-07-15 12:43:41 +02:00
Agnes Leroy
0d41b4f445
chore(gpu): add bench command for cuda and update weekly bench
2025-07-11 14:04:32 +01:00
Agnes Leroy
068cbc0f41
chore(gpu): add hl api noise squash latency and throughput bench
2025-07-11 14:04:32 +01:00
Agnes Leroy
f8947ddff3
chore(gpu): remove nightly schedule now that ci is lighter
2025-07-11 12:43:36 +01:00
Pedro Alves
1b98312e2c
fix(gpu): fix regression on ERC20 throughput
...
- partially revert changes done in fd79c4f972
- transfers for the GPU case should be measured using sequential
operations (without rayon!)
2025-07-11 08:57:19 +01:00
Pedro Alves
d3dd010deb
fix(gpu): reduces number of elements in the ZK throughput benchmark
2025-07-11 08:57:01 +01:00
Agnes Leroy
15762623d1
chore(gpu): minor refactor in sum ctxt
2025-07-10 16:24:02 +01:00
Beka Barbakadze
c6865ab880
fix(gpu): fix pbs128 multi-gpu bug
...
Signed-off-by: Beka Barbakadze <beka.barbakadze@zama.ai >
2025-07-10 15:54:27 +01:00
Enzo Di Maria
e376df2fa4
refactor(gpu): moving unsigned_scalar_div_rem and signed_scalar_div_rem to the backend
2025-07-10 09:24:13 +02:00
Arthur Meyre
bd739c2d48
chore(docs): uniformize paths in docs to use "-" instead of "_"
...
- this is to avoid conflicts with gitbook
2025-07-09 14:36:04 +02:00
Pedro Alves
9960f5e8b6
fix(gpu): Fix expand bench on multi-gpus
2025-07-09 09:17:55 +01:00
Nicolas Sarlin
776f08b534
chore(ci): remove close_data_pr workflow
2025-07-09 09:31:29 +02:00
David Testé
ac13eed3b1
chore(ci): allow git lfs sync between repositories
...
Since integration of HPU backend, some Git LFS references need to be synced along with the rest of the codebase. The usage of valtech-sd/git-sync action, which is a fork of wei/git-sync, allows to push git lfs reference to another repository.
2025-07-09 09:07:48 +02:00
Arthur Meyre
17d3a492b6
chore: only run backward compat clippy on x86 machines
...
- older versions of the crates are only compilable with x86, disable on arm
for now
- revisit when the crates are split ?
2025-07-09 08:29:12 +02:00
Enzo Di Maria
ba87f1ba5e
chore(gpu): removing useless arguments
2025-07-08 16:17:51 +02:00