pgardratzama
d2a570bdd6
chore: uses if_then_zero only in HPU ERC20 whitepaper (to be updated when encrypt_trivial becomes available on HPU), adds test of if_then_zero for both CPU & HPU
2026-01-06 16:55:07 +01:00
pgardratzama
ed84387bba
chore: trying to insure GPU ERC20 bench are not impacted while CPU & HPU uses if_then_zero
2026-01-06 16:55:07 +01:00
Baptiste Roux
e645ee3397
feat: Add IfThenZero impl for Cpu
2026-01-06 16:55:07 +01:00
pgardratzama
569abd9a3b
fix(hpu): fix whitepaper erc20 for HPU using if_then_zero
2026-01-06 16:55:07 +01:00
Nicolas Sarlin
312ce494bf
chore(zk): add 1 * 64 benches with production CRS
2025-12-17 15:06:37 +01:00
Thomas Montaigu
d394af7f4d
chore: bump dyn-stack to 0.13
...
Notable changes:
- StackReq methods no longer returns Result<StackReq, SizeOverflow>
instead, StackReq contains the invalid state.
Now, its when we create a PodBuffer that we can check/catch if the
size req is invalid by catching errors when calling
`PodBuffer::try_new`. Its also possible to manually check that
`stack_req != StackReq::OVERFLOW`
- GlobalaPodBuffer is now PodBuffer
2025-12-15 10:02:17 +01:00
Andrei Stoian
78d1ce18c1
feat(gpu): support keyswitch 64/32
2025-12-12 22:01:49 +01:00
Agnes Leroy
b7a706a3db
chore(bench): remove constraint in pcc to not use trivial name in bench
2025-12-11 14:41:07 +01:00
Agnes Leroy
8e4bec0b2a
chore(bench): modify whitepaper erc20 to match newest litepaper version
2025-12-11 14:41:07 +01:00
Enzo Di Maria
cf969ff930
refactor(gpu): creating benchmarks for match_value
2025-12-11 12:01:43 +01:00
David Testé
5eb4cc5a22
chore(bench): add fast benchmark capability for hlapi
...
Run only a small subset of the current benchmarks to speed up developers feedback
2025-12-09 11:34:53 +01:00
Agnes Leroy
100b4200c2
chore(gpu): update number of streams in erc20 throughput bench
2025-12-08 09:21:55 +01:00
David Testé
e85fd936d0
chore(bench): suffix hlapi ops bench with measured type name
2025-12-04 17:59:27 +01:00
Agnes Leroy
e6625521ad
chore(gpu): add the possibility to run classical bench for erc20 and dex
2025-12-02 15:59:40 +01:00
Andrei Stoian
e2063c8ef4
chore(gpu): bench KS latency batches
2025-11-27 17:32:44 +01:00
Nicolas Sarlin
01367368ed
chore(zk): do not bench zkv1 at the integer level
2025-11-25 17:20:06 +01:00
Nicolas Sarlin
33f77458e9
chore(zk): fix elements count for zk throughput benches
2025-11-25 17:20:06 +01:00
Arthur Meyre
caf5e9d879
chore: fix scalar benchmarks generating fixed values
...
- this would not give an average runtime for scalar benchmarks and for
small precisions could give super good timings (for lucky values)
- the timings for other precisions could still be favorable or unfavorable
depending on the value that was drawn
2025-11-25 14:23:55 +01:00
David Testé
6141ad2eee
chore(bench): fix bench prefix pattern for hlapi ops
...
To follow the standard used by other HLAPI benchmarks and ease parsing for data_extractor.
2025-11-24 17:56:10 +01:00
David Testé
b0393c0acb
chore(bench): run scalar ops in integer deduplicated cpu bench
2025-11-24 14:03:08 +01:00
David Testé
58378b7972
chore(bench): add dedicated targets for aes cuda benchmarks
2025-11-20 16:58:06 +01:00
David Testé
071e70c037
chore(bench): fix benchmark id pattern for aes and aes256
2025-11-19 17:23:05 +01:00
Mayeul@Zama
f9268b889f
chore(bench): revert print bench id
...
This reverts commit ef07963767 .
2025-11-17 11:23:50 +01:00
Enzo Di Maria
54c8c5e020
chore(gpu): no crash with aes benches if oom error
2025-11-14 17:02:33 +01:00
David Testé
ef07963767
chore(bench): print bench id before running the benchmark
...
Done to circumvent criterion limitation regarding automatic
truncation of long benchmark ID.
Using a println() call we ensure the complete name is displayed
before benchmark execution to ease manual parsing and debugging.
2025-11-14 13:45:04 +01:00
David Testé
d53bf79592
chore(bench): fix naming order for erc20 hpu benchmarks
2025-11-10 11:46:41 +01:00
Enzo Di Maria
4ff95e3a42
feat(gpu): AES 256
2025-11-05 13:37:08 +01:00
David Testé
0c977a3996
chore(bench): insert params name in bench id for hlapi
...
To ease parsing and filtering by third parties.
2025-11-04 10:53:25 +01:00
Arthur Meyre
00ce0deec9
chore: make typos version fixed
...
- add a script to properly install the correct version
- correct new typos
2025-11-03 14:58:23 +01:00
David Testé
2a8885aa9f
chore(ci): run erc20 and dex throughput bench only on demand
...
Following the same pattern as other benchmarks.
2025-10-30 09:52:30 +01:00
Pedro Alves
867f8fb579
feat(gpu): implement re-randomization
...
- exposed to integer and HL API
- test on the HL API
- benchmarks for GPU and CPU implementation
2025-10-29 17:55:45 -03:00
David Testé
b0b49ae533
chore(bench): new parameters set to run core_crypto bench for docs
...
This creates extended parameters set to reflect what's displayed
in the documentation.
2025-10-27 17:25:41 +01:00
Pedro Alves
70773e442c
fix(gpu): fix 128-bit compression benchmark
2025-10-27 17:06:45 +01:00
Mayeul@Zama
777bbe437a
feat(shortint): add multi bit decompression
2025-10-24 09:28:17 +02:00
Arthur Meyre
23246f63f7
chore: update fast_dedup opset to match the latency benchmarks in the docs
...
- signed bench update
2025-10-23 10:42:19 +02:00
Arthur Meyre
11c79b5237
chore: update fast_dedup opset to match the latency benchmarks in the docs
2025-10-23 10:42:19 +02:00
Guillermo Oyarzun
e12638dabe
feat(gpu): extend specialized version to classical pbs
2025-10-22 09:20:40 +02:00
pgardratzama
f9c89212ea
fix(hpu): display name on shift looked wrong
2025-10-21 13:29:59 +02:00
Thomas Montaigu
0dd0ead4e2
chore(bench): remove trivial encryptions
...
It makes benches not accurate
2025-10-20 12:26:44 +02:00
Agnes Leroy
c30835fc30
chore(gpu): remove async entry points for abs, add, sub, aes
2025-10-17 15:42:06 +02:00
Thomas Montaigu
498b0e6e5c
refactor: use BTreeMap as internals of KVStore
...
This is to make the order of the key and value lists
deterministic when compressing
2025-10-14 17:04:13 +02:00
Thomas Montaigu
126138a59d
chore: only run KVStore benches on CPU
...
As its the only backend that supports it
2025-10-08 11:52:14 +02:00
pgardratzama
3073d60f11
fix(hpu): work-around a criterion assert by reducing number of elements on division & modulus throughput bench
2025-10-07 14:23:07 +02:00
pgardratzama
ab25919187
fix(hpu): throughput benchmarks were done 1 IOp per 1 IOp...
2025-10-07 10:14:43 +02:00
Enzo Di Maria
f0f3dd76eb
feat(gpu): aes 128
2025-10-06 09:31:36 +02:00
Thomas Montaigu
e523fd2cb6
feat: add KVStore to the high level api
...
* Added Value type name to crate::integer::KVStore impl of Named trait
as well as a bool to check we deserialize the correct value type
(Radix vs SignedRadix)
* Add KVStore to high_level_api
* Add KVStore hlapi benches
* Remove specialized `[add,mul,sub]_to_slot` as `map` is now the
intended API.
- mul_to_slot was way slower than using `map`
- add/mul_to_slot were a bit faster (~5% latency-wise), but returned
less information (no old_value, no new_value, no boolean to check)
if the key matched
- Some known improvement can be made to map, which should result in
it being better than add/sub_to_slot
* Add FheIntegerType trait to make the KVStore generic over
FheUint/FheInt, and should make GPU integration "easy"
2025-10-03 15:01:23 +02:00
Agnes Leroy
f9e876730a
chore(gpu): remove support for drift noise reduction
2025-10-03 09:45:20 +02:00
pgardratzama
39b81a8ded
feat(hpu): move to new bitstream at 400Mhz with GRAM_NB 3
...
- update SIMD_N and min_batch_size to 12 which seems to give better
latency and ERC20 throughput
- support IOp on several lines in ami /proc file
- reduce amount of ERC_20_SIMD per batch in HLAPI bench
2025-10-02 13:20:36 +02:00
pgardratzama
2bf595d0e2
fix(hpu): missing bench numbers for less_than & less_or_equal because lower != less
2025-10-02 13:20:36 +02:00
David Testé
4ba1787e12
chore(bench): add crs size in zk-pke benchmark names
...
This is done get more details about the benchmarks when parsing
results.
2025-09-16 16:06:41 +02:00