pgardratzama
569abd9a3b
fix(hpu): fix whitepaper erc20 for HPU using if_then_zero
2026-01-06 16:55:07 +01:00
Nicolas Sarlin
312ce494bf
chore(zk): add 1 * 64 benches with production CRS
2025-12-17 15:06:37 +01:00
Thomas Montaigu
d394af7f4d
chore: bump dyn-stack to 0.13
...
Notable changes:
- StackReq methods no longer returns Result<StackReq, SizeOverflow>
instead, StackReq contains the invalid state.
Now, its when we create a PodBuffer that we can check/catch if the
size req is invalid by catching errors when calling
`PodBuffer::try_new`. Its also possible to manually check that
`stack_req != StackReq::OVERFLOW`
- GlobalaPodBuffer is now PodBuffer
2025-12-15 10:02:17 +01:00
Andrei Stoian
78d1ce18c1
feat(gpu): support keyswitch 64/32
2025-12-12 22:01:49 +01:00
Agnes Leroy
b7a706a3db
chore(bench): remove constraint in pcc to not use trivial name in bench
2025-12-11 14:41:07 +01:00
Agnes Leroy
8e4bec0b2a
chore(bench): modify whitepaper erc20 to match newest litepaper version
2025-12-11 14:41:07 +01:00
Enzo Di Maria
cf969ff930
refactor(gpu): creating benchmarks for match_value
2025-12-11 12:01:43 +01:00
David Testé
5eb4cc5a22
chore(bench): add fast benchmark capability for hlapi
...
Run only a small subset of the current benchmarks to speed up developers feedback
2025-12-09 11:34:53 +01:00
Agnes Leroy
100b4200c2
chore(gpu): update number of streams in erc20 throughput bench
2025-12-08 09:21:55 +01:00
David Testé
e85fd936d0
chore(bench): suffix hlapi ops bench with measured type name
2025-12-04 17:59:27 +01:00
Agnes Leroy
e6625521ad
chore(gpu): add the possibility to run classical bench for erc20 and dex
2025-12-02 15:59:40 +01:00
Andrei Stoian
e2063c8ef4
chore(gpu): bench KS latency batches
2025-11-27 17:32:44 +01:00
Nicolas Sarlin
01367368ed
chore(zk): do not bench zkv1 at the integer level
2025-11-25 17:20:06 +01:00
Nicolas Sarlin
33f77458e9
chore(zk): fix elements count for zk throughput benches
2025-11-25 17:20:06 +01:00
Arthur Meyre
caf5e9d879
chore: fix scalar benchmarks generating fixed values
...
- this would not give an average runtime for scalar benchmarks and for
small precisions could give super good timings (for lucky values)
- the timings for other precisions could still be favorable or unfavorable
depending on the value that was drawn
2025-11-25 14:23:55 +01:00
David Testé
6141ad2eee
chore(bench): fix bench prefix pattern for hlapi ops
...
To follow the standard used by other HLAPI benchmarks and ease parsing for data_extractor.
2025-11-24 17:56:10 +01:00
David Testé
b0393c0acb
chore(bench): run scalar ops in integer deduplicated cpu bench
2025-11-24 14:03:08 +01:00
David Testé
58378b7972
chore(bench): add dedicated targets for aes cuda benchmarks
2025-11-20 16:58:06 +01:00
David Testé
071e70c037
chore(bench): fix benchmark id pattern for aes and aes256
2025-11-19 17:23:05 +01:00
Mayeul@Zama
f9268b889f
chore(bench): revert print bench id
...
This reverts commit ef07963767 .
2025-11-17 11:23:50 +01:00
Enzo Di Maria
54c8c5e020
chore(gpu): no crash with aes benches if oom error
2025-11-14 17:02:33 +01:00
David Testé
ef07963767
chore(bench): print bench id before running the benchmark
...
Done to circumvent criterion limitation regarding automatic
truncation of long benchmark ID.
Using a println() call we ensure the complete name is displayed
before benchmark execution to ease manual parsing and debugging.
2025-11-14 13:45:04 +01:00
David Testé
d53bf79592
chore(bench): fix naming order for erc20 hpu benchmarks
2025-11-10 11:46:41 +01:00
Enzo Di Maria
4ff95e3a42
feat(gpu): AES 256
2025-11-05 13:37:08 +01:00
David Testé
0c977a3996
chore(bench): insert params name in bench id for hlapi
...
To ease parsing and filtering by third parties.
2025-11-04 10:53:25 +01:00
Arthur Meyre
00ce0deec9
chore: make typos version fixed
...
- add a script to properly install the correct version
- correct new typos
2025-11-03 14:58:23 +01:00
David Testé
2a8885aa9f
chore(ci): run erc20 and dex throughput bench only on demand
...
Following the same pattern as other benchmarks.
2025-10-30 09:52:30 +01:00
Pedro Alves
867f8fb579
feat(gpu): implement re-randomization
...
- exposed to integer and HL API
- test on the HL API
- benchmarks for GPU and CPU implementation
2025-10-29 17:55:45 -03:00
David Testé
b0b49ae533
chore(bench): new parameters set to run core_crypto bench for docs
...
This creates extended parameters set to reflect what's displayed
in the documentation.
2025-10-27 17:25:41 +01:00
Pedro Alves
70773e442c
fix(gpu): fix 128-bit compression benchmark
2025-10-27 17:06:45 +01:00
Mayeul@Zama
777bbe437a
feat(shortint): add multi bit decompression
2025-10-24 09:28:17 +02:00
Arthur Meyre
23246f63f7
chore: update fast_dedup opset to match the latency benchmarks in the docs
...
- signed bench update
2025-10-23 10:42:19 +02:00
Arthur Meyre
11c79b5237
chore: update fast_dedup opset to match the latency benchmarks in the docs
2025-10-23 10:42:19 +02:00
Guillermo Oyarzun
e12638dabe
feat(gpu): extend specialized version to classical pbs
2025-10-22 09:20:40 +02:00
pgardratzama
f9c89212ea
fix(hpu): display name on shift looked wrong
2025-10-21 13:29:59 +02:00
Thomas Montaigu
0dd0ead4e2
chore(bench): remove trivial encryptions
...
It makes benches not accurate
2025-10-20 12:26:44 +02:00
Agnes Leroy
c30835fc30
chore(gpu): remove async entry points for abs, add, sub, aes
2025-10-17 15:42:06 +02:00
Thomas Montaigu
498b0e6e5c
refactor: use BTreeMap as internals of KVStore
...
This is to make the order of the key and value lists
deterministic when compressing
2025-10-14 17:04:13 +02:00
Thomas Montaigu
126138a59d
chore: only run KVStore benches on CPU
...
As its the only backend that supports it
2025-10-08 11:52:14 +02:00
pgardratzama
3073d60f11
fix(hpu): work-around a criterion assert by reducing number of elements on division & modulus throughput bench
2025-10-07 14:23:07 +02:00
pgardratzama
ab25919187
fix(hpu): throughput benchmarks were done 1 IOp per 1 IOp...
2025-10-07 10:14:43 +02:00
Enzo Di Maria
f0f3dd76eb
feat(gpu): aes 128
2025-10-06 09:31:36 +02:00
Thomas Montaigu
e523fd2cb6
feat: add KVStore to the high level api
...
* Added Value type name to crate::integer::KVStore impl of Named trait
as well as a bool to check we deserialize the correct value type
(Radix vs SignedRadix)
* Add KVStore to high_level_api
* Add KVStore hlapi benches
* Remove specialized `[add,mul,sub]_to_slot` as `map` is now the
intended API.
- mul_to_slot was way slower than using `map`
- add/mul_to_slot were a bit faster (~5% latency-wise), but returned
less information (no old_value, no new_value, no boolean to check)
if the key matched
- Some known improvement can be made to map, which should result in
it being better than add/sub_to_slot
* Add FheIntegerType trait to make the KVStore generic over
FheUint/FheInt, and should make GPU integration "easy"
2025-10-03 15:01:23 +02:00
Agnes Leroy
f9e876730a
chore(gpu): remove support for drift noise reduction
2025-10-03 09:45:20 +02:00
pgardratzama
39b81a8ded
feat(hpu): move to new bitstream at 400Mhz with GRAM_NB 3
...
- update SIMD_N and min_batch_size to 12 which seems to give better
latency and ERC20 throughput
- support IOp on several lines in ami /proc file
- reduce amount of ERC_20_SIMD per batch in HLAPI bench
2025-10-02 13:20:36 +02:00
pgardratzama
2bf595d0e2
fix(hpu): missing bench numbers for less_than & less_or_equal because lower != less
2025-10-02 13:20:36 +02:00
David Testé
4ba1787e12
chore(bench): add crs size in zk-pke benchmark names
...
This is done get more details about the benchmarks when parsing
results.
2025-09-16 16:06:41 +02:00
David Testé
366d359441
chore(bench): measure ciphertext and key sizes at a large scale
...
Ciphertext sizes are measured at HLAPI layer with several
parameters set.
Keys sizes are measured at shortint level.
This benchmark has now its dedicated GitHub workflow that would
run, at least, each 24th of the month.
2025-09-16 15:43:36 +02:00
pgardratzama
757c2fc828
chore(hpu): make hpu integer bench fast by default
2025-09-10 22:24:31 +02:00
pgardratzama
4ff0d6cac2
feat(hpu): integer bench update (adds mod, div -> div_mod), erc20_simd simd batch size read from iop prototype
2025-09-10 22:24:31 +02:00