Commit Graph

109 Commits

Author SHA1 Message Date
Agnes Leroy
100b4200c2 chore(gpu): update number of streams in erc20 throughput bench 2025-12-08 09:21:55 +01:00
David Testé
e85fd936d0 chore(bench): suffix hlapi ops bench with measured type name 2025-12-04 17:59:27 +01:00
Agnes Leroy
e6625521ad chore(gpu): add the possibility to run classical bench for erc20 and dex 2025-12-02 15:59:40 +01:00
Andrei Stoian
e2063c8ef4 chore(gpu): bench KS latency batches 2025-11-27 17:32:44 +01:00
Nicolas Sarlin
f8a958663b chore(tfhe): rename nightly feature flag to avx512 2025-11-26 11:28:21 +01:00
Nicolas Sarlin
01367368ed chore(zk): do not bench zkv1 at the integer level 2025-11-25 17:20:06 +01:00
Nicolas Sarlin
33f77458e9 chore(zk): fix elements count for zk throughput benches 2025-11-25 17:20:06 +01:00
Arthur Meyre
caf5e9d879 chore: fix scalar benchmarks generating fixed values
- this would not give an average runtime for scalar benchmarks and for
small precisions could give super good timings (for lucky values)
- the timings for other precisions could still be favorable or unfavorable
depending on the value that was drawn
2025-11-25 14:23:55 +01:00
David Testé
6141ad2eee chore(bench): fix bench prefix pattern for hlapi ops
To follow the standard used by other HLAPI benchmarks and ease parsing for data_extractor.
2025-11-24 17:56:10 +01:00
David Testé
b0393c0acb chore(bench): run scalar ops in integer deduplicated cpu bench 2025-11-24 14:03:08 +01:00
David Testé
b3c3647530 chore(ci): add workflow to update documentation benchmark tables
This new workflow can trigger all the required benchmarks needed
to populate benchmarks tables in documentation.
It also can generate SVG tables and store them as artifacts.
Optionally, it can open a pull-request to update the current
tables in documentation.
2025-11-24 14:03:08 +01:00
David Testé
58378b7972 chore(bench): add dedicated targets for aes cuda benchmarks 2025-11-20 16:58:06 +01:00
David Testé
071e70c037 chore(bench): fix benchmark id pattern for aes and aes256 2025-11-19 17:23:05 +01:00
Mayeul@Zama
f9268b889f chore(bench): revert print bench id
This reverts commit ef07963767.
2025-11-17 11:23:50 +01:00
Enzo Di Maria
54c8c5e020 chore(gpu): no crash with aes benches if oom error 2025-11-14 17:02:33 +01:00
David Testé
ef07963767 chore(bench): print bench id before running the benchmark
Done to circumvent criterion limitation regarding automatic
truncation of long benchmark ID.
Using a println() call we ensure the complete name is displayed
before benchmark execution to ease manual parsing and debugging.
2025-11-14 13:45:04 +01:00
David Testé
d53bf79592 chore(bench): fix naming order for erc20 hpu benchmarks 2025-11-10 11:46:41 +01:00
Enzo Di Maria
4ff95e3a42 feat(gpu): AES 256 2025-11-05 13:37:08 +01:00
David Testé
0c977a3996 chore(bench): insert params name in bench id for hlapi
To ease parsing and filtering by third parties.
2025-11-04 10:53:25 +01:00
Arthur Meyre
00ce0deec9 chore: make typos version fixed
- add a script to properly install the correct version
- correct new typos
2025-11-03 14:58:23 +01:00
Nicolas Sarlin
83b82091bd chore: use common msrv for the workspace
Since cargo commands create a lock using the smallest msrv in the workspace, it
can prevent getting up-to-date dependencies
2025-10-31 09:31:43 +01:00
David Testé
2a8885aa9f chore(ci): run erc20 and dex throughput bench only on demand
Following the same pattern as other benchmarks.
2025-10-30 09:52:30 +01:00
Pedro Alves
867f8fb579 feat(gpu): implement re-randomization
- exposed to integer and HL API
- test on the HL API
- benchmarks for GPU and CPU implementation
2025-10-29 17:55:45 -03:00
Guillermo Oyarzun
0f0438c8cf feat(gpu): add 1_1 classical pbs params for specialized version 2025-10-29 09:18:18 +01:00
David Testé
b0b49ae533 chore(bench): new parameters set to run core_crypto bench for docs
This creates extended parameters set to reflect what's displayed
in the documentation.
2025-10-27 17:25:41 +01:00
Pedro Alves
70773e442c fix(gpu): fix 128-bit compression benchmark 2025-10-27 17:06:45 +01:00
Mayeul@Zama
777bbe437a feat(shortint): add multi bit decompression 2025-10-24 09:28:17 +02:00
Arthur Meyre
23246f63f7 chore: update fast_dedup opset to match the latency benchmarks in the docs
- signed bench update
2025-10-23 10:42:19 +02:00
Arthur Meyre
11c79b5237 chore: update fast_dedup opset to match the latency benchmarks in the docs 2025-10-23 10:42:19 +02:00
Guillermo Oyarzun
e12638dabe feat(gpu): extend specialized version to classical pbs 2025-10-22 09:20:40 +02:00
pgardratzama
f9c89212ea fix(hpu): display name on shift looked wrong 2025-10-21 13:29:59 +02:00
Agnes Leroy
b4b6275ca5 chore(gpu): remove device synchronize in drop for cudavec 2025-10-21 11:33:46 +02:00
Arthur Meyre
205b767fc1 chore: fix various target issues for benchmarks following renames
- renames were done to uniformize and make it easier to setup perf
regression measurements, some names were not updated this PR fixes that
2025-10-20 13:45:27 +02:00
Thomas Montaigu
0dd0ead4e2 chore(bench): remove trivial encryptions
It makes benches not accurate
2025-10-20 12:26:44 +02:00
Agnes Leroy
c30835fc30 chore(gpu): remove async entry points for abs, add, sub, aes 2025-10-17 15:42:06 +02:00
David Testé
206553e9ee chore(ci): check for performance regression and create report
After running performances regression benchmarks, a performance
changes checking is executed. It will fetch results data with an
external tool then it will look for anomaly in changes.
Finally it will produce a report as an issue comment with any
anomaly display in a Markdown array. A folded section of the
report message contains all the results from the benchmark.

Note that a fully custom benchmark triggered from an issue comment
would not generate a report. In addition HPU performance
regression benchmark is not supported yet.
2025-10-17 15:05:24 +02:00
Arthur Meyre
20a91337c1 chore: prepare v1.5 2025-10-16 15:23:36 +02:00
Thomas Montaigu
498b0e6e5c refactor: use BTreeMap as internals of KVStore
This is to make the order of the key and value lists
deterministic when compressing
2025-10-14 17:04:13 +02:00
Thomas Montaigu
126138a59d chore: only run KVStore benches on CPU
As its the only backend that supports it
2025-10-08 11:52:14 +02:00
pgardratzama
3073d60f11 fix(hpu): work-around a criterion assert by reducing number of elements on division & modulus throughput bench 2025-10-07 14:23:07 +02:00
pgardratzama
ab25919187 fix(hpu): throughput benchmarks were done 1 IOp per 1 IOp... 2025-10-07 10:14:43 +02:00
Nicolas Sarlin
6a676551d8 chore(shortint): add metaparams for ks32 2025-10-07 09:51:09 +02:00
Enzo Di Maria
f0f3dd76eb feat(gpu): aes 128 2025-10-06 09:31:36 +02:00
Thomas Montaigu
e523fd2cb6 feat: add KVStore to the high level api
* Added Value type name to crate::integer::KVStore impl of Named trait
  as well as a bool to check we deserialize the correct value type
  (Radix vs SignedRadix)
* Add KVStore to high_level_api
* Add KVStore hlapi benches
* Remove specialized `[add,mul,sub]_to_slot` as `map` is now the
  intended API.
    - mul_to_slot was way slower than using `map`
    - add/mul_to_slot were a bit faster (~5% latency-wise), but returned
      less information (no old_value, no new_value, no boolean to check)
      if the key matched
    - Some known improvement can be made to map, which should result in
      it being better than add/sub_to_slot
* Add FheIntegerType trait to make the KVStore generic over
  FheUint/FheInt, and should make GPU integration "easy"
2025-10-03 15:01:23 +02:00
Agnes Leroy
f9e876730a chore(gpu): remove support for drift noise reduction 2025-10-03 09:45:20 +02:00
pgardratzama
39b81a8ded feat(hpu): move to new bitstream at 400Mhz with GRAM_NB 3
- update SIMD_N and min_batch_size to 12 which seems to give better
  latency and ERC20 throughput
- support IOp on several lines in ami /proc file
- reduce amount of ERC_20_SIMD per batch in HLAPI bench
2025-10-02 13:20:36 +02:00
pgardratzama
2bf595d0e2 fix(hpu): missing bench numbers for less_than & less_or_equal because lower != less 2025-10-02 13:20:36 +02:00
David Testé
d397ea3a39 chore(bench): handle ks32 atomic pattern in key size measurements 2025-09-23 12:01:33 +02:00
Guillermo Oyarzun
022cb3b18a fix(gpu): avoid out of memory when benchmarking throughput 2025-09-19 14:44:12 +02:00
David Testé
4ba1787e12 chore(bench): add crs size in zk-pke benchmark names
This is done get more details about the benchmarks when parsing
results.
2025-09-16 16:06:41 +02:00