tfhe-rs

mirror of https://github.com/zama-ai/tfhe-rs.git synced 2026-01-07 22:04:10 -05:00

Author	SHA1	Message	Date
pgardratzama	d2a570bdd6	chore: uses if_then_zero only in HPU ERC20 whitepaper (to be updated when encrypt_trivial becomes available on HPU), adds test of if_then_zero for both CPU & HPU	2026-01-06 16:55:07 +01:00
pgardratzama	ed84387bba	chore: trying to insure GPU ERC20 bench are not impacted while CPU & HPU uses if_then_zero	2026-01-06 16:55:07 +01:00
Baptiste Roux	e645ee3397	feat: Add IfThenZero impl for Cpu	2026-01-06 16:55:07 +01:00
pgardratzama	569abd9a3b	fix(hpu): fix whitepaper erc20 for HPU using if_then_zero	2026-01-06 16:55:07 +01:00
Nicolas Sarlin	312ce494bf	chore(zk): add 1 * 64 benches with production CRS	2025-12-17 15:06:37 +01:00
Thomas Montaigu	d394af7f4d	chore: bump dyn-stack to 0.13 Notable changes: - StackReq methods no longer returns Result<StackReq, SizeOverflow> instead, StackReq contains the invalid state. Now, its when we create a PodBuffer that we can check/catch if the size req is invalid by catching errors when calling `PodBuffer::try_new`. Its also possible to manually check that `stack_req != StackReq::OVERFLOW` - GlobalaPodBuffer is now PodBuffer	2025-12-15 10:02:17 +01:00
Andrei Stoian	78d1ce18c1	feat(gpu): support keyswitch 64/32	2025-12-12 22:01:49 +01:00
Agnes Leroy	b7a706a3db	chore(bench): remove constraint in pcc to not use trivial name in bench	2025-12-11 14:41:07 +01:00
Agnes Leroy	8e4bec0b2a	chore(bench): modify whitepaper erc20 to match newest litepaper version	2025-12-11 14:41:07 +01:00
Enzo Di Maria	cf969ff930	refactor(gpu): creating benchmarks for match_value	2025-12-11 12:01:43 +01:00
David Testé	5eb4cc5a22	chore(bench): add fast benchmark capability for hlapi Run only a small subset of the current benchmarks to speed up developers feedback	2025-12-09 11:34:53 +01:00
Agnes Leroy	100b4200c2	chore(gpu): update number of streams in erc20 throughput bench	2025-12-08 09:21:55 +01:00
David Testé	e85fd936d0	chore(bench): suffix hlapi ops bench with measured type name	2025-12-04 17:59:27 +01:00
Agnes Leroy	e6625521ad	chore(gpu): add the possibility to run classical bench for erc20 and dex	2025-12-02 15:59:40 +01:00
Andrei Stoian	e2063c8ef4	chore(gpu): bench KS latency batches	2025-11-27 17:32:44 +01:00
Nicolas Sarlin	01367368ed	chore(zk): do not bench zkv1 at the integer level	2025-11-25 17:20:06 +01:00
Nicolas Sarlin	33f77458e9	chore(zk): fix elements count for zk throughput benches	2025-11-25 17:20:06 +01:00
Arthur Meyre	caf5e9d879	chore: fix scalar benchmarks generating fixed values - this would not give an average runtime for scalar benchmarks and for small precisions could give super good timings (for lucky values) - the timings for other precisions could still be favorable or unfavorable depending on the value that was drawn	2025-11-25 14:23:55 +01:00
David Testé	6141ad2eee	chore(bench): fix bench prefix pattern for hlapi ops To follow the standard used by other HLAPI benchmarks and ease parsing for data_extractor.	2025-11-24 17:56:10 +01:00
David Testé	b0393c0acb	chore(bench): run scalar ops in integer deduplicated cpu bench	2025-11-24 14:03:08 +01:00
David Testé	58378b7972	chore(bench): add dedicated targets for aes cuda benchmarks	2025-11-20 16:58:06 +01:00
David Testé	071e70c037	chore(bench): fix benchmark id pattern for aes and aes256	2025-11-19 17:23:05 +01:00
Mayeul@Zama	f9268b889f	chore(bench): revert print bench id This reverts commit `ef07963767`.	2025-11-17 11:23:50 +01:00
Enzo Di Maria	54c8c5e020	chore(gpu): no crash with aes benches if oom error	2025-11-14 17:02:33 +01:00
David Testé	ef07963767	chore(bench): print bench id before running the benchmark Done to circumvent criterion limitation regarding automatic truncation of long benchmark ID. Using a println() call we ensure the complete name is displayed before benchmark execution to ease manual parsing and debugging.	2025-11-14 13:45:04 +01:00
David Testé	d53bf79592	chore(bench): fix naming order for erc20 hpu benchmarks	2025-11-10 11:46:41 +01:00
Enzo Di Maria	4ff95e3a42	feat(gpu): AES 256	2025-11-05 13:37:08 +01:00
David Testé	0c977a3996	chore(bench): insert params name in bench id for hlapi To ease parsing and filtering by third parties.	2025-11-04 10:53:25 +01:00
Arthur Meyre	00ce0deec9	chore: make typos version fixed - add a script to properly install the correct version - correct new typos	2025-11-03 14:58:23 +01:00
David Testé	2a8885aa9f	chore(ci): run erc20 and dex throughput bench only on demand Following the same pattern as other benchmarks.	2025-10-30 09:52:30 +01:00
Pedro Alves	867f8fb579	feat(gpu): implement re-randomization - exposed to integer and HL API - test on the HL API - benchmarks for GPU and CPU implementation	2025-10-29 17:55:45 -03:00
David Testé	b0b49ae533	chore(bench): new parameters set to run core_crypto bench for docs This creates extended parameters set to reflect what's displayed in the documentation.	2025-10-27 17:25:41 +01:00
Pedro Alves	70773e442c	fix(gpu): fix 128-bit compression benchmark	2025-10-27 17:06:45 +01:00
Mayeul@Zama	777bbe437a	feat(shortint): add multi bit decompression	2025-10-24 09:28:17 +02:00
Arthur Meyre	23246f63f7	chore: update fast_dedup opset to match the latency benchmarks in the docs - signed bench update	2025-10-23 10:42:19 +02:00
Arthur Meyre	11c79b5237	chore: update fast_dedup opset to match the latency benchmarks in the docs	2025-10-23 10:42:19 +02:00
Guillermo Oyarzun	e12638dabe	feat(gpu): extend specialized version to classical pbs	2025-10-22 09:20:40 +02:00
pgardratzama	f9c89212ea	fix(hpu): display name on shift looked wrong	2025-10-21 13:29:59 +02:00
Thomas Montaigu	0dd0ead4e2	chore(bench): remove trivial encryptions It makes benches not accurate	2025-10-20 12:26:44 +02:00
Agnes Leroy	c30835fc30	chore(gpu): remove async entry points for abs, add, sub, aes	2025-10-17 15:42:06 +02:00
Thomas Montaigu	498b0e6e5c	refactor: use BTreeMap as internals of KVStore This is to make the order of the key and value lists deterministic when compressing	2025-10-14 17:04:13 +02:00
Thomas Montaigu	126138a59d	chore: only run KVStore benches on CPU As its the only backend that supports it	2025-10-08 11:52:14 +02:00
pgardratzama	3073d60f11	fix(hpu): work-around a criterion assert by reducing number of elements on division & modulus throughput bench	2025-10-07 14:23:07 +02:00
pgardratzama	ab25919187	fix(hpu): throughput benchmarks were done 1 IOp per 1 IOp...	2025-10-07 10:14:43 +02:00
Enzo Di Maria	f0f3dd76eb	feat(gpu): aes 128	2025-10-06 09:31:36 +02:00
Thomas Montaigu	e523fd2cb6	feat: add KVStore to the high level api * Added Value type name to crate::integer::KVStore impl of Named trait as well as a bool to check we deserialize the correct value type (Radix vs SignedRadix) * Add KVStore to high_level_api * Add KVStore hlapi benches * Remove specialized `[add,mul,sub]_to_slot` as `map` is now the intended API. - mul_to_slot was way slower than using `map` - add/mul_to_slot were a bit faster (~5% latency-wise), but returned less information (no old_value, no new_value, no boolean to check) if the key matched - Some known improvement can be made to map, which should result in it being better than add/sub_to_slot * Add FheIntegerType trait to make the KVStore generic over FheUint/FheInt, and should make GPU integration "easy"	2025-10-03 15:01:23 +02:00
Agnes Leroy	f9e876730a	chore(gpu): remove support for drift noise reduction	2025-10-03 09:45:20 +02:00
pgardratzama	39b81a8ded	feat(hpu): move to new bitstream at 400Mhz with GRAM_NB 3 - update SIMD_N and min_batch_size to 12 which seems to give better latency and ERC20 throughput - support IOp on several lines in ami /proc file - reduce amount of ERC_20_SIMD per batch in HLAPI bench	2025-10-02 13:20:36 +02:00
pgardratzama	2bf595d0e2	fix(hpu): missing bench numbers for less_than & less_or_equal because lower != less	2025-10-02 13:20:36 +02:00
David Testé	4ba1787e12	chore(bench): add crs size in zk-pke benchmark names This is done get more details about the benchmarks when parsing results.	2025-09-16 16:06:41 +02:00

1 2

96 Commits