Commit Graph

287 Commits

Author SHA1 Message Date
pgardratzama
39b81a8ded feat(hpu): move to new bitstream at 400Mhz with GRAM_NB 3
- update SIMD_N and min_batch_size to 12 which seems to give better
  latency and ERC20 throughput
- support IOp on several lines in ami /proc file
- reduce amount of ERC_20_SIMD per batch in HLAPI bench
2025-10-02 13:20:36 +02:00
David Testé
6494a82fb3 chore(ci): split cargo builds into several jobs
Post-commit checks was the bottleneck regarding running duration.
It's now split into 7 batches to improve parallelism.
Builds that are specific to Ubuntu are run in their own jobs, so
that only build_tfhe_full recipe call remains in the os matrix.
A final check is performed to ensure all the checks have passed,
this very job is used as branch protection rule.
2025-09-29 11:22:56 +02:00
Andrei Stoian
87c0d646a4 fix(gpu): coprocessor bench 2025-09-18 13:56:55 +02:00
Andrei Stoian
1dcc3c8c89 chore(gpu): structure to encapsulate streams 2025-09-18 09:43:17 +02:00
Nicolas Sarlin
1a2643d1da fix(ci): use precise wasm-bindgen version for the cli 2025-09-17 13:17:57 +02:00
David Testé
f8684d1f67 chore(ci): add regression benchmark workflow
Regression benchmarks are meant to be run in pull-request. They
can be launched in two flavors:
 * issue comment: using command like "/bench --backend cpu"
 * adding a label: `bench-perfs-cpu` or `bench-perfs-gpu`

Benchmark definitions are written in TOML and located at
ci/regression.toml.
While not exhaustive, it can be easily modified by reading the
embbeded documentation.

"/bench" commands are parsed by a Python script located at
ci/perf_regression.py. This script produces output files that
contains cargo commands and a shell script generating custom
environment variables. The Python script and generated files are
meant to be used only by the workflow
benchmark_perf_regression.yml.
2025-09-16 13:33:49 +02:00
Nicolas Sarlin
b4066df77f chore(ci): run cargo audit 2025-09-16 12:03:32 +02:00
Agnes Leroy
daee3f1850 chore(gpu): fix out of memory error in 4090 doc tests 2025-09-10 10:46:04 +02:00
pgardratzama
bd7df4a03b chore(hpu): enable hpu hlapi workflow and throughput bench in integer workflow 2025-09-05 10:42:36 +02:00
pgardratzama
c6aa1adbe7 chore(hpu): update benches to run new operations 2025-09-05 10:42:36 +02:00
Agnes Leroy
f62e5b3e3b chore(gpu): fix oom in 4090 tests 2025-08-28 16:12:52 +02:00
Andrei Stoian
c06b513182 chore(gpu): add valgrind and fix leaks 2025-08-28 14:21:57 +02:00
Nicolas Sarlin
44ac59099b chore(csprng): run clippy without the software-prng feature 2025-08-26 19:32:40 +02:00
David Testé
b3f1a85e1d chore(bench): write parameters to disk for hlapi operations 2025-08-13 18:34:26 +02:00
Arthur Meyre
1169096058 chore: fix whitespace in Makefile 2025-08-13 09:16:49 +02:00
Arthur Meyre
a63207af9e chore(ci): add MSRV build to check we are compliant with what we announce
- have to downgrade param_dedup edition as 1.84 cannot handle 2024 in a
workspace
2025-08-08 18:06:29 +02:00
Arthur Meyre
04d4ccc16c chore(ci): remove TFHE_SPEC from Makefile
- this is a leftover from a complicated attempt at backward compatibility
no need to keep this
2025-08-08 18:06:29 +02:00
Andrei Stoian
7bf2ec6ff2 chore(gpu): fix warnings detection 2025-07-31 18:47:08 +02:00
Arthur Meyre
82a5cc7f2d chore(ci): increase timeout for noise checks 2025-07-31 12:00:15 +02:00
Andrei Stoian
36eceaf05e feat(gpu): utility debug workflows in ci 2025-07-30 12:55:40 +01:00
Arthur Meyre
e8986cbd7c chore: setup CI for noise checks 2025-07-29 15:29:24 +02:00
Andrei Stoian
494e0e0601 chore(gpu): add short op sequence test for GPU on PRs 2025-07-15 16:03:45 +02:00
Agnes Leroy
068cbc0f41 chore(gpu): add hl api noise squash latency and throughput bench 2025-07-11 14:04:32 +01:00
Arthur Meyre
bd739c2d48 chore(docs): uniformize paths in docs to use "-" instead of "_"
- this is to avoid conflicts with gitbook
2025-07-09 14:36:04 +02:00
Arthur Meyre
17d3a492b6 chore: only run backward compat clippy on x86 machines
- older versions of the crates are only compilable with x86, disable on arm
for now
- revisit when the crates are split ?
2025-07-09 08:29:12 +02:00
Nicolas Sarlin
bb1ff363d3 chore(ci): use Cargo.lock for installed tools 2025-07-07 13:10:55 +02:00
Nicolas Sarlin
7bcd6b94da chore: use script to pull hpu files 2025-07-07 13:10:55 +02:00
Nicolas Sarlin
57cbab9fe1 chore(backward): integrate backward compat data
Code is taken from
59a6179831

Adapted to make ci work
2025-07-07 13:10:55 +02:00
Baptiste Roux
eb0b9643bb fix(hpu): Fix clippy_hpu_mockup makefile entry 2025-07-03 10:28:52 +02:00
Baptiste Roux
6432b98591 chore(mockup): Add clippy target for tfhe_hpu_mockup
Also fix all clippy lint
2025-07-02 14:41:41 +02:00
Nicolas Sarlin
950915a108 chore(ci): use the correct data branch in clippy_ws_tests 2025-07-01 14:18:10 +02:00
Andrei Stoian
5e6562878a chore(gpu): add cuda debug target for integer tests 2025-07-01 10:37:17 +02:00
Nicolas Sarlin
940a9ba860 chore(zk): enable tfhe-lints on zk pok 2025-06-27 14:34:25 +02:00
Nicolas Sarlin
ab0ec4a238 chore(zk): mark non-pke proofs as experimental 2025-06-10 17:07:33 +02:00
David Testé
b61f1d864c chore(ci): check ks32 parameters with lattice estimator
A small refactoring has been done to handle ciphertext modulus in a more convenient way.
2025-06-04 17:19:17 +02:00
tmontaigu
aca7e79585 feat(csprng): add Xof random generation
This adds a new kind of seed to the csprng

When created which such seed, the AES-CTR random generator
initialization changes:
- The AES-KEY used is initialized differently
- The AES-CTR starts with a CTR that may not be 0

The changes make it so that the counter still goes from 0..MAX,
but now the AES-CTR will encrypt the counter + some offset allowing
to keep the regular behavior and the new one
2025-06-04 09:57:18 +02:00
tmontaigu
c0e89a53ef fix(csprng): fix and endian for the counter
This commit fixes an endian (little) for the counter
representation of the counter used in the AES-CTR counter.

This is so that, the random bytes generated are the same not matter
the endian of the system.

A test case with known answers is added, as well as make command
to run the test in an emulated big-endian arch using the `cross`
utility.

This also include a small refactor where now the block cipher
do not encrypt `AesIndex`. This is done as it makes more sense
(AES encrypts bytes, not numbers), so this allows to move and centralize
the concept of endian as well a centralize where batch created.
2025-06-04 09:57:18 +02:00
David Testé
312952007f chore(ci): lock zizmor version to avoid breaking ci pipelines
Newer version of Zizmor can trigger errors due to new findings in workflows. To avoid breaking any ongoing pull-request, due to this unhandled update, zizmor version is locked.
2025-06-03 12:29:36 +02:00
tmontaigu
aa51b25313 chore(ci): fix test_user_docs run and add hpu
Due to #[cfg] before the test_user_docs module, the module would
not actually be compiled (thus run user doc test) unless all required
features where activated when running.

So we remove these cfg, as each hardware doc supports its own set of
features and its better to have a test fail because a feature is
missing rather than silently not run anything

Also, add commands and ci stuff to check HPU docs
2025-05-30 16:36:56 +02:00
Nicolas Sarlin
14e1ee5bd3 fix(gpu): build with hpu and zk features 2025-05-27 16:10:38 +02:00
Agnes Leroy
6e102b5fa1 chore(gpu): fix oom error in ci 2025-05-26 22:50:55 +02:00
Pedro Alves
408e81c45a feat(gpu): add support for GPU-accelerated expand on the HL Api
- includes documentation about GPU's accelerated expand on the HL API
- rework CudaKeySwitchingKey
- Cloning the key is no longer necessary on the HL API
2025-05-23 11:54:29 +02:00
Nicolas Sarlin
45fdba04b1 fix(gpu): allow to build with hpu feature enabled 2025-05-22 10:21:35 +02:00
Baptiste Roux
9ee8259002 feat(hpu): Add Hpu backend implementation
This backend abstract communication with Hpu Fpga hardware.
It define it's proper entities to prevent circular dependencies with
tfhe-rs.
Object lifetime is handle through Arc<Mutex<T>> wrapper, and enforce
that all objects currently alive in Hpu Hw are also kept valid on the
host side.

It contains the second version of HPU instruction set (HIS_V2.0):
* DOp have following properties:
  + Template as first class citizen
  + Support of Immediate template
  + Direct parser and conversion between Asm/Hex
  + Replace deku (and it's associated endianess limitation) by
  + bitfield_struct and manual parsing

* IOp have following properties:
  + Support various number of Destination
  + Support various number of Sources
  + Support various number of Immediat values
  + Support of multiple bitwidth (Not implemented yet in the Fpga
    firmware)

Details could be view in `backends/tfhe-hpu-backend/Readme.md`
2025-05-16 16:30:23 +02:00
Arthur Meyre
6cccaf3f66 chore: fix Makefile to specify toolchain for cargo xtask 2025-05-09 18:32:21 +02:00
David Testé
67ec4a28c1 chore(bench): move benchmarks to their own crate
This is done to speed-up compilation duration by avoiding
recompiling tfhe each time a modification is made in a benchmark
file.
2025-05-09 13:46:27 +02:00
Arthur Meyre
d05ee42629 chore: add param_dedup to alias redundant parameter defs across versions 2025-05-08 09:30:36 +02:00
David Testé
1ca14e6db0 chore(ci): add workflow security checks with zizmor 2025-05-06 14:06:17 +02:00
Agnes Leroy
97690ab3bd chore(gpu): write swap bench 2025-05-05 17:46:11 +02:00
Mayeul@Zama
2cbde1a56b chore(all): make clippy_rustdoc output less noisy 2025-04-18 16:03:00 +02:00