doc: start adding erc20 benchmarks

2026-01-09 14:47:56 -05:00 · 2025-10-22 10:33:08 +02:00
parent 23600eb8e1
commit 069c7334a9
7 changed files with 84 additions and 0 deletions
--- a/tfhe/docs/getting-started/benchmarks/cpu/README.md
+++ b/tfhe/docs/getting-started/benchmarks/cpu/README.md
@@ -9,4 +9,5 @@ All CPU benchmarks were launched on an `AWS hpc7a.96xlarge` instance equipped wi
 {% endhint %}

 * [Integer operations](cpu-integer-operations.md)
+* [ERC20](cpu-erc20.md)
 * [Programmable Bootstrapping](cpu-programmable-bootstrapping.md)
--- a/tfhe/docs/getting-started/benchmarks/cpu/cpu-erc20.md
+++ b/tfhe/docs/getting-started/benchmarks/cpu/cpu-erc20.md
@@ -0,0 +1,68 @@
+As TFHE-rs is the underlying library of the Zama Confidential Blockchain Protocol, to illustrate real-world performance,  
+consider an ERC20 transfer that requires executing the following sequence of operations:
+```rust
+use tfhe::FheUint64;
+fn erc20_transfer_whitepaper(
+    from_amount: &FheUint64,
+    to_amount: &FheUint64,
+    amount: &FheUint64,
+) -> (FheUint64, FheUint64) {
+    let has_enough_funds = (from_amount).ge(amount);
+    let zero_amount = FheUint64::encrypt_trivial(0u64);
+    let amount_to_transfer = has_enough_funds.select(amount, &zero_amount);
+
+    let new_to_amount = to_amount + &amount_to_transfer;
+    let new_from_amount = from_amount - &amount_to_transfer;
+
+    (new_from_amount, new_to_amount)
+}
+```
+This is one way to compute an encrypted ERC20 transfer, but it is not the most efficient.
+Instead, it is possible to compute the same transfer in a more efficient way by not using the `select` operation:
+```rust
+use tfhe::FheUint64;
+fn erc20_transfer_no_cmux(
+    from_amount: &FheUint64,
+    to_amount: &FheUint64,
+    amount: &FheUint64,
+) -> (FheUint64, FheUint64) {
+    let has_enough_funds = (from_amount).ge(amount);
+
+    let amount = amount * FheUint64::cast_from(has_enough_funds);
+
+    let new_to_amount = to_amount + &amount;
+    let new_from_amount = from_amount - &amount;
+
+    (new_from_amount, new_to_amount)
+}
+```
+An even more efficient way to compute an encrypted ERC20 transfer is to use the `overflowing_sub` operation as follows:
+```rust
+use tfhe::FheUint64;
+fn erc20_transfer_overflow(
+    from_amount: &FheUint64,
+    to_amount: &FheUint64,
+    amount: &FheUint64,
+) -> (FheUint64, FheUint64) {
+    let (new_from, did_not_have_enough) = (from_amount).overflowing_sub(amount);
+    let did_not_have_enough = &did_not_have_enough;
+    let had_enough_funds = !did_not_have_enough;
+
+    let (new_from_amount, new_to_amount) = rayon::join(
+        || did_not_have_enough.if_then_else(from_amount, &new_from),
+        || to_amount + (amount * FheUint64::cast_from(had_enough_funds)),
+    );
+    (new_from_amount, new_to_amount)
+}
+```
+In a blockchain protocol, the FHE operations would not be the only ones used to compute the transfer:
+ciphertext compression and decompression, as well as rerandomization, would also be used. 
+Network communications would also introduce significant overhead.
+For the sake of simplicity, here the focus is only placed on the performance of the FHE operations.
+The latency and throughput of these three ERC20 FHE transfer implementations are compared in the following table:
+
+TODO add SVG
+
+The throughput shown here is the maximum that can be achieved with TFHE-rs on CPU, in an ideal scenario where all transactions are independent. 
+In a blockchain protocol, the throughput would be limited by the latency of the network, but also by the necessity to apply other operations 
+(compression, decompression, ciphertext rerandomization).
--- a/tfhe/docs/getting-started/benchmarks/gpu/README.md
+++ b/tfhe/docs/getting-started/benchmarks/gpu/README.md
@@ -9,4 +9,5 @@ All GPU benchmarks were launched on H100 GPUs, and rely on the multithreaded PBS
 {% endhint %}

 * [Integer operations](gpu-integer-operations.md)
+* [ERC20](gpu-erc20.md)
 * [Programmable Bootstrapping](gpu-programmable-bootstrapping.md)
--- a/tfhe/docs/getting-started/benchmarks/gpu/gpu-erc20.md
+++ b/tfhe/docs/getting-started/benchmarks/gpu/gpu-erc20.md
@@ -0,0 +1,7 @@
+Similarly to the [CPU benchmarks](../cpu/cpu-erc20.md), the latency and throughput of a confidential ERC20 token transfer can be measured.
+
+TODO add SVG
+
+The throughput shown here is the maximum that can be achieved with TFHE-rs on an 8xH100 GPU node, in an ideal scenario where all transactions are independent.
+In a blockchain protocol, the throughput would be limited by the latency of the network and the necessity to apply 
+other operations (compression, decompression, rerandomization).
--- a/tfhe/docs/getting-started/benchmarks/hpu/README.md
+++ b/tfhe/docs/getting-started/benchmarks/hpu/README.md
@@ -10,3 +10,4 @@ All HPU benchmarks were launched on AMD Alveo v80 FPGAs.

 * [Integer operations](hpu-integer-operations.md)
 * [Programmable Bootstrapping](hpu-programmable-bootstrapping.md)
+* [ERC20](hpu-erc20.md)
--- a/tfhe/docs/getting-started/benchmarks/hpu/hpu-erc20.md
+++ b/tfhe/docs/getting-started/benchmarks/hpu/hpu-erc20.md
--- a/tfhe/src/test_user_docs.rs
+++ b/tfhe/src/test_user_docs.rs
@@ -15,6 +15,12 @@ mod test_cpu_doc {
        configuration_rust_configuration
    );

+    // BENCHMARKS
+    doctest!(
+        "../docs/getting-started/benchmarks/cpu/cpu-erc20.md",
+        benchmarks_cpu_erc20
+    );
+
    // FHE COMPUTATION

    // ADVANCED FEATURES