Vac Research Blog

Decentralized Message Layer Security (De-MLS) with Waku

2025-09-02T14:00:00.000Z

This post introduces de-MLS, a decentralized variant of Message Layer Security (MLS) that reimagines group messaging by replacing centralized delivery services with peer-to-peer protocols while retaining strong guarantees such as forward secrecy (FS) and post-compromise security (PCS).

Introduction

Secure Group Messaging (SGM) is resource-intensive when aiming for robust security features like forward secrecy (FS) and post-compromise security (PCS).

One straightforward approach to SGM is a pairwise group chat, where each pair of group members establishes a unique encryption key using Diffie-Hellman. While this method ensures security, it falls short in terms of practicality:

High storage requirements: Each participant must store encryption keys for every other participant.
Inefficient encryption: Each message must be encrypted separately for every participant, leading to significant computational overhead.
Inefficient message storage and delivery: Each separately encrypted message must then be sent over the wire, whatever this wire might be. Or stored in database.
Cumbersome group management: Adding or removing users and refreshing keys becomes increasingly inefficient as the group grows.

One scalable for Secure Group Messaging (SGM) is Message Layer Security (MLS), as standardized in RFC 9420. Leveraging TreeKEM, MLS organizes group members in a cryptographic tree structure, where each participant is responsible for maintaining specific parts of the tree.

While MLS offers scalability and strong security guarantees, its reliance on server-based delivery services poses limitations for fully decentralized environments.

In this post, we present the implementation details of the first version of Decentralized MLS (de-MLS) which is an SGM protocol. De-MLS can serve groups that cannot rely on central servers, such as journalists and activists seeking secure communication. It is also well suited for DAOs, where Ethereum-based authentication can restrict access to members holding a minimum ETH balance, and for NGOs or research consortia that prefer not to host their own servers while still requiring end-to-end encrypted group messaging. Decentralized MLS (de-MLS) satisfies the following features:

Decentralized
Scalable
End-to-end encrypted (E2EE)
FS and PCS provided
Ethereum authenticated

Background

MLS

The Message Layer Security (MLS) protocol offers scalable and secure group messaging protocol by organizing participants into a cryptographic tree structure, enabling efficient operations like adding or removing members with logarithmic time complexity relative to the group size. MLS provides strong security guarantees, including FS and PCS.

MLS assumes that two services are provided:

An Authentication Service (AS): It enables group members to authenticate the credentials presented by other group members.
A Delivery Service (DS) that routes MLS messages among the participants in the protocol in the correct order and manage the keyPackage of the users where the keyPackage is the objects that provide some public information about a user.

Despite its scalability, MLS has a notable limitation: it is inherently designed for server-based federated architectures for delivery service (DS), even when the servers themselves don't need to be trusted. To achieve a decentralized protocol, the functionality of DS must be reimagined to eliminate reliance on a central server while preserving the protocol's security properties. Thus, we proposed decentralized MLS (de-MLS), leveraging Waku nodes as peer-to-peer communication protocols to eliminate reliance on centralized servers.

Lastly, MLS operates on an epoch-based model, where group state changes (e.g., adding/removing users or key refreshes) occur between epochs that are always required to be conducted by a single entity. For example, if a user is removed in epoch E, the rest of the group members generate a new key in epoch E + 1 by passing the new entropy. The removed user cannot decrypt messages sent after epoch E + 1.

Waku

Waku is a decentralized messaging protocol designed for secure and efficient communication in peer-to-peer networks. It operates as a broadcast-based routing layer where content topics can be used to tag and filter messages. Users join channels by subscribing to specific content topics, which determine the scope and type of messages exchanged. This enables flexible and efficient communication patterns in a decentralized environment.

de-MLS

Decentralized MLS (de-MLS) is a peer-to-peer secure group messaging protocol that can work with any delivery service (DS) meeting a minimal set of requirements. In this post, we highlight its integration with Waku as the messaging protocol, while emphasizing that de-MLS itself remains agnostic to the underlying DS. Further technical details can be found in the de-MLS RFC.

Decentralization is achieved not only at the delivery service (DS) level but also within the authentication service (AS). Multiple special nodes named Steward in the group serve as authorized identities to authenticate users before they join or are removed from the group transparently.

de-MLS provides two different user management configurations, both utilizing the Waku protocol for DS:

Single Steward:
- A single authorized identity (Steward) manages the group, including removing or adding users with agreement among users by a voting-based consensus.
Multi-Steward:
- Multiple Stewards have equal authority to add or remove users.
- A consensus mechanism ensures consistency by resolving concurrent changes within the same epoch and preventing possible conflicts. In each epoch, all modifications are managed exclusively by a single Steward.

Note: We chose the term Steward to reflect the role of transparently coordinating and organizing passengers at stations, much like Stewards do in transit systems.

In multi-Steward settings, de-MLS requires a consensus among Stewards that have equal rights in the group since changes in an epoch in MLS are required to be conducted by a single identity, that is the Steward.

For the consensus integration, ongoing research explores two promising approaches:

On-chain consensus mechanisms: Outsourcing consensus to a smart contract solution for transparent and immutable agreement.
Off-chain consensus mechanisms: Utilizing off-chain consensus protocols to design efficient, decentralized protocols.

Waku Integration

Waku integration is a crucial step in the construction of de-MLS, aiming to replace traditional client-server communication with decentralized messaging. The specifics of Waku integration will be detailed in a separate RFC; for now, our main priority is the de-MLS RFC.

The main challenge in this transition is transforming the centralized Delivery Service (DS) into a decentralized equivalent, which performs two essential functions:

Message Delivery and Ordering: The DS is responsible not only for delivering messages to the correct recipients, but also for preserving the correct order of these messages, which is critical for the consistency of group state.
Key Package Management: The DS manages key packages, which are essential for adding members securely to a group.

To maintain a truly decentralized architecture, key packages cannot be stored in a centralized location. Initially, we considered using a smart contract (SC) as a decentralized substitute for server-side key package storage. However, this approach proved impractical. Blockchains are immutable by design—once data is written, it cannot be fully removed. This contradicts a core requirement of MLS: each key package must be used exactly once and then deleted, to prevent replay or reuse attacks. Instead, our solution is to require users to actively provide their key packages upon request, allowing validation at the moment of use without persistent storage. While this approach may lose some benefits of asynchronicity, we plan to address this in the future by introducing store nodes that can temporarily hold key packages. This ensures both compliance with MLS's security model and alignment with decentralized system principles.

Flow

The flow section explains the processes that when a user wants to join a group in both Steward and users side also their interactions. The flow of de-MLS is as follows:

1. Steward joins the welcome topic

The welcome topic is a topic created and monitored by the Steward for a specific secure messaging group, allowing any Waku node to subscribe permissionlessly. Being in the welcome topic does not imply group membership, it acts as a waiting room where users can send their key material, which the Steward listens for and processes before granting access to the secure group.

2. Group initialization

Steward initalizes a group with parameters such as cipher suite and group ID.

3. Emitting Group Anouncement (GA) by Steward

Steward creates group announcement (GA) periodically to the welcome channel that the users can find the who the Steward is. This will be important for the next step.

4. User joins the welcome topic

As first, the users who wants to be part of the decentralized MLS should subscribe the welcome channel. Then user can find the group name and also corresponding GA message from Steward. This GA message helps the user to create a valid keyPackages which define in section 10 in RFC9420 for the group.

5. User creates its key package

User creates the keyPackage and encrypt by public key of the Steward then send it to the Steward. Since the message is encrypted, stay secure though the welcome (permissionless) topic.

6. Steward receives the User's key package

Steward receives the user's keyPackage and decrypt it. After decrypted, Steward also verifies the validity of the keyPackage by signature verification. If the keyPackage is not valid, the Steward just drops the message, otherwise it moves to the next step which is proposal creation.

7. Creation of Voting proposals

Voting proposals are special MLS application messages that may come from any participant, including the Steward. In this context, any member can create a proposal corresponding to the user’s keyPackage. In regular MLS, proposals are automatically converted into commit messages, which can change the structure of the tree. However, in de-MLS, since the process is decentralized, proposals must be voted on before being converted into a commitment.

8. Voting for proposal

Voting applies decentralization by protecting small groups can control. Therefore, proposals must be voted on before committing. The consensus mechanism should be a lightweight consensus that cannot be a bottleneck for treeKEM scalability. Basically, the consensus returns the binary result for a given proposal. If voting result is NO, the proposal is dropped; otherwise, the Steward transforms it into an MLS proposal. MLS proposal message is a distinct type of MLS application message, where the Steward attaches the voting result instead of directly releasing a commit message.

9. Creating commit message

Commit messages are the messages that start new epochs. They include key and tree material that existing members can use to generate the new state of the tree.

After Steward gets the YES from consensus, Steward creates commit messages that injects new entropy for the existing group members.

10. Sending messages

After Steward creates and then sends two messages:

Commit message informs existing group member to update their key to align with the new member’s key for the upcoming epoch.
The welcome message informs the newly joined user to generate a group key that matches the key existing members will use in the upcoming epoch.

Although existing users had different group keys in the previous epoch and the new user had none, the Steward message ensures that both existing and new users converge on the same group key in the next epoch.

11. Applying welcome message

User can generate the next epoch group key by using the welcome message as well as existing users extract the same groupKey by using commit messages.

The commit message helps existing members generate the next group key Gk+1, while the welcome message helps the newly joining user generate the same Gk+1. This provides two important security properties:

Forward Secrecy (FS): The new user cannot read previous messages since they were encrypted with the old key Gk
Post-Compromise Security (PCS): If a user is removed from the group, they cannot read future messages since those messages will be encrypted with the new key Gk+1

Benchmark

This section presents the performance evaluation of de-MLS. One of the key advantages of the MLS protocol is its efficiency, as it eliminates the need for pairwise message exchanges between all participants. Instead, the decentralized DS enables the addition of new participants by sending only two messages to the group: a commit message and a welcome message. However, despite this advantage, the protocol does have certain bottlenecks, which are as follows:

Firstly, the Steward must receive the key packages from each member wishing to join the group. This process requires sequential message exchanges and involves computationally intensive tasks such as encryption, decryption, and digital signature verification. Even when multiple users are added to the group simultaneously, the process is essentially sequential. The tree structure is updated one user at a time, followed by sending the final commit message to the existing group members and a single welcome message to the new members.
Secondly adding a member to a group requires rebuilding the tree and computing new keys.

The following measurements were made as follows:

The time required for the entire sequence of receiving a user key package is presented here. This includes generating the Steward key, creating messages with signatures and encryption, and processing these messages.

Share Key Package - 1.8395 ms

Note that these measurements do not account for the time taken to forward messages.

The time required for creating the commit and welcome message from a ready-made package bunches is shown in this table.

Group Size (by users)	Time
10	1.8662 ms
100	14.124 ms
500	121.85 ms
1000	412.39 ms
5000	~ 15-20 s
10000	~ 1-1.5 min

The tests were conducted on the following configuration: Apple M3 Pro @ 4.05GHz and 12-Core CPU/18-Core GPU.

Here, the network latency and the time taken by users to apply the received commits are also excluded. These aspects are planned to be measured and evaluated in future work.

Potential drawbacks and countermeasures

Since de-MLS replace the servers by P2P, we could lose some good features of servers based MLS. In this section we present the potential drawbacks and possible countermeasures of de-MLS.

Offline users: keyPackages are provided by the users directly without any storing, this is required each user must be online for joining to a group.
- We can consider to use Waku sync nodes that are nodes has storing ability for a temporary storing of keyPackages.
DoS attack to Steward: Steward is known in welcome message from periodic group announcement message so Steward can be targeted for DoS attack.
- As always we consider to use Rate-Limiting Nullifier (RLN) with Waku to protect network from spam.
Message loss or delay : Because of P2P and consensus settings, message can be lost or delayed.,
- We can integrate reliability mechanisms to Waku such as scalable data sync (SDS)
- Consensus mechanism requires to provide liveness property against offline nodes, for example, it may provides default YES or NO options for a silent users who do not vote.
Enchanced authentication
- Ethereum authentication could be inefficient. We can configure the authentication mechanism for example asking minimum balance or etc.

Conclusion

To summarize, the approach to solving decentralized DS tasks with Waku can be outlined as shown in the comparison table:

Feature	MLS	de-MLS
Message Distribution	Messages are sent from the server to clients	Messages are sent by publishing/subscribing to pub-sub topics
Commit Message Handling	Relies on a server	Relies on a consensus and transparent Steward
Key Package Management	Key packages are stored and distributed by the server	Key packages are provided by the users themselves

Future Work

In the next iterations, the implementations are planned as following:

Dual-Consensus Multi-Steward Support: One consensus mechanism selects an Steward from all users, while a second governs group decisions among the elected Stewards
Consensus mechanism for handling concurrent changes within the same epoch
Key rotation support
Benchmarking for the multi-Steward configuration including the network time

References

[1] RFC 9420: The Messaging Layer Security (MLS) Protocol. Retrieved from https://datatracker.ietf.org/doc/rfc9420/
[2] OpenMLS. Retrived from https://github.com/openmls/openmls
[3] Waku. Retrived from https://waku.org/
[4] de-MLS. Retrived from https://github.com/vacp2p/de-mls/

Scaling libp2p GossipSub for Large Messages: An Evaluation of Performance Improvement Proposals

2025-08-12T12:00:00.000Z

The original GossipSub design emphasizes robustness, with less focus on message sizes. However, emerging use cases—such as Ethereum's EIP-4844 and data availability sampling (DAS) require rapid propagation of high data volumes, often in the form of large messages. Many ongoing research efforts attempt to find solutions for efficiently forwarding large messages through the GossipSub network. This post provides a concise overview and performance evaluation of some research initiatives aimed at improving GossipSub to meet the performance needs of modern P2P networks.

Overview

For each subscribed topic, GossipSub nodes maintain a full-message mesh of peers with a target degree $D$ , bounded by lower and upper thresholds $D_{low}$ and $D_{high}$ , respectively. In parallel, a metadata-only (gossip) mesh includes additional $K$ peers for metadata exchange. All messages flow through the full-message mesh, and metadata flows through the gossip mesh.

The metadata contains IHAVE messages, announcing IDs of seen messages. Upon receiving an IHAVE announcement about an unseen message ID, a receiver responds with an IWANT request to retrieve the announced message. IHAVE announcements serve multiple purposes:

Offer additional resilience in situations involving non-conforming peers or network partitions.
Speed up message propagation by allowing far-off peers to fetch overdue messages.

Since GossipSub v1.1 [1], replying to IWANT requests is optional, to safeguard honest peers from adversaries. This change encourages peers to make redundant IWANT requests for the same message. While this arrangement works well for small messages, it can be inefficient for bigger ones. This inefficiency arises from an increase in duplicates and a higher number of IWANT requests due to longer transmission times for large messages. As a result, we observe a significant rise in bandwidth utilization and message dissemination times across the network.

IDONTWANT messages in GossipSub v1.2 [2] help reduce some duplicates. However, further efforts are necessary to mitigate this problem [3, 4]. That is why many recent works focus on improving GossipSub's performance when handling large messages [5, 6, 7, 8, 9, 10, 11, 12].

In this post, we evaluate the push-pull phase transition (PPPT) mechanism [10], the GossipSub v1.4 proposal [11, 5], and the GossipSub v2.0 proposal [12] for performance against Gossipsub v1.2. For this purpose, we implement minimal proof-of-concept (PoC) versions of these protocols in nim-libp2p [13] and use the shadow simulator to provide performance evaluation results.

We begin by discussing the issues of message dissemination times and duplicate messages in GossipSub. Next, we provide a brief overview of the proposals under review, followed by the experiments and findings from our performance evaluations.

Message Transfer Time and Duplicates

Assuming uniform link characteristics, message dissemination to full-message mesh concludes in $\tau_D \approx (D \times \tau_{tx}) + \tau_p$ time, where $\tau_{tx} = \frac{S}{R}$ , with $S$ , $R$ , and $\tau_p$ being the message size, data rate, and link latency, respectively.

This simplifies network-wide dissemination time to $\tau_{N} \approx \tau_D \times h$ , with $h$ indicating the number of hops along the longest path. This implies that a tenfold increase in message size results in an eightyfold rise in $D \times \tau_{tx}$ for a mesh with $D = 8$ , which accumulates across $h$ hops. This leads to two fundamental problems:

A longer contention interval ( $\tau_D$ ) increases the chance that peers receive the same message from multiple mesh members during that interval,
leading to redundant transmissions and more duplicates. Reducing $D$ inadvertently increases $h$ — potentially slowing propagation overall.
Peers are unaware of ongoing message receptions and may generate redundant IWANT requests for the messages they are already receiving. Most of these requests are entertained by early message receivers, which increases message dissemination time by increasing the workload at peers situated along the optimal path.

Talking about duplicates, a network comprising $N$ peers, each with a degree $D$ , has a total of $\frac {N \times D}{2}$ edges (links), as every link connects two peers. Assuming that a message traverses every link exactly once, we get at least $\frac {N \times D}{2}$ transmissions.

Only $N-1$ transmissions are necessary for delivering a message to all peers. As a result, we get $\frac {N \times D}{2} -(N-1)$ duplicates in the network. We can simplify average duplicates received by a single peer to $\bar{d}_{min} \approx \frac{D}{2}-1$ . Here, $\bar{d}_{min}$ represents the lower bound on average duplicates because we assume that the send and receive operations are mutually exclusive. This assumption requires that message transmission times (and link latencies) are so small that no two peers simultaneously transmit the same message to each other.

However, a large message can noticeably increase the contention interval, which increases the likelihood that many peers will simultaneously transmit the same message to each other.

Authors in [14] explore the upper bound ( $\bar{d}_{max}$ ) on duplicates. They argue that a node can forward a received message to a maximum of $D-1$ peers while the original publisher sends it to $D$ peers. As a result, we get $N(D-1)+1$ transmissions in the network. Only $N-1$ transmissions are necessary to deliver a message to all peers. Remaining $N(D-1)+1-(N-1)$ transmissions are duplicates, which simplifies the upper bound on average duplicates received by each peer to $\bar{d}_{max} \approx D-2$ . This rise indicates that larger messages can lead to more duplicates due to longer contention intervals. It is essential to highlight that the impact of IWANT/IDONTWANT messages is not considered in $\bar{d}$ computations above.

Protocols Considered

Push-Pull Phase Transition (PPPT)

In PPPT, authors argue that most redundant transmissions occur during the later stages of message propagation. As a message traverses through the network, the peers forwarding the message should gradually reduce the number of mesh members that directly receive the message (push) and send immediate IHAVE announcements to the remaining peers in the mesh. The remaining mesh members can fetch any missing messages using IWANT requests (Pull). The authors also suggest two strategies to estimate the message propagation stage:

Include a hop count in the message header to identify the number of hops traversed by that message. When a peer forwards a received message, it performs a pull operation for a subset of mesh members that equals the specified hop count and a push operation for the remaining mesh members.
Infer the message propagation stage by looking into the number of received IHAVE announcements and duplicates for that message. Use this information to choose a balance between pull-based and push-based message forwarding.

The authors suggest that instead of simultaneously pushing a message to the selected peers, sequentially initiating transmission to each peer after a short delay enhances the likelihood of timely receiving a higher number of IDONTWANT requests.

Key Considerations and PoC Implementation

The use of hop count is a more effective and straightforward method for identifying the message propagation stage than relying on the duplicate count. However, this approach may compromise the publisher's anonymity and reveal information about the publisher and its mesh members. Additional due diligence may be needed to address these issues.

In the PoC implementation [15], we use hop count to determine the message propagation stage. When forwarding a message, every peer selects a subset of mesh members equal to the advertised hop count for pull operation and forwards the message to the remaining mesh members. If the advertised hop count exceeds the number of mesh members chosen for message forwarding, the sender relays the message to all selected peers.

GossipSub v1.4 Proposal

GossipSub v1.4 proposal considers longer large-message transfer times ( $\tau_D$ ) as contention intervals and argues that most duplicates occur during these intervals for two fundamental reasons:

Peers are unaware of ongoing message receptions and may generate redundant IWANT requests for messages they are already receiving.
Peers can send IDONTWANT announcements only after receiving the entire message. However, a large contention interval increases the likelihood that many redundant transmissions will start before IDONTWANT messages are issued.

GossipSub v1.4 proposal eliminates contention interval with help from two new control messages: PREAMBLE and IMRECEIVING. A PREAMBLE precedes every large message transmission. Upon receiving a preamble, a peer learns about the messages it is receiving and performs two actions:

Notify mesh members about ongoing message reception using an IMRECEIVING announcement. On receiving an IMRECEIVING announcement from a peer, mesh members defer sending the announced message to that peer.
Defer IWANT requests for messages that are currently being received. Peers also limit the outstanding IWANT requests for any message to one.

Key Considerations and PoC Implementation

The use of PREAMBLE/IMRECEIVING addresses the limitation of IDONTWANT messages. For instance, consider a peer $X$ begins receiving a message $m$ at time $t_1$ . It can transmit IDONTWANT only after receiving $m$ , i.e., at time $t_1+\tau_D$ . Therefore, it can not cancel any duplicate receptions of $m$ that start before $t1 + \tau_D + \tau_p$ . In contrast, IMRECEIVING announcements for $m$ start at $t_1 + \Delta$ , where $\Delta$ denotes PREAMBLE processing time and satisfies $\Delta \ll \tau_D$ . As a result, peer $X$ can eliminate all duplicate receptions of $m$ that start after $t_1 + \Delta +\tau_p$ , which noticeably reduces duplicates.

The use of PREAMBLE also allows deferring IWANT requests for messages we are already receiving, which can also improve message dissemination time by reducing the workload on peers along the optimal message forwarding path.

It is worth mentioning that a malicious peer can exploit this approach by sending a PREAMBLE and never completing (or deliberately delaying) the promised message transfer. The optional safety strategy in GossipSub v1.4 proposal suggests using a peer score threshold for PREAMBLE processing and a behavior penalty for broken promises. A timeout strategy helps recover such messages.

It is essential to mention that sending and processing of PREAMBLE and IMRECEIVING messages is optional. This flexibility allows for the use of custom safety strategies in various implementations. For example, the ongoing production-grade implementation of GossipSub v1.4 in nim-libp2p allows peers to ignore PREAMBLEs unless they come from mesh members with higher data rates (bandwidth estimation becomes trivial with PREAMBLEs) and good peer scores. This implementation also lets peers choose between a push or pull strategy for handling broken promises.

For the performance evaluations in this post, we utilize the PoC implementation of GossipSub v1.4 [16]. A complete, production-grade version is currently undergoing testing and validation [17].

GossipSub v2.0 Proposal

GossipSub v2.0 introduces a hybrid method for message dissemination that combines both push and pull strategies through two new control messages: IANNOUNCE and INEED. These messages are analogous to IHAVE and IWANT messages, respectively. However, IANNOUNCE messages are issued to the mesh members immediately after validating a received message without waiting for the heartbeat interval. Similarly, INEED requests are made exclusively to mesh members, and a peer generates only one INEED request for a received message.

The balance between push and pull approaches is determined by the $D_{announce}$ parameter. On receiving a message, a peer forwards it to $D - D_{announce}$ mesh members and sends IANNOUNCE messages to the remaining mesh peers. On receiving an IANNOUNCE for an unseen message, a peer can request it using an INEED message.

Key Considerations and PoC Implementation

The $D_{announce}$ parameter governs the balance between push and pull operations. Setting $D_{announce} = D$ results in a pull-only operation, which can eliminate duplicates at the cost of increased message dissemination time. In contrast, setting $D_{announce}$ to zero reverts to standard GossipSub v1.2 operation. The authors suggest setting $D_{announce} = D-1$ to moderately decrease dissemination time while incurring only a small number of duplicate transmissions.

It is important to note that malicious peers can exploit this approach by delaying or entirely omitting responses to INEED requests. Similarly, sending INEED requests to suboptimal or overwhelmed peers can further increase message dissemination time. The authors propose using a timeout strategy and negative peer scoring to address these issues. If a message transfer does not complete within the specified interval, the receiver decreases the sender's peer score and issues a new INEED request to an alternative mesh member.

For the performance evaluations in this post, we utilize the PoC implementation of GossipSub v2.0 [18] from nim-libp2p. We set $D_{announce} = D-1$ and allow any peer to send a single IWANT request for a message, only if it has not previously sent an INEED request for the same message.

Experiments

We conducted a series of experiments under various configurations to evaluate the performance of the GossipSub v1.4 proposal, the PPPT approach, and the GossipSub v2.0 proposal against the baseline GossipSub v1.2 protocol. To support these evaluations, we extended the nim-libp2p implementation to include minimal PoC implementations of the considered protocols [16, 15, 18]
and used the Shadow simulator [19] to carry out performance evaluations.

For GossipSub v1.4 and PPPT, we also report results from delayed forwarding, where peers introduce a short delay before relaying a message to every mesh member. This delay helps reduce the number of redundant transmissions by increasing the likelihood of timely receiving a higher number of IDONTWANT notifications.

We evaluate performance using network-wide message dissemination time (latency), network-wide bandwidth utilization (bandwidth), and the average number of duplicates received by a peer for every transmitted message. We also report the average number of IWANT requests transmitted by a peer for a single message.

For each experiment, we transmit multiple messages in the network. We average the network-wide dissemination time for these messages to report latency. Bandwidth refers to the total volume of traffic in the network, encompassing control messages and data transmissions (including duplicates and IWANT replies). A peer usually receives multiple copies of any transmitted message. Excluding the first received message, all copies are duplicates. We compute average duplicates received by a peer as $\frac{1}{N M} \sum_{j=1}^{M} \sum_{i=1}^{N} d_{i,j}$ where $N$ and $M$ denote the number of peers and the number of transmitted messages, respectively, and $d_{i,j}$ represents the number of duplicates received by peer $i$ for message $j$ . A similar mechanism computes average IWANT requests.

Three simulation scenarios are considered:

Scenario 1: The number of publishers and message size are kept constant while the network size gradually increases.
Scenario 2: The number of publishers and the network size remain constant while the message size gradually increases.
Scenario 3: The number of nodes and message size remain constant while the number of publishers gradually increases.

In all experiments, we transmit multiple messages such that every publisher sends exactly one message to the network. After a publisher transmits a message, each subsequent publisher waits for a specified interval (inter-message delay) before sending the next message.

Rotating publishers ensures that every message traverses a different path, which helps achieve fair performance evaluation. On the other hand, changing inter-message delays allows for creating varying traffic patterns. A shorter inter-message delay implies more messages can be in transit simultaneously, which helps evaluate performance against large message counts. A longer delay ensures every message is fully disseminated before introducing a new message. Similarly, increasing message size stresses the network. As a result, we evaluate performance across a broader range of use cases.

The simulation details are presented in the tables below. The experiments are conducted using the shadow simulator. We uniformly set peer bandwidths and link latencies between 40-200 Mbps and 40-130 milliseconds in five variations.

Table 1: Simulation Scenarios.

Experiment	No. of Nodes	No. of Publishers	Message Size (KB)	Inter-Message Delay (ms)
Scenario 1	3000, 6000, 9000, 12000	7	150	10000
Scenario 2	1500	10	200, 600, 1000, 1400, 1800	10000
Scenario 3	1500	25, 50, 75, 100, 125	50	50

Table 2: Simulation Parameters.

Parameter	Value	Parameter	Value
$D$	8	$D_{low}$	6
$D_{lazy}$	6	$D_{high}$	12
Gossip factor	0.05	Muxer	yamux
Heartbeat interval	1000 ms	Floodpublish	false
Peer Bandwidth	40-200 Mbps	Link Latency	40-130ms
$D_{announce}$ GossipSub v2.0	7	Forward delay in PPPT/v1.4 with delay	35 ms

Results

Scenario1: Increasing Network Size


Bandwidth	Latency	Average Duplicates	Average IWANT Requests

Scenario2: Increasing Message Size


Bandwidth	Latency	Average Duplicates	Average IWANT Requests

Scenario3: Increasing Number of Publishers


Bandwidth	Latency	Average Duplicates	Average IWANT Requests

Findings

The number of IWANT requests increases with the message size. Limiting ongoing IWANT requests for each message to one can be beneficial. Additionally, the use of message PREAMBLEs can help eliminate IWANT requests for messages that are already being received.
Pull-based approaches can substantially reduce bandwidth utilization, but may result in much longer message dissemination times. However, these approaches can achieve simultaneous propagation of multiple messages by implicitly rotating among outgoing messages. As a result, increasing the number of messages yields similar dissemination times.
Transition from push to pull operation during the later stages of message propagation can reduce bandwidth consumption, without compromising latency. However, determining the propagation stage is challenging. Methods like hop counts may compromise anonymity, while using IHAVE announcements can be misleading. For instance, in the case of large messages, peers may receive IHAVE announcements much earlier than the actual message spreads through the network.
Push-based approaches achieve the fastest message dissemination but often produce a higher number of duplicate messages. Employing mechanisms like PREAMBLE/IMRECEIVING messages for guided elimination of duplicate messages can significantly reduce bandwidth consumption. This reduction not only minimizes redundant transmissions but also decreases the overall message dissemination time by lessening the workload on peers located along optimal message forwarding paths.

Please feel free to join the discussion and leave feedback regarding this post in the VAC forum.

References

[1] GossipSub v1.1 Specifications

[2] GossipSub v1.2 Specifications

[3] Number of Duplicate Messages in Ethereum’s Gossipsub Network

[4] Impact of IDONTWANT in the Number of Duplicates

[5] PREAMBLE and IMRECEIVING for Improved Large Message Handling

[6] Staggering and Fragmentation for Improved Large Message Handling

[7] GossipSub for Big Messages

[8] FullDAS: Towards Massive Scalability with 32MB Blocks and Beyond

[9] Choke Extension for GossipSub

[10] PPPT: Fighting the GossipSub Overhead with Push-Pull Phase Transition

[11] GossipSub v1.4 Specifications Proposal

[12] GossipSub v2.0 Specifications Proposal

[13] Libp2p Implementation in nim

[14] The Streamr Network: Performance and Scalability

[15] PPPT: PoC Implementation in nim-libp2p

[16] GossipSub v1.4: PoC Implementation in nim-libp2p

[17] GossipSub v1.4: Production-Grade Implementation in nim-libp2p

[18] GossipSub v2.0: PoC Implementation in nim-libp2p

[19] nim-libp2p GossipSub Test Node

Zerokit optimizations: A performance journey

2025-07-25T18:30:00.000Z

Zerokit is a toolkit providing powerful zero-knowledge utilities, including a means to answer the question "How do you prevent spam when every message is anonymous?". Its use of the Merkle hash tree, combined Poseidon hasher are keys to the answer we seek here, and with other questions that ask the improbable. These technologies, however, can take a heavy toll on resources if not used correctly. What follows is a window into the efforts made to squeeze out optimizations, and culling of redundant resource use. A story of cripplingly slow performance meets engineering talent, we arrive at a place where Zerokit comes through, fast and efficient, ready to face the world.

Background

Our friends over at Waku are particularly enthusiastic about anonymous spam prevention technologies. They have been using the Rate Limiting Nullifier (RLN) tooling that Zerokit provides to enforce a message-rate policy among users—a crucial feature unless we want a community bombarded with "totally legit, not scams" messages on repeat. However, as is often the case with new technology, some problematic delays began to surface. Node recalculations, a common operation, were taking tens of seconds at the scales being tested and deployed—even exceeding 40 seconds at times. These delays accumulate, leading to total delays on the order of three hours under certain conditions.

Naturally, we couldn't just let this sit. While we've touched on the issue of spam prevention, it's important to recognize that this technology is foundational that challenges conventional wisdom on how things must be done. Does the idea of "smart contracts without gas" catch your attention? Don't hold your breath just yet: the really interesting applications of this tech will be dead in the water, unless we can meet the challenge put to us.

The Challenge

The plan of attack that the team put together was twofold: get rid of redundant operations and data taking up precious resources, and make the remaining operations go Blazingly Fast™.

Introducing the star of the show for part 1: The main point of having this tree is to generate proofs so that peers can verify the claims being made. That doesn’t require the whole Merkle tree, just a single path, from leaf to root. The engineering work took us in a direction where these paths were the primary context in which ZK proofs operated, relegating the tree itself to an off-network reference. This reduced the burden imposed on the network significantly. Updating the data on the tree has similarly reduced, with the exception being that the siblings of each node were retained. This is called the stateless approach.

Well, stateless in the context of proof generation and verification. This is the critical context when it comes to performance, and the stateless approach does a great job, but these proofs have to come from somewhere. Each participant still needs to maintain the Merkle tree in their local environment. Without this tree, one cannot generate proofs or verify the proofs provided to them. Fortunately, one does not need to track the entire tree, but can be limited to a subset of the tree needed. With millions of participants on-chain, this can make the difference needed to make Zerokit empowered technologies accessible to those running raspberry Pis. Combine this with the fact that the heavy lifting operations of proof gen/verification being modular and separate, each participant can optimise to run things according to the strengths and requirements of their native hardware, easing the way to allow each participants to run their tree implementation at the speed of mach-ludicrous.

Fortunately, the core of our already existing implementation was sane and well put together. Double-lucky for us, the talents of newly minted VAC/ACZ team members Sylvain and Vinh were readily available. Sylvain, with a solid background in the Rust programming language, having already graduated from the most challenging parts of its infamous learning curve. He quickly got to work zeroing in on some subtle performance pathologies. Something as simple as using a mutable iterator to change values directly. Clever use of references to avoid copying data, and other memory optimization techniques that can be hidden to those that cannot “see the matrix” when working in Rust lead to very promising bench-marking results.

Vinh, having recently graduated from his CS studies, was presented with the challenge of parrelising computations. For those not familiar with Rust, this might seem unreasonable, but thanks to the rayon crate, and Rusts promise of "fearless concurrency" afforded by its type and ownership system, this kind of refactor becomes surprisingly easy, even for a talented individual at the start of their career. Of particular note: These parallelisations have been made available to the browser. Browser threads are relatively now, and by diving into this bleeding-edge technology, and making use of libraries that are still in early development stages, Blazingly Fast™ is now available within the browser. With all that in the bag, all these performance gains are gift-wrapped in the use of browser-native WASM runtimes.

Well done, everyone!

The importance of benchmarks

No performance project is complete without high quality benchmark data. Writing a quick benchmark for tracking improvements through development is one thing, but having a system of telemetry that allows you to credibly assert claims of superior performance is what completes the project. With such credible claims in hand, these efforts can bring about significant impact on the field at large. The key word being credible. Credibility cannot depend on “trust me bro” (obviously). The truth of these claims must come out of the directed efforts of a multitude of thought-disciplines. The engineer must have a solid model to understand the nature of the system. The statistician sets the quality standards of the data. The Scientist must diligently put relevant hypothesis to the test. The advocate must see that the reports made reach out to where it makes the most impact, the list goes on. Much of this is out of scope for this article, and so I will treat you with a link. Here’s your chance to see a hardcore OS engineer at the top of their chosen field speak on the subject of their passion.

All this is to say we are not the only team implementing Merkle tree tech, which also includes the Poseidon hash function it needs. In order to be a premier research destination, key aspect of why VAC exists, the fruits of our labor is just the beginning. We must prove the merit of our efforts through comparative benchmarks that satisfies the skeptics and decision makers.

Comparative benchmarks are among the most high-stakes element of performance critical projects. Get it right, and quality research output can become industry standard technology. Get it wrong, and be ready to lose the trust the field has in you as your precious R&D fades into obscurity.

For the time being, our comparative benchmarks have been used internally to inform decision-makers. As benchmarks become standardised, independently verified and executed, this initial effort may be the first of many steps to a community-wide ecosystem. A thunderdome of benchmarks, leaving us with a single champion that cannot be denied, but which technology will claim this mantle? May the bits be ever in your favor...

Benchmarking with Rust's criterion crate

Rust, much like Nim, offers unadulterated, fine-grained, and direct control over performance, but with Rust, this control is even more immediate. With its sophisticated ownership model, powerful type system, and comprehensive tooling, Rust has earned an unrivaled reputation for enabling "fearless concurrency," ease of refactoring, and providing tooling that effectively "pair programs with you" to help avoid common pitfalls, includeing those of the performance veriety.

The criterion crate is considered the go-to library for micro-benchmarking within the Rust ecosystem, and is generally regarded as an invaluable tool for obtaining high-quality telemetry. Through its ergonomic idioms and well-thought-out API, writing high-quality benchmarks becomes straightforward once you become familiar with its features. Criterion helps avoid common traps such as inappropriate compiler optimizations, improper performance sampling, and failing to prune telemetry overhead. As is typical for the Rust ecosystem, the documentation is thorough, leaving little to be desired, and the examples are extensive, making the initial learning process a pleasant experience.

Most importantly, it automatically generates tables and graphs from this data, making the crucial task of analysis straightforward and accessible. At this point, we are ready to appreciate the results of our efforts.

Promising results

When it comes to Merkle trees, we have two elements to consider: The tree itself, and the hashing function that is plugged into it. In the benchmarks we put together for the benefit of internal processes, we put our implementation up against a corresponding FOSS implementation. Scenarios were developed to isolate key performance telemetry, obtain a statistically usable sampling, with the resulting data rendered into a human readable form that can be read with a reasonable degree of confidence: enjoy! The brief summary: It appears that our in house implementation consistently outperforms others, and we’ve decided to continue committing to the R&D of our in-house implementations. Congratulations to the Zerokit team for this accomplishment.

Despite the promising results, these “micro-benchmarks” form just some of the many pieces of the whole system performance when it comes to product needs. How the system performs as a whole is all that matters. This is a promising on it’s own, but watching the performance benefits being realized in the wild is the true goal.

Which brings us back to what started all this: Waku came to us with concerns about performance issues within Zerokit limiting the scope and scale in which it can be used. The engineering talent brought to bear on this issue has successfully achieved the performance goals needed, and the results of these effort have demonstrated there is merit in continuing our commitment to this project.

Conclusion

We’ve covered a story that starts with crippling performance bottlenecks in Waku, and ends on this high-note: The problematic performance scaling issues are no more, and in the process of resolving this critical pain-point, we have established internal benchmarks that allow us to confidently state that what we are doing, we are doing well. These accomplishments come down to a solid team effort. The open communication coming in from Waku, the talented engineers working together to bring their skills and contributions to bear, the community developed tools and prior works that allowed it all to happen, and those working quietly in the background providing the leadership, resources, and coordination needed to bring this all together. Two VAC/ACZ engineers in particular call for specific mention: Ekaterina for her role in taking lead in the R&D of the Zerokit ecosystem, and Sylvain for his efforts in squeezing out some impressive optimizations. Vinh for unleashing the power of multiple threads, not only for native, but for when running in the browser as well.

Perhaps you want to get involved! Maybe you have some ideas about what the community needs for standard benchmarks. Would you like to see another implementation added to the thunderdome? Raise an issue, or join us on our forum. We look forward to seeing your voice added.

This is just one story, coming out of one relatively small project from VAC research. The two driving directives of the team is to be a conduit of expertise within IFT, and to be a premier research destination within the domains we work in. You might be independent of IFT with an interest in what we do, an IFT core contributor, or anything in between: our services are at your disposal. Join us on discord to start the conversation, email one of our team members, or maybe you might hear a knock on your door, should something in your field of work catch our interest.

Nim in Logos - 1st Edition

2025-07-04T23:00:00.000Z

Welcome to the first edition of Nim in Logos — a newsletter covering major Nim features from Logos' perspective.

If you have comments or suggestions, feel free to reach out to the authors directly or start a thread in the Logos Discord server.

Nim 2.2 – Better Stability, Smarter Memory, and Smoother Development

The Nim 2.2 release series focuses on improving language stability, fixing long-standing bugs, and optimizing performance—particularly in the ORC memory management system. The latest patch in this series, version 2.2.4, continues to build on these goals.

Here are some of the key highlights from the 2.2 series:

More powerful generics and type expressions: Stabilization of generics, typedesc, and static types. These features now support arbitrary expressions that previously only worked in limited cases, making them more reliable.
Improved tuple unpacking: Tuple unpacking now supports discarding values using underscores (_) and allows inline type annotations for unpacked elements.
Memory leak fixes: Issues with memory leaks when using std/nre’s regular expressions or nested exceptions have been resolved.
More efficient async code: Futures no longer always copy data, resulting in better performance in asynchronous workflows.
String bug fixes: Several issues involving string and cstring usage have been corrected.

In addition to core language improvements:

NimSuggest stability: The language server has received multiple fixes, improving the experience in IDEs and editors that rely on NimSuggest for autocompletion and error checking.
Better code generation: Numerous issues related to invalid or broken C and C++ code generation and backend-specific bugs have been addressed, improving Nim’s reliability when targeting other languages.

You can read the full release announcement and changelog here

Error Handling in Nim: Why Results Beat Exceptions

Error handling is one of the most critical aspects of writing reliable software, yet it remains a contentious topic in many programming languages. In Nim, developers face a unique challenge: multiple error handling paradigms are supported, leading to confusion about which approach to choose. For robust, maintainable code, our answer at Logos is increasingly clear—favor Result types over exceptions.

The Exception Problem

While exceptions might seem convenient for quick scripts and prototypes, they introduce significant challenges in complex, long-running applications:

Silent API Changes: One of the most dangerous aspects of exception-based error handling is that changes deep within dependencies can break your code without any compile-time warning. When a function suddenly starts throwing a new exception type, your code may fail at runtime under exceptional circumstances—often when you least expect it.
Resource Management Issues: Exceptions create unpredictable control flow that can lead to resource leaks, security vulnerabilities, and unexpected crashes. When an exception unwinds the stack, resources may not be properly cleaned up.
Refactoring Difficulties: The compiler provides little assistance when working with exception-based code. Adding a new exception type breaks the ABI but leaves the API unchanged, making it nearly impossible to track down all the places that need updating.

The Result Advantage

The Result type offers a compelling alternative that makes error handling explicit, predictable, and compiler-verified:

# Enforce that no exceptions can't be raised in this module
{.push raises: [].}

import results

proc doSomething(): Result[void, string] =
# Implementation here

proc getRandomInt(): Result[int, string] =
# Implementation here

doSomething().isOkOr:

echo "Failed doing something, error: ", & error

randomInt = getRandomInt().valueOr:

echo "Failed getting random int, error: ", & error

Notice that this usage of Result is much more concise and easier to follow than try-except blocks

Best Practices for Result-Based Error Handling

Make Errors Explicit: Use Result when multiple failure paths exist and calling code needs to differentiate between them. This makes error handling visible at the call site and forces developers to consciously handle failure cases.
Handle Errors Locally: Address errors at each abstraction level rather than letting them bubble up through multiple layers. This prevents spurious abstraction leakage and keeps error handling logic close to where problems occur.
Use Exception Tracking: Enable exception tracking with {.push raises: [].} at the module level. This helps identify any remaining exception-throwing code and ensures new code follows the Result pattern.

When to Break the Rules

While Result should be your default choice, exceptions still have their place:

Assertions and Logic Errors: Use assertions for violated preconditions or situations where recovery isn't possible or expected.
Legacy Integration: When interfacing with exception-heavy libraries, you may need to use exceptions at integration boundaries, but convert them to Result types as quickly as possible. To ensure safe exception handling, explicitly declare which exceptions a procedure may raise using the {.raises: [SpecificException].} pragma.

Error handling in Nim continues to evolve, but the trend is clear: explicit error handling through Result types provides better safety, maintainability, and debugging experience than exceptions. By making errors part of your function signatures and forcing explicit handling at call sites, you create more robust software that fails gracefully and predictably.

Debugging in Nim

Nowadays, analyzing the behavior of a Nim program is not as straightforward as debugging a C++ application, for example.

GDB

GDB can be used, and step-by-step debugging with GDB and VSCode is possible. However, the interaction is not very smooth. You can set breakpoints in VSCode and press F5 to run the program up to the breakpoint and continue debugging from there. That said, the state of variables is not fully demangled. For example:

For that reason, GDB is not the preferred option in Logos

Logs - Chronicles

At Logos, we primarily debug Nim applications using log outputs. In particular, we make extensive use of the nim-chronicles library.

nim-chronicles is a robust logging library that automatically includes the following contextual information in each log entry:

Calling thread ID
Current timestamp
Log level (e.g., TRACE, DEBUG, INFO, WARN, ERROR, FATAL)
Source file name
Line number of the log statement

Additionally, chronicles supports attaching custom log messages along with relevant variable values, which proves especially useful for debugging. For instance, in the following example, the log message is "Configuration. Shards", and it includes the value of an additional variable, shard.

INF 2025-07-01 09:56:57.705+02:00 Configuration. Shards                      topics="waku conf" tid=28817 file=waku_conf.nim:147 shard=64

There are also useful techniques for displaying more detailed information about specific variables:

repr(p) — Returns a string representation of the variable p, providing a more comprehensive view of its contents.
name(typeof(p)) — Extracts the type of the variable p as a string. This is particularly helpful when working with pointers or generics.

Logs - echo

The echo statement in Nim serves as a basic debugging tool, although it is less powerful and flexible compared to nim-chronicles.

Besides, debugEcho is an interesting alternative, which behaves similarly to echo but it allows working on routines marked with no side effects.

Heaptrack

This technique enables precise insight into where memory is being consumed within a Nim application.

It is particularly useful for identifying potential memory leaks and is widely employed in nwaku (Nim Waku). For more details, refer to the documentation: Heaptrack Tutorial.

Formatting code in Nim

Maintaining a consistent code format is essential for readability and for facilitating clear diff comparisons during code reviews.

To support this, Logos strongly recommends using nph across all Nim projects.

The MDSECheck method: choosing secure square MDS matrices for P-SP-networks

2025-02-28T23:00:00.000Z

This article introduces MDSECheck method — a novel approach to checking square MDS matrices for unconditional security as the components of affine permutation layers of P-SP-networks.

Introduction

Maximum distance separable (MDS) matrices play a significant role in algebraic coding theory and symmetric cryptography. In particular, square MDS matrices are commonly used in affine permutation layers of partial substitution-permutation networks (P-SPNs). These are widespread designs of the modern symmetric ciphers and hash functions. A classic example of the latter is Poseidon [1], a well-known hash function used in zk-SNARK proving systems.

Square MDS matrices differ in terms of security that they are able to provide for P-SPNs. The use of some such matrices in certain P-SPNs may result in existence of infinitely long subspace trails of small period for the latter, which make them vulnerable to differential cryptanalysis [2].

Two methods for security checking of square MDS matrices for P-SPNs have been proposed in [2]. The first one, which is referred to as the three tests method in the rest of the article, is aimed at security checking for a specified structure of the substitution layer of a P-SPN. The second method, which is referred here as the sufficient test method, has been designed to determine whether a square MDS matrix satisfies a sufficient condition of being secure regardless of the structure of a P-SPN substitution layer, i.e. to check whether the matrix belongs to the class of square MDS matrices, which are referred to as unconditionally secure in the current article.

This article aims to introduce MDSECheck method — a novel approach to checking square MDS matrices for unconditional security, which has already been implemented in the Rust programming language as the library crate [3]. The next sections explain the notions mentioned above, describe the MDSECheck method as well as its mathematical foundations, provide a brief overview of the MDSECheck library crate and outline possible future research directions.

MDS matrix: how to define and construct

An $m$ x $n$ matrix $M$ over a finite field is called MDS, if and only if for distinct $n$ -dimensional column vectors $v_1$ and $v_2$ the column vectors $v_1 \: | \: M v_1$ and $v_2 \: | \: M v_2$ , where $|$ stands for vertical concatenation, do not coincide in $n$ or more components. The set of all possible column vectors $v \: | \: M v$ for some fixed matrix $M$ is a systematic MDS code, i.e. a linear code, which contains input symbols on their original positions and achieves the Singleton bound. The latter property results in good error-correction capability.

There are several equivalent definitions of MDS matrices, but the next one is especially useful for constructing them directly by means of algebraic methods. A matrix over a finite field is called MDS, if and only if all its square submatrices are nonsingular. The matrix entries and the matrix itself are also considered submatrices.

One of the most efficient and straightforward methods to directly construct an MDS matrix is generating a Cauchy matrix [4]. Such an $m$ x $n$ matrix is defined using $m$ -dimensional vector $x$ and $n$ -dimensional vector $y$ , for which all entries in the concatenation of $x$ and $y$ are distinct. The entries of the Cauchy matrix are described by the formula $M_{i, j} = 1 \: / \: (x_i - y_j)$ . It is obvious that any submatrix of a Cauchy matrix is also a Cauchy matrix. The Cauchy determinant formula [5] implies that every square Cauchy matrix is nonsingular. Thus, Cauchy matrices satisfy the second definition of MDS matrices.

Partial substitution-permutation networks

Describing SPNs in algebraic terms is convenient, so this approach has been chosen for this article. SPNs are designs of the symmetric cryptoprimitives, which operate on an internal state, which is represented as an $n$ -dimensional vector over some finite field, and update this state iteratively by means of the round transformations described below.

Each round begins with an optional update of the internal state by adding to its components some input data or extraction of some of these components as the output data. This optional step depends on the specific cryptoprimitive and the current round number. The next step is called the nonlinear substitution layer and lies in replacing the $i$ -th component of the internal state with $S_i(c)$ for each $i \in [1..n]$ , where $c$ is the component value and $S_i(x)$ is a nonlinear invertible function over the finite field. The function $S_i(x)$ is specific to the cryptoprimitive and called an S-Box. The final step, which is known as the affine permutation layer, replaces the internal state with $M X + c$ , where $X$ is the current internal state, $M$ is a nonsingular square matrix and $c$ is the vector of the round constants. The value of $c$ is specific to the cryptoprimitive and the current round number, while $M$ typically depends only on the cryptoprimitive. The data flow diagram for an SPN is given below.

..................................                         
   │        │        │        │                           
   ▼        ▼        ▼        ▼                           
┌────────────────────────────────┐                         
│ Optional addition / extraction │ <─────> Input / output
└──┬────────┬────────┬────────┬──┘                         
   ▼        ▼        ▼        ▼                            
┌─────┐  ┌─────┐  ┌─────┐  ┌─────┐                         
│S₁(x)│  │S₂(x)│  │ ... │  │Sₙ(x)│                         
└──┬──┘  └──┬──┘  └──┬──┘  └──┬──┘                         
   ▼        ▼        ▼        ▼                            
┌────────────────────────────────┐                         
│       Affine permutation       │                         
└──┬────────┬────────┬────────┬──┘                         
   ▼        ▼        ▼        ▼                            
┌────────────────────────────────┐                         
│ Optional addition / extraction │ <─────> Input / output
└──┬────────┬────────┬────────┬──┘                         
   ▼        ▼        ▼        ▼                            
┌─────┐  ┌─────┐  ┌─────┐  ┌─────┐                         
│S₁(x)│  │S₂(x)│  │ ... │  │Sₙ(x)│                         
└──┬──┘  └──┬──┘  └──┬──┘  └──┬──┘                         
   ▼        ▼        ▼        ▼                            
┌────────────────────────────────┐                         
│       Affine permutation       │                         
└──┬────────┬────────┬────────┬──┘                         
   ▼        ▼        ▼        ▼                            
..................................                         

Partial SPNs are modifications of SPNs, where for certain rounds some S-Boxes are replaced with the identity functions to reduce computational efforts [2]. For example, the nonlinear substitution layers of the partial rounds of Poseidon update only the first internal state component [1]. In the case of P-SPNs, security considerations commonly demand to choose $M$ as a square MDS matrix, because these matrices provide perfect diffusion property for the affine permutation layer [6]. Possessing this property means ensuring that any two $n$ -dimensional internal states, which differ in exactly $t$ components, are mapped by the affine permutation layer to two new internal states that differ in at least $n - t + 1$ components.

Square MDS matrix security check in the context of P-SPNs

Certain square MDS matrices should not be used in certain P-SPNs to avoid making them vulnerable to differential cryptanalysis, since it may exploit the existence of infinitely long subspace trails of small period for vulnerable P-SPNs. [2]. Such matrices are called insecure with respect to particular P-SPNs.

An infinitely long subspace trail of period $l$ exists for a P-SPN, if and only if there is a proper subspace of differences of internal state vectors, such that if for a pair of initial internal states the difference belongs to this subspace, then the difference for the new internal states, which are obtained from the initial ones by means of the same $l$ -round transformation, also belongs to this subspace [2].

Two methods for checking square MDS matrices for suitability for P-SPNs in terms of existence of infinitely long subspace trails have been proposed in [2]. The three tests method is aimed at checking whether using a specified matrix for a P-SPN with a specified structure of the substitution layer leads to existence of infinitely long subspace trails of period $p$ for this P-SPN for all $p$ no larger than a given $l$ . The sufficient test method has been designed to determine whether a square MDS matrix satisfies a sufficient condition of non-existence of infinitely long subspace trails of period $p$ for P-SPNs using this matrix for all $p$ no larger than a specified $l$ .

The sufficient test method is a direct consequence of Theorem 8 in [2] and consists in checking that the minimal polynomial of the $p$ -th power of the tested matrix has maximum degree and is irreducible for all $p \in [1..l]$ . The aforesaid sufficient non-existence condition is satisfied by the matrix, if and only if all the checks yield positive results.

It is convenient to define the unconditional P-SPN security level of the square MDS matrix as follows: this level is $l$ for the matrix $M$ , if and only if the minimal polynomials of $M$ , $M^2$ , $...$ , $M^l$ have maximum degree and are irreducible, but for $M^{l \: + \: 1}$ the minimal polynomial does not have this property. Using this definition, the purpose of the sufficient test method can be described as checking whether the unconditional P-SPN security level of the specified matrix is no less than a given bound.

MDSECheck method: getting rid of the matrix powers

The MDSECheck method, whose name is derived from the words "MDS", "security", "elaborated" and "check", has the same purpose as the sufficient test method, but achieves it differently. The differences of the first method from the latter and approaches to implementing them can be described as follows:

Computation and verification of minimal polynomials of $M^2$ , $M^3$ , $...$ , $M^l$ , where $M$ is the tested $n$ x $n$ matrix over $GF(q)$ and $l$ is the security level bound, has been replaced with checks for the corresponding powers of a root of the characteristic polynomial of $M$ for non-presence in nontrivial subfields of $GF(q^n)$ .
1. The non-presence check is performed without straightforward consideration of all nontrivial subfields of $GF(q^n)$ . The root is checked only for non-presence in the subfields $GF(q^{n \: / \: p_1})$ , $GF(q^{n \: / \: p_2})$ , $...$ , $GF(q^{n \: / \: p_d})$ , where $p_1$ , $p_2$ , $...$ , $p_d$ are all prime divisors of $n$ .
2. The non-presence check reuses some data computed during the checking for irreducibility the minimal polynomial of $M$ , which in this case coincides with $f(y)$ designating the characteristic polynomial of $M$ . The values of $y^{q^{n \: / \: p_j}} \mod f(y)$ are saved for each $j \in [1..d]$ during the irreducibility check to replace exponentiations with sequential computations of $(y^i)^{q^{n \: / \: p_j}} \mod f(y)$ from $(y^{(i \: - \: 1)})^{q^{n \: / \: p_j}} \mod f(y)$ as its product with $y^{q^{n \: / \: p_j}} \mod f(y)$ .
The check of the minimal polynomial of $M$ for irreducibility and maximum degree is performed without unconditional computation of this polynomial. This computation has been replaced with the Krylov method fragment, which consists in building and solving only one system of linear equations over $GF(q)$ . If $M$ has an irreducible minimal polynomial of maximum degree, then its coefficients are trivially determined from the system solution. If the system is degenerate, then the minimal polynomial of $M$ does not have such properties.

The correctness of the first distinctive feature can be proven as follows. Verifying that the minimal polynomial of a matrix is of maximum degree and irreducible is equivalent to verifying that the characteristic polynomial of this matrix is irreducible, because the minimal polynomial divides the characteristic one. Also, it is trivially proven that for a matrix with such a minimal polynomial it is equal to the characteristic polynomial. Thus, the required checks for the matrices $M^2$ , $M^3$ , $...$ , $M^l$ can be done by checking their characteristic polynomials for irreducibility.

Let $M$ be $n$ x $n$ matrix over $GF(q)$ , whose minimal polynomial is of maximum degree and irreducible. The statements in the previous paragraph imply that $f(y)$ , which is the $n$ -degree characteristic polynomial of $M$ , is irreducible. Consider $M$ over the extension field $GF(q^n)$ , which is the splitting field of $f(y)$ . Let $α \in GF(q^n)$ be a root of $f(y)$ . According to standard results from the Galois field theory, $α$ , $α^q$ , $α^{q^2}$ , $...$ , $α^{q^{n \: - \: 1}}$ are distinct roots of $f(y)$ [7]. Thus, these powers of $α$ are $n$ distinct eigenvalues of $M$ . Hence, due to matrix similarity properties, there is some matrix $S$ such that $S M S^{-1} = D$ , where $D$ is the diagonal matrix, whose nonzero elements are $α$ , $α^q$ , $α^{q^2}$ , $...$ , $α^{q^{n \: - \: 1}}$ . Therefore, $S M^i S^{-1} = D^i$ , so the roots of the characteristic polynomial of $M^i$ are $α^i$ , $(α^q)^i$ , $(α^{q^2})^i$ , $...$ , $(α^{q^{n \: - \: 1}})^i$ . If the minimal polynomial of $α^i$ has degree less than $n$ , then the characteristic polynomial of $M^i$ is divisible by this minimal polynomial, while $α^i$ lies in some nontrivial subfield of $GF(q^n)$ . One of the fields isomorphic to this subfield is a residue class ring of polynomials modulo the minimal polynomial of $α^i$ [7]. If the minimal polynomial of $α^i$ is of degree $n$ , then the characteristic polynomial of $M^i$ equals this minimal polynomial and therefore is irreducible, while $α^i$ does not lie in any nontrivial subfield of $GF(q^n)$ . In this case, $1$ , $α^i$ , $(α^i)^2$ , $...$ , $(α^i)^{n \: - \: 1}$ are linearly independent as distinct roots of an irreducible polynomial over a finite field [7], so any field containing $α^i$ has at least $q^n$ elements and therefore cannot be a trivial subfield of $GF(q^n)$ . Thus, checking the characteristic polynomials of the matrices $M^2$ , $M^3$ , $...$ , $M^l$ for irreducibility is equivalent to verifying that $α^2$ , $α^3$ , $...$ , $α^l$ do not lie in any nontrivial subfield of $GF(q^n)$ .

The last sentences of the two previous paragraphs imply the following: verifying that the minimal polynomials of $M^2$ , $M^3$ , $...$ , $M^l$ are of maximum degree and irreducible can be performed by verifying that the corresponding powers of a root of the characteristic polynomial of the $n$ x $n$ matrix $M$ over $GF(q)$ do not belong to any nontrivial subfield of $GF(q^n)$ . $\blacksquare$

The approaches to implementing the first distinctive feature can be explained and proven to be correct as follows. Since $GF(q^w)$ is a nontrivial subfield of $GF(q^u)$ if and only if $w$ divides $u$ and $w < u$ [7], the presence of some $ε$ in $GF(q^h)$ , which is a nontrivial subfield of $GF(q^n)$ , implies that $ε \in GF(q^{n \: / \: ν})$ for some prime $ν$ dividing $n$ , because $h$ divides the quotient of $n$ and some of its prime factors. Thus, checking that some value does not belong to subfields $GF(q^{n \: / \: p_1})$ , $GF(q^{n \: / \: p_2})$ , $...$ , $GF(q^{n \: / \: p_d})$ , where $p_1$ , $p_2$ , $...$ , $p_d$ are all prime divisors of $n$ , is equivalent to checking this value for non-presence in nontrivial subfields of $GF(q^n)$ .

Checking for irreducibility the minimal polynomial of $M$ is performed by means of Algorithm 2.2.9 in [8] and consists in sequential computation of $y^p \mod f(y)$ , $y^{p^2} \mod f(y)$ , $...$ , $y^{p^{\lfloor n \: / \: 2 \rfloor}} \mod f(y)$ and checking that $GCD(y^{p^i} \mod f(y) \: - \: y, f(y)) = 1$ for each $i \in [1..\lfloor n \: / \: 2 \rfloor]$ , where $f(y)$ is the characteristic polynomial of $M$ and coincides with the minimal polynomial in this case. The optimized root non-presence check is performed by checking that for each $i \in [2..l]$ for each $j \in [1..d]$ the value of $((y^i)^{q^{n \: / \: p_j}} - y^i) \mod f(y)$ is nonzero. This approach is based on the following standard results from the Galois field theory [7]:

$GF(q^n)$ is isomorphic to the residue class ring of univariate polynomials in $y$ modulo $f(y)$ , because at this point $f(y)$ is known to be irreducible, and some root of $f(y)$ is mapped by this isomorphism to the residue class the polynomial $y$ in this ring.
All elements of a finite field $GF(q^w)$ and only they are roots of $y^{q^w} - y$ .

The expression $((y^i)^{q^{n \: / \: p_j}} - y^i) \mod f(y)$ can be rewritten as $((y^{q^{n \: / \: p_j}})^i - y^i) \mod f(y)$ , which can be computed without exponentiation as the product of $(y^{(i \: - \: 1)})^{q^{n \: / \: p_j}} \mod f(y)$ and $y^{q^{n \: / \: p_j}} \mod f(y)$ , which has been saved during the irreducibility check.

The second distinctive feature can be explained and proven to be correct in following way. The $n$ x $n$ matrix $M$ does not have a minimal polynomial of maximum degree, if some Krylov subspace of order $n$ for it is not $n$ -dimensional. Indeed, the minimal polynomial of the matrix is divisible by the minimal polynomial of the restriction of this linear operator to an arbitrary subspace, and in the considered case the latter polynomial has degree less than $n$ , because the degree of the minimal polynomial of a linear operator cannot exceed the dimension of the subspace the operator acts on. Thus, an unconditional computation of the minimal polynomial of $M$ is not required to determine whether this polynomial is irreducible and has maximum degree. Using this computation has been replaced with the Krylov method fragment, which consists in choosing any nonzero $n$ -dimensional column vector $v$ and solving the system of linear equations $A X = b$ , where $A$ is an $n$ x $n$ matrix, whose columns are $v$ , $M v$ , $M^2 v$ , $...$ , $M^{n \: - \: 1} v$ , and $b$ is $M^n v$ . If $A$ is singular, the minimal polynomial of $M$ is reducible or does not have maximum degree, so checking $M$ has been accomplished; otherwise, $f(y)$ , which is the minimal and characteristic polynomial of $M$ , can be expressed as $y^n - X_{n \: - \: 1} y^{n \: - \: 1} - X_{n \: - \: 2} y^{n \: - \: 2} - … - X_1 y - X_0$ .

The steps of MDSECheck method can be summarized as follows:

The square MDS matrix $M$ over $GF(q)$ and the unconditional P-SPN security level bound $l$ are received as inputs.
The Krylov method fragment is used to compute the minimal polynomial of $M$ . If the computation fails, then $M$ is not unconditionally secure, so the check of $M$ is complete. If it succeeds, then the minimal polynomial has maximum degree and, therefore, coincides with the characteristic polynomial of $M$ .
Algorithm 2.2.9 is used to check for irreducibility the minimal polynomial of $M$ , which is also the characteristic polynomial of $M$ in this case. Some data computed during this step is saved to be reused at the next one. If the polynomial is reducible, then the check of $M$ is complete, because $M$ has been found to be not unconditionally secure.
The values of $α^2$ , $α^3$ , $...$ , $α^l$ , where $α$ is a root of the characteristic polynomial of $M$ , are sequentially checked for non-presence in nontrivial subfields of $GF(q^n)$ as described above. If $α^i$ belongs to some nontrivial subfield of $GF(q^n)$ , then the unconditional P-SPN security level of $M$ is $i \: - \: 1$ , so the check of $M$ is complete. If all the values do not belong to such a subfield, then the unconditional P-SPN security level is at least $l$ .

MDSECheck library crate: implementation in Rust

The library crate [3] provides tools for generating random square Cauchy MDS matrices over prime finite fields and applying the MDSECheck method to check such matrices for unconditional security. The used data types of field elements and polynomials are provided by the crates ark-ff [9] and ark-poly [10]. The auxiliary tools in the crate modules are accessible as well.

Generating by means of this crate a 10 x 10 MDS matrix, which is defined over the BN254 scalar field [11] and has unconditional P-SPN security level is 1000, takes less than 60 milliseconds on average for the laptop with the processor Intel® Core™ i9-14900HX, whose maximum clock frequency is 5.8 GHz.

Conclusion

The MDSECheck method proposed in this article is a novel approach to checking square MDS matrices for unconditional security as the components of affine permutation layers of P-SPNs. It has been implemented as a practical library crate for generating unconditionally secure square MDS matrices for P-SPNs over prime finite fields.

The future research directions may include theoretical and experimental studies of performance of approaches, which use the MDSECheck method to generate unconditionally secure square MDS matrices for P-SPNs.

References

L. Grassi, D. Khovratovich, C. Rechberger, A. Roy, M. Schofnegger. "POSEIDON: A New Hash Function for Zero-Knowledge Proof Systems (Updated Version)".
L. Grassi, C. Rechberger, M. Schofnegger. "Proving Resistance Against Infinitely Long Subspace Trails: How to Choose the Linear Layer".
The page "mdsecheck" on crates.io.
Y. Kumar, P. Mishra, S. Samanta, K. Chand Gupta, A. Gaur. "Construction of all MDS and involutory MDS matrices".
The page "Value of Cauchy Determinant" on proofwiki.org.
T. Silva, R. Dahab "MDS Matrices for Cryptography".
S. Huczynska, M. Neunhöffer. "Finite Fields"
R. Crandall, C. Pomerance. "Prime Numbers: A Computational Perspective" (2nd edition).
The page "ark-ff" on crates.io.
The page "ark-poly" on crates.io.
The page "ark-bn254" on crates.io.

Vac 2024 Year in Review

2025-01-09T18:30:00.000Z

In this post, we recap Vac's achievements in 2024 and look forward to 2025.

With 2024 now behind us and a new year ahead, Vac is proud to reflect on the milestones and breakthroughs that defined another year of researching and developing free and open digital public goods for the Institute of Free Technology and wider web3 ecosystem.

Vac comprises various subteams and service units, each with its own focus. Below, we celebrate each unit's achievements and look forward to its 2025 plans.

Nescience

Nescience is our state separation architecture that aims to enable private transactions and provide a general-purpose execution environment for classical applications.

Highlights

This year, the Nescience state separation architecture moved from exploration to real progress, taking significant steps towards building a functional and reliable system. The team focused on turning ideas into something real, testing the proposed architecture, and understanding its strengths and weaknesses.

ZkVM exploration and benchmarks
- Published deep reviews of 23 existing zkVMs
- Benchmarked the performance of the six zkVMs that best fit Nescience
Defined the NSSA architecture
- Brought clarity to NSSA’s design and explained the system’s architecture in a lengthy exploratory blog post
Built the sandboxed testnet
- Designed the first version of the node specification
- All core components (execution types, UTXOs, cryptographic primitives) implemented and being tested
- Testing the performance of all execution types in various scenarios

We also made progress on the essential parts of NSSA’s system, including:

Key protocol for secure key management
Execution types and circuits for reliable computation
UTXO specification to manage state transitions effectively
Cryptography module to ensure privacy and security

Looking forward

In 2025, the Nescience team plans to double down on what works, fix what doesn’t, and push NSSA closer to real-world use.

Sandboxed testnet data analysis – the sandboxed testnet will be our primary data source that we will analyse to identify issues, limitations, and areas for improvement.
Expanding the node – expand sandboxed components into a full node implementation with rigorous testing and iterative optimization (to bridge the gap between proof of concept and production readiness).
Finalizing the architecture and RFC – after completing NSSA’s architecture, we will draft an RFC to ensure transparency and enable greater collaboration with the broader ecosystem.
Testing real-life scenarios – applying NSSA to diverse, practical use cases to assess its adaptability, performance, and impact.
Ongoing optimization – ensure NSSA is robust, efficient, and ready to scale.

Token Economics (TKE)

The TKE Service Unit works closely with IFT portfolio projects to design and implement crypto-economic incentive structures.

Highlights

Formalized and implemented Codex economic incentives in the Litepaper and simulations
Orchestrated Status Network incentive structure and smart contract implementation
Started building Nomos’s economic model
Consulted and provided analysis of incentives for the Logos Operators ordinals project
Drove discussions on the economic sustainability of Waku; helped define RLN membership and its payment mechanism

Looking forward

In 2025, TKE will continue to support IFT portfolio projects, working toward economic sustainability while strengthening relationships within the organization. Additionally, the service unit aims to continue building its external reputation through partnerships and publications of relevant work on the Vac forum.

Quality Assurance (QA)

The QA Service Unit focuses on the development and execution of comprehensive test plans, including implementing unit and interoperability testing.

Highlights

Matured Waku interoperability testing framework with coverage for all major protocols and features
Began collaboration with Nomos, contributing to unit and integration testing
Partnered with the Status team to test message reliability under unstable network conditions

Looking forward

Extend collaboration with the Waku team on go-waku bindings and message reliability testing
Cement working relationship with the Nomos team through the building of an E2E testing framework for higher-level node validation
Work closely with Status’s QA team to enhance the functional testing framework
Continue work on nim-libp2p testing
Expand collaboration to additional projects

RFC

The RFC Service Unit takes on the responsibility of shepherding and editing specifications for IFT projects. The unit acts as a linchpin for ensuring standardized and interoperable protocols within the IFT ecosystem.

Highlights

Working to implement RFC culture across the IFT ecosystem
Began editorial work for several IFT portfolio projects: Status, Nomos, Waku, and Codex.
Reworked our standards with regard to writing RFCs to a consensus-oriented specification system

Looking forward

Continue to implement RFC culture across the IFT ecosystem
Broaden the number of RFCs produced – particularly for IFT portfolio projects nearing public releases
Include new projects with the rfc-index
Encourage external projects requiring RFCs to establish relationships with the service unit

Applied Cryptography and ZK (ACZ)

The ACZ Service Unit focuses on cryptographic solutions and zero-knowledge proofs, enhancing the security, privacy, and trustworthiness of IFT portfolio projects and contributing to the overall integrity and resilience of the decentralized web ecosystem.

Highlights

Researched a libp2p mix protocol and first proof-of-concept implementation (including ping and GossipSub over mix)
Researched a decentralized version of MLS (message layer security) with a first proof of concept
Released Zerokit v0.6.0 and v0.5.0
Added gnark RLN implementation
Released Stealth Address Kit v0.3.1, v0.2.0, and v0.1.0
Published:
- Verifying RLN Proofs in Light Clients with Subtrees
- RLN-v3: Towards a Flexible and Cost-Efficient Implementation

Looking forward

Ensure libp2p mix protocol is production-ready and support with the publishing of a paper and blog posts
Ensure decentralized MLS is production-ready and support with the publishing of a paper and blog posts
Begin explorations of additional research topics
Release Zerokit v0.7 and future versions

P2P

The P2P Service Unit specializes in peer-to-peer technologies and develops nim-libp2p, improves the libp2p GossipSub protocol, and assists IFT portfolio projects with the integration of P2P network layers.

Highlights

Analysis and work on libp2p GossipSub improvements
Published:
- Libp2p GossipSub IDONTWANT Message Performance Impact
PR to libp2p specifications about specific lib2p GossipSub improvements we researched and tested https://github.com/libp2p/specs/pull/654

Looking forward

Add new features to nim-libp2p: QUIC transport, web transport
Update specifications for libp2p GossipSub, aiming to significantly improve its performance

Distributed Systems Testing (DST)

The DST Service Unit’s primary objective is to assist IFT portfolio projects in understanding the scaling behavior of their nodes within larger networks. By conducting thorough regression testing, the DST unit helps ensure the reliability and stability of projects.

Highlights

DST compute resources transitioned from a hosted environment to a dedicated Vac Lab, enabling better customization of resources and adding significantly more compute power – enabled much higher and more stable simulations (several hundred nodes to several thousand) and enhanced environmental control.
Maintained monthly regression simulations for both Waku and Nim-libp2p, helping us to detect several issues and ensure that future versions do not introduce new ones.
Successfully simulated and obtained results for all Waku protocols, relaying feedback to the team.

Looking forward

More testing and simulations for Codex and Nomos
Develop useful tools for all IFT portfolio projects – e.g. a Log Parser tool and data dashboard

Nim

Several IFT portfolio projects use the Nim ecosystem for its efficiency. The Nim Service Unit is responsible for the development and maintenance of Nim tooling.

Highlights

Released Nim-libp2p (v1.7.1, v1.7.0, v1.6.0, v1.5.0, v1.4.0, v1.3.0, v1.2.0)
Introduced SAT solver to the Nimble package manager that significantly improves dependency resolution
Nim VSCode Extension
Stabilized Nim Language Server

Smart Contracts (SC)

Vac's Smart Contracts Service Unit ensures the smart contracts deployed across the various IFT portfolio projects are secure, robust, and aligned with project requirements.

Highlights

Deployed the SNT staking protocol testnet following Status's governance vote to develop SNT staking and Status Network
Wrote specifications for Codex's architectural components and Status's staking contracts
Delivered several learn-up sessions on a variety of topics for IFT contributors, including:
- Stealth addresses
- Tokenized vaults
- Rental NFTs
- Merkle trees
- Account abstraction
- EVM deep dive

Looking forward

Deploy the SNT staking protocol on the Status Network testnet
Encourage community security audits via contests
Provide smart contract consultation services for IFT portfolio products
Engage in more learn-up sessions to promote org-wide knowledge sharing.

Heading into 2025

This year has seen Vac involved with many research, development, and testing undertakings in support of IFT portfolio projects. The digital public goods that emerge from our efforts not only support the organization itself but are open and free to use by any project that would benefit.

As we move into 2025, we aim to nurture a stronger RFC culture across the IFT to encourage greater collaboration and knowledge sharing among portfolio projects. Our goal is to serve as an internal conduit of expertise within the organization, supported by a strong RFC culture, maintaining a repository of internal knowledge creation, and identifying and facilitating IFT project synergies. Such an approach should lead to greater efficiencies across the organization.

We also aim to establish a diverse research community around Vac, and our efforts in this regard are already underway. In the final quarter of 2024, Vac stepped up its collaboration with the libp2p community and made a concerted effort to engage the community on the Vac forum. In 2025, we aim to continue working closely with those communities to which we already have ties, such as the libp2p, Ethereum, and Nim ecosystems.

We look forward to continuing our journey with you!

Follow Vac on X, join us in the Vac Discord, or take part in the discussions on the Vac forum to stay up to date with our research and development progress.

Vac 101: Climbing Merkle Trees

2024-12-30T12:00:00.000Z

In this post, we introduce a crucial data structure used throughout web3.

Introduction

A large amount of data is swapped between users on a blockchain in the form of transactions. Over the entire life of a blockchain, the storage space required to maintain a copy of every transaction becomes untenable for most users. However, the integrity of a blockchain relies on a large pool of users that can validate the blockchain's history from its inception to its present state. The data representing the blockchain's state is compressed. This compression addresses the issue of scalability that would otherwise greatly restrict the pool of users.

Data compression alone is not the end goal. As mentioned, it is essential for users to be able to validate the blockchain's history. The property of compression and validation was solved in Bitcoin by the use of Merkle trees. Merkle trees were introduced first by Ralph Merkle in his dissertation [1]. A Merkle tree is a data structure that compresses a digest of data to a constant size while still providing a method for proving membership of elements of the digest. A previous rlog[2] described how Merkle trees with their proof of membership could be used for lightweight clients for RLN.

Tree structure

A tree is a special data structure that organizes nodes so that there is exactly one path between any two nodes. The trees that we consider can be arranged in layers with multiple nodes (children) merged into a single node (parent) in the preceding layer. A single node exists in the base layer; this special node is called the root node. The highest level of the tree consists of childless nodes called leaves.

A binary tree has one additional property: each nonleaf node has exactly two children nodes. That is, we assume that nodes in a binary tree are either a parent node with two children or a leaf. As strange as it sounds, each child node has exactly one parental node.

A binary tree with $2^n$ leaves consists of $n+1$ layers. Additionally, such a tree has $2^{n+1}-1$ nodes.

Merkle trees

A Merkle tree is a specialized tree in which each node contains the evaluation of a hash function. Merkle trees are usually taken to have a binary tree structure. As such, the presentation we provide in this section will be for binary trees.

Construction

In this section, we show how Merkle trees are constructed to compress a digest $D$ . Suppose that the digest $D$ consists of $2^n$ entries; we assume that the digest $D$ has this many entries since a Merkle tree is a binary tree. Additionally, each digest can be padded to ensure that $D$ has the desired number of entries.

Each leaf of the Merkle tree contains the hash of a digest entry. Each parent node contains the hash of the concatenation of their child nodes. Through this iterative construction, we reach the root of the tree. The value contained in the root node is called the root hash. The root hash is a compressed representation of the digest $D$ .

Each node in the Merkle tree is computed by taking a hash. Since a binary tree with $2^n$ leaves has $2^{n+1}-1$ nodes, then we need to evaluate $2^{n+1}-1$ hashes to construct the Merkle tree.

Merkle tree intregrity

A large quantity of data can be compressed to a single hash value. A natural question to ask is: could a clever party find another digest that yields a Merkle tree with the same root hash? If possible, this would compromise the ledger since the blockchain's history could be altered. Fortunately, Merkle trees are quite secure. In fact, Merkle trees can be used to both bind and hide a digest.

The Merkle tree is able to bind a digest with one of the properties of hash functions (see our previous Vac 101 [3] for information on hash functions). A hash function is collision resistant; it is infeasible for a malicious party to find two values share the same hash value.

This collision resistance property, essentially, fixes the input to each leaf and into their parent, their parent's parent, and so on.

In certain applications, it may be desirable for the digest of a Merkle tree to be kept confidential. This is achieved with the preimage resistant property of hash functions. A hash function is preimage resistant provided that it is difficult to reverse the hashing operation. It would be necessary for a malicious party to find preimages to each node starting from the root node to determine the original digest.

Now, we see that Merkle trees are secured structures that are tamper resistant.

Proof of membership

An interesting and critical property of Merkle trees is their ability to prove that any piece of data is part of its digest. This can be done with logarithmic storage and logarithmic computation time.

Suppose that we want to show that data $\ell$ is part of the Merkle tree's digest. Additionally, suppose that $\mathsf{hash}$ is the hash function used to construct the tree. We assume that the hash function $\mathsf{hash}$ can be computed in constant-time for any input.

Suppose that a prover provides data $\ell$ to a verifier, and tells the verifier that $\ell$ corresponds to the $i$ th leaf of the Merkle tree. For the verifier to be convinced that $\ell$ is part of the digest, he needs to be able to construct the tree's root hash using $\mathsf{hash}$ , $i$ , $\ell$ and some additional information from the prover. Specifically, the prover must provide the sibling hashes for each value that the verifier can compute. This enables the verifier to compute the parents of the siblings that the prover provides and the values that he was able to produce himself. The last of the computed parents is the root.

The leaf index $i$ indicates whether a hash value provided by the prover is a left or right sibling. This is done by looking at the binary expansion of $i$ .

The verifier can compute the leaf $h_0 = \mathsf{hash}(\ell)$ . Next, using $h_0$ 's sibling, $h'_0$ , provided by the prover, the verifier can compute $h_1 = \mathsf{hash}(h_0 \|h'_0)$ or $h_1 = \mathsf{hash}(h'_0 \| h_0)$ depending on whether $h'_0$ is a left or right sibling. This pathing continues until the verifier either successfully computes the root hash (in $n+1$ hashes) or fails to do so.

The prover has to provide $n$ sibling nodes for the proof of membership.

There is a key detail that is essential for the proof of membership to be secure. The root hash has to be provided to the verifier prior to the selection of data $\ell$ . Otherwise, the prover could generate a series of hash values (with the corresponding root hash) to forge a proof of membership.

Capped proof of membership

Polygon provides an implementation [4] of a shortened proof of membership with a slight modification. A specific layer of the Merkle tree is published instead of just the root hash. By doing this, a capped proof of membership is just the path from leaf to the published layer.

Extensions of Merkle trees

Merkle trees can be extended in multiple ways. In this section, we explore a select few of these extensions.

Sparse Merkle trees

A sparse Merkle tree (SMT) is a special Merkle tree that can be used to represent digests with nonconsecutive entries. Specifically, each digest entry has a particular leaf index. For simplicity, we assume that the index value is computed by taking the hash of the entry. We note that this is a sorted SMT.

Let $n$ denote the number of bits that a hash value can possess. This means that our SMT can have at most $2^n$ leaves.

An SMT is treated as a Merkle tree in which each entry is placed in the leaf corresponding to its hash value, and the other entries have a $\mathsf{null}$ marker inserted in. This means that we can prove membership in the way described. However, we can also prove nonmembership of an element by showing that $\mathsf{null}$ is located in the element's hash location. The crucial difference between a sorted and unsorted SMT is that the unsorted variant cannot be used to prove nonmembership.

We can take advantage of the sparse nature of SMTs to provide shortened proofs. Specifically, it is unlikely for entries to cluster together. Thus, it is efficient to maintain a list of values:

Null values
$d_0 := \mathsf{Hash(null)},$
$d_1 := \mathsf{Hash(d_0 \\|\\| d_0)},$
$d_2 := \mathsf{Hash(d_1 \\|\\| d_1)},$
$\vdots$
$d_{n-1} := \mathsf{Hash(d_{n-2} \\|\\| d_{n-2})}.$

Each of the $d_i$ 's represents the root hash of a Merkle tree with $2^i$ leaves containing $\mathsf{null}$ . These values can be used to shorten the time needed to construct an SMT and the length of proofs.

Proof of nonmembership

In the first Vac 101 [5], we examined Bloom and Cuckoo filters that could be used for proof of membership and nonmembership. However, the proof of membership may result in false positives due to collisions. This would affect nonmembership proofs as well. Sparse Merkle trees can be adapted to provide greater assurance that a given piece of data is not a member of the digest.

Why is sorting essential? The sorting mechanism of data can be arbitrarily chosen. However, it is essential that there are no gaps in the ordering. The maximum number of elements that could ever exist in the digest must be known. A simple method for this is to use a hash function to provide fingerprints to the data. Each hash using either SHA-256 or Keccak has 256-bits. Our entire digest could consist of a maximum of $2^{256}$ entries. This assumes that our digest does not contain collisions.

The fingerprint of a piece of data $\ell$ indicates which leaf of the SMT it is contained in. This means that a nonmembership of $\ell$ in the SMT becomes a matter of proving that $\mathsf{null}$ is contained in $\ell$ 's location.

It is crucial for the SMT to be sorted. Otherwise, a malicious party can append the entry $\ell$ to a random location. This allows for the malicious party to provide contradictory proofs that prove both membership and nonmembership. We note that the requirement that an SMT is sorted may be too strong of an assumption in centralized cases. However, sortedness is a necessary property of SMTs for decentralized systems.

Verkle Trees

A proof of membership grows in length as the Merkle tree grows. The most obvious approach to remedy this scalability issue is to use Merkle trees in which each node has more than two children. However, this does not fix the issue. A proof of membership in a $k$ -nary Merkle tree [6] (each node has $k$ children) has a proof size $\log_k(n)(k-1)$ . The multiple $k-1$ is the number of silbings that a node has on each layer. Hence, the proof size grows faster than a logarithmic function of the digest size.

An alternate approach is to use a different data structure: Verkle trees [6]. A Verkle tree replaces hash functions with polynomial commitments [7, 8]. We will explore Verkle trees in a future Vac 101 edition.

References

Large Message Handling in GossipSub: Potential Improvements

2024-10-31T12:00:00.000Z

Large Message Handling in GossipSub: Potential Improvements

Motivation

The challenge of large message transmissions in GossipSub leads to longer than expected network-wide message dissemination times (and relatively higher fluctuations). It is particularly relevant for applications that require on-time, network-wide dissemination of large messages, such as Ethereum and Waku [1,2].

This matter has been extensively discussed in the libp2p community [3, 4, 5, 6, 7, 8], and numerous improvements have been considered (or even incorporated) for the GossipSub protocol to enable efficient large-message propagation [3, 7, 9, 10].

Problem description

Sending a message to N peers involves approximately $\lceil \log_D(N) \rceil$ rounds, with approximately $(D-1)^{X-1} \times D$ transmissions in each round, where $X, D$ represent the round number and mesh size.

Transmitting to a higher number of peers (floodpublish) can theoretically reduce latency by increasing the transmissions in each round to $(D-1)^{X-1} \times (F+D)$ , where $F$ represents the number of peers included in floodpublish.

This arrangement works fine for relatively small/moderate message sizes. However, as message sizes increase, significant rises and fluctuations in network-wide message dissemination time are seen.

Interestingly, a higher $D$ or $F$ can also degrade performance in this situation.

Several aspects contribute to this behavior:

Ideally, a message transmission to a single peer concludes in $\tau_1 = \frac {L}{R}+P$ (ignoring any message processing time), where $L, R, P$ represent message size, data rate, and link latency. Therefore, the time required for sending a message on a 100Mbps link with 100ms latency jumps from $\tau_1^{10KB} = 100.8ms$ for a 10KB message to $\tau_1^{1MB} = 180ms$ for a 1MB message. For $D$ peers, the transmission time increases to $\tau_D^{1MB} = (80 \times D) + 100ms$ , triggering additional queuing delays (proportional to the transmission queue size) during each transmission round.
In practice, $\tau_1^{1MB}$ sometimes rises to several hundred milliseconds, further exaggerating the abovementioned queuing delays. This rise is because TCP congestion avoidance algorithms usually limit maximum in-flight bytes in a round trip time (RTT) based on the congestion window ( $C_{wnd}$ ) and maximum segment size (MSS) to approximately ${C_{wnd} \times MSS}$ , with $C_{wnd}$ rising with the data transfer for each flow. Consequently, sending the same message through a newly established (cold) connection takes longer. The message transfer time lowers as the $C_{wnd}$ grows. Therefore, performance-improvement practices such as floodpublish, frequent mesh adjustment, and lazy sending typically result in longer than expected message dissemination times for large messages (due to cold connections). It is also worth mentioning that some TCP variants reset their $C_{wnd}$ after different periods of inactivity.
Theoretically, the message transmission time to D peers $(\tau_D)$ remains the same even if the message is relayed sequentially to all peers or simultaneous transmissions are carried out, i.e., $\tau_D = \sum_{i=1}^{D} \tau_i$ However, sequential transmissions finish early for individual peers, allowing them to relay early. This may result in quicker network-wide message dissemination.
A realistic network comprises nodes with dissimilar capabilities (bandwidth, link latency, compute, etc.). As the message disseminates, it's not uncommon for some peers to receive it much earlier than others. Early gossip (IHAVE announcements) may bring in many IWANT requests to the early receivers (even from peers already receiving the same message), which adds to their workload.
A busy peer (with a sizeable outgoing message queue) will enqueue (or simultaneously transfer) newly scheduled outgoing messages. As a result, already scheduled messages are prioritized over those published by the peer itself, introducing a significant initial delay to the locally published messages. Enqueuing IWANT replies to the outgoing message queue can further exaggerate the problem. The lack of adaptiveness and standardization in outgoing message prioritization are key factors that can lead to noticeable inconsistency in message dissemination latency at each hop, even in similar network conditions.
Message size directly contributes to peers' workloads in terms of processing and transmission time. It also raises the probability of simultaneous redundant transmissions to the same peer, resulting in bandwidth wastage, congestion, and slow message propagation to the network. Moreover, the benefits of sequential message relaying can be compromised by prioritizing slow (or busy) peers.
Most use cases necessitate validating received messages before forwarding them to the next-hop peers. For a higher message transfer time $(\tau )$ , this store-and-forward delay accumulates across the hops traveled by the message.

Possible improvements

1. Minimizing transfer time for large messages

The impact of message size and achievable data rate on message transmit time $\tau$ is crucial as this time accumulates due to the store-and-forward delay introduced at intermediate hops.

Some possible improvements to minimize overall message dissemination latency include:

a. Message fragmentation

In a homogeneous network, network-wide message dissemination time (ignoring any processing delays) can be simplified to roughly $\delta \approx \delta_{Tx} + P_h$ , where $\delta_{Tx}$ represents accumulative message transmit time denoted as $\delta_{Tx} = \frac{S}{R} \times h$ , with $S, R$ being the data size and data rate, and $h, P_h$ being the number of hops in the longest path and message propagation time through the longest path.

Partitioning a large message into n fragments reduces a single fragment transmit time to $\frac{\delta_{Tx}}{n}$ . As a received fragment can be immediately relayed by the receiver (while the sender is still transmitting the remaining fragments), it reduces the transmit time to $\delta_{Tx} = \frac{S}{R} \times \frac{2h-1}{n}$ .

This time reduction is mainly attributed to the smaller store-and-forward delay involved in fragment transmissions.

However, it is worth noting that many applications require each fragment to be individually verifiable. At the same time, message fragmentation allows a malicious peer to never relay some fragments of a message, which can lead to a significant rise in the application's receive buffer size.

Therefore, message fragmentation requires a careful tradeoff analysis between time and risks.

b. Message staggering

Considering the same bandwidth, the time $\tau_D$ required for sending a message to D peers stays the same, even if we relay to all peers in parallel or send sequentially to the peers, i.e., $\tau_D = \sum_{i=1}^{D} \tau_i$ .

However, sequential relaying results in quicker message reception at individual peers ( $\tau_1 \approx \frac{\tau_D}{D}$ ) due to bandwidth concentration for a particular peer. So, the receiver can start relaying early to its mesh members while the original sender is still sending the message to other peers.

As a result, after every $\frac{\tau_D}{D}$ milliseconds, the number of peers receiving the message increases by $2^X\ \forall\ X \in \{0, D-1\}$ and by $\sum_{k=X-D}^{X-1} \lambda_k\ \forall\ X \geq D$ . Here, $X$ represents message transmission round $X = i \cdot \frac{\tau_D}{D} \mid i \in \mathbb{N}_0$ , and $\lambda_k$ represents the number of peers that received the message in round $k$ .

It is worth noting that a realistic network imposes certain constraints on staggering for peers. For instance, in a network with dissimilar peer capabilities, placing a slow peer (also in cases where many senders simultaneously select the same peer) at the head of the transmission queue may result in head-of-line blocking for the message queue.

At the same time, early receivers get many IWANT requests, increasing their workload.

c. Message prioritization for slow senders

A slow peer often struggles with a backlog of messages in the outgoing message queue(s) for mesh members. Any new message transmission at this stage (especially the locally published messages) gets delayed. Adaptive message-forwarding can help such peers prioritize traffic to minimize latency for essential message transfers.

For instance, any GossipSub peer will likely receive every message from multiple senders, leading to redundant transmissions [11]. Implementing efficient strategies (only for slow senders) like lazy sending and prioritizing locally published messages/IWANT replies over already queued messages can help minimize outgoing message queue sizes and optimize bandwidth for essential message transfers.

A peer can identify itself as a slow peer by using any bandwidth estimation approach or simply setting an outgoing message queue threshold for all mesh members.

Eliminating/deprioritizing some messages can lower a peer's score, but it also earns the peer an overall better score by achieving some early message transfers.
For instance, sending many near-first messages can only save a peer from a deficit penalty. On the other hand, sending only one message (assuming MeshMessageDeliveriesThreshold defaults to 1) as the first delivered message can add to the accumulative peer score.

2. Mitigating transport issues

Congestion avoidance algorithms used in various TCP versions directly influence achievable throughput and message transfer time as maximum unacknowledged in-flight bytes are based on the congestion window $(C_{wnd})$ size.

Rapid adaptation of $C_{wnd}$ to the available network conditions can help lower message dissemination latency.

Therefore, selecting a more suitable TCP variant like BBR, which is known for its ability to dynamically adjust the congestion window based on network conditions, can significantly enhance GossipSub's performance.

At the same time, parameters like receive window scaling and initial $C_{wnd}$ also impact message transfer time, but these are usually OS-specific system-wide choices.

One possible solution is to raise $C_{wnd}$ by exchanging data over the newly established connection. This data may involve useful details like peer exchange information and gossip to build initial trust, or GossipSub can use some dummy data to raise $C_{wnd}$ to a reasonable level.

It's important to understand that some TCP variants reset $C_{wnd}$ after specific periods of inactivity [12]. This can lead to a decline in TCP's performance for applications that generate traffic after intervals long enough to trigger the resetting of the congestion window.

Implementing straightforward measures like transport-level ping-pong messages can effectively mitigate this problem [13].

The limitations faced with $C_{wnd}$ scaling also impact some performance optimizations in GossipSub. For instance, floodpublishing is an optimization relying on additional transmissions by the publisher to minimize message dissemination latency.

However, a small $C_{wnd}$ value in (new/cold) TCP connections established with floodpublish peers significantly increases message transmission time [4]. Usually, these peers also receive the same message from other sources during this time, wasting the publisher's bandwidth.

The same is the case with IWANT replies.

Maintaining a bigger mesh (with warm TCP connections) and relaying to $D$ peers can be a better alternative to this problem.

3. Eliminating redundant transmissions

For every received packet, a peer makes roughly $D$ transmissions to contribute its fair share to the spread of messages. However, the fact that many recipients had already received the message (from some other peer) makes this message propagation inefficient.

Although the $D$ -spread is attributed to quicker dissemination and resilience against non-conforming peers, many potential solutions can still minimize redundant transmissions while preserving the resilience of GossipSub.

These solutions, ranging from probabilistic to more knowledgeful elimination of messages from the outgoing message queue, not only address the issue of redundancy but also provide an opportunity for bandwidth optimization, especially for resource-constrained peers.

For instance, an IDONTWANT message, a key component of GossipSub (v1.2) [10], can significantly reduce redundant transmissions.

It allows any node to notify its mesh members that it has already received a message, thereby preventing them from resending the same message. This functionality is useful when a node receives a message larger than a specified threshold.

In such cases, the node promptly informs its mesh peers about the successful reception of the message by sending IDONTWANT messages.

It's important to note that an IDONTWANT message is essentially an IHAVE message, but with a crucial difference, i.e., IHAVEs are only transmitted during the heartbeat intervals, whereas IDONTWANTs are sent immediately after receiving a large message.

This prompt notification helps curtail redundant large message transmissions without compromising the GossipSub resilience.

However, the use of IDONTWANT messages alone has an inherent limitation. For instance, a peer can only send an IDONTWANT after receiving the complete message.

A large message transmission consumes significant time. For example, transmitting a 1MB message at 100 Mbps bandwidth may consume 80 to several hundred milliseconds (depending upon $C_{wnd}$ and latency).

As a result, other mesh members may also start transmitting the same message during this interval. A few potential solutions include:

a. Staggering with IDONTWANT messages

As previously discussed, staggering can significantly reduce network-wide message dissemination latency. This is primarily due to the relatively smaller store-and-forward delays that are inherent in this approach.

Using both staggering and IDONTWANT messages can further enhance efficiency by reducing redundant transmissions. This is because a node only saturates its bandwidth for a small subset of mesh peers, leading to early transmissions and prompt IDONTWANT message notifications to the mesh members.

It is worth highlighting that staggering can be implemented in various ways.

For example, it can be applied to peers (peer staggering) where a node sequentially relays the same message to all peers one by one.

Alternatively, a node can send a different message to every peer (message staggering or rotational sending), allowing IDONTWANTs for other messages to arrive during this time. The message staggering approach is beneficial when several messages are introduced to the network within a short interval of time.

As the peers in staggered sending are sequentially covered (with a faster speed due to bandwidth concentration), this leads to another problem.

The early covered peers send IHAVE (during their heartbeat intervals) for the messages they have received. IHAVE announcements for newly received large messages trigger IWANTs from nodes (including those already receiving the same message), leading to an additional workload for early receivers [14].

Potential solutions to mitigate these problems include:

Defering IHAVE announcements for large messages.

Deferring IHAVE announcements can indirectly prioritize message transmission to the mesh peers over IWANT replies. However, deciding on a suitable deferred interval is crucial for optimal performance. One possible solution is to generate IHAVEs only after the message is relayed to all the mesh peers.

Defering IWANT requests for messages that are currently being received.

This requires prior knowledge of msgIDs for the messages under reception. Knowing the message length is also essential in deciding a suitable defer interval to handle situations where a sender starts sending a message and never completes the transmission.

Not issuing IWANT for a message if at least $K$ peers have transmitted IDONTWANT for the same message (as this indicates that these peers will eventually relay this message).

However, this approach can inadvertently empower a group of non-conforming mesh peers to send IDONTWANT for a message and never complete message transmission. A delayed IWANT, along with negative peer scoring, can remedy this problem.

b. IMReceiving message

A peer can issue an IDONTWANT only after it has received the entire message. However, a large message transmission may take several hundred milliseconds to complete. During this time, many other mesh members may start relaying the same message.

Therefore, the probability of simultaneously receiving the same message from multiple senders increases with the message size, significantly compromising the effectiveness of IDONTWANT messages.

Sending a short preamble (containing msgID and length) before the message transmission can provide valuable information about the message. If a receiver is already receiving the same message from another sender, the receiver can request to defer this transmission by sending a brief IMReceiving message.

An IDONTWANT from the receiver will indicate successful message reception. Otherwise, the waiting sender can initiate transmission after a specific wait interval.

However, waiting for IMReceiving after sending the preamble can delay the message transmission. On the other hand, proceeding with message transfer (after sending the preamble) leads to another problem: it is difficult to cancel ongoing message transmission after receiving IMReceiving for the same message.

To streamline this process, a peer can immediately send an IMReceiving message (for every received preamble), urging other mesh peers to defer sending the same message [15, 16].

The other peers can send this message if IDONTWANT is not received from the receiver during the wait interval. This approach can boost IDONTWANT benefits by considering ongoing transmissions for large messages.

While IMReceiving messages can bring about substantial improvements in terms of latency and bandwidth utilization, it's crucial to be aware of the potential risks.

A malicious user can exploit this approach to disrupt message transmission either by never completing a message or by intentionally sending a message at an extremely slow rate to numerous peers.

This could ultimately result in network-wide slow message propagation.

However, carefully calibrating the deferring interval (based on message size) and negative peer scoring can help mitigate these risks.

c. IDONTWANT message with reduced forwarding

It is common for slow peers to pile up outgoing message queues, especially for large message transfers. This results in a significant queuing delay for outgoing messages. Reduced message forwarding can help decrease the workload of slower peers.

On receiving a message longer than the specified threshold, a slow peer can relay it to only $K \in D$ peers and send an IDONTWANT message to all the peers in $D$ .

In this arrangement, the IDONTWANT message serves an additional purpose: to promptly announce data availability, reinforcing redundancy in the presence of adversaries.

When a peer receives an IDONTWANT for an unseen message, it learns about the new message and can request it by sending an IWANT request without waiting for the heartbeat (gossip) interval. As a result, a significantly smaller number of transmissions is sufficient for propagating the message to the entire network.

This approach conserves peer bandwidth by minimizing redundant transmissions while ensuring GossipSub resilience at the cost of one RTT (for missing peers).

Interestingly, curtailing queuing delays can also help lower network-wide message dissemination latency (for huge messages).

However, finding an appropriate value for $K$ is crucial for optimal performance. A smaller $K$ saves peer bandwidth, while a larger $K$ achieves quicker spread until outgoing message queues pile up. Setting $K = D_{low}$ can be one option.

It is worth mentioning that such behavior may negatively impact peer scoring (by missing message delivery rewards from $D-K$ peers). However, a minimized workload enables early message dissemination to the remaining peers. These early transmissions and randomized $K$ set selection can help achieve an overall better peer score.

4. Message prioritization

Despite the standardized specifications of the GossipSub protocol, the message forwarding mechanisms can significantly impact network-wide message dissemination latency and bandwidth utilization.

It is worth mentioning that every node is responsible for transmitting different types of packets, including control messages, locally published messages, messages received from mesh members, IWANT replies, etc.

As long as traffic volume is lower than the available data rate, the message forwarding mechanisms yield similar results due to negligible queuing delays.

However, when the traffic volume increases and exceeds the available peer bandwidth (even for short traffic bursts), the outgoing message queue(s) sizes rise, potentially impacting the network's performance.

In this scenario, FIFO-based traffic forwarding can lead to locally published messages being placed at the end of the outgoing message queue, introducing a queuing delay proportional to the queue size. The same applies to other delay-sensitive messages like IDONTWANT, PRUNE, etc.

On the other hand, the segregation of traffic into priority and non-priority queues can potentially starve low-priority messages. One possible solution is to use weighted queues for a fair spread of messages.

Message prioritization can be a powerful tool to ensure that important messages reach their intended recipients on time and allow for customizable message handling.

For example, staggering between peers and messages can be better managed by using priority queues. However, it is important to note that message prioritization also introduces additional complexity to the system, necessitating sophisticated algorithms for better message handling.

5. Maximizing benefits from IWANT messages

During heartbeat intervals, GossipSub nodes transmit IHAVE messages (carrying IDs of seen messages) to the peers not included in the full-message mesh. These peers can use IWANT messages to request any missing messages. A budget counter ensures these messages never exceed a specified threshold during each heartbeat interval.

The IHAVE/IWANT messages are a crucial tool in maintaining network connectivity. They bridge the information gap between nearby and far-off peers, ensuring that information can be disseminated to peers outside the mesh. This function is essential in protecting against network partitions and indirectly aids in safeguarding against Sybil and eclipse attacks.

However, it is essential to understand that high transmission times for large messages require careful due diligence when using IWANT messages for reasons not limited to:

A large message reception may take several hundred milliseconds to complete. During this time, an IHAVE message announcing the same message ID will trigger an IWANT request.
A peer can send IWANT requests for the same message to multiple nodes, leading to simultaneous transmissions of the same message.
Replying to (potentially many) IWANT requests can delay the transmission of the same message to mesh peers, resulting in lower peer scores and slower message propagation.

A few possible solutions to mitigate this problem may include:

Issuing IHAVE announcements only after the message is delivered to many mesh peers.
Allocating a volume-based budget to service IWANT requests during each heartbeat interval.
Deferring IWANT requests for messages that are currently being received.
Deferring IWANT requests if at least $K$ IDONTWANTs are received for the same message.
A large message transmission can yield high $C_{wnd}$ ; preferring such peers during mesh maintenance can be helpful.

Summary

This study investigates the pressing issue of considerable fluctuations and rises in network-wide dissemination times for large messages.

We delve into multiple factors, such as increased message transmit times, store-and-forward delays, congestion avoidance mechanisms, and prioritization between messages, to establish a comprehensive understanding of the problem.

The study also explores the performance of optimization efforts like floodpublishing, IHAVE/IWANT messages, and message forwarding strategies in the wake of large message transmissions.

A key finding is that most congestion avoidance algorithms lack optimization for peer-to-peer networks. Coupling this constraint with increased message transmission times results in notable store-and-forward delays accumulating at each hop.

Furthermore, the probabilistic message-forwarding nature of GossipSub further exacerbates the situation by utilizing a considerable share of available bandwidth on redundant transmissions.

Therefore, approaches focused on eliminating redundant transmissions (IDONTWANT, IMReceiving, lazy sending, etc.) can prove helpful. At the same time, strategies aimed at reducing store-and-forward delays (fragmentation, staggering, prioritization, etc.) can prove beneficial.

It is worth mentioning that many of the strategies suggested in this post are ideas at different stages. Some of these have already been explored and discussed to some extent [5, 17, 18]. We are nearing the completion of a comprehensive performance evaluation of these approaches and will soon share the results of our findings.

Please feel free to join the discussion and leave feedback regarding this post in the VAC forum.

References

[1] EIP-4844: Shard Blob Transactions. Retrieved from https://eips.ethereum.org/EIPS/eip-4844

[2] Message Propagation Times With Waku-RLN. Retrieved from https://docs.waku.org/research/research-and-studies/message-propagation/

[3] Lenient Flood Publishing. Retrieved from https://github.com/libp2p/rust-libp2p/pull/3666

[4] Disable Flood Publishing. Retrieved from https://github.com/sigp/lighthouse/pull/4383

[5] GossipSub for Big Messages. Retrieved from https://hackmd.io/X1DoBHtYTtuGqYg0qK4zJw

[6] GossipSub: Lazy Sending. Retrieved from https://github.com/status-im/nim-libp2p/issues/850

[7] GossipSub: Limit Flood Publishing. Retrieved from https://github.com/vacp2p/nim-libp2p/pull/911

[8] GossipSub: Lazy Prefix Detection. Retrieved from https://github.com/vacp2p/nim-libp2p/issues/859

[9] Potential Gossip Improvement List for EIP4844. Retrieved from https://hackmd.io/@gRwfloEASH6NWWS_KJxFGQ/B18wdnNDh

[10] GossipSub Specifications v1.2: IDONTWANT Message. Retrieved from https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.2.md?plain=1#L52

[11] Number of Duplicate Messages in Ethereum’s GossipSub Network. Retrieved from https://ethresear.ch/t/number-duplicate-messages-in-ethereums-gossipsub-network/19921

[12] TCP Congestion Control: Re-starting Idle Connections. Retrieved from https://datatracker.ietf.org/doc/html/rfc2581#section-4.1

[13] PING/PONG Control Messages. Retrieved from https://github.com/libp2p/specs/pull/558

[14] IHAVE/IWANT Message Impact. Retrieved from https://github.com/vacp2p/nim-libp2p/issues/1101

[15] Large Message Handling IDONTWANT + IMReceiving Messages. Retrieved from https://forum.vac.dev/t/large-message-handling-idontwant-imreceiving/281

[16] IDONTWANT Message Impact. Retrieved from https://forum.vac.dev/t/idontwant-message-impact/283

[17] IWANT Message Impact. Retrieved from https://forum.vac.dev/t/iwant-messages-may-have-negative-impact-on-message-dissemination-latency-for-large-messages/366

[18] IDONTWANT Message Performance. Retrieved from https://vac.dev/rlog/gsub-idontwant-perf-eval/

Libp2p GossipSub IDONTWANT Message Performance Impact

2024-10-28T12:00:00.000Z

This post provides quick insights into the IDONTWANT message performance and highlights minor tweaks that can further contribute to performance gains.

Overview

IDONTWANT messages are introduced to curtail redundant transmissions without compromising resilience. Cutting down on duplicates can potentially render two significant advantages:

Reducing bandwidth requirements
Reducing message dissemination time (latency)

For IDONTWANTs to be effective, they must be received and processed by the sender before the sender starts relaying the respective message.

Duplicates investigation reveals that the average time difference between the first message arrival and the first duplicate arrival is higher than the average round trip time in Ethereum's GossipSub network.

This allows for timely IDONTWANT reception and canceling of many duplicate transmissions, showing a potential for a significant drop in bandwidth utilization. On the other hand, lowering message dissemination time is only possible by minimizing queuing delays at busy peers.

Experiments

We conducted a series of experiments with different arrangements (changing heartbeat_interval and message size) to precisely identify the impact of IDONTWANT messages on bandwidth utilization and message dissemination time.

The experiments are performed on nim-libp2p using the shadow simulator. The peer bandwidth and link latency are uniformly set between 50-150 Mbps and 40-130 milliseconds in five stages.

In all experiments, ten messages are transmitted to the network, i.e., ten peers (publishers) are selected as the message transmitters. Every publisher transmits exactly one message, and inter-packet spacing (delay) is set to four seconds for each published message. For a fair assessment, we ensure that the publishers are uniformly selected from each bandwidth class.

At the start of each experiment, two additional messages are transmitted to increase the TCP $C_{wnd}$ . These messages are not included in latency computations.

The simulation details are presented in the table below.

Parameter	Value	Parameter	Value
Peers	2000	Publishers	10
Peer bandwidth	50-150 Mbps	Link latency	40-130 ms
Message size	1KB, 50KB, 500KB, 1MB	$D$	8
Heartbeat interval	700ms, 1000ms, 1500ms	$D_{low}$	6
FloodPublish	False	$D_{high}$	12
Gossip factor	0.05	Muxer	yamux

Findings

We use bandwidth utilization and latency as evaluation metrics. Bandwidth utilization represents total network-wide traffic (including gossip and other control messages). Latency refers to network-wide message dissemination time. The total number of IWANT requests and the number of message transmissions saved by IDONTWANT messages are also presented for detailed insights.

Experiments reveal that IDONTWANT messages yield a noticeable (up to 21%) drop in bandwidth utilization. A higher drop is seen with a higher heartbeat interval. Interestingly, a relatively low bandwidth reduction (12-20%) is seen for 1MB messages, compared to 500KB messages (18-21%).

This is because downloading a large message may consume several hundred milliseconds. During this time, a receiver will likely generate multiple IWANT requests for the same message, increasing bandwidth utilization.

Moreover, a peer can generate IDONTWANTs only after it has finished downloading the message. A longer download time will result in simultaneous reception of the same message from other mesh members.

These IWANT requests mainly overwhelm early message receivers, which can negatively impact message dissemination time on some occasions. Therefore, a similar message dissemination time is seen with and without IDONTWANT messages.

Similar results are seen on our large-scale deployment runs (running Waku nodes in Kubernetes).

Please feel free to join the discussion and leave feedback regarding this post in the VAC forum.

References

Vac 101: Transforming an Interactive Protocol to a Noninteractive Argument

2024-10-15T12:00:00.000Z

In this post, we introduce a common technique used to convert interactive protocols to their noninteractive variant.

Introduction

The set of interactive protocols form a class of protocols that consist of communication between two parties: the Prover and the Verifier. The Prover tries to convince the Verifier of a given claim. For example, the Prover may want to convince the Verifier that she owns a specific Unspent Transaction Output (UTXO); that is, the Prover possesses the ability to spend the UTXO. In many instances, there is information that the Prover does not wish to reveal to the Verifier. In our example, it is critical that the Prover does not provide the Verifier with the spending key associated with her UTXO. In addition to the Prover's claim and secret data, there is additional data, public parameters, that the claimed statement is expressed in terms of. The public parameters can be thought of as the basis for all similar claims.

In an interactive protocol, the Prover and the Verifier are in active communication. Specifically, the Prover and the Verifier exchange messages so that the Verifier can validate the Prover's claim. However, this communication is not practical for many applications. It is necessary that any party can verify the Prover's claim in decentralized systems. It is impractical for the Prover to be in active communication with a large number of verifying parties. Instead, it is desirable for the Prover to generate a proof on their own that can convince any party. To achieve this, it is necessary for the Prover to generate the Verifier's messages in such a way that the Prover cannot manipulate the Verifier's messages for her benefit. The Fiat-Shamir heuristic 1 is used for this purpose. Even though much of our discussion will focus on $\Sigma$ -protocols, the Fiat-Shamir heuristic is not limited to $\Sigma$ -protocols. The Fiat-Shamir heuristic has been applied to zk-SNARKs, but the security in this setting has been the subject of discussion and research in recent years. Block et al. 2 provide the first formal analysis of Fiat-Shamir heuristic in zk-SNARKs.

Sigma Protocols

A $\Sigma$ -protocol is a family of interactive protocols that consists of three publicly transmitted messages between the Prover and the Verifier. In particular, the protocol has the following framework:

Prover		Verifier
	$\stackrel{\mathsf{commitment}}{\longrightarrow}$
	$\stackrel{\mathsf{challenge}}{\longleftarrow}$
	$\stackrel{\mathsf{response}}{\longrightarrow}$

These three messages form the protocol's transcript: $(\mathsf{commitment}, \mathsf{challenge}, \mathsf{response})$ . The Verifier uses all three of these messages to validate the Prover's original claim. The Verifier's challenge should be selected uniform random from all possible challenges. Based on this selection, a dishonest Prover can only convince the Verifier with a negligible probability.

The Schnorr Protocol

The Schnorr protocol 3 is usually the first $\Sigma$ -protocol that one studies. Additionally, the Schnorr protocol can be used as an efficient signature scheme. The Schnorr protocol provides a framework that enables the Prover to convince the Verifier that: for group elements $g$ and $X$ , the Prover knows the power to raise $g$ to obtain $X$ . Specifically, the Prover possesses some integer $x$ so that $X = g^x$ . Cryptographic resources may use either multiplicative or additive notation for groups; we will use multiplicative notation. Briefly, the element $g$ being combined with itself in multiplicative notation is $g \cdot g = g^2$ , while in additive notation it is $g + g = 2g$ . We assume that our group is of prime order $p$ , and is sufficiently large to satisfy the discrete logarithm assumption.

The Schnorr protocol proceeds as follows:

Prover		Verifier
$t \in_R \mathbb{Z}_p$ , $T := g^t$	$\stackrel{T}{\longrightarrow}$
	$\stackrel{c}{\longleftarrow}$	$c \in_R \mathbb{Z}_p$
$z := t + xc$	$\stackrel{z}{\longrightarrow}$
		output 1 provided $g^z \stackrel{?}{=} T X^c$

Chaum-Pedersen protocol

A tuple of group elements $(U,V,W)$ is a DH-triple if and only if there exists some $x \in \mathbb{Z}_p$ so that $V = g^x$ and $W = U^x$ . The Chaum-Pedersen protocol provides a framework that enables a Prover to convince a Verifier that she possesses such a $x$ for a claimed DH-triple $(U,V,W)$ . The Chaum-Pedersen protocol proceeds as follows:

Prover		Verifier
$t \in_R \mathbb{Z}_p$ , $T := g^t$ , $S := U^t$	$\stackrel{T,S}{\longrightarrow}$
	$\stackrel{c}{\longleftarrow}$	$c \in_R \mathbb{Z}_p$
$z := t + xc$	$\stackrel{z}{\longrightarrow}$
		output 1 provided $g^z \stackrel{?}{=} T V^c$ and $U^z \stackrel{?}{=} SW^c$

Hash Functions

Cryptographic hash functions serve as the backbone to the Fiat-Shamir heuristic. A hash function, $\mathsf{Hash}$ , is a special function that takes in an arbitrary binary string and outputs a binary string of a predetermined fixed length. Specifically, $\mathsf{Hash} : \{0,1\}^* \rightarrow \{0,1\}^n$ .

The security of cryptographic hash functions will rely on certain tasks being computationally infeasible. A task is computationally infeasible provided that there is no deterministic algorithm that can conclude the task in polynomial-time.

A cryptographic hash function satisfies the following properties:

Succinct: The hash function should be easy to compute; the hash $\mathsf{Hash}({\bf{b}})$ can be efficiently computed for any binary string ${\bf{b}}$ .
Preimage Resistance: It should be computationally infeasible to work backwards given the output of a hash function. Let ${\bf{y}}$ be a binary string of length $n$ . It should be 'impossible' to find some binary string ${\bf{x}}$ so that ${\bf{y}} = \mathsf{Hash}({\bf{x}})$ .
Collision Resistance: It should be difficult to find two strings that hash to the same value. It should be computationally infeasible to find two binary strings ${\bf{x}_1}$ and ${\bf{x}_2}$ so that $\mathsf{Hash}({\bf{x}_1}) = \mathsf{Hash}({\bf{x}_2}).$

A related class of functions is one-way functions. A one-way function satisfies the first two conditions of a cryptographic hash function (succinct and preimage resistance). All cryptographic hash functions are a one-way functions. However, one-way functions do not necessarily satisfy collision-resistance. We will simply refer to cryptographic hash functions as hash functions for the rest of this blog. Commonly used hash functions include SHA-256 5, Keccak 6, and Poseidon 7.

The Fiat-Shamir heuristic

The Fiat-Shamir heuristic is the technique used to convert an interactive protocol to a noninteractive protocol. This is done by replacing each of the Verifier's messages with a hashed value. Specifically, the Prover generates the Verifier's message by evaluating the hash function $\mathsf{Hash}$ with the concatenation of all public values that appear in the protocol thus far. If $m_0, \dots, m_t$ denote the public values in the protocol thus far, then the Verifier's message is computed as $m_{t+1} := \mathsf{Hash}(m_0|| \cdots || m_t)$ .

Since $\mathsf{Hash}$ can be efficiently computed, and the messages $m_0, \dots, m_t$ are public, then any verifying party can compute $m_{t+1}$ . Critically, since $\mathsf{Hash}$ is preimage resistant and collision resistant, the Prover cannot manipulate her choices of the messages $m_0,\dots, m_t$ to influence the message $m_{t+1}$ . Hence, verifying parties can trust that $m_{t+1}$ is sufficiently random with respect to the preceding messages.

There are two variants of the Fiat-Shamir heuristic: weak and strong. The weak variant uses all of the publicly traded messages in computing the Verifier's messages but does not include the public parameters. However, in the strong variant all of the publicly traded messages and public parameters are used to compute the Verifier's messages. We will provide a discussion on issues that can arise from using the weak Fiat-Shamir heuristic.

Schnorr Protocol with the strong Fiat-Shamir

When the strong Fiat-Shamir heuristic is applied to the Schnorr protocol, the message $c = \mathsf{Hash}(g||X||T)$ . This choice of $c$ provides security since it should be computationally infeasible to find collisions for the outputs of $\mathsf{Hash}$ . Thus, $c$ fixes the group elements $g$ , $X$ and $T$ .

The elements that would be omitted in the hash by applying weak Fiat-Shamir heuristic are $g$ and $X$ .

Chaum-Pedersen Protocol with the strong Fiat-Shamir

The message $c = Hash(g||U||V||W||T||S)$ when the Prover applies the strong Fiat-Shamir heuristic to the Chaum-Pedersen protocol. The properties of $\mathsf{Hash}$ fixes the generator $g$ and the Prover's statement $(U,V,W)$ .

Improper use of the Fiat-Shamir heuristic

The Fiat-Shamir heuristic appears to be a fairly straightforward technique to implement. However, a subtle but serious issue that can occur in the application of the Fiat-Shamir heuristic has been a point of discussion for the past few years. The issue concerns what messages are included in the hash. In particular, are the public parameters used to compute the hash value?

Bernhard et al. 8 provide a discussion of Fiat-Shamir heuristic restricted to $\Sigma$ -protocols. In particular, Bernhard et al. discuss the pitfalls of the weak Fiat-Shamir heuristic. Recall that the strong Fiat-Shamir heuristic requires that the public parameters are included in the calculations of the Verifier's messages while the weak version does not. The inclusion of the public parameters in the hash evaluations fixes these public values for the entire protocol. This means that the Prover cannot retroactively change them.

The issues with the differences in the variants of the Fiat-Shamir heuristics has persisted since Bernhard et al.'s paper. In recent years, auditors from Trail of Bits and OpenZeppelin have released blogs (9, 10, 11, 12, 13) and papers (14, 15) describing specific vulnerabilities in zero-knowledge papers and repositories associated with the use of the weak Fiat-Shamir heuristic.

Trail of Bits coined the term FROZEN Heart to describe the use of weak Fiat-Shamir heuristic. Frozen comes from the phrase "FoRging Of ZEro kNowledge proofs", and Fiat-Shamir is the "heart" of transforming an interactive protocol to noninteractive protocol.

Now, we examine how weak Fiat-Shamir affects the Schnorr protocol and Chaum-Pedersen protocol.

Schnorr protocol with the weak Fiat-Shamir heuristic

For Schnorr, we will examine two variants: the first where we only include the Prover's claim $X$ but not the public parameter $g$ , and the second where we include the public parameter $g$ but not the Prover's claim $X$ .

Since we omit the generator $g \in \mathbb{G}$ from the computation for the message $c$ in our first approach, then $c = \mathsf{Hash}(X||T)$ .

Now, a malicious Prover can complete the transcript for the Schnorr protocol by selecting any $z \in_R \mathbb{Z}_p$ . Since, $g$ is not fixed as it was not included in the computation of $c$ . But, the malicious Prover needs the transcript $(T,c,z)$ to satisfy $g^z = TX^c$ . Hence, the malicious Prover can compute the generator $g = (TX^c)^{z^{-1}}.$

In our second approach, we omit the group element $X \in \mathbb{G}$ from the computation for the challenge $c$ . That is, $c = \mathsf{Hash}(g||T)$ .

As with the previous example, the malicious Prover takes a Schnorr transcript $(T,c,z)$ where $z \in_R \mathbb{Z}_p$ . It is necessary for the malicious Prover to find a value $X$ so that $g^z = TX^c$ . This can be acheived by computing $X = (g^z T^{-1})^{c^{-1}}$ .

Chaum-Pedersen protocol with the Fiat-Shamir heuristic

The Verifier's message $c = Hash(T,S)$ when weak Fiat-Shamir heuristic is applied. The Prover's triple $(U,V,W)$ and the generator $g$ are not fixed by $c$ . As such, a malicious Prover can generate values for $U,V,W$ , and $g$ that satisfy the Verifier's identity checks. In the case of a malicious Prover, $T$ and $S$ are randomly group elements instead of being computed using a value $t$ that the Prover selected. This means a malicious Prover must randomly select the value $z$ as well.

Given the values that have been fixed so far, each of the Verifier's identities consists of two unknowns. Hence, it is necessary to select one of these unknowns from each identity so that a malicious Prover can compute the last value. For instances, suppose that the malicious Prover randomly selects $V$ and $W$ . The malicious Prover can compute $g = (T V^c)^{1/z}$ and $V = (SW^c)^{1/z}$ . Thus, the malicious Prover has a claimed statement $(U,V,W)$ for generator $g$ that passes the Verifier's identities using weak Fiat-Shamir heuristic.

The omission of any of the values $U,V,W,$ and $g$ in the computation of $c$ allows a malicious Prover to forge a proof.

Conclusion

The Fiat-Shamir heuristic is an essential technique to convert an interactive protocol to a variant that does not require communication. Additionally, careful application of this technique is necessary to maintain the integrity of the system.

References

zkVM Testing Report: Evaluating Zero-Knowledge Virtual Machines for Nescience

2024-09-26T12:00:00.000Z

Following our initial exploration of zkVMs in our previous blog post [1], we have conducted a series of tests to identify the most suitable zkVM for the Nescience architecture [2]. This post outlines the testing process, results, and conclusions. Additionally, the full test suite and scripts can be found on our GitHub page [3], allowing others to replicate the results or explore the candidates further. Please note that we chose not to use hardware acceleration in our benchmarks, as our project is aimed at a broad audience. Particularly, we cannot assume AVX512 support by default, as it is typically available only in high-end CPUs.

We've shortlisted the following zkVMs for testing:

SP1 [4]
RISC0 [5]
Nexus [6]
ZkMIPS [7]
ZkWASM [8]
Valida [9]

Why these candidates?

When narrowing down the zkVMs, we focused on key factors:

True zero-knowledge functionality: The zkVMs had to demonstrate or be close to demonstrating the ability to generate and verify zero-knowledge proofs.
Performance baselines: We sought zkVMs with solid benchmarks in performance, particularly in speed and efficiency.
Specific functionalities: For Nescience, functionalities like lookup tables, precompiles, and recursive capabilities are critical.

We need a zkVM that supports these to enable robust project development.

Preliminary information on the candidates

SP1 is a performant, open-source zkVM that verifies the execution of arbitrary Rust (or any LLVM-compiled language) programs. SP1 utilizes Plonky3, enabling recursive proofs and supporting a wide range of cryptographic algorithms, including ECC-based ones like Groth16. While it supports aggregation, it appears not to support zero knowledge in a conventional manner.
RISC0 zkVM allows one to prove the correct execution of arbitrary Rust code. Built on a RISC-V architecture, it is inherently adaptable for implementing standard cryptographic hash functions such as SHA-256 and ECDSA. RISC0 employs STARKs, providing a security level of 98 bits. It supports multiple programming languages, including C and Rust, thanks to its compatibility with LLVM and WASM.
Nexus is a modular, extensible, open-source, highly parallelized, prover-optimized, and contributor-friendly zkVM written in Rust. It focuses on performance and security, using the Nova folding scheme, which is particularly effective for recursive proofs. Nexus also supports precompiles and targeted compilation, and besides Rust, it offers C++ support.
ZkMIPS is a general verifiable computing infrastructure based on Plonky2 and the MIPS microarchitecture, aiming to empower Ethereum as a global settlement layer. It can run arbitrary Rust code as well. Notably, zkMIPS is the only zkVM in this list that utilizes the MIPS opcode set.
ZkWASM adheres to and supports the unmodified standard WASM bytecode specification. Since Rust code can be compiled to WASM bytecode, one could theoretically run any Rust code on a zkWASM machine, providing flexibility and broad language support.
Valida is a STARK-based virtual machine aiming to improve upon the state of the art in several categories:
- Code reuse: The VM has a RISC-inspired instruction set, simplifying the targeting of conventional programming languages. A backend compiler is being developed to compile LLVM IR to the Valida ISA, enabling the proving of programs written in Rust, Go, C++, and others with minimal to no changes in source code.
- Prover performance: Engineered to maximize prover performance, Valida is compatible with a 31-bit field, restricted to degree 3 constraints, and features minimal instruction decoding. It operates directly on memory without general-purpose registers or a dedicated stack, utilizing newer lookup arguments to reduce trace overhead involved in cross-chip communication.
- Extensibility: Designed to be customizable, Valida can easily be extended to include an arbitrary number of user-defined instructions. Procedural macros are used to construct the desired machine at compile time, avoiding any runtime penalties.

Valida appears to be in the early stages of development but already showcases respectable performance metrics.

Testing plan

To thoroughly evaluate each zkVM, we devised a two-stage testing process:

Stage 1: Arithmetic operations

The first phase focused on evaluating the zkVMs’ ability to handle basic arithmetic operations: addition, subtraction, multiplication, division, modulus division, and square root calculations. We designed the test around heptagonal numbers, which required zkVMs to process multiple arithmetic operations simultaneously. By using this method, we could measure efficiency and speed in handling complex mathematical calculations – a crucial element for zkVM performance.
Stage 2: Memory consumption

For the second phase, we evaluated each zkVM’s ability to manage memory under heavy loads. We tested several data structures, including lists, hash maps, deques, queues, BTreeMaps, hash sets, and binary heaps. Each zkVM underwent tests for the following operations:
- Insert: How quickly can the zkVM add data to structures?
- Delete: Does the zkVM handle memory release effectively?
- Append: Can the zkVM efficiently grow data structures?
- Search: How fast and efficient is the zkVM when retrieving stored data?

The purpose of this stage was to identify any memory bottlenecks and to determine whether a zkVM could manage high-intensity tasks efficiently, something vital for the Nescience project’s complex, data-heavy processes.

Machine specifications

The tests were conducted on the following hardware configuration:

CPU: AMD EPYC 7713 "Milan" 64-core processor (128 threads total)
RAM: 600GiB DDR4 3200MHz ECC RAM, distributed across 16 DIMMs
Host OS: Proxmox 8.3
Hypervisor: KVM
Network layer: Open vSwitch
Machine model: Supermicro AS-2024US-TRT

Results

1. SP1

SP1 does not provide zero-knowledge capability in its proofs but delivers respectable performance, though slightly behind its main competitor. Memory leaks were minimal, staying below the 700 KB threshold. Interestingly, SP1 consumed more RAM during the basic arithmetic test than in memory allocation tests, showcasing the team's effective handling of memory under load. In the basic test, allocations were primarily in the 9-16 B, 33-64 B, and 65-128 B ranges. For memory allocations, most fell into the 129-256 B range.

Stage 1: Hept 100 test
- Proof size: 3.108 MB
- Proof time: 16.95 seconds

Stage 2: Vec 10000 test
- Proof size: 3.17 MB
- Proof time: 20.85 seconds

2. RISC0

RISC0 stands out with exceptional performance in proof size and generation time, ranking among the best (with the exception of Valida and zkWASM's basic test). It also handles memory well, with minor leaks under 0.5 MB and controlled RAM consumption staying below 2.2 GB. RISC0's memory allocations were consistent, primarily in the 17-32 B and 33-64 B ranges.

Stage 1: Hept 100 test
- Proof size: 217.4 KB
- Proof time: 9.73 seconds

Stage 2: Vec 10000 test
- Proof size: 217.4 KB
- Proof time: 16.63 seconds

Based on these results, RISC0 is a solid candidate for Nescience.

3. Nexus

Nexus' performance offers interesting insights into folding scheme-based zkVMs. Surprisingly, proof sizes remained consistent regardless of workload, with no significant memory leaks (under 700 KB). However, while RAM consumption increased slightly with higher workloads (up to 1.2 GB), Nexus performed poorly during memory allocation tests, making it unsuitable for our use case.

Allocation details:
- Basic test: Most allocations concentrated in 65-128 B
- Memory-heavy test: Allocations in the 129-256 B range
Stage 1: Hept 100 test
- Proof size: 46 MB
- Proof time: 12.06 seconds

Stage 2: Vec 10000 test
- Proof size: 46 MB
- Proof time: 56 minutes

4. ZkMIPS

ZkMIPS presents an intriguing case. While it shows good results in terms of proof size and generation time during the basic test, these come at the cost of significant RAM usage and memory leaks. The memory allocation test revealed a concerning 6.7 GB memory leak, with 0.5 GB leaked during the basic test. Despite this, RAM consumption (while high at 17+ GB) remains stable under higher workloads. Allocation sizes are spread across several ranges, with notable concentrations in the 17-32 B, 65-128 B, and 257-512 B slots.

Stage 1: Hept 100 test
- Proof size: 4.3 MB
- Proof time: 9.32 seconds

Stage 2: Vec 10000 test
- Proof size: 4.898 MB
- Proof time: 42.57 seconds

This zkVM provides mixed results with strong proof generation but concerning memory management issues.

5. ZkWASM

ZkWASM, unfortunately, performed poorly in both stages regarding proof size and generation time. RAM consumption was particularly high, exceeding 7 GB in the basic test, and an astounding 57 GB during memory allocation tests. Despite its impressive memory usage, the proof sizes were relatively large at 18 KB and 334 KB respectively. Allocation sizes were mainly concentrated in the 33-64 B range, with neighboring slots contributing small but notable amounts.

Stage 1: Hept 100 test
- Proof size: 18 KB
- Proof time: 42.7 seconds

Stage 2: Vec 10000 test
- Proof size: 334 KB
- Proof time: 323 seconds

6. Valida

Valida delivered impressive results in proof generation speed and size, with a proof size of 280 KB and a proof time of < 1 second. However, profiling was not possible due to Valida's limited Rust support. Valida currently compiles Rust using the LLVM backend, transpiling LLVM IR to leverage its C/C++ implementation, which leads to errors when handling Rust-specific data structures or dependencies. As a result, complex memory interactions couldn't be tested, and using Valida with Rust code is currently not advisable. A GitHub issue has been opened to address this.

Summary table

Stage 1

zkVM	Proof time	Proof size	Peak RAM consumption	Memory leaked
SP1	16.95 s	3.108 MB	2.1 GB	656.8 KB
RISC0	9.73 s	217.4 KB	1.9 GB	470.5 KB
Nexus	12.06 s	46 MB	9.7 MB	646.5 KB
ZkMIPS	9.32 s	4.3 MB	17.3 GB	453.8 MB
ZkWASM	42.7 s	18 KB	8.2 GB	259.4 KB
Valida	< 1 s	280 KB	N/A	N/A

Stage 2

zkVM	Proof time	Proof size	Peak RAM consumption	Memory leaked
SP1	20.85 s	3.17 MB	1.9 GB	616 KB
RISC0	16.63 s	217.4 KB	2.3 GB	485.3 KB
Nexus	56 m	46 MB	1.9 GB	616 KB
ZkMIPS	42.57 s	4.898 MB	18.9 GB	6.9 GB
ZkWASM	323 s	334 KB	58.8 GB	259.4 KB
Valida	N/A	N/A	N/A	N/A

Summary

After an extensive evaluation of six zkVM candidates for the Nescience project, RISC0 emerged as the top choice. It excels in both proof generation time and size while maintaining a reasonable memory footprint. With strong zero-knowledge proof capabilities and support for multiple programming languages, it aligns well with our project's needs for privacy, performance, and flexibility. Its overall balance between performance and efficiency makes it the most viable zkVM at this stage.

Valida, while promising with its potential for high prover performance, is still in early development and suffers from Rust integration issues. The current LLVM IR transpilation limitations mean it cannot handle complex memory interactions, disqualifying it for now. However, once its development matures, Valida could become a strong alternative, and we plan to revisit it as it evolves.

SP1, though initially interesting, failed to meet the zero-knowledge proof requirement. Its performance in arithmetic operations was respectable but insufficient to justify further consideration given its lack of ZK functionality – critical for our privacy-first objectives.

Nexus demonstrated consistent proof sizes and manageable memory usage, but its lackluster performance during memory-intensive tasks and its proof size (especially for larger workloads) disqualified it from being a top contender. While zkMIPS delivered solid proof times, the memory issues were too significant to ignore, making it unsuitable.

Finally, zkWASM exhibited the poorest results, struggling both in proof size and generation time. Despite its potential for WASM bytecode support, the excessive RAM consumption (up to 57 GB in the memory test) rendered it impractical for Nescience’s use case.

In conclusion, RISC0 is the best fit for Nescience at this stage, but Valida remains a future candidate as its development progresses.

In the future, we plan to compare RISC0 and SP1 with CUDA acceleration. Ideally, by that time, more zkVMs will include similar acceleration capabilities, enabling a fairer and more comprehensive comparison across platforms.

We’d love to hear your thoughts on our zkVM testing process and results! Do you agree with our conclusions, or do you think we missed a promising zkVM? We’re always open to feedback, insights, and suggestions from the community.

Join the discussion and share your perspectives on our forum or try out the tests yourself through our GitHub page!

References

[1] Exploring zkVMs: Which Projects Truly Qualify as Zero-Knowledge Virtual Machines? Retrieved from https://vac.dev/rlog/zkVM-explorations/

[2] Nescience: A User-Centric State-Separation Architecture. Retrieved from https://vac.dev/rlog/Nescience-state-separation-architecture

[3] Our GitHub Page for zkVM Testing. Retrieved from https://github.com/vacp2p/nescience-zkvm-testing

[4] Introducing SP1: A performant, 100% open-source, contributor-friendly zkVM. Retrieved from https://blog.succinct.xyz/introducing-sp1/

[5] The first general purpose zkVM. Retrieved from https://www.risczero.com/zkvm

[6] The Nexus 2.0 zkVM. Retrieved from https://docs.nexus.xyz/

[7] ZKM Architecture. Retrieved from https://docs.zkm.io/zkm-architecture

[8] ZK-WASM. Retrieved from https://delphinuslab.com/zk-wasm/

[9] Valida zkVM Design. Retrieved from https://delendum.xyz/writings/2023-05-10-zkvm-design.html

Exploring zkVMs: Which Projects Truly Qualify as Zero-Knowledge Virtual Machines?

2024-08-27T12:00:00.000Z

The blockchain space is rapidly evolving, and with it, new technologies are emerging that promise enhanced privacy, scalability, and security. As decentralized systems grow in complexity and usage, the need for secure and private computation has never been greater. Zero-knowledge virtual machines (zkVMs) are one such innovation, allowing for computations to be proven correct without revealing the underlying data. ZkVMs have enormous implications for privacy-preserving applications, decentralized finance (DeFi), and other blockchain-based use cases. However, as the term "zkVM" becomes more widely adopted, it is critical to distinguish between projects that truly satisfy the stringent requirements of a zkVM and those that do not.

What is a zkVM?

A zkVM is a virtual machine that combines the principles of cryptographic proof generation and privacy preservation with the computational model of traditional virtual machines. Essentially, a zkVM enables the execution of arbitrary programs while generating cryptographic proofs—specifically, zero-knowledge proofs (ZKPs)—that can verify the correctness of these computations without revealing any sensitive information. This ensures that computations can be trusted while protecting the privacy of the data involved. The key characteristics of a zkVM include:

Proof generation: The ability to produce ZKPs that verify the correct execution of programs. There are several types of cryptographic techniques used in zkVMs to generate these proofs, such as zk-SNARKs, zk-STARKs, and recursive proofs. A zkVM’s ability to generate these proofs determines how effectively it can ensure the integrity of computations in a privacy-preserving manner.
Privacy preservation: The system must maintain privacy, ensuring that only the proof is revealed, not the underlying computation or data. Privacy-preserving zkVMs allow users to maintain confidentiality without compromising the security or verifiability of their operations. However, not all zkVMs achieve the same level of privacy. Some may focus more on proof generation and scalability while deprioritizing privacy features, which can limit their use in certain privacy-sensitive applications.
Scalability and performance: zkVMs should offer scalable and efficient computation, leveraging advanced cryptographic techniques like zk-SNARKs, zk-STARKs, or recursive proofs. A zkVM's performance must also be measured in terms of latency (time to generate and verify a proof) and throughput (number of computations processed within a certain time frame).
Verifiable computation: The zkVM should be able to prove the execution of arbitrary programs in a secure and verifiable manner. Verifiable computation ensures that zkVMs can be deployed across a wide range of applications, from DeFi to private data-sharing platforms and more.

Why zkVMs matter

The rise of zkVMs is a crucial development for the future of blockchain and decentralized technologies. As more systems require the ability to scale while maintaining privacy and trust, zkVMs provide a powerful solution. They offer the potential to reshape the way decentralized applications (dapps) handle sensitive information, enabling them to be both efficient and private.

It is essential to distinguish between projects that fully realize the potential of zkVMs and those that do not. In the remainder of this post, we evaluate several zkVM projects, analyzing whether they satisfy the criteria for being classified as zkVMs based on our research.

Our methodology

We analyzed each project’s documentation, source code, and available benchmarks to determine whether they meet the definition of a zkVM. Our criteria focus on the key capabilities of zkVMs—proof generation, privacy, scalability, and integration with existing systems.

ZkVM project analysis

1. [SP1]

Overview: SP1 [1] is a developer-friendly zkVM designed to enable ZKP execution for LLVM-based languages like C, C++, Rust, and others. It supports a RISC-V-like instruction set architecture (ISA), which makes it compatible with various programming languages compiled through LLVM.
Main focus: The main focus of SP1 is scalability, open-source contributions, and accessibility for developers. It prioritizes performance over privacy, making it a good fit for environments where privacy isn't the primary concern.
Privacy: Not explicitly mentioned, making it less suitable for privacy-preserving applications.
Performance: SP1 has demonstrated up to 5.4x better performance than similar zkVMs like RISC0 for specific computations such as Fibonacci sequence generation.
Integration: SP1 is highly adaptable for rollups, light client verifiers, oracles, and even web2 projects like verifying the originality of images.
Conclusion: Yes, SP1 is a zkVM, but it does not prioritize zero-knowledge privacy, focusing more on scalability and performance.

2. [Nexus]

Overview: Nexus [2] is a highly modular zkVM designed to process up to a trillion CPU cycles per second. It relies on RISC-V instructions for computation, making it extensible and scalable. However, it currently lacks full ZKP capabilities due to its use of Spartan proofs.
Main focus: Nexus focuses on high performance and scalability, aiming to create an efficient execution environment for computationally intensive tasks.
Privacy: Although zero-knowledge privacy isn't the primary feature of Nexus, the project hints at potential privacy enhancements in the future.
Performance: Nexus has a high theoretical throughput, but it has yet to demonstrate benchmarks on zero-knowledge privacy.
Integration: Nexus is a good fit for high-performance environments that do not necessarily require full privacy.
Conclusion: Yes, Nexus qualifies as a zkVM in terms of scalability and proof generation, but it does not yet achieve full zero-knowledge privacy.

3. [RISC0]

Overview: Risc0 [3] is a general-purpose zkVM with strong developer support. It allows for the execution of Rust and C code on a RISC-V virtual machine and generates zk-SNARK and zk-STARK proofs for these computations.
Main focus: Risc0 is focused on ease of use for developers by abstracting away the complexities of circuit generation, making it accessible for a wide range of use cases.
Privacy: Full zero-knowledge privacy is supported via zk-SNARK and zk-STARK proofs, with Groth16 used for constant-size proof generation.
Performance: Risc0 offers strong benchmarks across different hardware setups, making it one of the most versatile zkVMs in terms of performance and scalability.
Integration: Risc0 integrates with several ecosystems, including Ethereum, and supports verifiable execution of Rust-based programs.
Conclusion: Yes, Risc0 qualifies as a zkVM, offering a balance of developer usability, scalability, and privacy.

4. [Powdr]

Overview: Powdr [4] is a toolkit for creating custom zkVMs. It allows developers to select from various front-end and back-end components to create zkVMs tailored to specific needs.
Main focus: Powdr is focused on providing a modular architecture for zkVM creation. It enables flexibility by allowing the combination of different ZK-proof backends like Halo2 or Valida.
Privacy: Powdr itself does not generate ZKPs, but it facilitates the creation of zkVMs that do.
Performance: The performance depends on the components chosen by the developer, as Powdr itself is more of a framework.
Integration: Powdr is highly customizable and can integrate with existing zkVM frameworks to extend their capabilities.
Conclusion: No, Powdr is not a zkVM itself, but it is a powerful tool for building customized zkVMs with different privacy and performance needs.

5. [ZkMIPS]

Overview: ZkMIPS [5] uses zk-STARKs to ensure privacy during computation, ensuring that private inputs are preserved while still proving correctness.
Performance: ZkMIPS is built for scalability, though explicit benchmarks are not widely published.
Integration: ZkMIPS can be integrated into systems that rely on MIPS architecture, making it versatile for legacy codebases that require privacy.
Conclusion: Yes, zkMIPS is a zkVM focused on scalability and privacy for MIPS-based architectures.

6. [Valida]

Overview: Valida [6] is a performance-oriented zkVM that generates proofs for programs using a custom ISA designed to optimize zkVM implementation. It uses Plonky3 for its proof system.
Main focus: Valida is centered around optimizing prover performance and extensibility, making it a valuable tool for generating proofs efficiently.
Privacy: While Valida is focused on performance, it does not prioritize zero-knowledge privacy as much as other zkVMs.
Performance: Valida has benchmarks indicating its performance advantages in proving computations quickly, particularly through parallel processing.
Integration: Valida is specialized and may not integrate as seamlessly into general-purpose systems, as it is optimized for performance over broad applicability.
Conclusion: Yes, Valida qualifies as a zkVM based on proof generation, but its lack of focus on privacy makes it less suitable for privacy-first use cases.

7. [Jolt]

Overview: Jolt [7] is a zkVM built to optimize prover performance using a modified Hyrax polynomial commitment system. It relies on RISC-V instructions for computation but falls short of full zero-knowledge capabilities.
Main focus: Jolt's main goal is to optimize the speed of proving program execution, making it suitable for high-performance applications where privacy isn't the primary concern.
Privacy: Jolt does not fully achieve zero-knowledge privacy due to the choice of polynomial commitment schemes.
Performance: Jolt offers strong performance, with benchmarks highlighting its ability to process proofs efficiently.
Integration: Jolt can be integrated with systems that prioritize speed over privacy, particularly where rapid proof generation is essential.
Conclusion: Yes, Jolt qualifies as a zkVM based on proof generation, though it does not provide full zero-knowledge privacy.

8. [ZkWASM]

Overview: ZkWASM [8] is a zkVM designed to execute WebAssembly (WASM) code in a privacy-preserving and scalable manner. It uses zk-SNARKs to prove the correctness of WASM program execution while ensuring privacy.
Main focus: ZkWASM focuses on scalability and privacy for WebAssembly, making it ideal for dapps that require verifiable computation without compromising privacy.
Privacy: Full zero-knowledge privacy is provided through zk-SNARKs, ensuring that the execution of WASM programs remains confidential.
Performance: ZkWASM is optimized for running WASM programs efficiently, with offchain computation and onchain verification to enhance performance.
Integration: ZkWASM is ideal for dapps, particularly those that use WebAssembly and require verifiable execution.
Conclusion: Yes, zkWASM qualifies as a zkVM, providing strong privacy, scalability, and verifiable execution for WebAssembly code.

9. [Aleo]

Overview: Aleo's [9] snarkVM converts code into Aleo instructions, which are then compiled into bytecode executable on its zkVM. Aleo emphasizes building private, scalable dapps.
Main focus: Aleo prioritizes privacy and scalability for dapps, providing a robust framework for developers building private dapps.
Privacy: Aleo offers full privacy through zk-SNARK proofs, making it suitable for building fully private applications.
Performance: Aleo focuses on scalability through efficient proof systems, though detailed performance benchmarks are not widely available.
Integration: Aleo is built for privacy-first dapps and integrates with other zkVM-based systems.
Conclusion: Yes, Aleo qualifies as a zkVM, offering a comprehensive solution for private and scalable dapps.

10. [Ola]

Overview: Ola [10] is a ZK-friendly, high-performance layer-2 (L2) rollup platform that is still under development. It is designed to execute computations offchain while generating validity proofs for these computations, ensuring that they are correctly executed without compromising security.
Privacy: Ola does not specifically prioritize privacy in the same way that zkVMs do. While it leverages ZKPs for scalability, its focus is on proving the correctness of transactions and computations rather than ensuring that the data remains private.
Performance: Ola is designed to achieve high performance, particularly in terms of transaction throughput.
Integration: Ola is designed to be interoperable with various layer-1 blockchains. The platform supports a hybrid ZK-rollup architecture and is expected to include bridges for cross-chain interoperability, enabling assets and data to move seamlessly between the layer-1 blockchain and the Ola rollup.
Conclusion: No, Ola is not a zkVM. While it leverages ZKPs (in the form of ZK-rollups) to ensure the validity of offchain computations, its primary focus is on scalability and performance rather than privacy or verifiable execution of arbitrary programs. Ola is more accurately described as a ZK-rollup platform aimed at improving transaction throughput and reducing transaction costs on layer-1 blockchains.

11. [Miden]

Overview: Miden zkVM [11] is a zk-STARK-based virtual machine that converts code into Miden VM instructions and proves the execution of these instructions with zero-knowledge privacy.
Main focus: Miden focuses on scalability and privacy for ZK-rollups, offering efficient proof generation for dapps.
Privacy: Miden ensures privacy for transactions and programs via zk-STARK proofs, making it suitable for private dapps.
Performance: Miden is optimized for scalability, with benchmarks showing its ability to handle up to 1,000 transactions per second (TPS).
Integration: Miden integrates well with ZK-rollup solutions, making it ideal for L2 scaling solutions on blockchains like Ethereum.
Conclusion: Yes, Miden qualifies as a zkVM, providing strong privacy and scalability for dapps and ZK-rollups.

12. [ZkOS]

Overview: ZkOS [12] is a verifiable operating system focused on running zkApps in a decentralized manner. It is built on the RISC-V architecture and aims to create a world computer where all untrusted executions can be verified.
Main focus: ZkOS is primarily designed to offer a proof-of-concept operating system where all executions can be verified in a trustless manner. However, its focus is more on the infrastructure for verifiable applications rather than being a traditional zkVM.
Privacy: ZkOS does not focus on privacy guarantees such as those found in zkVMs that generate ZKPs.
Performance: ZkOS focuses on the efficient execution of dapps, but performance benchmarks specific to ZKP generation are not provided.
Integration: ZkOS supports the execution of zkApps, but it is more of a verifiable operating system rather than a zkVM, making it distinct in its functionality.
Conclusion: No, zkOS is not a zkVM. It is a verifiable operating system focused on the infrastructure to support zkApps but does not directly generate ZKPs or focus on privacy preservation.

13. [Triton]

Overview: Triton [13] is a domain-specific language (DSL) and compiler designed primarily for high-performance GPU kernels, particularly those used in deep learning applications.
Main focus: The primary goal of Triton is to optimize computation for machine learning and GPU workloads. It is focused on enhancing performance and efficiency in processing data rather than on ZKPs or verifiable computation.
Privacy: Triton does not provide ZKPs or privacy features typically associated with zkVMs. Its focus is on high-performance computation rather than cryptographic verifiability.
Performance: Triton is highly optimized for GPU execution, offering significant improvements in performance for computationally intensive tasks such as those found in deep learning.
Integration: Triton is integrated with GPU-based computation environments and is highly specialized for optimizing low-level operations on hardware rather than being a general-purpose virtual machine.
Conclusion: No, Triton is not a zkVM. It is a specialized tool for optimizing GPU workloads, focusing on performance rather than privacy or ZKPs.

14. [Cairo]

Overview: Cairo zkVM [14] uses a custom language that compiles to an optimized STARK-based proof system, ensuring verifiable computation. It is primarily used in systems like Starknet.
Main focus: Cairo focuses on scalability and performance, using zk-STARK proofs to ensure the verifiable and secure execution of programs.
Privacy: Cairo provides privacy through zk-STARKs, but it focuses more on scalability and performance than privacy-first use cases.
Performance: Cairo is highly optimized for performance, making it well-suited for scalable applications on Starknet.
Integration: Cairo integrates deeply with systems like Starknet, supporting verifiable computation in a highly scalable and efficient manner.
Conclusion: Yes, Cairo qualifies as a zkVM, focusing on performance and verifiable execution while being ZK-friendly.

15. [SnarkOS]

Overview: SnarkOS [15] is a decentralized operating system designed to power Aleo's network, enabling secure and private dapps. It manages transactions and consensus, making it a critical infrastructure component for Aleo's zkVM-based ecosystem.
Main focus: SnarkOS primarily focuses on securing Aleo's network through consensus mechanisms and privacy-preserving transactions rather than acting as a zkVM that directly proves program execution.
Privacy: SnarkOS supports zero-knowledge privacy through its integration with Aleo's zkVM, but the operating system itself does not generate ZKPs for arbitrary computations.
Performance: SnarkOS is optimized for managing dapps on the Aleo network and handling private transactions, but its focus is more on infrastructure and consensus than on proof generation.
Integration: SnarkOS integrates seamlessly with Aleo's zkVM to support private dapps and transactions, but its primary role is as a consensus layer.
Conclusion: No, SnarkOS is not a zkVM. It serves as an operating system for Aleo's decentralized network, focusing on privacy and consensus rather than on generating ZKPs for computations.

16. [Lurk]

Overview: Lurk [16] is a Turing-complete programming language designed for recursive zk-SNARKs. It focuses on enabling developers to build complex, recursive ZKPs efficiently through a custom language tailored for verifiable computation.
Main focus: Lurk is centered around recursive proof generation rather than serving as a traditional virtual machine. Its purpose is to facilitate the creation of complex zk-SNARK-based proofs, making it a specialized tool for cryptographic proofs rather than general-purpose computation.
Privacy: Lurk is built for generating zk-SNARKs, which inherently provide privacy. However, Lurk itself is a language and not a zkVM that executes arbitrary programs and generates ZKPs for them.
Performance: Lurk is optimized for recursive zk-SNARK generation, but specific performance metrics are tied to its proof-generation capabilities rather than traditional execution environments.
Integration: Lurk is specialized for zk-SNARKs and may not easily integrate with other general-purpose systems, as it focuses on specific cryptographic tasks.
Conclusion: No, Lurk is not a zkVM. It is a programming language designed for recursive zk-SNARKs and focuses on proof generation rather than program execution in a virtual machine environment.

17. [Piecrust]

Overview: Piecrust [17] is a WASM-based zkVM designed to run on the Dusk Network. It supports concurrent execution and focuses on providing privacy and scalability for smart contracts.
Main focus: Piecrust is designed to provide private and efficient execution of smart contracts through the use of ZKPs.
Privacy: Piecrust supports ZK-friendly computations and enhances privacy through cryptographic primitives such as Merkle trees.
Performance: Piecrust is designed to be scalable and concurrent, allowing multiple sessions to run simultaneously, which improves overall performance.
Integration: Piecrust integrates with the Dusk Network and supports private smart contracts, making it ideal for dapps.
Conclusion: Yes, Piecrust qualifies as a zkVM, offering scalability, privacy, and support for succinct proof generation.

18. [Ceno]

Overview: Ceno [18] is a zkVM that provides a theoretical framework for reducing proving time by grouping common portions of code together. It uses recursive proofs to enhance prover efficiency.
Main focus: Ceno aims to optimize prover performance through recursive proofs, making it a powerful tool for handling complex computations efficiently.
Privacy: Ceno supports zero-knowledge privacy through recursive proofs and is designed to handle large-scale computations securely.
Performance: Ceno's recursive proof framework ensures that it can efficiently prove the execution of programs, reducing the time required for proof generation.
Integration: Ceno can be integrated into systems that require high efficiency and privacy, particularly those handling complex, repeated computations.
Conclusion: Yes, Ceno qualifies as a zkVM, providing efficient and private computation through the use of recursive proofs.

19. [Stellar]

Overview: Stellar [19] is a decentralized protocol designed to facilitate cross-border transactions between digital and fiat currencies.
Main focus: Stellar's primary goal is to improve financial transactions by enabling decentralized, low-cost currency transfers. It does not aim to provide ZKPs or run verifiable computations like a zkVM.
Privacy: Stellar focuses on confidentiality and security for financial transactions, but it does not employ ZKPs in the way zkVMs do for verifying computation without revealing data.
Performance: Stellar prioritizes the performance of financial transactions, ensuring low latency and high throughput across its decentralized network. However, this performance focus is specific to transactions rather than general-purpose program execution.
Integration: Stellar is designed for integration with financial systems, enabling currency conversions and transfers, but it is not built for executing smart contracts or verifiable computations.
Conclusion: No, Stellar is not a zkVM. It is a decentralized financial protocol focused on facilitating cross-border payments rather than verifiable or privacy-preserving computation.

20. [NovaNet]

Overview: NovaNet [20] is an open peer-to-peer network that aims to build upon concepts of non-uniform incremental verifiable computation.
Main focus: NovaNet's focus is on peer-to-peer networking and decentralized computing rather than on proving the execution of programs in a zero-knowledge manner.
Privacy: NovaNet does not provide ZKPs or privacy features typically associated with zkVMs. Its focus is on decentralized networking and computation.
Performance: NovaNet prioritizes efficient decentralized computation but does not focus on privacy or performance benchmarks related to ZKPs.
Integration: NovaNet is built for decentralized networks but is not designed to integrate with systems requiring verifiable computation or ZKP generation.
Conclusion: No, NovaNet is not a zkVM. It is a decentralized peer-to-peer network focused on distributed computing rather than zero-knowledge computation.

21. [ZkLLVM]

Overview: ZkLLVM [21] is a compiler that transforms C++ or Rust code into circuits for use in zk-SNARK or zk-STARK systems. Its primary purpose is to bridge high-level programming languages with ZKP systems by compiling code into arithmetic circuits that can be used to generate and verify proofs.
Main focus: ZkLLVM focuses on making ZKPs accessible to developers by enabling them to write code in familiar languages (C++, Rust) and then compile that code into ZK circuits.
Privacy: ZkLLVM enables the generation of ZKPs by compiling high-level code into ZK-compatible circuits. It plays a crucial role in privacy-preserving applications but does not act as a zkVM itself.
Performance: ZkLLVM allows for the performance of ZKPs to be closely tied to the complexity of the compiled circuits. The performance depends on the underlying zk-SNARK or zk-STARK system used.
Integration: ZkLLVM integrates with zk-SNARK and zk-STARK proof systems, making it useful for a variety of privacy-focused applications, but it does not serve as a zkVM for general-purpose computation.
Conclusion: No, zkLLVM is not a zkVM. It is a compiler that transforms high-level code into ZK circuits, enabling ZKPs but not acting as a virtual machine for executing and proving programs.

22. [ZkMove]

Overview: ZkMove [22] is a zkVM designed to execute smart contracts written in the Move language. It utilizes ZKPs to ensure that the execution of these contracts remains verifiable and secure.
Main focus: ZkMove focuses on privacy and verifiable execution for Move-based smart contracts, providing a framework for ZK-friendly computation.
Privacy: ZkMove ensures that smart contract execution remains private through ZKPs, making it suitable for privacy-preserving applications.
Performance: ZkMove is optimized for verifiable execution, ensuring that contracts can be proven correct while preserving privacy.
Integration: ZkMove integrates well with systems that use the Move language, particularly in environments that require private smart contract execution.
Conclusion: Yes, zkMove qualifies as a zkVM, offering ZK-friendly execution and privacy for smart contracts written in the Move language.

23. [O1VM]

Overview: O1VM [23] is a general-purpose zkVM developed by o1Labs. It is designed to prove the execution of MIPS programs efficiently through a combination of zk-SNARKs and specialized techniques like folding schemes and RAMLookups.
Main focus: O1VM focuses on scalability and verifiable computation for MIPS-based programs, making it a strong contender for executing and proving complex programs efficiently.
Privacy: O1VM ensures privacy through zk-SNARK proofs, keeping the details of the computation private while proving its correctness.
Performance: O1VM is optimized for handling long execution traces and complex computations, making it highly scalable.
Integration: O1VM integrates well with MIPS-based architectures and systems that require privacy-preserving computation.
Conclusion: Yes, o1VM qualifies as a zkVM, providing privacy, scalability, and strong proof generation for MIPS programs.

Summary of findings

Project name	ZkVM status	Zero knowledge	Reasoning/comments
SP1	Yes	No	Proves execution of LLVM-based programs but lacks privacy features.
Nexus	Yes	No	Strong proof generation but lacks zero-knowledge privacy due to Spartan.
Risc0	Yes	Yes	Supports full ZKP generation for Rust programs.
Powdr	No	Yes	Toolkit for creating custom zkVMs, not a zkVM itself.
ZkMIPS	Yes	Yes	Supports MIPS-like architecture with full zero-knowledge and proof generation.
Valida	Yes	No	Performance-focused zkVM, lacks privacy guarantees.
Jolt	Yes	No	Performance-focused zkVM, does not achieve zero-knowledge privacy.
ZkWASM	Yes	Yes	Full zero-knowledge and verifiable execution of WebAssembly code.
Aleo	Yes	Yes	Fully private and scalable dapps.
Ola	No	No	Primarily a ZK-rollup platform, not a zkVM, focusing on scalability and performance rather than privacy.
Miden	Yes	Yes	Zk-STARK-based zkVM with strong privacy and scalability.
ZkOS	No	No	Verifiable operating system focused on zkApps, not a zkVM.
Triton	No	No	Optimizes GPU workloads but not designed for ZKPs.
Cairo	Yes	ZK-friendly	Custom Rust-based language with zk-STARK proof generation.
SnarkOS	No	Yes	Decentralized OS for Aleo's network, focuses on consensus rather than verifiable computation.
Lurk	No	No	Programming language for recursive zk-SNARKs, not a zkVM.
Piecrust	Yes	ZK-friendly	ZkVM with recursive SNARK capabilities, focused on succinct proof generation.
Ceno	Yes	Yes	Theoretical zkVM improving prover efficiency through recursive proofs.
Stellar	No	No	Focuses on cross-border transactions, not ZK-proof generation or verifiable computation.
NovaNet	No	No	Peer-to-peer network focused on distributed computing, not zero-knowledge computation.
ZkLLVM	No	Yes, in some cases	Compiler for generating ZK-circuits, not a zkVM.
ZkMove	Yes	ZK-friendly	ZkVM supporting Move language with ZKP execution.
O1VM	Yes	Yes	MIPS-based zkVM with strong privacy, scalability, and proof generation.

Insights and conclusions

Our analysis reveals that many of the projects labeled as zkVMs do meet the core criteria for zkVMs, offering verifiable computation and proof generation as foundational features. However, a number of these projects fall short of delivering full zero-knowledge privacy. Projects like Risc0, Aleo, and Miden stand out as leading zkVM frameworks that balance proof generation, privacy, and scalability, offering strong platforms for developers seeking to build privacy-preserving applications.

Conversely, projects like SP1 and Nexus excel in generating verifiable proofs but currently lack comprehensive zero-knowledge privacy mechanisms. These platforms are excellent for scenarios where proof generation and scalability are paramount, but privacy is not a primary concern.

As zkVM technology continues to evolve, we expect to see more projects integrating enhanced privacy-preserving mechanisms while simultaneously improving performance and scalability. This ongoing development will likely broaden the application of zkVMs across the blockchain ecosystem, particularly in privacy-sensitive sectors such as finance, data security, and decentralized applications.

What are your thoughts on our zkVM analysis? Do you agree with our findings, or do you know of other zkVM projects that should be on our radar? We would love to hear your insights, questions, or suggestions! Feel free to join the discussion on our forum.

References

[1] Introducing SP1: A performant, 100% open-source, contributor-friendly zkVM. Retrieved from https://blog.succinct.xyz/introducing-sp1/

[2] The Nexus 2.0 zkVM. Retrieved from https://docs.nexus.xyz/

[3] The first general purpose zkVM. Retrieved from https://www.risczero.com/zkvm

[4] Powdr. Retrieved from https://docs.powdr.org/

[5] ZKM Architecture. Retrieved from https://docs.zkm.io/zkm-architecture

[6] Valida zkVM Design. Retrieved from https://delendum.xyz/writings/2023-05-10-zkvm-design.html

[7] Building Jolt: A fast, easy-to-use zkVM. Retrieved from https://a16zcrypto.com/posts/article/building-jolt/

[8] ZK-WASM. Retrieved from https://delphinuslab.com/zk-wasm/

[9] Aleo. Retrieved from https://aleo.org/blog/

[10] OlaVM Whitepaper V2. Retrieved from https://github.com/Sin7Y/olavm-whitepaper-v2/tree/master

[11] Polygon Miden VM. Retrieved from https://0xpolygonmiden.github.io/miden-vm/intro/main.html

[12] The Adventures of OS: Making a RISC-V Operating System using Rust. Retrieved from https://osblog.stephenmarz.com/index.html

[13] Triton VM. Retrieved from https://triton-vm.org/spec/

[14] How does the original Cairo VM work?. Retrieved from https://github.com/lambdaclass/cairo-vm/blob/main/docs/python_vm/README.md

[15] Aleo completes security audits of snarkOS & snarkVM. Retrieved from https://aleo.org/post/aleo-completes-security-audits-of-snarkos-and-snarkvm/

[16] Lurk zkVM. Retrieved from https://github.com/lurk-lab

[17] Piecrust VM. Retrieved from https://docs.rs/piecrust/latest/piecrust/

[18] Ceno: Non-uniform, Segment and Parallel Zero-knowledge Virtual Machine. Retrieved from https://eprint.iacr.org/2024/387

[19] ZkVM: a new design for fast, confidential smart contracts. Retrieved from https://stellar.org/blog/developers/zkvm-a-new-design-for-fast-confidential-smart-contracts

[20] Novanet. Retrieved from https://www.novanet.xyz/blog

[21] ZKLLVM. Retrieved from https://github.com/NilFoundation/zkLLVM

[22] zkMove 0.2.0 - Achieving Full Bytecode Compatibility with Move. Retrieved from https://www.zkmove.net/2023-06-20-zkMove-0.2.0-Achieving-Full-Bytecode-Compatibility-with-Move/

[23] O1VM. Retrieved from https://github.com/o1-labs/proof-systems/tree/master/o1vm

Nescience: A User-Centric State-Separation Architecture

2024-08-23T12:00:00.000Z

Nescience: A user-centric state-separation architecture.

Disclaimer: This content is a work in progress. Some components may be updated, changed, or expanded as new research findings become available.

In blockchain applications, privacy settings are typically predefined by developers, leaving users with limited control. This traditional, one-size-fits-all approach often leads to inefficiencies and potential privacy concerns as it fails to cater to the diverse needs of individual users. The Nescience state-separation architecture (NSSA) aims to address these issues by shifting privacy control from developers to users. NSSA introduces a flexible, user-centric approach that allows for customized privacy settings to better meet individual needs. This blog post will delve into the details of NSSA, including its different execution types, cryptographic foundations, and unique challenges.

Introducing NSSA: A user-centric approach

NSSA gives users control over their privacy settings by introducing shielded (which creates a layer of privacy for the outputs, and only the necessary details are shared) and deshielded (which reveal private details, making them publicly visible) execution types in addition to the traditional public and private modes. This flexibility allows users to customize their privacy settings to match their unique needs, whether they require high levels of confidentiality or more transparency. In NSSA, the system is divided into two states: public and private. The public state uses an account-based model while the private state employs a UTXO-based (unspent transaction output) model. Private executions within NSSA utilize UTXO exchanges, ensuring that transaction details remain confidential. The sequencer verifies these exchanges without accessing specific details, enhancing privacy by unlinking sender and receiver identities. Zero-knowledge proofs (ZKPs) allow users to prove transaction validity without revealing data, maintaining the integrity and confidentiality of private transactions. UTXOs contain assets such as balances, NFTs, or private storage data, and are stored in plaintext within Sparse Merkle trees (SMTs) in the private state and as hashes in the public state. This dual-storage approach keeps UTXO details confidential while allowing public verification through hashes, achieving a balance between privacy and transparency.

Implementing NSSA introduces unique challenges, particularly in cryptographic implementation and maintaining the integrity of private executions. These challenges are addressed through various solutions such as ZKPs, which ensure transaction validity without compromising privacy, and the dual-storage approach, which maintains confidentiality while enabling public verification. By allowing users to customize their privacy settings, NSSA enhances user experience and promotes wider adoption of private execution platforms. As we move towards a future where user-empowered privacy control is crucial, NSSA provides a flexible and user-centric solution that meets the diverse needs of blockchain users.

Why NSSA differs from other hybrid execution platforms

In many existing hybrid execution platforms, privacy settings are predefined by developers, often applying a one-size-fits-all approach that does not accommodate the diverse privacy needs of users. These platforms blend public and private states, but control over privacy remains with the application developers. While this approach is straightforward for developers (who bear the responsibility for any potential privacy leaks), it leaves users with no control over their own privacy settings. This rigidity becomes problematic as user needs evolve over time, or as new regulations necessitate changes to privacy configurations. In such cases, updates to decentralized applications are required to adjust privacy settings, which can disrupt the user experience and create friction.

NSSA addresses these limitations by introducing a groundbreaking concept: selective privacy. Unlike traditional platforms where privacy is static and determined by developers, selective privacy empowers users to dynamically choose their own privacy levels based on their unique needs and sensitivity. This flexibility is critical in a decentralized ecosystem where the diversity of users and use cases demands a more adaptable privacy solution.

In the NSSA model, users have the autonomy to select how they interact with decentralized applications (dapps) by choosing from four types of transaction executions: public, private, shielded, and deshielded. This model allows users to tailor their privacy settings on a per-transaction basis, selecting the most appropriate execution type for each specific interaction. For instance, a user concerned about data confidentiality might opt for a fully private transaction while another user, wary of privacy but seeking transparency, might choose a public execution.

While selective privacy may appear complex, especially for users who are not technically inclined, Nescience mitigates this by allowing the community or developers to establish best practices and recommended approaches. These guidelines provide users with an informed starting point, and over time, users can adjust their privacy settings as their preferences and trust in the platform evolve. Importantly, selective privacy gives users the right to alter their privacy level at any point in the future, ensuring that their privacy settings remain aligned with their needs as they change.

This approach not only empowers users but also facilitates greater adoption of dapps. Users who are skeptical about privacy concerns can initially engage with transparent transactions and gradually shift towards more private executions as they gain confidence in the system and vice versa for users who start with privacy but later find transparency beneficial for certain transactions. In this way, selective privacy bridges the gap between privacy and transparency, allowing for an optimal balance to emerge from the community’s collective preferences.

To liken this to open-source projects: in traditional systems, developers fix privacy rules much like immutable code—users must comply with these fixed settings. In contrast, with selective privacy, the rules are malleable and shaped by the users’ preferences, enabling the community to find the ideal balance between privacy and efficiency over time.

NSSA is distinct from traditional zero-knowledge (ZK) rollups in several key ways. One of the unique features of NSSA is its public execution type, which does not require ZKPs or a zero-knowledge virtual machine (zkVM). This provides a significant advantage in terms of scalability and efficiency as users can choose public executions for transactions that do not require enhanced privacy, avoiding the overhead associated with ZKP generation and verification.

Moreover, NSSA introduces two additional execution types—shielded and deshielded—which further distinguish it from traditional privacy-preserving rollups. These execution types allow for more nuanced control over privacy, giving users the ability to shield certain aspects of a transaction while deshielding others. This flexibility sets NSSA apart as a more adaptable and user-centric platform, catering to a wide range of privacy needs without imposing a one-size-fits-all solution.

By combining selective privacy with a flexible execution model, NSSA offers a more robust and adaptable framework for decentralized applications, ensuring that users maintain control over their privacy while benefiting from the security and efficiency of blockchain technology.

How Nescience state-separation architecture can be used

NSSA offers a flexible, privacy-preserving add-on that can be applied to existing dapps. One of the emerging trends in the blockchain space is that each dapp is expected to have its own rollup for efficiency, and it is estimated that Ethereum could see the deployment of different rollups in the near future. A key question arises: how many of these rollups will incorporate privacy? For dapp developers who want to offer flexible, user-centric privacy features, NSSA provides a solution through selective privacy.

Use case: Adding privacy to existing dapps

Consider a dapp running on a transparent network that offers no inherent privacy to its users. Converting this dapp to a privacy-preserving architecture from scratch would require significant effort, restructuring, and a deep understanding of cryptographic frameworks. However, with NSSA, the dapp does not need to undergo extensive changes. Instead, the Nescience state-separation model can be deployed as an add-on, offering selective privacy as an option for the dapp’s users.

This allows the dapp to retain its existing functionality while providing users with a choice between the traditional, transparent version and a new version with selective privacy features. With NSSA, the privacy settings are flexible, meaning users can tailor their level of privacy according to their individual needs while the dapp operates on its current infrastructure. This contrasts sharply with the typical approach, where dapps are either entirely transparent or fully private, with no flexibility for users to select their own privacy preferences.

Key advantage: Decoupling from the host chain

A key feature of NSSA is that it operates independently of the privacy characteristics of the host blockchain. Whether the host chain is fully transparent or fully private, the Nescience state-separation architecture can be deployed on top of it, offering users the ability to choose their own privacy settings. This decoupling from the host chain’s inherent privacy model is critical as it allows users to benefit from selective privacy even in environments that were not originally designed to offer it.

In fully private chains, NSSA allows users to selectively reveal transaction details when compliance with regulations or other requirements is necessary. In fully transparent chains, NSSA allows users to maintain privacy for specific transactions, offering flexibility that would not otherwise be possible.

Conclusion

NSSA provides a powerful tool for dapp developers who want to offer selective privacy to their users without the need for a complete overhaul of their existing systems. By deploying NSSA as an add-on, dapps can give users the ability to choose their own privacy settings whether they are operating on transparent or private blockchains. This flexibility makes NSSA a valuable option for any dapp looking to provide enhanced privacy options while maintaining efficiency and ease of use.

B. Design

In this section, we will delve into the core design components of the Nescience state-separation architecture, covering its key structural elements and the mechanisms that drive its functionality. We will explore the following topics:

Architecture's components: An in-depth look at the foundational building blocks of NSSA, including the public and private states, UTXO structures, zkVM, and smart contracts. These components work together to facilitate secure, flexible, and scalable transactions within the architecture.
General execution overview: We will outline the overall flow of transaction execution within NSSA, describing how users interact with the system and how the architecture supports various types of executions—public, private, shielded, and deshielded—while preserving privacy and efficiency.
Execution processes and UTXO management: This section will focus on the lifecycle of UTXOs within the architecture, from their generation to consumption. We will also cover the processes involved in managing UTXOs, including proof generation, state transitions, and ensuring transaction validity.

These topics will provide a comprehensive understanding of how NSSA enables flexible and secure interactions within dapps.

1. Architecture's components

NSSA introduces an advanced prototype execution framework designed to enhance privacy and security in blockchain applications. This framework integrates several essential components: the public state, private state, zkVM, various execution types, Nescience users, and smart contracts.

a) Public state

The public state in the NSSA is a fundamental component designed to hold all publicly accessible information within the blockchain network. This state is organized as a single Merkle tree structure, a sophisticated data structure that ensures efficient and secure data verification. The public state includes critical information such as user balances and the public storage data of smart contracts.

In an account-based model, the public state operates by storing each account or smart contract's public data as individual leaf nodes within the Merkle tree. When transactions occur, they directly modify the state by updating these leaf nodes. This direct modification ensures that the most current state of the network is always reflected accurately.

The Merkle tree structure is essential for maintaining data integrity. Each leaf node contains a hash of a data block, and each non-leaf node contains the hash of its child nodes. This hierarchical arrangement means that any change in the data will result in a change in the corresponding hash, making it easy to detect any tampering. The root hash, or Merkle root, is stored on the blockchain, providing a cryptographic guarantee of the data's integrity. This root hash serves as a single, concise representation of the entire state, enabling quick and reliable verification by any network participant.

Transparency is a key feature of the public state. All data stored within this state is openly accessible and verifiable by any participant in the network. This openness ensures that all transactions and state changes are visible and auditable, fostering trust and accountability. For example, user balances are publicly viewable, which helps ensure transparency and trust in the system. Similarly, public smart contract storage can be accessed and verified by anyone, making it suitable for applications that require public scrutiny and auditability, such as public record updates and some financial transactions.

The workflow of managing the public state involves several steps to ensure data integrity and transparency. When a user initiates a transaction involving public data, the relevant changes are proposed and applied to the public state tree. The transaction details, such as transferring funds between accounts or updating smart contract storage, update the corresponding leaf nodes in the Merkle tree. Following this, the hashes of the affected nodes are recalculated up to the root, ensuring that the entire tree accurately reflects the new state of the network. The updated Merkle root is then recorded on the blockchain, allowing all network participants to verify the integrity of the public state. Any discrepancy in the data will result in a mismatched root hash, signaling potential tampering or errors.

In summary, the public state in NSSA leverages the robustness of the Merkle tree structure to provide a secure, transparent, and verifiable environment for publicly accessible information. By operating on an account-based model and maintaining rigorous data integrity checks, the public state ensures that all transactions are transparent and trustworthy, laying a strong foundation for a reliable blockchain network.

b) Private state

The private state in the NSSA is a sophisticated system designed to maintain user privacy while ensuring transaction integrity. Each user has their own individual Merkle tree, which holds their private information such as balances and storage data. This structure is distinct from the public state, which uses an account-based model. Instead, the private state employs a UTXO-based model. In this model, each transaction output is a discrete unit that can be independently spent in future transactions. This design provides users with granular control over their transaction outputs.

A key aspect of maintaining privacy within the private state is the use of ZKPs. ZKPs allow transactions to be validated without revealing any underlying private data. This means that while the system can verify that a transaction is legitimate, the details of the transaction remain confidential. Only parties with the appropriate viewing key can access and reconstruct the user’s list of UTXOs, ensuring that sensitive information is protected.

The private state also employs a dual-storage approach to balance privacy and transparency. UTXOs are stored in plaintext within SMTs in the private state, providing detailed and accessible records for the user. In contrast, the public state only holds hashes of these UTXOs. This method ensures that while the public can verify the existence and integrity of private transactions through these hashes, they cannot access the specific details.

The workflow for a transaction in the private state begins with the user initiating a transaction involving their private data, such as transferring a private balance or updating private smart contract storage. The transaction involves spending existing UTXOs, represented as leaves in the Merkle tree, and creating new UTXOs, which are then appended to the user’s private list. The zkVM generates a ZKP to validate the transaction without revealing any private data, ensuring the transaction adheres to the system's rules.

Once the proof is generated, it is submitted to the sequencer, which verifies the transaction’s validity. Upon successful verification, the nullifier is added to the nullifier set, preventing double spending of the same UTXO. The use of ZKPs and nullifiers ensures that the private state maintains both security and privacy.

In summary, the private state in NSSA is meticulously designed to provide users with control over their private information while ensuring the security and integrity of transactions. By utilizing a UTXO-based model, individual Merkle trees, ZKPs, and a dual-storage system, NSSA achieves a balance between confidentiality and verifiability, making it a robust solution for managing private blockchain transactions.

c) ZkVM (zero-knowledge virtual machine)

The zkVM is a pivotal component in NSSA, designed to uphold the highest standards of privacy and security in blockchain transactions. Its primary function is to generate and aggregate ZKPs, enabling users to validate the correctness of their transactions without disclosing any underlying details. This capability is crucial for maintaining the confidentiality and integrity of sensitive data within the blockchain network.

ZKPs are sophisticated cryptographic protocols that allow one party, the prover, to convince another party, the verifier, that a certain statement is true, without revealing any information beyond the validity of the statement itself. In the context of the zkVM, this means users can prove their transactions are valid without exposing transaction specifics, such as amounts or parties involved. This process is essential for transactions within the private state, where maintaining confidentiality is paramount.

The generation of ZKPs involves intricate cryptographic computations. When a user initiates a transaction, the zkVM processes the transaction inputs and produces a proof that the transaction adheres to the protocol's rules. This proof must be robust enough to convince the verifier of the transaction's validity while preserving the privacy of the transaction details.

Performance optimization is another critical function of the zkVM. In a typical blockchain scenario, verifying multiple individual proofs can be computationally intensive and time consuming, potentially leading to network congestion and delays. To address this, the zkVM can aggregate multiple ZKPs into a single, consolidated proof. This aggregation significantly reduces the verification overhead as the verifier needs to check only one comprehensive proof rather than multiple individual ones. This efficiency is vital for maintaining high throughput and low latency in the blockchain network, ensuring that the system can handle a large volume of transactions swiftly and securely.

Furthermore, the zkVM's role extends beyond mere proof generation and aggregation. It also ensures that all transactions meet the required privacy and security standards before they are recorded on the blockchain. By interacting seamlessly with other components such as the public and private states, the zkVM ensures that any transaction, whether it involves public data, private data, or a mix of both, is thoroughly validated and secured.

In summary, the zkVM is essential for the NSSA, providing the cryptographic backbone necessary to support secure and private transactions. Its ability to generate and aggregate ZKPs not only preserves the confidentiality of user data but also enhances the overall efficiency and scalability of the blockchain network. By ensuring that all transactions are validated without revealing sensitive information, the zkVM upholds the integrity and trustworthiness of the Nescience blockchain system.

d) Execution types in NSSA

NSSA incorporates multiple execution types to cater to varying levels of privacy and security requirements. These execution types—public, private, shielded, and deshielded—are designed to provide users with flexible options for managing their transactions based on their specific privacy needs.

1. Public executions

Public executions are straightforward transactions that involve reading from and writing to the public state. In this model, data is openly accessible and verifiable by all participants in the network. Public executions do not require ZKPs since transparency is the primary goal. This execution type is ideal for non-sensitive transactions where public visibility is beneficial, such as updating public records, performing open financial transactions, or interacting with public smart contracts.

The workflow for a public execution starts with a user initiating a transaction that modifies public data. The transaction details are then used to update the relevant leaf nodes in the Merkle tree. As changes are made, the hashes of affected nodes are recalculated up to the root, ensuring that the entire tree reflects the most recent state. Finally, the updated Merkle root is recorded on the blockchain, making the new state publicly verifiable.

2. Private executions

Private executions are designed for confidential transactions, reading from and writing to the private state. These transactions require ZKPs to ensure that while the transaction details are validated, the actual data remains private. This execution type is suitable for scenarios where privacy is crucial, such as private financial transactions or sensitive data management within smart contracts.

In a private execution, the user initiates a transaction involving private data. The transaction spends existing UTXOs and creates new ones, all of which are represented as leaves in the Merkle tree. The zkVM generates a ZKP to validate the transaction without revealing private data. This proof is submitted to the sequencer, which verifies the proof to ensure the transaction's validity. Upon successful verification, the nullifier is added to the nullifier set, and the private state is updated with the new Merkle root.

3. Shielded executions

Shielded executions create a layer of privacy for the outputs by allowing interactions between the public and private states. When a transaction occurs in a shielded execution, details of the transaction are processed within the private state, ensuring that sensitive information remains confidential. Only the necessary details are shared with the public state, often in a masked or encrypted form. This approach allows for the validation of the transaction without revealing critical data, thus preserving the privacy of the involved parties.

The workflow for shielded executions begins with the user initiating a transaction that reads from the public state and prepares to write to the private state. Public data is accessed, and the private state is prepared to receive new data. The zkVM generates a ZKP to hide the receiver’s identity. This proof is submitted to the sequencer, which verifies the proof to ensure the transaction's validity. If valid, the private state is updated with the new data while the public state reflects the change without revealing private details. This type of execution is particularly useful for scenarios where the receiver’s identity needs to be hidden, such as in anonymous donation systems or confidential data storage.

4. Deshielded executions

Deshielded executions operate in the opposite manner of shielded executions, where data is read from the private state and written to the public state. This execution type is useful in situations where the sender's identity needs to be kept confidential while making the transaction results publicly visible.

In a deshielded execution, the user initiates a transaction that reads from the private state and prepares to write to the public state. Private data is accessed, and the transaction details are prepared. The zkVM generates a ZKP to hide the sender’s identity. This proof is then submitted to the sequencer, which verifies the proof to ensure the transaction's validity. Once verified, the public state is updated with the new data, reflecting the change while keeping the sender’s details confidential. This can be useful when transparency is needed, such as when auditing or proving certain aspects of a transaction to a wider audience. By selectively deshielding certain transactions, users can control what information is shared publicly, thus maintaining a balance between privacy and transparency as required by their specific use case.

Table of execution types

Type	Read from	Write to	ZKP required	Use case	Description
Public	Public state	Public state	No	Non-sensitive transactions requiring transparency.	Ideal for transactions that do not require privacy, ensuring full transparency.
Private	Private state	Private state	Yes	Confidential transactions needing privacy.	Suitable for transactions that require confidentiality. Ensures that transaction details remain private through the use of ZKPs.
Shielded	Public state	Private state	Yes	Transactions where the receiver’s identity needs to be hidden.	Hides the identity of the receiver while keeping the transaction details private. Suitable for anonymous donations or confidential data storage.
Deshielded	Private state	Public state	Yes	Transactions where the sender’s identity needs to be hidden.	Ensures the sender’s identity remains confidential while making the transaction results public. Suitable for confidential disbursements or anonymized data publication.

By supporting a range of execution types, NSSA provides a flexible and robust framework for managing privacy and security in blockchain transactions. Whether the need is for complete transparency, total privacy, or a balanced approach, NSSA's execution types allow users to select the level of confidentiality that best fits their requirements. This flexibility enhances the overall utility of the blockchain, making it suitable for a wide array of applications and use cases.

e) Nescience users

Nescience users are integral to the architecture, managing balances and assets within the blockchain network and invoking smart contracts with various privacy options. They can choose the appropriate execution type—public, private, shielded, or deshielded—based on their specific privacy and security needs.

Users handle both public and private balances. Public balances are visible to all network participants and suitable for non-sensitive transactions, while private balances are confidential and used for transactions requiring privacy. Digital wallets provide a user-friendly interface for managing these balances, assets, and transactions, allowing users to select the desired execution type seamlessly.

Security is ensured through the use of cryptographic keys, which authenticate and verify transactions. ZKPs maintain privacy by validating transaction correctness without revealing underlying data, ensuring sensitive information remains confidential even during verification.

The workflow for users involves initiating a transaction, preparing inputs, interacting with smart contracts, generating proofs if needed, and submitting the transaction to the sequencer for verification and state update. This flexible approach supports various use cases, from financial transactions and decentralized applications to data privacy management, allowing users to maintain control over their privacy settings.

By offering this high degree of flexibility and security, Nescience enables users to tailor their privacy settings to their specific needs, ensuring sensitive transactions remain confidential while non-sensitive ones are transparent. This integration of cryptographic keys and ZKPs provides a robust framework for a wide range of blockchain applications, enhancing both utility and trust within the network.

f) Smart contracts in NSSA

Smart contracts are a core feature of NSSA, providing a way to automate and execute predefined actions based on coded rules. Once deployed on the blockchain, these contracts become immutable, meaning their behavior cannot be altered. This ensures that they perform exactly as intended without the risk of tampering. Because the state and data of the contract are stored permanently on the blockchain, all interactions are fully transparent and auditable, creating a reliable and trustworthy environment.

One of the key strengths of smart contracts is their ability to automate processes. They are designed to automatically execute when specific conditions are met, reducing the need for manual oversight or intermediaries. For example, a smart contract might transfer funds when a certain deadline is reached or update a record once a task is completed. This self-executing nature makes them efficient and minimizes human error.

Smart contracts operate deterministically, meaning they will always produce the same result given the same inputs. This predictability is crucial for ensuring reliability, especially in complex systems. Additionally, they run in isolated environments on the blockchain, which enhances security by preventing unintended interactions with other processes.

Security is another critical feature of smart contracts. They leverage the underlying cryptographic protections of the blockchain, ensuring that every interaction is secure and authenticated. Before deployment, the contract code can be audited and verified to ensure it functions correctly. Once on the blockchain, the immutable nature of the code prevents unauthorized modifications, further ensuring the integrity of the system.

Running smart contracts requires computational resources, which are compensated through gas fees. These fees vary depending on the complexity of the operations within the contract. More resource-intensive contracts incur higher fees, which helps manage the computational load on the blockchain network.

The workflow of a smart contract begins with its development, where developers code the contract using languages like Rust. Once the code is compiled and deployed to the blockchain, it becomes a permanent part of the network. Users can then interact with the contract by sending transactions that invoke specific functions. The contract checks whether the required conditions are met, and if so, it automatically executes the specified actions, such as transferring tokens or updating data on the blockchain.

The benefits of smart contracts are numerous. They eliminate the need for intermediaries by providing a system where trust is built into the code itself. This not only reduces costs but also increases efficiency by automating repetitive processes. The inherent security of smart contracts, combined with their transparency—where every action is recorded and visible on the blockchain—makes them a powerful tool for ensuring accountability and trust in decentralized systems. They can be ideal for managing decentralized autonomous organizations (DAOs), where governance decisions are automated through coded rules.

By integrating smart contracts, NSSA offers a highly versatile, secure, and transparent framework that can support a wide range of applications across various industries, from finance to governance, supply chains, and more.

2. General execution overview

This section explains the execution process within NSSA, providing an overview of how it works from start to finish. It outlines the steps involved in each execution type, guiding the reader through the entire process from user interaction to completion.

The process begins when a user initiates a transaction by invoking a smart contract. This invocation involves selecting at least one of the four execution types: public, private, shielded, or deshielded. The choice of execution type determines how data will be read from and written to the blockchain, affecting the transaction's privacy and security levels. Each execution type caters to different privacy needs, allowing the user to tailor the transaction according to their specific requirements, whether it be full transparency or complete confidentiality.

Step 1: Smart contract selection and input creation

Smart contract selection: The user selects a smart contract they wish to invoke.
Input creation: The user creates a set of inputs required for the invocation by reading the necessary data from both the public and private states. This includes:
- Public data such as current account balances, public keys, and smart contract states.
- Private data such as private account balances and UTXOs.

Step 2: Choosing execution type

Execution type selection: The user selects the type of execution based on their privacy needs. The options include:
- Public execution: Suitable for transactions where transparency is desired.
- Private execution: Used when transaction details need to be confidential.
- Shielded execution: Hides the receiver's identity.
- Deshielded execution: Hides the sender's identity.
ZkVM requirement: If the execution involves private, shielded, or deshielded types, the user must call the zkVM to handle these confidential transactions. For purely public executions, the zkVM is not needed, and the user can directly transmit the transaction code to the sequencer.

Step 3: Calling zkVM for proof generation

ZkVM compilation: The user calls the zkVM to compile the smart contract with both public and private inputs.
- Kernel circuit proofs: The zkVM generates individual proofs for each execution type through kernel circuits.
- Proof aggregation: The zkVM aggregates these individual proofs into a single comprehensive proof, combining both private and public inputs.

Step 4: Transmitting public inputs and retaining private inputs

Retaining private inputs: The user keeps the private inputs secure and does not transmit them.
Revealing public inputs: The user transmits the following public inputs to the sequencer:
- Public inputs of the recursive proof
- Hashes of UTXOs
- Updates to the public state
- Transaction signature
- Nullifiers (to prevent double spending)

After completing these steps, the user's part of the execution is done, and the sequencer takes over the process.

Sequencer actions

Step 5: Proof verification

Proof and data reception: The sequencer receives the proof and public inputs from the user.
Verification process:
- For private, shielded, and deshielded executions, the sequencer verifies the proof using the provided public data.
- For public executions, the sequencer reruns the smart contract code with the provided inputs to check the results.
Validation: If both the zkVM proofs and public execution results are verified successfully, the sequencer collects the proof and public data to proceed. If verification fails, the process is aborted, and the transaction is rejected.

Step 6: Aggregating proofs and finalizing the block

Proof aggregation: The sequencer calls the zkVM again to aggregate all received proofs into one comprehensive proof to finalize the block.
Finalizing the block:
- Public state update: The sequencer updates the public state with the new transaction data.
- Nullifier tree update: Updates the nullifier tree to reflect the new state and prevent double spending.
- Synchronization mechanism: Runs synchronization mechanisms to ensure fairness and consistency across the network.
- UTXO validation: Validates the exchanged UTXOs to complete the transaction process.

This comprehensive process ensures that transactions are executed securely, with the appropriate level of privacy and state updates synchronized across the network.

Below, we outline the execution process of the four different execution types within NSSA:

Public execution:

Private execution:

Shielded execution:

Deshielded execution:

3. Execution processes and UTXO management

In Nescience state-separation architecture, UTXOs are key components for managing private data and assets. They serve as private entities that hold both storage and assets, facilitating secure and confidential transactions. UTXOs are utilized in three of the four execution types within NSSA: private, shielded, and deshielded executions. This section explores the lifecycle of UTXOs, detailing their generation, transfer, encryption, and eventual consumption within the private execution framework.

a) Components of a Nescience UTXO

A Nescience UTXO is a critical and versatile component of the private state in the Nescience state-separation architecture. It carries essential information that ensures its proper functionality within private execution, such as the owner, value, private storage slot, non-fungibles, and other cryptographic components. Below is a detailed breakdown of each component and its role in maintaining the integrity, security, and privacy of the system:

Owner: The owner component represents the public key of the entity that controls the UTXO. Only the owner can spend this UTXO, ensuring its security and privacy through public key cryptography. This means that the UTXO remains secure as only the rightful owner, using their private key, can generate valid signatures to authorize the transaction. For example, if Alice owns a UTXO linked to her public key, she must sign any transaction to spend it using her private key. This cryptographic protection ensures that only Alice can authorize spending the UTXO and transfer it to someone else, such as Bob.
Value: The value in a UTXO represents the balance or asset contained within it. This could be cryptocurrency, tokens, or other digital assets. The value ensures accurate accounting, preventing double spending and maintaining the overall integrity of the system. For instance, if Alice's UTXO has a value of 10 tokens, this represents her ownership of that amount within the network, and when spent, this value will be deducted from her UTXO and transferred accordingly.
Private storage slot: The private storage slot is an arbitrary and flexible storage space within the UTXO for Nescience applications. It allows users and smart contracts to store additional private data that is only accessible by the owner. This could be used to hold metadata, smart contract states, or user-specific information. For example, if a smart contract is holding private user data, this information is securely stored in the private storage slot and can only be accessed or modified by the owner, ensuring privacy and security.
Non-fungibles: Non-fungibles within the UTXO represent unique assets, such as NFTs (Non-Fungible Tokens). Each non-fungible asset is assigned a unique serial number or identifier within the UTXO, ensuring its distinctiveness and traceability. For example, if Alice owns a digital artwork represented as an NFT, the non-fungible component of the UTXO will store the unique identifier for this NFT, preventing duplication or forgery of the digital asset.
Random commitment key: The random commitment key (RCK) is a randomly generated number used to create a cryptographic commitment to the contents of the UTXO. This commitment ensures the integrity of the data without revealing any private information. By generating a random key for the commitment, the system ensures that even if someone observes the commitment, they cannot infer any details about the underlying UTXO. For example, RCK helps maintain confidentiality in the system while still allowing the verification of transactions.
Nullifier key: The Nullifier key is another randomly generated number, used to ensure that a UTXO is only spent once. When a UTXO is spent, its nullifier key is recorded in a nullifier set to prevent double spending. This key guarantees that once a UTXO is spent, it cannot be reused in another transaction, effectively nullifying it from future use. This mechanism is crucial for maintaining the security and integrity of the system, as it ensures that no UTXO can be spent more than once.

b) UTXO lifecycle: From generation to consumption

UTXOs in NSSA are created when a transaction outputs a specific value, asset, or data intended for future use. Once generated, these UTXOs become private entities owned by specific users, containing sensitive information such as balances, private data, or unique assets like NFTs.

To maintain the required level of confidentiality, UTXOs are encrypted and transferred anonymously across the network. This encryption process ensures that the data within each UTXO remains hidden from network participants, including the sequencer, while still allowing for verification and validation through ZKPs. These proofs enable the network to ensure that UTXOs are valid, prevent double spending, and maintain security, all without revealing any sensitive information.

When a user wishes to spend or transfer a UTXO, the lifecycle progresses towards its consumption. The user must prove ownership and validity of the UTXO through a ZKP, which is then verified by the sequencer. This process occurs in private, shielded, and deshielded executions, where confidentiality is a priority. Once the proof is validated, the UTXO is consumed, meaning it is marked as spent and cannot be reused, ensuring the integrity of the transaction and preventing double spending.

UTXOs are central to the private, shielded, and deshielded execution types in Nescience. In private executions, UTXOs are transferred securely between parties without revealing any details to the public state. In shielded executions, UTXOs are used to receive assets from the public state while keeping the recipient's identity confidential. Finally, in deshielded executions, UTXOs are used to send assets from the private state to the public state, while preserving the sender's anonymity.

Since UTXOs are not exchanged in public executions, this lifecycle analysis is focused solely on private, shielded, and deshielded executions, where privacy and confidentiality are essential. In these contexts, the careful management and transfer of UTXOs ensure that the users' private data and assets remain secure, while still allowing for seamless and confidential transactions within the network.

At this point, it's crucial to introduce two key components that will play a significant role in the next section: the ephemeral key and the nillifier.

Ephemeral key: The ephemeral key is embedded in the transaction message and plays a crucial role in maintaining privacy. It is used by the sender, alongside the receiver's public key, in a key agreement protocol to derive a shared secret. This shared secret is then employed to encrypt the transaction details, ensuring that only those with the receiver's viewing key can decrypt the transaction. By using the ephemeral key, the receiver can regenerate the shared secret, granting access to the transaction's contents. The sender generates the ephemeral key using their spending key and the UTXO's nullifier, reinforcing the security of the transaction. (more details in key management and addresses section)
Nullifier: A nullifier is a unique value tied to a specific UTXO, ensuring that it has not been spent before. Its uniqueness is essential, as a nullifier must never correspond to more than one UTXO—otherwise, even if both UTXOs are valid, only one could be spent. This would undermine the integrity of the system. To spend a UTXO, a proof must be provided showing that the nullifier does not already exist in the Nullifier Tree. Once the transaction is confirmed and included in the blockchain, the nullifier is added to the Nullifier Tree, preventing any future reuse of the same UTXO. A UTXO's nullifier is generated by combining the receiver's nullifier key with the transaction note's commitment, further ensuring its distinctiveness and security. (More details in nullifier tree section.)

I) UTXOs in private executions

In private executions within NSSA, transactions are handled ensuring maximum privacy by concealing all transaction details from the public state. This approach is particularly useful for confidential payments, where the identities of the sender and receiver, as well as the transaction amounts, must remain hidden. The process is powered by ZKPs, ensuring that only the involved parties have access to the transaction details while maintaining the integrity of the network.

Stages of private execution: Private executions operate in two key stages: UTXO consumption and UTXO creation. In the first stage, UTXOs from the private state are used as inputs for the transaction. In the second stage, new UTXOs are generated as outputs and stored back in the private state. Throughout this process, the details of the transaction are kept confidential and only shared between the sender and receiver.
Private transaction workflow (transaction initialization): The user initiates a private transaction by selecting the input UTXOs that will be spent and determining the output UTXOs to be created. This involves specifying the amounts to be transferred and the recipient’s private address (a divestified address that hides the recipient's public address from the network). The nullifier key and random number for commitments (RCK) are also generated at this stage to define how these UTXOs can be spent or nullified in the future by the receiver.
Proof generation and verification: Next, the zkVM generates a ZKP to validate the transaction. This proof includes both a membership proof for the input UTXOs, confirming their presence in the hashed UTXO tree, and a non-membership proof to ensure that the input UTXOs have not already been spent (i.e., they are not in the nullifier tree). The proof also confirms that the total input value matches the total output value, ensuring no discrepancies. The user then submits the proof, along with the necessary metadata, to the sequencer.
Shared secret and encryption: To maintain confidentiality, the sender uses the receiver’s divestified address to generate an ephemeral public key. This allows the creation of a shared secret between the sender and receiver. Using a key derivation function, a symmetric encryption key is generated from the shared secret. The input and output UTXOs are then encrypted using this symmetric key, ensuring that only the intended recipient can decrypt the data.
Broadcasting the transaction: The user broadcasts the encrypted UTXOs to the network, along with a commitment to the output UTXOs using Pedersen hashes. These committed UTXOs are sent to the sequencer, which updates the hashed UTXO tree without knowing the transaction details.
Decryption by the receiver: After the broadcast, the receiver attempts to decrypt the broadcast UTXOs using their symmetric key, derived from the ephemeral public key. If the receiver successfully decrypts a UTXO, it confirms ownership of that UTXO. The receiver then computes the nullifier for the UTXO and verifies its presence in the hashed UTXO tree and its absence from the nullifier tree, ensuring it has not been spent. Finally, the new UTXO is added to the receiver’s locally stored UTXO tree for future transactions.

Throughout the private execution process, the identities of both the sender and receiver, as well as all transaction details, remain hidden from the public. The use of ZKPs ensures that the integrity of the transaction is verified without revealing any sensitive information. At the end of the process, the network guarantees that no participant, aside from the sender and receiver, can deduce any details about the transaction or the involved parties.

II) UTXOs in shielded executions

In shielded executions, the interaction between public and private states provides a hybrid privacy model that balances transparency and confidentiality. This model is suitable for scenarios where the initial step, such as a public transaction, requires visibility, while subsequent actions, such as private asset management, need to remain confidential. One common use case is asset conversion—where a public token is converted into a private token. The conversion is visible on the public ledger, but subsequent transactions remain private.

a) How shielded executions work

Shielded executions operate in two distinct stages: first, there is a modification of the public state, and then new UTXOs are created and stored in the private state. Importantly, shielded executions do not consume UTXOs but instead mint them, as new UTXOs are created to reflect the changes in the private state. This structure demands ZKPs to ensure that the newly minted UTXOs are consistent with the modifications in the public state. Here’s a step-by-step breakdown of how the shielded execution process unfolds:

Transaction initiation: The user initiates a transaction that modifies the public state, such as converting a public token to a private token. The transaction alters the public state (e.g., balances or smart contract storage) while simultaneously preparing to mint new UTXOs in the private state.
Generating UTXOs: After modifying the public state, the system mints new UTXOs in the private state. These UTXOs must be securely created, ensuring their integrity and consistency with the initial public state modification. A ZKP is generated by the user to prove that these new UTXOs align with the changes made in the public state.
Key setup for privacy: The sender retrieves the receiver's address and uses it to create a shared secret through an ephemeral public key. This shared secret is then used to derive a symmetric key, which encrypts the output UTXOs. This encryption ensures that only the intended receiver can decrypt and access the UTXOs.
Broadcasting and verifying UTXOs: After encrypting the UTXOs, the sender broadcasts them to the network. The new hashed UTXOs are sent to the sequencer, which verifies the validity of the UTXOs and attaches them to the hashed UTXO tree within the private state. The public inputs for the ZKP circuits consist of the Pedersen-hashed UTXOs and the modifications in the public state.
Receiver's role: Once the UTXOs are broadcast, the receiver attempts to decrypt each UTXO using the symmetric key derived from the shared secret. If the decryption is successful, the UTXO belongs to the receiver. The receiver then verifies the UTXO’s validity by checking its inclusion in the hashed UTXO tree and ensuring that its nullifier has not yet been used.
Nullifier check and integration: To prevent double spending, the receiver computes the nullifier for the received UTXO and verifies that it is not already present in the nullifier tree. Once verified, the receiver adds the UTXO to their locally stored UTXO tree for future use in private transactions.

While shielded executions offer privacy, certain information is still exposed to the public state, such as the sender's identity. To further enhance privacy, the sender can create empty UTXOs—UTXOs that don’t belong to anyone but are included in the transaction to obfuscate the true details of the transaction. Though this approach increases the size of the data, it adds a layer of privacy by complicating the identification of meaningful transactions.

b) Summary of shielded execution flow

Stage 1 (public modification): The user modifies public state data, such as converting tokens from public to private. This stage is visible to the public.
Stage 2 (UTXO minting and privacy): New UTXOs are minted in the private state, encrypted, and broadcast to the network. The transaction remains private from this point forward, secured by ZKPs and cryptographic keys.
Receiver’s role: The receiver decrypts the UTXOs and verifies their validity, ensuring the UTXOs are not double spent and are ready for future transactions.

In summary, shielded executions enable a hybrid privacy model in Nescience, balancing public transparency and private confidentiality. They are well-suited for transactions requiring initial public visibility, such as asset conversions, while ensuring that subsequent actions remain secure and private within the network.

III) UTXOs in deshielded executions

In NSSA, deshielded executions offer a unique way to move data and assets from the private state to the public state, revealing previously private information in a controlled and verifiable manner. This type of execution allows for selective disclosure, ensuring transparency when needed while still maintaining the security and privacy of critical details through cryptographic techniques like ZKPs. Deshielded executions are particularly valuable for use cases such as regulatory compliance reporting, where specific transaction details must be revealed to meet legal requirements, while other sensitive transactions remain private.

a) Stages of deshielded executions

Stage 1 (UTXO consumption): The process begins in the private state, where UTXOs are consumed as inputs for the transaction. This involves gathering all necessary UTXOs that contain the assets or balances to be made public, as well as any associated private data stored in memory slots.
Stage 2 (public state modification): After the UTXOs are consumed, the transaction details are made public by modifying the public state. This update includes changes to the public balances, storage data, and any necessary public records. While the public state is updated, the sender’s identity and other sensitive information remain hidden, thanks to the privacy-preserving properties of ZKPs.

This model ensures that private data can be selectively revealed when needed, offering both flexibility and transparency. It is particularly useful for scenarios requiring auditing or compliance reporting, where specific details must be made publicly verifiable without exposing the entire history or contents of private transactions.

b) How deshielded executions work

The deshielded execution process starts when a user initiates a transaction using private UTXOs. The Nescience zkVM is called to generate a ZKP, which validates the transaction without revealing sensitive details such as the sender's identity or the specifics of the Nescience application being executed.

During the transaction, the UTXOs from the private state are consumed, meaning they are used up as inputs and will no longer be available for future transactions. Instead of generating new UTXOs, the transaction modifies the public state, updating the necessary balances or memory slots related to the transaction. Here’s a step-by-step breakdown of how the deshielded execution process unfolds:

Get receiver's public address: The sender first identifies the public address of the receiver, to which the information or assets will be made public.
Determine input UTXOs and public state modifications: The sender gathers all the input UTXOs needed for the transaction and determines the public state modifications necessary for the Nescience applications and token transfers involved.
Calculate nullifiers: Nullifiers are generated for each input UTXO, ensuring that these UTXOs cannot be reused or double spent. The nullifiers are derived from the corresponding UTXO commitments.
Call zkVM with deshielded circuits: The sender invokes the zkVM with deshielded kernel circuits, which generates the proof. The proof ensures that all input UTXOs are valid by verifying their membership in the UTXO tree and their non-membership in the nullifier tree, ensuring they haven’t been spent.
Generate and submit proof: The zkVM generates a ZKP that verifies the correctness of the transaction without revealing private details. The proof includes the nullifiers and the planned modifications to the public state.
Send proof to sequencer: The sender then sends the proof and any relevant public information to the sequencer. The sequencer is responsible for verifying the proof, updating the public state accordingly, and adding the nullifiers to the nullifier tree.

Once the proof and public information have been broadcast to the network, the receiver does not need to take any further action. The sequencer manages the public state updates and ensures that the transaction is properly executed. By the end of the deshielded execution, specific transaction details become publicly visible, such as the identity of the receiver and the outcome of the transaction. This allows participants in the public state to extract information about the transaction, including the receiver's identity and some details about the execution. While the receiver's identity is revealed, the sender's identity and sensitive transaction details remain hidden, thanks to the use of ZKPs. This makes deshielded executions ideal for cases where transparency is needed, but complete privacy is still a priority for certain elements of the transaction.

Summary of UTXO consumption in NSSA

In NSSA, consuming UTXOs is a critical step in maintaining the security and integrity of the blockchain by preventing double spending. When a UTXO is consumed, it is used as an input in a transaction, effectively marking it as spent. This ensures that the UTXO cannot be reused, preserving the integrity of the blockchain.

The process of consuming UTXOs: The process of consuming a UTXO begins when a user selects a UTXO from their private state. The user verifies the UTXO’s existence and ownership using their viewing key, ensuring that they are the legitimate owner of the UTXO. Once verified, the user generates two key cryptographic proofs:
- Membership proof: This proof confirms that the UTXO exists within the hashed UTXO tree, ensuring its validity within the system.
- Non-membership proof: This proof ensures that the UTXO has not been previously consumed by checking its absence in the nullifier tree, which tracks spent UTXOs.

To mark the UTXO as spent, a nullifier is generated. This nullifier is a unique cryptographic hash derived from the UTXO, which is then added to the nullifier tree in the public state. Adding the nullifier to the tree prevents the UTXO from being reused in future transactions, thus preventing double spending.

After generating the membership and non-membership proofs, the user compiles the transaction using the zkVM. The zkVM is responsible for generating the necessary ZKPs, which validate the transaction without revealing sensitive details. The compiled transaction, along with the proofs, is then submitted to the sequencer for verification.

The role of the sequencer: Once the transaction is submitted, the sequencer verifies the ZKPs to confirm that the transaction is valid. If the proofs are verified successfully, the sequencer updates both the private and public states to reflect the transaction. This includes updating the nullifier tree with the newly generated nullifier, ensuring that the UTXO is marked as spent and cannot be reused.

Example: Alice sending tokens to Bob

Consider an example where Alice wants to send 5 Nescience tokens to Bob using a private execution. Alice selects a UTXO from her private state that contains 5 Nescience tokens. She generates the necessary membership and non-membership proofs, ensuring that her UTXO exists in the system and has not been previously spent. Alice then creates a nullifier by hashing the UTXO and compiles the transaction with the zkVM.

Once Alice submits the transaction, the sequencer verifies the proofs and updates the blockchain by adding the nullifier to the nullifier tree and recording the transaction details. This ensures that Alice’s UTXO is marked as spent and cannot be used again, while Bob receives the 5 tokens.

The importance of nullifiers

Nullifiers are a key mechanism in preventing double spending. By marking consumed UTXOs as spent and tracking them in the nullifier tree, NSSA ensures that once a UTXO is used in a transaction, it cannot be reused in any future transactions. This process is fundamental to maintaining the integrity and security of the blockchain, as it guarantees that assets are only spent once and prevents potential attacks on the system.

In conclusion, the process of consuming UTXOs in NSSA combines cryptographic proofs, nullifiers, and ZKPs to ensure that transactions are secure, confidential, and free from the risks of double spending.

C. Cryptographic primitives in NSSA

In the NSSA, cryptographic primitives are the foundational elements that ensure the security, privacy, and efficiency of the state separation model. These cryptographic tools enable private transactions, secure data management, and robust verification processes across both public and private states. The architecture leverages a wide range of cryptographic mechanisms, including advanced hash functions, key management systems, tree structures, and ZKPs, to safeguard user data and maintain the integrity of transactions.

Cryptographic hash functions play a pivotal role in concealing UTXO details, generating nullifiers, and constructing sparse Merkle trees, which organize and verify data efficiently within the network. Key management and address generation further enhance the security of user assets and identity, ensuring that only authorized users can access and control their holdings.

The architecture also relies on specialized tree structures for organizing data, verifying the existence of UTXOs, and tracking nullifiers, which prevent double spending. Additionally, Nescience features a privacy-preserving zero-knowledge virtual machine (zk-zkVM), which allows users to prove the correctness of an execution without disclosing sensitive information. This enables private transactions and maintains confidentiality across the network.

As Nescience evolves, optional cryptographic mechanisms such as multi-party computation (MPC) may be integrated to enhance synchronization across privacy levels. This MPC-based synchronization mechanism is still under development and under review for potential inclusion in the system. Together, these cryptographic primitives form the backbone of Nescience’s security architecture, ensuring that users can transact and interact privately, securely, and efficiently.

In the following sections, we will explore each of these cryptographic components in detail, beginning with the role of hash functions.

a) Hash functions in Nescience

Hash functions are a foundational element of Nescience’s cryptographic framework, serving multiple critical roles that ensure the security, privacy, and efficiency of the system. One of the primary uses of hash functions in Nescience is to conceal sensitive details of UTXOs by converting them into fixed-size hashes. This process allows UTXO details to remain private, ensuring that sensitive information is not directly exposed on the blockchain, while still enabling their existence and integrity to be verified. Hashing the UTXO details allows the actual data to remain confidential, with the hashes stored in a global tree structure for efficient management and retrieval.

Additionally, hash functions are essential for generating nullifiers, which play a crucial role in preventing double spending. Nullifiers are created by hashing UTXOs and are used to mark them as spent, ensuring that they cannot be reused in subsequent transactions. These nullifiers are stored in a nullifier tree, and each transaction must prove that its UTXO’s nullifier is not already present in the tree before it can be processed. This ensures that the UTXO has not been spent before, maintaining the integrity of the transaction process.

Hash functions are also vital in the construction of sparse Merkle trees, which provide an efficient and secure method for verifying data within the blockchain. Sparse Merkle trees enable quick and reliable proofs of membership and non-membership, making them essential for verifying both UTXOs and nullifiers. By using hash functions to build these trees, Nescience can ensure the integrity of the data, as any tampering with the data would result in a change in the hash, making the manipulation detectable.

Another critical consideration in Nescience is the compatibility of hash functions with ZKPs. ZK-friendly hash functions are optimized for efficient computation within the constraints of ZK circuits, ensuring that they do not become a bottleneck in the proof generation or verification process. These hash functions maintain strong cryptographic security properties while enabling efficient computations in ZKP systems, which is essential for maintaining privacy and integrity within the ZK framework.

The primary advantage of using hash functions in Nescience is their ability to ensure that transaction details remain private while still allowing for verification of their validity. Furthermore, by integrating hash functions into Merkle trees, the blockchain data becomes tamper-proof, enabling quick and efficient verification processes that uphold the system’s security and privacy standards.

Use case: How to use the Pedersen hash to create the UTXO commitment

As mentioned in the UTXOs in private executions section, the user broadcasts the encrypted UTXOs to the network, along with a commitment to the output UTXOs using Pedersen hashes. The Pedersen hash is used to create the UTXO commitment. The Pedersen hash is a homomorphic commitment scheme that allows secure commitments while maintaining privacy and enabling proofs of correctness in transactions. The commitment formula is as follows:

$Commitment = C(UTXO,RCK) =g^{UTXO}⋅h^{RCK}$

In this formula, $g$ and $h$ are two generators of a cryptographic group where no known relationship exists between them. This ensures that the commitment is secure and computationally infeasible to reverse or manipulate without knowing the original UTXO components. The random number $RCK$ adds an additional layer of security by blinding the UTXO's contents, ensuring that the commitment doesn't leak any information about the underlying data.

Importance of homomorphic commitments

It is essential to use a homomorphic commitment like the Pedersen commitment for UTXOs because it allows for the verification of important properties in transactions, such as ensuring that the total input value of a transaction equals the total output value. This balance is crucial for preventing the unauthorized creation of funds or d discrepancies in transactions. A homomorphic commitment enables these proofs because of its additive properties. Specifically, the exponents in the commitment formula are additive, meaning that commitments can be combined and verified without revealing the individual components. For instance, if you have two UTXOs with commitments $C(UTXO_1,RCK_1)$ and $C(UTXO_2,RCK_2)$ , you can combine them and verify that the resulting commitment is valid without exposing the actual amounts.

This capability is leveraged through a modified version of the Schnorr protocol, which is used in conjunction with the Pedersen hash to verify the correctness of transactions. The Schnorr protocol allows users to prove, without revealing the actual values, that the sum of inputs equals the sum of outputs, ensuring that no funds are created or lost in the transaction.

Limitations of standard cryptographic hashes

Standard cryptographic hash functions, such as SHA-256, are not suitable for this purpose because they lack the algebraic structure needed for homomorphic properties. In particular, while SHA-256 provides strong security for general hashing purposes, it does not allow the additive properties that are required to perform the type of ZKPs used in Nescience for UTXO commitments. This is why the Pedersen hash is preferred, as it enables the secure and private execution of transactions while allowing for balance verification and other critical proofs.

Conclusion

By using homomorphic commitments like the Pedersen hash, NSSA ensures that UTXOs can be securely committed and validated without exposing sensitive information. The random component (RCK) adds an additional layer of security, and the additive properties of the Pedersen commitment enable powerful ZKPs that maintain the integrity of the system.

b) Key management and addresses in Nescience

NSSA utilizes different cryptographic schemes, such as public key encryption and digital signatures, to ensure secure private executions through the exchange of UTXOs. These schemes rely on a structured set of cryptographic keys, each serving a specific purpose in maintaining privacy, security, and control over assets. Here's a breakdown of the keys used in Nescience:

I. Spending key

The spending key is the fundamental secret key in NSSA, acting as the primary control mechanism for a user’s UTXOs and other digital assets. It plays a critical role in the cryptographic security of the system, ensuring that only the rightful owner can authorize and spend their assets.

Role of the spending key: The spending key is responsible for generating the user’s private keys, which are used in various cryptographic operations such as signing transactions and creating commitments. This hierarchical relationship means that the spending key sits at the root of a user’s key structure, safeguarding access to all associated private keys and, consequently, to the user’s assets. In Nescience’s privacy-focused model, the spending key is never exposed or shared outside the user’s control. Unlike other keys, it does not interact with the public state, kernel circuits, or even the ZKP system. This isolation ensures that the spending key remains completely private and inaccessible to external entities. By keeping the spending key separate from the operational aspects of the network, Nescience minimizes the risk of key leakage or compromise.
Generation and security of the spending key: The spending key is generated randomly from the scalar field, a large mathematical space that ensures uniqueness and cryptographic strength. This randomness is crucial because it prevents attackers from predicting or replicating the key, thereby safeguarding the user’s assets from unauthorized access: it is computationally infeasible for an attacker to guess or brute-force the key. Once the spending key is generated, it is securely stored by the user, typically in a hardware wallet or another secure storage mechanism that prevents unauthorized access.
Spending UTXOs with the spending key: The spending key’s primary function is to authorize the spending of UTXOs in private transactions. When a user initiates a transaction, the spending key is used to generate the necessary cryptographic proofs and signatures, ensuring that the transaction is valid and originates from the rightful owner. However, even though the spending key generates these proofs, it is never directly exposed during the transaction process. Instead, derived private keys handle the operational aspects while the spending key remains secure in the background. For example, when Alice decides to spend a UTXO in a private execution, her spending key generates the required private keys that will sign the transaction and ensure its validity. However, the spending key itself never appears in any public state or transaction data, preserving its confidentiality.
Ensuring security through isolation: One of the key security principles of the spending key is its isolation from the network. Since it never interacts with public-facing elements, such as the public state or kernel circuits, the risk of exposure is significantly reduced. This isolation ensures that even if other parts of the cryptographic infrastructure are compromised, the spending key remains protected, preventing unauthorized spending of UTXOs.

In summary, the spending key in Nescience is a powerful and carefully guarded element of the cryptographic system. It is the root key from which other private keys are derived, allowing users to spend their UTXOs securely and privately. Its isolation from the public state and its random generation from a secure scalar field ensures that the spending key remains protected, making it a cornerstone of security in NSSA.

II. Private keys

In Nescience, the private key is an essential cryptographic element responsible for facilitating various secure operations, such as generating commitments and signing transactions. While the spending key plays a foundational role in safeguarding access to UTXOs and assets, the private keys handle the operational aspects of transactions and cryptographic proofs. The private key consists of three critical components: ${private}_{key}.rsd$ , ${private}_{key}.rcm$ , and ${private}_{key}.sig$ , each serving a distinct purpose within the Nescience cryptographic framework.

${private}_{key}.rsd$ (random seed): The random seed ( ${private}_{key}.rsd$ ) is the first and foundational component of the private key. It is a value randomly chosen from the scalar field, which ensures its cryptographic security and unpredictability. This seed is generated using a random number generator, making it virtually impossible to predict or replicate. The random seed is essential because it is used to derive the other two components of the private key. By leveraging a secure random seed, Nescience ensures that the entire private key structure is rooted in randomness, preventing external entities from guessing or deriving the key through brute-force attacks. The strength of the random seed ensures the overall security of the private key and, consequently, the integrity of the user's transactions and commitments.
${private}_{key}.rcm$ (random commitment): The random commitment component ( ${private}_{key}.rcm$ ) is a crucial part of the private key used specifically in the commitment scheme. It acts as a blinding factor, adding a layer of security to commitments made by the user. The ${private}_{key}.rcm$ value is also drawn from the scalar field and is used to ensure that the commitment to any UTXO or other sensitive data remains confidential. The commitment scheme in Nescience requires the use of ${private}_{key}.rcm$ to create cryptographic commitments that bind the user to specific data (such as UTXO details) without revealing the actual data. The role of ${private}_{key}.rcm$ is to ensure that these commitments are non-malleable and secure, preventing anyone from modifying the committed data without detection. For instance, when Alice commits to a UTXO, ${private}_{key}.rcm$ is used to generate a Pedersen commitment that ensures the UTXO details are hidden but can still be verified cryptographically. This means that even though the actual UTXO details are concealed, their existence and integrity can be proven.
${private}_{key}.sig$ (signing key for transactions): The signing key ( ${private}_{key}.sig$ ) is the third and final component of the private key, used primarily for signing transactions. One possible approach is that Nescience employs Schnorr signatures, a cryptographic protocol known for its efficiency and security. In this case, the ${private}_{key}.sig$ component would generate Schnorr signatures that are used to authenticate transactions, ensuring that only the rightful owner of the private key can authorize the spending of UTXOs. Schnorr signatures are important as they provide a secure and non-repudiable method of verifying that a transaction was initiated by the legitimate owner of the assets. When Alice signs a transaction using her ${private}_{key}.sig$ , the corresponding public key allows others to verify that the transaction was indeed signed by Alice, without revealing her private key. This verification process ensures that all transactions are legitimate and prevents unauthorized entities from forging transactions or spending assets they do not control. Even if an attacker gains access to the signed transaction, they cannot reverse engineer the ${private}_{key}.sig$ , ensuring the security of Alice's future transactions.

Robustness of private keys in Nescience

Despite the critical role of the private key in the operation of NSSA, the system is designed to maintain security even in the event that the private key is compromised. This resilience is achieved through the integrity of the spending key, which is never exposed in the process of signing or committing. The spending key acts as the ultimate safeguard, ensuring that even if a private key component is compromised, the attacker cannot access or spend the user's assets without control over the spending key.

The architecture’s design, where private keys handle operational tasks but rely on the spending key for ultimate control, ensures a layered approach to security. This way, the system can mitigate the damage of a compromised private key by maintaining the inviolability of the user's assets.

Conclusion

In summary, the private key in Nescience consists of three interrelated components that together ensure secure transaction signing, commitment creation, and the protection of user data. The ${private}_{key}.rsd$ serves as the root from which the other key components are derived, ensuring randomness and security. The ${private}_{key}.rcm$ plays a crucial role in generating commitments, while ${private}_{key}.sig$ provides the signing capability needed for transaction authentication. Together, these components enable users to engage in private, secure transactions while preserving the integrity of their assets, even in the face of potential key compromise.

III. Public keys

Public keys in Nescience serve as the user's interface with the network, allowing for secure interaction and verification without exposing the user's private keys. Derived directly from the user's private keys, public keys play a crucial role in enabling cryptographic operations such as transaction verification, commitment schemes, and deterministic computations. The public key components correspond to their private key counterparts and ensure that transactions and commitments are securely processed and validated across the network.

${public}_{key}.sig$ (verifying Schnorr signatures):

The ${public}_{key}.sig$ is derived from the signing component of the private key ( ${private}_{key}.sig$ ) and is used for verifying Schnorr signatures. Schnorr signatures are used to authenticate transactions, ensuring that they have been signed by the legitimate owner of the private key. This public key is essentially a verification key, allowing others in the network to confirm that a specific transaction was indeed authorized by the user. When a transaction is broadcast to the network, ${public}_{key}.sig$ enables any participant to verify that the transaction’s signature matches the user’s private key without needing access to the private key itself. This mechanism prevents forgeries as only the legitimate owner with access to the private key can generate a valid Schnorr signature. For example, if Alice sends a transaction, she signs it with her private key ( ${private}_{key}.sig$ ). Bob, or any other network participant, can use Alice’s ${public}_{key}.sig$ to verify the signature. If the signature is valid, Bob can be confident that the transaction was authorized by Alice and not by an imposter.

${public}_{key}.rcm$ (commitment schemes)

The ${public}_{key}.rcm$ is derived from the commitment component of the private key ( ${private}_{key}.rcm$ ). It is used in the commitment schemes that underpin Nescience’s privacy-preserving architecture. Commitments are a crucial cryptographic technique that allows users to commit to a piece of data (such as a UTXO) without revealing the actual data, while still enabling proof of its integrity and existence. In Nescience, the ${public}_{key}.rcm$ is used as part of the Pedersen commitment scheme, where it functions as a public commitment to certain transaction details. Even though the actual values are hidden (thanks to the private key component), the commitment can still be verified by other network participants using ${public}_{key}.rcm$ . This enables secure and private transactions while maintaining the ability to verify that commitments are consistent with the original data. For instance, when Alice commits to a UTXO, she uses her private key to generate the commitment, and the ${public}_{key}.rcm$ is available to others to verify the commitment’s validity without revealing the underlying details.

${public}_{key}.sk(prf)$ (pseudorandom function)

The ${public}_{key}.sk(prf)$ is derived from a random field element within the private key and is used to generate the pseudorandom function (PRF) associated with the user's account. This PRF is essential for producing deterministic outputs based on the user’s keys and transaction data while ensuring that these outputs are unique to the user and cannot be predicted or replicated by others. The PRF is crucial in scenarios where the user needs to derive unique identifiers or values that are tied to their specific account, ensuring that these values remain consistent across different transactions or interactions without revealing sensitive information. For example, ${public}_{key}.sk(prf)$ may be used in generating deterministic yet secure addresses or transaction references, which can be linked to the user’s activity in a controlled manner. By using ${public}_{key}.sk(prf)$ , Nescience ensures that certain operations, like generating addresses or computing deterministic transaction outcomes, remain both private and cryptographically secure. The public key’s role in this process is to maintain consistency in these outputs while preventing unauthorized parties from reverse engineering the associated private keys or transaction data.

Summary

Public keys in Nescience are essential for secure interactions within the network. " ${public}_{key}.sig$ " allows others to verify that transactions were signed by the legitimate owner, ensuring the authenticity of every operation. " ${public}_{key}.rcm$ " enables secure and private commitment schemes, allowing participants to commit to transaction details without revealing sensitive information. Finally, " ${public}_{key}.sk(prf)$ " powers deterministic outputs through a pseudorandom function, ensuring that user-specific data remains consistent and secure throughout various transactions. Together, these public key components facilitate privacy, security, and trust within NSSA, enabling seamless interactions while safeguarding user data.

IV. Viewing key

The viewing key in NSSA is a specialized cryptographic key that allows a user to decrypt both incoming and outgoing transactions associated with their account. This key is designed to offer a degree of transparency to the user, enabling them to view the details of their transactions without compromising the security of their assets or granting control over those assets.

Role of the viewing key: The primary function of the viewing key is to provide visibility into transaction details while maintaining the integrity of private, shielded, or deshielded transactions. It enables the user to see the specifics of the transactions they are involved in—such as amounts transferred, asset types, and metadata—without exposing the sensitive transaction data to the broader network. For instance, if Alice has executed a private transaction with Bob, her viewing key allows her to decrypt and review the details of the transaction, ensuring that everything was processed correctly. This ability to audit her own transactions helps Alice maintain confidence in the integrity of her private interactions on the blockchain.
Security considerations: Despite its utility, the viewing key must be handled with care as its exposure could potentially compromise the user’s privacy. Although possessing the viewing key does not provide the ability to spend or sign transactions (that authority remains strictly with the spending key and private keys), it does allow anyone with access to the viewing key to decrypt the details of the user’s private transactions. This means that if the viewing key is leaked or stolen, the privacy guarantees of Nescience’s private, shielded, and deshielded executions could be undermined. Specifically, the viewing key could be used to link various transactions, breaking the unlinkability of private transactions. For example, an attacker with access to the viewing key could decrypt past and future transactions, exposing the relationships between different parties and transaction flows. To mitigate this risk, Nescience recommends that users treat their viewing key with the same level of protection as their private keys. It should be stored securely in encrypted hardware wallets or other secure storage solutions to prevent unauthorized access.
Balancing privacy and transparency: The viewing key provides an essential balance between privacy and transparency in NSSA. While it ensures that users can monitor their transaction history and verify the details of their private transactions, it does so without compromising the control of their funds. This allows users to maintain a transparent view of their interactions while keeping their assets secure. For example, if Alice is using shielded execution to transfer assets, her viewing key enables her to audit the transaction without allowing anyone else, including Bob or external observers, to see the specific details unless they also have access to the viewing key. Moreover, since the viewing key does not grant signing or spending authority, even if it were exposed, an attacker would still not be able to manipulate the user’s assets. However, to maintain the unlinkability and confidentiality of private transactions, the viewing key must be kept secure at all times.
Protecting transaction unlinkability: In private transactions, unlinkability is one of the core privacy guarantees. This property ensures that individual transactions cannot be correlated with each other or linked to the same user unless that user chooses to reveal the connection. The viewing key must be carefully protected to preserve this unlinkability, as its compromise could allow someone to map out a user’s private transaction history. For instance, in deshielded transactions, the viewing key allows the user to see which private UTXOs were consumed and how the public state was modified. If the viewing key is compromised, an attacker could potentially link private UTXOs across multiple transactions, unraveling the user’s privacy.

Conclusion

The viewing key in Nescience is a powerful tool for providing insight into both incoming and outgoing transactions without granting control over assets. It allows users to decrypt and verify their transaction details, maintaining transparency in their interactions. However, due to its potential to compromise privacy if exposed, the viewing key must be handled with great care. Proper security measures are necessary to protect the viewing key, ensuring that the unlinkability of private, shielded, and deshielded transactions remains intact. In this way, the viewing key offers a crucial balance between privacy and transparency within the Nescience ecosystem.

V. Ephemeral key

The ephemeral key is generated using a combination of the sender’s spending key and the UTXO's nullifier, ensuring that the key is unique to each transaction. The process can be informally described as follows:

Ephemeral key generation
Let $\rho$ denote the nullifier of the UTXO being consumed in the transaction. The sender uses the receiver’s public key component ${public}_{key}.sk(prf)$ , which is derived from the receiver’s private key, to compute an ephemeral secret key ( $esk$ ). The computation is based on the nullifier $\rho$ and a base value:

$esk = {public}_{key}.sk(prf((0,0,0,0) || \rho)$ This formula binds the secret key to the specific transaction, leveraging the receiver’s cryptographic identity and the unique properties of the UTXO being spent.

Deriving the ephemeral public key
After computing the ephemeral secret key ( $esk$ ), the next step is to derive the corresponding ephemeral public key (epk). This is done using the Key Agreement Protocol's DerivePublic algorithm, which generates the public key associated with the shared secret key. The ephemeral public key is computed as:

$epk = KA.DerivePublic(esk, gd)$

Here, ( $gd$ ) is the diversifier address associated with the receiver’s account. The diversifier address is computed from the receiver’s account using the DiversifierHash function:

$gd = receiver.DiversifierHash(d)$

The diversifier ( $d$ ) is a random value selected by the sender to add randomness to the process. This diversifier ensures that even if a single receiver is involved in multiple transactions, the derived keys remain distinct for each transaction. The value ( $d$ ) is included in the transaction note for transparency and reproducibility.

Establishing the shared secret
The shared secret, used to encrypt the transaction details, is derived from the key agreement between the sender’s ephemeral key and the receiver’s viewing key. Any party possessing the receiver’s viewing key can use it in conjunction with the ephemeral key to compute the shared secret, which is then used to decrypt the transaction. This ensures that only the intended recipient (or anyone with their viewing key) can access the transaction details.

Key components and protocol

The formal protocol for generating ephemeral keys closely follows this informal description but involves additional intermediate steps for converting values to binary sequences to fit implementation requirements. These steps are essential for ensuring compatibility with cryptographic algorithms used in NSSA. The protocol uses the following key components:

Nullifier ( $\rho$ ): Ensures that the ephemeral key is tied to the specific UTXO being consumed, preventing reuse of the key in future transactions.
Receiver’s public key ( ${public}_{key}.sk(prf)$ : Establishes the receiver's identity in the key generation process, ensuring that the shared secret can only be derived by the intended party.
Diversifier ( $d$ ): Adds randomness to the transaction, ensuring that keys remain unique across different transactions involving the same receiver.

The end result is an ephemeral key system that provides strong cryptographic guarantees for transaction privacy, leveraging key agreement protocols and secure cryptographic primitives to prevent unauthorized access to sensitive transaction data.

Conclusion

The ephemeral key in Nescience is a critical element for maintaining transaction confidentiality. It facilitates a secure key agreement between the sender and the receiver, allowing for the encryption of transaction details with a shared secret that can only be derived by the intended recipient. By incorporating the nullifier, receiver's public key, and diversifier address, the ephemeral key ensures that transaction privacy is preserved while preventing unauthorized access to transaction information, even in a complex, multi-party blockchain environment.

VI. Nescience addresses

Nescience’s dual address system is a core component of its privacy-focused architecture, designed to balance transparency and confidentiality across different types of transactions. The architecture provides each user or smart contract with both public addresses and private addresses, allowing them to participate in both open and confidential activities on the blockchain.

a) Public addresses

Public addresses in Nescience are visible to all participants on the network and reside within the public state. These addresses are essential for engaging in transparent and verifiable interactions, such as sending tokens or invoking smart contracts that are meant to be publicly auditable. Public addresses serve as the interface for users who need to engage with the transparent elements of the system, including public transactions or smart contracts that require public access.

They are analogous to traditional blockchain addresses seen in systems like Ethereum or Bitcoin, where every participant can see the address and the transactions associated with it. For example, when Alice wants to receive tokens from Bob in a public transaction, she can provide her public address, allowing Bob to send the tokens transparently. Anyone on the network can verify the transaction, providing accountability and trust in the public state.

Because public addresses are visible and auditable, they are typically used for interactions where privacy is not a concern or where transparency is desirable. This could include simple token transfers, public contract calls, or interactions with dapps that require public accountability, such as voting or governance systems.

b) Private addresses

In contrast, private addresses are designed for confidentiality and are not visible onchain. These addresses are used exclusively for private transactions and executions, ensuring that sensitive details—such as the sender, receiver, or amount transferred—remain hidden from the public state. Private addresses are a key feature of Nescience’s private, shielded, and deshielded execution models, where preserving the confidentiality of participants is crucial.

Users can generate an unlimited number of private addresses using their private keys. This flexibility allows users to compartmentalize their interactions, giving them the ability to provide different private addresses to different parties. For instance, Alice could create a unique private address for each entity she interacts with, thereby ensuring that her transactions remain isolated and difficult to trace. This feature enhances privacy by preventing any direct linkage between different transactions or activities associated with a single user.

Private addresses are not tied to the public state and are only accessible through the user’s private key infrastructure. Transactions involving private addresses are conducted within the confines of the private state and are only decrypted by the intended participants. For example, when Alice sends tokens to Bob using a private address, the details of that transaction remain confidential, accessible only to Alice and Bob, unless they choose to reveal it.

Role of the viewing key in private addresses: A key feature of Nescience’s private address system is the viewing key, which allows users to decrypt any transaction involving their private addresses. This capability provides oversight and transparency into the user’s private transactions, ensuring that they can monitor their own activity without exposing the details to the public. The viewing key does not compromise the security of the user's assets as it does not grant spending or signing authority. However, it does allow the user to audit and verify the accuracy of their private transactions, ensuring that everything proceeds as expected. For instance, Alice can use her viewing key to review the details of a private transaction she conducted with Bob, ensuring that the correct amount was transferred and that the transaction was properly processed. This functionality is critical for users who want to maintain control over their private interactions while still benefiting from transparency into their transaction history. The ability to generate multiple private addresses and decrypt them with the viewing key ensures that users can maintain compartmentalized privacy without sacrificing oversight.

Summary

Nescience’s dual address system—comprising public and private addresses—provides users with the flexibility to engage in both transparent and confidential transactions. Public addresses are visible onchain and are used for open, public interactions that require accountability and auditability. In contrast, private addresses are invisible onchain and are used for confidential transactions, enhancing privacy and security.

By allowing users to generate multiple private addresses, Nescience gives individuals control over the visibility of their transactions. Combined with the viewing key’s ability to decrypt transactions involving private addresses, the system ensures that users can maintain transparency over their private transactions without exposing sensitive information to the public state. This dual-address approach enables users to seamlessly switch between public and private interactions depending on their needs, providing a robust framework for both privacy and transparency in NSSA.

VII. Conclusion

Key management in NSSA is a carefully designed system that strikes an optimal balance between security, privacy, and flexibility. The architecture’s hierarchical structure, with distinct roles for the spending key, private keys, and public keys, ensures that users retain full control over their assets while maintaining the integrity of their transactions. The spending key, as the root of security, provides unassailable control over the user's UTXOs and assets, ensuring that only the rightful owner can authorize spending. Private keys, derived from the spending key, enable users to engage in cryptographic operations such as signing transactions and generating commitments without exposing sensitive information to the network.

The viewing key adds another layer of transparency, allowing users to decrypt and review their transactions without compromising their authority over their assets. While it provides a window into transaction history, the viewing key does not grant spending power, preserving the critical separation between visibility and control.

The dual system of public and private addresses gives users the flexibility to navigate between open, transparent transactions and confidential, privacy-protected activities. Public addresses allow users to engage in verifiable, public interactions while private addresses enable compartmentalized, secure transactions that remain hidden from the public eye. This dual-address framework ensures that users can seamlessly adapt to different privacy requirements, whether they are participating in public dapps or conducting sensitive financial operations.

Overall, Nescience’s cryptographic infrastructure is designed to empower users to engage confidently in both transparent and confidential activities. By providing flexible, secure key management and address systems, Nescience ensures that users can fully participate in the blockchain ecosystem without compromising their privacy or control. The architecture supports the nuanced needs of modern blockchain users, who require both the transparency of public interactions and the security of private transactions, all while maintaining the integrity and confidentiality of their assets.

c) Trees in NSSA

Trees in NSSA serve as verifiable databases, essential for maintaining privacy and security. Different types of trees are used for various purposes:

Global state tree: The global state tree is a single, public tree that holds all public assets and storage information. It acts as a central repository for all publicly accessible data on the blockchain. By organizing this data in a Merkle tree structure, the Global State Tree allows for efficient and secure verification of public information.
Hashed UTXO tree: The hashed UTXO tree is a public tree that contains hashes of all created UTXOs. When users wish to consume a UTXO, they provide a membership proof to demonstrate that the UTXO exists within this tree. This process ensures that only valid and existing UTXOs can be spent, maintaining the integrity of transactions. In fact, users generate membership proofs that verify the presence of specific UTXOs in the tree without revealing their actual data. The benefit here is that the Merkle tree structure allows for quick and efficient verification of UTXO existence.
UTXO trees (private states): Each user or smart contract has its private state stored in UTXO trees. These trees are kept as plaintext on the client’s local system (off-chain), ensuring privacy as sensitive information remains confidential. The private state includes all UTXOs owned by the user or the smart contract, and these are not directly exposed to the public blockchain. For instance, users have full control over their private state, which is not visible to other participants in the network.

In conclusion, the tree structures enable efficient verification of transaction validity without compromising privacy. By using Merkle trees, Nescience ensures that any tampering with the data can be easily detected. The efficient structure of these trees supports the scalability of the architecture, allowing it to handle a large number of transactions and data entries. By leveraging different types of trees, Nescience ensures efficient and secure management of both public and private states.

## d) Nullifier tree in Nescience

The nullifier tree is a fundamental component of NSSA, designed to prevent double spending by securely tracking all consumed UTXOs. This tree acts as a public ledger of spent UTXOs, ensuring that once a UTXO is consumed in a transaction, it cannot be reused in future transactions.

The primary function of the nullifier Tree is to store the nullifiers of all consumed UTXOs. By recording the nullifiers in a public tree, the system ensures that each UTXO is spent only once, thereby safeguarding the integrity of the entire network.

Ensuring non-membership and preventing double spending Before a user can consume a UTXO in a transaction, they must provide a non-membership proof. This proof demonstrates that the UTXO’s nullifier does not already exist in the Nullifier Tree, proving that the UTXO has not been spent before. If the UTXO’s nullifier is found in the tree, the system will reject the transaction, preventing double spending. The non-membership proof ensures that users cannot attempt to spend the same UTXO in multiple transactions. This mechanism is critical for maintaining the security and reliability of NSSA. The tree structure, which is typically built using a cryptographic tree like a Merkle tree, allows for efficient verification of nullifiers. Verifiers can quickly check whether a nullifier is present or absent in the tree, ensuring that each UTXO is only spent once.
Nullifier tree structure and operation The nullifier tree is likely structured as a Merkle tree, which is a cryptographic binary tree where each node represents the hash of its child nodes. This structure allows for efficient storage and verification of large sets of nullifiers as only the root hash of the tree needs to be stored on the blockchain. When a new nullifier is added to the tree, the tree is recalculated, and the root hash is updated. This process ensures that all consumed UTXOs are securely recorded. Each time a transaction consumes a UTXO, the nullifier is added to the Nullifier Tree, and the tree is updated to reflect this new entry. To verify that a UTXO has not been double spent, verifiers can use the tree’s root hash and a proof of inclusion or exclusion (membership or non-membership proof) to check whether the nullifier is present in the tree. For example, if Alice wants to spend a UTXO, she must prove that the nullifier associated with that UTXO is not already in the Nullifier Tree. She generates a non-membership proof that shows her nullifier is not recorded in the tree, and the transaction is allowed to proceed. Once the transaction is completed, the nullifier is added to the tree, ensuring that the UTXO cannot be used again.

Conclusion The Nullifier Tree is a crucial element of Nescience's security. By recording all consumed UTXOs and ensuring that nullifiers are unique, the tree prevents double spending and maintains the integrity of the blockchain. The non-membership proof mechanism guarantees that every transaction is validated against the tree. This structure supports the scalability and security of NSSA, providing a reliable method for verifying the validity of transactions while preventing malicious behavior.

e) Recursive-friendly privacy-preserving zk-zkVM

The development of the zk-zkVM in Nescience is a work in progress, as the architecture continues to evolve to support privacy-preserving transactions and efficient ZKP generation. The goal of the zk-zkVM is to seamlessly integrate with the Nescience state-separation architecture, ensuring that private transactions remain confidential while allowing the network to verify their validity without compromising privacy.

Currently, we are exploring and testing several existing zkVMs to identify the most suitable platform for our needs. Our focus is on finding a zkVM that not only supports the core features of Nescience, such as state separation and privacy, but also provides the efficiency and scalability required for a decentralized system. Once a suitable zkVM is chosen, we will begin implementing advanced privacy features on top of it, including support for confidential transactions, selective disclosure, and recursive proof aggregation.

The integration of these privacy-preserving features with an existing zkVM will enable Nescience to fully employ its state-separation architecture, ensuring that users can conduct private transactions with robust security and scalability. This approach will allow us to leverage the strengths of proven zkVM technologies while enhancing them with the unique privacy and state-separation capabilities that Nescience requires.

Privacy-preserving features: At its core, the zk-zkVM is designed with privacy in mind. One of the zk-zkVM’s standout privacy features is selective disclosure, which allows users to reveal only specific details of a transaction as needed. For example, a user could disclose the transaction amount while concealing the identities of the participants. The zk-zkVM employs advanced encryption techniques to protect this sensitive data. All transaction data is encrypted before being stored on the blockchain, so even if the data is intercepted, it cannot be deciphered without the appropriate decryption keys. Another of the crucial privacy-preserving features is the support for confidential transactions. Only the parties involved in the transaction can access the encrypted data. Furthermore, the zk-zkVM supports verifiable encryption, a powerful capability that allows encrypted data to be included in ZKPs without needing to decrypt it. This ensures that transaction details remain private while their correctness can still be proven.
Lightweight design for accessibility: The zk-zkVM is being designed to be lightweight and efficient, enabling it to run on standard consumer-grade hardware. This makes it accessible to a wide range of users without requiring specialized equipment or significant computational resources.
Faster proving time: To maintain a seamless user experience, especially during high transaction volumes, the zk-zkVM is being optimized for fast proving times. Fast proof generation is particularly important for ensuring that the system remains usable during periods of peak activity, preventing bottlenecks and maintaining the fluidity of the network.
Recursive-friendly operations: One of the most advanced features of the zk-zkVM will be its support for recursive operations. Recursion enables the aggregation of multiple proofs into a single proof, improving efficiency on both the client and sequencer sides of the network.
Client-side recursion (batch processing): When a single transaction involves multiple executions, each requiring its own ZKP, these individual proofs can be recursively aggregated before being sent to the sequencer. This reduces the overall data transmitted, enhancing the efficiency of the transaction process by compressing multiple proofs into a single package.
Sequencer-side recursion (reduced redundancy): The sequencer, which is responsible for processing transactions and creating verifiable blocks, collects transactions containing aggregated proofs. These proofs are further merged into a single comprehensive proof, ensuring that all transactions within a block are validated collectively. This process reduces redundancy and optimizes the blockchain’s efficiency by minimizing the size and complexity of the proofs required for verification.
Developer-friendly language: To foster widespread adoption and innovation within the Nescience ecosystem, the zk-zkVM would include a developer-friendly language. This high-level language simplifies the process of building applications that leverage state separation and privacy-preserving transactions. The language should offer extensive support for modular design, APIs, and SDKs, enabling developers to integrate their applications with the zk-zkVM more easily. By lowering the barrier to entry, Nescience encourages innovation and helps expand the range of privacy-preserving applications that can be built on its platform.

Conclusion

The zk-zkVM in Nescience is a powerful and versatile virtual machine that embodies the principles of privacy, efficiency, and scalability. By supporting ZKPs and integrating with advanced privacy technologies like homomorphic encryption. Its lightweight design allows it to run efficiently on standard hardware, promoting decentralization, and its recursive operations further enhance the system's scalability. With its developer-friendly language and fast proving times, the zk-zkVM is positioned as a key component in fostering the growth and adoption of privacy-preserving blockchain applications.

f) MPC-based synchronization mechanism (under review)

Nescience is developing an MPC-based synchronization mechanism to balance privacy and fairness between public and private execution types. This mechanism extracts common information from encrypted UTXOs without revealing private details, ensuring privacy and preventing UTXO linkage to users or specific transactions. It guarantees that public and private executions remain equitable, with the total input equaling the public output.

The mechanism employs MPC protocols to perform computations privately, ZKPs to verify correctness, and cryptographic protocols to secure data during synchronization. This ensures a consistent and fair environment for all users, regardless of their chosen privacy level. Currently, this feature is under development and review for potential inclusion depending on the research output and compatibility.

D. Future plans for Nescience

Nescience is committed to continuously evolving its architecture to ensure scalability, privacy, and security in a growing blockchain landscape. One of the primary goals is to integrate the zk-zkVM and the Nescience state-separation architecture into a fully functioning node, enabling efficient private transactions while maintaining network integrity.

Addressing scalability challenges: A key challenge facing Nescience is the increasing size of nullifier and hashed UTXO trees, which could impact network performance and scalability over time. To mitigate this, Nescience plans to adopt state-of-the-art scalable privacy techniques such as:
- Mutator sets: Dynamically adjusting data structures to manage the growth of the nullifier set efficiently.
- SNARK-based accumulators: Compressing data in a verifiable way to ensure that only relevant information is stored while maintaining cryptographic security.
- Pruning techniques: Periodically trimming unnecessary data from trees to maintain optimal size and performance, ensuring that the network scales logarithmically rather than exponentially as more transactions occur.

By implementing these approaches, Nescience aims to keep the size of its data structures manageable, ensuring that scalability does not come at the cost of performance or privacy.

Enhanced key management: Another critical focus for Nescience is improving key management to streamline operations and enhance security. The plan is to integrate the different keys used for signatures, addresses, UTXO encryption, and SNARK verification into a unified system. This integration will simplify key management for users while reducing the risk of security breaches caused by complex, disparate key systems. Nescience also plans to implement Hierarchical Deterministic (HD) keys, which allow users to derive multiple keys from a single seed, enhancing both security and usability. This approach reduces the complexity of managing multiple keys across various functions and provides an additional layer of protection for private transactions. Additionally, multi-signature schemes will be introduced, requiring multiple parties to authorize transactions. This feature increases security by reducing the likelihood of unauthorized access, ensuring that a single compromised key cannot lead to malicious transactions.
Integrating advanced cryptographic techniques: Nescience will integrate advanced cryptographic techniques, enhancing both privacy and scalability. Among these are:
- Homomorphic encryption: Allowing computations to be performed on encrypted data without the need to decrypt it, preserving privacy while enabling secure, complex data processing.
- Zero-knowledge rollups: Bundling multiple transactions into a single proof to reduce the amount of data processed and stored on the blockchain, significantly improving scalability without sacrificing security.

These cryptographic enhancements will ensure that Nescience can support a growing network while continuing to protect user privacy and maintaining high transaction throughput.

Long-term vision

The ultimate goal for Nescience is to deploy a fully operational node powered by zk-zkVM and the Nescience state-separation architecture. This node will handle complex, private transactions at scale while integrating all of the advanced cryptographic techniques outlined in the roadmap. Nescience aims to provide users with an infrastructure that balances privacy, security, and efficiency, ensuring the network remains resilient and capable of handling future demands.

By pursuing these future plans, Nescience is poised to not only address current challenges around scalability and key management but also lead the way in applying advanced cryptography to decentralized systems. This vision will help secure the long-term integrity and performance of the Nescience state-separation model as the blockchain grows and evolves.

References

[1] Nakamoto, S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System. Retrieved from https://bitcoin.org/bitcoin.pdf

[2] Sanchez, F. (2021). Cardano’s Extended UTXO accounting model. Retrieved from https://iohk.io/en/blog/posts/2021/03/11/cardanos-extended-utxo-accounting-model/

[3] Morgan, D. (2020). HD Wallets Explained: From High Level to Nuts and Bolts. Retrieved from https://medium.com/mycrypto/the-journey-from-mnemonic-phrase-to-address-6c5e86e11e14

[4] Wuille, P. (2012). Bitcoin Improvement Proposal (BIP) 32. Retrieved from https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki

[5] Sin7y Tech Review (29): Design Principles of Private Transactions in Aleo & Zcash. Retrieved from https://hackmd.io/@sin7y/rkxFXLkgs

[6] Sin7y Tech Review (33): Principles of private transactions and regulatory compliance issues. Retrieved from https://hackmd.io/@sin7y/S16RyFzZn

[7] Zcash Protocol Specification. Retrieved from https://zips.z.cash/protocol/protocol.pdf

[8] Anatomy of a Zcash Transaction. Retrieved from https://electriccoin.co/blog/anatomy-of-zcash

[9] The Penumbra Protocol: Notes, Nullifiers, and Trees. Retrieved from https://protocol.penumbra.zone/main/concepts/notes_nullifiers_trees.html

[10] Zero-knowledge Virtual Machine (ZKVM). Retrieved from https://medium.com/@abhilashkrish/zero-knowledge-virtual-machine-zkvm-95adc2082cfd

[11] What's a Sparse Merkle tree?. Retrieved from https://medium.com/@kelvinfichter/whats-a-sparse-merkle-tree-acda70aeb837

[12] Lecture 10: Accounts Model and Merkle Trees. Retrieved from https://web.stanford.edu/class/ee374/lec_notes/lec10.pdf

[13] The UTXO vs Account Model. Retrieved from https://www.horizen.io/academy/utxo-vs-account-model/

[14] Addresses and Value Pools in Zcash. Retrieved from https://zcash.readthedocs.io/en/latest/rtd_pages/addresses.html

Vac 101: Membership with Bloom Filters and Cuckoo Filters

2024-07-19T12:00:00.000Z

We examine two data structures: Bloom filters and Cuckoo filters.

Membership with Bloom Filters and Cuckoo Filters

The ability to efficiently query the membership of an element in a given data set is crucial. In certain applications, it is more important to output a result quickly than to have a 'perfect' result. In particular, false positives may be an acceptable tradeoff for speed. In this blog, we examine Bloom and Cuckoo data filters. Both of these filters are data structures that can be used for membership proofs.

Everyone is familiar with the process of creating a new account for various websites, whether it is an e-mail account or a social media account. Consider when you enter your desired username. Many sites provide real-time feedback, as you type, on the availability of a given string. In this scenario, it is necessary that the result is seemingly instant, regardless of the number of existing accounts. However, it is not important that the usernames that are flagged as unavailable are, in fact, in use. That is, it is sufficient to have a probabilistic check for membership.

Bloom filters and Cuckoo filters are data structures that can be used to accumulate data with a fixed amount of space. The associated filter $F$ for a digest of data $D$ can be queried to determine whether an element is (possibly) a member of $D$ :

0: The queried element is definitely not a member of digest $D$ .
1: The entry is possibly a member of the digest $D$ .

The algorithms associated with Bloom filters and Cuckoo filters, which we will discuss shortly, are deterministic. The possibility of false positives arises from the query algorithm.

Bloom filters

A Bloom filter is a data structure that can be used to accumulate an arbitrary amount of data with a fixed amount of space. Bloom filters have been a popular data structure for proof of non-membership due to their small storage size. Specifically, a Bloom filter consists of a binary string ${\bf{v}} \in \{0,1\}^n$ and $k$ hash functions $\{h_i: \{0,1\}^* \rightarrow \{0,\dots,n-1\}\}_{i=0}^{k-1}$ . We note that each hash function $h_i$ is used to determine an index of our binary string ${\bf{v}}$ to flip the associated bit to 1. The binary string ${\bf{v}}$ is initialized with every entry as 0. The hash functions do not need to be cryptographic hash functions.

Append: Suppose that we wish to add the element $x$ to the Bloom filter.
- Define the vector ${\bf{b}} \in \{0,\dots,n-1\}^k$ so that ${\bf{b}}[i] := h_i(x)$ for each $i \in \{0,\dots,k-1\}$ .
- Update the binary string ${\bf{v}}[{\bf{b}}[i]] \leftarrow 1$ for each $i \in \{0,\dots,k-1\}$ .
Query: Suppose that we wish to query the Bloom filter for element $y$ .
- Return 1 provided ${\bf{v}}[h_i(y)] = 1$ for every $i \in \{0,\dots,k-1\}$ . Otherwise, return 0.

The algorithm Query will output 1 for every element $y$ that has been added to the Bloom filter. This is a consequence of the Append algorithm. However, due to potential collisions over a set of hash functions, it is possible for false positives to occur. Moreover, the possibility of collisions makes it impossible to remove elements from the Bloom filter.

Complexity

The storage of a Bloom filter requires constant space. Specifically, the Bloom filter uses $n$ bits regardless of the size of the digest. So, regardless of the number of elements that we append, the Bloom filter will use $n$ bits. Further, if we assume that each of the $k$ hash functions runs in constant time, then we can append/query an entry in $O(k)$ .

Example

Suppose that $k = 3$ and $n = 10$ . Our Bloom filter is initialized as $\bf{v} = \begin{pmatrix}0&0&0&0&0&0&0&0&0&0\end{pmatrix}.$ Now, we will append the words $add$ , $sum$ , and $equal$ . Suppose that

$\begin{matrix} h_0(add) = 1 & h_1(add) = 4 & h_2(add) = 7\\ h_0(sum) = 9 & h_1(sum) = 2 & h_2(sum) = 1\\ h_0(equal) = 5 & h_1(equal) = 8 & h_2(equal) = 0. \end{matrix}$

After appending these words, the Bloom filter is $\bf{v} = \begin{pmatrix}1&1&1&0&1&1&0&1&1&1\end{pmatrix}.$

Now, suppose that we query the words $subtract$ and $multiple$ so that

$\begin{matrix} h_0(subtract) = 3 & h_1(subtract) = 5 & h_2(subtract) = 1\\ h_0(multiple) = 7 & h_1(multiple) = 1 & h_2(multiple) = 4\\ \end{matrix}$ .

The query for $subtract$ returns 0 since ${\bf{v}}[3]=0$ . On the other hand, the query for $multiple$ returns 1 since ${\bf{v}}[1]=1, {\bf{v}}[4] = 1$ , and ${\bf{v}}[7]=1$ . Even though $multiple$ was not used to generate the Bloom filter ${\bf{v}}$ , our query returns the false positive.

Probability of false positives

For our analysis, we will assume that the probabilities that arise in our analysis are independent. However, this assumption can be removed to gain the same approximation.

We note that for a single hash function, the probability that a specific bit is flipped to 1 is $1/n$ . So, the probability that the specific bit is not flipped by the hash function is $1-1/n$ . Applying our assumption that the $k$ hash functions are 'independent,' the probability that the specific bit is not flipped by any of the hash functions is $(1-1/n)^k$ .

Recall the calculus fact $\lim_{\infty} (1-1/n)^n = e^{-1}$ . That is, as we increase the number of bits that our Bloom filter uses, the approximate probability that a given bit is not flipped by any of the $k$ hash functions is $e^{-k/n}$ .

Suppose that $\ell$ entries have been added to the Bloom filter. The probability that a specific bit is still 0 after the $\ell$ entries have been added is approximately $e^{-\ell k/n}$ . The probability that a queried element is erroneously claimed as a member of the digest is approximately $(1-e^{-\ell k/n})^k$ .

The following table provides concrete values for these approximations.

$n$	$k$	$\ell$	$(1-e^{-\ell k/n})^k$
32	3	3	0.01474
32	3	7	0.11143
32	3	12	0.30802
32	3	17	0.50595
32	3	28	0.79804

Notice that the probability of false positives increases as the number of elements ( $\ell$ ) that have been added to the digest increases.

Sliding-Window Bloom filter

Our toy example and table illustrated an issue concerning Bloom filters. The number of entries that can be added to a Bloom filter is restricted by our choice of $k$ and $n$ . Not only does the probability that false positives will occur increase, but it is possible that our vector ${\bf{v}}$ can be a string of all 1s. Szepieniec and Værge proposed a modification to Bloom filters to handle this.

Instead of having a fixed number of bits for our Bloom filter, we dynamically allot memory based on the number of entries that have been added to the filter. Given a predetermined threshold ( $b$ ) for the number of entries, we shift our 'window' of flipping bits by $s$ bits. Note that this means that it is necessary to keep track of when a given entry is added to the digest. This means that querying the Sliding-Window Bloom filter will yield different results when different timestamps are used.

This can be done with $k$ hash functions as we used earlier. Alternatively, Szepieniec and Værge proposed using the same hash function but to produce $k$ entries in the current window. Specifically, we obtain the bits we wish to flip to 1s by computing $h(X || i)$ for each $i \in \{0,\dots, k-1\}$ and $X$ as we will define next. For Sliding-Window Bloom filters, $X$ is more than just the element we wish to append to the filter. Instead, $X$ consists of the element $x$ and a timestamp $t$ . The timestamp $t$ is used to locate the correct window for bits, as we see below:

Append: Suppose that we wish to add the element $x$ with timestamp $t$ to the Sliding-Window Bloom filter.
- Define the vector ${\bf{b}} \in \{0,\dots,n-1\}^k$ so that ${\bf{b}}[i] := h(x||t||i)$ for each $i \in \{0,\dots,k-1\}$ .
- Update the binary string ${\bf{v}}[{\bf{b}}[i]+\lfloor t/b \rfloor s] \leftarrow 1$ for each $i \in \{0,\dots,k-1\}$ .
Query: Suppose that we wish to query the Bloom filter for element $y$ with timestamp $t$ .
- Return 1 provided ${\bf{v}}[h(y||t||i) + \lfloor t/b \rfloor s] = 1$ for every $i \in \{0,\dots,k-1\}$ . Otherwise, return 0.

By incorporating a shifting window, we maintain efficient querying and appending at the cost of constant space. However, by losing constant space, we gain 'infinite' scalability.

Cuckoo filters

A Cuckoo filter is a data structure for probabilistic membership proofs based on Cuckoo hash tables. The specific design goal for Cuckoo filters is to address the inability to remove elements from a Bloom Filter. This is done by replacing a list of bits with a list of 'fingerprints.' A fingerprint can be thought of as the hash value for an entry in the digest. A Cuckoo filter is a fixed-length list of 'fingerprints.' If the maximum number of entries that a Cuckoo filter can hold is $n$ and a fingerprint occupies $f$ bits, then the Cuckoo filter occupies $nf$ bits.

Now, we describe the algorithms associated with the Cuckoo filter $C$ with hash function $hash(X)$ and fingerprint function $fingerprint(X)$ .

Append: Suppose that we wish to add the element $x$ to the Cuckoo filter.
- If either position $i_x := hash(x)$ or $j_x := i \otimes hash(fingerprint(x))$ of $C$ is empty, then $fingerprint(x)$ is inserted into an empty position.
- If both $i_x$ and $j_x$ are occupied with a fingerprint that is distinct from $fingerprint(x)$ , then we select either $i_x$ or $j_x$ to insert $fingerprint(x)$ . The fingerprint that had previously occupied this position cannot be discarried. Instead, we insert this fingerprint into its alternate location. This reshuffling process either ends with fingerprints all having their own bucket or one that cannot be inserted. In the case that we have a fingerprint that cannot be inserted, then the Cuckoo filter is overfilled.
Query: Suppose that we wish to query the Cuckoo filter for element $y$ .
- Return 1 provided $fingerprint(y)$ is either in position $i_y$ or $j_y$ .
Delete: Suppose that we wish to delete the element $y$ from the Cuckoo filter.
- If $y$ has been added to the Cuckoo filter, then $fingerprint(y)$ is either in position $i_y$ or $j_y$ . We remove $fingerprint(y)$ from the appropriate position.

We note that false positives in Cuckoo filters only occur when an element shares a fingerprint and hash with a value that has already been added to the Cuckoo filter.

Example

In this example, we will append the words $add$ , $sum$ , and $equal$ to a Cuckoo filter with 8 slots.

For each word $x$ , we compute two indices: $i_x := hash(x) \text{ and } j_x := hash(x) \otimes hash(fingerprint(x)).$ Suppose that we have the following values for our words:

word	$i_x$	$j_x$
$add$	$(0,1,0)$	$(1,0,0)$
$sum$	$(1,0,1)$	$(1,1,0)$
$equal$	$(0,1,0)$	$(1,0,1)$

For clarity of the example, we append the words directly to the buckets instead of fingerprints of our data.

	0	1	2	3	4	5	6	7
append $add$			$add$
append $sum$			$add$			$sum$

Notice that both of the buckets (2 and 5) that $equal$ can map to are occupied. So, we select one of these buckets (say 2) to insert $equal$ into. Then, we have to insert $add$ to its possible bucket (1). This leaves us with the Cuckoo filter:

0	1	2	3	4	5	6	7
	$add$	$equal$			$sum$

Complexity

Notice that deletions and queries to Cuckoo filters are done in constant time. Specifically, only two locations need to be checked for any data $x$ . Appends may require shuffling previously added elements to their alternate locations. As such, the append does not run in constant time.

Bloom filters vs Cuckoo filters

The design of Bloom filters is focused on space efficiency and quick query time. Even though they occupy constant space, Cuckoo filters require significantly more space for $n$ items than Bloom filters. The worst-case append in a Cuckoo filter is slower than the append in a Bloom filter. However, an append that does not require any shuffling in a Cuckoo filter can be quicker than appends in Bloom filters. Cuckoo filters make up for these disadvantages with quicker query time and the ability to delete entries. Further, the probability of false positives in Cuckoo filters is lower than the probability of false positives in Bloom filters.

Combining Filters with RLN

In a series of posts (1,2,3), various versons of rate limiting nullifiers (RLN) that are used by Waku has been discussed. RLN uses a sparse Merkle tree for the membership set. The computational power required to construct the Merkle tree prevent light clients from participating in verifying membership proofs. In Verifying RLN Proofs in Light Clients with Subtrees, it was proposed to move the membership set on-chain so that it would not be necessary for a light client to construct the entire Merkle tree locally. Unfortunately, the naive approach is not practical as the gas limit for a single call is too restrictive for an appropriately sized tree. Instead, it was proposed to make utilize of subtrees. In this section, we provide a discussion of an alternate solution for light clients by using filters for the membership set. The two parts of RLN that we will focus on are user registration and deletion.

Both Bloom and Cuckoo filters support user registration as this is can be done as an append. The fixed size of these filters would restrict the total number of users that can register. This can be migitated by using Sliding-Window Bloom filter as this supports system growth. The Sliding-Window can be adapted to Cuckoo filters as well. In the case of a Sliding-Window filter, an user would maintain the epoch of when they registered. The registration of new users to Bloom filters can be done in constant time which is a significant improvement over appending to subtrees. Unfortunately, the complexity of registration to Cuckoo filters cannot be as easily computed.

A user could be slashed from the RLN by sending too many messages in a given epoch. Unfortunately, Bloom filters do not support the deletion of members. Luckily, Cuckoo filters allow for deletions that can performed in constant time.

Cuckoo filter that use Sliding-Window could be used so that light clients are able to verify proofs of membership in the RLN. These proofs are not a substitute to the usual proofs that a heavy client can verify due to the allowance of false positives. However, with the allowance of false positives, a light client can participate in verification RLN proofs in an efficient manner.

References

RLN-v3: Towards a Flexible and Cost-Efficient Implementation

2024-05-13T12:00:00.000Z

Improving on the previous version of RLN by allowing dynamic epoch sizes.

Introduction

The premise of RLN-v3 is to have a variable message rate per variable epoch, which can be explained in the following way:

RLN-v1: “Alice can send 1 message per global epoch”

Practically, this is 1 msg/second
RLN-v2: “Alice can send x messages per global epoch”

Practically, this is x msg/second
RLN-v3: “Alice can send x messages within a time interval y chosen by herself. The funds she has to pay are affected by both the number of messages and the chosen time interval. Other participants can choose different time intervals fitting their specific needs.

Practically, this is x msg/y seconds

RLN-v3 allows higher flexibility and ease of payment/stake for users who have more predictable usage patterns and therefore, more predictable bandwidth usage on a p2p network (Waku, etc.).

For example:

An AMM that broadcasts bids, asks, and fills over Waku may require a lot of throughput in the smallest epoch possible and hence may register an RLN-v3 membership of 10000 msg/1 second. They could do this with RLN-v2, too.
Alice, a casual user of a messaging app built on Waku, who messages maybe 3-4 people infrequently during the day, may register an RLN-v3 membership of 100 msg/hour, which would not be possible in RLN-v2 considering the global epoch was set to 1 second. With RLN-v2, Alice would have to register with a membership of 1 msg/sec, which would translate to 3600 msg/hour. This is much higher than her usage and would result in her overpaying to stake into the membership set.
A sync service built over Waku, whose spec defines that it MUST broadcast a set of public keys every hour, may register an RLN-v3 membership of 1 msg/hour, cutting down the costs to enter the membership set earlier.

Theory

Modification to leaves set in the membership Merkle tree

To ensure that a user’s epoch size (user_epoch_limit) is included within their membership we must modify the user’s commitment/leaf in the tree to contain it. A user’s commitment/leaf in the tree is referred to as a rate_commitment, which was previously derived from their public key (identity_commitment) and their variable message rate (user_message_limit).

In RLN-v2:

rate\_commitment = poseidon([identity\_commitment, user\_message\_limit])

In RLN-v3:

rate\_commitment = poseidon([identity\_commitment, user\_message\_limit, user\_epoch\_limit])

Modification to circuit inputs

To detect double signaling, we make use of a circuit output nullifier, which remains the same if a user generates a proof with the same message_id and external_nullifier, where the external_nullifier and nullifier are defined as:

external\_nullifier = poseidon([epoch, rln\_identifier]) \\ nullifier = poseidon([identity\_secret, external\_nullifier, message\_id])

Where:

epoch is defined as the Unix epoch timestamp with seconds precision.
rln_identifier uniquely identifies an application for which a user submits a proof.
identity_secret is the private key of the user.
message_id is the sequence number of the user’s message within user_message_limit in an epoch.

In RLN-v2, the global epoch was 1 second, hence we did not need to perform any assertions to the epoch’s value inside the circuit, and the validation of the epoch was handled off-circuit (i.e., too old, too large, bad values, etc.).

In RLN-v3, we propose that the epoch that is passed into the circuit must be a valid multiple of user_epoch_limit since the user may pass in values of the epoch which do not directly correlate with the user_epoch_limit.

For example:

A user with user_epoch_limit of 120 passes in an epoch of 237 generates user_message_limit proofs with it, can increment the epoch by 1, and generate user_message_limit proofs with it, thereby allowing them to bypass the message per epoch restriction.

One could say that we could perform this validation outside of the circuit, but we maintain the user_epoch_limit as a private input to the circuit so that the user is not deanonymized by the anonymity set connected to that user_epoch_limit. Since user_epoch_limit is kept private, the verifier does not have access to that value and cannot perform validation on it.

If we ensure that the epoch is a multiple of user_epoch_limit, we have the following scenarios:

A user with user_epoch_limit of 120 passes in an epoch of 237. Proof generation fails since the epoch is not a multiple of user_epoch_limit.
A user with user_epoch_limit of 120 passes in an epoch of 240 and can generate user_message_limit proofs without being slashed.

Since we perform operations on the epoch, we must include it as a circuit input (previously, it was removed from the circuit inputs to RLN-v2).

Therefore, the new circuit inputs are as follows:

// unchanged
private identity_secret
private user_message_limit
private message_id
private pathElements[]
private pathIndices[]
public x // messageHash

// new/changed
private user_epoch_limit
private user_epoch_quotient // epoch/user_epoch_limit to assert within circuit
public epoch
public rln_identifier

The circuit outputs remain the same.

Additional circuit constraints

Since we accept the epoch, user_epoch_quotient, and user_epoch_limit, we must ensure that the relation between these 3 values is preserved. I.e.:
$epoch == user\_epoch\_limit * user\_epoch\_quotient$
To ensure no overflows/underflows occur in the above multiplication, we must constrain the inputs of epoch, user_epoch_quotient, and user_epoch_limit. We have assumed 3600 to be the maximum valid size of the user_epoch_quotient.

size(epoch) \leq 64\ bits \\ size(user\_epoch\_limit) \leq 12\ bits \\ user\_epoch\_limit \leq 3600 \\ user\_epoch\_limit \leq epoch \\ user\_epoch\_quotient < user\_epoch\_limit

Modifications to external epoch validation (Waku, etc.)

For receivers of an RLN-v3 proof to detect if a message is too old, we must use the higher bound of the user_epoch_limit, which has been set to 3600. The trade-off here is that we allow hour-old messages to propagate within the network.

Modifications to double signaling detection scheme (Waku, etc.)

For verifiers of RLN-v1/v2 proofs, a log of nullifiers seen in the last epoch is maintained, and if there is a match with a pre-existing nullifier, double signaling has been detected and the verifier MAY proceed to slash the spamming user.

With the RLN-v3 scheme, we need to increase the size of the nullifier log used, which previously cleared itself every second to the higher bound of the user_epoch_limit, which is 3600. Now, the RLN proof verifier must clear the nullifier log every 3600 seconds to satisfactorily detect double signaling.

The implementation

An implementation of the RLN-v3 scheme in gnark can be found here.

Comments on performance

Hardware: Macbook Air M2, 16GB RAM
Circuit: RLN-v3
Proving system: Groth16
Framework: gnark
Elliptic curve: bn254 (aka bn128) (not to be confused with the 254-bit Weierstrass curve)
Finite field: Prime-order subgroup of the group of points on the bn254 curve
Default Merkle tree height: 20
Hashing algorithm: Poseidon
Merkle tree: Sparse Indexed Merkle Tree

Proving

The proving time for the RLN-v3 circuit is 90ms for a single proof.

Verification

The verification time for the RLN-v3 circuit is 1.7ms for a single proof.

Conclusion

The RLN-v3 scheme introduces a new epoch-based message rate-limiting scheme to the RLN protocol. It enhances the user's flexibility in setting their message limits and cost-optimizes their stake.

Future work

Implementing the RLN-v3 scheme in Zerokit
Implementing the RLN-v3 scheme in Waku
Formal security analysis of the RLN-v3 scheme

References

Verifying RLN Proofs in Light Clients with Subtrees

2024-05-03T12:00:00.000Z

How resource-restricted devices can verify RLN proofs fast and efficiently.

Introduction

This post expands upon ideas described in the previous post, focusing on how resource-restricted devices can verify RLN proofs fast and efficiently.

Previously, it was required to fetch all the memberships from the smart contract, construct the merkle tree locally, and derive the merkle root, which is subsequently used to verify RLN proofs.

This process is not feasible for resource-restricted devices since it involves a lot of RPC calls, computation and fault tolerance. One cannot expect a mobile phone to fetch all the memberships from the smart contract and construct the merkle tree locally.

Constraints and requirements

An alternative solution to the one proposed in this post is to construct the merkle tree on-chain, and have the root accessible with a single RPC call. However, this approach increases gas costs for inserting new memberships and may not be feasible until it is optimized further with batching mechanisms, etc.

The other methods have been explored in more depth here.

Following are the requirements and constraints for the solution proposed in this post:

Cheap membership insertions.
As few RPC calls as possible to reduce startup time.
Merkle root of the tree is available on-chain.
No centralized services to sequence membership insertions.
Map inserted commitments to the block in which they were inserted.

Metrics on sync time for a tree with 2,653 leaves

The following metrics are based on the current implementation of RLN in the Waku gen0 network.

Test bench

Hardware: Macbook Air M2, 16GB RAM
Network: 120 Megabits/sec
Nwaku commit: e61e4ff
RLN membership set contract: 0xF471d71E9b1455bBF4b85d475afb9BB0954A29c4
Deployed block number: 4,230,716
RLN Membership set depth: 20
Hash function: PoseidonT3 (which is a gas guzzler)
Max size of the membership set: 2^20 = 1,048,576 leaves

Metrics

Time to sync the whole tree: 4 minutes
RPC calls: 702
Number of leaves: 2,653

One can argue that the time to sync the tree at the current state is not that bad. However, the number of RPC calls is a concern, which scales linearly with the number of blocks since the contract was deployed This is because the implementation fetches all events from the contract, chunking 2,000 blocks at a time. This is done to avoid hitting the block limit of 10,000 events per call, which is a limitation of popular RPC providers.

Proposed solution

From a theoretical perspective, one could construct the merkle tree on-chain, in a view call, in-memory. However, this is not feasible due to the gas costs associated with it.

To compute the root of a Merkle tree with $2^{20}$ leaves it costs approximately 2 billion gas. With Infura and Alchemy capping the gas limit to 350M and 550M gas respectively, it is not possible to compute the root of the tree in a single call.

Acknowledging that Polygon Miden and Penumbra both make use of a tiered commitment tree, we propose a similar approach for RLN.

A tiered commitment tree is a tree which is sharded into multiple smaller subtrees, each of which is a tree in itself. This allows scaling in terms of the number of leaves, as well as reducing state bloat by just storing the root of a subtree when it is full instead of all its leaves.

Here, the question arises: What is the maximum number of leaves in a subtree with which the root can be computed in a single call?

It costs approximately 217M gas to compute the root of a Merkle tree with $2^{10}$ leaves.

This is a feasible number for a single call, and hence we propose a tiered commitment tree with a maximum of $2^{10}$ leaves in a subtree and the number of subtrees is $2^{10}$ . Therefore, the maximum number of leaves in the tree is $2^{20}$ (the same as the current implementation).

Insertion

When a commitment is inserted into the tree it is first inserted into the first subtree. When the first subtree is full the next insertions go into the second subtree and so on.

Syncing

When syncing the tree, one only needs to fetch the roots of the subtrees. The root of the full tree can be computed in-memory or on-chain.

This allows us to derive the following relation:

number\_of\_rpc\_calls = number\_of\_filled\_subtrees + 1

This is a significant improvement over the current implementation, which requires fetching all the memberships from the smart contract.

Gas costs

The gas costs for inserting a commitment into the tree are the same as the current implementation except it consists of an extra SSTORE operation to store the shardIndex of the commitment.

Events

The events emitted by the contract are the same as the current implementation, appending the shardIndex of the commitment.

Proof of concept

A proof of concept implementation of the tiered commitment tree is available here, and is deployed on Sepolia at 0xE7987c70B54Ff32f0D5CBbAA8c8Fc1cAf632b9A5.

It is compatible with the current implementation of the RLN verifier.

Future work

Optimize the gas costs of the tiered commitment tree.
Explore using different number of leaves under a given node in the tree (currently set to 2).

Conclusion

The tiered commitment tree is a promising approach to reduce the number of RPC calls required to sync the tree and reduce the gas costs associated with computing the root of the tree. Consequently, it allows for a more scalable and efficient RLN verifier.

References

Strengthening Anonymous DoS Prevention with Rate Limiting Nullifiers in Waku

2023-11-07T12:00:00.000Z

Rate Limiting Nullifiers in practice, applied to an anonymous p2p network, like Waku.

Introduction

Rate Limiting Nullifier (RLN) is a zero-knowledge gadget that allows users to prove 2 pieces of information,

They belong to a permissioned membership set
Their rate of signaling abides by a fixed number that has been previously declared

The "membership set" introduced above, is in the form of a sparse, indexed merkle tree. This membership set can be maintained on-chain, off-chain or as a hybrid depending on the network's storage costs. Waku makes use of a hybrid membership set, where insertions are tracked in a smart contract. In addition, each Waku node maintains a local copy of the tree, which is updated upon each insertion.

Users register themselves with a hash of a locally generated secret, which is then inserted into the tree at the next available index. After having registered, users can prove their membership by proving their knowledge of the pre-image of the respective leaf in the tree. The leaf hashes are also referred to as commitments of the respective users. The actual proof is done by a Merkle Inclusion Proof, which is a type of ZK proof.

The circuit ensures that the user's secret does indeed hash to a leaf in the tree, and that the provided Merkle proof is valid.

After a User generates this Merkle proof, they can transmit it to other users, who can verify the proof. Including a message's hash within the proof generation, additionally guarantees integrity of that message.

A malicious user could generate multiple proofs per epoch. they generate multiple proofs per epoch. However, when multiple proofs are generated per epoch, the malicious user's secret is exposed, which strongly disincentivizes this attack. This mechanism is further described in malicious User secret interpolation mechanism

Note: This blog post describes rln-v1, which excludes the range check in favor of a global rate limit for all users, which is once per time window. This version is currently in use in waku-rln-relay.

RLN Protocol parameters

Given below is the set of cryptographic primitives, and constants that are used in the RLN protocol.

Proving System: groth16
Elliptic Curve: bn254 (aka bn128) (not to be confused with the 254 bit Weierstrass curve)
Finite Field: Prime-order subgroup of the group of points on the bn254 curve
Default Merkle Tree Height: 20
Hashing algorithm: Poseidon
Merkle Tree: Sparse Indexed Merkle Tree
Messages per epoch: 1
Epoch duration: 10 seconds

Malicious User secret interpolation mechanism

note: all the parameters mentioned below are elements in the finite field mentioned above.

The private inputs to the circuit are as follows: -

identitySecret: the randomly generated secret of the user
identityPathIndex: the index of the commitment derived from the secret
pathElements: elements included in the path to the index of the commitment

Following are the public inputs to the circuit -

x: hash of the signal to the finite field
rlnIdentifier: application-specific identifier which this proof is being generated for
epoch: the timestamp which this proof is being generated for

The outputs of the circuit are as follows: -

y: result of Shamir's secret sharing calculation
root: root of the Merkle tree obtained after applying the inclusion proof
nullifier: uniquely identifies a message, derived from rlnIdentifier, epoch, and the user's secret

With the above data in mind, following is the circuit pseudocode -

identityCommitment = Poseidon([identitySecret])
root = MerkleInclusionProof(identityCommitment, identityPathIndex, pathElements)
externalNullifier = Poseidon([epoch, rlnIdentifier])
a1 = Poseidon([identitySecret, externalNullifier])
y = identitySecret + a1 * x
nullifier = Poseidon([a1])

To interpolate the secret of a user who has sent multiple signals during the same epoch to the same rln-based application, we may make use of the following formula -

$a_1 = {(y_1 - y_2) \over (x_1 - x_2)}$

where $x_1$ , $y_1$ and $x_2$ , $y_2$ are shares from different messages

subsequently, we may use one pair of the shares, $x_1$ and $y_1$ to obtain the identitySecret

$identitySecret = y_1 - a_1 * x$

This enables RLN to be used for rate limiting with a global limit. For arbitrary limits, please refer to an article written by @curryrasul, rln-v2.

Waku's problem with DoS

In a decentralized, privacy focused messaging system like Waku, Denial of Service (DoS) vulnerabilities are very common, and must be addressed to promote network scale and optimal bandwidth utilization.

DoS prevention with user metadata

There are a couple of ways a user can be rate-limited, either -

IP Logging
KYC Logging

Both IP and KYC logging prevent systems from being truly anonymous, and hence, cannot be used as a valid DoS prevention mechanism for Waku.

RLN can be used as an alternative, which provides the best of both worlds, i.e a permissioned membership set, as well as anonymous signaling. However, we are bound by k-anonymity rules of the membership set.

Waku-RLN-Relay is a libp2p pubsub validator that verifies if a proof attached to a given message is valid. In case the proof is valid, the message is relayed.

Performance analysis

Test bench specs: AMD EPYC 7502P 32-Core, 4x32GB DDR4 Reg.ECC Memory

This simulation was conducted by @alrevuelta, and is described in more detail here.

The simulation included 100 waku nodes running in parallel.

Proof generation times -

Proof verification times -

A spammer node publishes 3000 msg/epoch, which is detected by all connected nodes, and subsequently disconnect to prevent further spam -

Security analysis

Barbulescu and Duquesne conclude that that the bn254 curve has only 100 bits of security. Since the bn254 curve has a small embedding degree, it is vulnerable to the MOV attack. However, the MOV attack is only applicable to pairings, and not to the elliptic curve itself. It is acceptable to use the bn254 curve for RLN, since the circuit does not make use of pairings.

An analysis on the number of rounds in the Poseidon hash function was done, which concluded that the hashing rounds should not be reduced,

The smart contracts have not been audited, and are not recommended for real world deployments yet.

Storage analysis

commitment\_size = 32\ bytes \\ tree\_height =20 \\ total\_leaves = 2^{20} \\ max\_tree\_size = total\_leaves * commitment\_size \\ max\_tree\_size = 2^{20} * 32 = 33,554,432 \\ ∴max\_tree\_size = 33.55\ megabytes

The storage overhead introduced by RLN is minimal. RLN only requires 34 megabytes of storage, which poses no problem on most end-user hardware, with the exception of IoT/microcontrollers. Still, we are working on further optimizations allowing proof generation without having to store the full tree.

The bare minimum requirements to run RLN

With proof generation time in sub-second latency, along with low storage overhead for the tree, it is possible for end users to generate and verify RLN proofs on a modern smartphone.

Following is a demo provided by @rramos that demonstrates waku-rln-relay used in react native.

Warning: The react native sdk will be deprecated soon, and the above demo should serve as a PoC for RLN on mobiles

RLN usage guide

Zerokit implements api's that allow users to handle operations to the tree, as well as generate/verify RLN proofs.

Our main implementation of RLN can be accessed via this Rust crate, which is documented here. It can used in other langugages via the FFI API, which is documented here. The usage of RLN in Waku is detailed in our RLN Implementers guide, which provides step-by-step instructions on how to run Waku-RLN-Relay.

Following is a diagram that will help understand the dependency tree -

Future work

Optimizations to zerokit for proof generation time.
Incrementing tree depth from 20 to 32, to allow more memberships.
Optimizations to the smart contract.
An ability to signal validity of a message in different time windows.
Usage of proving systems other than Groth16.

References

GossipSub Improvements: Evolution of Overlay Design and Message Dissemination in Unstructured P2P Networks

2023-11-06T12:00:00.000Z

GossipSub Improvements: Evolution of Overlay Design and Message Dissemination in Unstructured P2P Networks

Motivitation

We have been recently working on analyzing and improving the performance of the GossipSub protocol for large messages, as in the case of Ethereum Improvement Proposal EIP-4844. This work led to a comprehensive study of unstructured P2P networks. The intention was to identify the best practices that can serve as guidelines for performance improvement and scalability of P2P networks.

Introduction

Nodes in an unstructured p2p network form self-organizing overlay(s) on top of the IP infrastructure to facilitate different services like information dissemination, query propagation, file sharing, etc. The overlay(s) can be as optimal as a tree-like structure or as enforcing as a fully connected mesh.

Due to peer autonomy and a trustless computing environment, some peers may deviate from the expected operation or even leave the network. At the same time, the underlying IP layer is unreliable.

Therefore, tree-like overlays are not best suited for reliable information propagation. Moreover, tree-based solutions usually result in significantly higher message dissemination latency due to suboptimal branches.

Flooding-based solutions, on the other hand, result in maximum resilience against adversaries and achieve minimal message dissemination latency because the message propagates through all (including the optimal) paths. Redundant transmissions help maintain the integrity and security of the network in the presence of adversaries and high node failure but significantly increase network-wide bandwidth utilization, cramming the bottleneck links.

An efficient alternative is to lower the number of redundant transmissions by D-regular broadcasting, where a peer will likely receive (or relay) a message from up to $D$ random peers. Publishing through a D-regular overlay triggers approximately $N \times D$ transmissions. Reducing $D$ reduces the redundant transmissions but compromises reachability and latency. Sharing metadata through a K-regular overlay (where $K > D$ ) allows nodes to pull missing messages.

GossipSub [1] benefits from full-message (D-regular) and metadata-only (k-regular) overlays. Alternatively, a metadata-only overlay can be used, requiring a pull-based operation that significantly minimizes bandwidth utilization at the cost of increased latency.

Striking the right balance between parameters like $D, K$ , pull-based operation, etc., can yield application-specific performance tuning, but scalability remains a problem.

At the same time, many other aspects can significantly contribute to the network's performance and scalability. One option is to realize peers' suitability and continuously changing capabilities while forming overlays.

For instance, a low-bandwidth link near a publisher can significantly demean the entire network's performance. Reshuffling of peering links according to the changing network conditions can lead to superior performance.

Laying off additional responsibilities to more capable nodes (super nodes) can alleviate peer cramming, but it makes the network susceptible to adversaries/peer churn. Grouping multiple super nodes to form virtual node(s) can solve this problem.

Similarly, flat (single-tier) overlays cannot address the routing needs in large (geographically dispersed) networks.

Hierarchical (Multi-tier) overlays with different intra/inter-overlay routing solutions can better address these needs. Moreover, using message aggregation schemes for grouping multiple messages can save bandwidth and provide better resilience against adversaries/peer churn.

This article's primary objective is to investigate the possible choices that can empower an unstructured P2P network to achieve superior performance for the broadest set of applications. We look into different constraints imposed by application-specific needs (performance goals) and investigate various choices that can augment the network's performance. We explore overlay designs/freshness, peer selection approaches, message-relaying mechanisms, and resilience against adversaries/peer churn. We consider GossipSub a baseline protocol to explore various possibilities and decisively commit to the ones demonstrating superior performance. We also discuss the current state and, where applicable, propose a strategic plan for embedding new features to the GossipSub protocol.

GOAL1: Low Latency Operation

Different applications, like blockchain, streaming, etc., impose strict time bounds on network-wide message dissemination latency. A message delivered after the imposed time bounds is considered as dropped. An early message delivery in applications like live streaming can further enhance the viewing quality.

The properties and nature of the overlay network topology significantly impact the performance of services and applications executed on top of them. Studying and devising mechanisms for better overlay design and message dissemination is paramount to achieving superior performance.

Interestingly, shortest-path message delivery trees have many limitations:

Changing network dynamics requires a quicker and continuous readjustment of the multicast tree.
The presence of resource-constrained (bandwidth/compute, etc.) nodes in the overlay can result in congestion.
Node failure can result in partitions, making many segments unreachable.
Assuring a shortest-path tree-like structure requires a detailed view of the underlying (and continuously changing) network topology.

Solutions involve creating multiple random trees to add redundancy [2]. Alternatives involve building an overlay mesh and forwarding messages through the multicast delivery tree (eager push).

Metadata is shared through the overlay links so that the nodes can ask for missing messages (lazy push or pull-based operation) through the overlay links. New nodes are added from the overlay on node failure, but it requires non-faulty node selection.

GossipSub uses eager push (through overlay mesh) and lazy push (through IWANT messages).

The mesh degree $D_{Low} \leq D \leq D_{High}$ is crucial in deciding message dissemination latency. A smaller value for $D$ results in higher latency due to increased rounds, whereas a higher $D$ reduces latency on the cost of increased bandwidth. At the same time, keeping $D$ independent of the growing network size ( $N$ ) may increase network-wide message dissemination latency. Adjusting $D$ with $N$ maintains similar latency on the cost of increased workload for peers. Authors in [3] suggest only a logarithmic increase in $D$ to maintain a manageable workload for peers. In [4], it is reported that the average mesh degree should not exceed $D_{avg} = \ln(N) + C$ for an optimal operation, where $C$ is a small constant.

Moreover, quicker shuffling of peers results in better performance in the presence of resource-constrained nodes or node failure [4].

GOAL2: Considering Heterogeneity In Overlay Design

Random peering connections in P2P overlays represent a stochastic process. It is inherently difficult to precisely model the performance of such systems. Most of the research on P2P networks provides simulation results assuming nodes with similar capabilities. The aspect of dissimilar capabilities and resource-constrained nodes is less explored.

It is discussed in GOAL1 that overlay mesh results in better performance if $D_{avg}$ does not exceed $\ln(N) + C$ . Enforcing all the nodes to have approximately $\ln(N) + C$ peers makes resource-rich nodes under-utilized, while resource-constrained nodes are overloaded. At the same time, connecting high-bandwidth nodes through a low-bandwidth node undermines the network's performance. Ideally, the workload on any node should not exceed its available resources. A better solution involves a two-phased operation:

Every node computes its available bandwidth and selects a node degree $D$ proportional to its available bandwidth [4]. Different bandwidth estimation approaches are suggested in literature [5,6]. Simple bandwidth estimation approaches like variable packet size probing [6] yield similar results with less complexity. It is also worth mentioning that many nodes may want to allocate only a capped share of their bandwidth to the network. Lowering $D$ according to the available bandwidth can still prove helpful. Additionally, bandwidth preservation at the transport layer through approaches like µTP can be useful. To further conform to the suggested mesh-degree average $D_{avg}$ , every node tries achieving this average within its neighborhood, resulting in an overall similar $D_{avg}$ .
From the available local view, every node tries connecting peers with the lowest latency until $D$ connections are made. We suggest referring to the peering solution discussed in GOAL5 to avoid network partitioning.

The current GossipSub design considers homogeneous peers, and every node tries maintaining $D_{Low} \leq D \leq D_{High}$ connections.

GOAL3: Bandwidth Optimization

Redundant message transmissions are essential for handling adversaries/node failure. However, these transmissions result in traffic bursts, cramming many overlay links. This not only adds to the network-wide message dissemination latency but a significant share of the network's bandwidth is wasted on (usually) unnecessary transmissions. It is essential to explore solutions that can minimize the number of redundant transmissions while assuring resilience against node failures.

Many efforts have been made to minimize the impact of redundant transmissions. These solutions include multicast delivery trees, metadata sharing to enable pull-based operation, in-network information caching, etc. [7,8]. GossipSub employs a hybrid of eager push (message dissemination through the overlay) and lazy push (a pull-based operation by the nodes requiring information through IWANT messages).

A better alternative to simple redundant transmission is to use message aggregation [9,10,11] for the GossipSub protocol. As a result, redundant message transmissions can serve as a critical advantage of the GossipSub protocol. Suppose that we have three equal-length messages $x1, x2, x3$ . Assuming an XOR coding function, we know two trivial properties: $x1 \oplus x2 \oplus x2 = x1$ and $\vert x1 \vert = \vert x1 \oplus x2 \oplus x2 \vert$ .

This implies that instead of sending messages individually, we can encode and transmit composite message(s) to the network. The receiver can reconstruct the original message from encoded segments. As a result, fewer transmissions are sufficient for sending more messages to the network.

However, sharing linear combinations of messages requires organizing messages in intervals, and devising techniques to identify all messages belonging to each interval. In addition, combining messages from different publishers requires more complex arrangements, involving embedding publisher/message IDs, delayed forwarding (to accommodate more messages), and mechanisms to ensure the decoding of messages at all peers. Careful application-specific need analysis can help decide the benefits against the added complexity.

GOAL4: Handling Large Messages

Many applications require transferring large messages for their successful operation. For instance, database/blockchain transactions [12]. This introduces two challenges:

Redundant large message transmissions result in severe network congestion.
Message transmissions follow a store/forward process at all peers, which is inefficient in the case of large messages.

The above-mentioned challenges result in a noticeable increase in message dissemination latency and bandwidth wastage. Most of the work done for handling large messages involves curtailing redundant transmissions using multicast delivery trees, reducing the number of fanout nodes, employing in-network message caching, pull-based operation, etc.

Approaches like message aggregation also prove helpful in minimizing bandwidth wastage.

Our recent work on GossipSub improvements (still a work in progress) suggests the following solutions to deal with large message transmissions:

Using IDontWant message proposal [13] and staggered sending.

IDontWant message helps curtail redundant transmissions by letting other peers know we have already received the message. Staggered sending enables relaying the message to a short subset of peers in each round. We argue that simultaneously relaying a message to all peers hampers the effectiveness of the IDontWant message. Therefore, using the IDontWant message with staggered sending can yield better results by allowing timely reception and processing of IDontWant messages.
Message transmissions follow a store/forward process at all peers that is inefficient in the case of large messages. We can parallelize message transmission by partitioning large messages into smaller fragments, letting intermediate peers relay these fragments as soon as they receive them.

GOAL5: Scalability

P2P networks are inherently scalable because every incoming node brings in bandwidth and compute resources. In other words, we can keep adding nodes to the network as long as every incoming node brings at-least $R \times D$ bandwidth, where $R$ is average data arrival rate. It is worth mentioning that network-wide message dissemination requires at-least $\lceil \log_D (N) \rceil$ hops. Therefore, increasing network size increases message dissemination latency, assuming D is independent of the network size.

Additionally, problems like peer churn, adversaries, heterogeneity, distributed operation, etc., significantly hamper the network's performance. Most efforts for bringing scalability to the P2P systems have focused on curtailing redundant transmissions and flat overlay adjustments. Hierarchical overlay designs, on the other hand, are less explored.

Placing a logical structure in unstructured P2P systems can help scale P2P networks.

One possible solution is to use a hierarchical overlay inspired by the approaches [14,15,16]. An abstract operation of such overlay design is provided below:

Clustering nodes based on locality, assuming that such peers will have relatively lower intra-cluster latency and higher bandwidth. For this purpose, every node tries connecting peers with the lowest latency until $D$ connections are made or the cluster limit is reached.
A small subset of nodes having the highest bandwidth and compute resources is selected from each cluster. These super nodes form a fully connected mesh and jointly act as a virtual node, mitigating the problem of peer churn among super nodes.
Virtual nodes form a fully connected mesh to construct a hierarchical overlay. Each virtual node is essentially a collection of super nodes; a link to any of the constituent super nodes represents a link to the virtual node.
One possible idea is to use GossipSub for intra-cluster message dissemination and FloodSub for inter-cluster message dissemination.

Summary

Overlay acts as a virtual backbone for a P2P network. A flat overlay is more straightforward and allows effortless readjustment to application needs. On the other hand, a hierarchical overlay can bring scalability at the cost of increased complexity. Regardless of the overlay design, a continuous readjustment to appropriate peering links is essential for superior performance. At the same time, bandwidth preservation (through message aggregation, caching at strategic locations, metadata sharing, pull-based operation, etc.) can help minimize latency. However, problems like peer churn and in-network adversaries can be best alleviated through balanced redundant coverage, and frequent reshuffling of the peering links.

References

[1] D. Vyzovitis, Y. Napora, D. McCormick, D. Dias, and Y. Psaras, “Gossipsub: Attack-resilient message propagation in the filecoin and eth2. 0 networks,” arXiv preprint arXiv:2007.02754, 2020. Retrieved from https://arxiv.org/pdf/2007.02754.pdf
[2] M. Matos, V. Schiavoni, P. Felber, R. Oliveira, and E. Riviere, “Brisa: Combining efficiency and reliability in epidemic data dissemination,” in 2012 IEEE 26th International Parallel and Distributed Processing Symposium. IEEE, 2012, pp. 983–994. Retrieved from https://ieeexplore.ieee.org/abstract/document/6267905
[3] P. T. Eugster, R. Guerraoui, A. M. Kermarrec, and L. Massouli, “Epidemic information dissemination in distributed systems,” IEEE Computer, vol. 37, no. 5, 2004. Retrieved from https://infoscience.epfl.ch/record/83478/files/EugGueKerMas04IEEEComp.pdf
[4] D. Frey, “Epidemic protocols: From large scale to big data,” Ph.D. dissertation, Universite De Rennes 1, 2019. Retrieved from https://inria.hal.science/tel-02375909/document
[5] M. Jain and C. Dovrolis, “End-to-end available bandwidth: measurement methodology, dynamics, and relation with tcp throughput,” IEEE/ACM Transactions on networking, vol. 11, no. 4, pp. 537–549, 2003. Retrieved from https://ieeexplore.ieee.org/abstract/document/1224454
[6] R. Prasad, C. Dovrolis, M. Murray, and K. Claffy, “Bandwidth estimation: metrics, measurement techniques, and tools,” IEEE network, vol. 17, no. 6, pp. 27–35, 2003. Retrieved from https://ieeexplore.ieee.org/abstract/document/1248658
[7] D. Kostic, A. Rodriguez, J. Albrecht, and A. Vahdat, “Bullet: High bandwidth data dissemination using an overlay mesh,” in Proceedings of the nineteenth ACM symposium on Operating systems principles, 2003, pp. 282–297. Retrieved from https://dl.acm.org/doi/abs/10.1145/945445.945473
[8] V. Pai, K. Kumar, K. Tamilmani, V. Sambamurthy, and A. E. Mohr, “Chainsaw: Eliminating trees from overlay multicast,” in Peer-to-Peer Systems IV: 4th International Workshop, IPTPS 2005, Ithaca, NY, USA, February 24-25, 2005. Revised Selected Papers 4. Springer, 2005, pp. 127–140. Retrieved from https://link.springer.com/chapter/10.1007/11558989_12
[9] Y.-D. Bromberg, Q. Dufour, and D. Frey, “Multisource rumor spreading with network coding,” in IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2019, pp. 2359–2367. Retrieved from https://ieeexplore.ieee.org/abstract/document/8737576
[10] B. Haeupler, “Analyzing network coding gossip made easy,” in Proceedings of the forty-third annual ACM symposium on Theory of computing, 2011, pp. 293–302. Retrieved from https://dl.acm.org/doi/abs/10.1145/1993636.1993676
[11] S. Yu and Z. Li, “Massive data delivery in unstructured peer-to-peer networks with network coding,” in 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007). IEEE, 2007, pp. 592–597. Retrieved from https://ieeexplore.ieee.org/abstract/document/4276446
[12] V. Buterin, D. Feist, D. Loerakker, G. Kadianakis, M. Garnett, M. Taiwo, and A. Dietrichs, “Eip-4844: Shard blob transactions scale data-availability of ethereum in a simple, forwards-compatible manner,” 2022. Retrieved from https://eips.ethereum.org/EIPS/eip-4844
[13] A. Manning, “Gossipsub extension for epidemic meshes (v1.2.0),” 2022. Retrieved from https://github.com/libp2p/specs/pull/413
[14] Z. Duan, C. Tian, M. Zhou, X. Wang, N. Zhang, H. Du, and L. Wang, “Two-layer hybrid peer-to-peer networks,” Peer-to-Peer Networking and Applications, vol. 10, pp. 1304–1322, 2017. Retrieved from https://link.springer.com/article/10.1007/s12083-016-0460-5
[15] W. Hao, J. Zeng, X. Dai, J. Xiao, Q. Hua, H. Chen, K.-C. Li, and H. Jin, “Blockp2p: Enabling fast blockchain broadcast with scalable peer-to-peer network topology,” in Green, Pervasive, and Cloud Computing: 14th International Conference, GPC 2019, Uberlandia, Brazil, May 26–28, 2019, Proceedings 14. Springer, 2019, pp. 223–237. Retrieved from https://link.springer.com/chapter/10.1007/978-3-030-19223-5_16
[16] H. Qiu, T. Ji, S. Zhao, X. Chen, J. Qi, H. Cui, and S. Wang, “A geography-based p2p overlay network for fast and robust blockchain systems,” IEEE Transactions on Services Computing, 2022. Retrieved from https://ieeexplore.ieee.org/abstract/document/9826458

Nescience - A zkVM leveraging hiding properties

2023-08-28T12:00:00.000Z

Nescience, a privacy-first blockchain zkVM.

Introduction

Nescience is a privacy-first blockchain project that aims to enable private transactions and provide a general-purpose execution environment for classical applications. The goals include creating a state separation architecture for public/private computation, designing a versatile virtual machine based on mainstream instruction sets, creating proofs for private state updates, implementing a kernel-based architecture for correct execution of private functions, and implementing core DeFi protocols such as AMMs and staking from a privacy perspective.

It intends to create a user experience that is similar to public blockchains, but with additional privacy features that users can leverage at will. To achieve this goal, Nescience will implement a versatile virtual machine that can be used to implement existing blockchain applications, while also enabling the development of privacy-centric protocols such as private staking and private DEXs.

To ensure minimal trust assumptions and prevent information leakage, Nescience proposes a proof system that allows users to create proofs for private state updates, while the verification of the proofs and the execution of the public functions inside the virtual machine can be delegated to an external incentivised prover.

It also aims to implement a seamless interaction between public and private state, enabling composability between contracts, and private and public functions. Finally, Nescience intends to implement permissive licensing, which means that the source code will be open-source, and developers will be able to use and modify the code without any restriction.

Our primary objective is the construction of the Zero-Knowledge Virtual Machine (zkVM). This document serves as a detailed exploration of the multifaceted challenges, potential solutions, and alternatives that lay ahead. Each step is a testament to our commitment to thoroughness; we systematically test various possibilities and decisively commit to the one that demonstrates paramount performance and utility. For instance, as we progress towards achieving Goal 2, we are undertaking a rigorous benchmarking of the Nova proof system against its contemporaries. Should Nova showcase superior performance metrics, we stand ready to integrate it as our proof system of choice. Through such meticulous approaches, we not only reinforce the foundation of our project but also ensure its scalability and robustness in the ever-evolving landscape of blockchain technology.

Goal 1: Create a State Separation Architecture

The initial goal revolves around crafting a distinctive architecture that segregates public and private computations, employing an account-based framework for the public state and a UTXO-based structure for the private state.

The UTXO model [1,2], notably utilized in Bitcoin, generates new UTXOs to serve future transactions, while the account-based paradigm assigns balances to accounts that transactions can modify. Although the UTXO model bolsters privacy by concealing comprehensive balances, the pursuit of a dual architecture mandates a meticulous synchronization of these state models, ensuring that private transactions remain inconspicuous in the wider public network state.

This task is further complicated by the divergent transaction processing methods intrinsic to each model, necessitating a thoughtful and innovative approach to harmonize their functionality. To seamlessly bring together the dual architecture, harmonizing the account-based model for public state with the UTXO-based model for private state, a comprehensive strategy is essential.

The concept of blending an account-based structure with a UTXO-based model for differentiating between public and private states is intriguing. It seeks to leverage the strengths of both models: the simplicity and directness of the account-based model with the privacy enhancements of the UTXO model.

Here's a breakdown and a potential strategy for harmonizing these models:

Rationale Behind the Dual Architecture:

Account-Based Model: This model is intuitive and easy to work with. Every participant has an account, and transactions directly modify the balances of these accounts. It's conducive for smart contracts and a broad range of applications.
UTXO-Based Model: This model treats every transaction as a new output, which can then be used as an input for future transactions. By not explicitly associating transaction outputs with user identities, it offers a degree of privacy.

Harmonizing the Two Systems:

Translation Layer
- Role: Interface between UTXO and account-based states.
- UTXO-to-Account Adapter: When UTXOs are spent, the adapter can translate these into the corresponding account balance modifications. This could involve creating a temporary 'pseudo-account' that mirrors the UTXO's attributes.
- Account-to-UTXO Adapter: When an account wishes to make a private transaction, it would initiate a process converting a part of its balance to a UTXO, facilitating a privacy transaction.
Unified Identity Management
- Role: Maintain a unified identity (or address) system that works across both state models, allowing users to easily manage their public and private states without requiring separate identities.
- Deterministic Wallets: Use Hierarchical Deterministic (HD) wallets [3,4], enabling users to generate multiple addresses (both UTXO and account-based) from a single seed. This ensures privacy while keeping management centralized for the user.
State Commitments
- Role: Use cryptographic commitments to commit to the state of both models. This can help in efficiently validating cross-model transactions.
- Verkle Trees: Verkle Trees combine Vector Commitment and the KZG polynomial commitment scheme to produce a structure that's efficient in terms of both proofs and verification. Verkle proofs are considerably small in size (less data to store and transmit), where Transaction and state verifications can be faster due to the smaller proof sizes and computational efficiencies.
- Mimblewimble-style Aggregation [5]: For UTXOs, techniques similar to those used in Mimblewimble can be used to aggregate transactions, keeping the state compact and enhancing privacy.
Batch Processing & Anonymity Sets
- Role: Group several UTXO-based private transactions into a single public account-based transaction. This can provide a level of obfuscation and can make synchronization between the two models more efficient.
- CoinJoin Technique [6]: As seen in Bitcoin, multiple users can combine their UTXO transactions into one, enhancing privacy.
- Tornado Cash Principle [7]: For account-based systems wanting to achieve privacy, methods like those used in Tornado Cash can be implemented, providing zk-SNARKs-based private transactions.
Event Hooks & Smart Contracts
- Role: Implement event-driven mechanisms that trigger specific actions in one model based on events in the other. For instance, a private transaction (UTXO-based) can trigger a corresponding public notification or event in the account-based model.
- Conditional Execution: Smart contracts could be set to execute based on events in the UTXO system. For instance, a smart contract might release funds (account-based) once a specific UTXO is spent.
- Privacy Smart Contracts: Using zk-SNARKs or zk-STARKs to bring privacy to the smart contract layer, allowing for private logic execution.

Challenges and Solutions

Synchronization Overhead
- Challenge: Combining two distinct transaction models creates an inherent synchronization challenge.
- State Channels: By allowing transactions to be conducted off-chain between participants, state channels can alleviate synchronization stresses. Only the final state needs to be settled on-chain, drastically reducing the amount of data and frequency of updates required.
- Sidechains: These act as auxiliary chains to the main blockchain. Transactions can be processed on the sidechain and then periodically synced with the main chain. This structure helps reduce the immediate load on the primary system.
- Checkpointing: Introduce periodic checkpoints where the two systems' states are verified and harmonized. This can ensure consistency without constant synchronization.
Double Spending
- Challenge: With two models operating in tandem, there's an increased risk of double-spending attacks.
- Multi-Signature Transactions: Implementing transactions that require signatures from both systems can prevent unauthorized movements.
- Cross-Verification Mechanisms: Before finalizing a transaction, it undergoes verification in both UTXO and account-based systems. If discrepancies arise, the transaction can be halted.
- Timestamping: By attaching a timestamp to each transaction, it's possible to order them sequentially, making it easier to spot and prevent double spending.
Complexity in User Experience
- Challenge: The dual model, while powerful, is inherently complex.
- Abstracted User Interfaces: Design UIs that handle the complexity behind the scenes, allowing users to make transactions without needing to understand the nuances of the dual model.
- Guided Tutorials: Offer onboarding tutorials to acquaint users with the system's features, especially emphasizing when and why they might choose one transaction type over the other.
- Feedback Systems: Implement systems where users can provide feedback on any complexities or challenges they encounter. This real-time feedback can be invaluable for iterative design improvements.
Security
- Challenge: Merging two systems can introduce unforeseen vulnerabilities.
- Threat Modeling: Regularly conduct threat modeling exercises to anticipate potential attack vectors, especially those that might exploit the interaction between the two systems.
- Layered Security Protocols: Beyond regular audits, introduce multiple layers of security checks. Each layer can act as a fail-safe if a potential threat bypasses another.
- Decentralized Watchtowers: These are third-party services that monitor the network for malicious activities. If any suspicious activity is detected, they can take corrective measures or raise alerts.
Gas & Fee Management:
- Challenge: A dual model can lead to convoluted fee structures.
- Dynamic Fee Adjustment: Implement algorithms that adjust fees based on network congestion and transaction type. This can ensure fairness and prevent network abuse.
- Fee Estimation Tools: Provide tools that can estimate fees before a transaction is initiated. This helps users understand potential costs upfront.
- Unified Gas Stations: Design platforms where users can purchase or allocate gas for both transaction types simultaneously, simplifying the gas acquisition process.

By addressing these challenges head-on with a detailed and systematic approach, it's possible to unlock the full potential of a dual-architecture system, combining the strengths of both UTXO and account-based models without their standalone limitations.

Aspect	Details
Harmony	- Advanced VM Development: Design tailored for private smart contracts. - Leverage Established Architectures: Use WASM or RISC-V to harness their versatile and encompassing nature suitable for zero-knowledge applications. - Support for UTXO & Account-Based Models: Enhance adaptability across various blockchain structures.
Challenges	- Adaptation Concerns: WASM and RISC-V weren't designed with zero-knowledge proofs as a primary focus, posing integration challenges. - Complexities with Newer Systems: Systems like (Super)Nova, STARKs, and Sangria are relatively nascent, adding another layer of intricacy to the integration. - Optimization Concerns: Ensuring that these systems are optimized for zero-knowledge proofs.
Proposed Solutions	- Integration of Nova: Consider Nova's proof system for its potential alignment with project goals. - Comprehensive Testing: Rigorously test and benchmark against alternatives like Halo2, Plonky, and Starky to validate choices. - Poseidon Recursion Technique: To conduct exhaustive performance tests, providing insights into each system's efficiency and scalability.

Goal 2: Virtual Machine Creation

The second goal entails the creation of an advanced virtual machine by leveraging established mainstream instruction sets like WASM or RISC-V. Alternatively, the objective involves pioneering a new, specialized instruction set meticulously optimized for Zero-Knowledge applications.

This initiative seeks to foster a versatile and efficient environment for executing computations within the privacy-focused context of the project. Both WASM and RISC-V exhibit adaptability to both UTXO and account-based models due to their encompassing nature as general-purpose instruction set architectures.

WASM, operating as a low-level virtual machine, possesses the capacity to execute code derived from a myriad of high-level programming languages, and boasts seamless integration across diverse blockchain platforms.

Meanwhile, RISC-V emerges as a versatile option, accommodating both models, and can be seamlessly integrated with secure enclaves like SGX or TEE, elevating the levels of security and privacy. However, it is crucial to acknowledge that employing WASM or RISC-V might present challenges, given their original design without specific emphasis on optimizing for Zero-Knowledge Proofs (ZKPs).

Further complexity arises with the consideration of more potent proof systems like (Super)Nova, STARKs, and Sangria, which, while potentially addressing optimization concerns, necessitate extensive research and testing due to their relatively nascent status within the field. This accentuates the need for a judicious balance between established options and innovative solutions in pursuit of an architecture harmoniously amalgamating privacy, security, and performance.

The ambition to build a powerful virtual machine tailored to zero-knowledge (ZK) applications is both commendable and intricate. The combination of two renowned instruction sets, WASM and RISC-V, in tandem with ZK, is an innovation that could redefine privacy standards in blockchain. Let's dissect the challenges and possibilities inherent in this goal:

Established Mainstream Instruction Sets - WASM and RISC-V
- Strengths:
  - WASM: Rooted in its ability to execute diverse high-level language codes, its potential for cross-chain compatibility makes it a formidable contender. Serving as a low-level virtual machine, its role in the blockchain realm is analogous to that of the Java Virtual Machine in the traditional computing landscape.
  - RISC-V: This open-standard instruction set architecture has made waves due to its customizable nature. Its adaptability to both UTXO and account-based structures coupled with its compatibility with trusted execution environments like SGX and TEE augments its appeal, especially in domains that prioritize security and privacy.
- Challenges: Neither WASM nor RISC-V was primarily designed with ZKPs in mind. While they offer flexibility, they might lack the necessary optimizations for ZK-centric tasks. Adjustments to these architectures might demand intensive R&D efforts.
Pioneering a New, Specialized Instruction Set
- Strengths: A bespoke instruction set can be meticulously designed from the ground up with ZK in focus, potentially offering unmatched performance and optimizations tailored to the project's requirements.
- Challenges: Crafting a new instruction set is a monumental task requiring vast resources, including expertise, time, and capital. It would also need to garner community trust and support over time.
Contemporary Proof Systems - (Super)Nova, STARKs, Sangria
- Strengths: These cutting-edge systems, being relatively new, might offer breakthrough cryptographic efficiencies that older systems lack: designed with modern challenges in mind, they could potentially bridge the gap where WASM and RISC-V might falter in terms of ZKP optimization.
- Challenges: Their nascent nature implies a dearth of exhaustive testing, peer reviews, and potentially limited community support. The unknowns associated with these systems could introduce unforeseen vulnerabilities or complexities. While they could offer optimizations that address challenges presented by WASM and RISC-V, their young status demands rigorous vetting and testing.

	Mainstream (WASM, RISC-V)	ZK-optimized (New Instruction Set)
Existing Tooling	YES	NO
Blockchain-focused	NO	YES
Performant	DEPENDS	YES

Optimization Concerns for WASM and RISC-V:

Cryptography Libraries: ZKP applications rely heavily on specific cryptographic primitives. Neither WASM nor RISC-V natively supports all of these primitives. Thus, a comprehensive library of cryptographic functions, optimized for these platforms, needs to be developed.
Parallel Execution: Given the heavy computational demands of ZKPs, leveraging parallel processing capabilities can optimize the time taken. Both WASM and RISC-V would need modifications to handle parallel execution of ZKP processes efficiently.
Memory Management: ZKP computations can sometimes require significant amounts of memory, especially during the proof generation phase. Fine-tuned memory management mechanisms are essential to prevent bottlenecks.

Emerging ZKP Optimized Systems Considerations:

Proof Size: Different systems generate proofs of varying sizes. A smaller proof size is preferable for blockchain applications to save on storage and bandwidth. The trade-offs between proof size, computational efficiency, and security need to be balanced.
Universality: Some systems can support any computational statement (universal), while others might be tailored to specific tasks. A universal system can be more versatile for diverse applications on the blockchain.
Setup Requirements: Certain ZKP systems, like zk-SNARKs, require a trusted setup, which can be a security concern. Alternatives like zk-STARKs don't have this requirement but come with other trade-offs.

Strategies for Integration:

Iterative Development: Given the complexities, an iterative development approach can be beneficial. Start with a basic integration of WASM or RISC-V for general tasks and gradually introduce specialized ZKP functionalities.
Benchmarking: Establish benchmark tests specifically for ZKP operations. This will provide continuous feedback on the performance of the system as modifications are made, ensuring optimization.
External Audits & Research: Regular checks from cryptographic experts and collaboration with academic researchers can help in staying updated and ensuring secure implementations.

Goal 3: Proofs Creation and Verification

The process of generating proofs for private state updates is vested in the hands of the user, aligning with our commitment to minimizing trust assumptions and enhancing privacy. Concurrently, the responsibility of verifying these proofs and executing public functions within the virtual machine can be effectively delegated to an external prover, a role that is incentivized to operate with utmost honesty and integrity. This intricate balance seeks to safeguard against information leakage, preserving the confidentiality of private transactions. Integral to this mechanism is the establishment of a robust incentivization framework.

To ensure the prover’s steadfast commitment to performing tasks with honesty, we should introduce a mechanism that facilitates both rewards for sincere behavior and penalties for any deviation from the expected standards. This two-pronged approach serves as a compelling deterrent against dishonest behavior and fosters an environment of accountability. In addition to incentivization, a crucial consideration is the economic aspect of verification and execution. The verification process has been intentionally designed to be more cost-effective than execution.

This strategic approach prevents potential malicious actors from exploiting the system by flooding it with spurious proofs, a scenario that could arise when the costs align favorably. By maintaining a cost balance that favors verification, we bolster the system’s resilience against fraudulent activities while ensuring its efficiency. In sum, our multifaceted approach endeavors to strike an intricate equilibrium between user-initiated proof creation, external verification, and incentivization. This delicate interplay of mechanisms ensures a level of trustworthiness that hinges on transparency, accountability, and economic viability.

As a result, we are poised to cultivate an ecosystem where users’ privacy is preserved, incentives are aligned, and the overall integrity of the system is fortified against potential adversarial actions. To achieve the goals of user-initiated proof creation, external verification, incentivization, and cost-effective verification over execution, several options and mechanisms can be employed:

User-Initiated Proof Creation: Users are entrusted with the generation of proofs for private state updates, thus ensuring greater privacy and reducing trust dependencies.
- Challenges:
  - Maintaining the quality and integrity of the proofs generated by users.
  - Ensuring that users have the tools and knowledge to produce valid proofs.
- Solutions:
  - Offer extensive documentation, tutorials, and user-friendly tools to streamline the proof-generation process.
  - Implement checks at the verifier's end to ensure the quality of proofs.
External Verification by Provers: An external prover verifies the proofs and executes public functions within the virtual machine.
- Challenges:
  - Ensuring that the external prover acts honestly.
  - Avoiding centralized points of failure.
- Solutions:
  - Adopt a decentralized verification approach, with multiple provers cross-verifying each other’s work.
  - Use reputation systems to rank provers based on their past performances, creating a trust hierarchy.
** Incentivization Framework:** A system that rewards honesty and penalizes dishonest actions, ensuring provers' commitment to the task.
- Challenges:
  - Determining the right balance of rewards and penalties.
  - Ensuring that the system cannot be gamed for undue advantage.
- Solutions¹:
  - Implement a dynamic reward system that adjusts based on network metrics and provers' performance.
  - Use a staking mechanism where provers need to lock up a certain amount of assets. Honest behavior earns rewards, while dishonest behavior could lead to loss of staked assets.
Economic Viability through Cost Dynamics: Making verification more cost-effective than execution to deter spamming and malicious attacks.
- Challenges:
  - Setting the right cost metrics for both verification and execution.
  - Ensuring that genuine users aren’t priced out of the system.
- Solutions:
  - Use a dynamic pricing model, adjusting costs in real-time based on network demand.
  - Implement gas-like mechanisms to differentiate operation costs and ensure fairness.
** Maintaining Trustworthiness:** Create a system that's transparent, holds all actors accountable, and is economically sound.
- Challenges:
  - Keeping the balance where users feel their privacy is intact, while provers feel incentivized.
  - Ensuring the system remains resilient against adversarial attacks.
- Solutions:
  - Implement layered checks and balances.
  - Foster community involvement, allowing them to participate in decision-making, potentially through a decentralized autonomous organization (DAO).

Each of these options can be combined or customized to suit the specific requirements of your project, striking a balance between user incentives, cost dynamics, and verification integrity. A thoughtful combination of these mechanisms ensures that the system remains robust, resilient, and conducive to the objectives of user-initiated proof creation, incentivized verification, and cost- effective validation.

Aspect	Details
Design Principle	- User Responsibility: Generating proofs for private state updates. - External Prover: Delegated the task of verifying proofs and executing public VM functions.
Trust & Privacy	- Minimized Trust Assumptions: Place proof generation in users' hands. - Enhanced Privacy: Ensure confidentiality of private transactions and prevent information leakage.
Incentivization Framework	- Rewards: Compensate honest behavior. - Penalties: Deter and penalize dishonest behavior.
Economic Considerations	- Verification vs. Execution: Make verification more cost-effective than execution to prevent spurious proofs flooding. - Cost Balance: Strengthen resilience against fraudulent activities and maintain efficiency.
Outcome	An ecosystem where: - Users' privacy is paramount. - Incentives are appropriately aligned. - The system is robust against adversarial actions.

Goal 4: Kernel-based Architecture Implementation

This goal centers on the establishment of a kernel-based architecture, akin to the model observed in ZEXE, to facilitate the attestation of accurate private function executions. This innovative approach employs recursion to construct a call stack, which is then validated through iterative recursive computations. At its core, this technique harnesses a recursive Succinct Non-Interactive Argument of Knowledge (SNARK) mechanism, where each function call’s proof accumulates within the call stack.

The subsequent verification of this stack’s authenticity leverages recursive SNARK validation. While this method offers robust verification of private function executions, it’s essential to acknowledge its associated intricacies.

The generation of SNARK proofs necessitates a substantial computational effort, which, in turn, may lead to elevated gas fees for users. Moreover, the iterative recursive computations could potentially exhibit computational expansion as the depth of recursion increases. This calls for a meticulous balance between the benefits of recursive verification and the resource implications it may entail.

In essence, Goal 4 embodies a pursuit of enhanced verification accuracy through a kernel-based architecture. By weaving recursion and iterative recursive computations into the fabric of our system, we aim to establish a mechanism that accentuates the trustworthiness of private function executions, while conscientiously navigating the computational demands that ensue.

To accomplish the goal of implementing a kernel-based architecture for recursive verification of private function executions, several strategic steps and considerations can be undertaken: recursion handling and depth management.

Recursion Handling

Call Stack Management:
- Implement a data structure to manage the call stack, recording each recursive function call’s details, parameters, and state.
_Proof Accumulation: _
- Design a mechanism to accumulate proof data for each function call within the call stack. This includes cryptographic commitments, intermediate results, and cryptographic challenges.
- Ensure that the accumulated proof data remains secure and tamper-resistant throughout the recursion process.
Intermediary SNARK Proofs:
- Develop an intermediary SNARK proof for each function call’s correctness within the call stack. This proof should demonstrate that the function executed correctly and produced expected outputs.
- Ensure that the intermediary SNARK proof for each recursive call can be aggregated and verified together, maintaining the integrity of the entire call stack.

Depth management

Depth Limitation:
- Define a threshold for the maximum allowable recursion depth based on the system’s computational capacity, gas limitations, and performance considerations.
- Implement a mechanism to prevent further recursion beyond the defined depth limit, safeguarding against excessive computational growth.
Graceful Degradation:
- Design a strategy for graceful degradation when the recursion depth approaches or reaches the defined limit. This may involve transitioning to alternative execution modes or optimization techniques.
- Communicate the degradation strategy to users and ensure that the system gracefully handles scenarios where recursion must be curtailed.
Resource Monitoring:
- Develop tools to monitor resource consumption (such as gas usage and computational time) as recursion progresses. Provide real-time feedback to users about the cost and impact of recursive execution.
Dynamic Depth Adjustment:
- Consider implementing adaptive depth management that dynamically adjusts the recursion depth based on network conditions, transaction fees, and available resources.
- Utilize algorithms to assess the optimal recursion depth for efficient execution while adhering to gas cost constraints.
Fallback Mechanisms:
- Create fallback mechanisms that activate if the recursion depth limit is reached or if the system encounters resource constraints. These mechanisms could involve alternative verification methods or delayed execution.
User Notifications:
- Notify users when the recursion depth limit is approaching, enabling them to make informed decisions about the complexity of their transactions and potential resource usage.

Goal 4 underscores the project's ambition to integrate the merits of a kernel-based architecture with recursive verifications to bolster the reliability of private function executions. While the approach promises robust outcomes, it's pivotal to maneuver through its intricacies with astute strategies, ensuring computational efficiency and economic viability. By striking this balance, the architecture can realize its full potential in ensuring trustworthy and efficient private function executions.

Goal 5: Seamless Interaction Design

Goal 5 revolves around the meticulous design of a seamless interaction between public and private states within the blockchain ecosystem. This objective envisions achieving not only composability between contracts but also the harmonious integration of private and public functions.

A notable challenge in this endeavor lies in the intricate interplay between public and private states, wherein the potential linkage of a private transaction to a public one raises concerns about unintended information leakage.

The essence of this goal entails crafting an architecture that facilitates the dynamic interaction of different states while ensuring that the privacy and confidentiality of private transactions remain unbreached. This involves the formulation of mechanisms that enable secure composability between contracts, guaranteeing the integrity of interactions across different layers of functionality.

A key focus of this goal is to surmount the challenge of information leakage by implementing robust safeguards. The solution involves devising strategies to mitigate the risk of revealing private transaction details when connected to corresponding public actions. By creating a nuanced framework that com- partmentalizes private and public interactions, the architecture aims to uphold privacy while facilitating seamless interoperability.

Goal 5 encapsulates a multifaceted undertaking, calling for the creation of an intricate yet transparent framework that empowers users to confidently engage in both public and private functions, without compromising the confidentiality of private transactions. The successful realization of this vision hinges on a delicate blend of architectural ingenuity, cryptographic sophistication, and user-centric design.

To achieve seamless interaction between public and private states, composability, and privacy preservation, a combination of solutions and approaches can be employed. In the table below, a comprehensive list of solutions that address these objectives:

Solution Category	Description
Layer 2 Solutions	Employ zk-Rollups, Optimistic Rollups, and state channels to handle private interactions off-chain and settle them on-chain periodically. Boost scalability and cut transaction costs.
Intermediary Smart Contracts	Craft smart contracts as intermediaries for secure public-private interactions. Use these to manage data exchange confidentially.
Decentralized Identity & Pseudonymity	Implement decentralized identity systems for pseudonymous interactions. Validate identity using cryptographic proofs.
Confidential Sidechains & Cross-Chain	Set up confidential sidechains and employ cross-chain protocols to ensure private and composability across blockchains.
Temporal Data Structures	Create chronological data structures for secure interactions. Utilize cryptographic methods for data integrity and privacy.
Homomorphic Encryption & MPC	Apply homomorphic encryption and MPC for computations on encrypted data and interactions between state layers.
Commit-Reveal Schemes	Introduce commit-reveal mechanisms for private transactions, revealing data only post necessary public actions.
Auditability & Verifiability	Use on-chain tools for auditing and verifying interactions. Utilize cryptographic commitments for third-party validation.
Data Fragmentation & Sharding	Fragment data across shards for private interactions and curtailed data exposure. Bridge shards securely with cryptography.
Ring Signatures & CoinJoin	Incorporate ring signatures and CoinJoin protocols to mask transaction details and mix transactions collaboratively.

Goal 6: Integration of DeFi Protocols with a Privacy-Preserving Framework

The primary aim of Goal 6 is to weave key DeFi protocols, such as AMMs and staking, into a user-centric environment that accentuates privacy. This endeavor comes with inherent challenges, especially considering the heterogeneity of existing DeFi protocols, predominantly built on Ethereum. These variations in programming languages and VMs exacerbate the quest for interoperability. Furthermore, the success and functionality of DeFi protocols is closely tied to liquidity, which in turn is influenced by user engagement and the amount of funds locked into the system.

Strategic Roadmap for Goal 6

** Pioneering Privacy-Centric DeFi Models: ** Initiate the development of AMMs and staking solutions that are inherently protective of users' transactional privacy and identity.
** Specialized Smart Contracts with Privacy: ** Architect distinct smart contracts infused with privacy elements, setting the stage for secure user interactions within this new, confidential DeFi landscape.
** Optimized User Interfaces: ** Craft interfaces that resonate with user needs, simplifying the journey through the private DeFi space without compromising on security.
** Tackling Interoperability: **
- Deploy advanced bridge technologies and middleware tools to foster efficient data exchanges and guarantee operational harmony across a spectrum of programming paradigms and virtual environments.
- Design and enforce universal communication guidelines that bridge the privacy-centric DeFi entities with the larger DeFi world seamlessly.
** Enhancing and Sustaining Liquidity: **
- Unveil innovative liquidity stimuli and yield farming incentives, compelling users to infuse liquidity into the private DeFi space.
- Incorporate adaptive liquidity frameworks that continually adjust based on the evolving market demands, ensuring consistent liquidity.
- Forge robust alliances with other DeFi stalwarts, jointly maximizing liquidity stores and honing sustainable token distribution strategies.
** Amplifying Community Engagement:** Design and roll out enticing incentive schemes to rally users behind privacy-focused AMMs and staking systems, thereby nurturing a vibrant, privacy-advocating DeFi community.

Through the integration of these approaches, we aim to achieve Goal 6, providing users with a privacy-focused platform for engaging effortlessly in core DeFi functions such as AMMs and staking, all while effectively overcoming the obstacles related to interoperability and liquidity concerns.

Summary of the Architecture

In our quest to optimize privacy, we're proposing a Zero-Knowledge Virtual Machine (Zkvm) that harnesses the power of Zero-Knowledge Proofs (ZKPs). These proofs ensure that while private state data remains undisclosed, public state transitions can still be carried out and subsequently verified by third parties. This blend of public and private state is envisaged to be achieved through a state tree representing the public state, while the encrypted state leaves stand for the private state. Each user's private state indicates validity through the absence of a corresponding nullifier. A nullifier is a unique cryptographic value generated in privacy-preserving blockchain transactions to prevent double-spending, ensuring that each private transaction is spent only once without revealing its details.

Private functions' execution mandates users to offer a proof underscoring the accurate execution of all encapsulated private calls. For validating a singular private function call, we're leaning into the kernel-based model inspired by the ZEXE protocol. Defined as kernel circuits, these functions validate the correct execution of each private function call. Due to their recursive circuit structure, a succession of private function calls can be executed by calculating proofs in an iterative manner. Execution-relevant data, like private and public call stacks and additions to the state tree, are incorporated as public inputs.

Our method integrates the verification keys for these functions within a merkle tree. Here's the innovation: a user's ZKP showcases the existence of the verification key in this tree, yet keeps the executed function concealed. The unique function identifier can be presented as the verification key, with all contracts merkleized for hiding functionalities.

We suggest a nuanced shift from the ZEXE protocol's identity function, which crafts an identity for smart contracts delineating its behavior, access timeframes, and other functionalities. Instead of the ZEXE protocol's structure, our approach pivots to a method anchored in the security of a secret combined with the uniqueness from hashing with the contract address. The underlying rationale is straightforward: the sender, equipped with a unique nonce and salt for the transaction, hashes the secret, payload, nonce, and salt. This result is then hashed with the contract address for the final value. The hash function's unidirectional nature ensures that the input cannot be deduced easily from its output. A specific concern, however, is the potential repetition of secret and payload values across transactions, which could jeopardize privacy. Yet, by embedding the function's hash within the hash of the contract address, users can validate a specific function's execution without divulging the function, navigating this limitation.

Alternative routes do exist: We could employ signature schemes like ECDSA, focusing on uniqueness and authenticity, albeit at the cost of complex key management. Fully Homomorphic Encryption (FHE) offers another pathway, enabling function execution on encrypted data, or Multi-Party Computation (MPC) which guarantees non-disclosure of function or inputs. Yet, integrating ZKPs with either FHE or MPC presents a challenge. Combining cryptographic functions like SHA-3 and BLAKE2 can also bolster security and uniqueness. It's imperative to entertain these alternatives, especially when hashing might not serve large input/output functions effectively or might fall short in guaranteeing uniqueness.

Current State

Our aim is to revolutionize the privacy and security paradigms through Nescience. As we strive to set milestones and achieve groundbreaking advancements, our current focus narrows onto the realization of Goal 2 and Goal 3.

Our endeavors to build a powerful virtual machine tailored for Zero-Knowledge applications have led us down the path of rigorous exploration and testing. We believe that integrating the right proof system is pivotal to our project's success, which brings us to Nova [8]. In our project journey, we have opted to integrate the Nova proof system, recognizing its potential alignment with our overarching goals. However, as part of our meticulous approach to innovation and optimization, we acknowledge the need to thoroughly examine Nova’s performance capabilities, particularly due to its status as a pioneering and relatively unexplored proof system.

This critical evaluation entails a comprehensive process of benchmarking and comparative analysis [9], pitting Nova against other prominent proof systems in the field, including Halo2 [10], Plonky2 [11], and Starky [12]. This ongoing and methodical initiative is designed to ensure a fair and impartial assessment, enabling us to draw meaningful conclusions about Nova’s strengths and limitations in relation to its counterparts. By leveraging the Poseidon recursion technique, we are poised to conduct an exhaustive performance test that delves into intricate details. Through this testing framework, we aim to discern whether Nova possesses the potential to outshine its contemporaries in terms of efficiency, scalability, and overall performance. The outcome of this rigorous evaluation will be pivotal in shaping our strategic decisions moving forward. Armed with a comprehensive understanding of Nova’s performance metrics vis-à-vis other proof systems, we can confidently chart a course that maximizes the benefits of our project’s optimization efforts.

Moreover, as we ambitiously pursue the establishment of a robust mechanism for proof creation and verification, our focus remains resolute on preserving user privacy, incentivizing honest behaviour, and ensuring the cost-effective verification of transactions. At the heart of this endeavor is our drive to empower users by allowing them the autonomy of generating proofs for private state updates, thereby reducing dependencies and enhancing privacy. We would like to actively work on providing comprehensive documentation, user-friendly tools, and tutorials to aid users in this intricate process.

Parallelly, we're looking into decentralized verification processes, harnessing the strength of multiple external provers that cross-verify each other's work. Our commitment is further cemented by our efforts to introduce a dynamic reward system that adjusts based on network metrics and prover performance. This intricate balance, while challenging, aims to fortify our system against potential adversarial actions, aligning incentives, and preserving the overall integrity of the project.

References

[1] Nakamoto, S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System. Retrieved from https://bitcoin.org/bitcoin.pdf

[2] Sanchez, F. (2021). Cardano’s Extended UTXO accounting model. Retrived from https://iohk.io/en/blog/posts/2021/03/11/cardanos-extended-utxo-accounting-model/

[3] Morgan, D. (2020). HD Wallets Explained: From High Level to Nuts and Bolts. Retrieved from https://medium.com/mycrypto/the-journey-from-mnemonic-phrase-to-address-6c5e86e11e14

[4] Wuille, P. (012). Bitcoin Improvement Proposal (BIP) 44. Retrieved from https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki

[5] Jedusor, T. (2020). Introduction to Mimblewimble and Grin. Retrieved from https://github.com/mimblewimble/grin/blob/master/doc/intro.md

[6] Bitcoin's official wiki overview of the CoinJoin method. Retrieved from https://en.bitcoin.it/wiki/CoinJoin

[7] TornadoCash official Github page. Retrieved from https://github.com/tornadocash/tornado-classic-ui

[8] Kothapalli, A., Setty, S., Tzialla, I. (2021). Nova: Recursive Zero-Knowledge Arguments from Folding Schemes. Retrieved from https://eprint.iacr.org/2021/370

[9] ZKvm Github page. Retrieved from https://github.com/vacp2p/zk-explorations

[10] Electric Coin Company (2020). Explaining Halo 2. Retrieved from https://electriccoin.co/blog/explaining-halo-2/

[11] Polygon Labs (2022). Introducing Plonky2. Retrieved from https://polygon.technology/blog/introducing-plonky2

[12] StarkWare (2021). ethSTARK Documentation. Retrieved from https://eprint.iacr.org/2021/582

Incentive Mechanisms:
- Token Rewards: Design a token-based reward system where honest provers are compensated with tokens for their verification services. This incentivizes participation and encourages integrity.
- Staking and Slashing: Introduce a staking mechanism where provers deposit tokens as collateral. Dishonest behavior results in slashing (partial or complete loss) of the staked tokens, while honest actions are rewarded.
- Proof of Work/Proof of Stake: Implement a proof-of-work or proof-of- stake consensus mechanism for verification, aligning incentives with the blockchain’s broader consensus mechanism.
↩

Device Pairing in Js-waku and Go-waku

2023-04-24T12:00:00.000Z

Device pairing and secure message exchange using Waku and noise protocol.

As the world becomes increasingly connected through the internet, the need for secure and reliable communication becomes paramount. In this article it is described how the Noise protocol can be used as a key-exchange mechanism for Waku.

Recently, this feature was introduced in js-waku and go-waku, providing a simple API for developers to implement secure communication protocols using the Noise Protocol framework. These open-source libraries provide a solid foundation for building secure and decentralized applications that prioritize data privacy and security.

This functionality is designed to be simple and easy to use, even for developers who are not experts in cryptography. The library offers a clear and concise API that abstracts away the complexity of the Noise Protocol framework and provides an straightforward interface for developers to use. Using this, developers can effortlessly implement secure communication protocols on top of their JavaScript and Go applications, without having to worry about the low-level details of cryptography.

One of the key benefits of using Noise is that it provides end-to-end encryption, which means that the communication between two parties is encrypted from start to finish. This is essential for ensuring the security and privacy of sensitive information

Device Pairing

In today's digital world, device pairing has become an integral part of our lives. Whether it's connecting our smartphones with other computers or web applications, the need for secure device pairing has become more crucial than ever. With the increasing threat of cyber-attacks and data breaches, it's essential to implement secure protocols for device pairing to ensure data privacy and prevent unauthorized access.

To demonstrate how device pairing can be achieved using Waku and Noise, we have examples available at https://examples.waku.org/noise-js/. You can try pairing different devices, such as mobile and desktop, via a web application. This can be done by scanning a QR code or opening a URL that contains the necessary data for a secure handshake.

The process works as follows:

Actors:

Alice the initiator
Bob the responder

The first step in achieving secure device pairing using Noise and Waku is for Bob generate the pairing information which could be transmitted out-of-band. For this, Bob opens https://examples.waku.org/noise-js/ and a QR code is generated, containing the data required to do the handshake. This pairing QR code is timeboxed, meaning that after 2 minutes, it will become invalid and a new QR code must be generated
Alice scans the QR code using a mobile phone. This will open the app with the QR code parameters initiating the handshake process which is described in WAKU2-DEVICE-PAIRING. These messages are exchanged between two devices over Waku to establish a secure connection. The handshake messages consist of three main parts: the initiator's message, the responder's message, and the final message, which are exchanged to establish a secure connection. While using js-noise, the developer is abstracted of this process, since the messaging happens automatically depending on the actions performed by the actors in the pairing process.
Both Alice and Bob will be asked to verify each other's identity. This is done by confirming if an 8-digits authorization code match in both devices. If both actors confirm that the authorization code is valid, the handshake concludes succesfully
Alice and Bob receive a set of shared keys that can be used to start exchanging encrypted messages. The shared secret keys generated during the handshake process are used to encrypt and decrypt messages sent between the devices. This ensures that the messages exchanged between the devices are secure and cannot be intercepted or modified by an attacker.

The above example demonstrates device pairing using js-waku. Additionally, You can also try building and experimenting with other noise implementations like nwaku, or go-waku, with an example available at https://github.com/waku-org/go-waku/tree/master/examples/noise in which the same flow described before is done with Bob (the receiver) using go-waku instead of js-waku.

Conclusion

With its easy to use API built on top of the Noise Protocol framework and the LibP2P networking stack, if you are a developer looking to implement secure messaging in their applications that are both decentralized and censorship resistant, Waku is definitely an excellent choice worth checking out!

Waku is also Open source with a MIT and APACHEv2 licenses, which means that developers are encouraged to contribute code, report bugs, and suggest improvements to make it even better.

Don't hesitate to try the live example at https://examples.waku.org/noise-js and build your own webapp using https://github.com/waku-org/js-noise, https://github.com/waku-org/js-waku and https://github.com/waku-org/go-waku. This will give you a hands-on experience of implementing secure communication protocols using the Noise Protocol framework in a practical setting. Happy coding!

Vac Research Blog

Decentralized Message Layer Security (De-MLS) with Waku

Introduction​

Background​

MLS​

Waku​

de-MLS​

Waku Integration​

Flow​

1. Steward joins the welcome topic​

2. Group initialization​

3. Emitting Group Anouncement (GA) by Steward​

4. User joins the welcome topic​

5. User creates its key package​

6. Steward receives the User's key package​

7. Creation of Voting proposals​

8. Voting for proposal​

9. Creating commit message​

10. Sending messages​

11. Applying welcome message​

Benchmark​

Potential drawbacks and countermeasures​

Conclusion​

Future Work​

References​

Scaling libp2p GossipSub for Large Messages: An Evaluation of Performance Improvement Proposals

Overview​

Message Transfer Time and Duplicates​

Protocols Considered​

Push-Pull Phase Transition (PPPT)​

Key Considerations and PoC Implementation​

GossipSub v1.4 Proposal​

Key Considerations and PoC Implementation​

GossipSub v2.0 Proposal​

Key Considerations and PoC Implementation​

Experiments​

Results​

Scenario1: Increasing Network Size​

Scenario2: Increasing Message Size​

Scenario3: Increasing Number of Publishers​

Findings​

References​

Zerokit optimizations: A performance journey

Background​

The Challenge​

The importance of benchmarks​

Benchmarking with Rust's criterion crate​

Promising results​

Conclusion​

Nim in Logos - 1st Edition

Nim 2.2 – Better Stability, Smarter Memory, and Smoother Development​

Error Handling in Nim: Why Results Beat Exceptions​

The Exception Problem​

The Result Advantage​

Best Practices for Result-Based Error Handling​

When to Break the Rules​

Debugging in Nim​

GDB​

Logs - Chronicles​

Logs - echo​

Heaptrack​

Formatting code in Nim​

The MDSECheck method: choosing secure square MDS matrices for P-SP-networks

Introduction​

MDS matrix: how to define and construct​

Partial substitution-permutation networks​

Square MDS matrix security check in the context of P-SPNs​

MDSECheck method: getting rid of the matrix powers​

MDSECheck library crate: implementation in Rust​

Conclusion​

References​

Vac 2024 Year in Review

Nescience​

Highlights​

Looking forward​

Token Economics (TKE)​

Highlights​

Looking forward​

Quality Assurance (QA)​

Highlights​

Introduction

Background

MLS

Waku

de-MLS

Waku Integration

Flow

1. Steward joins the welcome topic

2. Group initialization

3. Emitting Group Anouncement (GA) by Steward

4. User joins the welcome topic

5. User creates its key package

6. Steward receives the User's key package

7. Creation of Voting proposals

8. Voting for proposal

9. Creating commit message

10. Sending messages

11. Applying welcome message

Benchmark

Potential drawbacks and countermeasures

Conclusion

Future Work

References

Overview

Message Transfer Time and Duplicates

Protocols Considered

Push-Pull Phase Transition (PPPT)

Key Considerations and PoC Implementation

GossipSub v1.4 Proposal

Key Considerations and PoC Implementation

GossipSub v2.0 Proposal

Key Considerations and PoC Implementation

Experiments

Results

Scenario1: Increasing Network Size

Scenario2: Increasing Message Size

Scenario3: Increasing Number of Publishers

Findings

References

Background

The Challenge

The importance of benchmarks

Benchmarking with Rust's criterion crate

Promising results

Conclusion

Nim 2.2 – Better Stability, Smarter Memory, and Smoother Development

Error Handling in Nim: Why Results Beat Exceptions

The Exception Problem

The Result Advantage

Best Practices for Result-Based Error Handling

When to Break the Rules

Debugging in Nim

GDB

Logs - Chronicles

Logs - echo

Heaptrack

Formatting code in Nim

Introduction

MDS matrix: how to define and construct

Partial substitution-permutation networks

Square MDS matrix security check in the context of P-SPNs

MDSECheck method: getting rid of the matrix powers

MDSECheck library crate: implementation in Rust

Conclusion

References

Nescience

Highlights

Looking forward

Token Economics (TKE)

Highlights

Looking forward

Quality Assurance (QA)

Highlights

Looking forward

RFC

Highlights

Looking forward

Applied Cryptography and ZK (ACZ)

Highlights

Looking forward