mirror of
https://github.com/vacp2p/vac.dev.git
synced 2026-01-08 22:28:01 -05:00
new URL structure for root pages and research posts, refs #115
This commit is contained in:
173
rlog/2019-07-19-p2p-data-sync-for-mobile.mdx
Normal file
173
rlog/2019-07-19-p2p-data-sync-for-mobile.mdx
Normal file
@@ -0,0 +1,173 @@
|
||||
---
|
||||
title: 'P2P Data Sync for Mobile'
|
||||
date: 2019-07-19 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: p2p-data-sync-for-mobile
|
||||
categories: research
|
||||
image: /img/mvds_interactive.png
|
||||
|
||||
toc_min_heading_level: 2
|
||||
toc_max_heading_level: 5
|
||||
---
|
||||
|
||||
A research log. Reliable and decentralized, pick two.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
Together with decanus, I've been working on the problem of data sync lately.
|
||||
|
||||
In building p2p messaging systems, one problem you quickly come across is the problem of reliably transmitting data. If there's no central server with high availability guarantees, you can't meaningfully guarantee that data has been transmitted. One way of solving this problem is through a synchronization protocol.
|
||||
|
||||
There are many synchronization protocols out there and I won't go into detail of how they differ with our approach here. Some common examples are Git and Bittorrent, but there are also projects like IPFS, Swarm, Dispersy, Matrix, Briar, SSB, etc.
|
||||
|
||||
## Problem motivation
|
||||
|
||||
Why do we want to do p2p sync for mobilephones in the first place? There are three components to that question. One is on the value of decentralization and peer-to-peer, the second is on why we'd want to reliably sync data at all, and finally why mobilephones and other resource restricted devices.
|
||||
|
||||
### Why p2p?
|
||||
|
||||
For decentralization and p2p, there are both technical and social/philosophical reasons. Technically, having a user-run network means it can scale with the number of users. Data locality is also improved if you query data that's close to you, similar to distributed CDNs. The throughput is also improved if there are more places to get data from.
|
||||
|
||||
Socially and philosophically, there are several ways to think about it. Open and decentralized networks also relate to the idea of open standards, i.e. compare the longevity of AOL with IRC or Bittorrent. One is run by a company and is shut down as soon as it stops being profitable, the others live on. Additionally increasingly control of data and infrastructure is becoming a liability. By having a network with no one in control, everyone is. It's ultimately a form of democratization, more similar to organic social structures pre Big Internet companies. This leads to properties such as censorship resistance and coercion resistance, where we limit the impact a 3rd party might have a voluntary interaction between individuals or a group of people. Examples of this are plentiful in the world of Facebook, Youtube, Twitter and WeChat.
|
||||
|
||||
### Why reliably sync data?
|
||||
|
||||
At risk of stating the obvious, reliably syncing data is a requirement for many problem domains. You don't get this by default in a p2p world, as it is unreliable with nodes permissionslessly join and leave the network. In some cases you can get away with only ephemeral data, but usually you want some kind of guarantees. This is a must for reliable group chat experience, for example, where messages are expected to arrive in a timely fashion and in some reasonable order. The same is true for messages there represent financial transactions, and so on.
|
||||
|
||||
### Why mobilephones?
|
||||
|
||||
Most devices people use daily are mobile phones. It's important to provide the same or at least similar guarantees to more traditional p2p nodes that might run on a desktop computer or computer. The alternative is to rely on gateways, which shares many of the drawbacks of centralized control and prone to censorship, control and surveillence.
|
||||
|
||||
More generally, resource restricted devices can differ in their capabilities. One example is smartphones, but others are: desktop, routers, Raspberry PIs, POS systems, and so on. The number and diversity of devices are exploding, and it's useful to be able to leverage this for various types of infrastructure. The alternative is to centralize on big cloud providers, which also lends itself to lack of democratization and censorship, etc.
|
||||
|
||||
## Minimal Requirements
|
||||
|
||||
For requirements or design goals for a solution, here's what we came up with.
|
||||
|
||||
1. MUST sync data reliably between devices. By reliably we mean having the ability to deal with messages being out of order, dropped, duplicated, or delayed.
|
||||
|
||||
2. MUST NOT rely on any centralized services for reliability. By centralized services we mean any single point of failure that isn’t one of the endpoint devices.
|
||||
|
||||
3. MUST allow for mobile-friendly usage. By mobile-friendly we mean devices that are resource restricted, mostly-offline and often changing network.
|
||||
|
||||
4. MAY use helper services in order to be more mobile-friendly. Examples of helper services are decentralized file storage solutions such as IPFS and Swarm. These help with availability and latency of data for mostly-offline devices.
|
||||
|
||||
5. MUST have the ability to provide casual consistency. By casual consistency we mean the commonly accepted definition in distributed systems literature. This means messages that are casually related can achieve a partial ordering.
|
||||
|
||||
6. MUST support ephemeral messages that don’t need replication. That is, allow for messages that don’t need to be reliabily transmitted but still needs to be transmitted between devices.
|
||||
|
||||
7. MUST allow for privacy-preserving messages and extreme data loss. By privacy-preserving we mean things such as exploding messages (self-destructing messages). By extreme data loss we mean the ability for two trusted devices to recover from a, deliberate or accidental, removal of data.
|
||||
|
||||
8. MUST be agnostic to whatever transport it is running on. It should not rely on specific semantics of the transport it is running on, nor be tightly coupled with it. This means a transport can be swapped out without loss of reliability between devices.
|
||||
|
||||
## MVDS - a minimium viable version
|
||||
|
||||
The first minimum viable version is in an alpha stage, and it has a [specification](https://rfc.vac.dev/spec/2), [implementation](https://github.com/vacp2p/mvds) and we have deployed it in a [console client](https://github.com/status-im/status-console-client) for end to end functionality. It's heavily inspired by [Bramble Sync Protocol](https://code.briarproject.org/briar/briar-spec/blob/master/protocols/BSP.md).
|
||||
|
||||
The spec is fairly minimal. You have nodes that exchange records over some secure transport. These records are of different types, such as `OFFER`, `MESSAGE`, `REQUEST`, and `ACK`. A peer keep tracks of the state of message for each node it is interacting with. There's also logic for message retransmission with exponential delay. The positive ACK and retransmission model is quite similar to how TCP is designed.
|
||||
|
||||
There are two different modes of syncing, interactive and batch mode. See sequence diagrams below.
|
||||
|
||||
Interactive mode:
|
||||

|
||||
|
||||
Batch mode:
|
||||

|
||||
|
||||
Which mode should you choose? It's a tradeoff of latency and bandwidth. If you want to minimize latency, batch mode is better. If you care about preserving bandwidth interactive mode is better. The choice is up to each node.
|
||||
|
||||
### Basic simulation
|
||||
|
||||
Initial ad hoc bandwidth and latency testing shows some issues with a naive approach. Running with the [default simulation settings](https://github.com/vacp2p/mvds/):
|
||||
|
||||
- communicating nodes: 2
|
||||
- nodes using interactive mode: 2
|
||||
- interval between messages: 5s
|
||||
- time node is offine: 90%
|
||||
- nodes each node is sharing with: 2
|
||||
|
||||
we notice a [huge overhead](https://notes.status.im/7QYa4b6bTH2wMk3HfAaU0w#). More specifically, we see a ~5 minute latency overhead and a bandwidth multiplier of x100-1000, i.e. 2-3 orders of magnitude just for receiving a message with interactive mode, without acks.
|
||||
|
||||
Now, that seems terrible. A moment of reflection will reveal why that is. If each node is offline uniformly 90% of the time, that means that each record will be lost 90% of the time. Since interactive mode requires offer, request, payload (and then ack), that's three links just for Bob to receive the actual message.
|
||||
|
||||
Each failed attempt implies another retransmission. That means we have `(1/0.1)^3 = 1000` expected overhead to receive a message in interactive mode. The latency follows naturally from that, with the retransmission logic.
|
||||
|
||||
### Mostly-offline devices
|
||||
|
||||
The problem above hints at the requirements 3 and 4 above. While we did get reliable syncing (requirement 1), it came at a big cost.
|
||||
|
||||
There are a few ways of getting around this issue. One is having a _store and forward_ model, where some intermediary node picks up (encrypted) messages and forwards them to the recipient. This is what we have in production right now at Status.
|
||||
|
||||
Another, arguably more pure and robust, way is having a _remote log_, where the actual data is spread over some decentralized storage layer, and you have a mutable reference to find the latest messages, similar to DNS.
|
||||
|
||||
What they both have in common is that they act as a sort of highly-available cache to smooth over the non-overlapping connection windows between two endpoints. Neither of them are _required_ to get reliable data transmission.
|
||||
|
||||
### Basic calculations for bandwidth multiplier
|
||||
|
||||
While we do want better simulations, and this is a work in progress, we can also look at the above scenarios using some basic calculations. This allows us to build a better intuition and reason about the problem without having to write code. Let's start with some assumptions:
|
||||
|
||||
- two nodes exchanging a single message in batch mode
|
||||
- 10% uniformly random uptime for each node
|
||||
- in HA cache case, 100% uptime of a piece of infrastructure C
|
||||
- retransmission every epoch (with constant or exponential backoff)
|
||||
- only looking at average (p50) case
|
||||
|
||||
#### First case, no helper services
|
||||
|
||||
A sends a message to B, and B acks it.
|
||||
|
||||
```
|
||||
A message -> B (10% chance of arrival)
|
||||
A <- ack B (10% chance of arrival)
|
||||
```
|
||||
|
||||
With a constant backoff, A will send messages at epoch `1, 2, 3, ...`. With exponential backoff and a multiplier of 2, this would be `1, 2, 4, 8, ...`. Let's assume constant backoff for now, as this is what will influence the success rate and thus the bandwidth multiplier.
|
||||
|
||||
There's a difference between _time to receive_ and _time to stop sending_. Assuming each send attempt is independent, it takes on average 10 epochs for A's message to arrive with B. Furthermore:
|
||||
|
||||
1. A will send messages until it receives an ACK.
|
||||
2. B will send ACK if it receives a message.
|
||||
|
||||
To get an average of one ack through, A needs to send 100 messages, and B send on average 10 acks. That's a multiplier of roughly a 100. That's roughly what we saw with the simulation above for receiving a message in interactive mode.
|
||||
|
||||
#### Second case, high-availability caching layer
|
||||
|
||||
Let's introduce a helper node or piece of infrastructure, C. Whenever A or B sends a message, it also sends it to C. Whenever A or B comes online, it queries for messages with C.
|
||||
|
||||
```
|
||||
A message -> B (10% chance of arrival)
|
||||
A message -> C (100% chance of arrival)
|
||||
B <- req/res -> C (100% chance of arrival)
|
||||
A <- ack B (10% chance of arrival)
|
||||
C <- ack B (100% chance of arrival)
|
||||
A <- req/res -> C (100% chance of arrival)
|
||||
```
|
||||
|
||||
What's the probability that A's messages will arrive at B? Directly, it's still 10%. But we can assume it's 100% that C picks up the message. (Giving C a 90% chance success rate doesn't materially change the numbers).
|
||||
|
||||
B will pick up A's message from C after an average of 10 epochs. Then B will send ack to A, which will also be picked up by C 100% of the time. Once A comes online again, it'll query C and receive B's ack.
|
||||
|
||||
Assuming we use exponential backoff with a multiplier of 2, A will send a message directly to B at epoch `1, 2, 4, 8` (assuming it is online). At this point, epoch `10`, B will be online in the average case. These direct sends will likely fail, but B will pick the message up from C and send one ack, both directly to A and to be picked up by C. Once A comes online, it'll query C and receive the ack from B, which means it won't do any more retransmits.
|
||||
|
||||
How many messages have been sent? Not counting interactions with C, A sends 4 (at most) and B 1. Depending on if the interaction with C is direct or indirect (i.e. multicast), the factor for interaction with C will be ~2. This means the total bandwidth multiplier is likely to be `<10`, which is a lot more acceptable.
|
||||
|
||||
Since the syncing semantics are end-to-end, this is without relying on the reliablity of C.
|
||||
|
||||
#### Caveat
|
||||
|
||||
Note that both of these are probabilistic argument. They are also based on heuristics. More formal analysis would be desirable, as well as better simulations to experimentally verify them. In fact, the calculations could very well be wrong!
|
||||
|
||||
## Future work
|
||||
|
||||
There are many enhancements that can be made and are desirable. Let's outline a few.
|
||||
|
||||
1. Data sync clients. Examples of actual usage of data sync, with more interesting domain semantics. This also includes usage of sequence numbers and DAGs to know what content is missing and ought to be synced.
|
||||
|
||||
2. Remote log. As alluded to above, this is necessary. It needs a more clear specification and solid proof of concepts.
|
||||
|
||||
3. More efficient ways of syncing with large number of nodes. When the number of nodes goes up, the algorithmic complexity doesn't look great. This also touches on things such as ambient content discovery.
|
||||
|
||||
4. More robust simulations and real-world deployments. Exisiting simulation is ad hoc, and there are many improvements that can be made to gain more confidence and identify issues. Additionally, better formal analysis.
|
||||
|
||||
5. Example usage over multiple transports. Including things like sneakernet and meshnets. The described protocol is designed to work over unstructured, structured and private p2p networks. In some cases it can leverage differences in topology, such as multicast, or direct connections.
|
||||
93
rlog/2019-08-02-vac-overview.mdx
Normal file
93
rlog/2019-08-02-vac-overview.mdx
Normal file
@@ -0,0 +1,93 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Vac - A Rough Overview'
|
||||
title: 'Vac - A Rough Overview'
|
||||
date: 2019-08-02 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: vac-overview
|
||||
categories: research
|
||||
---
|
||||
|
||||
Vac is a modular peer-to-peer messaging stack, with a focus on secure messaging. Overview of terms, stack and open problems.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
Vac is a **modular peer-to-peer messaging stack, with a focus on secure messaging**. What does that mean? Let's unpack it a bit.
|
||||
|
||||
## Basic terms
|
||||
|
||||
_messaging stack_. While the initial focus is on [data sync](https://vac.dev/p2p-data-sync-for-mobile), we are concerned with all layers in the stack. That means all the way from underlying transports, p2p overlays and routing, to initial trust establishment and semantics for things like group chat. The ultimate goal is to give application developers the tools they need to provide secure messaging for their users, so they can focus on their domain expertise.
|
||||
|
||||
_modular_. Unlike many other secure messaging applications, our goal is not to have a tightly coupled set of protocols, nor is it to reinvent the wheel. Instead, we aim to provide options at each layer in the stack, and build on the shoulders of giants, putting a premimum on interoperability. It's similar in philosophy to projects such as [libp2p](https://libp2p.io/) or [Substrate](https://www.parity.io/substrate/) in that regard. Each choice comes with different trade-offs, and these look different for different applications.
|
||||
|
||||
_peer-to-peer_. The protocols we work on are pure p2p, and aim to minimize centralization. This too is in opposition to many initiatives in the secure messaging space.
|
||||
|
||||
_messaging_. By messaging we mean messaging in a generalized sense. This includes both human to human communication, as well machine to machine communication. By messaging we also mean something more fundamental than text messages, we also include things like transactions (state channels, etc) under this moniker.
|
||||
|
||||
_secure messaging_. Outside of traditional notions of secure messaging, such as ensuring end to end encryption, forward secrecy, avoiding MITM-attacks, etc, we are also concerned with two other forms of secure messaging. We call these _private messaging_ and _censorship-resistance_. Private messaging means viewing privacy as a security property, with all that entails. Censorship resistance ties into being p2p, but also in terms of allowing for transports and overlays that can't easily be censored by port blocking, traffic analysis, and similar.
|
||||
|
||||
_Vāc_. Is a Vedic goddess of speech. It also hints at being a vaccine.
|
||||
|
||||
## Protocol stack
|
||||
|
||||
What does this stack look like? We take inspiration from [core](https://tools.ietf.org/html/rfc793) [internet architecture](https://www.ietf.org/rfc/rfc1122.txt), existing [survey work](https://css.csail.mit.edu/6.858/2020/readings/secure-messaging.pdf) and other [efforts](https://code.briarproject.org/briar/briar/wikis/A-Quick-Overview-of-the-Protocol-Stack) that have been done to decompose the problem into orthogonal pieces. Each layer provides their own set of properties and only interact with the layers it is adjacent to. Note that this is a rough sketch.
|
||||
|
||||
| Layer / Protocol | Purpose | Examples |
|
||||
| ------------------- | --------------------------------- | -------------------- |
|
||||
| Application layer | End user semantics | 1:1 chat, group chat |
|
||||
| Data Sync | Data consistency | MVDS, BSP |
|
||||
| Secure Transport | Confidentiality, PFS, etc | Double Ratchet, MLS |
|
||||
| Transport Privacy | Transport and metadata protection | Whisper, Tor, Mixnet |
|
||||
| P2P Overlay | Overlay routing, NAT traversal | devp2p, libp2p |
|
||||
| | |
|
||||
| Trust Establishment | Establishing end-to-end trust | TOFU, web of trust |
|
||||
|
||||
As an example, end user semantics such as group chat or moderation capabilities can largely work regardless of specific choices further down the stack. Similarly, using a mesh network or Tor doesn't impact the use of Double Ratchet at the Secure Transport layer.
|
||||
|
||||
Data Sync plays a similar role to what TCP does at the transport layer in a traditional Internet architecture, and for some applications something more like UDP is likely to be desirable.
|
||||
|
||||
In terms of specific properties and trade-offs at each layer, we'll go deeper down into them as we study them. For now, this is best treated as a rough sketch or mental map.
|
||||
|
||||
## Problems and rough priorities
|
||||
|
||||
With all the pieces involved, this is quite an undertaking. Luckily, a lot of pieces are already in place and can be either incorporated as-is or iterated on. In terms of medium and long term, here's a rough sketch of priorities and open problems.
|
||||
|
||||
1. **Better data sync.** While the current [MVDS](https://rfc.vac.dev/spec/2/) works, it is lacking in a few areas:
|
||||
|
||||
- Lack of remote log for mostly-offline offline devices
|
||||
- Better scalability for multi-user chat contexts
|
||||
- Better usability in terms of application layer usage and supporting more transports
|
||||
|
||||
2. **Better transport layer support.** Currently MVDS runs primarily over Whisper, which has a few issues:
|
||||
|
||||
- scalability, being able to run with many nodes
|
||||
- spam-resistance, proof of work is a poor mechanism for heterogeneous devices
|
||||
- no incentivized infrastructure, leading to centralized choke points
|
||||
|
||||
In addition to these most immediate concerns, there are other open problems. Some of these are overlapping with the above.
|
||||
|
||||
3. **Adaptive nodes.** Better support for resource restricted devices and nodes of varying capabilities. Light connection strategy for resources and guarantees. Security games to outsource processing with guarantees.
|
||||
|
||||
4. **Incentivized and spam-resistant messaging.** Reasons to run infrastructure and not relying on altruistic nodes. For spam resistance, in p2p multicast spam is a big attack vector due to amplification. There are a few interesting directions here, such as EigenTrust, proof of burn with micropayments, and leveraging zero-knowledge proofs.
|
||||
|
||||
5. **Strong privacy guarantees at transport privacy layer**. More rigorous privacy guarantees and explicit trade-offs for metadata protection. Includes Mixnet.
|
||||
6. **Censorship-resistant and robust P2P overlay**. NAT traversal; running in the browser; mesh networks; pluggable transports for traffic obfuscation.
|
||||
|
||||
7. **Scalable and decentralized secure conversational security.** Strong security guarantees such as forward secrecy, post compromise security, for large group chats. Includes projects such MLS and extending Double Ratchet.
|
||||
|
||||
8. **Better trust establishment and key handling**. Avoiding MITM attacks while still enabling a good user experience. Protecting against ghost users in group chat and providing better ways to do key handling.
|
||||
|
||||
There is also a set of more general problems, that touch multiple layers:
|
||||
|
||||
9. **Ensuring modularity and interoperability**. Providing interfaces that allow for existing and new protocols to be at each layer of the stack.
|
||||
|
||||
10. **Better specifications**. Machine-readable and formally verified specifications. More rigorous analysis of exact guarantees and behaviors. Exposing work in such a way that it can be analyzed by academics.
|
||||
|
||||
11. **Better simulations**. Providing infrastructure and tooling to be able to test protocols in adverse environments and at scale.
|
||||
|
||||
12. **Enabling excellent user experience**. A big reason for the lack of widespread adoption of secure messaging is the fact that more centralized, insecure methods provide a better user experience. Given that incentives can align better for users interested in secure messaging, providing an even better user experience should be doable.
|
||||
|
||||
---
|
||||
|
||||
We got some work to do. Come help us if you want. See you in the next update!
|
||||
113
rlog/2019-10-04-remote-log.mdx
Normal file
113
rlog/2019-10-04-remote-log.mdx
Normal file
@@ -0,0 +1,113 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'P2P Data Sync with a Remote Log'
|
||||
title: 'P2P Data Sync with a Remote Log'
|
||||
date: 2019-10-04 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: remote-log
|
||||
categories: research
|
||||
summary:
|
||||
image: /img/remote-log.png
|
||||
---
|
||||
|
||||
A research log. Asynchronous P2P messaging? Remote logs to the rescue!
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
A big problem when doing end-to-end data sync between mobile nodes is that most devices are offline most of the time. With a naive approach, you quickly run into issues of 'ping-pong' behavior, where messages have to be constantly retransmitted. We saw some basic calculations of what this bandwidth multiplier looks like in a [previous post](https://vac.dev/p2p-data-sync-for-mobile).
|
||||
|
||||
While you could do some background processing, this is really battery-draining, and on iOS these capabilities are limited. A better approach instead is to loosen the constraint that two nodes need to be online at the same time. How do we do this? There are two main approaches, one is the _store and forward model_, and the other is a _remote log_.
|
||||
|
||||
In the _store and forward_ model, we use an intermediate node that forward messages on behalf of the recipient. In the _remote log_ model, you instead replicate the data onto some decentralized storage, and have a mutable reference to the latest state, similar to DNS. While both work, the latter is somewhat more elegant and "pure", as it has less strict requirements of an individual node's uptime. Both act as a highly-available cache to smoothen over non-overlapping connection windows between endpoints.
|
||||
|
||||
In this post we are going to describe how such a remote log schema could work. Specifically, how it enhances p2p data sync and takes care of the [following requirements](https://vac.dev/p2p-data-sync-for-mobile):
|
||||
|
||||
> 3. MUST allow for mobile-friendly usage. By mobile-friendly we mean devices
|
||||
> that are resource restricted, mostly-offline and often changing network.
|
||||
|
||||
> 4. MAY use helper services in order to be more mobile-friendly. Examples of
|
||||
> helper services are decentralized file storage solutions such as IPFS and
|
||||
> Swarm. These help with availability and latency of data for mostly-offline
|
||||
> devices.
|
||||
|
||||
## Remote log
|
||||
|
||||
A remote log is a replication of a local log. This means a node can read data from a node that is offline.
|
||||
|
||||
The spec is in an early draft stage and can be found [here](https://github.com/vacp2p/specs/pull/16). A very basic [spike](<https://en.wikipedia.org/wiki/Spike_(software_development)>) / proof-of-concept can be found [here](https://github.com/vacp2p/research/tree/master/remote_log).
|
||||
|
||||
### Definitions
|
||||
|
||||
| Term | Definition |
|
||||
| ---------- | ------------------------------------------------------------------------- |
|
||||
| CAS | Content-addressed storage. Stores data that can be addressed by its hash. |
|
||||
| NS | Name system. Associates mutable data to a name. |
|
||||
| Remote log | Replication of a local log at a different location. |
|
||||
|
||||
### Roles
|
||||
|
||||
There are four fundamental roles:
|
||||
|
||||
1. Alice
|
||||
2. Bob
|
||||
3. Name system (NS)
|
||||
4. Content-addressed storage (CAS)
|
||||
|
||||
The _remote log_ is the data format of what is stored in the name system.
|
||||
|
||||
"Bob" can represent anything from 0 to N participants. Unlike Alice, Bob only needs read-only access to NS and CAS.
|
||||
|
||||
### Flow
|
||||
|
||||

|
||||
|
||||
### Data format
|
||||
|
||||
The remote log lets receiving nodes know what data they are missing. Depending on the specific requirements and capabilities of the nodes and name system, the information can be referred to differently. We distinguish between three rough modes:
|
||||
|
||||
1. Fully replicated log
|
||||
2. Normal sized page with CAS mapping
|
||||
3. "Linked list" mode - minimally sized page with CAS mapping
|
||||
|
||||
A remote log is simply a mapping from message identifiers to their corresponding address in a CAS:
|
||||
|
||||
| Message Identifier (H1) | CAS Hash (H2) |
|
||||
| ----------------------- | ------------- |
|
||||
| H1_3 | H2_3 |
|
||||
| H1_2 | H2_2 |
|
||||
| H1_1 | H2_1 |
|
||||
| | |
|
||||
| _address to next page_ |
|
||||
|
||||
The numbers here corresponds to messages. Optionally, the content itself can be included, just like it normally would be sent over the wire. This bypasses the need for a dedicated CAS and additional round-trips, with a trade-off in bandwidth usage.
|
||||
|
||||
| Message Identifier (H1) | Content |
|
||||
| ----------------------- | ------- |
|
||||
| H1_3 | C3 |
|
||||
| H1_2 | C2 |
|
||||
| H1_1 | C1 |
|
||||
| | |
|
||||
| _address to next page_ |
|
||||
|
||||
Both patterns can be used in parallel, e,g. by storing the last `k` messages directly and use CAS pointers for the rest. Together with the `next_page` page semantics, this gives users flexibility in terms of bandwidth and latency/indirection, all the way from a simple linked list to a fully replicated log. The latter is useful for things like backups on durable storage.
|
||||
|
||||
### Interaction with MVDS
|
||||
|
||||
[vac.mvds.Message](https://rfc.vac.dev/spec/2/#payloads) payloads are the only payloads that MUST be uploaded. Other messages types MAY be uploaded, depending on the implementation.
|
||||
|
||||
## Future work
|
||||
|
||||
The spec is still in an early draft stage, so it is expected to change. Same with the proof of concept. More work is needed on getting a fully featured proof of concept with specific CAS and NAS instances. E.g. Swarm and Swarm Feeds, or IPFS and IPNS, or something else.
|
||||
|
||||
For data sync in general:
|
||||
|
||||
- Make consistency guarantees more explicit for app developers with support for sequence numbers and DAGs, as well as the ability to send non-synced messages. E.g. ephemeral typing notifications, linear/sequential history and casual consistency/DAG history
|
||||
- Better semantics and scalability for multi-user sync contexts, e.g. CRDTs and joining multiple logs together
|
||||
- Better usability in terms of application layer usage (data sync clients) and supporting more transports
|
||||
|
||||
---
|
||||
|
||||
PS1. Thanks everyone who submitted great [logo proposals](https://explorer.bounties.network/bounty/3389) for Vac!
|
||||
|
||||
PPS2. Next week on October 10th decanus and I will be presenting Vac at [Devcon](https://devcon.org/agenda), come say hi :)
|
||||
153
rlog/2019-11-08-feasibility-semaphore-rate-limiting-zksnarks.mdx
Normal file
153
rlog/2019-11-08-feasibility-semaphore-rate-limiting-zksnarks.mdx
Normal file
@@ -0,0 +1,153 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Feasibility Study: Semaphore rate limiting through zkSNARKs'
|
||||
title: 'Feasibility Study: Semaphore rate limiting through zkSNARKs'
|
||||
date: 2019-11-08 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: feasibility-semaphore-rate-limiting-zksnarks
|
||||
categories: research
|
||||
image: /img/peacock-signaling.jpg
|
||||
discuss: https://forum.vac.dev/t/discussion-feasibility-study-semaphore-rate-limiting-through-zksnarks/21
|
||||
|
||||
toc_min_heading_level: 2
|
||||
toc_max_heading_level: 5
|
||||
---
|
||||
|
||||
A research log. Zero knowledge signaling as a rate limiting mechanism to prevent spam in p2p networks.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
**tldr: Moon math promising for solving spam in Whisper, but to get there we need to invest more in performance work and technical upskilling.**
|
||||
|
||||
## Motivating problem
|
||||
|
||||
In open p2p networks for messaging, one big problem is spam-resistance. Existing solutions, such as Whisper's proof of work, are insufficient, especially for heterogeneous nodes. Other reputation-based approaches might not be desirable, due to issues around arbitrary exclusion and privacy.
|
||||
|
||||
One possible solution is to use a right-to-access staking-based method, where a node is only able to send a message, signal, at a certain rate, and otherwise they can be slashed. One problem with this is in terms of privacy-preservation, where we specifically don't want a user to be tied to a specific payment or unique fingerprint.
|
||||
|
||||
### Related problems
|
||||
|
||||
In addition to above, there are a lot of related problems that share similarities in terms of their structure and proposed solution.
|
||||
|
||||
- Private transactions ([Zcash](https://z.cash/), [AZTEC](https://www.aztecprotocol.com/))
|
||||
- Private voting ([Semaphore](https://github.com/kobigurk/semaphore))
|
||||
- Private group membership (Semaphore)
|
||||
- Layer 2 scaling, poss layer 1 ([ZK Rollup](https://ethresear.ch/t/on-chain-scaling-to-potentially-500-tx-sec-through-mass-tx-validation/3477); StarkWare/Eth2-3)
|
||||
|
||||
## Overview
|
||||
|
||||
## Basic terminology
|
||||
|
||||
A _zero-knowledge proof_ allows a _prover_ to show a _verifier_ that they know something, without revealing what that something is. This means you can do trust-minimized computation that is also privacy preserving. As a basic example, instead of showing your ID when going to a bar you simply give them a proof that you are over 18, without showing the doorman your id.
|
||||
|
||||
_zkSNARKs_ is a form of zero-knowledge proofs. There are many types of zero-knowledge proofs, and the field is evolving rapidly. They come with various trade-offs in terms of things such as: trusted setup, cryptographic assumptions, proof/verification key size, proof/verification time, proof size, etc. See section below for more.
|
||||
|
||||
_Semaphore_ is a framework/library/construct on top of zkSNARks. It allows for zero-knowledge signaling, specifically on top of Ethereum. This means an approved user can broadcast some arbitrary string without revealing their identity, given some specific constraints. An approved user is someone who has been added to a certain merkle tree. See [current Github home](https://github.com/kobigurk/semaphore) for more.
|
||||
|
||||
_Circom_ is a DSL for writing arithmetic circuits that can be used in zkSNARKs, similar to how you might write a NAND gate. See [Github](https://github.com/iden3/circom) for more.
|
||||
|
||||
## Basic flow
|
||||
|
||||
We start with a private voting example, and then extend it to the slashable rate limiting example.
|
||||
|
||||
1. A user registers an identity (arbitrary keypair), along with a small fee, to a smart contract. This adds them to a merkle tree and allows them to prove that they are member of that group, without revealing who they are.
|
||||
|
||||
2. When a user wants to send a message, they compute a zero-knowledge proof. This ensures certain invariants, have some _public outputs_, and can be verified by anyone (including a smart contract).
|
||||
3. Any node can verify the proof, including smart contracts on chain (as of Byzantinum HF). Additionally, a node can have rules for the public output. In the case of voting, one such rule is that a specific output hash has to be equal to some predefined value, such as "2020-01-01 vote on Foo Bar for president".
|
||||
4. Because of how the proof is constructed, and the rules around output values, this ensures that: a user is part of the approved set of voters and that a user can only vote once.
|
||||
5. As a consequence of above, we have a system where registered users can only vote once, no one can see who voted for what, and this can all be proven and verified.
|
||||
|
||||
### Rate limiting example
|
||||
|
||||
In the case of rate limiting, we do want nodes to send multiple messages. This changes step 3-5 above somewhat.
|
||||
|
||||
_NOTE: It is a bit more involved than this, and if we precompute proofs the flow might look a bit different. But the general idea is the same_.
|
||||
|
||||
1. Instead of having a rule that you can only vote once, we have a rule that you can only send a message per epoch. Epoch here can be every second, as defined by UTC date time +-20s.
|
||||
2. Additionally, if a users sends more than one message per epoch, one of the public outputs is a random share of a private key. Using Shamir's Secret Sharing (similar to a multisig) and 2/3 key share as an example threshold: in the normal case only 1/3 private keys is revealed, which is insufficient to have access. In the case where two messages are sent in an epoch, probabilistically 2/3 shares is sufficient to have access to the key (unless you get the same random share of the key).
|
||||
3. This means any untrusted user who detects a spamming user, can use it to access their private key corresponding to funds in the contract, and thus slash them.
|
||||
|
||||
4. As a consequence of above, we have a system where registered users can only messages X times per epoch, and no one can see who is sending what messages. Additionally, if a user is violating the above rate limit, they can be punished and any user can profit from it.
|
||||
|
||||
### Briefly on scope of 'approved users'
|
||||
|
||||
In the case of an application like Status, this construct can either be a global StatusNetwork group, or one per chat, or network, etc. It can be applied both at the network and user level. There are no specific limitations on where or who deploys this, and it is thus more of a UX consideration.
|
||||
|
||||
## Technical details
|
||||
|
||||
For a fairly self-contained set of examples above, see exploration in [Vac research repo](https://github.com/vacp2p/research/blob/master/zksnarks/semaphore/src/hello.js). Note that the Shamir secret sharing is not inside the SNARK, but out-of-band for now.
|
||||
|
||||
The [current version](https://github.com/kobigurk/semaphore) of Semaphore is using NodeJS and [Circom](https://github.com/iden3/circom) from Iden3 for Snarks.
|
||||
|
||||
For more on rate limiting idea, see [ethresearch post](https://ethresear.ch/t/semaphore-rln-rate-limiting-nullifier-for-spam-prevention-in-anonymous-p2p-setting/5009/).
|
||||
|
||||
## Feasibility
|
||||
|
||||
The above repo was used to exercise the basic paths and to gain intution of feasibility. Based on it and related reading we outline a few blockers and things that require further study.
|
||||
|
||||
### Technical feasibility
|
||||
|
||||
#### Proof time
|
||||
|
||||
Prove time for Semaphore (<https://github.com/kobigurk/semaphore>) zKSNARKs using circom, groth and snarkjs is currently way too long. It takes on the order of ~10m to generate a proof. With Websnark, it is likely to take 30s, which might still be too long. We should experiment with native code on mobile here.
|
||||
|
||||
See [details](https://github.com/vacp2p/research/issues/7).
|
||||
|
||||
#### Proving key size
|
||||
|
||||
Prover key size is ~110mb for Semaphore. Assuming this is embedded on mobile device, it bloats the APK a lot. Current APK size is ~30mb and even that might be high for people with limited bandwidth.
|
||||
|
||||
See [details](https://github.com/vacp2p/research/issues/8).
|
||||
|
||||
#### Trusted setup
|
||||
|
||||
Using zkSNARKs a trusted setup is required to generate prover and verifier keys. As part of this setup, a toxic parameter lambda is generated. If a party gets access to this lambda, they can prove anything. This means people using zKSNARKs usually have an elaborate MPC ceremony to ensure this parameter doesn't get discovered.
|
||||
|
||||
See [details](https://github.com/vacp2p/research/issues/9).
|
||||
|
||||
#### Shamir logic in SNARK
|
||||
|
||||
For [Semaphore RLN](https://ethresear.ch/t/semaphore-rln-rate-limiting-nullifier-for-spam-prevention-in-anonymous-p2p-setting/5009) we need to embed the Shamir logic inside the SNARK in order to do slashing for spam. Currently the [implementation](https://github.com/vacp2p/research/blob/master/zksnarks/semaphore/src/hello.js#L450) is trusted and very hacky.
|
||||
|
||||
See [details](https://github.com/vacp2p/research/issues/10).
|
||||
|
||||
#### End to end integation
|
||||
|
||||
[Currently](https://github.com/vacp2p/research/blob/master/zksnarks/semaphore/src/hello.js) is standalone and doesn't touch multiple users, deployed contract with merkle tree and verification, actual transactions, a mocked network, add/remove members, etc. There are bound to be edge cases and unknown unknowns here.
|
||||
|
||||
See [details](https://github.com/vacp2p/research/issues/11).
|
||||
|
||||
#### Licensing issues
|
||||
|
||||
Currently Circom [uses a GPL license](https://github.com/iden3/circom/blob/master/COPYING), which can get tricky when it comes to the App Store etc.
|
||||
|
||||
See [details](https://github.com/vacp2p/research/issues/12).
|
||||
|
||||
#### Alternative ZKPs?
|
||||
|
||||
Some of the isolated blockers for zKSNARKs ([#7](https://github.com/vacp2p/research/issues/7), [#8](https://github.com/vacp2p/research/issues/8), [#9](https://github.com/vacp2p/research/issues/9)) might be mitigated by the use of other ZKP technology. However, they likely have their own issues.
|
||||
|
||||
See [details](https://github.com/vacp2p/research/issues/13).
|
||||
|
||||
### Social feasibility
|
||||
|
||||
#### Technical skill
|
||||
|
||||
zkSNARKs and related technologies are quite new. To learn how they work and get an intuition for them requires individuals to dedicate a lot of time to studying them. This means we must make getting competence in these technologies if we wish to use them to our advantage.
|
||||
|
||||
#### Time and resources
|
||||
|
||||
In order for this and related projects (such as private transaction) to get anywhere, it must be made an explicit area of focus for an extend period of time.
|
||||
|
||||
## General thoughts
|
||||
|
||||
Similar to Whisper, and in line with moving towards protocol and infrastructure, we need to upskill and invest resources into this. This doesn't mean developing all of the technologies ourselves, but gaining enough competence to leverage and extend existing solutions by the growing ZKP community.
|
||||
|
||||
For example, this might also include leveraging largely ready made solutions such as AZTEC for private transaction; more fundamental research into ZK rollup and similar; using Semaphore for private group membership and private voting; Nim based wrapper aronud Bellman, etc.
|
||||
|
||||
## Acknowledgement
|
||||
|
||||
Thanks to Barry Whitehat for patient explanation and pointers. Thanks to WJ for helping with runtime issues.
|
||||
|
||||
_Peacock header image from [Tonos](<https://en.wikipedia.org/wiki/File:Flickr_-_lo.tangelini_-_Tonos_(1).jpg>).\_
|
||||
297
rlog/2019-12-03-fixing-whisper-with-waku.mdx
Normal file
297
rlog/2019-12-03-fixing-whisper-with-waku.mdx
Normal file
@@ -0,0 +1,297 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Fixing Whisper with Waku'
|
||||
title: 'Fixing Whisper with Waku'
|
||||
date: 2019-12-03 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: fixing-whisper-with-waku
|
||||
categories: research
|
||||
image: /img/whisper_scalability.png
|
||||
discuss: https://forum.vac.dev/t/discussion-fixing-whisper-with-waku/27
|
||||
---
|
||||
|
||||
A research log. Why Whisper doesn't scale and how to fix it.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
This post will introduce Waku. Waku is a fork of Whisper that attempts to
|
||||
addresses some of Whisper's shortcomings in an iterative fashion. We will also
|
||||
introduce a theoretical scaling model for Whisper that shows why it doesn't
|
||||
scale, and what can be done about it.
|
||||
|
||||
## Introduction
|
||||
|
||||
Whisper is a gossip-based communication protocol or an ephemeral key-value store
|
||||
depending on which way you look at it. Historically speaking, it is the
|
||||
messaging pilllar of [Web3](http://gavwood.com/dappsweb3.html), together with
|
||||
Ethereum for consensus and Swarm for storage.
|
||||
|
||||
Whisper, being a somewhat esoteric protocol and with some fundamental issues,
|
||||
hasn't seen a lot of usage. However, applications such as Status are using it,
|
||||
and have been making minor ad hoc modifications to it to make it run on mobile
|
||||
devices.
|
||||
|
||||
What are these fundamental issues? In short:
|
||||
|
||||
1. scalability, most immediately when it comes to bandwidth usage
|
||||
2. spam-resistance, proof of work is a poor mechanism for heterogeneous nodes
|
||||
3. no incentivized infrastructure, leading to centralized choke points
|
||||
4. lack of formal and unambiguous specification makes it hard to analyze and implement
|
||||
5. running over devp2p, which limits where it can run and how
|
||||
|
||||
In this post, we'll focus on the first problem, which is scalability through bandwidth usage.
|
||||
|
||||
## Whisper theoretical scalability model
|
||||
|
||||
_(Feel free to skip this section if you want to get right to the results)._
|
||||
|
||||
There's widespread implicit knowledge that Whisper "doesn't scale", but it is less understood exactly why. This theoretical model attempts to encode some characteristics of it. Specifically for use case such as one by Status (see [Status Whisper usage
|
||||
spec](https://specs.status.im/spec/3)).
|
||||
|
||||
### Caveats
|
||||
|
||||
First, some caveats: this model likely contains bugs, has wrong assumptions, or completely misses certain dimensions. However, it acts as a form of existence proof for unscalability, with clear reasons.
|
||||
|
||||
If certain assumptions are wrong, then we can challenge them and reason about them in isolation. It doesn’t mean things will definitely work as the model predicts, and that there aren’t unknown unknowns.
|
||||
|
||||
The model also only deals with receiving bandwidth for end nodes, uses mostly static assumptions of averages, and doesn’t deal with spam resistance, privacy guarantees, accounting, intermediate node or network wide failures.
|
||||
|
||||
### Goals
|
||||
|
||||
1. Ensure network scales by being user or usage bound, as opposed to bandwidth growing in proportion to network size.
|
||||
2. Staying with in a reasonable bandwidth limit for limited data plans.
|
||||
3. Do the above without materially impacting existing nodes.
|
||||
|
||||
It proceeds through various case with clear assumptions behind them, starting from the most naive assumptions. It shows results for 100 users, 10k users and 1m users.
|
||||
|
||||
### Model
|
||||
|
||||
```
|
||||
Case 1. Only receiving messages meant for you [naive case]
|
||||
|
||||
Assumptions:
|
||||
- A1. Envelope size (static): 1024kb
|
||||
- A2. Envelopes / message (static): 10
|
||||
- A3. Received messages / day (static): 100
|
||||
- A4. Only receiving messages meant for you.
|
||||
|
||||
For 100 users, receiving bandwidth is 1000.0KB/day
|
||||
For 10k users, receiving bandwidth is 1000.0KB/day
|
||||
For 1m users, receiving bandwidth is 1000.0KB/day
|
||||
|
||||
------------------------------------------------------------
|
||||
|
||||
Case 2. Receiving messages for everyone [naive case]
|
||||
|
||||
Assumptions:
|
||||
- A1. Envelope size (static): 1024kb
|
||||
- A2. Envelopes / message (static): 10
|
||||
- A3. Received messages / day (static): 100
|
||||
- A5. Received messages for everyone.
|
||||
|
||||
For 100 users, receiving bandwidth is 97.7MB/day
|
||||
For 10k users, receiving bandwidth is 9.5GB/day
|
||||
For 1m users, receiving bandwidth is 953.7GB/day
|
||||
|
||||
------------------------------------------------------------
|
||||
|
||||
Case 3. All private messages go over one discovery topic [naive case]
|
||||
|
||||
Assumptions:
|
||||
- A1. Envelope size (static): 1024kb
|
||||
- A2. Envelopes / message (static): 10
|
||||
- A3. Received messages / day (static): 100
|
||||
- A6. Proportion of private messages (static): 0.5
|
||||
- A7. Public messages only received by relevant recipients (static).
|
||||
- A8. All private messages are received by everyone (same topic) (static).
|
||||
|
||||
For 100 users, receiving bandwidth is 49.3MB/day
|
||||
For 10k users, receiving bandwidth is 4.8GB/day
|
||||
For 1m users, receiving bandwidth is 476.8GB/day
|
||||
|
||||
------------------------------------------------------------
|
||||
|
||||
Case 4. All private messages are partitioned into shards [naive case]
|
||||
|
||||
Assumptions:
|
||||
- A1. Envelope size (static): 1024kb
|
||||
- A2. Envelopes / message (static): 10
|
||||
- A3. Received messages / day (static): 100
|
||||
- A6. Proportion of private messages (static): 0.5
|
||||
- A7. Public messages only received by relevant recipients (static).
|
||||
- A9. Private messages partitioned across partition shards (static), n=5000
|
||||
|
||||
For 100 users, receiving bandwidth is 1000.0KB/day
|
||||
For 10k users, receiving bandwidth is 1.5MB/day
|
||||
For 1m users, receiving bandwidth is 98.1MB/day
|
||||
|
||||
------------------------------------------------------------
|
||||
|
||||
Case 5. 4 + Bloom filter with false positive rate
|
||||
|
||||
Assumptions:
|
||||
- A1. Envelope size (static): 1024kb
|
||||
- A2. Envelopes / message (static): 10
|
||||
- A3. Received messages / day (static): 100
|
||||
- A6. Proportion of private messages (static): 0.5
|
||||
- A7. Public messages only received by relevant recipients (static).
|
||||
- A9. Private messages partitioned across partition shards (static), n=5000
|
||||
- A10. Bloom filter size (m) (static): 512
|
||||
- A11. Bloom filter hash functions (k) (static): 3
|
||||
- A12. Bloom filter elements, i.e. topics, (n) (static): 100
|
||||
- A13. Bloom filter assuming optimal k choice (sensitive to m, n).
|
||||
- A14. Bloom filter false positive proportion of full traffic, p=0.1
|
||||
|
||||
For 100 users, receiving bandwidth is 10.7MB/day
|
||||
For 10k users, receiving bandwidth is 978.0MB/day
|
||||
For 1m users, receiving bandwidth is 95.5GB/day
|
||||
|
||||
NOTE: Traffic extremely sensitive to bloom false positives
|
||||
This completely dominates network traffic at scale.
|
||||
With p=1% we get 10k users ~100MB/day and 1m users ~10gb/day)
|
||||
|
||||
------------------------------------------------------------
|
||||
|
||||
Case 6. Case 5 + Benign duplicate receives
|
||||
|
||||
Assumptions:
|
||||
- A1. Envelope size (static): 1024kb
|
||||
- A2. Envelopes / message (static): 10
|
||||
- A3. Received messages / day (static): 100
|
||||
- A6. Proportion of private messages (static): 0.5
|
||||
- A7. Public messages only received by relevant recipients (static).
|
||||
- A9. Private messages partitioned across partition shards (static), n=5000
|
||||
- A10. Bloom filter size (m) (static): 512
|
||||
- A11. Bloom filter hash functions (k) (static): 3
|
||||
- A12. Bloom filter elements, i.e. topics, (n) (static): 100
|
||||
- A13. Bloom filter assuming optimal k choice (sensitive to m, n).
|
||||
- A14. Bloom filter false positive proportion of full traffic, p=0.1
|
||||
- A15. Benign duplicate receives factor (static): 2
|
||||
- A16. No bad envelopes, bad PoW, expired, etc (static).
|
||||
|
||||
For 100 users, receiving bandwidth is 21.5MB/day
|
||||
For 10k users, receiving bandwidth is 1.9GB/day
|
||||
For 1m users, receiving bandwidth is 190.9GB/day
|
||||
|
||||
------------------------------------------------------------
|
||||
|
||||
Case 7. 6 + Mailserver under good conditions; small bloom fp; mostly offline
|
||||
|
||||
Assumptions:
|
||||
- A1. Envelope size (static): 1024kb
|
||||
- A2. Envelopes / message (static): 10
|
||||
- A3. Received messages / day (static): 100
|
||||
- A6. Proportion of private messages (static): 0.5
|
||||
- A7. Public messages only received by relevant recipients (static).
|
||||
- A9. Private messages partitioned across partition shards (static), n=5000
|
||||
- A10. Bloom filter size (m) (static): 512
|
||||
- A11. Bloom filter hash functions (k) (static): 3
|
||||
- A12. Bloom filter elements, i.e. topics, (n) (static): 100
|
||||
- A13. Bloom filter assuming optimal k choice (sensitive to m, n).
|
||||
- A14. Bloom filter false positive proportion of full traffic, p=0.1
|
||||
- A15. Benign duplicate receives factor (static): 2
|
||||
- A16. No bad envelopes, bad PoW, expired, etc (static).
|
||||
- A17. User is offline p% of the time (static) p=0.9
|
||||
- A18. No bad request, dup messages for mailservers; overlap perfect (static).
|
||||
- A19. Mailserver requests can change false positive rate to be p=0.01
|
||||
|
||||
For 100 users, receiving bandwidth is 3.9MB/day
|
||||
For 10k users, receiving bandwidth is 284.8MB/day
|
||||
For 1m users, receiving bandwidth is 27.8GB/day
|
||||
|
||||
------------------------------------------------------------
|
||||
|
||||
Case 8. No metadata protection w bloom filter; 1 node connected; static shard
|
||||
|
||||
Aka waku mode.
|
||||
|
||||
Next step up is to either only use contact code, or shard more aggressively.
|
||||
Note that this requires change of other nodes behavior, not just local node.
|
||||
|
||||
Assumptions:
|
||||
- A1. Envelope size (static): 1024kb
|
||||
- A2. Envelopes / message (static): 10
|
||||
- A3. Received messages / day (static): 100
|
||||
- A6. Proportion of private messages (static): 0.5
|
||||
- A7. Public messages only received by relevant recipients (static).
|
||||
- A9. Private messages partitioned across partition shards (static), n=5000
|
||||
|
||||
For 100 users, receiving bandwidth is 1000.0KB/day
|
||||
For 10k users, receiving bandwidth is 1.5MB/day
|
||||
For 1m users, receiving bandwidth is 98.1MB/day
|
||||
|
||||
------------------------------------------------------------
|
||||
```
|
||||
|
||||
See [source](https://github.com/vacp2p/research/tree/master/whisper_scalability)
|
||||
for more detail on the model and its assumptions.
|
||||
|
||||
### Takeaways
|
||||
|
||||
1. Whisper as it currently works doesn’t scale, and we quickly run into unacceptable bandwidth usage.
|
||||
2. There are a few factors of this, but largely it boils down to noisy topics usage and use of bloom filters. Duplicate (e.g. see [Whisper vs PSS](https://our.status.im/whisper-pss-comparison/)) and bad envelopes are also factors, but this depends a bit more on specific deployment configurations.
|
||||
3. Waku mode (case 8) is an additional capability that doesn’t require other nodes to change, for nodes that put a premium on performance.
|
||||
4. The next bottleneck after this is the partitioned topics (app/network specific), which either needs to gracefully (and potentially quickly) grow, or an alternative way of consuming those messages needs to be deviced.
|
||||
|
||||

|
||||
|
||||
The results are summarized in the graph above. Notice the log-log scale. The
|
||||
colored backgrounds correspond to the following bandwidth usage:
|
||||
|
||||
- Blue: <10mb/d (<~300mb/month)
|
||||
- Green: <30mb/d (<~1gb/month)
|
||||
- Yellow: <100mb/d (<~3gb/month)
|
||||
- Red: >100mb/d (>3gb/month)
|
||||
|
||||
These ranges are somewhat arbitrary, but are based on [user
|
||||
requirements](https://github.com/status-im/status-react/issues/9081) for users
|
||||
on a limited data plan, with comparable usage for other messaging apps.
|
||||
|
||||
## Introducing Waku
|
||||
|
||||
### Motivation for a new protocol
|
||||
|
||||
Apps such as Status will likely use something like Whisper for the forseeable
|
||||
future, and we want to enable them to use it with more users on mobile devices
|
||||
without bandwidth exploding with minimal changes.
|
||||
|
||||
Additionally, there's not a clear cut alternative that maps cleanly to the
|
||||
desired use cases (p2p, multicast, privacy-preserving, open, etc).
|
||||
|
||||
We are actively researching, developing and collaborating with more greenfield
|
||||
approaches. It is likely that Waku will either converge to those, or Waku will
|
||||
lay the groundwork (clear specs, common issues/components) necessary to make
|
||||
switching to another protocol easier. In this project we want to emphasize
|
||||
iterative work with results on the order of weeks.
|
||||
|
||||
### Briefly on Waku mode
|
||||
|
||||
- Doesn’t impact existing clients, it’s just a separate node and capability.
|
||||
- Other nodes can still use Whisper as is, like a full node.
|
||||
- Sacrifices metadata protection and incurs higher connectivity/availability requirements for scalbility
|
||||
|
||||
**Requirements:**
|
||||
|
||||
- Exposes API to get messages from a set of list of topics (no bloom filter)
|
||||
- Way of being identified as a Waku node (e.g. through version string)
|
||||
- Option to statically encode this node in app, e.g. similar to custom bootnodes/mailserver
|
||||
- Only node that needs to be connected to, possibly as Whisper relay / mailserver hybrid
|
||||
|
||||
**Provides:**
|
||||
|
||||
- likely provides scalability of up to 10k users and beyond
|
||||
- with some enhancements to partition topic logic, can possibly scale up to 1m users (app/network specific)
|
||||
|
||||
**Caveats:**
|
||||
|
||||
- hasn’t been tested in a large-scale simulation
|
||||
- other network and intermediate node bottlenecks might become apparent (e.g. full bloom filter and private cluster capacity; can likely be dealt with in isolation using known techniques, e.g. load balancing) (deployment specific)
|
||||
|
||||
### Progress so far
|
||||
|
||||
In short, we have a [Waku version 0 spec up](https://rfc.vac.dev/spec/5) as well as a [PoC](https://github.com/status-im/nim-eth/pull/120) for backwards compatibility. In the coming weeks, we are going to solidify the specs, get a more fully featured PoC for [Waku mode](https://github.com/status-im/nim-eth/pull/114). See [rough roadmap](https://github.com/vacp2p/pm/issues/5), project board [link deprecated] and progress thread on the [Vac forum](https://forum.vac.dev/t/waku-project-and-progress/24).
|
||||
|
||||
The spec has been rewrittten for clarity, with ABNF grammar and less ambiguous language. The spec also incorporates several previously [ad hoc implemented features](https://rfc.vac.dev/spec/6/#additional-capabilities), such as light nodes and mailserver/client support. This has already caught a few incompatibilities between the `geth` (Go), `status/whisper` (Go) and `nim-eth` (Nim) versions, specifically around light node usage and the handshake.
|
||||
|
||||
If you are interested in this effort, please check out [our forum](https://forum.vac.dev/) for questions, comments and proposals. We already have some discussion for better [spam protection](https://forum.vac.dev/t/stake-priority-based-queuing/26) (see [previous post](https://vac.dev/feasibility-semaphore-rate-limiting-zksnarks) for a more complex but privacy-preserving proposal), something that is likely going to be addressed in future versions of Waku, along with many other fixes and enhancement.
|
||||
149
rlog/2020-02-14-waku-update.mdx
Normal file
149
rlog/2020-02-14-waku-update.mdx
Normal file
@@ -0,0 +1,149 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Waku Update'
|
||||
title: 'Waku Update'
|
||||
date: 2020-02-14 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: waku-update
|
||||
categories: research
|
||||
image: /img/waku_infrastructure_sky.jpg
|
||||
discuss: https://forum.vac.dev/t/waku-update-where-are-we-at/34
|
||||
---
|
||||
|
||||
A research log. What's the current state of Waku? How many users does it support? What are the bottlenecks? What's next?
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
Waku is our fork of Whisper where we address the shortcomings of Whisper in an iterative manner. We've seen a in [previous post](https://vac.dev/fixing-whisper-with-waku) that Whisper doesn't scale, and why. In this post we'll talk about what the current state of Waku is, how many users it can support, and future plans.
|
||||
|
||||
## Current state
|
||||
|
||||
**Specs:**
|
||||
|
||||
We released [Waku spec v0.3](https://rfc.vac.dev/spec/6) this week! You can see the full changelog [here](https://rfc.vac.dev/spec/6/#changelog).
|
||||
|
||||
The main change from 0.2 is making the handshake more flexible. This enables us to communicate topic interest immediately without ambiguity. We also did the following:
|
||||
|
||||
- added recommendation for DNS based discovery
|
||||
- added an upgradability and compatibility policy
|
||||
- cut the spec up into several components
|
||||
|
||||
We cut the spec up in several components to make Vac as modular as possible. The components right now are:
|
||||
|
||||
- Waku (main spec), currently in [version 0.3.0](https://rfc.vac.dev/spec/6)
|
||||
- Waku envelope data field, currently in [version 0.1.0](https://rfc.vac.dev/spec/7)
|
||||
- Waku mailserver, currently in [version 0.2.0](https://rfc.vac.dev/spec/8)
|
||||
|
||||
We can probably factor these out further as the main spec is getting quite big, but this is good enough for now.
|
||||
|
||||
**Clients:**
|
||||
|
||||
There are currently two clients that implement Waku v0.3, these are [Nimbus (Update: now nim-waku)](https://github.com/status-im/nim-waku) in Nim and [status-go](https://github.com/status-im/status-go) in Go.
|
||||
|
||||
For more details on what each client support and don't, you can follow the [work in progress checklist](https://github.com/vacp2p/pm/issues/7).
|
||||
|
||||
Work is currently in progress to integrate it into the [Status core app](https://github.com/status-im/status-react/pull/9949). Waku is expected to be part of their upcoming 1.1 release (see [Status app roadmap (link deprecated)](https://trello.com/b/DkxQd1ww/status-app-roadmap)).
|
||||
|
||||
**Simulation:**
|
||||
|
||||
We have a [simulation](https://github.com/status-im/nim-waku/blob/master/waku/v1/node/quicksim.nim) that verifies - or rather, fails to falsify - our [scalability model](https://vac.dev/fixing-whisper-with-waku). More on the simulation and what it shows below.
|
||||
|
||||
## How many users does Waku support?
|
||||
|
||||
This is our current understanding of how many users a network running Waku can support. Specifically in the context of the Status chat app, since that's the most immediate consumer of Waku. It should generalize fairly well to most deployments.
|
||||
|
||||
**tl;dr (for Status app):**
|
||||
|
||||
- beta: 100 DAU
|
||||
- v1: 1k DAU
|
||||
- v1.1 (waku only): 10k DAU (up to x10 with deployment hotfixes)
|
||||
- v1.2 (waku+dns): 100k DAU (can optionally be folded into v1.1)
|
||||
|
||||
_Assuming 10 concurrent users = 100 DAU. Estimate uncertainty increases for each order of magnitude until real-world data is observed._
|
||||
|
||||
As far as we know right now, these are the bottlenecks we have:
|
||||
|
||||
- Immediate bottleneck - Receive bandwidth for end user clients (aka ‘Fixing Whisper with Waku’)
|
||||
- Very likely bottleneck - Nodes and cluster capacity (aka ‘DNS based node discovery’)
|
||||
- Conjecture but not unlikely to appear- Full node traffic (aka ‘the routing / partition problem’)
|
||||
|
||||
We've already seen the first bottleneck being discussed in the initial post. Dean wrote a post on [DNS based discovery](https://vac.dev/dns-based-discovery) which explains how we will address the likely second bottleneck. More on the third one in future posts.
|
||||
|
||||
For more details on these bottlenecks, see [Scalability estimate: How many users can Waku and the Status app support?](https://discuss.status.im/t/scalability-estimate-how-many-users-can-waku-and-the-status-app-support/1514).
|
||||
|
||||
## Simulation
|
||||
|
||||
The ultimate test is real-world usage. Until then, we have a simulation thanks to Kim De Mey from the Nimbus team!
|
||||
|
||||

|
||||
|
||||
We have two network topologies, Star and full mesh. Both networks have 6 full nodes, one traditional light node with bloom filter, and one Waku light node.
|
||||
|
||||
One of the full nodes sends 1 envelope over 1 of the 100 topics that the two light nodes subscribe to. After that, it sends 10000 envelopes over random topics.
|
||||
|
||||
For light node, bloom filter is set to almost 10% false positive (bloom filter: n=100, k=3, m=512). It shows the number of valid and invalid envelopes received for the different nodes.
|
||||
|
||||
**Star network:**
|
||||
|
||||
| Description | Peers | Valid | Invalid |
|
||||
| --------------- | ----- | ----- | ------- |
|
||||
| Master node | 7 | 10001 | 0 |
|
||||
| Full node 1 | 3 | 10001 | 0 |
|
||||
| Full node 2 | 1 | 10001 | 0 |
|
||||
| Full node 3 | 1 | 10001 | 0 |
|
||||
| Full node 4 | 1 | 10001 | 0 |
|
||||
| Full node 5 | 1 | 10001 | 0 |
|
||||
| Light node | 2 | 815 | 0 |
|
||||
| Waku light node | 2 | 1 | 0 |
|
||||
|
||||
**Full mesh:**
|
||||
|
||||
| Description | Peers | Valid | Invalid |
|
||||
| --------------- | ----- | ----- | ------- |
|
||||
| Full node 0 | 7 | 10001 | 20676 |
|
||||
| Full node 1 | 7 | 10001 | 9554 |
|
||||
| Full node 2 | 5 | 10001 | 23304 |
|
||||
| Full node 3 | 5 | 10001 | 11983 |
|
||||
| Full node 4 | 5 | 10001 | 24425 |
|
||||
| Full node 5 | 5 | 10001 | 23472 |
|
||||
| Light node | 2 | 803 | 803 |
|
||||
| Waku light node | 2 | 1 | 1 |
|
||||
|
||||
Things to note:
|
||||
|
||||
- Whisper light node with ~10% false positive gets ~10% of total traffic
|
||||
- Waku light node gets ~1000x less envelopes than Whisper light node
|
||||
- Full mesh results in a lot more duplicate messages, expect for Waku light node
|
||||
|
||||
Run the simulation yourself [here](https://github.com/status-im/nim-waku/blob/master/waku/v1/node/quicksim.nim). The parameters are configurable, and it is integrated with Prometheus and Grafana.
|
||||
|
||||
## Difference between Waku and Whisper
|
||||
|
||||
Summary of main differences between Waku v0 spec and Whisper v6, as described in [EIP-627](https://eips.ethereum.org/EIPS/eip-627):
|
||||
|
||||
- Handshake/Status message not compatible with shh/6 nodes; specifying options as association list
|
||||
- Include topic-interest in Status handshake
|
||||
- Upgradability policy
|
||||
- `topic-interest` packet code
|
||||
- RLPx subprotocol is changed from shh/6 to waku/0.
|
||||
- Light node capability is added.
|
||||
- Optional rate limiting is added.
|
||||
- Status packet has following additional parameters: light-node, confirmations-enabled and rate-limits
|
||||
- Mail Server and Mail Client functionality is now part of the specification.
|
||||
- P2P Message packet contains a list of envelopes instead of a single envelope.
|
||||
|
||||
## Next steps and future plans
|
||||
|
||||
Several challenges remain to make Waku a robust and suitable base
|
||||
communication protocol. Here we outline a few challenges that we are addressing and will continue to work on:
|
||||
|
||||
- scalability of the network
|
||||
- incentived infrastructure and spam-resistance
|
||||
- build with resource restricted devices in mind, including nodes being mostly offline
|
||||
|
||||
For the third bottleneck, a likely candidate for fixing this is Kademlia routing. This is similar to what is done in [Swarm's](https://www.ethswarm.org/) PSS. We are in the early stages of experimenting with this over libp2p in [nim-libp2p](https://github.com/status-im/nim-libp2p). More on this in a future post!
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
_Image from "caged sky" by mh.xbhd.org is licensed under CC BY 2.0 (https://ccsearch.creativecommons.org/photos/a9168311-78de-4cb7-a6ad-f92be8361d0e)_
|
||||
95
rlog/2020-02-7-dns-based-discovery.mdx
Normal file
95
rlog/2020-02-7-dns-based-discovery.mdx
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'DNS Based Discovery'
|
||||
title: 'DNS Based Discovery'
|
||||
date: 2020-02-7 12:00:00
|
||||
authors: dean
|
||||
published: true
|
||||
slug: dns-based-discovery
|
||||
categories: research
|
||||
---
|
||||
|
||||
A look at EIP-1459 and the benefits of DNS based discovery.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
Discovery in p2p networks is the process of how nodes find each other and specific resources they are looking for. Popular discovery protocols, such as [Kademlia](https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf) which utilizes a [distributed hash table](https://en.wikipedia.org/wiki/Distributed_hash_table) or DHT, are highly inefficient for resource restricted devices. These methods use short connection windows, and it is quite battery intensive to keep establishing connections. Additionally, we cannot expect a mobile phone for example to synchronize an entire DHT using cellular data.
|
||||
|
||||
Another issue is how we do the initial bootstrapping. In other words, how does a client find its first node to then discover the rest of the network? In most applications, including Status right now, this is done with a [static list of nodes](https://specs.status.im/spec/1#bootstrapping) that a client can connect to.
|
||||
|
||||
In summary, we have a static list that provides us with nodes we can connect to which then allows us to discover the rest of the network using something like Kademlia. But what we need is something that can easily be mutated, guarantees a certain amount of security, and is efficient for resource restricted devices. Ideally our solution would also be robust and scalable.
|
||||
|
||||
How do we do this?
|
||||
|
||||
[EIP 1459: Node Discovery via DNS](https://eips.ethereum.org/EIPS/eip-1459), which is one of the strategies we are using for discovering waku nodes. [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459) is a DNS-based discovery protocol that stores [merkle trees](https://en.wikipedia.org/wiki/Merkle_tree) in DNS records which contain connection information for nodes.
|
||||
|
||||
_Waku is our fork of Whisper. Oskar recently wrote an [entire post](https://vac.dev/fixing-whisper-with-waku) explaining it. In short, Waku is our method of fixing the shortcomings of Whisper in a more iterative fashion. You can find the specification [here](https://rfc.vac.dev/spec/6/)_
|
||||
|
||||
DNS-based methods for bootstrapping p2p networks are quite popular. Even Bitcoin uses it, but it uses a concept called DNS seeds, which are just DNS servers that are configured to return a list of randomly selected nodes from the network upon being queried. This means that although these seeds are hardcoded in the client, the IP addresses of actual nodes do not have to be.
|
||||
|
||||
```console
|
||||
> dig dnsseed.bluematt.me +short
|
||||
129.226.73.12
|
||||
107.180.78.111
|
||||
169.255.56.123
|
||||
91.216.149.28
|
||||
85.209.240.91
|
||||
66.232.124.232
|
||||
207.55.53.96
|
||||
86.149.241.168
|
||||
193.219.38.57
|
||||
190.198.210.139
|
||||
74.213.232.234
|
||||
158.181.226.33
|
||||
176.99.2.207
|
||||
202.55.87.45
|
||||
37.205.10.3
|
||||
90.133.4.73
|
||||
176.191.182.3
|
||||
109.207.166.232
|
||||
45.5.117.59
|
||||
178.211.170.2
|
||||
160.16.0.30
|
||||
```
|
||||
|
||||
The above displays the result of querying on of these DNS seeds. All the nodes are stored as [`A` records](https://simpledns.plus/help/a-records) for the given domain name. This is quite a simple solution which Bitcoin almost soley relies on since removing the [IRC bootstrapping method in v0.8.2](https://en.bitcoin.it/wiki/Network#IRC).
|
||||
|
||||
What makes this DNS based discovery useful? It allows us to have a mutable list of bootstrap nodes without needing to ship a new version of the client every time a list is mutated. It also allows for a more lightweight method of discovering nodes, something very important for resource restricted devices.
|
||||
|
||||
Additionally, DNS provides us with a robust and scalable infrastructure. This is due to its hierarchical architecture. This hierarchical architecture also already makes it distributed such that the failure of one DNS server does not result in us no longer being able to resolve our name.
|
||||
|
||||
As with every solution though, there is a trade-off. By storing the list in DNS name an adversary would simply need to censor the DNS records for a specific name. This would prevent any new client trying to join the network from being able to do so.
|
||||
|
||||
One thing you notice when looking at [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459) is that it is a lot more technically complex than Bitcoin's way of doing this. So if Bitcoin uses this simple method and has proven that it works, why did we need a new method?
|
||||
|
||||
There are multiple reasons, but the main one is **security**. In the Bitcoin example, an attacker could create a new list and no one querying would be able to tell. This is however mitigated in [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459) where we can verify the integrity of the entire returned list by storing an entire merkle tree in the DNS records.
|
||||
|
||||
Let's dive into this. Firstly, a client that is using these DNS records for discovery must know the public key corresponding to the private key controlled by the entity creating the list. This is because the entire list is signed using a secp256k1 private key, giving the client the ability to authenticate the list and know that it has not been tampered with by some external party.
|
||||
|
||||
So that already makes this a lot safer than the method Bitcoin uses. But how are these lists even stored? As previously stated they are stored using **merkle trees** as follows:
|
||||
|
||||
- The root of the tree is stored in a [`TXT` record](https://simpledns.plus/help/txt-records), this record contains the tree's root hash, a sequence number which is incremented every time the tree is updated and a signature as stated above.
|
||||
|
||||
Additionally, there is also a root hash to a second tree called a **link tree**, it contains the information to different lists. This link tree allows us to delegate trust and build a graph of multiple merkle trees stored across multiple DNS names.
|
||||
|
||||
The sequence number ensures that an attacker cannot replace a tree with an older version because when a client reads the tree, they should ensure that the sequence number is greater than the last synchronized version.
|
||||
|
||||
- Using the root hash for the tree, we can find the merkle tree's first branch, the branch is also stored in a `TXT` record. The branch record contains all the hashes of the branch's leafs.
|
||||
|
||||
- Once a client starts reading all the leafs, they can find one of two things: either a new branch record leading them further down the tree or an Ethereum Name Records (ENR) which means they now have the address of a node to connect to! To learn more about ethereum node records you can have a look at [EIP-778](https://eips.ethereum.org/EIPS/eip-778), or read a short blog post I wrote explaining them [here](https://dean.eigenmann.me/blog/2020/01/21/network-addresses-in-ethereum/#enr).
|
||||
|
||||
Below is the zone file taken from the [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459), displaying how this looks in practice.
|
||||
|
||||
```
|
||||
; name ttl class type content
|
||||
@ 60 IN TXT enrtree-root:v1 e=JWXYDBPXYWG6FX3GMDIBFA6CJ4 l=C7HRFPF3BLGF3YR4DY5KX3SMBE seq=1 sig=o908WmNp7LibOfPsr4btQwatZJ5URBr2ZAuxvK4UWHlsB9sUOTJQaGAlLPVAhM__XJesCHxLISo94z5Z2a463gA
|
||||
C7HRFPF3BLGF3YR4DY5KX3SMBE 86900 IN TXT enrtree://AM5FCQLWIZX2QFPNJAP7VUERCCRNGRHWZG3YYHIUV7BVDQ5FDPRT2@morenodes.example.org
|
||||
JWXYDBPXYWG6FX3GMDIBFA6CJ4 86900 IN TXT enrtree-branch:2XS2367YHAXJFGLZHVAWLQD4ZY,H4FHT4B454P6UXFD7JCYQ5PWDY,MHTDO6TMUBRIA2XWG5LUDACK24
|
||||
2XS2367YHAXJFGLZHVAWLQD4ZY 86900 IN TXT enr:-HW4QOFzoVLaFJnNhbgMoDXPnOvcdVuj7pDpqRvh6BRDO68aVi5ZcjB3vzQRZH2IcLBGHzo8uUN3snqmgTiE56CH3AMBgmlkgnY0iXNlY3AyNTZrMaECC2_24YYkYHEgdzxlSNKQEnHhuNAbNlMlWJxrJxbAFvA
|
||||
H4FHT4B454P6UXFD7JCYQ5PWDY 86900 IN TXT enr:-HW4QAggRauloj2SDLtIHN1XBkvhFZ1vtf1raYQp9TBW2RD5EEawDzbtSmlXUfnaHcvwOizhVYLtr7e6vw7NAf6mTuoCgmlkgnY0iXNlY3AyNTZrMaECjrXI8TLNXU0f8cthpAMxEshUyQlK-AM0PW2wfrnacNI
|
||||
MHTDO6TMUBRIA2XWG5LUDACK24 86900 IN TXT enr:-HW4QLAYqmrwllBEnzWWs7I5Ev2IAs7x_dZlbYdRdMUx5EyKHDXp7AV5CkuPGUPdvbv1_Ms1CPfhcGCvSElSosZmyoqAgmlkgnY0iXNlY3AyNTZrMaECriawHKWdDRk2xeZkrOXBQ0dfMFLHY4eENZwdufn1S1o
|
||||
```
|
||||
|
||||
All of this has already been introduced into go-ethereum with the pull request [#20094](https://github.com/ethereum/go-ethereum/pull/20094), created by Felix Lange. There's a lot of tooling around it that already exists too which is really cool. So if your project is written in Golang and wants to use this, it's relatively simple! Additionally, here's a proof of concept that shows what this might look like with libp2p on [github](https://github.com/decanus/dns-discovery).
|
||||
|
||||
I hope this was a helpful explainer into DNS based discovery, and shows [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459)'s benefits over more traditional DNS-based discovery schemes.
|
||||
261
rlog/2020-04-16-wechat-replacement-need.mdx
Normal file
261
rlog/2020-04-16-wechat-replacement-need.mdx
Normal file
@@ -0,0 +1,261 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'What Would a WeChat Replacement Need?'
|
||||
title: 'What Would a WeChat Replacement Need?'
|
||||
date: 2020-04-16 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: wechat-replacement-need
|
||||
categories: research
|
||||
image: /img/tianstatue.jpg
|
||||
discuss: https://forum.vac.dev/t/discussion-what-would-a-wechat-replacement-need/42
|
||||
---
|
||||
|
||||
What would a self-sovereign, private, censorship-resistant and open alternative to WeChat look like?
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
What would it take to replace WeChat? More specifically, what would a self-sovereign, private, censorship-resistant and open alternative look like? One that allows people to communicate, coordinate and transact freely.
|
||||
|
||||
## Background
|
||||
|
||||
### What WeChat provides to the end-user
|
||||
|
||||
Let's first look at some of the things that WeChat providers. It is a lot:
|
||||
|
||||
- **Messaging:** 1:1 and group chat. Text, as well as voice and video. Post gifs. Share location.
|
||||
- **Group chat:** Limited to 500 people; above 100 people people need to verify with a bank account. Also has group video chat and QR code to join a group.
|
||||
- **Timeline/Moments:** Post comments with attachments and have people like/comment on it.
|
||||
- **Location Discovery:** See WeChat users that are nearby.
|
||||
- **Profile:** Nickname and profile picture; can alias people.
|
||||
- **"Broadcast" messages:** Send one message to many contacts, up to 200 people (spam limited).
|
||||
- **Contacts:** Max 5000 contacts (people get around it with multiple accounts and sim cards).
|
||||
- **App reach:** Many diferent web apps, extensions, native apps, etc. Scan QR code to access web app from phone.
|
||||
- **Selective posting:** Decide who can view your posts and who can view your comments on other people's post.
|
||||
- **Transact:** Send money gifts through red envelopes.
|
||||
- **Transact:** Use WeChat pay to transfer money to friends and businesses; linked account with Alipay that is connected to your bank account.
|
||||
- **Services:** Find taxis and get notifications; book flights, train tickets, hotels etc.
|
||||
- **Mini apps:** API for all kinds of apps that allow you to provide services etc.
|
||||
- **Picture in picture:** allowing you to have a video call while using the app.
|
||||
|
||||
And much more. Not going to through it all in detail, and there are probably many things I don't know about WeChat since I'm not a heavy user living in mainland China.
|
||||
|
||||
### How WeChat works - a toy model
|
||||
|
||||
This is an overly simplistic model of how WeChat works, but it is sufficient for our purposes. This general design applies to most traditional client-server apps today.
|
||||
|
||||
To sign up for account you need a phone number or equivalent. To get access to some features you need to verify your identity further, for example with official ID and/or bank account.
|
||||
|
||||
When you signup this creates an entry in the WeChat server, from now on treated as a black box. You authenticate with that box, and thats where you get your messages from. If you go online the app asks that box for messages you have received while you were offline. If you login from a different app your contacts and conversations are synced from that box.
|
||||
|
||||
The box gives you an account, it deals with routing to your contacts, it stores messages and attachments and gives access to mini apps that people have uploaded. For transacting money, there is a partnership with a different company that has a different box which talks to your bank account.
|
||||
|
||||
This is done in a such a way that they can support a billion users with the features above, no sweat.
|
||||
|
||||
Whoever controls that box can sees who you are talking with and what the content of those messages are. There is no end to end encryption. If WeChat/Tencent disagrees with you for some reason they can ban you. This means you can't interact with the box under that name anymore.
|
||||
|
||||
## What do we want?
|
||||
|
||||
We want something that is self-sovereign, private, censorship-resistant and open that allows individuals and groups of people to communicate and transact freely. To explore what this means in more detail, without getting lost in the weeds, we provide the following list of properties. A lot of these are tied together, and some fall out of the other requirements. Some of them stand in slight opposition to each other.
|
||||
|
||||
**Self-sovereignity identity**. Exercises authority within your own sphere. If you aren't harming anyone, you should be able to have an account and communicate with other people.
|
||||
|
||||
**Pseudonymity, and ideally total anonymity**. Not having your identity tied to your real name (e.g. through phone number, bank account, ID, etc). This allows people to act more freely without being overly worried about censorship and coercion in the real world. While total anonymity is even more desirable - especially to break multiple hops to a true-name action - real-world constraints sometimes makes this more challenging.
|
||||
|
||||
**Private and secure communication**. Your communication and who you transact with should be for your eyes only. This includes transactions (transfer of value) as a form of communication.
|
||||
|
||||
**Censorship-resistance**. Not being able to easily censor individuals on the platform. Both at an individual, group and collective level. Not having single points of failure that allow service to be disrupted.
|
||||
|
||||
**Decentralization**. Partly falls out of censorship-resistance and other properties. If infrastructure isn't decentralized it means there's a single point of failure that can be disrupted. This is more of a tool than a goal on its own, but it is an important tool.
|
||||
|
||||
**Built for mass adoption**. Includes scalabiltiy, UX (latency, reliability, bandwidth consumption, UI etc), and allowing for people to stick around. One way of doing this is to allow users to discover people they want to talk to.
|
||||
|
||||
**Scalability**. Infrastructure needs to support a lot of users to be a viabile alternative. Like, a billion of them (eventually).
|
||||
|
||||
**Fundamentals in place to support great user experience**. To be a viable alternative, aside from good UI and distribution, fundamentals such as latency, bandwidth usage, consistency etc must support great UX to be a viable alternative.
|
||||
|
||||
**Works for resource restricted devices, including smartphones**. Most people will use a smartphone to use this. This means it has to work well on them and similar devices, without becoming a second-class citizen where we ignore properties such as censorship-resistance and privacy. Some concession to reality will be necessary due to additional constraints, which leads us to...
|
||||
|
||||
**Adaptive nodes**. Nodes will have different capabilities, and perhaps at different times. To maintain a lot of the properties described here it is desirable if as many participants as possible are first-class citizens. If a phone is switching from a limited data plan to a WiFi network or from battery to AC power it can do more useful work, and so on. Likewise for a laptop with a lot of free disk space and spare compute power, etc.
|
||||
|
||||
**Sustainable**. If there's no centralized, top down ad-driven model, this means all the infrastructure has to be sustainable somehow. Since these are individual entitites, this means it has to be paid for. While altruistic modes and similar can be used, this likely requires some form of incentivization scheme for useful services provided in the network. Related: free rider problem.
|
||||
|
||||
**Spam resistant**. Relates to sustainability, scalability and built for mass adoption. Made more difficult by pseudonymous identity due to whitewashing attacks.
|
||||
|
||||
**Trust-minimized**. To know that properties are provided for and aren't compromised, various ways of minimizing trust requirements are useful. This also related to mass adoption and social cohesion. Examples include: open and audited protocols, open source, reproducible builds, etc. This also relates to how mini apps are provided for, since we may not know their source but want to be able to use them anyway.
|
||||
|
||||
**Open source**. Related to above, where we must be able to inspect the software to know that it functions as advertised and hasn't been compromised, e.g. by uploading private data to a third party.
|
||||
|
||||
Some of these are graded and a bit subtle, i.e.:
|
||||
|
||||
- Censorship resistance would ideally be able to absorb Internet shutdowns. This would require an extensive MANET/meshnet infrastructure, which while desirable, requires a lot of challenges to be overcome to be feasible.
|
||||
- Privacy would ideally make all actions (optionally) totally anoymous, though this may incur undue costs on bandwidth and latency, which impacts user experience.
|
||||
- Decentralization, certain topologies, such as DHTs, are efficient and quite decentralized but still have some centralized aspects, which makes it attackable in various ways. Ditto for blockchains compared with bearer instruments which requires some coordinating infrastructure, compared with naturally occuring assets such as precious metals.
|
||||
- "Discover people" and striving for "total anonymity" might initially seem incompatible. The idea is to provide for sane defaults, and then allow people to decide how much information they want to disclose. This is the essence of privacy.
|
||||
- Users often want _some_ form of moderation to get a good user experience, which can be seen as a form of censorship. The idea to raise the bar on the basics, the fundamental infrastructure. If individuals or specific communities want certain moderation mechanisms, that is still a compatible requirement.
|
||||
|
||||
### Counterpoint 1
|
||||
|
||||
We could refute the above by saying that the design goals are undesirable. We want a system where people can censor others, and where everyone is tied to their real identity. Or we could say something like, freedom of speech is a general concept, and it doesn't apply to Internet companies, even if they provide a vital service. You can survive without it and you should've read the terms of service. This roughly charactericizes the mainstream view.
|
||||
|
||||
Additional factor here is the idea that a group of people know more about what's good for you then you do, so they are protecting you.
|
||||
|
||||
### Counterpoint 2
|
||||
|
||||
We could agree with all these design goals, but think they are too extreme in terms of their requirements. For example, we could operate as a non profit, take donations and volunteers, and then host the whole infrastructure ourselves. We could say we are in a friendly legislation, so we won't be a single point of failure. Since we are working on this and maybe even our designs are open, you can trust us and we'll provide service and infrastructure that gives you what you want without having to pay for it or solve all these complex decentralized computation and so on problems. If you don't trust us for some reason, you shouldn't use us regardless. Also, this is better than status quo. And we are more likely to survive by doing this, either by taking shortcuts or by being less ambituous in terms of scope.
|
||||
|
||||
## Principal components
|
||||
|
||||
There are many ways to skin a cat, but this is one way of breaking down the problem. We have a general direction with the properties listed above, together with some understanding of how WeChat works for the everday user. Now the question is, what infrastructure do we need to support this? How do we achieve the above properties, or at least get closer to them? We want to figure out the necessary building blocks, and one of doing this is to map out likely necessary components.
|
||||
|
||||
### Background: Ethereum and Web3 stack
|
||||
|
||||
It is worth noting that a lot of the required infrastructure has been developed, at least as concepts, in the original Ethereum / Web3 vision. In it there is Ethereum for consensus/compute/transact, storage through Swarm, and communication through Whisper. That said, the main focus has been on the Ethereum blockchain itself, and a lot of things have happened in the last 5y+ with respect to technology around privacy and scalabilty. It is worth revisiting things from a fresh point of view, with the WeChat alternative in mind as a clear use case.
|
||||
|
||||
### Account - self-sovereign identity and the perils of phone numbers
|
||||
|
||||
Starting from the most basic: what is an account and how do you get one? With most internet services today, WeChat and almost all popular messaging apps included, you need to signup with some centralized authority. Usually you also have to verify this with some data that ties this account to you as an individual. E.g. by requiring a phone number, which in most jurisdictions [^1] means giving out your real ID. This also means you can be banned from using the service by a somewhat arbitrary process, with no due process.
|
||||
|
||||
Now, we could argue these app providers can do what they want. And they are right, in a very narrow sense. As apps like WeChat (and Google) become general-purpose platforms, they become more and more ingrained in our everyday lives. They start to provide utilities that we absolutely require to work to go about our day, such as paying for food or transportation. This means we need higher standard than this.
|
||||
|
||||
Justifications for requiring phone numbers are usually centered around three claims:
|
||||
|
||||
1. Avoiding spam
|
||||
2. Tying your account to your real name, for various reasons
|
||||
3. Using as a commonly shared identifier as a social network discovery mechanism
|
||||
|
||||
Of course, many services require more than phone numbers. E.g. email, other forms of personal data such as voice recording, linking a bank account, and so on.
|
||||
|
||||
In contrast, a self-sovereign system would allow you to "create an account" completely on your own. This can easily be done with public key cryptograpy, and it also paves the way for end-to-end encryption to make your messages private.
|
||||
|
||||
The main issue with this that you need to get more creative about avoiding spam (e.g. through white washing attacks), and ideally there is some other form of social discovery mechanism.
|
||||
|
||||
Just having a public key as an account isn't enough though. If it goes through a central server, then nothing is stopping that server from arbitrarly blocking requests related to that public key. Of course, this also depends on how transparent such requests are. Fundamentally, lest we rely completely on goodwill, there needs to be multiple actors by which you can use the service. This naturally points to decentralization as a requirement. See counterpoint.
|
||||
|
||||
Even so, if the system is closed source we don't know what it is doing. Perhaps the app communicating is also uploading data to another place, or somehow making it possible to see who is who and act accordingly.
|
||||
|
||||
You might notice that just one simple property, self-sovereign identity, leads to a slew of other requirements and properties. You might also notice that WeChat is far from alone in this, even if their identity requirements might be a bit stringent than, say, Telegram. Their control aspects are also a bit more extreme, at least for someone with western sensibilities [^2].
|
||||
|
||||
Most user facing applications have similar issues, Google Apps/FB/Twitter etc. For popular tools that have this built in, we can look at git - which is truly decentralized and have keypair at the bottom. It is for a very specific technical domain, and even then people rely on Github. Key management is fairly difficult even for technical people, and for normal people even more so. Banks are generally far behind on this tech, relying on arcane procedures and special purpose hardware for 2FA. That's another big issue.
|
||||
|
||||
Let's shift gears a bit and talk about some other functional requirements.
|
||||
|
||||
### Routing - packets from A to B
|
||||
|
||||
In order to get a lot of the features WeChat provides, we need the ability to do three things: communicate, store data, and transact with people. We need a bit more than that, but let's focus on this for now.
|
||||
|
||||
To communicate with people, in the base case, we need to go from one phone to another phone that is separated by a large distance. This requires some form of routing. The most natural platform to build this on is the existing Internet, though not the only one. Most phones are resource restricted, and are only "on" for brief periods of time. This is needed to preserve battery and bandwidth. Additionally, Internet uses IPs as endpoints, which change as a phones move through space. NAT punching etc isn't always perfect either. This means we need a way to get a message from one public key to another, and through some intermediate nodes. We can think of these nodes as a form of service network. Similar to how a power grid works, or phone lines, or collection of ISPs.
|
||||
|
||||
One important property here is to ensure we don't end up in a situation like the centralized capture scenario above, something we've seen with centralized ISPs [^3] [^4] where they can choose which traffic is good and which is bad. We want to allow the use of different service nodes, just like if a restaurant gives you food poisioning you can go to the one next door and then the first one goes out of business after a while. And the circle of life continues.
|
||||
|
||||
We shouldn't be naive though, and think that this is something nodes are likely to do for free. They need to be adequately compensated for their services, in some of incentivization scheme. That can either be monetary, or as in the case of Bittorrent, more of a barter situation where you use game theory to coordinate with strangers [^5], and some form of reputation attached to it (for private trackers).
|
||||
|
||||
There are many ways of doing routing, and we won't go into too much technical detail here. Suffice to say is that you likely want both a structured and unstructured alternative, and that these comes with several trade-offs when it comes to efficiency, metadata protection, ability to incentivize, compatibility with existing topologies, and suitability for mobilephones (mostly offline, bandwidth restricted, not directly connectable). Expect more on this in a future article.
|
||||
|
||||
Some of these considerations naturally leads us into the storage and transaction components.
|
||||
|
||||
### Storage - available and persistant for later
|
||||
|
||||
If mobile phones are mostly offline, we need some way to store these messages so they can be retrieved when online again. The same goes for various kinds attachments as well, and for when people are switching devices. A user might control their timeline, but in the WeChat case that timeline is stored on Tencent's servers, and queried from there as well. This naturally needs to happen by some other service nodes. In the WeChat case, and for most IMs, the way these servers are paid for is through some indirect ad mechanism. The entity controlling these ads and so on is the same one as the one operating the servers for storage. A more direct model with different entities would see these services being compensated for their work.
|
||||
|
||||
We also need storage for attachments, mini-apps, as well as a way of understanding the current state of consensus when it comes to the compute/transact module. In the WeChat case, this state is completely handled by the bank institution or one of their partners, such as Alibaba. When it comes to bearer instruments like cash, no state needs to be kept as that's a direct exchange in the physical world. This isn't directly compatible with transfering value over a distance.
|
||||
|
||||
All of this state requires availability and persistance. It should be done in a trust minimized fashion and decentralized, which requires some form of incentivization for keeping data around. If it isn't, you are relying on social cohesion which breaks down at very large scales.
|
||||
|
||||
Since data will be spread out across multiple nodes, you need a way to sync data and transfer it in the network. As well as being able to add and query data from it. All of this requires a routing component.
|
||||
|
||||
To make it more censorship resistant it might be better to keep it as a general-purpose store, i.e. individuals don't need to know what they storing. Otherwise, you naturally end up in a situation where individual nodes can be pressured to not store certain content.
|
||||
|
||||
### Messaging - from me to you to all of us (not them)
|
||||
|
||||
This builds on top of routing, but it has a slightly different focus. The goal is to allow for individuals and groups to communicate in a private, secure and censorship-resistant manner.
|
||||
|
||||
It also needs to provide a decent interface to the end user, in terms of dealing seamlessly with offline messages, providing reliable and timely messaging.
|
||||
|
||||
In order to get closer to the ideal of total anonymity, it is useful to be able to hide metadata of who is talking to whom. This applies to both normal communication as well as for transactions. Ideally, no one but the parties involved can see who is taking part in a conversation. This can be achieved through various techniques such as mixnets, anonymous credentials, private information retrieval, and so on. Many of these techniques have a fundamental trade-off with latency and bandwidth, something that is a big concern for mobilephones. Being able to do some form of tuning, in an adaptive node manner, depending on your threat model and current capabilities is useful here.
|
||||
|
||||
The baseline here is pseudonymity, and having tools to allow individuals to "cut off" ties to their real world identity and transactions. People act different in different circles in the real world, and this should be mimicked online as well. Your company, family or government shouldn't be able to know what exactly you use your paycheck for, and who you are talking to.
|
||||
|
||||
### Compute - transact, contract and settle
|
||||
|
||||
The most immediate need here is transaction from A to B. Direct exchange. There is also a more indirect need for private lawmaking and contracting.
|
||||
|
||||
We talked about routing and storage and how they likely need to be incentivized to work properly. How are they going to be compensated? While this could in theory work via existing banking system and so on, this would be rather heavy. It'd also very likely require tying your identifier to your legal name, something that goes against what we want to achieve. What we want is something that acts more as right-to-access, similar to the way cash functions in a society [^6]. I pay for a fruit with something that is valuable to you and then I'm on my way.
|
||||
|
||||
While there might be other candidates, such as pre-paid debit cards and so on, this transaction mode pretty much requires a cryptocurrency component. The alternative is to do it on a reputation basis, which might work for small communities, due to social cohesion, but quickly detoriates for large ones [^7]. Ad hoc models like private Bittorrent trackers are centralized and easy to censor.
|
||||
|
||||
Now, none of the existing cryptocurrency models are ideal. They also all suffer from lack of widespread use, and it is difficult to get onboarded to them in the first place. Transactions in Bitcoin are slow. Ethereum is faster and has more capabilities, but it still suffers from linking payments over time, which makes the privacy part of this more difficult. Zcash, Monero and similar are interesting, but also require more use. For Zcash, shielded transactions appear to only account for less than 2% of all transactions in 2019 [^8] [^9].
|
||||
|
||||
Another dimension is what sets general purpose cryptocurrencies like Ethereum apart. Aside from just paying from A to B, you can encode rules about when something should be paid out and not. This is very useful for doing a form of private lawmaking, contracting, for setting up service agreements with these nodes. If there's no trivial recourse as in the meatspace world, where you know someone's name and you can sue them, you need a different kind of model.
|
||||
|
||||
What makes something like Zcash interesting is that it works more like digital cash. Instead of leaving a public trail for everyone, where someone can see where you got the initial money from and then trace you across various usage, for Zcash every hop is privacy preserving.
|
||||
|
||||
To fulfill the general goals of being censorship resistance and secure, it is also vital that the system being used stays online and can't be easily disrupted. That points to disintermediation, as opposed to using gateways and exchanges. This is a case where something like cash, or gold, is more direct, since no one can censor this transaction without being physically present where this direct exchange is taking place. However, like before, this doesn't work over distance.
|
||||
|
||||
### Secure chat - just our business
|
||||
|
||||
Similar to the messaging module above. The distinction here is that we assume the network part has already taken place. Here we are interested in keeping the contents of messages private, so that means confidentiality/end-to-end encryption, integrity, authentication, as well as forward secrecy and plausible deniability. This means that even if there's some actor that gets some private key material, or confiscated your phone, there is some level of...ephemerality to your conversations. Another issue here in terms of scalable private group chat.
|
||||
|
||||
### Extensible mini apps
|
||||
|
||||
This relates to the compute and storage module above. Essentially we want to provide mini apps as in WeChat, but to do so in a way that is compatible with what we want to achieve more generally. This allows individuals and small businesses to create small tools for various purposes, and coordinate with strangers. E.g. booking a cab or getting an insurance, and so on.
|
||||
|
||||
This has a higher dependency on the contracting/general computation aspect. I.e. often it isn't only a transaction, but you might want to encode some specific rules here that strangers can abide by without having too high trust requirements. As a simple example: escrows.
|
||||
|
||||
This also needs an open API that anyone can use. It should be properly secured, so using one doesn't compromise the rest of the system it is operating in. To be censorship resistant it requires the routing and storage component to work properly.
|
||||
|
||||
## Where are we now?
|
||||
|
||||
Let's look back at some of desirable properties we set out in the beginning and see how close we are to building out the necessary components. Is it realistic at all or just a pipe dream? We'll see that there are many building blocks in place, and there's reason for hope.
|
||||
|
||||
**Self-sovereignity identity**. Public key crypto and web of trust like constructs makes this possible.
|
||||
|
||||
**Pseudonymity, and ideally total anonymity**. Pseudonymity can largely be achieved with public key crypto and open systems that allow for permissionless participation. For transactions, pseudonymity exists in most cryptocurrencies. The challenge is linkage across time, especially when interfacing with other "legacy" system. There are stronger constructs that are actively being worked on and are promising here, such as mixnets (Nym), mixers (Wasabi Wallet, Tornado.Cash) and zero knowledge proofs (Zcash, Ethereum, Starkware). This area of applied research has exploded over the last few years.
|
||||
|
||||
**Private and secure communication**. Signal has pioneered a lot of this, following OTR. Double Ratchet, X3DH. E2EE is minimum these days, and properties like PFS and PD are getting better. For metadata protection, you have Tor, with its faults, and more active research on mixnets and private information retrieval, etc.
|
||||
|
||||
**Censorship-resistance**. This covers a lot of ground across the spectrum. You have technologies like Bittorrent, Bitcoin/Ethereum, Tor obfuscated transports, E2EE by default, partial mesh networks in production, abilit to move/replicate host machines more quickly have all made this more of a reality than it used to be. this easier. Of course, techniques such as deep packet inspection and internet shutdowns have increased.
|
||||
|
||||
**Decentralization**. Cryptocurrencies, projects like libp2p and IPFS. Need to be mindful here of many projects that claim decentralization but are still vulnerable to single points of failures, such as relying on gateways.
|
||||
|
||||
**Built for mass adoption**. This one is more subjective. There's definitely a lot of work to be done here, both when it comes to fundamental performance, key management and things like social discoverability. Directionally these things are improving and becoming easier for the average person but there is a lot ot be done here.
|
||||
|
||||
**Scalability**. With projects like Ethereum 2.0 and IPFS more and more resources are a being put into this, both at the consensus/compute layer as well as networking (gossip, scalable Kademlia) layer. Also various layer 2 solutions for transactions.
|
||||
|
||||
**Fundamentals in place to support great user experience**. Similar to built for mass adoption. As scalability becomes more important, more applied research is being done in the p2p area to improve things like latency, bandwidth.
|
||||
|
||||
**Works for resource restricted devices, including smartphones**. Work in progress and not enough focus here, generally an after thought. Also have stateless clients etc.
|
||||
|
||||
**Adaptive nodes**. See above. With subprotocols and capabilities in Ethereum and libp2p, this is getting easier.
|
||||
|
||||
**Sustainable**. Token economics is a thing. While a lot of it won't stay around, there are many more projects working on making themselves dispensable. Being open source, having an engaged community and enabling users run their own infrastructure. Users as stakeholders.
|
||||
|
||||
**Spam resistant**. Tricky problem if you want to be pseudonymous, but some signs of hope with incentivization mechanisms, zero knowledge based signaling, etc. Together with various forms of rate limiting and better controlling of topology and network amplification. And just generally being battle-tested by real world attacks, such as historical Ethereum DDoS attacks.
|
||||
|
||||
**Trust minimized**. Bitcoin. Zero knowledge provable computation. Open source. Reproducible builds. Signed binaries. Incentive compatible structures. Independent audits. Still a lot of work, but getting better.
|
||||
|
||||
**Open source**. Big and only getting bigger. Including mainstream companies.
|
||||
|
||||
## What's next?
|
||||
|
||||
We've look at what WeChat provides and what we'd like an alternative to look like. We've also seen a few principal modules that are necessary to achieve those goals. To achieve all of this is a daunting task, and one might call it overly ambitiuous. We've also seen how far we've come with some of the goals, and how a lot of the pieces are there, in one form or another. Then it is a question of putting them all together in the right mix.
|
||||
|
||||
The good news is that a lot of people are working all these building blocks and thinking about these problems. Compared to a few years ago we've come quite far when it comes to p2p infrastructure, privacy, security, scalability, and general developer mass and mindshare. If you want to join us in building some of these building blocks, and assembling them, check out our forum.
|
||||
|
||||
PS. We are [hiring protocol engineers](https://status.im/our_team/open_positions.html). DS
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
Corey, Dean, Jacek.
|
||||
|
||||
## References
|
||||
|
||||
[^1]: Mandatory SIM card registration laws: https://privacyinternational.org/long-read/3018/timeline-sim-card-registration-laws
|
||||
[^2]: On WeChat keyword censorship: https://citizenlab.ca/2016/11/wechat-china-censorship-one-app-two-systems/
|
||||
[^3]: Net Neutrality: https://www.eff.org/issues/net-neutrality
|
||||
[^4]: ISP centralization: https://ilsr.org/repealing-net-neutrality-puts-177-million-americans-at-risk/
|
||||
[^5]: Incentives Build Robustness in BitTorrent bittorrent.org/bittorrentecon.pdf
|
||||
[^6]: The Case for Electronic Cash: https://coincenter.org/files/2019-02/the-case-for-electronic-cash-coin-center.pdf
|
||||
[^7]: Money, blockchains, and social scalability: http://unenumerated.blogspot.com/2017/02/money-blockchains-and-social-scalability.html
|
||||
[^8]: Zcash private transactions (partial paywall): https://www.theblockcrypto.com/genesis/48413/an-analysis-of-zcashs-private-transactions
|
||||
[^9]: Shielded transactions usage (stats page 404s): https://z.cash/support/faq/
|
||||
87
rlog/2020-04-27-feasibility-discv5.mdx
Normal file
87
rlog/2020-04-27-feasibility-discv5.mdx
Normal file
@@ -0,0 +1,87 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Feasibility Study: Discv5'
|
||||
title: 'Feasibility Study: Discv5'
|
||||
date: 2020-04-27 12:00:00
|
||||
authors: dean
|
||||
published: true
|
||||
slug: feasibility-discv5
|
||||
categories: research
|
||||
discuss: https://discuss.status.im/t/discv5-feasibility-study/1632
|
||||
---
|
||||
|
||||
Looking at discv5 and the theoretical numbers behind finding peers.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
> Disclaimer: some of the numbers found in this write-up could be inaccurate. They are based on the current understanding of theoretical parts of the protocol itself by the author and are meant to provide a rough overview rather than bindable numbers.
|
||||
|
||||
This post serves as a more authoritative overview of the discv5 study, for a discussionary post providing more context make sure to check out the corresponding [discuss post](https://discuss.status.im/t/discv5-feasibility-study/1632). Additionally, if you are unfamiliar with discv5, check out my previous write-up: ["From Kademlia to Discv5"](https://vac.dev/kademlia-to-discv5).
|
||||
|
||||
## Motivating Problem
|
||||
|
||||
The discovery method currently used by [Status](https://status.im), is made up of various components and grew over time to solve a mix of problems. We want to simplify this while maintaining some of the properties we currently have.
|
||||
|
||||
Namely, we want to ensure censorship resistance to state-level adversaries. One of the issues Status had which caused us them add to their discovery method was the fact that addresses from providers like AWS and GCP were blocked both in Russia and China. Additionally, one of the main factors required is the ability to function on resource restricted devices.
|
||||
|
||||
Considering we are talking about resource restricted devices, let's look at the implications and what we need to consider:
|
||||
|
||||
- **Battery consumption** - constant connections like websockets consume a lot of battery life.
|
||||
- **CPU usage** - certain discovery methods may be CPU incentive, slowing an app down and making it unusable.
|
||||
- **Bandwidth consumption** - a lot of users will be using data plans, the discovery method needs to be efficient in order to accommodate those users without using up significant portions of their data plans.
|
||||
- **Short connection windows** - the discovery algorithm needs to be low latency, that means it needs to return results fast. This is because many users will only have the app open for a short amount of time.
|
||||
- **Not publicly connectable** - There is a good chance that most resource restricted devices are not publicly connectable.
|
||||
|
||||
For a node to be able to participate as both a provider, and a consumer in the discovery method. Meaning a node both reads from other nodes' stored DHTs and hosts the DHT for other nodes to read from, it needs to be publically connectable. This means another node must be able to connect to some public IP of the given node.
|
||||
|
||||
With devices that are behind a NAT, this is easier said than done. Especially mobile devices, that when connected to 4G LTE networks are often stuck behind a symmetric NAT, drastically reducing the the succeess rate of NAT traversal. Keeping this in mind, it becomes obvious that most resource restricted devices will be consumers rather than providers due to this technical limitation.
|
||||
|
||||
In order to answer our questions, we formulated the problem with a simple method for testing. The "needle in a haystack" problem was formulated to figure out how easily a specific node can be found within a given network. This issue was fully formulated in [vacp2p/research#15](https://github.com/vacp2p/research/issues/15).
|
||||
|
||||
## Overview
|
||||
|
||||
The main things we wanted to investigate was the overhead on finding a peer. This means we wanted to look at both the bandwidth, latency and effectiveness of this. There are 2 methods which we can use to find a peer:
|
||||
|
||||
- We can find a peer with a specific ID, using normal lookup methods as documented by Kademlia.
|
||||
- We can find a peer that advertises a capability, this is possible using either capabilities advertised in the ENR or through [topic tables](https://github.com/ethereum/devp2p/blob/master/discv5/discv5-theory.md#topic-advertisement).
|
||||
|
||||
## Feasbility
|
||||
|
||||
To be able to investigate the feasibility of discv5, we used various methods including rough calculations which can be found in the [notebook](https://vac.dev/discv5-notebook/), and a simulation isolated in [vacp2p/research#19](https://github.com/vacp2p/research/pull/19).
|
||||
|
||||
### CPU & Memory Usage
|
||||
|
||||
The experimental discv5 has already been used within Status, however what was noticed was that the CPU and memory usage was rather high. It therefore should be investiaged if this is still the case, and if it is, it should be isolated where this stems from. Additionally it is worth looking at whether or not this is the case with both the go and nim implementation.
|
||||
|
||||
See details: [vacp2p/research#31](https://github.com/vacp2p/research/issues/31)
|
||||
|
||||
### NAT on Cellular Data
|
||||
|
||||
If a peer is not publically connectable it can not participate in the DHT both ways. A lot of mobile phones are behind symmetric NATs which UDP hole-punching close to impossible. It should be investigated whether or not mobile phones will be able to participate both ways and if there are good methods for doing hole-punching.
|
||||
|
||||
See details: [vacp2p/research#29](https://github.com/vacp2p/research/issues/29)
|
||||
|
||||
### Topic Tables
|
||||
|
||||
Topic Tables allow us the ability to efficiently find nodes given a specific topic. However, they are not implemented in the [status-im/nim-eth](https://github.com/status-im/nim-eth/) implementation nor are they fully finalized in the spec. These are important if the network grows past a size where the concentration of specific nodes is relatively low making them hard to find.
|
||||
|
||||
See details: [vacp2p/research#26](https://github.com/vacp2p/research/issues/26)
|
||||
|
||||
### Finding a node
|
||||
|
||||
It is important to note, that given a network is relatively small sized, eg 100-500 nodes, then finding a node given a specific address is relatively managable. Additionally, if the concentration of a specific capability in a network is reasonable, then finding a node advertising its capabilities using an ENR rather than the topic table is also managable. A reasonable concentration for example would be 10%, which would give us an 80% chance of getting a node with that capability in the first lookup request. This can be explored more using our [discv5 notebook](https://vac.dev/discv5-notebook/#Needle-in-a-haystack-with-ENR-records-indicating-capabilities).
|
||||
|
||||
## Results
|
||||
|
||||
Research has shown that finding a node in the DHT has a relatively low effect on bandwidth, both inbound and outbound. For example when trying to find a node in a network of 100 nodes, it would take roughly 5668 bytes total. Additionally if we assume 100ms latency per request it would range at ≈ 300ms latency, translating to 3 requests to find a specific node.
|
||||
|
||||
## General Thoughts
|
||||
|
||||
One of the main blockers right now is figuring out what the CPU and memory usage of discv5 is on mobile phones, this is a large blocker as it affects one of the core problems for us. We need to consider whether discv5 is an upgrade as it allows us to simplify our current discovery process or if it is too much of an overhead for resource restricted devices. The topic table feature could largely enhance discovery however it is not yet implemented. Given that CPU and memory isn't too high, discv5 could probably be used as the other issues are more "features" than large scale issues. Implementing it would already reduce the ability for state level adversaries to censor our nodes.
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
- Oskar Thoren
|
||||
- Dmitry Shmatko
|
||||
- Kim De Mey
|
||||
- Corey Petty
|
||||
124
rlog/2020-04-9-kademlia-to-discv5.mdx
Normal file
124
rlog/2020-04-9-kademlia-to-discv5.mdx
Normal file
@@ -0,0 +1,124 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'From Kademlia to Discv5'
|
||||
title: 'From Kademlia to Discv5'
|
||||
date: 2020-04-9 16:00:00
|
||||
authors: dean
|
||||
published: true
|
||||
slug: kademlia-to-discv5
|
||||
categories: research
|
||||
---
|
||||
|
||||
A quick history of discovery in peer-to-peer networks, along with a look into discv4 and discv5, detailing what they are, how they work and where they differ.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
If you've been working on Ethereum or adjacent technologies you've probably heard of [discv4](https://github.com/ethereum/devp2p/blob/master/discv4.md) or [discv5](https://github.com/ethereum/devp2p/blob/master/discv5/discv5.md). But what are they actually? How do they work and what makes them different? To answer these questions, we need to start at the beginning, so this post will assume that there is little knowledge on the subject so the post should be accessible for anyone.
|
||||
|
||||
## The Beginning
|
||||
|
||||
Let's start right at the beginning: the problem of discovery and organization of nodes in peer-to-peer networks.
|
||||
|
||||
Early P2P file sharing technologies, such as Napster, would share information about who holds what file using a single server. A node would connect to the central server and give it a list of the files it owns. Another node would then connect to that central server, find a node that has the file it is looking for and contact that node. This however was a flawed system -- it was vulnerable to attacks and left a single party open to lawsuits.
|
||||
|
||||
It became clear that another solution was needed, and after years of research and experimentation, we were given the distributed hash table or DHT.
|
||||
|
||||
## Distributed Hash Tables
|
||||
|
||||
In 2001 4 new protocols for such DHTs were conceived, [Tapestry](https://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf), [Chord](https://pdos.csail.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf), [CAN](https://people.eecs.berkeley.edu/~sylvia/papers/cans.pdf) and [Pastry](http://rowstron.azurewebsites.net/PAST/pastry.pdf), all of which made various trade-offs and changes in their core functionality, giving them unique characteristics.
|
||||
|
||||
But as said, they're all DHTs. So what is a DHT?
|
||||
|
||||
A distributed hash table (DHT) is essentially a distributed key-value list. Nodes participating in the DHT can easily retrieve the value for a key.
|
||||
|
||||
If we have a network with 9 key-value pairs and 3 nodes, ideally each node would store 3 (optimally 6 for redundancy) of those key-value pairs, meaning that if a key-value pair were to be updated, only part of the network would responsible for ensuring that it is. The idea is that any node in the network would know where to find the specific key-value pair it is looking for based on how things are distributed amongst the nodes.
|
||||
|
||||
## Kademlia
|
||||
|
||||
So now that we know what DHTs are, let's get to Kademlia, the predecessor of discv4. Kademlia was created by Petar Maymounkov and David Mazières in 2002. I will naively say that this is probably one of the most popular and most used DHT protocols. It's quite simple in how it works, so let's look at it.
|
||||
|
||||
In Kademlia, nodes and values are arranged by distance (in a very mathematical definition). This distance is not a geographical one, but rather based on identifiers. It is calculated how far 2 identifiers are from eachother using some distance function.
|
||||
|
||||
Kademlia uses an `XOR` as its distance function. An `XOR` is a function that outputs `true` only when inputs differ. Here is an example with some binary identifiers:
|
||||
|
||||
```
|
||||
XOR 10011001
|
||||
00110010
|
||||
--------
|
||||
10101011
|
||||
```
|
||||
|
||||
The top in decimal numbers means that the distance between `153` and `50` is `171`.
|
||||
|
||||
There are several reasons why `XOR` was taken:
|
||||
|
||||
1. The distance from one ID to itself will be `0`.
|
||||
2. Distance is symmetric, A to B is the same as B to A.
|
||||
3. Follows triangle inequality, if `A`, `B` and `C` are points on a triangle then the distance `A` to `B` is closer or equal to that of `A` to `C` plus the one from `B` to `C`.
|
||||
|
||||
In summary, this distance function allows a node to decide what is "close" to it and make decisions based on that "closeness".
|
||||
|
||||
Kademlia nodes store a routing table. This table contains multiple lists. Each subsequent list contains nodes which are a little further distanced than the ones included in the previous list. Nodes maintain detailed knowledge about nodes closest to them, and the further away a node is, the less knowledge the node maintains about it.
|
||||
|
||||
So let's say I want to find a specific node. What I would do is go to any node which I already know and ask them for all their neighbours closest to my target. I repeat this process for the returned neighbours until I find my target.
|
||||
|
||||
The same thing happens for values. Values have a certain distance from nodes and their IDs are structured the same way so we can calculate this distance. If I want to find a value, I simply look for the neighbours closest to that value's key until I find the one storing said value.
|
||||
|
||||
For Kademlia nodes to support these functions, there are several messages with which the protocol communicates.
|
||||
|
||||
- `PING` - Used to check whether a node is still running.
|
||||
- `STORE` - Stores a value with a given key on a node.
|
||||
- `FINDNODE` - Returns the closest nodes requested to a given ID.
|
||||
- `FINDVALUE` - The same as `FINDNODE`, except if a node stores the specific value it will return it directly.
|
||||
|
||||
_This is a **very** simplified explanation of Kademlia and skips various important details. For the full description, make sure to check out the [paper](https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf) or a more in-depth [design specification](http://xlattice.sourceforge.net/components/protocol/kademlia/specs.html)_
|
||||
|
||||
## Discv4
|
||||
|
||||
Now after that history lesson, we finally get to discv4 (which stands for discovery v4), Ethereum's current node discovery protocol. The protocol itself is essentially based off of Kademlia, however it does away with certain aspects of it. For example, it does away with any usage of the value part of the DHT.
|
||||
|
||||
Kademlia is mainly used for the organisation of the network, so we only use the routing table to locate other nodes. Due to the fact that discv4 doesn't use the value portion of the DHT at all, we can throw away the `FINDVALUE` and `STORE` commands described by Kademlia.
|
||||
|
||||
The lookup method previously described by Kademlia describes how a node gets its peers. A node contacts some node and asks it for the nodes closest to itself. It does so until it can no longer find any new nodes.
|
||||
|
||||
Additionally, discv4 adds mutual endpoint verification. This is meant to ensure that a peer calling `FINDNODE` also participates in the discovery protocol.
|
||||
|
||||
Finally, all discv4 nodes are expected to maintain up-to-date ENR records. These contain information about a node. They can be requested from any node using a discv4-specific packet called `ENRRequest`.
|
||||
|
||||
_If you want some more details on ENRs, check out one of my posts ["Network Addresses in Ethereum"](https://dean.eigenmann.me/blog/2020/01/21/network-addresses-in-ethereum/)_
|
||||
|
||||
Discv4 comes with its own range of problems however. Let's look at a few of them.
|
||||
|
||||
Firstly, the way discv4 works right now, there is no way to differentiate between node sub-protocols. This means for example that an Ethereum node could add an Ethereum Classic Node, Swarm or Whisper node to its DHT without realizing that it is invalid until more communication has happened. This inability to differentiate sub-protocols makes it harder to find specific nodes, such as Ethereum nodes with light-client support.
|
||||
|
||||
Next, in order to prevent replay attacks, discv4 uses timestamps. This however can lead to various issues when a host's clock is wrong. For more details, see the ["Known Issues"](https://github.com/ethereum/devp2p/blob/master/discv4.md#known-issues-in-the-current-version) section of the discv4 specification.
|
||||
|
||||
Finally, we have an issue with the way mutual endpoint verification works. Messages can get dropped and there is no way to tell if both peers have verified eachother. This means that we could consider our peer verified while it does not consider us so making them drop the `FINDNODE` packet.
|
||||
|
||||
## Discv5
|
||||
|
||||
Finally, let's look at discv5. The next iteration of discv4 and the discovery protocol which will be used by Eth 2.0. It aims at fixing various issues present in discv4.
|
||||
|
||||
The first change is the way `FINDNODE` works. In traditional Kademlia as well as in discv5, we pass an identifier. However, in discv5 we instead pass the logarithmic distance, meaning that a `FINDNODE` request gets a response containing all nodes at the specified logarithmic distance from the called node.
|
||||
|
||||
Logarithmic distance means we first calculate the distance and then run it through our log base 2 function. See:
|
||||
|
||||
```
|
||||
log2(A xor B)
|
||||
```
|
||||
|
||||
And the second, more important change, is that discv5 aims at solving one of the biggest issues of discv4: the differentiation of sub-protocols. It does this by adding topic tables. Topic tables are [first in first out](<https://en.wikipedia.org/wiki/FIFO_(computing_and_electronics)>) lists that contain nodes which have advertised that they provide a specific service. Nodes get themselves added to this list by registering `ads` on their peers.
|
||||
|
||||
As of writing, there is still an issue with this proposal. There is currently no efficient way for a node to place `ads` on multiple peers, since it would require separate requests for every peer which is inefficient in a large-scale network.
|
||||
|
||||
Additionally, it is unclear how many peers a node should place these `ads` on and exactly which peers to place them on. For more details, check out the issue [devp2p#136](https://github.com/ethereum/devp2p/issues/136).
|
||||
|
||||
There are a bunch more smaller changes to the protocol, but they are less important hence they were ommitted from this summary.
|
||||
|
||||
Nevertheless, discv5 still does not resolve a couple issues present in discv4, such as unreliable endpoint verification. As of writing this post, there is currently no new method in discv5 to improve the endpoint verification process.
|
||||
|
||||
As you can see discv5 is still a work in progress and has a few large challenges to overcome. However if it does, it will most likely be a large improvement to a more naive Kademlia implementations.
|
||||
|
||||
---
|
||||
|
||||
Hopefully this article helped explain what these discovery protocols are and how they work. If you're interested in their full specifications you can find them on [github](https://github.com/ethereum/devp2p).
|
||||
252
rlog/2020-07-01-waku-v2-pitch.mdx
Normal file
252
rlog/2020-07-01-waku-v2-pitch.mdx
Normal file
@@ -0,0 +1,252 @@
|
||||
---
|
||||
layout: post
|
||||
name: "What's the Plan for Waku v2?"
|
||||
title: "What's the Plan for Waku v2?"
|
||||
date: 2020-07-01 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: waku-v2-plan
|
||||
categories: research
|
||||
image: /img/status_scaling_model_fig4.png
|
||||
discuss: https://forum.vac.dev/t/waku-version-2-pitch/52
|
||||
---
|
||||
|
||||
Read about our plans for Waku v2, moving to libp2p, better routing, adaptive nodes and accounting!
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
**tldr: The Waku network is fragile and doesn't scale. Here's how to solve it.**
|
||||
|
||||
_NOTE: This post was originally written with Status as a primary use case in mind, which reflects how we talk about some problems here. However, Waku v2 is a general-purpose private p2p messaging protocol, especially for people running in resource restricted environments._
|
||||
|
||||
## Problem
|
||||
|
||||
The Waku network is fragile and doesn't scale.
|
||||
|
||||
As [Status](https://status.im) is moving into a user-acquisition phase and is improving retention rates for users they need the infrastructure to keep up, specifically when it comes to messaging.
|
||||
|
||||
Based on user acquisition models, the initial goal is to support 100k DAU in September, with demand growing from there.
|
||||
|
||||
With the Status Scaling Model we have studied the current bottlenecks as a function of concurrent users (CCU) and daily active users (DAU). Here are the conclusions.
|
||||
|
||||
\***\*1. Connection limits\*\***. With 100 full nodes we reach ~10k CCU based on connection limits. This can primarily be addressed by increasing the number of nodes (cluster or user operated). This assumes node discovery works. It is also worth investigating the limitations of max number of connections, though this is likely to be less relevant for user-operated nodes. For a user-operated network, this means 1% of users have to run a full node. See Fig 1-2.
|
||||
|
||||
\***\*2. Bandwidth as a bottleneck\*\***. We notice that memory usage appears to not be
|
||||
the primary bottleneck for full nodes, and the bottleneck is still bandwidth. To support 10k DAU, and full nodes with an amplification factor of 25 the required Internet speed is ~50 Mbps, which is a fast home Internet connection. For ~100k DAU only cloud-operated nodes can keep up (500 Mbps). See Fig 3-5.
|
||||
|
||||
\***\*3. Amplification factors\*\***. Reducing amplification factors with better routing, would have a high impact, but it is likely we'd need additional measures as well, such as topic sharding or similar. See Fig 8-13.
|
||||
|
||||
Figure 1-5:
|
||||
|
||||

|
||||

|
||||

|
||||

|
||||

|
||||
|
||||
See <https://colab.research.google.com/drive/1Fz-oxRxxAFPpM1Cowpnb0nT52V1-yeRu#scrollTo=Yc3417FUJJ_0> for the full report.
|
||||
|
||||
What we need to do is:
|
||||
|
||||
1. Reduce amplification factors
|
||||
2. Get more user-run full nodes
|
||||
|
||||
Doing this means the Waku network will be able to scale, and doing so in the right way, in a robust fashion. What would a fragile way of scaling be? Increasing our reliance on a Status Pte Ltd operated cluster which would paint us in a corner where we:
|
||||
|
||||
- keep increasing requirements for Internet speed for full nodes
|
||||
- are vulnerable to censorship and attacks
|
||||
- have to control the topology in an artifical manner to keep up with load
|
||||
- basically re-invent a traditional centralized client-server app with extra steps
|
||||
- deliberately ignore most of our principles
|
||||
- risk the network being shut down when we run out of cash
|
||||
|
||||
## Appetite
|
||||
|
||||
Our initial risk appetite for this is 6 weeks for a small team.
|
||||
|
||||
The idea is that we want to make tangible progress towards the goal in a limited period of time, as opposed to getting bogged down in trying to find a theoretically perfect generalized solution. Fixed time, variable scope.
|
||||
|
||||
It is likely some elements of a complete solution will be done separately. See later sections for that.
|
||||
|
||||
## Solution
|
||||
|
||||
There are two main parts of the solution. One is to reduce amplification factors, and the other is incentivization to get more user run full nodes with desktop, etc.
|
||||
|
||||
What does a full node provide? It provides connectivity to the network, can act as a bandwidth "barrier" and be high or reasonably high availability. What this means right now is essentially topic interest and storing historical messages.
|
||||
|
||||
The goal is here to improve the status quo, not get a perfect solution from the get go. All of this can be iterated on further, for stronger guarantees, as well as replaced by other new modules.
|
||||
|
||||
Let's first look at the baseline, and then go into some of the tracks and their phases. Track 1 is best done first, after which track 2 and 3 can be executed in parallel. Track 1 gives us more options for track 2 and 3. The work in track 1 is currently more well-defined, so it is likely the specifics of track 2 and 3 will get refined at a later stage.
|
||||
|
||||
## Baseline
|
||||
|
||||
Here's where we are at now. In reality, the amplification factor are likely even worse than this (15 in the graph below), up to 20-30. Especially with an open network, where we can't easily control connectivity and availability of nodes. Left unchecked, with a full mesh, it could even go as high x100, though this is likely excessive and can be dialed down. See scaling model for more details.
|
||||
|
||||

|
||||
|
||||
## Track 1 - Move to libp2p
|
||||
|
||||
Moving to PubSub over libp2p wouldn't improve amplification per se, but it would be stepping stone. Why? It paves the way for GossipSub, and would be a checkpoint on this journey. Additionally, FloodSub and GossipSub are compatible, and very likely other future forms of PubSub such as GossipSub 1.1 (hardened/more secure), EpiSub, forwarding Kademlia / PubSub over Kademlia, etc. Not to mention security This would also give us access to the larger libp2p ecosystem (multiple protocols, better encryption, quic, running in the browser, security audits, etc, etc), as well as be a joint piece of infrastructured used for Eth2 in Nimbus. More wood behind fewer arrows.
|
||||
|
||||
See more on libp2p PubSub here: <https://docs.libp2p.io/concepts/publish-subscribe/>
|
||||
|
||||
As part of this move, there are a few individual pieces that are needed.
|
||||
|
||||
### 1. FloodSub
|
||||
|
||||
This is essentially what Waku over libp2p would look like in its most basic form.
|
||||
|
||||
One difference that is worth noting is that the app topics would **not** be the same as Waku topics. Why? In Waku we currently don't use topics for routing between full nodes, but only for edge/light nodes in the form of topic interest. In FloodSub, these topics are used for routing.
|
||||
|
||||
Why can't we use Waku topics for routing directly? PubSub over libp2p isn't built for rare and ephemeral topics, and nodes have to explicitly subscribe to a topic. See topic sharding section for more on this.
|
||||
|
||||

|
||||
|
||||
Moving to FloodSub over libp2p would also be an opportunity to clean up and simplify some components that are no longer needed in the Waku v1 protocol, see point below.
|
||||
|
||||
Very experimental and incomplete libp2p support can be found in the nim-waku repo under v2: <https://github.com/status-im/nim-waku>
|
||||
|
||||
### 2. Simplify the protocol
|
||||
|
||||
Due to Waku's origins in Whisper, devp2p and as a standalone protocol, there are a lot of stuff that has accumulated (<https://rfc.vac.dev/spec/6/>). Not all of it serves it purpose anymore. For example, do we still need RLP here when we have Protobuf messages? What about extremely low PoW when we have peer scoring? What about key management / encryption when have encryption at libp2p and Status protocol level?
|
||||
|
||||
Not everything has to be done in one go, but being minimalist at this stage will the protocol lean and make us more adaptable.
|
||||
|
||||
The essential characteristic that has to be maintained is that we don't need to change the upper layers, i.e. we still deal with (Waku) topics and some envelope like data unit.
|
||||
|
||||
### 3. Core integration
|
||||
|
||||
As early as possible we want to integrate with Core via Stimbus in order to mitigate risk and catch integration issues early in the process. What this looks like in practice is some set of APIs, similar to how Whisper and Waku were working in parallel, and experimental feature behind a toggle in core/desktop.
|
||||
|
||||
### 4. Topic interest behavior
|
||||
|
||||
While we target full node traffic here, we want to make sure we maintain the existing bandwidth requirements for light nodes that Waku v1 addressed (<https://vac.dev/fixing-whisper-with-waku>). This means implementing topic-interest in the form of Waku topics. Note that this would be separate from app topics notes above.
|
||||
|
||||
### 5. Historical message caching
|
||||
|
||||
Basically what mailservers are currently doing. This likely looks slightly different in a libp2p world. This is another opportunity to simplify things with a basic REQ-RESP architecture, as opposed to the roundabout way things are now. Again, not everything has to be done in one go but there's no reason to reimplement a poor API if we don't have to.
|
||||
|
||||
Also see section below on adaptive nodes and capabilities.
|
||||
|
||||
### 6. Waku v1 <\> Libp2p bridge
|
||||
|
||||
To make the transition complete, there has to a be bridge mode between current Waku and libp2p. This is similar to what was done for Whisper and Waku, and allows any nodes in the network to upgrade to Waku v2 at their leisure. For example, this would likely look different for Core, Desktop, Research and developers.
|
||||
|
||||
## Track 2 - Better routing
|
||||
|
||||
This is where we improve the amplification factors.
|
||||
|
||||
### 1. GossipSub
|
||||
|
||||
This is a subprotocol of FloodSub in the libp2p world. Moving to GossipSub would allow traffic between full nodes to go from an amplification factor of ~25 to ~6. This basically creates a mesh of stable bidirectional connections, together with some gossiping capabilities outside of this view.
|
||||
|
||||
Explaining how GossipSub works is out of scope of this document. It is implemented in nim-libp2p and used by Nimbus as part of Eth2. You can read the specs here in more detail if you are interested: <https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md> and <https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md>
|
||||
|
||||

|
||||
|
||||

|
||||

|
||||

|
||||

|
||||
|
||||
While we technically could implement this over existing Waku, we'd have to re-implement it, and we'd lose out on all the other benefits libp2p would provide, as well as the ecosystem of people and projects working on improving the scalability and security of these protocols.
|
||||
|
||||
### 2. Topic sharding
|
||||
|
||||
This one is slightly more speculative in terms of its ultimate impact. The basic idea is to split the application topic into N shards, say 10, and then each full node can choose which shards to listen to. This can reduce amplification factors by another factor of 10.
|
||||
|
||||

|
||||
|
||||

|
||||

|
||||
|
||||
Note that this means a light node that listens to several topics would have to be connected to more full nodes to get connectivity. For a more exotic version of this, see <https://forum.vac.dev/t/rfc-topic-propagation-extension-to-libp2p-pubsub/47>
|
||||
|
||||
This is orthogonal from the choice of FloodSub or GossipSub, but due to GossipSub's more dynamic nature it is likely best combined with it.
|
||||
|
||||
### 3. Other factors
|
||||
|
||||
Not a primary focus, but worth a look. Looking at the scaling model, there might be other easy wins to improve overall bandwidth consumption between full nodes. For example, can we reduce envelope size by a significant factor?
|
||||
|
||||
## Track 3 - Accounting and user-run nodes
|
||||
|
||||
This is where we make sure the network isn't fragile, become a true p2p app, get our users excited and engaged, and allow us to scale the network without creating an even bigger cluster.
|
||||
|
||||
To work in practice, this has a soft dependency on node discovery such as DNS based discovery (<https://eips.ethereum.org/EIPS/eip-1459>) or Discovery v5 (<https://vac.dev/feasibility-discv5>).
|
||||
|
||||
### 1. Adaptive nodes and capabilities
|
||||
|
||||
We want to make the gradation between light nodes, full nodes, storing (partial set of) historical messages, only acting for a specific shard, etc more flexible and explicit. This is required to identify and discover the nodes you want. See <https://github.com/vacp2p/specs/issues/87>
|
||||
|
||||
Depending on how the other tracks come together, this design should allow for a desktop node to identify as a full relaying node for some some app topic shard, but also express waku topic interest and retrieve historical messages itself.
|
||||
|
||||
E.g. Disc v5 can be used to supply node properties through ENR.
|
||||
|
||||
### 2. Accounting
|
||||
|
||||
This is based on a few principles:
|
||||
|
||||
1. Some nodes contribute a lot more than other nodes in the network
|
||||
2. We can account for the difference in contribution in some fashion
|
||||
3. We want to incentivize nodes to tell the true, and be incentivized not to lie
|
||||
|
||||
Accounting here is a stepping stone, where accounting is the raw data upon which some settlement later occurs. It can have various forms of granularity. See <https://forum.vac.dev/t/accounting-for-resources-in-waku-and-beyond/31> for discussion.
|
||||
|
||||
We also note that in GossipSub, the mesh is bidrectional. Additionally, it doesn't appears to be a high priority issue in terms of nodes misreporting. What is an issue is having people run full nodes in the first place. There are a few points to that. It has to be possible in the end-user UX, nodes have to be discovered, and it has to be profitable/visible that you are contributing. UX and discovery are out of scope for this work, whereas visibility/accounting is part of this scope. Settlement is a stretch goal here.
|
||||
|
||||
The general shape of the solution is inspired by the Swarm model, where we do accounting separate from settlement. It doesn't require any specific proofs, but nodes are incentivized to tell the truth in the following way:
|
||||
|
||||
1. Both full node and light node do accounting in a pairwise, local fashion
|
||||
2. If a light node doesn't ultimately pay or lie about reporting, they get disconnected (e.g.)
|
||||
3. If a full node doesn't provide its service the light node may pick another full node (e.g.)
|
||||
|
||||
While accounting for individual resource usage is useful, for the ultimate end user experience we can ideally account for other things such as:
|
||||
|
||||
- end to end delivery
|
||||
- online time
|
||||
- completeness of storage
|
||||
|
||||
This can be gradually enhanced and strengthened, for example with proofs, consistency checks, Quality of Service, reputation systems. See <https://discuss.status.im/t/network-incentivisation-first-draft/1037> for one attempt to provide stronger guarantees with periodic consistency checks and a shared fund mechanism. And <https://forum.vac.dev/t/incentivized-messaging-using-validity-proofs/51> for using validity proofs and removing liveness requirement for settlement.
|
||||
|
||||
All of this is optional at this stage, because our goal here is to improve the status quo for user run nodes. Accounting at this stage should be visible and correspond to the net benefit a node provides to another.
|
||||
|
||||
As a concrete example: a light node has some topic interest and cares about historical messages on some topic. A full node communicates envelopes as they come in, communicates their high availability (online time) and stores/forward stored messages. Both nodes have this information, and if they agree settlement (initially just a mock message) can be sending a payment to an address at some time interval / over some defined volume. See future sections for how this can be improved upon.
|
||||
|
||||
Also see below in section 4, using constructs such as eigentrust as a local reputation mechanism.
|
||||
|
||||
### 3. Relax high availability requirement
|
||||
|
||||
If we want desktop nodes to participate in the storing of historical messages, high availability is a problem. It is a problem for any node, especially if they lie about it, but assuming they are honest it is still an issue.
|
||||
|
||||
By being connected to multiple nodes, we can get an overlapping online window. Then these can be combined together to get consistency. This is obviously experimental and would need to be tested before being deployed, but if it works it'd be very useful.
|
||||
|
||||
Additionally or alternatively, instead of putting a high requirement on message availability, focus on detection of missing information. This likely requires re-thinking how we do data sync / replication.
|
||||
|
||||
### 4. Incentivize light and full nodes to tell the truth (policy, etc)
|
||||
|
||||
In accounting phase it is largely assumed nodes are honest. What happens when they lie, and how do we incentivize them to be honest? In the case of Bittorrent this is done with tit-for-tat, however this is a different kind of relationship. What follows are some examples of how this can be done.
|
||||
|
||||
For light nodes:
|
||||
|
||||
- if they don't, they get disconnected
|
||||
- prepayment (especially to "high value" nodes)
|
||||
|
||||
For full nodes:
|
||||
|
||||
- multiple nodes reporting to agree, where truth becomes a shelling point
|
||||
- use eigentrust
|
||||
- staking for discovery visibility with slashing
|
||||
|
||||
### 5. Settlement PoC
|
||||
|
||||
Can be done after phase 2 if so desired. Basically integrate payments based on accounting and policy.
|
||||
|
||||
## Out of scope
|
||||
|
||||
1. We assume the Status Base model requirements are accurate.
|
||||
2. We assume Core will improve retention rates.
|
||||
3. We assume the Stimbus production team will enable integration of nim-waku.
|
||||
4. We assume Discovery mechanisms such as DNS and Discovery v5 will be worked on separately.
|
||||
5. We assume Core will, at some point, provide an UX for integrating payment of services.
|
||||
6. We assume the desktop client is sufficiently usable.
|
||||
7. We assume Core and Infra will investigate ways of improving MaxPeers.
|
||||
110
rlog/2020-09-28-waku-v2-update.mdx
Normal file
110
rlog/2020-09-28-waku-v2-update.mdx
Normal file
@@ -0,0 +1,110 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Waku v2 Update'
|
||||
title: 'Waku v2 Update'
|
||||
date: 2020-09-28 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: waku-v2-update
|
||||
categories: research
|
||||
image: /img/vac.png
|
||||
discuss: https://forum.vac.dev/t/discussion-waku-v2-update/56
|
||||
---
|
||||
|
||||
A research log. Read on to find out what is going on with Waku v2, a messaging protocol. What has been happening? What is coming up next?
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
It has been a while since the last post. It is time for an update on Waku v2. Aside from getting more familiar with libp2p (specifically nim-libp2p) and some vacation, what have we been up to? In this post we'll talk about what we've gotten done since last time, and briefly talk about immediate next steps and future. But first, a recap.
|
||||
|
||||
## Recap
|
||||
|
||||
In the last post ([Waku v2 plan](https://vac.dev/waku-v2-plan)) we explained the rationale of Waku v2 - the current Waku network is fragile and doesn't scale. To solve this, Waku v2 aims to reduce amplification factors and get more user run nodes. We broke the work down into three separate tracks.
|
||||
|
||||
1. Track 1 - Move to libp2p
|
||||
2. Track 2 - Better routing
|
||||
3. Track 3 - Accounting and user-run nodes
|
||||
|
||||
As well as various rough components for each track. The primary initial focus is track 1. This means things like: moving to FloodSub, simplify the protocol, core integration, topic interest behavior, historical message caching, and Waku v1<\>v2 bridge.
|
||||
|
||||
## Current state
|
||||
|
||||
Let's talk about the state of specs and our main implementation nim-waku. Then we'll go over our recent testnet, Nangang, and finish off with a Web PoC.
|
||||
|
||||
## Specs
|
||||
|
||||
After some back and forth on how to best structure things, we ended up breaking down the specs into a few pieces. While Waku v2 is best thought of as a cohesive whole in terms of its capabilities, it is made up of several protocols. Here's a list of the current specs and their status:
|
||||
|
||||
- [Main spec](https://rfc.vac.dev/spec/10/) (draft)
|
||||
- [Relay protocol spec](https://rfc.vac.dev/spec/11/) (draft)
|
||||
- [Filter protocol spec](https://rfc.vac.dev/spec/12) (raw)
|
||||
- [Store protocol spec](https://rfc.vac.dev/spec/13) (raw)
|
||||
- [Bridge spec](https://rfc.vac.dev/spec/15/) (raw)
|
||||
|
||||
Raw means there is not yet an implementation that corresponds fully to the spec, and draft means there is an implementation that corresponds to the spec. In the interest of space, we won't go into too much detail on the specs here except to note a few things:
|
||||
|
||||
- The relay spec is essentially a thin wrapper on top of PubSub/FloodSub/GossipSub
|
||||
- The filter protocol corresponds to previous light client mode in Waku v1
|
||||
- The store protocol corresponds to the previous mailserver construct in Waku v1
|
||||
|
||||
The filter and store protocol allow for adaptive nodes, i.e. nodes that have various capabilities. For example, a node being mostly offline, or having limited bandwidth capacity. The bridge spec outlines how to bridge the Waku v1 and v2 networks.
|
||||
|
||||
## Implementation
|
||||
|
||||
The main implementation we are working on is [nim-waku](https://github.com/status-im/nim-waku/). This builds on top of libraries such as [nim-libp2p](https://github.com/status-im/nim-libp2p) and others that the [Nimbus team](https://nimbus.team/) have been working on as part of their Ethereum 2.0 client.
|
||||
|
||||
Currently nim-waku implements the relay protocol, and is close to implementing filter and store protocol. It also exposes a [Nim Node API](https://github.com/status-im/nim-waku/blob/master/docs/api/v2/node.md) that allows libraries such as [nim-status](https://github.com/status-im/status-nim) to use it. Additionally, there is also a rudimentary JSON RPC API for command line scripting.
|
||||
|
||||
## Nangang testnet
|
||||
|
||||
Last week we launched a very rudimentary internal testnet called Nangang. The goal was to test basic connectivity and make sure things work end to end. It didn't have things like: client integration, encryption, bridging, multiple clients, store/filter protocol, or even a real interface. What it did do is allow Waku developers to "chat" via RPC calls and looking in the log output. Doing this meant we exposed and fixed a few blockers, such as connection issues, deployment, topic subscription management, protocol and node integration, and basic scripting/API usage. After this, we felt confident enough to upgrade the main and relay spec to "draft" status.
|
||||
|
||||
## Waku Web PoC
|
||||
|
||||
As a bonus, we wanted to see what it'd take to get Waku running in a browser. This is a very powerful capability that enables a lot of use cases, and something that libp2p enables with its multiple transport support.
|
||||
|
||||
Using the current stack, with nim-waku, would require quite a lot of ground work with WASM, WebRTC, Websockets support etc. Instead, we decided to take a shortcut and hack together a JS implementation called [Waku Web Chat](https://github.com/vacp2p/waku-web-chat/). This quick hack wouldn't be possible without the people behind [js-libp2p-examples](https://github.com/libp2p/js-libp2p-examples/) and [js-libp2p](https://github.com/libp2p/js-libp2p) and all its libraries. These are people like Jacob Heun, Vasco Santos, and Cayman Nava. Thanks!
|
||||
|
||||
It consists of a brower implementation, a NodeJS implementation and a bootstrap server that acts as a signaling server for WebRTC. It is largely a bastardized version of GossipSub, and while it isn't completely to spec, it does allow messages originating from a browser to eventually end up at a nim-waku node, and vice versa. Which is pretty cool.
|
||||
|
||||
## Coming up
|
||||
|
||||
Now that we know what the current state is, what is still missing? what are the next steps?
|
||||
|
||||
## Things that are missing
|
||||
|
||||
While we are getting closer to closing out work for track 1, there are still a few things missing from the initial scope:
|
||||
|
||||
1. Store and filter protocols need to be finished. This means basic spec, implementation, API integration and proven to work in a testnet. All of these are work in progress and expected to be done very soon. Once the store protocol is done in a basic form, it needs further improvements to make it production ready, at least on a spec/basic implementation level.
|
||||
|
||||
2. Core integration was mentioned in scope for track 1 initially. This work has stalled a bit, largely due to organizational bandwidth and priorities. While there is a Nim Node API that in theory is ready to be used, having it be used in e.g. Status desktop or mobile app is a different matter. The team responsible for this at Status ([status-nim](https://github.com/status-im/status-nim) has been making progress on getting nim-waku v1 integrated, and is expected to look into nim-waku v2 integration soon. One thing that makes this a especially tricky is the difference in interface between Waku v1 and v2, which brings
|
||||
us too...
|
||||
|
||||
3. Companion spec for encryption. As part of simplifying the protocol, the routing is decoupled from the encryption in v2 ([1](https://github.com/vacp2p/specs/issues/158), [2](https://github.com/vacp2p/specs/issues/181)). There are multiple layers of encryption at play here, and we need to figure out a design that makes sense for various use cases (dapps using Waku on their own, Status app, etc).
|
||||
|
||||
4. Bridge implementation. The spec is done and we know how it should work, but it needs to be implemented.
|
||||
|
||||
5. General tightening up of specs and implementation.
|
||||
|
||||
While this might seem like a lot, a lot has been done already, and the majority of the remaining tasks are more amendable to be pursued in parallel with other efforts. It is also worth mentioning that part of track 2 and 3 have been started, in the form of moving to GossipSub (amplification factors) and basics of adaptive nodes (multiple protocols). This is in addition to things like Waku Web which were not part of the initial scope.
|
||||
|
||||
## Upcoming
|
||||
|
||||
Aside from the things mentioned above, what is coming up next? There are a few areas of interest, mentioned in no particular order. For track 2 and 3, see previous post for more details.
|
||||
|
||||
1. Better routing (track 2). While we are already building on top of GossipSub, we still need to explore things like topic sharding in more detail to further reduce amplification factors.
|
||||
|
||||
2. Accounting and user-run nodes (track 3). With store and filter protocol getting ready, we can start to implement accounting and light connection game for incentivization in a bottom up and iterative manner.
|
||||
|
||||
3. Privacy research. Study better and more rigorous privacy guarantees. E.g. how FloodSub/GossipSub behaves for common threat models, and how custom packet
|
||||
format can improve things like unlinkability.
|
||||
|
||||
4. zkSnarks RLN for spam protection and incentivization. We studied this [last year](https://vac.dev/feasibility-semaphore-rate-limiting-zksnarks) and recent developments have made this relevant to study again. Create an [experimental spec/PoC](https://github.com/vacp2p/specs/issues/189) as an extension to the relay protocol. Kudos to Barry Whitehat and others like Kobi Gurkan and Koh Wei Jie for pushing this!
|
||||
|
||||
5. Ethereum M2M messaging. Being able to run in the browser opens up a lot of doors, and there is an opportunity here to enable things like a decentralized WalletConnect, multi-sig transactions, voting and similar use cases. This was the original goal of Whisper, and we'd like to deliver on that.
|
||||
|
||||
As you can tell, quite a lot of thing! Luckily, we have two people joining as protocol engineers soon, which will bring much needed support for the current team of ~2-2.5 people. More details to come in further updates.
|
||||
|
||||
---
|
||||
|
||||
If you are feeling adventurous and want to use early stage alpha software, check out the [docs](https://github.com/status-im/nim-waku/tree/master/docs). If you want to read the specs, head over to [Waku spec](https://rfc.vac.dev/spec/10/). If you want to talk with us, join us on [Status](https://get.status.im/chat/public/vac) or on [Telegram](https://t.me/vacp2p) (they are bridged).
|
||||
192
rlog/2020-11-10-waku-v2-ethereum-messaging.mdx
Normal file
192
rlog/2020-11-10-waku-v2-ethereum-messaging.mdx
Normal file
@@ -0,0 +1,192 @@
|
||||
---
|
||||
layout: post
|
||||
name: '[Talk] Vac, Waku v2 and Ethereum Messaging'
|
||||
title: '[Talk] Vac, Waku v2 and Ethereum Messaging'
|
||||
date: 2020-11-10 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: waku-v2-ethereum-messaging
|
||||
categories: research
|
||||
image: /img/taipei_ethereum_meetup_slide.png
|
||||
discuss: https://forum.vac.dev/t/discussion-talk-vac-waku-v2-and-ethereum-messaging/60
|
||||
---
|
||||
|
||||
Talk from Taipei Ethereum Meetup. Read on to find out about our journey from Whisper to Waku v2, as well as how Waku v2 can be useful for Etherum Messaging.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
_The following post is a transcript of the talk given at the [Taipei Ethereum meetup, November 5](https://www.meetup.com/Taipei-Ethereum-Meetup/events/274033344/). There is also a [video recording](https://www.youtube.com/watch?v=lUDy1MoeYnI)._
|
||||
|
||||
---
|
||||
|
||||
## 0. Introduction
|
||||
|
||||
Hi! My name is Oskar and I'm the protocol research lead at Vac. This talk will be divided into two parts. First I'll talk about the journey from Whisper, to Waku v1 and now to Waku v2. Then I'll talk about messaging in Ethereum. After this talk, you should have an idea of what Waku v2 is, the problems it is trying to solve, as well as where it can be useful for messaging in Ethereum.
|
||||
|
||||
## PART 1 - VAC AND THE JOURNEY FROM WHISPER TO WAKU V1 TO WAKU V2
|
||||
|
||||
## 1. Vac intro
|
||||
|
||||
First, what is Vac? Vac grew out of our efforts Status to create a window on to Ethereum and secure messenger. Vac is modular protocol stack for p2p secure messaging, paying special attention to resource restricted devices, privacy and censorship resistance.
|
||||
|
||||
Today we are going to talk mainly about Waku v2, which is the transport privacy / routing aspect of the Vac protocol stack. It sits "above" the p2p overlay, such as libp2p dealing with transports etc, and below a conversational security layer dealing with messaging encryption, such as using Double Ratchet etc.
|
||||
|
||||
## 2. Whisper to Waku v1
|
||||
|
||||
In the beginning, there was Whisper. Whisper was part of the holy trinity of Ethereum. You had Ethereum for consensus/computation, Whisper for messaging, and Swarm for storage.
|
||||
|
||||
However, for various reasons, Whisper didn't get the attention it deserved. Development dwindled, it promised too much and it suffered from many issues, such as being extremely inefficient and not being suitable for running on e.g. mobile phone. Despite this, Status used it in its app from around 2017 to 2019. As far as I know, it was one of very few, if not the only, production uses of Whisper.
|
||||
|
||||
In an effort to solve some of its immediate problems, we forked Whisper into Waku and formalized it with a proper specification. This solved immediate bandwidth issues for light nodes, introduced rate limiting for better spam protection, improved historical message support, etc.
|
||||
|
||||
If you are interested in this journey, checkout the [EthCC talk Dean and I gave in Paris earlier this year](https://www.youtube.com/watch?v=6lLT33tsJjs).
|
||||
|
||||
Status upgraded to Waku v1 early 2020. What next?
|
||||
|
||||
## 3. Waku v1 to v2
|
||||
|
||||
We were far from done. The changes we had made were quite incremental and done in order to get tangible improvements as quickly as possible. This meant we couldn't address more fundamental issues related to full node routing scalability, running with libp2p for more transports, better security, better spam protection and incentivization.
|
||||
|
||||
This kickstarted Waku v2 efforts, which is what we've been working on since July. This work was and is initally centered around a few pieces:
|
||||
|
||||
(a) Moving to libp2p
|
||||
|
||||
(b) Better routing
|
||||
|
||||
(c) Accounting and user-run nodes
|
||||
|
||||
The general theme was: making the Waku network more scalable and robust.
|
||||
|
||||
We also did a scalability study to show at what point the network would run into issues, due to the inherent lack of routing that Whisper and Waku v1 provided.
|
||||
|
||||
You can read more about this [here](https://vac.dev/waku-v2-plan).
|
||||
|
||||
## 3.5 Waku v2 - Design goals
|
||||
|
||||
Taking a step back, what problem does Waku v2 attempt to solve compared to all the other solutions that exists out there? What type of applications should use it and why? We have the following design goals:
|
||||
|
||||
1. **Generalized messaging**. Many applications requires some form of messaging protocol to communicate between different subsystems or different nodes. This messaging can be human-to-human or machine-to-machine or a mix.
|
||||
|
||||
2. **Peer-to-peer**. These applications sometimes have requirements that make them suitable for peer-to-peer solutions.
|
||||
|
||||
3. **Resource restricted**. These applications often run in constrained environments, where resources or the environment is restricted in some fashion. E.g.:
|
||||
|
||||
- limited bandwidth, CPU, memory, disk, battery, etc
|
||||
- not being publicly connectable
|
||||
- only being intermittently connected; mostly-offline
|
||||
|
||||
4. **Privacy**. These applications have a desire for some privacy guarantees, such as pseudonymity, metadata protection in transit, etc.
|
||||
|
||||
As well as to do so in a modular fashion. Meaning you can find a reasonable trade-off depending on your exact requirements. For example, you usually have to trade off some bandwidth to get metadata protection, and vice versa.
|
||||
|
||||
The concept of designing for resource restricted devices also leads to the concept of adaptive nodes, where you have more of a continuum between full nodes and light nodes. For example, if you switch your phone from mobile data to WiFi you might be able to handle more bandwidth, and so on.
|
||||
|
||||
## 4. Waku v2 - Breakdown
|
||||
|
||||
Where is Waku v2 at now, and how is it structured?
|
||||
|
||||
It is running over libp2p and we had our second internal testnet last week or so. As a side note, we name our testnets after subway stations in Taipei, the first one being Nangang, and the most recent one being Dingpu.
|
||||
|
||||
The main implementation is written in Nim using nim-libp2p, which is also powering Nimbus, an Ethereum 2 client. There is also a PoC for running Waku v2 in the browser. On a spec level, we have the following specifications that corresponds to the components that make up Waku v2:
|
||||
|
||||
- Waku v2 - this is the main spec that explains the goals of providing generalized messaging, in a p2p context, with a focus on privacy and running on resources restricted devices.
|
||||
- Relay - this is the main PubSub spec that provides better routing. It builds on top of GossipSub, which is what Eth2 heavily relies on as well.
|
||||
- Store - this is a 1-1 protocol for light nodes to get historical messages, if they are mostly-offline.
|
||||
- Filter - this is a 1-1 protocol for light nodes that are bandwidth restricted to only (or mostly) get messages they care about.
|
||||
- Message - this explains the payload, to get some basic encryption and content topics. It corresponds roughly to envelopes in Whisper/Waku v1.
|
||||
- Bridge - this explains how to do bridging between Waku v1 and Waku v2 for compatibility.
|
||||
|
||||
Right now, all protocols, with the exception of bridge, are in draft mode, meaning they have been implemented but are not yet being relied upon in production.
|
||||
|
||||
You can read more about the breakdown in this [update](https://vac.dev/waku-v2-update) though some progress has been made since then, as well was in the [main Waku v2 spec](https://rfc.vac.dev/spec/10).
|
||||
|
||||
## 5. Waku v2 - Upcoming
|
||||
|
||||
What's coming up next? There are a few things.
|
||||
|
||||
For Status to use it in production, it needs to be integrated into the main app using the Nim Node API. The bridge also needs to be implemented and tested.
|
||||
|
||||
For other users, we are currently overhauling the API to allow usage from a browser, e.g. To make this experience great, there are also a few underlying infrastructure things that we need in nim-libp2p, such as a more secure HTTP server in Nim, Websockets and WebRTC support.
|
||||
|
||||
There are also some changes we made to at what level content encryption happens, and this needs to be made easier to use in the API. This means you can use a node without giving your keys to it, which is useful in some environments.
|
||||
|
||||
More generally, beyond getting to production-ready use, there are a few bigger pieces that we are working on or will work on soon. These are things like:
|
||||
|
||||
- Better scaling, by using topic sharding.
|
||||
- Accounting and user-run nodes, to account for and incentives full nodes.
|
||||
- Stronger and more rigorous privacy guarantees, e.g. through study of GossipSub, unlinkable packet formats, etc.
|
||||
- Rate Limit Nullifier for privacy preserving spam protection, a la what Barry Whitehat has presented before.
|
||||
|
||||
As well as better support for Ethereum M2M Messaging. Which is what I'll talk about next.
|
||||
|
||||
## PART 2 - ETHEREUM MESSAGING
|
||||
|
||||
A lot of what follows is inspired by exploratory work that John Lea has done at Status, previously Head of UX Architecture at Ubuntu.
|
||||
|
||||
## 6. Ethereum Messaging - Why?
|
||||
|
||||
It is easy to think that Waku v2 is only for human to human messaging, since that's how Waku is currently primarily used in the Status app. However, the goal is to be useful for generalized messaging, which includes other type of information as well as machine to machine messaging.
|
||||
|
||||
What is Ethereum M2M messaging? Going back to the Holy Trinity of Ethereum/Whisper/Swarm, the messaging component was seen as something that could facilitate messages between dapps and acts as a building block. This can help with things such as:
|
||||
|
||||
- Reducing on-chain transactions
|
||||
- Reduce latency for operations
|
||||
- Decentralize centrally coordinated services (like WalletConnect)
|
||||
- Improve UX of dapps
|
||||
- Broadcast live information
|
||||
- A message transport layer for state channels
|
||||
|
||||
And so on.
|
||||
|
||||
## 7. Ethereum Messaging - Why? (Cont)
|
||||
|
||||
What are some examples of practical things Waku as used for Ethereum Messaging could solve?
|
||||
|
||||
- Multisig transfers only needing one on chain transaction
|
||||
- DAO votes only needing one one chain transaction
|
||||
- Giving dapps ability to direct push notifications to users
|
||||
- Giving users ability to directly respond to requests from daps
|
||||
- Decentralized Wallet Connect
|
||||
|
||||
Etc.
|
||||
|
||||
## 8. What's needed to deliver this?
|
||||
|
||||
We can break it down into our actors:
|
||||
|
||||
- Decentralized M2M messaging system (Waku)
|
||||
- Native wallets (Argent, Metamask, Status, etc)
|
||||
- Dapps that benefit from M2M messaging
|
||||
- Users whose problems are being solved
|
||||
|
||||
Each of these has a bunch of requirements in turn. The messaging system needs to be decentralized, scalable, robust, etc. Wallets need support for messaging layer, dapps need to integrate this, etc.
|
||||
|
||||
This is a lot! Growing adoption is a challenge. There is a catch 22 in terms of justifying development efforts for wallets, when no dapps need it, and likewise for dapps when no wallets support Waku. In addition to this, there must be proven usage of Waku before it can be relied on, etc. How can we break this up into smaller pieces of work?
|
||||
|
||||
## 9. Breaking up the problem and a high level roadmap
|
||||
|
||||
We can start small. It doesn't and need to be used for critical features first. A more hybrid approach can be taken where it acts more as nice-to-haves.
|
||||
|
||||
1. Forking Whisper and solving scalablity, spam etc issues with it.
|
||||
This is a work in progress. What we talked about in part 1.
|
||||
2. Expose messaging API for Dapp developers.
|
||||
3. Implement decentralized version of WalletConnect.
|
||||
Currently wallets connect ot dapps with centralized service. Great UX.
|
||||
4. Solve DAO/Multi-Sig coordination problem.
|
||||
E.g. send message to wallet-derived key when it is time to sign a transaction.
|
||||
5. Extend dapp-to-user and user-to-dapp communication to more dapps.
|
||||
Use lessons learned and examples to drive adoptation for wallets/dapps.
|
||||
|
||||
And then build up from there.
|
||||
|
||||
## 10. We are hiring!
|
||||
|
||||
A lot of this will happen in Javascript and browsers, since that's the primarily environment for a lot of wallets and dapps. We are currently hiring for a Waku JS Wallet integration lead to help push this effort further.
|
||||
|
||||
Come talk to me after or [apply here](https://status.im/our_team/open_positions.html?gh_jid=2385338).
|
||||
|
||||
That's it! You can find us on Status, Telegram, vac.dev. I'm on twitter [here](https://twitter.com/oskarth).
|
||||
|
||||
Questions?
|
||||
|
||||
---
|
||||
225
rlog/2021-03-03-rln-relay.mdx
Normal file
225
rlog/2021-03-03-rln-relay.mdx
Normal file
@@ -0,0 +1,225 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Privacy-preserving p2p economic spam protection in Waku v2'
|
||||
title: 'Privacy-preserving p2p economic spam protection in Waku v2'
|
||||
date: 2021-03-05 12:00:00
|
||||
authors: sanaz
|
||||
published: true
|
||||
slug: rln-relay
|
||||
categories: reserach
|
||||
image: /img/rain.png
|
||||
discuss: https://forum.vac.dev/t/privacy-preserving-p2p-economic-spam-protection-in-waku-v2-with-rate-limiting-nullfiers/66
|
||||
|
||||
toc_min_heading_level: 2
|
||||
toc_max_heading_level: 5
|
||||
---
|
||||
|
||||
This post is going to give you an overview of how spam protection can be achieved in Waku Relay through rate-limiting nullifiers. We will cover a summary of spam-protection methods in centralized and p2p systems, and the solution overview and details of the economic spam-protection method. The open issues and future steps are discussed in the end.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
## Introduction
|
||||
|
||||
This post is going to give you an overview of how spam protection can be achieved in Waku Relay protocol[^2] through Rate-Limiting Nullifiers[^3] [^4] or RLN for short.
|
||||
|
||||
Let me give a little background about Waku(v2)[^1]. Waku is a privacy-preserving peer-to-peer (p2p) messaging protocol for resource-restricted devices. Being p2p means that Waku relies on **No** central server. Instead, peers collaboratively deliver messages in the network. Waku uses GossipSub[^16] as the underlying routing protocol (as of the writeup of this post). At a high level, GossipSub is based on publisher-subscriber architecture. That is, _peers, congregate around topics they are interested in and can send messages to topics. Each message gets delivered to all peers subscribed to the topic_. In GossipSub, a peer has a constant number of direct connections/neighbors. In order to publish a message, the author forwards its message to a subset of neighbors. The neighbors proceed similarly till the message gets propagated in the network of the subscribed peers. The message publishing and routing procedures are part of the Waku Relay[^17] protocol.
|
||||

|
||||
|
||||
## What do we mean by spamming?
|
||||
|
||||
In centralized messaging systems, a spammer usually indicates an entity that uses the messaging system to send an unsolicited message (spam) to large numbers of recipients. However, in Waku with a p2p architecture, spam messages not only affect the recipients but also all the other peers involved in the routing process as they have to spend their computational power/bandwidth/storage capacity on processing spam messages. As such, we define a spammer as an entity that uses the messaging system to publish a large number of messages in a short amount of time. The messages issued in this way are called spam. In this definition, we disregard the intention of the spammer as well as the content of the message and the number of recipients.
|
||||
|
||||
## Possible Solutions
|
||||
|
||||
Has the spamming issue been addressed before? Of course yes! Here is an overview of the spam protection techniques with their trade-offs and use-cases. In this overview, we distinguish between protection techniques that are targeted for centralized messaging systems and those for p2p architectures.
|
||||
|
||||
### Centralized Messaging Systems
|
||||
|
||||
In traditional centralized messaging systems, spam usually signifies unsolicited messages sent in bulk or messages with malicious content like malware. Protection mechanisms include
|
||||
|
||||
- authentication through some piece of personally identifiable information e.g., phone number
|
||||
- checksum-based filtering to protect against messages sent in bulk
|
||||
- challenge-response systems
|
||||
- content filtering on the server or via a proxy application
|
||||
|
||||
These methods exploit the fact that the messaging system is centralized and a global view of the users' activities is available based on which spamming patterns can be extracted and defeated accordingly. Moreover, users are associated with an identifier e.g., a username which enables the server to profile each user e.g., to detect suspicious behavior like spamming. Such profiling possibility is against the user's anonymity and privacy.
|
||||
|
||||
Among the techniques enumerated above, authentication through phone numbers is a some-what economic-incentive measure as providing multiple valid phone numbers will be expensive for the attacker. Notice that while using an expensive authentication method can reduce the number of accounts owned by a single spammer, cannot address the spam issue entirely. This is because the spammer can still send bulk messages through one single account. For this approach to be effective, a centralized mediator is essential. That is why such a solution would not fit the p2p environments where no centralized control exists.
|
||||
|
||||
### P2P Systems
|
||||
|
||||
What about spam prevention in p2p messaging platforms? There are two techniques, namely _Proof of Work_[^8] deployed by Whisper[^9] and _Peer scoring_[^6] method (namely reputation-based approach) adopted by LibP2P. However, each of these solutions has its own shortcomings for real-life use-cases as explained below.
|
||||
|
||||
#### Proof of work
|
||||
|
||||
The idea behind the Proof Of Work i.e., POW[^8] is to make messaging a computationally costly operation hence lowering the messaging rate of **all** the peers including the spammers. In specific, the message publisher has to solve a puzzle and the puzzle is to find a nonce such that the hash of the message concatenated with the nonce has at least z leading zeros. z is known as the difficulty of the puzzle. Since the hash function is one-way, peers have to brute-force to find a nonce. Hashing is a computationally-heavy operation so is the brute-force. While solving the puzzle is computationally expensive, it is comparatively cheap to verify the solution.
|
||||
|
||||
POW is also used as the underlying mining algorithm in Ethereum and Bitcoin blockchain. There, the goal is to contain the mining speed and allow the decentralized network to come to a consensus, or agree on things like account balances and the order of transactions.
|
||||
|
||||
While the use of POW makes perfect sense in Ethereum / Bitcoin blockchain, it shows practical issues in heterogeneous p2p messaging systems with resource-restricted peers. Some peers won't be able to carry the designated computation and will be effectively excluded. Such exclusion showed to be practically an issue in applications like Status, which used to rely on POW for spam-protection, to the extent that the difficulty level had to be set close to zero.
|
||||
|
||||
#### Peer Scoring
|
||||
|
||||
The peer scoring method[^6] that is utilized by libp2p is to limit the number of messages issued by a peer in connection to another peer. That is each peer monitors all the peers to which it is directly connected and adjusts their messaging quota i.e., to route or not route their messages depending on their past activities. For example, if a peer detects its neighbor is sending more than x messages per month, can drop its quota to z.x where z is less than one. The shortcoming of this solution is that scoring is based on peers' local observations and the concept of the score is defined in relation to one single peer. This leaves room for an attack where a spammer can make connections to k peers in the system and publishes k.(x-1) messages by exploiting all of its k connections. Another attack scenario is through botnets consisting of a large number of e.g., a million bots. The attacker rents a botnet and inserts each of them as a legitimate peer to the network and each can publish x-1 messages per month[^7].
|
||||
|
||||
#### Economic-Incentive Spam protection
|
||||
|
||||
Is this the end of our spam-protection journey? Shall we simply give up and leave spammers be? Certainly not!
|
||||
Waku RLN-Relay gives us a p2p spam-protection method which:
|
||||
|
||||
- suits **p2p** systems and does not rely on any central entity.
|
||||
- is **efficient** i.e., with no unreasonable computational, storage, memory, and bandwidth requirement! as such, it fits the network of **heterogeneous** peers.
|
||||
- respects users **privacy** unlike reputation-based and centralized methods.
|
||||
- deploys **economic-incentives** to contain spammers' activity. Namely, there is a financial sacrifice for those who want to spam the system. How? follow along ...
|
||||
|
||||
We devise a general rule to save everyone's life and that is
|
||||
|
||||
**No one can publish more than M messages per epoch without being financially charged!**
|
||||
|
||||
We set M to 1 for now, but this can be any arbitrary value. You may be thinking "This is too restrictive! Only one per epoch?". Don't worry, we set the epoch to a reasonable value so that it does not slow down the communication of innocent users but will make the life of spammers harder! Epoch here can be every second, as defined by UTC date-time +-20s.
|
||||
|
||||
The remainder of this post is all about the story of how to enforce this limit on each user's messaging rate as well as how to impose the financial cost when the limit gets violated. This brings us to the Rate Limiting Nullifiers and how we integrate this technique into Waku v2 (in specific the Waku Relay protocol) to protect our valuable users against spammers.
|
||||
|
||||
## Technical Terms
|
||||
|
||||
**Zero-knowledge proof**: Zero-knowledge proof (ZKP)[^14] allows a _prover_ to show a _verifier_ that they know something, without revealing what that something is. This means you can do the trust-minimized computation that is also privacy-preserving. As a basic example, instead of showing your ID when going to a bar you simply give them proof that you are over 18, without showing the doorman your id. In this write-up, by ZKP we essentially mean zkSNARK[^15] which is one of the many types of ZKPs.
|
||||
|
||||
**Threshold Secret Sharing Scheme**: (m,n) Threshold secret-sharing is a method by which you can split a secret value s into n pieces in a way that the secret s can be reconstructed by having m pieces (m <= n). The economic-incentive spam protection utilizes a (2,n) secret sharing realized by Shamir Secret Sharing Scheme[^13].
|
||||
|
||||
## Overview: Economic-Incentive Spam protection through Rate Limiting Nullifiers
|
||||
|
||||
**Context**: We started the idea of economic-incentive spam protection more than a year ago and conducted a feasibility study to identify blockers and unknowns. The results are published in our prior [post](https://vac.dev/feasibility-semaphore-rate-limiting-zksnarks). Since then major progress has been made and the prior identified blockers that are listed below are now addressed. Kudos to [Barry WhiteHat](https://github.com/barryWhiteHat), [Onur Kilic](https://github.com/kilic), [Koh Wei Jie](https://github.com/weijiekoh/perpetualpowersoftau) for all of their hard work, research, and development which made this progress possible.
|
||||
|
||||
- the proof time[^22] which was initially in the order of minutes ~10 mins and now is almost 0.5 seconds
|
||||
- the prover key size[^21] which was initially ~110MB and now is ~3.9MB
|
||||
- the lack of Shamir logic[^19] which is now implemented and part of the RLN repository[^4]
|
||||
- the concern regarding the potential multi-party computation for the trusted setup of zkSNARKs which got resolved[^20]
|
||||
- the lack of end-to-end integration that now we made it possible, have it implemented, and are going to present it in this post. New blockers are also sorted out during the e2e integration which we will discuss in the [Feasibility and Open Issues](#feasibility-and-open-issues) section.
|
||||
|
||||
Now that you have more context, let's see how the final solution works. The fundamental point is to make it economically costly to send more than your share of messages and to do so in a privacy-preserving and e2e fashion. To do that we have the following components:
|
||||
|
||||
- 1- **Group**: We manage all the peers inside a large group (later we can split peers into smaller groups, but for now consider only one). The group management is done via a smart contract which is devised for this purpose and is deployed on the Ethereum blockchain.
|
||||
- 2- **Membership**: To be able to send messages and in specific for the published messages to get routed by all the peers, publishing peers have to register to the group. Membership involves setting up public and private key pairs (think of it as the username and password). The private key remains at the user side but the public key becomes a part of the group information on the contract (publicly available) and everyone has access to it. Public keys are not human-generated (like usernames) and instead they are random numbers, as such, they do not reveal any information about the owner (think of public keys as pseudonyms). Registration is mandatory for the users who want to publish a message, however, users who only want to listen to the messages are more than welcome and do not have to register in the group.
|
||||
- **Membership fee**: Membership is not for free! each peer has to lock a certain amount of funds during the registration (this means peers have to have an Ethereum account with sufficient balance for this sake). This fund is safely stored on the contract and remains intact unless the peer attempts to break the rules and publish more than one message per epoch.
|
||||
- **Zero-knowledge Proof of membership**: Do you want your message to get routed to its destination, fine, but you have to prove that you are a member of the group (sorry, no one can escape the registration phase!). Now, you may be thinking that should I attach my public key to my message to prove my membership? Absolutely Not! we said that our solution respects privacy! membership proofs are done in a zero-knowledge manner that is each message will carry cryptographic proof asserting that "the message is generated by one of the current members of the group", so your identity remains private and your anonymity is preserved!
|
||||
- **Slashing through secret sharing**: Till now it does not seem like we can catch spammers, right? yes, you are right! now comes the exciting part, detecting spammers and slashing them. The core idea behind the slashing is that each publishing peer (not routing peers!) has to integrate a secret share of its private key inside the message. The secret share is deterministically computed over the private key and the current epoch. The content of this share is harmless for the peer's privacy (it looks random) unless the peer attempts to publish more than one message in the same epoch hence disclosing more than one secret share of its private key. Indeed two distinct shares of the private key under the same epoch are enough to reconstruct the entire private key. Then what should you do with the recovered private key? hurry up! go to the contract and withdraw the private key and claim its fund and get rich!! Are you thinking what if spammers attach junk values instead of valid secret shares? Of course, that wouldn't be cool! so, there is a zero-knowledge proof for this sake as well where the publishing peer has to prove that the secret shares are generated correctly.
|
||||
|
||||
A high-level overview of the economic spam protection is shown in Figure 1.
|
||||
|
||||
## Flow
|
||||
|
||||
In this section, we describe the flow of the economic-incentive spam detection mechanism from the viewpoint of a single peer. An overview of this flow is provided in Figure 3.
|
||||
|
||||
## Setup and Registration
|
||||
|
||||
A peer willing to publish a message is required to register. Registration is moderated through a smart contract deployed on the Ethereum blockchain. The state of the contract contains the list of registered members' public keys. An overview of registration is illustrated in Figure 2.
|
||||
|
||||
For the registration, a peer creates a transaction that sends x amount of Ether to the contract. The peer who has the "private key" `sk` associated with that deposit would be able to withdraw x Ether by providing valid proof. Note that `sk` is initially only known by the owning peer however it may get exposed to other peers in case the owner attempts spamming the system i.e., sending more than one message per epoch.
|
||||
The following relation holds between the `sk` and `pk` i.e., `pk = H(sk)` where `H` denotes a hash function.
|
||||

|
||||
|
||||
## Maintaining the membership Merkle Tree
|
||||
|
||||
The ZKP of membership that we mentioned before relies on the representation of the entire group as a [Merkle Tree](/#). The tree construction and maintenance is delegated to the peers (the initial idea was to keep the tree on the chain as part of the contract, however, the cost associated with member deletion and insertion was high and unreasonable, please see [Feasibility and Open Issues](#Feasibility-and-Open-Issues) for more details). As such, each peer needs to build the tree locally and sync itself with the contract updates (peer insertion and deletion) to mirror them on its tree.
|
||||
Two pieces of information of the tree are important as they enable peers to generate zero-knowledge proofs. One is the root of the tree and the other is the membership proof (or the authentication path). The tree root is public information whereas the membership proof is private data (or more precisely the index of the peer in the tree).
|
||||
|
||||
## Publishing
|
||||
|
||||
In order to publish at a given epoch, each message must carry a proof i.e., a zero-knowledge proof signifying that the publishing peer is a registered member, and has not exceeded the messaging rate at the given epoch.
|
||||
|
||||
Recall that the enforcement of the messaging rate was through associating a secret shared version of the peer's `sk` into the message together with a ZKP that the secret shares are constructed correctly. As for the secret sharing part, the peer generates the following data:
|
||||
|
||||
1. `shareX`
|
||||
2. `shareY`
|
||||
3. `nullifier`
|
||||
|
||||
The pair (`shareX`, `shareY`) is the secret shared version of `sk` that are generated using Shamir secret sharing scheme. Having two such pairs for an identical `nullifier` results in full disclosure of peer's `sk` and hence burning the associated deposit. Note that the `nullifier` is a deterministic value derived from `sk` and `epoch` therefore any two messages issued by the same peer (i.e., using the same `sk`) for the same `epoch` are guaranteed to have identical `nullifier`s.
|
||||
|
||||
Finally, the peer generates a zero-knowledge proof `zkProof` asserting the membership of the peer in the group and the correctness of the attached secret share (`shareX`, `shareY`) and the `nullifier`. In order to generate a valid proof, the peer needs to have two private inputs i.e., its `sk` and its authentication path. Other inputs are the tree root, epoch, and the content of the message.
|
||||
|
||||
**Privacy Hint:** Note that the authentication path of each peer depends on the recent list of members (hence changes when new peers register or leave). As such, it is recommended (and necessary for privacy/anonymity) that the publisher updates her authentication path based on the latest status of the group and attempts the proof using the updated version.
|
||||
|
||||
An overview of the publishing procedure is provided in Figure 3.
|
||||
|
||||
## Routing
|
||||
|
||||
Upon the receipt of a message, the routing peer needs to decide whether to route it or not. This decision relies on the following factors:
|
||||
|
||||
1. If the epoch value attached to the message has a non-reasonable gap with the routing peer's current epoch then the message must be dropped (this is to prevent a newly registered peer spamming the system by messaging for all the past epochs).
|
||||
2. The message MUST contain valid proof that gets verified by the routing peer.
|
||||
If the preceding checks are passed successfully, then the message is relayed. In case of an invalid proof, the message is dropped. If spamming is detected, the publishing peer gets slashed (see [Spam Detection and Slashing](#Spam-Detection-and-Slashing)).
|
||||
|
||||
An overview of the routing procedure is provided in Figure 3.
|
||||
|
||||
### Spam Detection and Slashing
|
||||
|
||||
In order to enable local spam detection and slashing, routing peers MUST record the `nullifier`, `shareX`, and `shareY` of any incoming message conditioned that it is not spam and has valid proof. To do so, the peer should follow the following steps.
|
||||
|
||||
1. The routing peer first verifies the `zkProof` and drops the message if not verified.
|
||||
2. Otherwise, it checks whether a message with an identical `nullifier` has already been relayed.
|
||||
- a) If such message exists and its `shareX` and `shareY` components are different from the incoming message, then slashing takes place (if the `shareX` and `shareY` fields of the previously relayed message is identical to the incoming message, then the message is a duplicate and shall be dropped).
|
||||
- b) If none found, then the message gets relayed.
|
||||
|
||||
An overview of the slashing procedure is provided in Figure 3.
|
||||

|
||||
|
||||
## Feasibility and Open Issues
|
||||
|
||||
We've come a long way since a year ago, blockers resolved, now we have implemented it end-to-end. We learned lot and could identify further issues and unknowns some of which are blocking getting to production. The summary of the identified issues are presented below.
|
||||
|
||||
## Storage overhead per peer
|
||||
|
||||
Currently, peers are supposed to maintain the entire tree locally and it imposes storage overhead which is linear in the size of the group (see this [issue](https://github.com/vacp2p/research/issues/57)[^11] for more details). One way to cope with this is to use the light-node and full-node paradigm in which only a subset of peers who are more resourceful retain the tree whereas the light nodes obtain the necessary information by interacting with the full nodes. Another way to approach this problem is through a more storage efficient method (as described in this research issue[^12]) where peers store a partial view of the tree instead of the entire tree. Keeping the partial view lowers the storage complexity to O(log(N)) where N is the size of the group. There are still unknown unknowns to this solution, as such, it must be studied further to become fully functional.
|
||||
|
||||
## Cost-effective way of member insertion and deletion
|
||||
|
||||
Currently, the cost associated with RLN-Relay membership is around 30 USD[^10]. We aim at finding a more cost-effective approach. Please feel free to share with us your solution ideas in this regard in this [issue](https://github.com/vacp2p/research/issues/56).
|
||||
|
||||
## Exceeding the messaging rate via multiple registrations
|
||||
|
||||
While the economic-incentive solution has an economic incentive to discourage spamming, we should note that there is still **expensive attack(s)**[^23] that a spammer can launch to break the messaging rate limit. That is, the attacker can pay for multiple legit registrations e.g., k, hence being able to publish k messages per epoch. We believe that the higher the membership fee is, the less probable would be such an attack, hence a stronger level of spam-protection can be achieved. Following this argument, the high fee associated with the membership (which we listed above as an open problem) can indeed be contributing to a better protection level.
|
||||
|
||||
## Conclusion and Future Steps
|
||||
|
||||
As discussed in this post, Waku RLN Relay can achieve a privacy-preserving economic spam protection through rate-limiting nullifiers. The idea is to financially discourage peers from publishing more than one message per epoch. In specific, exceeding the messaging rate results in a financial charge. Those who violate this rule are called spammers and their messages are spam. The identification of spammers does not rely on any central entity. Also, the financial punishment of spammers is cryptographically guaranteed.
|
||||
In this solution, privacy is guaranteed since: 1) Peers do not have to disclose any piece of personally identifiable information in any phase i.e., neither in the registration nor in the messaging phase 2) Peers can prove that they have not exceeded the messaging rate in a zero-knowledge manner and without leaving any trace to their membership accounts.
|
||||
Furthermore, all the computations are light hence this solution fits the heterogenous p2p messaging system. Note that the zero-knowledge proof parts are handled through zkSNARKs and the benchmarking result can be found in the RLN benchmark report[^5].
|
||||
|
||||
**Future steps**:
|
||||
|
||||
We are still at the PoC level, and the development is in progress. As our future steps,
|
||||
|
||||
- we would like to evaluate the running time associated with the Merkle tree operations. Indeed, the need to locally store Merkle tree on each peer was one of the unknowns discovered during this PoC and yet the concrete benchmarking result in this regard is not available.
|
||||
- We would also like to pursue our storage-efficient Merkle Tree maintenance solution in order to lower the storage overhead of peers.
|
||||
- In line with the storage optimization, the full-node light-node structure is another path to follow.
|
||||
- Another possible improvement is to replace the membership contract with a distributed group management scheme e.g., through distributed hash tables. This is to address possible performance issues that the interaction with the Ethereum blockchain may cause. For example, the registration transactions are subject to delay as they have to be mined before being visible in the state of the membership contract. This means peers have to wait for some time before being able to publish any message.
|
||||
|
||||
## Acknowledgement
|
||||
|
||||
Thanks to Onur Kılıç for his explanation and pointers and for assisting with development and runtime issues. Also thanks to Barry Whitehat for his time and insightful comments. Special thanks to Oskar Thoren for his constructive comments and his guides during the development of this PoC and the writeup of this post.
|
||||
|
||||
## References
|
||||
|
||||
[^1]: Waku v2: https://rfc.vac.dev/spec/10/
|
||||
[^2]: RLN-Relay specification: https://rfc.vac.dev/spec/17/
|
||||
[^3]: RLN documentation: [https://hackmd.io/tMTLMYmTR5eynw2lwK9n1w?both](https://hackmd.io/tMTLMYmTR5eynw2lwK9n1w?both)
|
||||
[^4]: RLN repositories: [https://github.com/kilic/RLN](https://github.com/kilic/RLN) and [https://github.com/kilic/rlnapp](https://github.com/kilic/rlnapp)
|
||||
[^5]: RLN Benchmark: [https://hackmd.io/tMTLMYmTR5eynw2lwK9n1w?view#Benchmarks](https://hackmd.io/tMTLMYmTR5eynw2lwK9n1w?view#Benchmarks)
|
||||
[^6]: Peer Scoring: [https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#peer-scoring](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#peer-scoring)
|
||||
[^7]: Peer scoring security issues: [https://github.com/vacp2p/research/issues/44](https://github.com/vacp2p/research/issues/44)
|
||||
[^8]: Proof of work: [http://www.infosecon.net/workshop/downloads/2004/pdf/clayton.pdf](http://www.infosecon.net/workshop/downloads/2004/pdf/clayton.pdf) and [https://link.springer.com/content/pdf/10.1007/3-540-48071-4_10.pdf](https://link.springer.com/content/pdf/10.1007/3-540-48071-4_10.pdf)
|
||||
[^9]: EIP-627 Whisper: https://eips.ethereum.org/EIPS/eip-627
|
||||
[^10]: Cost-effective way of member insertion and deletion: [https://github.com/vacp2p/research/issues/56](https://github.com/vacp2p/research/issues/56)
|
||||
[^11]: Storage overhead per peer: [https://github.com/vacp2p/research/issues/57](https://github.com/vacp2p/research/issues/57)
|
||||
[^12]: Storage-efficient Merkle Tree maintenance: [https://github.com/vacp2p/research/pull/54](https://github.com/vacp2p/research/pull/54)
|
||||
[^13]: Shamir Secret Sharing Scheme: [https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing](https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing)
|
||||
[^14]: Zero Knowledge Proof: [https://dl.acm.org/doi/abs/10.1145/3335741.3335750](https://dl.acm.org/doi/abs/10.1145/3335741.3335750) and [https://en.wikipedia.org/wiki/Zero-knowledge_proof](https://en.wikipedia.org/wiki/Zero-knowledge_proof)
|
||||
[^15]: zkSNARKs: [https://link.springer.com/chapter/10.1007/978-3-662-49896-5_11](https://link.springer.com/chapter/10.1007/978-3-662-49896-5_11) and [https://coinpare.io/whitepaper/zcash.pdf](https://coinpare.io/whitepaper/zcash.pdf)
|
||||
[^16]: GossipSub: [https://docs.libp2p.io/concepts/publish-subscribe/](https://docs.libp2p.io/concepts/publish-subscribe/)
|
||||
[^17]: Waku Relay: https://rfc.vac.dev/spec/11/
|
||||
[^18]: Prior blockers of RLN-Relay: [https://vac.dev/feasibility-semaphore-rate-limiting-zksnarks](https://vac.dev/feasibility-semaphore-rate-limiting-zksnarks)
|
||||
[^19]: The lack of Shamir secret sharing in zkSNARKs: [https://github.com/vacp2p/research/issues/10](https://github.com/vacp2p/research/issues/10)
|
||||
[^20]: The MPC required for zkSNARKs trusted setup: [https://github.com/vacp2p/research/issues/9](https://github.com/vacp2p/research/issues/9)
|
||||
[^21]: Prover key size: [https://github.com/vacp2p/research/issues/8](https://github.com/vacp2p/research/issues/8)
|
||||
[^22]: zkSNARKs proof time: [https://github.com/vacp2p/research/issues/7](https://github.com/vacp2p/research/issues/7)
|
||||
[^23]: Attack on the messaging rate: [https://github.com/vacp2p/specs/issues/251](https://github.com/vacp2p/specs/issues/251)
|
||||
181
rlog/2021-06-04-presenting-js-waku.mdx
Normal file
181
rlog/2021-06-04-presenting-js-waku.mdx
Normal file
@@ -0,0 +1,181 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Presenting JS-Waku: Waku v2 in the Browser'
|
||||
title: 'Presenting JS-Waku: Waku v2 in the Browser'
|
||||
date: 2021-06-04 12:00:00
|
||||
authors: franck
|
||||
published: true
|
||||
slug: presenting-js-waku
|
||||
categories: platform
|
||||
image: /img/js-waku-gist.png
|
||||
discuss: https://forum.vac.dev/t/discussion-presenting-js-waku-waku-v2-in-the-browser/81
|
||||
---
|
||||
|
||||
JS-Waku is bringing Waku v2 to the browser. Learn what we achieved so far and what is next in our pipeline!
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
For the past 3 months, we have been working on bringing Waku v2 to the browser.
|
||||
Our aim is to empower dApps with Waku v2, and it led to the creation of a new library.
|
||||
We believe now is good time to introduce it!
|
||||
|
||||
## Waku v2
|
||||
|
||||
First, let's review what Waku v2 is and what problem it is trying to solve.
|
||||
|
||||
Waku v2 comes from a need to have a more scalable, better optimised solution for the Status app to achieve decentralised
|
||||
communications on resource restricted devices (i.e., mobile phones).
|
||||
|
||||
The Status chat feature was initially built over Whisper.
|
||||
However, Whisper has a number of caveats which makes it inefficient for mobile phones.
|
||||
For example, with Whisper, all devices are receiving all messages which is not ideal for limited data plans.
|
||||
|
||||
To remediate this, a Waku mode (then Waku v1), based on devp2p, was introduced.
|
||||
To further enable web and restricted resource environments, Waku v2 was created based on libp2p.
|
||||
The migration of the Status chat feature to Waku v2 is currently in progress.
|
||||
|
||||
We see the need of such solution in the broader Ethereum ecosystem, beyond Status.
|
||||
This is why we are building Waku v2 as a decentralised communication platform for all to use and build on.
|
||||
If you want to read more about Waku v2 and what it aims to achieve,
|
||||
checkout [What's the Plan for Waku v2?](/waku-v2-plan).
|
||||
|
||||
Since last year, we have been busy defining and implementing Waku v2 protocols in [nim-waku](https://github.com/status-im/nim-waku),
|
||||
from which you can build [wakunode2](https://github.com/status-im/nim-waku#wakunode).
|
||||
Wakunode2 is an adaptive and modular Waku v2 node,
|
||||
it allows users to run their own node and use the Waku v2 protocols they need.
|
||||
The nim-waku project doubles as a library, that can be used to add Waku v2 support to native applications.
|
||||
|
||||
## Waku v2 in the browser
|
||||
|
||||
We believe that dApps and wallets can benefit from the Waku network in several ways.
|
||||
For some dApps, it makes sense to enable peer-to-peer communications.
|
||||
For others, machine-to-machine communications would be a great asset.
|
||||
For example, in the case of a DAO,
|
||||
Waku could be used for gas-less voting.
|
||||
Enabling the DAO to notify their users of a new vote,
|
||||
and users to vote without interacting with the blockchain and spending gas.
|
||||
|
||||
[Murmur](https://github.com/status-im/murmur) was the first attempt to bring Whisper to the browser,
|
||||
acting as a bridge between devp2p and libp2p.
|
||||
Once Waku v2 was started and there was a native implementation on top of libp2p,
|
||||
a [chat POC](https://github.com/vacp2p/waku-web-chat) was created to demonstrate the potential of Waku v2
|
||||
in web environment.
|
||||
It showed how using js-libp2p with few modifications enabled access to the Waku v2 network.
|
||||
There was still some unresolved challenges.
|
||||
For example, nim-waku only support TCP connections which are not supported by browser applications.
|
||||
Hence, to connect to other node, the POC was connecting to a NodeJS proxy application using websockets,
|
||||
which in turn could connect to wakunode2 via TCP.
|
||||
|
||||
However, to enable dApp and Wallet developers to easily integrate Waku in their product,
|
||||
we need to give them a library that is easy to use and works out of the box:
|
||||
introducing [JS-Waku](https://github.com/status-im/js-waku).
|
||||
|
||||
JS-Waku is a JavaScript library that allows your dApp, wallet or other web app to interact with the Waku v2 network.
|
||||
It is available right now on [npm](https://www.npmjs.com/package/js-waku):
|
||||
|
||||
`npm install js-waku`.
|
||||
|
||||
As it is written in TypeScript, types are included in the npm package to allow easy integration with TypeScript, ClojureScript and other typed languages that compile to JavaScript.
|
||||
|
||||
Key Waku v2 protocols are already available:
|
||||
[message](https://rfc.vac.dev/spec/14/), [store](https://rfc.vac.dev/spec/13/), [relay](https://rfc.vac.dev/spec/11/) and [light push](https://rfc.vac.dev/spec/19/),
|
||||
enabling your dApp to:
|
||||
|
||||
- Send and receive near-instant messages on the Waku network (relay),
|
||||
- Query nodes for messages that may have been missed, e.g. due to poor cellular network (store),
|
||||
- Send messages with confirmations (light push).
|
||||
|
||||
JS-Waku needs to operate in the same context from which Waku v2 was born:
|
||||
a restricted environment were connectivity or uptime are not guaranteed;
|
||||
JS-Waku brings Waku v2 to the browser.
|
||||
|
||||
## Achievements so far
|
||||
|
||||
We focused the past month on developing a [ReactJS Chat App](https://status-im.github.io/js-waku/).
|
||||
The aim was to create enough building blocks in JS-Waku to enable this showcase web app that
|
||||
we now [use for dogfooding](https://github.com/status-im/nim-waku/issues/399) purposes.
|
||||
|
||||
Most of the effort was on getting familiar with the [js-libp2p](https://github.com/libp2p/js-libp2p) library
|
||||
that we heavily rely on.
|
||||
JS-Waku is the second implementation of Waku v2 protocol,
|
||||
so a lot of effort on interoperability was needed.
|
||||
For example, to ensure compatibility with the nim-waku reference implementation,
|
||||
we run our [tests against wakunode2](https://github.com/status-im/js-waku/blob/90c90dea11dfd1277f530cf5d683fb92992fe141/src/lib/waku_relay/index.spec.ts#L137) as part of the CI.
|
||||
|
||||
This interoperability effort helped solidify the current Waku v2 specifications:
|
||||
By clarifying the usage of topics
|
||||
([#327](https://github.com/vacp2p/rfc/issues/327), [#383](https://github.com/vacp2p/rfc/pull/383)),
|
||||
fix discrepancies between specs and nim-waku
|
||||
([#418](https://github.com/status-im/nim-waku/issues/418), [#419](https://github.com/status-im/nim-waku/issues/419))
|
||||
and fix small nim-waku & nim-libp2p bugs
|
||||
([#411](https://github.com/status-im/nim-waku/issues/411), [#439](https://github.com/status-im/nim-waku/issues/439)).
|
||||
|
||||
To fully access the waku network, JS-Waku needs to enable web apps to connect to nim-waku nodes.
|
||||
A standard way to do so is using secure websockets as it is not possible to connect directly to a TCP port from the browser.
|
||||
Unfortunately websocket support is not yet available in [nim-libp2p](https://github.com/status-im/nim-libp2p/issues/407) so
|
||||
we ended up deploying [websockify](https://github.com/novnc/websockify) alongside wakunode2 instances.
|
||||
|
||||
As we built the [web chat app](https://github.com/status-im/js-waku/tree/main/examples/web-chat),
|
||||
we were able to fine tune the API to provide a simple and succinct interface.
|
||||
You can start a node, connect to other nodes and send a message in less than ten lines of code:
|
||||
|
||||
```javascript
|
||||
import { Waku } from 'js-waku'
|
||||
|
||||
const waku = await Waku.create({})
|
||||
|
||||
const nodes = await getStatusFleetNodes()
|
||||
await Promise.all(nodes.map((addr) => waku.dial(addr)))
|
||||
|
||||
const msg = WakuMessage.fromUtf8String(
|
||||
'Here is a message!',
|
||||
'/my-cool-app/1/my-use-case/proto',
|
||||
)
|
||||
await waku.relay.send(msg)
|
||||
```
|
||||
|
||||
We have also put a bounty at [0xHack](https://0xhack.dev/) for using JS-Waku
|
||||
and running a [workshop](https://www.youtube.com/watch?v=l77j0VX75QE).
|
||||
We were thrilled to have a couple of hackers create new software using our libraries.
|
||||
One of the projects aimed to create a decentralised, end-to-end encrypted messenger app,
|
||||
similar to what the [ETH-DM](https://rfc.vac.dev/spec/20/) protocol aims to achieve.
|
||||
Another project was a decentralised Twitter platform.
|
||||
Such projects allow us to prioritize the work on JS-Waku and understand how DevEx can be improved.
|
||||
|
||||
As more developers use JS-Waku, we will evolve the API to allow for more custom and fine-tune usage of the network
|
||||
while preserving this out of the box experience.
|
||||
|
||||
## What's next?
|
||||
|
||||
Next, we are directing our attention towards [Developer Experience](https://github.com/status-im/js-waku/issues/68).
|
||||
We already have [documentation](https://www.npmjs.com/package/js-waku) available but we want to provide more:
|
||||
[Tutorials](https://github.com/status-im/js-waku/issues/56), various examples
|
||||
and showing how [JS-Waku can be used with Web3](https://github.com/status-im/js-waku/issues/72).
|
||||
|
||||
By prioritizing DevEx we aim to enable JS-Waku integration in dApps and wallets.
|
||||
We think JS-Waku builds a strong case for machine-to-machine (M2M) communications.
|
||||
The first use cases we are looking into are dApp notifications:
|
||||
Enabling dApp to notify their user directly in their wallets!
|
||||
Leveraging Waku as a decentralised infrastructure and standard so that users do not have to open their dApp to be notified
|
||||
of events such as DAO voting.
|
||||
|
||||
We already have some POC in the pipeline to enable voting and polling on the Waku network,
|
||||
allowing users to save gas by **not** broadcasting each individual vote on the blockchain.
|
||||
|
||||
To facilitate said applications, we are looking at improving integration with Web3 providers by providing examples
|
||||
of signing, validating, encrypting and decrypting messages using Web3.
|
||||
Waku is privacy conscious, so we will also provide signature and encryption examples decoupled from users' Ethereum identity.
|
||||
|
||||
As you can read, we have grand plans for JS-Waku and Waku v2.
|
||||
There is a lot to do, and we would love some help so feel free to
|
||||
check out the new role in our team:
|
||||
[js-waku: Wallet & Dapp Integration Developer](https://status.im/our_team/jobs.html?gh_jid=3157894).
|
||||
We also have a number of [positions](https://status.im/our_team/jobs.html) open to work on Waku protocol and nim-waku.
|
||||
|
||||
If you are as excited as us by JS-Waku, why not build a dApp with it?
|
||||
You can find documentation on the [npmjs page](https://www.npmjs.com/package/js-waku).
|
||||
|
||||
Whether you are a developer, you can come chat with us using [WakuJS Web Chat](https://status-im.github.io/js-waku/)
|
||||
or [chat2](https://github.com/status-im/nim-waku/blob/master/docs/tutorial/chat2.md).
|
||||
You can get support in #dappconnect-support on [Vac Discord](https://discord.gg/j5pGbn7MHZ) or [Telegram](https://t.me/dappconnectsupport).
|
||||
If you have any ideas on how Waku could enable a specific dapp or use case, do share, we are always keen to hear it.
|
||||
248
rlog/2021-08-06-coscup-waku-ethereum.mdx
Normal file
248
rlog/2021-08-06-coscup-waku-ethereum.mdx
Normal file
@@ -0,0 +1,248 @@
|
||||
---
|
||||
layout: post
|
||||
name: '[Talk at COSCUP] Vac, Waku v2 and Ethereum Messaging'
|
||||
title: '[Talk at COSCUP] Vac, Waku v2 and Ethereum Messaging'
|
||||
date: 2021-08-06 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: waku-v2-ethereum-coscup
|
||||
categories: research
|
||||
image: /img/coscup-waku/talk.png
|
||||
discuss: https://forum.vac.dev/t/discussion-talk-at-coscup-vac-waku-v2-and-ethereum-messaging/95
|
||||
---
|
||||
|
||||
Learn more about Waku v2, its origins, goals, protocols, implementation and ongoing research. Understand how it is used and how it can be useful for messaging in Ethereum.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
_This is the English version of a talk originally given in Chinese at COSCUP in Taipei._
|
||||
|
||||
[video recording with Chinese and English subtitles.](https://www.youtube.com/watch?v=s0ATpQ4_XFc)
|
||||
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Hi everyone!
|
||||
|
||||
Today I'll talk to you about Waku v2. What it is, what problems it is solving,
|
||||
and how it can be useful for things such as messaging in Ethereum. First, let me
|
||||
start with some brief background.
|
||||
|
||||
## Brief history and background
|
||||
|
||||
Back when Ethereum got started, there used to be this concept of the "holy
|
||||
trinity". You had Ethereum for compute/consensus, Swarm for storage, and Whisper
|
||||
for messaging. This is partly where the term Web3 comes from.
|
||||
|
||||
Status started out as an app with the goal of being a window onto Ethereum and
|
||||
a secure messenger. As one of the few, if not the only, apps using Whisper in
|
||||
production, not to mention on a mobile phone, we quickly realized there were
|
||||
problems with the underlying protocols and infrastructure. Protocols such as
|
||||
Whisper weren't quite ready for prime time yet when it came to things such as
|
||||
scalability and working in the real world.
|
||||
|
||||
As we started addressing some of these challenges, and moved from app
|
||||
developement to focusing on protocols, research and infrastructure, we created
|
||||
Vac. Vac is an r&d unit doing protocol research focused on creating modular p2p
|
||||
messaging protocols for private, secure, censorship resistant communication.
|
||||
|
||||
I won't go into too much detail on the issues with Whisper, if you are
|
||||
interested in this check out this talk
|
||||
[here](https://www.youtube.com/watch?v=6lLT33tsJjs) or this
|
||||
[article](https://vac.dev/fixing-whisper-with-waku).
|
||||
|
||||
In a nutshell, we forked Whisper to address immediate shortcomings and this
|
||||
became Waku v1. Waku v2 is complete re-thought implementation from scratch on top
|
||||
of libp2p. This will be the subject of today's talk.
|
||||
|
||||
## Waku v2
|
||||
|
||||
### Overview
|
||||
|
||||
Waku v2 is a privacy-preserving peer-to-peer messaging protocol for resource
|
||||
restricted devices. We can look at Waku v2 as several things:
|
||||
|
||||
- Set of protocols
|
||||
- Set of implementations
|
||||
- Network of nodes
|
||||
|
||||
Let's first look at what the goals are.
|
||||
|
||||
### Goals
|
||||
|
||||
Waku v2 provides a PubSub based messaging protocol with the following
|
||||
characteristics:
|
||||
|
||||
1. **Generalized messaging**. Applications that require a messaging protocol to
|
||||
communicate human to human, machine to machine, or a mix.
|
||||
2. **Peer-to-peer**. For applications that require a p2p solution.
|
||||
3. **Resource restricted**. For example, running with limited bandwidth, being
|
||||
mostly-offline, or in a browser.
|
||||
4. **Privacy**. Applications that have privacy requirements, such as pseudonymity,
|
||||
metadata protection, etc.
|
||||
|
||||
And to provide these properties in a modular fashion, where applications can
|
||||
choose their desired trade-offs.
|
||||
|
||||
### Protocols
|
||||
|
||||
Waku v2 consists of several protocols. Here we highlight a few of the most
|
||||
important ones:
|
||||
|
||||
- 10/WAKU2 - main specification, details how all the pieces fit together
|
||||
- 11/RELAY - thin layer on top of GossipSub for message dissemination
|
||||
- 13/STORE - fetching of historical messages
|
||||
- 14/MESSAGE - message payload
|
||||
|
||||
This is the recommended subset for a minimal Waku v2 client.
|
||||
|
||||
In addition to this there are many other types of specifications at various
|
||||
stages of maturity, such as: content based filtering, bridge mode to Waku v1,
|
||||
JSON RPC API, zkSNARKS based spam protection with RLN, accounting and
|
||||
settlements with SWAP, fault-tolerant store nodes, recommendations around topic
|
||||
usage, and more.
|
||||
|
||||
See https://rfc.vac.dev/ for a full overview.
|
||||
|
||||
### Implementations
|
||||
|
||||
Waku v2 consists of multiple implementations. This allows for client diversity,
|
||||
makes it easier to strengthen the protocols, and allow people to use Waku v2 in
|
||||
different contexts.
|
||||
|
||||
- nim-waku - the reference client written in Nim, most full-featured.
|
||||
- js-waku - allow usage of Waku v2 from browsers, focus on interacting with dapps.
|
||||
- go-waku - subset of Waku v2 to ease integration into the Status app.
|
||||
|
||||
### Testnet Huilong and dogfooding
|
||||
|
||||
In order to test the protocol we have setup a testnet across all implementations
|
||||
called Huilong. Yes, that's the Taipei subway station!
|
||||
|
||||

|
||||
|
||||
Among us core devs we have disabled the main #waku Discord channel used for
|
||||
development, and people run their own node connected to this toy chat application.
|
||||
|
||||
Feel free to join and say hi! Instructions can be found here:
|
||||
|
||||
- [nim-waku chat](https://github.com/status-im/nim-waku/blob/master/docs/tutorial/chat2.md)
|
||||
|
||||
- [js-waku chat](https://status-im.github.io/js-waku/)
|
||||
|
||||
- [go-waku chat](https://github.com/status-im/go-waku/tree/master/examples/chat2)
|
||||
|
||||
### Research
|
||||
|
||||
While Waku v2 is being used today, we are actively researching improvements.
|
||||
Since the design is modular, we can gracefully introduce new capabilities. Some
|
||||
of these research areas are:
|
||||
|
||||
- Privacy-preserving spam protection using zkSNARKs and RLN
|
||||
- Accounting and settlement of resource usage to incentivize nodes to provide services with SWAP
|
||||
- State synchronization for store protocol to make it easier to run a store node without perfect uptime
|
||||
- Better node discovery
|
||||
- More rigorous privacy analysis
|
||||
- Improving interaction with wallets and dapp
|
||||
|
||||
## Use cases
|
||||
|
||||
Let's look at where Waku v2 is and can be used.
|
||||
|
||||
### Prelude: Topics in Waku v2
|
||||
|
||||
To give some context, there are two different types of topics in Waku v2. One is
|
||||
a PubSub topic, for routing. The other is a content topic, which is used for
|
||||
content based filtering. Here's an example of the default PubSub topic:
|
||||
|
||||
`/waku/2/default-waku/proto`
|
||||
|
||||
This is recommended as it increases privacy for participants and it is stored by
|
||||
default, however this is up to the application.
|
||||
|
||||
The second type of topic is a content topic, which is application specific. For
|
||||
example, here's the content topic used in our testnet:
|
||||
|
||||
`/toychat/2/huilong/proto`
|
||||
|
||||
For more on topics, see https://rfc.vac.dev/spec/23/
|
||||
|
||||
### Status app
|
||||
|
||||
In the Status protocol, content topics - topics in Whisper/Waku v1 - are used for several things:
|
||||
|
||||
- Contact code topic to discover X3DH bundles for perfect forward secrecy
|
||||
- Partitioned into N (currently 5000) content topics to balance privacy with efficiency
|
||||
- Public chats correspond to hash of the plaintext name
|
||||
- Negotiated topic for 1:1 chat with DHKE derived content topic
|
||||
|
||||
See more here https://specs.status.im/spec/10
|
||||
|
||||
Currently, Status app is in the process of migrating to and testing Waku v2.
|
||||
|
||||
### DappConnect: Ethereum messaging
|
||||
|
||||
It is easy to think of Waku as being for human messaging, since that's how it is
|
||||
primarily used in the Status app, but the goal is to be useful for generalized
|
||||
messaging, which includes Machine-To-Machine (M2M) messaging.
|
||||
|
||||
Recall the concept of the holy trinity with Ethereum/Swarm/Whisper and Web3 that
|
||||
we mentioned in the beginning. Messaging can be used as a building block for
|
||||
dapps, wallets, and users to communicate with each other. It can be used for
|
||||
things such as:
|
||||
|
||||
- Multisig and DAO vote transactions only needing one on-chain operation
|
||||
- Giving dapps ability to send push notifications to users
|
||||
- Giving users ability to directly respond to requests from dapps
|
||||
- Decentralized WalletConnect
|
||||
- Etc
|
||||
|
||||
Basically anything that requires communication and doesn't have to be on-chain.
|
||||
|
||||
### WalletConnect v2
|
||||
|
||||
WalletConnect is an open protocol for connecting dapps to wallets with a QR
|
||||
code. Version 2 is using Waku v2 as a communication channel to do so in a
|
||||
decentralized and private fashion.
|
||||
|
||||

|
||||
|
||||
See for more: https://docs.walletconnect.org/v/2.0/tech-spec
|
||||
|
||||
WalletConnect v2 is currently in late alpha using Waku v2.
|
||||
|
||||
### More examples
|
||||
|
||||
- Gasless voting and vote aggregation off-chain
|
||||
- Dapp games using Waku as player discovery mechanism
|
||||
- Send encrypted message to someone with an Ethereum key
|
||||
- <Your dapp here>
|
||||
|
||||
These are all things that are in progress / proof of concept stage.
|
||||
|
||||
## Contribute
|
||||
|
||||
We'd love to see contributions of any form!
|
||||
|
||||
- You can play with it here: [nim-waku chat](https://github.com/status-im/nim-waku/blob/master/docs/tutorial/chat2.md) (/ [js-waku browser chat](https://status-im.github.io/js-waku/))
|
||||
- Use Waku to build a dapp: [js-waku docs](https://status-im.github.io/js-waku/docs/)
|
||||
- Contribute to code: [js-waku](https://github.com/status-im/js-waku) / [nim-waku](https://github.com/status-im/nim-waku)
|
||||
- Contribute to specs: [vacp2p/rfc](https://github.com/vacp2p/rfc)
|
||||
- We are hiring: Wallet & Dapp Integration Developer, Distributed Systems Engineer, Protocol Engineer, Protocol Researcher - all [job listings](https://status.im/our_team/jobs.html)
|
||||
- Join our new [Discord](https://discord.gg/bJCTqS5H)
|
||||
|
||||
## Conclusion
|
||||
|
||||
In this talk we've gone over the original vision for Web3 and how Waku came to
|
||||
be. We've also looked at what Waku v2 aims to do. We looked at its protocols,
|
||||
implementations, the current testnet as well as briefly on some ongoing
|
||||
research for Vac.
|
||||
|
||||
We've also looked at some specific use cases for Waku. First we looked at how
|
||||
Status uses it with different topics. Then we looked at how it can be useful for
|
||||
messaging in Ethereum, including for things like WalletConnect.
|
||||
|
||||
I hope this talk gives you a better idea of what Waku is, why it exists, and
|
||||
that it inspires you to contribute, either to Waku itself or by using it in your
|
||||
own project!
|
||||
221
rlog/2021-10-25-waku-v1-vs-waku-v2.mdx
Normal file
221
rlog/2021-10-25-waku-v1-vs-waku-v2.mdx
Normal file
@@ -0,0 +1,221 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Waku v1 vs Waku v2: Bandwidth Comparison'
|
||||
title: 'Waku v1 vs Waku v2: Bandwidth Comparison'
|
||||
date: 2021-11-03 10:00:00
|
||||
authors: hanno
|
||||
published: true
|
||||
slug: waku-v1-v2-bandwidth-comparison
|
||||
categories: research
|
||||
image: /img/waku1-vs-waku2/waku1-vs-waku2-overall-network-size.png
|
||||
discuss: https://forum.vac.dev/t/discussion-waku-v1-vs-waku-v2-bandwidth-comparison/110
|
||||
---
|
||||
|
||||
A local comparison of bandwidth profiles showing significantly improved scalability in Waku v2 over Waku v1.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
## Background
|
||||
|
||||
The [original plan](https://vac.dev/waku-v2-plan) for Waku v2 suggested theoretical improvements in resource usage over Waku v1,
|
||||
mainly as a result of the improved amplification factors provided by GossipSub.
|
||||
In its turn, [Waku v1 proposed improvements](https://vac.dev/fixing-whisper-with-waku) over its predecessor, Whisper.
|
||||
|
||||
Given that Waku v2 is aimed at resource restricted environments,
|
||||
we are specifically interested in its scalability and resource usage characteristics.
|
||||
However, the theoretical performance improvements of Waku v2 over Waku v1,
|
||||
has never been properly benchmarked and tested.
|
||||
|
||||
Although we're working towards a full performance evaluation of Waku v2,
|
||||
this would require significant planning and resources,
|
||||
if it were to simulate "real world" conditions faithfully and measure bandwidth and resource usage across different network connections,
|
||||
robustness against attacks/losses, message latencies, etc.
|
||||
(There already exists a fairly comprehensive [evaluation of GossipSub v1.1](https://research.protocol.ai/publications/gossipsub-v1.1-evaluation-report/vyzovitis2020.pdf),
|
||||
on which [`11/WAKU2-RELAY`](https://rfc.vac.dev/spec/11/) is based.)
|
||||
|
||||
As a starting point,
|
||||
this post contains a limited and local comparison of the _bandwidth_ profile (only) between Waku v1 and Waku v2.
|
||||
It reuses and adapts existing network simulations for [Waku v1](https://github.com/status-im/nim-waku/blob/master/waku/v1/node/quicksim.nim) and [Waku v2](https://github.com/status-im/nim-waku/blob/master/waku/v2/node/quicksim2.nim)
|
||||
and compares bandwidth usage for similar message propagation scenarios.
|
||||
|
||||
## Theoretical improvements in Waku v2
|
||||
|
||||
Messages are propagated in Waku v1 using [flood routing](<https://en.wikipedia.org/wiki/Flooding_(computer_networking)>).
|
||||
This means that every peer will forward every new incoming message to all its connected peers (except the one it received the message from).
|
||||
This necessarily leads to unnecessary duplication (termed _amplification factor_),
|
||||
wasting bandwidth and resources.
|
||||
What's more, we expect this effect to worsen the larger the network becomes,
|
||||
as each _connection_ will receive a copy of each message,
|
||||
rather than a single copy per peer.
|
||||
|
||||
Message routing in Waku v2 follows the `libp2p` _GossipSub_ protocol,
|
||||
which lowers amplification factors by only sending full message contents to a subset of connected peers.
|
||||
As a Waku v2 network grows, each peer will limit its number of full-message ("mesh") peerings -
|
||||
`libp2p` suggests a maximum of `12` such connections per peer.
|
||||
This allows much better scalability than a flood-routed network.
|
||||
From time to time, a Waku v2 peer will send metadata about the messages it has seen to other peers ("gossip" peers).
|
||||
|
||||
See [this explainer](https://hackmd.io/@vac/main/%2FYYlZYBCURFyO_ZG1EiteWg#11WAKU2-RELAY-gossipsub) for a more detailed discussion.
|
||||
|
||||
## Methodology
|
||||
|
||||
The results below contain only some scenarios that provide an interesting contrast between Waku v1 and Waku v2.
|
||||
For example, [star network topologies](https://en.wikipedia.org/wiki/Star_network) do not show a substantial difference between Waku v1 and Waku v2.
|
||||
This is because each peer relies on a single connection to the central node for every message,
|
||||
which barely requires any routing:
|
||||
each connection receives a copy of every message for both Waku v1 and Waku v2.
|
||||
Hybrid topologies similarly show only a difference between Waku v1 and Waku v2 for network segments with [mesh-like connections](https://en.wikipedia.org/wiki/Mesh_networking),
|
||||
where routing decisions need to be made.
|
||||
|
||||
For this reason, the following approach applies to all iterations:
|
||||
|
||||
1. Simulations are run **locally**.
|
||||
This limits the size of possible scenarios due to local resource constraints,
|
||||
but is a way to quickly get an approximate comparison.
|
||||
2. Nodes are treated as a **blackbox** for which we only measure bandwidth,
|
||||
using an external bandwidth monitoring tool.
|
||||
In other words, we do not consider differences in the size of the envelope (for v1) or the message (for v2).
|
||||
3. Messages are published at a rate of **50 new messages per second** to each network,
|
||||
except where explicitly stated otherwise.
|
||||
4. Each message propagated in the network carries **8 bytes** of random payload, which is **encrypted**.
|
||||
The same symmetric key cryptographic algorithm (with the same keys) are used in both Waku v1 and v2.
|
||||
5. Traffic in each network is **generated from 10 nodes** (randomly-selected) and published in a round-robin fashion to **10 topics** (content topics for Waku v2).
|
||||
In practice, we found no significant difference in _average_ bandwidth usage when tweaking these two parameters (the number of traffic generating nodes and the number of topics).
|
||||
6. Peers are connected in a decentralized **full mesh topology**,
|
||||
i.e. each peer is connected to every other peer in the network.
|
||||
Waku v1 is expected to flood all messages across all existing connections.
|
||||
Waku v2 gossipsub will GRAFT some of these connections for full-message peerings,
|
||||
with the rest being gossip-only peerings.
|
||||
7. After running each iteration, we **verify that messages propagated to all peers** (comparing the number of published messages to the metrics logged by each peer).
|
||||
|
||||
For Waku v1, nodes are configured as "full" nodes (i.e. with full bloom filter),
|
||||
while Waku v2 nodes are `relay` nodes, all subscribing and publishing to the same PubSub topic.
|
||||
|
||||
## Network size comparison
|
||||
|
||||
### Iteration 1: 10 nodes
|
||||
|
||||
Let's start with a small network of 10 nodes only and see how Waku v1 bandwidth usage compares to that of Waku v2.
|
||||
At this small scale we don't expect to see improved bandwidth usage in Waku v2 over Waku v1,
|
||||
since all connections, for both Waku v1 and Waku v2, will be full-message connections.
|
||||
The number of connections is low enough that Waku v2 nodes will likely GRAFT all connections to full-message peerings,
|
||||
essentially flooding every message on every connection in a similar fashion to Waku v1.
|
||||
If our expectations are confirmed, it helps validate our methodology,
|
||||
showing that it gives more or less equivalent results between Waku v1 and Waku v2 networks.
|
||||
|
||||

|
||||
|
||||
Sure enough, the figure shows that in this small-scale setup,
|
||||
Waku v1 actually has a lower per-peer bandwidth usage than Waku v2.
|
||||
One reason for this may be the larger overall proportion of control messages in a gossipsub-routed network such as Waku v2.
|
||||
These play a larger role when the total network traffic is comparatively low, as in this iteration.
|
||||
Also note that the average bandwidth remains more or less constant as long as the rate of published messages remains stable.
|
||||
|
||||
### Iteration 2: 30 nodes
|
||||
|
||||
Now, let's run the same scenario for a larger network of highly-connected nodes, this time consisting of 30 nodes.
|
||||
At this point, the Waku v2 nodes will start pruning some connections to limit the number of full-message peerings (to a maximum of `12`),
|
||||
while the Waku v1 nodes will continue flooding messages to all connected peers.
|
||||
We therefore expect to see a somewhat improved bandwidth usage in Waku v2 over Waku v1.
|
||||
|
||||

|
||||
|
||||
Bandwidth usage in Waku v2 has increased only slightly from the smaller network of 10 nodes (hovering between 2000 and 3000 kbps).
|
||||
This is because there are only a few more full-message peerings than before.
|
||||
Compare this to the much higher increase in bandwidth usage for Waku v1, which now requires more than 4000 kbps on average.
|
||||
|
||||
### Iteration 3: 50 nodes
|
||||
|
||||
For an even larger network of 50 highly connected nodes,
|
||||
the divergence between Waku v1 and Waku v2 is even larger.
|
||||
The following figure shows comparative average bandwidth usage for a throughput of 50 messages per second.
|
||||
|
||||

|
||||
|
||||
Average bandwidth usage (for the same message rate) has remained roughly the same for Waku v2 as it was for 30 nodes,
|
||||
indicating that the number of full-message peerings per node has not increased.
|
||||
|
||||
### Iteration 4: 85 nodes
|
||||
|
||||
We already see a clear trend in the bandwidth comparisons above,
|
||||
so let's confirm by running the test once more for a network of 85 nodes.
|
||||
Due to local resource constraints, the effective throughput for Waku v1 falls to below 50 messages per second,
|
||||
so the v1 results below have been normalized and are therefore approximate.
|
||||
The local Waku v2 simulation maintains the message throughput rate without any problems.
|
||||
|
||||

|
||||
|
||||
### Iteration 5: 150 nodes
|
||||
|
||||
Finally, we simulate message propagation in a network of 150 nodes.
|
||||
Due to local resource constraints, we run this simulation at a lower rate -
|
||||
35 messages per second -
|
||||
and for a shorter amount of time.
|
||||
|
||||

|
||||
|
||||
Notice how the Waku v1 bandwidth usage is now more than 10 times worse than that of Waku v2.
|
||||
This is to be expected, as each Waku v1 node will try to flood each new message to 149 other peers,
|
||||
while the Waku v2 nodes limit their full-message peerings to no more than 12.
|
||||
|
||||
### Discussion
|
||||
|
||||
Let's summarize average bandwidth growth against network growth for a constant message propagation rate.
|
||||
Since we are particularly interested in how Waku v1 compares to Waku v2 in terms of bandwidth usage,
|
||||
the results are normalised to the Waku v2 average bandwidth usage for each network size.
|
||||
|
||||

|
||||
|
||||
Extrapolation is a dangerous game,
|
||||
but it's safe to deduce that the divergence will only grow for even larger network topologies.
|
||||
Although control signalling contributes more towards overall bandwidth for Waku v2 networks,
|
||||
this effect becomes less noticeable for larger networks.
|
||||
For network segments with more than ~18 densely connected nodes,
|
||||
the advantage of using Waku v2 above Waku v1 becomes clear.
|
||||
|
||||
## Network traffic comparison
|
||||
|
||||
The analysis above controls the average message rate while network size grows.
|
||||
In reality, however, active users (and therefore message rates) are likely to grow in conjunction with the network.
|
||||
This will have an effect on bandwidth for both Waku v1 and Waku v2, though not in equal measure.
|
||||
Consider the impact of an increasing rate of messages in a network of constant size:
|
||||
|
||||

|
||||
|
||||
The _rate_ of increase in bandwidth for Waku v2 is slower than that for Waku v1 for a corresponding increase in message propagation rate.
|
||||
In fact, for a network of 30 densely-connected nodes,
|
||||
if the message propagation rate increases by 1 per second,
|
||||
Waku v1 requires an increased average bandwidth of almost 70kbps at each node.
|
||||
A similar traffic increase in Waku v2 requires on average 40kbps more bandwidth per peer, just over half that of Waku v1.
|
||||
|
||||
## Conclusions
|
||||
|
||||
- **Waku v2 scales significantly better than Waku v1 in terms of average bandwidth usage**,
|
||||
especially for densely connected networks.
|
||||
- E.g. for a network consisting of **150** or more densely connected nodes,
|
||||
Waku v2 provides more than **10x** better average bandwidth usage rates than Waku v1.
|
||||
- As the network continues to scale, both in absolute terms (number of nodes) and in network traffic (message rates) the disparity between Waku v2 and Waku v1 becomes even larger.
|
||||
|
||||
## Future work
|
||||
|
||||
Now that we've confirmed that Waku v2's bandwidth improvements over its predecessor matches theory,
|
||||
we can proceed to a more in-depth characterisation of Waku v2's resource usage.
|
||||
Some questions that we want to answer include:
|
||||
|
||||
- What proportion of Waku v2's bandwidth usage is used to propagate _payload_ versus bandwidth spent on _control_ messaging to maintain the mesh?
|
||||
- To what extent is message latency (time until a message is delivered to its destination) affected by network size and message rate?
|
||||
- How _reliable_ is message delivery in Waku v2 for different network sizes and message rates?
|
||||
- What are the resource usage profiles of other Waku v2 protocols (e.g.[`12/WAKU2-FILTER`](https://rfc.vac.dev/spec/12/) and [`19/WAKU2-LIGHTPUSH`](https://rfc.vac.dev/spec/19/))?
|
||||
|
||||
Our aim is to get ever closer to a "real world" understanding of Waku v2's performance characteristics,
|
||||
identify and fix vulnerabilities
|
||||
and continually improve the efficiency of our suite of protocols.
|
||||
|
||||
## References
|
||||
|
||||
- [Evaluation of GossipSub v1.1](https://research.protocol.ai/publications/gossipsub-v1.1-evaluation-report/vyzovitis2020.pdf)
|
||||
- [Fixing Whisper with Waku](https://vac.dev/fixing-whisper-with-waku)
|
||||
- [GossipSub vs flood routing](https://hackmd.io/@vac/main/%2FYYlZYBCURFyO_ZG1EiteWg#11WAKU2-RELAY-gossipsub)
|
||||
- [Network topologies: star](https://www.techopedia.com/definition/13335/star-topology#:~:text=Star%20topology%20is%20a%20network,known%20as%20a%20star%20network.)
|
||||
- [Network topologies: mesh](https://en.wikipedia.org/wiki/Mesh_networking)
|
||||
- [Waku v2 original plan](https://vac.dev/waku-v2-plan)
|
||||
221
rlog/2021-12-03-ethics-surveillance-tech.mdx
Normal file
221
rlog/2021-12-03-ethics-surveillance-tech.mdx
Normal file
@@ -0,0 +1,221 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Opinion: Pseudo-ethics in the Surveillance Tech Industry'
|
||||
title: 'Opinion: Pseudo-ethics in the Surveillance Tech Industry'
|
||||
date: 2021-12-03 10:00:00
|
||||
authors: circe
|
||||
published: true
|
||||
slug: ethics-surveillance-tech
|
||||
categories: research
|
||||
summary:
|
||||
image: /img/vac.png
|
||||
discuss:
|
||||
---
|
||||
|
||||
A look at typical ethical shortfalls in the global surveillance tech industry.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
_This is an opinion piece by pseudonymous contributor, circe._
|
||||
|
||||
## Preface
|
||||
|
||||
The Vac team aims to provide a public good in the form of freely available, open source tools and protocols for decentralized communication.
|
||||
As such, we value our independence and the usefulness of our protocols for a wide range of applications.
|
||||
At the same time, we realize that all technical development, including ours, has a moral component.
|
||||
As a diverse team we are guided by a shared devotion to the principles of human rights and liberty.
|
||||
This explains why we place such a high premium on security, censorship-resistance and privacy -
|
||||
a stance we [share with the wider Status Network](https://our.status.im/our-principles/).
|
||||
The post below takes a different approach from our usual more technical analyses,
|
||||
by starting to peel back the curtain on the ethical shortfalls of the global surveillance tech industry.
|
||||
|
||||
## Spotlight on an industry
|
||||
|
||||
[Apple's announcement](https://www.apple.com/newsroom/2021/11/apple-sues-nso-group-to-curb-the-abuse-of-state-sponsored-spyware/) of their lawsuit against Israel's NSO Group
|
||||
marks the latest in a series of recent setbacks for the surveillance tech company.
|
||||
In early November, the [United States blacklisted the firm](https://public-inspection.federalregister.gov/2021-24123.pdf),
|
||||
citing concerns about the use of their spyware by foreign governments targeting civilians such as "journalists, businesspeople, activists" and more.
|
||||
The company is already [embroiled in a lawsuit with Whatsapp](https://www.reuters.com/article/us-facebook-cyber-whatsapp-nsogroup-idUSKBN1X82BE)
|
||||
over their exploit of the chat app's video calling service to install malware on target devices.
|
||||
NSO Group's most infamous product, [Pegasus](https://forbiddenstories.org/case/the-pegasus-project/), operates as a hidden exploit installed on victims' mobile phones,
|
||||
sometimes without even requiring as much as an unguarded click on a malicious link.
|
||||
It has the potential to lay bare, and report to its owners, _everything_ within the reach of the infected device.
|
||||
For most people this amounts to a significant portion of their private lives and thoughts.
|
||||
Pegasus can read your private messages (even encrypted), collect your passwords, record calls, track your location and access your device's microphone and camera.
|
||||
No activity or application on an infected phone would be hidden.
|
||||
|
||||
The latest controversies are perhaps less because of the novelty of the revelations -
|
||||
the existence of Pegasus has been known to civil activists [since at least 2016](https://www.bbc.com/news/technology-37192670).
|
||||
Rather, the public was reminded again of the potential scope of surveillance tech
|
||||
in the indiscriminate use of Pegasus on private citizens.
|
||||
This has far-reaching implications for human freedoms worldwide.
|
||||
Earlier this year, a [leaked list of over 50,000 targets](https://www.theguardian.com/world/2021/jul/18/revealed-leak-uncovers-global-abuse-of-cyber-surveillance-weapon-nso-group-pegasus), or possible targets, of Pegasus included
|
||||
the phone numbers of human rights advocates, independent journalists, lawyers and political activists.
|
||||
This should have come as no surprise.
|
||||
The type of autocratically inclined agents, and governments, who would venture to buy and use such invasive cyber-arms often target those they find politically inconvenient.
|
||||
Pegasus, and similar technologies, simply extend the reach and capacity of such individuals and governments -
|
||||
no border or distance, no political rank or social advantage, no sanctity of profession or regard for dignity,
|
||||
provide any indemnity from becoming a victim.
|
||||
Your best hope is to remain uninteresting enough to escape consideration.
|
||||
|
||||
The NSO Group has, of course, denied allegations of culpability and questions the authenticity of the list.
|
||||
At this stage, the latter is almost beside the point:
|
||||
Amnesty International's cybersecurity team, Security Lab, _did_ find [forensic evidence of Pegasus](https://www.amnesty.org/en/latest/research/2021/07/forensic-methodology-report-how-to-catch-nso-groups-pegasus/#_ftn1) on the phones of several volunteers whose numbers appeared on the original list,
|
||||
including those of journalists and human rights activists.
|
||||
(Security Lab has since opened up their [infection finding tool](https://github.com/mvt-project/mvt) to the public.)
|
||||
French intelligence has similarly [inspected and confirmed](https://www.theguardian.com/news/2021/aug/02/pegasus-spyware-found-on-journalists-phones-french-intelligence-confirms) infection of at least three devices belonging to journalists.
|
||||
The phones of several people who were close to the Saudi-American journalist, Jamal Khashoggi, were [confirmed hacked](https://www.bbc.com/news/world-57891506)
|
||||
both before and after Khashoggi's brutal murder at the Saudi embassy in Istanbul in 2018.
|
||||
[More reports](https://www.theguardian.com/news/2021/sep/21/hungary-journalist-daniel-nemeth-phones-infected-with-nso-pegasus-spyware) of confirmed Pegasus hacks are still published with some regularity.
|
||||
It is now an open secret that many authoritarian governments have bought Pegasus.
|
||||
It's not difficult to extrapolate from existing reports and such clients' track records
|
||||
what the potential injuries to human freedoms are that they can inflict with access to such a powerful cyberweapon.
|
||||
|
||||
## A typical response
|
||||
|
||||
[NSO's response](https://www.theguardian.com/news/2021/jul/18/response-from-nso-and-governments) to the allegations follows a textbook approach
|
||||
of avoiding earnest ethical introspection on the manufacturing, and selling, of cyber-arms.
|
||||
Firstly, shift ethical responsibility to a predetermined process, a list of checkboxes of your own making.
|
||||
The Group, for example, claims to sell only to "vetted governments", following a classification process
|
||||
of which they have now [published some procedural details](https://www.nsogroup.com/wp-content/uploads/2021/06/ReportBooklet.pdf) but no tangible criteria.
|
||||
The next step is to reaffirm continuously, and repetitively, your dedication to the _legal_ combat against crime,
|
||||
["legitimate law enforcement agencies"](https://www.nsogroup.com/wp-content/uploads/2021/06/ReportBooklet.pdf) (note the almost tautological phrasing),
|
||||
adherence to international arms trade laws,
|
||||
compliance clauses in customer contracts, etc.
|
||||
Thirdly, having been absolved of any moral suspicions that might exist about product and process,
|
||||
from conception to engineering to trade,
|
||||
distance yourself from the consequences of its use in the world.
|
||||
["NSO does not operate its technology, does not collect, nor possesses, nor has any access to any kind of data of its customers."](https://www.theguardian.com/news/2021/jul/18/response-from-nso-and-governments)
|
||||
It is interesting that directly after this statement they claim with contradictory confidence that
|
||||
their "technology was not associated in any way with the heinous murder of Jamal Khashoggi".
|
||||
The unapologetic tone seems hardly appropriate when the same document confirms that the Group had to
|
||||
shut down customers' systems due to "confirmed misuse" and have had to do so "multiple times" in the past.
|
||||
Given all this, the response manages to evade any serious interrogation of the "vetting" process itself,
|
||||
which forced the company to reject "approximately 15% of potential new opportunities for Pegasus" in one year.
|
||||
Courageous.
|
||||
|
||||
We have heard this all before.
|
||||
There exists a multi-billion dollar industry of private companies and engineering firms [thriving on proceeds](https://www.economist.com/business/2019/12/12/offering-software-for-snooping-to-governments-is-a-booming-business) from
|
||||
selling surveillance tools and cyber-arms to dubious agencies and foreign governments.
|
||||
In turn, the most power-hungry and oppressive regimes often _rely_ on such technological innovations -
|
||||
for which they lack the in-country engineering expertise -
|
||||
to maintain control, suppress uprisings, intimidate opposing journalists, and track their citizens.
|
||||
It's a lucrative business opportunity, and resourceful companies have sprung up everywhere to supply this demand,
|
||||
often in countries where citizens, including employees of the company, would be horrified if they were similarly subject to the oppressions of their own products.
|
||||
When, in 2014, Italy's _HackingTeam_ were pulsed by the United Nations about their (then alleged) selling of spyware to Sudan,
|
||||
which would have been a contravention of the UN's weapon export ban,
|
||||
they simply replied that their product was not controlled as a weapon and therefore not subject to such scrutiny.
|
||||
They remained within their legal bounds, technically.
|
||||
Furthermore, they similarly shifted ethical responsibility to external standards of legitimacy,
|
||||
claiming their ["software is not sold to governments that are blacklisted by the EU, the US, NATO, and similar international organizations"](https://citizenlab.ca/2014/02/mapping-hacking-teams-untraceable-spyware/).
|
||||
When the company themselves were [hacked in 2015](https://www.wired.com/2015/07/hacking-team-breach-shows-global-spying-firm-run-amok/),
|
||||
revelations (confirmations, that is) of widespread misuse by repressive governments were damaging enough to force them to disappear and rebrand as Memento Labs.
|
||||
[Their website](https://www.mem3nt0.com/en/) boasts an impressive list of statutes, regulations, procedures, export controls and legal frameworks,
|
||||
all of which the rebranded hackers proudly comply with.
|
||||
Surely no further ethical scrutiny is necessary?
|
||||
|
||||
## Ethics != the law
|
||||
|
||||
### The law is trailing behind
|
||||
|
||||
Such recourse to the _legality_ of your action as ethical justification is moot for several reasons.
|
||||
The first is glaringly obvious -
|
||||
our laws are ill-equipped to address the implications of modern technology.
|
||||
Legal systems are a cumbersome inheritance built over generations.
|
||||
This is especially true of the statutes and regulations governing international trade, behind which these companies so often hide.
|
||||
Our best legal systems are trailing miles behind the technology for which we seek guidelines.
|
||||
Legislators are still struggling to make sense of technologies like face recognition,
|
||||
the repercussions of smart devices acting "on their own" and biases in algorithms.
|
||||
To claim you are performing ethical due diligence by resorting to an outdated and incomplete system of legal codes is disingenuous.
|
||||
|
||||
### The law depends on ethics
|
||||
|
||||
The second reason is more central to my argument,
|
||||
and an important flaw in these sleight of hand justifications appearing from time to time in the media.
|
||||
Ethics can in no way be confused as synonymous with legality or legitimacy.
|
||||
These are incommensurable concepts.
|
||||
In an ideal world, of course, the law is meant to track the minimum standards of ethical conduct in a society.
|
||||
Laws are often drafted exactly from some ethical, and practical, impulse to minimize harmful conduct
|
||||
and provide for corrective and punitive measures where transgressions do occur.
|
||||
The law, however, has a much narrower scope than ethics.
|
||||
It can be just or unjust.
|
||||
In fact, it is in need of ethics to constantly reform.
|
||||
Ethics and values are born out of collective self-reflection.
|
||||
It develops in our conversation with ourselves and others about the type of society we strive for.
|
||||
As such, an ethical worldview summarizes our deepest intuitions about how we should live and measure our impact on the world.
|
||||
For this reason, ethics is primarily enforced by social and internal pressures, not legal boundaries -
|
||||
our desire to do what _ought_ to be done, however we define that.
|
||||
Ethics is therefore a much grander scheme than global legal systems
|
||||
and the diplomatic frameworks that grants legitimacy to governments.
|
||||
These are but one limited outflow of the human aspiration to form societies in accordance with our ideologies and ethics.
|
||||
|
||||
### International law is vague and exploitable
|
||||
|
||||
Of course, the cyber-arms trade has a favorite recourse, _international_ law, which is even more limited.
|
||||
Since such products are seldomly sold to governments and agencies within the country of production,
|
||||
it enables a further distancing from consequences.
|
||||
Many private surveillance companies are based in fairly liberal societies with (seemingly) strict emphases on human rights in their domestic laws.
|
||||
International laws are much more complicated - for opportunists a synonym for "more grey areas in which to hide".
|
||||
Company conduct can now be governed, and excused, by a system that follows
|
||||
the whims of autocrats with exploitative intent and vastly different ethical conceptions from the company's purported aims.
|
||||
International law, and the ways it is most often enforced by way of, say, UN-backed sanctions,
|
||||
have long been shaped by the compromises of international diplomacy.
|
||||
To be blunt: these laws are weak and subject to exactly the sort of narrow interests behind which mercenaries have always hidden.
|
||||
The surveillance tech industry is no exception.
|
||||
|
||||
## Conclusion
|
||||
|
||||
My point is simple:
|
||||
selling cyber-arms with the potential to become vast tools of oppression to governments and bodies with blatant histories of human rights violations,
|
||||
and all but the publicly announced intention to continue operating in this way,
|
||||
is categorically unconscionable.
|
||||
This seems obvious no matter what ethics system you argue from,
|
||||
provided it harbors any consideration for human dignity and freedom.
|
||||
It is a sign of poor moral discourse that such recourses to law and legitimacy are often considered synonymous with ethical justification.
|
||||
"_I have acted within the bounds of law_", _"We supply only to legitimate law enforcement agencies"_, etc. are no substitutes.
|
||||
Ethical conduct requires an honest evaluation of an action against some conception of "the good",
|
||||
however you define that.
|
||||
Too often the surveillance tech industry precisely sidesteps this question,
|
||||
both in internal processes and external rationalisations to a concerned public.
|
||||
|
||||
John Locke, he of the life-liberty-and-property, articulated the idea that government exists solely through the consent of the governed.
|
||||
Towards the end of the 17th century, he wrote in his _Second Treatise on Civil Government_,
|
||||
"[w]henever legislators endeavor to take away,
|
||||
and destroy the property of the people, or to reduce them to slavery under arbitrary power,
|
||||
they put themselves in a state of war with the people, who are thereupon absolved from any further obedience".
|
||||
The inference is straightforward and humanist in essence:
|
||||
legitimacy is not something that is conferred by governments and institutions.
|
||||
Rather, they derive their legitimacy from us, their citizens, holding them to standards of ethics and societal ideals.
|
||||
This legitimacy only remains in tact as long as this mandate is honored and continuously extended by a well-informed public.
|
||||
This is the principle of informed consent on which all reciprocal ethics is based.
|
||||
|
||||
The surveillance tech industry may well have nothing more or less noble in mind than profit-making within legal bounds
|
||||
when developing and selling their products.
|
||||
However, when such companies are revealed again and again to have supplied tools of gross human rights violations to known human rights violators,
|
||||
they will do well to remember that ethics always _precedes_ requirements of legality and legitimacy.
|
||||
It is a fallacy to take normative guidance from the concept of "legitimacy"
|
||||
if the concept itself depends on such normative guidelines for definition.
|
||||
Without examining the ethical standards by which institutions, governments, and laws, were created,
|
||||
no value-judgements about their legitimacy can be made.
|
||||
Hiding behind legal compliance as substitute for moral justification is not enough.
|
||||
Targets of increasingly invasive governmental snooping are too often chosen precisely to suppress the mechanisms from which the legitimacy of such governments flow -
|
||||
the consent of ordinary civilians.
|
||||
Free and fair elections, free speech, free media, freedom of thought are all at risk.
|
||||
|
||||
## References
|
||||
|
||||
- [Status Principles](https://our.status.im/our-principles/)
|
||||
- [Federal Register: Addition of Certain Entities to the Entity List](https://public-inspection.federalregister.gov/2021-24123.pdf)
|
||||
- [forbiddenstories.org: The Pegasus Project](https://forbiddenstories.org/case/the-pegasus-project/)
|
||||
- [theguardian.com: The Pegasus Project](https://www.theguardian.com/news/series/pegasus-project)
|
||||
- [amnesty.org Forensic Methodology Report: How to catch NSO Group’s Pegasus](https://www.amnesty.org/en/latest/research/2021/07/forensic-methodology-report-how-to-catch-nso-groups-pegasus/#_ftn1)
|
||||
- [Apple sues NSO Group to curb the abuse of state-sponsored spyware](https://www.apple.com/newsroom/2021/11/apple-sues-nso-group-to-curb-the-abuse-of-state-sponsored-spyware/)
|
||||
- [bbc.com: Who are the hackers who cracked the iPhone?](https://www.bbc.com/news/technology-37192670)
|
||||
- [bbc.com: Pegasus: Who are the alleged victims of spyware targeting?](https://www.bbc.com/news/world-57891506)
|
||||
- [citizenlab.ca: Mapping Hacking Team’s “Untraceable” Spyware](https://citizenlab.ca/2014/02/mapping-hacking-teams-untraceable-spyware/)
|
||||
- [economist.com: Offering software for snooping to governments is a booming business](https://www.economist.com/business/2019/12/12/offering-software-for-snooping-to-governments-is-a-booming-business)
|
||||
- [Memento Labs](https://www.mem3nt0.com/en/)
|
||||
- [Mobile Verification Toolkit to identify compromised devices](https://github.com/mvt-project/mvt)
|
||||
- [NSO Group: Transparency and Responsibility Report 2021](https://www.nsogroup.com/wp-content/uploads/2021/06/ReportBooklet.pdf)
|
||||
- [reuters.com: WhatsApp sues Israel's NSO for allegedly helping spies hack phones around the world](https://www.reuters.com/article/us-facebook-cyber-whatsapp-nsogroup-idUSKBN1X82BE)
|
||||
- [wired.com: Hacking Team Breach Shows a Global Spying Firm Run Amok](https://www.wired.com/2015/07/hacking-team-breach-shows-global-spying-firm-run-amok/)
|
||||
296
rlog/2022-04-12-introducing-nwaku.mdx
Normal file
296
rlog/2022-04-12-introducing-nwaku.mdx
Normal file
@@ -0,0 +1,296 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Introducing nwaku'
|
||||
title: 'Introducing nwaku'
|
||||
date: 2022-04-12 10:00:00
|
||||
authors: hanno
|
||||
published: true
|
||||
slug: introducing-nwaku
|
||||
categories: research
|
||||
image: /img/vac.png
|
||||
discuss: https://forum.vac.dev/
|
||||
toc_min_heading_level: 2
|
||||
toc_max_heading_level: 5
|
||||
---
|
||||
|
||||
Introducing nwaku, a Nim-based Waku v2 client, including a summary of recent developments and preview of current and future focus areas.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
## Background
|
||||
|
||||
If you've been following our [research log](https://vac.dev/research-log/),
|
||||
you'll know that many things have happened in the world of Waku v2 since [our last general update](/waku-v2-ethereum-coscup).
|
||||
In line with our [long term goals](https://vac.dev/#about),
|
||||
we've introduced new protocols,
|
||||
tweaked our existing protocols
|
||||
and expanded our team.
|
||||
We've also shown [in a series of practical experiments](/waku-v1-v2-bandwidth-comparison) that Waku v2 does indeed deliver on some of the [theoretical advantages](/waku-v2-plan) it was designed to have over its predecessor, Waku v1.
|
||||
A [sustainability and business workshop](https://forum.vac.dev/t/vac-sustainability-and-business-workshop/116) led to the formulation of a clearer vision for Vac as a team.
|
||||
|
||||
From the beginning, our protocol development has been complemented by various client implementations of these protocols,
|
||||
first in [Nim](https://github.com/status-im/nim-waku),
|
||||
but later also in [JavaScript](https://github.com/status-im/js-waku)
|
||||
and [Go](https://github.com/status-im/go-waku).
|
||||
A follow-up post will clarify the purposes, similarities and differences between these three clients.
|
||||
The [Nim client](https://github.com/status-im/nim-waku/tree/d2fccb5220144893f994a67f2cc26661247f101f/waku/v2), is our reference implementation,
|
||||
developed by the research team in parallel with the specs
|
||||
and building on a home-grown implementation of [`libp2p`](https://github.com/status-im/nim-libp2p).
|
||||
The Nim client is suitable to run as [a standalone adaptive node](/waku-update),
|
||||
managed by individual operators
|
||||
or as an encapsulated service node in other applications.
|
||||
This post looks at some recent developments within the Nim client.
|
||||
|
||||
## 1. _**nim-waku**_ is now known as _**nwaku**_
|
||||
|
||||
Pronounced NWHA-koo.
|
||||
You may already have seen us refer to "`nwaku`" on Vac communication channels,
|
||||
but it is now official:
|
||||
The `nim-waku` Waku v2 client has been named `nwaku`.
|
||||
Why? Well, we needed a recognizable name for our client that could easily be referred to in everyday conversations
|
||||
and `nim-waku` just didn't roll off the tongue.
|
||||
We've followed the example of the closely related [`nimbus` project](https://github.com/status-im/nimbus-eth2) to find a punchier name
|
||||
that explicitly links the client to both the Waku set of protocols and the Nim language.
|
||||
|
||||
## 2. Improvements in stability and performance
|
||||
|
||||
The initial implementation of Waku v2 demonstrated how the suite of protocols can be applied
|
||||
to form a generalized, peer-to-peer messaging network,
|
||||
while addressing a wide range of adaptive requirements.
|
||||
This allowed us to lift several protocol [specifications](https://rfc.vac.dev/) from `raw` to `draft` status,
|
||||
indicating that a reference implementation exists for each.
|
||||
However, as internal dogfooding increased and more external applications started using `nwaku`,
|
||||
we stepped up our focus on the client's stability and performance.
|
||||
This is especially true where we want `nwaku` to run unsupervised in a production environment
|
||||
without any degradation in the services it provides.
|
||||
|
||||
Some of the more significant productionization efforts over the last couple of months included:
|
||||
|
||||
1. Reworking the `store` implementation to maintain stable memory usage
|
||||
while storing historical messages
|
||||
and serving multiple clients querying history simultaneously.
|
||||
Previously, a `store` node would see gradual service degradation
|
||||
due to inefficient memory usage when responding to history queries.
|
||||
Queries that often took longer than 8 mins now complete in under 100 ms.
|
||||
|
||||
2. Improved peer management.
|
||||
For example, `filter` nodes will now remove unreachable clients after a number of connection failures,
|
||||
whereas they would previously keep accumulating dead peers.
|
||||
|
||||
3. Improved disk usage.
|
||||
`nwaku` nodes that persist historical messages on disk now manage their own storage size based on the `--store-capacity`.
|
||||
This can significantly improve node start-up times.
|
||||
|
||||
More stability issues may be addressed in future as `nwaku` matures,
|
||||
but we've noticed a marked improvement in the reliability of running `nwaku` nodes.
|
||||
These include environments where `nwaku` nodes are expected to run with a long uptime.
|
||||
Vac currently operates two long-running fleets of `nwaku` nodes, `wakuv2.prod` and `wakuv2.test`,
|
||||
for internal dogfooding and
|
||||
to serve as experimental bootstrapping nodes.
|
||||
Status has also recently deployed similar fleets for production and testing based on `nwaku`.
|
||||
Our goal is to have `nwaku` be stable, performant and flexible enough
|
||||
to be an attractive option for operators to run and maintain their own Waku v2 nodes.
|
||||
See also the [future work](#future-work) section below for more on our general goal of _`nwaku` for operators_.
|
||||
|
||||
## 3. Improvements in interoperability
|
||||
|
||||
We've implemented several features that improve `nwaku`'s usability in different environments
|
||||
and its interoperability with other Waku v2 clients.
|
||||
One major step forward here was adding support for both secure and unsecured WebSocket connections as `libp2p` transports.
|
||||
This allows direct connectivity with `js-waku`
|
||||
and paves the way for native browser usage.
|
||||
We've also added support for parsing and resolving DNS-type `multiaddrs`,
|
||||
i.e. multiaddress protocol schemes [`dns`, `dns4`, `dns6` and `dnsaddr`](https://github.com/multiformats/multiaddr/blob/b746a7d014e825221cc3aea6e57a92d78419990f/protocols.csv#L8-L11).
|
||||
A `nwaku` node can now also be [configured with its own IPv4 DNS domain name](https://github.com/status-im/nim-waku/tree/d2fccb5220144893f994a67f2cc26661247f101f/waku/v2#configuring-a-domain-name)
|
||||
allowing dynamic IP address allocation without impacting a node's reachability by its peers.
|
||||
|
||||
## 4. Peer discovery
|
||||
|
||||
_Peer discovery_ is the method by which nodes become aware of each other’s existence.
|
||||
The question of peer discovery in a Waku v2 network has been a focus area since the protocol was first conceptualized.
|
||||
Since then several different approaches to discovery have been proposed and investigated.
|
||||
We've implemented three discovery mechanisms in `nwaku` so far:
|
||||
|
||||
### DNS-based discovery
|
||||
|
||||
`nwaku` nodes can retrieve an authenticated, updateable list of peers via DNS to bootstrap connection to a Waku v2 network.
|
||||
Our implementation is based on [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459).
|
||||
|
||||
### GossipSub peer exchange
|
||||
|
||||
[GossipSub Peer Exchange (PX)](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/pubsub/gossipsub/gossipsub-v1.1.md#prune-backoff-and-peer-exchange) is a GossipSub v1.1 mechanism
|
||||
whereby a pruning peer may provide a pruned peer with a set of alternative peers
|
||||
where it can connect to reform its mesh.
|
||||
This is a very suitable mechanism to gradually discover more peers
|
||||
from an initial connection to a small set of bootstrap peers.
|
||||
It is enabled in a `nwaku` node by default.
|
||||
|
||||
### Waku Node Discovery Protocol v5
|
||||
|
||||
This is a DHT-based discovery mechanism adapted to store and relay _node records_.
|
||||
Our implementation is based on [Ethereum's Discovery v5 protocol](https://github.com/ethereum/devp2p/blob/fa6428ada7385c13551873b2ae6ad2457c228eb8/discv5/discv5-theory.md)
|
||||
with some [minor modifications](https://rfc.vac.dev/spec/33/) to isolate our discovery network from that of Ethereum.
|
||||
The decision to separate the Waku Discovery v5 network from Ethereum's was made on considerations of lookup efficiency.
|
||||
This comes at a possible tradeoff in network resilience.
|
||||
We are considering merging with the Ethereum Discovery v5 network in future,
|
||||
or even implement a hybrid solution.
|
||||
[This post](https://forum.vac.dev/t/waku-v2-discv5-roadmap-discussion/121/8) explains the decision and future steps.
|
||||
|
||||
## 5. Spam protection using RLN
|
||||
|
||||
An early addition to our suite of protocols was [an extension of `11/WAKU-RELAY`](https://rfc.vac.dev/spec/32/)
|
||||
that provided spam protection using [Rate Limiting Nullifiers (RLN)](https://rfc.vac.dev/spec/32/).
|
||||
The `nwaku` client now contains a working demonstration and integration of RLN relay.
|
||||
Check out [this tutorial](https://github.com/status-im/nim-waku/blob/ee96705c7fbe4063b780ac43b7edee2f6c4e351b/docs/tutorial/rln-chat2-live-testnet.md) to see the protocol in action using a toy chat application built on `nwaku`.
|
||||
We'd love for people to join us in dogfooding RLN spam protection as part of our operator incentive testnet.
|
||||
Feel free to join our [Vac Discord](https://discord.gg/KNj3ctuZvZ) server
|
||||
and head to the `#rln` channel for more information.
|
||||
|
||||
## Future work
|
||||
|
||||
As we continue working towards our goal of a fully decentralized, generalized and censorship-resistant messaging protocol,
|
||||
these are some of the current and future focus areas for `nwaku`:
|
||||
|
||||
### Reaching out to operators:
|
||||
|
||||
We are starting to push for operators to run and maintain their own Waku v2 nodes,
|
||||
preferably contributing to the default Waku v2 network as described by the default pubsub topic (`/waku/2/default-waku/proto`).
|
||||
Amongst other things, a large fleet of stable operator-run Waku v2 nodes will help secure the network,
|
||||
provide valuable services to a variety of applications
|
||||
and ensure the future sustainability of both Vac as a research organization and the Waku suite of protocols.
|
||||
|
||||
We are targeting `nwaku` as the main option for operator-run nodes.
|
||||
Specifically, we aim to provide through `nwaku`:
|
||||
|
||||
1. a lightweight and robust Waku v2 client.
|
||||
This client must be first in line to support innovative and new Waku v2 protocols,
|
||||
but configurable enough to serve the adaptive needs of various operators.
|
||||
2. an easy-to-follow guide for operators to configure,
|
||||
set up and maintain their own nodes
|
||||
3. a set of operator-focused tools to monitor and maintain a running node
|
||||
|
||||
### Better conversational security layer guarantees
|
||||
|
||||
Conversational security guarantees in Waku v2 are currently designed around the Status application.
|
||||
Developers building their own applications on top of Waku would therefore
|
||||
either have to reimplement a set of tools similar to Status
|
||||
or build their own security solutions on the application layer above Waku.
|
||||
We are working on [a set of features](https://github.com/vacp2p/research/issues/97) built into Waku
|
||||
that will provide the general security properties Waku users may desire
|
||||
and do so in a modern and simple way.
|
||||
This is useful for applications outside of Status that want similar security guarantees.
|
||||
As a first step, we've already made good progress toward [integrating noise handshakes](https://forum.vac.dev/t/noise-handshakes-as-key-exchange-mechanism-for-waku2/130) as a key exchange mechanism in Waku v2.
|
||||
|
||||
### Protocol incentivization
|
||||
|
||||
We want to design incentivization around our protocols to encourage desired behaviors in the Waku network,
|
||||
rewarding nodes providing costly services
|
||||
and punishing adversarial actions.
|
||||
This will increase the overall security of the network
|
||||
and encourage operators to run their own Waku nodes.
|
||||
In turn, the sustainability of Vac as an organization will be better guaranteed.
|
||||
As such, protocol incentivization was a major focus in our recent [Vac Sustainability and Business Workshop](https://forum.vac.dev/t/vac-sustainability-and-business-workshop/).
|
||||
Our first step here is to finish integrating RLN relay into Waku
|
||||
with blockchain interaction to manage members,
|
||||
punish spammers
|
||||
and reward spam detectors.
|
||||
After this, we want to design monetary incentivization for providers of `store`, `lightpush` and `filter` services.
|
||||
This may also tie into a reputation mechanism for service nodes based on a network-wide consensus on service quality.
|
||||
A big challenge for protocol incentivization is doing it in a private fashion,
|
||||
so we can keep similar metadata protection guarantees as the Waku base layer.
|
||||
This ties into our focus on [Zero Knowledge tech](https://forum.vac.dev/t/vac-3-zk/97).
|
||||
|
||||
### Improved store capacity
|
||||
|
||||
The `nwaku` store currently serves as an efficient in-memory store for historical messages,
|
||||
dimensioned by the maximum number of messages the store node is willing to keep.
|
||||
This makes the `nwaku` store appropriate for keeping history over a short term
|
||||
without any time-based guarantees,
|
||||
but with the advantage of providing fast responses to history queries.
|
||||
Some applications, such as Status, require longer-term historical message storage
|
||||
with time-based dimensioning
|
||||
to guarantee that messages will be stored for a specified minimum period.
|
||||
Because of the relatively high cost of memory compared to disk space,
|
||||
a higher capacity store, with time guarantees, should operate as a disk-only database of historical messages.
|
||||
This is an ongoing effort.
|
||||
|
||||
### Multipurpose discovery
|
||||
|
||||
In addition to [the three discovery methods](#4-peer-discovery) already implemented in `nwaku`,
|
||||
we are working on improving discovery on at least three fronts:
|
||||
|
||||
#### _Capability discovery:_
|
||||
|
||||
Waku v2 nodes may be interested in peers with specific capabilities, for example:
|
||||
|
||||
1. peers within a specific pubsub topic mesh,
|
||||
2. peers with **store** capability,
|
||||
3. **store** peers with x days of history for a specific content topic, etc.
|
||||
|
||||
Capability discovery entails mechanisms by which such capabilities can be advertised and discovered/negotiated.
|
||||
One major hurdle to overcome is the increased complexity of finding a node with specific capabilities within the larger network (a needle in a haystack).
|
||||
See the [original problem statement](https://github.com/vacp2p/rfc/issues/429) for more.
|
||||
|
||||
#### _Improvements in Discovery v5_
|
||||
|
||||
Of the implemented discovery methods,
|
||||
Discovery v5 best addresses our need for a decentralized and scalable discovery mechanism.
|
||||
With the basic implementation done,
|
||||
there are some improvements planned for Discovery v5,
|
||||
including methods to increase security such as merging with the Ethereum Discovery v5 network,
|
||||
introducing explicit NAT traversal
|
||||
and utilizing [topic advertisement](https://github.com/ethereum/devp2p/blob/fa6428ada7385c13551873b2ae6ad2457c228eb8/discv5/discv5-theory.md#topic-advertisement).
|
||||
The [Waku v2 Discovery v5 Roadmap](https://forum.vac.dev/t/waku-v2-discv5-roadmap-discussion/121) contains more details.
|
||||
|
||||
#### _Generalized peer exchange_
|
||||
|
||||
`nwaku` already implements [GossipSub peer exchange](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/pubsub/gossipsub/gossipsub-v1.1.md#prune-backoff-and-peer-exchange).
|
||||
We now need a general request-response mechanism outside of GossipSub
|
||||
by which a node may learn about other Waku v2 nodes
|
||||
by requesting and receiving a list of peers from a neighbor.
|
||||
This could, for example, be a suitable way for resource-restricted devices to request a stronger peer
|
||||
to perform a random Discovery v5 lookup on their behalf
|
||||
or simply to be informed of a subset of the peers known to that neighbor.
|
||||
See [this issue](https://github.com/vacp2p/rfc/issues/495) for more.
|
||||
|
||||
---
|
||||
|
||||
This concludes a general outline of some of the main recent developments in the `nwaku` client
|
||||
and a summary of the current and future focus areas.
|
||||
Much more is happening behind the scenes, of course,
|
||||
so for more information, or to join the conversation,
|
||||
feel free to join our [Vac Discord](https://discord.gg/KNj3ctuZvZ) server
|
||||
or to check out the [`nwaku` repo on Github](https://github.com/status-im/nim-waku).
|
||||
You can also view the changelog for past releases [here](https://github.com/status-im/nim-waku/releases).
|
||||
|
||||
## References
|
||||
|
||||
- [17/WAKU-RLN-RELAY](https://rfc.vac.dev/spec/17/)
|
||||
- [32/RLN](https://rfc.vac.dev/spec/32/)
|
||||
- [33/WAKU2-DISCV5](https://rfc.vac.dev/spec/33/)
|
||||
- [Capabilities advertising](https://github.com/vacp2p/rfc/issues/429)
|
||||
- [Configuring a domain name](https://github.com/status-im/nim-waku/tree/d2fccb5220144893f994a67f2cc26661247f101f/waku/v2#configuring-a-domain-name)
|
||||
- [Conversational security](https://github.com/vacp2p/research/issues/97)
|
||||
- [Discovery v5 Topic Advertisement](https://github.com/ethereum/devp2p/blob/fa6428ada7385c13551873b2ae6ad2457c228eb8/discv5/discv5-theory.md#topic-advertisement)
|
||||
- [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459)
|
||||
- [GossipSub Peer Exchange](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/pubsub/gossipsub/gossipsub-v1.1.md#prune-backoff-and-peer-exchange)
|
||||
- [go-waku](https://github.com/status-im/go-waku)
|
||||
- [js-waku](https://github.com/status-im/js-waku)
|
||||
- [`multiaddr` formats](https://github.com/multiformats/multiaddr/blob/b746a7d014e825221cc3aea6e57a92d78419990f/protocols.csv#L8-L11)
|
||||
- [nimbus-eth2](https://github.com/status-im/nimbus-eth2)
|
||||
- [nim-libp2p](https://github.com/status-im/nim-libp2p)
|
||||
- [nim-waku](https://github.com/status-im/nim-waku)
|
||||
- [nim-waku releases](https://github.com/status-im/nim-waku/releases)
|
||||
- [Node Discovery Protocol v5 - Theory](https://github.com/ethereum/devp2p/blob/fa6428ada7385c13551873b2ae6ad2457c228eb8/discv5/discv5-theory.md)
|
||||
- [Noise handshakes](https://forum.vac.dev/t/noise-handshakes-as-key-exchange-mechanism-for-waku2/130)
|
||||
- [RLN tutorial](https://github.com/status-im/nim-waku/blob/ee96705c7fbe4063b780ac43b7edee2f6c4e351b/docs/tutorial/rln-chat2-live-testnet.md)
|
||||
- [Vac <3 ZK](https://forum.vac.dev/t/vac-3-zk/97)
|
||||
- [Vac About page](https://vac.dev/#about)
|
||||
- [Vac Research log](https://vac.dev/research-log/)
|
||||
- [Vac RFC site](https://rfc.vac.dev/)
|
||||
- [Vac Sustainability and Business Workshop](https://forum.vac.dev/t/vac-sustainability-and-business-workshop/)
|
||||
- [Waku Update](/waku-update)
|
||||
- [Waku v1 vs Waku v2: Bandwidth Comparison](/waku-v1-v2-bandwidth-comparison)
|
||||
- [Waku v2 Peer Exchange](https://github.com/vacp2p/rfc/issues/495)
|
||||
- [Waku v2 Discovery v5 Roadmap](https://forum.vac.dev/t/waku-v2-discv5-roadmap-discussion/121)
|
||||
- [What's the Plan for Waku v2?](/waku-v2-plan)
|
||||
366
rlog/2022-05-09-ambient-peer-discovery.mdx
Normal file
366
rlog/2022-05-09-ambient-peer-discovery.mdx
Normal file
@@ -0,0 +1,366 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Waku v2 Ambient Peer Discovery'
|
||||
title: 'Waku v2 Ambient Peer Discovery'
|
||||
date: 2022-05-09 10:00:00
|
||||
authors: kaiserd
|
||||
published: true
|
||||
slug: wakuv2-apd
|
||||
categories: research
|
||||
image: /img/waku_v2_discv5_random_walk_estimation.svg
|
||||
discuss: https://forum.vac.dev/t/discussion-waku-v2-ambient-peer-discovery/133
|
||||
_includes: [math]
|
||||
---
|
||||
|
||||
Introducing and discussing ambient peer discovery methods currently used by Waku v2, as well as future plans in this area.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
[Waku v2](https://rfc.vac.dev/spec/10/) comprises a set of modular protocols for secure, privacy preserving communication.
|
||||
Avoiding centralization, these protocols exchange messages over a P2P network layer.
|
||||
In order to build a P2P network, participating nodes first have to discover peers within this network.
|
||||
This is where [_ambient peer discovery_](https://docs.libp2p.io/concepts/publish-subscribe/#discovery) comes into play:
|
||||
it allows nodes to find peers, making it an integral part of any decentralized application.
|
||||
|
||||
In this post the term _node_ to refers to _our_ endpoint or the endpoint that takes action,
|
||||
while the term _peer_ refers to other endpoints in the P2P network.
|
||||
These endpoints can be any device connected to the Internet: e.g. servers, PCs, notebooks, mobile devices, or applications like a browser.
|
||||
As such, nodes and peers are the same. We use these terms for the ease of explanation without loss of generality.
|
||||
|
||||
In Waku's modular design, ambient peer discovery is an umbrella term for mechanisms that allow nodes to find peers.
|
||||
Various ambient peer discovery mechanisms are supported, and each is specified as a separate protocol.
|
||||
Where do these protocols fit into Waku's protocol stack?
|
||||
The P2P layer of Waku v2 builds on [libp2p gossipsub](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/pubsub/gossipsub/README.md).
|
||||
Nodes participating in a gossipsub protocol manage a mesh network that is used for routing messages.
|
||||
This mesh network is an [unstructured P2P network](https://en.wikipedia.org/wiki/Peer-to-peer#Unstructured_networks)
|
||||
offering high robustness and resilience against attacks.
|
||||
Gossipsub implements many improvements overcoming the shortcomings typically associated with unstructured P2P networks, e.g. inefficient flooding based routing.
|
||||
The gossipsub mesh network is managed in a decentralized way, which requires each node to know other participating peers.
|
||||
Waku v2 may use any combination of its ambient discovery protocols to find appropriate peers.
|
||||
|
||||
Summarizing, Waku v2 comprises a _peer management layer_ based on libp2p gossipsub,
|
||||
which manages the peers of nodes, and an _ambient peer discovery layer_,
|
||||
which provides information about peers to the peer management layer.
|
||||
|
||||
We focus on ambient peer discovery methods that are in line with our goal of building a fully decentralized, generalized, privacy-preserving and censorship-resistant messaging protocol.
|
||||
Some of these protocols still need adjustments to adhere to our privacy and anonymity requirements. For now, we focus on operational stability and feasibility.
|
||||
However, when choosing techniques, we pay attention to selecting mechanisms that can feasibly be tweaked for privacy in future research efforts.
|
||||
Because of the modular design and the fact that Waku v2 has several discovery methods at its disposal, we could even remove a protocol in case future evaluation deems it not fitting our standards.
|
||||
|
||||
This post covers the current state and future considerations of ambient peer discovery for Waku v2,
|
||||
and gives reason for changes and modifications we made or plan to make.
|
||||
The ambient peer discovery protocols currently supported by Waku v2 are a modified version of Ethereum's [Discovery v5](https://github.com/ethereum/devp2p/blob/6b0abc3d956a626c28dce1307ee9f546db17b6bd/discv5/discv5.md)
|
||||
and [DNS-based discovery](https://vac.dev/dns-based-discovery).
|
||||
Waku v2 further supports [gossipsub's peer exchange protocol](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/pubsub/gossipsub/gossipsub-v1.1.md#prune-backoff-and-peer-exchange).
|
||||
In addition, we plan to introduce protocols for general peer exchange and capability discovery, respectively.
|
||||
The former allows resource restricted nodes to outsource querying for peers to stronger peers,
|
||||
the latter allows querying peers for their supported capabilities.
|
||||
Besides these new protocols, we are working on integrating capability discovery in our existing ambient peer discovery protocols.
|
||||
|
||||
## Static Node Lists
|
||||
|
||||
The simplest method of learning about peers in a P2P network is via static node lists.
|
||||
These can be given to nodes as start-up parameters or listed in a config-file.
|
||||
They can also be provided in a script-parseable format, e.g. in JSON.
|
||||
While this method of providing bootstrap nodes is very easy to implement, it requires static peers, which introduce centralized elements.
|
||||
Also, updating static peer information introduces significant administrative overhead:
|
||||
code and/or config files have to be updated and released.
|
||||
Typically, static node lists only hold a small number of bootstrap nodes, which may lead to high load on these nodes.
|
||||
|
||||
## DNS-based Discovery
|
||||
|
||||
Compared to static node lists,
|
||||
[DNS-based discovery](https://vac.dev/dns-based-discovery) (specified in [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459))
|
||||
provides a more dynamic way of discovering bootstrap nodes.
|
||||
It is very efficient, can easily be handled by resource restricted devices and provides very good availability.
|
||||
In addition to a naive DNS approach, Ethereum's DNS-based discovery introduces efficient authentication leveraging [Merkle trees](https://en.wikipedia.org/wiki/Merkle_tree).
|
||||
|
||||
A further advantage over static node lists is the separation of code/release management and bootstrap node management.
|
||||
However, changing and updating the list of bootstrap nodes still requires administrative privileges because DNS records have to be added or updated.
|
||||
|
||||
While this method of discovery still requires centralized elements,
|
||||
node list management can be delegated to various DNS zones managed by other entities mitigating centralization.
|
||||
|
||||
## Discovery V5
|
||||
|
||||
A much more dynamic method of ambient peer discovery is [Discovery v5](https://github.com/ethereum/devp2p/blob/6b0abc3d956a626c28dce1307ee9f546db17b6bd/discv5/discv5.md), which is Ethereum's peer discovery protocol.
|
||||
It is based on the [Kademlia](https://en.wikipedia.org/wiki/Kademlia) distributed hashtable (DHT).
|
||||
An [introduction to discv5 and its history](https://vac.dev/kademlia-to-discv5), and a [discv5 Waku v2 feasibility study](https://vac.dev/feasibility-discv5)
|
||||
can be found in previous posts on this research log.
|
||||
|
||||
We use Discovery v5 as an ambient peer discovery method for Waku v2 because it is decentralized, efficient, actively researched, and has web3 as its main application area.
|
||||
Discv5 also offers mitigation techniques for various attacks, which we cover later in this post.
|
||||
|
||||
Using a DHT (structured P2P network) as a means for ambient peer discovery, while using the gossipsub mesh network (unstructured P2P network) for transmitting actual messages,
|
||||
Waku v2 leverages advantages from both worlds.
|
||||
One of the main benefits of DHTs is offering a global view over participating nodes.
|
||||
This, in turn, allows sampling random sets of nodes which is important for equally distributing load.
|
||||
Gossipsub, on the other hand, offers great robustness and resilience against attacks.
|
||||
Even if discv5 discovery should not work in advent of a DoS attack, Waku v2 can still operate switching to different discovery methods.
|
||||
|
||||
Discovery methods that use separate P2P networks still depend on bootstrapping,
|
||||
which Waku v2 does via parameters on start-up or via DNS-based discovery.
|
||||
This might raise the question of why such discovery methods are beneficial.
|
||||
The answer lies in the aforementioned global view of DHTs. Without discv5 and similar methods, the bootstrap nodes are used as part of the gossipsub mesh.
|
||||
This might put heavy load on these nodes and further, might open pathways to inference attacks.
|
||||
Discv5, on the other hand, uses the bootstrap nodes merely as an entry to the discovery network and can provide random sets of nodes (sampled from a global view)
|
||||
for bootstrapping or expanding the mesh.
|
||||
|
||||
### DHT Background
|
||||
|
||||
Distributed Hash Tables are a class of structured P2P overlay networks.
|
||||
A DHT can be seen as a distributed node set of which each node is responsible for a part of the hash space.
|
||||
In contrast to unstructured P2P networks, e.g. the mesh network maintained by gossipsub,
|
||||
DHTs have a global view over the node set and the hash space (assuming the participating nodes behave well).
|
||||
|
||||
DHTs are susceptible to various kinds of attacks, especially [Sybil attacks](https://en.wikipedia.org/wiki/Sybil_attack)
|
||||
and [eclipse attacks](https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/heilman).
|
||||
While security aspects have been addressed in various research papers, general practical solutions are not available.
|
||||
However, discv5 introduced various practical mitigation techniques.
|
||||
|
||||
### Random Walk Discovery
|
||||
|
||||
While discv5 is based on the Kademlia DHT, it only uses the _distributed node set_ aspect of DHTs.
|
||||
It does not map values (items) into the distributed hash space.
|
||||
This makes sense, because the main purpose of discv5 is discovering other nodes that support discv5, which are expected to be Ethereum nodes.
|
||||
Ethereum nodes that want to discover other Ethereum nodes simply query the discv5 network for a random set of peers.
|
||||
If Waku v2 would do the same, only a small subset of the retrieved nodes would support Waku v2.
|
||||
|
||||
A first naive solution for Waku v2 discv5 discovery is
|
||||
|
||||
- retrieve a random node set, which is achieved by querying for a set of randomly chosen node IDs
|
||||
- filter the returned nodes on the query path based on Waku v2 capability via the [Waku v2 ENR](https://rfc.vac.dev/spec/31/)
|
||||
- repeat until enough Waku v2 capable nodes are found
|
||||
|
||||
This query process boils down to random walk discovery, which is very resilient against attacks, but also very inefficient if the number of nodes supporting the desired capability is small.
|
||||
We refer to this as the needle-in-the-haystack problem.
|
||||
|
||||
### Random Walk Performance Estimation
|
||||
|
||||
This subsection provides a rough estimation of the overhead introduced by random walk discovery.
|
||||
|
||||
Given the following parameters:
|
||||
|
||||
- $n$ number of total nodes participating in discv5
|
||||
- $p$ percentage of nodes supporting Waku
|
||||
- $W$ the event of having at least one Waku node in a random sample
|
||||
- $k$ the size of a random sample (default = 16)
|
||||
- $\alpha$ the number of parallel queries started
|
||||
- $b$ bits per hop
|
||||
- $q$ the number of queries
|
||||
|
||||
A query takes $log_{2^b}n$ hops to retrieve a random sample of nodes.
|
||||
|
||||
$P(W) = 1 - (1-p/100)^k$ is the probability of having at least one Waku node in the sample.
|
||||
|
||||
$P(W^q) = 1 - (1-p/100)^{kq}$ is the probability of having at least one Waku node in the union of $q$ samples.
|
||||
|
||||
Expressing this in terms of $q$, we can write:
|
||||
$$P(W^q) = 1 - (1-p/100)^{kq} \iff q = log_{(1-p/100)^k}(1-P(W^q))$$
|
||||
|
||||
Figure 1 shows a log-log plot for $P(W^q) = 90\%$.
|
||||
|
||||

|
||||
|
||||
Assuming $p=0.1$, we would need
|
||||
|
||||
$$0.9 = 1 - (1-0.1/100)^{16q} => q \approx 144$$
|
||||
|
||||
queries to get a Waku node with 90% probability, which leads to $\approx 144 * 18 = 2592$ overlay hops.
|
||||
Choosing $b=3$ would reduce the number to $\approx 144 * 6 = 864$.
|
||||
Even when choosing $\alpha = 10$ we would have to wait at least 80 RTTs.
|
||||
This effort is just for retrieving a single Waku node. Ideally, we want at least 3 Waku nodes for bootstrapping a Waku relay.
|
||||
|
||||
[The discv5 doc](https://github.com/ethereum/devp2p/blob/6b0abc3d956a626c28dce1307ee9f546db17b6bd/discv5/discv5-theory.md#ad-placement-and-topic-radius) roughly estimates $p=1%$ to be the threshold for acceptably efficient random walk discovery.
|
||||
This is in line with our estimation:
|
||||
|
||||
$$0.9 = 1 - (1-1/100)^{16q} => q \approx 14$$
|
||||
|
||||
The number of necessary queries is linearly dependent on the percentage $p$ of Waku nodes.
|
||||
The number of hops per query is logarithmically dependent on $n$.
|
||||
Thus, random walk searching is inefficient for small percentages $p$.
|
||||
Still, random walks are more resilient against attacks.
|
||||
|
||||
We can conclude that a Waku node concentration below 1% renders vanilla discv5 unfit for our needs.
|
||||
Our current solution and future plans for solving this issue are covered in the next subsections.
|
||||
|
||||
### Simple Solution: Separate Discovery Network
|
||||
|
||||
The simple solution we currently use for [Waku v2 discv5](https://rfc.vac.dev/spec/33/) is a separate discv5 network.
|
||||
All (well behaving) nodes in this network support Waku v2, resulting in a very high query efficiency.
|
||||
However, this solution reduces resilience because the difficulty of attacking a DHT scales with the number of participating nodes.
|
||||
|
||||
### Discv5 Topic Discovery
|
||||
|
||||
We did not base our solution on the [current version of discv5 topic discovery](https://github.com/ethereum/devp2p/blob/master/discv5/discv5-theory.md#topic-advertisement),
|
||||
because, similar to random walk discovery, it suffers from poor performance for relatively rare capabilities/topics.
|
||||
|
||||
However, there is [ongoing research](https://github.com/harnen/service-discovery-paper) in discv5 topic discovery which is close to ideas we explored when pondering efficient and resilient Waku discv5 solutions.
|
||||
We keep a close eye on this research, give feedback, and make suggestions, as we plan to switch to this version of topic discovery in the future.
|
||||
|
||||
In a nutshell, topic discovery will manage separate routing tables for each topic.
|
||||
These topic specific tables are initialized with nodes from the discv5 routing table.
|
||||
While the buckets of the discv5 routing table represent distance intervals from the node's `node ID`, the topic table buckets represent distance intervals from `topic ID`s.
|
||||
|
||||
Nodes that want to register a topic try to register that topic at one random peer per bucket.
|
||||
This leads to registering the topic at peers in closer and closer neighbourhoods around the topic ID, which
|
||||
yields a very efficient and resilient compromise between random walk discovery and DHT discovery.
|
||||
Peers in larger neighbourhoods around the topic ID are less efficient to discover, however more resilient against eclipse attacks and vice versa.
|
||||
|
||||
Further, this works well with the overload and DoS protection discv5 employs.
|
||||
Discv5 limits the amount of nodes registered per topic on a single peer. Further, discv5 enforces a waiting time before nodes can register topics at peers.
|
||||
So, for popular topics, a node might fail to register the topic in a close neighbourhood.
|
||||
However, because the topic is popular (has a high occurrence percentage $p$), it can still be efficiently discovered.
|
||||
|
||||
In the future, we also plan to integrate Waku v2 capability discovery, which will not only allow asking for nodes that support Waku v2,
|
||||
but asking for Waku v2 nodes supporting specific Waku v2 protocols like filter or store.
|
||||
For the store protocol we envision sub-capabilities reflecting message topics and time frames of messages.
|
||||
We will also investigate related security implications.
|
||||
|
||||
### Attacks on DHTs
|
||||
|
||||
In this post, we only briefly describe common attacks on DHTs.
|
||||
These attacks are mainly used for denial of service (DoS),
|
||||
but can also used as parts of more sophisticated attacks, e.g. deanonymization attacks.
|
||||
A future post on this research log will cover security aspects of ambient peer discovery with a focus on privacy and anonymity.
|
||||
|
||||
_Sybil Attack_
|
||||
|
||||
The power of an attacker in a DHT is proportional to the number of controlled nodes.
|
||||
Controlling nodes comes at a high resource cost and/or requires controlling a botnet via a preliminary attack.
|
||||
|
||||
In a Sybil attack, an attacker generates lots of virtual node identities.
|
||||
This allows the attacker to control a large portion of the ID space in a DHT at a relatively low cost.
|
||||
Sybil attacks are especially powerful when the attacker can freely choose the IDs of generated nodes,
|
||||
because this allows positioning at chosen points in the DHT.
|
||||
|
||||
Because Sybil attacks amplify the power of many attacks against DHTs,
|
||||
making Sybil attacks as difficult as possible is the basis for resilient DHT operation.
|
||||
The typical abstract mitigation approach is binding node identities to physical network interfaces.
|
||||
To some extend, this can be achieved by introducing IP address based limits.
|
||||
Further, generating node IDs can be bound by proof of work (PoW),
|
||||
which, however, comes with a set of shortcomings, e.g. relatively high costs on resource restricted devices.
|
||||
[The discv5 doc](https://github.com/ethereum/devp2p/blob/6b0abc3d956a626c28dce1307ee9f546db17b6bd/discv5/discv5-rationale.md#sybil-and-eclipse-attacks)
|
||||
describes both Sybil and eclipse attacks, as well as concrete mitigation techniques employed by discv5.
|
||||
|
||||
_Eclipse Attack_
|
||||
|
||||
In an eclipse attack, nodes controlled by the attacker poison the routing tables of other nodes in a way that parts of the DHT become eclipsed, i.e. invisible.
|
||||
When a controlled node is asked for the next step in a path,
|
||||
it provides another controlled node as the next step,
|
||||
effectively navigating the querying node around or away from certain areas of the DHT.
|
||||
While several mitigation techniques have been researched, there is no definitive protection against eclipse attacks available as of yet.
|
||||
One mitigation technique is increasing $\alpha$, the number of parallel queries, and following each concurrent path independently for the lookup.
|
||||
|
||||
The eclipse attack becomes very powerful in combination with a successful Sybil attack;
|
||||
especially when the attacker can freely choose the position of the Sybil nodes.
|
||||
|
||||
The aforementioned new topic discovery of discv5 provides a good balance between protection against eclipse attacks and query performance.
|
||||
|
||||
## Peer Exchange Protocol
|
||||
|
||||
While discv5 based ambient peer discovery has many desirable properties, resource restricted nodes and nodes behind restrictive NAT setups cannot run discv5 satisfactory.
|
||||
With these nodes in mind, we started working on a simple _peer exchange protocol_ based on ideas proposed [here](https://github.com/libp2p/specs/issues/222).
|
||||
The peer exchange protocol will allow nodes to ask peers for additional peers.
|
||||
Similar to discv5, the peer exchange protocol will also support capability discovery.
|
||||
|
||||
The new peer exchange protocol can be seen as a simple replacement for the [Rendezvous protocol](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/rendezvous/README.md), which Waku v2 does not support.
|
||||
While the rendezvous protocol involves nodes registering at rendezvous peers, the peer exchange protocol simply allows nodes to ask any peer for a list of peers (with a certain set of capabilities).
|
||||
Rendezvous tends to introduce centralized elements as rendezvous peers have a super-peer role.
|
||||
|
||||
In the future, we will investigate resource usage of [Waku v2 discv5](https://rfc.vac.dev/spec/33/) and provide suggestions for minimal resources nodes should have to run discv5 satisfactory.
|
||||
|
||||
## Further Protocols Related to Discovery
|
||||
|
||||
Waku v2 comprises further protocols related to ambient peer discovery. We shortly mention them for context, even though they are not strictly ambient peer discovery protocols.
|
||||
|
||||
### Gossipsub Peer Exchange Protocol
|
||||
|
||||
Gossipsub provides an integrated [peer exchange](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/pubsub/gossipsub/gossipsub-v1.1.md#prune-backoff-and-peer-exchange) mechanism which is also supported by Waku v2.
|
||||
Gossipsub peer exchange works in a _push_ manner. Nodes send peer lists to peers they prune from the active mesh.
|
||||
This pruning is part of the gossipsub peer management, blurring the boundaries of _peer management_ and _ambient peer discovery_.
|
||||
|
||||
We will investigate anonymity implications of this protocol and might disable it in favour of more anonymity-preserving protocols.
|
||||
Sending a list of peers discloses information about the sending node.
|
||||
We consider restricting these peer lists to cached peers that are currently not used in the active gossipsub mesh.
|
||||
|
||||
### Capability Negotiation
|
||||
|
||||
Some of the ambient peer discovery methods used by Waku2 will support capability discovery.
|
||||
This allows to narrow down the set of retrieved peers to peers that support specific capabilities.
|
||||
This is efficient because it avoids establishing connections to nodes that we are not interested in.
|
||||
|
||||
However, the ambient discovery interface does not require capability discovery, which will lead to nodes having peers with unknown capabilities in their peer lists.
|
||||
We work on a _capability negotiation protocol_ which allows nodes to ask peers
|
||||
|
||||
- for their complete list of capabilities, and
|
||||
- whether they support a specific capability
|
||||
|
||||
We will investigate security implications, especially when sending full capability lists.
|
||||
|
||||
## NAT traversal
|
||||
|
||||
For [NAT traversal](https://docs.libp2p.io/concepts/nat/), Waku v2 currently supports the port mapping protocols [UPnP](https://en.wikipedia.org/wiki/Universal_Plug_and_Play) and [NAT-PMP](https://datatracker.ietf.org/doc/html/rfc6886) / [PCP](https://datatracker.ietf.org/doc/html/rfc6887).
|
||||
|
||||
In the future, we plan to add support for parts of [ICE](https://datatracker.ietf.org/doc/html/rfc8445), e.g. [STUN](https://datatracker.ietf.org/doc/html/rfc7350).
|
||||
We do not plan to support [TURN](https://www.rfc-editor.org/rfc/rfc5928) because TURN relays would introduce a centralized element.
|
||||
A modified decentralized version of TURN featuring incentivization might be an option in the future;
|
||||
strong peers could offer a relay service similar to TURN.
|
||||
|
||||
There are [plans to integrate more NAT traversal into discv5](https://github.com/ethereum/devp2p/issues/199), in which we might participate.
|
||||
So far, the only traversal technique supported by discv5 is nodes receiving their external IP address in pong messages.
|
||||
|
||||
While NAT traversal is very important, adding more NAT traversal techniques is not a priority at the moment.
|
||||
Nodes behind restrictive symmetric NAT setups cannot be discovered, but they can still discover peers in less restrictive setups.
|
||||
While we wish to have as many nodes as possible to be discoverable via ambient peer discovery, two nodes behind a restrictive symmetric NAT can still exchange Waku v2 messages if they discovered a shared peer.
|
||||
This is one of the nice resilience related properties of flooding based routing algorithms.
|
||||
|
||||
For mobile nodes, which suffer from changing IP addresses and double NAT setups, we plan using the peer exchange protocol to ask peers for more peers.
|
||||
Besides saving resources on resource restricted devices, this approach works as long as peers are in less restrictive environments.
|
||||
|
||||
## Conclusion and Future Prospects
|
||||
|
||||
_Ambient peer discovery_ is an integral part of decentralized applications. It allows nodes to learn about peers in the network.
|
||||
As of yet, Waku v2 supports DNS-based discovery and a slightly modified version of discv5.
|
||||
We are working on further protocols, including a peer exchange protocol that allows resource restricted nodes to ask stronger peers for peer lists.
|
||||
Further, we are working on adding capability discovery to our ambient discovery protocols, allowing nodes to find peers with desired properties.
|
||||
|
||||
These protocols can be combined in a modular way and allow Waku v2 nodes to build a strong and resilient mesh network,
|
||||
even if some discovery methods are not available in a given situation.
|
||||
|
||||
We will investigate security properties of these discovery mechanisms with a focus on privacy and anonymity in a future post on this research log.
|
||||
As an outlook we can already state that DHT approaches typically allow inferring information about the querying node.
|
||||
Further, sending peer lists allows inferring the position of a node within the mesh, and by extension information about the node.
|
||||
Waku v2 already provides some mitigation, because the mesh for transmitting actual messages, and the peer discovery network are separate.
|
||||
To mitigate information leakage by transmitting peer lists, we plan to only reply with lists of peers that nodes do not use in their active meshes.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Waku v2](https://rfc.vac.dev/spec/10/)
|
||||
- [libp2p gossipsub](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/pubsub/gossipsub/README.md)
|
||||
- [unstructured P2P network](https://en.wikipedia.org/wiki/Peer-to-peer#Unstructured_networks)
|
||||
- [ambient peer discovery](https://docs.libp2p.io/concepts/publish-subscribe/#discovery)
|
||||
- [Discovery v5](https://github.com/ethereum/devp2p/blob/6b0abc3d956a626c28dce1307ee9f546db17b6bd/discv5/discv5.md)
|
||||
- [Kademlia](https://en.wikipedia.org/wiki/Kademlia)
|
||||
- [Discv5 history](https://vac.dev/kademlia-to-discv5)
|
||||
- [Discv5 Waku v2 feasibility study](https://vac.dev/feasibility-discv5)
|
||||
- [DNS-based discovery](https://vac.dev/dns-based-discovery)
|
||||
- [EIP-1459](https://eips.ethereum.org/EIPS/eip-1459)
|
||||
- [Merkle trees](https://en.wikipedia.org/wiki/Merkle_tree)
|
||||
- [Sybil attack](https://en.wikipedia.org/wiki/Sybil_attack)
|
||||
- [eclipse attack](https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/heilman)
|
||||
- [Waku v2 ENR](https://rfc.vac.dev/spec/31/)
|
||||
- [Discv5 topic discovery](https://github.com/ethereum/devp2p/blob/6b0abc3d956a626c28dce1307ee9f546db17b6bd/discv5/discv5-theory.md#ad-placement-and-topic-radius)
|
||||
- [Discv5 paper](https://github.com/harnen/service-discovery-paper)
|
||||
- [Discv5 vs Sybil and eclipse attacks](https://github.com/ethereum/devp2p/blob/6b0abc3d956a626c28dce1307ee9f546db17b6bd/discv5/discv5-rationale.md#sybil-and-eclipse-attacks)
|
||||
- [peer exchange idea](https://github.com/libp2p/specs/issues/222)
|
||||
- [Rendezvous protocol](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/rendezvous/README.md)
|
||||
- [Waku v2 discv5](https://rfc.vac.dev/spec/33/)
|
||||
- [Gossipsub peer exchange](https://github.com/libp2p/specs/blob/10712c55ab309086a52eec7d25f294df4fa96528/pubsub/gossipsub/gossipsub-v1.1.md#prune-backoff-and-peer-exchange)
|
||||
- [NAT traversal](https://docs.libp2p.io/concepts/nat/)
|
||||
- [UPnP](https://en.wikipedia.org/wiki/Universal_Plug_and_Play)
|
||||
- [NAT-PMP](https://datatracker.ietf.org/doc/html/rfc6886)
|
||||
- [PCP](https://datatracker.ietf.org/doc/html/rfc6887).
|
||||
- [Discv5 topic efficiency issue](https://github.com/ethereum/devp2p/issues/199)
|
||||
314
rlog/2022-05-17-noise.mdx
Normal file
314
rlog/2022-05-17-noise.mdx
Normal file
@@ -0,0 +1,314 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Noise handshakes as key-exchange mechanism for Waku'
|
||||
title: 'Noise handshakes as key-exchange mechanism for Waku'
|
||||
date: 2022-05-17 10:00:00
|
||||
authors: s1fr0
|
||||
published: true
|
||||
slug: wakuv2-noise
|
||||
categories: research
|
||||
summary:
|
||||
image: /img/noise/NM.png
|
||||
discuss: https://forum.vac.dev/t/discussion-noise-handshakes-as-key-exchange-mechanism-for-waku/137
|
||||
_includes: [math]
|
||||
---
|
||||
|
||||
We provide an overview of the Noise Protocol Framework as a tool to design efficient and secure key-exchange mechanisms in Waku2.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
## Introduction
|
||||
|
||||
In this post we will provide an overview of how [Waku v2](https://rfc.vac.dev/spec/10/) users can adopt [Noise handshakes](http://www.noiseprotocol.org/noise.html) to agree on cryptographic keys used to securely encrypt messages.
|
||||
|
||||
This process belongs to the class of _key-exchange_ mechanisms, consisting of all those protocols that, with different levels of complexity and security guarantees, allow two parties to publicly agree on a secret without letting anyone else know what this secret is.
|
||||
|
||||
But why do we need key-exchange mechanisms in the first place?
|
||||
|
||||
With the advent of [public-key cryptography](https://en.wikipedia.org/wiki/Public-key_cryptography), it become possible to decouple encryption from decryption through use of two distinct cryptographic keys: one _public_, used to encrypt information and that can be made available to anyone, and one _private_ (kept secret), which enables decryption of messages encrypted with its corresponding public key. The same does not happen in the case of [symmetric encryption schemes](https://en.wikipedia.org/wiki/Symmetric-key_algorithm) where, instead, the same key is used for both encryption and decryption operations and hence cannot be publicly revealed as for public keys.
|
||||
|
||||
In order to address specific application needs, many different public, symmetric and hybrid cryptographic schemes were designed: [Waku v1](https://rfc.vac.dev/spec/6/) and [Waku v2](https://rfc.vac.dev/spec/10/), which inherits part of their design from the Ethereum messaging protocol [Whisper](https://ethereum.org/en/developers/docs/networking-layer/#whisper), provide [support](https://rfc.vac.dev/spec/26/) to both public-key primitives ([`ECIES`](https://en.wikipedia.org/wiki/Integrated_Encryption_Scheme), [`ECDSA`](https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm)) and symmetric primitives ([`AES-256-GCM`](https://en.wikipedia.org/wiki/Galois/Counter_Mode), [`KECCAK-256`](https://en.wikipedia.org/wiki/SHA-3)), used to sign, hash, encrypt and decrypt exchanged messages.
|
||||
|
||||
In principle, when communications employ public-key based encryption schemes (`ECIES`, in the case of Waku), there is no need for a key-agreement among parties: messages can be directly encrypted using the recipient's public-key before being sent over the network. However, public-key encryption and decryption primitives are usually very inefficient in processing large amount of data, and this may constitute a bottleneck for many of today's applications. Symmetric encryption schemes such as `AES-256-GCM`, on the other hand, are much more efficient, but the encryption/decryption key needs to be shared among users beforehand any encrypted messages is exchanged.
|
||||
|
||||
To counter the downsides given by each of these two approaches while taking advantage of their strengths, hybrid constructions were designed. In these, public-key primitives are employed to securely agree on a secret key which, in turn, is used with a symmetric cipher for encrypting messages. In other words, such constructions specify a (public-key based) key-agreement mechanism!
|
||||
|
||||
Waku, up to [payload version 1](https://rfc.vac.dev/spec/14/#payload-encryption), does not implement nor recommend any protocol for exchanging symmetric ciphers' keys, leaving such task to the application layer. It is important to note that the kind of key-agreement employed has a direct impact on the security properties that can be granted on later encrypted messages, while security requirements usually depend on the specific application for which encryption is needed in the first place.
|
||||
|
||||
In this regard, [Status](https://status.im), which builds on top of Waku, [implements](https://specs.status.im/spec/5) a custom version of the [X3DH](https://signal.org/docs/specifications/x3dh/) key-agreement protocol, in order to allow users to instantiate end-to-end encrypted communication channels. However, although such a solution is optimal when applied to (distributed) E2E encrypted chats, it is not flexible enough to fit or simplify the variety of applications Waku aims to address.
|
||||
Hence, proposing and implementing one or few key-agreements which provide certain (presumably _strong_) security guarantees, would inevitably degrade performances of all those applications for which, given their security requirements, more tailored and efficient key-exchange mechanisms can be employed.
|
||||
|
||||
Guided by different examples, in the following sections we will overview Noise, a protocol framework we are [currently integrating](https://rfc.vac.dev/spec/35/) in Waku, for building secure key-agreements between two parties. One of the great advantage of using Noise is that it is possible to add support to new key-exchanges by just specifying users' actions from a predefined list, requiring none to minimal modifications to existing implementations. Furthermore, Noise provides a framework to systematically analyze protocols' security properties and the corresponding attacker threat models. This allows not only to easily design new key-agreements eventually optimized for specific applications we want to address, but also to easily analyze or even [formally verify](https://noiseexplorer.com/) any of such custom protocol!
|
||||
|
||||
We believe that with its enormous flexibility and features, Noise represents a perfect candidate for bringing key-exchange mechanisms in Waku.
|
||||
|
||||
## The Diffie-Hellman Key-exchange
|
||||
|
||||
The formalization of modern public-key cryptography started with the pioneering work of Whitefield Diffie and Martin Hellman, who detailed one of the earliest known key-agreement protocols: the famous [Diffie-Hellman Key-Exchange](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange).
|
||||
|
||||
Diffie-Hellman (DH) key-exchange is largely used today and represents the main cryptographic building block on which Noise handshakes' security is based.
|
||||
|
||||
In turn, the security of DH is based on a mathematical problem called [discrete logarithm](https://en.wikipedia.org/wiki/Discrete_logarithm) which is believed to be hard when the agreement is practically instantiated using certain [elliptic curves](https://en.wikipedia.org/wiki/Elliptic_curve) $E$ defined over finite fields $\mathbb{F}_p$.
|
||||
|
||||
Informally, a DH exchange between Alice and Bob proceeds as follows:
|
||||
|
||||
- Alice picks a secret scalar $s_A\in\mathbb{F}_p$ and computes, using the underlying [curve's arithmetic](https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication), the point $P_A = s_A\cdot P\in E(\mathbb{F}_p)$ for a certain pre-agreed public generator $P$ of the elliptic curve $E(\mathbb{F}_p)$. She then sends $P_A$ to Bob.
|
||||
- Similarly, Bob picks a secret scalar $s_B\in\mathbb{F}_p$, computes $P_B = s_B\cdot P\in E(\mathbb{F}_p)$ and sends $P_B$ to Alice.
|
||||
- By commutativity of scalar multiplication, both Alice and Bob can now compute the point $P_{AB} = s_As_B\cdot P$, using the elliptic curve point received from the other party and their secret scalar.
|
||||
|
||||
The assumed hardness of computing discrete logarithms in the elliptic curve, ensures that it is not possible to compute $s_A$ or $s_B$ from $P_A$ and $P_B$, respectively. Another security assumption (named [Computational Diffie-Hellman assumption](https://en.wikipedia.org/wiki/Computational_Diffie%E2%80%93Hellman_assumption)) ensures that it is not possible to compute $P_{AB}$ from $P$, $P_A$ and $P_B$. Hence the point $P_{AB}$ shared by Alice and Bob at the end of the above protocol cannot be efficiently computed by an attacker intercepting $P_A$ and $P_B$, and can then be used to generate a secret to be later employed, for example, as a symmetric encryption key.
|
||||
|
||||
On a side note, this protocol shows the interplay between two components typical to public-key based schemes: the scalars $s_A$ and $s_B$ can be seen as _private keys_ associated to the _public keys_ $P_A$ and $P_B$, respectively, which allow Alice and Bob only to compute the shared secret point $P_{AB}$.
|
||||
|
||||
## Ephemeral and Static Public Keys
|
||||
|
||||
Although we assumed that it is practically impossible for an attacker to compute the randomly picked secret scalar from the corresponding public elliptic curve point, it may happen that such scalar gets compromised or can be guessed due to a faulty employed random number generator. In such cases, an attacker will be able to recover the final shared secret and all encryption keys eventually derived from that, with clear catastrophic consequences for the privacy of exchanged messages.
|
||||
|
||||
To mitigate such issues, multiple DH operations can be combined using two different types of exchanged elliptic curve points or, better, _public keys_: _ephemeral keys_, that is random keys used only once in a DH operation, and long-term _static keys_, used mainly for authentication purposes since employed multiple times.
|
||||
|
||||
Just to provide an example, let us suppose Alice and Bob perform the following custom DH-based key-exchange protocol:
|
||||
|
||||
- Alice generates an ephemeral key $E_A=e_A\cdot P$ by picking a random scalar $e_A$ and sends $E_A$ to Bob;
|
||||
- Similarly, Bob generates an ephemeral key $E_B=e_B\cdot P$ and sends $E_B$ to Alice;
|
||||
- Alice and Bob computes $E_{AB} = e_Ae_B \cdot P$ and from it derive a secret encryption key $k$.
|
||||
- Bob sends to Alice his static key $S_B = s_B\cdot P$ encrypted with $k$.
|
||||
- Alice encrypts with $k$ her static key $S_A = s_A\cdot P$ and sends it to Bob.
|
||||
- Alice and Bob decrypt the received static keys, compute the secret $S_{AB} = s_As_B \cdot P$ and use it together with $E_{AB}$ to derive a new encryption key $\tilde{k}$ to be later used with a symmetric cipher.
|
||||
|
||||
In this protocol, if Alice's and/or Bob's static keys get compromised, it would not possible to derive the final secret key $\tilde{k}$, since at least one ephemeral key among $E_A$ and $E_B$ has to be compromised too in order to recover the secret $E_{AB}$. Furthermore, since Alice's and Bob's long-term static keys are encrypted, an attacker intercepting exchanged (encrypted) public keys will not be able to link such communication to Alice or Bob, unless one of the ephemeral key is compromised (and, even in such case, none of the messages encrypted under the key $\tilde{k}$ can be decrypted).
|
||||
|
||||
## The Noise Protocol Framework
|
||||
|
||||
In previous section we gave a small intuition on how multiple DH operations over ephemeral and static users' public keys can be combined to create different key-exchange protocols.
|
||||
|
||||
The [Noise Protocol Framework](http://www.noiseprotocol.org/noise.html), defines various rules for building custom key-exchange protocols while allowing easy analysis of the security properties and threat models provided given the type and order of the DH operations employed.
|
||||
|
||||
In Noise terminology, a key-agreement or _Noise protocol_ consists of one or more _Noise handshakes_. During a Noise handshake, Alice and Bob exchange multiple (handshake) messages containing their ephemeral keys and/or static keys. These public keys are then used to perform a handshake-dependent sequence of Diffie-Hellman operations, whose results are all hashed into a shared secret key. Similarly as we have seen above, after a handshake is complete, each party will use the derived secret key to send and receive [authenticated encrypted data](https://en.wikipedia.org/wiki/Authenticated_encryption) by employing a symmetric cipher.
|
||||
|
||||
Depending on the _handshake pattern_ adopted, different security guarantees can be provided on messages encrypted using a handshake-derived key.
|
||||
|
||||
The Noise handshakes we support in Waku all provide the following security properties:
|
||||
|
||||
- **Confidentiality**: the adversary should not be able to learn what data is being sent between Alice and Bob.
|
||||
- **Strong forward secrecy**: an active adversary cannot decrypt messages nor infer any information on the employed encryption key, even in the case he has access to Alice's and Bob's long-term private keys (during or after their communication).
|
||||
- **Authenticity**: the adversary should not be able to cause either Alice or Bob to accept messages coming from a party different than their original senders.
|
||||
- **Integrity**: the adversary should not be able to cause Alice or Bob to accept data that has been tampered with.
|
||||
- **Identity-hiding**: once a secure communication channel is established, a passive adversary should not be able to link exchanged encrypted messages to their corresponding sender and recipient by knowing their long-term static keys.
|
||||
|
||||
We refer to [Noise specification](http://www.noiseprotocol.org/noise.html) for more formal security definitions and precise threat models relative to Waku [supported Noise Handshake patterns](#Supported-Noise-Handshakes-in-Waku).
|
||||
|
||||
## Message patterns
|
||||
|
||||
Noise handshakes involving DH operations over ephemeral and static keys can be succinctly sketched using the following set of _handshake message tokens_: `e`,`s`,`ee`,`se`,`es`,`ss`.
|
||||
|
||||
Tokens employing single letters denote (the type of) users' public keys: `e` refers to randomly generated ephemeral key(s), while `s` indicates the users' long-term static key(s).
|
||||
|
||||
Two letters tokens, instead, denotes DH operations over the two users' public keys the token refers to, given that the left token letter refers to the handshake _initiator's_ public key, while the right token letter indicates the used _responder's_ public key. Thus, if Alice started a handshake with Bob, the `es` token will shortly represent a DH operation among Alice's ephemeral key `e` and Bob's static key `s`.
|
||||
|
||||
Since, in order to perform any DH operations users need to share (or pre-share) the corresponding public keys, Noise compactly represents messages' exchanges using the two direction `->` and `<-`, where the `->` denotes a message (arbitrary and/or DH public key) from the initiator to the responder, while `<-` the opposite.
|
||||
|
||||
Hence a _message pattern_ consisting of a direction and one or multiple tokens such as `<- e, s, es` has to be interpreted one token at a time: in this example, the responder is sending his ephemeral and static key to the initiator and is then executing a DH operation over the initiator's ephemeral key `e` (shared in a previously exchanged message pattern) and his static key `s`. On the other hand, such message indicates also that the initiator received the responder's ephemeral and static keys `e` and `s`, respectively, and performed a DH operation over his ephemeral key and the responder's just received static key `s`. In this way, both parties will be able to derive at the end of each message pattern processed the same shared secret, which is eventually used to update any derived symmetric encryption keys computed so far.
|
||||
|
||||
In some cases, DH public keys employed in a handshake are pre-shared before the handshake itself starts. In order to chronologically separate exchanged keys and DH operations performed before and during a handshake, Noise employs the `...` delimiter.
|
||||
|
||||
For example, the following message patterns
|
||||
|
||||
```
|
||||
<- e
|
||||
...
|
||||
-> e, ee
|
||||
```
|
||||
|
||||
indicates that the initiator knew the responder's ephemeral key before he sends his own ephemeral key and executes a DH operation between both parties ephemeral keys (similarly, the responder receives the initiator's ephemeral key and does a `ee` DH operation).
|
||||
|
||||
At this point it should be clear how such notation is able to compactly represent a large variety of DH based key-agreements. Nevertheless, we can easily define additional tokens and processing rules in order to address specific applications and security requirements, such as the [`psk`](http://www.noiseprotocol.org/noise.html#handshake-tokens) token used to process arbitrary pre-shared key material.
|
||||
|
||||
As an example of Noise flexibility, the custom protocol we detailed [above](#Ephemeral-and-Static-Public-Keys) can be shortly represented as _(Alice is on the left)_:
|
||||
|
||||
```
|
||||
-> e
|
||||
<- e, ee, s
|
||||
-> s, ss
|
||||
```
|
||||
|
||||
where after each DH operation an encryption key is derived (along with the secrets computed by all previously executed DH operations) in order to encrypt/decrypt any subsequent sent/received message.
|
||||
|
||||
Another example is given by the possibility to replicate within Noise the well established Signal's [X3DH](https://signal.org/docs/specifications/x3dh/) key-agreement protocols, thus making the latter a general framework to design and study security of many practical and widespread DH-based key-exchange protocols.
|
||||
|
||||
## The Noise State Objects
|
||||
|
||||
We mentioned multiple times that parties derive an encryption key each time they perform a DH operation, but how does this work in more details?
|
||||
|
||||
Noise defines three _state object_: a _Handshake State_, a _Symmetric State_ and a _Cipher State_, each encapsulated into each other and instantiated during the execution of a handshake.
|
||||
|
||||
The Handshake State object stores the user's and other party's received ephemeral and static keys (if any) and embeds a Symmetric State object.
|
||||
|
||||
The Symmetric State, instead, stores a handshake hash value `h`, iteratively updated with any message read/received and DH secret computed, and a chaining key `ck`, updated using a key derivation function every time a DH secret is computed. This object further embeds a Cipher State.
|
||||
|
||||
Lastly, the Cipher State stores a symmetric encryption `k` key and a counter `n` used to encrypt and decrypt messages exchanged during the handshake (not only static keys, but also arbitrary payloads). These key and counter are refreshed every time the chaining key is updated.
|
||||
|
||||
While processing each handshake's message pattern token, all these objects are updated according to some specific _processing rules_ which employ a combination of public-key primitives, hash and key-derivation functions and symmetric ciphers. It is important to note, however, that at the end of each processed message pattern, the two users will share the same Symmetric and Cipher State embedded in their respective Handshake States.
|
||||
|
||||
Once a handshake is complete, users derive two new Cipher States and can then discard the Handshake State object (and, thus, the embedded Symmetric State and Cipher State objects)
|
||||
employed during the handshake.
|
||||
|
||||
These two Cipher states are used to encrypt and decrypt all outbound and inbound after-handshake messages, respectively, and only to these will be granted the confidentiality, authenticity, integrity and identity-hiding properties we detailed above.
|
||||
|
||||
For more details on processing rules, we refer to [Noise specifications](http://www.noiseprotocol.org/noise.html).
|
||||
|
||||
## Supported Noise Handshakes in Waku
|
||||
|
||||
The Noise handshakes we provided support to in Waku address four typical scenarios occurring when an encrypted communication channel between Alice and Bob is going to be created:
|
||||
|
||||
- Alice and Bob know each others' static key.
|
||||
- Alice knows Bob's static key;
|
||||
- Alice and Bob share no key material and they don't know each others' static key.
|
||||
- Alice and Bob share some key material, but they don't know each others' static key.
|
||||
|
||||
The possibility to have handshakes based on the reciprocal knowledge parties have of each other, allows designing Noise handshakes that can quickly reach the desired level of security on exchanged encrypted messages while keeping the number of interactions between Alice and Bob minimum.
|
||||
|
||||
Nonetheless, due to the pure _token-based_ nature of handshake processing rules, implementations can easily add support to any custom handshake pattern with minor modifications, in case more specific application use-cases need to be addressed.
|
||||
|
||||
On a side note, we already mentioned that identity-hiding properties can be guaranteed against a passive attacker that only reads the communication occurring between Alice and Bob. However, an active attacker who compromised one party's static key and actively interferes with the parties' exchanged messages, may lower the identity-hiding security guarantees provided by some handshake patterns. In our security model we exclude such adversary, but, for completeness, in the following we report a summary of possible de-anonymization attacks that can be performed by such an active attacker.
|
||||
|
||||
For more details on supported handshakes and on how these are implemented in Waku, we refer to [35/WAKU2-NOISE](https://rfc.vac.dev/spec/35/) RFC.
|
||||
|
||||
### The K1K1 Handshake
|
||||
|
||||
If Alice and Bob know each others' static key (e.g., these are public or were already exchanged in a previous handshake) , they MAY execute a `K1K1` handshake. In Noise notation _(Alice is on the left)_ this can be sketched as:
|
||||
|
||||
```
|
||||
K1K1:
|
||||
-> s
|
||||
<- s
|
||||
...
|
||||
-> e
|
||||
<- e, ee, es
|
||||
-> se
|
||||
```
|
||||
|
||||
We note that here only ephemeral keys are exchanged. This handshake is useful in case Alice needs to instantiate a new separate encrypted communication channel with Bob, e.g. opening multiple parallel connections, file transfers, etc.
|
||||
|
||||
**Security considerations on identity-hiding (active attacker)**: no static key is transmitted, but an active attacker impersonating Alice can check candidates for Bob's static key.
|
||||
|
||||
### The XK1 Handshake
|
||||
|
||||
Here, Alice knows how to initiate a communication with Bob and she knows his public static key: such discovery can be achieved, for example, through a publicly accessible register of users' static keys, smart contracts, or through a previous public/private advertisement of Bob's static key.
|
||||
|
||||
A Noise handshake pattern that suits this scenario is `XK1`:
|
||||
|
||||
```
|
||||
XK1:
|
||||
<- s
|
||||
...
|
||||
-> e
|
||||
<- e, ee, es
|
||||
-> s, se
|
||||
```
|
||||
|
||||
Within this handshake, Alice and Bob reciprocally authenticate their static keys `s` using ephemeral keys `e`. We note that while Bob's static key is assumed to be known to Alice (and hence is not transmitted), Alice's static key is sent to Bob encrypted with a key derived from both parties ephemeral keys and Bob's static key.
|
||||
|
||||
**Security considerations on identity-hiding (active attacker)**: Alice's static key is encrypted with forward secrecy to an authenticated party. An active attacker initiating the handshake can check candidates for Bob's static key against recorded/accepted exchanged handshake messages.
|
||||
|
||||
### The XX and XXpsk0 Handshakes
|
||||
|
||||
If Alice is not aware of any static key belonging to Bob (and neither Bob knows anything about Alice), she can execute an `XX` handshake, where each party tran**X**mits to the other its own static key.
|
||||
|
||||
The handshake goes as follows:
|
||||
|
||||
```
|
||||
XX:
|
||||
-> e
|
||||
<- e, ee, s, es
|
||||
-> s, se
|
||||
```
|
||||
|
||||
We note that the main difference with `XK1` is that in second step Bob sends to Alice his own static key encrypted with a key obtained from an ephemeral-ephemeral Diffie-Hellman exchange.
|
||||
|
||||
This handshake can be slightly changed in case both Alice and Bob pre-shares some secret `psk` which can be used to strengthen their mutual authentication during the handshake execution. One of the resulting protocol, called `XXpsk0`, goes as follow:
|
||||
|
||||
```
|
||||
XXpsk0:
|
||||
-> psk, e
|
||||
<- e, ee, s, es
|
||||
-> s, se
|
||||
```
|
||||
|
||||
The main difference with `XX` is that Alice's and Bob's static keys, when transmitted, would be encrypted with a key derived from `psk` as well.
|
||||
|
||||
**Security considerations on identity-hiding (active attacker)**: Alice's static key is encrypted with forward secrecy to an authenticated party for both `XX` and `XXpsk0` handshakes. In `XX`, Bob's static key is encrypted with forward secrecy but is transmitted to a non-authenticated user which can then be an active attacker. In `XXpsk0`, instead, Bob's secret key is protected by forward secrecy to a partially authenticated party (through the pre-shared secret `psk` but not through any static key), provided that `psk` was not previously compromised (in such case identity-hiding properties provided by the `XX` handshake applies).
|
||||
|
||||
## Session Management and Multi-Device Support
|
||||
|
||||
When two users complete a Noise handshake, an encryption/decryption session - or _Noise session_ - consisting of two Cipher States is instantiated.
|
||||
|
||||
By identifying Noise session with a `session-id` derived from the handshake's cryptographic material, we can take advantage of the [PubSub/GossipSub](https://github.com/libp2p/specs/tree/master/pubsub) protocols used by Waku for relaying messages in order to manage instantiated Noise sessions.
|
||||
|
||||
The core idea is to exchange after-handshake messages (encrypted with a Cipher State specific to the Noise session), over a content topic derived from the (secret) `session-id` the corresponding session refers to.
|
||||
|
||||
This allows to decouple the handshaking phase from the actual encrypted communication, thus improving users' identity-hiding capabilities.
|
||||
|
||||
Furthermore, by publicly revealing a value derived from `session-id` on the corresponding session content topic, a Noise session can be marked as _stale_, enabling peers to save resources by discarding any eventually [stored](https://rfc.vac.dev/spec/13/) message sent to such content topic.
|
||||
|
||||
One relevant aspect in today's applications is the possibility for users to employ different devices in their communications. In some cases, this is non-trivial to achieve since, for example, encrypted messages might be required to be synced on different devices which do not necessarily share the necessary key material for decryption and may be temporarily offline.
|
||||
|
||||
We address this by requiring each user's device to instantiate multiple Noise sessions either with all user's other devices which, in turn, all together share a Noise session with the other party, or by directly instantiating a Noise session with all other party's devices.
|
||||
|
||||
We named these two approaches $N11M$ and $NM$, respectively, which are in turn loosely based on the paper [“Multi-Device for Signal”](https://eprint.iacr.org/2019/1363.pdf) and [Signal’s Sesame Algorithm](https://signal.org/docs/specifications/sesame/).
|
||||
|
||||

|
||||
|
||||
Informally, in the $N11M$ session management scheme, once the first Noise session between any of Alice’s and Bob’s device is instantiated, its session information is securely propagated to all other devices using previously instantiated Noise sessions. Hence, all devices are able to send and receive new messages on the content topic associated to such session.
|
||||
|
||||

|
||||
|
||||
In the $NM$ session management scheme, instead, all pairs of Alice's and Bob's devices have a distinct Noise session: a message is then sent from the currently-in-use sender’s device to all recipient’s devices, by properly encrypting and sending it to the content topics of each corresponding Noise session. If sent messages should be available on all sender’s devices as well, we require each pair of sender’s devices to instantiate a Noise session used for syncing purposes.
|
||||
|
||||
For more technical details on how Noise sessions are instantiated and managed within these two mechanisms and the different trade-offs provided by the latter, we refer to [37/WAKU2-NOISE-SESSIONS](https://rfc.vac.dev/spec/37/).
|
||||
|
||||
## Conclusions
|
||||
|
||||
In this post we provided an overview of Noise, a protocol framework for designing Diffie-Hellman based key-exchange mechanisms allowing systematic security and threat model analysis.
|
||||
|
||||
The flexibility provided by Noise components allows not only to fully replicate with same security guarantees well established key-exchange primitives such as X3DH, currently employed by Status [5/TRANSPORT-SECURITY](https://specs.status.im/spec/5), but enables also optimizations based on the reciprocal knowledge parties have of each other while allowing easier protocols' security analysis and (formal) verification.
|
||||
|
||||
Furthermore, different handshakes can be combined and executed one after each other, a particularly useful feature to authenticate multiple static keys employed by different applications but also to ease keys revocation.
|
||||
|
||||
The possibility to manage Noise sessions over multiple devices and the fact that handshakes can be concretely instantiated using modern, fast and secure cryptographic primitives such as [ChaChaPoly](https://datatracker.ietf.org/doc/html/rfc7539) and [BLAKE2b](https://datatracker.ietf.org/doc/html/rfc7693), make Noise one of the best candidates for efficiently and securely address the many different needs of applications built on top of Waku requiring key-agreement.
|
||||
|
||||
## Future steps
|
||||
|
||||
The available [implementation](https://github.com/status-im/nwaku/tree/master/waku/v2/waku_noise) of Noise in `nwaku`, although mostly complete, is still in its testing phase. As future steps we would like to:
|
||||
|
||||
- have an extensively tested and robust Noise implementation;
|
||||
- formalize, implement and test performances of the two proposed $N11M$ and $NM$ session management mechanisms and their suitability for common use-case scenarios;
|
||||
- provide Waku network nodes a native protocol to readily support key-exchanges, strongly-encrypted communication and multi-device session management mechanisms with none-to-little interaction besides applications' connection requests.
|
||||
|
||||
## References
|
||||
|
||||
- [6/WAKU1](https://rfc.vac.dev/spec/6/)
|
||||
- [10/WAKU2](https://rfc.vac.dev/spec/10/)
|
||||
- [13/WAKU2-STORE](https://rfc.vac.dev/spec/13/)
|
||||
- [26/WAKU-PAYLOAD](https://rfc.vac.dev/spec/26/)
|
||||
- [35/WAKU2-NOISE](https://rfc.vac.dev/spec/35/)
|
||||
- [37/WAKU2-NOISE-SESSIONS](https://rfc.vac.dev/spec/37/)
|
||||
- [5/TRANSPORT-SECURITY](https://specs.status.im/spec/5)
|
||||
- [The PubSub/GossipSub Protocols](https://github.com/libp2p/specs/tree/master/pubsub)
|
||||
- [The Noise Protocol Framework](http://www.noiseprotocol.org/noise.html)
|
||||
- [The X3DH Key-agreement Protocol](https://signal.org/docs/specifications/x3dh/)
|
||||
- [“Multi-Device for Signal”](https://eprint.iacr.org/2019/1363.pdf)
|
||||
- [Signal’s Sesame Algorithm](https://signal.org/docs/specifications/sesame/).
|
||||
- [Public-key cryptography](https://en.wikipedia.org/wiki/Public-key_cryptography)
|
||||
- [Elliptic curves](https://en.wikipedia.org/wiki/Elliptic_curve)
|
||||
- [Elliptic Curve point multiplication](https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication)
|
||||
- [Symmetric key algorithm](https://en.wikipedia.org/wiki/Symmetric-key_algorithm)
|
||||
- [Authenticated encryption](https://en.wikipedia.org/wiki/Authenticated_encryption)
|
||||
- [Diffie-Hellman Key-Exchange](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange)
|
||||
- [The Discrete Logarithm Problem](https://en.wikipedia.org/wiki/Discrete_logarithm)
|
||||
- [Computational Diffie-Hellman Assumption](https://en.wikipedia.org/wiki/Computational_Diffie%E2%80%93Hellman_assumption)
|
||||
- [The ECIES Encryption Algorithm](https://en.wikipedia.org/wiki/Integrated_Encryption_Scheme)
|
||||
- [The ECDSA Signature Algorithm](https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm)
|
||||
- [The Galois Counter Mode mode of operation](https://en.wikipedia.org/wiki/Galois/Counter_Mode)
|
||||
- [The ChaChaPoly AEAD Cipher](https://datatracker.ietf.org/doc/html/rfc7539)
|
||||
- [The BLAKE2b Hash Function](https://datatracker.ietf.org/doc/html/rfc7693)
|
||||
- [The SHA-3 Hash Function](https://en.wikipedia.org/wiki/SHA-3)
|
||||
380
rlog/2022-07-22-relay-anonymity.mdx
Normal file
380
rlog/2022-07-22-relay-anonymity.mdx
Normal file
@@ -0,0 +1,380 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Waku Privacy and Anonymity Analysis Part I: Definitions and Waku Relay'
|
||||
title: 'Waku Privacy and Anonymity Analysis Part I: Definitions and Waku Relay'
|
||||
date: 2022-07-22 10:00:00
|
||||
authors: kaiserd
|
||||
published: true
|
||||
slug: wakuv2-relay-anon
|
||||
categories: research
|
||||
image: /img/anonymity_trilemma.svg
|
||||
discuss: https://forum.vac.dev/t/discussion-waku-privacy-and-anonymity-analysis/149
|
||||
_includes: [math]
|
||||
|
||||
toc_min_heading_level: 2
|
||||
toc_max_heading_level: 5
|
||||
---
|
||||
|
||||
Introducing a basic threat model and privacy/anonymity analysis for the Waku v2 relay protocol.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
[Waku v2](https://rfc.vac.dev/spec/10/) enables secure, privacy preserving communication using a set of modular P2P protocols.
|
||||
Waku v2 also aims at protecting the user's anonymity.
|
||||
This post is the first in a series about Waku v2 security, privacy, and anonymity.
|
||||
The goal is to eventually have a full privacy and anonymity analysis for each of the Waku v2 protocols, as well as covering the interactions of various Waku v2 protocols.
|
||||
This provides transparency with respect to Waku's current privacy and anonymity guarantees, and also identifies weak points that we have to address.
|
||||
|
||||
In this post, we first give an informal description of security, privacy and anonymity in the context of Waku v2.
|
||||
For each definition, we summarize Waku's current guarantees regarding the respective property.
|
||||
We also provide attacker models, an attack-based threat model, and a first anonymity analysis of [Waku v2 relay](https://rfc.vac.dev/spec/11/) within the respective models.
|
||||
|
||||
Waku comprises many protocols that can be combined in a modular way.
|
||||
For our privacy and anonymity analysis, we start with the relay protocol because it is at the core of Waku v2 enabling Waku's publish subscribe approach to P2P messaging.
|
||||
In its current form, Waku relay is a minor extension of [libp2p GossipSub](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/README.md).
|
||||
|
||||
](/img/libp2p_gossipsub_types_of_peering.png)
|
||||
|
||||
## Informal Definitions: Security, Privacy, and Anonymity
|
||||
|
||||
The concepts of security, privacy, and anonymity are linked and have quite a bit of overlap.
|
||||
|
||||
### Security
|
||||
|
||||
Of the three, [Security](https://en.wikipedia.org/wiki/Information_security) has the clearest agreed upon definition,
|
||||
at least regarding its key concepts: _confidentiality_, _integrity_, and _availability_.
|
||||
|
||||
- confidentiality: data is not disclosed to unauthorized entities.
|
||||
- integrity: data is not modified by unauthorized entities.
|
||||
- availability: data is available, i.e. accessible by authorized entities.
|
||||
|
||||
While these are the key concepts, the definition of information security has been extended over time including further concepts,
|
||||
e.g. [authentication](https://en.wikipedia.org/wiki/Authentication) and [non-repudiation](https://en.wikipedia.org/wiki/Non-repudiation).
|
||||
We might cover these in future posts.
|
||||
|
||||
### Privacy
|
||||
|
||||
Privacy allows users to choose which data and information
|
||||
|
||||
- they want to share
|
||||
- and with whom they want to share it.
|
||||
|
||||
This includes data and information that is associated with and/or generated by users.
|
||||
Protected data also comprises metadata that might be generated without users being aware of it.
|
||||
This means, no further information about the sender or the message is leaked.
|
||||
Metadata that is protected as part of the privacy-preserving property does not cover protecting the identities of sender and receiver.
|
||||
Identities are protected by the [anonymity property](#anonymity).
|
||||
|
||||
Often privacy is realized by the confidentiality property of security.
|
||||
This neither makes privacy and security the same, nor the one a sub category of the other.
|
||||
While security is abstract itself (its properties can be realized in various ways), privacy lives on a more abstract level using security properties.
|
||||
Privacy typically does not use integrity and availability.
|
||||
An adversary who has no access to the private data, because the message has been encrypted, could still alter the message.
|
||||
|
||||
Waku offers confidentiality via secure channels set up with the help of the [Noise Protocol Framework](https://noiseprotocol.org/).
|
||||
Using these secure channels, message content is only disclosed to the intended receivers.
|
||||
They also provide good metadata protection properties.
|
||||
However, we do not have a metadata protection analysis as of yet,
|
||||
which is part of our privacy/anonymity roadmap.
|
||||
|
||||
### Anonymity
|
||||
|
||||
Privacy and anonymity are closely linked.
|
||||
Both the identity of a user and data that allows inferring a user's identity should be part of the privacy policy.
|
||||
For the purpose of analysis, we want to have a clearer separation between these concepts.
|
||||
|
||||
We define anonymity as _unlinkablity of users' identities and their shared data and/or actions_.
|
||||
|
||||
We subdivide anonymity into _receiver anonymity_ and _sender anonymity_.
|
||||
|
||||
#### Receiver Anonymity
|
||||
|
||||
We define receiver anonymity as _unlinkability of users' identities and the data they receive and/or related actions_.
|
||||
The data transmitted via Waku relay must be a [Waku message](https://rfc.vac.dev/spec/14/), which contains a content topic field.
|
||||
Because each message is associated with a content topic, and each receiver is interested in messages with specific content topics,
|
||||
receiver anonymity in the context of Waku corresponds to _subscriber-topic unlinkability_.
|
||||
An example for the "action" part of our receiver anonymity definition is subscribing to a specific topic.
|
||||
|
||||
The Waku message's content topic is not related to the libp2p pubsub topic.
|
||||
For now, Waku uses a single libp2p pubsub topic, which means messages are propagated via a single mesh of peers.
|
||||
With this, the receiver discloses its participation in Waku on the gossipsub layer.
|
||||
We will leave the analysis of libp2p gossipsub to a future article within this series, and only provide a few hints and pointers here.
|
||||
|
||||
Waku offers k-anonymity regarding content topic interest in the global adversary model.
|
||||
[K-anonymity](https://en.wikipedia.org/wiki/K-anonymity) in the context of Waku means an attacker can link receivers to content topics with a maximum certainty of $1/k$.
|
||||
The larger $k$, the less certainty the attacker gains.
|
||||
Receivers basically hide in a pool of $k$ content topics, any subset of which could be topics they subscribed to.
|
||||
The attacker does not know which of those the receiver actually subscribed to,
|
||||
and the receiver enjoys [plausible deniability](https://en.wikipedia.org/wiki/Plausible_deniability#Use_in_cryptography) regarding content topic subscription.
|
||||
Assuming there are $n$ Waku content topics, a receiver has $n$-anonymity with respect to association to a specific content topic.
|
||||
|
||||
Technically, Waku allows distributing messages over several libp2p pubsub topics.
|
||||
This yields $k$-anonymity, assuming $k$ content topics share the same pubsub topic.
|
||||
However, if done wrongly, such sharding of pubsub topics can breach anonymity.
|
||||
A formal specification of anonymity-preserving topic sharding building on the concepts of [partitioned topics](https://specs.status.im/spec/10#partitioned-topic) is part of our roadmap.
|
||||
|
||||
Also, Waku is not directly concerned with 1:1 communication, so for this post, 1:1 communication is out of scope.
|
||||
Channels for 1:1 communication can be implemented on top of Waku relay.
|
||||
In the future, a 1:1 communication protocol might be added to Waku.
|
||||
Similar to topic sharding, it would maintain receiver anonymity leveraging [partitioned topics](https://specs.status.im/spec/10#partitioned-topic).
|
||||
|
||||
#### Sender Anonymity
|
||||
|
||||
We define sender anonymity as _unlinkability of users' identities and the data they send and/or related actions_.
|
||||
Because the data in the context of Waku is Waku messages, sender anonymity corresponds to _sender-message unlinkability_.
|
||||
|
||||
In summary, Waku offers weak sender anonymity because of [Waku's strict no sign policy](https://rfc.vac.dev/spec/11/#signature-policy),
|
||||
which has its origins in the [Ethereum consensus specs](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#why-are-we-using-the-strictnosign-signature-policy).
|
||||
[17/WAKU-RLN-RELAY](https://rfc.vac.dev/spec/17/) and [18/WAKU2-SWAP](https://rfc.vac.dev/spec/18/) mitigate replay and injection attacks.
|
||||
|
||||
Waku currently does not offer sender anonymity in stronger attacker models, as well as cannot protect against targeted attacks in weaker attacker models like the single or multi node attacker.
|
||||
We will cover this in more detail in later sections.
|
||||
|
||||
### Anonymity Trilemma
|
||||
|
||||
[The Anonymity trilemma](https://freedom.cs.purdue.edu/projects/trilemma.html) states that only two out of _strong anonymity_, _low bandwidth_, and _low latency_ can be guaranteed in the global on-net attacker model.
|
||||
Waku's goal, being a modular set of protocols, is to offer any combination of two out of these three properties, as well as blends.
|
||||
An example for blending is an adjustable number of pubsub topics and peers in the respective pubsub topic mesh; this allows tuning the trade-off between anonymity and bandwidth.
|
||||
|
||||

|
||||
|
||||
A fourth factor that influences [the anonymity trilemma](https://freedom.cs.purdue.edu/projects/trilemma.html) is _frequency and patterns_ of messages.
|
||||
The more messages there are, and the more randomly distributed they are, the better the anonymity protection offered by a given anonymous communication protocol.
|
||||
So, incentivising users to use the protocol, for instance by lowering entry barriers, helps protecting the anonymity of all users.
|
||||
The frequency/patterns factor is also related to the above described k-anonymity.
|
||||
|
||||
### Censorship Resistance
|
||||
|
||||
Another security related property that Waku aims to offer is censorship resistance.
|
||||
Censorship resistance guarantees that users can participate even if an attacker tries to deny them access.
|
||||
So, censorship resistance ties into the availability aspect of security.
|
||||
In the context of Waku that means users should be able to send messages as well as receive all messages they are interested in,
|
||||
even if an attacker tries to prevent them from disseminating messages or tries to deny them access to messages.
|
||||
|
||||
Currently, Waku only guarantees censorship resistance in the weak single node attacker model.
|
||||
While currently employed secure channels mitigate targeted censorship, e.g. blocking specific content topics,
|
||||
general censorship resistance in strong attacker models is part of our roadmap.
|
||||
Among other options, we will investigate [Pluggable Transports](https://www.pluggabletransports.info/about/) in future articles.
|
||||
|
||||
## Attacker Types
|
||||
|
||||
The following lists various attacker types with varying degrees of power.
|
||||
The more power an attacker has, the more difficult it is to gain the respective attacker position.
|
||||
|
||||
Each attacker type comes in a passive and an active variant.
|
||||
While a passive attacker can stay hidden and is not suspicious,
|
||||
the respective active attacker has more (or at least the same) deanonymization power.
|
||||
|
||||
We also distinguish between internal and external attackers.
|
||||
|
||||
### Internal
|
||||
|
||||
With respect to Waku relay, an internal attacker participates in the same pubsub topic as its victims.
|
||||
Without additional measures on higher layer protocols, access to an internal position is easy to get.
|
||||
|
||||
#### Single Node
|
||||
|
||||
This attacker controls a single node.
|
||||
Because this position corresponds to normal usage of Waku relay, it is trivial to obtain.
|
||||
|
||||
#### Multi Node
|
||||
|
||||
This attacker controls several nodes. We assume a smaller static number of controlled nodes.
|
||||
The multi node position can be achieved relatively easily by setting up multiple nodes.
|
||||
Botnets might be leveraged to increase the number of available hosts.
|
||||
Multi node attackers could use [Sybil attacks](https://en.wikipedia.org/wiki/Sybil_attack) to increase the number of controlled nodes.
|
||||
A countermeasure is for nodes to only accept libp2p gossipsub graft requests from peers with different IP addresses, or even different subnets.
|
||||
|
||||
#### Linearly Scaling Nodes
|
||||
|
||||
This attacker controls a number of nodes that scales linearly with the number of nodes in the network.
|
||||
This attacker is especially interesting to investigate in the context of DHT security,
|
||||
which Waku uses for ambient peer discovery.
|
||||
|
||||
### External
|
||||
|
||||
An external attacker can only see encrypted traffic (protected by a secure channel set up with [Noise](https://rfc.vac.dev/spec/35/)).
|
||||
Because an internal position can be easily obtained,
|
||||
in practice external attackers would mount combined attacks that leverage both internal an external attacks.
|
||||
We cover this more below when describing attacks.
|
||||
|
||||
#### Local
|
||||
|
||||
A local attacker has access to communication links in a local network segment.
|
||||
This could be a rogue access point (with routing capability).
|
||||
|
||||
#### AS
|
||||
|
||||
An AS attacker controls a single AS (autonomous system).
|
||||
A passive AS attacker can listen to traffic on arbitrary links within the AS.
|
||||
An active AS attacker can drop, inject, and alter traffic on arbitrary links within the AS.
|
||||
|
||||
In practice, a malicious ISP would be considered as an AS attacker.
|
||||
A malicious ISP could also easily setup a set of nodes at specific points in the network,
|
||||
gaining internal attack power similar to a strong multi node attacker.
|
||||
|
||||
#### Global On-Net
|
||||
|
||||
A global on-net attacker has complete overview over the whole network.
|
||||
A passive global attacker can listen to traffic on all links,
|
||||
while the active global attacker basically carries the traffic: it can freely drop, inject, and alter traffic at all positions in the network.
|
||||
This basically corresponds to the [Dolev-Yao model](https://en.wikipedia.org/wiki/Dolev%E2%80%93Yao_model).
|
||||
|
||||
An entity with this power would, in practice, also have the power of the internal linearly scaling nodes attacker.
|
||||
|
||||
## Attack-based Threat Analysis
|
||||
|
||||
The following lists various attacks including the weakest attacker model in which the attack can be successfully performed.
|
||||
The respective attack can be performed in all stronger attacker models as well.
|
||||
|
||||
An attack is considered more powerful if it can be successfully performed in a weaker attacker model.
|
||||
|
||||
If not stated otherwise, we look at these attacks with respect to their capability to deanonymize the message sender.
|
||||
|
||||
### Scope
|
||||
|
||||
In this post, we introduce a simple tightly scoped threat model for Waku v2 Relay, which will be extended in the course of this article series.
|
||||
|
||||
In this first post, we will look at the relay protocol in isolation.
|
||||
Even though many threats arise from layers Waku relay is based on, and layers that in turn live on top of relay,
|
||||
we want to first look at relay in isolation because it is at the core of Waku v2.
|
||||
Addressing and trying to solve all security issues of a complex system at once is an overwhelming task, which is why we focus on the soundness of relay first.
|
||||
|
||||
This also goes well with the modular design philosophy of Waku v2, as layers of varying levels of security guarantees can be built on top of relay, all of which can relay on the guarantees that Waku provides.
|
||||
Instead of looking at a multiplicative explosion of possible interactions, we look at the core in this article, and cover the most relevant combinations in future posts.
|
||||
|
||||
Further restricting the scope, we will look at the data field of a relay message as a black box.
|
||||
In a second article on Waku v2 relay, we will look into the data field, which according to the [specification of Waku v2 relay](https://rfc.vac.dev/spec/11/#message-fields) must be a [Waku v2 message](https://rfc.vac.dev/spec/14/).
|
||||
We only consider messages with version field `2`, which indicates that the payload has to be encoded using [35/WAKU2-NOISE](https://rfc.vac.dev/spec/35/).
|
||||
|
||||
### Prerequisite: Get a Specific Position in the Network
|
||||
|
||||
Some attacks require the attacker node(s) to be in a specific position in the network.
|
||||
In most cases, this corresponds to trying to get into the mesh peer list for the desired pubsub topic of the victim node.
|
||||
|
||||
In libp2p gossipsub, and by extension Waku v2 relay, nodes can simply send a graft message for the desired topic to the victim node.
|
||||
If the victim node still has open slots, the attacker gets the desired position.
|
||||
This only requires the attacker to know the gossipsub multiaddress of the victim node.
|
||||
|
||||
A linearly scaling nodes attacker can leverage DHT based discovery systems to boost the probability of malicious nodes being returned, which in turn significantly increases the probability of attacker nodes ending up in the peer lists of victim nodes.
|
||||
[Waku v2 discv5](https://vac.dev/wakuv2-apd) will employ countermeasures that mitigate the amplifying effect this attacker type can achieve.
|
||||
|
||||
### Replay Attack
|
||||
|
||||
In the scope we defined above, Waku v2 is resilient against replay attacks.
|
||||
GossipSub nodes, and by extension Waku relay nodes, feature a `seen` cache, and only relay messages they have not seen before.
|
||||
Further, replay attacks will be punished by [RLN](https://rfc.vac.dev/spec/17/) and [SWAP](https://rfc.vac.dev/spec/18/).
|
||||
|
||||
### Neighbourhood Surveillance
|
||||
|
||||
This attack can be performed by a single node attacker that is connected to all peers of the victim node $v$ with respect to a specific topic mesh.
|
||||
The attacker also has to be connected to $v$.
|
||||
In this position, the attacker will receive messages $m_v$ sent by $v$ both on the direct path from $v$, and on indirect paths relayed by peers of $v$.
|
||||
It will also receive messages $m_x$ that are not sent by $v$. These messages $m_x$ are relayed by both $v$ and the peers of $v$.
|
||||
Messages that are received (significantly) faster from $v$ than from any other of $v$'s peers are very likely messages that $v$ sent,
|
||||
because for these messages the attacker is one hop closer to the source.
|
||||
|
||||
The attacker can (periodically) measure latency between itself and $v$, and between itself and the peers of $v$ to get more accurate estimates for the expected timings.
|
||||
An AS attacker (and if the topology allows, even a local attacker) could also learn the latency between $v$ and its well-behaving peers.
|
||||
An active AS attacker could also increase the latency between $v$ and its peers to make the timing differences more prominent.
|
||||
This, however, might lead to $v$ switching to other peers.
|
||||
|
||||
This attack cannot (reliably) distinguish messages $m_v$ sent by $v$ from messages $m_y$ relayed by peers of $v$ the attacker is not connected to.
|
||||
Still, there are hop-count variations that might be leveraged.
|
||||
Messages $m_v$ always have a hop-count of 1 on the path from $v$ to the attacker, while all other paths are longer.
|
||||
Messages $m_y$ might have the same hop-count on the path from $v$ as well as on other paths.
|
||||
|
||||
### Controlled Neighbourhood
|
||||
|
||||
If a multi node attacker manages to control all peers of the victim node, it can trivially tell which messages originated from $v$.
|
||||
|
||||
### Observing Messages
|
||||
|
||||
If Waku relay was not protected with Noise, the AS attacker could simply check for messages leaving $v$ which have not been relayed to $v$.
|
||||
These are the messages sent by $v$.
|
||||
Waku relay protects against this attack by employing secure channels setup using Noise.
|
||||
|
||||
### Correlation
|
||||
|
||||
Monitoring all traffic (in an AS or globally), allows the attacker to identify traffic correlated with messages originating from $v$.
|
||||
This (alone) does not allow an external attacker to learn which message $v$ sent, but it allows identifying the respective traffic propagating through the network.
|
||||
The more traffic in the network, the lower the success rate of this attack.
|
||||
|
||||
Combined with just a few nodes controlled by the attacker, the actual message associated with the correlated traffic can eventually be identified.
|
||||
|
||||
### DoS
|
||||
|
||||
An active single node attacker could run a disruption attack by
|
||||
|
||||
- (1) dropping messages that should be relayed
|
||||
- (2) flooding neighbours with bogus messages
|
||||
|
||||
While (1) has a negative effect on availability, the impact is not significant.
|
||||
A linearly scaling botnet attacker, however, could significantly disrupt the network with such an attack.
|
||||
(2) is thwarted by [RLN](https://rfc.vac.dev/spec/17/).
|
||||
Also [SWAP](https://rfc.vac.dev/spec/18/) helps mitigating DoS attacks.
|
||||
|
||||
A local attacker can DoS Waku by dropping all Waku traffic within its controlled network segment.
|
||||
An AS attacker can DoS Waku within its authority, while a global attacker can DoS the whole network.
|
||||
A countermeasure are censorship resistance techniques like [Pluggable Transports](https://www.pluggabletransports.info/about/).
|
||||
|
||||
## Summary and Future Work
|
||||
|
||||
Currently, Waku v2 relay offers k-anonymity with respect to receiver anonymity.
|
||||
This also includes k-anonymity towards legitimate members of the same topic.
|
||||
|
||||
Waku v2 relay offers sender anonymity in the single node attacker model with its [strict no sign policy](https://rfc.vac.dev/spec/11/#signature-policy).
|
||||
Currently, Waku v2 does not guarantee sender anonymity in the multi node and stronger attacker models.
|
||||
However, we are working on modular anonymity-preserving protocols and building blocks as part of our privacy/anonymity roadmap.
|
||||
The goal is to allow tunable anonymity with respect to trade offs between _strong anonymity_, _low bandwidth_, and _low latency_.
|
||||
All of these cannot be fully guaranteed as the [the anonymity trilemma](https://freedom.cs.purdue.edu/projects/trilemma.html) states.
|
||||
Some applications have specific requirements, e.g. low latency, which require a compromise on anonymity.
|
||||
Anonymity-preserving mechanisms we plan to investigate and eventually specify as pluggable anonymity protocols for Waku comprise
|
||||
|
||||
- [Dandelion++](https://arxiv.org/abs/1805.11060) for lightweight anonymity;
|
||||
- [onion routing](https://en.wikipedia.org/wiki/Onion_routing) as a building block adding a low latency anonymization layer;
|
||||
- [a mix network](https://en.wikipedia.org/wiki/Mix_network) for providing strong anonymity (on top of onion routing) even in the strongest attacker model at the cost of higher latency.
|
||||
|
||||
These pluggable anonymity-preserving protocols will form a sub-set of the Waku v2 protocol set.
|
||||
As an intermediate step, we might directly employ Tor for onion-routing, and [Nym](https://nymtech.net/) as a mix-net layer.
|
||||
|
||||
In future research log posts, we will cover further Waku v2 protocols and identify anonymity problems that will be added to our roadmap.
|
||||
These protocols comprise
|
||||
|
||||
- [13/WAKU2-STORE](https://rfc.vac.dev/spec/13/), which can violate receiver anonymity as it allows filtering by content topic.
|
||||
A countermeasure is using the content topic exclusively for local filters.
|
||||
- [12/WAKU2-FILTER](https://rfc.vac.dev/spec/12/), which discloses nodes' interest in topics;
|
||||
- [19/WAKU2-LIGHTPUSH](https://rfc.vac.dev/spec/19/), which also discloses nodes' interest in topics and links the lightpush client as the sender of a message to the lightpush service node;
|
||||
- [21/WAKU2-FTSTORE](https://rfc.vac.dev/spec/21/), which discloses nodes' interest in specific time ranges allowing to infer information like online times.
|
||||
|
||||
While these protocols are not necessary for the operation of Waku v2, and can be seen as pluggable features,
|
||||
we aim to provide alternatives without the cost of lowering the anonymity level.
|
||||
|
||||
## References
|
||||
|
||||
- [10/WAKU2](https://rfc.vac.dev/spec/10/)
|
||||
- [11/WAKU2-RELAY](https://rfc.vac.dev/spec/11/)
|
||||
- [libp2p GossipSub](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/README.md)
|
||||
- [Security](https://en.wikipedia.org/wiki/Information_security)
|
||||
- [Authentication](https://en.wikipedia.org/wiki/Authentication)
|
||||
- [Non-repudiation](https://en.wikipedia.org/wiki/Non-repudiation)
|
||||
- [Noise Protocol Framework](https://noiseprotocol.org/)
|
||||
- [plausible deniability](https://en.wikipedia.org/wiki/Plausible_deniability#Use_in_cryptography)
|
||||
- [Waku v2 message](https://rfc.vac.dev/spec/14/)
|
||||
- [partitioned topics](https://specs.status.im/spec/10#partitioned-topic)
|
||||
- [Sybil attack](https://en.wikipedia.org/wiki/Sybil_attack)
|
||||
- [Dolev-Yao model](https://en.wikipedia.org/wiki/Dolev%E2%80%93Yao_model)
|
||||
- [35/WAKU2-NOISE](https://rfc.vac.dev/spec/35/)
|
||||
- [33/WAKU2-DISCV5](https://vac.dev/wakuv2-apd)
|
||||
- [strict no sign policy](https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#why-are-we-using-the-strictnosign-signature-policy)
|
||||
- [Waku v2 strict no sign policy](https://rfc.vac.dev/spec/11/#signature-policy)
|
||||
- [17/WAKU-RLN-RELAY](https://rfc.vac.dev/spec/17/)
|
||||
- [anonymity trilemma](https://freedom.cs.purdue.edu/projects/trilemma.html)
|
||||
- [18/WAKU2-SWAP](https://rfc.vac.dev/spec/18/)
|
||||
- [Pluggable Transports](https://www.pluggabletransports.info/about/)
|
||||
- [Nym](https://nymtech.net/)
|
||||
- [Dandelion++](https://arxiv.org/abs/1805.11060)
|
||||
- [13/WAKU2-STORE](https://rfc.vac.dev/spec/13/)
|
||||
- [12/WAKU2-FILTER](https://rfc.vac.dev/spec/12/)
|
||||
- [19/WAKU2-LIGHTPUSH](https://rfc.vac.dev/spec/19/)
|
||||
- [21/WAKU2-FTSTORE](https://rfc.vac.dev/spec/21/)
|
||||
493
rlog/2022-11-04-building-privacy-protecting-infrastructure.mdx
Normal file
493
rlog/2022-11-04-building-privacy-protecting-infrastructure.mdx
Normal file
@@ -0,0 +1,493 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Building Privacy-Protecting Infrastructure'
|
||||
title: 'Building Privacy-Protecting Infrastructure'
|
||||
date: 2022-11-04 12:00:00
|
||||
authors: oskarth
|
||||
published: true
|
||||
slug: building-privacy-protecting-infrastructure
|
||||
categories: research
|
||||
image: /img/building_private_infra_intro.png
|
||||
discuss: https://forum.vac.dev/t/discussion-building-privacy-protecting-infrastructure/161
|
||||
---
|
||||
|
||||
What is privacy-protecting infrastructure? Why do we need it and how we can build it? We'll look at Waku, the communication layer for Web3. We'll see how it uses ZKPs to incentivize and protect the Waku network. We'll also look at Zerokit, a library that makes it easier to use ZKPs in different environments. After reading this, I hope you'll better understand the importance of privacy-protecting infrastructure and how we can build it.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
_This write-up is based on a talk given at DevCon 6 in Bogota, a video can be found [here](https://www.youtube.com/watch?v=CW1DYJifdhs)_
|
||||
|
||||
### Intro
|
||||
|
||||
In this write-up, we are going to talk about building privacy-protecting
|
||||
infrastructure. What is it, why do we need it and how can we build it?
|
||||
|
||||
We'll look at Waku, the communication layer for Web3. We'll look at how we are
|
||||
using Zero Knowledge (ZK) technology to incentivize and protect the Waku
|
||||
network. We'll also look at Zerokit, a library we are writing to make ZKP easier
|
||||
to use in different environments.
|
||||
|
||||
At the end of this write-up, I hope you'll come away with an understanding of
|
||||
the importance of privacy-protecting infrastructure and how we can build it.
|
||||
|
||||
### About
|
||||
|
||||
First, briefly about Vac. We build public good protocols for the decentralized
|
||||
web, with a focus on privacy and communication. We do applied research based on
|
||||
which we build protocols, libraries and publications. We are also the custodians
|
||||
of protocols that reflect a set of principles.
|
||||
|
||||

|
||||
|
||||
It has its origins in the [Status app](https://status.im/) and trying to improve
|
||||
the underlying protocols and infrastructure. We build [Waku](https://waku.org/),
|
||||
among other things.
|
||||
|
||||
### Why build privacy-protecting infrastructure?
|
||||
|
||||
Privacy is the power to selectively reveal yourself. It is a requirement for
|
||||
freedom and self-determination.
|
||||
|
||||
Just like you need decentralization in order to get censorship-resistance, you
|
||||
need privacy to enable freedom of expression.
|
||||
|
||||
To build applications that are decentralized and privacy-protecting, you need
|
||||
the base layer, the infrastructure itself, to have those properties.
|
||||
|
||||
We see this a lot. It is easier to make trade-offs at the application layer than
|
||||
doing them at the base layer. You can build custodial solutions on top of a
|
||||
decentralized and non-custodial network where participants control their own
|
||||
keys, but you can't do the opposite.
|
||||
|
||||
If you think about it, buildings can be seen as a form of privacy-protecting
|
||||
infrastructure. It is completely normal and obvious in many ways, but when it
|
||||
comes to the digital realm our mental models and way of speaking about it hasn't
|
||||
caught up yet for most people.
|
||||
|
||||
I'm not going too much more into the need for privacy or what happens when you
|
||||
don't have it, but suffice to say it is an important property for any open
|
||||
society.
|
||||
|
||||
When we have conversations, true peer-to-peer offline conversations, we can talk
|
||||
privately. If we use cash to buy things we can do commerce privately.
|
||||
|
||||
On the Internet, great as it is, there are a lot of forces that makes this
|
||||
natural state of things not the default. Big Tech has turned users into a
|
||||
commodity, a product, and monetized user's attention for advertising. To
|
||||
optimize for your attention they need to surveil your habits and activities, and
|
||||
hence breach your privacy. As opposed to more old-fashioned models, where
|
||||
someone is buying a useful service from a company and the incentives are more
|
||||
aligned.
|
||||
|
||||
We need to build credibly neutral infrastructure that protects your privacy at
|
||||
the base layer, in order to truly enable applications that are
|
||||
censorship-resistant and encourage meaningful freedom of expression.
|
||||
|
||||
### Web3 infrastructure
|
||||
|
||||
Infrastructure is what lies underneath. Many ways of looking at this but I'll
|
||||
keep it simple as per the original Web3 vision. You had Ethereum for
|
||||
compute/consensus, Swarm for storage, and Whisper for messaging. Waku has taken
|
||||
over the mantle from Whisper and is a lot more
|
||||
[usable](https://vac.dev/fixing-whisper-with-waku) today than Whisper ever was,
|
||||
for many reasons.
|
||||
|
||||

|
||||
|
||||
On the privacy-front, we see how Ethereum is struggling. It is a big UX problem,
|
||||
especially when you try to add privacy back "on top". It takes a lot of effort
|
||||
and it is easier to censor. We see this with recent action around Tornado Cash.
|
||||
Compare this with something like Zcash or Monero, where privacy is there by
|
||||
default.
|
||||
|
||||
There are also problems when it comes to the p2p networking side of things, for
|
||||
example with Ethereum validator privacy and hostile actors and jurisdictions. If
|
||||
someone can easily find out where a certain validator is physically located,
|
||||
that's a problem in many parts of the world. Being able to have stronger
|
||||
privacy-protection guarantees would be very useful for high-value targets.
|
||||
|
||||
This doesn't begin to touch on the so called "dapps" that make a lot of
|
||||
sacrifices in how they function, from the way domains work, to how websites are
|
||||
hosted and the reliance on centralized services for communication. We see this
|
||||
time and time again, where centralized, single points of failure systems work
|
||||
for a while, but then eventually fail.
|
||||
|
||||
In many cases an individual user might not care enough though, and for platforms
|
||||
the lure to take shortcuts is strong. That is why it is important to be
|
||||
principled, but also pragmatic in terms of the trade-offs that you allow on top.
|
||||
We'll touch more on this in the design goals around modularity that Waku has.
|
||||
|
||||
### ZK for privacy-protecting infrastructure
|
||||
|
||||
ZKPs are a wonderful new tool. Just like smart contracts enables programmable
|
||||
money, ZKPs allow us to express fundamentally new things. In line with the great
|
||||
tradition of trust-minimization, we can prove statement while revealing the
|
||||
absolute minimum information necessary. This fits the definition of privacy, the
|
||||
power to selectively reveal yourself, perfectly. I'm sure I don't need to tell
|
||||
anyone reading this but this is truly revolutionary. The technology is advancing
|
||||
extremely fast and often it is our imagination that is the limit.
|
||||
|
||||

|
||||
|
||||
### Waku
|
||||
|
||||
What is Waku? It is a set of modular protocols for p2p communication. It has a
|
||||
focus on privacy, security and being able to run anywhere. It is the spiritual
|
||||
success to Whisper.
|
||||
|
||||
By modular we mean that you can pick and choose protocols and how you use them
|
||||
depending on constraints and trade-offs. For example, bandwidth usage vs
|
||||
privacy.
|
||||
|
||||
It is designed to work in resource restricted environments, such as mobile
|
||||
phones and in web browsers. It is important that infrastructure meets users
|
||||
where they are and supports their real-world use cases. Just like you don't need
|
||||
your own army and a castle to have your own private bathroom, you shouldn't need
|
||||
to have a powerful always-on node to get reasonable privacy and
|
||||
censorship-resistance. We might call this self-sovereignty.
|
||||
|
||||
### Waku - adaptive nodes
|
||||
|
||||
One way of looking at Waku is as an open service network. There are nodes with
|
||||
varying degrees of capabilities and requirements. For example when it comes to
|
||||
bandwidth usage, storage, uptime, privacy requirements, latency requirements,
|
||||
and connectivity restrictions.
|
||||
|
||||
We have a concept of adaptive nodes that can run a variety of protocols. A node
|
||||
operator can choose which protocols they want to run. Naturally, there'll be
|
||||
some nodes that do more consumption and other nodes that do more provisioning.
|
||||
This gives rise to the idea of a service network, where services are provided
|
||||
for and consumed.
|
||||
|
||||

|
||||
|
||||
### Waku - protocol interactions
|
||||
|
||||
There are many protocols that interact. Waku Relay protocol is based on libp2p
|
||||
GossipSub for p2p messaging. We have filter for bandwidth-restricted nodes to
|
||||
only receive subset of messages. Lightpush for nodes with short connection
|
||||
windows to push messages into network. Store for nodes that want to retrieve
|
||||
historical messages.
|
||||
|
||||
On the payload layer, we provide support for Noise handshakes/key-exchanges.
|
||||
This means that as a developers, you can get end-to-end encryption and expected
|
||||
guarantees out of the box. We have support for setting up a secure channel from
|
||||
scratch, and all of this paves the way for providing Signal's Double Ratchet at
|
||||
the protocol level much easier. We also have experimental support for
|
||||
multi-device usage. Similar features have existed in for example the Status app
|
||||
for a while, but with this we make it easier for any platform using Waku to use
|
||||
it.
|
||||
|
||||
There are other protocols too, related to peer discovery, topic usage, etc. See
|
||||
[specs](https://rfc.vac.dev/) for more details.
|
||||
|
||||

|
||||
|
||||
### Waku - Network
|
||||
|
||||
For the Waku network, there are a few problems. For example, when it comes to
|
||||
network spam and incentivizing service nodes. We want to address these while
|
||||
keeping privacy-guarantees of the base layer. I'm going to go into both of
|
||||
these.
|
||||
|
||||
The spam problem arises on the gossip layer when anyone can overwhelm the
|
||||
network with messages. The service incentivization is a problem when nodes don't
|
||||
directly benefit from the provisioning of a certain service. This can happen if
|
||||
they are not using the protocol directly themselves as part of normal operation,
|
||||
or if they aren't socially inclined to provide a certain service. This depends a
|
||||
lot on how an individual platform decides to use the network.
|
||||
|
||||

|
||||
|
||||
### Dealing with network spam and RLN Relay
|
||||
|
||||
Since the p2p relay network is open to anyone, there is a problem with spam. If
|
||||
we look at existing solutions for dealing with spam in traditional messaging
|
||||
systems, a lot of entities like Google, Facebook, Twitter, Telegram, Discord use
|
||||
phone number verification. While this is largely sybil-resistant, it is
|
||||
centralized and not private at all.
|
||||
|
||||
Historically, Whisper used PoW which isn't good for heterogenerous networks.
|
||||
Peer scoring is open to sybil attacks and doesn't directly address spam
|
||||
protection in an anonymous p2p network.
|
||||
|
||||
The key idea here is to use RLN for private economic spam protection using
|
||||
zkSNARKs.
|
||||
|
||||
I'm not going to go into too much detail of RLN here. If you are interested, I
|
||||
gave a [talk](https://www.youtube.com/watch?v=g41nHQ0mLoA) in Amsterdam at
|
||||
Devconnect about this. We have some write-ups on RLN
|
||||
[here](https://vac.dev/rln-relay) by Sanaz who has been pushing a lot of this
|
||||
from our side. There's also another talk at Devcon by Tyler going into RLN in
|
||||
more detail. Finally, here's the [RLN spec](https://rfc.vac.dev/spec/32/).
|
||||
|
||||
I'll briefly go over what it is, the interface and circuit and then talk about
|
||||
how it is used in Waku.
|
||||
|
||||
### RLN - Overview and Flow
|
||||
|
||||
RLN stands for Rate Limiting Nullifier. It is an anonyomous rate limiting
|
||||
mechanism based on zkSNARKs. By rate limiting we mean you can only send N
|
||||
messages in a given period. By anonymity we mean that you can't link message to
|
||||
a publisher. We can think of it as a voting booth, where you are only allowed to
|
||||
vote once every election.
|
||||
|
||||

|
||||
|
||||
It can be used for spam protection in p2p messaging systems, and also rate
|
||||
limiting in general, such as for a decentralized captcha.
|
||||
|
||||
There are three parts to it. You register somewhere, then you can signal and
|
||||
finally there's a verification/slashing phase. You put some capital at risk,
|
||||
either economic or social, and if you double signal you get slashed.
|
||||
|
||||
### RLN - Circuit
|
||||
|
||||
Here's what the private and public inputs to the circuit look like. The identity
|
||||
secret is generated locally, and we create an identity commitment that is
|
||||
inserted into a Merkle tree. We then use Merkle proofs to prove membership.
|
||||
Registered member can only signal once for a given epoch or external nullifier,
|
||||
for example every ten seconds in Unix time. RLN identifer is for a specific RLN
|
||||
app.
|
||||
|
||||
We also see what the circuit output looks like. This is calculated locally. `y`
|
||||
is a share of the secret equation, and the (internal) nullifier acts as a unique
|
||||
fingerprint for a given app/user/epoch combination. How do we calculate `y` and
|
||||
the internal nullifier?
|
||||
|
||||
```
|
||||
// Private input
|
||||
signal input identity_secret;
|
||||
signal input path_elements[n_levels][1];
|
||||
signal input identity_path_index[n_levels];
|
||||
|
||||
// Public input
|
||||
signal input x; // signal_hash
|
||||
signal input epoch; // external_nullifier
|
||||
signal input rln_identifier;
|
||||
|
||||
// Circuit output
|
||||
signal output y;
|
||||
signal output root;
|
||||
signal output nullifier;
|
||||
```
|
||||
|
||||
### RLN - Shamir's secret sharing
|
||||
|
||||
This is done using [Shamir's secret
|
||||
sharing](https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing). Shamir’s
|
||||
secret sharing is based on idea of splitting a secret into shares. This is how
|
||||
we enable slashing of funds.
|
||||
|
||||
In this case, we have two shares. If a given identity `a0` signals twice in
|
||||
epoch/external nullifier, `a1` is the same. For a given RLN app,
|
||||
`internal_nullifier` then stays the same. `x` is signal hash which is different,
|
||||
and `y` is public, so we can reconstruct `identity_secret`. With the identity
|
||||
secret revealed, this gives access to e.g. financial stake.
|
||||
|
||||
```
|
||||
a_0 = identity_secret // secret S
|
||||
a_1 = poseidonHash([a0, external_nullifier])
|
||||
|
||||
y = a_0 + x * a_1
|
||||
|
||||
internal_nullifier = poseidonHash([a_1, rln_identifier])
|
||||
```
|
||||
|
||||

|
||||
|
||||
### RLN Relay
|
||||
|
||||
This is how RLN is used with Relay/GossipSub protocol. A node registers and
|
||||
locks up funds, and after that it can send messages. It publishes a message
|
||||
containing the Zero Knowledge proof and some other details.
|
||||
|
||||
Each relayer node listens to the membership contract for new members, and it
|
||||
also keeps track of relevant metadata and merkle tree. Metadata is needed to be
|
||||
able to detect double signaling and perform slashing.
|
||||
|
||||
Before forwarding a message, it does some verification checks to ensure there
|
||||
are no duplicate messages, ZKP is valid and no double signaling has occured. It
|
||||
is worth noting that this can be combined with peer scoring, for example for
|
||||
duplicate messages or invalid ZK proofs.
|
||||
|
||||
In line of Waku's goals of modularity, RLN Relay is applied on a specific subset
|
||||
of pubsub and content topics. You can think of it as an extra secure channel.
|
||||
|
||||

|
||||
|
||||
### RLN Relay cross-client testnet
|
||||
|
||||
Where are we with RLN Relay deployment? We've recently launched our second
|
||||
testnet. This is using RLN Relay with a smart contract on Goerli. It integrates
|
||||
with our example p2p chat application, and it does so through three different
|
||||
clients, nwaku, go-waku and js-waku for browsers. This is our first p2p
|
||||
cross-client testnet for RLN Relay.
|
||||
|
||||
Here's a [video](https://www.youtube.com/watch?v=-vVrJWW0fls) that shows a user
|
||||
registering in a browser, signaling through JS-Waku. It then gets relayed to a
|
||||
nwaku node, that verifies the proof. The second
|
||||
[video](https://www.youtube.com/watch?v=Xz5q2ZhkFYs) shows what happens in the
|
||||
spam case. when more than one message is sent in a given epoch, it detects it as
|
||||
spam and discards it. Slashing hasn't been implemented fully yet in the client
|
||||
and is a work in progress.
|
||||
|
||||
If you are curious and want to participate, you can join the effort on our [Vac
|
||||
Discord](https://discord.gg/PQFdubGt6d). We also have
|
||||
[tutorials](https://github.com/status-im/nwaku/blob/master/docs/tutorial/rln-chat-cross-client.md)
|
||||
setup for all clients so you can play around with it.
|
||||
|
||||
As part of this, and to make it work in multiple different environments, we've
|
||||
also been developing a new library called Zerokit. I'll talk about this a bit
|
||||
later.
|
||||
|
||||
### Private settlement / Service credentials
|
||||
|
||||
Going back to the service network idea, let's talk about service credentials.
|
||||
The idea behind service credentials and private settlement is to enable two
|
||||
actors to pay for and provide services without compromising their privacy. We do
|
||||
not want the payment to create a direct public link between the service provider
|
||||
and requester.
|
||||
|
||||
Recall the Waku service network illustration with adaptive nodes that choose
|
||||
which protocols they want to run. Many of these protocols aren't very heavy and
|
||||
just work by default. For example the relay protocol is enabled by default.
|
||||
Other protocols are much heavier to provide, such as storing historical
|
||||
messages.
|
||||
|
||||
It is desirable to have additional incentives for this, especially for platforms
|
||||
that aren't community-based where some level of altruism can be assumed (e.g.
|
||||
Status Communities, or WalletConnect cloud infrastructure).
|
||||
|
||||
You have a node Alice that is often offline and wants to consume historical
|
||||
messages on some specific content topics. You have another node Bob that runs a
|
||||
server at home where they store historical messages for the last several weeks.
|
||||
Bob is happy to provide this service for free because he's excited about running
|
||||
privacy-preserving infrastructure and he's using it himself, but his node is
|
||||
getting overwhelmed by freeloaders and he feels like he should be paid something
|
||||
for continuing to provide this service.
|
||||
|
||||
Alice deposits some funds in a smart contract which registers it in a tree,
|
||||
similar to certain other private settlement mechanisms. A fee is taken or
|
||||
burned. In exchange, she gets a set of tokens or service credentials. When she
|
||||
wants to do a query with some criteria, she sends this to Bob. Bob responds with
|
||||
size of response, cost, and receiver address. Alice then sends a proof of
|
||||
delegation of a service token as a payment. Bob verifies the proof and resolves
|
||||
the query.
|
||||
|
||||
The end result is that Alice has consumed some service from Bob, and Bob has
|
||||
received payment for this. There's no direct transaction link between Alice and
|
||||
Bob, and gas fees can be minimized by extending the period before settling on
|
||||
chain.
|
||||
|
||||
This can be complemented with altruistic service provisioning, for example by
|
||||
splitting the peer pool into two slots, or only providing a few cheap queries
|
||||
for free.
|
||||
|
||||
The service provisioning is general, and can be generalized for any kind of
|
||||
request/response service provisoning that we want to keep private.
|
||||
|
||||
This isn't a perfect solution, but it is an incremental improvement on top of
|
||||
the status quo. It can be augmented with more advanced techniques such as better
|
||||
non-repudiable node reputation, proof of correct service provisioning, etc.
|
||||
|
||||
We are currently in the raw spec / proof of concept stage of this. We expect to
|
||||
launch a testnet of this later this year or early next year.
|
||||
|
||||

|
||||
|
||||
### Zerokit
|
||||
|
||||
[Zerokit](https://github.com/vacp2p/zerokit) is a set of Zero Knowledge modules,
|
||||
written in Rust and designed to be used in many different environments. The
|
||||
initial goal is to get the best of both worlds with Circom/Solidity/JS and
|
||||
Rust/ZK ecosystem. This enables people to leverage Circom-based constructs from
|
||||
non-JS environments.
|
||||
|
||||
For the RLN module, it is using Circom circuits via ark-circom and Rust for
|
||||
scaffolding. It exposes a C FFI API that can be used through other system
|
||||
programming environments, like Nim and Go. It also exposes an experimental WASM
|
||||
API that can be used through web browsers.
|
||||
|
||||
Waku is p2p infrastructure running in many different environments, such as
|
||||
Nim/JS/Go/Rust, so this a requirement for us.
|
||||
|
||||
Circom and JS strengths are access to Dapp developers, tooling, generating
|
||||
verification code, circuits etc. Rust strengths is that it is systems-based and
|
||||
easy to interface with other language runtime such as Nim, Go, Rust, C. It also
|
||||
gives access to other Rust ZK ecosystems such as arkworks. This opens door for
|
||||
using other constructs, such as Halo2. This becomes especially relevant for
|
||||
constructs where you don't want to do a trusted setup or where circuits are more
|
||||
complex/custom and performance requirements are higher.
|
||||
|
||||
In general with Zerokit, we want to make it easy to build and use ZKP in a
|
||||
multitude of environments, such as mobile phones and web browsers. Currently it
|
||||
is too complex to write privacy-protecting infrastructure with ZKPs considering
|
||||
all the languages and tools you have to learn, from JS, Solidity and Circom to
|
||||
Rust, WASM and FFI. And that isn't even touching on things like secure key
|
||||
storage or mobile dev. Luckily more and more projects are working on this,
|
||||
including writing DSLs etc. It'd also be exciting if we can make a useful
|
||||
toolstack for JS-less ZK dev to reduce cognitive overhead, similar to what we
|
||||
have with something like Foundry.
|
||||
|
||||
### Other research
|
||||
|
||||
I also want to mention a few other things we are doing. One thing is
|
||||
[protocol specifications](https://rfc.vac.dev/). We think this is very important
|
||||
for p2p infra, and we see a lot of other projects that claim to do it p2p
|
||||
infrastructure but they aren't clear about guarantees or how stable something
|
||||
is. That makes it hard to have multiple implementations, to collaborate across
|
||||
different projects, and to analyze things objectively.
|
||||
|
||||
Related to that is publishing [papers](https://vac.dev/publications). We've put
|
||||
out three so far, related to Waku and RLN-Relay. This makes it easier to
|
||||
interface with academia. There's a lot of good researchers out there and we want
|
||||
to build a better bridge between academia and industry.
|
||||
|
||||
Another thing is [network](https://vac.dev/wakuv2-relay-anon)
|
||||
[privacy](https://github.com/vacp2p/research/issues/107). Waku is modular with
|
||||
respect to privacy guarantees, and there are a lot of knobs to turn here
|
||||
depending on specific deployments. For example, if you are running the full
|
||||
relay protocol you currently have much stronger receiver anonymity than if you
|
||||
are running filter protocol from a bandwidth or connectivity-restricted node.
|
||||
|
||||
We aim to make this pluggable depending on user needs. E.g. mixnets such as Nym
|
||||
come with some trade-offs but are a useful tool in the arsenal. A good mental
|
||||
model to keep in mind is the anonymity trilemma, where you can only pick 2/3 out
|
||||
of low latency, low bandwidth usage and strong anonymity.
|
||||
|
||||
We are currently exploring [Dandelion-like
|
||||
additions](https://github.com/vacp2p/research/issues/119) to the relay/gossip
|
||||
protocol, which would provide for stronger sender anonymity, especially in a
|
||||
multi-node/botnet attacker model. As part of this we are looking into different
|
||||
parameters choices and general possibilities for lower latency usage. This could
|
||||
make it more amenable for latency sensitive environments, such as validator
|
||||
privacy, for specific threat models. The general theme here is we want to be
|
||||
rigorous with the guarantees we provide, under what conditions and for what
|
||||
threat models.
|
||||
|
||||
Another thing mentioned earlier is [Noise payload
|
||||
encryption](https://vac.dev/wakuv2-noise), and specifically things like allowing
|
||||
for pairing different devices with e.g. QR codes. This makes it easier for
|
||||
developers to provide secure messaging in many realistic scenarios in a
|
||||
multi-device world.
|
||||
|
||||

|
||||
|
||||
### Summary
|
||||
|
||||
We've gone over what privacy-protecting infrastructure is, why we want it and
|
||||
how we can build it. We've seen how ZK is a fundamental building block for this.
|
||||
We've looked at Waku, the communication layer for Web3, and how it uses Zero
|
||||
Knowledge proofs to stay private and function better. We've also looked at
|
||||
Zerokit and how we can make it easier to do ZKP in different environments.
|
||||
|
||||
Finally we also looked at some other research we've been doing. All of the
|
||||
things mentioned in this article, and more, is available as
|
||||
[write-ups](https://vac.dev/research), [specs](https://rfc.vac.dev/), or
|
||||
discussions on our [forum](forum.vac.dev/) or [Github](github.com/vacp2p/).
|
||||
|
||||
If you find any of this exciting to work on, feel free to reach out on our
|
||||
Discord. We are also [hiring](https://jobs.status.im/), and we have started
|
||||
expanding into other privacy infrastructure tech like private and provable
|
||||
computation with ZK-WASM.
|
||||
191
rlog/2022-11-08-waku-for-all-decentralize-applications.mdx
Normal file
191
rlog/2022-11-08-waku-for-all-decentralize-applications.mdx
Normal file
@@ -0,0 +1,191 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Waku for All Decentralized Applications and Infrastructures'
|
||||
title: 'Waku for All Decentralized Applications and Infrastructures'
|
||||
date: 2022-11-08 00:00:00
|
||||
authors: franck
|
||||
published: true
|
||||
slug: waku-for-all
|
||||
categories: waku, dapp, infrastructure, public good, platform, operator
|
||||
image: /img/black-waku-logo-with-name.png
|
||||
discuss: https://forum.vac.dev/t/discussion-waku-for-all-decentralized-applications-and-infrastructures/163
|
||||
---
|
||||
|
||||
Waku is an open communication protocol and network. Decentralized apps and infrastructure can use Waku for their
|
||||
communication needs. It is designed to enable dApps and decentralized infrastructure projects to have secure, private,
|
||||
scalable communication. Waku is available in several languages and platforms, from Web to mobile to desktop to cloud.
|
||||
Initially, We pushed Waku adoption to the Web ecosystem, we learned that Waku is usable in a variety of complex applications
|
||||
and infrastructure projects. We have prioritized our effort to make Waku usable on various platforms and environments.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
## Background
|
||||
|
||||
We have built Waku to be the communication layer for Web3. Waku is a collection of protocols to chose from for your
|
||||
messaging needs. It enables secure, censorship-resistant, privacy-preserving, spam-protected communication for its user.
|
||||
It is designed to run on any device, from mobile to the cloud.
|
||||
|
||||
Waku is available on many systems and environments and used by several applications and SDKs for decentralized communications.
|
||||
|
||||
This involved research efforts in various domains: conversational security, protocol incentivization, zero-knowledge,
|
||||
etc.
|
||||
|
||||
Waku uses novel technologies. Hence, we knew that early dogfooding of Waku was necessary. Even if research
|
||||
was still _in progress_ [[1]](#references). Thus, as soon as Waku protocols and software were usable, we started to push
|
||||
for the adoption of Waku. This started back in 2021.
|
||||
|
||||
Waku is the communication component of the Web3 trifecta. This trifecta was Ethereum (contracts), Swarm
|
||||
(storage) and Whisper (communication). Hence, it made sense to first target dApps which already uses one of the pillars:
|
||||
Ethereum.
|
||||
|
||||
As most dApps are web apps, we started the development of [js-waku for the browser](https://vac.dev/presenting-js-waku).
|
||||
|
||||
Once ready, we reached out to dApps to integrate Waku, added [prizes to hackathons](https://twitter.com/waku_org/status/1451400128791605254?s=20&t=Zhc0BEz6RVLkE_SeE6UyFA)
|
||||
and gave [talks](https://docs.wakuconnect.dev/docs/presentations/).
|
||||
|
||||
We also assumed we would see patterns in the usage of Waku, that we would facilitate with the help of
|
||||
[SDKs](https://github.com/status-im/wakuconnect-vote-poll-sdk).
|
||||
|
||||
Finally, we created several web apps:
|
||||
[examples](https://docs.wakuconnect.dev/docs/examples/)
|
||||
and [PoCs](https://github.com/status-iM/gnosis-safe-waku).
|
||||
|
||||
By discussing with Waku users and watching it being used, we learned a few facts:
|
||||
|
||||
1. The potential use cases for Waku are varied and many:
|
||||
|
||||
- Wallet <> dApp communication: [WalletConnect](https://medium.com/walletconnect/walletconnect-v2-0-protocol-whats-new-3243fa80d312), [XMTP](https://xmtp.org/docs/dev-concepts/architectural-overview/)
|
||||
- Off-chain (and private) marketplace:
|
||||
[RAILGUN](https://twitter.com/RAILGUN_Project/status/1556780629848727552?s=20&t=NEKQJiJAfg5WJqvuF-Ym_Q) &
|
||||
[Decentralized Uber](https://twitter.com/TheBojda/status/1455557282318721026)
|
||||
- Signature exchange for a multi-sign wallet: [Gnosis Safe x Waku](https://github.com/status-im/gnosis-safe-waku)
|
||||
- Off-chain Game moves/actions: [Super Card Game (EthOnline 2021)](https://showcase.ethglobal.com/ethonline2021/super-card-game)
|
||||
- Decentralized Pastebin: [Debin](https://debin.io/)
|
||||
|
||||
2. Many projects are interested in having an embedded chat in their dApp,
|
||||
3. There are complex applications that need Waku as a solution. Taking RAILGUN as an example:
|
||||
|
||||
- Web wallet
|
||||
- \+ React Native mobile wallet
|
||||
- \+ NodeJS node/backend.
|
||||
|
||||
(1) means that it is not that easy to create SDKs for common use cases.
|
||||
|
||||
(2) was a clear candidate for an SDK. Yet, building a chat app is a complex task. Hence, the Status app team tackled
|
||||
this in the form of [Status Web](https://github.com/status-im/status-web/).
|
||||
|
||||
Finally, (3) was the most important lesson. We learned that multi-tier applications need Waku for decentralized and
|
||||
censorship-resistant communications. For these projects, js-waku is simply not enough. They need Waku to work in their
|
||||
Golang backend, Unity desktop game and React Native mobile app.
|
||||
|
||||
We understood that we should see the whole Waku software suite
|
||||
([js-waku](https://github.com/waku-org/js-waku),
|
||||
[nwaku](https://github.com/status-im/nwaku),
|
||||
[go-waku](https://github.com/status-im/go-waku),
|
||||
[waku-react-native](https://github.com/waku-org/waku-react-native),
|
||||
[etc](https://github.com/waku-org)) as an asset for its success.
|
||||
That we should not limit outreach, marketing, documentation efforts to the web, but target all platforms.
|
||||
|
||||
From a market perspective, we identified several actors:
|
||||
|
||||
- platforms: Projects that uses Waku to handle communication,
|
||||
- operators: Operators run Waku nodes and are incentivized to do so,
|
||||
- developers: Developers are usually part of a platforms or solo hackers learning Web3,
|
||||
- contributors: Developers and researchers with interests in decentralization, privacy, censorship-resistance,
|
||||
zero-knowledge, etc.
|
||||
|
||||
## Waku for All Decentralized Applications and Infrastructures
|
||||
|
||||
In 2022, we shifted our focus to make the various Waku implementations **usable and used**.
|
||||
|
||||
We made Waku [multi-plaform](https://github.com/status-im/go-waku/tree/master/examples).
|
||||
|
||||
We shifted Waku positioning to leverage all Waku implementations and better serve the user's needs:
|
||||
|
||||
- Running a node for your projects and want to use Waku? Use [nwaku](https://github.com/status-im/nwaku).
|
||||
- Going mobile? Use [Waku React Native](https://github.com/status-im/waku-react-native).
|
||||
- C++ Desktop Game? Use [go-waku's C-Bindings](https://github.com/status-im/go-waku/tree/master/examples/c-bindings).
|
||||
- Web app? Use [js-waku](https://github.com/status-im/js-waku).
|
||||
|
||||
We are consolidating the documentation for all implementations on a single website ([work in progress](https://github.com/waku-org/waku.org/issues/15))
|
||||
to improve developer experience.
|
||||
|
||||
This year, we also started the _operator outreach_ effort to push for users to run their own Waku nodes. We have
|
||||
recently concluded our [first operator trial run](https://github.com/status-im/nwaku/issues/828).
|
||||
[Nwaku](https://vac.dev/introducing-nwaku)'s documentation, stability and performance has improved. It is now easier to
|
||||
run your [own Waku node](https://github.com/status-im/nwaku/tree/master/docs/operators).
|
||||
|
||||
Today, operator wannabes most likely run their own nodes to support or use the Waku network.
|
||||
We are [dogfooding](https://twitter.com/oskarth/status/1582027828295790593?s=20&t=DPEP6fXK6KWbBjV5EBCBMA)
|
||||
[Waku RLN](https://github.com/status-im/nwaku/issues/827), our novel economic spam protection protocol,
|
||||
and looking at [incentivizing the Waku Store protocol](https://github.com/vacp2p/research/issues/99).
|
||||
This way, we are adding reasons to run your own Waku node.
|
||||
|
||||
For those who were following us in 2021, know that we are retiring the _Waku Connect_ branding in favour of the _Waku_
|
||||
branding.
|
||||
|
||||
## Waku for Your Project
|
||||
|
||||
As discussed, Waku is now available on various platforms. The question remains: How can Waku benefit **your** project?
|
||||
|
||||
Here are a couple of use cases we recently investigated:
|
||||
|
||||
## Layer-2 Decentralization
|
||||
|
||||
Most ([[2] [3]](#references) roll-ups use a centralized sequencer or equivalent. Running several sequencers is not as straightforward as running several execution nodes.
|
||||
Waku can help:
|
||||
|
||||
- Provide a neutral marketplace for a mempool: If sequencers compete for L2 tx fees, they may not be incentivized to
|
||||
share transactions with other sequencers. Waku nodes can act as a neutral network to enable all sequences to access
|
||||
transactions.
|
||||
- Enable censorship-resistant wallet<>L2 communication,
|
||||
- Provide rate limiting mechanism for spam protection: Using [RLN](https://rfc.vac.dev/spec/32/) to prevent DDOS.
|
||||
|
||||
## Device pairing and communication
|
||||
|
||||
With [Waku Device Pairing](https://rfc.vac.dev/spec/43/), a user can setup a secure encrypted communication channel
|
||||
between their devices. As this channel would operate over Waku, it would be censorship-resistant and privacy preserving.
|
||||
These two devices could be:
|
||||
|
||||
- Ethereum node and mobile phone to access a remote admin panel,
|
||||
- Alice's phone and Bob's phone for any kind of secure communication,
|
||||
- Mobile wallet and desktop/browser dApp for transaction and signature exchange.
|
||||
|
||||
Check [js-waku#950](https://github.com/waku-org/js-waku/issues/950) for the latest update on this.
|
||||
|
||||
## Get Involved
|
||||
|
||||
Developer? Grab any of the Waku implementations and integrate it in your app: https://waku.org/platform.
|
||||
|
||||
Researcher? See https://vac.dev/contribute to participate in Waku research.
|
||||
|
||||
Tech-savvy? Try to run your own node: https://waku.org/operator.
|
||||
|
||||
Otherwise, play around with the various [web examples](https://github.com/waku-org/js-waku-examples#readme).
|
||||
|
||||
If you want to help, we are [hiring](https://jobs.status.im/)!
|
||||
|
||||
## Moving Forward
|
||||
|
||||
What you can expect next:
|
||||
|
||||
- [Scalability and performance studies](https://forum.vac.dev/t/waku-v2-scalability-studies/142/9) and improvement across Waku software,
|
||||
- [New websites](https://github.com/waku-org/waku.org/issues/15) to easily find documentation about Waku and its implementations,
|
||||
- New Waku protocols implemented in all code bases and cross client PoCs
|
||||
([noise](https://rfc.vac.dev/spec/35/), [noise-sessions](https://rfc.vac.dev/spec/37/),
|
||||
[waku-rln-relay](https://rfc.vac.dev/spec/17/), etc),
|
||||
- Easier to [run your own waku node](https://github.com/status-im/nwaku/issues/828), more operator trials,
|
||||
- Dogfooding and Improvement of existing protocols (e.g. [Waku Filter](https://github.com/vacp2p/rfc/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc++12%2FWAKU2-FILTER)),
|
||||
- Continue our focus Waku portability: Browser,
|
||||
[Raspberry Pi Zero](https://twitter.com/richardramos_me/status/1574405469912932355?s=20&t=DPEP6fXK6KWbBjV5EBCBMA) and other restricted-resource environments,
|
||||
- More communication & marketing effort around Waku and the Waku developer community.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- \[1\] Waku is modular; it is a suite of protocols; hence some Waku protocols may be mature, while
|
||||
new protocols are still being designed. Which means that research continues to be _ongoing_ while
|
||||
Waku is already used in production.
|
||||
- [[2]](https://community.optimism.io/docs/how-optimism-works/#block-production) The Optimism Foundation runs the only block produce on the Optimism network.
|
||||
- [[3]](https://l2beat.com/) Top 10 L2s are documented has having a centralized operator.
|
||||
136
rlog/2023-04-03-waku-as-a-network.mdx
Normal file
136
rlog/2023-04-03-waku-as-a-network.mdx
Normal file
@@ -0,0 +1,136 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'The Future of Waku Network: Scaling, Incentivization, and Heterogeneity'
|
||||
title: 'The Future of Waku Network: Scaling, Incentivization, and Heterogeneity'
|
||||
date: 2023-04-03 00:00:00
|
||||
authors: franck
|
||||
published: true
|
||||
slug: future-of-waku-network
|
||||
categories: platform, operator, network
|
||||
image: /img/black-waku-logo-with-name.png
|
||||
discuss: https://forum.vac.dev/t/discussion-the-future-of-waku-network-scaling-incentivization-and-heterogeneity/173
|
||||
hide_table_of_contents: false
|
||||
---
|
||||
|
||||
Learn how the Waku Network is evolving through scaling, incentivization, and diverse ecosystem development and what the future might look like.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
Waku is preparing for production with a focus on the Status Communities use case. In this blog post, we will provide an
|
||||
overview of recent discussions and research outputs, aiming to give you a better understanding of how the Waku network
|
||||
may look like in terms of scaling and incentivization.
|
||||
|
||||
## DOS Mitigation for Status Communities
|
||||
|
||||
Waku is actively exploring DOS mitigation mechanisms suitable for Status Communities. While RLN
|
||||
(Rate Limiting Nullifiers) remains the go-to DOS protection solution due to its privacy-preserving and
|
||||
censorship-resistant properties, there is still more work to be done. We are excited to collaborate with PSE
|
||||
(Privacy & Scaling Explorations) in this endeavor. Learn more about their latest progress in this [tweet](https://twitter.com/CPerezz19/status/1640373940634939394?s=20).
|
||||
|
||||
## A Heterogeneous Waku Network
|
||||
|
||||
As we noted in a previous [forum post](https://forum.vac.dev/t/waku-payment-models/166/3), Waku's protocol
|
||||
incentivization model needs to be flexible to accommodate various business models. Flexibility ensures that projects
|
||||
can choose how they want to use Waku based on their specific needs.
|
||||
|
||||
### Reversing the Incentivization Question
|
||||
|
||||
Traditionally, the question of incentivization revolves around how to incentivize operators to run nodes. We'd like to
|
||||
reframe the question and instead ask, "How do we pay for the infrastructure?"
|
||||
|
||||
Waku does not intend to offer a free lunch.
|
||||
Ethereum's infrastructure is supported by transaction fees and inflation, with validators receiving rewards from both sources.
|
||||
However, this model does not suit a communication network like Waku.
|
||||
Users and platforms would not want to pay for every single message they send. Additionally, Waku aims to support instant
|
||||
ephemeral messages that do not require consensus or long-term storage.
|
||||
|
||||
Projects that use Waku to enable user interactions, whether for chat messages, gaming, private DeFi, notifications, or
|
||||
inter-wallet communication, may have different value extraction models. Some users might provide services for the
|
||||
project and expect to receive value by running nodes, while others may pay for the product or run infrastructure to
|
||||
contribute back. Waku aims to support each of these use cases, which means there will be various ways to "pay for the
|
||||
infrastructure."
|
||||
|
||||
In [his talk](https://vac.dev/building-privacy-protecting-infrastructure), Oskar addressed two strategies: RLN and service credentials.
|
||||
|
||||
### RLN and Service Credentials
|
||||
|
||||
RLN enables DOS protection across the network in a privacy-preserving and permission-less manner: stake in a contract,
|
||||
and you can send messages.
|
||||
|
||||
Service credentials establish a customer-provider relationship. Users might pay to have messages they are interested in
|
||||
stored and served by a provider. Alternatively, a community owner could pay a service provider to host their community.
|
||||
|
||||
Providers could offer trial or limited free services to Waku users, similar to Slack or Discord. Once a trial is expired or outgrown,
|
||||
a community owner could pay for more storage or bandwidth, similar to Slack's model.
|
||||
Alternatively, individual users could contribute financially, akin to Discord's Server Boost, or by sharing their own
|
||||
resources with their community.
|
||||
|
||||
We anticipate witnessing various scenarios across the spectrum: from users sharing resources to users paying for access to the network and everything in between.
|
||||
|
||||
## Waku Network: Ethereum or Cosmos?
|
||||
|
||||
Another perspective is to consider whether the Waku network will resemble Ethereum or Cosmos.
|
||||
|
||||
For those not familiar with the difference between both, in a very concise manner:
|
||||
|
||||
- Ethereum is a set of protocols and software that are designed to operate on one common network and infrastructure
|
||||
- Cosmos is a set of protocols and software (SDKs) designed to be deployed in separate yet interoperable networks and infrastructures by third parties
|
||||
|
||||
We want Waku to be decentralized to provide censorship resistance and privacy-preserving communication.
|
||||
If each application has to deploy its own network, we will not achieve this goal.
|
||||
Therefore, we aim Waku to be not only an open source set of protocols, but also a shared infrastructure that anyone can leverage to build applications on top, with some guarantees in terms of decentralization and anonymity.
|
||||
This approach is closer in spirit to Ethereum than Cosmos.
|
||||
Do note that, similarly to Ethereum, anyone is free to take Waku software and protocols and deploy their own network.
|
||||
|
||||
Yet, because of the difference in the fee model, the Waku Network is unlikely to be as unified as Ethereum's.
|
||||
We currently assume that there will be separate gossipsub networks with different funding models.
|
||||
Since there is no consensus on Waku, each individual operator can decide which network to support, enabling Waku to maintain its permission-less property.
|
||||
|
||||
Most likely, the Waku network will be heterogeneous, and node operators will choose the incentivization model they prefer.
|
||||
|
||||
## Scalability and Discovery Protocols
|
||||
|
||||
To enable scalability, the flow of messages in the Waku network will be divided in shards,
|
||||
so that not every node has to forward every message of the whole network.
|
||||
Discovery protocols will facilitate users connecting to the right nodes to receive the messages they are interested in.
|
||||
|
||||
Different shards could be subject to a variety of rate limiting techniques (globally, targeted to that shard or something in-between).
|
||||
|
||||
Marketplace protocols may also be developed to help operators understand how they can best support the network and where
|
||||
their resources are most needed. However, we are still far from establishing or even assert that such a marketplace will be needed.
|
||||
|
||||
## Open Problems
|
||||
|
||||
Splitting traffic between shards reduces bandwidth consumption for every Waku Relay node.
|
||||
This improvement increases the likelihood that users with home connections can participate and contribute to the gossipsub network without encountering issues.
|
||||
|
||||
However, it does not cap traffic.
|
||||
There are still open problems regarding how to guarantee that someone can use Waku with lower Internet bandwidth or run critical services, such as a validation node, on the same connection.
|
||||
|
||||
We have several ongoing initiatives:
|
||||
|
||||
- Analyzing the Status Community protocol to confirm efficient usage of Waku [[4]](https://github.com/vacp2p/research/issues/177)
|
||||
- Simulating the Waku Network to measure actual bandwidth usage [[5]](https://github.com/waku-org/pm/issues/2)
|
||||
- Segregating chat messages from control and media messages [[6]](https://rfc.vac.dev/spec/57/#control-message-shards)
|
||||
|
||||
The final solution will likely be a combination of protocols that reduce bandwidth usage or mitigate the risk of DOS attacks, providing flexibility for users and platforms to enable the best experience.
|
||||
|
||||
## The Evolving Waku Network
|
||||
|
||||
The definition of the "Waku Network" will likely change over time. In the near future, it will transition from a single
|
||||
gossipsub network to a sharded set of networks unified by a common discovery layer. This change will promote scalability
|
||||
and allow various payment models to coexist within the Waku ecosystem.
|
||||
|
||||
In conclusion, the future of Waku Network entails growth, incentivization, and heterogeneity while steadfastly
|
||||
maintaining its core principles. As Waku continues to evolve, we expect it to accommodate a diverse range of use cases
|
||||
and business models, all while preserving privacy, resisting censorship, avoiding surveillance, and remaining accessible
|
||||
to devices with limited resources.
|
||||
|
||||
## References
|
||||
|
||||
1. [51/WAKU2-RELAY-SHARDING](https://rfc.vac.dev/spec/51/)
|
||||
2. [57/STATUS-Simple-Scaling](https://rfc.vac.dev/spec/57/)
|
||||
3. [58/RLN-V2](https://rfc.vac.dev/spec/58/)
|
||||
4. [Scaling Status Communities: Potential Problems](https://github.com/vacp2p/research/issues/177)
|
||||
5. [Waku Network Testing](https://github.com/waku-org/pm/issues/2)
|
||||
6. [51/WAKU2-RELAY-SHARDING: Control Message Shards](https://rfc.vac.dev/spec/57/#control-message-shards)
|
||||
61
rlog/2023-04-24-device-pairing-in-js-waku-and-go-waku.mdx
Normal file
61
rlog/2023-04-24-device-pairing-in-js-waku-and-go-waku.mdx
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
layout: post
|
||||
name: 'Device Pairing in Js-waku and Go-waku'
|
||||
title: 'Device Pairing in Js-waku and Go-waku'
|
||||
date: 2023-04-24 12:00:00
|
||||
authors: rramos
|
||||
published: true
|
||||
slug: device-pairing-in-js-waku-and-go-waku
|
||||
categories: platform
|
||||
---
|
||||
|
||||
Device pairing and secure message exchange using Waku and noise protocol.
|
||||
|
||||
<!--truncate-->
|
||||
|
||||
As the world becomes increasingly connected through the internet, the need for secure and reliable communication becomes paramount. In [this article](https://vac.dev/wakuv2-noise) it is described how the Noise protocol can be used as a key-exchange mechanism for Waku.
|
||||
|
||||
Recently, this feature was introduced in [js-waku](https://github.com/waku-org/js-noise) and [go-waku](https://github.com/waku-org/go-waku), providing a simple API for developers to implement secure communication protocols using the Noise Protocol framework. These open-source libraries provide a solid foundation for building secure and decentralized applications that prioritize data privacy and security.
|
||||
|
||||
This functionality is designed to be simple and easy to use, even for developers who are not experts in cryptography. The library offers a clear and concise API that abstracts away the complexity of the Noise Protocol framework and provides an straightforward interface for developers to use. Using this, developers can effortlessly implement secure communication protocols on top of their JavaScript and Go applications, without having to worry about the low-level details of cryptography.
|
||||
|
||||
One of the key benefits of using Noise is that it provides end-to-end encryption, which means that the communication between two parties is encrypted from start to finish. This is essential for ensuring the security and privacy of sensitive information
|
||||
|
||||
### Device Pairing
|
||||
|
||||
In today's digital world, device pairing has become an integral part of our lives. Whether it's connecting our smartphones with other computers or web applications, the need for secure device pairing has become more crucial than ever. With the increasing threat of cyber-attacks and data breaches, it's essential to implement secure protocols for device pairing to ensure data privacy and prevent unauthorized access.
|
||||
|
||||
To demonstrate how device pairing can be achieved using Waku and Noise, we have examples available at https://examples.waku.org/noise-js/. You can try pairing different devices, such as mobile and desktop, via a web application. This can be done by scanning a QR code or opening a URL that contains the necessary data for a secure handshake.
|
||||
|
||||
The process works as follows:
|
||||
|
||||
Actors:
|
||||
|
||||
- Alice the initiator
|
||||
- Bob the responder
|
||||
|
||||
1. The first step in achieving secure device pairing using Noise and Waku is for Bob generate the pairing information which could be transmitted out-of-band. For this, Bob opens https://examples.waku.org/noise-js/ and a QR code is generated, containing the data required to do the handshake. This pairing QR code is timeboxed, meaning that after 2 minutes, it will become invalid and a new QR code must be generated
|
||||
2. Alice scans the QR code using a mobile phone. This will open the app with the QR code parameters initiating the handshake process which is described in [43/WAKU2-DEVICE-PAIRING](https://rfc.vac.dev/spec/43/#protocol-flow). These messages are exchanged between two devices over Waku to establish a secure connection. The handshake messages consist of three main parts: the initiator's message, the responder's message, and the final message, which are exchanged to establish a secure connection. While using js-noise, the developer is abstracted of this process, since the messaging happens automatically depending on the actions performed by the actors in the pairing process.
|
||||
3. Both Alice and Bob will be asked to verify each other's identity. This is done by confirming if an 8-digits authorization code match in both devices. If both actors confirm that the authorization code is valid, the handshake concludes succesfully
|
||||
4. Alice and Bob receive a set of shared keys that can be used to start exchanging encrypted messages. The shared secret keys generated during the handshake process are used to encrypt and decrypt messages sent between the devices. This ensures that the messages exchanged between the devices are secure and cannot be intercepted or modified by an attacker.
|
||||
|
||||
The above example demonstrates device pairing using js-waku. Additionally, You can also try building and experimenting with other noise implementations like nwaku, or go-waku, with an example available at https://github.com/waku-org/go-waku/tree/master/examples/noise in which the same flow described before is done with Bob (the receiver) using go-waku instead of js-waku.
|
||||
|
||||
### Conclusion
|
||||
|
||||
With its easy to use API built on top of the Noise Protocol framework and the LibP2P networking stack, if you are a developer looking to implement secure messaging in their applications that are both decentralized and censorship resistant, Waku is definitely an excellent choice worth checking out!
|
||||
|
||||
Waku is also Open source with a MIT and APACHEv2 licenses, which means that developers are encouraged to contribute code, report bugs, and suggest improvements to make it even better.
|
||||
|
||||
Don't hesitate to try the live example at https://examples.waku.org/noise-js and build your own webapp using https://github.com/waku-org/js-noise, https://github.com/waku-org/js-waku and https://github.com/waku-org/go-waku. This will give you a hands-on experience of implementing secure communication protocols using the Noise Protocol framework in a practical setting. Happy coding!
|
||||
|
||||
### References
|
||||
|
||||
- [Noise handshakes as key-exchange mechanism for Waku](https://vac.dev/wakuv2-noise)
|
||||
- [Noise Protocols for Waku Payload Encryption](https://rfc.vac.dev/spec/35/)
|
||||
- [Session Management for Waku Noise](https://rfc.vac.dev/spec/37/)
|
||||
- [Device pairing and secure transfers with Noise](https://rfc.vac.dev/spec/43/)
|
||||
- [go-waku Noise's example](https://github.com/waku-org/go-waku/tree/master/examples/noise)
|
||||
- [js-waku Noise's example](https://github.com/waku-org/js-waku-examples/tree/master/examples/noise-js)
|
||||
- [js-noise](https://github.com/waku-org/js-noise/)
|
||||
- [go-noise](https://github.com/waku-org/js-noise/)
|
||||
44
rlog/authors.yml
Normal file
44
rlog/authors.yml
Normal file
@@ -0,0 +1,44 @@
|
||||
circe:
|
||||
name: 'Circe'
|
||||
twitter: 'vacp2p'
|
||||
github: 'thecirce'
|
||||
|
||||
dean:
|
||||
name: 'Dean'
|
||||
twitter: 'DeanEigenmann'
|
||||
github: 'decanus'
|
||||
website: 'https://dean.eigenmann.me'
|
||||
|
||||
franck:
|
||||
name: 'Franck'
|
||||
twitter: 'fryorcraken'
|
||||
github: 'fryorcraken'
|
||||
|
||||
hanno:
|
||||
name: 'Hanno Cornelius'
|
||||
twitter: '4aelius'
|
||||
github: 'jm-clius'
|
||||
|
||||
kaiserd:
|
||||
name: 'Daniel'
|
||||
github: 'kaiserd'
|
||||
|
||||
oskarth:
|
||||
name: 'Oskar'
|
||||
twitter: 'oskarth'
|
||||
github: 'oskarth'
|
||||
|
||||
rramos:
|
||||
name: 'Richard'
|
||||
twitter: 'richardramos_me'
|
||||
github: 'richard-ramos'
|
||||
website: 'https://richard-ramos.github.io/'
|
||||
|
||||
s1fr0:
|
||||
name: 's1fr0'
|
||||
github: 's1fr0'
|
||||
|
||||
sanaz:
|
||||
name: 'Sanaz'
|
||||
twitter: 'sanaz2016'
|
||||
github: 'staheri14'
|
||||
Reference in New Issue
Block a user