mirror of
https://github.com/vacp2p/rfc-index.git
synced 2026-01-10 16:18:22 -05:00
Compare commits
14 Commits
logos/raw/
...
community_
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ec1d718f4b | ||
|
|
63a0308982 | ||
|
|
f7286d0e7c | ||
|
|
a919fd359d | ||
|
|
ecf9c46f79 | ||
|
|
ec97967fa1 | ||
|
|
d2df7e0c2d | ||
|
|
63107d3830 | ||
|
|
dd397adc59 | ||
|
|
cb4d0de84f | ||
|
|
69802377a8 | ||
|
|
e4f5f28ea3 | ||
|
|
171e934d61 | ||
|
|
36be428cdd |
485
codex/raw/codex-block-exchange.md
Normal file
485
codex/raw/codex-block-exchange.md
Normal file
@@ -0,0 +1,485 @@
|
||||
---
|
||||
title: CODEX-BLOCK-EXCHANGE
|
||||
name: Codex Block Exchange Protocol
|
||||
status: raw
|
||||
category: Standards Track
|
||||
tags: codex, block-exchange, p2p, data-distribution
|
||||
editor: Codex Team
|
||||
contributors:
|
||||
---
|
||||
|
||||
## Abstract
|
||||
|
||||
The Block Exchange (BE) is a core Codex component responsible for
|
||||
peer-to-peer content distribution across the network.
|
||||
It manages the sending and receiving of data blocks between nodes,
|
||||
enabling efficient data sharing and retrieval.
|
||||
This specification defines both an internal service interface and a
|
||||
network protocol for referring to and providing data blocks.
|
||||
Blocks are uniquely identifiable by means of an address and represent
|
||||
fixed-length chunks of arbitrary data.
|
||||
|
||||
## Semantics
|
||||
|
||||
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
|
||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
|
||||
document are to be interpreted as described in
|
||||
[RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).
|
||||
|
||||
### Definitions
|
||||
|
||||
| Term | Description |
|
||||
|------|-------------|
|
||||
| **Block** | Fixed-length chunk of arbitrary data, uniquely identifiable |
|
||||
| **Standalone Block** | Self-contained block addressed by SHA256 hash (CID) |
|
||||
| **Dataset Block** | Block in ordered set, addressed by dataset CID + index |
|
||||
| **Block Address** | Unique identifier for standalone/dataset addressing |
|
||||
| **WantList** | List of block requests sent by a peer |
|
||||
| **Block Delivery** | Transmission of block data from one peer to another |
|
||||
| **Block Presence** | Indicator of whether peer has requested block |
|
||||
| **Merkle Proof** | Proof verifying dataset block position correctness |
|
||||
| **CID** | Content Identifier - hash-based identifier for content |
|
||||
| **Multicodec** | Self-describing format identifier for data encoding |
|
||||
| **Multihash** | Self-describing hash format |
|
||||
|
||||
## Motivation
|
||||
|
||||
The Block Exchange module serves as the fundamental layer for content
|
||||
distribution in the Codex network.
|
||||
It provides primitives for requesting and delivering blocks of data
|
||||
between peers, supporting both standalone blocks and blocks that are
|
||||
part of larger datasets.
|
||||
The protocol is designed to work over libp2p streams and integrates
|
||||
with Codex's discovery, storage, and payment systems.
|
||||
|
||||
When a peer wishes to obtain a block, it registers its unique address
|
||||
with the Block Exchange, and the Block Exchange will then be in charge
|
||||
of procuring it by finding a peer that has the block, if any, and then
|
||||
downloading it.
|
||||
The Block Exchange will also accept requests from peers which might
|
||||
want blocks that the node has, and provide them.
|
||||
|
||||
**Discovery Separation:** Throughout this specification we assume that
|
||||
if a peer wants a block, then the peer has the means to locate and
|
||||
connect to peers which either: (1) have the block; or (2) are
|
||||
reasonably expected to obtain the block in the future.
|
||||
In practical implementations, the Block Exchange will typically require
|
||||
the support of an underlying discovery service, e.g., the Codex DHT,
|
||||
to look up such peers, but this is beyond the scope of this document.
|
||||
|
||||
The protocol supports two distinct block types to accommodate different
|
||||
use cases: standalone blocks for independent data chunks and dataset
|
||||
blocks for ordered collections of data that form larger structures.
|
||||
|
||||
## Block Format
|
||||
|
||||
The Block Exchange protocol supports two types of blocks:
|
||||
|
||||
### Standalone Blocks
|
||||
|
||||
Standalone blocks are self-contained pieces of data addressed by their
|
||||
SHA256 content identifier (CID).
|
||||
These blocks are independent and do not reference any larger structure.
|
||||
|
||||
**Properties:**
|
||||
|
||||
- Addressed by content hash (SHA256)
|
||||
- Default size: 64 KiB
|
||||
- Self-contained and independently verifiable
|
||||
|
||||
### Dataset Blocks
|
||||
|
||||
Dataset blocks are part of ordered sets and are addressed by a
|
||||
`(datasetCID, index)` tuple.
|
||||
The datasetCID refers to the Merkle tree root of the entire dataset,
|
||||
and the index indicates the block's position within that dataset.
|
||||
|
||||
Formally, we can define a block as a tuple consisting of raw data and
|
||||
its content identifier: `(data: seq[byte], cid: Cid)`, where standalone
|
||||
blocks are addressed by `cid`, and dataset blocks can be addressed
|
||||
either by `cid` or a `(datasetCID, index)` tuple.
|
||||
|
||||
**Properties:**
|
||||
|
||||
- Addressed by `(treeCID, index)` tuple
|
||||
- Part of a Merkle tree structure
|
||||
- Require Merkle proof for verification
|
||||
- Must be uniformly sized within a dataset
|
||||
- Final blocks MUST be zero-padded if incomplete
|
||||
|
||||
### Block Specifications
|
||||
|
||||
All blocks in the Codex Block Exchange protocol adhere to the
|
||||
following specifications:
|
||||
|
||||
| Property | Value | Description |
|
||||
|----------|-------|-------------|
|
||||
| Default Block Size | 64 KiB | Standard size for data blocks |
|
||||
| Multicodec | `codex-block` (0xCD02) | Format identifier |
|
||||
| Multihash | `sha2-256` (0x12) | Hash algorithm for addressing |
|
||||
| Padding Requirement | Zero-padding | Incomplete final blocks padded |
|
||||
|
||||
## Service Interface
|
||||
|
||||
The Block Exchange module exposes two core primitives for
|
||||
block management:
|
||||
|
||||
### `requestBlock`
|
||||
|
||||
```python
|
||||
async def requestBlock(address: BlockAddress) -> Block
|
||||
```
|
||||
|
||||
Registers a block address for retrieval and returns the block data
|
||||
when available.
|
||||
This function can be awaited by the caller until the block is retrieved
|
||||
from the network or local storage.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
- `address`: BlockAddress - The unique address identifying the block
|
||||
to retrieve
|
||||
|
||||
**Returns:**
|
||||
|
||||
- `Block` - The retrieved block data
|
||||
|
||||
### `cancelRequest`
|
||||
|
||||
```python
|
||||
async def cancelRequest(address: BlockAddress) -> bool
|
||||
```
|
||||
|
||||
Cancels a previously registered block request.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
- `address`: BlockAddress - The address of the block request to cancel
|
||||
|
||||
**Returns:**
|
||||
|
||||
- `bool` - True if the cancellation was successful, False otherwise
|
||||
|
||||
## Dependencies
|
||||
|
||||
The Block Exchange module depends on and interacts with several other
|
||||
Codex components:
|
||||
|
||||
| Component | Purpose |
|
||||
|-----------|---------|
|
||||
| **Discovery Module** | DHT-based peer discovery for locating nodes |
|
||||
| **Local Store (Repo)** | Persistent block storage for local blocks |
|
||||
| **Advertiser** | Announces block availability to the network |
|
||||
| **Network Layer** | libp2p connections and stream management |
|
||||
|
||||
## Protocol Specification
|
||||
|
||||
### Protocol Identifier
|
||||
|
||||
The Block Exchange protocol uses the following libp2p protocol
|
||||
identifier:
|
||||
|
||||
```text
|
||||
/codex/blockexc/1.0.0
|
||||
```
|
||||
|
||||
### Connection Model
|
||||
|
||||
The protocol operates over libp2p streams.
|
||||
When a node wants to communicate with a peer:
|
||||
|
||||
1. The initiating node dials the peer using the protocol identifier
|
||||
2. A bidirectional stream is established
|
||||
3. Both sides can send and receive messages on this stream
|
||||
4. Messages are encoded using Protocol Buffers
|
||||
5. The stream remains open for the duration of the exchange session
|
||||
6. Peers track active connections in a peer context store
|
||||
|
||||
The protocol handles peer lifecycle events:
|
||||
|
||||
- **Peer Joined**: When a peer connects, it is added to the active
|
||||
peer set
|
||||
- **Peer Departed**: When a peer disconnects gracefully, its context
|
||||
is cleaned up
|
||||
- **Peer Dropped**: When a peer connection fails, it is removed from
|
||||
the active set
|
||||
|
||||
### Message Format
|
||||
|
||||
All messages use Protocol Buffers encoding for serialization.
|
||||
The main message structure supports multiple operation types in a
|
||||
single message.
|
||||
|
||||
#### Main Message Structure
|
||||
|
||||
```protobuf
|
||||
message Message {
|
||||
Wantlist wantlist = 1;
|
||||
repeated BlockDelivery payload = 3;
|
||||
repeated BlockPresence blockPresences = 4;
|
||||
int32 pendingBytes = 5;
|
||||
AccountMessage account = 6;
|
||||
StateChannelUpdate payment = 7;
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
|
||||
- `wantlist`: Block requests from the sender
|
||||
- `payload`: Block deliveries (actual block data)
|
||||
- `blockPresences`: Availability indicators for requested blocks
|
||||
- `pendingBytes`: Number of bytes pending delivery
|
||||
- `account`: Account information for micropayments
|
||||
- `payment`: State channel update for payment processing
|
||||
|
||||
#### Block Address
|
||||
|
||||
The BlockAddress structure supports both standalone and dataset
|
||||
block addressing:
|
||||
|
||||
```protobuf
|
||||
message BlockAddress {
|
||||
bool leaf = 1;
|
||||
bytes treeCid = 2; // Present when leaf = true
|
||||
uint64 index = 3; // Present when leaf = true
|
||||
bytes cid = 4; // Present when leaf = false
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
|
||||
- `leaf`: Indicates if this is dataset block (true) or standalone
|
||||
(false)
|
||||
- `treeCid`: Merkle tree root CID (present when `leaf = true`)
|
||||
- `index`: Position of block within dataset (present when `leaf = true`)
|
||||
- `cid`: Content identifier of the block (present when `leaf = false`)
|
||||
|
||||
**Addressing Modes:**
|
||||
|
||||
- **Standalone Block** (`leaf = false`): Direct CID reference to a
|
||||
standalone content block
|
||||
- **Dataset Block** (`leaf = true`): Reference to a block within an
|
||||
ordered set, identified by a Merkle tree root and an index.
|
||||
The Merkle root may refer to either a regular dataset, or a dataset
|
||||
that has undergone erasure-coding
|
||||
|
||||
#### WantList
|
||||
|
||||
The WantList communicates which blocks a peer desires to receive:
|
||||
|
||||
```protobuf
|
||||
message Wantlist {
|
||||
enum WantType {
|
||||
wantBlock = 0;
|
||||
wantHave = 1;
|
||||
}
|
||||
|
||||
message Entry {
|
||||
BlockAddress address = 1;
|
||||
int32 priority = 2;
|
||||
bool cancel = 3;
|
||||
WantType wantType = 4;
|
||||
bool sendDontHave = 5;
|
||||
}
|
||||
|
||||
repeated Entry entries = 1;
|
||||
bool full = 2;
|
||||
}
|
||||
```
|
||||
|
||||
**WantType Values:**
|
||||
|
||||
- `wantBlock (0)`: Request full block delivery
|
||||
- `wantHave (1)`: Request availability information only (presence check)
|
||||
|
||||
**Entry Fields:**
|
||||
|
||||
- `address`: The block being requested
|
||||
- `priority`: Request priority (currently always 0)
|
||||
- `cancel`: If true, cancels a previous want for this block
|
||||
- `wantType`: Specifies whether full block or presence is desired
|
||||
- `wantHave (1)`: Only check if peer has the block
|
||||
- `wantBlock (0)`: Request full block data
|
||||
- `sendDontHave`: If true, peer should respond even if it doesn't have
|
||||
the block
|
||||
|
||||
**WantList Fields:**
|
||||
|
||||
- `entries`: List of block requests
|
||||
- `full`: If true, replaces all previous entries; if false, delta update
|
||||
|
||||
**Delta Updates:**
|
||||
|
||||
WantLists support delta updates for efficiency.
|
||||
When `full = false`, entries represent additions or modifications to
|
||||
the existing WantList rather than a complete replacement.
|
||||
|
||||
#### Block Delivery
|
||||
|
||||
Block deliveries contain the actual block data along with verification
|
||||
information:
|
||||
|
||||
```protobuf
|
||||
message BlockDelivery {
|
||||
bytes cid = 1;
|
||||
bytes data = 2;
|
||||
BlockAddress address = 3;
|
||||
bytes proof = 4;
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
|
||||
- `cid`: Content identifier of the block
|
||||
- `data`: Raw block data (up to 100 MiB)
|
||||
- `address`: The BlockAddress identifying this block
|
||||
- `proof`: Merkle proof (CodexProof) verifying block correctness
|
||||
(required for dataset blocks)
|
||||
|
||||
**Merkle Proof Verification:**
|
||||
|
||||
When delivering dataset blocks (`address.leaf = true`):
|
||||
|
||||
- The delivery MUST include a Merkle proof (CodexProof)
|
||||
- The proof verifies that the block at the given index is correctly
|
||||
part of the Merkle tree identified by the tree CID
|
||||
- This applies to all datasets, irrespective of whether they have been
|
||||
erasure-coded or not
|
||||
- Recipients MUST verify the proof before accepting the block
|
||||
- Invalid proofs result in block rejection
|
||||
|
||||
#### Block Presence
|
||||
|
||||
Block presence messages indicate whether a peer has or does not have a
|
||||
requested block:
|
||||
|
||||
```protobuf
|
||||
enum BlockPresenceType {
|
||||
presenceHave = 0;
|
||||
presenceDontHave = 1;
|
||||
}
|
||||
|
||||
message BlockPresence {
|
||||
BlockAddress address = 1;
|
||||
BlockPresenceType type = 2;
|
||||
bytes price = 3;
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
|
||||
- `address`: The block address being referenced
|
||||
- `type`: Whether the peer has the block or not
|
||||
- `price`: Price (UInt256 format)
|
||||
|
||||
#### Payment Messages
|
||||
|
||||
Payment-related messages for micropayments using Nitro state channels.
|
||||
|
||||
**Account Message:**
|
||||
|
||||
```protobuf
|
||||
message AccountMessage {
|
||||
bytes address = 1; // Ethereum address to which payments should be made
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
|
||||
- `address`: Ethereum address for receiving payments
|
||||
|
||||
**State Channel Update:**
|
||||
|
||||
```protobuf
|
||||
message StateChannelUpdate {
|
||||
bytes update = 1; // Signed Nitro state, serialized as JSON
|
||||
}
|
||||
```
|
||||
|
||||
**Fields:**
|
||||
|
||||
- `update`: Nitro state channel update containing payment information
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Block Verification
|
||||
|
||||
- All dataset blocks MUST include and verify Merkle proofs before acceptance
|
||||
- Standalone blocks MUST verify CID matches the SHA256 hash of the data
|
||||
- Peers SHOULD reject blocks that fail verification immediately
|
||||
|
||||
### DoS Protection
|
||||
|
||||
- Implementations SHOULD limit the number of concurrent block requests per peer
|
||||
- Implementations SHOULD implement rate limiting for WantList updates
|
||||
- Large WantLists MAY be rejected to prevent resource exhaustion
|
||||
|
||||
### Data Integrity
|
||||
|
||||
- All blocks MUST be validated before being stored or forwarded
|
||||
- Zero-padding in dataset blocks MUST be verified to prevent data corruption
|
||||
- Block sizes MUST be validated against protocol limits
|
||||
|
||||
### Privacy Considerations
|
||||
|
||||
- Block requests reveal information about what data a peer is seeking
|
||||
- Implementations MAY implement request obfuscation strategies
|
||||
- Presence information can leak storage capacity details
|
||||
|
||||
## Rationale
|
||||
|
||||
### Design Decisions
|
||||
|
||||
**Two-Tier Block Addressing:**
|
||||
The protocol supports both standalone and dataset blocks to accommodate
|
||||
different use cases.
|
||||
Standalone blocks are simpler and don't require Merkle proofs, while
|
||||
dataset blocks enable efficient verification of large datasets without
|
||||
requiring the entire dataset.
|
||||
|
||||
**WantList Delta Updates:**
|
||||
Supporting delta updates reduces bandwidth consumption when peers only
|
||||
need to modify a small portion of their wants, which is common in
|
||||
long-lived connections.
|
||||
|
||||
**Separate Presence Messages:**
|
||||
Decoupling presence information from block delivery allows peers to
|
||||
quickly assess availability without waiting for full block transfers.
|
||||
|
||||
**Fixed Block Size:**
|
||||
The 64 KiB default block size balances efficient network transmission
|
||||
with manageable memory overhead.
|
||||
|
||||
**Zero-Padding Requirement:**
|
||||
Requiring zero-padding for incomplete dataset blocks ensures uniform
|
||||
block sizes within datasets, simplifying Merkle tree construction and
|
||||
verification.
|
||||
|
||||
**Protocol Buffers:**
|
||||
Using Protocol Buffers provides efficient serialization, forward
|
||||
compatibility, and wide language support.
|
||||
|
||||
## Copyright
|
||||
|
||||
Copyright and related rights waived via
|
||||
[CC0](https://creativecommons.org/publicdomain/zero/1.0/).
|
||||
|
||||
## References
|
||||
|
||||
### Normative
|
||||
|
||||
- [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) - Key words for use
|
||||
in RFCs to Indicate Requirement Levels
|
||||
- **libp2p**: <https://libp2p.io>
|
||||
- **Protocol Buffers**: <https://protobuf.dev>
|
||||
- **Multihash**: <https://multiformats.io/multihash/>
|
||||
- **Multicodec**: <https://github.com/multiformats/multicodec>
|
||||
|
||||
### Informative
|
||||
|
||||
- **Codex Documentation**: <https://docs.codex.storage>
|
||||
- **Codex Block Exchange Module Spec**:
|
||||
<https://github.com/codex-storage/codex-docs-obsidian/blob/main/10%20Notes/Specs/Block%20Exchange%20Module%20Spec.md>
|
||||
- **Merkle Trees**: <https://en.wikipedia.org/wiki/Merkle_tree>
|
||||
- **Content Addressing**:
|
||||
<https://en.wikipedia.org/wiki/Content-addressable_storage>
|
||||
802
codex/raw/codex-marketplace.md
Normal file
802
codex/raw/codex-marketplace.md
Normal file
@@ -0,0 +1,802 @@
|
||||
---
|
||||
slug: codex-marketplace
|
||||
title: CODEX-MARKETPLACE
|
||||
name: Codex Storage Marketplace
|
||||
status: raw
|
||||
category: Standards Track
|
||||
tags: codex, storage, marketplace, smart-contract
|
||||
editor: Codex Team and Dmitriy Ryajov <dryajov@status.im>
|
||||
contributors:
|
||||
- Mark Spanbroek <mark@codex.storage>
|
||||
- Adam Uhlíř <adam@codex.storage>
|
||||
- Eric Mastro <eric@codex.storage>
|
||||
- Jimmy Debe <jimmy@status.im>
|
||||
- Filip Dimitrijevic <filip@status.im>
|
||||
---
|
||||
|
||||
## Abstract
|
||||
|
||||
Codex Marketplace and its interactions are defined by a smart contract deployed on an EVM-compatible blockchain. This specification describes these interactions for the various roles within the network.
|
||||
|
||||
The document is intended for implementors of Codex nodes.
|
||||
|
||||
## Semantics
|
||||
|
||||
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [2119](https://www.ietf.org/rfc/rfc2119.txt).
|
||||
|
||||
### Definitions
|
||||
|
||||
| Terminology | Description |
|
||||
|---------------------------|---------------------------------------------------------------------------------------------------------------------------|
|
||||
| Storage Provider (SP) | A node in the Codex network that provides storage services to the marketplace. |
|
||||
| Validator | A node that assists in identifying missing storage proofs. |
|
||||
| Client | A node that interacts with other nodes in the Codex network to store, locate, and retrieve data. |
|
||||
| Storage Request or Request | A request created by a client node to persist data on the Codex network. |
|
||||
| Slot or Storage Slot | A space allocated by the storage request to store a piece of the request's dataset. |
|
||||
| Smart Contract | A smart contract implementing the marketplace functionality. |
|
||||
| Token | The ERC20-based token used within the Codex network. |
|
||||
|
||||
## Motivation
|
||||
|
||||
The Codex network aims to create a peer-to-peer storage engine with robust data durability, data persistence guarantees, and a comprehensive incentive structure.
|
||||
|
||||
The marketplace is a critical component of the Codex network, serving as a platform where all involved parties interact to ensure data persistence. It provides mechanisms to enforce agreements and facilitate data repair when SPs fail to fulfill their duties.
|
||||
|
||||
Implemented as a smart contract on an EVM-compatible blockchain, the marketplace enables various scenarios where nodes assume one or more roles to maintain a reliable persistence layer for users. This specification details these interactions.
|
||||
|
||||
The marketplace contract manages storage requests, maintains the state of allocated storage slots, and orchestrates SP rewards, collaterals, and storage proofs.
|
||||
|
||||
A node that wishes to participate in the Codex persistence layer MUST implement one or more roles described in this document.
|
||||
|
||||
### Roles
|
||||
|
||||
A node can assume one of the three main roles in the network: the client, SP, and validator.
|
||||
|
||||
A client is a potentially short-lived node in the network with the purpose of persisting its data in the Codex persistence layer.
|
||||
|
||||
An SP is a long-lived node providing storage for clients in exchange for profit. To ensure a reliable, robust service for clients, SPs are required to periodically provide proofs that they are persisting the data.
|
||||
|
||||
A validator ensures that SPs have submitted valid proofs each period where the smart contract required a proof to be submitted for slots filled by the SP.
|
||||
|
||||
---
|
||||
|
||||
## Part I: Protocol Specification
|
||||
|
||||
This part defines the **normative requirements** for the Codex Marketplace protocol. All implementations MUST comply with these requirements to participate in the Codex network. The protocol is defined by smart contract interactions on an EVM-compatible blockchain.
|
||||
|
||||
## Storage Request Lifecycle
|
||||
|
||||
The diagram below depicts the lifecycle of a storage request:
|
||||
|
||||
```text
|
||||
┌───────────┐
|
||||
│ Cancelled │
|
||||
└───────────┘
|
||||
▲
|
||||
│ Not all
|
||||
│ Slots filled
|
||||
│
|
||||
┌───────────┐ ┌──────┴─────────────┐ ┌─────────┐
|
||||
│ Submitted ├───►│ Slots Being Filled ├──────────►│ Started │
|
||||
└───────────┘ └────────────────────┘ All Slots └────┬────┘
|
||||
Filled │
|
||||
│
|
||||
┌───────────────────────┘
|
||||
Proving ▼
|
||||
┌────────────────────────────────────────────────────────────┐
|
||||
│ │
|
||||
│ Proof submitted │
|
||||
│ ┌─────────────────────────► All good │
|
||||
│ │ │
|
||||
│ Proof required │
|
||||
│ │ │
|
||||
│ │ Proof missed │
|
||||
│ └─────────────────────────► After some time slashed │
|
||||
│ eventually Slot freed │
|
||||
│ │
|
||||
└────────┬─┬─────────────────────────────────────────────────┘
|
||||
│ │ ▲
|
||||
│ │ │
|
||||
│ │ SP kicked out and Slot freed ┌───────┴────────┐
|
||||
All good │ ├─────────────────────────────►│ Repair process │
|
||||
Time ran out │ │ └────────────────┘
|
||||
│ │
|
||||
│ │ Too many Slots freed ┌────────┐
|
||||
│ └─────────────────────────────►│ Failed │
|
||||
▼ └────────┘
|
||||
┌──────────┐
|
||||
│ Finished │
|
||||
└──────────┘
|
||||
```
|
||||
|
||||
## Client Role
|
||||
|
||||
A node implementing the client role mediates the persistence of data within the Codex network.
|
||||
|
||||
A client has two primary responsibilities:
|
||||
|
||||
- Requesting storage from the network by sending a storage request to the smart contract.
|
||||
- Withdrawing funds from the storage requests previously created by the client.
|
||||
|
||||
### Creating Storage Requests
|
||||
|
||||
When a user prompts the client node to create a storage request, the client node SHOULD receive the input parameters for the storage request from the user.
|
||||
|
||||
To create a request to persist a dataset on the Codex network, client nodes MUST split the dataset into data chunks, $(c_1, c_2, c_3, \ldots, c_{n})$. Using the erasure coding method and the provided input parameters, the data chunks are encoded and distributed over a number of slots. The applied erasure coding method MUST use the [Reed-Solomon algorithm](https://hackmd.io/FB58eZQoTNm-dnhu0Y1XnA). The final slot roots and other metadata MUST be placed into a `Manifest` (TODO: Manifest RFC). The CID for the `Manifest` MUST then be used as the `cid` for the stored dataset.
|
||||
|
||||
After the dataset is prepared, a client node MUST call the smart contract function `requestStorage(request)`, providing the desired request parameters in the `request` parameter. The `request` parameter is of type `Request`:
|
||||
|
||||
```solidity
|
||||
struct Request {
|
||||
address client;
|
||||
Ask ask;
|
||||
Content content;
|
||||
uint64 expiry;
|
||||
bytes32 nonce;
|
||||
}
|
||||
|
||||
struct Ask {
|
||||
uint256 proofProbability;
|
||||
uint256 pricePerBytePerSecond;
|
||||
uint256 collateralPerByte;
|
||||
uint64 slots;
|
||||
uint64 slotSize;
|
||||
uint64 duration;
|
||||
uint64 maxSlotLoss;
|
||||
}
|
||||
|
||||
struct Content {
|
||||
bytes cid;
|
||||
bytes32 merkleRoot;
|
||||
}
|
||||
```
|
||||
|
||||
The table below provides the description of the `Request` and the associated types attributes:
|
||||
|
||||
| attribute | type | description |
|
||||
|-----------|------|-------------|
|
||||
| `client` | `address` | The Codex node requesting storage. |
|
||||
| `ask` | `Ask` | Parameters of Request. |
|
||||
| `content` | `Content` | The dataset that will be hosted with the storage request. |
|
||||
| `expiry` | `uint64` | Timeout in seconds during which all the slots have to be filled, otherwise Request will get cancelled. The final deadline timestamp is calculated at the moment the transaction is mined. |
|
||||
| `nonce` | `bytes32` | Random value to differentiate from other requests of same parameters. It SHOULD be a random byte array. |
|
||||
| `pricePerBytePerSecond` | `uint256` | Amount of tokens that will be awarded to SPs for finishing the storage request. It MUST be an amount of tokens offered per slot per second per byte. The Ethereum address that submits the `requestStorage()` transaction MUST have [approval](https://docs.openzeppelin.com/contracts/2.x/api/token/erc20#IERC20-approve-address-uint256-) for the transfer of at least an equivalent amount of full reward (`pricePerBytePerSecond * duration * slots * slotSize`) in tokens. |
|
||||
| `collateralPerByte` | `uint256` | The amount of tokens per byte of slot's size that SPs submit when they fill slots. Collateral is then slashed or forfeited if SPs fail to provide the service requested by the storage request (more information in the [Slashing](#### Slashing) section). |
|
||||
| `proofProbability` | `uint256` | Determines the average frequency that a proof is required within a period: $\frac{1}{proofProbability}$. SPs are required to provide proofs of storage to the marketplace contract when challenged. To prevent hosts from only coming online when proofs are required, the frequency at which proofs are requested from SPs is stochastic and is influenced by the `proofProbability` parameter. |
|
||||
| `duration` | `uint64` | Total duration of the storage request in seconds. It MUST NOT exceed the limit specified in the configuration `config.requestDurationLimit`. |
|
||||
| `slots` | `uint64` | The number of requested slots. The slots will all have the same size. |
|
||||
| `slotSize` | `uint64` | Amount of storage per slot in bytes. |
|
||||
| `maxSlotLoss` | `uint64` | Max slots that can be lost without data considered to be lost. |
|
||||
| `cid` | `bytes` | An identifier used to locate the Manifest representing the dataset. It MUST be a [CIDv1](https://github.com/multiformats/cid#cidv1), SHA-256 [multihash](https://github.com/multiformats/multihash) and the data it represents SHOULD be discoverable in the network, otherwise the request will be eventually canceled. |
|
||||
| `merkleRoot` | `bytes32` | Merkle root of the dataset, used to verify storage proofs |
|
||||
|
||||
#### Renewal of Storage Requests
|
||||
|
||||
It should be noted that the marketplace does not support extending requests. It is REQUIRED that if the user wants to extend the duration of a request, a new request with the same CID must be [created](### Creating Storage Requests) **before the original request completes**.
|
||||
|
||||
This ensures that the data will continue to persist in the network at the time when the new (or existing) SPs need to retrieve the complete dataset to fill the slots of the new request.
|
||||
|
||||
### Monitoring and State Management
|
||||
|
||||
Client nodes MUST implement the following smart contract interactions for monitoring and state management:
|
||||
|
||||
- **getRequest(requestId)**: Retrieve the full `StorageRequest` data from the marketplace. This function is used for recovery and state verification after restarts or failures.
|
||||
|
||||
- **requestState(requestId)**: Query the current state of a storage request. Used for monitoring request progress and determining the appropriate client actions.
|
||||
|
||||
- **requestExpiresAt(requestId)**: Query when the request will expire if not fulfilled.
|
||||
|
||||
- **getRequestEnd(requestId)**: Query when a fulfilled request will end (used to determine when to call `freeSlot` or `withdrawFunds`).
|
||||
|
||||
Client nodes MUST subscribe to the following marketplace events:
|
||||
|
||||
- **RequestFulfilled(requestId)**: Emitted when a storage request has enough filled slots to start. Clients monitor this event to determine when their request becomes active and transitions from the submission phase to the active phase.
|
||||
|
||||
- **RequestFailed(requestId)**: Emitted when a storage request fails due to proof failures or other reasons. Clients observe this event to detect failed requests and initiate fund withdrawal.
|
||||
|
||||
### Withdrawing Funds
|
||||
|
||||
The client node MUST monitor the status of the requests it created. When a storage request enters the `Cancelled`, `Failed`, or `Finished` state, the client node MUST initiate the withdrawal of the remaining or refunded funds from the smart contract using the `withdrawFunds(requestId)` function.
|
||||
|
||||
Request states are determined as follows:
|
||||
|
||||
- The request is considered `Cancelled` if no `RequestFulfilled(requestId)` event is observed during the timeout specified by the value returned from the `requestExpiresAt(requestId)` function.
|
||||
- The request is considered `Failed` when the `RequestFailed(requestId)` event is observed.
|
||||
- The request is considered `Finished` after the interval specified by the value returned from the `getRequestEnd(requestId)` function has elapsed.
|
||||
|
||||
## Storage Provider Role
|
||||
|
||||
A Codex node acting as an SP persists data across the network by hosting slots requested by clients in their storage requests.
|
||||
|
||||
The following tasks need to be considered when hosting a slot:
|
||||
|
||||
- Filling a slot
|
||||
- Proving
|
||||
- Repairing a slot
|
||||
- Collecting request reward and collateral
|
||||
|
||||
### Filling Slots
|
||||
|
||||
When a new request is created, the `StorageRequested(requestId, ask, expiry)` event is emitted with the following properties:
|
||||
|
||||
- `requestId` - the ID of the request.
|
||||
- `ask` - the specification of the request parameters. For details, see the definition of the `Request` type in the [Creating Storage Requests](### Creating Storage Requests) section above.
|
||||
- `expiry` - a Unix timestamp specifying when the request will be canceled if all slots are not filled by then.
|
||||
|
||||
It is then up to the SP node to decide, based on the emitted parameters and node's operator configuration, whether it wants to participate in the request and attempt to fill its slot(s) (note that one SP can fill more than one slot). If the SP node decides to ignore the request, no further action is required. However, if the SP decides to fill a slot, it MUST follow the remaining steps described below.
|
||||
|
||||
The node acting as an SP MUST decide which slot, specified by the slot index, it wants to fill. The SP MAY attempt to fill more than one slot. To fill a slot, the SP MUST first reserve the slot in the smart contract using `reserveSlot(requestId, slotIndex)`. If reservations for this slot are full, or if the SP has already reserved the slot, the transaction will revert. If the reservation was unsuccessful, then the SP is not allowed to fill the slot. If the reservation was successful, the node MUST then download the slot data using the CID of the manifest (**TODO: Manifest RFC**) and the slot index. The CID is specified in `request.content.cid`, which can be retrieved from the smart contract using `getRequest(requestId)`. Then, the node MUST generate a proof over the downloaded data (**TODO: Proving RFC**).
|
||||
|
||||
When the proof is ready, the SP MUST call `fillSlot()` on the smart contract with the following REQUIRED parameters:
|
||||
|
||||
- `requestId` - the ID of the request.
|
||||
- `slotIndex` - the slot index that the node wants to fill.
|
||||
- `proof` - the `Groth16Proof` proof structure, generated over the slot data.
|
||||
|
||||
The Ethereum address of the SP node from which the transaction originates MUST have [approval](https://docs.openzeppelin.com/contracts/2.x/api/token/erc20#IERC20-approve-address-uint256-) for the transfer of at least the amount of tokens required as collateral for the slot (`collateralPerByte * slotSize`).
|
||||
|
||||
If the proof delivered by the SP is invalid or the slot was already filled by another SP, then the transaction will revert. Otherwise, a `SlotFilled(requestId, slotIndex)` event is emitted. If the transaction is successful, the SP SHOULD transition into the **proving** state, where it will need to submit proof of data possession when challenged by the smart contract.
|
||||
|
||||
It should be noted that if the SP node observes a `SlotFilled` event for the slot it is currently downloading the dataset for or generating the proof for, it means that the slot has been filled by another node in the meantime. In response, the SP SHOULD stop its current operation and attempt to fill a different, unfilled slot.
|
||||
|
||||
### Proving
|
||||
|
||||
Once an SP fills a slot, it MUST submit proofs to the marketplace contract when a challenge is issued by the contract. SPs SHOULD detect that a proof is required for the current period using the `isProofRequired(slotId)` function, or that it will be required using the `willProofBeRequired(slotId)` function in the case that the [proving clock pointer is in downtime](https://github.com/codex-storage/codex-research/blob/41c4b4409d2092d0a5475aca0f28995034e58d14/design/storage-proof-timing.md).
|
||||
|
||||
Once an SP knows it has to provide a proof it MUST get the proof challenge using `getChallenge(slotId)`, which then
|
||||
MUST be incorporated into the proof generation as described in Proving RFC (**TODO: Proving RFC**).
|
||||
|
||||
When the proof is generated, it MUST be submitted by calling the `submitProof(slotId, proof)` smart contract function.
|
||||
|
||||
#### Slashing
|
||||
|
||||
There is a slashing scheme orchestrated by the smart contract to incentivize correct behavior and proper proof submissions by SPs. This scheme is configured at the smart contract level and applies uniformly to all participants in the network. The configuration of the slashing scheme can be obtained via the `configuration()` contract call.
|
||||
|
||||
The slashing works as follows:
|
||||
|
||||
- When SP misses a proof and a validator trigger detection of this event using the `markProofAsMissing()` call, the SP is slashed by `config.collateral.slashPercentage` **of the originally required collateral** (hence the slashing amount is always the same for a given request).
|
||||
- If the number of slashes exceeds `config.collateral.maxNumberOfSlashes`, the slot is freed, the remaining collateral is burned, and the slot is offered to other nodes for repair. The smart contract also emits the `SlotFreed(requestId, slotIndex)` event.
|
||||
|
||||
If, at any time, the number of freed slots exceeds the value specified by the `request.ask.maxSlotLoss` parameter, the dataset is considered lost, and the request is deemed _failed_. The collateral of all SPs that hosted the slots associated with the storage request is burned, and the `RequestFailed(requestId)` event is emitted.
|
||||
|
||||
### Repair
|
||||
|
||||
When a slot is freed due to too many missed proofs, which SHOULD be detected by listening to the `SlotFreed(requestId, slotIndex)` event, an SP node can decide whether to participate in repairing the slot. Similar to filling a slot, the node SHOULD consider the operator's configuration when making this decision. The SP that originally hosted the slot but failed to comply with proving requirements MAY also participate in the repair. However, by refilling the slot, the SP **will not** recover its original collateral and must submit new collateral using the `fillSlot()` call.
|
||||
|
||||
The repair process is similar to filling slots. If the original slot dataset is no longer present in the network, the SP MAY use erasure coding to reconstruct the dataset. Reconstructing the original slot dataset requires retrieving other pieces of the dataset stored in other slots belonging to the request. For this reason, the node that successfully repairs a slot is entitled to an additional reward. (**TODO: Implementation**)
|
||||
|
||||
The repair process proceeds as follows:
|
||||
|
||||
1. The SP observes the `SlotFreed` event and decides to repair the slot.
|
||||
2. The SP MUST reserve the slot with the `reserveSlot(requestId, slotIndex)` call. For more information see the [Filling Slots](###filling slots) section.
|
||||
3. The SP MUST download the chunks of data required to reconstruct the freed slot's data. The node MUST use the [Reed-Solomon algorithm](https://hackmd.io/FB58eZQoTNm-dnhu0Y1XnA) to reconstruct the missing data.
|
||||
4. The SP MUST generate proof over the reconstructed data.
|
||||
5. The SP MUST call the `fillSlot()` smart contract function with the same parameters and collateral allowance as described in the [Filling Slots](###filling slots) section.
|
||||
|
||||
### Collecting Funds
|
||||
|
||||
An SP node SHOULD monitor the requests and the associated slots it hosts.
|
||||
|
||||
When a storage request enters the `Cancelled`, `Finished`, or `Failed` state, the SP node SHOULD call the `freeSlot(slotId)` smart contract function.
|
||||
|
||||
The aforementioned storage request states (`Cancelled`, `Finished`, and `Failed`) can be detected as follows:
|
||||
|
||||
- A storage request is considered `Cancelled` if no `RequestFulfilled(requestId)` event is observed within the time indicated by the `expiry` request parameter. Note that a `RequestCancelled` event may also be emitted, but the node SHOULD NOT rely on this event to assert the request expiration, as the `RequestCancelled` event is not guaranteed to be emitted at the time of expiry.
|
||||
- A storage request is considered `Finished` when the time indicated by the value returned from the `getRequestEnd(requestId)` function has elapsed.
|
||||
- A node concludes that a storage request has `Failed` upon observing the `RequestFailed(requestId)` event.
|
||||
|
||||
For each of the states listed above, different funds are handled as follows:
|
||||
|
||||
- In the `Cancelled` state, the collateral is returned along with a proportional payout based on the time the node actually hosted the dataset before the expiry was reached.
|
||||
- In the `Finished` state, the full reward for hosting the slot, along with the collateral, is collected.
|
||||
- In the `Failed` state, no funds are collected. The reward is returned to the client, and the collateral is burned. The slot is removed from the list of slots and is no longer included in the list of slots returned by the `mySlots()` function.
|
||||
|
||||
## Validator Role
|
||||
|
||||
In a blockchain, a contract cannot change its state without a transaction and gas initiating the state change. Therefore, our smart contract requires an external trigger to periodically check and confirm that a storage proof has been delivered by the SP. This is where the validator role is essential.
|
||||
|
||||
The validator role is fulfilled by nodes that help to verify that SPs have submitted the required storage proofs.
|
||||
|
||||
It is the smart contract that checks if the proof requested from an SP has been delivered. The validator only triggers the decision-making function in the smart contract. To incentivize validators, they receive a reward each time they correctly mark a proof as missing corresponding to the percentage of the slashed collateral defined by `config.collateral.validatorRewardPercentage`.
|
||||
|
||||
Each time a validator observes the `SlotFilled` event, it SHOULD add the slot reported in the `SlotFilled` event to the validator's list of watched slots. Then, after the end of each period, a validator has up to `config.proofs.timeout` seconds (a configuration parameter retrievable with `configuration()`) to validate all the slots. If a slot lacks the required proof, the validator SHOULD call the `markProofAsMissing(slotId, period)` function on the smart contract. This function validates the correctness of the claim, and if right, will send a reward to the validator.
|
||||
|
||||
If validating all the slots observed by the validator is not feasible within the specified `timeout`, the validator MAY choose to validate only a subset of the observed slots.
|
||||
|
||||
---
|
||||
|
||||
## Part II: Implementation Suggestions
|
||||
|
||||
> **IMPORTANT**: The sections above (Abstract through Validator Role) define the normative Codex Marketplace protocol requirements. All implementations MUST comply with those protocol requirements to participate in the Codex network.
|
||||
>
|
||||
> **The sections below are non-normative**. They document implementation approaches used in the nim-codex reference implementation. These are suggestions to guide implementors but are NOT required by the protocol. Alternative implementations MAY use different approaches as long as they satisfy the protocol requirements defined in Part I.
|
||||
|
||||
## Implementation Suggestions
|
||||
|
||||
This section describes implementation approaches used in reference implementations. These are **suggestions and not normative requirements**. Implementations are free to use different internal architectures, state machines, and data structures as long as they correctly implement the protocol requirements defined above.
|
||||
|
||||
### Storage Provider Implementation
|
||||
|
||||
The nim-codex reference implementation provides a complete Storage Provider implementation with state machine management, slot queueing, and resource management. This section documents the nim-codex approach.
|
||||
|
||||
#### State Machine
|
||||
|
||||
The Sales module implements a deterministic state machine for each slot, progressing through the following states:
|
||||
|
||||
1. **SalePreparing** - Find a matching availability and create a reservation
|
||||
2. **SaleSlotReserving** - Reserve the slot on the marketplace
|
||||
3. **SaleDownloading** - Stream and persist the slot's data
|
||||
4. **SaleInitialProving** - Wait for stable challenge and generate initial proof
|
||||
5. **SaleFilling** - Compute collateral and fill the slot
|
||||
6. **SaleFilled** - Post-filling operations and expiry updates
|
||||
7. **SaleProving** - Generate and submit proofs periodically
|
||||
8. **SalePayout** - Free slot and calculate collateral
|
||||
9. **SaleFinished** - Terminal success state
|
||||
10. **SaleFailed** - Free slot on market and transition to error
|
||||
11. **SaleCancelled** - Cancellation path
|
||||
12. **SaleIgnored** - Sale ignored (no matching availability or other conditions)
|
||||
13. **SaleErrored** - Terminal error state
|
||||
14. **SaleUnknown** - Recovery state for crash recovery
|
||||
15. **SaleProvingSimulated** - Proving with injected failures for testing
|
||||
|
||||
All states move to `SaleErrored` if an error is raised.
|
||||
|
||||
##### SalePreparing
|
||||
|
||||
- Find a matching availability based on the following criteria: `freeSize`, `duration`, `collateralPerByte`, `minPricePerBytePerSecond` and `until`
|
||||
- Create a reservation
|
||||
- Move to `SaleSlotReserving` if successful
|
||||
- Move to `SaleIgnored` if no availability is found or if `BytesOutOfBoundsError` is raised because of no space available.
|
||||
- Move to `SaleFailed` on `RequestFailed` event from the `marketplace`
|
||||
- Move to `SaleCancelled` on cancelled timer elapsed, set to storage contract expiry
|
||||
|
||||
##### SaleSlotReserving
|
||||
|
||||
- Check if the slot can be reserved
|
||||
- Move to `SaleDownloading` if successful
|
||||
- Move to `SaleIgnored` if `SlotReservationNotAllowedError` is raised or the slot cannot be reserved. The collateral is returned.
|
||||
- Move to `SaleFailed` on `RequestFailed` event from the `marketplace`
|
||||
- Move to `SaleCancelled` on cancelled timer elapsed, set to storage contract expiry
|
||||
|
||||
##### SaleDownloading
|
||||
|
||||
- Select the correct data expiry:
|
||||
- When the request is started, the request end date is used
|
||||
- Otherwise the expiry date is used
|
||||
- Stream and persist data via `onStore`
|
||||
- For each written batch, release bytes from the reservation
|
||||
- Move to `SaleInitialProving` if successful
|
||||
- Move to `SaleFailed` on `RequestFailed` event from the `marketplace`
|
||||
- Move to `SaleCancelled` on cancelled timer elapsed, set to storage contract expiry
|
||||
- Move to `SaleFilled` on `SlotFilled` event from the `marketplace`
|
||||
|
||||
##### SaleInitialProving
|
||||
|
||||
- Wait for a stable initial challenge
|
||||
- Produce the initial proof via `onProve`
|
||||
- Move to `SaleFilling` if successful
|
||||
- Move to `SaleFailed` on `RequestFailed` event from the `marketplace`
|
||||
- Move to `SaleCancelled` on cancelled timer elapsed, set to storage contract expiry
|
||||
|
||||
##### SaleFilling
|
||||
|
||||
- Get the slot collateral
|
||||
- Fill the slot
|
||||
- Move to `SaleFilled` if successful
|
||||
- Move to `SaleIgnored` on `SlotStateMismatchError`. The collateral is returned.
|
||||
- Move to `SaleFailed` on `RequestFailed` event from the `marketplace`
|
||||
- Move to `SaleCancelled` on cancelled timer elapsed, set to storage contract expiry
|
||||
|
||||
##### SaleFilled
|
||||
|
||||
- Ensure that the current host has filled the slot by checking the signer address
|
||||
- Notify by calling `onFilled` hook
|
||||
- Call `onExpiryUpdate` to change the data expiry from expiry date to request end date
|
||||
- Move to `SaleProving` (or `SaleProvingSimulated` for simulated mode)
|
||||
- Move to `SaleFailed` on `RequestFailed` event from the `marketplace`
|
||||
- Move to `SaleCancelled` on cancelled timer elapsed, set to storage contract expiry
|
||||
|
||||
##### SaleProving
|
||||
|
||||
- For each period: fetch challenge, call `onProve`, and submit proof
|
||||
- Move to `SalePayout` when the slot request ends
|
||||
- Re-raise `SlotFreedError` when the slot is freed
|
||||
- Raise `SlotNotFilledError` when the slot is not filled
|
||||
- Move to `SaleFailed` on `RequestFailed` event from the `marketplace`
|
||||
- Move to `SaleCancelled` on cancelled timer elapsed, set to storage contract expiry
|
||||
|
||||
##### SaleProvingSimulated
|
||||
|
||||
- Submit invalid proofs every `N` periods (`failEveryNProofs` in configuration) to test failure scenarios
|
||||
|
||||
##### SalePayout
|
||||
|
||||
- Get the current collateral and try to free the slot to ensure that the slot is freed after payout.
|
||||
- Forward the returned collateral to cleanup
|
||||
- Move to `SaleFinished` if successful
|
||||
- Move to `SaleFailed` on `RequestFailed` event from the `marketplace`
|
||||
- Move to `SaleCancelled` on cancelled timer elapsed, set to storage contract expiry
|
||||
|
||||
##### SaleFinished
|
||||
|
||||
- Call `onClear` hook
|
||||
- Call `onCleanUp` hook
|
||||
|
||||
##### SaleFailed
|
||||
|
||||
- Free the slot
|
||||
- Move to `SaleErrored` with the failure message
|
||||
|
||||
##### SaleCancelled
|
||||
|
||||
- Ensure that the node hosting the slot frees the slot
|
||||
- Call `onClear` hook
|
||||
- Call `onCleanUp` hook with the current collateral
|
||||
|
||||
##### SaleIgnored
|
||||
|
||||
- Call `onCleanUp` hook with the current collateral
|
||||
|
||||
##### SaleErrored
|
||||
|
||||
- Call `onClear` hook
|
||||
- Call `onCleanUp` hook
|
||||
|
||||
##### SaleUnknown
|
||||
|
||||
- Recovery entry: get the `on-chain` state and jump to the appropriate state
|
||||
|
||||
#### Slot Queue
|
||||
|
||||
Slot queue schedules slot work and instantiates one `SalesAgent` per item with bounded concurrency.
|
||||
|
||||
- Accepts `(requestId, slotIndex, …)` items and orders them by priority
|
||||
- Spawns one `SalesAgent` for each dequeued item, in other words, one item for one agent
|
||||
- Caps concurrent agents to `maxWorkers`
|
||||
- Supports pause/resume
|
||||
- Allows controlled requeue when an agent finishes with `reprocessSlot`
|
||||
|
||||
##### Slot Ordering
|
||||
|
||||
The criteria are in the following order:
|
||||
|
||||
1) **Unseen before seen** - Items that have not been seen are dequeued first.
|
||||
2) **More profitable first** - Higher `profitability` wins. `profitability` is `duration * pricePerSlotPerSecond`.
|
||||
3) **Less collateral first** - The item with the smaller `collateral` wins.
|
||||
4) **Later expiry first** - If both items carry an `expiry`, the one with the greater timestamp wins.
|
||||
|
||||
Within a single request, per-slot items are shuffled before enqueuing so the default slot-index order does not influence priority.
|
||||
|
||||
##### Pause / Resume
|
||||
|
||||
When the Slot queue processes an item with `seen = true`, it means that the item was already evaluated against the current availabilities and did not match.
|
||||
To avoid draining the queue with untenable requests (due to insufficient availability), the queue pauses itself.
|
||||
|
||||
The queue resumes when:
|
||||
|
||||
- `OnAvailabilitySaved` fires after an availability update that increases one of: `freeSize`, `duration`, `minPricePerBytePerSecond`, or `totalRemainingCollateral`.
|
||||
- A new unseen item (`seen = false`) is pushed.
|
||||
- `unpause()` is called explicitly.
|
||||
|
||||
##### Reprocess
|
||||
|
||||
Availability matching occurs in `SalePreparing`.
|
||||
If no availability fits at that time, the sale is ignored with `reprocessSlot` to true, meaning that the slot is added back to the queue with the flag `seen` to true.
|
||||
|
||||
##### Startup
|
||||
|
||||
On `SlotQueue.start()`, the sales module first deletes reservations associated with inactive storage requests, then starts a new `SalesAgent` for each active storage request:
|
||||
|
||||
- Fetch the active `on-chain` active slots.
|
||||
- Delete the local reservations for slots that are not in the active list.
|
||||
- Create a new agent for each slot and assign the `onCleanUp` callback.
|
||||
- Start the agent in the `SaleUnknown` state.
|
||||
|
||||
#### Main Behaviour
|
||||
|
||||
When a new slot request is received, the sales module extracts the pair `(requestId, slotIndex, …)` from the request.
|
||||
A `SlotQueueItem` is then created with metadata such as `profitability`, `collateral`, `expiry`, and the `seen` flag set to `false`.
|
||||
This item is pushed into the `SlotQueue`, where it will be prioritised according to the ordering rules.
|
||||
|
||||
#### SalesAgent
|
||||
|
||||
SalesAgent is the instance that executes the state machine for a single slot.
|
||||
|
||||
- Executes the sale state machine across the slot lifecycle
|
||||
- Holds a `SalesContext` with dependencies and host hooks
|
||||
- Supports crash recovery via the `SaleUnknown` state
|
||||
- Handles errors by entering `SaleErrored`, which runs cleanup routines
|
||||
|
||||
#### SalesContext
|
||||
|
||||
SalesContext is a container for dependencies used by all sales.
|
||||
|
||||
- Provides external interfaces: `Market` (marketplace) and `Clock`
|
||||
- Provides access to `Reservations`
|
||||
- Provides host hooks: `onStore`, `onProve`, `onExpiryUpdate`, `onClear`, `onSale`
|
||||
- Shares the `SlotQueue` handle for scheduling work
|
||||
- Provides configuration such as `simulateProofFailures`
|
||||
- Passed to each `SalesAgent`
|
||||
|
||||
#### Marketplace Subscriptions
|
||||
|
||||
The sales module subscribes to on-chain events to keep the queue and agents consistent.
|
||||
|
||||
##### StorageRequested
|
||||
|
||||
When the marketplace signals a new request, the sales module:
|
||||
|
||||
- Computes collateral for free slots.
|
||||
- Creates per-slot `SlotQueueItem` entries (one per `slotIndex`) with `seen = false`.
|
||||
- Pushes the items into the `SlotQueue`.
|
||||
|
||||
##### SlotFreed
|
||||
|
||||
When the marketplace signals a freed slot (needs repair), the sales module:
|
||||
|
||||
- Retrieves the request data for the `requestId`.
|
||||
- Computes collateral for repair.
|
||||
- Creates a `SlotQueueItem`.
|
||||
- Pushes the item into the `SlotQueue`.
|
||||
|
||||
##### RequestCancelled
|
||||
|
||||
When a request is cancelled, the sales module removes all queue items for that `requestId`.
|
||||
|
||||
##### RequestFulfilled
|
||||
|
||||
When a request is fulfilled, the sales module removes all queue items for that `requestId` and notifies active agents bound to the request.
|
||||
|
||||
##### RequestFailed
|
||||
|
||||
When a request fails, the sales module removes all queue items for that `requestId` and notifies active agents bound to the request.
|
||||
|
||||
##### SlotFilled
|
||||
|
||||
When a slot is filled, the sales module removes the queue item for that specific `(requestId, slotIndex)` and notifies the active agent for that slot.
|
||||
|
||||
##### SlotReservationsFull
|
||||
|
||||
When the marketplace signals that reservations are full, the sales module removes the queue item for that specific `(requestId, slotIndex)`.
|
||||
|
||||
#### Reservations
|
||||
|
||||
The Reservations module manages both Availabilities and Reservations.
|
||||
When an Availability is created, it reserves bytes in the storage module so no other modules can use those bytes.
|
||||
Before a dataset for a slot is downloaded, a Reservation is created, and the freeSize of the Availability is reduced.
|
||||
When bytes are downloaded, the reservation of those bytes in the storage module is released.
|
||||
Accounting of both reserved bytes in the storage module and freeSize in the Availability are cleaned up upon completion of the state machine.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Availability] -->|creates| R[Reservation]
|
||||
A -->|reserves bytes in| SM[Storage Module]
|
||||
R -->|reduces| AF[Availability.freeSize]
|
||||
R -->|downloads data| D[Dataset]
|
||||
D -->|releases bytes to| SM
|
||||
TC[Terminal State] -->|triggers cleanup| C[Cleanup]
|
||||
C -->|returns bytes to| AF
|
||||
C -->|deletes| R
|
||||
C -->|returns collateral to| A
|
||||
```
|
||||
|
||||
#### Hooks
|
||||
|
||||
- **onStore**: streams data into the node's storage
|
||||
- **onProve**: produces proofs for initial and periodic proving
|
||||
- **onExpiryUpdate**: notifies the client node of a change in the expiry data
|
||||
- **onSale**: notifies that the host is now responsible for the slot
|
||||
- **onClear**: notification emitted once the state machine has concluded; used to reconcile Availability bytes and reserved bytes in the storage module
|
||||
- **onCleanUp**: cleanup hook called in terminal states to release resources, delete reservations, and return collateral to availabilities
|
||||
|
||||
#### Error Handling
|
||||
|
||||
- Always catch `CancelledError` from `nim-chronos` and log a trace, exiting gracefully
|
||||
- Catch `CatchableError`, log it, and route to `SaleErrored`
|
||||
|
||||
#### Cleanup
|
||||
|
||||
Cleanup releases resources held by a sales agent and optionally requeues the slot.
|
||||
|
||||
- Return reserved bytes to the availability if a reservation exists
|
||||
- Delete the reservation and return any remaining collateral
|
||||
- If `reprocessSlot` is true, push the slot back into the queue marked as seen
|
||||
- Remove the agent from the sales set and track the removal future
|
||||
|
||||
#### Resource Management Approach
|
||||
|
||||
The nim-codex implementation uses Availabilities and Reservations to manage local storage resources:
|
||||
|
||||
##### Reservation Management
|
||||
|
||||
- Maintain `Availability` and `Reservation` records locally
|
||||
- Match incoming slot requests to available capacity using prioritisation rules
|
||||
- Lock capacity and collateral when creating a reservation
|
||||
- Release reserved bytes progressively during download and free all remaining resources in terminal states
|
||||
|
||||
**Note:** Availabilities and Reservations are completely local to the Storage Provider implementation and are not visible at the protocol level. They provide one approach to managing storage capacity, but other implementations may use different resource management strategies.
|
||||
|
||||
---
|
||||
|
||||
> **Protocol Compliance Note**: The Storage Provider implementation described above is specific to nim-codex. The only normative requirements for Storage Providers are defined in the [Storage Provider Role](#storage-provider-role) section of Part I. Implementations must satisfy those protocol requirements but may use completely different internal designs.
|
||||
|
||||
### Client Implementation
|
||||
|
||||
The nim-codex reference implementation provides a complete Client implementation with state machine management for storage request lifecycles. This section documents the nim-codex approach.
|
||||
|
||||
The nim-codex implementation uses a state machine pattern to manage purchase lifecycles, providing deterministic state transitions, explicit terminal states, and recovery support. The state machine definitions (state identifiers, transitions, state descriptions, requirements, data models, and interfaces) are documented in the subsections below.
|
||||
|
||||
> **Note**: The Purchase module terminology and state machine design are specific to the nim-codex implementation. The protocol only requires that clients interact with the marketplace smart contract as specified in the Client Role section.
|
||||
|
||||
#### State Identifiers
|
||||
|
||||
- PurchasePending: `pending`
|
||||
- PurchaseSubmitted: `submitted`
|
||||
- PurchaseStarted: `started`
|
||||
- PurchaseFinished: `finished`
|
||||
- PurchaseErrored: `errored`
|
||||
- PurchaseCancelled: `cancelled`
|
||||
- PurchaseFailed: `failed`
|
||||
- PurchaseUnknown: `unknown`
|
||||
|
||||
#### General Rules for All States
|
||||
|
||||
- If a `CancelledError` is raised, the state machine logs the cancellation message and takes no further action.
|
||||
- If a `CatchableError` is raised, the state machine moves to `errored` with the error message.
|
||||
|
||||
#### State Transitions
|
||||
|
||||
```text
|
||||
|
|
||||
v
|
||||
------------------------- unknown
|
||||
| / /
|
||||
v v /
|
||||
pending ----> submitted ----> started ---------> finished <----/
|
||||
\ \ /
|
||||
\ ------------> failed <----/
|
||||
\ /
|
||||
--> cancelled <-----------------------
|
||||
```
|
||||
|
||||
**Note:**
|
||||
|
||||
Any state can transition to errored upon a `CatchableError`.
|
||||
`failed` is an intermediate state before `errored`.
|
||||
`finished`, `cancelled`, and `errored` are terminal states.
|
||||
|
||||
#### State Descriptions
|
||||
|
||||
**Pending State (`pending`)**
|
||||
|
||||
A storage request is being created by making a call `on-chain`. If the storage request creation fails, the state machine moves to the `errored` state with the corresponding error.
|
||||
|
||||
**Submitted State (`submitted`)**
|
||||
|
||||
The storage request has been created and the purchase waits for the request to start. When it starts, an `on-chain` event `RequestFulfilled` is emitted, triggering the subscription callback, and the state machine moves to the `started` state. If the expiry is reached before the callback is called, the state machine moves to the `cancelled` state.
|
||||
|
||||
**Started State (`started`)**
|
||||
|
||||
The purchase is active and waits until the end of the request, defined by the storage request parameters, before moving to the `finished` state. A subscription is made to the marketplace to be notified about request failure. If a request failure is notified, the state machine moves to `failed`.
|
||||
|
||||
Marketplace subscription signature:
|
||||
|
||||
```nim
|
||||
method subscribeRequestFailed*(market: Market, requestId: RequestId, callback: OnRequestFailed): Future[Subscription] {.base, async.}
|
||||
```
|
||||
|
||||
**Finished State (`finished`)**
|
||||
|
||||
The purchase is considered successful and cleanup routines are called. The purchase module calls `marketplace.withdrawFunds` to release the funds locked by the marketplace:
|
||||
|
||||
```nim
|
||||
method withdrawFunds*(market: Market, requestId: RequestId) {.base, async: (raises: [CancelledError, MarketError]).}
|
||||
```
|
||||
|
||||
After that, the purchase is done; no more states are called and the state machine stops successfully.
|
||||
|
||||
**Failed State (`failed`)**
|
||||
|
||||
If the marketplace emits a `RequestFailed` event, the state machine moves to the `failed` state and the purchase module calls `marketplace.withdrawFunds` (same signature as above) to release the funds locked by the marketplace. After that, the state machine moves to `errored`.
|
||||
|
||||
**Cancelled State (`cancelled`)**
|
||||
|
||||
The purchase is cancelled and the purchase module calls `marketplace.withdrawFunds` to release the funds locked by the marketplace (same signature as above). After that, the purchase is terminated; no more states are called and the state machine stops with the reason of failure as error.
|
||||
|
||||
**Errored State (`errored`)**
|
||||
|
||||
The purchase is terminated; no more states are called and the state machine stops with the reason of failure as error.
|
||||
|
||||
**Unknown State (`unknown`)**
|
||||
|
||||
The purchase is in recovery mode, meaning that the state has to be determined. The purchase module calls the marketplace to get the request data (`getRequest`) and the request state (`requestState`):
|
||||
|
||||
```nim
|
||||
method getRequest*(market: Market, id: RequestId): Future[?StorageRequest] {.base, async: (raises: [CancelledError]).}
|
||||
|
||||
method requestState*(market: Market, requestId: RequestId): Future[?RequestState] {.base, async.}
|
||||
```
|
||||
|
||||
Based on this information, it moves to the corresponding next state.
|
||||
|
||||
> **Note**: Functional and non-functional requirements for the client role are summarized in the [Codex Marketplace Specification](https://github.com/codex-storage/codex-spec/blob/master/specs/marketplace.md). The requirements listed below are specific to the nim-codex Purchase module implementation.
|
||||
|
||||
#### Functional Requirements
|
||||
|
||||
##### Purchase Definition
|
||||
|
||||
- Every purchase MUST represent exactly one `StorageRequest`
|
||||
- The purchase MUST have a unique, deterministic identifier `PurchaseId` derived from `requestId`
|
||||
- It MUST be possible to restore any purchase from its `requestId` after a restart
|
||||
- A purchase is considered expired when the expiry timestamp in its `StorageRequest` is reached before the request start, i.e, an event `RequestFulfilled` is emitted by the `marketplace`
|
||||
|
||||
##### State Machine Progression
|
||||
|
||||
- New purchases MUST start in the `pending` state (submission flow)
|
||||
- Recovered purchases MUST start in the `unknown` state (recovery flow)
|
||||
- The state machine MUST progress step-by-step until a deterministic terminal state is reached
|
||||
- The choice of terminal state MUST be based on the `RequestState` returned by the `marketplace`
|
||||
|
||||
##### Failure Handling
|
||||
|
||||
- On marketplace failure events, the purchase MUST immediately transition to `errored` without retries
|
||||
- If a `CancelledError` is raised, the state machine MUST log the cancellation and stop further processing
|
||||
- If a `CatchableError` is raised, the state machine MUST transition to `errored` and record the error
|
||||
|
||||
#### Non-Functional Requirements
|
||||
|
||||
##### Execution Model
|
||||
|
||||
A purchase MUST be handled by a single thread; only one worker SHOULD process a given purchase instance at a time.
|
||||
|
||||
##### Reliability
|
||||
|
||||
`load` supports recovery after process restarts.
|
||||
|
||||
##### Performance
|
||||
|
||||
State transitions should be non-blocking; all I/O is async.
|
||||
|
||||
##### Logging
|
||||
|
||||
All state transitions and errors should be clearly logged for traceability.
|
||||
|
||||
##### Safety
|
||||
|
||||
- Avoid side effects during `new` other than initialising internal fields; `on-chain` interactions are delegated to states using `marketplace` dependency.
|
||||
- Retry policy for external calls.
|
||||
|
||||
##### Testing
|
||||
|
||||
- Unit tests check that each state handles success and error properly.
|
||||
- Integration tests check that a full purchase flows correctly through states.
|
||||
|
||||
---
|
||||
|
||||
> **Protocol Compliance Note**: The Client implementation described above is specific to nim-codex. The only normative requirements for Clients are defined in the [Client Role](#client-role) section of Part I. Implementations must satisfy those protocol requirements but may use completely different internal designs.
|
||||
|
||||
---
|
||||
|
||||
## Copyright
|
||||
|
||||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
|
||||
|
||||
## References
|
||||
|
||||
### Normative
|
||||
|
||||
- [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) - Key words for use in RFCs to Indicate Requirement Levels
|
||||
- [Reed-Solomon algorithm](https://hackmd.io/FB58eZQoTNm-dnhu0Y1XnA) - Erasure coding algorithm used for data encoding
|
||||
- [CIDv1](https://github.com/multiformats/cid#cidv1) - Content Identifier specification
|
||||
- [multihash](https://github.com/multiformats/multihash) - Self-describing hashes
|
||||
- [Proof-of-Data-Possession](https://hackmd.io/2uRBltuIT7yX0CyczJevYg?view) - Zero-knowledge proof system for storage verification
|
||||
- [Original Codex Marketplace Spec](https://github.com/codex-storage/codex-spec/blob/master/specs/marketplace.md) - Source specification for this document
|
||||
|
||||
### Informative
|
||||
|
||||
- [Codex Implementation](https://github.com/codex-storage/nim-codex) - Reference implementation in Nim
|
||||
- [Codex market implementation](https://github.com/codex-storage/nim-codex/blob/master/codex/market.nim) - Marketplace module implementation
|
||||
- [Codex Sales Component Spec](https://github.com/codex-storage/codex-docs-obsidian/blob/main/10%20Notes/Specs/Component%20Specification%20-%20Sales.md) - Storage Provider implementation details
|
||||
- [Codex Purchase Component Spec](https://github.com/codex-storage/codex-docs-obsidian/blob/main/10%20Notes/Specs/Component%20Specification%20-%20Purchase.md) - Client implementation details
|
||||
- [Nim Chronos](https://github.com/status-im/nim-chronos) - Async/await framework for Nim
|
||||
- [Storage proof timing design](https://github.com/codex-storage/codex-research/blob/41c4b4409d2092d0a5475aca0f28995034e58d14/design/storage-proof-timing.md) - Proof timing mechanism
|
||||
377
codex/raw/community-history.md
Normal file
377
codex/raw/community-history.md
Normal file
@@ -0,0 +1,377 @@
|
||||
---
|
||||
title: CODEX-COMMUNITY-HISTORY
|
||||
name: Codex Community History
|
||||
status: raw
|
||||
tags: codex
|
||||
editor:
|
||||
contributors:
|
||||
- Jimmy Debe <jimmy@status.im>
|
||||
---
|
||||
|
||||
## Abstract
|
||||
|
||||
This document describes how nodes in Status Communities archive historical message data of their communities.
|
||||
Not requiring to follow the time range limit provided by [13/WAKU2-STORE](../../waku/standards/core/13/store.md)
|
||||
nodes using the [BitTorrent protocol](https://www.bittorrent.org/beps/bep_0003.html).
|
||||
It also describes how the archives are distributed to community members via the [Status network](https://status.network/),
|
||||
so they can fetch them and
|
||||
gain access to a complete message history.
|
||||
|
||||
## Background
|
||||
|
||||
Messages are stored permanently by [13/WAKU2-STORE](../../waku/standards/core/13/store.md) nodes for a configurable time range,
|
||||
which is limited by the overall storage provided by a [13/WAKU2-STORE](../../waku/standards/core/13/store.md) nodes.
|
||||
Messages older than that period are no longer provided by [13/WAKU2-STORE](../../waku/standards/core/13/store.md) nodes,
|
||||
making it impossible for other nodes to request historical messages that go beyond that time range.
|
||||
This raises issues in the case of Status communities,
|
||||
where recently joined members of a community are not able to request complete message histories of the community channels.
|
||||
|
||||
### Terminology
|
||||
|
||||
| Name | Description |
|
||||
| ---- | -------------- |
|
||||
| Waku node | A [10/WAKU2](../../waku/standards/core/10/waku.md) node that implements [11/WAKU2-RELAY](../../waku/standards/core/11/relay.md) |
|
||||
| Store node | A [10/WAKU2](../../waku/standards/core/10/waku.md) node that implements [13/WAKU2-STORE](../../waku/standards/core/13/store.md) |
|
||||
| Waku network | A group of [10/WAKU2](../../waku/standards/core/10/waku.md) nodes forming a graph, connected via [11/WAKU2-RELAY](../../waku/standards/core/11/relay.md) |
|
||||
| Status user | A Status account that is used in a Status consumer product, such as Status Mobile or Status Desktop |
|
||||
| Status node | A Status client run by a Status application |
|
||||
| Control node| A Status node that owns the private key for a Status community |
|
||||
| Community member | A Status user that is part of a Status community, not owning the private key of the community|
|
||||
| Community member node | A Status node with message archive capabilities enabled, run by a community member |
|
||||
| Live messages | [14/WAKU2-MESSAGE](../../waku/standards/core/14/message.md) received through the Waku network |
|
||||
| BitTorrent client | A program implementing the BitTorrent protocol |
|
||||
| Torrent/Torrent file | A file containing metadata about data to be downloaded by BitTorrent clients |
|
||||
| Magnet link | A link encoding the metadata provided by a torrent file (Magnet URI scheme) |
|
||||
|
||||
## Specification
|
||||
|
||||
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”,
|
||||
“SHOULD NOT”, “RECOMMENDED”, “MAY”, and
|
||||
“OPTIONAL” in this document are to be interpreted as described in [2119](https://www.ietf.org/rfc/rfc2119.txt).
|
||||
|
||||
### Message History Archive
|
||||
|
||||
Message history archives are represented as `WakuMessageArchive` and
|
||||
created from a [14/WAKU2-MESSAGE](../../waku/standards/core/14/message.md) exported from the local database.
|
||||
The following describes the protocol buffer for `WakuMessageArchive` :
|
||||
|
||||
``` protobuf
|
||||
|
||||
syntax = "proto3";
|
||||
|
||||
message WakuMessageArchiveMetadata {
|
||||
uint8 version = 1;
|
||||
uint64 from = 2;
|
||||
uint64 to = 3;
|
||||
repeated string content_Topic = 4;
|
||||
}
|
||||
|
||||
message WakuMessageArchive {
|
||||
uint8 version = 1;
|
||||
WakuMessageArchiveMetadata metadata = 2;
|
||||
repeated WakuMessage messages = 3; // `WakuMessage` is provided by 14/WAKU2-MESSAGE
|
||||
bytes padding = 4;
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
The `from` field SHOULD contain a `timestamp` of the time range's lower bound.
|
||||
This type parallels to the `timestamp` of a `WakuMessage`.
|
||||
The `to` field SHOULD contain a `timestamp` of the time range's the higher bound.
|
||||
The `contentTopic` field MUST contain a list of all community channel `contentTopic`s.
|
||||
The `messages` field MUST contain all messages that belong in the archive, given its `from`,
|
||||
`to`, and `contentTopic` fields.
|
||||
|
||||
The `padding` field MUST contain the amount of zero bytes needed for the protobuf encoded `WakuMessageArchive`.
|
||||
The overall byte size MUST be a multiple of the `pieceLength` used to divide the data into pieces.
|
||||
This is needed for seamless encoding and
|
||||
decoding of archival data when interacting with BitTorrent,
|
||||
as explained in [creating message archive torrents](#creating-message-archive-torrents).
|
||||
|
||||
#### Message History Archive Index
|
||||
|
||||
Control nodes MUST provide message archives for the entire community history.
|
||||
The entire history consists of a set of `WakuMessageArchive`,
|
||||
where each archive contains a subset of historical `WakuMessage` for a time range of seven days.
|
||||
All the `WakuMessageArchive` are concatenated into a single file as a byte string, see [Ensuring reproducible data pieces](#ensuring-reproducible-data-pieces).
|
||||
|
||||
Control nodes MUST create a message history archive index,
|
||||
`WakuMessageArchiveIndex` with metadata,
|
||||
that allows receiving nodes to only fetch the message history archives they are interested in.
|
||||
|
||||
##### WakuMessageArchiveIndex
|
||||
|
||||
``` protobuf
|
||||
|
||||
syntax = "proto3"
|
||||
|
||||
message WakuMessageArchiveIndexMetadata {
|
||||
uint8 version = 1
|
||||
WakuMessageArchiveMetadata metadata = 2
|
||||
uint64 offset = 3
|
||||
uint64 num_pieces = 4
|
||||
}
|
||||
|
||||
message WakuMessageArchiveIndex {
|
||||
map<string, WakuMessageArchiveIndexMetadata> archives = 1
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
A `WakuMessageArchiveIndex` is a map where the key is the KECCAK-256 hash of the `WakuMessageArchiveIndexMetadata`,
|
||||
is derived from a 7-day archive
|
||||
and the value is an instance of that `WakuMessageArchiveIndexMetadata` corresponding to that archive.
|
||||
|
||||
The `offset` field MUST contain the position at which the message history archive starts in the byte string
|
||||
of the total message archive data.
|
||||
This MUST be the sum of the length of all previously created message archives in bytes, see [creating message archive torrents](#creating-message-archive-torrents).
|
||||
|
||||
The control node MUST update the `WakuMessageArchiveIndex` every time it creates one or
|
||||
more `WakuMessageArchive`s and bundle it into a new torrent.
|
||||
For every created `WakuMessageArchive`,
|
||||
there MUST be a `WakuMessageArchiveIndexMetadata` entry in the archives field `WakuMessageArchiveIndex`.
|
||||
|
||||
### Creating Message Archive Torrents
|
||||
|
||||
Control nodes MUST create a .torrent file containing metadata for all message history archives.
|
||||
To create a .torrent file, and
|
||||
later serve the message archive data on the BitTorrent network,
|
||||
control nodes MUST store the necessary data in dedicated files on the file system.
|
||||
|
||||
A torrent's source folder MUST contain the following two files:
|
||||
|
||||
- `data`: Contains all protobuf encoded `WakuMessageArchive`'s (as bit strings)
|
||||
concatenated in ascending order based on their time
|
||||
- `index`: Contains the protobuf encoded `WakuMessageArchiveIndex`
|
||||
|
||||
Control nodes SHOULD store these files in a dedicated folder that is identifiable via a community identifier.
|
||||
|
||||
### Ensuring Reproducible Data Pieces
|
||||
|
||||
The control node MUST ensure that the byte string from the protobuf encoded data
|
||||
is equal to the byte string data from the previously generated message archive torrent.
|
||||
Including the data of the latest seven days worth of messages encoded as `WakuMessageArchive`.
|
||||
Therefore, the size of data grows every seven days as it's append-only.
|
||||
|
||||
Control nodes MUST ensure that the byte size,
|
||||
for every individual `WakuMessageArchive` encoded protobuf,
|
||||
is a multiple of `pieceLength` using the padding field.
|
||||
If the `WakuMessageArchive` is not a multiple of `pieceLength`,
|
||||
its padding field MUST be filled with zero bytes and
|
||||
the `WakuMessageArchive` MUST be re-encoded until its size becomes a multiple of `pieceLength`.
|
||||
|
||||
This is necessary because the content of the data file will be split into pieces of `pieceLength` when the torrent file is created,
|
||||
and the SHA1 hash of every piece is then stored in the torrent file and
|
||||
later used by other nodes to request the data for each individual data piece.
|
||||
|
||||
By fitting message archives into a multiple of `pieceLength` and
|
||||
ensuring they fill the possible remaining space with zero bytes,
|
||||
control nodes prevent the next message archive from occupying that remaining space of the last piece,
|
||||
which will result in a different SHA1 hash for that piece.
|
||||
|
||||
Example: Without padding
|
||||
Let `WakuMessageArchive` "A1" be of size 20 bytes:
|
||||
|
||||
``` text
|
||||
0 11 22 33 44 55 66 77 88 99
|
||||
10 11 12 13 14 15 16 17 18 19
|
||||
```
|
||||
|
||||
With a `pieceLength` of 10 bytes, A1 will fit into 20 / 10 = 2 pieces:
|
||||
|
||||
```text
|
||||
0 11 22 33 44 55 66 77 88 99 // piece[0] SHA1: 0x123
|
||||
10 11 12 13 14 15 16 17 18 19 // piece[1] SHA1: 0x456
|
||||
```
|
||||
|
||||
Example: With padding
|
||||
Let `WakuMessageArchive` "A2" be of size 21 bytes:
|
||||
|
||||
```text
|
||||
0 11 22 33 44 55 66 77 88 99
|
||||
10 11 12 13 14 15 16 17 18 19
|
||||
20
|
||||
```
|
||||
|
||||
With a `pieceLength` of 10 bytes,
|
||||
A2 will fit into 21 / 10 = 2 pieces.
|
||||
|
||||
The remainder will introduce a third piece:
|
||||
|
||||
```text
|
||||
0 11 22 33 44 55 66 77 88 99 // piece[0] SHA1: 0x123
|
||||
10 11 12 13 14 15 16 17 18 19 // piece[1] SHA1: 0x456
|
||||
20 // piece[2] SHA1: 0x789
|
||||
```
|
||||
|
||||
The next `WakuMessageArchive` "A3" will be appended ("#3") to the existing data and
|
||||
occupy the remaining space of the third data piece.
|
||||
|
||||
The piece at index 2 will now produce a different SHA1 hash:
|
||||
|
||||
```text
|
||||
0 11 22 33 44 55 66 77 88 99 // piece[0] SHA1: 0x123
|
||||
10 11 12 13 14 15 16 17 18 19 // piece[1] SHA1: 0x456
|
||||
20 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[2] SHA1: 0xeef
|
||||
#3 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[3]
|
||||
```
|
||||
|
||||
By filling up the remaining space of the third piece with A2 using its padding field,
|
||||
it is guaranteed that its SHA1 will stay the same:
|
||||
|
||||
```text
|
||||
0 11 22 33 44 55 66 77 88 99 // piece[0] SHA1: 0x123
|
||||
10 11 12 13 14 15 16 17 18 19 // piece[1] SHA1: 0x456
|
||||
20 0 0 0 0 0 0 0 0 0 // piece[2] SHA1: 0x999
|
||||
#3 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[3]
|
||||
#3 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[4]
|
||||
```
|
||||
|
||||
### Seeding Message History Archives
|
||||
|
||||
The control node MUST seed the generated torrent until a new `WakuMessageArchive` is created.
|
||||
|
||||
The control node SHOULD NOT seed torrents for older message history archives.
|
||||
Only one torrent at a time SHOULD be seeded.
|
||||
|
||||
### Creating Magnet Links
|
||||
|
||||
Once a torrent file for all message archives is created,
|
||||
the control node MUST derive a magnet link,
|
||||
following the Magnet URI scheme using the underlying [BitTorrent protocol](https://www.bittorrent.org/beps/bep_0003.html) client.
|
||||
|
||||
#### Message Archive Distribution
|
||||
|
||||
Message archives are available via the BitTorrent network as they are being seeded by the control node.
|
||||
Other community member nodes will download the message archives, from the BitTorrent network,
|
||||
after receiving a magnet link that contains a message archive index.
|
||||
|
||||
The control node MUST send magnet links containing message archives and
|
||||
the message archive index to a special community channel.
|
||||
The `content_Topic` of that special channel follows the following format:
|
||||
|
||||
``` text
|
||||
|
||||
/{application-name}/{version-of-the-application}/{content-topic-name}/{encoding}
|
||||
```
|
||||
|
||||
All messages sent with this special channel's `content_Topic` MUST be instances of `ApplicationMetadataMessage`,
|
||||
with a [62/STATUS-PAYLOADS](../../status/62/payloads.md) of `CommunityMessageArchiveIndex`.
|
||||
|
||||
Only the control node MAY post to the special channel.
|
||||
Other messages on this specified channel MUST be ignored by clients.
|
||||
Community members MUST NOT have permission to send messages to the special channel.
|
||||
However, community member nodes MUST subscribe to a special channel,
|
||||
to receive a [14/WAKU2-MESSAGE](../../waku/standards/core/14/message.md) containing magnet links for message archives.
|
||||
|
||||
#### Canonical Message Histories
|
||||
|
||||
Only control nodes are allowed to distribute messages with magnet links,
|
||||
via the special channel for magnet link exchange.
|
||||
Status nodes MUST ignore all messages in the special channel that aren't signed by a control node.
|
||||
Since the magnet links are created from the control node's database
|
||||
(and previously distributed archives),
|
||||
the message history provided by the control node becomes the canonical message history and
|
||||
single source of truth for the community.
|
||||
|
||||
Community member nodes MUST replace messages in their local database with the messages extracted from archives
|
||||
within the same time range.
|
||||
Messages that the control node didn't receive MUST be removed and
|
||||
are no longer part of the message history of interest,
|
||||
even if it already existed in a community member node's database.
|
||||
|
||||
### Fetching Message History Archives
|
||||
|
||||
The process of fetching message history:
|
||||
|
||||
1. Receive message archive index magnet link as described in [Message archive distribution](#message-archive-distribution),
|
||||
2. Download the index file from the torrent, then determine which message archives to download
|
||||
3. Download individual archives
|
||||
|
||||
Community member nodes subscribe to the special channel of the control nodes that publish magnet links for message history archives.
|
||||
Two RECOMMENDED scenarios in which community member nodes can receive such a magnet link message from the special channel:
|
||||
|
||||
1. The member node receives it via live messages, by listening to the special channel.
|
||||
2. The member node requests messages for a time range of up to 30 days from store nodes
|
||||
(this is the case when a new community member joins a community.)
|
||||
3. Downloading message archives
|
||||
|
||||
When community member nodes receive a message with a `CommunityMessageHistoryArchive` [62/STATUS-PAYLOADS](../../status/62/payloads.md),
|
||||
they MUST extract the `magnet_uri`.
|
||||
Then SHOULD pass it to their underlying BitTorrent client to fetch the latest message history archive index,
|
||||
which is the index file of the torrent, see [Creating message archive torrents].
|
||||
|
||||
Due to the nature of distributed systems,
|
||||
there's no guarantee that a received message is the "last" message.
|
||||
This is especially true when community member nodes request historical messages from store nodes.
|
||||
Therefore, community member nodes MUST wait for 20 seconds after receiving the last `CommunityMessageArchive`,
|
||||
before they start extracting the magnet link to fetch the latest archive index.
|
||||
|
||||
Once a message history archive index is downloaded and
|
||||
parsed back into `WakuMessageArchiveIndex`,
|
||||
community member nodes use a local lookup table to determine which of the listed archives are missing,
|
||||
using the KECCAK-256 hashes stored in the index.
|
||||
|
||||
For this lookup to work,
|
||||
member nodes MUST store the KECCAK-256 hashes,
|
||||
of the `WakuMessageArchiveIndexMetadata` provided by the index file,
|
||||
for all of the message history archives that have been downloaded into their local database.
|
||||
|
||||
Given a `WakuMessageArchiveIndex`, member nodes can access individual `WakuMessageArchiveIndexMetadata` to download individual archives.
|
||||
|
||||
Community member nodes MUST choose one of the following options:
|
||||
|
||||
1. Download all archives: Request and download all data pieces for the data provided by the torrent
|
||||
(this is the case for new community member nodes that haven't downloaded any archives yet.)
|
||||
2. Download only the latest archive: Request and
|
||||
download all pieces starting at the offset of the latest `WakuMessageArchiveIndexMetadata`
|
||||
(this is the case for any member node that already has downloaded all previous history and
|
||||
is now interested in only the latest archive).
|
||||
3. Download specific archives: Look into from and
|
||||
to fields of every `WakuMessageArchiveIndexMetadata` and
|
||||
determine the pieces for archives of a specific time range
|
||||
(can be the case for member nodes that have recently joined the network and
|
||||
are only interested in a subset of the complete history).
|
||||
|
||||
#### Storing Historical Messages
|
||||
|
||||
When message archives are fetched,
|
||||
community member nodes MUST unwrap the resulting `WakuMessage` instances into `ApplicationMetadataMessage` instances
|
||||
and store them in their local database.
|
||||
Community member nodes SHOULD NOT store the wrapped `WakuMessage` messages.
|
||||
|
||||
All messages within the same time range MUST be replaced with the messages provided by the message history archive.
|
||||
|
||||
Community members' nodes MUST ignore the expiration state of each archive message.
|
||||
|
||||
### Security Considerations
|
||||
|
||||
#### Multiple Community Owners
|
||||
|
||||
It is possible for control nodes to export the private key of their owned community and
|
||||
pass it to other users so they become control nodes as well.
|
||||
This means it's possible for multiple control nodes to exist for one community.
|
||||
|
||||
This might conflict with the assumption that the control node serves as a single source of truth.
|
||||
Multiple control nodes can have different message histories.
|
||||
Not only will multiple control nodes multiply the amount of archive index messages being distributed to the network,
|
||||
but they might also contain different sets of magnet links and their corresponding hashes.
|
||||
Even if just a single message is missing in one of the histories,
|
||||
the hashes presented in the archive indices will look completely different,
|
||||
resulting in the community member node downloading the corresponding archive.
|
||||
This might be identical to an archive that was already downloaded,
|
||||
except for that one message.
|
||||
|
||||
## Copyright
|
||||
|
||||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
|
||||
|
||||
## References
|
||||
|
||||
- [13/WAKU2-STORE](../../waku/standards/core/13/store.md)
|
||||
- [BitTorrent protocol](https://www.bittorrent.org/beps/bep_0003.html)
|
||||
- [Status network](https://status.network/)
|
||||
- [10/WAKU2](../../waku/standards/core/10/waku.md)
|
||||
- [11/WAKU2-RELAY](../../waku/standards/core/11/relay.md)
|
||||
- [14/WAKU2-MESSAGE](../../waku/standards/core/14/message.md)
|
||||
- [62/STATUS-PAYLOADS](../../status/62/payloads.md)
|
||||
@@ -40,7 +40,7 @@ Request For Comments specification process managed by the Vac service department
|
||||
|
||||
## License
|
||||
|
||||
Copyright (c) 2008-24 the Editor and Contributors.
|
||||
Copyright (c) 2008-26 the Editor and Contributors.
|
||||
|
||||
This Specification is free software;
|
||||
you can redistribute it and/or
|
||||
|
||||
531
vac/raw/mix.md
531
vac/raw/mix.md
@@ -120,6 +120,18 @@ contents as packets are forwarded hop-by-hop.
|
||||
Sphinx packets are fixed-size and indistinguishable from one another, providing
|
||||
unlinkability and metadata protection.
|
||||
|
||||
- **Initialization Vector (IV)**
|
||||
A fixed-length input used to initialize block ciphers to add randomness to the
|
||||
encryption process. It ensures that encrypting the same plaintext with the same
|
||||
key produces different ciphertexts. The IV is not secret but must be unique for
|
||||
each encryption.
|
||||
|
||||
- **Single-Use Reply Block (SURB)**
|
||||
A pre-computed Sphinx header that encodes a return path back to the sender.
|
||||
SURBs are generated by the sender and included in the Sphinx packet sent to the recipient.
|
||||
It enables the recipient to send anonymous replies,
|
||||
without learning the sender’s identity, the return path, or the forwarding delays.
|
||||
|
||||
## 3. Motivation and Background
|
||||
|
||||
libp2p enables modular peer-to-peer applications, but it lacks built-in support
|
||||
@@ -194,7 +206,7 @@ describe the core components of a mix network.
|
||||
|
||||
The Mix Protocol relies on two core design elements to achieve sender
|
||||
unlinkability and metadata
|
||||
protection: a mixing strategy and a a cryptographically secure mix packet
|
||||
protection: a mixing strategy and a cryptographically secure mix packet
|
||||
format.
|
||||
|
||||
### 4.1 Mixing Strategy
|
||||
@@ -297,7 +309,7 @@ layered encryption and per-hop delay provides resistance to traffic analysis and
|
||||
enables message-level
|
||||
unlinkability.
|
||||
|
||||
Unlike typical custom libp2p protocols, the Mix protocol is stateless—it
|
||||
Unlike typical custom libp2p protocols, the Mix Protocol is stateless—it
|
||||
does not establish
|
||||
persistent streams, negotiate protocols, or maintain sessions. Each message is
|
||||
self-contained
|
||||
@@ -458,8 +470,8 @@ multistream handshake for protocol negotiation.
|
||||
|
||||
While libp2p supports multiplexing multiple streams over a single transport
|
||||
connection using
|
||||
stream muxers such as mplex and yamux, it does not natively support reusing a
|
||||
stream over multiple
|
||||
stream muxers such as mplex and yamux, it does not natively support reusing the
|
||||
same stream over multiple
|
||||
message transmissions. However, stream reuse may be desirable in the mixnet
|
||||
setting to reduce overhead
|
||||
and avoid hitting per protocol stream limits between peers.
|
||||
@@ -512,9 +524,9 @@ adversarial routing bias.
|
||||
|
||||
While no existing mechanism provides unbiased sampling by default,
|
||||
[Waku’s ambient discovery](https://rfc.vac.dev/waku/standards/core/33/discv5/)
|
||||
— an extension
|
||||
—an extension
|
||||
over [Discv5](https://github.com/ethereum/devp2p/blob/master/discv5/discv5.md)
|
||||
— demonstrates
|
||||
—demonstrates
|
||||
an approximate solution. It combines topic-based capability advertisement with
|
||||
periodic
|
||||
peer sampling. A similar strategy could potentially be adapted for the Mix
|
||||
@@ -618,7 +630,7 @@ _unobservability_ where
|
||||
a passive adversary cannot determine whether a node is sending real messages or
|
||||
not.
|
||||
|
||||
In the Mix Protocol, cover traffic is limited to _loop messages_ — dummy
|
||||
In the Mix Protocol, cover traffic is limited to _loop messages_—dummy
|
||||
Sphinx packets
|
||||
that follow a valid mix path and return to the originating node. These messages
|
||||
carry no application
|
||||
@@ -627,7 +639,7 @@ routing behavior.
|
||||
|
||||
Cover traffic MAY be generated by either mix nodes or senders. The strategy for
|
||||
generating
|
||||
such traffic — such as timing and frequency — is pluggable and not
|
||||
such traffic—such as timing and frequency—is pluggable and not
|
||||
specified
|
||||
in this document.
|
||||
|
||||
@@ -667,9 +679,9 @@ deter participation by compromised or transient peers.
|
||||
|
||||
The Mix Protocol does not mandate any form of payment, token exchange, or
|
||||
accounting. More
|
||||
sophisticated economic models — such as stake-based participation,
|
||||
sophisticated economic models—such as stake-based participation,
|
||||
credentialed relay networks,
|
||||
or zero-knowledge proof-of-contribution systems — MAY be layered on top of
|
||||
or zero-knowledge proof-of-contribution systems—MAY be layered on top of
|
||||
the protocol or
|
||||
enforced via external coordination.
|
||||
|
||||
@@ -779,6 +791,11 @@ destination origin protocol instance.
|
||||
|
||||
The node MUST NOT retain decrypted content after forwarding.
|
||||
|
||||
The routing behavior described in this section relies on the use of
|
||||
Sphinx packets to preserve unlinkability and confidentiality across
|
||||
hops. The next section specifies their structure, cryptographic
|
||||
components, and construction.
|
||||
|
||||
## 8. Sphinx Packet Format
|
||||
|
||||
The Mix Protocol uses the Sphinx packet format to enable unlinkable, multi-hop
|
||||
@@ -789,8 +806,10 @@ encapsulated in a Sphinx packet constructed by the initiating mix node. The
|
||||
packet is encrypted in
|
||||
layers such that each hop in the mix path can decrypt exactly one layer and
|
||||
obtain the next-hop
|
||||
routing information and delay value, without learning the complete path or the
|
||||
routing information and forwarding delay, without learning the complete path or the
|
||||
message origin.
|
||||
Only the final hop learns the destination, which is encoded in the innermost
|
||||
routing layer.
|
||||
|
||||
Sphinx packets are self-contained and indistinguishable on the wire, providing
|
||||
strong metadata
|
||||
@@ -820,12 +839,10 @@ subsections.
|
||||
|
||||
### 8.1 Packet Structure Overview
|
||||
|
||||
Each Sphinx packet consists of three fixed-length header fields — $α$,
|
||||
$β$, and $γ$ —
|
||||
followed by a fixed-length encrypted payload $δ$. Together, these components
|
||||
enable per-hop message
|
||||
processing with strong confidentiality and integrity guarantees in a stateless
|
||||
and unlinkable manner.
|
||||
Each Sphinx packet consists of three fixed-length header fields— $α$,
|
||||
$β$, and $γ$ —followed by a fixed-length encrypted payload $δ$.
|
||||
Together, these components enable per-hop message processing with strong
|
||||
confidentiality and integrity guarantees in a stateless and unlinkable manner.
|
||||
|
||||
- **$α$ (Alpha)**: An ephemeral public value. Each mix node uses its private key
|
||||
and $α$ to
|
||||
@@ -833,8 +850,9 @@ derive a shared session key for that hop. This session key is used to decrypt
|
||||
and process
|
||||
one layer of the packet.
|
||||
- **$β$ (Beta)**: The nested encrypted routing information. It encodes the next
|
||||
hop address, the forwarding delay,
|
||||
integrity check $γ$ for the next hop, and the $β$ for subsequent hops.
|
||||
hop address, the forwarding delay, integrity check $γ$ for the next hop, and
|
||||
the $β$ for subsequent hops. At the final hop, $β$ encodes the destination
|
||||
address and fixed-length zero padding to preserve uniform size.
|
||||
- **$γ$ (Gamma)**: A message authentication code computed over $β$ using the
|
||||
session key derived
|
||||
from $α$. It ensures header integrity at each hop.
|
||||
@@ -843,12 +861,12 @@ fixed maximum length and
|
||||
encrypted in layers corresponding to each hop in the mix path.
|
||||
|
||||
At each hop, the mix node derives the session key from $α$, verifies the header
|
||||
integrity
|
||||
using $γ$, decrypts one layer of $β$ to extract the next hop and delay, and
|
||||
decrypts one layer
|
||||
of $δ$. It then constructs a new packet with updated values of $α$, $β$, $γ$,
|
||||
and $δ$, and
|
||||
forwards it to the next hop.
|
||||
integrity using $γ$, decrypts one layer of $β$ to extract the next hop and
|
||||
delay, and decrypts one layer of $δ$. It then constructs a new packet with
|
||||
updated values of $α$, $β$, $γ$, and $δ$, and forwards it to the next hop. At
|
||||
the final hop, the mix node decrypts the innermost layer of $β$ and $δ$, which
|
||||
yields the destination address and the original application message
|
||||
respectively.
|
||||
|
||||
All Sphinx packets are fixed in size and indistinguishable on the wire. This
|
||||
uniform format,
|
||||
@@ -866,20 +884,29 @@ This section defines the cryptographic primitives used in Sphinx packet
|
||||
construction and processing.
|
||||
|
||||
- **Security Parameter**: All cryptographic operations target a minimum of
|
||||
$\kappa = 128$ bits of
|
||||
$κ = 128$ bits of
|
||||
security, balancing performance with resistance to modern attacks.
|
||||
|
||||
- **Elliptic Curve Group $\mathbb{G}$**:
|
||||
- **Curve**: Curve25519
|
||||
- **Notation**: Let $g$ denote the canonical base point (generator) of $\mathbb{G}$.
|
||||
- **Purpose**: Used for deriving Diffie–Hellman-style shared key at each hop
|
||||
using $α$.
|
||||
- **Representation**: Small 32-byte group elements, efficient for both
|
||||
encryption and key exchange.
|
||||
- **Scalar Field**: The curve is defined over the finite field
|
||||
$\mathbb{Z}_q$, where $q = 2^{252} + 27742317777372353535851937790883648493$.
|
||||
Ephemeral exponents used in Sphinx packet construction are selected uniformly
|
||||
at random from $\mathbb{Z}_q^*$, the multiplicative subgroup of $\mathbb{Z}_q$.
|
||||
|
||||
- **Hash Function**:
|
||||
- **Construction**: SHA-256
|
||||
- **Notation**: The hash function is denoted by $H(\cdot)$ in subsequent sections.
|
||||
|
||||
- **Key Derivation Function (KDF)**:
|
||||
- **Purpose**: To derive encryption keys, IVs, and MAC key from the shared
|
||||
session key at each hop.
|
||||
- **Construction**: SHA-256 hash with output truncated to 128 bits.
|
||||
- **Construction**: SHA-256 hash with output truncated to $128$ bits.
|
||||
- **Key Derivation**: The KDF key separation labels (_e.g.,_ `"aes_key"`,
|
||||
`"mac_key"`)
|
||||
are fixed strings and MUST be agreed upon across implementations.
|
||||
@@ -889,9 +916,457 @@ are fixed strings and MUST be agreed upon across implementations.
|
||||
- **Keys and IVs**: Each derived from the session key for the hop using the KDF.
|
||||
|
||||
- **Message Authentication Code (MAC)**:
|
||||
- **Construction**: HMAC-SHA-256 with output truncated to 128 bits.
|
||||
- **Construction**: HMAC-SHA-256 with output truncated to $128$ bits.
|
||||
- **Purpose**: To compute $γ$ for each hop.
|
||||
- **Key**: Derived using KDF from the session key for the hop.
|
||||
|
||||
These primitives are used consistently throughout packet construction and
|
||||
decryption, as described in the following sections.
|
||||
|
||||
### 8.3 Packet Component Sizes
|
||||
|
||||
This section defines the size of each component in a Sphinx packet, deriving them
|
||||
from the security parameter and protocol parameters introduced earlier. All Sphinx
|
||||
packets MUST be fixed in length to ensure uniformity and indistinguishability on
|
||||
the wire. The serialized packet is structured as follows:
|
||||
|
||||
```text
|
||||
+--------+----------+--------+----------+
|
||||
| α | β | γ | δ |
|
||||
| 32 B | variable | 16 B | variable |
|
||||
+--------+----------+--------+----------+
|
||||
```
|
||||
|
||||
#### 8.3.1 Header Field Sizes
|
||||
|
||||
The header consists of the fields $α$, $β$, and $γ$, totaling a fixed size per
|
||||
maximum path length:
|
||||
|
||||
- **$α$ (Alpha)**: 32 bytes
|
||||
The size of $α$ is determined by the elliptic curve group representation used
|
||||
(Curve25519), which encodes group elements as 32-byte values.
|
||||
|
||||
- **$β$ (Beta)**: $((t + 1)r + 1)κ$ bytes
|
||||
The size of $β$ depends on:
|
||||
- **Maximum path length ($r$)**: The recommended value of $r=5$ balances
|
||||
bandwidth versus anonymity tradeoffs.
|
||||
- **Combined address and delay width ($tκ$)**: The recommended $t=6$
|
||||
accommodates standard libp2p relay multiaddress representations plus a
|
||||
2-byte delay field. While the actual multiaddress and delay fields may be
|
||||
shorter, they are padded to $tκ$ bytes to maintain fixed field size. The
|
||||
structure and rationale for the $tκ$ block and its encoding are specified in
|
||||
[Section 8.4](#84-address-and-delay-encoding).
|
||||
|
||||
Note: This expands on the original
|
||||
[Sphinx packet format]((https://cypherpunks.ca/~iang/pubs/Sphinx_Oakland09.pdf)),
|
||||
which embeds a fixed $κ$-byte mix node identifier per hop in $β$.
|
||||
The Mix Protocol generalizes this to $tκ$ bytes to accommodate libp2p
|
||||
multiaddresses and forwarding delays while preserving the cryptographic
|
||||
properties of the original design.
|
||||
|
||||
- **Per-hop $γ$ size ($κ$)** (defined below): Accounts for the integrity tag
|
||||
included with each hop’s routing information.
|
||||
|
||||
Using the recommended value of $r=5$ and $t=6$, the resulting $β$ size is
|
||||
$576$ bytes. At the final hop, $β$ encodes the destination address in the
|
||||
first $tκ-2$ bytes and the remaining bytes are zero-padded.
|
||||
|
||||
- **$γ$ (Gamma)**: $16$ bytes
|
||||
The size of $γ$ equals the security parameter $κ$, providing a $κ$-bit integrity
|
||||
tag at each hop.
|
||||
|
||||
Thus, the total header length is:
|
||||
|
||||
$`
|
||||
\begin{aligned}
|
||||
|Header| &= α + β + γ \\
|
||||
&= 32 + ((t + 1)r + 1)κ + 16
|
||||
\end{aligned}
|
||||
`$
|
||||
|
||||
Notation: $|x|$ denotes the size (in bytes) of field $x$.
|
||||
|
||||
Using the recommended value of $r = 5$ and $t = 6$, the header size is:
|
||||
|
||||
$`
|
||||
\begin{aligned}
|
||||
|Header| &= 32 + 576 + 16 \\
|
||||
&= 624 \ bytes
|
||||
\end{aligned}
|
||||
`$
|
||||
|
||||
#### 8.3.2 Payload Size
|
||||
|
||||
This subsection defines the size of the encrypted payload $δ$ in a Sphinx packet.
|
||||
|
||||
$δ$ contains the application message, padded to a fixed maximum length to ensure
|
||||
all packets are indistinguishable on the wire. The size of $δ$ is calculated as:
|
||||
|
||||
$`
|
||||
\begin{aligned}
|
||||
|δ| &= TotalPacketSize - HeaderSize
|
||||
\end{aligned}
|
||||
`$
|
||||
|
||||
The recommended total packet size is $4608$ bytes, chosen to:
|
||||
|
||||
- Accommodate larger libp2p application messages, such as those commonly
|
||||
observed in Status chat using Waku (typically ~4KB payloads),
|
||||
- Allow inclusion of additional data such as SURBs without requiring fragmentation,
|
||||
- Maintain reasonable per-hop processing and bandwidth overhead.
|
||||
|
||||
This recommended total packet size of \$4608\$ bytes yields:
|
||||
|
||||
$`
|
||||
\begin{aligned}
|
||||
Payload &= 4608 - 624 \\
|
||||
&= 3984\ bytes
|
||||
\end{aligned}
|
||||
`$
|
||||
|
||||
Implementations MUST account for payload extensions, such as SURBs,
|
||||
when determining the maximum message size that can be encapsulated in a
|
||||
single Sphinx packet. Details on SURBs are defined in
|
||||
[Section X.X].
|
||||
|
||||
The following subsection defines the padding and fragmentation requirements for
|
||||
ensuring this fixed-size constraint.
|
||||
|
||||
#### 8.3.3 Padding and Fragmentation
|
||||
|
||||
Implementations MUST ensure that all messages shorter than the maximum payload size
|
||||
are padded before Sphinx encapsulation to ensure that all packets are
|
||||
indistinguishable on the wire. Messages larger than the maximum payload size MUST
|
||||
be fragmented by the origin protocol or top-level application before being passed
|
||||
to the Mix Protocol. Reassembly is the responsibility of the consuming application,
|
||||
not the Mix Protocol.
|
||||
|
||||
#### 8.3.4 Anonymity Set Considerations
|
||||
|
||||
The fixed maximum packet size is a configurable parameter. Protocols or
|
||||
applications that choose to configure a different packet size (either larger or
|
||||
smaller than the default) MUST be aware that using unique or uncommon packet sizes
|
||||
can reduce their effective anonymity set to only other users of the same size.
|
||||
Implementers SHOULD align with widely used defaults to maximize anonymity set size.
|
||||
|
||||
Similarly, parameters such as $r$ and $t$ are configurable. Changes to these
|
||||
parameters affect header size and therefore impact payload size if the total packet
|
||||
size remains fixed. However, if such changes alter the total packet size on the
|
||||
wire, the same anonymity set considerations apply.
|
||||
|
||||
The following subsection defines how the next-hop or destination address and
|
||||
forwarding delay are encoded within $β$ to enable correct routing and mixing
|
||||
behavior.
|
||||
|
||||
### 8.4 Address and Delay Encoding
|
||||
|
||||
Each hop’s $β$ includes a fixed-size block containing the next-hop address and
|
||||
the forwarding delay, except for the final hop, which encodes the destination
|
||||
address and a delay-sized zero padding. This section defines the structure and
|
||||
encoding of that block.
|
||||
|
||||
The combined address and delay block MUST be exactly $tκ$ bytes in length,
|
||||
as defined in [Section 8.3.1](#831-header-field-sizes), regardless of the
|
||||
actual address or delay values. The first $(tκ - 2)$ bytes MUST encode the
|
||||
address, and the final $2$ bytes MUST encode the forwarding delay.
|
||||
This fixed-length encoding ensures that packets remain indistinguishable on
|
||||
the wire and prevents correlation attacks based on routing metadata structure.
|
||||
|
||||
Implementations MAY use any address and delay encoding format agreed upon
|
||||
by all participating mix nodes, as long as the combined length is exactly
|
||||
$tκ$ bytes. The encoding format MUST be interpreted consistently by all
|
||||
nodes within a deployment.
|
||||
|
||||
For interoperability, a recommended default encoding format involves:
|
||||
|
||||
- Encoding the next-hop or destination address as a libp2p multi-address:
|
||||
- To keep the address block compact while allowing relay connectivity, each mix
|
||||
node is limited to one IPv4 circuit relay multiaddress. This ensures that most
|
||||
nodes can act as mix nodes, including those behind NATs or firewalls.
|
||||
- In libp2p terms, this combines transport addresses with multiple peer
|
||||
identities to form an address that describes a relay circuit:
|
||||
`
|
||||
/ip4/<ipv4>/tcp/<port>/p2p/<relayPeerID>/p2p-circuit/p2p/<relayedPeerID>
|
||||
`
|
||||
Variants may include directly reachable peers and transports such as
|
||||
`/quic-v1`, depending on the mix node's supported stack.
|
||||
- IPv6 support is deferred, as it adds $16$ bytes just for the IP field.
|
||||
- Future revisions may extend this format to support IPv6 or DNS-based
|
||||
multiaddresses.
|
||||
|
||||
With these constraints, the recommended encoding layout is:
|
||||
- IPv4 address (4 bytes)
|
||||
- Protocol identifier _e.g.,_ TCP or QUIC (1 byte)
|
||||
- Port number (2 bytes)
|
||||
- Peer IDs (39 bytes, post-Base58 decoding)
|
||||
|
||||
- Encoding the forwarding delay as an unsigned 16-bit integer (2 bytes) in
|
||||
milliseconds, using big endian network byte order.
|
||||
|
||||
If the encoded address or delay is shorter than its respective allocated
|
||||
field, it MUST be padded with zeros. If it exceeds the allocated size, it
|
||||
MUST be rejected or truncated according to the implementation policy.
|
||||
|
||||
Note: Future versions of the Mix Protocol may support address compression by
|
||||
encoding only the peer identifier and relying on external peer discovery
|
||||
mechanisms to retrieve full multiaddresses at runtime. This would allow for
|
||||
more compact headers and greater address flexibility, but requires fast and
|
||||
reliable lookup support across deployments. This design is out of scope for
|
||||
the current version.
|
||||
|
||||
With the field sizes and encoding conventions established, the next section describes
|
||||
how a mix node constructs a complete Sphinx packet when initiating the Mix Protocol.
|
||||
|
||||
### 8.5 Packet Construction
|
||||
|
||||
This section defines how a mix node constructs a Sphinx packet when initiating
|
||||
the Mix Protocol on behalf of a local origin protocol instance.
|
||||
The construction process wraps the message in a sequence of encryption
|
||||
layers—one for each hop—such that only the corresponding mix node
|
||||
can decrypt its layer and retrieve the routing instructions for that hop.
|
||||
|
||||
#### 8.5.1 Inputs
|
||||
|
||||
To initiate the Mix Protocol, the origin protocol instance submits a message
|
||||
to the Mix Entry Layer on the same node. This layer forwards it to the local
|
||||
Mix Protocol instance, which constructs a Sphinx packet
|
||||
using the following REQUIRED inputs:
|
||||
|
||||
- **Application message**: The serialized message provided by the origin
|
||||
protocol instance. The Mix Protocol instance applies any configured spam
|
||||
protection mechanism and attaches one or two SURBs prior to encapsulating
|
||||
the message in the Sphinx packet. The initiating node MUST ensure that
|
||||
the resulting payload size does not exceed the maximum supported size
|
||||
defined in [Section 8.3.2](#832-payload-size).
|
||||
- **Origin protocol codec**: The libp2p protocol string corresponding to the
|
||||
origin protocol instance. This is included in the payload so that
|
||||
the exit node can route the message to the intended destination protocol
|
||||
after decryption.
|
||||
- **Mix Path length $L$**: The number of mix nodes to include in the path.
|
||||
The mix path MUST consist of at least three hops, each representing a
|
||||
distinct mix node.
|
||||
- **Destination address $Δ$**: The routing address of the intended recipient
|
||||
of the message. This address is encoded in $(tκ - 2)$ bytes as defined in
|
||||
[Section 8.4](#84-address-and-delay-encoding) and revealed only at the last hop.
|
||||
|
||||
#### 8.5.2 Construction Steps
|
||||
|
||||
This subsection defines how the initiating mix node constructs a complete
|
||||
Sphinx packet using the inputs defined in
|
||||
[Section 8.5.1](#851-inputs). The construction MUST
|
||||
follow the cryptographic structure defined in
|
||||
[Section 8.1](#81-packet-structure-overview), use the primitives specified in
|
||||
[Section 8.2](#82-cryptographic-primitives), and adhere to the component sizes
|
||||
and encoding formats from [Section 8.3](#83-packet-component-sizes) and
|
||||
[Section 8.4](#84-address-and-delay-encoding).
|
||||
|
||||
The construction MUST proceed as follows:
|
||||
|
||||
1. **Prepare Application Message**
|
||||
|
||||
- Apply any configured spam protection mechanism (_e.g.,_ PoW, VDF, RLN)
|
||||
to the serialized message. Spam protection mechanisms are pluggable as defined
|
||||
in [Section 6.3](#63-spam-protection).
|
||||
- Attach one or more SURBs, if required. Their format and processing are
|
||||
specified in [Section X.X].
|
||||
- Append the origin protocol codec.
|
||||
- Pad the result to the maximum application message length of $3968$ bytes
|
||||
using a deterministic padding scheme. This value is derived from the fixed
|
||||
payload size in [Section 8.3.2](#832-payload-size) ($3984$ bytes) minus the
|
||||
security parameter $κ = 16$ bytes defined in
|
||||
[Section 8.2](#82-cryptographic-primitives). The chosen scheme MUST yield a
|
||||
fixed-size padded output and MUST be consistent across all mix nodes to
|
||||
ensure correct interpretation during unpadding. For example, schemes that
|
||||
explicitly encode the padding length and prepend zero-valued padding bytes
|
||||
MAY be used.
|
||||
- Let the resulting message be $m$.
|
||||
|
||||
2. **Select A Mix Path**
|
||||
|
||||
- First obtain an unbiased random sample of live, routable mix nodes using
|
||||
some discovery mechanism. The choice of discovery mechanism is
|
||||
deployment-specific as defined in [Section 6.1](#61-discovery). The
|
||||
discovery mechanism MUST be unbiased and provide, at a minimum, the
|
||||
multiaddress and X25519 public key of each mix node.
|
||||
- From this sample, choose a random mix path of length $L \geq 3$. As defined
|
||||
in [Section 2](#2-terminology), a mix path is a non-repeating sequence of
|
||||
mix nodes.
|
||||
- For each hop $i \in \{0 \ldots L-1\}$:
|
||||
- Retrieve the multiaddress and corresponding X25519 public key $y_i$ of
|
||||
the $i$-th mix node.
|
||||
- Encode the multiaddress in $(tκ - 2)$ bytes as defined in
|
||||
[Section 8.4](#84-address-and-delay-encoding). Let the resulting encoded
|
||||
multiaddress be $\mathrm{addr\_i}$.
|
||||
|
||||
3. **Wrap Plaintext Payload In Sphinx Packet**
|
||||
|
||||
a. **Compute Ephemeral Secrets**
|
||||
|
||||
- Choose a random private exponent $x \in_R \mathbb{Z}_q^*$.
|
||||
- Initialize:
|
||||
$`
|
||||
\begin{aligned}
|
||||
α_0 &= g^x \\
|
||||
s_0 &= y_0^x \\
|
||||
b_0 &= H(α_0\ |\ s_0)
|
||||
\end{aligned}
|
||||
`$
|
||||
- For each hop $i$ (from $1$ to $L-1$), compute:
|
||||
$`
|
||||
\begin{aligned}
|
||||
α_i &= α_{i-1}^{b_{i-1}} \\
|
||||
s_i &= y_{i}^{x\prod_{\text{j=0}}^{\text{i-1}} b_{j}} \\
|
||||
b_i &= H(α_i\ |\ s_i)
|
||||
\end{aligned}
|
||||
`$
|
||||
|
||||
Note that the length of $α_i$ is $32$ bytes as defined in
|
||||
[Section 8.3](#83-packet-component-sizes).
|
||||
|
||||
b. **Compute Per-Hop Filler Strings**
|
||||
Filler strings are encrypted strings that are appended to the header during
|
||||
encryption. They ensure that the header length remains constant across hops,
|
||||
regardless of the position of a node in the mix path.
|
||||
|
||||
To compute the sequence of filler strings, perform the following steps:
|
||||
|
||||
- Initialize $Φ_0 = \epsilon$ (empty string).
|
||||
- For each $i$ (from $1$ to $L-1$):
|
||||
|
||||
- Derive per-hop AES key and IV:
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
Φ_{\mathrm{aes\_key}_{i-1}} =
|
||||
\mathrm{KDF}(\text{"aes\_key"} \mid s_{i-1})\\
|
||||
Φ_{\mathrm{iv}_{i-1}} =
|
||||
\mathrm{KDF}(\text{"iv"} \mid s_{i-1})
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
- Compute the filler string $Φ_i$ using $\text{AES-CTR}^\prime_i$,
|
||||
which is AES-CTR encryption with the keystream starting from
|
||||
index $((t+1)(r-i)+t+2)\kappa$ :
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
Φ_i = \mathrm{AES\text{-}CTR}'_i\bigl(Φ_{\mathrm{aes\_key}_{i-1}},
|
||||
Φ_{\mathrm{iv}_{i-1}}, Φ_{i-1} \mid 0_{(t+1)\kappa} \bigr),
|
||||
\text{where notation $0_x$ defines the string of $0$ bits of length $x$.}
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
Note that the length of $Φ_i$ is $(t+1)i\kappa$.
|
||||
|
||||
c. **Construct Routing Header**
|
||||
The routing header as defined in
|
||||
[Section 8.1](#81-packet-structure-overview) is the encrypted structure
|
||||
that carries the forwarding instructions for each hop. It ensures that a
|
||||
mix node can learn only its immediate next hop and forwarding delay without
|
||||
inferring the full path.
|
||||
|
||||
Filler strings computed in the previous step are appended during encryption
|
||||
to ensure that the header length remains constant across hops. This prevents
|
||||
a node from distinguishing its position in the path based on header size.
|
||||
|
||||
To construct the routing header, perform the following steps for each hop
|
||||
$i = L-1$ down to $0$, recursively:
|
||||
|
||||
- Derive per-hop AES key, MAC key, and IV:
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
β_{\mathrm{aes\_key}_i} =
|
||||
\mathrm{KDF}(\text{"aes\_key"} \mid s_i)\\
|
||||
\mathrm{mac\_key}_i =
|
||||
\mathrm{KDF}(\text{"mac\_key"} \mid s_{i})\\
|
||||
β_{\mathrm{iv}_i} =
|
||||
\mathrm{KDF}(\text{"iv"} \mid s_i)
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
- Set the per hop two-byte encoded delay $\mathrm{delay}_i$ as defined in
|
||||
[Section 8.4](#84-address-and-delay-encoding):
|
||||
- If final hop (_i.e.,_ $i = L - 1$), encode two byte zero padding.
|
||||
- For all other hop $i,\ i < L - 1$, sample a forwarding delay
|
||||
using the delay strategy configured by the application and encode it in two bytes.
|
||||
The delay strategy is pluggable as defined in [Section 6.2](#62-delay-strategy).
|
||||
|
||||
- Using the derived keys and encoded forwarding delay, compute the nested
|
||||
encrypted routing information $β_i$:
|
||||
|
||||
- If $i = L-1$ (_i.e.,_ exit node):
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
β_i = \mathrm{AES\text{-}CTR}\bigl(β_{\mathrm{aes\_key}_i},
|
||||
β_{\mathrm{iv}_i}, Δ \mid \mathrm{delay}_i \mid 0_{((t+1)(r-L)+2)\kappa}
|
||||
\bigr) \bigm| Φ_{L-1}
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
- Otherwise (_i.e.,_ intermediary node):
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
β_i = \mathrm{AES\text{-}CTR}\bigl(β_{\mathrm{aes\_key}_i},
|
||||
β_{\mathrm{iv}_i}, \mathrm{addr}_{i+1} \mid $\mathrm{delay}_i$
|
||||
\mid γ_{i+1} \mid β_{i+1 \, [0 \ldots (r(t+1) - t)\kappa - 1]} \bigr)
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
Note that the length of $\beta_i$ is $(r(t+1)+1)\kappa$, $0 \leq i \leq L-1$
|
||||
as defined in [Section 8.3](#83-packet-component-sizes).
|
||||
|
||||
- Compute the message authentication code $γ_i$:
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
γ_i = \mathrm{HMAC\text{-}SHA\text{-}256}\bigl(\mathrm{mac\_key}_i,
|
||||
β_i \bigr)
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
Note that the length of $\gamma_i$ is $\kappa$ as defined in
|
||||
[Section 8.3](#83-packet-component-sizes).
|
||||
|
||||
d. **Encrypt Payload**
|
||||
The encrypted payload $δ$ contains the message $m$ defined in Step 1,
|
||||
prepended with a $κ$-byte string of zeros. It is encrypted in layers such that
|
||||
each hop in the mix path removes exactly one layer using the per-hop session
|
||||
key. This ensures that only the final hop (_i.e.,_ the exit node) can fully
|
||||
recover $m$, validate its integrity, and forward it to the destination.
|
||||
To compute the encrypted payload, perform the following steps for each hop
|
||||
$i = L-1$ down to $0$, recursively:
|
||||
|
||||
- Derive per-hop AES key and IV:
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
δ_{\mathrm{aes\_key}_i} =
|
||||
\mathrm{KDF}(\text{"δ\_aes\_key"} \mid s_i)\\
|
||||
δ_{\mathrm{iv}_i} =
|
||||
\mathrm{KDF}(\text{"δ\_iv"} \mid s_i)
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
- Using the derived keys, compute the encrypted payload $δ_i$:
|
||||
|
||||
- If $i = L-1$ (_i.e.,_ exit node):
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
δ_i = \mathrm{AES\text{-}CTR}\bigl(δ_{\mathrm{aes\_key}_i},
|
||||
δ_{\mathrm{iv}_i}, 0_{\kappa} \mid m
|
||||
\bigr)
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
- Otherwise (_i.e.,_ intermediary node):
|
||||
|
||||
$`
|
||||
\begin{array}{l}
|
||||
δ_i = \mathrm{AES\text{-}CTR}\bigl(δ_{\mathrm{aes\_key}_i},
|
||||
δ_{\mathrm{iv}_i}, δ_{i+1} \bigr)
|
||||
\end{array}
|
||||
`$
|
||||
|
||||
205
vac/raw/sds.md
205
vac/raw/sds.md
@@ -58,6 +58,9 @@ other participants using the corresponding message ID.
|
||||
* **Participant ID:**
|
||||
Each participant has a globally unique, immutable ID
|
||||
visible to other participants in the communication.
|
||||
* **Sender ID:**
|
||||
The **Participant ID** of the original sender of a message,
|
||||
often coupled with a **Message ID**.
|
||||
|
||||
## Wire protocol
|
||||
|
||||
@@ -75,6 +78,8 @@ syntax = "proto3";
|
||||
message HistoryEntry {
|
||||
string message_id = 1; // Unique identifier of the SDS message, as defined in `Message`
|
||||
optional bytes retrieval_hint = 2; // Optional information to help remote parties retrieve this SDS message; For example, A Waku deterministic message hash or routing payload hash
|
||||
|
||||
optional string sender_id = 3; // Participant ID of original message sender. Only populated if using optional SDS Repair extension
|
||||
}
|
||||
|
||||
message Message {
|
||||
@@ -84,6 +89,9 @@ message Message {
|
||||
optional uint64 lamport_timestamp = 10; // Logical timestamp for causal ordering in channel
|
||||
repeated HistoryEntry causal_history = 11; // List of preceding message IDs that this message causally depends on. Generally 2 or 3 message IDs are included.
|
||||
optional bytes bloom_filter = 12; // Bloom filter representing received message IDs in channel
|
||||
|
||||
repeated HistoryEntry repair_request = 13; // Capped list of history entries missing from sender's causal history. Only populated if using the optional SDS Repair extension.
|
||||
|
||||
optional bytes content = 20; // Actual content of the message
|
||||
}
|
||||
```
|
||||
@@ -102,9 +110,11 @@ These fields MAY be left unset in the case of [ephemeral messages](#ephemeral-me
|
||||
The message `content` MAY be left empty for [periodic sync messages](#periodic-sync-message),
|
||||
otherwise it MUST contain the application-level content
|
||||
|
||||
> **_Note:_** Close readers may notice that, outside of filtering messages originating from the sender itself,
|
||||
> **_Note:_** Close readers may notice that,
|
||||
outside of filtering messages originating from the sender itself,
|
||||
the `sender_id` field is not used for much.
|
||||
Its importance is expected to increase once a p2p retrieval mechanism is added to SDS, as is planned for the protocol.
|
||||
Its importance is expected to increase once a p2p retrieval mechanism is added to SDS,
|
||||
as is planned for the protocol.
|
||||
|
||||
### Participant state
|
||||
|
||||
@@ -289,6 +299,197 @@ Upon reception,
|
||||
ephemeral messages SHOULD be delivered immediately without buffering for causal dependencies
|
||||
or including in the local log.
|
||||
|
||||
### SDS Repair (SDS-R)
|
||||
|
||||
SDS Repair (SDS-R) is an optional extension module for SDS,
|
||||
allowing participants in a communication to collectively repair any gaps in causal history (missing messages)
|
||||
preferably over a limited time window.
|
||||
Since SDS-R acts as coordinated rebroadcasting of missing messages,
|
||||
which involves all participants of the communication,
|
||||
it is most appropriate in a limited use case for repairing relatively recent missed dependencies.
|
||||
It is not meant to replace mechanisms for long-term consistency,
|
||||
such as peer-to-peer syncing or the use of a high-availability centralised cache (Store node).
|
||||
|
||||
#### SDS-R message fields
|
||||
|
||||
SDS-R adds the following fields to SDS messages:
|
||||
|
||||
* `sender_id` in `HistoryEntry`:
|
||||
the original message sender's participant ID.
|
||||
This is used to determine the group of participants who will respond to a repair request.
|
||||
* `repair_request` in `Message`:
|
||||
a capped list of history entries missing for the message sender
|
||||
and for which it's requesting a repair.
|
||||
|
||||
#### SDS-R participant state
|
||||
|
||||
SDS-R adds the following to each participant state:
|
||||
|
||||
* Outgoing **repair request buffer**:
|
||||
a list of locally missing `HistoryEntry`s
|
||||
each mapped to a future request timestamp, `T_req`,
|
||||
after which this participant will request a repair if at that point the missing dependency has not been repaired yet.
|
||||
`T_req` is computed as a pseudorandom backoff from the timestamp when the dependency was detected missing.
|
||||
[Determining `T_req`](#determine-t_req) is described below.
|
||||
We RECOMMEND that the outgoing repair request buffer be chronologically ordered in ascending order of `T_req`.
|
||||
|
||||
* Incoming **repair request buffer**:
|
||||
a list of locally available `HistoryEntry`s
|
||||
that were requested for repair by a remote participant
|
||||
AND for which this participant might be an eligible responder,
|
||||
each mapped to a future response timestamp, `T_resp`,
|
||||
after which this participant will rebroadcast the corresponding requested `Message` if at that point no other participant had rebroadcast the `Message`.
|
||||
`T_resp` is computed as a pseudorandom backoff from the timestamp when the repair was first requested.
|
||||
[Determining `T_resp`](#determine-t_resp) is described below.
|
||||
We describe below how a participant can [determine if they're an eligible responder](#determine-response-group) for a specific repair request.
|
||||
|
||||
* Augmented local history log:
|
||||
for each message ID kept in the local log for which the participant could be a repair responder,
|
||||
the full SDS `Message` must be cached rather than just the message ID,
|
||||
in case this participant is called upon to rebroadcast the message.
|
||||
We describe below how a participant can [determine if they're an eligible responder](#determine-response-group) for a specific message.
|
||||
|
||||
**_Note:_** The required state can likely be significantly reduced in future by simply requiring that a responding participant should _reconstruct_ the original `Message` when rebroadcasting, rather than the simpler, but heavier,
|
||||
requirement of caching the entire received `Message` content in local history.
|
||||
|
||||
#### SDS-R global state
|
||||
|
||||
For a specific channel (that is, within a specific SDS-controlled communication)
|
||||
the following SDS-R configuration state SHOULD be common for all participants in the conversation:
|
||||
|
||||
* `T_min`: the _minimum_ time period to wait before a missing causal entry can be repaired.
|
||||
We RECOMMEND a value of at least 30 seconds.
|
||||
* `T_max`: the _maximum_ time period over which missing causal entries can be repaired.
|
||||
We RECOMMEND a value of between 120 and 600 seconds.
|
||||
|
||||
Furthermore, to avoid a broadcast storm with multiple participants responding to a repair request,
|
||||
participants in a single channel MAY be divided into discrete response groups.
|
||||
Participants will only respond to a repair request if they are in the response group for that request.
|
||||
The global `num_response_groups` variable configures the number of response groups for this communication.
|
||||
Its use is described below.
|
||||
A reasonable default value for `num_response_groups` is one response group for every `128` participants.
|
||||
In other words, if the (roughly) expected number of participants is expressed as `num_participants`, then
|
||||
`num_response_groups = num_participants div 128 + 1`.
|
||||
In other words, if there are fewer than 128 participants in a communication,
|
||||
they will all belong to the same response group.
|
||||
|
||||
We RECOMMEND that the global state variables `T_min`, `T_max` and `num_response_groups`
|
||||
be set _statically_ for a specific SDS-R application,
|
||||
based on expected number of group participants and volume of traffic.
|
||||
|
||||
**_Note:_** Future versions of this protocol will recommend dynamic global SDS-R variables,
|
||||
based on the current number of participants.
|
||||
|
||||
#### SDS-R send message
|
||||
|
||||
SDS-R adds the following steps when sending a message:
|
||||
|
||||
Before broadcasting a message,
|
||||
|
||||
* the participant SHOULD populate the `repair_request` field in the message
|
||||
with _eligible_ entries from the outgoing repair request buffer.
|
||||
An entry is eligible to be included in a `repair_request`
|
||||
if its corresponding request timestamp, `T_req`, has expired (in other words,
|
||||
`T_req <= current_time`).
|
||||
The maximum number of repair request entries to include is up to the application.
|
||||
We RECOMMEND that this quota be filled by the eligible entries from the outgoing repair request buffer with the lowest `T_req`.
|
||||
We RECOMMEND a maximum of 3 entries.
|
||||
If there are no eligible entries in the buffer,
|
||||
this optional field MUST be left unset.
|
||||
|
||||
#### SDS-R receive message
|
||||
|
||||
On receiving a message,
|
||||
|
||||
* the participant MUST remove entries matching the received message ID from its _outgoing_ repair request buffer.
|
||||
This ensures that the participant does not request repairs for dependencies that have now been met.
|
||||
* the participant MUST remove entries matching the received message ID from its _incoming_ repair request buffer.
|
||||
This ensures that the participant does not respond to repair requests that another participant has already responded to.
|
||||
* the participant SHOULD add any unmet causal dependencies to its outgoing repair request buffer against a unique `T_req` timestamp for that entry.
|
||||
It MUST compute the `T_req` for each such HistoryEntry according to the steps outlined in [_Determine T_req_](#determine-t_req).
|
||||
* for each item in the `repair_request` field:
|
||||
* the participant MUST remove entries matching the repair message ID from its own outgoing repair request buffer.
|
||||
This limits the number of participants that will request a common missing dependency.
|
||||
* if the participant has the requested `Message` in its local history _and_ is an eligible responder for the repair request,
|
||||
it SHOULD add the request to its incoming repair request buffer against a unique `T_resp` timestamp for that entry.
|
||||
It MUST compute the `T_resp` for each such repair request according to the steps outlined in [_Determine T_resp_](#determine-t_resp).
|
||||
It MUST determine if it's an eligible responder for a repair request according to the steps outlined in [_Determine response group_](#determine-response-group).
|
||||
|
||||
#### Determine T_req
|
||||
|
||||
A participant determines the repair request timestamp, `T_req`,
|
||||
for a missing `HistoryEntry` as follows:
|
||||
|
||||
```text
|
||||
T_req = current_time + hash(participant_id, message_id) % (T_max - T_min) + T_min
|
||||
```
|
||||
|
||||
where `current_time` is the current timestamp,
|
||||
`participant_id` is the participant's _own_ participant ID
|
||||
(not the `sender_id` in the missing `HistoryEntry`),
|
||||
`message_id` is the missing `HistoryEntry`'s message ID,
|
||||
and `T_min` and `T_max` are as set out in [SDS-R global state](#sds-r-global-state).
|
||||
|
||||
This allows `T_req` to be pseudorandomly and linearly distributed as a backoff of between `T_min` and `T_max` from current time.
|
||||
|
||||
> **_Note:_** placing `T_req` values on an exponential backoff curve will likely be more appropriate and is left for a future improvement.
|
||||
|
||||
#### Determine T_resp
|
||||
|
||||
A participant determines the repair response timestamp, `T_resp`,
|
||||
for a `HistoryEntry` that it could repair as follows:
|
||||
|
||||
```text
|
||||
distance = hash(participant_id) XOR hash(sender_id)
|
||||
T_resp = current_time + distance*hash(message_id) % T_max
|
||||
```
|
||||
|
||||
where `current_time` is the current timestamp,
|
||||
`participant_id` is the participant's _own_ (local) participant ID,
|
||||
`sender_id` is the requested `HistoryEntry` sender ID,
|
||||
`message_id` is the requested `HistoryEntry` message ID,
|
||||
and `T_max` is as set out in [SDS-R global state](#sds-r-global-state).
|
||||
|
||||
We first calculate the logical `distance` between the local `participant_id` and
|
||||
the original `sender_id`.
|
||||
If this participant is the original sender, the `distance` will be `0`.
|
||||
It should then be clear that the original participant will have a response backoff time of `0`,
|
||||
making it the most likely responder.
|
||||
The `T_resp` values for other eligible participants will be pseudorandomly and
|
||||
linearly distributed as a backoff of up to `T_max` from current time.
|
||||
|
||||
> **_Note:_** placing `T_resp` values on an exponential backoff curve will likely be more appropriate and
|
||||
is left for a future improvement.
|
||||
|
||||
#### Determine response group
|
||||
|
||||
Given a message with `sender_id` and `message_id`,
|
||||
a participant with `participant_id` is in the response group for that message if
|
||||
|
||||
```text
|
||||
hash(participant_id, message_id) % num_response_groups == hash(sender_id, message_id) % num_response_groups
|
||||
```
|
||||
|
||||
where `num_response_groups` is as set out in [SDS-R global state](#sds-r-global-state).
|
||||
This ensures that a participant will always be in the response group for its own published messages.
|
||||
It also allows participants to determine immediately on first reception of a message or
|
||||
a history entry if they are in the associated response group.
|
||||
|
||||
#### SDS-R incoming repair request buffer sweep
|
||||
|
||||
An SDS-R participant MUST periodically check if there are any incoming requests in the **incoming** repair request buffer* that is due for a response.
|
||||
For each item in the buffer,
|
||||
the participant SHOULD broadcast the corresponding `Message` from local history
|
||||
if its corresponding response timestamp, `T_resp`, has expired
|
||||
(in other words, `T_resp <= current_time`).
|
||||
|
||||
#### SDS-R Periodic Sync Message
|
||||
|
||||
If the participant is due to send a periodic sync message,
|
||||
it SHOULD send the message according to [SDS-R send message](#sds-r-send-message)
|
||||
if there are any eligible items in the outgoing repair request buffer,
|
||||
regardless of whether other participants have also recently broadcast a Periodic Sync message.
|
||||
|
||||
## Copyright
|
||||
|
||||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
slug: 21
|
||||
title: 21/WAKU2-FAULT-TOLERANT-STORE
|
||||
name: Waku v2 Fault-Tolerant Store
|
||||
status: draft
|
||||
status: deleted
|
||||
editor: Sanaz Taheri <sanaz@status.im>
|
||||
contributors:
|
||||
---
|
||||
189
waku/standards/core/31/enr.md
Normal file
189
waku/standards/core/31/enr.md
Normal file
@@ -0,0 +1,189 @@
|
||||
---
|
||||
slug: 31
|
||||
title: 31/WAKU2-ENR
|
||||
name: Waku v2 usage of ENR
|
||||
status: draft
|
||||
tags: [waku/core-protocol]
|
||||
editor: Franck Royer <franck@status.im>
|
||||
contributors:
|
||||
---
|
||||
|
||||
## Abstract
|
||||
|
||||
This specification describes the usage of the ENR (Ethereum Node Records)
|
||||
format for [10/WAKU2](../10/waku2.md) purposes.
|
||||
The ENR format is defined in [EIP-778](https://eips.ethereum.org/EIPS/eip-778) [[3]](#references).
|
||||
|
||||
This specification is an extension of EIP-778,
|
||||
ENR used in Waku MUST adhere to both EIP-778 and 31/WAKU2-ENR.
|
||||
|
||||
## Motivation
|
||||
|
||||
EIP-1459 with the usage of ENR has been implemented [[1]](#references) [[2]](#references) as a discovery protocol for Waku.
|
||||
|
||||
EIP-778 specifies a number of pre-defined keys.
|
||||
However, the usage of these keys alone does not allow for certain transport capabilities to be encoded,
|
||||
such as Websocket.
|
||||
Currently, Waku nodes running in a browser only support websocket transport protocol.
|
||||
Hence, new ENR keys need to be defined to allow for the encoding of transport protocol other than raw TCP.
|
||||
|
||||
### Usage of Multiaddr Format Rationale
|
||||
|
||||
One solution would be to define new keys such as `ws` to encode the websocket port of a node.
|
||||
However, we expect new transport protocols to be added overtime such as quic.
|
||||
Hence, this would only provide a short term solution until another specification would need to be added.
|
||||
|
||||
Moreover, secure websocket involves SSL certificates.
|
||||
SSL certificates are only valid for a given domain and ip,
|
||||
so an ENR containing the following information:
|
||||
|
||||
- secure websocket port
|
||||
- ipv4 fqdn
|
||||
- ipv4 address
|
||||
- ipv6 address
|
||||
|
||||
Would carry some ambiguity: Is the certificate securing the websocket port valid for the ipv4 fqdn?
|
||||
the ipv4 address?
|
||||
the ipv6 address?
|
||||
|
||||
The [10/WAKU2](../10/waku2.md) protocol family is built on the [libp2p](https://github.com/libp2p/specs) protocol stack.
|
||||
Hence, it uses [multiaddr](https://github.com/multiformats/multiaddr) to format network addresses.
|
||||
|
||||
Directly storing one or several multiaddresses in the ENR would fix the issues listed above:
|
||||
|
||||
- multiaddr is self-describing and support addresses for any network protocol:
|
||||
No new specification would be needed to support encoding other transport protocols in an ENR.
|
||||
- multiaddr contains both the host and port information,
|
||||
allowing the ambiguity previously described to be resolved.
|
||||
|
||||
## Wire Format
|
||||
|
||||
### `multiaddrs` ENR key
|
||||
|
||||
We define a `multiaddrs` key.
|
||||
|
||||
- The value MUST be a list of binary encoded multiaddr prefixed by their size.
|
||||
- The size of the multiaddr MUST be encoded in a Big Endian unsigned 16-bit integer.
|
||||
- The size of the multiaddr MUST be encoded in 2 bytes.
|
||||
- The `secp256k1` value MUST be present on the record;
|
||||
`secp256k1` is defined in [EIP-778](https://eips.ethereum.org/EIPS/eip-778) and
|
||||
contains the compressed secp256k1 public key.
|
||||
- The node's peer id SHOULD be deduced from the `secp256k1` value.
|
||||
- The multiaddresses SHOULD NOT contain a peer id except for circuit relay addresses
|
||||
- For raw TCP & UDP connections details,
|
||||
[EIP-778](https://eips.ethereum.org/EIPS/eip-778) pre-defined keys SHOULD be used;
|
||||
The keys `tcp`, `udp`, `ip` (and `tcp6`, `udp6`, `ip6` for IPv6)
|
||||
are enough to convey all necessary information;
|
||||
- To save space, `multiaddrs` key SHOULD only be used for connection details that cannot be represented using the [EIP-778](https://eips.ethereum.org/EIPS/eip-778) pre-defined keys.
|
||||
- The 300 bytes size limit as defined by [EIP-778](https://eips.ethereum.org/EIPS/eip-778) still applies;
|
||||
In practice, it is possible to encode 3 multiaddresses in ENR, more or
|
||||
less could be encoded depending on the size of each multiaddress.
|
||||
|
||||
### Usage
|
||||
|
||||
#### Many connection types
|
||||
|
||||
Alice is a Waku node operator, she runs a node that supports inbound connection for the following protocols:
|
||||
|
||||
- TCP 10101 on `1.2.3.4`
|
||||
- UDP 20202 on `1.2.3.4`
|
||||
- TCP 30303 on `1234:5600:101:1::142`
|
||||
- UDP 40404 on `1234:5600:101:1::142`
|
||||
- Secure Websocket on `wss://example.com:443/`
|
||||
- QUIC on `quic://quic.example.com:443/`
|
||||
- A circuit relay address `/ip4/1.2.3.4/tcp/55555/p2p/QmRelay/p2p-circuit/p2p/QmAlice`
|
||||
|
||||
Alice SHOULD structure the ENR for her node as follows:
|
||||
|
||||
| key | value |
|
||||
| ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `tcp` | `10101` |
|
||||
| `udp` | `20202` |
|
||||
| `tcp6` | `30303` |
|
||||
| `udp6` | `40404` |
|
||||
| `ip` | `1.2.3.4` |
|
||||
| `ip6` | `1234:5600:101:1::142` |
|
||||
| `secp256k1` | Alice's compressed secp256k1 public key, 33 bytes |
|
||||
| `multiaddrs` | `len1 \| /dns4/example.com/tcp/443/wss \| len2 \| /dns4/quic.examle.com/tcp/443/quic \| len3 \| /ip4/1.2.3.4/tcp/55555/p2p/QmRelay` |
|
||||
|
||||
Where `multiaddrs`:
|
||||
|
||||
- `|` is the concatenation operator,
|
||||
- `len1` is the length of `/dns4/example.com/tcp/443/wss` byte representation,
|
||||
- `len2` is the length of `/dns4/quic.examle.com/tcp/443/quic` byte representation.
|
||||
- `len3` is the length of `/ip4/1.2.3.4/tcp/55555/p2p/QmRelay` byte representation.
|
||||
Notice that the `/p2p-circuit` component is not stored, but,
|
||||
since circuit relay addresses are the only one containing a `p2p` component,
|
||||
it's safe to assume that any address containing this component is a circuit relay address.
|
||||
Decoding this type of multiaddresses would require appending the `/p2p-circuit` component.
|
||||
|
||||
#### Raw TCP only
|
||||
|
||||
Bob is a node operator that runs a node that supports inbound connection for the following protocols:
|
||||
|
||||
- TCP 10101 on `1.2.3.4`
|
||||
|
||||
Bob SHOULD structure the ENR for his node as follows:
|
||||
|
||||
| key | value |
|
||||
| ----------- | ----------------------------------------------- |
|
||||
| `tcp` | `10101` |
|
||||
| `ip` | `1.2.3.4` |
|
||||
| `secp256k1` | Bob's compressed secp256k1 public key, 33 bytes |
|
||||
|
||||
As Bob's node's connection details can be represented with EIP-778's pre-defined keys only,
|
||||
it is not needed to use the `multiaddrs` key.
|
||||
|
||||
### Limitations
|
||||
|
||||
Supported key type is `secp256k1` only.
|
||||
|
||||
Support for other elliptic curve cryptography such as `ed25519` MAY be used.
|
||||
|
||||
### `waku2` ENR key
|
||||
|
||||
We define a `waku2` field key:
|
||||
|
||||
- The value MUST be an 8-bit flag field,
|
||||
where bits set to `1` indicate `true` and
|
||||
bits set to `0` indicate `false` for the relevant flags.
|
||||
- The flag values already defined are set out below,
|
||||
with `bit 7` the most significant bit and `bit 0` the least significant bit.
|
||||
|
||||
| bit 7 | bit 6 | bit 5 | bit 4 | bit 3 | bit 2 | bit 1 | bit 0 |
|
||||
| ------- | ------- | ------- | ------- | ----------- | -------- | ------- | ------- |
|
||||
| `undef` | `undef` | `undef` | `sync` | `lightpush` | `filter` | `store` | `relay` |
|
||||
|
||||
- In the scheme above, the flags `sync`, `lightpush`, `filter`, `store` and
|
||||
`relay` correlates with support for protocols with the same name.
|
||||
If a protocol is not supported, the corresponding field MUST be set to `false`.
|
||||
Indicating positive support for any specific protocol is OPTIONAL,
|
||||
though it MAY be required by the relevant application or discovery process.
|
||||
- Flags marked as `undef` is not yet defined.
|
||||
These SHOULD be set to `false` by default.
|
||||
|
||||
### Key Usage
|
||||
|
||||
- A Waku node MAY choose to populate the `waku2` field for enhanced discovery capabilities,
|
||||
such as indicating supported protocols.
|
||||
Such a node MAY indicate support for any specific protocol by setting the corresponding flag to `true`.
|
||||
- Waku nodes that want to participate in [Node Discovery Protocol v5](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/33/discv5.md) [[4]](#references), however,
|
||||
MUST implement the `waku2` key with at least one flag set to `true`.
|
||||
- Waku nodes that discovered other participants using Discovery v5,
|
||||
MUST filter out participant records that do not implement this field or
|
||||
do not have at least one flag set to `true`.
|
||||
- In addition, such nodes MAY choose to filter participants on specific flags
|
||||
(such as supported protocols),
|
||||
or further interpret the `waku2` field as required by the application.
|
||||
|
||||
## Copyright
|
||||
|
||||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
|
||||
|
||||
## References
|
||||
|
||||
- [1](../10/waku2.md)
|
||||
- [2](https://github.com/status-im/nim-waku/pull/690)
|
||||
- [3](https://github.com/vacp2p/rfc/issues/462#issuecomment-943869940)
|
||||
- [4](https://eips.ethereum.org/EIPS/eip-778)
|
||||
- [5](https://github.com/ethereum/devp2p/blob/master/discv5/discv5.md)
|
||||
Reference in New Issue
Block a user