Compare commits

...

22 Commits

Author SHA1 Message Date
Cofson
85b3200ed9 Update purchase spec metadata: rename to PURCHASING and add contributors 2025-10-25 11:35:08 +02:00
Cofson
f32711607d 2nd refactor: Separate protocol requirements from implementation details 2025-10-13 21:55:12 +02:00
Cofson
1f73c9190d Refactor: Separate protocol requirements from implementation details 2025-10-13 21:50:14 +02:00
Cofson
652adbc85f Improved Abstract section 2025-10-10 09:09:43 +02:00
Cofson
eec236a39f Added Filip in contributors 2025-10-08 19:54:52 +02:00
Cofson
e49e587dd5 Created codex/raw/codex-purchase.md file 2025-10-08 19:10:03 +02:00
AkshayaMani
36be428cdd RFC Refactor: Sphinx Packet Format (#173) 2025-10-05 20:21:16 -04:00
Hanno Cornelius
6672c5bedf docs: update lamport timestamps to uint64, pegged to current time (#196)
Lamport timestamps should remain close to current time (in milliseconds)
for new joiners to be able to have their messages ordered reasonably
close to other messages in the channel.

This means that:
- the `timestamp` field should be large enough to accommodate
millisecond resolution timestamps, i.e. `uint64` (see
https://github.com/vacp2p/rfc-index/pull/195 for reasoning)
- the lamport timestamp should be updated before sending _each_ message
to `max(timeNowInMs, current_lamport_timestamp + 1)`.

The current spec only indicated that Lamport timestamps should be
_initialised_ to current time, which means that the logical timestamp
would soon drift from current time.
2025-10-02 14:07:29 +01:00
0xc1c4da
422b7ec3d4 Add the Notion reference for Nomos specifications (#190)
Ideally these specifications were tracked as RFCs, but in the meantime
we should have a link to the Notion.

---------

Co-authored-by: Cofson <41572590+Cofson@users.noreply.github.com>
Co-authored-by: Jimmy Debe <91767824+jimstir@users.noreply.github.com>
Co-authored-by: Cofson <dimitrijevic.filip92@gmail.com>
2025-09-26 17:17:18 +02:00
Cofson
51ef4cd533 added nomos/raw/nomosda-network.md (#160)
added nomosda-network.md dfrat file to nomos/raw folder

---------

Co-authored-by: Hanno Cornelius <68783915+jm-clius@users.noreply.github.com>
Co-authored-by: fryorcraken <110212804+fryorcraken@users.noreply.github.com>
Co-authored-by: shash256 <111925100+shash256@users.noreply.github.com>
Co-authored-by: seugu <99656002+seugu@users.noreply.github.com>
Co-authored-by: Ekaterina Broslavskaya <seemenkina@gmail.com>
Co-authored-by: Jimmy Debe <91767824+jimstir@users.noreply.github.com>
Co-authored-by: kaiserd <1684595+kaiserd@users.noreply.github.com>
2025-09-25 20:15:42 +02:00
Cofson
53dfb97bc7 Created nomos/raw/sdp.md draft (#157)
Created sdp.md draft on nomos/raw folder
2025-09-25 20:13:36 +02:00
Cofson
39d6f07d4f added nomos/raw/nomosda-encoding.md draft (#156)
Created nomosda-encoding.md draft on nomos/raw/ folder

---------

Co-authored-by: AkshayaMani <AkshayaMani@users.noreply.github.com>
Co-authored-by: Jimmy Debe <91767824+jimstir@users.noreply.github.com>
2025-09-25 20:12:28 +02:00
Cofson
aa8a3b0c65 Created nomos/raw/p2p-network-bootstrapping.md draft (#175)
Created p2p-network-bootstrapping.md draft file in nomos/raw folder
2025-09-25 20:10:51 +02:00
Cofson
cfb3b78c71 Created nomos/raw/p2p-nat-solution.md draft (#174)
Created p2p-nat-solution.md draft file in nomos/raw folder
2025-09-25 20:08:57 +02:00
Cofson
34bbd7af90 Created nomos/raw/hardware-requirements.md file (#172)
Created hardware-requirements.md file in nomos/raw folder
2025-09-25 20:04:03 +02:00
Cofson
a3a5b91df3 Created nomos/raw/p2p-network.md file (#169)
Created p2p-network.md file in nomos/raw folder
2025-09-25 20:02:31 +02:00
fryorcraken
b1da70386e fix: use milliseconds for Lamport timestamp initialization (#179)
Changed Lamport timestamp initialization from nanoseconds to
milliseconds. The current time in nanoseconds exceeds JavaScript's
Number.MAX_SAFE_INTEGER, making nanosecond precision unsuitable for
JavaScript implementations. Milliseconds provide sufficient precision
while remaining well within safe integer bounds for decades to come.
2025-09-15 20:23:58 +10:00
seugu
f051117d37 VAC-RAW/Consensus-hashgraphlike RFC (#142)
This simple, scalable and decentralized consensus is for using in
decentralization MLS RFC.

todo:
- [x] solve lints
- [x] refine the RFC: adding liveness and time expiration section, also
default counting silent peers (peers that dont participate the voting)
- [x] add references
- [x] add license
- [x] first round reviewing 
- [x] second round reviewing
- [x] last round review

---------

Co-authored-by: Ekaterina Broslavskaya <seemenkina@gmail.com>
2025-09-15 10:06:24 +03:00
Jimmy Debe
3505da6bd6 sds lint fix (#177)
Fix lint issue in sds.md
2025-08-22 14:53:34 +02:00
seugu
3b968ccce3 VAC/RAW/ ETH-MLS-OFFCHAIN RFC (#166)
The first version of decentralized MLS (de-MLS) aka ETH-MLS-OFFCHAIN
RFC.

---------

Co-authored-by: Ekaterina Broslavskaya <seemenkina@gmail.com>
Co-authored-by: Jimmy Debe <91767824+jimstir@users.noreply.github.com>
Co-authored-by: kaiserd <1684595+kaiserd@users.noreply.github.com>
2025-08-21 13:33:59 +03:00
Hanno Cornelius
536d31b5b7 docs: re-add sender ID to messages (#170)
Re-added the concept of a participant ID and the corresponding
`sender_id` field in each SDS message.

This is useful to filter "self-triggered" messages as described in
https://github.com/waku-org/js-waku/pull/2528

However, this will also be a necessary part of the protocol once p2p
message exchange is added.

---------

Co-authored-by: fryorcraken <110212804+fryorcraken@users.noreply.github.com>
Co-authored-by: shash256 <111925100+shash256@users.noreply.github.com>
2025-08-19 16:38:42 +01:00
fryorcraken
4361e2958f Add implementation recommendation for metadata (#168)
Based on recent learnings.

---------

Co-authored-by: Hanno Cornelius <68783915+jm-clius@users.noreply.github.com>
2025-07-31 12:51:04 +10:00
14 changed files with 3369 additions and 44 deletions

415
codex/raw/codex-purchase.md Normal file
View File

@@ -0,0 +1,415 @@
---
slug: codex-purchase
title: CODEX-PURCHASING
name: Codex Purchasing Module
status: raw
category: Standards Track
tags: codex, storage, marketplace, state-machine
editor: Codex Team and Filip Dimitrijevic <filip@status.im>
contributors:
- Adam Uhlíř <adam.u@status.im>
- Arnaud DeVille <arnaud@status.im>
- Mark Spanbroek <mark@spanbroek.net>
- Eric Mastro <ericmastro@status.im>
- Filip Dimitrijevic <filip@status.im>
---
## Abstract
This specification defines the client role in the [Codex](https://github.com/codex-storage/nim-codex) marketplace protocol. A client is a Codex node that purchases storage by creating storage requests, submitting them to the marketplace smart contract, and managing the request lifecycle until completion or failure.
The specification outlines the protocol-level requirements for clients, including storage request creation, marketplace interactions, fund management, and request state monitoring. Implementation approaches, such as state machines and recovery mechanisms, are provided as suggestions.
## Background
In the Codex marketplace protocol, clients purchase storage by creating and submitting storage requests to the marketplace smart contract. Each storage request specifies the data to be stored (identified by a CID), storage parameters (duration, collateral requirements, proof parameters), and the number of storage slots needed.
The marketplace smart contract manages the storage request lifecycle, including request fulfillment, slot management, and fund distribution. Clients must interact with the marketplace to submit requests, monitor request state, and withdraw funds when requests complete or fail.
This specification defines what the protocol requires from clients. The marketplace smart contract governs these requirements—how individual Codex nodes implement them internally (state machines, recovery mechanisms, etc.) is an implementation detail covered in the Implementation Suggestions section.
## Client Protocol Requirements
This section defines the normative requirements for client implementations to participate in the Codex marketplace.
### Storage Request Creation
Clients MUST create storage requests with the following parameters:
- **client**: The client's address (identifying the requester)
- **ask**: Storage parameters including proof probability, pricing, collateral, slots, slot size, duration, and maximum slot loss
- **content**: Content identification using CID and merkle root
- **expiry**: Expiration timestamp for the request (uint64)
- **nonce**: Unique nonce value to ensure request uniqueness
### Marketplace Interactions
Clients MUST implement the following smart contract interactions:
- **requestStorage(request)**: Submit a new storage request to the marketplace along with required payment. Returns a unique `requestId` for tracking the request.
- **getRequest(requestId)**: Retrieve the full `StorageRequest` data from the marketplace. Used for recovery and state verification.
- **requestState(requestId)**: Query the current state of a storage request. Used for recovery and monitoring request progress.
- **withdrawFunds(requestId)**: Withdraw funds locked by the marketplace when a request completes, is cancelled, or fails. Clients MUST call this function to release locked funds.
### Event Subscriptions
Clients MUST subscribe to the following marketplace events:
- **RequestFulfilled(requestId)**: Emitted when a storage request has enough filled slots to start. Clients monitor this event to determine when their request becomes active.
- **RequestFailed(requestId)**: Emitted when a storage request fails due to proof failures or other reasons. Clients observe this event to detect failed requests and initiate fund withdrawal.
### Request Lifecycle Management
Clients MUST monitor storage requests through their lifecycle:
- After submitting a request with `requestStorage`, clients wait for the `RequestFulfilled` event or expiry timeout
- Once fulfilled, clients monitor for the request duration to complete successfully or for a `RequestFailed` event
- When a request completes (finished), is cancelled (expired before fulfillment), or fails, clients MUST call `withdrawFunds` to release locked funds
### Fund Management
Clients MUST:
- Provide sufficient payment when calling `requestStorage` to cover the storage request cost (based on slots, duration, and pricing parameters)
- Call `withdrawFunds` when a request completes, is cancelled, or fails to release any locked funds
## Implementation Suggestions
This section describes implementation approaches used in the nim-codex reference implementation. These are suggestions and not normative requirements.
The nim-codex implementation uses a state machine pattern to manage purchase lifecycles, providing deterministic state transitions, explicit terminal states, and recovery support. The state machine definitions (state identifiers, transitions, state descriptions, requirements, data models, and interfaces) are documented in the subsections below.
> **Note**: The Purchase module terminology and state machine design are specific to the nim-codex implementation. The protocol only requires that clients interact with the marketplace smart contract as specified in the Client Protocol Requirements section.
### State Identifiers
- PurchasePending: `pending`
- PurchaseSubmitted: `submitted`
- PurchaseStarted: `started`
- PurchaseFinished: `finished`
- PurchaseErrored: `errored`
- PurchaseCancelled: `cancelled`
- PurchaseFailed: `failed`
- PurchaseUnknown: `unknown`
### General Rules for All States
- If a `CancelledError` is raised, the state machine logs the cancellation message and takes no further action.
- If a `CatchableError` is raised, the state machine moves to `errored` with the error message.
### State Transitions
```text
|
v
------------------------- unknown
| / /
v v /
pending ----> submitted ----> started ---------> finished <----/
\ \ /
\ ------------> failed <----/
\ /
--> cancelled <-----------------------
```
**Note:**
Any state can transition to errored upon a `CatchableError`.
`failed` is an intermediate state before `errored`.
`finished`, `cancelled`, and `errored` are terminal states.
### State Descriptions
**Pending State (`pending`)**
A storage request is being created by making a call `on-chain`. If the storage request creation fails, the state machine moves to the `errored` state with the corresponding error.
**Submitted State (`submitted`)**
The storage request has been created and the purchase waits for the request to start. When it starts, an `on-chain` event `RequestFulfilled` is emitted, triggering the subscription callback, and the state machine moves to the `started` state. If the expiry is reached before the callback is called, the state machine moves to the `cancelled` state.
**Started State (`started`)**
The purchase is active and waits until the end of the request, defined by the storage request parameters, before moving to the `finished` state. A subscription is made to the marketplace to be notified about request failure. If a request failure is notified, the state machine moves to `failed`.
Marketplace subscription signature:
```nim
method subscribeRequestFailed*(market: Market, requestId: RequestId, callback: OnRequestFailed): Future[Subscription] {.base, async.}
```
**Finished State (`finished`)**
The purchase is considered successful and cleanup routines are called. The purchase module calls `marketplace.withdrawFunds` to release the funds locked by the marketplace:
```nim
method withdrawFunds*(market: Market, requestId: RequestId) {.base, async: (raises: [CancelledError, MarketError]).}
```
After that, the purchase is done; no more states are called and the state machine stops successfully.
**Failed State (`failed`)**
If the marketplace emits a `RequestFailed` event, the state machine moves to the `failed` state and the purchase module calls `marketplace.withdrawFunds` (same signature as above) to release the funds locked by the marketplace. After that, the state machine moves to `errored`.
**Cancelled State (`cancelled`)**
The purchase is cancelled and the purchase module calls `marketplace.withdrawFunds` to release the funds locked by the marketplace (same signature as above). After that, the purchase is terminated; no more states are called and the state machine stops with the reason of failure as error.
**Errored State (`errored`)**
The purchase is terminated; no more states are called and the state machine stops with the reason of failure as error.
**Unknown State (`unknown`)**
The purchase is in recovery mode, meaning that the state has to be determined. The purchase module calls the marketplace to get the request data (`getRequest`) and the request state (`requestState`):
```nim
method getRequest*(market: Market, id: RequestId): Future[?StorageRequest] {.base, async: (raises: [CancelledError]).}
method requestState*(market: Market, requestId: RequestId): Future[?RequestState] {.base, async.}
```
Based on this information, it moves to the corresponding next state.
> **Note**: Functional and non-functional requirements for the client role are summarized in the [Codex Marketplace Specification](https://github.com/codex-storage/codex-spec/blob/master/specs/marketplace.md). The requirements listed below are specific to the nim-codex Purchase module implementation.
### Functional Requirements
#### Purchase Definition
- Every purchase MUST represent exactly one `StorageRequest`
- The purchase MUST have a unique, deterministic identifier `PurchaseId` derived from `requestId`
- It MUST be possible to restore any purchase from its `requestId` after a restart
- A purchase is considered expired when the expiry timestamp in its `StorageRequest` is reached before the request start, i.e, an event `RequestFulfilled` is emitted by the `marketplace`
#### State Machine Progression
- New purchases MUST start in the `pending` state (submission flow)
- Recovered purchases MUST start in the `unknown` state (recovery flow)
- The state machine MUST progress step-by-step until a deterministic terminal state is reached
- The choice of terminal state MUST be based on the `RequestState` returned by the `marketplace`
#### Failure Handling
- On marketplace failure events, the purchase MUST immediately transition to `errored` without retries
- If a `CancelledError` is raised, the state machine MUST log the cancellation and stop further processing
- If a `CatchableError` is raised, the state machine MUST transition to `errored` and record the error
### Non-Functional Requirements
#### Execution Model
A purchase MUST be handled by a single thread; only one worker SHOULD process a given purchase instance at a time.
#### Reliability
`load` supports recovery after process restarts.
#### Performance
State transitions should be non-blocking; all I/O is async.
#### Logging
All state transitions and errors should be clearly logged for traceability.
#### Safety
- Avoid side effects during `new` other than initialising internal fields; `on-chain` interactions are delegated to states using `marketplace` dependency.
- Retry policy for external calls.
#### Testing
- Unit tests check that each state handles success and error properly.
- Integration tests check that a full purchase flows correctly through states.
### Data Models (Wire Format)
#### Purchase
```nim
Purchase* = ref object of Machine
future*: Future[void]
market*: Market
clock*: Clock
requestId*: RequestId
request*: ?StorageRequest
```
Extends the `Machine` type from the asyncstatemachine framework. Contains a `future` for async state machine execution, references to the `market` and `clock` dependencies, the `requestId` identifier, and an optional `request` field for the `StorageRequest`.
#### StorageRequest
```nim
StorageRequest* = object
client* {.serialize.}: Address
ask* {.serialize.}: StorageAsk
content* {.serialize.}: StorageContent
expiry* {.serialize.}: uint64
nonce*: Nonce
```
Contains the `client` address, storage parameters in `ask`, content identification in `content`, an `expiry` timestamp (uint64), and a unique `nonce`. Fields are marked with `{.serialize.}` for serialization support.
#### StorageAsk
```nim
StorageAsk* = object
proofProbability* {.serialize.}: UInt256
pricePerBytePerSecond* {.serialize.}: UInt256
collateralPerByte* {.serialize.}: UInt256
slots* {.serialize.}: uint64
slotSize* {.serialize.}: uint64
duration* {.serialize.}: uint64
maxSlotLoss* {.serialize.}: uint64
```
Defines storage agreement parameters: `proofProbability` and pricing in `pricePerBytePerSecond`, `collateralPerByte` for collateral requirements, `slots` and `slotSize` for storage distribution, `duration` for agreement length, and `maxSlotLoss` for fault tolerance. Numeric fields use `UInt256` for large values or `uint64` for counts and durations.
#### StorageContent
```nim
StorageContent* = object
cid* {.serialize.}: Cid
merkleRoot*: array[32, byte]
```
Identifies content to be stored using a `cid` (Content Identifier) and a `merkleRoot` (32-byte array) for verification purposes.
#### RequestId
```nim
RequestId* = distinct array[32, byte]
```
A distinct 32-byte array type used as a unique identifier for storage requests. Provides type safety through Nim's distinct types.
#### Nonce
```nim
Nonce* = distinct array[32, byte]
```
A distinct 32-byte array type used as a unique nonce value in `StorageRequest` to ensure request uniqueness.
### Interfaces
| Interface (Nim) | Description | Input | Output |
| --------------- | ----------- | ----- | ------ |
| `func new(_: type Purchase, requestId: RequestId, market: Market, clock: Clock): Purchase` | Construct a purchase from a storage request identifier. Used to recover a purchase. | `requestId: RequestId`, `market: Market`, `clock: Clock` | `Purchase` |
| `func new(_: type Purchase, request: StorageRequest, market: Market, clock: Clock): Purchase` | Create a purchase from a full `StorageRequest`. | `request: StorageRequest`, `market: Market`, `clock: Clock` | `Purchase` |
| `proc start*(purchase: Purchase)` | Start the state machine in pending mode (new on-chain submission flow). | `purchase: Purchase` | `void` |
| `proc load*(purchase: Purchase)` | Start the state machine in unknown mode (restore/recover after restart). | `purchase: Purchase` | `void` |
| `proc wait*(purchase: Purchase) {.async.}` | Await terminal state: completes on success or raises on failure. | `purchase: Purchase` | `Future[void]` |
| `func id*(purchase: Purchase): PurchaseId` | Stable identifier derived from `requestId`. | `purchase: Purchase` | `PurchaseId` |
| `func finished*(purchase: Purchase): bool` | Check whether the purchase completed successfully. | `purchase: Purchase` | `bool` |
| `func error*(purchase: Purchase): ?(ref CatchableError)` | Get error if the purchase failed. | `purchase: Purchase` | `Option[ref CatchableError]` |
| `func state*(purchase: Purchase): ?string` | Query current state name via the state machine. | `purchase: Purchase` | `Option[string]` |
| `proc hash*(x: PurchaseId): Hash` | Compute hash for `PurchaseId`. | `x: PurchaseId` | `Hash` |
| `proc ==*(x, y: PurchaseId): bool` | Equality comparison for `PurchaseId`. | `x: PurchaseId`, `y: PurchaseId` | `bool` |
| `proc toHex*(x: PurchaseId): string` | Hex string representation of `PurchaseId`. | `x: PurchaseId` | `string` |
| `method run*(state: PurchasePending, machine: Machine): Future[?State] {.async: (raises: []).}` | Submit request to market. | `state: PurchasePending`, `machine: Machine` | `Future[Option[State]]` |
| `method run*(state: PurchaseSubmitted, machine: Machine): Future[?State] {.async: (raises: []).}` | Await purchase start. | `state: PurchaseSubmitted`, `machine: Machine` | `Future[Option[State]]` |
| `method run*(state: PurchaseStarted, machine: Machine): Future[?State] {.async: (raises: []).}` | Run the purchase. | `state: PurchaseStarted`, `machine: Machine` | `Future[Option[State]]` |
| `method run*(state: PurchaseFinished, machine: Machine): Future[?State] {.async: (raises: []).}` | Purchase completed. | `state: PurchaseFinished`, `machine: Machine` | `Future[Option[State]]` |
| `method run*(state: PurchaseErrored, machine: Machine): Future[?State] {.async: (raises: []).}` | Purchase failed. | `state: PurchaseErrored`, `machine: Machine` | `Future[Option[State]]` |
| `method run*(state: PurchaseCancelled, machine: Machine): Future[?State] {.async: (raises: []).}` | Purchase cancelled or timed out. | `state: PurchaseCancelled`, `machine: Machine` | `Future[Option[State]]` |
| `method run*(state: PurchaseUnknown, machine: Machine): Future[?State] {.async: (raises: []).}` | Recover a purchase. | `state: PurchaseUnknown`, `machine: Machine` | `Future[Option[State]]` |
### Dependencies
The Purchase module requires the following dependencies:
- **marketplace** - Used for `requestStorage`, `withdrawFunds`, and subscriptions for request events
- **clock** - Provides timing utilities for expiry and scheduling logic
- **nim-chronos** - Async runtime for futures, awaiting I/O, and cancellation handling
- **asyncstatemachine** - Base state machine framework for implementing the purchase lifecycle
- **hashes** - Standard Nim hashing for `PurchaseId`
- **questionable** - Used for optional fields like `request`
### State Machine Implementation
Based on the specification, implementations should use the `asyncstatemachine` framework as the base for the Purchase state machine. The `Purchase` object extends `Machine` from this framework.
### Recovery Implementation
The `unknown` state implements recovery by:
1. Calling `marketplace.getRequest` to retrieve the `StorageRequest` data
2. Calling `marketplace.requestState` to determine the current `RequestState`
3. Transitioning to the appropriate state based on the marketplace response
### Fund Withdrawal
Terminal states (`finished`, `failed`, `cancelled`) all call `marketplace.withdrawFunds` to release locked funds. Implementations must ensure this call is made before the state machine stops.
## Security/Privacy Considerations
### Safety Requirements
The specification defines the following safety requirements:
- Implementations must avoid side effects during `new` other than initialising internal fields
- `on-chain` interactions must be delegated to states using `marketplace` dependency
- A retry policy for external calls should be implemented
### State Machine Safety
The deterministic state machine design ensures:
- Every purchase reaches exactly one terminal state (`finished`, `cancelled`, or `errored`)
- The `failed` state is an intermediate state that always transitions to `errored`
- State transitions follow the defined state diagram without cycles in terminal states
### Fund Management Safety
All terminal paths (`finished`, `failed`, `cancelled`) call `marketplace.withdrawFunds` as specified in the state descriptions. This ensures locked funds are released when purchases complete or fail.
### Recovery Safety
The `unknown` state enables recovery after restarts by querying the marketplace for current state. The specification requires that it must be possible to restore any purchase from its `requestId` after a restart.
## Rationale
This specification is based on the Purchase module component specification from the Codex project.
### State Machine Pattern
The specification uses a state machine pattern for managing purchase lifecycles. This design provides:
- **Deterministic state transitions**: As documented in the state diagram, purchases follow well-defined paths
- **Explicit terminal states**: Three terminal states (`finished`, `cancelled`, `errored`) clearly indicate final outcomes
- **Recovery support**: The `unknown` state enables restoration after process restarts as specified in the functional requirements
### Marketplace Abstraction
The marketplace abstraction allows for different backend implementations (on-chain or custom storage systems). The specification notes that "the abstraction could be replaceable: it could point to a different backend or custom storage system in future implementations."
This design separates purchase lifecycle management from marketplace implementation details, with all on-chain interactions delegated to states using the `marketplace` dependency as specified in the safety requirements.
### Async State Machine Framework
The specification requires the `asyncstatemachine` dependency for the base state machine framework and `nim-chronos` for async runtime. The `Purchase` object extends `Machine` from this framework, and state transitions are non-blocking with async I/O as specified in the performance requirements.
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
## References
### normative
- **Codex Marketplace Specification**: [Marketplace Spec](https://github.com/codex-storage/codex-spec/blob/master/specs/marketplace.md) - Defines the client role and protocol requirements
- **Codex Purchase Implementation**: [GitHub - codex-storage/nim-codex](https://github.com/codex-storage/nim-codex)
- **Codex Documentation**: [Codex Docs - Component Specification - Purchase](https://github.com/codex-storage/codex-docs-obsidian/blob/main/10%20Notes/Specs/Component%20Specification%20-%20Purchase.md)
### informative
- **Nim Chronos**: [GitHub - status-im/nim-chronos](https://github.com/status-im/nim-chronos) - Async/await framework for Nim
- **RFC 2119**: Key words for use in RFCs to Indicate Requirement Levels
- **Codex Marketplace**: Smart contract or backend system managing storage agreements

View File

@@ -2,5 +2,5 @@
Nomos is building a secure, flexible, and
scalable infrastructure for developers creating applications for the network state.
To learn more about Nomos current protocols under discussion,
head over to [Nomos Specs](https://github.com/logos-co/nomos-specs).
Published Specifications are currently available here,
[Nomos Specifications](https://nomos-tech.notion.site/project).

View File

@@ -0,0 +1,170 @@
---
title: NOMOSDA-ENCODING
name: NomosDA Encoding Protocol
status: raw
category:
tags: data-availability
editor: Daniel Sanchez-Quiros <danielsq@status.im>
contributors:
- Daniel Kashepava <danielkashepava@status.im>
- Álvaro Castro-Castilla <alvaro@status.im>
- Filip Dimitrijevic <filip@status.im>
- Thomas Lavaur <thomaslavaur@status.im>
- Mehmet Gonen <mehmet@status.im>
---
## Introduction
This document describes the encoding and verification processes of NomosDA, which is the data availability (DA) solution used by the Nomos blockchain. NomosDA provides an assurance that all data from Nomos blobs are accessible and verifiable by every network participant.
This document presents an implementation specification describing how:
- Encoders encode blobs they want to upload to the Data Availability layer.
- Other nodes implement the verification of blobs that were already uploaded to DA.
## Definitions
- **Encoder**: An encoder is any actor who performs the encoding process described in this document. This involves committing to the data, generating proofs, and submitting the result to the DA layer.
In the Nomos architecture, the rollup sequencer typically acts as the encoder, but the role is not exclusive and any actor in the DA layer can also act as encoders.
- **Verifier**: Verifies its portion of the distributed blob data as per the verification protocol. In the Nomos architecture, the DA nodes act as the verifiers.
## Overview
In the encoding stage, the encoder takes the DA parameters and the padded blob data and creates an initial matrix of data chunks. This matrix is expanded using Reed-Solomon coding and various commitments and proofs are created for the data.
When a verifier receives a sample, it verifies the data it receives from the encoder and broadcasts the information if the data is verified. Finally, the verifier stores the sample data for the required length of time.
## Construction
The encoder and verifier use the [NomosDA cryptographic protocol](https://www.notion.so/NomosDA-Cryptographic-Protocol-1fd261aa09df816fa97ac81304732e77?pvs=21) to carry out their respective functions. These functions are implemented as abstracted and configurable software entities that allow the original data to be encoded and verified via high-level operations.
### Glossary
| Name | Description | Representation |
| --- | --- | --- |
| `Commitment` | Commitment as per the [NomosDA Cryptographic Protocol](https://www.notion.so/NomosDA-Cryptographic-Protocol-1fd261aa09df816fa97ac81304732e77?pvs=21) | `bytes` |
| `Proof` | Proof as per the [NomosDA Cryptographic Protocol](https://www.notion.so/NomosDA-Cryptographic-Protocol-1fd261aa09df816fa97ac81304732e77?pvs=21) | `bytes` |
| `ChunksMatrix` | Matrix of chunked data. Each chunk is **31 bytes.** Row and Column sizes depend on the encoding necessities. | `List[List[bytes]]` |
### Encoder
An encoder takes a set of parameters and the blob data, and creates a matrix of chunks that it uses to compute the necessary cryptographic data. It produces the set of Reed-Solomon (RS) encoded data, the commitments, and the proofs that are needed prior to [dispersal](https://www.notion.so/NomosDA-Dispersal-1fd261aa09df815288c9caf45ed72c95?pvs=21).
```mermaid
flowchart LR
A[DaEncoderParams] -->|Input| B(Encoder)
I[31bytes-padded-input] -->|Input| B
B -->|Creates| D[Chunks matrix]
D --> |Input| C[NomosDA encoding]
C --> E{Encoded data📄}
```
#### Encoding Process
The encoder executes the encoding process as follows:
1. The encoder takes the following input parameters:
```python
class DAEncoderParams:
column_count: usize
bytes_per_field_element: usize
```
| Name | Description | Representation |
| --- | --- | --- |
| `column_count` | The number of subnets available for dispersal in the system | `usize`, `int` in Python |
| `bytes_per_field_element` | The amount of bytes per data chunk. This is set to 31 bytes. Each chunk has 31 bytes rather than 32 to ensure that the chunk value does not exceed the maximum value on the [BLS12-381 elliptic curve](https://electriccoin.co/blog/new-snark-curve/). | `usize`, `int` in Python |
2. The encoder also includes the blob data to be encoded, which must be of a size that is a multiple of `bytes_per_field_element` bytes. Clients are responsible for padding the data so it fits this constraint.
3. The encoder splits the data into `bytes_per_field_element`-sized chunks. It also arranges these chunks into rows and columns, creating a matrix.
a. The amount of columns of the matrix needs to fit with the `column_count` parameter, taking into account the `rs_expansion_factor` (currently fixed to 2).
i. This means that the size of each row in this matrix is `(bytes_per_field_element*column_count)/rs_expansion_factor`.
b. The amount of rows depends on the size of the data.
4. The data is encoded as per [the cryptographic details](https://www.notion.so/NomosDA-Cryptographic-Protocol-1fd261aa09df816fa97ac81304732e77?pvs=21).
5. The encoder provides the encoded data set:
| Name | Description | Representation |
| --- | --- | --- |
| `data` | Original data | `bytes` |
| `chunked_data` | Matrix before RS expansion | `ChunksMatrix` |
| `extended_matrix` | Matrix after RS expansion | `ChunksMatrix` |
| `row_commitments` | Commitments for each matrix row | `List[Commitment]` |
| `combined_column_proofs` | Proofs for each matrix column | `List[Proof]` |
```python
class EncodedData:
data: bytes
chunked_data: ChunksMatrix
extended_matrix: ChunksMatrix
row_commitments: List[Commitment]
combined_column_proofs: List[Proof]
```
#### Encoder Limits
NomosDA does not impose a fixed limit on blob size at the encoding level. However, protocols that involve resource-intensive operations must include upper bounds to prevent abuse. In the case of NomosDA, blob size limits are expected to be enforced, as part of the protocol's broader responsibility for resource management and fairness.
Larger blobs naturally result in higher computational and bandwidth costs, particularly for the encoder, who must compute a proof for each column. Without size limits, malicious clients could exploit the system by attempting to stream unbounded data to DA nodes. Since payment is provided before blob dispersal, DA nodes are protected from performing unnecessary work. This enables the protocol to safely accept very large blobs, as the primary computational cost falls on the encoder. The protocol can accommodate generous blob sizes in practice, while rejecting only absurdly large blobs, such as those exceeding 1 GB, to prevent denial-of-service attacks and ensure network stability.
To mitigate this, the protocol define acceptable blob size limits, and DA implementations enforce local mitigation strategies, such as flagging or blacklisting clients that violate these constraints.
### Verifier
A verifier checks the proper encoding of data blobs it receives. A verifier executes the verification process as follows:
1. The verifier receives a `DAShare` with the required verification data:
| Name | Description | Representation |
| --- | --- | --- |
| `column` | Column chunks (31 bytes) from the encoded matrix | `List[bytes]` |
| `column_idx` | Column id (`0..2047`). It is directly related to the `subnetworks` in the [network specification](https://www.notion.so/NomosDA-Network-Specification-1fd261aa09df81188e76cb083791252d?pvs=21). | `u16`, unsigned int of 16 bits. `int` in Python |
| `combined_column_proof` | Proof of the random linear combination of the column elements. | `Proof` |
| `row_commitments` | Commitments for each matrix row | `List[Commitment]` |
| `blob_id` | This is computed as the hash (**blake2b**) of `row_commitments` | `bytes` |
2. Upon receiving the above data it verifies the column data as per the [cryptographic details](https://www.notion.so/NomosDA-Cryptographic-Protocol-1fd261aa09df816fa97ac81304732e77?pvs=21). If the verification is successful, the node triggers the [replication protocol](https://www.notion.so/NomosDA-Subnetwork-Replication-1fd261aa09df811d93f8c6280136bfbb?pvs=21) and stores the blob.
```python
class DAShare:
column: Column
column_idx: u16
combined_column_proof: Proof
row_commitments: List[Commitment]
def blob_id(self) -> BlobId:
hasher = blake2b(digest_size=32)
for c in self.row_commitments:
hasher.update(bytes(c))
return hasher.digest()
```
### Verification Logic
```mermaid
sequenceDiagram
participant N as Node
participant S as Subnetwork Column N
loop For each incoming blob column
N-->>N: If blob is valid
N-->>S: Replication
N->>N: Stores blob
end
```
## Details
The encoder and verifier processes described above make use of a variety of cryptographic functions to facilitate the correct verification of column data by verifiers. These functions rely on primitives such as polynomial commitments and Reed-Solomon erasure codes, the details of which are outside the scope of this document. These details, as well as introductions to the cryptographic primitives being used, can be found in the NomosDA Cryptographic Protocol:
[NomosDA Cryptographic Protocol](https://www.notion.so/NomosDA-Cryptographic-Protocol-1fd261aa09df816fa97ac81304732e77?pvs=21)
## References
- Encoder Specification: [GitHub/encoder.py](https://github.com/logos-co/nomos-specs/blob/master/da/encoder.py)
- Verifier Specification: [GitHub/verifier.py](https://github.com/logos-co/nomos-specs/blob/master/da/verifier.py)
- Cryptographic protocol: [NomosDA Cryptographic Protocol](https://www.notion.so/NomosDA-Cryptographic-Protocol-1fd261aa09df816fa97ac81304732e77?pvs=21)
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

View File

@@ -0,0 +1,255 @@
---
title: NOMOS-DA-NETWORK
name: NomosDA Network
status: raw
category:
tags: network, data-availability, da-nodes, executors, sampling
editor: Daniel Sanchez Quiros <danielsq@status.im>
contributors:
- Álvaro Castro-Castilla <alvaro@status.im>
- Daniel Kashepava <danielkashepava@status.im>
- Gusto Bacvinka <augustinas@status.im>
- Filip Dimitrijevic <filip@status.im>
---
## Introduction
NomosDA is the scalability solution protocol for data availability within the Nomos network.
This document delineates the protocol's structure at the network level,
identifies participants,
and describes the interactions among its components.
Please note that this document does not delve into the cryptographic aspects of the design.
For comprehensive details on the cryptographic operations,
a detailed specification is a work in progress.
## Objectives
NomosDA was created to ensure that data from Nomos zones is distributed, verifiable, immutable, and accessible.
At the same time, it is optimised for the following properties:
- **Decentralization**: NomosDAs data availability guarantees must be achieved with minimal trust assumptions
and centralised actors. Therefore,
permissioned DA schemes involving a Data Availability Committee (DAC) had to be avoided in the design.
Schemes that require some nodes to download the entire blob data were also off the list
due to the disproportionate role played by these “supernodes”.
- **Scalability**: NomosDA is intended to be a bandwidth-scalable protocol, ensuring that its functions are maintained as the Nomos network grows. Therefore, NomosDA was designed to minimise the amount of data sent to participants, reducing the communication bottleneck and allowing more parties to participate in the DA process.
To achieve the above properties, NomosDA splits up zone data and
distributes it among network participants,
with cryptographic properties used to verify the datas integrity.
A major feature of this design is that parties who wish to receive an assurance of data availability
can do so very quickly and with minimal hardware requirements.
However, this comes at the cost of additional complexity and resources required by more integral participants.
## Requirements
In order to ensure that the above objectives are met,
the NomosDA network requires a group of participants
that undertake a greater burden in terms of active involvement in the protocol.
Recognising that not all node operators can do so,
NomosDA assigns different roles to different kinds of participants,
depending on their ability and willingness to contribute more computing power
and bandwidth to the protocol.
It was therefore necessary for NomosDA to be implemented as an opt-in Service Network.
Because the NomosDA network has an arbitrary amount of participants,
and the data is split into a fixed number of portions (see the [Encoding & Verification Specification](https://www.notion.so/NomosDA-Encoding-Verification-4d8ca269e96d4fdcb05abc70426c5e7c)),
it was necessary to define exactly how each portion is assigned to a participant who will receive and verify it.
This assignment algorithm must also be flexible enough to ensure smooth operation in a variety of scenarios,
including where there are more or fewer participants than the number of portions.
## Overview
### Network Participants
The NomosDA network includes three categories of participants:
- **Executors**: Tasked with the encoding and dispersal of data blobs.
- **DA Nodes**: Receive and verify the encoded data,
subsequently temporarily storing it for further network validation through sampling.
- **Light Nodes**: Employ sampling to ascertain data availability.
### Network Distribution
The NomosDA network is segmented into `num_subnets` subnetworks.
These subnetworks represent subsets of peers from the overarching network,
each responsible for a distinct portion of the distributed encoded data.
Peers in the network may engage in one or multiple subnetworks,
contingent upon network size and participant count.
### Sub-protocols
The NomosDA protocol consists of the following sub-protocols:
- **Dispersal**: Describes how executors distribute encoded data blobs to subnetworks.
[NomosDA Dispersal](https://www.notion.so/NomosDA-Dispersal-1818f96fb65c805ca257cb14798f24d4?pvs=21)
- **Replication**: Defines how DA nodes distribute encoded data blobs within subnetworks.
[NomosDA Subnetwork Replication](https://www.notion.so/NomosDA-Subnetwork-Replication-1818f96fb65c80119fa0e958a087cc2b?pvs=21)
- **Sampling**: Used by sampling clients (e.g., light clients) to verify the availability of previously dispersed
and replicated data.
[NomosDA Sampling](https://www.notion.so/NomosDA-Sampling-1538f96fb65c8031a44cf7305d271779?pvs=21)
- **Reconstruction**: Describes gathering and decoding dispersed data back into its original form.
[NomosDA Reconstruction](https://www.notion.so/NomosDA-Reconstruction-1828f96fb65c80b2bbb9f4c5a0cf26a5?pvs=21)
- **Indexing**: Tracks and exposes blob metadata on-chain.
[NomosDA Indexing](https://www.notion.so/NomosDA-Indexing-1bb8f96fb65c8044b635da9df20c2411?pvs=21)
## Construction
### NomosDA Network Registration
Entities wishing to participate in NomosDA must declare their role via [SDP](https://www.notion.so/Final-Draft-Validator-Role-Protocol-17b8f96fb65c80c69c2ef55e22e29506) (Service Declaration Protocol).
Once declared, they're accounted for in the subnetwork construction.
This enables participation in:
- Dispersal (as executor)
- Replication & sampling (as DA node)
- Sampling (as light node)
### Subnetwork Assignment
The NomosDA network comprises `num_subnets` subnetworks,
which are virtual in nature.
A subnetwork is a subset of peers grouped together so nodes know who they should connect with,
serving as groupings of peers tasked with executing the dispersal and replication sub-protocols.
In each subnetwork, participants establish a fully connected overlay,
ensuring all nodes maintain permanent connections for the lifetime of the SDP set
with peers within the same subnetwork.
Nodes refer to nodes in the Data Availability SDP set to ascertain their connectivity requirements across subnetworks.
#### Assignment Algorithm
The concrete distribution algorithm is described in the following specification:
[DA Subnetwork Assignation](https://www.notion.so/DA-Subnetwork-Assignation-217261aa09df80fc8bb9cf46092741ce)
## Executor Connections
Each executor maintains a connection with one peer per subnetwork,
necessitating at least num_subnets stable and healthy connections.
Executors are expected to allocate adequate resources to sustain these connections.
An example algorithm for peer selection would be:
```python
def select_peers(
subnetworks: Sequence[Set[PeerId]],
filtered_subnetworks: Set[int],
filtered_peers: Set[PeerId]
) -> Set[PeerId]:
result = set()
for i, subnetwork in enumerate(subnetworks):
available_peers = subnetwork - filtered_peers
if i not in filtered_subnetworks and available_peers:
result.add(next(iter(available_peers)))
return result
```
## NomosDA Protocol Steps
### Dispersal
1. The NomosDA protocol is initiated by executors
who perform data encoding as outlined in the [Encoding Specification](https://www.notion.so/NomosDA-Encoding-Verification-4d8ca269e96d4fdcb05abc70426c5e7c).
2. Executors prepare and distribute each encoded data portion
to its designated subnetwork (from `0` to `num_subnets - 1` ).
3. Executors might opt to perform sampling to confirm successful dispersal.
4. Post-dispersal, executors publish the dispersed `blob_id` and metadata to the mempool. <!-- TODO: add link to dispersal document-->
### Replication
DA nodes receive columns from dispersal or replication
and validate the data encoding.
Upon successful validation,
they replicate the validated column to connected peers within their subnetwork.
Replication occurs once per blob; subsequent validations of the same blob are discarded.
### Sampling
1. Sampling is [invoked based on the node's current role](https://www.notion.so/1538f96fb65c8031a44cf7305d271779?pvs=25#15e8f96fb65c8006b9d7f12ffdd9a159).
2. The node selects `sample_size` random subnetworks
and queries each for the availability of the corresponding column for the sampled blob. Sampling is deemed successful only if all queried subnetworks respond affirmatively.
- If `num_subnets` is 2048, `sample_size` is [20 as per the sampling research](https://www.notion.so/1708f96fb65c80a08c97d728cb8476c3?pvs=25#1708f96fb65c80bab6f9c6a946940078)
```mermaid
sequenceDiagram
SamplingClient ->> DANode_1: Request
DANode_1 -->> SamplingClient: Response
SamplingClient ->>DANode_2: Request
DANode_2 -->> SamplingClient: Response
SamplingClient ->> DANode_n: Request
DANode_n -->> SamplingClient: Response
```
### Network Schematics
The overall network and protocol interactions is represented by the following diagram
```mermaid
flowchart TD
subgraph Replication
subgraph Subnetwork_N
N10 -->|Replicate| N20
N20 -->|Replicate| N30
N30 -->|Replicate| N10
end
subgraph ...
end
subgraph Subnetwork_0
N1 -->|Replicate| N2
N2 -->|Replicate| N3
N3 -->|Replicate| N1
end
end
subgraph Sampling
N9 -->|Sample 0| N2
N9 -->|Sample S| N20
end
subgraph Dispersal
Executor -->|Disperse| N1
Executor -->|Disperse| N10
end
```
## Details
### Network specifics
The NomosDA network is engineered for connection efficiency.
Executors manage numerous open connections,
utilizing their resource capabilities.
DA nodes, with their resource constraints,
are designed to maximize connection reuse.
NomosDA uses [multiplexed](https://docs.libp2p.io/concepts/transports/quic/#quic-native-multiplexing) streams over [QUIC](https://docs.libp2p.io/concepts/transports/quic/) connections.
For each sub-protocol, a stream protocol ID is defined to negotiate the protocol,
triggering the specific protocol once established:
- Dispersal: /nomos/da/{version}/dispersal
- Replication: /nomos/da/{version}/replication
- Sampling: /nomos/da/{version}/sampling
Through these multiplexed streams,
DA nodes can utilize the same connection for all sub-protocols.
This, combined with virtual subnetworks (membership sets),
ensures the overlay node distribution is scalable for networks of any size.
## References
- [Encoding Specification](https://www.notion.so/NomosDA-Encoding-Verification-4d8ca269e96d4fdcb05abc70426c5e7c)
- [Encoding & Verification Specification](https://www.notion.so/NomosDA-Encoding-Verification-4d8ca269e96d4fdcb05abc70426c5e7c)
- [NomosDA Dispersal](https://www.notion.so/NomosDA-Dispersal-1818f96fb65c805ca257cb14798f24d4?pvs=21)
- [NomosDA Subnetwork Replication](https://www.notion.so/NomosDA-Subnetwork-Replication-1818f96fb65c80119fa0e958a087cc2b?pvs=21)
- [DA Subnetwork Assignation](https://www.notion.so/DA-Subnetwork-Assignation-217261aa09df80fc8bb9cf46092741ce)
- [NomosDA Sampling](https://www.notion.so/NomosDA-Sampling-1538f96fb65c8031a44cf7305d271779?pvs=21)
- [NomosDA Reconstruction](https://www.notion.so/NomosDA-Reconstruction-1828f96fb65c80b2bbb9f4c5a0cf26a5?pvs=21)
- [NomosDA Indexing](https://www.notion.so/NomosDA-Indexing-1bb8f96fb65c8044b635da9df20c2411?pvs=21)
- [SDP](https://www.notion.so/Final-Draft-Validator-Role-Protocol-17b8f96fb65c80c69c2ef55e22e29506)
- [invoked based on the node's current role](https://www.notion.so/1538f96fb65c8031a44cf7305d271779?pvs=25#15e8f96fb65c8006b9d7f12ffdd9a159)
- [20 as per the sampling research](https://www.notion.so/1708f96fb65c80a08c97d728cb8476c3?pvs=25#1708f96fb65c80bab6f9c6a946940078)
- [multiplexed](https://docs.libp2p.io/concepts/transports/quic/#quic-native-multiplexing)
- [QUIC](https://docs.libp2p.io/concepts/transports/quic/)
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

View File

@@ -0,0 +1,236 @@
---
title: P2P-HARDWARE-REQUIREMENTS
name: Nomos p2p Network Hardware Requirements Specification
status: raw
category: infrastructure
tags: [hardware, requirements, nodes, validators, services]
editor: Daniel Sanchez-Quiros <danielsq@status.im>
contributors:
- Filip Dimitrijevic <filip@status.im>
---
## Abstract
This specification defines the hardware requirements for running various types of Nomos blockchain nodes. Hardware needs vary significantly based on the node's role, from lightweight verification nodes to high-performance Zone Executors. The requirements are designed to support diverse participation levels while ensuring network security and performance.
## Motivation
The Nomos network is designed to be inclusive and accessible across a wide range of hardware configurations. By defining clear hardware requirements for different node types, we enable:
1. **Inclusive Participation**: Allow users with limited resources to participate as Light Nodes
2. **Scalable Infrastructure**: Support varying levels of network participation based on available resources
3. **Performance Optimization**: Ensure adequate resources for computationally intensive operations
4. **Network Security**: Maintain network integrity through properly resourced validator nodes
5. **Service Quality**: Define requirements for optional services that enhance network functionality
**Important Notice**: These hardware requirements are preliminary and subject to revision based on implementation testing and real-world network performance data.
## Specification
### Node Types Overview
Hardware requirements vary based on the node's role and services:
- **Light Node**: Minimal verification with minimal resources
- **Basic Bedrock Node**: Standard validation participation
- **Service Nodes**: Enhanced capabilities for optional network services
### Light Node
Light Nodes provide network verification with minimal resource requirements, suitable for resource-constrained environments.
**Target Use Cases:**
- Mobile devices and smartphones
- Single-board computers (Raspberry Pi, etc.)
- IoT devices with network connectivity
- Users with limited hardware resources
**Hardware Requirements:**
| Component | Specification |
|-----------|---------------|
| **CPU** | Low-power processor (smartphone/SBC capable) |
| **Memory (RAM)** | 512 MB |
| **Storage** | Minimal (few GB) |
| **Network** | Reliable connection, 1 Mbps free bandwidth |
### Basic Bedrock Node (Validator)
Basic validators participate in Bedrock consensus using typical consumer hardware.
**Target Use Cases:**
- Individual validators on consumer hardware
- Small-scale validation operations
- Entry-level network participation
**Hardware Requirements:**
| Component | Specification |
|-----------|---------------|
| **CPU** | 2 cores, 2 GHz modern multi-core processor |
| **Memory (RAM)** | 1 GB minimum |
| **Storage** | SSD with 100+ GB free space, expandable |
| **Network** | Reliable connection, 1 Mbps free bandwidth |
### Service-Specific Requirements
Nodes can optionally run additional Bedrock Services that require enhanced resources beyond basic validation.
#### Data Availability (DA) Service
DA Service nodes store and serve data shares for the network's data availability layer.
**Service Role:**
- Store blockchain data and blob data long-term
- Serve data shares to requesting nodes
- Maintain high availability for data retrieval
**Additional Requirements:**
| Component | Specification | Rationale |
|-----------|---------------|-----------|
| **CPU** | Same as Basic Bedrock Node | Standard processing needs |
| **Memory (RAM)** | Same as Basic Bedrock Node | Standard memory needs |
| **Storage** | **Fast SSD, 500+ GB free** | Long-term chain and blob storage |
| **Network** | **High bandwidth (10+ Mbps)** | Concurrent data serving |
| **Connectivity** | **Stable, accessible external IP** | Direct peer connections |
**Network Requirements:**
- Capacity to handle multiple concurrent connections
- Stable external IP address for direct peer access
- Low latency for efficient data serving
#### Blend Protocol Service
Blend Protocol nodes provide anonymous message routing capabilities.
**Service Role:**
- Route messages anonymously through the network
- Provide timing obfuscation for privacy
- Maintain multiple concurrent connections
**Additional Requirements:**
| Component | Specification | Rationale |
|-----------|---------------|-----------|
| **CPU** | Same as Basic Bedrock Node | Standard processing needs |
| **Memory (RAM)** | Same as Basic Bedrock Node | Standard memory needs |
| **Storage** | Same as Basic Bedrock Node | Standard storage needs |
| **Network** | **Stable connection (10+ Mbps)** | Multiple concurrent connections |
| **Connectivity** | **Stable, accessible external IP** | Direct peer connections |
**Network Requirements:**
- Low-latency connection for effective message blending
- Stable connection for timing obfuscation
- Capability to handle multiple simultaneous connections
#### Executor Network Service
Zone Executors perform the most computationally intensive work in the network.
**Service Role:**
- Execute Zone state transitions
- Generate zero-knowledge proofs
- Process complex computational workloads
**Critical Performance Note**: Zone Executors perform the heaviest computational work in the network. High-performance hardware is crucial for effective participation and may provide competitive advantages in execution markets.
**Hardware Requirements:**
| Component | Specification | Rationale |
|-----------|---------------|-----------|
| **CPU** | **Very high-performance multi-core processor** | Zone logic execution and ZK proving |
| **Memory (RAM)** | **32+ GB strongly recommended** | Complex Zone execution requirements |
| **Storage** | Same as Basic Bedrock Node | Standard storage needs |
| **GPU** | **Highly recommended/often necessary** | Efficient ZK proof generation |
| **Network** | **High bandwidth (10+ Mbps)** | Data dispersal and high connection load |
**GPU Requirements:**
- **NVIDIA**: CUDA-enabled GPU (RTX 3090 or equivalent recommended)
- **Apple**: Metal-compatible Apple Silicon
- **Performance Impact**: Strong GPU significantly reduces proving time
**Network Requirements:**
- Support for **2048+ direct UDP connections** to DA Nodes (for blob publishing)
- High bandwidth for data dispersal operations
- Stable connection for continuous operation
*Note: DA Nodes utilizing [libp2p](https://docs.libp2p.io/) connections need sufficient capacity to receive and serve data shares over many connections.*
## Implementation Requirements
### Minimum Requirements
All Nomos nodes MUST meet:
1. **Basic connectivity** to the Nomos network via [libp2p](https://docs.libp2p.io/)
2. **Adequate storage** for their designated role
3. **Sufficient processing power** for their service level
4. **Reliable network connection** with appropriate bandwidth for [QUIC](https://docs.libp2p.io/concepts/transports/quic/) transport
### Optional Enhancements
Node operators MAY implement:
- Hardware redundancy for critical services
- Enhanced cooling for high-performance configurations
- Dedicated network connections for service nodes utilizing [libp2p](https://docs.libp2p.io/) protocols
- Backup power systems for continuous operation
### Resource Scaling
Requirements may vary based on:
- **Network Load**: Higher network activity increases resource demands
- **Zone Complexity**: More complex Zones require additional computational resources
- **Service Combinations**: Running multiple services simultaneously increases requirements
- **Geographic Location**: Network latency affects optimal performance requirements
## Security Considerations
### Hardware Security
1. **Secure Storage**: Use encrypted storage for sensitive node data
2. **Network Security**: Implement proper firewall configurations
3. **Physical Security**: Secure physical access to node hardware
4. **Backup Strategies**: Maintain secure backups of critical data
### Performance Security
1. **Resource Monitoring**: Monitor resource usage to detect anomalies
2. **Redundancy**: Plan for hardware failures in critical services
3. **Isolation**: Consider containerization or virtualization for service isolation
4. **Update Management**: Maintain secure update procedures for hardware drivers
## Performance Characteristics
### Scalability
- **Light Nodes**: Minimal resource footprint, high scalability
- **Validators**: Moderate resource usage, network-dependent scaling
- **Service Nodes**: High resource usage, specialized scaling requirements
### Resource Efficiency
- **CPU Usage**: Optimized algorithms for different hardware tiers
- **Memory Usage**: Efficient data structures for constrained environments
- **Storage Usage**: Configurable retention policies and compression
- **Network Usage**: Adaptive bandwidth utilization based on [libp2p](https://docs.libp2p.io/) capacity and [QUIC](https://docs.libp2p.io/concepts/transports/quic/) connection efficiency
## References
1. [libp2p protocol](https://docs.libp2p.io/)
2. [QUIC protocol](https://docs.libp2p.io/concepts/transports/quic/)
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

View File

@@ -0,0 +1,377 @@
---
title: P2P-NAT-SOLUTION
name: Nomos P2P Network NAT Solution Specification
status: raw
category: networking
tags: [nat, traversal, autonat, upnp, pcp, nat-pmp]
editor: Antonio Antonino <antonio@status.im>
contributors:
- Álvaro Castro-Castilla <alvaro@status.im>
- Daniel Sanchez-Quiros <danielsq@status.im>
- Petar Radovic <petar@status.im>
- Gusto Bacvinka <augustinas@status.im>
- Youngjoon Lee <youngjoon@status.im>
- Filip Dimitrijevic <filip@status.im>
---
## Abstract
This specification defines a comprehensive NAT (Network Address Translation) traversal solution for the Nomos P2P network. The solution enables nodes to automatically determine their NAT status and establish both outbound and inbound connections regardless of network configuration. The strategy combines [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md), dynamic port mapping protocols, and continuous verification to maximize public reachability while maintaining decentralized operation.
## Motivation
Network Address Translation presents a critical challenge for Nomos participants, particularly those operating on consumer hardware without technical expertise. The Nomos network requires a NAT traversal solution that:
1. **Automatic Operation**: Works out-of-the-box without user configuration
2. **Inclusive Participation**: Enables nodes on consumer hardware to participate effectively
3. **Decentralized Approach**: Leverages the existing Nomos P2P network rather than centralized services
4. **Progressive Fallback**: Escalates through increasingly complex protocols as needed
5. **Dynamic Adaptation**: Handles changing network environments and configurations
The solution must ensure that nodes can both establish outbound connections and accept inbound connections from other peers, maintaining network connectivity across diverse NAT configurations.
## Specification
### Terminology
- **Public Node**: A node that is publicly reachable via a public IP address or valid port mapping
- **Private Node**: A node that is not publicly reachable due to NAT/firewall restrictions
- **Dialing**: The process of establishing a connection using the [libp2p protocol](https://docs.libp2p.io/) stack
- **NAT Status**: Whether a node is publicly reachable or hidden behind NAT
### Key Design Principles
#### Optional Configuration
The NAT traversal strategy must work out-of-the-box whenever possible. Users who do not want to engage in configuration should only need to install the node software package. However, users requiring full control must be able to configure every aspect of the strategy.
#### Decentralized Operation
The solution leverages the existing Nomos P2P network for coordination rather than relying on centralized third-party services. This maintains the decentralized nature of the network while providing necessary NAT traversal capabilities.
#### Progressive Fallback
The protocol begins with lightweight checks and escalates through more complex and resource-intensive protocols. Failure at any step moves the protocol to the next stage in the strategy, ensuring maximum compatibility across network configurations.
#### Dynamic Network Environment
Unless explicitly configured for static addresses, each node's public or private status is assumed to be dynamic. A once publicly-reachable node can become unreachable and vice versa, requiring continuous monitoring and adaptation.
### Node Discovery Considerations
The Nomos public network encourages participation from a large number of nodes, many deployed through simple installation procedures. Some nodes will not achieve Public status, but the discovery protocol must track these peers and allow other nodes to discover them. This prevents network partitioning and ensures Private nodes remain accessible to other participants.
### NAT Traversal Protocol
#### Protocol Requirements
**Each node MUST:**
- Run an [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) client, except for nodes statically configured as Public
- Use the [Identify protocol](https://github.com/libp2p/specs/blob/master/identify/README.md) to advertise support for:
- `/nomos/autonat/2/dial-request` for main network
- `/nomos-testnet/autonat/2/dial-request` for public testnet
- `/nomos/autonat/2/dial-back` and `/nomos-testnet/autonat/2/dial-back` respectively
#### NAT State Machine
The NAT traversal process follows a multi-phase state machine:
```mermaid
graph TD
Start@{shape: circle, label: "Start"} -->|Preconfigured public IP or port mapping| StaticPublic[Statically configured as<br/>**Public**]
subgraph Phase0 [Phase 0]
Start -->|Default configuration| Boot
end
subgraph Phase1 [Phase 1]
Boot[Bootstrap and discover AutoNAT servers]--> Inspect
Inspect[Inspect own IP addresses]-->|At least 1 IP address in the public range| ConfirmPublic[AutoNAT]
end
subgraph Phase2 [Phase 2]
Inspect -->|No IP addresses in the public range| MapPorts[Port Mapping Client<br/>UPnP/NAT-PMP/PCP]
MapPorts -->|Successful port map| ConfirmMapPorts[AutoNAT]
end
ConfirmPublic -->|Node's IP address reachable by AutoNAT server| Public[**Public** Node]
ConfirmPublic -->|Node's IP address not reachable by AutoNAT server or Timeout| MapPorts
ConfirmMapPorts -->|Mapped IP address and port reachable by AutoNAT server| Public
ConfirmMapPorts -->|Mapped IP address and port not reachable by AutoNAT server or Timeout| Private
MapPorts -->|Failure or Timeout| Private[**Private** Node]
subgraph Phase3 [Phase 3]
Public -->Monitor
Private --> Monitor
end
Monitor[Network Monitoring] -->|Restart| Inspect
```
### Phase Implementation
#### Phase 0: Bootstrapping and Identifying Public Nodes
If the node is statically configured by the operator to be Public, the procedure stops here.
The node utilizes bootstrapping and discovery mechanisms to find other Public nodes. The [Identify protocol](https://github.com/libp2p/specs/blob/master/identify/README.md) confirms which detected Public nodes support [AutoNAT v2](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md).
#### Phase 1: NAT Detection
The node starts an [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) client and inspects its own addresses. For each public IP address, the node verifies public reachability via [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md). If any public IP addresses are confirmed, the node assumes Public status and moves to Phase 3. Otherwise, it continues to Phase 2.
#### Phase 2: Automated Port Mapping
The node attempts to secure port mapping on the default gateway using:
- **[PCP](https://datatracker.ietf.org/doc/html/rfc6887)** (Port Control Protocol) - Most reliable
- **[NAT-PMP](https://datatracker.ietf.org/doc/html/rfc6886)** (NAT Port Mapping Protocol) - Second most reliable
- **[UPnP-IGD](https://datatracker.ietf.org/doc/html/rfc6970)** (Universal Plug and Play Internet Gateway Device) - Most widely deployed
**Port Mapping Algorithm:**
```python
def try_port_mapping():
# Step 1: Get the local IPv4 address
local_ip = get_local_ipv4_address()
# Step 2: Get the default gateway IPv4 address
gateway_ip = get_default_gateway_address()
# Step 3: Abort if local or gateway IP could not be determined
if not local_ip or not gateway_ip:
return "Mapping failed: Unable to get local or gateway IPv4"
# Step 4: Probe the gateway for protocol support
supports_pcp = probe_pcp(gateway_ip)
supports_nat_pmp = probe_nat_pmp(gateway_ip)
supports_upnp = probe_upnp(gateway_ip) # Optional for logging
# Step 5-9: Try protocols in order of reliability
# PCP (most reliable) -> NAT-PMP -> UPnP -> fallback attempts
protocols = [
(supports_pcp, try_pcp_mapping),
(supports_nat_pmp, try_nat_pmp_mapping),
(True, try_upnp_mapping), # Always try UPnP
(not supports_pcp, try_pcp_mapping), # Fallback
(not supports_nat_pmp, try_nat_pmp_mapping) # Last resort
]
for supported, mapping_func in protocols:
if supported:
mapping = mapping_func(local_ip, gateway_ip)
if mapping:
return mapping
return "Mapping failed: No protocol succeeded"
```
If mapping succeeds, the node uses [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) to confirm public reachability. Upon confirmation, the node assumes Public status. Otherwise, it assumes Private status.
**Port Mapping Sequence:**
```mermaid
sequenceDiagram
box Node
participant AutoNAT Client
participant NAT State Machine
participant Port Mapping Client
end
participant Router
alt Mapping is successful
Note left of AutoNAT Client: Phase 2
Port Mapping Client ->> +Router: Requests new mapping
Router ->> Port Mapping Client: Confirms new mapping
Port Mapping Client ->> NAT State Machine: Mapping secured
NAT State Machine ->> AutoNAT Client: Requests confirmation<br/>that mapped address<br/>is publicly reachable
alt Node asserts Public status
AutoNAT Client ->> NAT State Machine: Mapped address<br/>is publicly reachable
Note left of AutoNAT Client: Phase 3<br/>Network Monitoring
else Node asserts Private status
AutoNAT Client ->> NAT State Machine: Mapped address<br/>is not publicly reachable
Note left of AutoNAT Client: Phase 3<br/>Network Monitoring
end
else Mapping fails, node asserts Private status
Note left of AutoNAT Client: Phase 2
Port Mapping Client ->> Router: Requests new mapping
Router ->> Port Mapping Client: Refuses new mapping or Timeout
Port Mapping Client ->> NAT State Machine: Mapping failed
Note left of AutoNAT Client: Phase 3<br/>Network Monitoring
end
```
#### Phase 3: Network Monitoring
Unless explicitly configured, nodes must monitor their network status and restart from Phase 1 when changes are detected.
**Public Node Monitoring:**
A Public node must restart when:
- [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) client no longer confirms public reachability
- A previously successful port mapping is lost or refresh fails
**Private Node Monitoring:**
A Private node must restart when:
- It gains a new public IP address
- Port mapping is likely to succeed (gateway change, sufficient time passed)
**Network Monitoring Sequence:**
```mermaid
sequenceDiagram
participant AutoNAT Server
box Node
participant AutoNAT Client
participant NAT State Machine
participant Port Mapping Client
end
participant Router
Note left of AutoNAT Server: Phase 3<br/>Network Monitoring
par Refresh mapping and monitor changes
loop periodically refreshes mapping
Port Mapping Client ->> Router: Requests refresh
Router ->> Port Mapping Client: Confirms mapping refresh
end
break Mapping is lost, the node loses Public status
Router ->> Port Mapping Client: Refresh failed or mapping dropped
Port Mapping Client ->> NAT State Machine: Mapping lost
NAT State Machine ->> NAT State Machine: Restart
end
and Monitor public reachability of mapped addresses
loop periodically checks public reachability
AutoNAT Client ->> AutoNAT Server: Requests dialback
AutoNAT Server ->> AutoNAT Client: Dialback successful
end
break
AutoNAT Server ->> AutoNAT Client: Dialback failed or Timeout
AutoNAT Client ->> NAT State Machine: Public reachability lost
NAT State Machine ->> NAT State Machine: Restart
end
end
Note left of AutoNAT Server: Phase 1
```
### Public Node Responsibilities
**A Public node MUST:**
- Run an [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) server
- Listen on and advertise via [Identify protocol](https://github.com/libp2p/specs/blob/master/identify/README.md) its publicly reachable [multiaddresses](https://github.com/libp2p/specs/blob/master/addressing/README.md):
`/{public_peer_ip}/udp/{port}/quic-v1/p2p/{public_peer_id}`
- Periodically renew port mappings according to protocol recommendations
- Maintain high availability for [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) services
### Peer Dialing
Other peers can always dial a Public peer using its publicly reachable [multiaddresses](https://github.com/libp2p/specs/blob/master/addressing/README.md):
`/{public_peer_ip}/udp/{port}/quic-v1/p2p/{public_peer_id}`
## Implementation Requirements
### Mandatory Components
All Nomos nodes MUST implement:
1. **[AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) client** for NAT status detection
2. **Port mapping clients** for [PCP](https://datatracker.ietf.org/doc/html/rfc6887), [NAT-PMP](https://datatracker.ietf.org/doc/html/rfc6886), and [UPnP-IGD](https://datatracker.ietf.org/doc/html/rfc6970)
3. **[Identify protocol](https://github.com/libp2p/specs/blob/master/identify/README.md)** for capability advertisement
4. **Network monitoring** for status change detection
### Optional Enhancements
Nodes MAY implement:
- Custom port mapping retry strategies
- Enhanced network change detection
- Advanced [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) server load balancing
- Backup connectivity mechanisms
### Configuration Parameters
#### [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) Configuration
```yaml
autonat:
client:
dial_timeout: 15s
max_peer_addresses: 16
throttle_global_limit: 30
throttle_peer_limit: 3
server:
dial_timeout: 30s
max_peer_addresses: 16
throttle_global_limit: 30
throttle_peer_limit: 3
```
#### Port Mapping Configuration
```yaml
port_mapping:
pcp:
timeout: 30s
lifetime: 7200s # 2 hours
retry_interval: 300s
nat_pmp:
timeout: 30s
lifetime: 7200s
retry_interval: 300s
upnp:
timeout: 30s
lease_duration: 7200s
retry_interval: 300s
```
## Security Considerations
### NAT Traversal Security
1. **Port Mapping Validation**: Verify that requested port mappings are actually created
2. **[AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) Server Trust**: Implement peer reputation for [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) servers
3. **Gateway Communication**: Secure communication with NAT devices
4. **Address Validation**: Validate public addresses before advertisement
### Privacy Considerations
1. **IP Address Exposure**: Public nodes necessarily expose IP addresses
2. **Traffic Analysis**: Monitor for patterns that could reveal node behavior
3. **Gateway Information**: Minimize exposure of internal network topology
### Denial of Service Protection
1. **[AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) Rate Limiting**: Implement request throttling for [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) services
2. **Port Mapping Abuse**: Prevent excessive port mapping requests
3. **Resource Exhaustion**: Limit concurrent NAT traversal attempts
## Performance Characteristics
### Scalability
- **[AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md) Server Load**: Distributed across Public nodes
- **Port Mapping Overhead**: Minimal ongoing resource usage
- **Network Monitoring**: Efficient periodic checks
### Reliability
- **Fallback Mechanisms**: Multiple protocols ensure high success rates
- **Continuous Monitoring**: Automatic recovery from connectivity loss
- **Protocol Redundancy**: Multiple port mapping protocols increase reliability
## References
1. [Multiaddress spec](https://github.com/libp2p/specs/blob/master/addressing/README.md)
2. [Identify protocol spec](https://github.com/libp2p/specs/blob/master/identify/README.md)
3. [AutoNAT v2 protocol spec](https://github.com/libp2p/specs/blob/master/autonat/autonat-v2.md)
4. [Circuit Relay v2 protocol spec](https://github.com/libp2p/specs/blob/master/relay/circuit-v2.md)
5. [PCP - RFC 6887](https://datatracker.ietf.org/doc/html/rfc6887)
6. [NAT-PMP - RFC 6886](https://datatracker.ietf.org/doc/html/rfc6886)
7. [UPnP IGD - RFC 6970](https://datatracker.ietf.org/doc/html/rfc6970)
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

View File

@@ -0,0 +1,185 @@
---
title: P2P-NETWORK-BOOTSTRAPPING
name: Nomos P2P Network Bootstrapping Specification
status: raw
category: networking
tags: [p2p, networking, bootstrapping, peer-discovery, libp2p]
editor: Daniel Sanchez-Quiros <danielsq@status.im>
contributors:
- Álvaro Castro-Castilla <alvaro@status.im>
- Petar Radovic <petar@status.im>
- Gusto Bacvinka <augustinas@status.im>
- Antonio Antonino <antonio@status.im>
- Youngjoon Lee <youngjoon@status.im>
- Filip Dimitrijevic <filip@status.im>
---
## Introduction
Nomos network bootstrapping is the process by which a new node discovers peers and synchronizes with the existing decentralized network. It ensures that a node can:
1. **Discover Peers** Find other active nodes in the network.
2. **Establish Connections** Securely connect to trusted peers.
3. **Negotiate (libp2p) Protocols** - Ensure that other peers operate in the same protocols as the node needs.
## Overview
The Nomos P2P network bootstrapping strategy relies on a designated subset of **bootstrap nodes** to facilitate secure and efficient node onboarding. These nodes serve as the initial entry points for new network participants.
### Key Design Principles
#### Trusted Bootstrap Nodes
A curated set of publicly announced and highly available nodes ensures reliability during initial peer discovery. These nodes are configured with elevated connection limits to handle a high volume of incoming bootstrapping requests from new participants.
#### Node Configuration & Onboarding
New node operators must explicitly configure their instances with the addresses of bootstrap nodes. This configuration may be preloaded or dynamically fetched from a trusted source to minimize manual setup.
#### Network Integration
Upon initialization, the node establishes connections with the bootstrap nodes and begins participating in Nomos networking protocols. Through these connections, the node discovers additional peers, synchronizes with the network state, and engages in protocol-specific communication (e.g., consensus, block propagation).
### Security & Decentralization Considerations
**Trust Minimization**: While bootstrap nodes provide initial connectivity, the network rapidly transitions to decentralized peer discovery to prevent over-reliance on any single entity.
**Authenticated Announcements**: The identities and addresses of bootstrap nodes are publicly verifiable to mitigate impersonation attacks. From [libp2p documentation](https://docs.libp2p.io/concepts/transports/quic/#quic-in-libp2p):
> To authenticate each others' peer IDs, peers encode their peer ID into a self-signed certificate, which they sign using their host's private key.
**Dynamic Peer Management**: After bootstrapping, nodes continuously refine their peer lists to maintain a resilient and distributed network topology.
This approach ensures **rapid, secure, and scalable** network participation while preserving the decentralized ethos of the Nomos protocol.
## Protocol
### Protocol Overview
The bootstrapping protocol follows libp2p conventions for peer discovery and connection establishment. Implementation details are handled by the underlying libp2p stack with Nomos-specific configuration parameters.
### Bootstrapping Process
#### Step-by-Step bootstrapping process
1. **Node Initial Configuration**: New nodes load pre-configured bootstrap node addresses. Addresses may be `IP` or `DNS` embedded in a compatible [libp2p PeerId multiaddress](https://docs.libp2p.io/concepts/fundamentals/peers/#peer-ids-in-multiaddrs). Node operators may chose to advertise more than one address. This is out of the scope of this protocol. For example:
`/ip4/198.51.100.0/udp/4242/p2p/QmYyQSo1c1Ym7orWxLYvCrM2EmxFTANf8wXmmE7DWjhx5N` or
`/dns/foo.bar.net/udp/4242/p2p/QmYyQSo1c1Ym7orWxLYvCrM2EmxFTANf8wXmmE7DWjhx5N`
2. **Secure Connection**: Nodes establish connections to bootstrap nodes announced addresses. Verifies network identity and protocol compatibility.
3. **Peer Discovery**: Requests and receives validated peer lists from bootstrap nodes. Each entry includes connectivity details as per the peer discovery protocol engaging after the initial connection.
4. **Network Integration**: Iteratively connects to discovered peers. Gradually build peer connections.
5. **Protocol Engagement**: Establishes required protocol channels (gossip/consensus/sync). Begins participating in network operations.
6. **Ongoing Maintenance**: Continuously evaluates and refreshes peer connections. Ideally removes the connection to the bootstrap node itself. Bootstrap nodes may chose to remove the connection on their side to keep high availability for other nodes.
```mermaid
sequenceDiagram
participant Nomos Network
participant Node
participant Bootstrap Node
Node->>Node: Fetches bootstrapping addresses
loop Interacts with bootstrap node
Node->>+Bootstrap Node: Connects
Bootstrap Node->>-Node: Sends discovered peers information
end
loop Connects to Network participants
Node->>Nomos Network: Engages in connections
Node->>Nomos Network: Negotiates protocols
end
loop Ongoing maintenance
Node-->>Nomos Network: Evaluates peer connections
alt Bootstrap connection no longer needed
Node-->>Bootstrap Node: Disconnects
else Bootstrap enforces disconnection
Bootstrap Node-->>Node: Disconnects
end
end
```
## Implementation Details
The bootstrapping process for the Nomos p2p network uses the **QUIC** transport as specified in the Nomos network specification.
Bootstrapping is separated from the network's peer discovery protocol. It assumes that there is one protocol that would engage as soon as the connection with the bootstrapping node triggers. Currently Nomos network uses `kademlia` as the current first approach for the Nomos p2p network, this comes granted.
### Bootstrap Node Requirements
Bootstrap nodes MUST fulfill the following requirements:
- **High Availability**: Maintain uptime of 99.5% or higher
- **Connection Capacity**: Support minimum 1000 concurrent connections
- **Geographic Distribution**: Deploy across multiple regions
- **Protocol Compatibility**: Support all required Nomos network protocols
- **Security**: Implement proper authentication and rate limiting
### Network Configuration
Bootstrap node addresses are distributed through:
- **Hardcoded addresses** in node software releases
- **DNS seeds** for dynamic address resolution
- **Community-maintained lists** with cryptographic verification
## Security Considerations
### Trust Model
Bootstrap nodes operate under a **minimal trust model**:
- Nodes verify peer identities through cryptographic authentication
- Bootstrap connections are temporary and replaced by organic peer discovery
- No single bootstrap node can control network participation
### Attack Mitigation
**Sybil Attack Protection**: Bootstrap nodes implement connection limits and peer verification to prevent malicious flooding.
**Eclipse Attack Prevention**: Nodes connect to multiple bootstrap nodes and rapidly diversify their peer connections.
**Denial of Service Resistance**: Rate limiting and connection throttling protect bootstrap nodes from resource exhaustion attacks.
## Performance Characteristics
### Bootstrapping Metrics
- **Initial Connection Time**: Target < 30 seconds to first bootstrap node
- **Peer Discovery Duration**: Discover minimum viable peer set within 2 minutes
- **Network Integration**: Full protocol engagement within 5 minutes
### Resource Requirements
#### Bootstrap Nodes
- Memory: Minimum 4GB RAM
- Bandwidth: 100 Mbps sustained
- Storage: 50GB available space
#### Regular Nodes
- Memory: 512MB for bootstrapping process
- Bandwidth: 10 Mbps during initial sync
- Storage: Minimal requirements
## References
- P2P Network Specification (internal document)
- [libp2p QUIC Transport](https://docs.libp2p.io/concepts/transports/quic/)
- [libp2p Peer IDs and Addressing](https://docs.libp2p.io/concepts/fundamentals/peers/)
- [Ethereum bootnodes](https://ethereum.org/en/developers/docs/nodes-and-clients/bootnodes/)
- [Bitcoin peer discovery](https://developer.bitcoin.org/devguide/p2p_network.html#peer-discovery)
- [Cardano nodes connectivity](https://docs.cardano.org/stake-pool-operators/node-connectivity)
- [Cardano peer sharing](https://www.coincashew.com/coins/overview-ada/guide-how-to-build-a-haskell-stakepool-node/part-v-tips/implementing-peer-sharing)
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

307
nomos/raw/p2p-network.md Normal file
View File

@@ -0,0 +1,307 @@
---
title: NOMOS-P2P-NETWORK
name: Nomos P2P Network Specification
status: draft
category: networking
tags: [p2p, networking, libp2p, kademlia, gossipsub, quic]
editor: Daniel Sanchez-Quiros <danielsq@status.im>
contributors:
- Filip Dimitrijevic <filip@status.im>
---
## Abstract
This specification defines the peer-to-peer (P2P) network layer for Nomos blockchain nodes. The network serves as the comprehensive communication infrastructure enabling transaction dissemination through mempool and block propagation. The specification leverages established libp2p protocols to ensure robust, scalable performance with low bandwidth requirements and minimal latency while maintaining accessibility for diverse hardware configurations and network environments.
## Motivation
The Nomos blockchain requires a reliable, scalable P2P network that can:
1. **Support diverse hardware**: From laptops to dedicated servers across various operating systems and geographic locations
2. **Enable inclusive participation**: Allow non-technical users to operate nodes with minimal configuration
3. **Maintain connectivity**: Ensure nodes remain reachable even with limited connectivity or behind NAT/routers
4. **Scale efficiently**: Support large-scale networks (+10k nodes) with eventual consistency
5. **Provide low-latency communication**: Enable efficient transaction and block propagation
## Specification
### Network Architecture Overview
The Nomos P2P network addresses three critical challenges:
- **Peer Connectivity**: Mechanisms for peers to join and connect to the network
- **Peer Discovery**: Enabling peers to locate and identify network participants
- **Message Transmission**: Facilitating efficient message exchange across the network
### Transport Protocol
#### QUIC Protocol Transport
The Nomos network employs **[QUIC protocol](https://docs.libp2p.io/concepts/transports/quic/)** as the primary transport protocol, leveraging the [libp2p protocol](https://docs.libp2p.io/) implementation.
**Rationale for [QUIC protocol](https://docs.libp2p.io/concepts/transports/quic/):**
- Rapid connection establishment
- Enhanced NAT traversal capabilities (UDP-based)
- Built-in multiplexing simplifies configuration
- Production-tested reliability
### Peer Discovery
#### Kademlia DHT
The network utilizes libp2p's Kademlia Distributed Hash Table (DHT) for peer discovery.
**Protocol Identifiers:**
- **Mainnet**: `/nomos/kad/1.0.0`
- **Testnet**: `/nomos-testnet/kad/1.0.0`
**Features:**
- Proximity-based peer discovery heuristics
- Distributed peer routing table
- Resilient to network partitions
- Automatic peer replacement
#### Identify Protocol
Complements Kademlia by enabling peer information exchange.
**Protocol Identifiers:**
- **Mainnet**: `/nomos/identify/1.0.0`
- **Testnet**: `/nomos-testnet/identify/1.0.0`
**Capabilities:**
- Protocol support advertisement
- Peer capability negotiation
- Network interoperability enhancement
#### Future Considerations
The current Kademlia implementation is acknowledged as interim. Future improvements target:
- Lightweight design without full DHT overhead
- Highly-scalable eventual consistency
- Support for 10k+ nodes with minimal resource usage
### NAT Traversal
The network implements comprehensive NAT traversal solutions to ensure connectivity across diverse network configurations.
**Objectives:**
- Configuration-free peer connections
- Support for users with varying technical expertise
- Enable nodes on standard consumer hardware
**Implementation:**
- Tailored solutions based on user network configuration
- Automatic NAT type detection and adaptation
- Fallback mechanisms for challenging network environments
*Note: Detailed NAT traversal specifications are maintained in a separate document.*
### Message Dissemination
#### Gossipsub Protocol
Nomos employs **gossipsub** for reliable message propagation across the network.
**Integration:**
- Seamless integration with Kademlia peer discovery
- Automatic peer list updates
- Efficient message routing and delivery
#### Topic Configuration
**Mempool Dissemination:**
- **Mainnet**: `/nomos/mempool/0.1.0`
- **Testnet**: `/nomos-testnet/mempool/0.1.0`
**Block Propagation:**
- **Mainnet**: `/nomos/cryptarchia/0.1.0`
- **Testnet**: `/nomos-testnet/cryptarchia/0.1.0`
#### Network Parameters
**Peering Degree:**
- **Minimum recommended**: 8 peers
- **Rationale**: Ensures redundancy and efficient propagation
- **Configurable**: Nodes may adjust based on resources and requirements
### Bootstrapping
#### Initial Network Entry
New nodes connect to the network through designated bootstrap nodes.
**Process:**
1. Connect to known bootstrap nodes
2. Obtain initial peer list through Kademlia
3. Establish gossipsub connections
4. Begin participating in network protocols
**Bootstrap Node Requirements:**
- High availability and reliability
- Geographic distribution
- Version compatibility maintenance
### Message Encoding
All network messages follow the Nomos Wire Format specification for consistent encoding and decoding across implementations.
**Key Properties:**
- Deterministic serialization
- Efficient binary encoding
- Forward/backward compatibility support
- Cross-platform consistency
*Note: Detailed wire format specifications are maintained in a separate document.*
## Implementation Requirements
### Mandatory Protocols
All Nomos nodes MUST implement:
1. **Kademlia DHT** for peer discovery
2. **Identify protocol** for peer information exchange
3. **Gossipsub** for message dissemination
### Optional Enhancements
Nodes MAY implement:
- Advanced NAT traversal techniques
- Custom peering strategies
- Enhanced message routing optimizations
### Network Versioning
Protocol versions follow semantic versioning:
- **Major version**: Breaking protocol changes
- **Minor version**: Backward-compatible enhancements
- **Patch version**: Bug fixes and optimizations
## Configuration Parameters
### Implementation Note
**Current Status**: The Nomos P2P network implementation uses hardcoded libp2p protocol parameters for optimal performance and reliability. While the node configuration file (`config.yaml`) contains network-related settings, the core libp2p protocol parameters (Kademlia DHT, Identify, and Gossipsub) are embedded in the source code.
### Node Configuration
The following network parameters are configurable via `config.yaml`:
#### Network Backend Settings
```yaml
network:
backend:
host: 0.0.0.0
port: 3000
node_key: <node_private_key>
initial_peers: []
```
#### Protocol-Specific Topics
**Mempool Dissemination:**
- **Mainnet**: `/nomos/mempool/0.1.0`
- **Testnet**: `/nomos-testnet/mempool/0.1.0`
**Block Propagation:**
- **Mainnet**: `/nomos/cryptarchia/0.1.0`
- **Testnet**: `/nomos-testnet/cryptarchia/0.1.0`
### Hardcoded Protocol Parameters
The following libp2p protocol parameters are currently hardcoded in the implementation:
#### Peer Discovery Parameters
- **Protocol identifiers** for Kademlia DHT and Identify protocols
- **DHT routing table** configuration and query timeouts
- **Peer discovery intervals** and connection management
#### Message Dissemination Parameters
- **Gossipsub mesh parameters** (peer degree, heartbeat intervals)
- **Message validation** and caching settings
- **Topic subscription** and fanout management
#### Rationale for Hardcoded Parameters
1. **Network Stability**: Prevents misconfigurations that could fragment the network
2. **Performance Optimization**: Parameters are tuned for the target network size and latency requirements
3. **Security**: Reduces attack surface by limiting configurable network parameters
4. **Simplicity**: Eliminates need for operators to understand complex P2P tuning
## Security Considerations
### Network-Level Security
1. **Peer Authentication**: Utilize libp2p's built-in peer identity verification
2. **Message Validation**: Implement application-layer message validation
3. **Rate Limiting**: Protect against spam and DoS attacks
4. **Blacklisting**: Mechanism for excluding malicious peers
### Privacy Considerations
1. **Traffic Analysis**: Gossipsub provides some resistance to traffic analysis
2. **Metadata Leakage**: Minimize identifiable information in protocol messages
3. **Connection Patterns**: Randomize connection timing and patterns
### Denial of Service Protection
1. **Resource Limits**: Impose limits on connections and message rates
2. **Peer Scoring**: Implement reputation-based peer management
3. **Circuit Breakers**: Automatic protection against resource exhaustion
### Node Configuration Example
[Nomos Node Configuration](https://github.com/logos-co/nomos/blob/master/nodes/nomos-node/config.yaml) is an example node configuration
## Performance Characteristics
### Scalability
- **Target Network Size**: 10,000+ nodes
- **Message Latency**: Sub-second for critical messages
- **Bandwidth Efficiency**: Optimized for limited bandwidth environments
### Resource Requirements
- **Memory Usage**: Minimal DHT routing table overhead
- **CPU Usage**: Efficient cryptographic operations
- **Network Bandwidth**: Adaptive based on node role and capacity
## References
Original working document, from Nomos Notion: [P2P Network Specification](https://nomos-tech.notion.site/P2P-Network-Specification-206261aa09df81db8100d5f410e39d75).
1. [libp2p Specifications](https://docs.libp2p.io/)
2. [QUIC Protocol Specification](https://docs.libp2p.io/concepts/transports/quic/)
3. [Kademlia DHT](https://docs.libp2p.io/concepts/discovery-routing/kaddht/)
4. [Gossipsub Protocol](https://github.com/libp2p/specs/tree/master/pubsub/gossipsub)
5. [Identify Protocol](https://github.com/libp2p/specs/blob/master/identify/README.md)
6. [Nomos Implementation](https://github.com/logos-co/nomos) - Reference implementation and source code
7. [Nomos Node Configuration](https://github.com/logos-co/nomos/blob/master/nodes/nomos-node/config.yaml) - Example node configuration
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

345
nomos/raw/sdp.md Normal file
View File

@@ -0,0 +1,345 @@
---
title: NOMOS-SDP
name: Nomos Service Declaration Protocol Specification
status: raw
category:
tags: participation, validators, declarations
editor: Marcin Pawlowski <marcin@status.im>
contributors:
- Mehmet <mehmet@status.im>
- Daniel Sanchez Quiros <danielsq@status.im>
- Álvaro Castro-Castilla <alvaro@status.im>
- Thomas Lavaur <thomaslavaur@status.im>
- Filip Dimitrijevic <filip@status.im>
- Gusto Bacvinka <augustinas@status.im>
- David Rusu <davidrusu@status.im>
---
## Introduction
This document defines a mechanism enabling validators to declare their participation in specific protocols that require a known and agreed-upon list of participants. Some examples of this are Data Availability and the Blend Network. We create a single repository of identifiers which is used to establish secure communication between validators and provide services. Before being admitted to the repository, the validator proves that it locked at least a minimum stake.
## Requirements
The requirements for the protocol are defined as follows:
- A declaration must be backed by a confirmation that the sender of the declaration owns a certain value of the stake.
- A declaration is valid until it is withdrawn or is not used for a service-specific amount of time.
## Overview
The SDP enables nodes to declare their eligibility to serve a specific service in the system, and withdraw their declarations.
### Protocol Actions
The protocol defines the following actions:
- **Declare**: A node sends a declaration that confirms its willingness to provide a specific service, which is confirmed by locking a threshold of stake.
- **Active**: A node marks that its participation in the protocol is active according to the service-specific activity logic. This action enables the protocol to monitor the node's activity. We utilize this as a non-intrusive differentiator of node activity. It is crucial to exclude inactive nodes from the set of active nodes, as it enhances the stability of services.
- **Withdraw**: A node withdraws its declaration and stops providing a service.
The logic of the protocol is straightforward:
1. A node sends a declaration message for a specific service and proves it has a minimum stake.
2. The declaration is registered on the ledger, and the node can commence its service according to the service-specific service logic.
3. After a service-specific service-providing time, the node confirms its activity.
4. The node must confirm its activity with a service-specific minimum frequency; otherwise, its declaration is inactive.
5. After the service-specific locking period, the node can send a withdrawal message, and its declaration is removed from the ledger, which means that the node will no longer provide the service.
💡 The protocol messages are subject to a finality that means messages become part of the immutable ledger after a delay. The delay at which it happens is defined by the consensus.
## Construction
In this section, we present the main constructions of the protocol. First, we start with data definitions. Second, we describe the protocol actions. Finally, we present part of the Bedrock Mantle design responsible for storing and processing SDP-related messages and data.
### Data
In this section, we discuss and define data types, messages, and their storage.
#### Service Types
We define the following services which can be used for service declaration:
- `BN`: for Blend Network service.
- `DA`: for Data Availability service.
```python
class ServiceType(Enum):
BN="BN" # Blend Network
DA="DA" # Data Availability
```
A declaration can be generated for any of the services above. Any declaration that is not one of the above must be rejected. The number of services might grow in the future.
#### Minimum Stake
The minimum stake is a global value that defines the minimum stake a node must have to perform any service.
The `MinStake` is a structure that holds the value of the stake `stake_threshold` and the block number it was set at: `timestamp`.
```python
class MinStake:
stake_threshold: StakeThreshold
timestamp: BlockNumber
```
The `stake_thresholds` is a structure aggregating all defined `MinStake` values.
```python
stake_thresholds: list[MinStake]
```
For more information on how the minimum stake is calculated, please refer to the Nomos documentation.
#### Service Parameters
The service parameters structure defines the parameters set necessary for correctly handling interaction between the protocol and services. Each of the service types defined above must be mapped to a set of the following parameters:
- `session_length` defines the session length expressed as the number of blocks; the sessions are counted from block `timestamp`.
- `lock_period` defines the minimum time (as a number of sessions) during which the declaration cannot be withdrawn, this time must include the period necessary for finalizing the declaration (which might be implicit) and provision of a service for least a single session; it can be expressed as the number of blocks by multiplying its value by the `session_length`.
- `inactivity_period` defines the maximum time (as a number of sessions) during which an activation message must be sent; otherwise, the declaration is considered inactive; it can be expressed as the number of blocks by multiplying its value by the `session_length`.
- `retention_period` defines the time (as a number of sessions) after which the declaration can be safely deleted by the Garbage Collection mechanism; it can be expressed as the number of blocks by multiplying its value by the `session_length`.
- `timestamp` defines the block number at which the parameter was set.
```python
class ServiceParameters:
session_length: NumberOfBlocks
lock_period: NumberOfSessions
inactivity_period: NumberOfSessions
retention_period: NumberOfSessions
timestamp: BlockNumber
```
The `parameters` is a structure aggregating all defined `ServiceParameters` values.
```python
parameters: list[ServiceParameters]
```
#### Identifiers
We define the following set of identifiers which are used for service-specific cryptographic operations:
- `provider_id`: used to sign the SDP messages and to establish secure links between validators; it is `Ed25519PublicKey`.
- `zk_id`: used for zero-knowledge operations by the validator that includes rewarding ([Zero Knowledge Signature Scheme (ZkSignature)](https://www.notion.so/Zero-Knowledge-Signature-Scheme-ZkSignature-21c261aa09df8119bfb2dc74a3430df6?pvs=21)).
#### Locators
A `Locator` is the address of a validator which is used to establish secure communication between validators. It follows the [multiaddr addressing scheme from libp2p](https://docs.libp2p.io/concepts/fundamentals/addressing/), but it must contain only the location part and must not contain the node identity (`peer_id`).
The `provider_id` must be used as the node identity. Therefore, the `Locator` must be completed by adding the `provider_id` at the end of it, which makes the `Locator` usable in the context of libp2p.
The length of the `Locator` is restricted to 329 characters.
The syntax of every `Locator` entry must be validated.
**The common formatting of every** `Locator` **must be applied to maintain its unambiguity, to make deterministic ID generation work consistently.** The `Locator` must at least contain only lower case letters and every part of the address must be explicit (no implicit defaults).
#### Declaration Message
The construction of the declaration message is as follows:
```python
class DeclarationMessage:
service_type: ServiceType
locators: list[Locator]
provider_id: Ed25519PublicKey
zk_id: ZkPublicKey
```
The `locators` list length must be limited to reduce the potential for abuse. Therefore, the length of the list cannot be longer than 8.
The message must be signed by the `provider_id` key to prove ownership of the key that is used for network-level authentication of the validator. The message is also signed by the `zk_id` key (by default all Mantle transactions are signed with `zk_id` key).
#### Declaration Storage
Only valid declaration messages can be stored on the ledger. We define the `DeclarationInfo` as follows:
```python
class DeclarationInfo:
service: ServiceType
provider_id: Ed25519PublicKey
zk_id: ZkPublicKey
locators: list[Locator]
created: BlockNumber
active: BlockNumber
withdrawn: BlockNumber
nonce: Nonce
```
Where:
- `service` defines the service type of the declaration;
- `provider_id` is an `Ed25519PublicKey` used to sign the message by the validator;
- `zk_id` is used for zero-knowledge operations by the validator that includes rewarding ([Zero Knowledge Signature Scheme (ZkSignature)](https://www.notion.so/Zero-Knowledge-Signature-Scheme-ZkSignature-21c261aa09df8119bfb2dc74a3430df6?pvs=21));
- `locators` are a copy of the `locators` from the `DeclarationMessage`;
- `created` refers to the block number of the block that contained the declaration;
- `active` refers to the latest block number for which the active message was sent (it is set to `created` by default);
- `withdrawn` refers to the block number for which the service declaration was withdrawn (it is set to 0 by default).
- The `nonce` must be set to 0 for the declaration message and must increase monotonically by every message sent for the `declaration_id`.
We also define the `declaration_id` (of a `DeclarationId` type) that is the unique identifier of `DeclarationInfo` calculated as a hash of the concatenation of `service`, `provider_id`, `locators` and `zk_id`. The implementation of the hash function is `blake2b` using 256 bits of the output.
```python
declaration_id = Hash(service||provider_id||zk_id||locators)
```
The `declaration_id` is not stored as part of the `DeclarationInfo` but it is used to index it.
All `DeclarationInfo` references are stored in the `declarations` and are indexed by `declaration_id`.
```python
declarations: list[declaration_id]
```
#### Active Message
The construction of the active message is as follows:
```python
class ActiveMessage:
declaration_id: DeclarationId
nonce: Nonce
metadata: Metadata
```
where `metadata` is a service-specific node activeness metadata.
The message must be signed by the `zk_id` key associated with the `declaration_id`.
The `nonce` must increase monotonically by every message sent for the `declaration_id`.
#### Withdraw Message
The construction of the withdraw message is as follows:
```python
class WithdrawMessage:
declaration_id: DeclarationId
nonce: Nonce
```
The message must be signed by the `zk_id` key from the `declaration_id`.
The `nonce` must increase monotonically by every message sent for the `declaration_id`.
#### Indexing
Every event must be correctly indexed to enable lighter synchronization of the changes. Therefore, we index every `declaration_id` according to `EventType`, `ServiceType`, and `Timestamp`. Where `EventType = { "created", "active", "withdrawn" }` follows the type of the message.
```python
events = {
event_type: {
service_type: {
timestamp: {
declarations: list[declaration_id]
}
}
}
}
```
### Protocol
#### Declare
The Declare action associates a validator with a service it wants to provide. It requires sending a valid `DeclarationMessage` (as defined in Declaration Message), which is then processed (as defined below) and stored (as defined in Declaration Storage).
The declaration message is considered valid when all of the following are met:
- The sender meets the stake requirements.
- The `declaration_id` is unique.
- The sender knows the secret behind the `provider_id` identifier.
- The length of the `locators` list must not be longer than 8.
- The `nonce` is increasing monotonically.
If all of the above conditions are fulfilled, then the message is stored on the ledger; otherwise, the message is discarded.
#### Active
The Active action enables marking the provider as actively providing a service. It requires sending a valid `ActiveMessage` (as defined in Active Message), which is relayed to the service-specific node activity logic (as indicated by the service type in Common SDP Structures).
The Active action updates the `active` value of the `DeclarationInfo`, which means that it also activates inactive (but not expired) providers.
The SDP active action logic is:
1. A node sends a `ActiveMessage` transaction.
2. The `ActiveMessage` is verified by the SDP logic.
a. The `declaration_id` returns an existing `DeclarationInfo`.
b. The transaction containing `ActiveMessage` is signed by the `zk_id`.
c. The `withdrawn` from the `DeclarationInfo` is set to zero.
d. The `nonce` is increasing monotonically.
3. If any of these conditions fail, discard the message and stop processing.
4. The message is processed by the service-specific activity logic alongside the `active` value indicating the period since the last active message was sent. The `active` value comes from the `DeclarationInfo`.
5. If the service-specific activity logic approves the node active message, then the `active` field of the `DeclarationInfo` is set to the current block height.
#### Withdraw
The withdraw action enables a withdrawal of a service declaration. It requires sending a valid `WithdrawMessage` (as defined in Withdraw Message). The withdrawal cannot happen before the end of the locking period, which is defined as the number of blocks counted since `created`. This lock period is stored as `lock_period` in the Service Parameters.
The logic of the withdraw action is:
1. A node sends a `WithdrawMessage` transaction.
2. The `WithdrawMessage` is verified by the SDP logic:
a. The `declaration_id` returns an existing `DeclarationInfo`.
b. The transaction containing `WithdrawMessage` is signed by the `zk_id`.
c. The `withdrawn` from `DeclarationInfo` is set to zero.
d. The `nonce` is increasing monotonically.
3. If any of the above is not correct, then discard the message and stop.
4. Set the `withdrawn` from the `DeclarationInfo` to the current block height.
5. Unlock the stake.
#### Garbage Collection
The protocol requires a garbage collection mechanism that periodically removes unused `DeclarationInfo` entries.
The logic of garbage collection is:
For every `DeclarationInfo` in the `declarations` set, remove the entry if either:
1. The entry is past the retention period: `withdrawn + retention_period < current_block_height`.
2. The entry is inactive beyond the inactivity and retention periods: `active + inactivity_period + retention_period < current_block_height`.
#### Query
The protocol must enable querying the ledger in at least the following manner:
- `GetAllProviderId(timestamp)`, returns all `provider_id`s associated with the `timestamp`.
- `GetAllProviderIdSince(timestamp)`, returns all `provider_id`s since the `timestamp`.
- `GetAllDeclarationInfo(timestamp)`, returns all `DeclarationInfo` entries associated with the `timestamp`.
- `GetAllDeclarationInfoSince(timestamp)`, returns all `DeclarationInfo` entries since the `timestamp`.
- `GetDeclarationInfo(provider_id)`, returns the `DeclarationInfo` entry identified by the `provider_id`.
- `GetDeclarationInfo(declaration_id)`, returns the `DeclarationInfo` entry identified by the `declaration_id`.
- `GetAllServiceParameters(timestamp)`, returns all entries of the `ServiceParameters` store for the requested `timestamp`.
- `GetAllServiceParametersSince(timestamp)`, returns all entries of the `ServiceParameters` store since the requested `timestamp`.
- `GetServiceParameters(service_type, timestamp)`, returns the service parameter entry from the `ServiceParameters` store of a `service_type` for a specified `timestamp`.
- `GetMinStake(timestamp)`, returns the `MinStake` structure at the requested `timestamp`.
- `GetMinStakeSince(timestamp)`, returns a set of `MinStake` structures since the requested `timestamp`.
The query must return an error if the retention period for the delegation has passed and the requested information is not available.
The list of queries may be extended.
Every query must return information for a finalized state only.
### Mantle and ZK Proof
For more information about the Mantle and ZK proofs, please refer to [Mantle Specification](https://www.notion.so/Mantle-Specification-21c261aa09df810c8820fab1d78b53d9?pvs=21).
## Appendix
### Future Improvements
Refer to the [Mantle Specification](https://www.notion.so/Mantle-Specification-21c261aa09df810c8820fab1d78b53d9?pvs=21) for a list of potential improvements to the protocol.
## References
- Mantle and ZK Proof: [Mantle Specification](https://www.notion.so/Mantle-Specification-21c261aa09df810c8820fab1d78b53d9?pvs=21)
- Ed25519 Digital Signatures: [RFC 8032](https://datatracker.ietf.org/doc/html/rfc8032)
- BLAKE2b Cryptographic Hash: [RFC 7693](https://datatracker.ietf.org/doc/html/rfc7693)
- libp2p Multiaddr: [Addressing Specification](https://docs.libp2p.io/concepts/fundamentals/addressing/)
- Zero Knowledge Signatures: [ZkSignature Scheme](https://www.notion.so/Zero-Knowledge-Signature-Scheme-ZkSignature-21c261aa09df8119bfb2dc74a3430df6?pvs=21)
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

View File

@@ -0,0 +1,252 @@
---
title: HASHGRAPHLIKE CONSENSUS
name: Hashgraphlike Consensus Protocol
status: raw
category: Standards Track
tags:
editor: Ugur Sen [ugur@status.im](mailto:ugur@status.im)
contributors: seemenkina [ekaterina@status.im](mailto:ekaterina@status.im)
---
## Abstract
This document specifies a scalable, decentralized, and Byzantine Fault Tolerant (BFT)
consensus mechanism inspired by Hashgraph, designed for binary decision-making in P2P networks.
## Motivation
Consensus is one of the essential components of decentralization.
In particular, in the decentralized group messaging application is used for
binary decision-making to govern the group.
Therefore, each user contributes to the decision-making process.
Besides achieving decentralization, the consensus mechanism MUST be strong:
- Under the assumption of at least `2/3` honest users in the network.
- Each user MUST conclude the same decision and scalability:
message propagation in the network MUST occur within `O(logn)` rounds,
where `n` is the total number of peers,
in order to preserve the scalability of the messaging application.
## Format Specification
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document
are to be interpreted as described in [2119](https://www.ietf.org/rfc/rfc2119.txt).
## Flow
Any user in the group initializes the consensus by creating a proposal.
Next, the user broadcasts the proposal to the whole network.
Upon each user receives the proposal, validates the proposal,
adds its vote as yes or no and with its signature and timestamp.
The user then sends the proposal and vote to a random peer in a P2P setup,
or to a subscribed gossipsub channel if gossip-based messaging is used.
Therefore, each user first validates the signature and then adds its new vote.
Each sending message counts as a round.
After `log(n)` rounds all users in the network have the others vote
if at least `2/3` number of users are honest where honesty follows the protocol.
In general, the voting-based consensus consists of the following phases:
1. Initialization of voting
2. Exchanging votes across the rounds
3. Counting the votes
### Assumptions
- The users in the P2P network can discover the nodes or they are subscribing same channel in a gossipsub.
- We MAY have non-reliable (silent) nodes.
- Proposal owners MUST know the number of voters.
## 1. Initialization of voting
A user initializes the voting with the proposal payload which is
implemented using [protocol buffers v3](https://protobuf.dev/) as follows:
```bash
syntax = "proto3";
package vac.voting;
message Proposal {
string name = 10; // Proposal name
string payload = 11; // Proposal description
uint32 proposal_id = 12; // Unique identifier of the proposal
bytes proposal_owner = 13; // Public key of the creator
repeated Votes = 14; // Vote list in the proposal
uint32 expected_voters_count = 15; // Maximum number of distinct voters
uint32 round = 16; // Number of Votes
uint64 timestamp = 17; // Creation time of proposal
uint64 expiration_time = 18; // The time interval that the proposal is active.
bool liveness_criteria_yes = 19; // Shows how managing the silent peers vote
}
message Vote {
uint32 vote_id = 20; // Unique identifier of the vote
bytes vote_owner = 21; // Voter's public key
uint32 proposal_id = 22; // Linking votes and proposals
int64 timestamp = 23; // Time when the vote was cast
bool vote = 24; // Vote bool value (true/false)
bytes parent_hash = 25; // Hash of previous owner's Vote
bytes received_hash = 26; // Hash of previous received Vote
bytes vote_hash = 27; // Hash of all previously defined fields in Vote
bytes signature = 28; // Signature of vote_hash
}
```
To initiate a consensus for a proposal,
a user MUST complete all the fields in the proposal, including attaching its `vote`
and the `payload` that shows the purpose of the proposal.
Notably, `parent_hash` and `received_hash` are empty strings because there is no previous or received hash.
Then the initialization section ends when the user who creates the proposal sends it
to the random peer from the network or sends it to the proposal to the specific channel.
## 2. Exchanging votes across the peers
Once the peer receives the proposal message `P_1` from a 1-1 or a gossipsub channel does the following checks:
1. Check the signatures of the each votes in proposal, in particular for proposal `P_1`,
verify the signature of `V_1` where `V_1 = P_1.votes[0]` with `V_1.signature` and `V_1.vote_owner`
2. Do `parent_hash` check: If there are repeated votes from the same sender,
check that the hash of the former vote is equal to the `parent_hash` of the later vote.
3. Do `received_hash` check: If there are multiple votes in a proposal, check that the hash of a vote is equal to the `received_hash` of the next one.
4. After successful verification of the signature and hashes, the receiving peer proceeds to generate `P_2` containing a new vote `V_2` as following:
4.1. Add its public key as `P_2.vote_owner`.
4.2. Set `timestamp`.
4.3. Set boolean `vote`.
4.4. Define `V_2.parent_hash = 0` if there is no previous peer's vote, otherwise hash of previous owner's vote.
4.5. Set `V_2.received_hash = hash(P_1.votes[0])`.
4.6. Set `proposal_id` for the `vote`.
4.7. Calculate `vote_hash` by hash of all previously defined fields in Vote:
`V_2.vote_hash = hash(vote_id, owner, proposal_id, timestamp, vote, parent_hash, received_hash)`
4.8. Sign `vote_hash` with its private key corresponding the public key as `vote_owner` component then adds `V_2.vote_hash`.
5. Create `P_2` with by adding `V_2` as follows:
5.1. Assign `P_2.name`, `P_2.proposal_id`, and `P_2.proposal_owner` to be identical to those in `P_1`.
5.2. Add the `V_2` to the `P_2.Votes` list.
5.3. Increase the round by one, namely `P_2.round = P_1.round + 1`.
5.4. Verify that the proposal has not expired by checking that: `P_2.timestamp - current_time < P_1.expiration_time`.
If this does not hold, other peers ignore the message.
After the peer creates the proposal `P_2` with its vote `V_2`,
sends it to the random peer from the network or
sends it to the proposal to the specific channel.
## 3. Determining the result
Because consensus depends on meeting a quorum threshold,
each peer MUST verify the accumulated votes to determine whether the necessary conditions have been satisfied.
The voting result is set YES if the majority of the `2n/3` from the distinct peers vote YES.
To verify, the `findDistinctVoter` method processes the proposal by traversing its `Votes` list to determine the number of unique voters.
If this method returns true, the peer proceeds with strong validation,
which ensures that if any honest peer reaches a decision,
no other honest peer can arrive at a conflicting result.
1. Check each `signature` in the vote as shown in the [Section 2](#2-exchanging-votes-across-the-peers).
2. Check the `parent_hash` chain if there are multiple votes from the same owner namely `vote_i` and `vote_i+1` respectively,
the parent hash of `vote_i+1` should be the hash of `vote_i`
3. Check the `previous_hash` chain, each received hash of `vote_i+1` should be equal to the hash of `vote_i`.
4. Check the `timestamp` against the replay attack.
In particular, the `timestamp` cannot be the old in the determined threshold.
5. Check that the liveness criteria defined in the Liveness section are satisfied.
If a proposal is verified by all the checks,
the `countVote` method counts each YES vote from the list of Votes.
## 4. Properties
The consensus mechanism satisfies liveness and security properties as follows:
### Liveness
Liveness refers to the ability of the protocol to eventually reach a decision when sufficient honest participation is present.
In this protocol, if `n > 2` and more than `n/2` of the votes among at least `2n/3` distinct peers are YES,
then the consensus result is defined as YES; otherwise, when `n ≤ 2`, unanimous agreement (100% YES votes) is required.
The peer calculates the result locally as shown in the [Section 3](#3-determining-the-result).
From the [hashgraph property](https://hedera.com/learning/hedera-hashgraph/what-is-hashgraph-consensus),
if a node could calculate the result of a proposal,
it implies that no peer can calculate the opposite of the result.
Still, reliability issues can cause some situations where peers cannot receive enough messages,
so they cannot calculate the consensus result.
Rounds are incremented when a peer adds and sends the new proposal.
Calculating the required number of rounds, `2n/3` from the distinct peers' votes is achieved in two ways:
1. `2n/3` rounds in pure P2P networks
2. `2` rounds in gossipsub
Since the message complexity is `O(1)` in the gossipsub channel,
in case the network has reliability issues,
the second round is used for the peers cannot receive all the messages from the first round.
If an honest and online peer has received at least one vote but not enough to reach consensus,
it MAY continue to propagate its own vote — and any votes it has received — to support message dissemination.
This process can continue beyond the expected round count,
as long as it remains within the expiration time defined in the proposal.
The expiration time acts as a soft upper bound to ensure that consensus is either reached or aborted within a bounded timeframe.
#### Equality of votes
An equality of votes occurs when verifying at least `2n/3` distinct voters and
applying `liveness_criteria_yes` the number of YES and NO votes is equal.
Handling ties is an application-level decision. The application MUST define a deterministic tie policy:
RETRY: re-run the vote with a new proposal_id, optionally adjusting parameters.
REJECT: abort the proposal and return voting result as NO.
The chosen policy SHOULD be consistent for all peers via proposal's `payload` to ensure convergence on the same outcome.
### Silent Node Management
Silent nodes are the nodes that not participate the voting as YES or NO.
There are two possible counting votes for the silent peers.
1. **Silent peers means YES:**
Silent peers counted as YES vote, if the application prefer the strong rejection for NO votes.
2. **Silent peers means NO:**
Silent peers counted as NO vote, if the application prefer the strong acception for NO votes.
The proposal is set to default true, which means silent peers' votes are counted as YES namely `liveness_criteria_yes` is set true by default.
### Security
This RFC uses cryptographic primitives to prevent the
malicious behaviours as follows:
- Vote forgery attempt: creating unsigned invalid votes
- Inconsistent voting: a malicious peer submits conflicting votes (e.g., YES to some peers and NO to others)
in different stages of the protocol, violating vote consistency and attempting to undermine consensus.
- Integrity breaking attempt: tampering history by changing previous votes.
- Replay attack: storing the old votes to maliciously use in fresh voting.
## 5. Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/)
## 6. References
- [Hedera Hashgraph](https://hedera.com/learning/hedera-hashgraph/what-is-hashgraph-consensus)
- [Gossip about gossip](https://docs.hedera.com/hedera/core-concepts/hashgraph-consensus-algorithms/gossip-about-gossip)
- [Simple implementation of hashgraph consensus](https://github.com/conanwu777/hashgraph)

241
vac/raw/eth-mls-offchain.md Normal file
View File

@@ -0,0 +1,241 @@
---
title: ETH-MLS-OFFCHAIN
name: Secure channel setup using decentralized MLS and Ethereum accounts
status: raw
category: Standards Track
tags:
editor: Ugur Sen [ugur@status.im](mailto:ugur@status.im)
contributors: seemenkina [ekaterina@status.im](mailto:ekaterina@status.im)
---
## Abstract
The following document specifies Ethereum authenticated scalable
and decentralized secure group messaging application by
integrating Message Layer Security (MLS) backend.
Decentralization refers each user is a node in P2P network and
each user has voice for any changes in group.
This is achieved by integrating a consensus mechanism.
Lastly, this RFC can also be referred to as de-MLS,
decentralized MLS, to emphasize its deviation
from the centralized trust assumptions of traditional MLS deployments.
## Motivation
Group messaging is a fundamental part of digital communication,
yet most existing systems depend on centralized servers,
which introduce risks around privacy, censorship, and unilateral control.
In restrictive settings, servers can be blocked or surveilled;
in more open environments, users still face opaque moderation policies,
data collection, and exclusion from decision-making processes.
To address this, we propose a decentralized, scalable peer-to-peer
group messaging system where each participant runs a node, contributes
to message propagation, and takes part in governance autonomously.
Group membership changes are decided collectively through a lightweight
partially synchronous, fault-tolerant consensus protocol without a centralized identity.
This design enables truly democratic group communication and is well-suited
for use cases like activist collectives, research collaborations, DAOs, support groups,
and decentralized social platforms.
## Format Specification
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document
are to be interpreted as described in [2119](https://www.ietf.org/rfc/rfc2119.txt).
### Assumptions
- The nodes in the P2P network can discover other nodes or will connect to other nodes when subscribing to same topic in a gossipsub.
- We MAY have non-reliable (silent) nodes.
- We MUST have a consensus that is lightweight, scalable and finalized in a specific time.
## Roles
The three roles used in de-MLS is as follows:
- `node`: Nodes are members of network without being in any secure group messaging.
- `member`: Members are special nodes in the secure group messaging who
obtains current group key of secure group messaging.
- `steward`: Stewards are special and transparent members in secure group
messaging who organizes the changes upon the voted-proposals.
## MLS Background
The de-MLS consists of MLS backend, so the MLS services and other MLS components
are taken from the original [MLS specification](https://datatracker.ietf.org/doc/rfc9420/), with or without modifications.
### MLS Services
MLS is operated in two services authentication service (AS) and delivery service (DS).
Authentication service enables group members to authenticate the credentials presented by other group members.
The delivery service routes MLS messages among the nodes or
members in the protocol in the correct order and
manage the `keyPackage` of the users where the `keyPackage` is the objects
that provide some public information about a user.
### MLS Objects
Following section presents the MLS objects and components that used in this RFC:
`Epoch`: Fixed time intervals that changes the state that is defined by members,
section 3.4 in [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/).
`MLS proposal message:` Members MUST receive the proposal message prior to the
corresponding commit message that initiates a new epoch with key changes,
in order to ensure the intended security properties, section 12.1 in [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/).
Here, the add and remove proposals are used.
`Application message`: This message type used in arbitrary encrypted communication between group members.
This is restricted by [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/) as if there is pending proposal,
the application message should be cut.
Note that: Since the MLS is based on servers, this delay between proposal and commit messages are very small.
`Commit message:` After members receive the proposals regarding group changes,
the committer, who may be any member of the group, as specified in [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/),
generates the necessary key material for the next epoch, including the appropriate welcome messages
for new joiners and new entropy for removed members. In this RFC, the committers only MUST be stewards.
### de-MLS Objects
`Voting proposal` Similar to MLS proposals, but processed only if approved through a voting process.
They function as application messages in the MLS group,
allowing the steward to collect them without halting the protocol.
## Flow
General flow is as follows:
- A steward initializes a group just once, and then sends out Group Announcements (GA) periodically.
- Meanwhile, each`node`creates and sends their`credential` includes `keyPackage`.
- Each `member`creates `voting proposals` sends them to from MLS group during epoch E.
- Meanwhile, the `steward` collects finalized `voting proposals` from MLS group and converts them into
`MLS proposals` then sends them with correspondng `commit messages`
- Evantually, with the commit messages, all members starts the next epoch E+1.
## Creating Voting Proposal
A `member` MAY initializes the voting with the proposal payload
which is implemented using [protocol buffers v3](https://protobuf.dev/) as follows:
```protobuf
syntax = "proto3";
message Proposal {
string name = 10; // Proposal name
string payload = 11; // Describes the what is voting fore
int32 proposal_id = 12; // Unique identifier of the proposal
bytes proposal_owner = 13; // Public key of the creator
repeated Vote votes = 14; // Vote list in the proposal
int32 expected_voters_count = 15; // Maximum number of distinct voters
int32 round = 16; // Number of Votes
int64 timestamp = 17; // Creation time of proposal
int64 expiration_time = 18; // Time interval that the proposal is active
bool liveness_criteria_yes = 19; // Shows how managing the silent peers vote
}
```
```bash
message Vote {
int32 vote_id = 20; // Unique identifier of the vote
bytes vote_owner = 21; // Voter's public key
int64 timestamp = 22; // Time when the vote was cast
bool vote = 23; // Vote bool value (true/false)
bytes parent_hash = 24; // Hash of previous owner's Vote
bytes received_hash = 25; // Hash of previous received Vote
bytes vote_hash = 26; // Hash of all previously defined fields in Vote
bytes signature = 27; // Signature of vote_hash
}
```
The voting proposal MAY include adding a `node` or removing a `member`.
After the `member` creates the voting proposal,
it is emitted to the network via the MLS `Application message` with a lightweight,
epoch based voting such as [hashgraphlike consensus.](https://github.com/vacp2p/rfc-index/blob/consensus-hashgraph-like/vac/raw/consensus-hashgraphlike.md)
This consensus result MUST be finalized within the epoch as YES or NO.
If the voting result is YES, this points out the voting proposal will be converted into
the MLS proposal by the `steward` and following commit message that starts the new epoch.
## Creating welcome message
When a MLS `MLS proposal message` is created by the `steward`,
a `commit message` SHOULD follow,
as in section 12.04 [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/) to the members.
In order for the new `member` joining the group to synchronize with the current members
who received the `commit message`,
the `steward` sends a welcome message to the node as the new `member`,
as in section 12.4.3.1. [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/).
## Single steward
To naive way to create a decentralized secure group messaging is having a single transparent `steward`
who only applies the changes regarding the result of the voting.
This is mostly similar with the general flow and specified in voting proposal and welcome message creation sections.
1. Each time a single `steward` initializes a group with group parameters with parameters
as in section 8.1. Group Context in [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/).
2. `steward` creates a group anouncement (GA) according to the previous step and
broadcast it to the all network periodically. GA message is visible in network to all `nodes`.
3. The each `node` who wants to be a member needs to obtain this anouncement and create `credential`
includes `keyPackage` that is specified in [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/) section 10.
4. The `steward` aggregates all `KeyPackages` utilizes them to provision group additions for new members,
based on the outcome of the voting process.
5. Any `member` start to create `voting proposals` for adding or removing users,
and present them to the voting in the MLS group as an application message.
However, unlimited use of `voting proposals` within the group may be misused by
malicious or overly active members.
Therefore, an application-level constraint can be introduced to limit the number
or frequency of proposals initiated by each member to prevent spam or abuse.
6. Meanwhile, the `steward` collects finalized `voting proposals` with in epoch `E`,
that have received affirmative votes from members via application messages.
Otherwise, the `steward` discards proposals that did not receive a majority of "YES" votes.
Since voting proposals are transmitted as application messages, omitting them does not affect
the protocols correctness or consistency.
7. The `steward` converts all approved `voting proposals` into
corresponding `MLS proposals` and `commit message`, and
transmits both in a single operation as in [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/) section 12.4,
including welcome messages for the new members. Therefore, the `commit message` ends the previous epoch and create new ones.
## Multi stewards
Decentralization has already been achieved in the previous section.
However, to improve availability and ensure censorship resistance,
the single-steward protocol is extended to a multi-steward architecture.
In this design, each epoch is coordinated by a designated steward,
operating under the same protocol as the single-steward model.
Thus, the multi-steward approach primarily defines how steward roles
rotate across epochs while preserving the underlying structure and logic of the original protocol.
Two variants of the multi-steward design are introduced to address different system requirements.
### Multi steward with single consensus
In this model, all group modifications, such as adding or removing members,
must be approved through consensus by all participants,
including the steward assigned for epoch `E`.
A configuration with multiple stewards operating under a shared consensus protocol offers
increased decentralization and stronger protection against censorship.
However, this benefit comes with reduced operational efficiency.
The model is therefore best suited for small groups that value
decentralization and censorship resistance more than performance.
### Multi steward with two consensuses
The two-consensus model offers improved efficiency with a trade-off in decentralization.
In this design, group changes require consensus only among the stewards, rather than all members.
Regular members participate by periodically selecting the stewards but do not take part in each decision.
This structure enables faster coordination since consensus is achieved within a smaller group of stewards.
It is particularly suitable for large user groups, where involving every member in each decision would be impractical.
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/)
### References
- [MLS RFC 9420](https://datatracker.ietf.org/doc/rfc9420/)
- [Hashgraphlike Consensus](https://github.com/vacp2p/rfc-index/blob/consensus-hashgraph-like/vac/raw/consensus-hashgraphlike.md)
- [vacp2p/de-mls](https://github.com/vacp2p/de-mls)

View File

@@ -120,6 +120,18 @@ contents as packets are forwarded hop-by-hop.
Sphinx packets are fixed-size and indistinguishable from one another, providing
unlinkability and metadata protection.
- **Initialization Vector (IV)**
A fixed-length input used to initialize block ciphers to add randomness to the
encryption process. It ensures that encrypting the same plaintext with the same
key produces different ciphertexts. The IV is not secret but must be unique for
each encryption.
- **Single-Use Reply Block (SURB)**
A pre-computed Sphinx header that encodes a return path back to the sender.
SURBs are generated by the sender and included in the Sphinx packet sent to the recipient.
It enables the recipient to send anonymous replies,
without learning the senders identity, the return path, or the forwarding delays.
## 3. Motivation and Background
libp2p enables modular peer-to-peer applications, but it lacks built-in support
@@ -194,7 +206,7 @@ describe the core components of a mix network.
The Mix Protocol relies on two core design elements to achieve sender
unlinkability and metadata
protection: a mixing strategy and a a cryptographically secure mix packet
protection: a mixing strategy and a cryptographically secure mix packet
format.
### 4.1 Mixing Strategy
@@ -297,7 +309,7 @@ layered encryption and per-hop delay provides resistance to traffic analysis and
enables message-level
unlinkability.
Unlike typical custom libp2p protocols, the Mix protocol is stateless&mdash;it
Unlike typical custom libp2p protocols, the Mix Protocol is stateless&mdash;it
does not establish
persistent streams, negotiate protocols, or maintain sessions. Each message is
self-contained
@@ -458,8 +470,8 @@ multistream handshake for protocol negotiation.
While libp2p supports multiplexing multiple streams over a single transport
connection using
stream muxers such as mplex and yamux, it does not natively support reusing a
stream over multiple
stream muxers such as mplex and yamux, it does not natively support reusing the
same stream over multiple
message transmissions. However, stream reuse may be desirable in the mixnet
setting to reduce overhead
and avoid hitting per protocol stream limits between peers.
@@ -512,9 +524,9 @@ adversarial routing bias.
While no existing mechanism provides unbiased sampling by default,
[Wakus ambient discovery](https://rfc.vac.dev/waku/standards/core/33/discv5/)
&mdash; an extension
&mdash;an extension
over [Discv5](https://github.com/ethereum/devp2p/blob/master/discv5/discv5.md)
&mdash; demonstrates
&mdash;demonstrates
an approximate solution. It combines topic-based capability advertisement with
periodic
peer sampling. A similar strategy could potentially be adapted for the Mix
@@ -618,7 +630,7 @@ _unobservability_ where
a passive adversary cannot determine whether a node is sending real messages or
not.
In the Mix Protocol, cover traffic is limited to _loop messages_ &mdash; dummy
In the Mix Protocol, cover traffic is limited to _loop messages_&mdash;dummy
Sphinx packets
that follow a valid mix path and return to the originating node. These messages
carry no application
@@ -627,7 +639,7 @@ routing behavior.
Cover traffic MAY be generated by either mix nodes or senders. The strategy for
generating
such traffic &mdash; such as timing and frequency &mdash; is pluggable and not
such traffic&mdash;such as timing and frequency&mdash;is pluggable and not
specified
in this document.
@@ -667,9 +679,9 @@ deter participation by compromised or transient peers.
The Mix Protocol does not mandate any form of payment, token exchange, or
accounting. More
sophisticated economic models &mdash; such as stake-based participation,
sophisticated economic models&mdash;such as stake-based participation,
credentialed relay networks,
or zero-knowledge proof-of-contribution systems &mdash; MAY be layered on top of
or zero-knowledge proof-of-contribution systems&mdash;MAY be layered on top of
the protocol or
enforced via external coordination.
@@ -779,6 +791,11 @@ destination origin protocol instance.
The node MUST NOT retain decrypted content after forwarding.
The routing behavior described in this section relies on the use of
Sphinx packets to preserve unlinkability and confidentiality across
hops. The next section specifies their structure, cryptographic
components, and construction.
## 8. Sphinx Packet Format
The Mix Protocol uses the Sphinx packet format to enable unlinkable, multi-hop
@@ -789,8 +806,10 @@ encapsulated in a Sphinx packet constructed by the initiating mix node. The
packet is encrypted in
layers such that each hop in the mix path can decrypt exactly one layer and
obtain the next-hop
routing information and delay value, without learning the complete path or the
routing information and forwarding delay, without learning the complete path or the
message origin.
Only the final hop learns the destination, which is encoded in the innermost
routing layer.
Sphinx packets are self-contained and indistinguishable on the wire, providing
strong metadata
@@ -820,12 +839,10 @@ subsections.
### 8.1 Packet Structure Overview
Each Sphinx packet consists of three fixed-length header fields &mdash; $α$,
$β$, and $γ$ &mdash;
followed by a fixed-length encrypted payload $δ$. Together, these components
enable per-hop message
processing with strong confidentiality and integrity guarantees in a stateless
and unlinkable manner.
Each Sphinx packet consists of three fixed-length header fields&mdash; $α$,
$β$, and $γ$ &mdash;followed by a fixed-length encrypted payload $δ$.
Together, these components enable per-hop message processing with strong
confidentiality and integrity guarantees in a stateless and unlinkable manner.
- **$α$ (Alpha)**: An ephemeral public value. Each mix node uses its private key
and $α$ to
@@ -833,8 +850,9 @@ derive a shared session key for that hop. This session key is used to decrypt
and process
one layer of the packet.
- **$β$ (Beta)**: The nested encrypted routing information. It encodes the next
hop address, the forwarding delay,
integrity check $γ$ for the next hop, and the $β$ for subsequent hops.
hop address, the forwarding delay, integrity check $γ$ for the next hop, and
the $β$ for subsequent hops. At the final hop, $β$ encodes the destination
address and fixed-length zero padding to preserve uniform size.
- **$γ$ (Gamma)**: A message authentication code computed over $β$ using the
session key derived
from $α$. It ensures header integrity at each hop.
@@ -843,12 +861,12 @@ fixed maximum length and
encrypted in layers corresponding to each hop in the mix path.
At each hop, the mix node derives the session key from $α$, verifies the header
integrity
using $γ$, decrypts one layer of $β$ to extract the next hop and delay, and
decrypts one layer
of $δ$. It then constructs a new packet with updated values of $α$, $β$, $γ$,
and $δ$, and
forwards it to the next hop.
integrity using $γ$, decrypts one layer of $β$ to extract the next hop and
delay, and decrypts one layer of $δ$. It then constructs a new packet with
updated values of $α$, $β$, $γ$, and $δ$, and forwards it to the next hop. At
the final hop, the mix node decrypts the innermost layer of $β$ and $δ$, which
yields the destination address and the original application message
respectively.
All Sphinx packets are fixed in size and indistinguishable on the wire. This
uniform format,
@@ -866,20 +884,29 @@ This section defines the cryptographic primitives used in Sphinx packet
construction and processing.
- **Security Parameter**: All cryptographic operations target a minimum of
$\kappa = 128$ bits of
$κ = 128$ bits of
security, balancing performance with resistance to modern attacks.
- **Elliptic Curve Group $\mathbb{G}$**:
- **Curve**: Curve25519
- **Notation**: Let $g$ denote the canonical base point (generator) of $\mathbb{G}$.
- **Purpose**: Used for deriving DiffieHellman-style shared key at each hop
using $α$.
- **Representation**: Small 32-byte group elements, efficient for both
encryption and key exchange.
- **Scalar Field**: The curve is defined over the finite field
$\mathbb{Z}_q$, where $q = 2^{252} + 27742317777372353535851937790883648493$.
Ephemeral exponents used in Sphinx packet construction are selected uniformly
at random from $\mathbb{Z}_q^*$, the multiplicative subgroup of $\mathbb{Z}_q$.
- **Hash Function**:
- **Construction**: SHA-256
- **Notation**: The hash function is denoted by $H(\cdot)$ in subsequent sections.
- **Key Derivation Function (KDF)**:
- **Purpose**: To derive encryption keys, IVs, and MAC key from the shared
session key at each hop.
- **Construction**: SHA-256 hash with output truncated to 128 bits.
- **Construction**: SHA-256 hash with output truncated to $128$ bits.
- **Key Derivation**: The KDF key separation labels (_e.g.,_ `"aes_key"`,
`"mac_key"`)
are fixed strings and MUST be agreed upon across implementations.
@@ -889,9 +916,457 @@ are fixed strings and MUST be agreed upon across implementations.
- **Keys and IVs**: Each derived from the session key for the hop using the KDF.
- **Message Authentication Code (MAC)**:
- **Construction**: HMAC-SHA-256 with output truncated to 128 bits.
- **Construction**: HMAC-SHA-256 with output truncated to $128$ bits.
- **Purpose**: To compute $γ$ for each hop.
- **Key**: Derived using KDF from the session key for the hop.
These primitives are used consistently throughout packet construction and
decryption, as described in the following sections.
### 8.3 Packet Component Sizes
This section defines the size of each component in a Sphinx packet, deriving them
from the security parameter and protocol parameters introduced earlier. All Sphinx
packets MUST be fixed in length to ensure uniformity and indistinguishability on
the wire. The serialized packet is structured as follows:
```text
+--------+----------+--------+----------+
| α | β | γ | δ |
| 32 B | variable | 16 B | variable |
+--------+----------+--------+----------+
```
#### 8.3.1 Header Field Sizes
The header consists of the fields $α$, $β$, and $γ$, totaling a fixed size per
maximum path length:
- **$α$ (Alpha)**: 32 bytes
The size of $α$ is determined by the elliptic curve group representation used
(Curve25519), which encodes group elements as 32-byte values.
- **$β$ (Beta)**: $((t + 1)r + 1)κ$ bytes
The size of $β$ depends on:
- **Maximum path length ($r$)**: The recommended value of $r=5$ balances
bandwidth versus anonymity tradeoffs.
- **Combined address and delay width ($tκ$)**: The recommended $t=6$
accommodates standard libp2p relay multiaddress representations plus a
2-byte delay field. While the actual multiaddress and delay fields may be
shorter, they are padded to $tκ$ bytes to maintain fixed field size. The
structure and rationale for the $tκ$ block and its encoding are specified in
[Section 8.4](#84-address-and-delay-encoding).
Note: This expands on the original
[Sphinx packet format]((https://cypherpunks.ca/~iang/pubs/Sphinx_Oakland09.pdf)),
which embeds a fixed $κ$-byte mix node identifier per hop in $β$.
The Mix Protocol generalizes this to $tκ$ bytes to accommodate libp2p
multiaddresses and forwarding delays while preserving the cryptographic
properties of the original design.
- **Per-hop $γ$ size ($κ$)** (defined below): Accounts for the integrity tag
included with each hops routing information.
Using the recommended value of $r=5$ and $t=6$, the resulting $β$ size is
$576$ bytes. At the final hop, $β$ encodes the destination address in the
first $tκ-2$ bytes and the remaining bytes are zero-padded.
- **$γ$ (Gamma)**: $16$ bytes
The size of $γ$ equals the security parameter $κ$, providing a $κ$-bit integrity
tag at each hop.
Thus, the total header length is:
$`
\begin{aligned}
|Header| &= α + β + γ \\
&= 32 + ((t + 1)r + 1)κ + 16
\end{aligned}
`$
Notation: $|x|$ denotes the size (in bytes) of field $x$.
Using the recommended value of $r = 5$ and $t = 6$, the header size is:
$`
\begin{aligned}
|Header| &= 32 + 576 + 16 \\
&= 624 \ bytes
\end{aligned}
`$
#### 8.3.2 Payload Size
This subsection defines the size of the encrypted payload $δ$ in a Sphinx packet.
$δ$ contains the application message, padded to a fixed maximum length to ensure
all packets are indistinguishable on the wire. The size of $δ$ is calculated as:
$`
\begin{aligned}
|δ| &= TotalPacketSize - HeaderSize
\end{aligned}
`$
The recommended total packet size is $4608$ bytes, chosen to:
- Accommodate larger libp2p application messages, such as those commonly
observed in Status chat using Waku (typically ~4KB payloads),
- Allow inclusion of additional data such as SURBs without requiring fragmentation,
- Maintain reasonable per-hop processing and bandwidth overhead.
This recommended total packet size of \$4608\$ bytes yields:
$`
\begin{aligned}
Payload &= 4608 - 624 \\
&= 3984\ bytes
\end{aligned}
`$
Implementations MUST account for payload extensions, such as SURBs,
when determining the maximum message size that can be encapsulated in a
single Sphinx packet. Details on SURBs are defined in
[Section X.X].
The following subsection defines the padding and fragmentation requirements for
ensuring this fixed-size constraint.
#### 8.3.3 Padding and Fragmentation
Implementations MUST ensure that all messages shorter than the maximum payload size
are padded before Sphinx encapsulation to ensure that all packets are
indistinguishable on the wire. Messages larger than the maximum payload size MUST
be fragmented by the origin protocol or top-level application before being passed
to the Mix Protocol. Reassembly is the responsibility of the consuming application,
not the Mix Protocol.
#### 8.3.4 Anonymity Set Considerations
The fixed maximum packet size is a configurable parameter. Protocols or
applications that choose to configure a different packet size (either larger or
smaller than the default) MUST be aware that using unique or uncommon packet sizes
can reduce their effective anonymity set to only other users of the same size.
Implementers SHOULD align with widely used defaults to maximize anonymity set size.
Similarly, parameters such as $r$ and $t$ are configurable. Changes to these
parameters affect header size and therefore impact payload size if the total packet
size remains fixed. However, if such changes alter the total packet size on the
wire, the same anonymity set considerations apply.
The following subsection defines how the next-hop or destination address and
forwarding delay are encoded within $β$ to enable correct routing and mixing
behavior.
### 8.4 Address and Delay Encoding
Each hops $β$ includes a fixed-size block containing the next-hop address and
the forwarding delay, except for the final hop, which encodes the destination
address and a delay-sized zero padding. This section defines the structure and
encoding of that block.
The combined address and delay block MUST be exactly $tκ$ bytes in length,
as defined in [Section 8.3.1](#831-header-field-sizes), regardless of the
actual address or delay values. The first $(tκ - 2)$ bytes MUST encode the
address, and the final $2$ bytes MUST encode the forwarding delay.
This fixed-length encoding ensures that packets remain indistinguishable on
the wire and prevents correlation attacks based on routing metadata structure.
Implementations MAY use any address and delay encoding format agreed upon
by all participating mix nodes, as long as the combined length is exactly
$tκ$ bytes. The encoding format MUST be interpreted consistently by all
nodes within a deployment.
For interoperability, a recommended default encoding format involves:
- Encoding the next-hop or destination address as a libp2p multi-address:
- To keep the address block compact while allowing relay connectivity, each mix
node is limited to one IPv4 circuit relay multiaddress. This ensures that most
nodes can act as mix nodes, including those behind NATs or firewalls.
- In libp2p terms, this combines transport addresses with multiple peer
identities to form an address that describes a relay circuit:
`
/ip4/<ipv4>/tcp/<port>/p2p/<relayPeerID>/p2p-circuit/p2p/<relayedPeerID>
`
Variants may include directly reachable peers and transports such as
`/quic-v1`, depending on the mix node's supported stack.
- IPv6 support is deferred, as it adds $16$ bytes just for the IP field.
- Future revisions may extend this format to support IPv6 or DNS-based
multiaddresses.
With these constraints, the recommended encoding layout is:
- IPv4 address (4 bytes)
- Protocol identifier _e.g.,_ TCP or QUIC (1 byte)
- Port number (2 bytes)
- Peer IDs (39 bytes, post-Base58 decoding)
- Encoding the forwarding delay as an unsigned 16-bit integer (2 bytes) in
milliseconds, using big endian network byte order.
If the encoded address or delay is shorter than its respective allocated
field, it MUST be padded with zeros. If it exceeds the allocated size, it
MUST be rejected or truncated according to the implementation policy.
Note: Future versions of the Mix Protocol may support address compression by
encoding only the peer identifier and relying on external peer discovery
mechanisms to retrieve full multiaddresses at runtime. This would allow for
more compact headers and greater address flexibility, but requires fast and
reliable lookup support across deployments. This design is out of scope for
the current version.
With the field sizes and encoding conventions established, the next section describes
how a mix node constructs a complete Sphinx packet when initiating the Mix Protocol.
### 8.5 Packet Construction
This section defines how a mix node constructs a Sphinx packet when initiating
the Mix Protocol on behalf of a local origin protocol instance.
The construction process wraps the message in a sequence of encryption
layers&mdash;one for each hop&mdash;such that only the corresponding mix node
can decrypt its layer and retrieve the routing instructions for that hop.
#### 8.5.1 Inputs
To initiate the Mix Protocol, the origin protocol instance submits a message
to the Mix Entry Layer on the same node. This layer forwards it to the local
Mix Protocol instance, which constructs a Sphinx packet
using the following REQUIRED inputs:
- **Application message**: The serialized message provided by the origin
protocol instance. The Mix Protocol instance applies any configured spam
protection mechanism and attaches one or two SURBs prior to encapsulating
the message in the Sphinx packet. The initiating node MUST ensure that
the resulting payload size does not exceed the maximum supported size
defined in [Section 8.3.2](#832-payload-size).
- **Origin protocol codec**: The libp2p protocol string corresponding to the
origin protocol instance. This is included in the payload so that
the exit node can route the message to the intended destination protocol
after decryption.
- **Mix Path length $L$**: The number of mix nodes to include in the path.
The mix path MUST consist of at least three hops, each representing a
distinct mix node.
- **Destination address $Δ$**: The routing address of the intended recipient
of the message. This address is encoded in $(tκ - 2)$ bytes as defined in
[Section 8.4](#84-address-and-delay-encoding) and revealed only at the last hop.
#### 8.5.2 Construction Steps
This subsection defines how the initiating mix node constructs a complete
Sphinx packet using the inputs defined in
[Section 8.5.1](#851-inputs). The construction MUST
follow the cryptographic structure defined in
[Section 8.1](#81-packet-structure-overview), use the primitives specified in
[Section 8.2](#82-cryptographic-primitives), and adhere to the component sizes
and encoding formats from [Section 8.3](#83-packet-component-sizes) and
[Section 8.4](#84-address-and-delay-encoding).
The construction MUST proceed as follows:
1. **Prepare Application Message**
- Apply any configured spam protection mechanism (_e.g.,_ PoW, VDF, RLN)
to the serialized message. Spam protection mechanisms are pluggable as defined
in [Section 6.3](#63-spam-protection).
- Attach one or more SURBs, if required. Their format and processing are
specified in [Section X.X].
- Append the origin protocol codec.
- Pad the result to the maximum application message length of $3968$ bytes
using a deterministic padding scheme. This value is derived from the fixed
payload size in [Section 8.3.2](#832-payload-size) ($3984$ bytes) minus the
security parameter $κ = 16$ bytes defined in
[Section 8.2](#82-cryptographic-primitives). The chosen scheme MUST yield a
fixed-size padded output and MUST be consistent across all mix nodes to
ensure correct interpretation during unpadding. For example, schemes that
explicitly encode the padding length and prepend zero-valued padding bytes
MAY be used.
- Let the resulting message be $m$.
2. **Select A Mix Path**
- First obtain an unbiased random sample of live, routable mix nodes using
some discovery mechanism. The choice of discovery mechanism is
deployment-specific as defined in [Section 6.1](#61-discovery). The
discovery mechanism MUST be unbiased and provide, at a minimum, the
multiaddress and X25519 public key of each mix node.
- From this sample, choose a random mix path of length $L \geq 3$. As defined
in [Section 2](#2-terminology), a mix path is a non-repeating sequence of
mix nodes.
- For each hop $i \in \{0 \ldots L-1\}$:
- Retrieve the multiaddress and corresponding X25519 public key $y_i$ of
the $i$-th mix node.
- Encode the multiaddress in $(tκ - 2)$ bytes as defined in
[Section 8.4](#84-address-and-delay-encoding). Let the resulting encoded
multiaddress be $\mathrm{addr\_i}$.
3. **Wrap Plaintext Payload In Sphinx Packet**
a. **Compute Ephemeral Secrets**
- Choose a random private exponent $x \in_R \mathbb{Z}_q^*$.
- Initialize:
$`
\begin{aligned}
α_0 &= g^x \\
s_0 &= y_0^x \\
b_0 &= H(α_0\ |\ s_0)
\end{aligned}
`$
- For each hop $i$ (from $1$ to $L-1$), compute:
$`
\begin{aligned}
α_i &= α_{i-1}^{b_{i-1}} \\
s_i &= y_{i}^{x\prod_{\text{j=0}}^{\text{i-1}} b_{j}} \\
b_i &= H(α_i\ |\ s_i)
\end{aligned}
`$
Note that the length of $α_i$ is $32$ bytes as defined in
[Section 8.3](#83-packet-component-sizes).
b. **Compute Per-Hop Filler Strings**
Filler strings are encrypted strings that are appended to the header during
encryption. They ensure that the header length remains constant across hops,
regardless of the position of a node in the mix path.
To compute the sequence of filler strings, perform the following steps:
- Initialize $Φ_0 = \epsilon$ (empty string).
- For each $i$ (from $1$ to $L-1$):
- Derive per-hop AES key and IV:
$`
\begin{array}{l}
Φ_{\mathrm{aes\_key}_{i-1}} =
\mathrm{KDF}(\text{"aes\_key"} \mid s_{i-1})\\
Φ_{\mathrm{iv}_{i-1}} =
\mathrm{KDF}(\text{"iv"} \mid s_{i-1})
\end{array}
`$
- Compute the filler string $Φ_i$ using $\text{AES-CTR}^\prime_i$,
which is AES-CTR encryption with the keystream starting from
index $((t+1)(r-i)+t+2)\kappa$ :
$`
\begin{array}{l}
Φ_i = \mathrm{AES\text{-}CTR}'_i\bigl(Φ_{\mathrm{aes\_key}_{i-1}},
Φ_{\mathrm{iv}_{i-1}}, Φ_{i-1} \mid 0_{(t+1)\kappa} \bigr),
\text{where notation $0_x$ defines the string of $0$ bits of length $x$.}
\end{array}
`$
Note that the length of $Φ_i$ is $(t+1)i\kappa$.
c. **Construct Routing Header**
The routing header as defined in
[Section 8.1](#81-packet-structure-overview) is the encrypted structure
that carries the forwarding instructions for each hop. It ensures that a
mix node can learn only its immediate next hop and forwarding delay without
inferring the full path.
Filler strings computed in the previous step are appended during encryption
to ensure that the header length remains constant across hops. This prevents
a node from distinguishing its position in the path based on header size.
To construct the routing header, perform the following steps for each hop
$i = L-1$ down to $0$, recursively:
- Derive per-hop AES key, MAC key, and IV:
$`
\begin{array}{l}
β_{\mathrm{aes\_key}_i} =
\mathrm{KDF}(\text{"aes\_key"} \mid s_i)\\
\mathrm{mac\_key}_i =
\mathrm{KDF}(\text{"mac\_key"} \mid s_{i})\\
β_{\mathrm{iv}_i} =
\mathrm{KDF}(\text{"iv"} \mid s_i)
\end{array}
`$
- Set the per hop two-byte encoded delay $\mathrm{delay}_i$ as defined in
[Section 8.4](#84-address-and-delay-encoding):
- If final hop (_i.e.,_ $i = L - 1$), encode two byte zero padding.
- For all other hop $i,\ i < L - 1$, sample a forwarding delay
using the delay strategy configured by the application and encode it in two bytes.
The delay strategy is pluggable as defined in [Section 6.2](#62-delay-strategy).
- Using the derived keys and encoded forwarding delay, compute the nested
encrypted routing information $β_i$:
- If $i = L-1$ (_i.e.,_ exit node):
$`
\begin{array}{l}
β_i = \mathrm{AES\text{-}CTR}\bigl(β_{\mathrm{aes\_key}_i},
β_{\mathrm{iv}_i}, Δ \mid \mathrm{delay}_i \mid 0_{((t+1)(r-L)+2)\kappa}
\bigr) \bigm| Φ_{L-1}
\end{array}
`$
- Otherwise (_i.e.,_ intermediary node):
$`
\begin{array}{l}
β_i = \mathrm{AES\text{-}CTR}\bigl(β_{\mathrm{aes\_key}_i},
β_{\mathrm{iv}_i}, \mathrm{addr}_{i+1} \mid $\mathrm{delay}_i$
\mid γ_{i+1} \mid β_{i+1 \, [0 \ldots (r(t+1) - t)\kappa - 1]} \bigr)
\end{array}
`$
Note that the length of $\beta_i$ is $(r(t+1)+1)\kappa$, $0 \leq i \leq L-1$
as defined in [Section 8.3](#83-packet-component-sizes).
- Compute the message authentication code $γ_i$:
$`
\begin{array}{l}
γ_i = \mathrm{HMAC\text{-}SHA\text{-}256}\bigl(\mathrm{mac\_key}_i,
β_i \bigr)
\end{array}
`$
Note that the length of $\gamma_i$ is $\kappa$ as defined in
[Section 8.3](#83-packet-component-sizes).
d. **Encrypt Payload**
The encrypted payload $δ$ contains the message $m$ defined in Step 1,
prepended with a $κ$-byte string of zeros. It is encrypted in layers such that
each hop in the mix path removes exactly one layer using the per-hop session
key. This ensures that only the final hop (_i.e.,_ the exit node) can fully
recover $m$, validate its integrity, and forward it to the destination.
To compute the encrypted payload, perform the following steps for each hop
$i = L-1$ down to $0$, recursively:
- Derive per-hop AES key and IV:
$`
\begin{array}{l}
δ_{\mathrm{aes\_key}_i} =
\mathrm{KDF}(\text{"δ\_aes\_key"} \mid s_i)\\
δ_{\mathrm{iv}_i} =
\mathrm{KDF}(\text{"δ\_iv"} \mid s_i)
\end{array}
`$
- Using the derived keys, compute the encrypted payload $δ_i$:
- If $i = L-1$ (_i.e.,_ exit node):
$`
\begin{array}{l}
δ_i = \mathrm{AES\text{-}CTR}\bigl(δ_{\mathrm{aes\_key}_i},
δ_{\mathrm{iv}_i}, 0_{\kappa} \mid m
\bigr)
\end{array}
`$
- Otherwise (_i.e.,_ intermediary node):
$`
\begin{array}{l}
δ_i = \mathrm{AES\text{-}CTR}\bigl(δ_{\mathrm{aes\_key}_i},
δ_{\mathrm{iv}_i}, δ_{i+1} \bigr)
\end{array}
`$

View File

@@ -55,6 +55,9 @@ but improves scalability by reducing direct interactions between participants.
Each message has a globally unique, immutable ID (or hash).
Messages can be requested from the high-availability caches or
other participants using the corresponding message ID.
* **Participant ID:**
Each participant has a globally unique, immutable ID
visible to other participants in the communication.
## Wire protocol
@@ -75,17 +78,18 @@ message HistoryEntry {
}
message Message {
// 1 Reserved for sender/participant id
string sender_id = 1; // Participant ID of the message sender
string message_id = 2; // Unique identifier of the message
string channel_id = 3; // Identifier of the channel to which the message belongs
optional int32 lamport_timestamp = 10; // Logical timestamp for causal ordering in channel
optional uint64 lamport_timestamp = 10; // Logical timestamp for causal ordering in channel
repeated HistoryEntry causal_history = 11; // List of preceding message IDs that this message causally depends on. Generally 2 or 3 message IDs are included.
optional bytes bloom_filter = 12; // Bloom filter representing received message IDs in channel
optional bytes content = 20; // Actual content of the message
}
```
Each message MUST include its globally unique identifier in the `message_id` field,
The sending participant MUST include its own globally unique identifier in the `sender_id` field.
In addition, it MUST include a globally unique identifier for the message in the `message_id` field,
likely based on a message hash.
The `channel_id` field MUST be set to the identifier of the channel of group communication
that is being synchronized.
@@ -98,12 +102,20 @@ These fields MAY be left unset in the case of [ephemeral messages](#ephemeral-me
The message `content` MAY be left empty for [periodic sync messages](#periodic-sync-message),
otherwise it MUST contain the application-level content
> **_Note:_** Close readers may notice that, outside of filtering messages originating from the sender itself,
the `sender_id` field is not used for much.
Its importance is expected to increase once a p2p retrieval mechanism is added to SDS, as is planned for the protocol.
### Participant state
Each participant MUST maintain:
* A Lamport timestamp for each channel of communication,
initialized to current epoch time in nanosecond resolution.
initialized to current epoch time in millisecond resolution.
The Lamport timestamp is increased as described in the [protocol steps](#protocol-steps)
to maintain a logical ordering of events while staying close to the current epoch time.
This allows the messages from new joiners to be correctly ordered with other recent messages,
without these new participants first having to synchronize past messages to discover the current Lamport timestamp.
* A bloom filter for received message IDs per channel.
The bloom filter SHOULD be rolled over and
recomputed once it reaches a predefined capacity of message IDs.
@@ -136,8 +148,11 @@ the `lamport_timestamp`, `causal_history` and `bloom_filter` fields.
Before broadcasting a message:
* the participant MUST increase its local Lamport timestamp by `1` and
include this in the `lamport_timestamp` field.
* the participant MUST set its local Lamport timestamp
to the maximum between the current value + `1`
and the current epoch time in milliseconds.
In other words the local Lamport timestamp is set to `max(timeNowInMs, current_lamport_timestamp + 1)`.
* the participant MUST include the increased Lamport timestamp in the message's `lamport_timestamp` field.
* the participant MUST determine the preceding few message IDs in the local history
and include these in an ordered list in the `causal_history` field.
The number of message IDs to include in the `causal_history` depends on the application.
@@ -157,6 +172,8 @@ of unacknowledged outgoing messages.
Upon receiving a message,
* the participant SHOULD ignore the message if it has a `sender_id` matching its own.
* the participant MAY deduplicate the message by comparing its `message_id` to previously received message IDs.
* the participant MUST [review the ACK status](#review-ack-status) of messages
in its unacknowledged outgoing buffer
using the received message's causal history and bloom filter.
@@ -240,7 +257,8 @@ participants SHOULD periodically send sync messages to maintain state.
These sync messages:
* MUST be sent with empty content
* MUST include an incremented Lamport timestamp
* MUST include a Lamport timestamp increased to `max(timeNowInMs, current_lamport_timestamp + 1)`,
where `timeNowInMs` is the current epoch time in milliseconds.
* MUST include causal history and bloom filter according to regular message rules
* MUST NOT be added to the unacknowledged outgoing buffer
* MUST NOT be included in causal histories of subsequent messages

View File

@@ -3,9 +3,10 @@ slug: 66
title: 66/WAKU2-METADATA
name: Waku Metadata Protocol
status: draft
editor: Alvaro Revuelta <alrevuelta@status.im>
editor: Franck Royer <franck@status.im>
contributors:
- Filip Dimitrijevic <filip@status.im>
- Filip Dimitrijevic <filip@status.im>
- Alvaro Revuelta <alrevuelta@status.im>
---
## Abstract
@@ -15,16 +16,19 @@ that can be associated with a [10/WAKU2](/waku/standards/core/10/waku2.md) node.
## Metadata Protocol
Waku specifies a req/resp protocol that provides information about the node's medatadata.
Such metadata is meant to be used by the node to decide if a peer is worth connecting
or not.
The keywords “MUST”, // List style “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”,
“NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).
Waku specifies a req/resp protocol that provides information about the node's capabilities.
Such metadata MAY be used by other peers for subsequent actions such as light protocol requests or disconnection.
The node that makes the request,
includes its metadata so that the receiver is aware of it,
without requiring an extra interaction.
without requiring another round trip.
The parameters are the following:
* `clusterId`: Unique identifier of the cluster that the node is running in.
* `shards`: Shard indexes that the node is subscribed to.
* `shards`: Shard indexes that the node is subscribed to via [`11/WAKU2-RELAY`](/waku/standards/core/11/relay.md).
***Protocol Identifier***
@@ -48,6 +52,51 @@ message WakuMetadataResponse {
}
```
## Implementation Suggestions
### Triggering Metadata Request
A node SHOULD proceed with metadata request upon first connection to a remote node.
A node SHOULD use the remote node's libp2p peer id as identifier for this heuristic.
A node MAY proceed with metadata request upon reconnection to a remote peer.
A node SHOULD store the remote peer's metadata information for future reference.
A node MAY implement a TTL regarding a remote peer's metadata, and refresh it upon expiry by initiating another metadata request.
It is RECOMMENDED to set the TTL to 6 hours.
A node MAY trigger a metadata request after receiving an error response from a remote note
stating they do not support a specific cluster or shard.
For example, when using a request-response service such as [`19/WAKU2-LIGHTPUSH`](/waku/standards/core/19/lightpush.md).
### Providing Cluster Id
A node MUST include their cluster id into their metadata payload.
It is RECOMMENDED for a node to operate on a single cluster id.
### Providing Shard Information
* Nodes that mount [`11/WAKU2-RELAY`](/waku/standards/core/11/relay.md) MAY include the shards they are subscribed to in their metadata payload.
* Shard-relevant services are message related services,
such as [`13/WAKU2-STORE`](/waku/standards/core/13/store.md), [12/WAKU2-FILTER](/waku/standards/core/12/filter.md)
and [`19/WAKU2-LIGHTPUSH`](/waku/standards/core/19/lightpush.md)
but not [`34/WAKU2-PEER-EXCHANGE`](/waku/standards/core/34/peer-exchange.md)
* Nodes that mount [`11/WAKU2-RELAY`](/waku/standards/core/11/relay.md) and a shard-relevant service SHOULD include the shards they are subscribed to in their metadata payload.
* Nodes that do not mount [`11/WAKU2-RELAY`](/waku/standards/core/11/relay.md) SHOULD NOT include any shard information
### Using Cluster Id
When reading the cluster id of a remote peer, the local node MAY disconnect if their cluster id is different from the remote peer.
### Using Shard Information
It is NOT RECOMMENDED to disconnect from a peer based on the fact that their shard information is different from the local node.
Ahead of doing a shard-relevant request,
a node MAY use the previously received metadata shard information to select a peer that support the targeted shard.
For non-shard-relevant requests, a node SHOULD NOT discriminate a peer based on medata shard information.
## Copyright
Copyright and related rights waived via