docs: finalize Section 8 Sphinx Packet Construction and Handling (#202)

This PR builds on PR #173 and completes the remaining construction and
runtime processing logic in `Section 8` of the Mix Protocol RFC. It
finalizes the last steps of packet construction (`Section 8.5.2 step 3.
e–f`) and introduces the complete mix node handler logic in `Section
8.6`, including intermediary and exit processing.
It clearly separates construction, role determination, and processing
logic.

### Changes Introduced in This PR

- **8.5.2 Construction Steps (Final Steps Added)**
  - Sphinx packet construction
    - [x] Assemble Final Packet
    - [x] Transmit Packet
    
- **8.6 Sphinx Packet Handling**
  - [x] **8.6.1 Shared Preprocessing**
- Derives session key, validates replay tag and MAC, decrypts
header/payload
  - [x] **8.6.2 Node Role Determination**
- Inspects decrypted header prefix and padding to classify node as
intermediary or exit
  - [x] **8.6.3 Intermediary Processing**
    - Parses next hop address and mean delay
    - Updates ephemeral key and routing fields
    - Samples actual forwarding delay and transmits packet
    - Erases all temporary state.
  - [x] **8.6.4 Exit Processing**
    - Verifies payload padding and extracts destination address
    - Parses and validates application-layer message
- Hands off to Exit Layer along with origin protocol codec and
destination address

### Highlights
  - Explicit role determination via zero-delay and padding inspection
  - Fully decoupled construction and handling logic
  - Forwarding delay behavior updated:
    - Sender selects per-hop mean delay
    - Mix node samples actual delay using pluggable distribution

---------

Co-authored-by: kaiserd <1684595+kaiserd@users.noreply.github.com>
This commit is contained in:
AkshayaMani
2025-12-10 07:24:23 -05:00
committed by GitHub
parent aaf158aa59
commit 9d11a22901

View File

@@ -87,8 +87,7 @@ A per-message flag set by the origin protocol to indicate that a message should
be routed using
the Mix Protocol or not.
Only messages with mixify set are forwarded to the Mix Entry Layer.
Other messages SHOULD be routed using the origin protocols default behavior.
Other messages SHOULD be routed using the origin protocols default behavior.
The phrases 'messages to be mixified', 'to mixify a message' and related
variants are used
informally throughout this document to refer to messages that either have the
@@ -1100,8 +1099,10 @@ For interoperability, a recommended default encoding format involves:
- Port number (2 bytes)
- Peer IDs (39 bytes, post-Base58 decoding)
- Encoding the forwarding delay as an unsigned 16-bit integer (2 bytes) in
milliseconds, using big endian network byte order.
- Encoding the forwarding delay as an unsigned 16-bit integer (2 bytes),
representing the mean delay in milliseconds for the configured delay
distribution, using big endian network byte order.
The delay distribution is pluggable, as defined in [Section 6.2](#62-delay-strategy).
If the encoded address or delay is shorter than its respective allocated
field, it MUST be padded with zeros. If it exceeds the allocated size, it
@@ -1165,20 +1166,24 @@ The construction MUST proceed as follows:
1. **Prepare Application Message**
- Apply any configured spam protection mechanism (_e.g.,_ PoW, VDF, RLN)
to the serialized message. Spam protection mechanisms are pluggable as defined
in [Section 6.3](#63-spam-protection).
to the serialized message. Spam protection mechanisms are pluggable as defined
in [Section 6.3](#63-spam-protection).
- Attach one or more SURBs, if required. Their format and processing are
specified in [Section X.X].
- Append the origin protocol codec.
specified in [Section X.X].
- Append the origin protocol codec in a format that enables the exit node to
reliably extract it during parsing. A recommended encoding approach is to
prefix the codec string with its length, encoded as a compact varint field
limited to two bytes. Regardless of the scheme used, implementations MUST
agree on the format within a deployment to ensure deterministic decoding.
- Pad the result to the maximum application message length of $3968$ bytes
using a deterministic padding scheme. This value is derived from the fixed
payload size in [Section 8.3.2](#832-payload-size) ($3984$ bytes) minus the
security parameter $κ = 16$ bytes defined in
[Section 8.2](#82-cryptographic-primitives). The chosen scheme MUST yield a
fixed-size padded output and MUST be consistent across all mix nodes to
ensure correct interpretation during unpadding. For example, schemes that
explicitly encode the padding length and prepend zero-valued padding bytes
MAY be used.
using a deterministic padding scheme. This value is derived from the fixed
payload size in [Section 8.3.2](#832-payload-size) ($3984$ bytes) minus the
security parameter $κ = 16$ bytes defined in
[Section 8.2](#82-cryptographic-primitives). The chosen scheme MUST yield a
fixed-size padded output and MUST be consistent across all mix nodes to
ensure correct interpretation during unpadding. For example, schemes that
explicitly encode the padding length and prepend zero-valued padding bytes
MAY be used.
- Let the resulting message be $m$.
2. **Select A Mix Path**
@@ -1220,8 +1225,8 @@ The construction MUST proceed as follows:
\end{aligned}
`$
Note that the length of $α_i$ is $32$ bytes as defined in
[Section 8.3](#83-packet-component-sizes).
Note that the length of $α_i$ is $32$ bytes, $0 \leq i \leq L-1$ as defined in
[Section 8.3.1](#831-header-field-sizes).
b. **Compute Per-Hop Filler Strings**
Filler strings are encrypted strings that are appended to the header during
@@ -1246,17 +1251,17 @@ The construction MUST proceed as follows:
- Compute the filler string $Φ_i$ using $\text{AES-CTR}^\prime_i$,
which is AES-CTR encryption with the keystream starting from
index $((t+1)(r-i)+t+2)\kappa$ :
index $((t+1)(r-i)+t+2)κ$ :
$`
\begin{array}{l}
Φ_i = \mathrm{AES\text{-}CTR}'_i\bigl(Φ_{\mathrm{aes\_key}_{i-1}},
Φ_{\mathrm{iv}_{i-1}}, Φ_{i-1} \mid 0_{(t+1)\kappa} \bigr),
Φ_{\mathrm{iv}_{i-1}}, Φ_{i-1} \mid 0_{(t+1)κ} \bigr),\; \; \;
\text{where notation $0_x$ defines the string of $0$ bits of length $x$.}
\end{array}
`$
Note that the length of $Φ_i$ is $(t+1)i\kappa$.
Note that the length of $Φ_i$ is $(t+1)iκ$, $0 \leq i \leq L-1$.
c. **Construct Routing Header**
The routing header as defined in
@@ -1286,21 +1291,22 @@ The construction MUST proceed as follows:
`$
- Set the per hop two-byte encoded delay $\mathrm{delay}_i$ as defined in
[Section 8.4](#84-address-and-delay-encoding):
[Section 8.4](#84-address-and-delay-encoding):
- If final hop (_i.e.,_ $i = L - 1$), encode two byte zero padding.
- For all other hop $i,\ i < L - 1$, sample a forwarding delay
using the delay strategy configured by the application and encode it in two bytes.
The delay strategy is pluggable as defined in [Section 6.2](#62-delay-strategy).
- For all other hop $i,\ i < L - 1$, select the mean forwarding delay
for the delay strategy configured by the application, and encode it as a
two-byte value. The delay strategy is pluggable, as defined in
[Section 6.2](#62-delay-strategy).
- Using the derived keys and encoded forwarding delay, compute the nested
encrypted routing information $β_i$:
encrypted routing information $β_i$:
- If $i = L-1$ (_i.e.,_ exit node):
$`
\begin{array}{l}
β_i = \mathrm{AES\text{-}CTR}\bigl(β_{\mathrm{aes\_key}_i},
β_{\mathrm{iv}_i}, Δ \mid \mathrm{delay}_i \mid 0_{((t+1)(r-L)+2)\kappa}
β_{\mathrm{iv}_i}, Δ \mid \mathrm{delay}_i \mid 0_{((t+1)(r-L)+2)κ}
\bigr) \bigm| Φ_{L-1}
\end{array}
`$
@@ -1310,13 +1316,15 @@ The construction MUST proceed as follows:
$`
\begin{array}{l}
β_i = \mathrm{AES\text{-}CTR}\bigl(β_{\mathrm{aes\_key}_i},
β_{\mathrm{iv}_i}, \mathrm{addr}_{i+1} \mid $\mathrm{delay}_i$
\mid γ_{i+1} \mid β_{i+1 \, [0 \ldots (r(t+1) - t)\kappa - 1]} \bigr)
β_{\mathrm{iv}_i}, \mathrm{addr}_{i+1} \mid \mathrm{delay}_i
\mid γ_{i+1} \mid β_{i+1 \, [0 \ldots (r(t+1) - t)κ - 1]} \bigr),\; \; \;
\text{where notation $X_{[a \ldots b]}$ denotes the substring of $X$
from byte offset $a$ to $b$, inclusive, using zero-based indexing.}
\end{array}
`$
Note that the length of $\beta_i$ is $(r(t+1)+1)\kappa$, $0 \leq i \leq L-1$
as defined in [Section 8.3](#83-packet-component-sizes).
Note that the length of $\beta_i$ is $(r(t+1)+1)κ$, $0 \leq i \leq L-1$
as defined in [Section 8.3.1](#831-header-field-sizes).
- Compute the message authentication code $γ_i$:
@@ -1327,8 +1335,8 @@ The construction MUST proceed as follows:
\end{array}
`$
Note that the length of $\gamma_i$ is $\kappa$ as defined in
[Section 8.3](#83-packet-component-sizes).
Note that the length of $\gamma_i$ is $κ$, $0 \leq i \leq L-1$ as defined in
[Section 8.3.1](#831-header-field-sizes).
d. **Encrypt Payload**
The encrypted payload $δ$ contains the message $m$ defined in Step 1,
@@ -1357,7 +1365,7 @@ The construction MUST proceed as follows:
$`
\begin{array}{l}
δ_i = \mathrm{AES\text{-}CTR}\bigl(δ_{\mathrm{aes\_key}_i},
δ_{\mathrm{iv}_i}, 0_{\kappa} \mid m
δ_{\mathrm{iv}_i}, 0_{κ} \mid m
\bigr)
\end{array}
`$
@@ -1370,3 +1378,376 @@ The construction MUST proceed as follows:
δ_{\mathrm{iv}_i}, δ_{i+1} \bigr)
\end{array}
`$
Note that the length of $\delta_i$, $0 \leq i \leq L-1$ is $|m| + κ$ bytes.
Given that the derived size of $\delta_i$ is $3984$ bytes as defined in
[Section 8.3.2](#832-payload-size), this allows $m$ to be of length
$3984-16 = 3968$ bytes as defined in Step 1.
e. **Assemble Final Packet**
The final Sphinx packet is structured as defined in
[Section 8.3](#83-packet-component-sizes):
```text
α = α_0 // 32 bytes
β = β_0 // 576 bytes
γ = γ_0 // 16 bytes
δ = δ_0 // 3984 bytes
```
Serialize the final packet using a consistent format and
prepare it for transmission.
f. **Transmit Packet**
- Sample a randomized delay from the same distribution family used for
per-hop delays (in Step 3.e.) with an independently chosen mean.
This delay prevents timing correlation when multiple Sphinx packets are
sent in quick succession. Such bursts may occur when an upstream protocol
fragments a large message, or when several messages are sent close together.
- After the randomized delay elapses, transmit the serialized packet to
the first hop via a libp2p stream negotiated under the
`"/mix/1.0.0"` protocol identifier.
Implementations MAY reuse an existing stream to the first hop as
described in [Section 5.5](#55-stream-management-and-multiplexing), if
doing so does not introduce any observable linkability between the
packets.
Once a Sphinx packet is constructed and transmitted by the initiating node, it is
processed hop-by-hop by the remaining mix nodes in the path. Each node receives
the packet over a libp2p stream negotiated under the `"/mix/1.0.0"` protocol.
The following subsection defines the per-hop packet handling logic expected of
each mix node, depending on whether it acts as an intermediary or an exit.
### 8.6 Sphinx Packet Handling
Each mix node MUST implement a handler for incoming data received over
libp2p streams negotiated under the `"/mix/1.0.0"` protocol identifier.
The incoming stream may have been reused by the previous hop, as described
in [Section 5.5](#55-stream-management-and-multiplexing). Implementations
MUST ensure that packet handling remains stateless and unlinkable,
regardless of stream reuse.
Upon receiving the stream payload, the node MUST interpret it as a Sphinx packet
and process it in one of two roles&mdash;intermediary or exit&mdash; as defined in
[Section 7.3](#73-sphinx-packet-receiving-and-processing). This section defines
the exact behavior for both roles.
#### 8.6.1 Shared Preprocessing
Upon receiving a stream payload over a libp2p stream, the mix node MUST first
deserialize it into a Sphinx packet `(α, β, γ, δ)`.
The deserialized fields MUST match the sizes defined in [Section 8.5.2](#852-construction-steps)
step 3.e., and the total packet length MUST match the fixed packet size defined in
[Section 8.3.2](#832-payload-size).
If the stream payload does not match the expected length, it MUST be discarded and
the processing MUST terminate.
After successful deserialization, the mix node performs the following steps:
1. **Derive Session Key**
Let $x \in \mathbb{Z}_q^*$ denote the node's X25519 private key.
Compute the shared secret $s = α^x$.
2. **Check for Replays**
- Compute the tag $H(s)$.
- If the tag exists in the node's table of previously seen tags,
discard the packet and terminate processing.
- Otherwise, store the tag in the table.
The table MAY be flushed when the node rotates its private key.
Implementations SHOULD perform this cleanup securely and automatically.
3. **Check Header Integrity**
- Derive the MAC key from the session secret $s$:
$`
\begin{array}{l}
\mathrm{mac\_key} =
\mathrm{KDF}(\text{"mac\_key"} \mid s)
\end{array}
`$
- Verify the integrity of the routing header:
$`
\begin{array}{l}
γ \stackrel{?}{=} \mathrm{HMAC\text{-}SHA\text{-}256}(\mathrm{mac\_key},
β)
\end{array}
`$
If the check fails, discard the packet and terminate processing.
4. **Decrypt One Layer of the Routing Header**
- Derive the routing header AES key and IV from the session secret $s$:
$`
\begin{array}{l}
β_{\mathrm{aes\_key}} =
\mathrm{KDF}(\text{"aes\_key"} \mid s)\\
β_{\mathrm{iv}} =
\mathrm{KDF}(\text{"iv"} \mid s)
\end{array}
`$
- Decrypt the suitably padded $β$ to obtain the routing block $B$ for this hop:
$`
\begin{array}{l}
B = \mathrm{AES\text{-}CTR}\bigl(β_{\mathrm{aes\_key}},
β_{\mathrm{iv}}, β \mid 0_{(t+1)κ}
\bigr)
\end{array}
`$
This step removes the filler string appended during header encryption in
[Section 8.5.2](#852-construction-steps) step 3.c. and
yields the plaintext routing information for this hop.
The routing block $B$ MUST be parsed according to the rules and field layout
defined in [Section 8.6.2](#862-node-role-determination) to determine
whether the current node is an intermediary or the exit.
5. **Decrypt One Layer of the Payload**
- Derive the payload AES key and IV from the session secret $s$:
$`
\begin{array}{l}
δ_{\mathrm{aes\_key}} =
\mathrm{KDF}(\text{"δ\_aes\_key"} \mid s)\\
δ_{\mathrm{iv}} =
\mathrm{KDF}(\text{"δ\_iv"} \mid s)
\end{array}
`$
- Decrypt one layer of the encrypted payload $δ$:
$`
\begin{array}{l}
δ' = \mathrm{AES\text{-}CTR}\bigl(δ_{\mathrm{aes\_key}},
δ_{\mathrm{iv}}, δ \bigr)
\end{array}
`$
The resulting $δ'$ is the decrypted payload for this hop and MUST be
interpreted depending on the parsed node's role, determined by $B$, as
described in [Section 8.6.2](#862-node-role-determination).
#### 8.6.2 Node Role Determination
As described in [Section 8.6.1](#861-shared-preprocessing), the mix node
obtains the routing block $B$ by decrypting one layer of the encrypted
header $β$.
At this stage, the node MUST determine whether it is an intermediary
or the exit based on the prefix of $B$, in accordance with the construction of
$β_i$ defined in [Section 8.5.2](#852-construction-steps) step 3.c.:
- If the first $(tκ - 2)$ bytes of $B$ contain a nonzero-encoded
address, immediately followed by a two-byte zero delay,
and then $((t + 1)(r - L) + t + 2)κ$ bytes of all-zero padding,
process the packet as an exit.
- Otherwise, process the packet as an intermediary.
The following subsections define the precise behavior for each case.
#### 8.6.3 Intermediary Processing
Once the node determines its role as an intermediary following the steps in
[Section 8.6.2](#862-node-role-determination), it MUST perform the following
steps to interpret routing block $B$ and decrypted payload $δ'$ obtained in
[Section 8.6.1](#861-shared-preprocessing):
1. **Parse Routing Block**
Parse the routing block $B$ according to the $β_i$, $i \neq L - 1$ construction
defined in [Section 8.5.2](#852-construction-steps) step 3.c.:
- Extract the first $(tκ - 2)$ bytes of $B$ as the next hop address $\mathrm{addr}$
$`
\begin{array}{l}
\mathrm{addr} = B_{[0\ldots(tκ - 2) - 1]}
\end{array}
`$
- Extract next two bytes as the mean delay $\mathrm{delay}$
$`
\begin{array}{l}
\mathrm{delay} = B_{[(tκ - 2)\ldots{tκ} - 1]}
\end{array}
`$
- Extract next $κ$ bytes as the next hop MAC $γ'$
$`
\begin{array}{l}
γ' = B_{[tκ\ldots(t + 1)κ - 1]}
\end{array}
`$
- Extract next $(r(t+1)+1)κ$ bytes as the next hop routing information $β'$
$`
\begin{array}{l}
β' = B_{[(t + 1)κ\ldots(r(t +1 ) + t + 2)\kappa - 1]}
\end{array}
`$
If parsing fails, discard the packet and terminate processing.
2. **Update Header Fields**
Update the header fields according to the construction steps
defined in [Section 8.5.2](#852-construction-steps):
- Compute the next hop ephemeral public value $α'$, deriving the blinding factor
$b$ from the shared secret $s$ computed in
[Section 8.6.1](#861-shared-preprocessing) step 1.
$`
\begin{aligned}
b &= H(α\ |\ s) \\
α' &= α^b
\end{aligned}
`$
- Use the $β'$ and $γ'$ extracted in Step 1. as the routing information and
MAC respectively in the outgoing packet.
3. **Update Payload**
Use the decrypted payload $δ'$ computed in
[Section 8.6.1](#861-shared-preprocessing) step 5. as the payload in the
outgoing packet.
4. **Assemble Final Packet**
The final Sphinx packet is structured as defined in
[Section 8.3](#83-packet-component-sizes):
```text
α = α' // 32 bytes
β = β' // 576 bytes
γ = γ' // 16 bytes
δ = δ' // 3984 bytes
```
Serialize $α'$ using the same format used in
[Section 8.5.2](#852-construction-steps). The remaining fields are
already fixed-length buffers and do not require further
transformation.
5. **Transmit Packet**
- Interpret the $\mathrm{addr}$ and $\mathrm{delay}$ extracted in
Step 1. according to the encoding format used during construction in
[Section 8.5.2](#852-construction-steps) Step 3.c.
- Sample the actual forwarding delay from the configured delay distribution,
using the decoded mean delay value as the distribution parameter.
- After the forwarding delay elapses, transmit the serialized packet to
the next hop address via a libp2p stream negotiated under the `"/mix/1.0.0"`
protocol identifier.
Implementations MAY reuse an existing stream to the next hop as
described in [Section 5.5](#55-stream-management-and-multiplexing), if
doing so does not introduce any observable linkability between the
packets.
6. **Erase State**
- After transmission, erase all temporary values securely from memory,
including session keys, decrypted content, and routing metadata.
- If any error occurs&mdash;such as malformed header, invalid delay, or
failed stream transmission&mdash;silently discard the packet and do not
send any error response.
#### 8.6.4 Exit Processing
Once the node determines its role as an exit following the steps in
[Section 8.6.2](#862-node-role-determination), it MUST perform the following
steps to interpret routing block $B$ and decrypted payload $δ'$ obtained in
[Section 8.6.1](#861-shared-preprocessing):
1. **Parse Routing Block**
Parse the routing block $B$ according to the $β_i$, $i = L - 1$
construction defined in [Section 8.5.2](#852-construction-steps) step 3.c.:
- Extract first $(tκ - 2)$ bytes of $B$ as the destination address $Δ$
$`
\begin{array}{l}
Δ = B_{[0\ldots(tκ - 2) - 1]}
\end{array}
`$
2. **Recover Padded Application Message**
- Verify the decrypted payload $δ'$ computed in
[Section 8.6.1](#861-shared-preprocessing) step 5.:
$`
\begin{array}{l}
δ'_{[0\ldots{κ} - 1]} \stackrel{?}{=} 0_{κ}
\end{array}
`$
If the check fails, discard $δ'$ and terminate processing.
- Extract rest of the bytes of $δ'$ as the padded application message $m$:
$`
\begin{array}{l}
m = δ'_{[κ\ldots]},\; \; \;
\text{where notation $X_{[a \ldots]}$ denotes the substring of $X$
from byte offset $a$ to the end of the string using zero-based indexing.}
\end{array}
`$
3. **Extract Application Message**
Interpret recovered $m$ according to the construction steps
defined in [Section 8.5.2](#852-construction-steps) step 1.:
- First, unpad $m$ using the deterministic padding scheme defined during
construction.
- Next, parse the unpadded message deterministically to extract:
- optional spam protection proof
- zero or more SURBs
- the origin protocol codec
- the serialized application message
- Parse and deserialize the metadata fields required for spam validation,
SURB extraction, and protocol codec identification, consistent with the
format and extensions applied by the initiator.
The application message itself MUST remain serialized.
- If parsing fails at any stage, discard $m$ and terminate processing.
4. **Handoff to Exit Layer**
- Hand off the serialized application message, the origin protocol codec, and
destination address $Δ$ (extracted in step 1.) to the local Exit layer for
further processing and delivery.
- The Exit Layer is responsible for establishing a client-only connection and
forwarding the message to the destination. Implementations MAY reuse an
existing stream to the destination, if doing so does not introduce any
observable linkability between forwarded messages.