10 KiB
RFC 0003 - Peer Routing Records
- Start Date: 2019-10-04
- Related Issues:
Abstract
This RFC proposes a method for distributing peer routing records, which contain a peer's publicly reachable listen addresses, and may be extended in the future to contain additional metadata relevant to routing. This serves a similar purpose to Ethereum Node Records. Like ENR records, libp2p routing records should be extensible, so that we can add information relevant to as-yet unknown use cases.
The record described here does not include a signature, but it is expected to be serialized and wrapped in a signed envelope, which will prove the identity of the issuing peer. The dialer can then prioritize self-certified addresses over addresses from an unknown origin.
Problem Statement
All libp2p peers keep a "peer store", which maps peer ids to a set of known addresses for each peer. When the application layer wants to contact a peer, the dialer will pull addresses from the peer store and try to initiate a connection on one or more addresses.
Addresses for a peer can come from a variety of sources. If we have already made a connection to a peer, the libp2p identify protocol will inform us of other addresses that they are listening on. We may also discover their address by querying the DHT, checking a fixed "bootstrap list", or perhaps through a pubsub message or an application-specific protocol.
In the case of the identify protocol, we can be fairly certain that the addresses originate from the peer we're speaking to, assuming that we're using a secure, authenticated communication channel. However, more "ambient" discovery methods such as DHT traversal and pubsub depend on potentially untrustworthy third parties to relay address information.
Even in the case of receiving addresses via the identify protocol, our confidence that the address came directly from the peer is not actionable, because the peer store does not track the origin of an address. Once added to the peer store, all addresses are considered equally valid, regardless of their source.
We would like to have a means of distributing verifiable address records, which we can prove originated from the addressed peer itself. We also need a way to track the "provenance" of an address within libp2p's internal components such as the peer store. Once those pieces are in place, we will also need a way to prioritize addresses based on their authenticity, with the most strict strategy being to only dial certified addresses.
Complications
While producing a signed record is fairly trivial, there are a few aspects to this problem that complicate things.
- Addresses are not static. A given peer may have several addresses at any given time, and the set of addresses can change at arbitrary times.
- Peers may not know their own addresses. It's often impossible to automatically infer one's own public address, and peers may need to rely on third party peers to inform them of their observed public addresses.
- A peer may inadvertently or maliciously sign an address that they do not control. In other words, a signature isn't a guarantee that a given address is valid.
- Some addresses may be ambiguous. For example, addresses on a private subnet are valid within that subnet but are useless on the public internet.
The first point can be addressed by having records contain a sequence number that increases monotonically when new records are issued, and by having newer records replace older ones.
The other points, while worth thinking about, are out of scope for this RFC. However, we can take care to make our records extensible so that we can add additional metadata in the future. Some thoughts along these lines are in the Future Work section below.
Address Record Format
Here's a protobuf that might work:
// RoutingState contains the listen addresses for a peer at a particular point in time.
message RoutingState {
// AddressInfo wraps a multiaddr. In the future, it may be extended to
// contain additional metadata, such as "routability" (whether an address is
// local or global, etc).
message AddressInfo {
bytes multiaddr = 1;
}
// the peer id of the subject of the record (who these addresses belong to).
bytes peer_id = 1;
// A monotonically increasing sequence number, used for record ordering.
uint64 seq = 2;
// All current listen addresses
repeated AddressInfo addresses = 4;
}
The AddressInfo wrapper message is used instead of a bare multiaddr to allow
us to extend addresses with additional metadata in the future.
The seq field contains a sequence number that MUST increase monotonically as
new records are created. Newer records MUST have a higher seq value than older
records. To avoid persisting state across restarts, implementations MAY use unix
epoch time as the seq value, however they MUST NOT attempt to interpret a
seq value from another peer as a valid timestamp.
Example
{
peer_id: "QmAlice...",
seq: 1570215229,
addresses: [
{
addr: "/ip4/1.2.3.4/tcp/42/p2p/QmAlice",
},
{
addr: "/ip4/10.0.1.2/tcp/42/p2p/QmAlice",
}
]
}
Certification / Verification
This structure can be serialized and contained in a signed envelope, which lets us issue "self-certified" address records that are signed by the peer that the addresses belong to.
To produce a "self-certified" address, a peer will construct a RoutingState
containing all of their publicly-reachable listen addresses. A peer SHOULD only
include addresses that it believes are routable via the public internet, ideally
having confirmed that this is the case via some external mechanism such as a
successful AutoNAT dial-back.
In some cases we may want to include localhost or LAN-local address; for example, when testing the DHT using many processes on a single machine. To support this, implementations may use a global runtime configuration flag or environment variable to control whether local addresses will be included.
Once the RoutingState has been constructed, it should be serialized to a byte
string and wrapped in a signed envelope. The public_key field
of the envelope MUST be able to derive the peer_id contained in the record. If
the envelope's public_key does not match the peer_id of the routing record,
the record MUST be rejected as invalid.
Signed Envelope Domain
Signed envelopes require a "domain separation" string that defines the scope or purpose of a signature.
When wrapping a RoutingState in a signed envelope, the domain string MUST be
libp2p-routing-state.
Signed Envelope Payload Type
Signed envelopes contain a payload_type field that indicates how to interpret
the contents of the envelope.
Ideally, we should define a new multicodec for routing records, so that we can
identify them in a few bytes. While we're still spec'ing and working on the
initial implementation, we can use the UTF-8 string
"/libp2p/routing-state-record" as the payload_type value.
Peer Store APIs
We will need to add a few methods to the peer store:
-
AddCertifiedAddrs(envelope) -> Maybe<Error>- Add a self-certified address, wrapped in a signed envelope. This should
validate the envelope signature & store the envelope for future reference.
If any certified addresses already exist for the peer, only accept the new
envelope if it has a greater
seqvalue than existing envelopes.
- Add a self-certified address, wrapped in a signed envelope. This should
validate the envelope signature & store the envelope for future reference.
If any certified addresses already exist for the peer, only accept the new
envelope if it has a greater
-
CertifiedAddrs(peer_id) -> Set<Multiaddr>- return the set of self-certified addresses for the given peer id
-
SignedRoutingState(peer_id) -> Maybe<SignedEnvelope>- retrive the signed envelope that was most recently added to the peerstore for the given peer, if any exists.
And possibly:
IsCertified(peer_id, multiaddr) -> Boolean- has a particular address been self-certified by the given peer?
We'll also need a method that constructs a new RoutingState containing our
listen addresses and wraps it in a signed envelope. This may belong on the Host
instead of the peer store, since it needs access to the private signing key.
Dialing Strategies
Once self-certified addresses are available via the peer store, we can update the dialer to prefer using them when possible. Some systems may want to only dial self-certified addresses, so we should include some configuration options to control whether non-certified addresses are acceptable.
Changes to core libp2p protocols
How to publish these to the DHT? Are there backward compatibility issues with older unsigned address records? Maybe we just publish these to a different key prefix...
Should we update identify and mDNS discovery to use signed records?
Future Work
Some things that were originally considered in this RFC were trimmed so that we can focus on delivering a basic self-certified record, which is a pressing need.
This includes a notion of "routability", which could be used to communicate whether a given address is global (reachable via the public internet), LAN-local, etc. We may also want to include some kind of confidence score or priority ranking, so that peers can communicate which addresses they would prefer other peers to use.
To allow these fields to be added in the future, we wrap multiaddrs in the
AddressInfo message instead of having the addresses field be a list of "raw"
multiaddrs.
Another potentially useful extension would be a compact protocol table or bloom
filter that could be used to test whether a peer supports a given protocol
before interacting with them directly. This could be added as a new field in the
RoutingState message.