mirror of
https://github.com/vacp2p/specs.git
synced 2026-01-09 15:28:03 -05:00
trim scope & rename to "routing records"
This commit is contained in:
@@ -1,267 +0,0 @@
|
||||
# RFC 0003 - Address Records with Metadata
|
||||
|
||||
- Start Date: 2019-10-04
|
||||
- Related Issues:
|
||||
- [libp2p/issues/47](https://github.com/libp2p/libp2p/issues/47)
|
||||
- [go-libp2p/issues/436](https://github.com/libp2p/go-libp2p/issues/436)
|
||||
|
||||
## Abstract
|
||||
|
||||
This RFC proposes a method for distributing address records, which contain a
|
||||
peer's publicly reachable listen addresses, as well as some metadata that can
|
||||
help other peers categorize addresses and prioritize thme when dialing.
|
||||
|
||||
The record described here does not include a signature, but it is expected to
|
||||
be serialized and wrapped in a [signed envelope][envelope-rfc], which will
|
||||
prove the identity of the issuing peer. The dialer can then prioritize
|
||||
self-certified addresses over addresses from an unknown origin.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
All libp2p peers keep a "peer store" (called a peer book in some
|
||||
implementations), which maps [peer ids][peer-id-spec] to a set of known
|
||||
addresses for each peer. When the application layer wants to contact a peer, the
|
||||
dialer will pull addresses from the peer store and try to initiate a connection
|
||||
on one or more addresses.
|
||||
|
||||
Addresses for a peer can come from a variety of sources. If we have already made
|
||||
a connection to a peer, the libp2p [identify protocol][identify-spec] will
|
||||
inform us of other addresses that they are listening on. We may also discover
|
||||
their address by querying the DHT, checking a fixed "bootstrap list", or perhaps
|
||||
through a pubsub message or an application-specific protocol.
|
||||
|
||||
In the case of the identify protocol, we can be fairly certain that the
|
||||
addresses originate from the peer we're speaking to, assuming that we're using a
|
||||
secure, authenticated communication channel. However, more "ambient" discovery
|
||||
methods such as DHT traversal and pubsub depend on potentially untrustworthy
|
||||
third parties to relay address information.
|
||||
|
||||
Even in the case of receiving addresses via the identify protocol, our
|
||||
confidence that the address came directly from the peer is not actionable, because
|
||||
the peer store does not track the origin of an address. Once added to the peer
|
||||
store, all addresses are considered equally valid, regardless of their source.
|
||||
|
||||
We would like to have a means of distributing _verifiable_ address records,
|
||||
which we can prove originated from the addressed peer itself. We also need a way to
|
||||
track the "provenance" of an address within libp2p's internal components such as
|
||||
the peer store. Once those pieces are in place, we will also need a way to
|
||||
prioritize addresses based on their authenticity, with the most strict strategy
|
||||
being to only dial certified addresses.
|
||||
|
||||
### Complications
|
||||
|
||||
While producing a signed record is fairly trivial, there are a few aspects to
|
||||
this problem that complicate things.
|
||||
|
||||
1. Addresses are not static. A given peer may have several addresses at any given
|
||||
time, and the set of addresses can change at arbitrary times.
|
||||
2. Peers may not know their own addresses. It's often impossible to automatically
|
||||
infer one's own public address, and peers may need to rely on third party
|
||||
peers to inform them of their observed public addresses.
|
||||
3. A peer may inadvertently or maliciously sign an address that they do not
|
||||
control. In other words, a signature isn't a guarantee that a given address is
|
||||
valid.
|
||||
4. Some addresses may be ambiguous. For example, addresses on a private subnet
|
||||
are valid within that subnet but are useless on the public internet.
|
||||
|
||||
The first point implies that the address record should include some kind of
|
||||
temporal component, so that newer records can replace older ones as the state
|
||||
changes over time. This could be a timestamp and/or a simple sequence number
|
||||
that each node increments whenever they publish a new record.
|
||||
|
||||
The second and third points highlight the limits of certifying information that
|
||||
is itself uncertain. While a signature can prove that the addresses originated
|
||||
from the peer, it cannot prove that the addresses are correct or useful. Given
|
||||
the asymmetric nature of real-world NATs, it's often the case that a peer is
|
||||
_less likely_ to have correct information about its own address than an outside
|
||||
observer, at least initially.
|
||||
|
||||
This suggests that we should include some measure of "confidence" in our
|
||||
records, so that peers can distribute addresses that they are not fully certain
|
||||
are correct, while still asserting that they created the record. For example,
|
||||
when requesting a dial-back via the [AutoNAT service][autonat], a peer could
|
||||
send a "provisional" address record. When the AutoNAT peer confirms the address,
|
||||
that address could be marked as confirmed and advertised in a new record.
|
||||
|
||||
Regarding the fourth point about ambiguous addresses, it would also be desirable
|
||||
for the address record to include a notion of "routability," which would
|
||||
indicate how "accessible" the address is likely to be. This would allow us to
|
||||
mark an address as "LAN-only," if we know that it is not mapped to a publicly
|
||||
reachable address but would still like to distribute it to local peers.
|
||||
|
||||
## Address Record Format
|
||||
|
||||
Here's a protobuf that might work:
|
||||
|
||||
```protobuf
|
||||
// Routability indicates the "scope" of an address, meaning how visible
|
||||
// or accessible it is. This allows us to distinguish between LAN and
|
||||
// WAN addresses.
|
||||
//
|
||||
// Side Note: we could potentially have a GLOBAL_RELAY case, which would
|
||||
// make it easy to prioritize non-relay addresses in the dialer. Bit of
|
||||
// a mix of concerns though.
|
||||
enum Routability {
|
||||
// catch-all default / unknown scope
|
||||
UNKNOWN = 1;
|
||||
|
||||
// another process on the same machine
|
||||
LOOPBACK = 2;
|
||||
|
||||
// a local area network
|
||||
LOCAL = 3;
|
||||
|
||||
// public internet
|
||||
GLOBAL = 4;
|
||||
|
||||
// reserved for future use
|
||||
INTERPLANETARY = 100;
|
||||
}
|
||||
|
||||
|
||||
// Confidence indicates how much we believe in the validity of the
|
||||
// address.
|
||||
enum Confidence {
|
||||
// default, unknown confidence. we don't know one way or another
|
||||
UNKNOWN = 1;
|
||||
|
||||
// INVALID means we know that this address is invalid and should be deleted
|
||||
INVALID = 2;
|
||||
|
||||
// UNCONFIRMED means that we suspect this address is valid, but we haven't
|
||||
// fully confirmed that we're reachable.
|
||||
UNCONFIRMED = 3;
|
||||
|
||||
// CONFIRMED means that we fully believe this address is valid.
|
||||
// Each node / implementation can have their own criteria for confirmation.
|
||||
CONFIRMED = 4;
|
||||
}
|
||||
|
||||
// AddressInfo is a multiaddr plus some metadata.
|
||||
message AddressInfo {
|
||||
bytes multiaddr = 1;
|
||||
Routability routability = 2;
|
||||
Confidence confidence = 3;
|
||||
}
|
||||
|
||||
// AddressState contains the listen addresses (and their metadata)
|
||||
// for a peer at a particular point in time.
|
||||
//
|
||||
// Although this record contains a wall-clock `issuedAt` timestamp,
|
||||
// there are no guarantees about node clocks being in sync or correct.
|
||||
// As such, the `issuedAt` field should be considered informational,
|
||||
// and `version` should be preferred when ordering records.
|
||||
message AddressState {
|
||||
// the peer id of the subject of the record.
|
||||
bytes subjectPeer = 1;
|
||||
|
||||
// `version` is an increment-only counter that can be used to
|
||||
// order AddressState records chronologically. Newer records
|
||||
// MUST have a higher `version` than older records, but there
|
||||
// can be gaps between version numbers.
|
||||
uint64 version = 2;
|
||||
|
||||
// The `issuedAt` timestamp stores the creation time of this record in
|
||||
// seconds from the unix epoch, according to the issuer's clock. There
|
||||
// are no guarantees about clock sync or correctness. SHOULD NOT be used
|
||||
// to order AddressState records; use `version` instead.
|
||||
uint64 issuedAt = 3;
|
||||
|
||||
// All current listen addresses and their metadata.
|
||||
repeated AddressInfo addresses = 4;
|
||||
}
|
||||
```
|
||||
|
||||
The idea with the structure above is that you send some metadata along with your
|
||||
addresses: your "routability", and your own confidence in the validity of the
|
||||
address. This is wrapped in an `AddressInfo` struct along with the address
|
||||
itself.
|
||||
|
||||
Then you have a big list of `AddressInfo`s, which we put in an `AddressState`.
|
||||
An `AddressState` identifies the `subjectPeer`, which is the peer that the
|
||||
record is about, to whom the addresses belong. It also includes a `version`
|
||||
number, so that we can replace earlier `AddressState`s with newer ones, and a
|
||||
timestamp for informational purposes.
|
||||
|
||||
#### Example
|
||||
|
||||
Here's an example. Alice has an address that she thinks is publicly reachable
|
||||
but has not confirmed. She also has a LAN-local address that she knows is valid,
|
||||
but not routable via the public internet:
|
||||
|
||||
```javascript
|
||||
{
|
||||
subjectPeer: "QmAlice...",
|
||||
version: 23456,
|
||||
issuedAt: 1570215229,
|
||||
|
||||
addresses: [
|
||||
{
|
||||
addr: "/ip4/1.2.3.4/tcp/42/p2p/QmAlice",
|
||||
routability: "GLOBAL",
|
||||
confidence: "UNCONFIRMED"
|
||||
},
|
||||
{
|
||||
addr: "/ip4/10.0.1.2/tcp/42/p2p/QmAlice",
|
||||
routability: "LOCAL",
|
||||
confidence: "CONFIRMED"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
If Alice wants to publish her address to a public shared resource like a DHT,
|
||||
she should omit `LOCAL` and other unreachable addresses, and peers should
|
||||
likewise filter out `LOCAL` addresses from public sources.
|
||||
|
||||
## Certification / Verification
|
||||
|
||||
This structure can be contained in a [signed envelope][envelope-rfc], which lets
|
||||
us issue "self-certified" address records that are signed by the `subjectPeer`.
|
||||
|
||||
## Peer Store APIs
|
||||
|
||||
This section is a WIP, and I'd love input.
|
||||
|
||||
We need to figure out how to surface the address metadata in the peerstore APIs.
|
||||
|
||||
In go, extending the [`AddrInfo`
|
||||
struct](https://github.com/libp2p/go-libp2p-core/blob/master/peer/addrinfo.go)
|
||||
to include metadata seems like a decent place to start, and js likewise has
|
||||
[js-peer-info](https://github.com/libp2p/js-peer-info) that could be extended.
|
||||
|
||||
When storing this metadata internally, we may want to make a distinction between
|
||||
the remote peer's confidence in an address and our own confidence; we may decide
|
||||
an address is invalid when the remote peer thinks otherwise. One idea is to have
|
||||
our local confidence just be a numeric score (for easy sorting) that takes the
|
||||
remote peer's confidence value as an input.
|
||||
|
||||
The go [AddrBook
|
||||
interface](https://github.com/libp2p/go-libp2p-core/blob/master/peerstore/peerstore.go#L89)
|
||||
would also need to be updated - it currently deals with "raw" multiaddrs, and
|
||||
the only metadata exposed is a TTL for expiration. Changing this interface seems
|
||||
like a fairly big refactor to me, especially with the implementation in another
|
||||
repo. I'd love if some gophers could weigh in on a good way forward.
|
||||
|
||||
## Dialing Strategies
|
||||
|
||||
Once we're surfacing routability info alongside addresses, the dialer can decide
|
||||
to optionally prioritize addresses it thinks are most likely to be reachable. We
|
||||
can also add an option to only dial self-certified addresses, although that
|
||||
likely won't be practical until self-certified addresses become commonplace.
|
||||
|
||||
## Changes to core libp2p protocols
|
||||
|
||||
How to publish these to the DHT? Are the backward compatibility issues with
|
||||
older unsigned address records? Maybe we just publish these to a different key
|
||||
prefix...
|
||||
|
||||
Should we update identify and mDNS discovery to use signed records?
|
||||
|
||||
|
||||
[identify-spec]: ../identify/README.md
|
||||
[peer-id-spec]: ../peer-ids/peer-ids.md
|
||||
[autonat]: https://github.com/libp2p/specs/issues/180
|
||||
[ipld]: https://ipld.io/
|
||||
[ipld-schema-schema]: https://github.com/ipld/specs/blob/master/schemas/schema-schema.ipldsch
|
||||
[envelope-rfc]: ./0002-signed-envelopes.md
|
||||
237
RFC/0003-routing-records.md
Normal file
237
RFC/0003-routing-records.md
Normal file
@@ -0,0 +1,237 @@
|
||||
# RFC 0003 - Peer Routing Records
|
||||
|
||||
- Start Date: 2019-10-04
|
||||
- Related Issues:
|
||||
- [libp2p/issues/47](https://github.com/libp2p/libp2p/issues/47)
|
||||
- [go-libp2p/issues/436](https://github.com/libp2p/go-libp2p/issues/436)
|
||||
|
||||
## Abstract
|
||||
|
||||
This RFC proposes a method for distributing peer routing records, which contain
|
||||
a peer's publicly reachable listen addresses, and may be extended in the future
|
||||
to contain additional metadata relevant to routing. This serves a similar
|
||||
purpose to [Ethereum Node Records][eip-778]. Like ENR records, libp2p routing
|
||||
records should be extensible, so that we can add information relevant to as-yet
|
||||
unknown use cases.
|
||||
|
||||
The record described here does not include a signature, but it is expected to
|
||||
be serialized and wrapped in a [signed envelope][envelope-rfc], which will
|
||||
prove the identity of the issuing peer. The dialer can then prioritize
|
||||
self-certified addresses over addresses from an unknown origin.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
All libp2p peers keep a "peer store", which maps [peer ids][peer-id-spec] to a
|
||||
set of known addresses for each peer. When the application layer wants to
|
||||
contact a peer, the dialer will pull addresses from the peer store and try to
|
||||
initiate a connection on one or more addresses.
|
||||
|
||||
Addresses for a peer can come from a variety of sources. If we have already made
|
||||
a connection to a peer, the libp2p [identify protocol][identify-spec] will
|
||||
inform us of other addresses that they are listening on. We may also discover
|
||||
their address by querying the DHT, checking a fixed "bootstrap list", or perhaps
|
||||
through a pubsub message or an application-specific protocol.
|
||||
|
||||
In the case of the identify protocol, we can be fairly certain that the
|
||||
addresses originate from the peer we're speaking to, assuming that we're using a
|
||||
secure, authenticated communication channel. However, more "ambient" discovery
|
||||
methods such as DHT traversal and pubsub depend on potentially untrustworthy
|
||||
third parties to relay address information.
|
||||
|
||||
Even in the case of receiving addresses via the identify protocol, our
|
||||
confidence that the address came directly from the peer is not actionable, because
|
||||
the peer store does not track the origin of an address. Once added to the peer
|
||||
store, all addresses are considered equally valid, regardless of their source.
|
||||
|
||||
We would like to have a means of distributing _verifiable_ address records,
|
||||
which we can prove originated from the addressed peer itself. We also need a way to
|
||||
track the "provenance" of an address within libp2p's internal components such as
|
||||
the peer store. Once those pieces are in place, we will also need a way to
|
||||
prioritize addresses based on their authenticity, with the most strict strategy
|
||||
being to only dial certified addresses.
|
||||
|
||||
### Complications
|
||||
|
||||
While producing a signed record is fairly trivial, there are a few aspects to
|
||||
this problem that complicate things.
|
||||
|
||||
1. Addresses are not static. A given peer may have several addresses at any given
|
||||
time, and the set of addresses can change at arbitrary times.
|
||||
2. Peers may not know their own addresses. It's often impossible to automatically
|
||||
infer one's own public address, and peers may need to rely on third party
|
||||
peers to inform them of their observed public addresses.
|
||||
3. A peer may inadvertently or maliciously sign an address that they do not
|
||||
control. In other words, a signature isn't a guarantee that a given address is
|
||||
valid.
|
||||
4. Some addresses may be ambiguous. For example, addresses on a private subnet
|
||||
are valid within that subnet but are useless on the public internet.
|
||||
|
||||
The first point can be addressed by having records contain a sequence number
|
||||
that increases monotonically when new records are issued, and by having newer
|
||||
records replace older ones.
|
||||
|
||||
The other points, while worth thinking about, are out of scope for this RFC.
|
||||
However, we can take care to make our records extensible so that we can add
|
||||
additional metadata in the future. Some thoughts along these lines are in the
|
||||
[Future Work section below](#future-work).
|
||||
|
||||
## Address Record Format
|
||||
|
||||
Here's a protobuf that might work:
|
||||
|
||||
```protobuf
|
||||
|
||||
// RoutingRecord contains the listen addresses for a peer at a particular point in time.
|
||||
message RoutingRecord {
|
||||
// AddressInfo wraps a multiaddr. In the future, it may be extended to
|
||||
// contain additional metadata, such as "routability" (whether an address is
|
||||
// local or global, etc).
|
||||
message AddressInfo {
|
||||
bytes multiaddr = 1;
|
||||
}
|
||||
|
||||
// the peer id of the subject of the record (who these addresses belong to).
|
||||
bytes subjectPeer = 1;
|
||||
|
||||
// A monotonically increasing sequence number, used for record ordering.
|
||||
uint64 seq = 2;
|
||||
|
||||
// All current listen addresses
|
||||
repeated AddressInfo addresses = 4;
|
||||
}
|
||||
```
|
||||
|
||||
The `AddressInfo` wrapper message is used instead of a bare multiaddr to allow
|
||||
us to extend addresses with additional metadata [in the future](#future-work).
|
||||
|
||||
The `seq` field contains a sequence number that MUST increase monotonically as
|
||||
new records are created. Newer records MUST have a higher `seq` value than older
|
||||
records. To avoid persisting state across restarts, implementations MAY use unix
|
||||
epoch time as the `seq` value, however they MUST NOT attempt to interpret a
|
||||
`seq` value from another peer as a valid timestamp.
|
||||
|
||||
#### Example
|
||||
|
||||
```javascript
|
||||
{
|
||||
subjectPeer: "QmAlice...",
|
||||
seq: 1570215229,
|
||||
|
||||
addresses: [
|
||||
{
|
||||
addr: "/ip4/1.2.3.4/tcp/42/p2p/QmAlice",
|
||||
},
|
||||
{
|
||||
addr: "/ip4/10.0.1.2/tcp/42/p2p/QmAlice",
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Certification / Verification
|
||||
|
||||
This structure can be contained in a [signed envelope][envelope-rfc], which lets
|
||||
us issue "self-certified" address records that are signed by the `subjectPeer`.
|
||||
|
||||
To produce a "self-certified" address, a peer will construct a `RoutingRecord`
|
||||
containing all of their publicly-reachable listen addresses. A peer SHOULD only
|
||||
include addresses that it believes are routable via the public internet, ideally
|
||||
having confirmed that this is the case via some external mechanism such as a
|
||||
successful AutoNAT dial-back.
|
||||
|
||||
In some cases we may want to include localhost or LAN-local address; for
|
||||
example, when testing the DHT using many processes on a single machine. To
|
||||
support this, implementations may use a global runtime configuration flag or
|
||||
environment variable to control whether local addresses will be included.
|
||||
|
||||
Once the `RoutingRecord` has been constructed, it should be serialized to a byte
|
||||
string and wrapped in a [signed envelope][envelope-rfc]. The `publicKey` field
|
||||
of the envelope MUST be consistent with the `subjectPeer` peer id for the record
|
||||
to be considered valid.
|
||||
|
||||
### Signed Envelope Domain
|
||||
|
||||
Signed envelopes require a "domain separation" string that defines the "scope"
|
||||
or purpose of a signature.
|
||||
|
||||
When wrapping a `RoutingRecord` in a signed envelope, the domain string MUST be
|
||||
`libp2p-routing-record`.
|
||||
|
||||
### Signed Envelope Type Hint
|
||||
|
||||
Signed envelopes contain a "type hint" that indicates how to interpret the
|
||||
contents of the envelope.
|
||||
|
||||
Ideally, we should define a new multicodec for routing records, so that we can
|
||||
identify them in a few bytes. While we're still spec'ing and working on the
|
||||
initial implementation, we can use the UTF-8 string ``"/libp2p/routing-record"`
|
||||
as the type hint value.
|
||||
|
||||
## Peer Store APIs
|
||||
|
||||
We will need to add a few methods to the peer store:
|
||||
|
||||
- `AddCertifiedAddrs(envelope) -> Maybe<Error>`
|
||||
- Add a self-certified address, wrapped in a signed envelope. This should
|
||||
validate the envelope signature & store the envelope for future reference.
|
||||
If any certified addresses already exist for the peer, only accept the new
|
||||
envelope if it has a greater `seq` value than existing envelopes.
|
||||
|
||||
- `CertifiedAddrs(peerId) -> Set<Multiaddr>`
|
||||
- return the set of self-certified addresses for the given peer id
|
||||
|
||||
And possibly:
|
||||
|
||||
- `IsCertified(peerId, multiaddr) -> Boolean`
|
||||
- has a particular address been self-certified by the given peer?
|
||||
|
||||
|
||||
We'll also need a method that constructs a new `RoutingRecord` containing our
|
||||
listen address and wraps it in a signed envelope. This may belong on the Host
|
||||
instead of the peer store, since it needs access to the private signing key.
|
||||
|
||||
## Dialing Strategies
|
||||
|
||||
Once self-certified addresses are available via the peer store, we can update
|
||||
the dialer to prefer using them when possible. Some systems may want to _only_
|
||||
dial self-certified addresses, so we should include some configuration options
|
||||
to control whether non-certified addresses are acceptable.
|
||||
|
||||
## Changes to core libp2p protocols
|
||||
|
||||
How to publish these to the DHT? Are there backward compatibility issues with
|
||||
older unsigned address records? Maybe we just publish these to a different key
|
||||
prefix...
|
||||
|
||||
Should we update identify and mDNS discovery to use signed records?
|
||||
|
||||
## Future Work
|
||||
|
||||
Some things that were originally considered in this RFC were trimmed so that we
|
||||
can focus on delivering a basic self-certified record, which is a pressing need.
|
||||
|
||||
This includes a notion of "routability", which could be used to communicate
|
||||
whether a given address is global (reachable via the public internet),
|
||||
LAN-local, etc. We may also want to include some kind of confidence score or
|
||||
priority ranking, so that peers can communicate which addresses they would
|
||||
prefer other peers to use.
|
||||
|
||||
To allow these fields to be added in the future, we wrap multiaddrs in the
|
||||
`AddressInfo` message instead of having the `addresses` field be a list of "raw"
|
||||
multiaddrs.
|
||||
|
||||
Another potentially useful extension would be a compact protocol table or bloom
|
||||
filter that could be used to test whether a peer supports a given protocol
|
||||
before interacting with them directly. This could be added as a new field in the
|
||||
`RoutingRecord` message.
|
||||
|
||||
|
||||
|
||||
[identify-spec]: ../identify/README.md
|
||||
[peer-id-spec]: ../peer-ids/peer-ids.md
|
||||
[autonat]: https://github.com/libp2p/specs/issues/180
|
||||
[ipld]: https://ipld.io/
|
||||
[ipld-schema-schema]: https://github.com/ipld/specs/blob/master/schemas/schema-schema.ipldsch
|
||||
[envelope-rfc]: ./0002-signed-envelopes.md
|
||||
[eip-778]: https://eips.ethereum.org/EIPS/eip-778
|
||||
Reference in New Issue
Block a user