Merge master into feat/peer-ids

This commit is contained in:
Yusef Napora
2019-06-19 09:51:44 -04:00
25 changed files with 1596 additions and 129 deletions

View File

@@ -0,0 +1,233 @@
# libp2p specification framework lifecycle: maturity level and status
> Author: @raulk
> Revision: r0, 2019-05-21
## Prelude
Our goal is to design a framework to foster rapid and incremental libp2p
specification development, aiming to reduce the barrier to entry for new
ideas, increasing the throughput of ideation and crystallisation of
breakthrough novel proposals, promoting their evolution and adoption within
the ecosystem, while maximising consensus through a common policy for
progression across lifecycle stages.
This document defines the policies that regulate specification lifecycle. Our
ideas are partially inspired in the W3C Process [0].
## Definitions
We employ two axes to describe the stage of a specification within its
lifecycle:
* Maturity level: classifies the specification in terms of completeness,
demonstrability of implementation, community acceptance, and level of
technical detail.
We characterize specifications along a three-level, progressive scale:
* `Level 1: Working Draft`
* `Level 2: Candidate Recommendation`
* `Level 3: Recommendation`
* Status: classifies the operativeness of the specification.
* `Active`
* `Deprecated`
* `Terminated`
### Applicability matrix
Not all statuses are relevant to all maturity levels. This matrix defines the
applicability:
| | **Active** | **Deprecated** | **Terminated** |
| ----------------------------: | :--------: | :------------: | :------------: |
| **Working Draft** | ✔ | | ✔ |
| **Candidate Recommendation** | ✔ | ✔ | |
| **Recommendation** | ✔ | ✔ | |
### Abbreviations
To abbreviate the lifecycle stage of a specification, we combine the maturity
level and status into a two-character string:
```
<abbrv maturity level> ::= "1" | "2" | "3"
<abbrv status> ::= "A" | "D" | "T"
<abbrv lifecycle stage> ::= <abbrv maturity level> <abbrv status>
// example: 1A (Working Draft / Active), 2D (Candidate Recommendation / Deprecated).
```
### Document headers
We use the following nomenclature in document headers to denote its current
lifecycle stage:
```
<full maturity level> ::= "Working Draft" | "Candidate Recommendation" | "Recommendation"
<full status> ::= "Active" | "Deprecated" | "Terminated"
<lifecycle header> ::= <abbrv lifecycle stage> " " <full maturity level> " / " <full status>
// example: 1A Working Draft / Active.
```
## Maturity levels
### Level 1: Working Draft
The specification of the system, process, protocol or item is under
development.
This level is lightweight and mostly self-directed by the author. We aim to
reduce the barrier to entry, and it's designed to allow for iterative
experimentation, discovery, and pivoting.
We do not enforce a hard template in an attempt to enhance author's
expressability and creativity.
We enter this level by posting an `Initial Working Draft` that covers:
* context: what is the current situation or a brief overview of the
environment the specification targets.
* motivation: why this specification is relevant, and how it advances the
status quo.
* scope and rationale: what areas of the technical system the specification
impacts.
* goals: what we expect to achieve (positively and negatively) as a result
of implementing the specification.
* expected feature set: a summary/enumeration of features the spec provides.
* tentative technical directions: how are we planning to materialise the
specification in terms of system design.
Upon submission of an `Initial Working Draft`, a minimum of three (3) libp2p
contributors are required to express interest and commitment to shepherd and
advise the author(s) throughout the specification process.
The resulting group will constitute the _Interest Group_, formed by consensus,
barring blocking, binding community feedback. We encourage the _Interest
Group_ to be heterogeneous yet relevant, and hold representation for libp2p
implementation teams across various languages.
The _Interest Group_ will be responsible for expediently awarding the review
approvals or feedback necessary to transition the specification across stages.
The `Initial Working Draft` shall be reviewed by the _Interest Group_ in no
more than 5 working days. Should there be no defects in form, content or
serious technical soundness issues, the `Initial Working Draft` will be
accepted and merged.
Ideas deemed controversial or breaking, and those that garner subjective
opposition, will still be accepted in order to give them a venue to grow,
mature and iterate.
Once the `Initial Working Draft` is merged, the author may continue revising
and evolving their specification by self-approving their own *Pull Requests*.
To facilitate open progress tracking and observability, as the `Working Draft`
evolves, the author(s) SHOULD assemble a checklist of items that are pending
specification, explicitly stating which items are compulsory for promoting the
spec to a `Candidate Recommendation`.
As a `Working Draft` evolves and shows promise to exit this stage towards a
`Candidate Recommendation`, the _Interest Group_ shall be expanded by two (2)
additional members, comprising a total of five (5).
We MAY use GitHub's
[`CODEOWNERS`](https://help.github.com/en/articles/about-code-owners) feature
to enforce per-spec approval policies automatically.
A `Working Draft` can be in either `Active` or `Terminated` status.
### Level 2: Candidate Recommendation
The changes proposed in the specification are considered plausible and
desirable.
The specification document itself is technically complete. It defines wire
level formats for interoperability, error codes, algorithms, data structures,
heuristics, behaviours, etc., in a way that it is sufficient to enable
contributors to develop an interoperable implementation.
There is at least ONE implementation conforming to the specification. That
implementation serves as the _Reference Implementation_.
The promotion from a `Working Draft` to a `Candidate Recommendation` is done
via a *Pull Request* that is reviewed by the _Interest Group_, allowing 10
working days to elapse to collect feedback from the libp2p community at large.
A `Candidate Recommendation` can be in either `Active` or `Deprecated` status.
### Level 3: Recommendation
There are at least TWO implementations conforming to the specification, with
demonstrated cross-interoperability. This is the supreme stage in the
lifecycle of a specification.
The promotion from a `Candidate Recommendation` to a `Recommendation` is done
via a *Pull Request* that is reviewed by the _Interest Group_, allowing 10
working days to elapse to collect feedback from the libp2p community at large.
A `Recommendation` can be in either `Active` or `Deprecated` status.
## Status
### Active
The specification is actively being worked on (`Working Draft`), or it is
actively encouraged for adoption by implementers (`Candidate Recommendation`,
`Recommendation`).
This is the entry status for all `Initial Working Drafts`, and is the default
status until some event triggers deprecation or termination.
### Deprecated
The specification is no longer applicable and the community actively
discourages new implementations from being built, unless requirements for
backwards-compatibility are in force.
Transition to this stage is usually triggered when a new version of a related
specification superseding this one reaches the `Candidate Recommendation`
stage.
The transition from the `Active` status to the `Deprecated` status is
performed via a *Pull Request* that is reviewed by the _Interest Group_,
allowing 5 working days to elapse to collect feedback from the libp2p
community at large.
### Terminated
A specification in `Working Draft` maturity level aged without ammassing
consensus in a timely fashion, and it was therefore terminated by the
procedure below.
Procedure for termination: In order to motivate accountability, efficiency and
order, a specification that stays on the `Working Draft` maturity level for
over 4 months of its initial approval will be transitioned to the `Terminated`
status automatically.
The author or _Interest Group_ can request extensions up to 2 times (making
for a cumulative runway 12 months), and will be granted by consensus if
there's evidence of progress and continued author commitment. We consider this
an implicit checkpoint to resolve issues that prevent the specification from
making progress.
---
## Interest Group membership changes
Changes in the membership of an _Interest Group_ are possible at any time.
While we don't maintain a comprehensive enumeration of reasons, common sense
applies.
They include events like waning dedication/commitment of members, changes in
technical relevance, or violations of the [community code of
conduct](https://github.com/ipfs/community/blob/master/code-of-conduct.md).
## References
[0] W3.org. (2019). World Wide Web Consortium Process Document. [online]
Available at: https://www.w3.org/2019/Process-20190301/ [Accessed 21 May
2019].

View File

@@ -0,0 +1,166 @@
# Document Header for libp2p Specs
> A standard document header to indicate spec maturity, status & ownership.
| Lifecycle Stage | Maturity | Status | Latest Revision |
|-----------------|---------------|--------|-----------------|
| 1A | Working Draft | Active | r0, 2019-05-28 |
Authors: [@yusefnapora]
Interest Group: [@raulk], [@vyzo], [@mgoelzer], [@jacobheun], [@tomaka]
[@yusefnapora]: https://github.com/yusefnapora
[@raulk]: https://github.com/raulk
[@vyzo]: https://github.com/vyzo
[@mgoelzer]: https://github.com/mgoelzer
[@jacobheun]: https://github.com/jacobheun
[@tomaka]: https://github.com/tomaka
See the [lifecycle document][lifecycle-spec] for context about maturity level
and spec status.
[lifecycle-spec]: https://github.com/libp2p/specs/blob/master/00-framework-01-spec-lifecycle.md
## Motivation
The [maturity and lifecycle spec][lifecycle-spec] defines levels of maturity for
libp2p specs, as well as the states that a spec can be in at a given time. It
also introduces the notion of an `Interest Group`, which is a set of libp2p
community members that have expressed interest in the spec and are willing to
help move it forward in its evolution.
This document defines a header format to convey this key status information in
an easy-to-read manner.
## Example
```markdown
# Spec title
> An optional one-liner summary of the spec
| Lifecycle Stage | Maturity | Status | Latest Revision |
|-----------------|--------------------------|--------|-----------------|
| 2A | Candidate Recommendation | Active | r0, 2019-05-27 |
Authors: [@author1], [@author2]
Interest Group: [@interested1], [@interested2]
[@author1]: https://github.com/author1
[@author2]: https://github.com/author2
[@interested1]: https://github.com/interested1
[@interested2]: https://github.com/interested2
See the [lifecycle document][lifecycle-spec] for context about maturity level
and spec status.
[lifecycle-spec]: https://github.com/libp2p/specs/blob/master/00-framework-01-spec-lifecycle.md
```
## Format
### Title and Short Intro
Each spec begins with its title, formatted as an H1 markdown header.
The title can optionally be followed by a short block-quote introducing the
spec, which serves as a subtitle and should be a maximum of one or two lines.
### Status Table
The main status information is contained in a markdown table, using the [table
syntax][gfm-tables] supported by [Github Flavored Markdown][gfm-spec].
The status table consists of a single row, with a header containing the field
names.
Example:
```markdown
| Lifecycle Stage | Maturity | Status | Latest Revision |
|-----------------|---------------|--------|-----------------|
| 1A | Working Draft | Active | r0, 2019-05-27 |
```
The following fields are all required:
- `Lifecycle Stage`
- The [abbreviated lifecycle stage][abbrev-stage-definition] that the spec is
currently in. This must match the `Maturity` and `Status` fields.
- `Maturity`
- The full name of the maturity level that the spec is currently in.
- Valid values are: `Working Draft`, `Candidate Recommenation`,
`Recommendation`.
- `Status`
- The full name of the status that the spec is currently in.
- For `Candidate Recommendation` or `Recommendation` specs, valid values are
`Active` and `Deprecated`.
- For `Working Draft` specs, valid values are: `Active` and `Terminated`.
- `Latest Revision`
- A revision number and date to indicate when the spec was last modified,
separated by a comma.
- Revision numbers start with lowercase `r` followed by an integer, which gets
bumped whenever the spec is modified by merging a new PR.
- Revision numbers start at `r0` when the spec is first merged.
- Dates are formatted according to [ISO 8601](https://xkcd.com/1179/).
### Authors and Interest Group
After the status table, spec Authors and Interest Group members are listed.
Authors and Interest Group members are referenced by their Github handles
(with a leading `@` symbol), and are presented as a comma-separated list of links
to Github profiles.
To make the list readable in the markdown source, we use the [shortcut reference
link syntax][gfm-shortcut-refs], which allows us to wrap the author name in
square brackets in the list and define the link target below. Albeit this may
appear redundant when viewing in github.com with the GitHub renderer, it's
necessary to avoid losing information when viewing elsewhere.
For example:
```markdown
Authors: [@author1], [@author2]
Interest Group: [@interested1], [@interested2]
[@author1]: https://github.com/author1
[@author2]: https://github.com/author2
[@interested1]: https://github.com/interested1
[@interested2]: https://github.com/interested2
```
The Authors and Interest Group lists must be separated by a newline, which
causes them to render as distinct paragraphs.
When proposing a new `Working Draft` where the Interest Group is unknown, use
`TBD` to indicate that the group is To Be Determined:
```markdown
Interest Group: TBD
```
### Link to Lifecycle Doc
Finally, the header should contain a link to the [lifecycle
spec][lifecycle-spec] so that readers can get up to speed on the definitions
used in the header. To avoid having to keep track of relative paths within the
specs repo, an absolute URL is preferred when linking to the specs document.
Here's an example that can be copy/pasted directly:
```markdown
See the [lifecycle document][lifecycle-spec] for context about maturity level
and spec status.
[lifecycle-spec]: https://github.com/libp2p/specs/blob/master/00-framework-01-spec-lifecycle.md
```
[abbrev-stage-definition]: ./00-framework-01-spec-lifecycle.md#abbreviations
[gfm-tables]: https://help.github.com/en/articles/organizing-information-with-tables
[gfm-spec]: https://github.github.com/gfm/
[gfm-shortcut-refs]: https://github.github.com/gfm/#shortcut-reference-link

146
README.md
View File

@@ -1,103 +1,95 @@
# libp2p specification
<h1 align="center">
<img src="https://raw.githubusercontent.com/libp2p/libp2p/a13997787e57d40d6315b422afbe1ceb62f45511/logo/libp2p-logo.png" alt="libp2p logo"/>
</h1>
[![](https://img.shields.io/badge/made%20by-Protocol%20Labs-blue.svg?style=flat-square)](http://ipn.io)
[![](https://img.shields.io/badge/project-libp2p-blue.svg?style=flat-square)](http://github.com/libp2p/libp2p)
[![](https://img.shields.io/badge/freenode-%23ipfs-blue.svg?style=flat-square)](http://webchat.freenode.net/?channels=%23ipfs)
<a href="http://protocol.ai"><img src="https://img.shields.io/badge/made%20by-Protocol%20Labs-blue.svg?style=flat-square" /></a>
<a href="http://libp2p.io/"><img src="https://img.shields.io/badge/project-libp2p-yellow.svg?style=flat-square" /></a>
<a href="http://webchat.freenode.net/?channels=%23libp2p"><img src="https://img.shields.io/badge/freenode-%23libp2p-yellow.svg?style=flat-square" /></a>
<a href="https://discuss.libp2p.io"><img src="https://img.shields.io/discourse/https/discuss.libp2p.io/posts.svg" /></a>
> This document presents `libp2p`, a modularized and extensible network stack to overcome the networking challenges faced when doing peer-to-peer applications. `libp2p` is used by IPFS as its networking library.
## Overview
Authors:
This repository contains the specifications for [`libp2p`](https://libp2p.io), a
framework and suite of protocols for building peer-to-peer network applications.
libp2p has several [implementations][libp2p_implementations], with more in development.
- [Juan Benet](https://github.com/jbenet)
- [David Dias](https://github.com/diasdavid)
The main goal of this repository is to provide accurate reference documentation
for the aspects of libp2p that are independent of language or implementation.
This includes wire protocols, addressing conventions, and other "network level"
concerns.
Reviewers:
## Status
- `N/A`
The specifications for libp2p are currently incomplete, and we have recently
[defined a process][spec_lifecycle] for categorizing specs according to their
maturity and status. Many of the existing specs linked below are not yet
categorized according to this framework, however, they will soon be updated for
consistency.
## Abstract
This document replaces an earlier RFC, which still contains much useful
information and is helpful for understanding the libp2p design philosophy. It is
avaliable at [_archive/README.md](./_archive/README.md).
This describes the [IPFS](https://ipfs.io/) network protocol. The network layer provides point-to-point transports (reliable and unreliable) between any two IPFS nodes in the network.
## Specification Index
This document defines the spec implemented in `libp2p`.
This index contains links to all the spec documents that are currently merged.
If documents are moved to new locations within the repository, this index will
be updated to reflect the new locations.
## Status of this spec ![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square)
### Specs Framework
## Organization of this document
These specs define processes for the specification framework itself, such as the
expected lifecycle and document formatting.
This RFC is organized by chapters described on the *Table of contents* section. Each of the chapters can be found in its own file.
- [Spec Lifecycle][spec_lifecycle] - The process for introducing, revising and
adopting specs.
- [Document Header][spec_header] - A standard document header for libp2p specs.
## Table of contents
### Protocols
- [1 Introduction](1-introduction.md)
- [1.1 Motivation](1-introduction.md#11-motivation)
- [1.2 Goals](1-introduction.md#12-goals)
- [2 An analysis the state of the art in network stacks](2-state-of-the-art.md)
- [2.1 The client-server model](2-state-of-the-art.md#21-the-client-server-model)
- [2.2 Categorizing the network stack protocols by solutions](2-state-of-the-art.md#22-categorizing-the-network-stack-protocols-by-solutions)
- [2.3 Current shortcomings](2-state-of-the-art.md#23-current-shortcomings)
- [3 Requirements](3-requirements.md)
- [3.1 Transport agnostic](3-requirements.md#34-transport-agnostic)
- [3.2 Multi-multiplexing](3-requirements.md#35-multi-multiplexing)
- [3.3 Encryption](3-requirements.md#33-encryption)
- [3.4 NAT traversal](3-requirements.md#31-nat-traversal)
- [3.5 Relay](3-requirements.md#32-relay)
- [3.6 Enable several network topologies](3-requirements.md#36-enable-several-network-topologies)
- [3.7 Resource discovery](3-requirements.md#37-resource-discovery)
- [3.8 Messaging](3-requirements.md#38-messaging)
- [3.9 Naming](3-requirements.md#38-naming)
- [4 Architecture](4-architecture.md)
- [4.1 Peer Routing](4-architecture.md#41-peer-routing)
- [4.2 Swarm](4-architecture.md#42-swarm)
- [4.3 Distributed Record Store](4-architecture.md#43-distributed-record-store)
- [4.4 Discovery](4-architecture.md#44-discovery)
- [4.5 Messaging](4-architecture.md#45-messaging)
- [4.5.1 PubSub](4-architecture.md#451-pubsub)
- [4.6 Naming](4-architecture.md#46-naming)
- [4.6.1 IPRS](4-architecture.md#461-iprs)
- [4.6.2 IPNS](4-architecture.md#462-ipns)
- [5 Data structures](5-datastructures.md)
- [6 Interfaces](6-interfaces.md)
- [6.1 libp2p](6-interfaces.md#61-libp2p)
- [6.1 Transport](6-interfaces.md)
- [6.2 Connection](6-interfaces.md)
- [6.3 Stream Multiplexer](6-interfaces.md)
- [6.3 Swarm](6-interfaces.md#63-swarm)
- [6.5 Peer Discovery](6-interfaces.md#65-peer-discovery)
- [6.2 Peer Routing](6-interfaces.md#62-peer-routing)
- [6.2 Content Routing](6-interfaces.md#62-peer-routing)
- [6.3.1 Distributed Record Store](6-interfaces.md#64-distributed-record-store)
- [6.6 libp2p interface and UX](6-interfaces.md#66-libp2p-interface-and-ux)
- [7 Properties](7-properties.md)
- [7.1 Communication Model - Streams](7-properties.md#71-communication-model---streams)
- [7.2 Ports - Constrained Entrypoints](7-properties.md#72-ports---constrained-entrypoints)
- [7.3 Transport Protocol](7-properties.md#73-transport-protocols)
- [7.4 Non-IP Networks](7-properties.md#74-non-ip-networks)
- [7.5 On the wire](7-properties.md#75-on-the-wire)
- [7.5.1 Protocol-Multiplexing](7-properties.md#751-protocol-multiplexing)
- [7.5.2 multistream - self-describing protocol stream](7-properties.md#752-multistream---self-describing-protocol-stream)
- [7.5.3 multistream-selector - self-describing protocol stream selector](7-properties.md#753-multistream-selector---self-describing-protocol-stream-selector)
- [7.5.4 Stream Multiplexing](7-properties.md#754-stream-multiplexing)
- [7.5.5 Portable Encodings](7-properties.md#755-portable-encodings)
- [7.5.6 Secure Communications](7-properties.md#756-secure-communications)
- [8 Implementations](8-implementations.md)
- [9 References](9-references.md)
These specs define wire protocols that are used by libp2p for connectivity,
security, multiplexing, and other purposes.
## Other specs that haven't made to the main document
- [identify][spec_identify] - Exchange keys and addresses with other peers
- [mplex][spec_mplex] - The friendly stream multiplexer
- [pnet][spec_pnet] - Private networking in libp2p using pre-shared keys
- [pubsub][spec_pubsub] - PubSub interface for libp2p
- [gossipsub][spec_gossipsub] - An extensible baseline PubSub protocol
- [episub][spec_episub] - Proximity Aware Epidemic PubSub for libp2p
- [relay][spec_relay] - Circuit Switching for libp2p (similar to TURN)
- [rendezvous][spec_rendezvous] - Rendezvous Protocol for generalized
peer discovery
- [secio][spec_secio] - SECIO, a transport security protocol for libp2p
- [tls][spec_tls] The libp2p TLS Handshake (TLS 1.3+)
- [Relay](/relay)
- [PubSub](/pubsub)
## Contribute
## Contributions
Please contribute! [Dive into the issues](https://github.com/libp2p/specs/issues)!
Thanks for your interest in improving libp2p! We welcome contributions from all
interested parties. Please take a look at the [Spec Lifecycle][spec_lifecycle]
document to get a feel for how the process works, and [open an
issue](https://github.com/libp2p/specs/issues/new) if there's work you'd like to
discuss.
Please be aware that all interactions related to multiformats are subject to the IPFS [Code of Conduct](https://github.com/ipfs/community/blob/master/code-of-conduct.md).
For discussions about libp2p that aren't specific to a particular spec, or if
you feel an issue isn't the appropriate place for your topic, please join our
[discussion forum](https://discuss.libp2p.io) and post a new topic in the
[contributor's section](https://discuss.libp2p.io/c/contributors).
## License
[CC-BY-SA 3.0 License](https://creativecommons.org/licenses/by-sa/3.0/us/) © Protocol Labs Inc.
[libp2p_implementations]: https://libp2p.io/implementations
[spec_lifecycle]: 00-framework-01-spec-lifecycle.md
[spec_header]: 00-framework-02-document-header.md
[spec_identify]: ./identify/README.md
[spec_mplex]: ./mplex/README.md
[spec_pnet]: ./pnet/README.md
[spec_pubsub]: ./pubsub/README.md
[spec_gossipsub]: ./pubsub/gossipsub/README.md
[spec_episub]: ./pubsub/gossipsub/episub.md
[spec_relay]: ./relay/README.md
[spec_rendezvous]: ./rendezvous/README.md
[spec_secio]: ./secio/README.md
[spec_tls]: ./tls/tls.md

View File

@@ -5,6 +5,9 @@ While developing [IPFS, the InterPlanetary FileSystem](https://ipfs.io/), we cam
In order to build this library, we focused on tackling problems independently, creating less complex solutions with powerful abstractions that, when composed, can offer an environment for a peer-to-peer application to work successfully.
| ⚠️ Warning: parts of this document are incomplete and out of date. Please see [this issue](https://github.com/libp2p/specs/issues/156), and look for deprecation notices throughout. ⚠️ |
| --- |
## 1.1 Motivation
`libp2p` is the result of our collective experience of building a distributed system, in that it puts responsibility on developers to decide how they want an app to interoperate with others in the network, and favors configuration and extensibility instead of making assumptions about the network setup.

View File

@@ -22,7 +22,7 @@ We even learned how to hide all the complexity of a distributed system behind ga
## 2.2 Categorizing the network stack protocols by solutions
Before diving into the `libp2p` protocols, it is important to understand the large diversity of protocols already in wide use and deployment that help maintain today's simple abstractions. For example, when one thinks about an HTTP connection, one might naively just think that HTTP/TCP/IP are the main protocols involved, but in reality many more protocols participate, depending on the usage, the networks involved, and so on. Protocols like DNS, DHCP, ARP, OSPF, Ethernet, 802.11 (Wi-Fi) and many others get involved. Looking inside ISPs' own networks would reveal dozens more.
Before diving into the `libp2p` protocols, it is important to understand the large diversity of protocols already in wide use and deployment that help maintain today's simple abstractions. For example, when one thinks about an HTTP connection, one might naively just think that HTTP/TCP/IP are the main protocols involved, but in reality many more protocols participate, depending on the usage, the networks involved, and so on. Protocols like DNS, DHCP(v6), ARP, NDISC, OSPF, Ethernet, 802.11 (Wi-Fi) and many others get involved. Looking inside ISPs' own networks would reveal dozens more.
Additionally, it's worth noting that the traditional 7-layer OSI model characterization does not fit `libp2p`. Instead, we categorize protocols based on their role, i.e. the problem they solve. The upper layers of the OSI model are geared towards point-to-point links between applications, whereas the `libp2p` protocols speak more towards various sizes of networks, with various properties, under various different security models. Different `libp2p` protocols can have the same role (in the OSI model, this would be "address the same layer"), meaning that multiple protocols can run simultaneously, all addressing one role (instead of one-protocol-per-layer in traditional OSI stacking). For example, bootstrap lists, mDNS, DHT discovery, and PEX are all forms of the role "Peer Discovery"; they can coexist and even synergize.
@@ -42,7 +42,8 @@ Additionally, it's worth noting that the traditional 7-layer OSI model character
### 2.2.3 Discovering other peers or services
- ARP
- DHCP
- NDISC
- DHCP(v6)
- DNS
- Onion

View File

@@ -1,6 +1,9 @@
4 Architecture
==============
| ⚠️ Warning: this section is incomplete, and parts of it are out of date. Please see [this issue](https://github.com/libp2p/specs/issues/156) to track progress on improving it. ⚠️ |
| --- |
`libp2p` was designed around the Unix Philosophy of creating small components that are easy to understand and test. These components should also be able to be swapped in order to accommodate different technologies or scenarios and also make it feasible to upgrade them over time.
Although different peers can support different protocols depending on their capabilities, any peer can act as a dialer and/or a listener for connections from other peers, connections that once established can be reused from both ends, removing the distinction between clients and servers.
@@ -91,7 +94,9 @@ Follows [IPRS spec](/IPRS.md).
### 4.4.1 mDNS-discovery
mDNS-discovery is a Discovery Protocol that uses [mDNS](https://en.wikipedia.org/wiki/Multicast_DNS) over local area networks. It emits mDNS beacons to find if there are more peers available. Local area network peers are very useful to peer-to-peer protocols, because of their low latency links.
mDNS-discovery is a Discovery Protocol that uses [mDNS](https://en.wikipedia.org/wiki/Multicast_DNS) over local area networks with zero configuration. Local area network peers are very useful to peer-to-peer protocols, because of their low latency links.
The [mDNS-discovery](discovery/mdns.md) specification describes how to use mDNS to discover other peers.
mDNS-discovery is a standalone protocol and does not depend on any other `libp2p` protocol. mDNS-discovery can yield peers available in the local area network, without relying on other infrastructure. This is particularly useful in intranets, networks disconnected from the Internet backbone, and networks which temporarily lose links.

View File

@@ -1,6 +1,9 @@
6 Interfaces
============
| ⚠️ Warning: this section is incomplete, and parts of it are out of date. Please see [this issue](https://github.com/libp2p/specs/issues/156) to track progress on improving it. ⚠️ |
| --- |
`libp2p` is a collection of several protocols working together to offer a common solid interface that can talk with any other network addressable process. This is made possible by shimming currently existing protocols and implementations into a set of explicit interfaces: Peer Routing, Discovery, Stream Muxing, Transports, Connections and so on.
## 6.1 libp2p

View File

@@ -1,6 +1,9 @@
7 Properties
============
| ⚠️ Warning: this section is incomplete, and parts of it are out of date. Please see [this issue](https://github.com/libp2p/specs/issues/156) to track progress on improving it. ⚠️ |
| --- |
## 7.1 Communication Model - Streams
The Network layer handles all the problems of connecting to a peer, and exposes
@@ -104,18 +107,18 @@ We have the **hard constraint** of making IPFS work across _any_ duplex stream (
To make this work, IPFS has to solve a few problems:
- [Protocol Multiplexing](#751-protocol-multiplexing) - running multiple protocols over the same stream
- [multistream](#752-multistream-self-describing-protocol-stream) - self-describing protocol streams
- [multistream-select](#753-multistream-selector-self-describing-protocol-stream-selector) - a self-describing protocol selector
- [multistream](#752-multistream---self-describing-protocol-stream) - self-describing protocol streams
- [multistream-select](#753-multistream-selector---self-describing-protocol-stream-selector) - a self-describing protocol selector
- [Stream Multiplexing](#754-stream-multiplexing) - running many independent streams over the same wire
- [Portable Encodings](#755-portable-encodings) - using portable serialization formats
- [Secure Communications](#756-secure-communication) - using ciphersuites to establish security and privacy (like TLS)
- [Secure Communications](#756-secure-communications) - using ciphersuites to establish security and privacy (like TLS)
### 7.5.1 Protocol-Multiplexing
Protocol Multiplexing means running multiple different protocols over the same stream. This could happen sequentially (one after the other), or concurrently (at the same time, with their messages interleaved). We achieve protocol multiplexing using three pieces:
- [multistream](#752-multistream-self-describing-protocol-stream) - self-describing protocol streams
- [multistream-select](#753-multistream-selector-self-describing-protocol-stream-selector) - a self-describing protocol selector
- [multistream](#752-multistream---self-describing-protocol-stream) - self-describing protocol streams
- [multistream-select](#753-multistream-selector---self-describing-protocol-stream-selector) - a self-describing protocol selector
- [Stream Multiplexing](#754-stream-multiplexing) - running many independent streams over the same wire
### 7.5.2 multistream - self-describing protocol stream

104
_archive/README.md Normal file
View File

@@ -0,0 +1,104 @@
# libp2p specification
<h1 align="center">
<img src="https://raw.githubusercontent.com/libp2p/libp2p/a13997787e57d40d6315b422afbe1ceb62f45511/logo/libp2p-logo.png" alt="libp2p logo"/>
</h1>
<a href="http://protocol.ai"><img src="https://img.shields.io/badge/made%20by-Protocol%20Labs-blue.svg?style=flat-square" /></a>
<a href="http://libp2p.io/"><img src="https://img.shields.io/badge/project-libp2p-yellow.svg?style=flat-square" /></a>
<a href="http://webchat.freenode.net/?channels=%23libp2p"><img src="https://img.shields.io/badge/freenode-%23libp2p-yellow.svg?style=flat-square" /></a>
<a href="https://waffle.io/libp2p/libp2p"><img src="https://img.shields.io/badge/pm-waffle-yellow.svg?style=flat-square" /></a>
> This document presents `libp2p`, a modularized and extensible network stack to overcome the networking challenges faced when doing peer-to-peer applications. `libp2p` is used by IPFS as its networking library.
Authors:
- [Juan Benet](https://github.com/jbenet)
- [David Dias](https://github.com/daviddias)
Reviewers:
- `N/A`
## Abstract
This describes the [IPFS](https://ipfs.io/) network protocol. The network layer provides point-to-point transports (reliable and unreliable) between any two IPFS nodes in the network.
This document defines the spec implemented in `libp2p`.
## Status of this spec ![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square)
## Organization of this document
This RFC is organized by chapters described on the *Table of contents* section. Each of the chapters can be found in its own file.
## Table of contents
- [1 Introduction](1-introduction.md)
- [1.1 Motivation](1-introduction.md#11-motivation)
- [1.2 Goals](1-introduction.md#12-goals)
- [2 An analysis the state of the art in network stacks](2-state-of-the-art.md)
- [2.1 The client-server model](2-state-of-the-art.md#21-the-client-server-model)
- [2.2 Categorizing the network stack protocols by solutions](2-state-of-the-art.md#22-categorizing-the-network-stack-protocols-by-solutions)
- [2.3 Current shortcomings](2-state-of-the-art.md#23-current-shortcomings)
- [3 Requirements](3-requirements.md)
- [3.1 Transport agnostic](3-requirements.md#34-transport-agnostic)
- [3.2 Multi-multiplexing](3-requirements.md#35-multi-multiplexing)
- [3.3 Encryption](3-requirements.md#33-encryption)
- [3.4 NAT traversal](3-requirements.md#31-nat-traversal)
- [3.5 Relay](3-requirements.md#32-relay)
- [3.6 Enable several network topologies](3-requirements.md#36-enable-several-network-topologies)
- [3.7 Resource discovery](3-requirements.md#37-resource-discovery)
- [3.8 Messaging](3-requirements.md#38-messaging)
- [3.9 Naming](3-requirements.md#38-naming)
- [4 Architecture](4-architecture.md)
- [4.1 Peer Routing](4-architecture.md#41-peer-routing)
- [4.2 Swarm](4-architecture.md#42-swarm)
- [4.3 Distributed Record Store](4-architecture.md#43-distributed-record-store)
- [4.4 Discovery](4-architecture.md#44-discovery)
- [4.5 Messaging](4-architecture.md#45-messaging)
- [4.5.1 PubSub](4-architecture.md#451-pubsub)
- [4.6 Naming](4-architecture.md#46-naming)
- [4.6.1 IPRS](4-architecture.md#461-iprs)
- [4.6.2 IPNS](4-architecture.md#462-ipns)
- [5 Data structures](5-datastructures.md)
- [6 Interfaces](6-interfaces.md)
- [6.1 libp2p](6-interfaces.md#61-libp2p)
- [6.1 Transport](6-interfaces.md)
- [6.2 Connection](6-interfaces.md)
- [6.3 Stream Multiplexer](6-interfaces.md)
- [6.3 Swarm](6-interfaces.md#63-swarm)
- [6.5 Peer Discovery](6-interfaces.md#65-peer-discovery)
- [6.2 Peer Routing](6-interfaces.md#62-peer-routing)
- [6.2 Content Routing](6-interfaces.md#62-peer-routing)
- [6.3.1 Distributed Record Store](6-interfaces.md#64-distributed-record-store)
- [6.6 libp2p interface and UX](6-interfaces.md#66-libp2p-interface-and-ux)
- [7 Properties](7-properties.md)
- [7.1 Communication Model - Streams](7-properties.md#71-communication-model---streams)
- [7.2 Ports - Constrained Entrypoints](7-properties.md#72-ports---constrained-entrypoints)
- [7.3 Transport Protocol](7-properties.md#73-transport-protocols)
- [7.4 Non-IP Networks](7-properties.md#74-non-ip-networks)
- [7.5 On the wire](7-properties.md#75-on-the-wire)
- [7.5.1 Protocol-Multiplexing](7-properties.md#751-protocol-multiplexing)
- [7.5.2 multistream - self-describing protocol stream](7-properties.md#752-multistream---self-describing-protocol-stream)
- [7.5.3 multistream-selector - self-describing protocol stream selector](7-properties.md#753-multistream-selector---self-describing-protocol-stream-selector)
- [7.5.4 Stream Multiplexing](7-properties.md#754-stream-multiplexing)
- [7.5.5 Portable Encodings](7-properties.md#755-portable-encodings)
- [7.5.6 Secure Communications](7-properties.md#756-secure-communications)
- [8 Implementations](8-implementations.md)
- [9 References](9-references.md)
## Other specs that haven't made to the main document
- [Relay](/relay)
- [PubSub](/pubsub)
## Contribute
Please contribute! [Dive into the issues](https://github.com/libp2p/specs/issues)!
Please be aware that all interactions related to multiformats are subject to the IPFS [Code of Conduct](https://github.com/ipfs/community/blob/master/code-of-conduct.md).
## License
[CC-BY-SA 3.0 License](https://creativecommons.org/licenses/by-sa/3.0/us/) © Protocol Labs Inc.

131
discovery/mdns.md Normal file
View File

@@ -0,0 +1,131 @@
# Multicast DNS (mDNS)
Author: Richard Schneider (makaretu@gmail.com)
## Overview
The goal is to allow peers to discover each other when on the same local network with zero configuration. mDNS uses a multicast system of DNS records; this allows all peers on the local network to see all query responses.
Conceptually, it is very simple. When a peer starts (or detects a network change), it sends a query for all peers. As responses come in, the peer adds the other peers' information into its local database of peers.
## Definitions
- `service-name` is the DNS Service Discovery (DNS-SD) service name for all peers. It is defined as `_p2p._udp.local`.
- `host-name` is the fully qualified name of the peer. It is derived from the peer's name and `p2p.local`.
- `peer-name` is the case-insensitive unique identifier of the peer, and is less than 64 characters. It is normally the base-32 encoding of the peer's ID.
If the encoding of the peer's ID exceeds 63 characters, then the [Split at 63rd character](https://github.com/ipfs/in-web-browsers/issues/89#issue-341357014) workaround can be used.
If a [private network](https://github.com/libp2p/specs/blob/master/pnet/Private-Networks-PSK-V1.md) is in use, then the `service-name` contains the base-16 encoding of the network's fingerprint as in `_p2p-X._udp.local`.
The prevents public and private networks from discovering each other's peers.
## Peer Discovery
### Request
To find all peers, a DNS message is sent with the question `_p2p._udp.local PTR`. Peers will then start responding with their details.
Note that a peer must respond to its own query. This allows other peers to passively discover it.
### Response
On receipt of a `find all peers` query, a peer sends a DNS response message (QR = 1) that contains the **answer**
```
<service-name> PTR <peer-name>.<service-name>
```
The **additional records** of the response contain the peer's discovery details:
```
<peer-name>.<service-name> TXT "dnsaddr=..."
```
The TXT record contains the multiaddresses that the peer is listening on. Each multiaddress is a TXT attribute with the form `dnsaddr=/.../p2p/QmId`. Multiple `dnsaddr` attributes and/or TXT records are allowed.
## DNS Service Discovery
DNS-SD support is not needed for peers to discover each other. However, it is extremely useful for network administrators to discover what is running on the network.
### Meta Query
This allows discovery of all services. The question is `_services._dns-sd._udp.local PTR`.
A peer responds with the answer
```
_services._dns-sd._udp.local PTR <service-name>
```
### Find All Response
On receipt of a `find all peers` query, the following **additional records** should be included
```
<peer-name>.<service-name> SRV ... <host-name>
<host-name> A <ipv4 address>
<host-name> AAAA <ipv6 address>
```
### Gotchas
Many existing tools ignore the Additional Records, and always send individual queries for the peer's discovery details. To accomodate this, a peer should respond to the following queries:
- `<peer-name>.<service-name> SRV`
- `<peer-name>.<service-name> TXT`
- `<host-name> A`
- `<host-name> AAAA`
## Issues
[ ] mDNS requires link-local addresses. Loopback and "NAT busting" addresses should not sent and must be ignored on receipt?
## References
- [RFC 1035 - Domain Names (DNS)](https://tools.ietf.org/html/rfc1035)
- [RFC 6762 - Multicast DNS](https://tools.ietf.org/html/rfc6762)
- [RFC 6763 - DNS-Based Service Discovery](https://tools.ietf.org/html/rfc6763)
- [Multiaddr](https://github.com/multiformats/multiaddr)
## Worked Examples
Asumming that `peer-id` is `QmQusTXc1Z9C1mzxsqC9ZTFXCgSkpBRGgW4Jk2QYHxKE22`, then the `peer-name` is `ciqcmoputolsfsigvm7nx5fwkko2eq26h46qhbj6o4co7uyn2f2srdy` (base32 encoding of the peer ID).
To make the examples more readable `id` and `name` are used.
### Meta Query
Goal: find all services on the local network.
#### Question
```
_services._dns-sd._udp.local PTR
```
#### Answer
```
_services._dns-sd._udp.local IN PTR _p2p._udp.local
```
### Find All Peers
Goal: find all peers on the local network.
#### Question
```
_p2p._udp.local PTR
```
#### Answer
```
_p2p._udp.local IN PTR `name`._p2p._udp.local
```
#### Additional Records
- `name`._p2p._udp.local IN TXT dnsaddr=/ip6/fe80::7573:b0a8:46b0:bfea/tcp/4001/ipfs/`id`
- `name`._p2p._udp.local IN TXT dnsaddr=/ip4/192.168.178.21/tcp/4001/ipfs/'id'

84
identify/README.md Normal file
View File

@@ -0,0 +1,84 @@
# Identify v1.0.0
The identify protocol is used to exchange basic information with other peers
in the network, including addresses, public keys, and capabilities.
There are two variations of the identify protocol, `identify` and `identify/push`.
### `identify`
The `identify` protocol has the protocol id `/ipfs/id/1.0.0`, and it is used
to query remote peers for their information.
The protocol works by opening a stream to the remote peer you want to query, using
`/ipfs/id/1.0.0` as the protocol id string. The peer being identified responds by returning
an `Identify` message and closes the stream.
### `identify/push`
The `identify/push` protocol has the protocol id `/ipfs/id/push/1.0.0`, and it is used
to inform known peers about changes that occur at runtime.
When a peer's basic information changes, for example, because they've obtained a new
public listen address, they can use `identify/push` to inform others about the new
information.
The push variant works by opening a stream to each remote peer you want to update, using
`/ipfs/id/push/1.0.0` as the protocol id string. When the remote peer accepts the stream,
the local peer will send an `Identify` message and close the stream.
Upon recieving the pushed `Identify` message, the remote peer should update their local
metadata repository with the information from the message. Note that missing fields
should be ignored, as peers may choose to send partial updates containing only the fields
whose values have changed.
## The Identify Message
```protobuf
message Identify {
optional string protocolVersion = 5;
optional string agentVersion = 6;
optional bytes publicKey = 1;
repeated bytes listenAddrs = 2;
optional bytes observedAddr = 4;
repeated string protocols = 3;
}
```
### protocolVersion
The protocol version identifies the family of protocols used by the peer.
The current protocol version is `ipfs/0.1.0`; if the protocol major or minor
version does not match the protocol used by the initiating peer, then the connection
is considered unusable and the peer must close the connection.
### agentVersion
This is a free-form string, identifying the implementation of the peer.
The usual format is `agent-name/version`, where `agent-name` is
the name of the program or library and `version` is its semantic version.
### publicKey
This is the public key of the peer, marshalled in binary form as specicfied
in [peer-ids](../peer-ids).
### listenAddrs
These are the addresses on which the peer is listening as multi-addresses.
### observedAddr
This is the connection source address of the stream initiating peer as observed by the peer
being identified; it is a multi-address. The initiator can use this address to infer
the existence of a NAT and its public address.
For example, in the case of a TCP/IP transport the observed addresses will be of the form
`/ip4/x.x.x.x/tcp/xx`. In the case of a circuit relay connection, the observed address will
be of the form `/p2p/QmRelay/p2p-circuit`. In the case of onion transport, there is no
observable source address.
### protocols
This is a list of protocols supported by the peer.

View File

@@ -2,7 +2,7 @@
> This repo contains the spec of mplex, the friendly Stream Multiplexer (that works in 3 languages!)
Mplex is a Stream Multiplexer protocol used by js-ipfs and go-ipfs in their implementations. The origins of this protocol are based in [multiplex](https://github.com/maxogden/multiplex), the JavaScript only Stream Multiplexer. After many battle field tests, we felt the need to improve and fix some of its bugs and mechanics, resulting on this new version used by libp2p.
Mplex is a Stream Multiplexer protocol used by js-ipfs and go-ipfs in their implementations. The origins of this protocol are based in [multiplex](https://github.com/maxogden/multiplex), the JavaScript-only Stream Multiplexer. After many battle field tests, we felt the need to improve and fix some of its bugs and mechanics, resulting on this new version used by libp2p.
This document will attempt to define a specification for the wire protocol and algorithm used in both implementations.
@@ -12,7 +12,7 @@ Implementations in:
- [JavaScript](https://github.com/libp2p/js-libp2p-mplex)
- [Go](https://github.com/libp2p/go-mplex)
- [Rust](https://github.com/libp2p/rust-libp2p/tree/master/multiplex-rs)
- [Rust](https://github.com/libp2p/rust-libp2p/tree/master/muxers/mplex)
## Message format
@@ -53,26 +53,25 @@ Mplex operates over a reliable ordered pipe between two peers, such as a TCP soc
To open a new stream, first allocate a new stream ID. Then, send a message with the flag set to `NewStream`, the ID set to the newly allocated stream ID, and the data of the message set to the name of the stream.
Stream names are purely for interfaces and are not otherwise considered by the protocol. An empty string may also be used for the stream name, and they may also be repeated (using the same stream name for every stream is valid). Reusing a stream ID after closing a stream may result in undefined behaviour.
Stream names are purely for debugging purposes and are not otherwise considered by the protocol. An empty string may also be used for the stream name, and they may also be repeated (using the same stream name for every stream is valid). Reusing a stream ID after closing a stream may result in undefined behaviour.
The party that opens a stream is called the stream initiator. This is used to identify whether the message comes from a channel opened locally or remotely. Thus, the stream initiator always uses even flags and stream receivers uses odd flags.
The party that opens a stream is called the stream initiator. Both parties can open a substream with the same ID, therefore this distinction is used to identify whether each message concerns the channel opened locally or remotely.
### Writing to a stream
To write data to a stream, one must send a message with the flag `MessageReceiver` (1) or `MessageInitiator` (2) (depending on whether or not the writer is the receiver or sender). The data field should contain the data you wish to write to the stream, up to 1MiB per message.
To write data to a stream, one must send a message with the flag `MessageReceiver` (1) or `MessageInitiator` (2) (depending on whether or not the writer is the one initiating the stream). The data field should contain the data you wish to write to the stream, up to 1MiB per message.
### Closing a stream
Mplex supports half-closed streams. Closing a stream closes it for writing and closes the remote end for reading but allows writing in the other direction.
To close a stream, send a message with a zero length body and a `CloseReceiver` (3) or `CloseInitiator` (4) flag (depending on whether or not the closer is the receiver or sender). Writing to a stream after it has been closed should result
in an error. Reading from a remote-closed stream should return all data send before closing the stream and then EOF thereafter.
To close a stream, send a message with a zero length body and a `CloseReceiver` (3) or `CloseInitiator` (4) flag (depending on whether or not the closer is the one initiaing the stream). Writing to a stream after it has been closed is a protocol violation. Reading from a remote-closed stream should return all data sent before closing the stream and then EOF thereafter.
### Resetting a stream
To immediately close a stream for both reading and writing, use reset. This should generally only be used on error; during normal operation, both sides should close instead.
To reset a stream, send a message with a zero length body and a `ResetReceiver` (5) or `ResetInitiator` (6) flag. Reset must not block and must immediately close both ends of the stream for both reading and writing. All current and future reads and writes must return errors (*not* EOF) and any data queued or in flight should be dropped.
To reset a stream, send a message with a zero length body and a `ResetReceiver` (5) or `ResetInitiator` (6) flag. Reset must immediately close both ends of the stream for both reading and writing. Writing to a stream after it has been reset is a protocol violation. Since reset is generally sent when an error happens, all future reads from a reset stream should return an error (*not* EOF).
## Implementation notes

View File

@@ -1,6 +1,6 @@
# PubSub interface for libp2p
Revision: draft 1, 2017-02-17
Revision: draft 2, 2019-02-01
Authors:
- whyrusleeping (why@ipfs.io)
@@ -21,12 +21,14 @@ You can find information about the PubSub research and notes in the following re
- https://github.com/libp2p/research-pubsub
- https://github.com/libp2p/pubsub-notes
Implementations:
## Implementations
- FloodSub, simple flooding pubsub (2017)
- [libp2p/go-floodsub](https://github.com/libp2p/go-floodsub/pull/67), [libp2p/js-libp2p-floodsub](http://github.com/libp2p/js-libp2p-floodsub), [libp2p/rust-libp2p/floodsub](https://github.com/libp2p/rust-libp2p/tree/master/floodsub)
- [libp2p/go-libp2p-pubsub/floodsub.go](https://github.com/libp2p/go-libp2p-pubsub/blob/master/floodsub.go);
- [libp2p/js-libp2p-floodsub](http://github.com/libp2p/js-libp2p-floodsub);
- [libp2p/rust-libp2p/floodsub](https://github.com/libp2p/rust-libp2p/tree/master/protocols/floodsub)
- GossipSub, extensible baseline pubsub (2018)
- [gossipsub](./gossipsub)
- [gossipsub](https://github.com/libp2p/specs/tree/master/pubsub/gossipsub#implementation-status)
- [EpiSub](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/episub.md), an epidemic broadcast tree router (defined 2018, not yet started as of Oct 2018)
## The RPC
@@ -64,6 +66,8 @@ message Message {
optional bytes data = 2;
optional bytes seqno = 3;
repeated string topicIDs = 4;
optional bytes signature = 5;
optional bytes key = 6;
}
```
@@ -74,30 +78,69 @@ done to allow content to be routed through a swarm of pubsubbing peers.
The `data` field is an opaque blob of data, it can contain any data that the
publisher wants it to.
The `seqno` field is a linearly increasing number that is unique among messages
originating from each given peer. No two messages on a pubsub topic from the
same peer should have the same `seqno` value, however messages from different
peers may have the same sequence number, so this number alone cannot be used to
address messages. (Notably the 'timecache' in use by the floodsub
implementation uses the concatenation of the `seqno` and the `from` field.)
The `seqno` field is a 64-bit big-endian uint that is a linearly increasing
number that is unique among messages originating from each given peer. No two
messages on a pubsub topic from the same peer should have the same `seqno`
value, however messages from different peers may have the same sequence number,
so this number alone cannot be used to address messages. Notably the
'timecache' in use by the go implementation contains a `message_id`,
which is constructed from the concatenation of the `seqno` and the `from`
fields. This `message_id` is then unique among messages. It was also proposed
in [#116](https://github.com/libp2p/specs/issues/116) to use a `message_hash`,
however, it was noted: "a potential caveat with using hashes instead of seqnos:
the peer won't be able to send identical messages (e.g. keepalives) within the
timecache interval, as they will get rejected as duplicates."
The `topicIDs` field specifies a set of topics that this message is being
published to.
Note that messages are currently *not* signed. This will come in the near
future.
The `signature` and `key` fields are used for message signing, as explained below.
The size of the `Message` should be limited, say to 1 MiB, but could also
be configurable, for more information see
[issue 118](https://github.com/libp2p/specs/issues/118), while messages should be
rejected if they are over this size.
Note that for applications where state such as messages is
stored, such as blockchains, it is suggested to have some kind of storage
economics (see e.g.
[here](https://ethresear.ch/t/draft-position-paper-on-resource-pricing/2838),
[here](https://ethresear.ch/t/ethereum-state-rent-for-eth-1-x-pre-eip-document/4378)
and
[here](https://ethresear.ch/t/improving-the-ux-of-rent-with-a-sleeping-waking-mechanism/1480)).
## Message Signing
Messages can be optionally signed, and it is up to the peer whether to accept and forward
unsigned messages.
For signing purposes, the `signature` and `key` fields are used:
- The `signature` field contains the signature.
- The `key` field contains the signing key when it cannot be inlined in the source peer ID.
When present, it must match the peer ID.
The signature is computed over the marshalled message protobuf _excluding_ the key field.
The protobuf blob is prefixed by the string `libp2p-pubsub:` before signing.
When signature validation fails for a signed message, the implementation must
drop the message and omit propagation. Locally, it may treat this event in whichever
manner it wishes (e.g. logging).
## The Topic Descriptor
The topic descriptor message is used to define various options and parameters
of a topic. It currently specifies the topic's human readable name, its
authentication options, and its encryption options.
authentication options, and its encryption options. The `AuthOpts` and `EncOpts`
of the topic descriptor message are not used in current implementations, but
may be used in future. For clarity, this is added as a comment in the file,
and may be removed once used.
The `TopicDescriptor` protobuf is as follows:
```protobuf
message TopicDescriptor {
optional string name = 1;
// AuthOpts and EncOpts are unused as of Oct 2018, but
// are planned to be used in future.
optional AuthOpts auth = 2;
optional EncOpts enc = 3;
@@ -125,9 +168,17 @@ message TopicDescriptor {
}
```
The `name` field is a string used to identify or mark the topic, It can be
The `name` field is a string used to identify or mark the topic. It can be
descriptive or random or anything that the creator chooses.
Note that instead of using `TopicDescriptor.name`, for privacy reasons the
`TopicDescriptor` struct may be hashed, and used as the topic ID. Another
option is to use a CID as a topic ID. While a consensus has not been reached,
for forwards and backwards compatibility, using an enum `TopicID` that allows
custom types in variants (i.e. `Name`, `hashedTopicDescriptor`, `CID`)
may be the most suitable option if it is available within an implementation's
language (otherwise it would be implementation defined).
The `auth` field specifies how authentication will work for this topic. Only
authenticated peers may publish to a given topic. See 'AuthOpts' below for
details.
@@ -179,3 +230,18 @@ Web Of Trust publishing. Messages are encrypted with some certificate or
certificate chain shared amongst trusted peers. (Spec writer's note: this is the
least clearly defined option and my description here may be wildly incorrect,
needs checking).
## Topic Validation
Implementations MUST support attaching _validators_ to topics.
_Validators_ have access to the `Message` and can apply any logic to determine its validity.
When propagating a message for a topic, implementations will invoke all validators attached
to that topic, and will only continue propagation if, and only if all, validations pass.
In its simplest form, a _validator_ is a function with signature `(peer.ID, *Message) => bool`,
where the return value is `true` if validation passes, and `false` otherwise.
Local handling of failed validation is left up to the implementation (e.g. logging).
Implementations MAY allow dynamically adding and removing _validators_ at runtime.

View File

@@ -39,9 +39,9 @@ profiles.
## Implementation status
- Go: [libp2p/go-floodsub#67](https://github.com/libp2p/go-floodsub/pull/67) (experimental)
- JS: not yet started
- Rust: not yet started
- Go: [libp2p/go-libp2p-pubsub/gossipsub.go](https://github.com/libp2p/go-libp2p-pubsub/blob/master/gossipsub.go) (experimental)
- JS: [ChainSafeSystems/gossipsub-js](https://github.com/ChainSafeSystems/gossipsub-js) work in progress; check branches and PRs.
- Rust: [libp2p/rust-libp2p#898](https://github.com/libp2p/rust-libp2p/pull/898) implements the spec but is missing some features. [libp2p/rust-libp2p#767](https://github.com/libp2p/rust-libp2p/pull/767) is an alternative, partial implementation that differs slightly from the spec (see [#142](https://github.com/libp2p/specs/issues/142) for details).
- Gerbil: [vyzo/gerbil-simsub](https://github.com/vyzo/gerbil-simsub) (simulator)
@@ -54,12 +54,13 @@ It implements pubsub in the most basic manner, with two defining aspects:
### Ambient Peer Discovery
With ambient peer discovery, the function is pushed outside the scope
of the protocol. Instead, it relies on ambient connection events to
perform peer discovery via protocol identification. Whenever a new
peer is connected, the protocol checks to see if the peer implements
floodsub, and if so it sends a hello packet that announces the topics
that it is currently subscribing to.
With ambient peer discovery, the function is pushed outside the scope of the
protocol. Instead, the mechanism for discovering peers is provided for by the
environment. In practice, this can be embodied by DHT walks, rendezvous
points, etc. This protocol relies on the ambient connection events produced by
such mechanisms. Whenever a new peer is connected, the protocol checks to see
if the peer implements floodsub and/or gossipsub, and if so, it sends it a
hello packet that announces the topics that it is currently subscribing to.
This allows the peer to maintain soft overlays for all topics of
interest. The overlay is maintained by exchanging subscription
@@ -234,11 +235,14 @@ delay in the overlay with some healthy margin.
Topic membership is controlled by two operations supported by the
router, as part of the pubsub api:
- On `JOIN(topic)` the router joins the topic. In order to do so, it
selects `D` peers from `peers.gossipsub[topic]`, adds them to `mesh[topic]`
and notifies them with a `GRAFT(topic)` control message. If it already has
`fanout` peers in the topic, then it selects those peers as the
initial mesh peers.
- On `JOIN(topic)` the router joins the topic. In order to do so, if it already has
`D` peers from the `fanout` peers of a topic, then it adds them to `mesh[topic]`,
and notifies them with a `GRAFT(topic)` control message. Otherwise, if there are
less than `D` peers (let this number be `x`) in the fanout for a topic (or the
topic is not in the fanout), then it
still adds them as above (if there are any), and selects the remaining number
of peers (`D-x`) from `peers.gossipsub[topic]`, and likewise adds them to
`mesh[topic]` and notifies them with a `GRAFT(topic)` control message.
- On `LEAVE(topic)` the router leaves the topic. It notifies the peers in
`mesh[topic]` with a `PRUNE(topic)` message and forgets `mesh[topic]`.
@@ -286,7 +290,7 @@ application layer), then it proceeds similarly to the payload reaction:
### Heartbeat
The router periodically runs a hearbeat procedure, which is
The router periodically runs a heartbeat procedure, which is
responsible for maintaining the mesh, emitting gossip, and shifting
the message cache.
@@ -325,7 +329,7 @@ for each topic in mesh+fanout:
let mids be mcache.window[topic]
if mids is not empty:
select D peers from peers.gossipsub[topic]
for each peer not in mesh[topic]
for each peer not in mesh[topic] or fanout[topic]
emit IHAVE(mids)
shift the mcache

View File

@@ -268,7 +268,7 @@ with the contents of the message. Care should be taken for issues
with transitive connectivity due to NAT. If
a node cannot connect to the originating node for a `SHUFFLEREPLY`,
then it should not perform the shuffle. Similarly, the originating
node could time out waiting for a shuffle reply and try with again
node could time out waiting for a shuffle reply and try again
with a lower TTL, until a TTL of zero reuses the connection in the
case of NATed hosts.
@@ -320,7 +320,7 @@ to the cache, and pushes the message id to the lazy notification queue.
The loop runs a short periodic timer, with a period in the order of
0.1s for gossiping message summaries. Every time it fires, the node
flushes the lazy notification queue with all the recently received
message ids in an `IHAVE` message to its lazy peers. The `IHAVE`
message ids in an `IHAVE` message to its lazy peers. The `IHAVE`
notifications summarize recent messages the node has seen and have not
propagated through the eager links.
@@ -394,7 +394,7 @@ Management protocol communicates these changes to the broadcast loop via
`NeighborUp` and `NeighborDown` notifications.
When a new node is added to the active list, the broadcast loop receives
a `NeighborUp` notifications; it simply adds the node to the eager peer
a `NeighborUp` notification; it simply adds the node to the eager peer
list. On the other hand, when a node is removed with a `NeighborDown`
notification, the loop has to consider if the node was an eager or lazy
peer. If the node was a lazy peer, it doesn't need to do anything as the
@@ -526,6 +526,6 @@ Broadcast protocol:
The authors do suggest lazy aggregation as a possible optimization nonetheless.
- `GRAFT` messages similarly aggregate multiple message requests.
- Missing messages and overlay repair are managed by a single background timer instead of
of creating timers left and right for every missing message; that's impractical from an
creating timers left and right for every missing message; that's impractical from an
implementation point of view, at least in Go.
- There is no provision for eager overlay repair on `NeighborDown` messages in Plumtree.

View File

@@ -6,7 +6,6 @@
- [js-libp2p-circuit](https://github.com/libp2p/js-libp2p-circuit)
- [go-libp2p-circuit](https://github.com/libp2p/go-libp2p-circuit)
- [rust-libp2p](https://github.com/libp2p/rust-libp2p/tree/master/relay)
## Table of Contents

222
rendezvous/README.md Normal file
View File

@@ -0,0 +1,222 @@
# Rendezvous Protocol
Author: vyzo
Revision: DRAFT; 2019-01-18
The protocol described in this specification is intended to provide a
lightweight mechanism for generalized peer discovery. It can be used
for bootstrap purposes, real time peer discovery, application specific
routing, and so on. Any node implementing the rendezvous protocol can
act as a rendezvous point, allowing the discovery of relevant peers in
a decentralized fashion.
## Use Cases
Depending on the application, the protocol could be used in the
following context:
- During bootstrap, a node can use known rendezvous points to discover
peers that provide critical services. For instance, rendezvous can
be used to discover circuit relays for connectivity restricted
nodes.
- During initialization, a node can use rendezvous to discover
peers to connect with the rest of the application. For instance,
rendezvous can be used to discover pubsub peers within a topic.
- In a real time setting, applications can poll rendezvous points in
order to discover new peers in a timely fashion.
- In an application specific routing setting, rendezvous points can be
used to progressively discover peers that can answer specific queries
or host shards of content.
### Replacing ws-star-rendezvous
We intend to replace ws-star-rendezvous with a few rendezvous daemons
and a fleet of p2p-circuit relays. Real-time applications will
utilize rendezvous both for bootstrap and in a real-time setting.
During bootstrap, rendezvous will be used to discover circuit relays
that provide connectivity for browser nodes. Subsequently, rendezvous
will be utilized throughout the lifetime of the application for real
time peer discovery by registering and polling rendezvous points.
This allows us to replace a fragile centralized component with a
horizontally scalable ensemble of daemons.
### Rendezvous and pubsub
Rendezvous can be naturally combined with pubsub for effective
real-time discovery. At a basic level, rendezvous can be used to
bootstrap pubsub: nodes can utilize rendezvous in order to discover
their peers within a topic. Alternatively, pubsub can also be used as
a mechanism for building rendezvous services. In this scenerio, a
number of rendezvous points can federate using pubsub for internal
real-time distribution, while still providing a simple interface to
clients.
## The Protocol
The rendezvous protocol provides facilities for real-time peer
discovery within application specific namespaces. Peers connect to the
rendezvous point and register their presence in one or more
namespaces. It is not allowed to register arbitrary peers in a
namespace; only the peer initiating the registration can register
itself.
Peers registered with the rendezvous point can be discovered by other
nodes by querying the rendezvous point. The query specifies the
namespace for limiting application scope and optionally a maximum
number of peers to return. The namespace can be omitted in the query,
which asks for all peers registered to the rendezvous point.
The query can also include a cookie, obtained from the response to a
previous query, such that only registrations that weren't included in
the previous response will be returned. This allows peers to
progressively refresh their network view without overhead, which
greatly simplifies real time discovery. It also allows for pagination
of query responses, so that large numbers of peer registrations can be
managed.
### Registration Lifetime
Registration lifetime is controlled by an optional TTL parameter in
the `REGISTER` message. If a TTL is specified, then the registration
persists until the TTL expires. If no TTL was specified, then a default
of 2hrs is implied. There may be a rendezvous point-specific upper bound
on TTL, with a minimum such value of 72hrs. If the TTL of a registration
is inadmissible, the rendezvous point may reject the registration with
an `E_INVALID_TTL` status.
Peers can refresh their registrations at any time with a new
`REGISTER` message; the TTL of the new message supersedes previous
registrations. Peers can also cancel existing registrations at any
time with an explicit `UNREGISTER` message.
The registration response includes the actual TTL of the registration,
so that peers know when to refresh.
### Interaction
Clients `A` and `B` connect to the rendezvous point `R` and register for namespace
`my-app` with a `REGISTER` message:
```
A -> R: REGISTER{my-app, {QmA, AddrA}}
R -> A: {OK}
B -> R: REGISTER{my-app, {QmB, AddrB}}
R -> B: {OK}
```
Client `C` connects and registers for namespace `another-app`:
```
C -> R: REGISTER{another-app, {QmC, AddrC}}
R -> C: {OK}
```
Another client `D` can discover peers in `my-app` by sending a `DISCOVER` message; the
rendezvous point responds with the list of current peer reigstrations and a cookie.
```
D -> R: DISCOVER{ns: my-app}
R -> D: {[REGISTER{my-app, {QmA, Addr}}
REGISTER{my-app, {QmB, Addr}}],
c1}
```
If `D` wants to discover all peers registered with `R`, then it can omit the namespace
in the query:
```
D -> R: DISCOVER{}
R -> D: {[REGISTER{my-app, {QmA, Addr}}
REGISTER{my-app, {QmB, Addr}}
REGISTER{another-app, {QmC, AddrC}}],
c2}
```
If `D` wants to progressively poll for real time discovery, it can use
the cookie obtained from a previous response in order to only ask for
new registrations.
So here we consider a new client `E` registering after the first query,
and a subsequent query that discovers just that peer by including the cookie:
```
E -> R: REGISTER{my-app, {QmE, AddrE}}
R -> E: {OK}
D -> R: DISCOVER{ns: my-app, cookie: c1}
R -> D: {[REGISTER{my-app, {QmE, AddrE}}],
c3}
```
### Proof of Work
The protocol as described so far is susceptible to spam attacks from
adversarial actors who generate a large number of peer identities and
register under a namespace of interest (eg: the relay namespace). This
can be mitigated by requiring a Proof of Work scheme for client
registrations.
This is TBD before finalizing the spec.
### Protobuf
```protobuf
message Message {
enum MessageType {
REGISTER = 0;
REGISTER_RESPONSE = 1;
UNREGISTER = 2;
DISCOVER = 3;
DISCOVER_RESPONSE = 4;
}
enum ResponseStatus {
OK = 0;
E_INVALID_NAMESPACE = 100;
E_INVALID_PEER_INFO = 101;
E_INVALID_TTL = 102;
E_INVALID_COOKIE = 103;
E_NOT_AUTHORIZED = 200;
E_INTERNAL_ERROR = 300;
E_UNAVAILABLE = 400;
}
message PeerInfo {
optional bytes id = 1;
repeated bytes addrs = 2;
}
message Register {
optional string ns = 1;
optional PeerInfo peer = 2;
optional int64 ttl = 3; // in seconds
}
message RegisterResponse {
optional ResponseStatus status = 1;
optional string statusText = 2;
optional int64 ttl = 3; // in seconds
}
message Unregister {
optional string ns = 1;
optional bytes id = 2;
}
message Discover {
optional string ns = 1;
optional int64 limit = 2;
optional bytes cookie = 3;
}
message DiscoverResponse {
repeated Register registrations = 1;
optional bytes cookie = 2;
optional ResponseStatus status = 3;
optional string statusText = 4;
}
optional MessageType type = 1;
optional Register register = 2;
optional RegisterResponse registerResponse = 3;
optional Unregister unregister = 4;
optional Discover discover = 5;
optional DiscoverResponse discoverResponse = 6;
}
```

338
secio/README.md Normal file
View File

@@ -0,0 +1,338 @@
# SECIO 1.0.0
> A stream security transport for libp2p. Streams wrapped by SECIO use secure
> sessions to encrypt all traffic.
| Lifecycle Stage | Maturity Level | Status | Latest Revision |
|-----------------|----------------|--------|-----------------|
| 3A | Recommendation | Active | r0, 2019-05-27 |
Authors: [@jbenet], [@bigs], [@yusefnapora]
Interest Group: [@Stebalien], [@richardschneider], [@tomaka], [@raulk]
[@jbenet]: https://github.com/jbenet
[@bigs]: https://github.com/bigs
[@yusefnapora]: https://github.com/yusefnapora
[@Stebalien]: https://github.com/Stebalien
[@richardschneider]: https://github.com/richardschneider
[@tomaka]: https://github.com/tomaka
[@raulk]: https://github.com/raulk
See the [lifecycle document](../00-framework-01-spec-lifecycle.md) for context
about maturity level and spec status.
## Table of Contents
- [SECIO 1.0.0](#secio-100)
- [Table of Contents](#table-of-contents)
- [Implementations](#implementations)
- [Algorithm Support](#algorithm-support)
- [Exchanges](#exchanges)
- [Ciphers](#ciphers)
- [Hashes](#hashes)
- [Data Structures](#data-structures)
- [Protocol](#protocol)
- [Prerequisites](#prerequisites)
- [Message framing](#message-framing)
- [Proposal Generation](#proposal-generation)
- [Determining Roles and Algorithms](#determining-roles-and-algorithms)
- [Key Exchange](#key-exchange)
- [Key marshaling](#key-marshaling)
- [Shared Secret Generation](#shared-secret-generation)
- [Key Stretching](#key-stretching)
- [Creating the Cipher and HMAC signer](#creating-the-cipher-and-hmac-signer)
- [Initiate Secure Channel](#initiate-secure-channel)
- [Secure Message Framing](#secure-message-framing)
- [Initial Packet Verification](#initial-packet-verification)
## Implementations
- [js-libp2p-secio](https://github.com/libp2p/js-libp2p-secio)
- [go-secio](https://github.com/libp2p/go-libp2p-secio)
- [rust-libp2p](https://github.com/libp2p/rust-libp2p/tree/master/protocols/secio)
## Algorithm Support
SECIO allows participating peers to support a subset of the following
algorithms.
### Exchanges
The following elliptic curves are used for ephemeral key generation:
- P-256
- P-384
- P-521
### Ciphers
The following symmetric ciphers are used for encryption of messages once
the SECIO channel is established:
- AES-256
- AES-128
Note that current versions of `go-libp2p` support the Blowfish cipher, however
support for Blowfish will be dropped in future releases and should not be
considered part of the SECIO spec.
### Hashes
The following hash algorithms are used for key stretching and for HMACs once
the SECIO channel is established:
- SHA256
- SHA512
## Data Structures
The SECIO wire protocol features two message types defined in the version 2 syntax of the
[protobuf description language](https://developers.google.com/protocol-buffers/docs/proto).
```protobuf
syntax = "proto2";
message Propose {
optional bytes rand = 1;
optional bytes pubkey = 2;
optional string exchanges = 3;
optional string ciphers = 4;
optional string hashes = 5;
}
message Exchange {
optional bytes epubkey = 1;
optional bytes signature = 2;
}
```
These two messages, `Propose` and `Exchange` are the only serialized types
required to implement SECIO.
## Protocol
### Prerequisites
Prior to undertaking the SECIO handshake described below, it is assumed that
we have already established a dedicated bidirectional channel between both
parties, and that both have agreed to proceed with the SECIO handshake
using [multistream-select][multistream-select] or some other form of protocol
negotiation.
### Message framing
All messages sent over the wire are prefixed with the message length in bytes,
encoded as an unsigned variable length integer as defined
by the [multiformats unsigned-varint spec][unsigned-varint].
### Proposal Generation
SECIO channel negotiation begins with a proposal phase.
Each side will construct a `Propose` protobuf message (as defined [above](#data-structures)),
setting the fields as follows:
| field | value |
|-------------|--------------------------------------------------------------------------------------|
| `rand` | A 16 byte random nonce, generated using the most secure means available |
| `pubkey` | The sender's public key, serialized [as described in the peer-id spec][peer-id-spec] |
| `exchanges` | A list of supported [key exchanges](#exchanges) as a comma-separated string |
| `ciphers` | A list of supported [ciphers](#ciphers) as a comma-separated string |
| `hashes` | A list of supported [hashes](#hashes) as a comma-separated string |
Both parties serialize this message and send it over the wire. If either party
has prior knowledge of the other party's peer id, they may attempt to validate
that the given public key can be used to generate the same peer id, and may
close the connection if there is a mismatch.
### Determining Roles and Algorithms
Next, the peers use a deterministic formula to compute their roles in the coming
exchanges. Each peer computes:
```
oh1 := sha256(concat(remotePeerPubKeyBytes, myNonce))
oh2 := sha256(concat(myPubKeyBytes, remotePeerNonce))
```
Where `myNonce` is the `rand` component of the local peer's `Propose` message,
and `remotePeerNonce` is the `rand` field from the remote peer's proposal.
With these hashes, determine which peer's preferences to favor. This peer will
be referred to as the "preferred peer". If `oh1 == oh2`, then the peer is
communicating with itself and should return an error. If `oh1 < oh2`, use the
remote peer's preferences. If `oh1 > oh2`, prefer the local peer's preferences.
Given our preference, we now sort through each of the `exchanges`, `ciphers`,
and `hashes` provided by both peers, selecting the first item from our preferred
peer's set that is also shared by the other peer.
### Key Exchange
Now the peers prepare a key exchange.
Both peers generate an ephemeral keypair using the elliptic curve algorithm that was
chosen from the proposed `exchanges` in the previous step.
With keys generated, both peers create an `Exchange` message. First, they start by
generating a "corpus" that they will sign.
```
corpus := concat(myProposalBytes, remotePeerProposalBytes, ephemeralPubKey)
```
The `corpus` is then signed using the permanent private key associated with the local
peer's peer id, producing a byte array `signature`.
| field | value |
|-------------|---------------------------------------------------------------------------|
| `epubkey` | The ephemeral public key, marshaled as described [below](#key-marshaling) |
| `signature` | The `signature` of the `corpus` described above |
The peers serialize their `Exchange` messages and write them over the wire. Upon
receiving the remote peer's `Exchange`, the local peer will compute the remote peer's
expected `corpus` using the known proposal bytes and the ephemeral public key sent by
the remote peer in the `Exchange`. The `signature` can then be validated using the
permanent public key of the remote peer obtained in the initial proposal.
Peers MUST close the connection if the signature does not validate.
#### Key marshaling
Within the `Exchange` message, ephemeral public keys are marshaled into the
uncompressed form specified in section 4.3.6 of ANSI X9.62.
This is the behavior provided by the go standard library's
[`elliptic.Marshal`](https://golang.org/pkg/crypto/elliptic/#Marshal) function.
### Shared Secret Generation
Peers now generate their shared secret by combining their ephemeral private key with the
remote peer's ephemeral public key.
First, the remote ephemeral public key is unmarshaled into a point on the elliptic curve
used in the agreed-upon exchange algorithm. If the point is not valid for the agreed-upon
curve, secret generation fails and the connection must be closed.
The remote ephemeral public key is then combined with the local ephemeral private key
by means of elliptic curve scalar multiplication. The result of the multiplication is
the shared secret, which will then be stretched to produce MAC and cipher keys, as
described in the next section.
### Key Stretching
The key stretching process uses an HMAC algorithm to derive encryption and MAC keys
and a stream cipher initialization vector from the shared secret.
Key stretching produces the following three values for each peer:
- A MAC key used to initialize an HMAC algorithm for message verification
- A cipher key used to initialize a block cipher
- An initialization vector (IV), used to generate a CTR stream cipher from the block cipher
The key stretching function will return two data structures `k1` and `k2`, each containing
the three values above.
Before beginning the stretching process, the size of the IV and cipher key are determined
according to the agreed-upon cipher algorithm. The sizes (in bytes) used are as follows:
| cipher type | cipher key size | IV size |
|-------------|-----------------|---------|
| AES-128 | 16 | 16 |
| AES-256 | 32 | 16 |
The generated MAC key will always have a size of 20 bytes.
Once the sizes are known, we can compute the total size of the output we need to generate
as `outputSize := 2 * (ivSize + cipherKeySize + macKeySize)`.
The stretching algorithm will then proceed as follows:
First, an HMAC instance is initialized using the agreed upon hash function and shared secret.
A fixed seed value of `"key expansion"` (encoded into bytes as UTF-8) is fed into the HMAC
to produce an initial digest `a`.
Then, the following process repeats until `outputSize` bytes have been generated:
- reset the HMAC instance or generate a new one using the same hash function and shared secret
- compute digest `b` by feeding `a` and the seed value into the HMAC:
- `b := hmac_digest(concat(a, "key expansion"))`
- append `b` to previously generated output (if any).
- if, after appending `b`, the generated output exceeds `outputSize`, the output is truncated to `outputSize` and generation ends.
- reset the HMAC and feed `a` into it, producing a new value for `a` to be used in the next iteration
- `a = hmac_digest(a)`
- repeat until `outputSize` is reached
Having generated `outputSize` bytes, the output is then split into six parts to
produce the final return values `k1` and `k2`:
```
| k1.IV | k1.CipherKey | k1.MacKey | k2.IV | k2.CipherKey | k2.MacKey |
```
The size of each field is determined by the cipher key and IV sizes detailed above.
### Creating the Cipher and HMAC signer
With `k1` and `k2` computed, swap the two values if the remote peer is the
preferred peer. After swapping if necessary, `k1` becomes the local peer's key
and `k2` the remote peer's key.
Each peer now generates an HMAC signer using the agreed upon algorithm and the
`MacKey` produced by the key stretcher.
Each peer will also initialize the agreed-upon block cipher using the generated
`CipherKey`, and will then initialize a CTR stream cipher from the block cipher
using the generated initialization vector `IV`.
### Initiate Secure Channel
With the cipher and HMAC signer created, the secure channel is ready to be
opened.
#### Secure Message Framing
To communicate over the channel, peers send packets containing an encrypted
body and an HMAC signature of the encrypted body.
The encrypted body is produced by applying the stream cipher initialized
previously to an arbitrary plaintext message payload. The encrypted data
is then fed into the HMAC signer to produce the HMAC signature.
Once the encrypted body and HMAC signature are known, they are concatenated
together, and their combined length is prefixed to the resulting payload.
Each packet is of the form:
```
[uint32 length of packet | encrypted body | hmac signature of encrypted body]
```
The packet length is in bytes, and it is encoded as an unsigned 32-bit integer
in network (big endian) byte order.
#### Initial Packet Verification
The first packet transmitted by each peer must be the remote peer's nonce.
Each peer will decrypt the message body and validate the HMAC signature,
comparing the decrypted output to the nonce recieved in the initial
`Propose` message. If either peer is unable to validate the initial
packet against the known nonce, they must abort the connection.
If both peers successfully validate the initial packet, the secure channel has
been opened and is ready for use, using the framing rules described
[above](#secure-message-framing).
[peer-id-spec]: https://github.com/libp2p/specs/peer-ids/peer-ids.md
[multistream-select]: https://github.com/multiformats/multistream-select
[unsigned-varint]: https://github.com/multiformats/unsigned-varint

View File

@@ -0,0 +1,47 @@
# Design considerations for the libp2p TLS Handshake
## Requirements
There are two main requirements that prevent us from using the straightforward way to run a TLS handshake (which would be to simply use the host key to create a self-signed certificate).
1. We want to use different key types: RSA, ECDSA, and Ed25519, Secp256k1 (and maybe more in the future?).
2. We want to be able to send the key type along with the key (see https://github.com/libp2p/specs/issues/111).
The first point is problematic in practice, because Go currently only supports RSA and ECDSA certificates. Support for Ed25519 was planned for Go 1.12, but was deferred recently, and the Go team is now evaluating interest in this in order to prioritze their work, so this might or might not happen in Go 1.13. I'm not aware of any plans for Secp256k1 at the moment.
The second requirement implies that we might want add some additional (free-form) information to the handshake, and we need to find a field to stuff that into.
The handshake protocol described here:
* supports arbitrary keys, independent from what the signature algorithms implemented by the TLS library used
* defines a way how future versions of this protocol might be negotiated without requiring any out-of-band information and additional roundtrips
## Design Choices
### TLS 1.3 - What about older versions?
The handshake protocol requires TLS 1.3 support. This means that the handshake between two peers that have never communicated before will typically complete in just a single roundtrip. With older TLS versions, a handshake typically takes two roundtrips. By not specifying support for older TLS versions, we increase performance and simplify the protocol.
### Why we're not using the host key for the certificate
The current proposal uses a self-signed certificate to carry the host's public key in the libp2p Public Key Extension. The key used to generate the self-signed certificate has no relationship with the host key. This key can be generated for every single connection, or can be generated at boot time.
One optimisation that was considered when designing the protocol was to use the libp2p host key to generate the certificate in the case of RSA and ECDSA keys (which we can assume to be supported signature schemes by all peers). That would have allowed us to strip the host key and the signature from the key extension, in order to
1. reduce the size of the certificate and
2. reduce the number of signature verifications the peer has to perform from 2 to 1.
The protocol does not include this optimisation, because
1. assuming that the peer uses an ECDSA key for generating the self-signed certificate, this only saves about ~150 bytes if the host key is an ECDSA key as well, and it even slightly increases the size of the certificate in case of a RSA host key. Furthermore, for ECDSA keys, the size of all handshake messages combined is less than 900 bytes, so having a slightly larger certificate won't require us to send more (TCP / QUIC) packets.
2. For a client, the number of signature verifications shouldn't pose a problem, since it controls the rate of its dials. Only for servers this might be a problem, since a malicious client could force a server to waste resources on signature verification. However, this is not a particularly interesting DoS vector, since the client's certificate is sent in its second flight (after receiving the ServerHello and the server's certificate), so it requires the attacker to actually perform most of the TLS handshake, including encrypting the certificate chain with a key that's tied to that handshake.
### Versioning - How we could roll out a new version of this protocol in the future
An earlier version of this document included a version negotiation mechanism. While it is a desireable property to be able to change things in the future, it also adds a lot of complexity.
To keep things simple, the current proposal does not include a version negotiation mechanism. A future version of this protocol might:
1. Change the format in which the keys are transmitted. A x509 extension has an ID (the Objected Identifier, OID), so we can use a new OID if we want to change the way we encode information. x509 certificates allow use to include multiple extensions, so we can even send the old and the new version during a transition period. In the handshake protocol defined here, peers are required to skip over extensions that they don't understand.
2. For more involved changes, a new version might (ab)use the SNI field in the ClientHello to announce support for new versions. To allow for this to work, the current version requires clients to send anything in the SNI field and server to completely ignore this field, no matter what its contents are.

67
tls/tls.md Normal file
View File

@@ -0,0 +1,67 @@
# libp2p TLS Handshake
## Introduction
This document describes how [TLS 1.3](https://tools.ietf.org/html/rfc8446) is used to secure libp2p connections. Endpoints authenticate to their peers by encoding their public key into a x509 certificate extension. The protocol described here allows peers to use arbitrary key types, not constrained to those for which signing of a x509 certificates is specified.
## Handshake Protocol
The libp2p handshake uses TLS 1.3 (and higher). Endpoints MUST NOT negotiate lower TLS versions.
During the handshake, peers authenticate each others identity as described in [Peer Authentication](#peer-authentication). Endpoints MUST verify the peer's identity. Specifically, this means that servers MUST require client authentication during the TLS handshake, and MUST abort a connection attempt if the client fails to provide the requested authentication information.
## Peer Authentication
In order to be able use arbitrary key types, peers dont use their host key to sign the x509 certificate they send during the handshake. Instead, the host key is encoded into the [libp2p Public Key Extension](#libp2p-public-key-extension), which is carried in a self-signed certificate. The key used to generate and sign this certificate SHOULD NOT be related to the host's key. Endpoints MAY generate a new key and certificate for every connection attempt, or they MAY reuse the same key and certificate for multiple connections. Endpoints MUST choose a key that will allow the peer to verify the certificate (i.e. choose a signature algorithm that the peer supports), and SHOULD use a key type which allows for efficient signature computation and which reduces the combined size of the certificate and the signature.
Endpoints MUST NOT send a certificate chain that contains more than one certificate. The certificate MUST have NotBefore and NotAfter fields set such that the certificate is valid at the time it is received by the peer. When receiving the certificate chain, an endpoint MUST check these conditions and abort the connection attempt if the presented certificate is not yet valid or if it is expired.
The certificate MUST contain the [libp2p Public Key Extension](#libp2p-public-key-extension). If this extension is missing, endpoints MUST abort the connection attempt. The certificate MAY contain other extensions, implementations MUST ignore extensions with unknown OIDs.
Note for clients: Since clients complete the TLS handshake immediately after sending the certificate (and the TLS ClientFinished message), the handshake will appear as having succeeded before the server had the chance to verify the certificate. In this state, the client can already send application data. If certificate verification fails on the server side, the server will close the connection without processing any data that the client sent.
### libp2p Public Key Extension
In order to prove ownership of its host key, an endpoint sends two values:
- the public host key
- a signature performed using the private host key
The public host key allows the peer to calculate the peer ID of the peer it is connecting to. Clients MUST verify that the peer ID derived from the certificate matches the peer ID they intended to connect to, and MUST abort the connection if there is a mismatch.
The peer signs the concatenation of the string "libp2p-tls-handshake:" and the public key that it used to generate the certificate carrying the libp2p Public Key Extension using its private host key. This signature provides cryptographic proof that the peer was in possession of the private host key at the time the certificate was signed. Peers MUST verify the signature, and abort the connection attempt if signature verification fails.
The public host key and the signature are ANS.1-encoded into the SignedKey data structure, which is carried in the libp2p Public Key Extension. The libp2p Public Key Extension is a x509 extension with the Object Identier 1.3.6.1.4.1.53594.1.1.
```asn1
SignedKey ::= SEQUENCE {
publicKey BIT STRING,
signature BIT STRING
}
```
The publicKey field of SignedKey contains the public host key of the endpoint, encoded using the following protobuf.
```protobuf
enum KeyType {
RSA = 0;
Ed25519 = 1;
Secp256k1 = 2;
ECDSA = 3;
}
message PublicKey {
required KeyType Type = 1;
required bytes Data = 2;
}
```
TODO: PublicKey.Data looks underspecified. Define precisely how to marshal the key.
## Future Extensibility
Future versions of this handshake protocol MAY use the Server Name Indication in the ClientHello as defined in [RFC 6066, section 3](https://tools.ietf.org/html/rfc6066) to announce their support for other versions. In order to keep this flexibility for future versions, clients that only support the version of the handshake defined in this document MUST NOT send any value in the Server Name Indication. Servers that only this version MUST ignore this field, specifically, they MUST NOT check if it was empty.