bubble up and add logo

This commit is contained in:
David Dias
2015-11-25 15:03:24 +00:00
commit 8fc1640e06
19 changed files with 854 additions and 0 deletions

30
1-introduction.md Normal file
View File

@@ -0,0 +1,30 @@
1 Introduction
==============
With the developement of building IPFS, the InterPlanetary FileSystem[?], we came to learn about the several challenges imposed by having to run a distributed file system on top of heterogeneous devices, with diferent network setups and capabilities. During this process, we had to revisit the whole network stack and elaborate solutions to overcome the obstacles imposed by design decisions of the several layers and protocols, without breaking compatibility or recreating technologies.
In order to build this library, we focused on tackling problems independently, creating less complex solutions with powerful abstractions, that when composed, can offer an environment for a Peer-to-Peer application to work sucessfuly.
## 1.1 Motivation
`libp2p` is the result of the collective experience while building a distributed system, that puts the responsability on the developers on how they want their app to interop with others in the network, favoring configuration and extensibility instead of assumptions about the network setup.
In essence, a peer using libp2p should be able to communicate with another peer using different transports, including connection relay, and talk over different protocols, negotiated on demand.
## 1.2 Goals
Our goals for libp2p specification and its implementations are:
- Enable the use of various:
- transports: TCP, UDP, SCTP, UDT, uTP, QUIC, SSH, etc.
- authenticated transports: TLS, DTLS, CurveCP, SSH
- Efficient use of sockets (connection reuse)
- Enable communications between peers to be multiplex over one socket (avoiding handshake overhead)
- Enable multiprotocols and respective versions to be used between peers, using a negotiation process.
- Be backwards compatible
- Work in current systems
- Use the current network technologies to its best capability
- Have NAT Traversal
- Enable connections to be relayed
- Enable encrypted channels
- Efficient use of underlying transport (e.g. native stream muxing, native auth, etc)

78
2-state-of-the-art.md Normal file
View File

@@ -0,0 +1,78 @@
2 An analysis the State of the Art in Network Stacks
====================================================
This section presents to the reader an analysis of the available protocols and architectures for a Network Stack. The goal is to provide the foundations from which to infer the conclusions and understand what are libp2p requirements and its designed architecture.
## 2.1 The client-server model
The client-server model indicates that both parties that ends of the channel have different roles, that support different services and/or have different capabilities, or in another words, speak different protocols.
Building client-server applications has been natural tendency for a number of reasons:
- The bandwidth inside a DataCenter is considerably high compared to the one available for clients to connect between each other
- DataCenter resources are considerably cheaper, due to efficient usage and bulk stocking
- Enables easier methods for the developer and system admin to have a fine grained control over the application
- Reduces the number of heteregeneus systems to be handled (although it is still considerably high)
- Systems like NAT make it really hard for client machines to find and talk with each other, forcing a developer to perform very clever hacks to traverse these obstacles.
- Protocols started to be designed with the assumption that a developer will create a client-server application from the start.
We even learned how to hide all of the complexity of a distributed system behind gateways on the Internet, using protocols that were designed to perform a point-to-point operation, such as HTTP, making it opaque for the application to see and understand how the cascade of service calls made for each request.
`libp2p` offers a move towards dialer-listener interactions, from the client-server listener, where it is not implicit which of the entities, dialer or listener, has which capabilities or is enabled to perform which actions. Setting up a connection between two applications today is a multilayered problem to solve, and these connections should not have a purpose bias, instead support to several other protocols to work on top of the established connection. In a client-server model, a server sending data without a prior request from the client is known as a push model, which typically adds more complexity, in a dialer-listener model, both entities can perform requests independently.
## 2.2 Categorizing the network stack protocols by solutions
Before diving into the libp2p protocols, it is important to understand the large diversity of protocols already in wide use and deployment that help maintain today's simple abstractions. For example, when one thinks about an HTTP connection, one might naively just think HTTP/TCP/IP as the main protocols involved, but in reality many more participate, all depending on the usage, the networks involved, and so on. Protocols like DNS, DHCP, ARP, OSPF, Ethernet, 802.11 (WiFI), ... and many others get involved. Looking inside ISPs' own networks would reveal dozens more.
Additionally, it's worth noting that the traditional 7-layer OSI model characterization does not fit libp2p. Instead, we categorize protocols based on their role, the problem they solve. The upper layers of the OSI model are geared towards point-to-point links between applications, whereas the libp2p protocols speak more towards various sizes of networks, with various properties, under various different security models. Different libp2p protocols can have the same role (in the OSI model, this would be "address the same layer"), meaning that multiple protocols can run simultaneously, all addressing one role (instead of one-protocol-per-layer in traditional OSI stacking) For example, bootstrap lists, mDNS, DHT Discovery, and PEX are all forms of the role "Peer Discovery"; they can coexist and even synergize.
### 2.2.1 Establishing the physical Link
- ethernet
- wifi
- bluetooth
- usb
### 2.2.2 Addressing a machine or process
- IPv4
- IPv6
- Hidden Addressing, like SDP
### 2.2.3 Discovering other peers or services
- ARP
- DHCP
- DNS
- Onion
### 2.2.4 Routing messages through the Network
- RIP(1, 2)
- OSP
- PPP
- Tor
- I2P
- cjdns
### 2.2.5 Transport
- TCP
- UDP
- UDT
- QUIC
- WebRTC DataChannel
### 2.2.6 Agreed semantics for applications to talk to each other
- RMI
- Remoting
- RPC
- HTTP
## 2.3 Current Shortcommings
Although we currently have a panoply of protocols available for our services the communicate, the abundance and the variety of solutions is also its shortfall. It is currently difficult for an application to be able to support and be available through several transports (for e.g. the lack of TCP/UDP stack in browser applications).
There is also no 'presence linking', meaning that isn't a notion for a peer to announce itself in several transports, so that other peer can guarantee that it is always the same peer.

104
3-requirements.md Normal file
View File

@@ -0,0 +1,104 @@
3 Requirements and considerations
=================================
## 3.1 NAT traversal
Network Address Translation is ubiquitous in the internet. Not only are most consumer devices behind many layers of NATs, but most datacenter nodes are often behind NAT for security or virtualization reasons. As we move into containerized deployments, this is getting worse. IPFS implementations SHOULD provide a way to traverse NATs, otherwise it is likely that operation will be affected. Even nodes meant to run with real IP addresses must implement NAT traversal techniques, as they may need to establish connections to peers behind NAT.
libp2p accomplishes full NAT traversal using an ICE-like protocol. It is not exactly ICE, as ipfs networks provide the possibility of relaying communications over the IPFS protocol itself, for coordinating hole-punching or even relaying communication.
It is recommended that implementations use one of the many NAT traversal libraries available, such as `libnice`, `libwebrtc`, or `natty`. However, NAT traversal must be interoperable.
## 3.2 Relay
Unfortunately, due to symmetric NATs, container and VM NATs, and other impossible-to-bypass NATs, libp2p MUST fallback to relaying communication to establish a full connectivity graph. To be complete, implementations MUST support relay, though it SHOULD be optional and able to be turned off by end users.
## 3.3 Encryption
Communications on libp2p may be:
- **encrypted**
- **signed** (not encrypted)
- **clear** (not encrypted, not signed)
We take both security and performance seriously. We recognize that encryption is not viable for some in-datacenter high performance use cases.
We recommend that:
- implementations encrypt all communications by default
- implementations are audited
- unless absolutely necessary, users normally operate with encrypted communications only.
libp2p uses cyphersuites like TLS.
**NOTE:** we do not use lib2p directly, because we do not want the CA system baggage. Most libp2p implementations are very big. Since the lib2p model begins with keys, libp2p only needs to apply ciphers. This is a minimal portion of the whole TLS standard.
## 3.4 Transport Agnostic
libp2p is transport agnostic, so it can run over any transport protocol. It does not even depend on IP; it may run on top of NDN, XIA, and other new internet architectures.
In order to reason about possible transports, libp2p uses [multiaddr](https://github.com/jbenet/multiaddr), a self-describing addressing format. This makes it possible for libp2p to treat addresses opaquely everywhere in the system, and have support for various transport protocols in the network layer. The actual format of addresses in libp2p is `ipfs-addr`, a multiaddr that ends with an ipfs nodeid. For example, these are all valid `ipfs-addrs`:
```
# ipfs over tcp over ipv6 (typical tcp)
/ip6/fe80::8823:6dff:fee7:f172/tcp/4001/ipfs/QmYJyUMAcXEw1b5bFfbBbzYu5wyyjLMRHXGUkCXpag74Fu
# ipfs over utp over udp over ipv4 (udp-shimmed transport)
/ip4/162.246.145.218/udp/4001/utp/ipfs/QmYJyUMAcXEw1b5bFfbBbzYu5wyyjLMRHXGUkCXpag74Fu
# ipfs over ipv6 (unreliable)
/ip6/fe80::8823:6dff:fee7:f172/ipfs/QmYJyUMAcXEw1b5bFfbBbzYu5wyyjLMRHXGUkCXpag74Fu
# ipfs over tcp over ip4 over tcp over ip4 (proxy)
/ip4/162.246.145.218/tcp/7650/ip4/192.168.0.1/tcp/4001/ipfs/QmYJyUMAcXEw1b5bFfbBbzYu5wyyjLMRHXGUkCXpag74Fu
# ipfs over ethernet (no ip)
/ether/ac:fd:ec:0b:7c:fe/ipfs/QmYJyUMAcXEw1b5bFfbBbzYu5wyyjLMRHXGUkCXpag74Fu
```
**Note:** at this time, no unreliable implementations exist. The protocol's interface for defining and using unreliable transport has not been defined.
**TODO:** define how unreliable transport would work. base it on webrtc.
## 3.5 Multi-Multiplexing
The libp2p Protocol is a collection of multiple protocols. In order to conserve resources, and to make connectivity easier, libp2p can perform all its operations through a single port, such as TCP or UDP port, depending on the transports used. libp2p can multiplex its many protocols through point-to-point connections. This multiplexing is for both reliable streams and unreliable datagrams.
libp2p is pragmatic. It seeks to be usable in as many settings as possible, to be modular and flexible to fit various use cases, and to force as few choices as possible. Thus the libp2p network layer provides what we're loosely referring to as "multi-multiplexing":
- can multiplex multiple listen network interfaces
- can multiplex multiple transport protocols
- can multiplex multiple connections per peer
- can multiplex multiple client protocols
- can multiplex multiple streams per protocol, per connection (SPDY, HTTP2, QUIC, SSH)
- has flow control (backpressure, fairness)
- encrypts each connection with a different ephemeral key
To give an example, imagine a single IPFS node that:
- listens on a particular TCP/IP address
- listens on a different TCP/IP address
- listens on a SCTP/UDP/IP address
- listens on a UDT/UDP/IP address
- has multiple connections to another node X
- has multiple connections to another node Y
- has multiple streams open per connection
- multiplexes streams over http2 to node X
- multiplexes streams over ssh to node Y
- one protocol mounted on top of libp2p uses one stream per peer
- one protocol mounted on top of libp2p uses multiple streams per peer
Not providing this level of flexbility makes it impossible to use libp2p in various platforms, use cases, or network setups. It is not important that all implementations support all choices; what is critical is that the spec is flexible enough to allow implementations to use precisely what they need. This ensures that complex user or application constraints do not rule out libp2p as an option.
## 3.6 Enable several network topologies
Differents systems have different requirements and with that comes different topologies. In the P2P literature we can find these topologies being enumerated as: Unstructured, Structured, Hybrid and Centralised.
Centralised topologies are the most common to find in Web Applications infrastructures, it requires for a given service or services to be present at all times in a known static location, so that other services can access them. Unstructured networks represent a type of P2P networks where the network topology is completely random, or at least non deterministic, while structured networks have a implicit way of organizing themselves, hybrid networks are a mix of the last two.
With this in consideration, libp2p must be ready to perform different routing mechanisms and peer discovery, in order to build the routing tables that will enable services to propagate messages or to find each other.
## 3.7 Resource Discovery
libp2p also solves the problem with discoverability of resources inside of a network through Records, a record is a unit of data that can be digitally signed, timestamp and/or used with other methods to give it a ephemeral validity. These Records hold pieces of information, such as location of availability of resources present in the network, these resources can be data, storage, CPU cycles and other types of services.
libp2p must not put a constraint on the location of resources, instead offer ways to find them easily in the network or use a sidechannel.

134
4-architecture.md Normal file
View File

@@ -0,0 +1,134 @@
4 Architecture
==============
libp2p was designed around the Unix Philosophy, creating smaller components, easier to understand and to test. These components should also be able to be swapped in order to accomodate different technologies or scenarios and also make it that it is upgradable over time.
Although different Peers can support different protocols depending on their capabilities, any Peer can act as a dialer and/or a listener for connections from other Peers, connections that once established can be reused from both ends, removing the distinction between clients and servers.
libp2p interface acts as a thin veneer to a multitude of subsystems that are required in order for peers to be able to communicate. These subsystems are allowed to be built on top of other subsystems as long as they respect the standardized interface. The main areas where these subsystems fit are:
- Peer Routing - Mechanism to find a Peer in a network. This Routing can be done recursively, iteratively or even in a broadcast/multicast mode.
- Swarm - Handles everything that touches the 'opening a stream' part of libp2p, from protocol muxing, stream muxing, NAT Traversal, Connection Relaying, while being multitransport
- Distributed Record Store - A system to store and distribute records. Records are small entries used by other systems for signaling, establishing links, announcing peers or content, and so on. It has a similar role to DNS in the broader internet.
- Discovery - Finding or identifying other peers in the network.
Each of these subsystem exposes a well known interface (see chapter 6 for Interfaces) and may use eachother in order to fulfil their goal. A global overview of the system is:
```
┌─────────────────────────────────────────────────────────────────────────────────┐
│ libp2p │
└─────────────────────────────────────────────────────────────────────────────────┘
┌─────────────────┐┌─────────────────┐┌──────────────────────────┐┌───────────────┐
│ Peer Routing ││ Swarm ││ Distributed Record Store ││ Discovery │
└─────────────────┘└─────────────────┘└──────────────────────────┘└───────────────┘
```
## 4.1 Peer Routing
A Peer Routing subsystem, exposes an interface to identify which peers should a message be routed in the DHT. It receives a key and must return one or more PeerInfo objects.
We present two examples of possible Peer Routing subsystems, the first based on a the Kademlia DHT and the second based on mDNS. Nevertheless, other Peer Routing mechanisms can be implemented, as long as they fulfil the same expectation and interface.
### 4.1.1 kad-routing
kad-routing implements the Kademlia Routing table, where each peer holds a set of k-buckets, each of them containing several PeerInfo from other peers in the network.
```
┌──────────────────────────────────────────────────────────────┐
│ Peer Routing │
│ │
│┌──────────────┐┌────────────────┐┌──────────────────────────┐│
││ kad-routing ││ mDNS-routing ││ other-routing-mechanisms ││
││ ││ ││ ││
││ ││ ││ ││
│└──────────────┘└────────────────┘└──────────────────────────┘│
└──────────────────────────────────────────────────────────────┘
```
### 4.1.2 mDNS-routing
mDNS-routing uses mDNS probes to identify if local area network peers that have a given key or simply are present.
## 4.2 Swarm
### 4.2.1 Stream Muxer
The stream muxer must implement the interface offered by [abstract-stream-muxer](https://github.com/diasdavid/abstract-stream-muxer).
### 4.2.2 Protocol Muxer
Protocol muxing is handled on the application level instead of the conventional way at the port level (where a different services/protocols listen at different ports). This enables us to support several protocols to be muxed in the same socket, saving the cost of doing NAT traversal for more than one port.
Protocol multiplexing is done through [`multistream`](https://github.com/jbenet/multistream), a protocol to negoatiate different types of streams (protocols) using [`multicodec`](https://github.com/jbenet/multicodec)
### 4.2.3 Transport
### 4.2.4 Crypto
### 4.2.5 Identify
**Identify** is one of the protocols mounted on top of swarm, our Connection handler, however, it follows and respects the same pattern as any other protocol when it comes to mounting it on top of swarm. Identify enables us to trade listenAddrs and observedAddrs between peers, this is crucial for the working of IPFS, since every socket open implements REUSEPORT, an observedAddr by another peer can enable a third peer to connect to us, since the port will be already open and redirect to us on a NAT.
### 4.2.6 Relay
## 4.3 Distributed Record Store
### 4.3.1 Record
Follows [IPRS](https://github.com/ipfs/specs/tree/master/records)
### 4.3.2 abstract-record-store
### 4.3.3 kad-record-store
### 4.3.4 mDNS-record-store
### 4.3.5 s3-record-store
## 4.4 Discovery
### 4.4.1 mDNS-discovery
mDNS-discovery is a Discovery Protocol that uses mDNS (link to wikipedia) over local area networks. It emits mDNS beacons to find if there are more peers available. Local area network peers are very useful to peer-to-peer protocols, as low latency links are very useful.
mDNS-discovery is a standalone protocol and does not depend on any other libp2p protocol. mDNS-discovery can yield peers available in the local area network, without relying on other infrastructure. This is particularly useful in intranets, networks disconnected from the internet backbone, and networks who temporarily loose links.
mDNS-discovery can be configured per-service (i.e. discover only peers participating in a specific protocol, like IPFS), and with private networks (discover peers belonging to a private network).
We are exploring ways to make mDNS-discovery beacons encrypted (so that other nodes in the local network cannot discern what service is being used). Though the nature of mDNS will always reveal local IP addresses.
Privacy Note: mDNS advertises in local area networks, which reveals IP addresses to listeners in the same local network. It is not recommended to use this with privacy-sensitive applications or oblivious routing protocols.
#### 4.4.2 random-walk
Random-Walk is a Discovery Protocol for DHTs (and other protocols with routing tables). It makes random DHT queries in order to learn about a large number of peers quickly. This causes the DHT (or other protocol) to converge much faster, at the expense of a small load at the very beginning.
#### 4.4.3 bootstrap-list
Bootstrap-List is a Discovery Protocol that uses local storage to cache the addresses of highly stable (and somewhat trusted) peers available in the network. This allows protocols to "find the rest of the network". This is essentially the same way that DNS bootstraps itself. (though note that changing the DNS bootstrap list --the "dot domain" addresses -- is not easy to do, by design).
- The list should be stored in long-term local storage, whatever that means to the local node (e.g. to disk)
- Protocols can ship a default list hardcoded or along with the standard code distribution (like DNS)
- In most cases (and certainly in the case of IPFS) the Bootstrap-List should be user configurable, as users may wish to establish separate networks, or place their reliance and trust in specific nodes.

13
5-datastructures.md Normal file
View File

@@ -0,0 +1,13 @@
5 Datastructures
================
The network protocol deals with these datastructures:
- a `PrivateKey`, the private key of a node.
- a `PublicKey`, the public key of a node.
- a `PeerID`, a hash of a node's public key.
- a `Node`[1], has a PeerID, and open connections to other `Nodes`.
- a `Connection`, a point-to-point link between two Nodes (muxes 1 or more streams)
- a `Stream`, a duplex message channel.
[1] currently called `PeerHost` in go-ipfs.

139
6-interfaces.md Normal file
View File

@@ -0,0 +1,139 @@
6 Interfaces
============
## 6.1 libp2p
## 6.2 Peer Routing
## 6.3 Swarm
~~The network is abstracted through the swarm which presents a simplified interface for the remaining layers to have access to the network. This interface should look like:~~
- `sw.addTransport(transport, [options, dialOptions, listenOptions])` - Add a transport to be supported by this swarm instance. Swarm expects it to implement the [abstract-transport](https://github.com/diasdavid/abstract-transport) interface.
- `sw.addUpgrade(connUpgrade, [options])` - A connection upgrade must be able to receive and return something that implements the [abstract-connection](https://github.com/diasdavid/abstract-connection) interface.
- `sw.addStreamMuxer(streamMuxer, [options])` - Upgrading a connection to use a stream muxer is still considered an upgrade, but a special case since once this connection is applied, the returned obj will implement the abstract-stream-muxer interface.
- `sw.dial(PeerInfo, options, protocol, callback)` - PeerInfo should contain the ID of the peer and its respective multiaddrs known.
- `sw.handleProtocol(protocol, handlerFunction)` - enable a protocol to be registered, so that another peer can open a stream to talk with us to that specific protocol
The following figure represents how the network level pieces, are tied together:
```
┌ ─ ─ ─ ─ ┌ ─ ─ ─ ─ ┌ ─ ─ ─ ─ ┌───────────┐
mounted │ mounted │ mounted ││Identify │
│protocol │protocol │protocol │(mounted │
1 │ 2 │ ... ││ protocol) │
└ ─ ─ ─ ─ └ ─ ─ ─ ─ └ ─ ─ ─ ─ └───────────┘
┌─────────────────────────────────────────┐
│ swarm │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ connection │
└─────────────────────────────────────────┘
┌───────────────┐┌───────────┐┌───────────┐
│Transport ││multistream││ stream │
│(TCP, UDP, etc)││ ││ muxer │
└───────────────┘└───────────┘│┌ ─ ─ ─ ─ ┐│
│ spdy │
│└ ─ ─ ─ ─ ┘│
│┌ ─ ─ ─ ─ ┐│
│ multiplex │
│└ ─ ─ ─ ─ ┘│
│┌ ─ ─ ─ ─ ┐│
│ QUIC │
│└ ─ ─ ─ ─ ┘│
│┌ ─ ─ ─ ─ ┐│
│ others │
│└ ─ ─ ─ ─ ┘│
└───────────┘
```
## 6.4 Distributed Record Store
----------------------------
OLD
The network protocol's interface has two parts:A
1. the _client interface_, for clients (e.g. higher layers of IPFS)
2. the _service interface_, for remote peers (e.g. other IPFS nodes)
### 4.1 Client Interface
The **Client Interface** is exposed to the higher layers of IPFS. It is the entry point for other parts to open + handle streams.
This type system represents the interface exposed to clients. Actual implementations will likely be more complicated, but they should aim to cover this.
```go
type PrivateKey interface {
PublicKey() PublicKey
Sign(data []byte) Signature
Decrypt(ciphertext []byte) (plaintext []byte)
}
type PublicKey interface {
PeerID() PeerID
Verify(Signature) (ok bool)
Encrypt(plaintext []byte) (ciphertext []byte)
}
// PeerID is a hash of a PublicKey, encoded in multihash
// It represents the identity of a node.
type PeerID Multihash
// Node is a peer in the network. It is both a client and server.
// Users may open streams to remote peers, or set handlers for protocols.
type Node interface {
// ID returns the PeerID of this Node
ID() PeerID
// NewStream creates a new stream to given peerID.
// It may have to establish a new connection to given peer.
// (This includes finding the addresses of a peer, and NAT Traversal.)
NewStream(Protocol, PeerID) (Stream, error)
// SetStreamHandler sets a callback for remote-opened streams for a protocol
// Thus clients register "protocol handlers", much like URL route handlers
SetStreamHandler(Protocol, StreamHandler)
// Raw connections are not exported to the user, only streams.
}
type StreamHandler func (Stream)
```
TODO: incorporate unreliable message / packet streams.
### 4.2 Protocol Interface
The network protocol consists of:
- Any secure, reliable, stream transport:
- a reliable transport protocol (TCP, QUIC, SCTP, UDT, UTP, ...)
- a secure PKI based transport protocol (SSH, TLS, ...)
- a stream transport (with flow control, etc) (HTTP2, SSH, QUIC)
- Protocol stream framing, to multiplex services
- Auxiliary protocols for connectivity:
- Identify - exchange node information
- NAT - NAT Traversal (ICE)
- Relay - for when NAT Traversal fails
Both the transport and stream muxer are pluggable. Unless
constraints dictate otherwise, implementations SHOULD implement TCP and HTTP/2
for interoperability. These are the default
- any reliable transport protocol
- a secure channel encryption
- a stream multiplexor with flow control (e.g. HTTP/2, SPDY, QUIC, SSH)
- every stream protocol header
(TODO: unreliable transport)

192
7-properties.md Normal file
View File

@@ -0,0 +1,192 @@
7 Properties
============
## 7.1 Communication Model - Streams
The Network layer handles all the problems of connecting to a peer, and exposes
simple bidirectional streams. Users can both open a new stream
(`NewStream()`) and register a stream handler (`SetStreamHandler`). The user
is then free to implement whatever wire messaging protocol she desires. This
makes it easy to build peer-to-peer protocols, as the complexities of
connectivity, multi-transport support, flow control, and so on, are handled.
To help capture the model, consider that:
- `NewStream` is similar to making a Request in an HTTP client.
- `SetStreamHandler` is similar to registering a URL handler in an HTTP server
So a protocol, such as a DHT, could:
```go
node := p2p.NewNode(peerid)
// register a handler, here it is simply echoing everything.
node.SetStreamHandler("/helloworld", func (s Stream) {
io.Copy(s, s)
})
// make a request.
buf1 := []byte("Hello World!")
buf2 := make([]byte, len(buf1))
stream, _ := node.NewStream("/helloworld", peerid) // open a new stream
stream.Write(buf1) // write to the remote
stream.Read(buf2) // read what was sent back
fmt.Println(buf2) // print what was sent back
```
## 7.2 Ports - Constrained Entrypoints
In the internet of 2015, we have a processing model where a program may be
running without the ability to open multiple -- or even single -- network
ports. Most hosts are behind NAT, whether of household ISP variety or new
containerized data-center type. And some programs may even be running in
browsers, with no ability to open sockets directly (sort of). This presents
challenges to completely peer-to-peer networks who aspire to connect _any_
hosts together -- whether they're running on a page in the browser, or in
a container within a container.
IPFS only needs a single channel of communication with the rest of the
network. This may be a single TCP or UDP port, or a single connection
through Websockets or WebRTC. In a sense, the role of the TCP/UDP network
stack -- i.e. multiplexing applications and connections -- may now be forced
to happen at the application level.
## 7.3 Transport Protocols
IPFS is transport agnostic. It can run on any transport protocol. The
`ipfs-addr` format (which is an ipfs-specific
[multiaddr](https://github.com/jbenet/multiaddr)) describes the transport.
For example:
```sh
# ipv4 + tcp
/ip4/10.1.10.10/tcp/29087/ipfs/QmVcSqVEsvm5RR9mBLjwpb2XjFVn5bPdPL69mL8PH45pPC
# ipv6 + tcp
/ip6/2601:9:4f82:5fff:aefd:ecff:fe0b:7cfe/tcp/1031/ipfs/QmRzjtZsTqL1bMdoJDwsC6ZnDX1PW1vTiav1xewHYAPJNT
# ipv4 + udp + udt
/ip4/104.131.131.82/udp/4001/udt/ipfs/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ
# ipv4 + udp + utp
/ip4/104.131.67.168/udp/1038/utp/ipfs/QmU184wLPg7afQjBjwUUFkeJ98Fp81GhHGurWvMqwvWEQN
```
IPFS delegtes the transport dialing to a multiaddr-based network pkg, such
as [go-multiaddr-net](https://github.com/jbenet/go-multiaddr-net). It is
advisable to build modules like this in other languages, and scope the
implementation of other transport protocols.
Some of the transport protocols we will be using:
- UTP
- UDT
- SCTP
- WebRTC (SCTP, etc)
- Websockets
- TCP Remy
## 7.4 Non-IP Networks
Efforts like [NDN](http://named-data.net) and
[XIA](http://www.cs.cmu.edu/~xia/) are new architectures for the internet,
which are closer to the model IPFS uses than what IP provides today. IPFS
will be able to operate on top of these architectures trivially, as there
is no assumptions made about the network stack in the protocol. Implementations
will likley need to change, but changing implementations is vastly easier than
changing protocols.
## 7.5 On the wire
We have the **hard constraint** of making IPFS work across _any_ duplex stream (an outgoing and an incoming stream pair, any arbitrary connection) and work on _any_ platform.
To make this work, IPFS has to solve a few problems:
- [Protocol Multiplexing](#protocol-multiplexing) - running multiple protocols over the same stream
- [multistream](#multistream) - self-describing protocol streams
- [multistream-select](#multistream-select) - a self-describing protocol selector
- [Stream Multiplexing](#stream-multiplexing) - running many independent streams over the same wire.
- [Portable Encodings](#portable-encodings) - using portable serialization formats
- [Secure Communications](#secure-communication) - using ciphersuites to establish security and privacy (like TLS).
### 7.5.1 Protocol-Multiplexing
Protocol Multiplexing means running multiple different protocols over the same stream. This could happen sequentially (one after the other), or concurrently (at the same time, with their messages interleaved). We achieve protocol multiplexing using three pieces:
- [multistream](#multistream) - self-describing protocol streams
- [multistream-select](#multistream-select) - a self-describing protocol selector
- [Stream Multiplexing](#stream-multiplexing) - running many independent streams over the same wire.
### 7.5.2 multistream - self-describing protocol stream
[multistream](https://github.com/jbenet/multistream) is a self-describing protocol stream format. It is extremely simple. Its goal is to define a way to add headers to protocols that describe the protocol itself. It is sort of like adding versions to a protocol, but being extremely explicit.
For example:
```
/ipfs/QmVXZiejj3sXEmxuQxF2RjmFbEiE9w7T82xDn3uYNuhbFb/ipfs-dht/0.2.3
<dht-message>
<dht-message>
...
```
### 7.5.3 multistream-selector - self-describing protocol stream selector
[multistream-select](https://github.com/jbenet/multistream/tree/master/multistream-select) is a simple [multistream](https://github.com/jbenet/multistream) protocol that allows listing and selecting other protocols. This means that Protomux has a list of registered protocols, listens for one, and then _nests_ (or upgrades) the connection to speak the registered protocol. This takes direct advantage of multistream: it enables interleaving multiple protocols, as well as inspecting what protocols might be spoken by the remote endpoint.
For example:
```
/ipfs/QmdRKVhvzyATs3L6dosSb6w8hKuqfZK2SyPVqcYJ5VLYa2/multistream-select/0.3.0
/ipfs/QmVXZiejj3sXEmxuQxF2RjmFbEiE9w7T82xDn3uYNuhbFb/ipfs-dht/0.2.3
<dht-message>
<dht-message>
...
```
### 7.5.4 Stream Multiplexing
Stream Multiplexing is the process of multiplexing (or combining) many different streams into a single one. This is a complicated subject because it enables protocols to run concurrently over the same wire. And all sorts of notions regarding fairness, flow control, head-of-line blocking, etc. start affecting the protocols. In practice, stream multiplexing is well understood and there are many stream multiplexing protocols. To name a few:
- HTTP/2
- SPDY
- QUIC
- SSH
IPFS nodes are free to support whatever stream multiplexors they wish, on top of the default one. The default one is there to enable even the simplest of nodes to speak multiple protocols at once. The default multiplexor will be HTTP/2 (or maybe QUIC?), but implementations for it are sparse, so we are beginning with SPDY. We simply select which protocol to use with a multistream header.
For example:
```
/ipfs/QmdRKVhvzyATs3L6dosSb6w8hKuqfZK2SyPVqcYJ5VLYa2/multistream-select/0.3.0
/ipfs/Qmb4d8ZLuqnnVptqTxwqt3aFqgPYruAbfeksvRV1Ds8Gri/spdy/3
<spdy-header-opening-a-stream-0>
/ipfs/QmVXZiejj3sXEmxuQxF2RjmFbEiE9w7T82xDn3uYNuhbFb/ipfs-dht/0.2.3
<dht-message>
<dht-message>
<spdy-header-opening-a-stream-1>
/ipfs/QmVXZiejj3sXEmxuQxF2RjmFbEiE9w7T82xDn3uYNuhbFb/ipfs-bitswap/0.3.0
<bitswap-message>
<bitswap-message>
<spdy-header-selecting-stream-0>
<dht-message>
<dht-message>
<dht-message>
<dht-message>
<spdy-header-selecting-stream-1>
<bitswap-message>
<bitswap-message>
<bitswap-message>
<bitswap-message>
...
```
### 7.5.5 Portable Encodings
In order to be ubiquitous, we _must_ use hyper-portable format encodings, those that are easy to use in various other platforms. Ideally these encodings are well-tested in the wild, and widely used. There may be cases where multiple encodings have to be supported (and hence we may need a [multicodec](https://github.com/jbenet/multicodec) self-describing encoding), but this has so far not been needed.
For now, we use [protobuf](https://github.com/google/protobuf) for all protocol messages exclusively, but other good candidates are [capnp](https://capnproto.org), [bson](http://bsonspec.org/), [ubjson](http://ubjson.org/).
### 7.5.6 Secure Communications
The wire protocol is -- of course -- wrapped with encryption. We use cyphersuites similar to TLS. This is explained further in the [network spec](./#encryption).

40
8-implementations.md Normal file
View File

@@ -0,0 +1,40 @@
8 Implementations
=================
This is a list of known libp2p module implementations. They are components that respect the interfaces and expectations defined in the "Interfaces" chapter, and can be composed to make a working libp2p library.
## 8.1 libp2p
- https://github.com/diasdavid/node-libp2p
- https://github.com/diasdavid/node-ipfs-logger
## 8.2 Peer Discovery
- https://github.com/diasdavid/node-ipfs-mdns
- https://github.com/diasdavid/node-ipfs-railing
- https://github.com/diasdavid/node-ipfs-random-walk
## 8.3 Peer Routing
- https://github.com/diasdavid/node-ipfs-kad-router
## 8.4 Swarm
- https://github.com/diasdavid/node-ipfs-swarm
- https://github.com/diasdavid/node-ipfs-ping
- https://github.com/diasdavid/node-spdy-stream-muxer
- https://github.com/diasdavid/abstract-stream-muxer
## 8.5 Record Store
- https://github.com/diasdavid/node-ipfs-record
- https://github.com/diasdavid/node-ipfs-record-store
## 8.6 Data Structures
- https://github.com/diasdavid/node-ipfs-peer-id
- https://github.com/diasdavid/node-ipfs-peer
## 8.7 Implementations of Specs libp2p depends on
- https://github.com/diasdavid/node-multistream

1
9-references.md Normal file
View File

@@ -0,0 +1 @@
- State of Peer-to-Peer (P2P) Communication across Network Address Translators (NATs) https://tools.ietf.org/html/rfc5128

68
README.md Normal file
View File

@@ -0,0 +1,68 @@
RFC - libp2p
============
![](https://raw.githubusercontent.com/diasdavid/specs/libp2p-spec/protocol/network/figs/logo.png)
Authors:
- [Juan Benet](https://github.com/jbenet)
- [David Dias](https://github.com/diasdavid)
Reviewers:
> tl;dr; This document presents libp2p, a modularized and extensible network stack to overcome the networking challenges faced when doing Peer-to-Peer applications. libp2p is used by IPFS as its networking library.
* * *
# Abstract
This describes the IPFS network protocol. The network layer provides point-to-point transports (reliable and unreliable) between any two IPFS nodes in the network.
This document defines the spec implemented in libp2p.
# Status of this spec
> **This spec is a Work In Progress (WIP).**
# Organization of this document
This RFC is organized by chapters described on the `Table of Contents` section. Each of the chapters can be found in each own file.
# Table of Contents
- [1 Introduction](/protocol/network/1-introduction.md)
- [1.1 Motivation](/protocol/network/1-introduction.md#11-motivation)
- [1.2 Goals](/protocol/network/1-introduction.md#12-goals)
- [2 Overview of current Network Stack](/protocol/network/2-state-of-the-art.md)
- [2.1 Client Server model](/protocol/network/2-state-of-the-art.md#21-the-client-server-model)
- [2.2 Categorizing the Network Stack protocols by solutions](/protocol/network/2-state-of-the-art.md#22-categorizing-the-network-stack-protocols-by-solutions)
- [2.3 Current Shortcommings](/protocol/network/2-state-of-the-art.md#23-current-shortcommings)
- [3 Requirements](/protocol/network/3-requirements.md)
- [3.1 NAT traversal](/protocol/network/3-requirements.md#31-nat-traversal)
- [3.2 Relay](/protocol/network/3-requirements.md#32-relay)
- [3.3 Encryption](/protocol/network/3-requirements.md#33-encryption)
- [3.4 Transport Agnostic](/protocol/network/3-requirements.md#34-transport-agnostic)
- [3.5 Multi-Multiplexing](/protocol/network/3-requirements.md#35-multi-multiplexing)
- [4 Architecture](/protocol/network/4-architecture.md)
- [4.1 Peer Routing](/protocol/network/4-architecture.md#41-peer-routing)
- [4.2 Swarm](/protocol/network/4-architecture.md#42-swarm)
- [4.3 Distributed Record Store](/protocol/network/4-architecture.md#43-distributed-record-store)
- [5 Datastructures](/protocol/network/5-datastructures.md)
- [6 Interfaces](/protocol/network/6-interfaces.md)
- [6.1 libp2p](/protocol/network/6-interfaces.md#61-libp2p)
- [6.2 Peer Routing](/protocol/network/6-interfaces.md#62-peer-routing)
- [6.3 Swarm](/protocol/network/6-interfaces.md#63-swarm)
- [6.4 Distributed Record Store](/protocol/network/6-interfaces.md#64-distributed-record-store)
- [7 Properties](/protocol/network/7-properties.md)
- [7.1 Communication Model - Streams](/protocol/network/7-properties.md#71-communication-model---streams)
- [7.2 Ports - Constrained Entrypoints](/protocol/network/7-properties.md#72-ports---constrained-entrypoints)
- [7.3 Transport Protocol](/protocol/network/7-properties.md#73-transport-protocols)
- [7.4 Non-IP Networks](/protocol/network/7-properties.md#74-non-ip-networks)
- [7.5 On the wire](/protocol/network/7-properties.md#75-on-the-wire)
- [7.5.1 Protocol-Multiplexing](/protocol/network/7-properties.md#751-protocol-multiplexing)
- [7.5.2 multistream - self-describing protocol stream](/protocol/network/7-properties.md#752-multistream---self-describing-protocol-stream)
- [7.5.3 multistream-selector - self-describing protocol stream selector](/protocol/network/7-properties.md#753-multistream-selector---self-describing-protocol-stream-selector)
- [7.5.4 Stream Multiplexing](/protocol/network/7-properties.md#754-stream-multiplexing)
- [7.5.5 Portable Encodings](/protocol/network/7-properties.md#755-portable-encodings)
- [8 Implementations](/protocol/network/8-implementations.md)
- [9 References](/protocol/network/9-references.md)

BIN
figs/architecture-1.monopic Normal file

Binary file not shown.

6
figs/architecture-1.txt Normal file
View File

@@ -0,0 +1,6 @@
┌─────────────────────────────────────────────────────────────────────────────────┐
│ libp2p │
└─────────────────────────────────────────────────────────────────────────────────┘
┌─────────────────┐┌─────────────────┐┌──────────────────────────┐┌───────────────┐
│ Peer Routing ││ Swarm ││ Distributed Record Store ││ Discovery │
└─────────────────┘└─────────────────┘└──────────────────────────┘└───────────────┘

BIN
figs/logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

BIN
figs/logo.sketch Normal file

Binary file not shown.

13
figs/logo.svg Normal file
View File

@@ -0,0 +1,13 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg width="483px" height="204px" viewBox="0 0 483 204" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:sketch="http://www.bohemiancoding.com/sketch/ns">
<!-- Generator: Sketch 3.3.3 (12072) - http://www.bohemiancoding.com/sketch -->
<title>logo</title>
<desc>Created with Sketch.</desc>
<defs></defs>
<g id="Page-1" stroke="none" stroke-width="1" fill="none" fill-rule="evenodd" sketch:type="MSPage">
<text id="libp2p" sketch:type="MSTextLayer" font-family="Arial Black" font-size="144" font-weight="686">
<tspan x="0" y="158" fill="#62D1D4">lib</tspan>
<tspan x="191.953125" y="158" fill="#001C57">p2p</tspan>
</text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 800 B

BIN
figs/overview.monopic Normal file

Binary file not shown.

27
figs/overview.txt Normal file
View File

@@ -0,0 +1,27 @@
┌ ─ ─ ─ ─ ┌ ─ ─ ─ ─ ┌ ─ ─ ─ ─ ┌───────────┐
mounted │ mounted │ mounted ││Identify │
│protocol │protocol │protocol │(mounted │
1 │ 2 │ ... ││ protocol) │
└ ─ ─ ─ ─ └ ─ ─ ─ ─ └ ─ ─ ─ ─ └───────────┘
┌─────────────────────────────────────────┐
│ swarm │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ connection │
└─────────────────────────────────────────┘
┌───────────────┐┌───────────┐┌───────────┐
│Transport ││multistream││ stream │
│(TCP, UDP, etc)││ ││ muxer │
└───────────────┘└───────────┘│┌ ─ ─ ─ ─ ┐│
│ spdy │
│└ ─ ─ ─ ─ ┘│
│┌ ─ ─ ─ ─ ┐│
│ multiplex │
│└ ─ ─ ─ ─ ┘│
│┌ ─ ─ ─ ─ ┐│
│ QUIC │
│└ ─ ─ ─ ─ ┘│
│┌ ─ ─ ─ ─ ┐│
│ others │
│└ ─ ─ ─ ─ ┘│
└───────────┘

BIN
figs/peer-routing.monopic Normal file

Binary file not shown.

9
figs/peer-routing.txt Normal file
View File

@@ -0,0 +1,9 @@
┌──────────────────────────────────────────────────────────────┐
│ Peer Routing │
│ │
│┌──────────────┐┌────────────────┐┌──────────────────────────┐│
││ kad-routing ││ mDNS-routing ││ other-routing-mechanisms ││
││ ││ ││ ││
││ ││ ││ ││
│└──────────────┘└────────────────┘└──────────────────────────┘│
└──────────────────────────────────────────────────────────────┘