"Minor edits with grammar, clarity and links" (#91)

This commit is contained in:
James Ray
2018-08-23 17:39:24 +10:00
committed by vyzo
parent 7a78aa642d
commit 75eeb8f177

View File

@@ -12,27 +12,30 @@ specialized routers, which may add protocol messages and gossip in
order to provide behaviour optimized for specific application
profiles.
<!-- toc -->
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Contents**
- [Implementation status](#implementation-status)
- [In the beginning was floodsub](#in-the-beginning-was-floodsub)
* [Ambient Peer Discovery](#ambient-peer-discovery)
* [Flood routing](#flood-routing)
* [Retrospective](#retrospective)
- [Controlling the flood](#controlling-the-flood)
* [randomsub: A random message router](#randomsub-a-random-message-router)
* [meshsub: An overlay mesh router](#meshsub-an-overlay-mesh-router)
* [gossipsub: The gossiping mesh router](#gossipsub-the-gossiping-mesh-router)
- [The gossipsub protocol](#the-gossipsub-protocol)
* [Control messages](#control-messages)
* [Router state](#router-state)
* [Topic membership](#topic-membership)
* [Message processing](#message-processing)
* [Heartbeat](#heartbeat)
* [Control message piggybacking](#control-message-piggybacking)
* [Protobuf](#protobuf)
- [gossipsub: An extensible baseline pubsub protocol](#gossipsub-an-extensible-baseline-pubsub-protocol)
- [Implementation status](#implementation-status)
- [In the beginning was floodsub](#in-the-beginning-was-floodsub)
- [Ambient Peer Discovery](#ambient-peer-discovery)
- [Flood routing](#flood-routing)
- [Retrospective](#retrospective)
- [Controlling the flood](#controlling-the-flood)
- [randomsub: A random message router](#randomsub-a-random-message-router)
- [meshsub: An overlay mesh router](#meshsub-an-overlay-mesh-router)
- [gossipsub: The gossiping mesh router](#gossipsub-the-gossiping-mesh-router)
- [The gossipsub protocol](#the-gossipsub-protocol)
- [Control messages](#control-messages)
- [Router state](#router-state)
- [Topic membership](#topic-membership)
- [Message processing](#message-processing)
- [Heartbeat](#heartbeat)
- [Control message piggybacking](#control-message-piggybacking)
- [Protobuf](#protobuf)
<!-- tocstop -->
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
## Implementation status
@@ -46,8 +49,8 @@ profiles.
The initial pubsub experiment in libp2p was `floodsub`.
It implements pubsub in the most basic manner, with two defining aspects:
- ambient peer discovery.
- most basic routing; flooding.
- ambient peer discovery; and
- most basic routing: flooding.
### Ambient Peer Discovery
@@ -55,8 +58,8 @@ With ambient peer discovery, the function is pushed outside the scope
of the protocol. Instead, it relies on ambient connection events to
perform peer discovery via protocol identification. Whenever a new
peer is connected, the protocol checks to see if the peer implements
floodsub, and if so it sends a hello packet announcing the topics it
is currently subscribing.
floodsub, and if so it sends a hello packet that announces the topics
that it is currently subscribing to.
This allows the peer to maintain soft overlays for all topics of
interest. The overlay is maintained by exchanging subscription
@@ -94,12 +97,15 @@ highly desirable properties:
The problem however is that messages don't just follow the minimum
latency paths; they follow all edges, thus creating a flood. The
outbound degree of the network is unbounded. This creates a problem
for individual densely connected nodes, as they may have a large
number of connected peers and cannot afford the bandwidth to forward
all these pubsub messages. Similary, the amplification factor is only
bounded by the sum of degrees of all nodes in the overlay, which
creates a scaling problem for densely connected overlays at large.
outbound degree of the network is unbounded, whereas we want it to be
bounded in order to reduce bandwidth requirements and increase
decentralization and scalability. In other words, this unbounded
outbound degree creates a problem for individual densely connected
nodes, as they may have a large number of connected peers and cannot
afford the bandwidth to forward all these pubsub messages. Similary,
the amplification factor is only bounded by the sum of degrees of all
nodes in the overlay, which creates a scaling problem for densely
connected overlays at large.
## Controlling the flood
@@ -111,10 +117,10 @@ globally controls the amplification factor.
### randomsub: A random message router
Let's first consider the simplest bounded floodsub variant, which we
call `randomsub`. In this construction, the router is still stateless
call `randomsub`. In this construction, the router is still stateless,
apart from a list of known peers in the topic. But instead of
forwarding messages to all peers, it forwards to a random subset up to
`D` peers, where `D` is the desired degree of the network.
forwarding messages to all peers, it forwards to a random subset of up
to `D` peers, where `D` is the desired degree of the network.
The problem with this construction is that the message propagation
patterns are non-deterministic. This results in extreme message route
@@ -158,7 +164,7 @@ admissible mesh degree bounds.
### gossipsub: The gossiping mesh router
The meshsub router offsers a baseline construction with good amplification
The meshsub router offers a baseline construction with good amplification
control properties, which we augment with _gossip_ about message flow.
The gossip is emitted to random subsets of peers not in the mesh, similar
to randomsub, and it allows us to propagate _metadata_ about message flow
@@ -171,7 +177,7 @@ The router can use this metadata to improve the mesh, for instance an
[episub](episub.md) router built on top of gossipsub can create
epidemic broadcast trees. Beyond that, the metadata can restart
message transmission at different points in the overlay to rectify
downstream message loss. Or it can simply jump hops oppurtunistically
downstream message loss. Or it can simply jump hops opportunistically
and accelerate message transmission for peers who are at some distance
in the mesh.
@@ -191,33 +197,33 @@ towards them.
### Control messages
The protocol defines four control messages:
- `GRAFT`: graft a mesh link; notifies the peer that it has been added to the local mesh view.
- `PRUNE`: prune a mesh link; notifies the peer that it has been removed from the local mesh view.
- `IHAVE`: gossip; notifies the peer that the following messages were recently seen and are available on request.
- `IWANT`: requests the transmission of messages announced in an `IHAVE` message.
- `GRAFT`: graft a mesh link; this notifies the peer that it has been added to the local mesh view.
- `PRUNE`: prune a mesh link; this notifies the peer that it has been removed from the local mesh view.
- `IHAVE`: gossip; this notifies the peer that the following messages were recently seen and are available on request.
- `IWANT`: request transmission of messages announced in an `IHAVE` message.
### Router state
The router maintains the following state:
- `peers`: a set of all known peers; `peers.gossipsub` denotes the gossipsub peers
while `peers.floodsub` denotes the floodsub peers.
- `mesh`: the overlay meshes as a map of topic to lists of peers.
- `mesh`: the overlay meshes as a map of topics to lists of peers.
- `fanout`: the mesh peers to which we are publishing to without topic membership,
as a map of topic to list of peers.
- `seen`: this is the timed message id cache, which tracks seen messages.
as a map of topics to lists of peers.
- `seen`: this is the timed message ID cache, which tracks seen messages.
- `mcache`: a message cache that contains the messages for the last few
heartbeat ticks.
The message cache is a data structure that stores windows of message ids
The message cache is a data structure that stores windows of message IDs
and the corresponding messages. It supports the following operations:
- `mcache.put(m)`: adds a message to the current window and the cache.
- `mcache.get(id)`: retrieves a message from the cache by its id, if it is still present.
- `mcache.window()`: retrieves the message id for messages in the current history window.
- `mcache.get(id)`: retrieves a message from the cache by its ID, if it is still present.
- `mcache.window()`: retrieves the message ID for messages in the current history window.
- `mcache.shift()`: shifts the current window, discarding messages older than the
history length of the cache.
The timed message id cache is the flow control mechanism. It tracks
the message ids of seen message for the last couple of minutes. It is
The timed message ID cache is the flow control mechanism. It tracks
the message IDs of seen messages for the last couple of minutes. It is
separate from `mcache` for implementation reasons in Go (the `seen`
cache is inherited from the pubsub framework), but they could be the
same data structure.
@@ -237,8 +243,8 @@ router, as part of the pubsub api:
Note that the router can publish messages without topic membership. In order
to maintain stable routes in that case, it maintains a list of peers for each
topic it has published in the `fanout` map. If the router does not publish any
messages for some time, then the `fanout` peers for the topic are forgotten, so
this is soft state.
messages of a topic for some time, then the `fanout` peers for that topic are
forgotten, so this is soft state.
Also note that as part of the pubsub api, the peer emits `SUBSCRIBE`
and `UNSUBSCRIBE` control messages to all its peers whenever it joins
@@ -251,8 +257,8 @@ implementation would have to implement those control messages.
Upon receiving a message, the router first processes the payload of the message.
If it contains a valid message that has not been previously seen, then
it publishes the message:
- it forwards the message to every peer in `peers.floodsub[topic]`, provided it's not
the source of the message
- It forwards the message to every peer in `peers.floodsub[topic]`, provided it's not
the source of the message.
- It forwards the message to every peer in `mesh[topic]`, provided it's not the
source of the message.
@@ -268,10 +274,10 @@ After processing the payload, it then processes the control messages in the enve
When the router publishes a message that originates from the router itself (at the
application layer), then it proceeds similar to the payload reaction:
- it forwards the message to every peer in `peers.floodsub[topic]`.
- if it is subscribed to the topic, then it must have a set of peers in `mesh[topic]`,
- It forwards the message to every peer in `peers.floodsub[topic]`.
- If it is subscribed to the topic, then it must have a set of peers in `mesh[topic]`,
to which the message is forwarded.
- if it is not subscribed to the topic, it then forwards the message to
- If it is not subscribed to the topic, it then forwards the message to
the peers in `fanout[topic]`. If this set is empty, it chooses `D` peers from
`peers.gossipsub[topic]` to become the new `fanout[topic]` peers and forwards
to them.
@@ -286,7 +292,8 @@ The `mesh` is maintained exactly as prescribed by `meshsub`:
```
for each topic in mesh:
if |mesh[topic]| < D_low:
select D - |mesh[topic]| peers from peers.gossipsub[topic] - mesh[topic]
select D - |mesh[topic]| peers from peers.gossipsub[topic] - mesh[topic]
// i.e. not including those that are already in the topic mesh.
for each new peer:
add peer to mesh[topic]
emit GRAFT(topic) control message to peer
@@ -298,15 +305,15 @@ for each topic in mesh:
emit PRUNE(topic) control message to peer
```
The `fanout` map is maintained by keeping track of last published time
The `fanout` map is maintained by keeping track of the last published time
for each topic:
```
for each topic in fanout:
if last published time > TTL
remove topic from fanout
else if |fanout[topic]| < D
select D - |fanout[topic]| peers from peers.gossipsub[topic] - fanout[topic]
add the peers to fanout[topic]
if time since last published > ttl
remove topic from fanout
else if |fanout[topic]| < D
select D - |fanout[topic]| peers from peers.gossipsub[topic] - fanout[topic]
add the peers to fanout[topic]
```
Gossip is emitted by selecting peers for each topic that are not already part
@@ -327,7 +334,6 @@ gossip for simplicity, but this is not normative. A separate parameter
factor, which allows for tuning the tradeoff between eager and lazy
transmission of messages.
### Control message piggybacking
Gossip and other control messages do not have to be transmitted on
@@ -336,7 +342,7 @@ any other message in the regular flow, for any topic. This can lead to
message rate reduction whenever there is some correlated flow between
topics, and can be significant for densely connected peers.
For piggyback implementation details, consult the go implementation.
For piggyback implementation details, consult the [Go implementation](https://github.com/libp2p/go-floodsub/blob/master/gossipsub.go).
### Protobuf