Add an additional byte to SessionBitFlags to accommodate SESSION_REFINE
and reduce the risk of logic errors.
Additionally:
* `!SESSION_SEED & !SESSION_REFINE` is now referred to as `SESSION_DEFAULT`
* `!SESSION_REFINE` is refered to as `SESSION_NET`.
* `SESSION_ALL` has been deleted since it was conceptually out-dated
* Binaries have been updated.
Add an additional byte to SessionBitFlags to accommodate SESSION_REFINE
and reduce the risk of logic errors.
Additionally:
* `!SESSION_SEED & !SESSION_REFINE` is now referred to as `SESSION_DEFAULT`
* `!SESSION_REFINE` is refered to as `SESSION_NET`.
* `SESSION_ALL` has been deleted since it was conceptually out-dated
* Binaries have been updated.
* Move greylist refinery and ping self process into Session.
* Remove hosts/ submodule and return store to net/hosts.rs
* Temporarily disable Gold/ Grey list upgrade and downgrade (we will move into RefineSession)
this leads to cleaner code since depending on the use case we still do
different things with the HostState following move_host(). However it
does mean that unregister() has to be called manually in some cases.
Previously there was a bug which happened v rarely in which:
> Outbound and Manual Session are waiting on a stop signal
> Outbound/ Manual receives a stop signal, de-registers channel (in move_host)
> Channel is selected by Slot 1 to be connected to, state is changed to Connect
> remove_sub_on_stop() receives a stop signal, de-registers channel
> Channel is selected by Slot 5 connected to, state is changed to Connect
> Slot 1 connects, state is changed to Connected
> Slot 5 connects -> panic!
To avoid this happening, we move unregister() out of move_host and perform the sequence:
recv stop signal -> move_host to greylist (if outbond/manual) -> unregister()
We do this inside the shared method remove_sub_on_stop to ensure the execution path always happens in the same way.
In monero, nodes broadcast addrs from their whitelist. Receiving nodes
save the information on their greylist.
This is to ensure that honest nodes only broadcast active (i.e. whitelist)
nodes to the network. Dishonest nodes can send garbage info through
the hostlist, and therefore all information received from other nodes
is considered hostile and placed in the greylist, until we independently
verify it is accessible via the refinery.
Previously, darkfi deviated from this design as follows:
* Since peers on the greylist that do not match our transports never
enter the refinery, we assume that the greylist consists of
unsupported transports.
* We broadcast the greylist in ProtocolAddr, in an attempt to
ensure that all transports are propagated.
Rather than simply assuming the greylist contains unsupported
transports, it is better to assume the greylist is hostile (since it
comes from other nodes).
We create a `darklist` specifically for storing unknown/ unsupported
transports. When we receive information from other peer, unsupported
addrs are added to our `darklist`, which is then broadcast to other
peers in ProtocolAddr. This fulfils to requirement (of broadcasting all
transports) without also involving honest peers in the propagating of
hostile info.
Specifically:
* Hostile peers can still broadcast garbage info in their gold, white
and dark lists.
* Since info from other nodes is potentially hostile, honest peers save
this info on their greylist and do not broadcast it to other peers
unless a) it passes the refinery b) we connect to in outbound session
c) we do not support this transport.
* There is a potential attack in which an attacker could fill their
darklist with garbage e.g. Nym addresses, and honest nodes that do not
support Nym will continue sharing these addresses via the dark list.
The hostile peers will continue to be shared until a Nym-supporting
node receives them and they pass via the refinery.
* Note that this attack is less severe, since providing the nodes stay
on the Dark list they are ignored by the refinery and outbound connect
loop and do not eat up resources of the node. The only time it will
potentially cause pressure on a node if is the e.g. Nym node receives
a list of hostile fake Nym addresses and they enter its greylist,
causing it to refine many garbage addresses and potentially slowing
its ability to make outbound connections. The latter can be prevented
by increasing the settings `anchor_connect_count` and
`white_connection_percent` (meaning outbound connections will not
select from the greylist, or select less).
* Since there exists a potential attack vector of garbage entries in the
Dark list, we limit the Dark list size to 1000 peers.
* This also means that supporting all transports is the best setup for a
since it increases the security of the network (wrt the dark list).
Fixes several bugs:
1. Gold list upgrades were getting blocked since it required the
following state change that was not permitted: Connected() -> Move
We have fixed this by enabling this state change and making Move
take an Option<ChannelPtr> so that we can immediately reset Gold
upgrades to Connected(ChannelPtr) once the upgrade has successfully
completed.
2. Previously we were not downgrading peers when they disconnect, this
has now been fixed.
3. Previously move_host() was not properly atomic. While HostState
protects single host from being misused (like being Refined and Moved
at the same time), HostState does not protect the hostlists
themselves from being written to, creating race conditions when hosts
are being removed from hostlists, like so:
thread1: assert!(get_index(host_a) == 0)
thread2: assert!(get_index(host_b) == 1)
thread1: remove(0)
thread2: remove(1) -> panic!
We resolve this by moving write locks higher up in the code so that
the entire sequence of looking up an index and removing it is
atomic.
4. Manual session had a bug in which we proceeded to establish a
Connector with an address even if try_register() failed. This has
been fixed. We now only try to connect to address that are valid,
otherwise we wait outbound_connect_timeout and retry
manual_attempt_limit tries.
The usage of a second p2p network for miners was a premature optimization for faster block propagation between block producers, but in reality we don't know if its required yet, therefore we eliminate the extra complexity it introduces