mirror of
https://github.com/vacp2p/vac.dev.git
synced 2026-01-06 21:34:17 -05:00
496 lines
30 KiB
Plaintext
496 lines
30 KiB
Plaintext
---
|
||
title: 'Large Message Handling in GossipSub: Potential Improvements'
|
||
date: 2024-10-31 12:00:00
|
||
authors: farooq
|
||
published: true
|
||
slug: gsub-largemsg-improvements
|
||
categories: research
|
||
discuss: https://forum.vac.dev/t/large-message-handling-in-gossipsub-potential-improvements/375
|
||
|
||
|
||
toc_min_heading_level: 2
|
||
toc_max_heading_level: 5
|
||
---
|
||
|
||
Large Message Handling in GossipSub: Potential Improvements
|
||
|
||
{/* truncate */}
|
||
|
||
## Motivation
|
||
The challenge of large message transmissions in GossipSub leads to longer than expected network-wide message dissemination times (and relatively higher fluctuations).
|
||
It is particularly relevant for applications that require on-time, network-wide dissemination of large messages,
|
||
such as Ethereum and Waku [[1](https://eips.ethereum.org/EIPS/eip-4844),[2](https://docs.waku.org/research/research-and-studies/message-propagation/)].
|
||
|
||
|
||
This matter has been extensively discussed in the libp2p community [[3](https://github.com/libp2p/rust-libp2p/pull/3666),
|
||
[4](https://github.com/sigp/lighthouse/pull/4383),
|
||
[5](https://hackmd.io/X1DoBHtYTtuGqYg0qK4zJw),
|
||
[6](https://github.com/status-im/nim-libp2p/issues/850),
|
||
[7](https://github.com/vacp2p/nim-libp2p/pull/911),
|
||
[8](https://github.com/vacp2p/nim-libp2p/issues/859)],
|
||
and numerous improvements have been considered (or even incorporated) for the GossipSub protocol to enable efficient large-message propagation
|
||
[[3](https://github.com/libp2p/rust-libp2p/pull/3666),
|
||
[7](https://github.com/vacp2p/nim-libp2p/pull/911),
|
||
[9](https://hackmd.io/@gRwfloEASH6NWWS_KJxFGQ/B18wdnNDh),
|
||
[10](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.2.md?plain=1#L52)].
|
||
|
||
## Problem description
|
||
|
||
Sending a message to N peers involves approximately $\lceil \log_D(N) \rceil$ rounds,
|
||
with approximately $(D-1)^{X-1} \times D$ transmissions in each round,
|
||
where $X, D$ represent the round number and mesh size.
|
||
|
||
Transmitting to a higher number of peers (floodpublish) can theoretically reduce latency by increasing the transmissions in each round to $(D-1)^{X-1} \times (F+D)$,
|
||
where $F$ represents the number of peers included in floodpublish.
|
||
|
||
This arrangement works fine for relatively small/moderate message sizes.
|
||
However, as message sizes increase, significant rises and fluctuations in network-wide message dissemination time are seen.
|
||
|
||
Interestingly, a higher $D$ or $F$ can also degrade performance in this situation.
|
||
|
||
Several aspects contribute to this behavior:
|
||
|
||
1. Ideally, a message transmission to a single peer concludes in $\tau_1 = \frac {L}{R}+P$ (ignoring any message processing time),
|
||
where $L, R, P$ represent message size, data rate, and link latency.
|
||
Therefore, the time required for sending a message on a 100Mbps link with 100ms latency
|
||
jumps from $\tau_1^{10KB} = 100.8ms$ for a 10KB message to $\tau_1^{1MB} = 180ms$ for a 1MB message.
|
||
For $D$ peers, the transmission time increases to $\tau_D^{1MB} = (80 \times D) + 100ms$,
|
||
triggering additional queuing delays (proportional to the transmission queue size) during each transmission round.
|
||
|
||
2. In practice, $\tau_1^{1MB}$ sometimes rises to several hundred milliseconds,
|
||
further exaggerating the abovementioned queuing delays.
|
||
This rise is because TCP congestion avoidance algorithms usually limit maximum in-flight bytes in a round trip time (RTT)
|
||
based on the congestion window ($C_{wnd}$) and maximum segment size (MSS) to approximately ${C_{wnd} \times MSS}$,
|
||
with $C_{wnd}$ rising with the data transfer for each flow.
|
||
Consequently, sending the same message through a newly established (cold) connection takes longer.
|
||
The message transfer time lowers as the $C_{wnd}$ grows.
|
||
Therefore, performance-improvement practices such as floodpublish, frequent mesh adjustment, and lazy sending
|
||
typically result in longer than expected message dissemination times for large messages (due to cold connections).
|
||
It is also worth mentioning that some TCP variants reset their $C_{wnd}$ after different periods of inactivity.
|
||
|
||
3. Theoretically, the message transmission time to D peers $(\tau_D)$ remains the same
|
||
even if the message is relayed sequentially to all peers or simultaneous transmissions are carried out,
|
||
i.e., $\tau_D = \sum_{i=1}^{D} \tau_i$
|
||
However, sequential transmissions finish early for individual peers, allowing them to relay early.
|
||
This may result in quicker network-wide message dissemination.
|
||
|
||
|
||
4. A realistic network comprises nodes with dissimilar capabilities (bandwidth, link latency, compute, etc.).
|
||
As the message disseminates, it's not uncommon for some peers to receive it much earlier than others.
|
||
Early gossip (IHAVE announcements) may bring in many IWANT requests to the early receivers (even from peers already receiving the same message),
|
||
which adds to their workload.
|
||
|
||
5. A busy peer (with a sizeable outgoing message queue) will enqueue (or simultaneously transfer) newly scheduled outgoing messages.
|
||
As a result, already scheduled messages are prioritized over those published by the peer itself,
|
||
introducing a significant initial delay to the locally published messages.
|
||
Enqueuing IWANT replies to the outgoing message queue can further exaggerate the problem.
|
||
The lack of adaptiveness and standardization in outgoing message prioritization are key factors that can lead to noticeable inconsistency
|
||
in message dissemination latency at each hop, even in similar network conditions.
|
||
|
||
6. Message size directly contributes to peers' workloads in terms of processing and transmission time.
|
||
It also raises the probability of simultaneous redundant transmissions to the same peer,
|
||
resulting in bandwidth wastage, congestion, and slow message propagation to the network.
|
||
Moreover, the benefits of sequential message relaying can be compromised by prioritizing slow (or busy) peers.
|
||
|
||
7. Most use cases necessitate validating received messages before forwarding them to the next-hop peers.
|
||
For a higher message transfer time $(\tau )$, this store-and-forward delay accumulates across the hops traveled by the message.
|
||
|
||
## Possible improvements
|
||
|
||
|
||
### 1. Minimizing transfer time for large messages
|
||
|
||
The impact of message size and achievable data rate on message transmit time $\tau$ is crucial
|
||
as this time accumulates due to the store-and-forward delay introduced at intermediate hops.
|
||
|
||
Some possible improvements to minimize overall message dissemination latency include:
|
||
|
||
#### a. Message fragmentation
|
||
|
||
In a homogeneous network, network-wide message dissemination time (ignoring any processing delays)
|
||
can be simplified to roughly $\delta \approx \delta_{Tx} + P_h$, where $\delta_{Tx}$ represents accumulative message transmit time denoted as $\delta_{Tx} = \frac{S}{R} \times h$,
|
||
with $S, R$ being the data size and data rate,
|
||
and $h, P_h$ being the number of hops in the longest path and message propagation time through the longest path.
|
||
|
||
Partitioning a large message into n fragments reduces a single fragment transmit time to $\frac{\delta_{Tx}}{n}$.
|
||
As a received fragment can be immediately relayed by the receiver (while the sender is still transmitting the remaining fragments),
|
||
it reduces the transmit time to $\delta_{Tx} = \frac{S}{R} \times \frac{2h-1}{n}$.
|
||
|
||
This time reduction is mainly attributed to the smaller store-and-forward delay involved in fragment transmissions.
|
||
|
||
However, it is worth noting that many applications require each fragment to be individually verifiable.
|
||
At the same time, message fragmentation allows a malicious peer to never relay some fragments of a message,
|
||
which can lead to a significant rise in the application's receive buffer size.
|
||
|
||
Therefore, message fragmentation requires a careful tradeoff analysis between time and risks.
|
||
|
||
#### b. Message staggering
|
||
|
||
Considering the same bandwidth, the time $\tau_D$ required for sending a message to D peers stays the same,
|
||
even if we relay to all peers in parallel or send sequentially to the peers, i.e., $\tau_D = \sum_{i=1}^{D} \tau_i$.
|
||
|
||
However, sequential relaying results in quicker message reception at individual peers ($\tau_1 \approx \frac{\tau_D}{D}$) due to bandwidth concentration for a particular peer.
|
||
So, the receiver can start relaying early to its mesh members while the original sender is still sending the message to other peers.
|
||
|
||
|
||
As a result, after every $\frac{\tau_D}{D}$ milliseconds,
|
||
the number of peers receiving the message increases by $2^X\ \forall\ X \in \{0, D-1\}$ and by $\sum_{k=X-D}^{X-1} \lambda_k\ \forall\ X \geq D$.
|
||
Here, $X$ represents message transmission round $X = i \cdot \frac{\tau_D}{D} \mid i \in \mathbb{N}_0$, and $\lambda_k$ represents the number of peers that received the message in round $k$.
|
||
|
||
It is worth noting that a realistic network imposes certain constraints on staggering for peers.
|
||
For instance, in a network with dissimilar peer capabilities,
|
||
placing a slow peer (also in cases where many senders simultaneously select the same peer) at the head of the transmission queue
|
||
may result in head-of-line blocking for the message queue.
|
||
|
||
At the same time, early receivers get many IWANT requests, increasing their workload.
|
||
|
||
#### c. Message prioritization for slow senders
|
||
|
||
A slow peer often struggles with a backlog of messages in the outgoing message queue(s) for mesh members.
|
||
Any new message transmission at this stage (especially the locally published messages) gets delayed.
|
||
Adaptive message-forwarding can help such peers prioritize traffic to minimize latency for essential message transfers.
|
||
|
||
For instance, any GossipSub peer will likely receive every message from multiple senders,
|
||
leading to redundant transmissions [[11](https://ethresear.ch/t/number-duplicate-messages-in-ethereums-gossipsub-network/19921)].
|
||
Implementing efficient strategies (only for slow senders) like lazy sending
|
||
and prioritizing locally published messages/IWANT replies over already queued messages
|
||
can help minimize outgoing message queue sizes and optimize bandwidth for essential message transfers.
|
||
|
||
A peer can identify itself as a slow peer by using any bandwidth estimation approach
|
||
or simply setting an outgoing message queue threshold for all mesh members.
|
||
|
||
|
||
Eliminating/deprioritizing some messages can lower a peer's score,
|
||
but it also earns the peer an overall better score by achieving some early message transfers.
|
||
For instance, sending many near-first messages can only save a peer from a deficit penalty.
|
||
On the other hand, sending only one message (assuming MeshMessageDeliveriesThreshold defaults to 1)
|
||
as the first delivered message can add to the accumulative peer score.
|
||
|
||
### 2. Mitigating transport issues
|
||
|
||
Congestion avoidance algorithms used in various TCP versions directly influence achievable throughput and message transfer time
|
||
as maximum unacknowledged in-flight bytes are based on the congestion window $(C_{wnd})$ size.
|
||
|
||
|
||
Rapid adaptation of $C_{wnd}$ to the available network conditions can help lower message dissemination latency.
|
||
|
||
Therefore, selecting a more suitable TCP variant like BBR,
|
||
which is known for its ability to dynamically adjust the congestion window based on network conditions,
|
||
can significantly enhance GossipSub's performance.
|
||
|
||
At the same time, parameters like receive window scaling and initial $C_{wnd}$ also impact message transfer time,
|
||
but these are usually OS-specific system-wide choices.
|
||
|
||
One possible solution is to raise $C_{wnd}$ by exchanging data over the newly established connection.
|
||
This data may involve useful details like peer exchange information and gossip to build initial trust,
|
||
or GossipSub can use some dummy data to raise $C_{wnd}$ to a reasonable level.
|
||
|
||
It's important to understand that some TCP variants reset $C_{wnd}$ after specific periods of inactivity [[12](https://datatracker.ietf.org/doc/html/rfc2581#section-4.1)].
|
||
This can lead to a decline in TCP's performance for applications
|
||
that generate traffic after intervals long enough to trigger the resetting of the congestion window.
|
||
|
||
Implementing straightforward measures like transport-level ping-pong messages can effectively mitigate this problem [[13](https://github.com/libp2p/specs/pull/558)].
|
||
|
||
|
||
The limitations faced with $C_{wnd}$ scaling also impact some performance optimizations in GossipSub.
|
||
For instance, floodpublishing is an optimization relying on additional transmissions by the publisher to minimize message dissemination latency.
|
||
|
||
However, a small $C_{wnd}$ value in (new/cold) TCP connections established with floodpublish peers significantly increases message transmission time
|
||
[[4](https://github.com/sigp/lighthouse/pull/4383)].
|
||
Usually, these peers also receive the same message from other sources during this time, wasting the publisher's bandwidth.
|
||
|
||
The same is the case with IWANT replies.
|
||
|
||
Maintaining a bigger mesh (with warm TCP connections) and relaying to $D$ peers can be a better alternative to this problem.
|
||
|
||
### 3. Eliminating redundant transmissions
|
||
|
||
For every received packet, a peer makes roughly $D$ transmissions to contribute its fair share to the spread of messages.
|
||
However, the fact that many recipients had already received the message (from some other peer)
|
||
makes this message propagation inefficient.
|
||
|
||
Although the $D$-spread is attributed to quicker dissemination and resilience against non-conforming peers,
|
||
many potential solutions can still minimize redundant transmissions
|
||
while preserving the resilience of GossipSub.
|
||
|
||
|
||
These solutions, ranging from probabilistic to more knowledgeful elimination of messages from the outgoing message queue,
|
||
not only address the issue of redundancy but also provide an opportunity for bandwidth optimization,
|
||
especially for resource-constrained peers.
|
||
|
||
For instance, an IDONTWANT message, a key component of GossipSub (v1.2) [[10](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.2.md?plain=1#L52)],
|
||
can significantly reduce redundant transmissions.
|
||
|
||
It allows any node to notify its mesh members that it has already received a message,
|
||
thereby preventing them from resending the same message.
|
||
This functionality is useful when a node receives a message larger than a specified threshold.
|
||
|
||
In such cases, the node promptly informs its mesh peers about the successful reception of the message by sending IDONTWANT messages.
|
||
|
||
It's important to note that an IDONTWANT message is essentially an IHAVE message, but with a crucial difference,
|
||
i.e., IHAVEs are only transmitted during the heartbeat intervals, whereas IDONTWANTs are sent immediately after receiving a large message.
|
||
|
||
This prompt notification helps curtail redundant large message transmissions without compromising the GossipSub resilience.
|
||
|
||
However, the use of IDONTWANT messages alone has an inherent limitation.
|
||
For instance, a peer can only send an IDONTWANT after receiving the complete message.
|
||
|
||
A large message transmission consumes significant time.
|
||
For example, transmitting a 1MB message at 100 Mbps bandwidth may consume 80 to several hundred milliseconds (depending upon $C_{wnd}$ and latency).
|
||
|
||
As a result, other mesh members may also start transmitting the same message during this interval.
|
||
A few potential solutions include:
|
||
|
||
#### a. Staggering with IDONTWANT messages
|
||
|
||
As previously discussed, [staggering](#b-message-staggering) can significantly reduce network-wide message dissemination latency.
|
||
This is primarily due to the relatively smaller store-and-forward delays that are inherent in this approach.
|
||
|
||
Using both staggering and IDONTWANT messages can further enhance efficiency by reducing redundant transmissions.
|
||
This is because a node only saturates its bandwidth for a small subset of mesh peers,
|
||
leading to early transmissions and prompt IDONTWANT message notifications to the mesh members.
|
||
|
||
It is worth highlighting that staggering can be implemented in various ways.
|
||
|
||
For example, it can be applied to peers (peer staggering)
|
||
where a node sequentially relays the same message to all peers one by one.
|
||
|
||
Alternatively, a node can send a different message to every peer (message staggering or rotational sending),
|
||
allowing IDONTWANTs for other messages to arrive during this time.
|
||
The message staggering approach is beneficial when several messages are introduced to the network within a short interval of time.
|
||
|
||
As the peers in staggered sending are sequentially covered
|
||
(with a faster speed due to bandwidth concentration), this leads to another problem.
|
||
|
||
The early covered peers send IHAVE (during their heartbeat intervals) for the messages they have received.
|
||
IHAVE announcements for newly received large messages trigger IWANTs from nodes
|
||
(including those already receiving the same message),
|
||
leading to an additional workload for early receivers [[14](https://github.com/vacp2p/nim-libp2p/issues/1101)].
|
||
|
||
Potential solutions to mitigate these problems include:
|
||
|
||
|
||
1) Defering IHAVE announcements for large messages.
|
||
|
||
Deferring IHAVE announcements can indirectly prioritize message transmission to the mesh peers over IWANT replies.
|
||
However, deciding on a suitable deferred interval is crucial for optimal performance.
|
||
One possible solution is to generate IHAVEs only after the message is relayed to all the mesh peers.
|
||
|
||
2) Defering IWANT requests for messages that are currently being received.
|
||
|
||
This requires [prior knowledge of msgIDs](#b-imreceiving-message) for the messages under reception.
|
||
Knowing the message length is also essential in deciding a suitable defer interval
|
||
to handle situations where a sender starts sending a message and never completes the transmission.
|
||
|
||
3) Not issuing IWANT for a message if at least $K$ peers have transmitted IDONTWANT for the same message
|
||
(as this indicates that these peers will eventually relay this message).
|
||
|
||
However, this approach can inadvertently empower a group of non-conforming mesh peers to send IDONTWANT for a message and never complete message transmission.
|
||
A delayed IWANT, along with negative peer scoring, can remedy this problem.
|
||
|
||
#### b. IMReceiving message
|
||
|
||
A peer can issue an IDONTWANT only after it has received the entire message.
|
||
However, a large message transmission may take several hundred milliseconds to complete.
|
||
During this time, many other mesh members may start relaying the same message.
|
||
|
||
Therefore, the probability of simultaneously receiving the same message from multiple senders increases with the message size,
|
||
significantly compromising the effectiveness of IDONTWANT messages.
|
||
|
||
Sending a short preamble (containing msgID and length) before the message transmission can provide valuable information about the message.
|
||
If a receiver is already receiving the same message from another sender,
|
||
the receiver can request to defer this transmission by sending a brief IMReceiving message.
|
||
|
||
An IDONTWANT from the receiver will indicate successful message reception. Otherwise, the waiting sender can initiate transmission after a specific wait interval.
|
||
|
||
However, waiting for IMReceiving after sending the preamble can delay the message transmission.
|
||
On the other hand, proceeding with message transfer (after sending the preamble) leads to another problem:
|
||
it is difficult to cancel ongoing message transmission after receiving IMReceiving for the same message.
|
||
|
||
|
||
To streamline this process, a peer can immediately send an IMReceiving message (for every received preamble),
|
||
urging other mesh peers to defer sending the same message [[15](https://forum.vac.dev/t/large-message-handling-idontwant-imreceiving/281),
|
||
[16](https://forum.vac.dev/t/idontwant-message-impact/283)].
|
||
|
||
The other peers can send this message if IDONTWANT is not received from the receiver during the wait interval.
|
||
This approach can boost IDONTWANT benefits by considering ongoing transmissions for large messages.
|
||
|
||
While IMReceiving messages can bring about substantial improvements in terms of latency and bandwidth utilization,
|
||
it's crucial to be aware of the potential risks.
|
||
|
||
A malicious user can exploit this approach to disrupt message transmission
|
||
either by never completing a message or by intentionally sending a message at an extremely slow rate to numerous peers.
|
||
|
||
|
||
This could ultimately result in network-wide slow message propagation.
|
||
|
||
However, carefully calibrating the deferring interval (based on message size) and negative peer scoring can help mitigate these risks.
|
||
|
||
#### c. IDONTWANT message with reduced forwarding
|
||
|
||
It is common for slow peers to pile up outgoing message queues,
|
||
especially for large message transfers.
|
||
This results in a significant queuing delay for outgoing messages.
|
||
Reduced message forwarding can help decrease the workload of slower peers.
|
||
|
||
On receiving a message longer than the specified threshold,
|
||
a slow peer can relay it to only $K \in D$ peers and send an IDONTWANT message to all the peers in $D$.
|
||
|
||
In this arrangement, the IDONTWANT message serves an additional purpose:
|
||
to promptly announce data availability, reinforcing redundancy in the presence of adversaries.
|
||
|
||
When a peer receives an IDONTWANT for an unseen message,
|
||
it learns about the new message and can request it by sending an IWANT request without waiting for the heartbeat (gossip) interval.
|
||
As a result, a significantly smaller number of transmissions is sufficient for propagating the message to the entire network.
|
||
|
||
This approach conserves peer bandwidth by minimizing redundant transmissions
|
||
while ensuring GossipSub resilience at the cost of one RTT (for missing peers).
|
||
|
||
Interestingly, curtailing queuing delays can also help lower network-wide message dissemination latency (for huge messages).
|
||
|
||
|
||
However, finding an appropriate value for $K$ is crucial for optimal performance.
|
||
A smaller $K$ saves peer bandwidth, while a larger $K$ achieves quicker spread until outgoing message queues pile up.
|
||
Setting $K = D_{low}$ can be one option.
|
||
|
||
It is worth mentioning that such behavior may negatively impact peer scoring (by missing message delivery rewards from $D-K$ peers).
|
||
However, a minimized workload enables early message dissemination to the remaining peers.
|
||
These early transmissions and randomized $K$ set selection can help achieve an overall better peer score.
|
||
|
||
### 4. Message prioritization
|
||
|
||
Despite the standardized specifications of the GossipSub protocol,
|
||
the message forwarding mechanisms can significantly impact network-wide message dissemination latency and bandwidth utilization.
|
||
|
||
It is worth mentioning that every node is responsible for transmitting different types of packets,
|
||
including control messages, locally published messages, messages received from mesh members, IWANT replies, etc.
|
||
|
||
As long as traffic volume is lower than the available data rate,
|
||
the message forwarding mechanisms yield similar results due to negligible queuing delays.
|
||
|
||
However, when the traffic volume increases and exceeds the available peer bandwidth (even for short traffic bursts),
|
||
the outgoing message queue(s) sizes rise, potentially impacting the network's performance.
|
||
|
||
In this scenario, FIFO-based traffic forwarding can lead to locally published messages being placed at the end of the outgoing message queue,
|
||
introducing a queuing delay proportional to the queue size.
|
||
The same applies to other delay-sensitive messages like IDONTWANT, PRUNE, etc.
|
||
|
||
On the other hand, the segregation of traffic into priority and non-priority queues can potentially starve low-priority messages.
|
||
One possible solution is to use weighted queues for a fair spread of messages.
|
||
|
||
Message prioritization can be a powerful tool to ensure that important messages reach their intended recipients on time
|
||
and allow for customizable message handling.
|
||
|
||
For example, staggering between peers and messages can be better managed by using priority queues.
|
||
However, it is important to note that message prioritization also introduces additional complexity to the system,
|
||
necessitating sophisticated algorithms for better message handling.
|
||
|
||
### 5. Maximizing benefits from IWANT messages
|
||
During heartbeat intervals, GossipSub nodes transmit IHAVE messages (carrying IDs of seen messages) to the peers not included in the full-message mesh.
|
||
These peers can use IWANT messages to request any missing messages.
|
||
A budget counter ensures these messages never exceed a specified threshold during each heartbeat interval.
|
||
|
||
The IHAVE/IWANT messages are a crucial tool in maintaining network connectivity.
|
||
They bridge the information gap between nearby and far-off peers,
|
||
ensuring that information can be disseminated to peers outside the mesh.
|
||
This function is essential in protecting against network partitions and indirectly aids in safeguarding against Sybil and eclipse attacks.
|
||
|
||
|
||
However, it is essential to understand that high transmission times for large messages
|
||
require careful due diligence when using IWANT messages for reasons not limited to:
|
||
|
||
1) A large message reception may take several hundred milliseconds to complete.
|
||
During this time, an IHAVE message announcing the same message ID will trigger an IWANT request.
|
||
|
||
2) A peer can send IWANT requests for the same message to multiple nodes,
|
||
leading to simultaneous transmissions of the same message.
|
||
|
||
3) Replying to (potentially many) IWANT requests can delay the transmission of the same message to mesh peers,
|
||
resulting in lower peer scores and slower message propagation.
|
||
|
||
A few possible solutions to mitigate this problem may include:
|
||
|
||
|
||
1) Issuing IHAVE announcements only after the message is delivered to many mesh peers.
|
||
|
||
2) Allocating a volume-based budget to service IWANT requests during each heartbeat interval.
|
||
|
||
3) Deferring IWANT requests for messages that are currently being received.
|
||
|
||
4) Deferring IWANT requests if at least $K$ IDONTWANTs are received for the same message.
|
||
|
||
5) A large message transmission can yield high $C_{wnd}$; preferring such peers during mesh maintenance can be helpful.
|
||
|
||
## Summary
|
||
This study investigates the pressing issue of considerable fluctuations and rises in network-wide dissemination times for large messages.
|
||
|
||
|
||
We delve into multiple factors,
|
||
such as increased message transmit times, store-and-forward delays, congestion avoidance mechanisms, and prioritization between messages,
|
||
to establish a comprehensive understanding of the problem.
|
||
|
||
The study also explores the performance of optimization efforts
|
||
like floodpublishing, IHAVE/IWANT messages, and message forwarding strategies in the wake of large message transmissions.
|
||
|
||
A key finding is that most congestion avoidance algorithms lack optimization for peer-to-peer networks.
|
||
Coupling this constraint with increased message transmission times
|
||
results in notable store-and-forward delays accumulating at each hop.
|
||
|
||
Furthermore, the probabilistic message-forwarding nature of GossipSub further exacerbates the situation
|
||
by utilizing a considerable share of available bandwidth on redundant transmissions.
|
||
|
||
|
||
Therefore, approaches focused on eliminating redundant transmissions
|
||
(IDONTWANT, IMReceiving, lazy sending, etc.) can prove helpful.
|
||
At the same time, strategies aimed at reducing store-and-forward delays
|
||
(fragmentation, staggering, prioritization, etc.) can prove beneficial.
|
||
|
||
|
||
It is worth mentioning that many of the strategies suggested in this post are ideas at different stages.
|
||
Some of these have already been explored and discussed to some extent [[5](https://hackmd.io/X1DoBHtYTtuGqYg0qK4zJw),
|
||
[17](https://forum.vac.dev/t/iwant-messages-may-have-negative-impact-on-message-dissemination-latency-for-large-messages/366),
|
||
[18](https://vac.dev/rlog/gsub-idontwant-perf-eval/)].
|
||
We are nearing the completion of a comprehensive performance evaluation of these approaches and will soon share the results of our findings.
|
||
|
||
Please feel free to join the discussion and leave feedback regarding this post in the
|
||
[VAC forum](https://forum.vac.dev/t/large-message-handling-in-gossipsub-potential-improvements/375).
|
||
|
||
|
||
## References
|
||
[1] EIP-4844: Shard Blob Transactions. Retrieved from https://eips.ethereum.org/EIPS/eip-4844
|
||
|
||
[2] Message Propagation Times With Waku-RLN. Retrieved from https://docs.waku.org/research/research-and-studies/message-propagation/
|
||
|
||
[3] Lenient Flood Publishing. Retrieved from https://github.com/libp2p/rust-libp2p/pull/3666
|
||
|
||
[4] Disable Flood Publishing. Retrieved from https://github.com/sigp/lighthouse/pull/4383
|
||
|
||
[5] GossipSub for Big Messages. Retrieved from https://hackmd.io/X1DoBHtYTtuGqYg0qK4zJw
|
||
|
||
[6] GossipSub: Lazy Sending. Retrieved from https://github.com/status-im/nim-libp2p/issues/850
|
||
|
||
[7] GossipSub: Limit Flood Publishing. Retrieved from https://github.com/vacp2p/nim-libp2p/pull/911
|
||
|
||
[8] GossipSub: Lazy Prefix Detection. Retrieved from https://github.com/vacp2p/nim-libp2p/issues/859
|
||
|
||
[9] Potential Gossip Improvement List for EIP4844. Retrieved from https://hackmd.io/@gRwfloEASH6NWWS_KJxFGQ/B18wdnNDh
|
||
|
||
[10] GossipSub Specifications v1.2: IDONTWANT Message. Retrieved from https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.2.md?plain=1#L52
|
||
|
||
[11] Number of Duplicate Messages in Ethereum’s GossipSub Network. Retrieved from https://ethresear.ch/t/number-duplicate-messages-in-ethereums-gossipsub-network/19921
|
||
|
||
[12] TCP Congestion Control: Re-starting Idle Connections. Retrieved from https://datatracker.ietf.org/doc/html/rfc2581#section-4.1
|
||
|
||
[13] PING/PONG Control Messages. Retrieved from https://github.com/libp2p/specs/pull/558
|
||
|
||
[14] IHAVE/IWANT Message Impact. Retrieved from https://github.com/vacp2p/nim-libp2p/issues/1101
|
||
|
||
[15] Large Message Handling IDONTWANT + IMReceiving Messages. Retrieved from https://forum.vac.dev/t/large-message-handling-idontwant-imreceiving/281
|
||
|
||
[16] IDONTWANT Message Impact. Retrieved from https://forum.vac.dev/t/idontwant-message-impact/283
|
||
|
||
[17] IWANT Message Impact. Retrieved from https://forum.vac.dev/t/iwant-messages-may-have-negative-impact-on-message-dissemination-latency-for-large-messages/366
|
||
|
||
[18] IDONTWANT Message Performance. Retrieved from https://vac.dev/rlog/gsub-idontwant-perf-eval/
|
||
|