Files
research/waku_scalability/waku_scaling.md
Farooq 93c710b4f7 waku latency/bandwidth theoretical analysis (#206)
* equations/discussion for latency, bandwidth and duplicates added. some equations corrected!

* some minor adjustments in formulas

* updated theoratical waku latency/bandwidth computations

* Update waku_scalability/waku_scaling.md

typo fix

Co-authored-by: Alberto Soutullo <alberto.soutullo639@gmail.com>

* corrected latency description

---------

Co-authored-by: Alberto Soutullo <alberto.soutullo639@gmail.com>
2025-10-07 15:53:45 +05:00

17 KiB

Waku Scaling

Note: This analysis (and the underlying model) is concerned with currently active users. The total number of users of an app's multicast group using Waku relay could be factor 10 to 100 higher (estimated based on @Menduist Discord checks).

Note

: The analysis focus on ingress traffic per node. For Waku relay nodes, the egress traffic is (almost exactly) the same as this ingress traffic. So, to consider both ingress and egress load on relay nodes, we need to multiply results by 2.

Note

: In this analysis we assume nodes to be part of a protocol stack that has an app layer (e.g. Status Desktop), which originates messages. We simply say: "a node sends messages", even though it is not the gossipsub layer itself, but the app layer using the gossipsub node.

Many thanks to @Menduist for the helpful discussions, input, and feedback.

Analysis

We first look at the scaling properties of a single pubsub topic, which corresponds to a single shard. This is relevant for the scaling the number of nodes within a single multicast group, and by extension the number of active users in a single Status Community. We also look at scaling properties of a Waku network as a whole, including sharing.

Single Shard

The load on nodes scales linearly with the number of nodes in a shard. So, there is a limit of nodes (which we currently see at around 10k nodes) that can feasibly be active in a single shard. The degree d of the assumed regular topology graph adds to the scaling factor. While this is asymptotically irrelevant, it has a significant practical impact.

  • The (constant) degree d limits ingress load growth per node to O(n).
  • A higher d adds load but reduces latency.
  • Unbound d yields best latency but infeasible load.
  • Gossip reduces latency at the cost of load.

Topology and Duplicate Messages

We assume a d-regular graph topology. The topology effects the number of duplicate message transmissions, as well as the latency.

Duplicate messages cause significant additional load. Gossipsub reduces the number of duplicate messages by

  • not sending a message back to the sender
  • not relaying a message if it has already been relayed (using a seen cache)

while this is very helpful, there are duplicate messages that cannot be avoided with these techniques. Assume node A is connected to B and C, and B and C are connected to D. A relays a message to both B and C, both of which will relay the message to D, so D receives the message twice. With local routing decisions (and no additional control messages), the number of edges in the graph is a lower bound for the number of hop-to-hop transmissions of a single message. A d-regular graph has \frac{(n \times d)}{2} edges (proof here).

However, there is another practical problem (thanks @Menduist for pointing this out): Two nodes that both just got the message might relay the message to each other before either of them registers the receipt.

So, practically a message can go twice over the same edge. Each node actually sends the message to d-1 peers, which leads to a duplication factor of n \times (d-1) hop-to-hop transmissions per message. Waiting before transmitting can lower this bound to somewhere between \frac{(n \times d)}{2} and n \times (d-1).

gossip

Gossip is sent once per heartbeat interval (1s) in the form of an IHAVE message, indicating a node has a certain message. IHAVEs regarding messages are sent if the respective message is in one of the last mcache_gossip = 3 windows. The window length corresponds to the heartbeat time. So, effectively, each relayed message triggers mcache_gossip = 3 many IHAVE messages. Each IHAVE is transmitted to d_lazy = 6 many peers. So each relayed message triggers 3*6 = 18 IHAVE messages; estimation of the size of an IHAVE. If a receiving node actually wants the message, it sends an IWANT, and receives the actual message. In the model, we introduce avg_ratio_gossip_replys as the ratio of IHAVEs that are followed by an IWANT. The follow-up causes an additional load of: (gossip_message_size + message_size) * avg_ratio_gossip_replys

Multi Shard

Sharding allows to bind the maximum load a node/user is exposed to. Instead of scaling with the number of nodes in the network, the load scales with the number of nodes in all shards the node is part of.

Latency and Bandwidth Analysis

Assuming uniform link characteristics, message dissemination to full-message mesh concludes in \tau_d \approx (d \times \tau_{tx}) + \tau_p time, where \tau_{tx} = \frac{S}{R}, with S, R, and \tau_p being the message size, data rate, and link latency, respectively. This simplifies network-wide dissemination time to \tau_{n} \approx \tau_d \times h, with h indicating the number of hops along the longest path, and can be roughly estimated as h \approx \lceil {log_d(n)} \rceil.

During each hop, roughly (d-1)^{x-1} \times d transmissions are made, where x represents the hop number and satisfies 1 \leq x \leq h. Publisher transmitting to a higher number of peers (floodpublish) can theoretically lower latency (\tau_n) by increasing the transmissions in each round to (d-1)^{x-1} \times (f+d), where f represents the number of peers included in floodpublish. Similarly, the use of IWANT messages can also reduce latency by enabling distant peers to promptly retrieve overdue messages. However, the impact of floodpublish and IWANT messages is not considered in the latency computations above.

Talking about bandwidth utilization, a network comprising n peers, each with a degree d, has a total of \frac {n \times d}{2} edges (links), as every link connects two peers. Assuming that a message traverses every link exactly once, we get at least \frac {n \times d}{2} transmissions for each message. Only n-1 transmissions are necessary for delivering a message to all peers. As a result, we get \frac {n \times d}{2} -(n-1) duplicates in the network. We can simplify average duplicates received by a single peer to \bar{d}_{min} \approx \frac{D}{2}-1. Here, \bar{d}_{min} represents the lower bound on average duplicates because we assume that the send and receive operations are mutually exclusive. This assumption requires that message transmission times (and link latencies) are so small that no two peers simultaneously transmit the same message to each other.

However, a large message can noticeably increase the contention interval, which increases the likelihood that many peers will simultaneously transmit the same message to each other. Talking about the maximum bandwidth utilization, we assume that all peers forward every received message to d-1 peers while the original publisher sends it to d peers. As a result, we get n(d-1)+1 transmissions in the network. Only n-1 transmissions are necessary to deliver a message to all peers. Remaining n(d-1)+1-(n-1) transmissions are duplicates, which simplifies the upper bound on average duplicates received by each peer to \bar{d}_{max} \approx d-2. This rise indicates that larger messages can lead to more duplicates due to longer contention intervals. It is essential to highlight that the impact of IWANT/IDONTWANT messages is not considered in \bar{d} computations above.

Latency and bandwidth properties based on our current analysis are presented below.

Model Calculations

(generated with ./waku_scaling.py)

Load case 1 (store load; corresponds to received load per naive light node)

Assumptions/Simplifications:

  • A01. Message size (static): 2.05KB
  • A02. Messages sent per node per hour (static) (assuming no spam; but also no rate limiting.): 5
  • A03. The network topology is a d-regular graph of degree (static): 6
  • A04. Messages outside of Waku Relay are not considered, e.g. store messages.
  • A07. Single shard (i.e. single pubsub mesh)
  • A21. Store nodes do not store duplicate messages.

For 100 users, receiving bandwidth is 1.0MB/hour For 10k users, receiving bandwidth is 100.0MB/hour


Load case 2 (received load per node)

Assumptions/Simplifications:

  • A01. Message size (static): 2.05KB
  • A02. Messages sent per node per hour (static) (assuming no spam; but also no rate limiting.): 5
  • A03. The network topology is a d-regular graph of degree (static): 6
  • A04. Messages outside of Waku Relay are not considered, e.g. store messages.
  • A05. Messages are only sent once along an edge. (requires delays before sending)
  • A07. Single shard (i.e. single pubsub mesh)
  • A31. Gossip is not considered.

For 100 users, receiving bandwidth is 3.0MB/hour For 10k users, receiving bandwidth is 300.0MB/hour


Load case 3 (received load per node)

Assumptions/Simplifications:

  • A01. Message size (static): 2.05KB
  • A02. Messages sent per node per hour (static) (assuming no spam; but also no rate limiting.): 5
  • A03. The network topology is a d-regular graph of degree (static): 6
  • A04. Messages outside of Waku Relay are not considered, e.g. store messages.
  • A06. Messages are sent to all d-1 neighbours as soon as receiving a message (current operation)
  • A07. Single shard (i.e. single pubsub mesh)
  • A31. Gossip is not considered.

For 100 users, receiving bandwidth is 5.0MB/hour For 10k users, receiving bandwidth is 500.0MB/hour


Load case 4 (received load per node incl. gossip)

Assumptions/Simplifications:

  • A01. Message size (static): 2.05KB
  • A02. Messages sent per node per hour (static) (assuming no spam; but also no rate limiting.): 5
  • A03. The network topology is a d-regular graph of degree (static): 6
  • A04. Messages outside of Waku Relay are not considered, e.g. store messages.
  • A06. Messages are sent to all d-1 neighbours as soon as receiving a message (current operation)
  • A07. Single shard (i.e. single pubsub mesh)
  • A32. Gossip message size (IHAVE/IWANT) (static):0.05KB
  • A33. Ratio of IHAVEs followed-up by an IWANT (incl. the actual requested message):0.01

For 100 users, receiving bandwidth is 5.6MB/hour For 10k users, receiving bandwidth is 563.0MB/hour


Load case 5 (received load per node with IDONTWANT messages, excl. gossip)

Assumptions/Simplifications:

  • A01. Message size (static): 2.05KB
  • A02. Messages sent per node per hour (static) (assuming no spam; but also no rate limiting.): 5
  • A03. The network topology is a d-regular graph of degree (static): 6
  • A04. Messages outside of Waku Relay are not considered, e.g. store messages.
  • A06. Messages are sent to all d-1 neighbours as soon as receiving a message (current operation)
  • A07. Single shard (i.e. single pubsub mesh)
  • A16. There exists at most one peer edge between any two nodes.
  • A17. The peer network is connected.
  • A34. Gossip message size for IDONTWANT (static): 0.05KB
  • A35. Ratio of messages that are big enough to trigger a IDONTWANT response: 0.2
  • A36. Ratio of big messages that are avoided due to IDONTWANT: 1.67
  • A37. Size of messages large enough to trigger IDONTWANT (static): 6.14KB

For 100 users, receiving bandwidth is 4.0MB/hour For 10k users, receiving bandwidth is 400.4MB/hour


load sharding case 1 (received load per node incl. gossip)

Assumptions/Simplifications:

  • A01. Message size (static): 2.05KB
  • A02. Messages sent per node per hour (static) (assuming no spam; but also no rate limiting.): 5
  • A03. The network topology is a d-regular graph of degree (static): 6
  • A04. Messages outside of Waku Relay are not considered, e.g. store messages.
  • A06. Messages are sent to all d-1 neighbours as soon as receiving a message (current operation)
  • A08. Multiple shards; mapping of content topic (multicast group) to shard is 1 to 1
  • A09. Max number of nodes per shard (static) 10000
  • A10. Number of shards a given node is part of (static) 3
  • A11. Number of nodes in the network is variable. These nodes are distributed evenly over 3 shards. Once all of these shards have 10000 nodes, new shards are spawned. These new shards have no influcene on this model, because the nodes we look at are not part of these new shards.
  • A32. Gossip message size (IHAVE/IWANT) (static):0.05KB
  • A33. Ratio of IHAVEs followed-up by an IWANT (incl. the actual requested message):0.01

For 100 users, receiving bandwidth is 5.7MB/hour For 10k users, receiving bandwidth is 563.0MB/hour For 1m users, receiving bandwidth is 1689.0MB/hour


load sharding case 2 (received load per node incl. gossip and 1:1 chat)

Assumptions/Simplifications:

  • A01. Message size (static): 2.05KB
  • A02. Messages sent per node per hour (static) (assuming no spam; but also no rate limiting.): 5
  • A03. The network topology is a d-regular graph of degree (static): 6
  • A04. Messages outside of Waku Relay are not considered, e.g. store messages.
  • A06. Messages are sent to all d-1 neighbours as soon as receiving a message (current operation)
  • A08. Multiple shards; mapping of content topic (multicast group) to shard is 1 to 1
  • A09. Max number of nodes per shard (static) 10000
  • A10. Number of shards a given node is part of (static) 3
  • A11. Number of nodes in the network is variable. These nodes are distributed evenly over 3 shards. Once all of these shards have 10000 nodes, new shards are spawned. These new shards have no influcene on this model, because the nodes we look at are not part of these new shards.
  • A12. Including 1:1 chat. Messages sent to a given user are sent into a 1:1 shard associated with that user's node. Effectively, 1:1 chat adds a receive load corresponding to one additional shard a given node has to be part of.
  • A13. 1:1 chat messages sent per node per hour (static): 5
  • A14. 1:1 chat shards are filled one by one (not evenly distributed over the shards). This acts as an upper bound and overestimates the 1:1 load for lower node counts.
  • A32. Gossip message size (IHAVE/IWANT) (static):0.05KB
  • A33. Ratio of IHAVEs followed-up by an IWANT (incl. the actual requested message):0.01

For 100 users, receiving bandwidth is 11.3MB/hour For 10k users, receiving bandwidth is 1126.0MB/hour For 1m users, receiving bandwidth is 2252.0MB/hour


load sharding case 3 (received load naive light node.)

Assumptions/Simplifications:

  • A01. Message size (static): 2.05KB
  • A02. Messages sent per node per hour (static) (assuming no spam; but also no rate limiting.): 5
  • A03. The network topology is a d-regular graph of degree (static): 6
  • A04. Messages outside of Waku Relay are not considered, e.g. store messages.
  • A06. Messages are sent to all d-1 neighbours as soon as receiving a message (current operation)
  • A08. Multiple shards; mapping of content topic (multicast group) to shard is 1 to 1
  • A09. Max number of nodes per shard (static) 10000
  • A10. Number of shards a given node is part of (static) 3
  • A15. Naive light node. Requests all messages in shards that have (large) 1:1 mapped multicast groups the light node is interested in.
  • A32. Gossip message size (IHAVE/IWANT) (static):0.05KB
  • A33. Ratio of IHAVEs followed-up by an IWANT (incl. the actual requested message):0.01

For 100 users, receiving bandwidth is 1.0MB/hour For 10k users, receiving bandwidth is 100.0MB/hour For 1m users, receiving bandwidth is 300.0MB/hour


Latency case 1 :: Topology: 6-regular graph. No gossip (note: gossip would help here)

Assumptions/Simplifications:

  • A03. The network topology is a d-regular graph of degree (static): 6
  • A41. Delay is calculated based on an upper bound of the expected distance.
  • A42. Average delay per hop (static): 0.1s.
  • A43. Average peer bandwidth (static): 30Mbps.

For 100 users, the expected network-wide dissemination time is 0.301 s For 10k users, the expected network-wide dissemination time is 0.733 s (max with sharding) For 1m users, the expected network-wide dissemination time is 18.578 (in a single shard)


vs old model

We assume a node is not relaying messages, but only sending

This is useful for establishing a lower bound independent of network topology, but underestimates the load by a factor (which is piratically relevant). It ignores duplicate receives.

This is why we switch to messages_sent_per_day instead of assuming a number of received messages. The number of messages received depends on the topology.

todo

  • add case:
    • load: Status control messages
    • load: light node with selective requests

Future Work

  • new case class: peers per node (various cases)
    • this matters because too many peers -> too many connections
  • generalize from averages to distributions (e.g. multicast group sizes, latency, ...)
  • distributed store