equations/discussion for latency, bandwidth and duplicates added. some equations corrected!

2026-01-09 21:58:07 -05:00 · 2025-09-28 23:55:13 +02:00
parent b367397a8b
commit 7ce805c040
1 changed files with 43 additions and 7 deletions
--- a/waku_scalability/waku_scaling.md
+++ b/waku_scalability/waku_scaling.md
@@ -50,14 +50,14 @@ Assume node A is connected to B and C, and B and C are connected to D.
 A relays a message to both B and C, both of which will relay the message to D, so D receives the message twice.
 With local routing decisions (and no additional control messages),
 the number of edges in the graph is a lower bound for the number of hop-to-hop transmissions of a single message.
-A d-regular graph has $(n * d)/2$ edges (proof [here](https://proofwiki.org/wiki/Number_of_Edges_of_Regular_Graph)).
+A d-regular graph has $\frac{(n \times d)}{2}$ edges (proof [here](https://proofwiki.org/wiki/Number_of_Edges_of_Regular_Graph)).

 However, there is another practical problem (thanks @Menduist for pointing this out):
 Two nodes that both just got the message might relay the message to each other before either of them registers the receipt.

 So, practically a message can go twice over the same edge.
-Each node actually sends the message to `d-1` peers, which leads to a duplication factor of $n * (d-1)$ hop-to-hop transmissions per message.
-Waiting before transmitting can lower this bound to somewhere between $(n * d)/2$ and $n * (d-1)$.
+Each node actually sends the message to `d-1` peers, which leads to a duplication factor of $n \times (d-1)$ hop-to-hop transmissions per message.
+Waiting before transmitting can lower this bound to somewhere between $\frac{(n \times d)}{2}$ and $n \times (d-1)$.

 #### gossip

@@ -77,11 +77,47 @@ The follow-up causes an additional load of: `(gossip_message_size + message_size
 Sharding allows to bind the maximum load a node/user is exposed to.
 Instead of scaling with the number of nodes in the network, the load scales with the number of nodes in all shards the node is part of.

-### Latency
+## Latency and Bandwidth Analysis

-In our first analysis based on an upper bound of inter-node distance in d-regular graphs, latency properties are good (see results below).
-However, this needs further investigation, especially in regards to worst case and the upper percentile etc...
-Future work comprises a latency distribution (most likely, we will focus on other parts for the scaling MVP).
+Assuming uniform link characteristics, message dissemination to full-message mesh concludes in 
+$\tau_d \approx (d \times \tau_{tx}) + \tau_p$ time, where $\tau_{tx} = \frac{S}{R}$, 
+with $S$, $R$, and $\tau_p$ being the message size, data rate, and link latency, respectively. 
+This simplifies network-wide dissemination time to $\tau_{n} \approx \tau_d \times h$, 
+with $h$ indicating the number of hops along the longest path, 
+and can be roughly estimated as $h \approx \lceil {log_d(n)} \rceil$. 
+
+During each hop, roughly $(d-1)^{x-1} \times d$ transmissions are made, 
+where $x$ represents the hop number and satisfies $1 \leq x \leq h$. 
+Publisher transmitting to a higher number of peers (floodpublish) can theoretically lower latency ($\tau_n$)
+by increasing the transmissions in each round to $(d-1)^{x-1} \times (f+d)$, 
+where $f$ represents the number of peers included in floodpublish. 
+Similarly, the use of IWANT messages can also reduce latency by enabling distant peers to promptly retrieve overdue messages. 
+However, the impact of floodpublish and IWANT messages is not considered in the latency computations above.
+
+Talking about bandwidth utilization, a network comprising $n$ peers, each with a degree $d$, 
+has a total of $\frac {n \times d}{2}$ edges (links), as every link connects two peers. 
+Assuming that a message traverses every link exactly once, 
+we get at least $\frac {n \times d}{2}$ transmissions for each message. 
+Only $n-1$ transmissions are necessary for delivering a message to all peers. 
+As a result, we get $\frac {n \times d}{2} -(n-1)$ duplicates in the network. 
+We can simplify average duplicates received by a single peer to $\bar{d}_{min} \approx \frac{D}{2}-1$. 
+Here, $\bar{d}_{min}$ represents the lower bound on average duplicates 
+because we assume that the send and receive operations are mutually exclusive. 
+This assumption requires that message transmission times (and link latencies) are so small that 
+no two peers simultaneously transmit the same message to each other.
+
+However, a large message can noticeably increase the contention interval, 
+which increases the likelihood that many peers will simultaneously transmit the same message to each other. Talking about the maximum bandwidth utilization, 
+we assume that all peers forward every received message to $d-1$ peers 
+while the original publisher sends it to $d$ peers. 
+As a result, we get $n(d-1)+1$ transmissions in the network. 
+Only $n-1$ transmissions are necessary to deliver a message to all peers. 
+Remaining $n(d-1)+1-(n-1)$ transmissions are duplicates, 
+which simplifies the upper bound on average duplicates received by each peer to $\bar{d}_{max} \approx d-2$. 
+This rise indicates that larger messages can lead to more duplicates due to longer contention intervals. 
+It is essential to highlight that the impact of IWANT/IDONTWANT messages is not considered in $\bar{d}$ computations above.
+
+Latency and bandwidth properties based on out current analysis are presented below. 


 ## Model Calculations