mirror of
https://github.com/vacp2p/vac-book.git
synced 2026-01-09 21:37:59 -05:00
feat: move status scaling research from hackmd
This commit is contained in:
55
scratch/dos-prevention/status-community-semaphore.md
Normal file
55
scratch/dos-prevention/status-community-semaphore.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# Community DoS protection with Semaphore
|
||||
|
||||
## Background
|
||||
|
||||
This proposal takes inspiration from [Alvaro's DoS protection scheme, i.e splitting app-level and network-level validation](https://github.com/vacp2p/research/issues/164#issuecomment-1418967178)
|
||||
|
||||
waku-rln-relay[^1][^2] in its current state, is not production ready. There are open problems being worked on, including, but not limited to, robustness[^3], circuit security[^4] as well as performance[^5].
|
||||
|
||||
After these open problems are addressed, we may look at using RLN for community DoS protection.
|
||||
|
||||
RLN is based on Semaphore[^6], with the added functionality of slashing. However, slashing is very rarely required in the event of spamming from a *trusted* set of peers.
|
||||
|
||||
> *trusted* implies that the peer has been verified (out of band) before being added to a set of trusted peers.
|
||||
|
||||
Semaphore's latest version[^7] has been audited and can be considered safe to use in production.
|
||||
|
||||
|
||||
## Assumptions
|
||||
|
||||
- There is no need to slash a community member spamming messages
|
||||
- We simply route the messages if the member belongs to the Semaphore group, and drop them if not.
|
||||
|
||||
## Optional
|
||||
|
||||
An optional pre-requisite for this solution is to use the Waku Message UID[^8], and updating the behaviour of dissemination of the `COMMUNITY_DESCRIPTION` message.
|
||||
|
||||
- Currently, the `COMMUNITY_DESCRIPTION` message is sent at fixed intervals *and* whenever the metadata of the community has changed
|
||||
- When a new member is added to the community, their public key is appended to a list of members' public keys, and broadcasted with the `COMMUNITY_DESCRIPTION` message
|
||||
|
||||
This change proposes that the `COMMUNITY_DESCRIPTION` that is broadcasted at fixed intervals, merely carries a reference to the change in the community metadata, which was broadcasted when the change happened, i.e a MUID of the message that includes the change.
|
||||
|
||||
Sending the MUID allows reduced message sizes, and can leverage waku store lookups for old messages.
|
||||
|
||||
## Working
|
||||
|
||||
This method requires all members to keep a local copy of a sparse merkle tree containing the commitments (sent out of band to the community owner) of the members of the community. Having the MUID of the latest tree state broadcasted by the community owner is very useful here.
|
||||
|
||||
Whenever the member wishes to send a message, they attach a Semaphore proof to it (increasing message size by *TODO*).
|
||||
|
||||
Messages without proofs attached/invalid proofs can be dropped, thereby reducing their propagation through the network.
|
||||
|
||||
## Conclusion
|
||||
|
||||
RLN and Semaphore-based DoS protection can be used in tandem, RLN for channels with slow-mode enabled, and Semaphore for all other channels.
|
||||
|
||||
## References
|
||||
|
||||
[^1]: https://rfc.vac.dev/spec/17/
|
||||
[^2]: https://rfc.vac.dev/spec/32/
|
||||
[^3]: https://github.com/waku-org/nwaku/issues/1501
|
||||
[^4]: https://github.com/Rate-Limiting-Nullifier/rln-circuits/pull/7
|
||||
[^5]: https://github.com/waku-org/nwaku/issues/1501
|
||||
[^6]: https://semaphore.appliedzkp.org/
|
||||
[^7]: https://github.com/semaphore-protocol/semaphore/releases/tag/v3.0.0
|
||||
[^8]: https://github.com/waku-org/pm/issues/9
|
||||
48
scratch/status-scaling/community-description-napkin-math.md
Normal file
48
scratch/status-scaling/community-description-napkin-math.md
Normal file
@@ -0,0 +1,48 @@
|
||||
# Napkin math for Community description message sizes
|
||||
|
||||
1. Number of chats/channels: 25
|
||||
2. Strings: 32 bytes (ens_name, display_name, magnet_uri, etc)
|
||||
3. All keys: 32 bytes
|
||||
4. Each member has a profile image
|
||||
5. Each member has socials
|
||||
6. Each member is granted access to all chats
|
||||
|
||||
Fixed overhead:
|
||||
|
||||
organization_id + clock + name + description + magnet_uri + permissions + chats + organization_identity + encryption_key + (ChatIdentity fields) ~ 728 bytes
|
||||
|
||||
Variable overhead:
|
||||
|
||||
number_of_members * ((1 grant * number_of_chats)+ 1 image + 1 social link)
|
||||
|
||||
# Calculations
|
||||
|
||||
- 100 members
|
||||
- variable overhead: 104000
|
||||
- total size: 104728 bytes = 104.78 kb
|
||||
|
||||
- 1000 members
|
||||
- variable overhead: 1040000
|
||||
- total size: 104728 bytes = 1.04 mb
|
||||
|
||||
- 10000 members
|
||||
- variable overhead: 10400000
|
||||
- total size: 104728 bytes = 10.4 mb
|
||||
|
||||
After accessing telemetry =>
|
||||
|
||||
- 139 members (math)
|
||||
- variable overhead: 144560
|
||||
- total size: 145288 bytes = 145.28 kb
|
||||
|
||||
- 139 members (actual)
|
||||
- variable overhead: ?
|
||||
- total size: 346049 bytes = 346 kb
|
||||
|
||||
At approximately 401 members, we will cross the configured message size (1mb). Which means, the following is napkin math for 1000 and 10000 members respectively
|
||||
|
||||
- 1000 members
|
||||
- total size: 2.489 mb
|
||||
|
||||
- 10000 members
|
||||
- total size: 24.89 mb
|
||||
81
scratch/status-scaling/community-description-optimization.md
Normal file
81
scratch/status-scaling/community-description-optimization.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# Optimizing the `CommunityDescription` dissemination
|
||||
|
||||
## Context
|
||||
|
||||
This document describes a solution for using Sparse Merkle Trees (SMT) on IPFS to distribute members' public keys in an organization. The solution allows organizations/communities to efficiently manage and verify the membership of their members in a trustless manner.
|
||||
|
||||
This is done to prevent network overhead when broadcasting the `CommunityDescription` message, which will increase with the number of members in an organization/community
|
||||
|
||||
## Method
|
||||
|
||||
The proposed solution is to use Sparse Merkle Trees (SMT) on IPFS to distribute members' public keys.
|
||||
|
||||
The SMT is constructed with a set of leaf nodes, where each leaf node represents a public key of a member. The SMT can be updated by adding or removing leaf nodes as members are added or removed from the organization. The SMT is then recalculated to generate a new root hash, which is used to identify the SMT on the IPFS network.
|
||||
|
||||
The SMT can be stored on IPFS by adding the root hash to the IPFS network. The root hash can then be shared with the members of the organization, so they can retrieve the SMT from IPFS.
|
||||
|
||||
When a member wants to verify the membership of another member, they can use the SMT's proof mechanism to verify the presence of the member's public key in the SMT.
|
||||
|
||||
Therefore, the `CommunityDescription` protobuf changes from -
|
||||
```protobuf=
|
||||
message CommunityDescription {
|
||||
uint64 clock = 1;
|
||||
repeated bytes members = 2;
|
||||
OrganisationPermissions permissions = 3;
|
||||
ChatMessageIdentity identity = 5;
|
||||
repeated OrganisationChat chats = 6;
|
||||
// ... other fields
|
||||
}
|
||||
```
|
||||
to -
|
||||
```diff=
|
||||
message CommunityDescription {
|
||||
uint64 clock = 1;
|
||||
- repeated bytes members = 2;
|
||||
+ bytes members = 2; // Note: we should be able to change repeated bytes to bytes as the wire type of both is the same (Type 2) (I may be wrong here)
|
||||
OrganisationPermissions permissions = 3;
|
||||
ChatMessageIdentity identity = 5;
|
||||
repeated OrganisationChat chats = 6;
|
||||
}
|
||||
```
|
||||
|
||||
> Note: I have yet to explore viability of this solution for the other `repeated` field which may hold large amounts of data (chats)
|
||||
|
||||
|
||||
## Napkin Math
|
||||
|
||||
- 100 members
|
||||
- 100 leaf nodes
|
||||
- 7 levels.
|
||||
- Max 128 nodes
|
||||
- Storage required: 100 * 32 = 3,200 bytes
|
||||
- 1000 members
|
||||
- 1000 leaf nodes
|
||||
- 10 levels
|
||||
- Max 1024 nodes
|
||||
- Storage required: 1000 * 32 = 32,000 bytes = 32 kb
|
||||
- 10,000 members
|
||||
- 10000 leaf nodes
|
||||
- 14 levels
|
||||
- Max 16384 nodes
|
||||
- Storage required: 10000 * 32 = 320,000 bytes = 320 kb
|
||||
- 100,000 members
|
||||
- 100,000 leaf nodes
|
||||
- 17 levels
|
||||
- Max 131072 nodes
|
||||
- Storage required: 100000 * 32 = 3,200,000 bytes = 3.2 mb
|
||||
|
||||
The storage required is relatively less, and membership can be verified easily by the nodes.
|
||||
|
||||
The size of the `CommunityDescription` remains constant with the number of members in the community.
|
||||
|
||||
> I have not verified the integrity of this math, please help!
|
||||
|
||||
## Security considerations
|
||||
|
||||
- Anyone can update the tree, but the owner distributes the `CommunityDescription`, hence, the owner becomes a single point of failure. If the owner node is compromised, an arbitrary CID can be distributed, leading community members to believe that there are a different set of members
|
||||
- This can be solved by few members keeping the member tree in memory, and by computing the root hash themselves. If the local CID matches the CID distributed by the owner, then the members can verify that the computation was done correctly.
|
||||
|
||||
## Future Work
|
||||
|
||||
- The storage can be done on a variety of platforms which have support for content addressable storage (Codex?)
|
||||
@@ -0,0 +1,94 @@
|
||||
# Waku pubsub topic sharding
|
||||
|
||||
## Context
|
||||
The following document provides an overview of the Waku pubsub topic sharding method, which is based on the [35/WAKU2-NOISE](https://rfc.vac.dev/spec/35/) and [23/WAKU2-TOPICS](https://rfc.vac.dev/spec/23/#23waku2-topics) RFCs.
|
||||
|
||||
## Method
|
||||
The Waku pubsub topic sharding method is based on the use of a shared secret key derived from a Diffie-Hellman key exchange, and a deterministic hash function.
|
||||
The method is described as follows:
|
||||
|
||||
1. The two parties, Alice and Bob, establish a shared secret key using a Diffie-Hellman key exchange.
|
||||
|
||||
2. The shared secret key is used as an input to a deterministic hash function, such as SHA256, to generate a new topic for the next message.
|
||||
|
||||
$$
|
||||
pubsub\_topic = sha256(shared\_sk \ mod \ privacy\_parameter)
|
||||
$$
|
||||
|
||||
3. For each subsequent message, the shared secret key (which has been recomputed) is used as the input to the hash function again to generate a new topic.
|
||||
|
||||
4. (optional) To ensure the topic is unique, a nonce is concatenated to the shared secret key before hashing.
|
||||
|
||||
$$
|
||||
pubsub\_topic = sha256((shared\_sk \ mod \ privacy\_parameter) ⌢ nonce)
|
||||
$$
|
||||
|
||||
5. (optional) To ensure the topic is different for each message, a counter is concatenated to the shared secret key before hashing.
|
||||
|
||||
$$
|
||||
pubsub\_topic = sha256((shared\_sk ⌢ ctr) \ mod \ privacy\_parameter)
|
||||
$$
|
||||
|
||||
6. (optional) To reduce computational overhead of calculating a new pubsub topic for each message, a time window can be agreed upon between Alice and Bob.
|
||||
For example, a 24hr window can be negotiated if it is within the constraints that both Alice and Bob agree to.
|
||||
|
||||
Therefore, the pubsub topic is calculated in the following manner
|
||||
$$
|
||||
pubsub\_topic = sha256((shared\_sk⌢ timestamp) \ mod \ privacy\_parameter)
|
||||
$$
|
||||
Note that this approach would require both Alice and Bob to keep track of the shared secret key used at the start of the negotiation to derive the next pubsub topic.
|
||||
This cache can be erased when the negotiation process restarts at the end of the window.
|
||||
|
||||
|
||||
The `privacy_parameter` can be set by the peers, and must be agreed on beforehand between Alice and Bob.
|
||||
The lower the `privacy_parameter` is, the set of values for the pubsub topic would be fewer and it offers k-anonymity based privacy benefits.
|
||||
The higher the `privacy_parameter` is, the set of values for the pubsub topic 'B' would be larger, therefore, performance is increased at the cost of privacy.
|
||||
|
||||
It's important to note that the shared secret key changes with each message sent, as per Noise processing rules.
|
||||
This allows updates to the shared secret key by hashing the result of an ephemeral-ephemeral Diffie-Hellman exchange every 1-RTT communication. Therefore, even if privacy guarantees are lost with one message, they can be regained with the next message, if Alice signals to Bob (or vice-versa), to change the privacy_parameter
|
||||
|
||||
One can argue that this approach may lead to failure in forming a mesh, but with the assumption that:
|
||||
- the number of community-run nodes are high, and they participate in meshing
|
||||
- peers are incentivised to relay messages on a pubsub topic (incentivisation is yet to be solved)
|
||||
|
||||
If any of these assumptions are invalid, the peers can resort to using a lower `privacy_parameter` which would result in a commonly used pubsub topic, and piggyback on other peers that have a vested interest in that pubsub topic.
|
||||
|
||||
## Security Considerations
|
||||
The Waku pubsub topic sharding method has several potential security considerations that must be taken into account, including:
|
||||
|
||||
1. The security of the shared secret key depends on the security of the Diffie-Hellman key exchange, which can be vulnerable to various attacks if not implemented correctly.
|
||||
|
||||
2. The privacy benefits of k-anonymity depend on the value of the privacy parameter. A low value of the privacy parameter may provide stronger privacy guarantees, but at the cost of lower performance. A high value of the privacy parameter may provide better performance, but at the cost of weaker privacy guarantees.
|
||||
|
||||
3. An attacker who is able to obtain the shared secret key or the privacy parameter may be able to determine the topic of a message and potentially intercept it.
|
||||
|
||||
4. The technique of updating the shared secret key with each message can be useful in protecting against eavesdropping, but it also requires additional computation and communication overhead.
|
||||
|
||||
5. The technique of updating the shared secret key with each message also requires a secure signaling mechanism to signal the privacy parameter between Alice and Bob.
|
||||
|
||||
Overall, it is important to carefully consider the trade-offs between performance and privacy when implementing the Waku pubsub topic sharding method, and to ensure that the key exchange and signaling mechanisms are implemented securely.
|
||||
|
||||
## Future research
|
||||
|
||||
- Peer incentivisation
|
||||
- Check if this model can apply to [5/SECURE-TRANSPORT](https://specs.status.im/spec/5)'s cryptography
|
||||
|
||||
## Appendix A - Multicast group chats
|
||||
|
||||
In the context of Status Communities, where single ratchet encryption is used, the nonce used could be the channel name, thereby sharding communities at the channel level.
|
||||
|
||||
$$
|
||||
pubsub\_topic = sha256((ratchet\_key ⌢ channel\_name) \ mod \ privacy\_parameter)
|
||||
$$
|
||||
|
||||
|
||||
One benefit of using this method, is that all community participants are aware of this, and can derive a set of pubsub_topics every time there is a key change (when a member is added or removed from a community).
|
||||
|
||||
## Appendix B - Message reconciliation and ordering
|
||||
|
||||
A method for reconciling and ordering messages across different pubsub topics while preserving privacy is as follows:
|
||||
|
||||
The sender can include a unique, monotonically increasing sequence number for each message.
|
||||
This allows the receiver to order the messages based on the sequence number, regardless of which pubsub topic the messages were sent on.
|
||||
|
||||
The fault-tolerant characteristic of the underlying store protocol is out of scope for this spec.
|
||||
552
scratch/status-scaling/status-telemetry-analysis.md
Normal file
552
scratch/status-scaling/status-telemetry-analysis.md
Normal file
@@ -0,0 +1,552 @@
|
||||
# Status Telemetry Analysis
|
||||
|
||||
> All the following data comes from Status desktop clients running version `0.10.0rc`
|
||||
|
||||
> There are ~30 members of the status community running this version
|
||||
|
||||
|
||||
## Methodology
|
||||
|
||||
The [Status superset](https://superset.infra.status.im/superset/dashboard/13/?native_filters_key=IaOrPqizciMolxZRbRhk0vAFWWeVt8jxeuTzyiawNKIXpuJRqn5E9KvSnvOoUgGa) runs upon the pg database that has all the telemetry data.
|
||||
|
||||
These queries were run directly on the database, with the following index added -
|
||||
|
||||
```sql=
|
||||
CREATE INDEX idx_receivedmessages_messagetype_messagesize_chatid
|
||||
ON receivedmessages (messagetype, messagesize, chatid);
|
||||
```
|
||||
|
||||
which helps query times re: sizes of each `messagetype` being broadcasted
|
||||
|
||||
## Main query
|
||||
|
||||
`messagetypes` ordered by size, descending -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
messagetype,
|
||||
AVG(messagesize)
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagesize > 0
|
||||
GROUP BY
|
||||
messagetype
|
||||
ORDER BY
|
||||
AVG(messagesize) DESC
|
||||
LIMIT
|
||||
30;
|
||||
```
|
||||
which yields
|
||||
```
|
||||
messagetype | avg
|
||||
------------------------------------+-----------------------
|
||||
MEMBERSHIP_UPDATE_MESSAGE | 67950.887645733805
|
||||
COMMUNITY_DESCRIPTION | 48434.633391221031
|
||||
UNKNOWN | 32869.262119978472
|
||||
BACKUP | 29259.738297689599
|
||||
COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 24488.108695652174
|
||||
SYNC_PROFILE_PICTURE | 19550.000000000000
|
||||
CONTACT_CODE_ADVERTISEMENT | 7421.5858845617061659
|
||||
CHAT_IDENTITY | 6109.9454022988505747
|
||||
CHAT_MESSAGE | 4670.8508814547473626
|
||||
SYNC_INSTALLATION_COMMUNITY | 3597.6666666666666667
|
||||
CONTACT_UPDATE | 1981.9735152487961477
|
||||
EDIT_MESSAGE | 1616.5792079207920792
|
||||
ACCEPT_CONTACT_REQUEST | 1337.4227272727272727
|
||||
SYNC_CHAT_MESSAGES_READ | 1114.2974738675958188
|
||||
SYNC_INSTALLATION_CONTACT | 1094.3000000000000000
|
||||
RETRACT_CONTACT_REQUEST | 945.1025641025641026
|
||||
SYNC_ACTIVITY_CENTER_READ | 842.8750000000000000
|
||||
REQUEST_CONTACT_VERIFICATION | 773.0000000000000000
|
||||
SYNC_CHAT_REMOVED | 754.0000000000000000
|
||||
SYNC_CONTACT_REQUEST_DECISION | 746.5000000000000000
|
||||
DELETE_MESSAGE | 735.8549450549450549
|
||||
COMMUNITY_ARCHIVE_MAGNETLINK | 636.1341371514694800
|
||||
EMOJI_REACTION | 606.1040987716219604
|
||||
PIN_MESSAGE | 483.2560000000000000
|
||||
PAIR_INSTALLATION | 459.7571428571428571
|
||||
STATUS_UPDATE | 338.4547629848038104
|
||||
PUSH_NOTIFICATION_QUERY | 137.0000000000000000
|
||||
COMMUNITY_REQUEST_TO_JOIN | 125.1176470588235294
|
||||
COMMUNITY_CANCEL_REQUEST_TO_JOIN | 122.0000000000000000
|
||||
COMMUNITY_REQUEST_TO_LEAVE | 112.0000000000000000
|
||||
```
|
||||
|
||||
Ignoring `UNKNOWN`, we will run queries for the top 5, i.e
|
||||
|
||||
- MEMBERSHIP_UPDATE_MESSAGE
|
||||
- COMMUNITY_DESCRIPTION
|
||||
- BACKUP
|
||||
- COMMUNITY_REQUEST_TO_JOIN_RESPONSE
|
||||
- SYNC_PROFILE_PICTURE
|
||||
|
||||
## Queries
|
||||
|
||||
### 1. `MEMBERSHIP_UPDATE_MESSAGE`
|
||||
|
||||
1. Window of `MEMBERSHIP_UPDATE_MESSAGE` sizes sent by different groups(?) -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
left(chatid, 20) as trunc_chat_id,
|
||||
messagetype,
|
||||
messagesize,
|
||||
sentat
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'MEMBERSHIP_UPDATE_MESSAGE'
|
||||
AND messagesize > 0
|
||||
ORDER BY sentat DESC -- to get the latest sizes
|
||||
LIMIT
|
||||
10;
|
||||
```
|
||||
|
||||
which yields
|
||||
|
||||
```
|
||||
trunc_chat_id | messagetype | messagesize | sentat
|
||||
----------------------+---------------------------+-------------+------------
|
||||
contact-discovery-13 | MEMBERSHIP_UPDATE_MESSAGE | 15769 | 1677850449
|
||||
contact-discovery-13 | MEMBERSHIP_UPDATE_MESSAGE | 15502 | 1677850436
|
||||
0x045e098c95c9639719 | MEMBERSHIP_UPDATE_MESSAGE | 45282 | 1677850426
|
||||
contact-discovery-85 | MEMBERSHIP_UPDATE_MESSAGE | 45320 | 1677850397
|
||||
0x0413cd82a2df9c6c4b | MEMBERSHIP_UPDATE_MESSAGE | 22780 | 1677850357
|
||||
contact-discovery-13 | MEMBERSHIP_UPDATE_MESSAGE | 67856 | 1677850357
|
||||
contact-discovery-38 | MEMBERSHIP_UPDATE_MESSAGE | 67856 | 1677850357
|
||||
0x0413cd82a2df9c6c4b | MEMBERSHIP_UPDATE_MESSAGE | 22780 | 1677850356
|
||||
contact-discovery-85 | MEMBERSHIP_UPDATE_MESSAGE | 45320 | 1677850356
|
||||
contact-discovery-48 | MEMBERSHIP_UPDATE_MESSAGE | 67856 | 1677850356
|
||||
```
|
||||
|
||||
2. Average `MEMBERSHIP_UPDATE_MESSAGE` size -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
AVG(messagesize)
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'MEMBERSHIP_UPDATE_MESSAGE'
|
||||
AND messagesize > 0;
|
||||
```
|
||||
|
||||
which yields
|
||||
|
||||
```
|
||||
avg
|
||||
--------------------
|
||||
67937.954692116552
|
||||
```
|
||||
|
||||
~ 67kb
|
||||
|
||||
3. Irrelevant to check broadcast frequency, it is ad-hoc afaik
|
||||
|
||||
4. Number of `MEMBERSHIP_UPDATE_MESSAGE` in 1 day -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
COUNT(*)
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'MEMBERSHIP_UPDATE_MESSAGE'
|
||||
AND messagesize > 0
|
||||
AND sentat BETWEEN 1677850670
|
||||
AND 1677764270;
|
||||
```
|
||||
|
||||
which yields
|
||||
|
||||
```
|
||||
count
|
||||
-------
|
||||
2577
|
||||
```
|
||||
|
||||
which itself is 2577 * 67kb = 172mb. Can someone from the Status team explain this message type's usage?
|
||||
|
||||
### 2. `COMMUNITY_DESCRIPTION`
|
||||
|
||||
1. Window of `COMMUNITY_DESCRIPTION` sizes sent by different communities -
|
||||
|
||||
```sql=
|
||||
SELECT
|
||||
left(chatid, 20) as trunc_chat_id,
|
||||
messagetype,
|
||||
messagesize,
|
||||
sentat
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'COMMUNITY_DESCRIPTION'
|
||||
AND messagesize >= 70000
|
||||
ORDER BY sentat DESC -- to get the latest sizes
|
||||
LIMIT
|
||||
10;
|
||||
```
|
||||
which yields
|
||||
```
|
||||
trunc_chat_id | messagetype | messagesize | sentat
|
||||
----------------------+-----------------------+-------------+------------
|
||||
0x0269b18891d3b42ebd | COMMUNITY_DESCRIPTION | 87531 | 1676369752
|
||||
0x0269b18891d3b42ebd | COMMUNITY_DESCRIPTION | 87531 | 1676442625
|
||||
0x03c6552a70bc9d9407 | COMMUNITY_DESCRIPTION | 247176 | 1676317357
|
||||
0x03c6552a70bc9d9407 | COMMUNITY_DESCRIPTION | 247176 | 1676313458
|
||||
0x03c6552a70bc9d9407 | COMMUNITY_DESCRIPTION | 247176 | 1676325158
|
||||
0x03c6552a70bc9d9407 | COMMUNITY_DESCRIPTION | 247176 | 1676321259
|
||||
0x03c6552a70bc9d9407 | COMMUNITY_DESCRIPTION | 247176 | 1676309857
|
||||
0x03dcc6838078722b8c | COMMUNITY_DESCRIPTION | 318522 | 1676495674
|
||||
0x0269b18891d3b42ebd | COMMUNITY_DESCRIPTION | 87531 | 1676446526
|
||||
0x03c6552a70bc9d9407 | COMMUNITY_DESCRIPTION | 247176 | 1676496141
|
||||
```
|
||||
|
||||
|
||||
2. Average `COMMUNITY_DESCRIPTION` size sent by the Status Community
|
||||
|
||||
```sql=
|
||||
SELECT
|
||||
AVG(messagesize)
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'COMMUNITY_DESCRIPTION'
|
||||
AND messagesize >= 70000 AND chatid = '0x03073514d4c14a7d10ae9fc9b0f05abc904d84166a6ac80add58bf6a3542a4e50a';
|
||||
```
|
||||
> note: the chatid can be derived using `{"method":"wakuext_joinedCommunities"}` in the node management tab. Thanks @rramos.eth!
|
||||
|
||||
which yields
|
||||
```
|
||||
avg
|
||||
---------------------
|
||||
346049.563314711359
|
||||
```
|
||||
~346kb, which is off the [estimation](https://hackmd.io/Dru3ULQSS2-II2WwkWcosg?both) by 200kb. This leads to a worse scenario in which ~401 members will lead to 1mb message size.
|
||||
|
||||
3. Median time difference between each broadcast of `COMMUNITY_DESCRIPTION` -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
PERCENTILE_CONT(0.5) WITHIN GROUP(
|
||||
ORDER BY
|
||||
diff
|
||||
) as median
|
||||
FROM
|
||||
(
|
||||
SELECT
|
||||
sentat,
|
||||
sentat - lag(sentat) over (
|
||||
order by
|
||||
sentat
|
||||
) as diff
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
chatid = '0x03073514d4c14a7d10ae9fc9b0f05abc904d84166a6ac80add58bf6a3542a4e50a'
|
||||
AND messagetype = 'COMMUNITY_DESCRIPTION'
|
||||
ORDER BY
|
||||
sentat DESC
|
||||
LIMIT
|
||||
1000
|
||||
) q
|
||||
WHERE
|
||||
diff > 0;
|
||||
```
|
||||
|
||||
which yields
|
||||
```
|
||||
median
|
||||
--------
|
||||
3632
|
||||
```
|
||||
The message is broadcasted ~ every hour
|
||||
|
||||
### 3. `BACKUP`
|
||||
|
||||
1. Window of `BACKUP` sizes sent by different peers
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
left(chatid, 20) as trunc_chat_id,
|
||||
messagetype,
|
||||
messagesize,
|
||||
sentat
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'BACKUP'
|
||||
AND messagesize > 0 ORDER BY sentat desc
|
||||
LIMIT
|
||||
10;
|
||||
```
|
||||
which yields
|
||||
|
||||
```
|
||||
trunc_chat_id | messagetype | messagesize | sentat
|
||||
----------------------+-------------+-------------+------------
|
||||
contact-discovery-04 | BACKUP | 2264 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 608 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 119 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 738 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 109 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 109 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 109 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 112 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 612 | 1677848347
|
||||
contact-discovery-04 | BACKUP | 9391 | 1677848347
|
||||
```
|
||||
|
||||
|
||||
|
||||
2. Average size of the `BACKUP` message -
|
||||
|
||||
```sql=
|
||||
SELECT
|
||||
messagetype,
|
||||
AVG(messagesize)
|
||||
FROM
|
||||
(
|
||||
SELECT
|
||||
messagetype,
|
||||
messagesize
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'BACKUP'
|
||||
AND messagesize > 0 --> desktop clients using non-0.10.0rc set this to 0
|
||||
LIMIT
|
||||
1000
|
||||
) AS subq
|
||||
GROUP BY
|
||||
messagetype;
|
||||
```
|
||||
which yields
|
||||
```
|
||||
messagetype | avg
|
||||
-------------+--------------------
|
||||
BACKUP | 33749.652000000000
|
||||
```
|
||||
~ 33kb. With this result, the backup protocol seems fine for now without ipfs pinning/another backup mechanism. However, as the number of communities each person belongs to, this will not scale, and will require changes.
|
||||
|
||||
3. Average time difference between each broadcast of `BACKUP`
|
||||
|
||||
> note: this query makes use of a random chatid being used to backup. Should probably be refactored
|
||||
|
||||
> note: for some reason, some backup messages are being broadcasted with very low intervals to the same receiverkeyuid. Assumed due to different types of backup messages being sent.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
AVG(diff)
|
||||
FROM
|
||||
(
|
||||
SELECT
|
||||
sentat,
|
||||
sentat - lag(sentat) over (
|
||||
order by
|
||||
sentat
|
||||
) as diff
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
chatid = 'contact-discovery-04f2e7b3394dcbf0d03bba954dbedc0bd561951a8c31a00320c6b99a40145e655da6e689d19ced6a314c6dde31ce96feb0166f19472c2a1edbeeb939e3282bce14'
|
||||
AND messagetype = 'BACKUP' AND receiverkeyuid='0x45ab6ed9f2461720737f0f3095a40d3b0c2475fe5ea4c57bc8b9fe293afcffbc'
|
||||
ORDER BY
|
||||
sentat DESC
|
||||
LIMIT
|
||||
1000000
|
||||
) q
|
||||
WHERE
|
||||
diff > 0;
|
||||
```
|
||||
|
||||
which yields
|
||||
|
||||
```
|
||||
avg
|
||||
-----------------------
|
||||
9205.8181818181818182
|
||||
```
|
||||
~ 2.5 hours
|
||||
|
||||
### 4. `COMMUNITY_REQUEST_TO_JOIN_RESPONSE`
|
||||
|
||||
1. Window of `COMMUNITY_REQUEST_TO_JOIN_RESPONSE` sizes -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
left(chatid, 20) as trunc_chat_id,
|
||||
messagetype,
|
||||
messagesize,
|
||||
sentat
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'COMMUNITY_REQUEST_TO_JOIN_RESPONSE'
|
||||
AND messagesize > 0 ORDER BY sentat desc
|
||||
LIMIT
|
||||
10;
|
||||
```
|
||||
which yields
|
||||
|
||||
```
|
||||
trunc_chat_id | messagetype | messagesize | sentat
|
||||
----------------------+------------------------------------+-------------+------------
|
||||
contact-discovery-35 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 828 | 1677185497
|
||||
contact-discovery-28 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 36636 | 1677145598
|
||||
contact-discovery-35 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 1281 | 1676917897
|
||||
contact-discovery-38 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 677 | 1676892082
|
||||
contact-discovery-38 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 958 | 1676887818
|
||||
contact-discovery-38 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 677 | 1676887804
|
||||
contact-discovery-28 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 36636 | 1676877610
|
||||
contact-discovery-45 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 1110 | 1676877544
|
||||
contact-discovery-28 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 36496 | 1676655568
|
||||
contact-discovery-45 | COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 1110 | 1676655552
|
||||
```
|
||||
|
||||
2. Average size of the `COMMUNITY_REQUEST_TO_JOIN_RESPONSE` message -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
messagetype,
|
||||
AVG(messagesize)
|
||||
FROM
|
||||
(
|
||||
SELECT
|
||||
messagetype,
|
||||
messagesize
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'COMMUNITY_REQUEST_TO_JOIN_RESPONSE'
|
||||
AND messagesize > 0 --> desktop clients using non-0.10.0rc set this to 0
|
||||
LIMIT
|
||||
1000
|
||||
) AS subq
|
||||
GROUP BY
|
||||
messagetype;
|
||||
```
|
||||
|
||||
which yields
|
||||
|
||||
```
|
||||
messagetype | avg
|
||||
------------------------------------+--------------------
|
||||
COMMUNITY_REQUEST_TO_JOIN_RESPONSE | 24488.108695652174
|
||||
```
|
||||
~24kb
|
||||
|
||||
3. Irrelevant to check broadcast frequency, it is ad-hoc afaik
|
||||
|
||||
4. Number of `COMMUNITY_REQUEST_TO_JOIN_RESPONSE` in 1 day -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
COUNT(*)
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'COMMUNITY_REQUEST_TO_JOIN_RESPONSE'
|
||||
AND messagesize > 0
|
||||
AND sentat BETWEEN 1677099097
|
||||
AND 1677185497;
|
||||
```
|
||||
|
||||
which yields
|
||||
|
||||
```
|
||||
count
|
||||
-------
|
||||
2
|
||||
```
|
||||
|
||||
|
||||
### 5. `SYNC_PROFILE_PICTURE`
|
||||
|
||||
1. Window of `SYNC_PROFILE_PICTURE` sizes -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
left(chatid, 20) as trunc_chat_id,
|
||||
messagetype,
|
||||
messagesize,
|
||||
sentat
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'SYNC_PROFILE_PICTURE'
|
||||
AND messagesize > 0 ORDER BY sentat desc
|
||||
LIMIT
|
||||
10;
|
||||
```
|
||||
which yields
|
||||
```
|
||||
trunc_chat_id | messagetype | messagesize | sentat
|
||||
----------------------+----------------------+-------------+------------
|
||||
0x0461f576da67dc0bca | SYNC_PROFILE_PICTURE | 19550 | 1677162569
|
||||
0x0461f576da67dc0bca | SYNC_PROFILE_PICTURE | 19550 | 1677162327
|
||||
0x0461f576da67dc0bca | SYNC_PROFILE_PICTURE | 19550 | 1677162099
|
||||
0x0461f576da67dc0bca | SYNC_PROFILE_PICTURE | 19550 | 1677162038
|
||||
0x0461f576da67dc0bca | SYNC_PROFILE_PICTURE | 19550 | 1677162005
|
||||
```
|
||||
|
||||
2. Average size of the `SYNC_PROFILE_PICTURE` message -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
messagetype,
|
||||
AVG(messagesize)
|
||||
FROM
|
||||
(
|
||||
SELECT
|
||||
messagetype,
|
||||
messagesize
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'SYNC_PROFILE_PICTURE'
|
||||
AND messagesize > 0 --> desktop clients using non-0.10.0rc set this to 0
|
||||
LIMIT
|
||||
1000
|
||||
) AS subq
|
||||
GROUP BY
|
||||
messagetype;
|
||||
```
|
||||
which yields
|
||||
```
|
||||
messagetype | avg
|
||||
----------------------+--------------------
|
||||
SYNC_PROFILE_PICTURE | 19550.000000000000
|
||||
```
|
||||
|
||||
3. Irrelevant to check broadcast frequency, it is ad-hoc afaik
|
||||
|
||||
4. Number of `SYNC_PROFILE_PICTURE` in 1 day -
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
COUNT(*)
|
||||
FROM
|
||||
receivedmessages
|
||||
WHERE
|
||||
messagetype = 'SYNC_PROFILE_PICTURE'
|
||||
AND messagesize > 0
|
||||
AND sentat BETWEEN 1677076169
|
||||
AND 1677162569;
|
||||
```
|
||||
|
||||
which yields
|
||||
```
|
||||
count
|
||||
-------
|
||||
5
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
It is assumed that `UNKNOWN` messages originate from 1:1 chats.
|
||||
|
||||
The major bandwidth usage comes from `MEMBERSHIP_UPDATE_MESSAGE` and `COMMUNITY_DESCRIPTION`
|
||||
|
||||
It is recommended to optimize the payloads to ensure that scaling problems are solved.
|
||||
Reference in New Issue
Block a user