mirror of
https://github.com/vacp2p/roadmap.git
synced 2026-01-10 08:08:06 -05:00
NCT - Commitments (major draft) (#3)
This commit is contained in:
152
content/dst/codex/codex-comparison.md
Normal file
152
content/dst/codex/codex-comparison.md
Normal file
@@ -0,0 +1,152 @@
|
||||
---
|
||||
title: Codex Comparison
|
||||
tags:
|
||||
- "2024q4"
|
||||
- "dst"
|
||||
- "codex"
|
||||
draft: false
|
||||
description: "Measure Codex against systems like IPFS, BitTorrent etc and see how it compares. Primarily BitTorrent."
|
||||
---
|
||||
|
||||
`vac:dst:codex:codex-comparison`
|
||||
Measure Codex against systems like IPFS, BitTorrent etc and see how it compares. Primarily BitTorrent.
|
||||
## Description
|
||||
We will compare Codex to other systems like IPFS and BitTorrent
|
||||
to see how it performs.
|
||||
|
||||
We will compare on things such as:
|
||||
* Time to first byte
|
||||
* Bandwidth usage
|
||||
* Stability
|
||||
* Reliability
|
||||
|
||||
Most importantly we will do a head to head speed test
|
||||
comparing download speeds of Codex against other systems.
|
||||
This will allow us to understand where Codex needs improvement
|
||||
and where it stands right now in terms of suitability for different use cases.
|
||||
|
||||
We will support the Conduit of Expertise narrative directly
|
||||
by providing valuable insights to Codex
|
||||
that allow them to understand how Codex performs
|
||||
in comparison to common and popular systems in the "altruistic" space.
|
||||
|
||||
Specifically, we will:
|
||||
|
||||
* Accelerate Codex reaching competitiveness with BitTorrent or find out what is and isn't possible to do.
|
||||
* Answer the simple question: "Is Codex faster than BitTorrent?"
|
||||
and in doing so, allow that to be a yes one day 😀
|
||||
* Test the reliability of Codex in automated and highly stressful benchmarks
|
||||
that push its limits and reveal its shortcomings.
|
||||
* Improve the RFC culture by allowing us to reuse the work we do here
|
||||
to build future scenarios that can test complicated situations
|
||||
and requirements in a repeatable way.
|
||||
|
||||
## Task List
|
||||
|
||||
### Matrices Deployments
|
||||
|
||||
* fully qualified name: <vac:dst:codex:codex-comparison:matrices-deployments>
|
||||
* owner: Wings
|
||||
* status: 50%
|
||||
* start-date: 2024/10/01
|
||||
* end-date: 2024/10/11
|
||||
|
||||
#### Description
|
||||
|
||||
Expand upon the current deployment work
|
||||
that uses Kubernetes manifests
|
||||
to deploy and measure complex simulations
|
||||
by implementing a combination of ArgoCD or some similar deployment tool,
|
||||
and standardised Helm, Kustomize or plain manifests,
|
||||
and devise a way to both script and control simulations
|
||||
in a repeatable, easy way.
|
||||
|
||||
Build a system that can deploy and measure
|
||||
a matrix of different scenarios and configurations.
|
||||
|
||||
It must allow multiple unrelated deployments,
|
||||
such as nwaku and gowaku, to exist and interact
|
||||
in the course of a single test.
|
||||
|
||||
#### Deliverables
|
||||
* Example Helm charts or Kustomize for deploying Codex.
|
||||
* Customisations to those Helm or Kustomize charts that allow tuning them to meet specific scenarios such as number of nodes, amount of data.
|
||||
* Automated systems for running a matrix of tests and measuring them.
|
||||
|
||||
This will build on prior work by DST that benefits from this work as well (ArgoCD work).
|
||||
|
||||
### Control BitTorrent
|
||||
|
||||
* fully qualified name: <vac:dst:codex:codex-comparison:control-bittorrent>
|
||||
* owner: Wings
|
||||
* status: 0%
|
||||
* start-date: 2024/10/10
|
||||
* end-date: 2024/10/14
|
||||
|
||||
Pick a BitTorrent client that is Dockerizable and scriptable. Current main candidate is Deluge, maybe qBittorrent.
|
||||
|
||||
Find a sane way to control and script BitTorrent behaviour
|
||||
such as distributing a torrent file to the set of peers
|
||||
that will be tested and automating stopping, starting, and otherwise manipulating torrents
|
||||
as a separate process from launching the initial client swarm. Flexibility and consistency is the goal.
|
||||
|
||||
Implement those controls and start using them to build towards the wider Commitment.
|
||||
|
||||
#### Deliverables
|
||||
|
||||
* Selected BitTorrent client and explained reasons for choice.
|
||||
* Built a Dockerised image if there isn't one already.
|
||||
* Implemented this into a test scenario of some kind and proven that we can script a scenario.
|
||||
* A report on what we learned from the process.
|
||||
|
||||
### k8sified Tracker
|
||||
|
||||
* fully qualified name: <vac:dst:codex:codex-comparison:k8sified-tracker>
|
||||
* owner: Wings
|
||||
* status: 0%
|
||||
* start-date: 2024/10/15
|
||||
* end-date: 2024/10/17
|
||||
|
||||
Make a BitTorrent tracker work within Kubernetes and able to be controlled by API calls.
|
||||
|
||||
Most likely it will simply involve adding auth to an existing Deluge or similar API, and passing the request through the existing API.
|
||||
|
||||
#### Deliverables
|
||||
|
||||
* BitTorrent trackers compared, best one selected, reasons for best choice recorded.
|
||||
* Chosen tracker is dockerized.
|
||||
* Chosen tracker is scriptable.
|
||||
* Finished script and docker container can realistically be used in a test scenario.
|
||||
|
||||
### Build/Test Scenarios
|
||||
|
||||
* fully qualified name: <vac:dst:codex:codex-comparison:build-test-scenarios>
|
||||
* owner: Wings
|
||||
* status: 0%
|
||||
* start-date: <2024/10/15>
|
||||
* end-date: <2024/12/31>
|
||||
|
||||
Use the work done in Matrices Deployments and Control BitTorrent to build and test a set of scenarios that can be used to test Codex.
|
||||
|
||||
We will target these things to compare:
|
||||
|
||||
**Modes**: BitTorrent, Codex Erasure-Coded, Codex Non-Erasure-Coded
|
||||
|
||||
**Swarm Size**:
|
||||
* total size: 2, 8, 16, 32
|
||||
* seeders: 1, 2, 4, 8, 16
|
||||
* file size:
|
||||
100
|
||||
MB,
|
||||
1
|
||||
GB,
|
||||
5
|
||||
GB
|
||||
|
||||
We will compare a matrix of file sizes, seeders, total size, and build a flexible test harness on top of Matrices Deployments and Control BitTorrent to run the tests.
|
||||
|
||||
#### Deliverables
|
||||
|
||||
* A completely automated end to end test scenario that can be used to test Codex against BitTorrent.
|
||||
* A report on the results of the tests and the conclusions we can draw from them.
|
||||
* Hard numbers on what Codex is capable of and how these swarm sizes and other parameters affect performance, latency and other metrics.
|
||||
132
content/dst/codex/codex-scaling.md
Normal file
132
content/dst/codex/codex-scaling.md
Normal file
@@ -0,0 +1,132 @@
|
||||
---
|
||||
title: Codex Scaling
|
||||
tags:
|
||||
- "2024q4"
|
||||
- "dst"
|
||||
- "codex"
|
||||
draft: false
|
||||
description: "Improve Codex's scaling abilities
|
||||
and our understanding of these,
|
||||
using scientific testing and experiments,
|
||||
leading to better scaling.
|
||||
Compare to other systems.
|
||||
Support the testnet efforts by providing base capacity.
|
||||
Measure speed, latency, other metrics.
|
||||
Give hard numbers on Codex vs BitTorrent."
|
||||
---
|
||||
|
||||
`vac:dst:codex:codex-scaling`
|
||||
|
||||
## Description
|
||||
Use real world testing, theoretical analysis and simulation
|
||||
to determine and improve Codex's scaling properties.
|
||||
|
||||
Find the limits of Codex's capabilities and measure them in different scenarios.
|
||||
|
||||
We will allow Codex to scale to support large scale use cases,
|
||||
test how it behaves in large 100TB+ testnet deployments
|
||||
and in various deployment setups,
|
||||
and we will help make Codex more scalable in the first place.
|
||||
|
||||
We will support the Conduit of Expertise narrative directly
|
||||
by providing valuable insights to Codex
|
||||
and the ability to theorise, reason about,
|
||||
test, measure and improve
|
||||
the performance, stability and scalability of Codex.
|
||||
|
||||
These efforts will contribute in these ways to the Conduit of Expertise narrative:
|
||||
|
||||
* Accelerate adoption and development and productising of Codex
|
||||
by providing support to the Codex team
|
||||
in the form of real world testing
|
||||
to improve their efficiency and effectiveness
|
||||
in building a better product.
|
||||
|
||||
* Improve the RFC culture
|
||||
by allowing for faster and easier development of RFCs
|
||||
with the aid of rapidly accelerated insights
|
||||
into how an RFC in development will perform
|
||||
as it's being expanded and going through the draft process.
|
||||
|
||||
* Allow easier post-mortem analysis
|
||||
of the success or relative performance of a given RFC -
|
||||
does this change use more or less bandwidth?
|
||||
Did it improve things?
|
||||
Seeing the effects of changes at scale
|
||||
allows for a greater ability to usefully wrap up work on
|
||||
and conclude an RFC process
|
||||
and document and absorb what we learned
|
||||
in the process into further improvements.
|
||||
|
||||
Further, we will contribute both directly and indirectly
|
||||
to the Premier Research destination narrative
|
||||
by helping Codex build a stable base
|
||||
on which other research and interesting use cases can be built.
|
||||
|
||||
|
||||
## Task List
|
||||
|
||||
### Deploy Base Capacity
|
||||
|
||||
* fully qualified name: <vac:dst:codex:codex-scaling:deploy-base-capacity>
|
||||
* owner: Wings
|
||||
* status: 99%
|
||||
* start-date: 2024/10/05
|
||||
* end-date: 2024/10/31
|
||||
|
||||
#### Description
|
||||
|
||||
Deploy a large set of base capacity to the Codex testnet and keep it online, stable and prevented from losing data where possible.
|
||||
|
||||
It will consist of 50x nodes with 10xTB of data each.
|
||||
|
||||
#### Deliverables
|
||||
|
||||
* Helm chart adapted to Vaclab and used to deploy the nodes.
|
||||
* 50x nodes running and adopted into the testnet.
|
||||
* Downloads/uploads tested and working for at least 3 selected nodes.
|
||||
* Ongoing monitoring (not a one time thing)
|
||||
* 500TB of overall capacity provided to the network
|
||||
|
||||
### How Fast Is Codex?
|
||||
|
||||
* fully qualified name: <vac:dst:codex:codex-scaling:how-fast-is-codex>
|
||||
* owner: Wings
|
||||
* status: 0%
|
||||
* start-date: 2024/10/18
|
||||
* end-date: 2024/10/21
|
||||
|
||||
#### Description
|
||||
|
||||
Related to Codex Comparison,
|
||||
we simply want to find out fast Codex is, at various things
|
||||
under different kinds of stress and load.
|
||||
|
||||
We will use the Base Capacity.
|
||||
|
||||
We will test and compare the following:
|
||||
|
||||
* Upload speed (1 client)
|
||||
* Download speed
|
||||
* Time to first byte
|
||||
* Time to 50%
|
||||
* Time to 90%
|
||||
* Time to 100
|
||||
|
||||
We would also like to collect all data from the items in this matrix:
|
||||
|
||||
**Benchmark conditions**:
|
||||
* total size: 2, 8, 16, 32
|
||||
* seeders: 1, 2, 4, 8, 16
|
||||
* file size:
|
||||
100
|
||||
MB,
|
||||
1
|
||||
GB,
|
||||
5
|
||||
GB
|
||||
|
||||
#### Deliverables
|
||||
|
||||
- [ ] Reports from how each item in the matrix performed.
|
||||
- [ ] A general writeup
|
||||
81
content/dst/ift/deployer-tool.md
Normal file
81
content/dst/ift/deployer-tool.md
Normal file
@@ -0,0 +1,81 @@
|
||||
---
|
||||
title: Deployer Tool
|
||||
tags:
|
||||
- "2024q4"
|
||||
- "dst"
|
||||
- "ift"
|
||||
draft: false
|
||||
description: "Develop, test, demonstrate and graduate
|
||||
a tool or method for reliably deploying,
|
||||
measuring and scaling arbitrary sets of software
|
||||
that needs testing and validation"
|
||||
---
|
||||
|
||||
`vac:dst:ift:deployer-tool`
|
||||
|
||||
## Description
|
||||
|
||||
We will develop, test, demonstrate and graduate (productionise)
|
||||
a tool or method for reliably deploying, measuring and scaling
|
||||
arbitrary sets of software that needs testing and validation
|
||||
- such as Waku, Codex, Nomos, etc.
|
||||
|
||||
The tool will be used to improve the developer experience of
|
||||
deploying these systems at various scales,
|
||||
including automation, metrics, and the ability to change
|
||||
a running simulation as needed.
|
||||
|
||||
It should support arbitrary Helm and Kustomize charts,
|
||||
allowing us to use well defined configurations
|
||||
in the form of Kubernetes resources,
|
||||
managed by modular bundles that can be swapped in and out as needed.
|
||||
|
||||
This will allow us to do all of our other work more easily,
|
||||
allowing us to focus on providing value to the IFT ecosystem.
|
||||
Through this, both the narratives of the Conduit of Expertise
|
||||
is supported - through increasing our efficiency,
|
||||
capabilities and the reliability of repeating our experiments
|
||||
and research, allowing us to provide better insights and data
|
||||
to the teams we work with to allow them to make better decisions.
|
||||
|
||||
## Task List
|
||||
|
||||
### ArgoCD Or Similar
|
||||
|
||||
* fully qualified name: `vac:dst:ift:argocd-or-similar`
|
||||
* owner: Wings
|
||||
* status: 80%
|
||||
* start-date: 2024/10/04
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
|
||||
Get ArgoCD or a similar tool up and running.
|
||||
|
||||
Use it to demonstrate deploying an nwaku simulation from a Git repo
|
||||
with a Helm chart or plain manifests in it.
|
||||
|
||||
#### Deliverables
|
||||
|
||||
* The demonstrated ability to run an nwaku simulation.
|
||||
|
||||
|
||||
### Working Matrices
|
||||
|
||||
* fully qualified name: <vac:dst:ift:working-matrices>
|
||||
* owner: Wings
|
||||
* status: 0%
|
||||
* start-date: 2024/10/04
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
|
||||
Ensure that deployment matrices work once `ArgoCD Or Similar` is completed.
|
||||
|
||||
Test some basic deployments and record findings.
|
||||
|
||||
#### Deliverables
|
||||
|
||||
* A report on the findings of the tests and the current state of the deployment matrices.
|
||||
* A deployment matrix tool or set of instructions/documentation.
|
||||
* Deployments tested and working with a 3x3 matrix of different configurations.
|
||||
189
content/dst/ift/vaclab.md
Normal file
189
content/dst/ift/vaclab.md
Normal file
@@ -0,0 +1,189 @@
|
||||
---
|
||||
title: VacLab
|
||||
tags:
|
||||
- "2024q4"
|
||||
- "dst"
|
||||
- "ift"
|
||||
draft: false
|
||||
description: "Scale and apply the VacLab to IFT's needs.
|
||||
Anticipate untapped use cases and needs from other teams.
|
||||
Achieve 25% real world time usage."
|
||||
---
|
||||
|
||||
`vac:dst:ift:vaclab`
|
||||
|
||||
## Description
|
||||
|
||||
The VacLab is a resource provided to Vac by Riff Labs Limited,
|
||||
intended to help us perform detailed simulations
|
||||
and deployments of distributed systems at scale
|
||||
as well as the systems and dependencies that surround them.
|
||||
|
||||
With the VacLab reaching maturity where it can start being used comfortably
|
||||
to advance IFT's research, development testing efforts, and quality control,
|
||||
we want to ensure it is being used to its full potential
|
||||
and that teams understand that the resource exists,
|
||||
and if they find a useful case for their work being improved
|
||||
by collaboration with the DST team through VacLab,
|
||||
they should be comfortable reaching out to us,
|
||||
and confident that we will be willing to try and help them,
|
||||
based on our attitude and more importantly the track record
|
||||
and results we will produce using these tools and our expertise.
|
||||
|
||||
The lab will be treated as an IaaS-style service at first,
|
||||
with the raw underlying infrastructure being developed in partnership with Riff Labs
|
||||
who handles the details of making that IaaS layer available to us and reliable.
|
||||
|
||||
As we progress through the maturity of the lab,
|
||||
we will transition to supporting a more PaaS or even SaaS software model,
|
||||
where as much as possible is accessible to the IFT ecosystem and teams
|
||||
to use and benefit from without them needing to concern themselves with the details of the underlying infrastructure or be blocked by the need to build and manage their own.
|
||||
|
||||
We will move towards self service testing and deployment,
|
||||
and by doing so unblock and accelerate the development, R&D and productionisation
|
||||
of IFT's projects by providing a safe and reliable place to experiment and test.
|
||||
|
||||
It will continue to provide significant efficiency benefits in terms of cost vs output when compared to cloud providers
|
||||
and even on traditional premises deployment of infrastructure,
|
||||
using many independent and cheaper nodes
|
||||
rather than larger more powerful vertically scaled machines,
|
||||
building on second hand and used equipment,
|
||||
and "patching around" the unreliability of individual hardware,
|
||||
by ensuring everything is resilient and reliable even in the face of individual failures,
|
||||
and in doing so continue to reduce and control the costs of testing our systems at scale.
|
||||
|
||||
Through the use of the VacLab
|
||||
we will support the Conduit of Expertise narrative by:
|
||||
* providing a unique capability to the IFT ecosystem
|
||||
that would not otherwise be available to them,
|
||||
lower the barrier to entry for teams needing research -
|
||||
or services that require infrastructure -
|
||||
by lowering the cost and removing the need for them to get it themselves
|
||||
through cloud providers that provide less flexibility and direct control,
|
||||
|
||||
* using our knowledge of what is possible to do with these resources,
|
||||
based on who is already using them,
|
||||
and apply that knowledge to intuit new use cases
|
||||
that will unlock better collaboration between teams and the DST,
|
||||
driving and accelerating development of IFT projects
|
||||
such as Waku, Codex, Nomos and more.
|
||||
|
||||
* Accelerating initiatives by providing the means, capability and encouragement
|
||||
to test every aspect of anything that can be tested in a simulation,
|
||||
across every team and use case that is interested in doing so,
|
||||
up to the limits of what the DST team can support.
|
||||
|
||||
We will also provide support for the Premier Research destination narrative by:
|
||||
* Allowing public access to non-sensitive telemetry and metrics from non-sensitive systems such as Codex storage nodes, and potentially even probes that measure the state of networks such as The Waku Network and Status.
|
||||
|
||||
## Task List
|
||||
|
||||
### Status Page Known
|
||||
|
||||
* fully qualified name: `vac:dst:ift:vaclab:status-page-known`
|
||||
* owner: Wings
|
||||
* status: 80%
|
||||
* start-date: 2024/12/01
|
||||
* end-date: 2024/12/07
|
||||
|
||||
#### Description
|
||||
|
||||
A status page for the VacLab
|
||||
that has wide acceptance and use
|
||||
by anyone who wants to know the current status
|
||||
of the VacLab and its availability.
|
||||
|
||||
#### Deliverables
|
||||
* Status page is created and hosted on the lab
|
||||
and made available to users.
|
||||
* Status page reflects reality and is accepted by the users
|
||||
as being a good fit for their needs.
|
||||
* Status page sees widespread use among its users.
|
||||
* Build an external probe and a fallback status page
|
||||
that can be used in case everything
|
||||
|
||||
### Better Time Slicing
|
||||
|
||||
* fully qualified name: `vac:dst:ift:vaclab:better-time-slicing`
|
||||
* owner: Wings
|
||||
* status: 0%
|
||||
* start-date: 2024/06/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
Do a better job of time slicing the lab.
|
||||
|
||||
#### Deliverables
|
||||
* A report on the current state of time slicing in the lab.
|
||||
* A plan for how to improve time slicing in the lab.
|
||||
* A timeline for implementing the plan.
|
||||
* Measurable improvements in usage of the lab
|
||||
that aims for an initial target of 25% of real world time
|
||||
being used for useful workloads and tests
|
||||
|
||||
Later repeats in the VacLab commitment will aim to improve this to 50%,
|
||||
then 75%, then as far as possible
|
||||
to the limits of the underlying infrastructure and our actual needs.
|
||||
|
||||
### Train Lab Staff
|
||||
<!-- technically sort of external
|
||||
and will be done outside of normal DST cadence
|
||||
but will be managed so as not to disrupt other works
|
||||
-->
|
||||
|
||||
* fully qualified name: `vac:dst:ift:vaclab:train-lab-staff`
|
||||
* owner: Wings
|
||||
* status: 30%
|
||||
* start-date: 2024/12/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
Fully dedicate all time outside of core DST deliverable work
|
||||
to training Michaela, the VacLab (Riff Labs Perth) custodian,
|
||||
in all aspects of not just managing the VacLab,
|
||||
but providing support to DST's work that utilises it,
|
||||
with the focus of improving both the reliability of the lab
|
||||
and provide a better systems testing service.
|
||||
|
||||
Will - must, for practical reasons - be done in person in Perth.
|
||||
|
||||
Will also be used to improve the reliability and capabilities
|
||||
of the VacLab as a platform for IFT's research and development needs.
|
||||
|
||||
Must not impact other works outside of this task.
|
||||
|
||||
#### Deliverables
|
||||
- [ ] Full automation for anything we know needs doing regularly
|
||||
- [ ] Automated patching for security updates (Debian, Authentik, SeaweedFS)
|
||||
- [ ] Secure key management and rotation automation (for SSH keys)
|
||||
- [ ] Michaela fully comfortable operating the lab independently
|
||||
- [ ] A report on what was learned in this process
|
||||
and how we believe it improved VacLab support and operations
|
||||
- [ ] Improvements to the lab that are documented, implemented and recorded.
|
||||
|
||||
|
||||
### Automation Uplift
|
||||
<!-- technically sort of external
|
||||
and will be done outside of normal DST cadence
|
||||
but will be managed so as not to disrupt other works
|
||||
-->
|
||||
|
||||
* fully qualified name: `vac:dst:ift:vaclab:automation-uplift`
|
||||
* owner: Wings
|
||||
* status: 0%
|
||||
* start-date: 2024/12/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
Significantly improve the automation and management of the VacLab,
|
||||
freeing up resources for Wings to focus on other work.
|
||||
|
||||
#### Deliverables
|
||||
- [ ] Full automation for anything we know needs doing regularly
|
||||
- [ ] Automated patching for security updates (Debian, Authentik, SeaweedFS)
|
||||
- [ ] Secure key management and rotation automation (for SSH keys)
|
||||
- [ ] A report on what was learned in this process
|
||||
and how we believe it improved VacLab support and operations
|
||||
- What was automated? Why? What did that change?
|
||||
- What remains manual and needs improving?
|
||||
- [ ] Improvements to the lab that are documented, implemented and recorded.
|
||||
85
content/dst/ift/visualiser-tool.md
Normal file
85
content/dst/ift/visualiser-tool.md
Normal file
@@ -0,0 +1,85 @@
|
||||
---
|
||||
title: Visualiser Tool
|
||||
tags:
|
||||
- "2024q4"
|
||||
- "dst"
|
||||
- "ift"
|
||||
draft: false
|
||||
description: "Develop tools or frameworks
|
||||
suitable for visualising the state of arbitrary distributed systems.
|
||||
This initial iteration must support Waku visualisation,
|
||||
but future intention is to support any system
|
||||
which is log compatible with the Visualiser Tools."
|
||||
---
|
||||
|
||||
`vac:dst:ift:visualiser-tool`
|
||||
|
||||
## Description
|
||||
|
||||
Develop tools or frameworks
|
||||
suitable for visualising the state of arbitrary distributed systems.
|
||||
This initial iteration must support Waku visualisation,
|
||||
but future intention is to support any system
|
||||
which is log compatible with the Visualiser Tools.
|
||||
|
||||
We will demonstrate the usefulness and unique understanding
|
||||
such a tool can give you about the way a p2p network behaves
|
||||
under different conditions, and from its inception to its middle state and eventual end.
|
||||
|
||||
Through providing a way to visualise p2p network behaviour and message propagation,
|
||||
we will help enable the Conduit of Research narrative to be supported
|
||||
by giving the Waku team a way to intuitively understand
|
||||
the actual way network propagation occurs,
|
||||
and how it is affected by different factors.
|
||||
|
||||
It will also provide a way to test RFCs
|
||||
that affect aspects of Waku
|
||||
that are visible in a simulation
|
||||
but hard to observe in the real world
|
||||
or without significant mental time and investment into logs
|
||||
that don't provide a visual way of analysing large scale behaviours.
|
||||
|
||||
## Task List
|
||||
|
||||
### debug-visualiser
|
||||
|
||||
* fully qualified name: `vac:dst:ift:visualiser-tool:debug-visualiser`
|
||||
* owner: Alberto
|
||||
* status: 60%
|
||||
* start-date: 2024/06/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
|
||||
The debug visualiser is designed
|
||||
to allow for digging into the interactions,
|
||||
packet flow, and behaviour,
|
||||
of distributed systems, initially Waku.
|
||||
|
||||
It is intended to be "interesting and deep, not pretty or wide".
|
||||
|
||||
#### Deliverables
|
||||
- [ ] https://github.com/vacp2p/dst-live-visualiser
|
||||
|
||||
### live-visualiser
|
||||
* fully qualified name: `vac:dst:ift:visualiser-tool:live-visualiser`
|
||||
* owner: Wings
|
||||
* status: 99%
|
||||
* start-date: 2024/06/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
|
||||
The live visualiser is designed
|
||||
to allow for digging into the interactions,
|
||||
packet flow, and behaviour,
|
||||
of distributed systems, initially Waku.
|
||||
|
||||
It is intended to be "pretty and wide" and in contrast to the debug visualiser
|
||||
it runs in realtime along with the network
|
||||
and shows you the network in a way that is easy to understand and interpret,
|
||||
especially for those previously not familiar with peer to peer technologies or networks.
|
||||
|
||||
#### Deliverables
|
||||
|
||||
- [ ] https://github.com/vacp2p/dst-live-visualiser
|
||||
@@ -9,12 +9,23 @@ tags:
|
||||
---
|
||||
|
||||
### `ift`
|
||||
* [[dst/ift/deployer-tool|deployer-tool]]
|
||||
* [ ] [[dst/ift/deployer-tool|deployer-tool]]
|
||||
* [ ] [[dst/ift/visualiser-tool|visualiser-tool]]
|
||||
* [ ] [[dst/ift/vaclab|vaclab]]
|
||||
|
||||
### `waku`
|
||||
* [ ] [[dst/waku/waku-scaling|waku-scaling]]
|
||||
|
||||
### `codex`
|
||||
* [ ] [[dst/codex/codex-scaling|codex-scaling]]
|
||||
* [ ] [[dst/codex/codex-comparison|codex-comparison]]
|
||||
|
||||
<!--
|
||||
### `nomos`
|
||||
* [ ] [[dst/nomos/nomos-scaling|nomos-scaling]]
|
||||
-->
|
||||
|
||||
### `vac`
|
||||
* [ ] [[dst/vac/libp2p-evaluation|libp2p-evaluation]]
|
||||
|
||||
|
||||
|
||||
79
content/dst/nomos/nomos-scaling.md
Normal file
79
content/dst/nomos/nomos-scaling.md
Normal file
@@ -0,0 +1,79 @@
|
||||
---
|
||||
title: Nomos Scaling
|
||||
tags:
|
||||
- "2024q4"
|
||||
- "dst"
|
||||
- "nomos"
|
||||
draft: false
|
||||
description: "Help Nomos understand and improve
|
||||
the properties of Nomos.
|
||||
Improve privacy and security,
|
||||
and improve scaling properties."
|
||||
---
|
||||
|
||||
`vac:dst:nomos:nomos-scaling`
|
||||
|
||||
## Description
|
||||
Use real world testing,
|
||||
theoretical analysis
|
||||
and simulation
|
||||
to determine and improve Nomos's scaling properties.
|
||||
Find the limits of Nomos's capabilities
|
||||
and measure them in different scenarios.
|
||||
|
||||
We will measure the real world speeds and latency of Nomos' mixnet,
|
||||
and what use cases it is therefore able to support.
|
||||
|
||||
We will support the Conduit of Expertise narrative directly
|
||||
by providing valuable insights to Nomos
|
||||
and the ability to theorise, reason about,
|
||||
test, measure and improve
|
||||
the performance, stability and scalability of Nomos.
|
||||
|
||||
These efforts will contribute in these ways to the Conduit of Expertise narrative:
|
||||
|
||||
* Help Nomos ship a more scalable mixnet,
|
||||
unlocking capabilities across IFT's teams and ecosystem
|
||||
and allowing for more use cases to be supported and understood.
|
||||
This will also help spur on outside adoption and contributions.
|
||||
* Improve the RFC culture
|
||||
by allowing for faster and easier development of RFCs
|
||||
with the aid of rapidly accelerated insights into how an RFC in development will perform as it's being expanded and going through the draft process.
|
||||
* Allow easier post-mortem analysis of the success or relative performance of a given RFC -
|
||||
does this change use more or less bandwidth?
|
||||
Did it improve things?
|
||||
Seeing the effects of changes at scale allows for a greater ability to usefully wrap up work on and conclude an RFC process and document and absorb what we learned in the process into further improvements.
|
||||
|
||||
## Task List
|
||||
|
||||
### Mixnet benchmarking
|
||||
|
||||
* fully qualified name: <vac:dst:nomos:nomos-benchmarking>
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: <yyyy/mm/dd>
|
||||
* end-date: <yyyy/mm/dd>
|
||||
|
||||
#### Description
|
||||
|
||||
Measure the speed and reliability of Nomos's mixnet, benchmarking it against other mixnets and a selection of real world use cases.
|
||||
|
||||
#### Deliverables
|
||||
* Benchmarks done
|
||||
* Report published with all relevant details
|
||||
|
||||
### RFC analysis (recurring)
|
||||
|
||||
* fully qualified name: <vac:dst:nomos:rfc-analysis>
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: <yyyy/mm/dd>
|
||||
* end-date: <yyyy/mm/dd>
|
||||
|
||||
#### Description
|
||||
Analyse the performance of RFCs that have an expected effect on the network's performance and scaling properties, using the benchmarking tools and real world measurements.
|
||||
|
||||
#### Deliverables
|
||||
* Analysis done
|
||||
* Report published with all relevant details
|
||||
* RFC's GitHub issue updated with links to the analysis and results
|
||||
79
content/dst/vac/libp2p-evaluation.md
Normal file
79
content/dst/vac/libp2p-evaluation.md
Normal file
@@ -0,0 +1,79 @@
|
||||
---
|
||||
title: Libp2p Evaluation
|
||||
tags:
|
||||
- "2024q4"
|
||||
- "dst"
|
||||
- "vac"
|
||||
draft: false
|
||||
description: "Test libp2p on a regular basis
|
||||
and look for regressions,
|
||||
learn scaling properties and run scaling studies,
|
||||
understand the limits of libp2p and its behaviour.
|
||||
Deliver hard numbers and actionable insights.
|
||||
Do this monthly, reliably, with strong documentation of findings."
|
||||
---
|
||||
|
||||
`vac:dst:vac:libp2p-evaluation`
|
||||
|
||||
## Description
|
||||
Test libp2p on a regular basis
|
||||
and look for regressions,
|
||||
learn scaling properties and run scaling studies,
|
||||
understand the limits of libp2p and its behaviour.
|
||||
|
||||
We want to learn specific, actionable information
|
||||
about libp2p's behaviour
|
||||
and how it is evolving over time
|
||||
with each new release
|
||||
and with each thing we are specifically asked to check and test.
|
||||
|
||||
We will use a combination of real world testing,
|
||||
theoretical analysis and simulation
|
||||
to determine and measure the success,
|
||||
side effects and other factors of libp2p and its evolution.
|
||||
|
||||
We will support the Conduit of Expertise narrative directly
|
||||
by analysing and evaluating new libp2p releases and their features,
|
||||
both with regards to features they have today
|
||||
and with regards to how that compares to past behaviour.
|
||||
|
||||
We will:
|
||||
|
||||
* Enable improvements to libp2p
|
||||
by allowing for repeatable, measureable
|
||||
and real world insights into libp2p,
|
||||
all the way from theory to practice and back.
|
||||
* Reduce the risk of a libp2p regression
|
||||
making it into a new release of our product
|
||||
|
||||
Additionally, these efforts will contribute
|
||||
to the Premier Research destination narrative by:
|
||||
|
||||
* Improving and strengthening our relationship with the libp2p team
|
||||
and thus increasing the reach and influence of the IFT,
|
||||
and improving the chances
|
||||
that we successfully grow our ecosystem's products and collaborations
|
||||
and especially those we want to work with externally.
|
||||
|
||||
### Regression testing (recurring)
|
||||
|
||||
* fully qualified name: <vac:dst:vac:libp2p-evaluation:regression-testing>
|
||||
* owner: Alberto
|
||||
* status: N/A
|
||||
* start-date: N/A
|
||||
* end-date: N/A
|
||||
|
||||
#### Description
|
||||
Run different scenarios
|
||||
and collect evidence and data
|
||||
of libp2p's behaviour.
|
||||
|
||||
Test for known regressions
|
||||
that have occurred in the past
|
||||
and ensure they don't happen again.
|
||||
|
||||
#### Deliverables
|
||||
* Analysis done
|
||||
* Report published with all relevant details
|
||||
* RFC's GitHub issue updated
|
||||
with links to the analysis and results.
|
||||
281
content/dst/waku/waku-scaling.md
Normal file
281
content/dst/waku/waku-scaling.md
Normal file
@@ -0,0 +1,281 @@
|
||||
---
|
||||
title: Waku Scaling
|
||||
tags:
|
||||
- "2024q4"
|
||||
- "dst"
|
||||
- "waku"
|
||||
draft: false
|
||||
description: "Use real world testing,
|
||||
theoretical analysis and simulation
|
||||
to determine and improve Waku's scaling properties.
|
||||
Find the limits of Waku's capabilities
|
||||
and measure them in different scenarios.
|
||||
Deliver hard numbers and actionable insights.
|
||||
Confirm or reject our ideas."
|
||||
---
|
||||
|
||||
`vac:dst:waku:waku-scaling`
|
||||
|
||||
## Description
|
||||
Use real world testing,
|
||||
theoretical analysis and simulation
|
||||
to determine and improve Waku's scaling properties.
|
||||
Find the limits of Waku's capabilities
|
||||
and measure them in different scenarios.
|
||||
Deliver hard numbers and actionable insights.
|
||||
Confirm or reject our ideas.
|
||||
|
||||
Through this we will, among other things,
|
||||
research and find the limits of Waku's capabilities
|
||||
and measure them in different scenarios.
|
||||
We will work with the Waku team to improve and measure Waku
|
||||
and allow for deep examination of a wide range of networks
|
||||
from sizes anywhere from small (< 500 nodes)
|
||||
to midscale (500-5000 nodes)
|
||||
to large (10,000+ nodes).
|
||||
|
||||
We will in some ways
|
||||
provide a parallel to the Vac QA team's efforts -
|
||||
while their focus is on individual low level
|
||||
or individual parts of Waku
|
||||
and other software within the IFT ecosystem,
|
||||
ours will be on the real world behaviour of Waku as a whole system -
|
||||
at different scales and with different configurations,
|
||||
mesh structure and shape -
|
||||
and how that maps to our theoretical work.
|
||||
|
||||
We will support the Conduit of Expertise narrative directly
|
||||
by providing valuable insights to Waku
|
||||
and the ability to theorise, reason about, test, measure and improve
|
||||
the performance, stability and scalability of Waku.
|
||||
|
||||
These efforts will contribute in these ways to the Conduit of Expertise narrative:
|
||||
|
||||
* Accelerate improvements to Waku,
|
||||
improving the developer community's experience and satisfaction
|
||||
both inside and outside of IFT's ecosystem,
|
||||
through allowing repeatable, measureable and real world insights into Waku,
|
||||
all the way from theory to practice and back.
|
||||
|
||||
* Improve the RFC culture by allowing for faster and easier development of RFCs
|
||||
with the aid of rapidly accelerated insights
|
||||
into how an RFC in development
|
||||
will perform as it's being expanded
|
||||
and as it goes through the draft process.
|
||||
|
||||
* Allow easier post-mortem analysis
|
||||
of the success or relative performance
|
||||
of a given RFC -
|
||||
does this change use more or less bandwidth?
|
||||
Did it improve things?
|
||||
Seeing the effects of changes at scale
|
||||
allows for a greater ability
|
||||
to usefully wrap up work on, and conclude, an RFC process
|
||||
and document and absorb what we learned in the process
|
||||
into further improvements.
|
||||
|
||||
## Task List
|
||||
|
||||
### High Scalability Waku Demonstration
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:high-scalability-waku-demonstration`
|
||||
* owner: Wings
|
||||
* status: 95%
|
||||
* start-date: 2024/03/01
|
||||
* end-date: 2024/11/01
|
||||
|
||||
#### Description
|
||||
Demonstrate a working, real world, large scale Waku network.
|
||||
|
||||
Measure its performance
|
||||
and attempt to support the assertion
|
||||
that Waku is a scalable solution
|
||||
that can work in networks at sizes
|
||||
that push the limits of what the theoretical work we did predicted is possible.
|
||||
|
||||
Specifically, we want to deploy a 10,000 node Waku network
|
||||
and measure its performance in terms of message delivery,
|
||||
bandwidth usage, and other metrics.
|
||||
We want to deliver a report on what we learned,
|
||||
what we tested and what we found.
|
||||
|
||||
The report should include analysis of the performance of Waku at extreme scale,
|
||||
providing insights that allow people to see significant supporting evidence
|
||||
that Waku can in fact scale to these sizes and perform reliably.
|
||||
|
||||
#### Deliverables
|
||||
- [x] An infrastructure setup, whether on-prem or cloud,
|
||||
that can support deployments of a 10,000 node Waku network.
|
||||
|
||||
- [x] https://github.com/vacp2p/10ksim - A working set of bundled and compatible Kubernetes manifests
|
||||
that allow for up to a 10,000 node Waku network
|
||||
to be reliably created and measured.
|
||||
The manifests should be compatible with [vac|dst|deployer-tool|deployer-tool]
|
||||
and flexible.
|
||||
|
||||
- [ ] A useful set of measurements taken with the monitoring system and tooling we have available.
|
||||
|
||||
- [ ] The monitoring system stays stable the entire time, providing useful information and metrics.
|
||||
|
||||
### Test Store Protocol At Scale
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:test-store-protocol-at-scale`
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: 2024/10/07
|
||||
* end-date: 2024/10/11
|
||||
|
||||
#### Description
|
||||
Test the Store protocol at scale.
|
||||
|
||||
#### Deliverables
|
||||
- [ ] A report on the results of the test,
|
||||
including analysis, data and metrics.
|
||||
- [ ] A list of any issues encountered.
|
||||
- [ ] Hard data and metrics from the simulation.
|
||||
|
||||
### High Churn Relay+Store Reliability
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:high-churn-relay-store-reliability`
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: 2024/09/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
If nodes go online/offline, we should be able to retrieve missing messages from the store.
|
||||
|
||||
#### Deliverables
|
||||
- [ ] A report on the results of the test,
|
||||
including analysis, data and metrics.
|
||||
- [ ] A list of any issues encountered.
|
||||
- [ ] Hard data and metrics from the simulation.
|
||||
|
||||
### Relay/DiscV5 Resources in Heterogenous Clusters
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:relay-discv5-resources-in-heterogenous-clusters`
|
||||
* owner: Wings
|
||||
* status: 0%
|
||||
* start-date: 2024/09/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
Measure Relay bandwidth usage
|
||||
and DiscV5 bandwidth usage
|
||||
in heterogenous clusters
|
||||
involving different node implementations
|
||||
such as nwaku and go-waku.
|
||||
|
||||
### Deliverables
|
||||
- [ ] A report on the results of each test, including analysis, data and metrics.
|
||||
- [ ] A list of any issues encountered.
|
||||
- [ ] Hard data and metrics from the simulation.
|
||||
|
||||
### Waku Shard Reliability vs Scale
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:waku-shard-reliability-vs-scale`
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: 2024/09/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
Test waku shard behaviour and stability with various of numbers of shards.
|
||||
|
||||
Choose a matrix to test for and then test for it.
|
||||
|
||||
### Deliverables
|
||||
- [ ] Matrix/exact deployment script defined
|
||||
- [ ] A report on the results of each test, including analysis, data and metrics.
|
||||
- [ ] A list of any issues encountered.
|
||||
- [ ] Hard data and metrics from the simulation.
|
||||
|
||||
### Filter and lightpush tests
|
||||
|
||||
Test the performance and reliability and behaviour
|
||||
of the Filter and lightpush protocols at scale.
|
||||
|
||||
Confirm their stability and reliability at various scales.
|
||||
|
||||
Adjust the specific tests involved
|
||||
in response to collaboration with the Waku team's directions
|
||||
and the discoveries we make during the course of this work.
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:filter-lightpush-tests`
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: 2024/10/18
|
||||
* end-date: 2024/10/25
|
||||
|
||||
#### Description
|
||||
Test the Filter and lightpush protocols at scale.
|
||||
|
||||
### Deliverables
|
||||
- [ ] A report on the current reliability and performance of the protocols at scale.
|
||||
- [ ] Filed any issues encountered.
|
||||
- [ ] Hard data and metrics from the simulation.
|
||||
|
||||
### Measure DiscV5 bandwidth with Waku discovery
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:measure-discv5-bandwidth-with-waku-discovery`
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: 2024/09/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
Measure the bandwidth usage of the Waku discovery protocol using the DiscV5 protocol.
|
||||
|
||||
### Deliverables
|
||||
- [ ] A report on what you've learnt
|
||||
- [ ] Hard data and metrics from the simulation.
|
||||
- [ ] A documentation page with analysis and results and notes.
|
||||
|
||||
### Partial PeX Experimental Analysis
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:partial-pex-experimental-analysis`
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: 2024/09/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
Produce and run an experimental test environment
|
||||
where a partial subset of the nodes
|
||||
use Waku's Peer Exchange protocol
|
||||
to share information about other nodes in the network.
|
||||
|
||||
Measure the bandwidth usage of DiscV5 on those nodes that use PeX
|
||||
and compare it to the DiscV5 bandwidth usage of nodes that do not.
|
||||
|
||||
Measure overall bandwidth usage and record conclusions as to the impact of PeX.
|
||||
|
||||
#### Deliverables
|
||||
- [ ] DiscV5 bandwidth comparison document/report - PeX vs no-PeX
|
||||
- [ ] Overall bandwidth usage comparison document/report
|
||||
- [ ] Record conclusions as to the impact of PeX.
|
||||
|
||||
### Mixed Environment Analysis
|
||||
|
||||
* fully qualified name: `vac:dst:waku:waku-scaling:mixed-environment-analysis`
|
||||
* owner: Alberto
|
||||
* status: 0%
|
||||
* start-date: 2024/09/01
|
||||
* end-date: 2024/12/31
|
||||
|
||||
#### Description
|
||||
|
||||
Measure relay resource with a mix of nodes
|
||||
using Resource-restricted device reliability protocol and peer exchange,
|
||||
meaning a small number of nwaku nodes serve store, light push and filter protocols
|
||||
and a high number of clients consume them.
|
||||
For example, 6-10 service nodes, 200 relay nodes and 1000 light nodes.
|
||||
This should include connection and node churn impact on reliability for both relay and light clients.
|
||||
|
||||
#### Deliverables
|
||||
|
||||
- [ ] A report on the findings and measurements and results.
|
||||
- [ ] A list of any issues encountered.
|
||||
- [ ] Analysis and actionable insights or conclusions.
|
||||
|
||||
<!-- Most recently blocked by metrics scaling issues, nearly through them -->
|
||||
Reference in New Issue
Block a user