**What type of PR is this?**
Feature
What does this PR do? Why is it needed?
This PR takes @MarcoPolo 's PR at
https://github.com/OffchainLabs/prysm/pull/16130 to completion with
tests.
The description on his PR:
"""
a relatively small change to optimize network send order.
Without this, network writes tend to prioritize sending data for one
column to all peers before sending data for later columns (e.g for two
columns and 4 peers per column it would send A,A,A,A,B,B,B,B). With
batch publishing we can change the write order to round robin across
columns (e.g. A,B,A,B,A,B,A,B).
In cases where the process is sending at a rate over the network limit,
this approach allows at least some copies of the column to propagate
through the network. In early simulations with bandwidth limits of
50mbps for the publisher, this improved dissemination by ~20-30%.
"""
See the issue for some more context.
**Which issues(s) does this PR fix?**
Fixes https://github.com/OffchainLabs/prysm/issues/16129
Other notes for review
Acknowledgements
- [x] I have read
[CONTRIBUTING.md](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md).
- [ ] I have included a uniquely named [changelog fragment
file](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md#maintaining-changelogmd).
- [x] I have added a description with sufficient context for reviewers
to understand this PR.
- [ ] I have tested that my changes work as expected and I added a
testing plan to the PR description (if applicable).
---------
Co-authored-by: Marco Munizaga <git@marcopolo.io>
Co-authored-by: Kasey Kirkham <kasey@users.noreply.github.com>
Co-authored-by: kasey <489222+kasey@users.noreply.github.com>
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* wip
* fixing tests
* adding script to update workspace for eth clients
* updating test sepc to 1.6.0 and fixing broadcaster test
* fix specrefs
* more ethspecify fixes
* still trying to fix ethspecify
* fixing attestation tests
* fixing sha for consensus specs
* removing script for now until i have something more standard
* fixing more p2p tests
* fixing discovery tests
* attempting to fix discovery test flakeyness
* attempting to fix port binding issue
* more attempts to fix flakey tests
* Revert "more attempts to fix flakey tests"
This reverts commit 25e8183703.
* Revert "attempting to fix port binding issue"
This reverts commit 583df8000d.
* Revert "attempting to fix discovery test flakeyness"
This reverts commit 3c76525870.
* Revert "fixing discovery tests"
This reverts commit 8c701bf3b9.
* Revert "fixing more p2p tests"
This reverts commit 140d5db203.
* Revert "fixing attestation tests"
This reverts commit 26ded244cb.
* fixing attestation tests
* fixing more p2p tests
* fixing discovery tests
* attempting to fix discovery test flakeyness
* attempting to fix port binding issue
* more attempts to fix flakey tests
* changelog
* fixing import
* adding some missing dependencies, but TestService_BroadcastAttestationWithDiscoveryAttempts is still failing
* attempting to fix test
* reverting test as it migrated to other pr
* reverting test
* fixing test from merge
* Fix `TestService_BroadcastAttestationWithDiscoveryAttempts`.
* Fix again `TestService_Start_OnlyStartsOnce`.
* fixing TestListenForNewNodes
* removing manual set of fulu epoch
* missed a few
* fixing subnet test
* Update beacon-chain/rpc/eth/config/handlers_test.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* removing a few more missed spots of reverting fulu epoch setting
* updating test name based on feedback
* fixing rest apis, they actually need the setting of the epoch due to the guard
---------
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* PeerDAS: Implement sync
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Implement `TestFetchDataColumnSidecarsFromPeers`.
* Implement `TestSelectPeers`.
* Fix James' comment.
* Fix flakiness in `TestSelectPeers`.
* Revert "Fix Potuz's comment."
This reverts commit c45230b455.
* Revert "Fix James' comment."
This reverts commit a3f919205a.
* `selectPeers`: Avoid map with key but empty value.
* Fix Potuz's comment.
* Add DataColumnStorage and SubscribeAllDataSubnets flag.
* getBlobsV2: retry if reconstruction isnt successful
* test: engine client and sync package, metrics
* lint: fmt and log capitalisation
* lint: return error when it is not nil
* config: make retry interval configurable
* sidecar: recover function and different context for retrying
* lint: remove unused field
* beacon: default retry interval
* reconstruct: load once, correctly deliver the result to all waiting goroutines
* reconstruct: simplify multi goroutine case and avoid race condition
* engine: remove isDataAlreadyAvailable function
* sync: no goroutine, getblobsv2 in absence of block as well, wrap error
* exec: hardcode retry interval
* da: non blocking checks
* sync: remove unwanted checks
* execution: fix test
* execution: retry atomicity test
* da: updated IsDataAvailable
* sync: remove unwanted tests
* bazel: bazel run //:gazelle -- fix
* blockchain: fix CustodyGroupCount return
* lint: formatting
* lint: lint and use unused metrics
* execution: retry logic inside ReconstructDataColumnSidecars itself
* lint: format
* execution: ensure the retry actually happens when it needs to
* execution: ensure single responsibility, execution should not do DA check
* sync: don't call ReconstructDataColumnSidecars if not required
* blockchain: move IsDataAvailable interface to blockchain package
* execution: make reconstructSingleflight part of the service struct
* blockchain: cleaner DA check
* lint: formatting and remove confusing comment
* sync: fix lint, test and add extra test for when data is actually not available
* sync: new appropriate mock service
* execution: edge case - delete activeRetries on success
* execution: use service context instead of function's for retry
* blockchain: get variable samplesPerSlot only when required
* remove redundant function and fix name
* fix test
* fix more tests
* put samplesPerSlot at appropriate place
* tidy up IsDataAvailable
* correct bad merge
* fix bad merge
* remove redundant flag option
* refactor to deduplicate sidecar construction code
* - Add godocs
- Rename some functions to be closer to the spec
- Add err in return of commitments
* Replace mutating public method (but only internally used) `Populate` but private not mutating method `extract`.
* Implement a unique `processDataColumnSidecarsFromExecution` instead 2 separate functions from block and from sidecar.
* `ReceiveBlock`: Wrap errors.
* Remove useless tests.
* `ConstructionPopulator`: Add tests.
* Fix tests
* Move functions to be consistent with blobs.
* `fetchCellsAndProofsFromExecution`: Avoid useless flattening.
* `processDataColumnSidecarsFromExecution`: Stop using DB cache.
---------
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
Co-authored-by: Kasey Kirkham <kasey@users.noreply.github.com>
* Add entry for sequence number in chain-metadata bucket & Basic getter/setter
* Mark p2p-metadata flag as deprecated
* Fix metaDataFromConfig: use DB instead to get seqnum
* Save sequence number after updating the metadata
* Fix beacon-chain/p2p unit tests: add DB in config
* Add changelog
* Add ReadOnlyDatabaseWithSeqNum
* Code suggestion from Manu
* Remove seqnum getter at interface
---------
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>
* Subnets subscription: Avoid dynamic subscribing blocking in case not enough peers per subnets are found.
* `subscribeWithParameters`: Use struct to avoid too many function parameters (no functional changes).
* Optimise subnets search.
Currently, when we are looking for peers in let's say data column sidecars subnets 3, 6 and 7, we first look for peers in subnet 3.
If, during the crawling, we meet some peers with subnet 6, we discard them (because we are exclusively looking for peers with subnet 3).
When we are happy, we start again with peers with subnet 6.
This commit optimizes that by looking for peers with satisfy our constraints in one look.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Fix James' commnet.
* Fix James' comment.
* Fix James' comment.
* Fix James's comment.
* Simplify following James' comment.
* Fix James' comment.
* Update beacon-chain/sync/rpc_goodbye.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Update config/params/config.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Update beacon-chain/sync/subscriber.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Fix Preston's comment.
* Fix Preston's comment.
* `TestService_BroadcastDataColumn`: Re-add sleep 50 ms.
* Fix Preston's comment.
* Update beacon-chain/p2p/subnets.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
---------
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Migrate Prysm repo to Offchain Labs organization ahead of Pectra upgrade v6
* Replace prysmaticlabs with OffchainLabs on general markdowns
* Update mock
* Gazelle and add mock.go to excluded generated mock file
* updating the goethereum dependency
* fixing dependencies
* reverting workspace
* more fixes, work in progress
* trying with upgraded geth version
* fixing deprecated functions except for the time related ones on eth1 distance due to time issues
* fixing time issues
* gaz
* fixing test and upgrading some dependencies and reverting others
* Disable cgo in hid, delete old vendored usb library
* changelog
* rolling back dependencies
* fixing go mod tidy
* Geth v1.13.6
* fix tests
* Add ping interval, set to 500ms for tests. This didnt work
* Update to v1.14.8
* Spread it out to different bootnodes
* Fix it
* Remove Memsize
* Update all out of date dependencies
* Fix geth body change
* Fix Test
* Fix Build
* Fix Tests
* Fix Tests Again
* Fix Tests Again
* Fix Tests
* Fix Test
* Copy USB Package for HID
* Push it up
* Finally fix all tests with felix's changes
* updating geth dependency
* Update go-ethereum to v1.14.11
* fixing import
* reverting blob change
* fixing Implicit memory aliasing in for loop.
* WIP changes
* wip getting a little further on e2e runs
* getting a little further
* getting a little further
* setting everything to capella
* more partial fixes
* more fixes but still WIP
* fixing access list transactions"
* some cleanup
* making configs dynamic
* reverting time
* skip lower bound in builder
* updating to geth v1.14.12
* fixing verify blob to pointer
* go mod tidy
* fixing linting
* missed removing another terminal difficulty item
* another missed update
* updating more dependencies to fix cicd
* fixing holiman dependency update
* downgrading geth to 1.14.11 due to p2p loop issue
* reverting builder middleware caused by downgrade
* fixing more rollback issues
* upgrading back to 1.14.12 after discussing with preston
* mod tidy
* gofmt
* partial review feedback
* trying to start e2e from bellatrix instead
* reverting some changes
---------
Co-authored-by: Preston Van Loon <preston@pvl.dev>
Co-authored-by: nisdas <nishdas93@gmail.com>
Rationale:
Before this commit, the internal loop exited if:
- the expected amount of peers is found, or,
- the iterator returns `false` (exhaustion), or
- `batchSize` iterations are done.
The issue with the iterations count is, in case not enough
peer are found AND `iterator.Next` always returns `true`,
we don't control WHEN the loop is going to stop.
The root cause is we don't control the time needed to
run the `iterator.Next` function, which is a function of
`devp2P (geth)`.
The value of `batchSize (2000)` was chosen arbitrarily.
It turns out the time needed to run `iterator.Next` can go from a few micro seconds to a few hundreds of milliseconds.
==> In small networks (example: E2E tests), it was possible for the loop not to exit during several dozen of seconds.
With this commit, we replace the `batchSize` by a `batchPeriod`, ensuring the loop will never
run longer than `batchPeriod`, even in a small network.
Co-authored-by: Nishant Das <nishdas93@gmail.com>
* `listenForNewNodes` and `FindPeersWithSubnet`: Stop using `Readnodes` and use iterator instead.
It avoids infinite loop in small devnets.
* Update beacon-chain/p2p/discovery.go
Co-authored-by: Sammy Rosso <15244892+saolyn@users.noreply.github.com>
---------
Co-authored-by: Sammy Rosso <15244892+saolyn@users.noreply.github.com>
* Add Current Changes To Routine
* Add In New Test
* Add Feature Flag
* Add Discovery Rebooter feature
* Do Not Export Mutex And Use Zero Value Mutex
* Wrap Error For Better Debugging
* Fix Function Name and Add Specific Test For it
* Manu's Review
* `subscribeStaticWithSubnets`: Fix docstring.
* `buildOptions`: Avoid `options` mutations.
* `dv5Cfg`: Avoid mutation.
* `RefreshENR`: Use default for all but Phase0.
* `udp4`, `udp6`: Create enum.
* `p2p.Config`: `BootstrapNodeAddr`==> `BootstrapNodeAddrs`.
* `p2p.Config`: `Discv5BootStrapAddr` ==> `Discv5BootStrapAddrs`.
* `TestScorers_BadResponses_Score`: Improve.
* `BeaconNode`: Avoid mutation.
* `TestStore_TrustedPeers`: Remove blankline.
* Remove blank identifiers.
* `privKey`: Keep the majority of code with low indentation.
* `P2PPreregistration`: Return error instead of fatal log.
* `parseBootStrapAddrs` => `ParseBootStrapAddrs` (export)
* `p2p.Config`: Remove `BootstrapNodeAddrs`.
* `NewService`: Avoid mutation when possible.
* `Service`: Remove blank identifier.
* `buildOptions`: Avoid `log.Fatalf` (make deepsource happy).
* `registerGRPCGateway`: Use `net.JoinHostPort` (make deepsource happy).
* `registerBuilderService`: Make deepsource happy.
* `scorers`: Add `NoLock` suffix (make deepsource happy).
* `scorerr`: Add some `NoLock`suffixes (making deepsource happy).
* `discovery_test.go`. Remove init.
Rationale:
`rand.Seed` is deprecated: As of Go 1.20 there is no reason to call Seed with a random value. Programs that call Seed with a known value to get a specific sequence of results should use New(NewSource(seed)) to obtain a local random generator.
This makes deepsource happy as well.
* `createListener`: Reduce cyclomatic complexity (make deepsource happy).
* `startDB`: Reduce cyclomatic complexity (make deepsource happy).
* `main`: Log a FATAL on error.
This way, the error message is very readable.
Before this commit, the error message is the less readable
message in the logs.
* `New`: Reduce cyclomatic complexity (make deepsource happy).
* `main`: Avoid `App` mutation, and make deepsource happy.
* Update beacon-chain/node/node.go
Co-authored-by: Sammy Rosso <15244892+saolyn@users.noreply.github.com>
* `bootnodes` ==> `BootNodes` (Fix PR comment).
* Remove duplicate `configureFastSSZHashingAlgorithm` since already done in `configureBeacon`. (Fix PR comment)
* Add `TestCreateLocalNode`. (PR comment fix.)
* `startModules` ==> `startBaseServices (Fix PR comment).
* `buildOptions` return errors consistently.
* `New`: Change ordering.
---------
Co-authored-by: Sammy Rosso <15244892+saolyn@users.noreply.github.com>
* First take at updating everything to v5
* Patch gRPC gateway to use prysm v5
Fix patch
* Update go ssz
---------
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Update libp2p to support go 1.19
* gaz
* go mod tidy
* Only update the minimum deps
* go mod tidy
* revert .bazelrc
* Update go-libp2p to v0.22.0 and update import paths (#11440)
* Fix import paths
* Fix go-libp2p-peerstore import
* Bazel updates
* fix
* revert some comments changes
* revert some comment stuff
* fix dependency issues
* vendor problematic library
* use your brain
* remove
* tests
Co-authored-by: Marco Munizaga <marco@marcopolo.io>
Co-authored-by: Nishant Das <nishdas93@gmail.com>
* Value assigned to a variable is never read before being overwritten
* The result of append is not used anywhere
* Suspicious assignment of range-loop vars detected
* Unused method receiver detected
* Revert "Auxiliary commit to revert individual files from 54edcb445484a2e5d79612e19af8e949b8861253"
This reverts commit bbd1e1beabf7b0c5cfc4f514dcc820062ad6c063.
* Method modifies receiver
* Fix test
* Duplicate imports detected
* Incorrectly formatted error string
* Types of function parameters can be combined
* One more "Unused method receiver detected"
* Unused parameter detected in function