* wip
* fixing tests
* adding script to update workspace for eth clients
* updating test sepc to 1.6.0 and fixing broadcaster test
* fix specrefs
* more ethspecify fixes
* still trying to fix ethspecify
* fixing attestation tests
* fixing sha for consensus specs
* removing script for now until i have something more standard
* fixing more p2p tests
* fixing discovery tests
* attempting to fix discovery test flakeyness
* attempting to fix port binding issue
* more attempts to fix flakey tests
* Revert "more attempts to fix flakey tests"
This reverts commit 25e8183703.
* Revert "attempting to fix port binding issue"
This reverts commit 583df8000d.
* Revert "attempting to fix discovery test flakeyness"
This reverts commit 3c76525870.
* Revert "fixing discovery tests"
This reverts commit 8c701bf3b9.
* Revert "fixing more p2p tests"
This reverts commit 140d5db203.
* Revert "fixing attestation tests"
This reverts commit 26ded244cb.
* fixing attestation tests
* fixing more p2p tests
* fixing discovery tests
* attempting to fix discovery test flakeyness
* attempting to fix port binding issue
* more attempts to fix flakey tests
* changelog
* fixing import
* adding some missing dependencies, but TestService_BroadcastAttestationWithDiscoveryAttempts is still failing
* attempting to fix test
* reverting test as it migrated to other pr
* reverting test
* fixing test from merge
* Fix `TestService_BroadcastAttestationWithDiscoveryAttempts`.
* Fix again `TestService_Start_OnlyStartsOnce`.
* fixing TestListenForNewNodes
* removing manual set of fulu epoch
* missed a few
* fixing subnet test
* Update beacon-chain/rpc/eth/config/handlers_test.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* removing a few more missed spots of reverting fulu epoch setting
* updating test name based on feedback
* fixing rest apis, they actually need the setting of the epoch due to the guard
---------
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Add a lock for p2p computation of active validator count and limit only to topics that need it.
* Changelog fragment
* Update gossip_scoring_params.go
Wrap errors
* Define TCP and QUIC as `InternetProtocol` (no functional change).
* Group types. (No functional changes)
* Rename variables and use range syntax.
* Add `p2pMaxPeers` and `p2pPeerCountDirectionType` metrics
* `p2p_subscribed_topic_peer_total`: Reset to avoid dangling values.
* `validateConfig`:
- Use `Warning` with fields instead of `Warnf`.
- Avoid to both modify in place the input value and return it.
* Add `p2p_minimum_peers_per_subnet` metric.
* `beaconConfig` => `cfg`.
https://github.com/OffchainLabs/prysm/pull/15880#discussion_r2436826215
* Add changelog
---------
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>
* Update Earliest available slot when pruning
* bazel run //:gazelle -- fix
* custodyUpdater interface to avoid import cycle
* bazel run //:gazelle -- fix
* simplify test
* separation of concerns
* debug log for updating eas
* UpdateEarliestAvailableSlot function in CustodyManager
* fix test
* UpdateEarliestAvailableSlot function for FakeP2P
* lint
* UpdateEarliestAvailableSlot instead of UpdateCustodyInfo + check for Fulu
* fix test and lint
* bugfix: enforce minimum retention period in pruner
* remove MinEpochsForBlockRequests function and use from config
* remove modifying earliest_available_slot after data column pruning
* correct earliestAvailableSlot validation: allow backfill decrease but prevent increase within MIN_EPOCHS_FOR_BLOCK_REQUESTS
* lint
* bazel run //:gazelle -- fix
* lint and remove unwanted debug logs
* Return a wrapped error, and let the caller decide what to do
* fix tests because updateEarliestSlot returns error now
* avoid re-doing computation in the test function
* lint and correct changelog
* custody updater should be a mandatory part of the pruner service
* ensure never increase eas if we are in the block requests window
* slot level granularity edge case
* update the value stored in the DB
* log tidy up
* use errNoCustodyInfo
* allow earliestAvailableSlot edit when custodyGroupCount doesnt change
* undo the minimal config change
* add context to CustodyGroupCount after merging from develop
* cosmetic change
* shift responsibility from caller to callee, protection for updateEarliestSlot. UpdateEarliestAvailableSlot returns cgc
* allow increase in earliestAvailableSlot only when custodyGroupCount also increases
* remove CustodyGroupCount as it is no longer needed as UpdateEarliestAvailableSlot returns cgc now
* proper place for log and name refactor
* test for Nil custody info
* allow decreasing earliest slot in DB (just like in memory)
* invert if statement to make more readable
* UpdateEarliestAvailableSlot for DB (equivalent of p2p's UpdateEarliestAvailableSlot) & undo changes made to UpdateCustodyInfo
* in UpdateEarliestAvailableSlot, no need to return unused values
* no need to log stored group count
* log.WithField instead of log.WithFields
* `findPeersWithSubnets`: If the `filter` function returns an error for a given peer, log an error and skip the peer instead of aborting the whole function.
* `computeIndicesByRootByPeer`: If the loop returns an error for a given peer, log an error and skip the peer instead of aborting the whole function.
* Add changelog.
---------
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>
* Revert "`createLocalNode`: Wait before retrying to retrieve the custody group count if not present. (#15735)"
This reverts commit 4585cdc932.
* Revert "Fix no custody info available at start (#15732)"
This reverts commit 80eba4e6dd.
* Add context to `EarliestAvailableSlot` and `CustodyGroupCount` (no functional change).
* Remove double imports.
* `EarliestAvailableSlot` and `CustodyGroupCount`: Wait for custody info to be initialized.
* ignore version/digest mismatch if far future
* bonus: this log generates a lot of noise, bump it down to trace
* unit test
---------
Co-authored-by: Kasey Kirkham <kasey@users.noreply.github.com>
* Change wrap message to avoid the could not...: could not...: could not... effect.
Reference: https://github.com/uber-go/guide/blob/master/style.md#error-wrapping.
* Log: remove period at the end of the latest sentence.
* Dirty quick fix to ensure that the custody group count is set at P2P service start.
A real fix would involve a chan implement a proper synchronization scheme.
* Add changelog.
* `Broadcasted data column sidecar` log: Add `blobCount`.
* `broadcastAndReceiveDataColumns`: Broadcast and receive data columns in parallel.
* `ProposeBeaconBlock`: First broadcast/receive block, and then sidecars.
* `broadcastReceiveBlock`: Add log.
* Add changelog
* Fix deadlock-option 1.
* Fix deadlock-option 2.
* Take notifier out of the critical section
* only compute common info once, for all sidecars
---------
Co-authored-by: Kasey Kirkham <kasey@users.noreply.github.com>
* PeerDAS: Implement sync
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Implement `TestFetchDataColumnSidecarsFromPeers`.
* Implement `TestSelectPeers`.
* Fix James' comment.
* Fix flakiness in `TestSelectPeers`.
* Revert "Fix Potuz's comment."
This reverts commit c45230b455.
* Revert "Fix James' comment."
This reverts commit a3f919205a.
* `selectPeers`: Avoid map with key but empty value.
* Fix Potuz's comment.
* Add DataColumnStorage and SubscribeAllDataSubnets flag.
* getBlobsV2: retry if reconstruction isnt successful
* test: engine client and sync package, metrics
* lint: fmt and log capitalisation
* lint: return error when it is not nil
* config: make retry interval configurable
* sidecar: recover function and different context for retrying
* lint: remove unused field
* beacon: default retry interval
* reconstruct: load once, correctly deliver the result to all waiting goroutines
* reconstruct: simplify multi goroutine case and avoid race condition
* engine: remove isDataAlreadyAvailable function
* sync: no goroutine, getblobsv2 in absence of block as well, wrap error
* exec: hardcode retry interval
* da: non blocking checks
* sync: remove unwanted checks
* execution: fix test
* execution: retry atomicity test
* da: updated IsDataAvailable
* sync: remove unwanted tests
* bazel: bazel run //:gazelle -- fix
* blockchain: fix CustodyGroupCount return
* lint: formatting
* lint: lint and use unused metrics
* execution: retry logic inside ReconstructDataColumnSidecars itself
* lint: format
* execution: ensure the retry actually happens when it needs to
* execution: ensure single responsibility, execution should not do DA check
* sync: don't call ReconstructDataColumnSidecars if not required
* blockchain: move IsDataAvailable interface to blockchain package
* execution: make reconstructSingleflight part of the service struct
* blockchain: cleaner DA check
* lint: formatting and remove confusing comment
* sync: fix lint, test and add extra test for when data is actually not available
* sync: new appropriate mock service
* execution: edge case - delete activeRetries on success
* execution: use service context instead of function's for retry
* blockchain: get variable samplesPerSlot only when required
* remove redundant function and fix name
* fix test
* fix more tests
* put samplesPerSlot at appropriate place
* tidy up IsDataAvailable
* correct bad merge
* fix bad merge
* remove redundant flag option
* refactor to deduplicate sidecar construction code
* - Add godocs
- Rename some functions to be closer to the spec
- Add err in return of commitments
* Replace mutating public method (but only internally used) `Populate` but private not mutating method `extract`.
* Implement a unique `processDataColumnSidecarsFromExecution` instead 2 separate functions from block and from sidecar.
* `ReceiveBlock`: Wrap errors.
* Remove useless tests.
* `ConstructionPopulator`: Add tests.
* Fix tests
* Move functions to be consistent with blobs.
* `fetchCellsAndProofsFromExecution`: Avoid useless flattening.
* `processDataColumnSidecarsFromExecution`: Stop using DB cache.
---------
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
Co-authored-by: Kasey Kirkham <kasey@users.noreply.github.com>
* `computeIndicesByRootByPeer`: Add 1 slack epoch regarding peer head slot.
* `FetchDataColumnSidecars`: Switch mode.
Before this commit, this function returned on error as long as at least ONE requested sidecar was not retrieved.
Now, this function retrieves what it can (best effort mode) and returns an additional value which is the map of missing sidecars after running this function.
It is now the role of the caller to check this extra returned value and decide what to do in case some requested sidecars are still missing.
* `fetchOriginDataColumnSidecars`: Optimize
Before this commit, when running `fetchOriginDataColumnSidecars`, all the missing sidecars had to been retrieved in a single shot for the sidecars to be considered as available. The issue was, if for example `sync.FetchDataColumnSidecars` returned all but one sidecar, the returned sidecars were NOT saved, and on the next iteration, all the previously fetched sidecars had to be requested again (from peers.)
After this commit, we greedily save all fetched sidecars, solving this issue.
* Initial sync: Do not fetch data column sidecars before the retention period.
* Implement perfect peerdas syncing.
* Add changelog.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Update beacon-chain/sync/data_column_sidecars.go
Co-authored-by: Potuz <potuz@prysmaticlabs.com>
* Update beacon-chain/sync/data_column_sidecars.go
Co-authored-by: Potuz <potuz@prysmaticlabs.com>
* Update beacon-chain/sync/data_column_sidecars.go
Co-authored-by: Potuz <potuz@prysmaticlabs.com>
* Update after Potuz's comment.
* Fix Potuz's commit.
* Fix James' comment.
---------
Co-authored-by: Potuz <potuz@prysmaticlabs.com>
* Add flag for colocation whitelisting. --p2p-ip-colocation-whitelist
This change updates the peer IP colocation checking to respect the
configured CIDR whitelist (--p2p-ip-colocation-whitelist flag).
Changes:
- Added IPColocationWhitelist field to peers.StatusConfig
- Added ipColocationWhitelist field to Status struct to store parsed IPNets
- Parse CIDR strings into net.IPNet in NewStatus constructor
- Updated isfromBadIP method to skip colocation limits for whitelisted IPs
- Pass IPColocationWhitelist from Service config when creating Status
The IP colocation whitelist allows operators to exempt specific IP ranges
from the colocation limit, useful for deployments with known trusted
address ranges or legitimate node clustering.
Only check if an IP is in the whitelist when the colocation limit
is actually exceeded, rather than checking for every IP. This is
more efficient and matches the intended behavior.
* Changelog fragment
* Apply suggestion from @nalepae
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* Apply suggestion from @nalepae
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* @kasey feedback: Move IP colocation parsing to the node construction
---------
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* `startBaseServices`: Warm data column storage cache.
* `TestFindPeers_NodeDeduplication`: Use `t.context`.
* `BUILD.bazel`: Moge `# gazelle.ignore` at the top of the file.
Rationale: This directive is applied to the whole file, regardless its position in the file.
* Improve `TestConstructGenericBeaconBlock`: Courtesy of Terence
* Add `TestDataColumnStoragePath_FlagSpecified`.
* `appFlags`: Move `flags.SubscribeAllDataSubnets` (cosmetic).
* `appFlags`: Add `storage.DataColumnStoragePathFlag`.
* Add changelog.
* adding what I think could be a fix for find peer
* removing uneeded comment
* unit tests
* linting
* gofmt
* changelog
* Update beacon-chain/p2p/discovery_test.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update changelog/james-prysm_fix-find-peers.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fixing test import
* applying suggestions
* fixing typo
* manu feedback
* accidently checked in files
* addressing manu's edgecase, old bug
* moving tests from service-test.go to subnets_test.go and adding coverage for receiving bad existing node with higher seq
* cleanup
* updating for clarity
* missingPeerCount should increment if we are removing the peer from map
* manu's recommendations on defective subnet rollback edge case
* rollback introduced too much complication as well as a new bug so we are removing it
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add entry for sequence number in chain-metadata bucket & Basic getter/setter
* Mark p2p-metadata flag as deprecated
* Fix metaDataFromConfig: use DB instead to get seqnum
* Save sequence number after updating the metadata
* Fix beacon-chain/p2p unit tests: add DB in config
* Add changelog
* Add ReadOnlyDatabaseWithSeqNum
* Code suggestion from Manu
* Remove seqnum getter at interface
---------
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>
* Log when downscoring a peer.
* `validateSequenceNumber`: Downscore peer in function, clarify and add logs
* `AddConnectionHandler`: Send majority code to the outer scope (no funtional change).
* `disconnectBadPeer`: Improve log.
* `sendRPCStatusRequest`: Improve log.
* `findPeersWithSubnets`: Add preventive peer filtering.
(As done in `s.findPeers`.)
* `Stop`: Use one `defer` for the whole function.
Reminder: `defer`s are executed backwards.
* `Stop`: Send a goodbye message to all connected peers when stopping the service.
Before this commit, stopping the service did not send any goodbye message to all connected peers. The issue with this approach is that the peer still thinks we are alive, and behaves so by trying to communicate with us. Unfortunatly, because we are offline, we cannot respond. Because of that, the peer starts to downscore us, and then bans us. As a consequence, when we restart, the peer refuses our connection request.
By sending a goodbye message when stopping the service, we ensure the peer stops to expect anything from us. When restarting, everything is allright.
* `ConnectedF` and `DisconnectedF`: Workaround very probable libp2p bug by preventing outbound connection to very recently disconnected peers.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* `AddDisconnectionHandler`: Handle multiple close calls to `DisconnectedF` for the same peer.
* Subnets subscription: Avoid dynamic subscribing blocking in case not enough peers per subnets are found.
* `subscribeWithParameters`: Use struct to avoid too many function parameters (no functional changes).
* Optimise subnets search.
Currently, when we are looking for peers in let's say data column sidecars subnets 3, 6 and 7, we first look for peers in subnet 3.
If, during the crawling, we meet some peers with subnet 6, we discard them (because we are exclusively looking for peers with subnet 3).
When we are happy, we start again with peers with subnet 6.
This commit optimizes that by looking for peers with satisfy our constraints in one look.
* Fix James' comment.
* Fix James' comment.
* Fix James' comment.
* Fix James' commnet.
* Fix James' comment.
* Fix James' comment.
* Fix James's comment.
* Simplify following James' comment.
* Fix James' comment.
* Update beacon-chain/sync/rpc_goodbye.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Update config/params/config.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Update beacon-chain/sync/subscriber.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Fix Preston's comment.
* Fix Preston's comment.
* `TestService_BroadcastDataColumn`: Re-add sleep 50 ms.
* Fix Preston's comment.
* Update beacon-chain/p2p/subnets.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
---------
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Convert genesis times from seconds to time.Time
* Fixing failed forkchoice tests in a new commit so it doesn't get worse
Fixing failed spectest tests in a new commit so it doesn't get worse
Fixing forkchoice tests, then spectests
* Fixing forkchoice tests, then spectests. Now asking for help...
* Fix TestForkChoice_GetProposerHead
* Fix broken build
* Resolve TODO(preston) items
* Changelog fragment
* Resolve TODO(preston) items again
* Resolve lint issues
* Use consistant field names for sinceSlotStart (no spaces)
* Manu's feedback
* Renamed StartTime -> UnsafeStartTime, marked as deprecated because it doesn't handle overflow scenarios.
Renamed SlotTime -> StartTime
Renamed SlotAt -> At
Handled the error in cases where StartTime was used.
@james-prysm feedback
* Revert beacon-chain/blockchain/receive_block_test.go from 1b7844de
* Fixing issues after rebase
* Accepted suggestions from @potuz
* Remove CanonicalHeadSlot from merge conflicts
---------
Co-authored-by: potuz <potuz@prysmaticlabs.com>
* Add log capitalization analyzer and apply fixes across codebase
Implements a new nogo analyzer to enforce proper log message capitalization and applies the fixes to all affected log statements throughout the beacon chain, validator, and supporting components.
Co-Authored-By: Claude <noreply@anthropic.com>
* Radek's feedback
---------
Co-authored-by: Claude <noreply@anthropic.com>