The head state is guaranteed to have the same shuffling and active
indices if the previous dependent root coincides with the target
checkpoint's in some cases.
**What type of PR is this?**
Other
**What does this PR do? Why is it needed?**
This pull request removes `NUMBER_OF_COLUMNS` and
`MAX_CELLS_IN_EXTENDED_MATRIX` configuration.
**Other notes for review**
Please read commit by commit, with commit messages.
**Acknowledgements**
- [x] I have read
[CONTRIBUTING.md](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md).
- [x] I have included a uniquely named [changelog fragment
file](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md#maintaining-changelogmd).
- [x] I have added a description to this PR with sufficient context for
reviewers to understand this PR.
<!-- Thanks for sending a PR! Before submitting:
1. If this is your first PR, check out our contribution guide here
https://docs.prylabs.network/docs/contribute/contribution-guidelines
You will then need to sign our Contributor License Agreement (CLA),
which will show up as a comment from a bot in this pull request after
you open it. We cannot review code without a signed CLA.
2. Please file an associated tracking issue if this pull request is
non-trivial and requires context for our team to understand. All
features and most bug fixes should have
an associated issue with a design discussed and decided upon. Small bug
fixes and documentation improvements don't need issues.
3. New features and bug fixes must have tests. Documentation may need to
be updated. If you're unsure what to update, send the PR, and we'll
discuss
in review.
4. Note that PRs updating dependencies and new Go versions are not
accepted.
Please file an issue instead.
5. A changelog entry is required for user facing issues.
-->
**What type of PR is this?**
Feature
**What does this PR do? Why is it needed?**
| Feature | Semi-Supernode | Supernode |
| ----------------------- | ------------------------- |
------------------------ |
| **Custody Groups** | 64 | 128 |
| **Data Columns** | 64 | 128 |
| **Storage** | ~50% | ~100% |
| **Blob Reconstruction** | Yes (via Reed-Solomon) | No reconstruction
needed |
| **Flag** | `--semi-supernode` | `--supernode` |
| **Can serve all blobs** | Yes (with reconstruction) | Yes (directly) |
**note** if your validator total effective balance results in more
custody than the semi-supernode it will override those those
requirements.
cgc=64 from @nalepae
Pro:
- We are useful to the network
- Less disconnection likelihood
- Straight forward to implement
Con:
- We cannot revert to a full node
- We have to serve incoming RPC requests corresponding to 64 columns
Tested the following using this kurtosis setup
```
participants:
# Super-nodes
- el_type: geth
el_image: ethpandaops/geth:master
cl_type: prysm
vc_image: gcr.io/offchainlabs/prysm/validator:latest
cl_image: gcr.io/offchainlabs/prysm/beacon-chain:latest
count: 2
cl_extra_params:
- --supernode
vc_extra_params:
- --verbosity=debug
# Full-nodes
- el_type: geth
el_image: ethpandaops/geth:master
cl_type: prysm
vc_image: gcr.io/offchainlabs/prysm/validator:latest
cl_image: gcr.io/offchainlabs/prysm/beacon-chain:latest
count: 2
validator_count: 1
cl_extra_params:
- --semi-supernode
vc_extra_params:
- --verbosity=debug
additional_services:
- dora
- spamoor
spamoor_params:
image: ethpandaops/spamoor:master
max_mem: 4000
spammers:
- scenario: eoatx
config:
throughput: 200
- scenario: blobs
config:
throughput: 20
network_params:
fulu_fork_epoch: 0
withdrawal_type: "0x02"
preset: mainnet
global_log_level: debug
```
```
curl -H "Accept: application/json" http://127.0.0.1:32961/eth/v1/node/identity
{"data":{"peer_id":"16Uiu2HAm7xzhnGwea8gkcxRSC6fzUkvryP6d9HdWNkoeTkj6RSqw","enr":"enr:-Ni4QIH5u2NQz17_pTe9DcCfUyG8TidDJJjIeBpJRRm4ACQzGBpCJdyUP9eGZzwwZ2HS1TnB9ACxFMQ5LP5njnMDLm-GAZqZEXjih2F0dG5ldHOIAAAAAAAwAACDY2djQIRldGgykLZy_whwAAA4__________-CaWSCdjSCaXCErBAAE4NuZmSEAAAAAIRxdWljgjLIiXNlY3AyNTZrMaECulJrXpSOBmCsQWcGYzQsst7r3-Owlc9iZbEcJTDkB6qIc3luY25ldHMFg3RjcIIyyIN1ZHCCLuA","p2p_addresses":["/ip4/172.16.0.19/tcp/13000/p2p/16Uiu2HAm7xzhnGwea8gkcxRSC6fzUkvryP6d9HdWNkoeTkj6RSqw","/ip4/172.16.0.19/udp/13000/quic-v1/p2p/16Uiu2HAm7xzhnGwea8gkcxRSC6fzUkvryP6d9HdWNkoeTkj6RSqw"],"discovery_addresses":["/ip4/172.16.0.19/udp/12000/p2p/16Uiu2HAm7xzhnGwea8gkcxRSC6fzUkvryP6d9HdWNkoeTkj6RSqw"],"metadata":{"seq_number":"3","attnets":"0x0000000000300000","syncnets":"0x05","custody_group_count":"64"}}}
```
```
curl -s http://127.0.0.1:32961/eth/v1/debug/beacon/data_column_sidecars/head | jq '.data | length'
64
```
```
curl -X 'GET' \
'http://127.0.0.1:32961/eth/v1/beacon/blobs/head' \
-H 'accept: application/json'
```
**Which issues(s) does this PR fix?**
Fixes #
**Other notes for review**
**Acknowledgements**
- [x] I have read [CONTRIBUTING.md](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md).
- [x] I have included a uniquely named [changelog fragment file](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md#maintaining-changelogmd).
- [x] I have added a description to this PR with sufficient context for reviewers to understand this PR.
---------
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
Co-authored-by: james-prysm <jhe@offchainlabs.com>
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* stop emitting payload attribute events during late block handling when we are not proposing the next slot
* Change the behavior to not even enter FCU if we are not proposing next slot
* init
* reverting some functions
* rolling back a change and fixing linting
* wip
* wip
* fixing test
* breaking up proofs and cells for cleaner code
* fixing test and type
* fixing safe conversion
* fixing test
* fixing more tests
* fixing even more tests
* fix the 0 indices option
* adding a test for coverage
* small test update
* changelog
* radek's suggestions
* Update beacon-chain/core/peerdas/validator.go
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* addressing comments on kzg package
* addressing suggestions for reconstruction
* more manu feedback items
* removing unneeded files
* removing unneeded setter
---------
Co-authored-by: james-prysm <jhe@offchainlabs.com>
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* Only use head if it's compatible with target
* Allow blocks from the previous epoch to be viable for checkpoints
* Add feature flag to make it configurable
* fix tests
* @satushh's review
* Manu's nit
* Use fields in logs
* Implement `AvailableBlocks`.
* `blobSidecarByRootRPCHandler`: Do not serve a sidecar if the corresponding block is not available.
* `dataColumnSidecarByRootRPCHandler`: Do not do extra work if only needed for TRACE logging.
* `TestDataColumnSidecarsByRootRPCHandler`: Re-arrange (no functional change).
* `TestDataColumnSidecarsByRootRPCHandler`: Save blocks corresponding to sidecars into DB.
* `dataColumnSidecarByRootRPCHandler`: Do not serve a sidecar if the corresponding block is not available.
* Add changelog
* `TestDataColumnSidecarsByRootRPCHandler`: Use `assert` instead of `require` in goroutines.
https://github.com/stretchr/testify?tab=readme-ov-file#require-package
* Define TCP and QUIC as `InternetProtocol` (no functional change).
* Group types. (No functional changes)
* Rename variables and use range syntax.
* Add `p2pMaxPeers` and `p2pPeerCountDirectionType` metrics
* `p2p_subscribed_topic_peer_total`: Reset to avoid dangling values.
* `validateConfig`:
- Use `Warning` with fields instead of `Warnf`.
- Avoid to both modify in place the input value and return it.
* Add `p2p_minimum_peers_per_subnet` metric.
* `beaconConfig` => `cfg`.
https://github.com/OffchainLabs/prysm/pull/15880#discussion_r2436826215
* Add changelog
---------
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>
* Update Earliest available slot when pruning
* bazel run //:gazelle -- fix
* custodyUpdater interface to avoid import cycle
* bazel run //:gazelle -- fix
* simplify test
* separation of concerns
* debug log for updating eas
* UpdateEarliestAvailableSlot function in CustodyManager
* fix test
* UpdateEarliestAvailableSlot function for FakeP2P
* lint
* UpdateEarliestAvailableSlot instead of UpdateCustodyInfo + check for Fulu
* fix test and lint
* bugfix: enforce minimum retention period in pruner
* remove MinEpochsForBlockRequests function and use from config
* remove modifying earliest_available_slot after data column pruning
* correct earliestAvailableSlot validation: allow backfill decrease but prevent increase within MIN_EPOCHS_FOR_BLOCK_REQUESTS
* lint
* bazel run //:gazelle -- fix
* lint and remove unwanted debug logs
* Return a wrapped error, and let the caller decide what to do
* fix tests because updateEarliestSlot returns error now
* avoid re-doing computation in the test function
* lint and correct changelog
* custody updater should be a mandatory part of the pruner service
* ensure never increase eas if we are in the block requests window
* slot level granularity edge case
* update the value stored in the DB
* log tidy up
* use errNoCustodyInfo
* allow earliestAvailableSlot edit when custodyGroupCount doesnt change
* undo the minimal config change
* add context to CustodyGroupCount after merging from develop
* cosmetic change
* shift responsibility from caller to callee, protection for updateEarliestSlot. UpdateEarliestAvailableSlot returns cgc
* allow increase in earliestAvailableSlot only when custodyGroupCount also increases
* remove CustodyGroupCount as it is no longer needed as UpdateEarliestAvailableSlot returns cgc now
* proper place for log and name refactor
* test for Nil custody info
* allow decreasing earliest slot in DB (just like in memory)
* invert if statement to make more readable
* UpdateEarliestAvailableSlot for DB (equivalent of p2p's UpdateEarliestAvailableSlot) & undo changes made to UpdateCustodyInfo
* in UpdateEarliestAvailableSlot, no need to return unused values
* no need to log stored group count
* log.WithField instead of log.WithFields
h/t to the NuConstruct team for reporting this. The event feed
incorrectly sends epoch transition flag on head events when the first
slot of the epoch is missing (or reorgs across epoch transition).
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>
* feature: Use service context and continue on slasher attestation errors
* Create Galoretka_feature-slasher-feed-use-service-ctx
* Rename Galoretka_feature-slasher-feed-use-service-ctx to Galoretka_feature-slasher-feed-use-service-ctx.md
---------
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>
* Revert "`createLocalNode`: Wait before retrying to retrieve the custody group count if not present. (#15735)"
This reverts commit 4585cdc932.
* Revert "Fix no custody info available at start (#15732)"
This reverts commit 80eba4e6dd.
* Add context to `EarliestAvailableSlot` and `CustodyGroupCount` (no functional change).
* Remove double imports.
* `EarliestAvailableSlot` and `CustodyGroupCount`: Wait for custody info to be initialized.
* Sort sidecars by index before calling `RecoverCellsAndKZGProofs`.
Reason: Starting at `c-kzg-4844 v2.1.2`, the library needs input to be sorted.
* Update `c-kzg-4844` to `v2.1.3`
* Update `c-kzg-4844` to `v2.1.5`
* additional log information around invalid payloads
* fix test with reversed require.ErrorIs args
---------
Co-authored-by: Kasey Kirkham <kasey@users.noreply.github.com>
* `reconstructSaveBroadcastDataColumnSidecars`: Use the `s.columnIndicesToSample` instead of recoding its content.
ddd
* Rename `custodyColumns` ==> `columnIndicesToSample`.
* `DataColumnStorage.Save`: Remove wrong godoc.
* Implement `receiveDataColumnSidecars` and transform `receiveDataColumnSidecar` as a subcase of the plural version.
* `dataColumnSubscriber`: Add godoc and remove only once used variable.
* `processDataColumnSidecarsFromExecution`: Use single flight directly in the function.
So the caller does not have any more the responsability to deal with multiple simultaneous calls.
* `processDataColumnSidecarsFromReconstruction`: Guard against a single flight.
In `dataColumnSubscriber`, trig in parallel `processDataColumnSidecarsFromReconstruction` and `processDataColumnSidecarsFromExecution`.
Stop when the first of them is successful.
* `processDataColumnSidecarsFromExecution`: Use `receiveDataColumnSidecars` instead of `receiveDataColumnSidecar`.
* Implement and use `broadcastAndReceiveUnseenDataColumnSidecars`.
* Add changelog.
* Fix James' comment.
* Fix James' comment.
* `processDataColumnSidecarsFromReconstruction`: Log reconstruction duration.
* Change InsertChain
InsertChain uses `ROBlock` since #14571, this allows it to insert the
last block of the chain as well. We change the semantics of InsertChain
to include all blocks and take them in increasing order.
* Fix tests
* Use slices.Reverse
* PeerDAS: Implement sync
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Fix Potuz's comment.
* Implement `TestFetchDataColumnSidecarsFromPeers`.
* Implement `TestSelectPeers`.
* Fix James' comment.
* Fix flakiness in `TestSelectPeers`.
* Revert "Fix Potuz's comment."
This reverts commit c45230b455.
* Revert "Fix James' comment."
This reverts commit a3f919205a.
* `selectPeers`: Avoid map with key but empty value.
* Fix Potuz's comment.
* Add DataColumnStorage and SubscribeAllDataSubnets flag.
* getBlobsV2: retry if reconstruction isnt successful
* test: engine client and sync package, metrics
* lint: fmt and log capitalisation
* lint: return error when it is not nil
* config: make retry interval configurable
* sidecar: recover function and different context for retrying
* lint: remove unused field
* beacon: default retry interval
* reconstruct: load once, correctly deliver the result to all waiting goroutines
* reconstruct: simplify multi goroutine case and avoid race condition
* engine: remove isDataAlreadyAvailable function
* sync: no goroutine, getblobsv2 in absence of block as well, wrap error
* exec: hardcode retry interval
* da: non blocking checks
* sync: remove unwanted checks
* execution: fix test
* execution: retry atomicity test
* da: updated IsDataAvailable
* sync: remove unwanted tests
* bazel: bazel run //:gazelle -- fix
* blockchain: fix CustodyGroupCount return
* lint: formatting
* lint: lint and use unused metrics
* execution: retry logic inside ReconstructDataColumnSidecars itself
* lint: format
* execution: ensure the retry actually happens when it needs to
* execution: ensure single responsibility, execution should not do DA check
* sync: don't call ReconstructDataColumnSidecars if not required
* blockchain: move IsDataAvailable interface to blockchain package
* execution: make reconstructSingleflight part of the service struct
* blockchain: cleaner DA check
* lint: formatting and remove confusing comment
* sync: fix lint, test and add extra test for when data is actually not available
* sync: new appropriate mock service
* execution: edge case - delete activeRetries on success
* execution: use service context instead of function's for retry
* blockchain: get variable samplesPerSlot only when required
* remove redundant function and fix name
* fix test
* fix more tests
* put samplesPerSlot at appropriate place
* tidy up IsDataAvailable
* correct bad merge
* fix bad merge
* remove redundant flag option
* refactor to deduplicate sidecar construction code
* - Add godocs
- Rename some functions to be closer to the spec
- Add err in return of commitments
* Replace mutating public method (but only internally used) `Populate` but private not mutating method `extract`.
* Implement a unique `processDataColumnSidecarsFromExecution` instead 2 separate functions from block and from sidecar.
* `ReceiveBlock`: Wrap errors.
* Remove useless tests.
* `ConstructionPopulator`: Add tests.
* Fix tests
* Move functions to be consistent with blobs.
* `fetchCellsAndProofsFromExecution`: Avoid useless flattening.
* `processDataColumnSidecarsFromExecution`: Stop using DB cache.
---------
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
Co-authored-by: Kasey Kirkham <kasey@users.noreply.github.com>
* create lc cache to track branches
* save lc stuff
* remove finalized data from LC cache on finalization
* read lc stuff
* edit tests
* changelog
* linter
* address commments
* address commments 2
* address commments 3
* address commments 4
* lint
* address commments 5 x_x
* set beacon lcStore to mimick registrable services
* clean up the error propagation
* pass the state to saveLCBootstrap since it's not saved in db yet
* propose block changes from peerdas branch
* breaking out broadcast code into its own helper, changing fulu broadcast for rest api to properly send datasidecars
* renamed validate blobsidecars to validate blobs, and added check for max blobs
* gofmt
* adding in batch verification for blobs"
* changelog
* adding kzg tests, moving new kzg functions to validation.go
* linting and other small fixes
* fixing linting issues and adding some proposer tests
* missing dependencies
* fixing test
* fixing more tests
* gaz
* removed return on broadcast data columns
* more cleanup and unit test adjustments
* missed removal of unneeded field
* adding data column receiver initialization
* Update beacon-chain/rpc/eth/beacon/handlers.go
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* partial review feedback from manu
* gaz
* reverting some code to peerdas as I don't believe the broadcast code needs to be reused
* missed removal of build dependency
* fixing tests and adding another test based on manu's suggestion
* fixing linting
* Update beacon-chain/rpc/eth/beacon/handlers.go
Co-authored-by: Radosław Kapka <rkapka@wp.pl>
* Update beacon-chain/blockchain/kzg/validation.go
Co-authored-by: Radosław Kapka <rkapka@wp.pl>
* radek's review changes
* adding missed test
---------
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
Co-authored-by: Radosław Kapka <rkapka@wp.pl>
* Swap the wrong arguments in a call
I saw that the names of the passed arguments and the ones of the
function parameters don't match, so I suspect that it's a bug.
* Add changelog
* Add validation for the fillInForkChocieMissingBlocks checkpoints.
* Add test for checkpoint epoch validation in fillInForkChoiceMissingBlocks.
* Use a sentinel error rather than error string
---------
Co-authored-by: kasey <489222+kasey@users.noreply.github.com>
Co-authored-by: Preston Van Loon <preston@pvl.dev>
By default when starting a node, we load the finalized checkpoint from
db and set it as head. When the chain has not been finalizing for a
while and the user does not start from the latest head, it may still be
benefitial to start from the latest justified checkpoint that has to be
a descendant of the finalized one.
* initialize genesis data asap at node start
* add genesis validation tests with embedded state verification
* Add test for hardcoded mainnet genesis validator root and time from init() function
* Add test for UnmarshalState in encoding/ssz/detect/configfork.go
* Add tests for genesis.Initialize
* Move genesis/embedded to genesis/internal/embedded
* Gazelle / BUILD fix
* James feedback
* Fix lint
* Revert lock
---------
Co-authored-by: Kasey <kasey@users.noreply.github.com>
Co-authored-by: terence tsao <terence@prysmaticlabs.com>
Co-authored-by: Preston Van Loon <preston@pvl.dev>
* Fix race on ReceibeBlock
In the event two routines for `ReceiveBlock` are triggered with the same
block (it may happen if one routine is triggered over gossip and the
other in init-sync) it may happen that the second routine believes it's
syncing the block for the first time. This is because the cache on
`blocksBeingSynced` is not checked to be set and the block may still not
be put in forkchoice by the first routine.
In the normal case this would not cause any trouble as the second
forkchoice insertion is a noop by design. However, if the second routine
times out or has any error in processing (for example the engine will
return an error if we try to send FCU to an older head) then the second
routine will attempt to remove the inserted block from forkchoice and
this bricks the node since forkchoice refuses to remove a valid node,
the root is removed inconditionally from db and the node ends up with a
root that is not in db and remains in forkchoice.
This PR just prevents the race.
As a followup perhaps we can gate the rollback function from db to nodes
that are effectively not in forkchoice, alternatively, force removal
from forkchoice when rolling back from db (although this version is
complicated due to possible accounting issues on forkchoice).
* Fix lint
Currently the payload attribute events is triggered on
`forkchoiceUpodateWithExecution`. However when we import an early block,
we do not call this function, we make two calls to FCU, the first one is
on a locked path at the end of `postBlockProcess` and this call is made
without any payload attributes to avoid updating the shuffling caches.
The second call is made on `handleSecondFCUCall` which calls directly
`notifyForkchoiceUpdate` bypassing the call to
`forkchoiceUpdateWithExecution`, but this call is the one that actually
computes the payload attributes. So the event handler is never called
with the new attributes.
This PR moves the event trigger to the same place where we actually call
FCU with the computed payload attributes.
Some considerations with forkchoice locking logic: since the calls are
always in a go routine, anyway the routine will wait to forkchoice to be
unlocked to proceed.
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>