<!-- Thanks for sending a PR! Before submitting:
1. If this is your first PR, check out our contribution guide here
https://docs.prylabs.network/docs/contribute/contribution-guidelines
You will then need to sign our Contributor License Agreement (CLA),
which will show up as a comment from a bot in this pull request after
you open it. We cannot review code without a signed CLA.
2. Please file an associated tracking issue if this pull request is
non-trivial and requires context for our team to understand. All
features and most bug fixes should have
an associated issue with a design discussed and decided upon. Small bug
fixes and documentation improvements don't need issues.
3. New features and bug fixes must have tests. Documentation may need to
be updated. If you're unsure what to update, send the PR, and we'll
discuss
in review.
4. Note that PRs updating dependencies and new Go versions are not
accepted.
Please file an issue instead.
5. A changelog entry is required for user facing issues.
-->
**What type of PR is this?**
## Summary
This PR implements gRPC fallback support for the validator client,
allowing it to automatically switch between multiple beacon node
endpoints when the primary node becomes unavailable or unhealthy.
## Changes
- Added `grpcConnectionProvider` to manage multiple gRPC connections
with circular failover
- Validator automatically detects unhealthy beacon nodes and switches to
the next available endpoint
- Health checks verify both node responsiveness AND sync status before
accepting a node
- Improved logging to only show "Found fully synced beacon node" when an
actual switch occurs (reduces log noise)
I removed the old middleware that uses gRPC's built in load balancer
because:
- gRPC's pick_first load balancer doesn't provide sync-status-aware
failover
- The validator needs to ensure it connects to a fully synced node, not
just a reachable one
## Test Scenario
### Setup
Deployed a 4-node Kurtosis testnet with local validator connecting to 2
beacon nodes:
```yaml
# kurtosis-grpc-fallback-test.yaml
participants:
- el_type: nethermind
cl_type: prysm
validator_count: 128 # Keeps chain advancing
- el_type: nethermind
cl_type: prysm
validator_count: 64
- el_type: nethermind
cl_type: prysm
validator_count: 64 # Keeps chain advancing
- el_type: nethermind
cl_type: prysm
validator_count: 64 # Keeps chain advancing
network_params:
fulu_fork_epoch: 0
seconds_per_slot: 6
```
Local validator started with:
```bash
./validator --beacon-rpc-provider=127.0.0.1:33005,127.0.0.1:33012 ...
```
### Test 1: Primary Failover (cl-1 → cl-2)
1. Stopped cl-1 beacon node
2. Validator detected failure and switched to cl-2
**Logs:**
```
WARN Beacon node is not responding, switching host currentHost=127.0.0.1:33005 nextHost=127.0.0.1:33012
DEBUG Trying gRPC endpoint newHost=127.0.0.1:33012 previousHost=127.0.0.1:33005
INFO Failover succeeded: connected to healthy beacon node failedAttempts=[127.0.0.1:33005] newHost=127.0.0.1:33012 previousHost=127.0.0.1:33005
```
**Result:** ✅ PASSED - Validator continued submitting attestations on
cl-2
### Test 2: Circular Failover (cl-2 → cl-1)
1. Restarted cl-1, stopped cl-2
2. Validator detected failure and switched back to cl-1
**Logs:**
```
WARN Beacon node is not responding, switching host currentHost=127.0.0.1:33012 nextHost=127.0.0.1:33005
DEBUG Trying gRPC endpoint newHost=127.0.0.1:33005 previousHost=127.0.0.1:33012
INFO Failover succeeded: connected to healthy beacon node failedAttempts=[127.0.0.1:33012] newHost=127.0.0.1:33005 previousHost=127.0.0.1:33012
```
**Result:** ✅ PASSED - Circular fallback works correctly
## Key Log Messages
| Log Level | Message | Source |
|-----------|---------|--------|
| WARN | "Beacon node is not responding, switching host" |
`changeHost()` in validator.go |
| INFO | "Switched gRPC endpoint" | `SetHost()` in
grpc_connection_provider.go |
| INFO | "Found fully synced beacon node" | `FindHealthyHost()` in
validator.go (only on actual switch) |
## Test Plan
- [x] Verify primary failover (cl-1 → cl-2)
- [x] Verify circular failover (cl-2 → cl-1)
- [x] Verify validator continues producing attestations after switch
- [x] Verify "Found fully synced beacon node" only logs on actual switch
(not every health check)
**What does this PR do? Why is it needed?**
**Which issues(s) does this PR fix?**
Fixes # https://github.com/OffchainLabs/prysm/pull/7133
**Other notes for review**
**Acknowledgements**
- [x] I have read
[CONTRIBUTING.md](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md).
- [x] I have included a uniquely named [changelog fragment
file](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md#maintaining-changelogmd).
- [x] I have added a description with sufficient context for reviewers
to understand this PR.
- [x] I have tested that my changes work as expected and I added a
testing plan to the PR description (if applicable).
---------
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: Radosław Kapka <rkapka@wp.pl>
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
* Migrate Prysm repo to Offchain Labs organization ahead of Pectra upgrade v6
* Replace prysmaticlabs with OffchainLabs on general markdowns
* Update mock
* Gazelle and add mock.go to excluded generated mock file
* `EpochFromString`: Use already defined `Uint64FromString` function.
* `Test_uint64FromString` => `Test_FromString`
This test function tests more functions than `Uint64FromString`.
* Slashing protection history: Remove unreachable code.
The function `NewKVStore` creates, via `kv.UpdatePublicKeysBuckets`,
a new item in the `proposal-history-bucket-interchange`.
IMO there is no real reason to prefer `proposal` than `attestation`
as a prefix for this bucket, but this is the way it is done right now
and renaming the bucket will probably be backward incompatible.
An `attestedPublicKey` cannot exist without
the corresponding `proposedPublicKey`.
Thus, the `else` portion of code removed in this commit is not reachable.
We raise an error if we get there.
This is also probably the reason why the removed `else` portion was not
tested.
* `NewKVStore`: Switch items in `createBuckets`.
So the order corresponds to `schema.go`
* `slashableAttestationCheck`: Fix comments and logs.
* `ValidatorClient.db`: Use `iface.ValidatorDB`.
* BoltDB database: Implement `GraffitiFileHash`.
* Filesystem database: Creates `db.go`.
This file defines the following structs:
- `Store`
- `Graffiti`
- `Configuration`
- `ValidatorSlashingProtection`
This files implements the following public functions:
- `NewStore`
- `Close`
- `Backup`
- `DatabasePath`
- `ClearDB`
- `UpdatePublicKeysBuckets`
This files implements the following private functions:
- `slashingProtectionDirPath`
- `configurationFilePath`
- `configuration`
- `saveConfiguration`
- `validatorSlashingProtection`
- `saveValidatorSlashingProtection`
- `publicKeys`
* Filesystem database: Creates `genesis.go`.
This file defines the following public functions:
- `GenesisValidatorsRoot`
- `SaveGenesisValidatorsRoot`
* Filesystem database: Creates `graffiti.go`.
This file defines the following public functions:
- `SaveGraffitiOrderedIndex`
- `GraffitiOrderedIndex`
* Filesystem database: Creates `migration.go`.
This file defines the following public functions:
- `RunUpMigrations`
- `RunDownMigrations`
* Filesystem database: Creates proposer_settings.go.
This file defines the following public functions:
- `ProposerSettings`
- `ProposerSettingsExists`
- `SaveProposerSettings`
* Filesystem database: Creates `attester_protection.go`.
This file defines the following public functions:
- `EIPImportBlacklistedPublicKeys`
- `SaveEIPImportBlacklistedPublicKeys`
- `SigningRootAtTargetEpoch`
- `LowestSignedTargetEpoch`
- `LowestSignedSourceEpoch`
- `AttestedPublicKeys`
- `CheckSlashableAttestation`
- `SaveAttestationForPubKey`
- `SaveAttestationsForPubKey`
- `AttestationHistoryForPubKey`
* Filesystem database: Creates `proposer_protection.go`.
This file defines the following public functions:
- `HighestSignedProposal`
- `LowestSignedProposal`
- `ProposalHistoryForPubKey`
- `ProposalHistoryForSlot`
- `ProposedPublicKeys`
* Ensure that the filesystem store implements the `ValidatorDB` interface.
* `slashableAttestationCheck`: Check the database type.
* `slashableProposalCheck`: Check the database type.
* `slashableAttestationCheck`: Allow usage of minimal slashing protection.
* `slashableProposalCheck`: Allow usage of minimal slashing protection.
* `ImportStandardProtectionJSON`: Check the database type.
* `ImportStandardProtectionJSON`: Allow usage of min slashing protection.
* Implement `RecursiveDirFind`.
* Implement minimal<->complete DB conversion.
3 public functions are implemented:
- `IsCompleteDatabaseExisting`
- `IsMinimalDatabaseExisting`
- `ConvertDatabase`
* `setupDB`: Add `isSlashingProtectionMinimal` argument.
The feature addition is located in `validator/node/node_test.go`.
The rest of this commit consists in minimal slashing protection testing.
* `setupWithKey`: Add `isSlashingProtectionMinimal` argument.
The feature addition is located in `validator/client/propose_test.go`.
The rest of this commit consists in tests wrapping.
* `setup`: Add `isSlashingProtectionMinimal` argument.
The added feature is located in the `validator/client/propose_test.go`
file.
The rest of this commit consists in tests wrapping.
* `initializeFromCLI` and `initializeForWeb`: Factorize db init.
* Add `convert-complete-to-minimal` command.
* Creates `--enable-minimal-slashing-protection` flag.
* `importSlashingProtectionJSON`: Check database type.
* `exportSlashingProtectionJSON`: Check database type.
* `TestClearDB`: Test with minimal slashing protection.
* KeyManager: Test with minimal slashing protection.
* RPC: KeyManager: Test with minimal slashing protection.
* `convert-complete-to-minimal`: Change option names.
Options were:
- `--source` (for source data directory), and
- `--target` (for target data directory)
However, since this command deals with slashing protection, which has
source (epochs) and target (epochs), the initial option names may confuse
the user.
In this commit:
`--source` ==> `--source-data-dir`
`--target` ==> `--target-data-dir`
* Set `SlashableAttestationCheck` as an iface method.
And delete `CheckSlashableAttestation` from iface.
* Move helpers functions in a more general directory.
No functional change.
* Extract common structs out of `kv`.
==> `filesystem` does not depend anymore on `kv`.
==> `iface` does not depend anymore on `kv`.
==> `slashing-protection` does not depend anymore on `kv`.
* Move `ValidateMetadata` in `validator/helpers`.
* `ValidateMetadata`: Test with mock.
This way, we can:
- Avoid any circular import for tests.
- Implement once for all `iface.ValidatorDB` implementations
the `ValidateMetadata`function.
- Have tests (and coverage) of `ValidateMetadata`in
its own package.
The ideal solution would have been to implement `ValidateMetadata` as
a method with the `iface.ValidatorDB`receiver.
Unfortunately, golang does not allow that.
* `iface.ValidatorDB`: Implement ImportStandardProtectionJSON.
The whole purpose of this commit is to avoid the `switch validatorDB.(type)`
in `ImportStandardProtectionJSON`.
* `iface.ValidatorDB`: Implement `SlashableProposalCheck`.
* Remove now useless `slashableProposalCheck`.
* Delete useless `ImportStandardProtectionJSON`.
* `file.Exists`: Detect directories and return an error.
Before, `Exists` was only able to detect if a file exists.
Now, this function takes an extra `File` or `Directory` argument.
It detects either if a file or a directory exists.
Before, if an error was returned by `os.Stat`, the the file was
considered as non existing.
Now, it is treated as a real error.
* Replace `os.Stat` by `file.Exists`.
* Remove `Is{Complete,Minimal}DatabaseExisting`.
* `publicKeys`: Add log if unexpected file found.
* Move `{Source,Target}DataDirFlag`in `db.go`.
* `failedAttLocalProtectionErr`: `var`==> `const`
* `signingRoot`: `32`==> `fieldparams.RootLength`.
* `validatorClientData`==> `validator-client-data`.
To be consistent with `slashing-protection`.
* Add progress bars for `import` and `convert`.
* `parseBlocksForUniquePublicKeys`: Move in `db/kv`.
* helpers: Remove unused `initializeProgressBar` function.
* First take at updating everything to v5
* Patch gRPC gateway to use prysm v5
Fix patch
* Update go ssz
---------
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* adding fix for nethermind's findings on gaslimit =0 on some default setups
* adding in default gaslimit check
* fixing linting on complexity
* fixing cognitive complexity linting marker
* fixing unit test and bug with referencing
---------
Co-authored-by: Nishant Das <nishdas93@gmail.com>
* WIP
* WIP
* adding in migration function
* updating mock validator and gaz
* adding descriptive logs
* fixing mocking
* fixing tests
* fixing mock
* adding changes to handle enable builder settings
* fixing tests and edge case
* reduce cognative complexity of function
* further reducing cognative complexity on function
* WIP
* fixing unit test on migration
* adding more tests
* gaz and fix unit test
* fixing deepsource issues
* fixing more deesource issues missed previously
* removing unused reciever name
* WIP fix to migration logic
* fixing loging info
* reverting migration logic, converting logic to address issues discussed on slack, adding unit tests
* adding test for builder setting only not saved to db
* addressing comment
* fixing flag
* removing accidently missed deprecated flags
* rolling back mock on pr
* fixing fmt linting
* updating comments based on feedback
* Update config/features/flags.go
Co-authored-by: Sammy Rosso <15244892+saolyn@users.noreply.github.com>
* fixing based on feedback on PR
* Update config/validator/service/proposer_settings.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Update validator/client/runner.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* Update validator/db/kv/proposer_settings.go
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* adding additional logs to clear up some steps based on feedback
* fixing log
* deepsource
* adding comments based on review feedback
---------
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
Co-authored-by: Sammy Rosso <15244892+saolyn@users.noreply.github.com>
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
* adding filter for validator registration
* adding new filter logic based on validator status
* make sure to check status each time
* WIP unit testing
* fixing unit tests
* adding ux improvement
* addressing nishant's comments
* cleanup for already slice error
---------
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
* moving voluntary exit command to prysmctl
* fixing linting
* fixing imports
* removing unused import:
* refactoring and adding a new unit test
* rolling back some changes
* fixing parameters
* fixing tests
* adding to main prysm ctl
* refactoring how wallet function works and ux to validator exit
* adding interop support
* fixing bazel build
* fixing if statement
* fixing beaconcha.in printout
* fixing web3signer bug, missing signing slot info
* fixing deep source issues
* fixing bazel package visibility for prysmctl
* gaz
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
* fixing ux on propertynaming, introducing placeholder property
* reverting some refactors
* Update debug.go
* Update debug.go
* rolling back change on file
* adding new unit tests and renaming flags
* renaming variable
* changing gaslimit to validator registration
* adding new flag to enable validator registration for suggested fee recipient
* making sure default gaslimit is set
* Update cmd/validator/flags/flags.go
Co-authored-by: terencechain <terence@prysmaticlabs.com>
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
Co-authored-by: terencechain <terence@prysmaticlabs.com>
Co-authored-by: prylabs-bulldozer[bot] <58059840+prylabs-bulldozer[bot]@users.noreply.github.com>
* Startinb builder service and interface
* Get header from builder
* Add get builder block
* Single validator registration
* Add mev-builder http cli flag
* Add method to verify registration signature
* Add builder registration
* Add submit validator registration
* suporting yaml
* fix yaml unmarshaling
* rolling back some changes from unmarshal from file
* adding yaml support
* adding register validator support
* added new validator requests into client/validator
* fixing gofmt
* updating flags and including gas limit, unit tests are still broken
* fixing bazel
* more name changes and fixing unit tests
* fixing unit tests and renaming functions
* fixing unit tests and renaming to match changes
* adding new test for yaml
* fixing bazel linter
* reverting change on validator service proto
* adding clarifying logs
* renaming function name to be more descriptive
* renaming variable
* rolling back some files that will be added from the builder-1 branch
* reverting more
* more reverting
* need placeholder
* need placeholder
* fixing unit test
* fixing unit test
* fixing unit test
* fixing unit test
* fixing more unit tests
* fixing more unit tests
* rolling back mockgen
* fixing bazel
* rolling back changes
* removing duplicate function
* fixing client mock
* removing unused type
* fixing missing brace
* fixing bad field name
* fixing bazel
* updating naming
* fixing bazel
* fixing unit test
* fixing bazel linting
* unhandled err
* fixing gofmt
* simplifying name based on feedback
* using corrected function
* moving default fee recipient and gaslimit to beaconconfig
* missing a few constant changes
* fixing bazel
* fixing more missed default renames
* fixing more constants in tests
* fixing bazel
* adding update proposer setting per epoch
* refactoring to reduce complexity
* adding unit test for proposer settings
* Update validator/client/validator.go
Co-authored-by: terencechain <terence@prysmaticlabs.com>
* trying out renaming based on feedback
* adjusting based on review comments
* making tests more appropriate
* fixing bazel
* updating flag description based on review feedback
* addressing review feedback
* switching to pushing at start of epoch for more time
* adding new unit test and properly throwing error
* switching keys in error to count
* fixing log variable
* resolving conflict
* resolving more conflicts
* adjusting error message
Co-authored-by: terence tsao <terence@prysmaticlabs.com>
Co-authored-by: Kasey Kirkham <kasey@users.noreply.github.com>
* adding checksum check at validator client and beacon node
* adding validation and logs on validator client startup
* moving the log and validation
* fixing unit tests
* adding test for back checksum on validator client
* fixing bazel
* addressing comments
* fixing log display
* Update beacon-chain/node/config.go
* Apply suggestions from code review
* breaking up lines
* fixing unit test
* ugh another fix to unit test
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
Co-authored-by: prylabs-bulldozer[bot] <58059840+prylabs-bulldozer[bot]@users.noreply.github.com>
* send proposer data to config
* wip implementation with file and url based config import
* improving logic on get validator index
* fix function
* optimizing function for map and address bug
* fixing log
* update cache if it doesn't exist
* updating flags
* initial unit test scaffold
* fixing validator to call rpc call, removed temporary dependency
* adding the API calls for the runner
* fixing broken build
* fixing deepsource
* fixing interface
* fixing fatal
* fixing more deepsource issues
* adding test placeholders
* updating proposer config to add validation
* changing how if statement throws error
* removing unneeded validation, validating in a different way
* wip improving tests
* more unit test work
* fixing unit test
* fixing unit tests and edge cases
* adding unit tests and adjusting how the config is created
* fixing bazel builds
* fixing proto generation
* fixing imports
* fixing unit tests
* Update cmd/validator/flags/flags.go
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
* updating flags based on comments, fixing unit tests
* fixing bazel
* removing unneeded function
* fixing unit tests
* refactors and unit test fixes based on comments
* fixing bazel build
* refactor the cache out fo the fee recipient function
* adding usecase for multiple fee recipient
* refactor burn name
* Update validator/client/validator.go
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
* Update validator/client/validator.go
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
* Update validator/client/validator.go
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
* fixing bug with validator index based on code review
* edited flag descriptions to better communicate usage
* fixing manual reference to flag name
* fixing code review comments
* fixing linting
* Update validator/client/validator.go
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
* addressing comments and renaming functions
* fixing linting
* Update cmd/validator/flags/flags.go
* Update cmd/validator/flags/flags.go
* Update cmd/validator/flags/flags.go
* improving comments
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
* initial commit
* adding first testcase wip
* fixing test
* adding more unit tests
* adding bazel file
* adding more unit tests and file checks
* addressing comments
* refactoring based on comments
* added bazel
* fixing build
* adding in fixes for url
* fixing gazelle
* fixing wrong keymanager kind
* adding required scheme to urls
* fixing another unit test
* removing unused file
* adding new commit to retrigger deepsource ci
* imported -> local
* reverting to the state of the PR at eb1e3c3d1
* use changes from develop
* del
* rem patch
* patch
* rename to local
* gazelle
* add back build
* imported rename
* gaz
* local
* merge fix + remove proto changes
* comment revert
* build
* gofmt and one new reference
* gofmt pt2
* Update validator/accounts/wallet_edit.go
Co-authored-by: Preston Van Loon <preston@prysmaticlabs.com>
* Update validator/rpc/accounts.go
* rename
* gaz
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
Co-authored-by: Preston Van Loon <preston@prysmaticlabs.com>
Co-authored-by: james-prysm <90280386+james-prysm@users.noreply.github.com>
* validator cmd
* imports
* more imports
* e2e viz
* alias
* use native alias
* add actual
* fix macro
* work on fix e2e
* add viz
* gaz
Co-authored-by: Preston Van Loon <preston@prysmaticlabs.com>
* att sign validations
* eliminate old cached methods and use a simple approach in the db
* redefined db methods
* db package builds
* add multilock to attest and propose
* gaz
* removed concurrency tests that are no longer relevant
* add cache to db functions for attesting history checks
* passing
* add in feature flag --disable-attesting-history-db-cache
* remove lock
* Revert "remove lock"
This reverts commit b1a65020e4.
* comment
* gaz
* fix with text
* fix also for validator and add test
* handle error
* fix another test
* handle error
* gazelle
* fix bug in slasher as well
* gazelle