13 Commits

Author SHA1 Message Date
james-prysm
cf94ccbf72 node fallback cleanup (#16316)
**What type of PR is this?**

 Other

**What does this PR do? Why is it needed?**

Follow up to https://github.com/OffchainLabs/prysm/pull/16215 this pr
improves logging, fixes stuttering in package naming, adds additional
unit tests, and deduplicates fallback node code.

**Which issues(s) does this PR fix?**

fixes a potential race if reconnecting to the same host very quickly
which has a stale connection still.

**Other notes for review**

**Acknowledgements**

- [x] I have read
[CONTRIBUTING.md](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md).
- [x] I have included a uniquely named [changelog fragment
file](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md#maintaining-changelogmd).
- [x] I have added a description with sufficient context for reviewers
to understand this PR.
- [x] I have tested that my changes work as expected and I added a
testing plan to the PR description (if applicable).
2026-02-04 15:59:42 +00:00
james-prysm
641d90990d grpc fallback improvements (#16215)
<!-- Thanks for sending a PR! Before submitting:

1. If this is your first PR, check out our contribution guide here
https://docs.prylabs.network/docs/contribute/contribution-guidelines
You will then need to sign our Contributor License Agreement (CLA),
which will show up as a comment from a bot in this pull request after
you open it. We cannot review code without a signed CLA.
2. Please file an associated tracking issue if this pull request is
non-trivial and requires context for our team to understand. All
features and most bug fixes should have
an associated issue with a design discussed and decided upon. Small bug
   fixes and documentation improvements don't need issues.
3. New features and bug fixes must have tests. Documentation may need to
be updated. If you're unsure what to update, send the PR, and we'll
discuss
   in review.
4. Note that PRs updating dependencies and new Go versions are not
accepted.
   Please file an issue instead.
5. A changelog entry is required for user facing issues.
-->

**What type of PR is this?**

## Summary

This PR implements gRPC fallback support for the validator client,
allowing it to automatically switch between multiple beacon node
endpoints when the primary node becomes unavailable or unhealthy.

## Changes

- Added `grpcConnectionProvider` to manage multiple gRPC connections
with circular failover
- Validator automatically detects unhealthy beacon nodes and switches to
the next available endpoint
- Health checks verify both node responsiveness AND sync status before
accepting a node
- Improved logging to only show "Found fully synced beacon node" when an
actual switch occurs (reduces log noise)


I removed the old middleware that uses gRPC's built in load balancer
because:

- gRPC's pick_first load balancer doesn't provide sync-status-aware
failover
- The validator needs to ensure it connects to a fully synced node, not
just a reachable one

## Test Scenario

### Setup
Deployed a 4-node Kurtosis testnet with local validator connecting to 2
beacon nodes:

```yaml
# kurtosis-grpc-fallback-test.yaml
participants:
  - el_type: nethermind
    cl_type: prysm
    validator_count: 128  # Keeps chain advancing
  - el_type: nethermind
    cl_type: prysm
    validator_count: 64
  - el_type: nethermind
    cl_type: prysm
    validator_count: 64   # Keeps chain advancing
  - el_type: nethermind
    cl_type: prysm
    validator_count: 64   # Keeps chain advancing

network_params:
  fulu_fork_epoch: 0
  seconds_per_slot: 6
```

Local validator started with:
```bash
./validator --beacon-rpc-provider=127.0.0.1:33005,127.0.0.1:33012 ...
```

### Test 1: Primary Failover (cl-1 → cl-2)

1. Stopped cl-1 beacon node
2. Validator detected failure and switched to cl-2

**Logs:**
```
WARN  Beacon node is not responding, switching host currentHost=127.0.0.1:33005 nextHost=127.0.0.1:33012
DEBUG Trying gRPC endpoint newHost=127.0.0.1:33012 previousHost=127.0.0.1:33005
INFO  Failover succeeded: connected to healthy beacon node failedAttempts=[127.0.0.1:33005] newHost=127.0.0.1:33012 previousHost=127.0.0.1:33005
```

**Result:**  PASSED - Validator continued submitting attestations on
cl-2

### Test 2: Circular Failover (cl-2 → cl-1)

1. Restarted cl-1, stopped cl-2
2. Validator detected failure and switched back to cl-1

**Logs:**
```
WARN  Beacon node is not responding, switching host currentHost=127.0.0.1:33012 nextHost=127.0.0.1:33005
DEBUG Trying gRPC endpoint newHost=127.0.0.1:33005 previousHost=127.0.0.1:33012
INFO  Failover succeeded: connected to healthy beacon node failedAttempts=[127.0.0.1:33012] newHost=127.0.0.1:33005 previousHost=127.0.0.1:33012
```

**Result:**  PASSED - Circular fallback works correctly

## Key Log Messages

| Log Level | Message | Source |
|-----------|---------|--------|
| WARN | "Beacon node is not responding, switching host" |
`changeHost()` in validator.go |
| INFO | "Switched gRPC endpoint" | `SetHost()` in
grpc_connection_provider.go |
| INFO | "Found fully synced beacon node" | `FindHealthyHost()` in
validator.go (only on actual switch) |

## Test Plan

- [x] Verify primary failover (cl-1 → cl-2)
- [x] Verify circular failover (cl-2 → cl-1)
- [x] Verify validator continues producing attestations after switch
- [x] Verify "Found fully synced beacon node" only logs on actual switch
(not every health check)

**What does this PR do? Why is it needed?**

**Which issues(s) does this PR fix?**

Fixes # https://github.com/OffchainLabs/prysm/pull/7133


**Other notes for review**

**Acknowledgements**

- [x] I have read
[CONTRIBUTING.md](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md).
- [x] I have included a uniquely named [changelog fragment
file](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md#maintaining-changelogmd).
- [x] I have added a description with sufficient context for reviewers
to understand this PR.
- [x] I have tested that my changes work as expected and I added a
testing plan to the PR description (if applicable).

---------

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: Radosław Kapka <rkapka@wp.pl>
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
2026-02-02 14:51:56 +00:00
Bastin
92bd211e4d upgrade v6 to v7 (#15989)
* upgrade v6 to v7

* changelog

* update-go-ssz
2025-11-06 16:16:23 +00:00
Preston Van Loon
62fec4d1f3 Replace context.Background with testing.TB.Context where possible (#15416)
* Replace context.Background with testing.TB.Context where possible

* Fix failing tests
2025-06-16 22:09:18 +00:00
james-prysm
8c324cc491 validator client: adding in get duties v2 (#15380)
* adding in get duties v2

* gaz

* missed definition

* removing comment

* updating description
2025-06-05 15:49:57 +00:00
terence
774b9a7159 Migrate Prysm repo to Offchain Labs organization ahead of Pectra V6 (#15140)
* Migrate Prysm repo to Offchain Labs organization ahead of Pectra upgrade v6

* Replace prysmaticlabs with OffchainLabs on general markdowns

* Update mock

* Gazelle and add mock.go to excluded generated mock file
2025-04-10 15:40:39 +00:00
james-prysm
c735ed2e32 Remove use of committee list from validator client (#15039)
* wip

* fixing unit tests

* changing is aggregator function

* wip

* fully removing the use of committee from validator client, adding a wrapper type for duties

* fixing tests

* fixing linting

* fixing more tests

* changelog

* adding some more tests

* Update proto/prysm/v1alpha1/validator.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* radek's feedback

* removing accidently checked in

---------

Co-authored-by: Radosław Kapka <rkapka@wp.pl>
2025-03-25 16:25:42 +00:00
james-prysm
d6ae838bbf replace receive slot with event stream (#13563)
* WIP

* event stream wip

* returning nil

* temp removing some tests

* wip health checks

* fixing conficts

* updating fields based on linting

* fixing more errors

* fixing mocks

* fixing more mocks

* fixing more linting

* removing white space for lint

* fixing log format

* gaz

* reverting changes on grpc

* fixing unit tests

* adding in tests for health tracker and event stream

* adding more tests for streaming slot

* gaz

* Update api/client/event/event_stream.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* review comments

* Update validator/client/runner.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update validator/client/validator.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update validator/client/validator.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update validator/client/validator.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update validator/client/validator.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update validator/client/beacon-api/beacon_api_validator_client.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update validator/client/validator.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update validator/client/validator.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* addressing radek comments

* Update validator/client/validator.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* addressing review feedback

* moving things to below next slot ticker

* fixing tests

* update naming

* adding TODO comment

* Update api/client/beacon/health.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* addressing comments

* fixing broken linting

* fixing more import issues

* fixing more import issues

* linting

* updating based on radek's comments

* addressing more comments

* fixing nogo error

* fixing duplicate import

* gaz

* adding radek's review suggestion

* Update proto/prysm/v1alpha1/node.proto

Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>

* preston review comments

* Update api/client/event/event_stream.go

Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>

* Update validator/client/validator.go

Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>

* addressing some more preston review items

* fixing tests for linting

* fixing missed linting

* updating based on feedback to simplify

* adding interface check at the top

* reverting some comments

* cleaning up intatiations

* reworking the health tracker

* fixing linting

* fixing more linting to adhear to interface

* adding interface check at the the top of the file

* fixing unit tests

* attempting to fix dependency cycle

* addressing radek's comment

* Update validator/client/beacon-api/beacon_api_validator_client.go

Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>

* adding more tests and feedback items

* fixing TODO comment

---------

Co-authored-by: Radosław Kapka <rkapka@wp.pl>
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
2024-03-13 13:01:05 +00:00
Sammy Rosso
4ff91bebf8 Switch gomock library (#13639)
* Update gomock

* Update mockgen

* Gaz

* Go mod

* Cleanup

* Regenerate gomock

* Manually fix import
2024-02-21 18:37:17 +00:00
terence
5a66807989 Update to V5 (#13622)
* First take at updating everything to v5

* Patch gRPC gateway to use prysm v5

Fix patch

* Update go ssz

---------

Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
2024-02-15 05:46:47 +00:00
terencechain
d17996f8b0 Update to V4 🚀 (#12134)
* Update V3 from V4

* Fix build v3 -> v4

* Update ssz

* Update beacon_chain.pb.go

* Fix formatter import

* Update update-mockgen.sh comment to v4

* Fix conflicts. Pass build and tests

* Fix test
2023-03-17 18:52:56 +00:00
Patrice Vignola
dbeb3ee886 Onboard validator's Beacon REST API usage to e2e tests (#11704)
* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* Onboard validator's Beacon REST API usage to e2e tests

* Remove unused variables

* Remove use_beacon_api tags

* Fix DeepSource errors

* Revert unneeded changes

* Revert evaluator changes

* Revert import reordering

* Address PR comments

* Remove all REST API e2e tests except minimal one

* Fix validator pointing to inexisting beacon node port

Co-authored-by: Radosław Kapka <rkapka@wp.pl>
2022-12-08 14:38:56 +00:00
Patrice Vignola
0d4b98cd0a Add REST implementation for Validator's WaitForChainStart (#11654)
* Add REST implementation for Validator's WaitForChainStart

* Add missing error mapping

* Add missing bazel dependency

* Add missing tests

* Address PR comments

* Replace EventErrorJson with DefaultErrorJson

* Add tests for WaitForChainStart

* Refactor tests

* Address PR comments

* Add gazelle:build_tag use_beacon_api comment in BUILD.bazel

* Address PR comments

* Address PR comments

Co-authored-by: Radosław Kapka <rkapka@wp.pl>
2022-11-22 12:12:55 +00:00