Commit Graph

14 Commits

Author SHA1 Message Date
james-prysm
641d90990d grpc fallback improvements (#16215)
<!-- Thanks for sending a PR! Before submitting:

1. If this is your first PR, check out our contribution guide here
https://docs.prylabs.network/docs/contribute/contribution-guidelines
You will then need to sign our Contributor License Agreement (CLA),
which will show up as a comment from a bot in this pull request after
you open it. We cannot review code without a signed CLA.
2. Please file an associated tracking issue if this pull request is
non-trivial and requires context for our team to understand. All
features and most bug fixes should have
an associated issue with a design discussed and decided upon. Small bug
   fixes and documentation improvements don't need issues.
3. New features and bug fixes must have tests. Documentation may need to
be updated. If you're unsure what to update, send the PR, and we'll
discuss
   in review.
4. Note that PRs updating dependencies and new Go versions are not
accepted.
   Please file an issue instead.
5. A changelog entry is required for user facing issues.
-->

**What type of PR is this?**

## Summary

This PR implements gRPC fallback support for the validator client,
allowing it to automatically switch between multiple beacon node
endpoints when the primary node becomes unavailable or unhealthy.

## Changes

- Added `grpcConnectionProvider` to manage multiple gRPC connections
with circular failover
- Validator automatically detects unhealthy beacon nodes and switches to
the next available endpoint
- Health checks verify both node responsiveness AND sync status before
accepting a node
- Improved logging to only show "Found fully synced beacon node" when an
actual switch occurs (reduces log noise)


I removed the old middleware that uses gRPC's built in load balancer
because:

- gRPC's pick_first load balancer doesn't provide sync-status-aware
failover
- The validator needs to ensure it connects to a fully synced node, not
just a reachable one

## Test Scenario

### Setup
Deployed a 4-node Kurtosis testnet with local validator connecting to 2
beacon nodes:

```yaml
# kurtosis-grpc-fallback-test.yaml
participants:
  - el_type: nethermind
    cl_type: prysm
    validator_count: 128  # Keeps chain advancing
  - el_type: nethermind
    cl_type: prysm
    validator_count: 64
  - el_type: nethermind
    cl_type: prysm
    validator_count: 64   # Keeps chain advancing
  - el_type: nethermind
    cl_type: prysm
    validator_count: 64   # Keeps chain advancing

network_params:
  fulu_fork_epoch: 0
  seconds_per_slot: 6
```

Local validator started with:
```bash
./validator --beacon-rpc-provider=127.0.0.1:33005,127.0.0.1:33012 ...
```

### Test 1: Primary Failover (cl-1 → cl-2)

1. Stopped cl-1 beacon node
2. Validator detected failure and switched to cl-2

**Logs:**
```
WARN  Beacon node is not responding, switching host currentHost=127.0.0.1:33005 nextHost=127.0.0.1:33012
DEBUG Trying gRPC endpoint newHost=127.0.0.1:33012 previousHost=127.0.0.1:33005
INFO  Failover succeeded: connected to healthy beacon node failedAttempts=[127.0.0.1:33005] newHost=127.0.0.1:33012 previousHost=127.0.0.1:33005
```

**Result:**  PASSED - Validator continued submitting attestations on
cl-2

### Test 2: Circular Failover (cl-2 → cl-1)

1. Restarted cl-1, stopped cl-2
2. Validator detected failure and switched back to cl-1

**Logs:**
```
WARN  Beacon node is not responding, switching host currentHost=127.0.0.1:33012 nextHost=127.0.0.1:33005
DEBUG Trying gRPC endpoint newHost=127.0.0.1:33005 previousHost=127.0.0.1:33012
INFO  Failover succeeded: connected to healthy beacon node failedAttempts=[127.0.0.1:33012] newHost=127.0.0.1:33005 previousHost=127.0.0.1:33012
```

**Result:**  PASSED - Circular fallback works correctly

## Key Log Messages

| Log Level | Message | Source |
|-----------|---------|--------|
| WARN | "Beacon node is not responding, switching host" |
`changeHost()` in validator.go |
| INFO | "Switched gRPC endpoint" | `SetHost()` in
grpc_connection_provider.go |
| INFO | "Found fully synced beacon node" | `FindHealthyHost()` in
validator.go (only on actual switch) |

## Test Plan

- [x] Verify primary failover (cl-1 → cl-2)
- [x] Verify circular failover (cl-2 → cl-1)
- [x] Verify validator continues producing attestations after switch
- [x] Verify "Found fully synced beacon node" only logs on actual switch
(not every health check)

**What does this PR do? Why is it needed?**

**Which issues(s) does this PR fix?**

Fixes # https://github.com/OffchainLabs/prysm/pull/7133


**Other notes for review**

**Acknowledgements**

- [x] I have read
[CONTRIBUTING.md](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md).
- [x] I have included a uniquely named [changelog fragment
file](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md#maintaining-changelogmd).
- [x] I have added a description with sufficient context for reviewers
to understand this PR.
- [x] I have tested that my changes work as expected and I added a
testing plan to the PR description (if applicable).

---------

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: Radosław Kapka <rkapka@wp.pl>
Co-authored-by: Manu NALEPA <enalepa@offchainlabs.com>
2026-02-02 14:51:56 +00:00
Bastin
6b5ba5ad01 Switch logging from using prefixes to the new package path format (#16059)
#### This PR sets the foundation for the new logging features.

---

The goal of this big PR is the following:
1. Adding a log.go file to every package:
[_commit_](54f6396d4c)
- Writing a bash script that adds the log.go file to every package that
imports logrus, except the excluded packages, configured at the top of
the bash script.
- the log.go file creates a log variable and sets a field called
`package` to the full path of that package.
- I have tried to fix every error/problem that came from mass generation
of this file. (duplicate declarations, different prefix names, etc...)
- some packages had the log.go file from before, and had some helper
functions in there as well. I've moved all of them to a `log_helpers.go`
file within each package.

2. Create a CI rule which verifies that:
[_commit_](b799c3a0ef)
- every package which imports logrus, also has a log.go file, except the
excluded packages.
- the `package` field of each log.go variable, has the correct path. (to
detect when we move a package or change it's name)
- I pushed a commit with a manually changed log.go file to trigger the
ci check failure and it worked.

3. Alter the logging system to read the prefix from this `package` field
for every log while outputing:
[_commit_](b0c7f1146c)
- some packages have/want/need a different log prefix than their package
name (like `kv`). This can be solved by keeping a map of package paths
to prefix names somewhere.
    
    
---

**Some notes:**
- Please review everything carefully.
- I created the `prefixReplacement` map and populated the data that I
deemed necessary. Please check it and complain if something doesn't make
sense or is missing. I attached at the bottom, the list of all the
packages that used to use a different name than their package name as
their prefix.
- I have chosen to mark some packages to be excluded from this whole
process. They will either not log anything, or log without a prefix, or
log using their previously defined prefix. See the list of exclusions in
the bottom.
- I fixed all the tests that failed because of this change. These were
failing because they were expecting the old prefix to be in the
generated logs. I have changed those to expect the new `package` field
instead. This might not be a great solution. Ideally we might want to
remove this from the tests so they only test for relevant fields in the
logs. but this is a problem for another day.
- Please run the node with this config, and mention if you see something
weird in the logs. (use different verbosities)
- The CI workflow uses a script that basically runs the
`hack/gen-logs.sh` and checks that the git diff is zero. that script is
`hack/check-logs.sh`. This means that if one runs this script locally,
it will not actually _check_ anything, rather than just regenerate the
log.go files and fix any mistake. This might be confusing. Please
suggest solutions if you think it's a problem.

---

**A list of packages that used a different prefix than their package
names for their logs:**

- beacon-chain/cache/depositsnapshot/ package depositsnapshot, prefix
"cache"
- beacon-chain/core/transition/log.go — package transition, prefix
"state"
  - beacon-chain/db/kv/log.go — package kv, prefix "db"
- beacon-chain/db/slasherkv/log.go — package slasherkv, prefix
"slasherdb"
- beacon-chain/db/pruner/pruner.go — package pruner, prefix "db-pruner"
- beacon-chain/light-client/log.go — package light_client, prefix
"light-client"
- beacon-chain/operations/attestations/log.go — package attestations,
prefix "pool/attestations"
- beacon-chain/operations/slashings/log.go — package slashings, prefix
"pool/slashings"
  - beacon-chain/rpc/core/log.go — package core, prefix "rpc/core"
- beacon-chain/rpc/eth/beacon/log.go — package beacon, prefix
"rpc/beaconv1"
- beacon-chain/rpc/eth/validator/log.go — package validator, prefix
"beacon-api"
- beacon-chain/rpc/prysm/v1alpha1/beacon/log.go — package beacon, prefix
"rpc"
- beacon-chain/rpc/prysm/v1alpha1/validator/log.go — package validator,
prefix "rpc/validator"
- beacon-chain/state/stategen/log.go — package stategen, prefix
"state-gen"
- beacon-chain/sync/checkpoint/log.go — package checkpoint, prefix
"checkpoint-sync"
- beacon-chain/sync/initial-sync/log.go — package initialsync, prefix
"initial-sync"
  - cmd/prysmctl/p2p/log.go — package p2p, prefix "prysmctl-p2p"
  - config/features/log.go -- package features, prefix "flags"
  - io/file/log.go — package file, prefix "fileutil"
  - proto/prysm/v1alpha1/log.go — package eth, prefix "protobuf"
- validator/client/beacon-api/log.go — package beacon_api, prefix
"beacon-api"
  - validator/db/kv/log.go — package kv, prefix "db"
  - validator/db/filesystem/db.go — package filesystem, prefix "db"
- validator/keymanager/derived/log.go — package derived, prefix
"derived-keymanager"
- validator/keymanager/local/log.go — package local, prefix
"local-keymanager"
- validator/keymanager/remote-web3signer/log.go — package
remote_web3signer, prefix "remote-keymanager"
- validator/keymanager/remote-web3signer/internal/log.go — package
internal, prefix "remote-web3signer-
    internal"
- beacon-chain/forkchoice/doubly... prefix is
"forkchoice-doublylinkedtree"
  
  
  
**List of excluded directories (their subdirectories are also
excluded):**
  ```
  EXCLUDED_PATH_PREFIXES=(
      "testing"
      "validator/client/testutil"
      "beacon-chain/p2p/testing"
      "beacon-chain/rpc/eth/config"
      "beacon-chain/rpc/prysm/v1alpha1/debug"
      "tools"
      "runtime"
      "monitoring"
      "io"
      "cmd"
      ".well-known"
      "changelog"
      "hack"
      "specrefs"
      "third_party"
      "bazel-out"
      "bazel-bin"
      "bazel-prysm"
      "bazel-testlogs"
      "build"
      ".github"
      ".jj"
      ".idea"
      ".vscode"
)
```
2026-01-05 14:15:20 +00:00
Preston Van Loon
2fd6bd8150 Add golang.org/x/tools modernize static analyzer and fix violations (#15946)
* Ran gopls modernize to fix everything

go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./...

* Override rules_go provided dependency for golang.org/x/tools to v0.38.0.

To update this, checked out rules_go, then ran `bazel run //go/tools/releaser -- upgrade-dep -mirror=false org_golang_x_tools` and copied the patches.

* Fix buildtag violations and ignore buildtag violations in external

* Introduce modernize analyzer package.

* Add modernize "any" analyzer.

* Fix violations of any analyzer

* Add modernize "appendclipped" analyzer.

* Fix violations of appendclipped

* Add modernize "bloop" analyzer.

* Add modernize "fmtappendf" analyzer.

* Add modernize "forvar" analyzer.

* Add modernize "mapsloop" analyzer.

* Add modernize "minmax" analyzer.

* Fix violations of minmax analyzer

* Add modernize "omitzero" analyzer.

* Add modernize "rangeint" analyzer.

* Fix violations of rangeint.

* Add modernize "reflecttypefor" analyzer.

* Fix violations of reflecttypefor analyzer.

* Add modernize "slicescontains" analyzer.

* Add modernize "slicessort" analyzer.

* Add modernize "slicesdelete" analyzer. This is disabled by default for now. See https://go.dev/issue/73686.

* Add modernize "stringscutprefix" analyzer.

* Add modernize "stringsbuilder" analyzer.

* Fix violations of stringsbuilder analyzer.

* Add modernize "stringsseq" analyzer.

* Add modernize "testingcontext" analyzer.

* Add modernize "waitgroup" analyzer.

* Changelog fragment

* gofmt

* gazelle

* Add modernize "newexpr" analyzer.

* Disable newexpr until go1.26

* Add more details in WORKSPACE on how to update the override

* @nalepae feedback on min()

* gofmt

* Fix violations of forvar
2025-11-14 01:27:22 +00:00
Bastin
92bd211e4d upgrade v6 to v7 (#15989)
* upgrade v6 to v7

* changelog

* update-go-ssz
2025-11-06 16:16:23 +00:00
Preston Van Loon
62fec4d1f3 Replace context.Background with testing.TB.Context where possible (#15416)
* Replace context.Background with testing.TB.Context where possible

* Fix failing tests
2025-06-16 22:09:18 +00:00
terence
774b9a7159 Migrate Prysm repo to Offchain Labs organization ahead of Pectra V6 (#15140)
* Migrate Prysm repo to Offchain Labs organization ahead of Pectra upgrade v6

* Replace prysmaticlabs with OffchainLabs on general markdowns

* Update mock

* Gazelle and add mock.go to excluded generated mock file
2025-04-10 15:40:39 +00:00
Nishant Das
5d6a406829 Update to Go 1.23 (#14818)
* Update to Go 1.23

* Update bazel version

* Update rules_go

* Use toolchains_protoc

* Update go_honnef_go_tools

* Update golang.org/x/tools

* Fix violations of SA3000

* Update errcheck by re-exporting the upstream repo

* Remove problematic ginkgo and gomega test helpers. Rewrote tests without these test libraries.

* Update go to 1.23.5

* gofmt with go1.23.5

* Revert Patch

* Unclog

* Update for go 1.23 support

* Fix Lint Issues

* Gazelle

* Fix Build

* Fix Lint

* no lint

* Fix lint

* Fix lint

* Disable intrange

* Preston's review

---------

Co-authored-by: Preston Van Loon <preston@pvl.dev>
2025-01-24 04:53:23 +00:00
james-prysm
45fd3eb1bf gRPC Gateway Removal (#14089)
* wip passing e2e

* reverting temp comment

* remove unneeded comments

* fixing merge errors

* fixing more bugs from merge

* fixing test

* WIP moving code around and fixing tests

* unused linting

* gaz

* temp removing these tests as we need placeholder/wrapper APIs for them with the removal of the gateway

* attempting to remove dependencies to gRPC gateway , 1 mroe left in deps.bzl

* renaming flags and other gateway services to http

* goimport

* fixing deepsource

* git mv

* Update validator/package/validator.yaml

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update validator/package/validator.yaml

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update cmd/beacon-chain/flags/base.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update cmd/beacon-chain/flags/base.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update cmd/beacon-chain/flags/base.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* addressing feedback

* missed lint

* renaming import

* reversal based on feedback

* fixing web ui registration

* don't require mux handler

* gaz

* removing gRPC service from validator completely, merged with http service, renames are a work in progress

* updating go.sum

* linting

* trailing white space

* realized there was more cleanup i could do with code reuse

* adding wrapper for routes

* reverting version

* fixing dependencies from merging develop

* gaz

* fixing unit test

* fixing dependencies

* reverting unit test

* fixing conflict

* updating change log

* Update log.go

Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>

* gaz

* Update api/server/httprest/server.go

Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>

* addressing some feedback

* forgot to remove deprecated flag in usage

* gofmt

* fixing test

* fixing deepsource issue

* moving deprecated flag and adding timeout handler

* missed removal of a flag

* fixing test:

* Update CHANGELOG.md

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* addressing feedback

* updating comments based on feedback

* removing unused field for now, we can add it back in if we need to use the option

* removing unused struct

* changing api-timeout flag based on feedback

---------

Co-authored-by: Radosław Kapka <rkapka@wp.pl>
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
2024-09-04 15:40:31 +00:00
terence
5a66807989 Update to V5 (#13622)
* First take at updating everything to v5

* Patch gRPC gateway to use prysm v5

Fix patch

* Update go ssz

---------

Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
2024-02-15 05:46:47 +00:00
james-prysm
0266609bf6 bugfix : Eth-Consensus-Version header on response header (#12600)
* adding in custom header

* adding in parsing for middleware

* fixing casing

* add handling on error as well

* changing how error is handled for header

* changing how error is handled

* fixing casing

* Update beacon-chain/rpc/eth/beacon/blocks.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update beacon-chain/rpc/eth/beacon/blinded_blocks.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update beacon-chain/rpc/eth/beacon/blinded_blocks.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* Update beacon-chain/rpc/eth/beacon/blocks.go

Co-authored-by: Radosław Kapka <rkapka@wp.pl>

* fixing unit tests and review comment

* making some constants consistent in 1 file

* fixing missed blinded blocks

* fixing constants in custom handler tests

---------

Co-authored-by: Radosław Kapka <rkapka@wp.pl>
Co-authored-by: Raul Jordan <raul@prysmaticlabs.com>
2023-07-11 13:27:23 -05:00
terencechain
d17996f8b0 Update to V4 🚀 (#12134)
* Update V3 from V4

* Fix build v3 -> v4

* Update ssz

* Update beacon_chain.pb.go

* Fix formatter import

* Update update-mockgen.sh comment to v4

* Fix conflicts. Pass build and tests

* Fix test
2023-03-17 18:52:56 +00:00
Raul Jordan
d077483577 Add V3 Suffix to All Prysm Packages (#11083)
* v3 import renamings

* tidy

* fmt

* rev

* Update beacon-chain/core/epoch/precompute/reward_penalty_test.go

* Update beacon-chain/core/helpers/validators_test.go

* Update beacon-chain/db/alias.go

* Update beacon-chain/db/alias.go

* Update beacon-chain/db/alias.go

* Update beacon-chain/db/iface/BUILD.bazel

* Update beacon-chain/db/kv/kv.go

* Update beacon-chain/db/kv/state.go

* Update beacon-chain/rpc/prysm/v1alpha1/validator/attester_test.go

* Update beacon-chain/rpc/prysm/v1alpha1/validator/attester_test.go

* Update beacon-chain/sync/initial-sync/service.go

* fix deps

* fix bad replacements

* fix bad replacements

* change back

* gohashtree version

* fix deps

Co-authored-by: Nishant Das <nishdas93@gmail.com>
Co-authored-by: Potuz <potuz@prysmaticlabs.com>
2022-08-16 12:20:13 +00:00
Raul Jordan
a9a4bb9163 Move Shared/Testutil into Testing (#9659)
* move testutil

* util pkg

* build

* gaz

Co-authored-by: prylabs-bulldozer[bot] <58059840+prylabs-bulldozer[bot]@users.noreply.github.com>
2021-09-23 18:53:46 +00:00
Raul Jordan
8f0008c45a Move Related Packages Into API/ Pkg (#9619)
* begin on api pkg

* build

* build

Co-authored-by: prylabs-bulldozer[bot] <58059840+prylabs-bulldozer[bot]@users.noreply.github.com>
2021-09-16 19:55:51 +00:00