94 Commits

Author SHA1 Message Date
Ragnar
51e1ede62c docs: add comprehensive usage documentation to docker-compose.local.yml (#8441)
- Replace TODO comment with detailed usage instructions 
- Add step-by-step guide for local monitoring setup
- Include port information and service URLs
- Explain host.docker.internal configuration"
2025-09-30 17:22:00 -04:00
Julien
95ce0442c3 chore: added docker support for osx (#6696)
* chore: added docker support for osx

* chore: address comments

* chore: address comments

* Update docker-compose.yml

Co-authored-by: Nico Flaig <nflaig@protonmail.com>

* chore: address comments

---------

Co-authored-by: Nico Flaig <nflaig@protonmail.com>
2024-05-24 17:18:35 +02:00
Nazar Hussain
bbfdcb4cbe chore(docker): security upgrade grafana/grafana from 8.5.25 to 8.5.27 (#5924)
fix: docker/grafana/Dockerfile to reduce vulnerabilities

The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE315-NCURSES-5606598
- https://snyk.io/vuln/SNYK-ALPINE315-NCURSES-5606598
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-5661569
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-5661569
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-5788364

Co-authored-by: snyk-bot <snyk-bot@snyk.io>
2023-09-11 12:01:56 +02:00
Snyk bot
b8c239f020 chore: [Snyk] Security upgrade grafana/grafana from 8.5.22 to 8.5.25 (#5571)
* fix: docker/grafana/Dockerfile to reduce vulnerabilities

The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3368753
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3368753
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-5291790
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-5291790

* Update docker/grafana/Dockerfile

---------

Co-authored-by: Cayman <caymannava@gmail.com>
2023-05-30 19:17:46 +00:00
Snyk bot
7e34b462b6 [Snyk] Security upgrade grafana/grafana from 8.5.20 to 8.5.22 (#5321)
fix: docker/grafana/Dockerfile to reduce vulnerabilities

The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE315-E2FSPROGS-3339845
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314621
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314621
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314622
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314629
2023-03-30 09:19:59 -04:00
Lion - dapplion
d1cddb7ca8 Add dockerized metrics local setup (#5173)
* Add dockerized metrics local setup

* Review PR
2023-02-20 11:08:46 -05:00
Nazar Hussain
91473d3724 [Snyk] Security upgrade grafana/grafana from 8.5.16 to 8.5.20 (#5166)
* fix: docker/grafana/Dockerfile to reduce vulnerabilities

The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314621
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314622
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314628
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314629
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-3314629

* Update docker/grafana/Dockerfile

---------

Co-authored-by: snyk-bot <snyk-bot@snyk.io>
Co-authored-by: Cayman <caymannava@gmail.com>
2023-02-20 14:42:35 +01:00
Nico Flaig
914397acd5 Fix prometheus in local metrics docker setup (#5123) 2023-02-10 11:01:31 -05:00
Snyk bot
a0da8cd996 [Snyk] Security upgrade grafana/grafana from 8.4.2 to 8.5.16 (#5071)
fix: docker/grafana/Dockerfile to reduce vulnerabilities

The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-2426331
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-2426331
- https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-2426331
- https://snyk.io/vuln/SNYK-ALPINE315-ZLIB-2434420
- https://snyk.io/vuln/SNYK-ALPINE315-ZLIB-2976173
2023-01-31 06:48:34 +08:00
Nico Flaig
0582f418ae Use job name to differentiate beacon and validator metrics (#4952)
Grafana dashboards now use the "job" default label instead of "scrape_location"
custom label to differentiate between beacon and validator metrics.

The prometheus config files are updated to use the expected job name.
2023-01-12 12:01:20 -05:00
Micah Zoltu
5b9a78e636 Moves Prometheus entrypoint to a non-volume directory. (#4982)
Putting the entrypoint script in `/prometheus` caused problems with updating the docker image as the entrypoint was written to the volume when it was first created and then wouldn't be overwritten by future images.  Moving the entrypoint script into a non-mounted location avoids this problem.

I also added a VOLUME line to make it clear to anyone reading the dockerfile that `/prometheus` is intended to be a mounted volume, which may also help some tools auto-discover volumes and I think docker may even automatically create a volume when the image is run (not sure on the last).  Generally speaking, I believe it is a good practice to flag intended mount points in the Dockerfile.

Note: Things will work without this change, but if the entrypoint.sh file ever changes in a future image users will have a bad time when updating to the new image from an old one so ideally this should happen sooner rather than later (ideally before a lot of people start using this published image).
2023-01-06 09:12:21 -05:00
Micah Zoltu
f85e1c953f Fixes prometheus entrypoint. (#4975)
Unfortunately I got this wrong in my previous PR.  This should fix it though (tested the build locally and it works/runs).
2023-01-05 09:25:21 -05:00
Micah Zoltu
84b798dac4 Publishes Grafana and Prometheus images for monitoring. (#4927) 2023-01-04 10:04:16 -05:00
Lion - dapplion
7c54815d42 Build from source by default (#4359)
* Optimize build from source

* Build from source by default

* Don't remove source

* Print layer history in CI
2022-08-05 09:31:49 -05:00
dadepo
a987fa9aa2 Add doppelganger support (#3883)
* exposing liveness endpoint from beacon node

* Added test for the liveness endpoint

* update description of test

* Correct computing end slot for an epoch

* Added Doppelganger service which polls the beacon nodes liveness endpoint (WIP)

* detect doppelganger based on data from liveness endpoint

* simplify doppelganger registration

* Add doppelganger guide to signing of blocks

* Fix test by only creating an instance - which starts doppelganger proptection - only if it is enabled

* doppelganger added to signing attestation

* filter duties by safe keys

* minor renaming

* make epoch to miss attestation configurable

* a better way to setting default value for doppelgangerEpochsToCheck

* a better way to setting default value for doppelgangerEpochsToCheck also in tests

* Move the doppelganger protection into the validatorStore methods

* Adding a test to confirm validator is shutdown in case of a doppelganger detection

* Added test for when doppelganger is on but no duplicate live keys

* Fix type errors in test after merging master

* Testing that siging block proposer is only allowed after doppelganger period has elapsed

* Testing that attesting is only allowed after doppelganger period has elapsed

* Fix type errros in tests

* Fix lint

* Some renaming, removal of unused code etc

* Improvement to test

* compute timeout used in tests

* increase test timeout

* merged in master and fix conflicts

* minor formating

* Move getLiveness api from lodestar namespace to validator

* Minor style related fixes and reverting unnecessary changes in return statements

* Do not start doppelganger if not after first slot of genesis epoch

* adding proposer to seenBlockProposers on import of block by the beacon node

* de-deuplicate doppelganger options

* fix check-types errors

* do not detect activity of same vc as doppelganger

* Renamed ImportBlockModules variable back to PR. Will rename in another PR

* removing remainingEpochsToCheck and making enableDoppelganger and optional cli option

* fix lint errors

* Including block attesters in the liveness

* Undo renaming

* Return data directly. No need to resolve it in a promise

* fix lint

* fix integration test for the liveness endpoint

* Initial metrics to measure duration of doppelganger check

* Merged in master and fix merge conflicts

* Added doppelganger status to metrics

* Fixing more errors

* making sure generateAttestationData is setting the current slot correctly

* renamed metrics with a vc_ prefix to be inline with naming convention

* having custom bucket instead for vc_doppelganger_check_time_seconds

* Fix lint

* Attempting to expose the vc metrics endpoint also in local metrics setup. Still need to check on server if this works

* Adding epoch as a label to the doppelganger metrics

* Revert "Adding epoch as a label to the doppelganger metrics"

This reverts commit 993a4e81e6.

* Fix logging erros to console to show currentEpoch

* correct path in get liveness endpoint

* should now be observing attesters

* optimise observing block attesters

* fix prunning on finalized

* added some code re-use in tests

* on registering set status as unverified for metrics

* comment to explain the doppelganger status

* Added to metrics when doppelganger detected

* Update packages/lodestar/src/api/impl/validator/index.ts

Co-authored-by: Cayman <caymannava@gmail.com>

* Fix test by updating assertion in test to correspond to changes made to code

* fix lint errors

* Renamed enableDoppelganger => doppelgangerProtectionEnabled

* Fix lint

* fixing compile error after merging in unstable.

* Review PR

* Revert yarn-lock changes

* Review lodesar code

* removing use of uneeded Array.from

* No need to throw. Polutes the log in an expected scenario

* include doppelganger in filter of relevant duties

* Simplify pollLiveness

* Remove unnecessary diff

* Assert doppleganger safe

* Review PR

* replace validator.stop with validator.close since validator.stop has been removed

* Update test to reflect that validator.start has been removed

* Should fix ci: check-types, test, lint and e2e

* Move options to validator

* TODO processed. Removed check introduced in the doppelganger process

* Add doppleganger unit tests

* Validate remoteKeys API data

* Fix spelling

* Log that doppelganger is enabled

* Fetch pubkeys not known yet

* Add createAttesterDuty

* Log to info doppleganger process

* Add processShutdownCallback

* Disable full e2e test in lodestar for doppelganger

* Fix test types

* Use same epoch value in test as another value could be misleading

* Remove doppelgangerStatusMetrics

* Review PR

Co-authored-by: Cayman <caymannava@gmail.com>
Co-authored-by: dapplion <35266934+dapplion@users.noreply.github.com>
2022-06-29 20:46:43 +02:00
Lion - dapplion
15d8ae2c2c Simplify gitData and version guessing (#3992)
Don't print double slash in version string

Dont add git-data.json to NPM releases

Write git-data.json only in from source docker build

Remove numCommits

Test git-data.json generation from within the test

Move comment

Revert "Dont add git-data.json to NPM releases"

This reverts commit 5fe2d388825f3e3a834058478071e8364b0d761c.

Simplify gitData and version guessing

Run cmd
2022-05-10 12:07:27 +02:00
Lion - dapplion
d0a78a209c Re-org dashboards into common folder (#3905) 2022-04-10 21:07:08 +05:30
Lion - dapplion
55180870d9 Add Gossipsub debug charts (#3904) 2022-04-10 18:33:44 +05:30
dadepo
3e1f9b26fa Support running metrics via (grafana/prometheus) on macos (#3868)
* Support running local monitoring (grafana/prometheus) via docker on macos

* create seperate prometheus/grafana config files since host.docker.internal does not work crossplatform

* removed extra_hosts
2022-04-04 00:39:35 +05:30
Lion - dapplion
f63ae9d4f0 Add node exporter metrics section (#3816)
* Add node exporter metrics section

* Set examplar false
2022-03-01 09:54:19 -06:00
Lion - dapplion
0efbf1d671 Add more range sync metrics (#3803)
* Add more sync metrics

* Bump to 8.4.2

* Lock Grafana version

* Add Sync - Range charts

* Set exemplar false
2022-02-28 22:39:26 +05:30
Lion - dapplion
fa5ffea989 Update Grafana dashboard (#3795)
* Update Grafana dashboard

* Add common datasource uid

* Set exemplar to false
2022-02-28 10:48:23 +05:30
Lion - dapplion
9a58ce3c8b Run prettier on entire repo (#3720) 2022-02-07 09:57:26 -06:00
g11tech
0e47cd31af backfill sync from an anchor checkpoint state (#3384)
* Rebased mpetrunic's PR #2637 with fixes on current master

* fixing the remove peer error

* refactoring to solve sync stuck issues on not anchored kind of errors

* read from db, validate wsCheckpoint

* backfill sequences in db to skip redoing previous backfill work

* syncrange improvs

* feedback cleanup, modular refac of sync function and metrics update

* cleanup

* Graphana Dashboard

* renaming sequences to ranges

* rebase cleanup

* shortneing comment

* using initialize from's return as the anchorState

* Fix metrics

* Add Aborted enum value in lodestar_backfill_sync_status

* Only use JSDoc comment notation for JSDocs

* Simplify nullish values to be only null

* WIP

* refactoring the backfill sync, with parent-child linkage verfication of last previous unverified finalized or wscheckpoint block

* cleanup and simplification of checkpoint/prev finalized checks

* initializing backfillwritten to avoid previous overwriting with a ahead value

* prev finalized or wscheckpoint lookup fix

* missing initializtion

* better assignment of prev fin or ws checkpoint

* don't verify sig on genesis block

* making the extractPreviousFinOrWsCheckpoint lighter

* simplfication of extractPreviousFinOrWsCheckpoint

* improving messaging

* metric for prev fin or ws block slot validation

* dashboard entry for prev fin or ws checkpoint slot for validation

* dashed line for prev fin or ws slot for better clairty

* comments cleanup and always backfill

Co-authored-by: dapplion <35266934+dapplion@users.noreply.github.com>
2022-01-06 18:32:38 +01:00
Lion - dapplion
434891d2af Reduce max old space size to 4096 (#3516) 2021-12-15 12:30:43 +01:00
Lion - dapplion
f5d6f9f2e7 Add multi-node dashboard (#3503)
* Add ad hoc filter to dashboard

* Add multinode dashboard

* Enable auto-refresh

* Scan all dashboards for exemplar
2021-12-11 09:26:40 -06:00
g11tech
1c5047c126 changing points to line in graphana graphs (#3458) 2021-11-30 08:46:46 +01:00
Cayman
f82e0730c7 Update nodejs to v16 - gallium (#3440) 2021-11-20 23:42:59 +01:00
tuyennhv
ac812ab38b Monitor validator balance (#3430)
* Monitor validator balance per epoch

* Grafana: add Correct Head Percentage panel
2021-11-16 09:25:08 +07:00
tuyennhv
40d47c1121 Grafana Dashboard: add Precompute Epoch Transition panels (#3421)
* Grafana Dashboard: add Precompute Epoch Transition panels

* Apply rate() to counter  metrics
2021-11-10 15:00:01 +01:00
Lion - dapplion
6b4f9ed104 Use python3 in Dockerfile (#3426) 2021-11-10 14:59:48 +01:00
Lion - dapplion
08dbb21538 Refactor discovery consumer (#3403)
* Integrate discv5 into discovery consumer

* Start discovery

* Update test types

* Add metrics for find node queries

* Add cachedENRsSize metric

* Add dashboard

* Track dropped ENRs

* Track peersToConnect metric

* Improve metrics

* Set exemplar to false

* More charts

* Fix e2e tests

* Tune charts

* WIP test

* Uncomment retry

* Track count of sync peers

* Review libp2p options

* Disable libp2p latency monitor

* Improve PeerManager peer data

* Overshoot when connecting to peers

* Skip discv5 e2e test
2021-11-04 12:03:21 -05:00
g11tech
13b61a32d2 prefixing lodestar to lodestar metrics, suffixing with a quantifier like count,total if missing in gauge metrics (#3404) 2021-11-01 09:19:33 +07:00
g11tech
861f6e4531 regen fn metrices collapse and negative cache artificats fx (#3261) 2021-10-28 17:11:14 +02:00
g11tech
694a6562a6 tracking UnhandledPromiseRejection(s) (#3386) 2021-10-25 16:28:42 -05:00
tuyennhv
20d4cab311 Epoch transition count metric (#3310)
* Epoch transition count metric

* Fix exemplar

* Address PR comment
2021-10-06 13:07:49 +07:00
tuyennhv
957dfb2aff Add Unknown Block Sync metrics to Grafana dashboard (#3244)
* Add Unknown Block Sync metrics to Grafana dashboard

* Revert title and uid
2021-09-24 18:58:58 +02:00
tuyennhv
f86a4381bd Add Gossip Block metrics (#3214)
* Add Gossip Block metrics

* Fix lint

* Fix check types

* Capture seenTimestamp before gossip queues

* Fix merge issue

* Calculate elappsedTimeTillProcessed in gossip handler

Co-authored-by: dapplion <35266934+dapplion@users.noreply.github.com>
2021-09-21 10:48:36 +02:00
g11tech
1041a5d2a0 updating Dicv5 legends (#3181) 2021-09-16 09:27:28 +02:00
Cayman
e612b00a94 Add discv5 metrics to grafana dashboard (#3103)
* Add discv5 metrics to grafana dashboard

* Fix lint error

* Fix metrics

Co-authored-by: dapplion <35266934+dapplion@users.noreply.github.com>
2021-09-13 22:25:57 +02:00
g11tech
ec7e4192e1 rearrangement of regen fn stats panel (#3119) 2021-09-13 10:49:15 +02:00
g11tech
2c9ebf748c switching off exemplar in queries (#3114)
Co-authored-by: gajinder <gajinder@g11>
2021-09-12 12:06:45 +02:00
g11tech
71b44e667b regen metrics reference impl (#2852)
* state cache and checkpoint cache metrics across all entrypoints

* Reduce diff

* refac regen metrics based on the latest jobprocessor queue

* regen cache dashboard

* regen fn stats

* removing labels from the cache metrics

* additional state/checkpoint state cache add, size metrics

* grafana dashboard update as well as new metrics for state and statecheckpoint

* Review PR

Co-authored-by: Lion - dapplion <35266934+dapplion@users.noreply.github.com>
2021-08-28 15:36:02 +02:00
Lion - dapplion
cd6e6a4f3b Remove postinstall script (#3027) 2021-08-26 19:52:32 -05:00
Lion - dapplion
0add9d11d6 Update Grafana dashboard (#2947) 2021-08-11 09:58:29 +02:00
Lion - dapplion
8a61ea0a86 Harden docker setup (#2891)
* Disable anonymous login in Grafana

* Don't expose API port by default

* Remove cli link in package.json

* Move NODE_OPTIONS to ENV to split beacon_node and validator limits

* Revert "Remove cli link in package.json"

This reverts commit 74c9b2ec9a.
2021-07-27 09:45:51 -05:00
Lion - dapplion
6fe84e427f Add generic batch BLS verification (#2801)
* Buffer jobs in BLS queue + batch them

* Mark some gossip objects as batchable

* Chunkify batchable

* Guard against missing jobResult

* Add more comments and review

* Update BLS grafana charts

* Just assert error happens

* Fix chart equation

* Don't call metric.inc() with 0
2021-07-26 15:20:22 -05:00
Lion - dapplion
623994c533 Add head drift chart (#2844) 2021-07-19 10:23:58 -05:00
Lion - dapplion
34f55490e5 Review gossipsub handlers (#2803)
* Handle onAttestation error

* Simplify gossip validation fns

* Move gossip topic handling to validate functions

* Re-org gossip handlers

* Cleanup

* Override validate function completely

* Fix tests

* Add StrictNoSign validation

* Add gossipMeshPeersBySyncCommitteeSubnet metric

* Handle multiple forks in meshPeers metrics

* Update tests

* Rename allForksAfterAltair

* Fix merge issues

* Fix merge issues in e2e tests
2021-07-07 08:48:16 +07:00
Lion - dapplion
fde84b56bb Remove gitData script from docker build (#2815) 2021-07-06 13:24:09 -05:00