Node Operator Observability
Due by March 28, 2025
0% complete
The end-point of this milestone is a Grafana dashboard that we can open source and provide to node operators. We will need to add a bunch of new metrics to satisfy that dashboard
It should contain:
Sync Metrics
- Number of connected peers
- Number of messages received through sync (per peer)
- Success/failure metrics for sync API requests (per peer)
- Rate of me…
The end-point of this milestone is a Grafana dashboard that we can open source and provide to node operators. We will need to add a bunch of new metrics to satisfy that dashboard
It should contain:
Sync Metrics
- Number of connected peers
- Number of messages received through sync (per peer)
- Success/failure metrics for sync API requests (per peer)
- Rate of messages received through sync/second
- Success/failure metrics for validating messages and storing to the DB
- Percentage of sync jobs (peers to sync from) that are succeeding or failing at a given time.
Indexer metrics
- Gauge of current block being indexed
- Gauge of current block on chain
- Distance between current block and indexer progress (how far behind the head of the chain are we)
- Number of messages processed (per contract)
- Message validation and storage error/success rates
- Number of retryable errors reached in message storer
API
- Number of requests processed (tags per method, and per response status code)
- Response time in ms (tags per method)
- Response status code for all unary methods
Anything else we decide is important along the way