Skip to content

use prometheus metrics instead of opencensus#251

Open
gammazero wants to merge 3 commits intomainfrom
prometheus-metrics
Open

use prometheus metrics instead of opencensus#251
gammazero wants to merge 3 commits intomainfrom
prometheus-metrics

Conversation

@gammazero
Copy link
Copy Markdown
Collaborator

Use of prometheus metrics is cleaner.

Comment thread metrics/metrics.go
Aggregation: view.Sum(),
}
CacheEvictions = prometheus.NewGauge(
prometheus.GaugeOpts{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this shouldn't be a counter, may be a bit tricky as from what I can see it may require creating a custom collector, but the type is used by tools such as grafana explorer to suggest functions that work well with counters.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense as a counter, but cannot do that because I need to get this as a value from the cache implementation.

Comment thread metrics/pebble_metrics.go
var (
cacheTag, _ = tag.NewKey("cache")
flushCount = prometheus.NewGauge(
prometheus.GaugeOpts{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another counter candidate ;)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot use a counter for this, because it is a value that is retrieved from pebble.

Comment thread metrics/metrics.go
CacheMultihashes = stats.Int64("core/cache/multihashes", "Number of cached multihashes", stats.UnitDimensionless)
CacheValues = stats.Int64("core/cache/values", "Number of cached values", stats.UnitDimensionless)
CacheEvictions = stats.Int64("core/cache/evictions", "Number of indexes evicted from cache", stats.UnitDimensionless)
CacheMisuse = stats.Int64("core/cache/misuse", "Cache clears due to high value to multihash ratio (indexer misuse)", stats.UnitDimensionless)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be removed - what is the reason?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not removed. Just represented as prometheus counters or gauges. For example, CacheHits is defined just below this comment.

Comment thread metrics/metrics.go
Measure: StoreSize,
Aggregation: view.LastValue(),
}
GetIndexLatency = prometheus.NewGauge(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use histogram for that

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create the histogram in grafana instead, and just track the raw value here? If we can do that, then the viewer can configure whatever bucket sizes work best for them to see interesting data.

Comment thread metrics/metrics.go
GetIndexLatency = stats.Float64("core/get_index_latency", "Internal lookup time for a single index", stats.UnitMilliseconds)
IngestMultihashes = stats.Int64("core/ingest_multihashes", "Number of multihashes put into the indexer", stats.UnitDimensionless)
RemovedProviders = stats.Int64("core/removed_providers", "Number of providers removed from indexer", stats.UnitDimensionless)
StoreSize = stats.Int64("core/storage_size", "Bytes of storage used to store the indexed content", stats.UnitBytes)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That one (StoreSize) and two below (DHMultihashLatency and DHMetadataLatency) are missing in the new metrics set? What was the reason?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StoreSize was removed because it is not used. Since some stores do not provide any indication of how much data they store, we look at disk usage instead.

DHMultihashLatency and DHMetadataLatency are not used anywhere AFAIK. storetheindex keeps metrics on the multihash/value write latency for all backends, whether using dhstore, pebble, or something else.

@gammazero gammazero requested a review from byo April 8, 2026 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants