Skip to content

Change metric naming, unit, and http fields #213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions internal/scti/handlers.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,48 +68,48 @@ var (
lastSCTTimestamp *prometheus.GaugeVec // origin => value
reqsCounter *prometheus.CounterVec // origin, op => value
rspsCounter *prometheus.CounterVec // origin, op, code => value
rspLatency *prometheus.HistogramVec // origin, op, code => value
rspDuration *prometheus.HistogramVec // origin, op, code => value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what's the thinking behind the name change?
What I think this metric represents is called latency distribution in, er, other automatic monitoring systems we're intimately familiar with :)

If you're intent on changing it, "response duration" sounds a bit odd to my ears, but how about rspTime? Also, the description of the metric still explains it in terms of Latency of responses, so I guess that needs updating to match.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, this PR lacked something very important: a link to https://prometheus.io/docs/practices/naming/, I've now added it. I'm 100% onboard with your comments and this looked a bit odd to me at first sight, being used to other monitoring systems... but it seems to be the prometheus way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent of the naming doc you linked was the http_ prefix there, and the ..._request_duration_seconds was just an example.

I found this "counter" (hah!) example in the histogram section of the prom docs:

A common example would be the time it takes to reply to a request, called latency.

So I think there's latitude to fit in with styles we're familiar with.

)

// setupMetrics initializes all the exported metrics.
func setupMetrics() {
// TODO(phboneff): add metrics for deduplication and chain storage.
knownLogs = promauto.NewGaugeVec(
prometheus.GaugeOpts{
Name: "known_logs",
Name: "ct_known_logs_info",
Help: "Set to 1 for known logs",
},
[]string{"origin"})
lastSCTTimestamp = promauto.NewGaugeVec(
prometheus.GaugeOpts{
Name: "last_sct_timestamp",
Help: "Time of last SCT in ms since epoch",
Name: "ct_sct_last_timestamp_seconds",
Help: "Time of last SCT in seconds since epoch",
},
[]string{"origin"})
lastSCTIndex = promauto.NewGaugeVec(
prometheus.GaugeOpts{
Name: "last_sct_index",
Name: "ct_sct_last_index",
Help: "Index of last SCT",
},
[]string{"origin"})
reqsCounter = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "http_reqs",
Name: "ct_http_requests_total",
Help: "Number of requests",
},
[]string{"origin", "ep"})
[]string{"origin", "operation"})
rspsCounter = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "http_rsps",
Name: "ct_http_responses_total",
Help: "Number of responses",
},
[]string{"origin", "op", "code"})
rspLatency = promauto.NewHistogramVec(
[]string{"origin", "operation", "code"})
rspDuration = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Name: "http_latency",
Name: "ct_http_request_duration_seconds",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: variable naming is in terms of "response", but metric name is now in terms of "request" - would be good to keep them consistent if possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good point!

Help: "Latency of responses in seconds",
},
[]string{"origin", "op", "code"})
[]string{"origin", "operation", "code"})
}

// entrypoints is a list of entrypoint names as exposed in statistics/logging.
Expand Down Expand Up @@ -140,7 +140,7 @@ func (a appHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
a.opts.RequestLog.origin(logCtx, a.log.origin)
defer func() {
latency := a.opts.TimeSource.Now().Sub(startTime).Seconds()
rspLatency.WithLabelValues(label0, label1, strconv.Itoa(statusCode)).Observe(latency)
rspDuration.WithLabelValues(label0, label1, strconv.Itoa(statusCode)).Observe(latency)
}()
klog.V(2).Infof("%s: request %v %q => %s", a.log.origin, r.Method, r.URL, a.name)
// TODO(phboneff): add a.Method directly on the handler path and remove this test.
Expand Down Expand Up @@ -352,7 +352,7 @@ func addChainInternal(ctx context.Context, opts *HandlerOptions, log *log, w htt
}
klog.V(3).Infof("%s: %s <= SCT", log.origin, method)
if sct.Timestamp == timeMillis {
lastSCTTimestamp.WithLabelValues(log.origin).Set(float64(sct.Timestamp))
lastSCTTimestamp.WithLabelValues(log.origin).Set(float64(sct.Timestamp) / 1000)
lastSCTIndex.WithLabelValues(log.origin).Set(float64(idx))
}

Expand Down
Loading