Use linear buckets in some places #3384

ndr-ds · 2025-02-21T02:07:45Z

Motivation

We currently use exponential buckets everywhere. As much as they're good in the sense that they generate less buckets, and can be cheaper on Grafana Cloud, etc, they make our buckets super wide, which makes our Prometheus data less accurate.

Proposal

I need to spend some time later looking at the testnet data for these different metrics to adjust the buckets, but for not just changing proxy/server latencies to use linear buckets instead.
I also changed the default starting value to 0.001 as 1 microsecond should be enough for most of what we're measuring, and will take at least one bucket away from metrics using this.

Test Plan

Run a validator locally, see the metrics exported with the new buckets

Release Plan

Nothing to do / These changes follow the usual release cycle.

ndr-ds · 2025-02-21T02:08:04Z

Use linear buckets in some places #3384 👈 (View in Graphite)
Have one thread per chain on benchmark #3374
Adjust dashboards to show 1m metrics #3380
Adjust latency metrics buckets #3379
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

MathieuDutSik

I do not use the buckets, instead I am using sum and the numbers which gets me more raw data.

With that being said, while a linear scaling might be better for many cases, I think exponential is better for latency. So, I do not like the global change from exponential to linear.

ndr-ds · 2025-02-21T20:18:48Z

Why is exponential better for latency? It gives us less accurate latency values, as I explained in the PR description

MathieuDutSik · 2025-02-24T10:37:52Z

Why is exponential better for latency? It gives us less accurate latency values, as I explained in the PR description

I thought the latency was more in an exponential curves, but if you see it differently fine.
But what about the number of buckets? Could we have too much data in the Prometheus.

If it were up to me, I would remove the buckets altogether. I see it mostly as something useful for presentation.

ndr-ds · 2025-02-24T14:15:27Z

So the issue here is that the wider our buckets, the less accurate our quantiles will be (p50, p90, p99, etc), which is not good, because that's what we'll be looking at in our dashboards to monitor the system. So we need to make our buckets as narrow as possible, without generating too many buckets, which can be expensive when using Grafana Cloud, for example. These were starting at 0.0001 ms and it had a base 3 exponential growth. Now it starts at 1 ms, and grows linearly for both the proxy and server latencies. We're probably not interested in latencies below 1 ms for these, so this is fine. This will generate less buckets than before, and the buckets will be narrower so our data will be more accurate.
So this is a win on every aspect IMHO.

MathieuDutSik

I approve because there is less buckets with this system than the preceding.

ndr-ds · 2025-02-24T16:56:36Z

Merge activity

Feb 24, 11:56 AM EST: A user started a stack merge that includes this pull request via Graphite.
Feb 24, 11:56 AM EST: A user merged this pull request with Graphite.

This was referenced Feb 21, 2025

Have one thread per chain on benchmark #3374

Merged

Adjust latency metrics buckets #3379

Merged

Adjust dashboards to show 1m metrics #3380

Merged

ndr-ds force-pushed the 02-20-use_linear_buckets_in_some_places branch from 3e55e27 to 9faa4f1 Compare February 21, 2025 02:09

ndr-ds requested review from afck, christos-h, jvff, ma2bd, Twey and MathieuDutSik February 21, 2025 13:53

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from 43ce4d1 to f48ee51 Compare February 21, 2025 14:52

ndr-ds force-pushed the 02-20-use_linear_buckets_in_some_places branch from 9faa4f1 to f0d0d79 Compare February 21, 2025 14:52

ndr-ds force-pushed the 02-19-have_one_thread_per_chain_on_benchmark branch from f48ee51 to fd0602b Compare February 21, 2025 14:53

ndr-ds force-pushed the 02-20-use_linear_buckets_in_some_places branch from f0d0d79 to 5dcb117 Compare February 21, 2025 14:53

ndr-ds changed the base branch from 02-19-have_one_thread_per_chain_on_benchmark to graphite-base/3384 February 21, 2025 15:28

ndr-ds force-pushed the 02-20-use_linear_buckets_in_some_places branch from 5dcb117 to 6984d86 Compare February 21, 2025 15:28

ndr-ds force-pushed the graphite-base/3384 branch from fd0602b to cfefa52 Compare February 21, 2025 15:28

ndr-ds changed the base branch from graphite-base/3384 to main February 21, 2025 15:28

Use linear buckets in some places

a44c253

ndr-ds force-pushed the 02-20-use_linear_buckets_in_some_places branch from 6984d86 to a44c253 Compare February 21, 2025 15:28

MathieuDutSik reviewed Feb 21, 2025

View reviewed changes

ndr-ds force-pushed the 02-20-use_linear_buckets_in_some_places branch from d4bb1fb to a44c253 Compare February 21, 2025 20:20

MathieuDutSik approved these changes Feb 24, 2025

View reviewed changes

ndr-ds merged commit 55b9e8d into main Feb 24, 2025
47 checks passed

ndr-ds deleted the 02-20-use_linear_buckets_in_some_places branch February 24, 2025 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use linear buckets in some places #3384

Use linear buckets in some places #3384

ndr-ds commented Feb 21, 2025

ndr-ds commented Feb 21, 2025 •

edited

Loading

MathieuDutSik left a comment

ndr-ds commented Feb 21, 2025 •

edited

Loading

MathieuDutSik commented Feb 24, 2025

ndr-ds commented Feb 24, 2025

MathieuDutSik left a comment

ndr-ds commented Feb 24, 2025 •

edited

Loading

Use linear buckets in some places #3384

Use linear buckets in some places #3384

Conversation

ndr-ds commented Feb 21, 2025

Motivation

Proposal

Test Plan

Release Plan

ndr-ds commented Feb 21, 2025 • edited Loading

MathieuDutSik left a comment

Choose a reason for hiding this comment

ndr-ds commented Feb 21, 2025 • edited Loading

MathieuDutSik commented Feb 24, 2025

ndr-ds commented Feb 24, 2025

MathieuDutSik left a comment

Choose a reason for hiding this comment

ndr-ds commented Feb 24, 2025 • edited Loading

Merge activity

ndr-ds commented Feb 21, 2025 •

edited

Loading

ndr-ds commented Feb 21, 2025 •

edited

Loading

ndr-ds commented Feb 24, 2025 •

edited

Loading