feat: Add waiting duration metric to query gate by AdeshDeshmukh · Pull Request #18378 · prometheus/prometheus

AdeshDeshmukh · 2026-03-26T16:55:37Z

The query gate limits concurrent requests but we had no visibility into how long requests wait when the limit is hit.

This adds a histogram metric to track waiting duration, so operators can see if the gate is becoming a bottleneck and whether they need to increase the concurrency limit.

The metric is named 'prometheus_query_gate_waiting_duration_seconds' and uses standard histogram buckets. Waiting time is measured from when Start() is called until the request acquires a gate slot.

This includes comprehensive tests covering normal operation, context cancellation, and metric recording.

Fixes: prometheus#11365 The query gate limits concurrent requests but we had no visibility into how long requests wait when the limit is hit. This adds a histogram metric to track waiting duration, so operators can see if the gate is becoming a bottleneck and whether they need to increase the concurrency limit. The metric is named 'prometheus_query_gate_waiting_duration_seconds' and uses standard histogram buckets. Waiting time is measured from when Start() is called until the request acquires a gate slot. This includes comprehensive tests covering normal operation, context cancellation, and metric recording. Signed-off-by: Test User <[email protected]>

ogulcanaydogan

Hi @AdeshDeshmukh — I also have an open PR for this issue (#18355).

A few observations on this approach:

Global metric via promauto: The histogram is a package-level singleton, which means it can't be customized per caller and is harder to test (can't verify observations through a test registry). #18355 uses the prometheus.Registerer pattern (like util/notifications) so the caller controls naming and registration.
No New() signature change: This keeps backward compat, but it also means the metric is always registered — even if the gate is used in a context where metrics aren't wanted.
Metric naming: prometheus_query_gate_waiting_duration_seconds assumes the gate is only used for queries. The remote read handler also uses it, so a more generic name (or caller-provided prefix) might be better.

Happy to collaborate on converging the approaches — the core logic (measure time.Since(start) in Start()) is the same in both PRs.

AdeshDeshmukh requested a review from a team as a code owner March 26, 2026 16:55

AdeshDeshmukh requested a review from cristiangreco March 26, 2026 16:55

AdeshDeshmukh mentioned this pull request Mar 26, 2026

Gate needs a waiting duration metric #11365

Open

AdeshDeshmukh force-pushed the add-gate-waiting-metric branch from 4b17969 to 92241cf Compare March 26, 2026 16:58

ogulcanaydogan reviewed Mar 27, 2026

View reviewed changes

bboreham mentioned this pull request Apr 7, 2026

remote: add waiting duration metric to remote read handler gate #18450

Closed

This was referenced Apr 14, 2026

remote: add gate waiting duration metric #18491

Open

util/gate: add waiting duration histogram metric #18509

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add waiting duration metric to query gate#18378

feat: Add waiting duration metric to query gate#18378
AdeshDeshmukh wants to merge 1 commit intoprometheus:mainfrom
AdeshDeshmukh:add-gate-waiting-metric

AdeshDeshmukh commented Mar 26, 2026

Uh oh!

ogulcanaydogan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdeshDeshmukh commented Mar 26, 2026

Uh oh!

ogulcanaydogan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants