Skip to content

Scrape: add string interning to scrape cache#18501

Open
bobrik wants to merge 1 commit intoprometheus:mainfrom
bobrik:ivan/intern-scrape-cache
Open

Scrape: add string interning to scrape cache#18501
bobrik wants to merge 1 commit intoprometheus:mainfrom
bobrik:ivan/intern-scrape-cache

Conversation

@bobrik
Copy link
Copy Markdown
Contributor

@bobrik bobrik commented Apr 9, 2026

A lot of memory is used by scrape cache:

(pprof) list addRef
Total: 99.23GB
ROUTINE ======================== github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef in scrape/scrape.go
    6.78GB     6.78GB (flat, cum)  6.83% of Total
         .          .   1085:func (c *scrapeCache) addRef(met []byte, ref storage.SeriesRef, lset labels.Labels, hash uint64) (ce *cacheEntry) {
         .          .   1086:   if ref == 0 {
         .          .   1087:           return nil
         .          .   1088:   }
    1.73GB     1.73GB   1089:   ce = &cacheEntry{ref: ref, lastIter: c.iter, lset: lset, hash: hash}
    5.04GB     5.04GB   1090:   c.series[string(met)] = ce
         .          .   1091:   return ce
         .          .   1092:}
         .          .   1093:
         .          .   1094:func (c *scrapeCache) addDropped(met []byte) {
         .          .   1095:   iter := c.iter

Here met is a timeseries identity as it comes from the scrape, so it is very much shared between instances. Instead of storing 255 instances of this coming from every target:

edgeworker_request_internalExceptions{isActor="true",exceptionType="overloaded",stableId="cloudflare/cf_imgresize",plan="ss",cordon="paid"}

One can store 255 u64 pointers, which sounds a lot more compact.

I built a vanilla prometheus binary and a patched binary and let them both scrape the same targets in a datacenter with sum(scrape_samples_scraped) of about 10 million.

After running for not a very long period of time in parallel:

  • Control:
(pprof) top30
Showing nodes accounting for 33.55GB, 94.28% of 35.58GB total
Dropped 702 nodes (cum <= 0.18GB)
Showing top 30 nodes out of 95
      flat  flat%   sum%        cum   cum%
    4.84GB 13.60% 13.60%     4.84GB 13.60%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
    4.10GB 11.53% 25.14%     4.10GB 11.53%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef
    3.90GB 10.96% 36.09%     4.58GB 12.88%  github.com/prometheus/prometheus/tsdb.newMemSeries (inline)
    3.56GB  9.99% 46.09%     3.56GB  9.99%  github.com/prometheus/prometheus/tsdb/chunkenc.NewXORChunk (inline)
    2.99GB  8.40% 54.49%     2.99GB  8.40%  github.com/prometheus/prometheus/tsdb/index.appendWithExponentialGrowth[go.shape.uint64] (inline)
    2.41GB  6.78% 61.27%     2.41GB  6.78%  github.com/prometheus/prometheus/tsdb.(*txRing).add
    1.37GB  3.84% 65.11%     1.37GB  3.84%  github.com/prometheus/prometheus/scrape.(*scrapeCache).trackStaleness
    1.31GB  3.68% 68.79%     1.31GB  3.68%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addDropped
    1.22GB  3.44% 72.23%     1.22GB  3.44%  github.com/prometheus/prometheus/scrape.NewManager.func1
    1.10GB  3.08% 75.31%     1.10GB  3.08%  github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).Appender
    1.08GB  3.05% 78.36%     5.76GB 16.19%  github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk
    0.91GB  2.56% 80.93%     0.92GB  2.58%  github.com/prometheus/prometheus/tsdb.(*memSeries).mmapChunks
    0.70GB  1.97% 82.90%     0.70GB  1.97%  github.com/prometheus/prometheus/scrape.(*scrapeCache).setHelp
    0.68GB  1.92% 84.82%     0.68GB  1.92%  github.com/prometheus/prometheus/tsdb.newTxRing (inline)
    0.56GB  1.59% 86.41%     0.56GB  1.59%  github.com/prometheus/prometheus/tsdb.(*seriesHashmap).set
    0.55GB  1.55% 87.95%     5.70GB 16.02%  github.com/prometheus/prometheus/tsdb.(*stripeSeries).getOrSet
    0.35GB  0.99% 88.94%     0.36GB  1.01%  golang.org/x/net/trace.NewEventLog
    0.28GB  0.78% 89.73%     0.28GB  0.78%  github.com/prometheus/prometheus/tsdb.NewCircularExemplarStorage
    0.26GB  0.72% 90.45%     0.26GB  0.72%  github.com/prometheus/prometheus/model/labels.(*ScratchBuilder).Labels
    0.25GB  0.71% 91.16%     0.25GB  0.71%  bufio.NewReaderSize (inline)
    0.23GB  0.66% 91.81%     0.23GB  0.66%  bufio.NewWriterSize (inline)
    0.20GB  0.57% 92.39%     0.20GB  0.57%  github.com/prometheus/prometheus/tsdb.(*blockSeriesSet).At
    0.18GB  0.51% 92.89%     0.90GB  2.52%  github.com/prometheus/prometheus/promql.(*evaluator).rangeEval
    0.12GB  0.34% 93.23%     0.47GB  1.33%  github.com/prometheus/prometheus/promql.expandSeriesSet
    0.11GB   0.3% 93.53%     8.67GB 24.38%  github.com/prometheus/prometheus/tsdb.(*headAppender).Append
    0.10GB  0.29% 93.82%     1.88GB  5.28%  github.com/prometheus/prometheus/rules.(*Group).Eval.func1
    0.05GB  0.15% 93.97%     1.38GB  3.89%  github.com/prometheus/prometheus/promql.(*evaluator).eval
    0.05GB  0.15% 94.12%     0.92GB  2.59%  net/http.(*Transport).dialConn
    0.04GB  0.11% 94.22%     8.57GB 24.10%  github.com/prometheus/prometheus/tsdb.(*headAppender).getOrCreate
    0.02GB 0.059% 94.28%     0.18GB  0.52%  github.com/prometheus/prometheus/promql.(*evaluator).VectorBinop
  • Test:
(pprof) top30
Showing nodes accounting for 28.31GB, 94.43% of 29.98GB total
Dropped 684 nodes (cum <= 0.15GB)
Showing top 30 nodes out of 96
      flat  flat%   sum%        cum   cum%
    4.41GB 14.72% 14.72%     4.41GB 14.72%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
    3.74GB 12.46% 27.18%     4.38GB 14.63%  github.com/prometheus/prometheus/tsdb.newMemSeries (inline)
    3.15GB 10.51% 37.69%     3.15GB 10.51%  github.com/prometheus/prometheus/tsdb/chunkenc.NewXORChunk (inline)
    2.98GB  9.93% 47.61%     2.98GB  9.93%  github.com/prometheus/prometheus/tsdb/index.appendWithExponentialGrowth[go.shape.uint64] (inline)
    1.93GB  6.42% 54.03%     1.93GB  6.42%  github.com/prometheus/prometheus/tsdb.(*txRing).add
    1.88GB  6.28% 60.31%     1.88GB  6.28%  bytes.growSlice
    1.63GB  5.45% 65.76%     1.64GB  5.46%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef
    1.17GB  3.90% 69.66%     1.17GB  3.90%  github.com/prometheus/prometheus/scrape.(*scrapeCache).trackStaleness
    0.97GB  3.23% 72.89%     5.03GB 16.78%  github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk
    0.89GB  2.95% 75.84%     0.89GB  2.95%  github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).Appender
    0.68GB  2.28% 78.12%     0.71GB  2.36%  github.com/prometheus/prometheus/tsdb.(*memSeries).mmapChunks
    0.65GB  2.17% 80.29%     0.65GB  2.17%  github.com/prometheus/prometheus/tsdb.newTxRing (inline)
    0.56GB  1.88% 82.17%     5.50GB 18.35%  github.com/prometheus/prometheus/tsdb.(*stripeSeries).getOrSet
    0.55GB  1.84% 84.02%     0.55GB  1.84%  github.com/prometheus/prometheus/tsdb.(*seriesHashmap).set
    0.42GB  1.40% 85.42%     0.42GB  1.40%  github.com/prometheus/prometheus/scrape.NewManager.func1
    0.36GB  1.20% 86.61%     0.36GB  1.21%  github.com/prometheus/prometheus/scrape.(*scrapeCache).setHelp
    0.33GB  1.09% 87.70%     0.34GB  1.13%  golang.org/x/net/trace.NewEventLog
    0.28GB  0.92% 88.62%     0.28GB  0.92%  github.com/prometheus/prometheus/tsdb.NewCircularExemplarStorage
    0.26GB  0.88% 89.50%     0.26GB  0.88%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addDropped
    0.24GB   0.8% 90.30%     0.24GB   0.8%  bufio.NewReaderSize (inline)
    0.24GB  0.79% 91.09%     0.24GB  0.79%  bufio.NewWriterSize (inline)
    0.20GB  0.68% 91.77%     8.95GB 29.86%  github.com/prometheus/prometheus/tsdb.(*headAppender).Append
    0.19GB  0.63% 92.40%     0.19GB  0.63%  internal/stringslite.Clone
    0.18GB  0.59% 92.99%     8.69GB 28.97%  github.com/prometheus/prometheus/tsdb.(*headAppender).getOrCreate
    0.17GB  0.58% 93.57%     0.17GB  0.58%  github.com/prometheus/prometheus/tsdb/encoding.(*Encbuf).PutString
    0.10GB  0.33% 93.90%     0.26GB  0.86%  unique.(*canonMap[go.shape.string]).LoadOrStore
    0.06GB  0.22% 94.11%     0.91GB  3.05%  net/http.(*Transport).dialConn
    0.04GB  0.13% 94.24%     0.38GB  1.27%  github.com/prometheus/prometheus/promql.(*evaluator).eval
    0.04GB  0.12% 94.37%     0.18GB  0.59%  github.com/prometheus/prometheus/promql.expandSeriesSet
    0.02GB  0.06% 94.43%     0.52GB  1.72%  github.com/prometheus/prometheus/rules.(*Group).Eval.func1

Key changes:

  • addRef: 4.10GB -> 1.63GB
  • addDropped: 1.31GB -> 0.26GB
  • setHelp: 0.70GB -> 0.36GB

Given that Go GC requires quite significant overhead, these numbers need to be multiplied by 1.5-2x to get the actual amount of memory saved as visible from the OS perspective.

Which issue(s) does the PR fix:

Release notes for end users (ALL commits must be considered).

[PERF] Scrape: optimize memory usage for scrape cache.

@bobrik bobrik requested a review from a team as a code owner April 9, 2026 19:12
@bobrik bobrik requested a review from machine424 April 9, 2026 19:12
Comment thread scrape/scrape.go
if e.Help != string(help) {
e.Help = string(help)
e.help = unique.Make(string(help))
e.Help = e.help.Value()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This awkwardness enables no breaking changes for the public model that exposes string fields:

@bobrik
Copy link
Copy Markdown
Contributor Author

bobrik commented Apr 9, 2026

Mostly opening as an RFC first, will fix CI once there's some consensus on this being a good idea.

A lot of memory is used by scrape cache:

```
(pprof) list addRef
Total: 99.23GB
ROUTINE ======================== github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef in scrape/scrape.go
    6.78GB     6.78GB (flat, cum)  6.83% of Total
         .          .   1085:func (c *scrapeCache) addRef(met []byte, ref storage.SeriesRef, lset labels.Labels, hash uint64) (ce *cacheEntry) {
         .          .   1086:   if ref == 0 {
         .          .   1087:           return nil
         .          .   1088:   }
    1.73GB     1.73GB   1089:   ce = &cacheEntry{ref: ref, lastIter: c.iter, lset: lset, hash: hash}
    5.04GB     5.04GB   1090:   c.series[string(met)] = ce
         .          .   1091:   return ce
         .          .   1092:}
         .          .   1093:
         .          .   1094:func (c *scrapeCache) addDropped(met []byte) {
         .          .   1095:   iter := c.iter
```

Here `met` is a timeseries identity as it comes from the scrape, so it is very much shared
between instances. Instead of storing 255 instances of this coming from every target:

```
edgeworker_request_internalExceptions{isActor="true",exceptionType="overloaded",stableId="cloudflare/cf_imgresize",plan="ss",cordon="paid"}
```

One can store 255 u64 pointers, which sounds a lot more compact.

I built a vanilla prometheus binary and a patched binary and let them both scrape
the same targets in a datacenter with sum(scrape_samples_scraped) of about 10 million.

After running for not a very long period of time in parallel:

* Control:

```
(pprof) top30
Showing nodes accounting for 33.55GB, 94.28% of 35.58GB total
Dropped 702 nodes (cum <= 0.18GB)
Showing top 30 nodes out of 95
      flat  flat%   sum%        cum   cum%
    4.84GB 13.60% 13.60%     4.84GB 13.60%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
    4.10GB 11.53% 25.14%     4.10GB 11.53%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef
    3.90GB 10.96% 36.09%     4.58GB 12.88%  github.com/prometheus/prometheus/tsdb.newMemSeries (inline)
    3.56GB  9.99% 46.09%     3.56GB  9.99%  github.com/prometheus/prometheus/tsdb/chunkenc.NewXORChunk (inline)
    2.99GB  8.40% 54.49%     2.99GB  8.40%  github.com/prometheus/prometheus/tsdb/index.appendWithExponentialGrowth[go.shape.uint64] (inline)
    2.41GB  6.78% 61.27%     2.41GB  6.78%  github.com/prometheus/prometheus/tsdb.(*txRing).add
    1.37GB  3.84% 65.11%     1.37GB  3.84%  github.com/prometheus/prometheus/scrape.(*scrapeCache).trackStaleness
    1.31GB  3.68% 68.79%     1.31GB  3.68%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addDropped
    1.22GB  3.44% 72.23%     1.22GB  3.44%  github.com/prometheus/prometheus/scrape.NewManager.func1
    1.10GB  3.08% 75.31%     1.10GB  3.08%  github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).Appender
    1.08GB  3.05% 78.36%     5.76GB 16.19%  github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk
    0.91GB  2.56% 80.93%     0.92GB  2.58%  github.com/prometheus/prometheus/tsdb.(*memSeries).mmapChunks
    0.70GB  1.97% 82.90%     0.70GB  1.97%  github.com/prometheus/prometheus/scrape.(*scrapeCache).setHelp
    0.68GB  1.92% 84.82%     0.68GB  1.92%  github.com/prometheus/prometheus/tsdb.newTxRing (inline)
    0.56GB  1.59% 86.41%     0.56GB  1.59%  github.com/prometheus/prometheus/tsdb.(*seriesHashmap).set
    0.55GB  1.55% 87.95%     5.70GB 16.02%  github.com/prometheus/prometheus/tsdb.(*stripeSeries).getOrSet
    0.35GB  0.99% 88.94%     0.36GB  1.01%  golang.org/x/net/trace.NewEventLog
    0.28GB  0.78% 89.73%     0.28GB  0.78%  github.com/prometheus/prometheus/tsdb.NewCircularExemplarStorage
    0.26GB  0.72% 90.45%     0.26GB  0.72%  github.com/prometheus/prometheus/model/labels.(*ScratchBuilder).Labels
    0.25GB  0.71% 91.16%     0.25GB  0.71%  bufio.NewReaderSize (inline)
    0.23GB  0.66% 91.81%     0.23GB  0.66%  bufio.NewWriterSize (inline)
    0.20GB  0.57% 92.39%     0.20GB  0.57%  github.com/prometheus/prometheus/tsdb.(*blockSeriesSet).At
    0.18GB  0.51% 92.89%     0.90GB  2.52%  github.com/prometheus/prometheus/promql.(*evaluator).rangeEval
    0.12GB  0.34% 93.23%     0.47GB  1.33%  github.com/prometheus/prometheus/promql.expandSeriesSet
    0.11GB   0.3% 93.53%     8.67GB 24.38%  github.com/prometheus/prometheus/tsdb.(*headAppender).Append
    0.10GB  0.29% 93.82%     1.88GB  5.28%  github.com/prometheus/prometheus/rules.(*Group).Eval.func1
    0.05GB  0.15% 93.97%     1.38GB  3.89%  github.com/prometheus/prometheus/promql.(*evaluator).eval
    0.05GB  0.15% 94.12%     0.92GB  2.59%  net/http.(*Transport).dialConn
    0.04GB  0.11% 94.22%     8.57GB 24.10%  github.com/prometheus/prometheus/tsdb.(*headAppender).getOrCreate
    0.02GB 0.059% 94.28%     0.18GB  0.52%  github.com/prometheus/prometheus/promql.(*evaluator).VectorBinop
```

* Test:

```
(pprof) top30
Showing nodes accounting for 28.31GB, 94.43% of 29.98GB total
Dropped 684 nodes (cum <= 0.15GB)
Showing top 30 nodes out of 96
      flat  flat%   sum%        cum   cum%
    4.41GB 14.72% 14.72%     4.41GB 14.72%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
    3.74GB 12.46% 27.18%     4.38GB 14.63%  github.com/prometheus/prometheus/tsdb.newMemSeries (inline)
    3.15GB 10.51% 37.69%     3.15GB 10.51%  github.com/prometheus/prometheus/tsdb/chunkenc.NewXORChunk (inline)
    2.98GB  9.93% 47.61%     2.98GB  9.93%  github.com/prometheus/prometheus/tsdb/index.appendWithExponentialGrowth[go.shape.uint64] (inline)
    1.93GB  6.42% 54.03%     1.93GB  6.42%  github.com/prometheus/prometheus/tsdb.(*txRing).add
    1.88GB  6.28% 60.31%     1.88GB  6.28%  bytes.growSlice
    1.63GB  5.45% 65.76%     1.64GB  5.46%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef
    1.17GB  3.90% 69.66%     1.17GB  3.90%  github.com/prometheus/prometheus/scrape.(*scrapeCache).trackStaleness
    0.97GB  3.23% 72.89%     5.03GB 16.78%  github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk
    0.89GB  2.95% 75.84%     0.89GB  2.95%  github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).Appender
    0.68GB  2.28% 78.12%     0.71GB  2.36%  github.com/prometheus/prometheus/tsdb.(*memSeries).mmapChunks
    0.65GB  2.17% 80.29%     0.65GB  2.17%  github.com/prometheus/prometheus/tsdb.newTxRing (inline)
    0.56GB  1.88% 82.17%     5.50GB 18.35%  github.com/prometheus/prometheus/tsdb.(*stripeSeries).getOrSet
    0.55GB  1.84% 84.02%     0.55GB  1.84%  github.com/prometheus/prometheus/tsdb.(*seriesHashmap).set
    0.42GB  1.40% 85.42%     0.42GB  1.40%  github.com/prometheus/prometheus/scrape.NewManager.func1
    0.36GB  1.20% 86.61%     0.36GB  1.21%  github.com/prometheus/prometheus/scrape.(*scrapeCache).setHelp
    0.33GB  1.09% 87.70%     0.34GB  1.13%  golang.org/x/net/trace.NewEventLog
    0.28GB  0.92% 88.62%     0.28GB  0.92%  github.com/prometheus/prometheus/tsdb.NewCircularExemplarStorage
    0.26GB  0.88% 89.50%     0.26GB  0.88%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addDropped
    0.24GB   0.8% 90.30%     0.24GB   0.8%  bufio.NewReaderSize (inline)
    0.24GB  0.79% 91.09%     0.24GB  0.79%  bufio.NewWriterSize (inline)
    0.20GB  0.68% 91.77%     8.95GB 29.86%  github.com/prometheus/prometheus/tsdb.(*headAppender).Append
    0.19GB  0.63% 92.40%     0.19GB  0.63%  internal/stringslite.Clone
    0.18GB  0.59% 92.99%     8.69GB 28.97%  github.com/prometheus/prometheus/tsdb.(*headAppender).getOrCreate
    0.17GB  0.58% 93.57%     0.17GB  0.58%  github.com/prometheus/prometheus/tsdb/encoding.(*Encbuf).PutString
    0.10GB  0.33% 93.90%     0.26GB  0.86%  unique.(*canonMap[go.shape.string]).LoadOrStore
    0.06GB  0.22% 94.11%     0.91GB  3.05%  net/http.(*Transport).dialConn
    0.04GB  0.13% 94.24%     0.38GB  1.27%  github.com/prometheus/prometheus/promql.(*evaluator).eval
    0.04GB  0.12% 94.37%     0.18GB  0.59%  github.com/prometheus/prometheus/promql.expandSeriesSet
    0.02GB  0.06% 94.43%     0.52GB  1.72%  github.com/prometheus/prometheus/rules.(*Group).Eval.func1
```

Key changes:

* `addRef`: 4.10GB -> 1.63GB
* `addDropped`: 1.31GB -> 0.26GB
* `setHelp`: 0.70GB -> 0.36GB

Given that Go GC requires quite significant overhead, these numbers need to be multiplied
by 1.5-2x to get the actual amount of memory saved as visible from the OS perspective.

Signed-off-by: Ivan Babrou <[email protected]>
@bobrik bobrik force-pushed the ivan/intern-scrape-cache branch from 7dc8e26 to b0fd424 Compare April 9, 2026 20:58
@roidelapluie
Copy link
Copy Markdown
Member

/prombench main

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Apr 10, 2026

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-18501 and main

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

@roidelapluie
Copy link
Copy Markdown
Member

/prombench cancel

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Apr 13, 2026

Benchmark cancel is in progress.

Copy link
Copy Markdown
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Prombench showed a small increase in CPU (which is expected) but no decrease in memory (which was also expected). However we don't always trust Prombench.

it is very much shared between instances

To nitpick this wording, I would say it is expected that instances of the same program export the same series.

Comment thread scrape/scrape.go
// Parsed string to an entry with information about the actual label set
// and its storage reference.
series map[string]*cacheEntry
series map[unique.Handle[string]]*cacheEntry
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a note that unique is used to share memory across targets.

Comment thread scrape/scrape.go
metaMtx sync.Mutex // Mutex is needed due to api touching it when metadata is queried.
metadata map[string]*metaEntry // metadata by metric family name.
metaMtx sync.Mutex // Mutex is needed due to api touching it when metadata is queried.
metadata map[unique.Handle[string]]*metaEntry // metadata by metric family name.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether metadata is worth doing, since it is one per family not one per series.

Comment thread scrape/scrape.go

func (c *scrapeCache) get(met []byte) (*cacheEntry, bool, bool) {
e, ok := c.series[string(met)]
e, ok := c.series[unique.Make(string(met))]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest that unique.Make(string(met)) is passed in and cached in the caller so we don't recompute it on line 1016, etc. The pattern for byte[] -> string is a specific compiler optimisation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 @bobrik is away but I promised him I'll try this patch on our production workload to get a better idea of mem/cpu impact (at least in our environment). I'll deploy patch will all the comments addressed and will report back here.

@afurm
Copy link
Copy Markdown

afurm commented Apr 14, 2026

The unique.Make(string(met)) calls in get-path functions (get, getDropped, setType, setHelp) mean every cache lookup now allocates a Handle, even if the key already exists in the map. Since Handle is essentially a pointer to an interned string, is there a measurable cost from the repeated Make calls on the hot scrape path? Was the allocation cost of Handle creation considered alongside the memory savings from string deduplication?

@bboreham
Copy link
Copy Markdown
Member

now allocates a Handle

Handle should be on the stack, no?

I think:

  • previously we would do a map[string] lookup for the drop cache, possibly followed by a map[string] lookup for the series cache.
  • after this PR and my suggested modification we will do a string->Handle lookup once, then a map[Handle] lookup for the drop cache possibly followed by a map[Handle] lookup for the series cache.

So the number of string hashes should come down, and some extra Handle hashes should be added.

Would be good to see some benchmark results.

@afurm
Copy link
Copy Markdown

afurm commented Apr 15, 2026

The calls in get-path functions (get, getDropped, setType, setHelp) mean every cache lookup now allocates a Handle, even if the key already exists in the map. Since Handle is a uint64, stack allocation is cheap but not free — especially in hot scrape loops running every 15s across thousands of targets.

Agreed that caching the Handle at the call site would reduce repeated Make calls. But the bigger concern is whether the metadata maps (metadata, metaEntry.help/unit) justify the added unique.Make overhead at all — metadata is per family, not per series, so deduplication wins there are much smaller.

@bboreham
Copy link
Copy Markdown
Member

Generally all stack allocations in a function will happen at the same time, with a single addition to the stack pointer.
I don't think this is a realistic concern.

I already suggested to remove the metadata changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants