Skip to content

NewXORChunk over-allocation #18502

@bobrik

Description

@bobrik

We see a lot of memory allocated by NewXORChunk:

(pprof) list NewXORChunk
Total: 99.23GB
ROUTINE ======================== github.com/prometheus/prometheus/tsdb/chunkenc.NewXORChunk in tsdb/chunkenc/xor.go
   11.85GB    11.85GB (flat, cum) 11.94% of Total
         .          .     66:func NewXORChunk() *XORChunk {
    8.93GB     8.93GB     67:   b := make([]byte, chunkHeaderSize, chunkAllocationSize)
    2.92GB     2.92GB     68:   return &XORChunk{b: bstream{stream: b, count: 0}}
         .          .     69:}
         .          .     70:
         .          .     71:func (c *XORChunk) Reset(stream []byte) {
         .          .     72:   c.b.Reset(stream)
         .          .     73:}

Given that memory is attributed to make() and not to append somewhere, I'm making a conclusion that we either allocate the exact necessary amount or we over-allocate to the point that the slice never needs to grow. I think it's the latter.

To test this theory, I changed chunkAllocationSize from 128 down to 3, which is the minimum allowed amount based on other make calls using chunkAllocationSize. I then ran the vanilla and the patched versions with the same config side by side for a while to see the effects.

  • Baseline:
(pprof) top20
Showing nodes accounting for 32.51GB, 91.70% of 35.45GB total
Dropped 710 nodes (cum <= 0.18GB)
Showing top 20 nodes out of 86
      flat  flat%   sum%        cum   cum%
    4.83GB 13.64% 13.64%     4.83GB 13.64%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
    4.12GB 11.64% 25.27%     4.12GB 11.64%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef
    3.88GB 10.95% 36.22%     4.57GB 12.89%  github.com/prometheus/prometheus/tsdb.newMemSeries (inline)
    3.54GB  9.99% 46.21%     3.54GB  9.99%  github.com/prometheus/prometheus/tsdb/chunkenc.NewXORChunk (inline)
    3.09GB  8.70% 54.92%     3.09GB  8.70%  github.com/prometheus/prometheus/tsdb/index.appendWithExponentialGrowth[go.shape.uint64]
    2.72GB  7.66% 62.58%     2.72GB  7.66%  github.com/prometheus/prometheus/tsdb.(*txRing).add
    1.37GB  3.85% 66.43%     1.37GB  3.85%  github.com/prometheus/prometheus/scrape.(*scrapeCache).trackStaleness
    1.30GB  3.67% 70.10%     1.30GB  3.67%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addDropped
    1.09GB  3.09% 73.19%     5.70GB 16.07%  github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk
    1.06GB  2.99% 76.18%     1.06GB  2.99%  github.com/prometheus/prometheus/scrape.NewManager.func1
    1.04GB  2.93% 79.11%     1.04GB  2.93%  github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).Appender
    0.84GB  2.38% 81.49%     0.85GB  2.39%  github.com/prometheus/prometheus/tsdb.(*memSeries).mmapChunks
    0.73GB  2.07% 83.56%     0.73GB  2.07%  github.com/prometheus/prometheus/scrape.(*scrapeCache).setHelp
    0.69GB  1.94% 85.50%     0.69GB  1.94%  github.com/prometheus/prometheus/tsdb.newTxRing (inline)
    0.54GB  1.52% 87.03%     5.65GB 15.93%  github.com/prometheus/prometheus/tsdb.(*stripeSeries).getOrSet
    0.54GB  1.51% 88.54%     0.54GB  1.51%  github.com/prometheus/prometheus/tsdb.(*seriesHashmap).set
    0.37GB  1.04% 89.58%     0.38GB  1.07%  golang.org/x/net/trace.NewEventLog
    0.28GB  0.78% 90.36%     0.28GB  0.78%  github.com/prometheus/prometheus/tsdb.NewCircularExemplarStorage
    0.24GB  0.68% 91.05%     0.24GB  0.68%  bufio.NewReaderSize
    0.23GB  0.66% 91.70%     0.23GB  0.66%  bufio.NewWriterSize
  • Test:
(pprof) top20
Showing nodes accounting for 30.34GB, 92.20% of 32.91GB total
Dropped 640 nodes (cum <= 0.16GB)
Showing top 20 nodes out of 95
      flat  flat%   sum%        cum   cum%
    4.91GB 14.92% 14.92%     4.91GB 14.92%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
    4.13GB 12.55% 27.46%     4.13GB 12.55%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addRef
    3.97GB 12.07% 39.53%     4.69GB 14.26%  github.com/prometheus/prometheus/tsdb.newMemSeries (inline)
    3.06GB  9.31% 48.84%     3.06GB  9.31%  github.com/prometheus/prometheus/tsdb/index.appendWithExponentialGrowth[go.shape.uint64]
    2.49GB  7.56% 56.40%     2.49GB  7.56%  github.com/prometheus/prometheus/tsdb.(*txRing).add
    1.38GB  4.19% 60.60%     1.38GB  4.19%  github.com/prometheus/prometheus/scrape.(*scrapeCache).trackStaleness
    1.35GB  4.10% 64.69%     1.35GB  4.10%  github.com/prometheus/prometheus/scrape.(*scrapeCache).addDropped
    1.09GB  3.32% 68.02%     1.09GB  3.32%  github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).Appender
    1.09GB  3.31% 71.33%     2.89GB  8.79%  github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk
    0.94GB  2.85% 74.18%     0.94GB  2.85%  github.com/prometheus/prometheus/scrape.NewManager.func1
    0.87GB  2.66% 76.84%     0.88GB  2.67%  github.com/prometheus/prometheus/tsdb.(*memSeries).mmapChunks
    0.87GB  2.65% 79.49%     0.87GB  2.65%  github.com/prometheus/prometheus/tsdb/chunkenc.(*bstream).writeByte
    0.72GB  2.19% 81.67%     0.72GB  2.19%  github.com/prometheus/prometheus/tsdb.newTxRing (inline)
    0.69GB  2.10% 83.78%     0.69GB  2.10%  github.com/prometheus/prometheus/tsdb/chunkenc.NewXORChunk (inline)
    0.69GB  2.10% 85.87%     0.69GB  2.10%  github.com/prometheus/prometheus/scrape.(*scrapeCache).setHelp
    0.55GB  1.67% 87.55%     5.78GB 17.57%  github.com/prometheus/prometheus/tsdb.(*stripeSeries).getOrSet
    0.54GB  1.64% 89.19%     0.54GB  1.64%  github.com/prometheus/prometheus/tsdb.(*seriesHashmap).set
    0.38GB  1.15% 90.34%     0.38GB  1.15%  github.com/prometheus/prometheus/tsdb/chunkenc.(*bstream).writeBit
    0.33GB  1.02% 91.36%     0.35GB  1.05%  golang.org/x/net/trace.NewEventLog
    0.28GB  0.84% 92.20%     0.28GB  0.84%  github.com/prometheus/prometheus/tsdb.(*headIndexReader).SortedPostings

There are two relevant lines:

  • chunkenc.NewXORChunk: 3.54GB -> 0.69GB
  • chunkenc.(*XORChunk).Appender: 1.04GB -> 1.09GB

Overall it's a drop from 4.58GB to 1.78GB, which seems massive for a heap size of ~32GB.

I'm happy to make a PR to reduce the capacity argument, but it also begs the question of why allocate a stream eagerly to begin with. Perhaps it's more economical to do this on the first append.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions