Skip to content

Added setting "min_bytes_to_use_mmap_io"#8520

Merged
alexey-milovidov merged 13 commits intomasterfrom
read-mmap
Jan 4, 2020
Merged

Added setting "min_bytes_to_use_mmap_io"#8520
alexey-milovidov merged 13 commits intomasterfrom
read-mmap

Conversation

@alexey-milovidov
Copy link
Member

@alexey-milovidov alexey-milovidov commented Jan 3, 2020

Changelog category (leave one):

  • Experimental Feature

Changelog entry (up to few sentences, required except for Non-significant/Documentation categories):
Added experimental setting min_bytes_to_use_mmap_io. It allows to read big files without copying data from kernel to userspace. The setting is disabled by default. Recommended threshold is about 64 MB, because mmap/munmap is slow.

Detailed description (optional):
Caveats:

  • if the same data is read from multiple threads, multiple mappings created;
  • memory for mappings is not accounted;
  • if subset of a file is read, all file is mapped;
  • in contrast to the usual IO, the number of threads is not lowered in case of slow reads.

Possible further work:

  • the number of IO parameters has increased, better to consolidate them in separate struct;
  • the notion of various ways to do file IO should be extracted to "IO engine" and various IO engines should be available through factory (for example, this is needed to implement userspace page cache);
  • independent ReadBuffers and WriteBuffers for different columns that read data on demand - is not the best way to do IO - maybe it's better if they will use shared data (some IO context) and if we can set up in advance what data we are going to read (for example, this is needed to control readahead).

@alexey-milovidov alexey-milovidov merged commit 42226b1 into master Jan 4, 2020
@alesapin alesapin added the pr-feature Pull request with new product feature label Jan 20, 2020
@bobrik
Copy link
Contributor

bobrik commented Jun 10, 2020

@alexey-milovidov should it work for skipping indices as well? I'm trying:

set min_bytes_to_use_mmap_io = 1024
select *
  from system.settings
 where name = 'min_bytes_to_use_mmap_io'
┌─name─────────────────────┬─value─┬─changed─┬─description──────────────────────────────────────────────────────────────────────────────────────────────────────┬─min──┬─max──┬─readonly─┬─type──────────┐
│ min_bytes_to_use_mmap_io │ 1024  │       1 │ The minimum number of bytes for reading the data with mmap option during SELECT queries execution. 0 - disabled. │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │        0 │ SettingUInt64 │
└──────────────────────────┴───────┴─────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────┴──────┴──────────┴───────────────┘

And still seeing:

image

Doesn't look like any of the mmap related functions are called:

$ sudo /usr/share/bcc/tools/funccount /state/var/lib/docker/overlay2/8794eb76eb8d76868fc1602075b0d920abece7c257ece060216aeecf23d115de/merged/usr/bin/clickhouse:*MMap*
Tracing 18 functions for "/state/var/lib/docker/overlay2/8794eb76eb8d76868fc1602075b0d920abece7c257ece060216aeecf23d115de/merged/usr/bin/clickhouse:*MMap*"... Hit Ctrl-C to end.
^C
FUNC                                    COUNT
Detaching...

And the tracing is working for CompressedReadBuffer related functions:

$ sudo /usr/share/bcc/tools/funccount /state/var/lib/docker/overlay2/8794eb76eb8d76868fc1602075b0d920abece7c257ece060216aeecf23d115de/merged/usr/bin/clickhouse:*CompressedReadBuffer*
Tracing 28 functions for "/state/var/lib/docker/overlay2/8794eb76eb8d76868fc1602075b0d920abece7c257ece060216aeecf23d115de/merged/usr/bin/clickhouse:*CompressedReadBuffer*"... Hit Ctrl-C to end.
^C
FUNC                                    COUNT
_ZN2DB28CompressedReadBufferFromFileD0Ev    11288
_ZN2DB24CompressedReadBufferBaseD1Ev    11739
_ZN2DB28CompressedReadBufferFromFileC1ENSt3__110unique_ptrINS_22ReadBufferFromFileBaseENS1_14default_deleteIS3_EEEE    11832
_ZN2DB24CompressedReadBufferBaseC2EPNS_10ReadBufferE    11930
_ZN2DB28CompressedReadBufferFromFile4seekEmm    14121
_ZN2DB28CompressedReadBufferFromFile8nextImplEv   224130
_ZN2DB24CompressedReadBufferBase10decompressEPcmm   234479
_ZN2DB24CompressedReadBufferBase18readCompressedDataERmS1_   234735
_ZN2DB28CompressedReadBufferFromFile7readBigEPcm   289238

Coming here from #10787 to keep that issue strictly io_uring related.

@alexey-milovidov
Copy link
Member Author

It works only for data files, not indices.

@bobrik
Copy link
Contributor

bobrik commented Jun 11, 2020

I'm struggling to make it work for data files either:

SET min_bytes_to_use_mmap_io = 1024
SELECT *
FROM system.settings
WHERE name = 'min_bytes_to_use_mmap_io'
┌─name─────────────────────┬─value─┬─changed─┬─description──────────────────────────────────────────────────────────────────────────────────────────────────────┬─min──┬─max──┬─readonly─┬─type──────────┐
│ min_bytes_to_use_mmap_io │ 1024  │       1 │ The minimum number of bytes for reading the data with mmap option during SELECT queries execution. 0 - disabled. │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │        0 │ SettingUInt64 │
└──────────────────────────┴───────┴─────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────┴──────┴──────────┴───────────────┘
SELECT sum(length(tags))
FROM jaeger_index_v2
PREWHERE (service = 'nginx-fl') AND (antiTimestamp <= (-toUnixTimestamp(now() - (3600 * 6)))) AND (antiTimestamp >= (-toUnixTimestamp(now())))
┌─sum(length(tags))─┐
│       17687286885 │
└───────────────────┘

1 rows in set. Elapsed: 38.125 sec. Processed 2.07 billion rows, 602.56 GB (54.35 million rows/s., 15.80 GB/s.)

image

Is there a reason for not supporting indices or is it just a temporary limitation of the current implementation?

@alexey-milovidov
Copy link
Member Author

@bobrik #11955

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants