Polymorphic parts (in-memory format)#10697
Conversation
|
TODO:
|
c22be14 to
4c03f48
Compare
7b3e0f4 to
e8262cc
Compare
src/Storages/StorageMergeTree.cpp
Outdated
|
|
||
| merger_mutator.renameMergedTemporaryPart(new_part, future_part.parts, nullptr); | ||
|
|
||
| DataPartsVector parts_to_remove_immediately; |
There was a problem hiding this comment.
I think we need to move this logic into grabOldParts of MergeTree. Maybe introduce additional setting for lifetime of in-memory parts.
| else if (type == MergeTreeDataPartType::WIDE) | ||
| return std::make_shared<MergeTreeDataPartWide>(*this, name, part_info, volume, relative_path); | ||
| else if (type == MergeTreeDataPartType::IN_MEMORY) | ||
| return std::make_shared<MergeTreeDataPartInMemory>(*this, name, part_info, volume, relative_path); |
There was a problem hiding this comment.
Seems like the in-memory part also require disk space?
| if (part_in_memory && getSettings()->in_memory_parts_enable_wal) | ||
| { | ||
| auto wal = getWriteAheadLog(); | ||
| wal->addPart(part_in_memory->block, part_in_memory->name); |
There was a problem hiding this comment.
We don't create any disk reservations for this. So our WAL is not accounted in our disk space. Maybe we can use a reservation from the part?
There was a problem hiding this comment.
Now write_ahead_log_max_bytes are reserved
| } | ||
|
|
||
| /// Calculates uncompressed sizes in memory. | ||
| void MergeTreeDataPartInMemory::calculateEachColumnSizesOnDisk(ColumnSizeByName & each_columns_size, ColumnSize & total_size) const |
There was a problem hiding this comment.
I'd expect this method to return zero :) Maybe we have to rename it?
| for (const auto & part : inserted_parts) | ||
| { | ||
| auto part_in_memory = asInMemoryPart(part); | ||
| if (!part_in_memory->waitUntilMerged(in_memory_parts_timeout)) |
There was a problem hiding this comment.
And what user should do with this error? I think most of the users will just retry this error and duplicate data.
There was a problem hiding this comment.
I've removed this functionality at all. It's better to implement syncronous insert, that depends on fsync of WAL. I think this guaranty will be enough.
|
From "Yandex synchronization check" report: |
alesapin
left a comment
There was a problem hiding this comment.
LGTM, but we have to finish the throttler in the next PR.
|
Performance seems not related to changes. |
* master: (27 commits) Whitespaces Fix typo Fix UBSan report in base64 Correct default secure port for clickhouse-benchmark ClickHouse#11044 Remove test with bug ClickHouse#10697 Update in-functions.md (ClickHouse#12430) Allow nullable key in MergeTree Update arithmetic-functions.md [docs] add rabbitmq docs (ClickHouse#12326) Lower block sizes and look what will happen ClickHouse#9248 Fix lifetime_bytes/lifetime_rows for Buffer direct block write Retrigger CI Fix up test_mysql_protocol failed Implement lifetime_rows/lifetime_bytes for Buffer engine Add comment regarding proxy tunnel usage in PocoHTTPClient.cpp Add lifetime_rows/lifetime_bytes interface (exported via system.tables) Tiny IStorage refactoring Trigger integration-test-runner image rebuild. Delete log.txt Fix test_mysql_client/test_python_client error ...
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added new in-memory format of parts in
MergeTree-family tables, which stores data in memory. Parts are written on disk at first merge. Part will be created in in-memory format if its size in rows or bytes is below thresholdsmin_rows_for_compact_partandmin_bytes_for_compact_part. Also optional support of Write-Ahead-Log is available, which is enabled by default and is controlled by settingin_memory_parts_enable_wal.