Skip to content

Allow replicas to send some buckets out of order#80179

Merged
nickitat merged 39 commits intomasterfrom
out_of_order_buckets
Aug 13, 2025
Merged

Allow replicas to send some buckets out of order#80179
nickitat merged 39 commits intomasterfrom
out_of_order_buckets

Conversation

@nickitat
Copy link
Member

@nickitat nickitat commented May 13, 2025

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Added new logic (controlled by the setting enable_producing_buckets_out_of_order_in_aggregation, enabled by default) that allows sending some buckets out of order during memory-efficient aggregation. When some aggregation buckets take significantly longer to merge than others, it improves performance by allowing the initiator to merge buckets with higher bucket id-s in the meantime. The downside is potentially higher memory usage (shouldn't be significant).


Motivation was described here: https://github.com/ClickHouse/clickhouse-private/issues/22937#issuecomment-2731022465

Screenshot 2025-06-04 at 13 48 38

All the essence is in src/Processors/Transforms/AggregatingTransform.cpp and src/Processors/Transforms/MergingAggregatedMemoryEfficientTransform.cpp.

@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented May 13, 2025

Workflow [PR], commit [6ec8781]

@clickhouse-gh clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label May 13, 2025
@nickitat nickitat marked this pull request as ready for review June 4, 2025 12:06
@clickhouse-gh clickhouse-gh bot added pr-performance Pull request with some performance improvements and removed pr-not-for-changelog This PR should not be mentioned in the changelog labels Jun 4, 2025
@nickitat nickitat force-pushed the out_of_order_buckets branch from 22d84e9 to 407e14d Compare July 30, 2025 18:58
@nickitat nickitat requested a review from devcrafter August 1, 2025 11:57
@nickitat
Copy link
Member Author

nickitat commented Aug 1, 2025

Stateless tests (amd_asan, distributed plan, sequential) - #84876
Stateless tests (amd_binary, ParallelReplicas, s3 storage, parallel) - #84771

)

# big number of granules + low total size in bytes = super tiny granules = big min_marks_per_task
# => big mark_segment_size will be chosen. it is not required to be big, just not equal to the default
Copy link
Member

@devcrafter devcrafter Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this comment refers to in the test? I don't see we set any settings for MergeTree

@nickitat
Copy link
Member Author

Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, parallel) - #54748
Integration tests (amd_tsan, 6/6) - #85563

@nickitat nickitat added this pull request to the merge queue Aug 13, 2025
Merged via the queue into master with commit 7387328 Aug 13, 2025
121 of 123 checks passed
@nickitat nickitat deleted the out_of_order_buckets branch August 13, 2025 18:15
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Aug 13, 2025
baibaichen pushed a commit to Kyligence/gluten that referenced this pull request Aug 14, 2025
baibaichen pushed a commit to Kyligence/gluten that referenced this pull request Aug 15, 2025
baibaichen pushed a commit to Kyligence/gluten that referenced this pull request Aug 16, 2025
baibaichen pushed a commit to Kyligence/gluten that referenced this pull request Aug 18, 2025
nickitat added a commit that referenced this pull request Aug 18, 2025
@nickitat
Copy link
Member Author

#85844

github-merge-queue bot pushed a commit that referenced this pull request Aug 19, 2025
Revert "Merge pull request #80179 from ClickHouse/out_of_order_buckets"
github-merge-queue bot pushed a commit that referenced this pull request Sep 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants