Fix two-level aggregation when using Merge over Distributed#87687
Fix two-level aggregation when using Merge over Distributed#87687nickitat merged 6 commits intoClickHouse:masterfrom
Conversation
|
@nickitat @devcrafter Would you mind taking a look, since you recently worked on #80179? |
|
We recently had a fix for the same issue, I believe. But in a different place: #78500 |
|
Pls check this: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=87687&sha=56424693c790ac83770d2f367bbd740ae0605450&name_0=PR&name_1=Bugfix+validation+%28functional+tests%29 |
|
Thanks @nickitat for taking a look.
Yeah we saw this fix and initially thought it would solve the issue, but it didn't. The exception is the same, but I think what we are seeing here is related to the Merge engine narrowing the pipe, not the parallelized reading from storage.
Yeah the test failed because of https://github.com/ClickHouse/ClickHouse/pull/80179/files#diff-1806a5c1f13b491c615e887366acf1daac6d85b1528be423e3762705482c80fbR574, so that there was no more |
|
Right now the problem is that the test is too long: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=87687&sha=334d98b71965ac4918c166e1cf388e0243ac93e5&name_0=PR&name_1=Stateless+tests+%28amd_asan%2C+flaky+check%29 |
I was able to speed it up a bit by using sync distributed inserts, but I ended up tagging it as Does the actual fix in |
5cbec5b
…merge-distributed Fix two-level aggregation when using Merge over Distributed
Antalya 25.8.14 backport of ClickHouse#87687 - Fix two-level aggregation when using Merge over Distributed
We (Cloudflare) noticed that some
GROUP BYqueries referencing aMergetable overDistributedtables fail with a logical error fromSortingAggregatedTransform:Code: 49. DB::Exception: SortingAggregatedTransform already got bucket with number 237. (LOGICAL_ERROR). Apparently the processor is not getting the buckets in the expected order when aMergetable is involved. When comparing with an old version, that doesn't have this problem, we noticed a small difference in the pipeline, which is the number of inputs toGroupingAggregatedTransform:We believe this is because of bff832c, where the max number of streams is not set to
max_distributed_connectionsanymore.When
ReadFromMerge::initializePipelinecallspipeline.narrow(), this doesn't seem to preserve the order of buckets that is expected bySortingAggregatedTransform.I'm not sure if the change I made in
ReadFromMergeis a good solution, but based on the test I also added in this PR, it seems to fix the problem.I was also able to reproduce the issue on 25.8. On master things seem to have changed a bit because of #80179, so that
SortingAggregatedTransformdoesn't seem to be part of the pipeline anymore.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fixed two-level aggregation when using
MergeoverDistributed.Documentation entry for user-facing changes