Skip to content

Fix two-level aggregation when using Merge over Distributed#87687

Merged
nickitat merged 6 commits intoClickHouse:masterfrom
c-end:two-level-aggregation-merge-distributed
Oct 10, 2025
Merged

Fix two-level aggregation when using Merge over Distributed#87687
nickitat merged 6 commits intoClickHouse:masterfrom
c-end:two-level-aggregation-merge-distributed

Conversation

@c-end
Copy link
Contributor

@c-end c-end commented Sep 26, 2025

We (Cloudflare) noticed that some GROUP BY queries referencing a Merge table over Distributed tables fail with a logical error from SortingAggregatedTransform: Code: 49. DB::Exception: SortingAggregatedTransform already got bucket with number 237. (LOGICAL_ERROR). Apparently the processor is not getting the buckets in the expected order when a Merge table is involved. When comparing with an old version, that doesn't have this problem, we noticed a small difference in the pipeline, which is the number of inputs to GroupingAggregatedTransform:

(Expression)
ExpressionTransform × 4
  (MergingAggregated)
  Resize 1 → 4
    SortingAggregatedTransform 4 → 1
      MergingAggregatedBucketTransform × 4
        Resize 1 → 4
          GroupingAggregatedTransform 8 → 1 <-- the number of inputs no longer matches the number of remote sources here
            (ReadFromMerge)

We believe this is because of bff832c, where the max number of streams is not set to max_distributed_connections anymore.

When ReadFromMerge::initializePipeline calls pipeline.narrow(), this doesn't seem to preserve the order of buckets that is expected by SortingAggregatedTransform.

I'm not sure if the change I made in ReadFromMerge is a good solution, but based on the test I also added in this PR, it seems to fix the problem.

I was also able to reproduce the issue on 25.8. On master things seem to have changed a bit because of #80179, so that SortingAggregatedTransform doesn't seem to be part of the pipeline anymore.

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixed two-level aggregation when using Merge over Distributed.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@c-end
Copy link
Contributor Author

c-end commented Sep 26, 2025

@nickitat @devcrafter Would you mind taking a look, since you recently worked on #80179?

@thevar1able thevar1able added the can be tested Allows running workflows for external contributors label Sep 26, 2025
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Sep 26, 2025

Workflow [PR], commit [70da360]

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Sep 26, 2025
@nickitat nickitat self-assigned this Sep 30, 2025
@nickitat
Copy link
Member

nickitat commented Oct 3, 2025

We recently had a fix for the same issue, I believe. But in a different place: #78500

@nickitat
Copy link
Member

nickitat commented Oct 3, 2025

Pls check this: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=87687&sha=56424693c790ac83770d2f367bbd740ae0605450&name_0=PR&name_1=Bugfix+validation+%28functional+tests%29
The new test failed, but for an orthogonal reason. Until we fix the test, the whole set of checks won't run.

@c-end
Copy link
Contributor Author

c-end commented Oct 6, 2025

Thanks @nickitat for taking a look.

We recently had a fix for the same issue, I believe. But in a different place: #78500

Yeah we saw this fix and initially thought it would solve the issue, but it didn't. The exception is the same, but I think what we are seeing here is related to the Merge engine narrowing the pipe, not the parallelized reading from storage.

Pls check this: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=87687&sha=56424693c790ac83770d2f367bbd740ae0605450&name_0=PR&name_1=Bugfix+validation+%28functional+tests%29 The new test failed, but for an orthogonal reason. Until we fix the test, the whole set of checks won't run.

Yeah the test failed because of https://github.com/ClickHouse/ClickHouse/pull/80179/files#diff-1806a5c1f13b491c615e887366acf1daac6d85b1528be423e3762705482c80fbR574, so that there was no more SortingAggregatedTransform processor. I've updated the test and I'm still able to reproduce the issue.
If you have an idea on how to simplify the test, please let me know.

@nickitat
Copy link
Member

nickitat commented Oct 9, 2025

@c-end
Copy link
Contributor Author

c-end commented Oct 10, 2025

Right now the problem is that the test is too long: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=87687&sha=334d98b71965ac4918c166e1cf388e0243ac93e5&name_0=PR&name_1=Stateless+tests+%28amd_asan%2C+flaky+check%29

I was able to speed it up a bit by using sync distributed inserts, but I ended up tagging it as long. It only takes a few seconds in the other runs, so I hope that's fine.

Does the actual fix in StorageMerge look good to you? It would be great if this can be merged soon, because it's actually a critical bug for us. We cannot deploy any recent ClickHouse version to our production clusters that use Merge over Distributed.

@nickitat nickitat added this pull request to the merge queue Oct 10, 2025
Merged via the queue into ClickHouse:master with commit 5cbec5b Oct 10, 2025
121 of 123 checks passed
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Oct 10, 2025
mkmkme pushed a commit to Altinity/ClickHouse that referenced this pull request Jan 21, 2026
…merge-distributed

Fix two-level aggregation when using Merge over Distributed
Enmk added a commit to Altinity/ClickHouse that referenced this pull request Jan 26, 2026
Antalya 25.8.14 backport of ClickHouse#87687 - Fix two-level aggregation when using Merge over Distributed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants