Skip to content

Fix assertion failure in CRoaring on self-merge of NumericIndexedVector#99976

Merged
Avogar merged 3 commits intoClickHouse:masterfrom
Desel72:fix/issue-99704
Mar 19, 2026
Merged

Fix assertion failure in CRoaring on self-merge of NumericIndexedVector#99976
Avogar merged 3 commits intoClickHouse:masterfrom
Desel72:fix/issue-99704

Conversation

@Desel72
Copy link
Copy Markdown
Contributor

@Desel72 Desel72 commented Mar 18, 2026

Fix assertion failure in CRoaring roaring_bitmap_xor_inplace on self-merge of NumericIndexedVector

Closes #99704

The executeAggregateMultiply uses exponentiation by squaring, which calls function->merge(vec_from[i], vec_from[i], arena) (self-merge) on even iterations. For BSINumericIndexedVector, this calls pointwiseAddInplace where this == &rhs. Since shallowCopyFrom shares Roaring bitmap pointers, the full adder loop performs sum->rb_xor(*addend) on aliased bitmaps, triggering assert(x1 != x2) in debug builds and producing incorrect results (A XOR A = 0) in release builds.

Added a self-merge guard to both pointwiseAddInplace and pointwiseSubtractInplace that deep-copies the argument when this == &rhs.

Changelog category (leave one):

  • Critical Bug Fix (crash, data loss, RBAC)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix assertion failure (exception in debug builds, incorrect results in release builds) when multiplying NumericIndexedVector aggregate states by an even integer constant, caused by self-XOR on aliased Roaring bitmaps in pointwiseAddInplace.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

…f-merge of NumericIndexedVector

The `executeAggregateMultiply` function uses exponentiation by squaring,
which calls `function->merge(vec_from[i], vec_from[i], arena)` (self-merge)
on even iterations. For `BSINumericIndexedVector`, this calls
`pointwiseAddInplace` where `this == &rhs`. Since `shallowCopyFrom` shares
Roaring bitmap pointers, the full adder loop performs `sum->rb_xor(*addend)`
on aliased bitmaps, triggering `assert(x1 != x2)` in debug builds and
producing incorrect results (A XOR A = 0) in release builds.

Add a self-merge guard to both `pointwiseAddInplace` and
`pointwiseSubtractInplace` that deep-copies the argument when `this == &rhs`.

ClickHouse#99704

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@Desel72
Copy link
Copy Markdown
Contributor Author

Desel72 commented Mar 18, 2026

Hi @PedroTadim @alexey-milovidov @Algunenano
Could you review this PR?

-- https://github.com/ClickHouse/ClickHouse/issues/99704

-- `multiply` triggers self-merge via exponentiation by squaring (even branch).
SELECT arrayJoin([numericIndexedVectorToMap(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test reproduces the issue.

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alexey-milovidov alexey-milovidov self-assigned this Mar 19, 2026
@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Mar 19, 2026
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 19, 2026

Workflow [PR], commit [bf80514]

Summary:


AI Review

Summary

This PR fixes a real correctness bug in NumericIndexedVector state multiplication by guarding self-aliasing in BSINumericIndexedVector::pointwiseAddInplace (and symmetrically in pointwiseSubtractInplace) via a deep-copy path before in-place bitmap XOR operations. The change is small, targeted, and matches the reported failure mode in issue #99704. I did not find new blocker/major issues in the diff.

Missing context

  • ⚠️ Full CI results are not yet available (several checks are still in progress), so this review is code-focused and not based on complete CI logs.

ClickHouse Rules

Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
Safe rollout
Compilation time

Final Verdict

  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! labels Mar 19, 2026
@alexey-milovidov
Copy link
Copy Markdown
Member

@Desel72, the test has failed.

Desel72 and others added 2 commits March 19, 2026 12:54
…pected output

The function signature is (index, value), so (100, 1) produces {100:2}
after multiply(2, ...), matching the reference file.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@Desel72
Copy link
Copy Markdown
Contributor Author

Desel72 commented Mar 19, 2026

@Desel72, the test has failed.

@alexey-milovidov I've solved this.
Could you please solve #99977 problem?

@antaljanosbenjamin
Copy link
Copy Markdown
Member

Could you please solve #99977 problem?
What should be solved there? You opened two PRs with the same content. One of the was closed as duplicate. What do you expect to be done?

I just don't see what are we missing here.

@Desel72
Copy link
Copy Markdown
Contributor Author

Desel72 commented Mar 19, 2026

Could you please solve #99977 problem?
What should be solved there? You opened two PRs with the same content. One of the was closed as duplicate. What do you expect to be done?

I just don't see what are we missing here.

@antaljanosbenjamin I made a mistake. Sorry for that. I mean you can consider this #99977 (comment)?

@antaljanosbenjamin
Copy link
Copy Markdown
Member

Okay, let me try to address your comment.

Closed PR is really vulnerable for me.

I feel sorry about this, but there is nothing wrong about it. You have two PRs with the same changes. One of them has to be closed, it doesn't make sense to merge both of them, because they address the same issue as far as I can see.

I want you to make this to draft PR. Could you please help me this time?

Why would be a draft PR better than a closed one? It is better to close one of them and do not let it wonder around for the future.

I will never make a mistake. I will be a crazy contributor. I am really sorry. Please help me.

You don't need to worry. You are very welcome to contribute to ClickHouse as anybody else. A bunch of people opened duplicate PRs, it is not a big deal.

@Desel72
Copy link
Copy Markdown
Contributor Author

Desel72 commented Mar 19, 2026

Okay, let me try to address your comment.

Closed PR is really vulnerable for me.

I feel sorry about this, but there is nothing wrong about it. You have two PRs with the same changes. One of them has to be closed, it doesn't make sense to merge both of them, because they address the same issue as far as I can see.

I want you to make this to draft PR. Could you please help me this time?

Why would be a draft PR better than a closed one? It is better to close one of them and do not let it wonder around for the future.

I will never make a mistake. I will be a crazy contributor. I am really sorry. Please help me.

You don't need to worry. You are very welcome to contribute to ClickHouse as anybody else. A bunch of people opened duplicate PRs, it is not a big deal.

@antaljanosbenjamin Can I contact with you on telegram?

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 19, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.80% 83.80% +0.00%
Functions 23.90% 23.90% +0.00%
Branches 76.30% 76.30% +0.00%

PR changed lines: PR changed-lines coverage: 71.43% (15/21, 0 noise lines excluded)
Diff coverage report
Uncovered code

@antaljanosbenjamin
Copy link
Copy Markdown
Member

Sorry, but I don't understand what is the problem, therefore I don't see any reason to contact me in private.

@Desel72
Copy link
Copy Markdown
Contributor Author

Desel72 commented Mar 19, 2026

Sorry, but I don't understand what is the problem, therefore I don't see any reason to contact me in private.

Sorry for that. That is because of my personal problem..

@Avogar Avogar added this pull request to the merge queue Mar 19, 2026
Merged via the queue into ClickHouse:master with commit e829a5b Mar 19, 2026
163 checks passed
@robot-ch-test-poll robot-ch-test-poll added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Mar 19, 2026
robot-clickhouse added a commit that referenced this pull request Mar 19, 2026
robot-clickhouse added a commit that referenced this pull request Mar 19, 2026
robot-clickhouse added a commit that referenced this pull request Mar 19, 2026
robot-clickhouse added a commit that referenced this pull request Mar 19, 2026
robot-clickhouse added a commit that referenced this pull request Mar 19, 2026
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 19, 2026
clickhouse-gh bot added a commit that referenced this pull request Mar 19, 2026
Backport #99976 to 26.2: Fix assertion failure in CRoaring on self-merge of NumericIndexedVector
clickhouse-gh bot added a commit that referenced this pull request Mar 19, 2026
Backport #99976 to 26.1: Fix assertion failure in CRoaring on self-merge of NumericIndexedVector
clickhouse-gh bot added a commit that referenced this pull request Mar 19, 2026
Backport #99976 to 26.3: Fix assertion failure in CRoaring on self-merge of NumericIndexedVector
alexey-milovidov added a commit that referenced this pull request Mar 19, 2026
Backport #99976 to 25.12: Fix assertion failure in CRoaring on self-merge of NumericIndexedVector
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Mar 19, 2026
Avogar added a commit that referenced this pull request Mar 24, 2026
Backport #99976 to 25.8: Fix assertion failure in CRoaring on self-merge of NumericIndexedVector
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Assertion `x1 != x2' failed (STID: 2794-3a2f)

7 participants