Fix zero‑copy unlock check before part dir move#95597
Merged
alexey-milovidov merged 2 commits intoClickHouse:masterfrom Feb 6, 2026
Merged
Fix zero‑copy unlock check before part dir move#95597alexey-milovidov merged 2 commits intoClickHouse:masterfrom
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
Conversation
This was referenced Jan 30, 2026
Open
Contributor
|
Workflow [PR], commit [0d2272c] Summary: ❌
|
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a regression in zero-copy replication introduced by #94262, where shared parts could be prematurely deleted before other replicas finished fetching them. The issue arose because the "can we delete shared blobs?" check was being performed after the part directory was moved, at which point the refcount file was no longer accessible at the expected path.
Changes:
- Move the
can_remove_callback()evaluation to occur before the directory move operation - Cache the result in
can_remove_descriptionfor use after the move - Add explanatory comments documenting the zero-copy reference check requirement
tiandiwonder
reviewed
Feb 2, 2026
This was referenced Feb 2, 2026
Contributor
Author
|
failures are unrelated. @tiandiwonder ? |
Contributor
Yes. |
tiandiwonder
approved these changes
Feb 6, 2026
Contributor
Author
|
@aalexfvk you may also be interested. |
Contributor
Author
|
Should we push it to the branches where #94262 was backported? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a [user-readable short description]
Fix regression in (experimental) zero‑copy replication introduced by #94262 where shared parts could be deleted before other replicas finished fetching them.
Documentation entry for user-facing changes
#94262 fixes a data race between part removal and
system.partsby stoppingDataPartStorageOnDiskBase::remove()from updatingpart_diraftermoveDirectory(). Previously the code didpart_dir = delete_tmp_*, but that write could race withgetFullPath()reads. The fix removed the write, sopart_dirstays unchanged during removal, eliminating the race.But that change accidentally broke zero‑copy cleanup: the “can we delete shared blobs?” check (
unlockSharedData) reads a refcount file viapart_dir. After the move,part_dirnow points to the old location, so the check fails and the code assumes the part is temporary and deletes blobs without ZooKeeper checks. The fix is to evaluate the “can remove” decision before the directory is moved, while the refcount file is still visible, and then reuse that decision after the move.This matches existing nearby branches that already call
can_remove_callback()before disk operations.No new tests added; zero‑copy replication test coverage has been removed in newer branches. Change was validated internally.
P.S. This should be backported to the branches where #94262 was backported, since it introduced the regression.