Avoid race condition for updating system.distribution_queue values#43406
Merged
tavplubix merged 1 commit intoClickHouse:masterfrom Dec 5, 2022
Merged
Avoid race condition for updating system.distribution_queue values#43406tavplubix merged 1 commit intoClickHouse:masterfrom
tavplubix merged 1 commit intoClickHouse:masterfrom
Conversation
Previously it was possible to have a race while updating files_count/bytes_count, since INSERT updates it those counters from one thread and the same metrics are updated from filesystem in a separate thread, and even though the access is synchronized with the mutex it avoids the race only for accessing the variables not the logical race, since it is possible that getFiles() from a separate thread will increment counters and later addAndSchedule() will increment them again. Here you can find an example of this race [1]. [1]: https://pastila.nl/?00950e00/41a3c7bbb0a7e75bd3f2922c58b02334 Note, that I analyzed logs from production system with lots of async Distributed INSERT and everything is OK there, even though the logs contains the following: 2022.11.20 02:21:15.459483 [ 11528 ] {} <Trace> v21.dist_out.DirectoryMonitor: Files set to 35 (was 34) 2022.11.20 02:21:15.459515 [ 11528 ] {} <Trace> v21.dist_out.DirectoryMonitor: Bytes set to 4035418 (was 3929008) 2022.11.20 02:21:15.819488 [ 11528 ] {} <Trace> v21.dist_out.DirectoryMonitor: Files set to 1 (was 2) 2022.11.20 02:21:15.819502 [ 11528 ] {} <Trace> v21.dist_out.DirectoryMonitor: Bytes set to 190072 (was 296482) As you may see it first increases the counters and next update decreases (and 4035418-3929008 == 296482-190072) Refs: ClickHouse#23885 Reported-by: @tavplubix Signed-off-by: Azat Khuzhin <[email protected]>
tavplubix
approved these changes
Nov 21, 2022
Member
Member
Author
Retries failed, but the first time it fails because the process got killed and container had been destroyed, but no info in logs about this (maybe OOM, but why): Note, 137 is SIGKILL (128+9) And than all |
azat
commented
Jan 4, 2023
| } | ||
| } | ||
|
|
||
| { |
Member
Author
There was a problem hiding this comment.
This patch introduces a bug - now this counters are not initialized on server start (not a huge deal, but still), I've rewrote the code completely to avoid readdir (you may think about this readdir here as a synchronization method between INSERT and thread that sends data to remote nodes) before each send - #44922
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Jan 5, 2023
In ClickHouse#43406 metrics was broken for a clean start, since they where not initialized from disk, but metrics for broken files was never initialized from disk. Fix this and rework how DirectoryMonitor works with file system: - do not iterate over directory before each send, do this only once on init, after the map of files will be updated from the INSERT - call fs::create_directories() from the ctor for "broken" folder to avoid excessive calls - cache "broken" paths This patch also fixes possible issue when current_batch can be processed multiple times (second time will be an exception), since if there is existing current_batch.txt after processing it you should remove it instantly. Plus this patch implicitly fixes issues with logging, that logs incorrect number of files in case of error (see ClickHouse#44907 for details). Signed-off-by: Azat Khuzhin <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Avoid race condition for updating system.distribution_queue values
Previously it was possible to have a race while updating files_count/bytes_count, since INSERT updates it those counters from one thread and the same metrics are updated from filesystem in a separate thread, and even though the access is synchronized with the mutex it avoids the race only for accessing the variables not the logical race, since it is possible that getFiles() from a separate thread will increment counters and later addAndSchedule() will increment them again.
Here you can find an example of this race 1.
Note, that I analyzed logs from production system with lots of async Distributed INSERT and everything is OK there, even though the logs contains the following:
As you may see it first increases the counters and next update decreases (and 4035418-3929008 == 296482-190072)
Refs: #23885
Reported-by: @tavplubix