Skip to content

S3Queue auxiliary Zookeeper support#95203

Merged
kssenii merged 3 commits intoClickHouse:masterfrom
lesandie:s3_queue_aux_zk
Jan 28, 2026
Merged

S3Queue auxiliary Zookeeper support#95203
kssenii merged 3 commits intoClickHouse:masterfrom
lesandie:s3_queue_aux_zk

Conversation

@lesandie
Copy link
Contributor

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

S3Queue auxiliary Zookeeper support using keeper_path setting from s3Queue

This will allow to do workload separation for heavy zk usage modes like unordered, by creating a zk/keeper specifically for some S3Queue tables and creating an S3queue specifying keeper_path with an auxiliary zookeeper like in a ReplicatedMergeTree table:

'auxiliary_zookeeper:/clickhouse/s3queue/my_s3queue_table'

Added integration tests and doc

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@lesandie
Copy link
Contributor Author

lesandie commented Jan 26, 2026

This is a clean PR based on: #93786

@kssenii I will test this again, but LGTM.

Thanks for your feedback!

@kssenii kssenii added the can be tested Allows running workflows for external contributors label Jan 26, 2026
@kssenii kssenii self-assigned this Jan 26, 2026
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Jan 26, 2026

Workflow [PR], commit [e39170f]

Summary:

job_name test_name status info comment
Performance Comparison (amd_release, master_head, 1/6) failure
Check Results failure IGNORED

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Jan 26, 2026
Comment on lines -91 to -101

/// Use partition key for bucketing when bucketing_mode is PARTITION
/// This ensures files from the same partition always go to the same bucket
if (bucketing_mode == ObjectStorageQueueBucketingMode::PARTITION)
std::string key = path;
if (bucketing_mode == ObjectStorageQueueBucketingMode::PARTITION && hasPartitioningMode(partitioning_mode))
{
auto partition_key = getPartitionKey(path, partitioning_mode, parser);
return sipHash64(partition_key) % buckets_num;
if (!partition_key.empty())
key = std::move(partition_key);
}

/// Default hash the full file path
return sipHash64(path) % buckets_num;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make it back to the old diff

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK will do.

settings[ObjectStorageQueueSetting::mode] = table_metadata.mode;
settings[ObjectStorageQueueSetting::after_processing] = table_metadata.after_processing;
settings[ObjectStorageQueueSetting::keeper_path] = zk_path;
if (zookeeper_name == zkutil::DEFAULT_ZOOKEEPER_NAME)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check if it is empty here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I saw this, the empty() one which was fixed but somehow it got through, I was gonna fix it. The thing here is that as @ianton-ru said, why we need to check for empty here if it has been already initialized?.

https://github.com/ClickHouse/ClickHouse/pull/95203/changes#diff-74462022be60177caa2aad87d8e3b8d2abf451799319621a44f12a1907f3c5c1R165

I was going to remove the empty condition from all.

Also since this was supposed to be fixed and it got through I'm going to review past comments and see if they are set.

@lesandie
Copy link
Contributor Author

lesandie commented Jan 27, 2026

@kssenii

Seems like we're ok 😄

Codex pulled the review comments from both PRs via gh and cross‑checked the current branch.

Summary:

Applied (matches current branch):

  • StorageObjectStorageQueue.h: removed unused getZooKeeperName() accessor. Comment by @kssenii addressed.
  • StorageObjectStorageQueue.h/cpp: chooseZooKeeperPath out‑param renamed to result_zookeeper_name, and zookeeper_name/zookeeper_path kept adjacent in signatures. Comments by @kssenii / @ianton-ru addressed.
  • StorageObjectStorageQueue.cpp: relative keeper_path now uses s3queue_default_zookeeper_path even with aux prefix, addressing “Why avoid zk_path_prefix with auxiliary zk?”.
  • ObjectStorageQueueOrderedFileMetadata.h/cpp: zookeeper_name moved into BucketInfo per @kssenii.
  • ObjectStorageQueueMetadata.cpp: all Keeper ops inside retry loops now obtain client via getZooKeeper() (reinit on retry). Addresses @kssenii / @ianton-ru comments.
  • ObjectStorageQueueOrderedFileMetadata.cpp: multi‑get and exists checks are wrapped in retry loops @ianton-ru .
  • ObjectStorageQueueMetadataFactory.cpp: metadata key uses makeMetadataKey() directly; no redundant re‑build; empty zookeeper name checks removed.
    Docs s3queue.md: keeper_path description uses separate paragraphs/lines per style feedback.

@kssenii
Copy link
Member

kssenii commented Jan 28, 2026

I've resolved a conflict in private sync, see CH Inc sync - Waiting for status to be reported in checks. When it finishes running tests, this PR can be merged.

@kssenii kssenii enabled auto-merge January 28, 2026 12:04
@lesandie lesandie changed the title S3Queue auxiliary Zookeeper support (clean) S3Queue auxiliary Zookeeper support Jan 28, 2026
@lesandie
Copy link
Contributor Author

I've resolved a conflict in private sync, see CH Inc sync - Waiting for status to be reported in checks. When it finishes running tests, this PR can be merged.

@kssenii Seems like CH Inc sync test passed successfully

@kssenii kssenii added this pull request to the merge queue Jan 28, 2026
Merged via the queue into ClickHouse:master with commit b63af24 Jan 28, 2026
132 of 134 checks passed
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Jan 28, 2026
zvonand added a commit to Altinity/ClickHouse that referenced this pull request Feb 4, 2026
25.8.15 Backport of ClickHouse#95203: S3Queue auxiliary Zookeeper support
zvonand added a commit to Altinity/ClickHouse that referenced this pull request Feb 5, 2026
25.8.15 Backport of ClickHouse#95203: S3Queue auxiliary Zookeeper support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants