Skip to content

Fix attaching Replicated DBs when the interserver host changed after restarting#93779

Merged
tuanpach merged 6 commits intoClickHouse:masterfrom
tuanpach:fix-attach-replicated-db-when-interserver-host-changed
Jan 14, 2026
Merged

Fix attaching Replicated DBs when the interserver host changed after restarting#93779
tuanpach merged 6 commits intoClickHouse:masterfrom
tuanpach:fix-attach-replicated-db-when-interserver-host-changed

Conversation

@tuanpach
Copy link
Member

@tuanpach tuanpach commented Jan 9, 2026

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix attaching Replicated DBs when the interserver host changed after restarting.

After ClickHouse restarted, the interserver host might change. However, for Replicated DBs, we store host_id, which includes the host, in the replica_path. When attaching after restarting, the host_id mismatched and throw an error.

In this PR, if the host_id mismatch, but the UUID is the same, we set replica_path data to the new host_id.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Jan 9, 2026

Workflow [PR], commit [0aa9484]

Summary:

job_name test_name status info comment
Integration tests (arm_binary, distributed plan, 2/4) failure
test_scheduler_cpu_preemptive/test.py::test_independent_pools[cpu-slot-preemption-timeout-60s] FAIL cidb
BuzzHouse (amd_debug) failure
Logical error: 'Inconsistent AST formatting: the query: (STID: 1941-1bfa) FAIL cidb, issue
BuzzHouse (amd_tsan) failure
Logical error: Bad cast from type A to B (STID: 1635-3e67) FAIL cidb, issue

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Jan 9, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an issue where Replicated databases fail to attach after a ClickHouse restart if the interserver host changed. The problem occurs because the host_id (which includes the host address) is stored in the replica_path in ZooKeeper, and a mismatch after restart causes an error. The solution is to update the host_id in ZooKeeper when the UUID matches but the host_id differs.

Changes:

  • Added logic to parse and compare the UUID from the stored host_id in ZooKeeper
  • When host_id mismatch occurs but UUID matches, update the host_id in ZooKeeper instead of throwing an error
  • Added integration test to verify the fix works when interserver_http_host changes after database creation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/Databases/DatabaseReplicated.cpp Added parseHostID function to extract UUID from host_id string and logic to update ZooKeeper host_id when UUID matches but address changed
tests/integration/test_replicated_database_interserver_host/test.py Refactored config update logic into helper function and added test case for interserver host change scenario

tuanpach and others added 2 commits January 12, 2026 09:09
Replace replica_host_id in the log

Co-authored-by: Copilot <[email protected]>
}

if (uuid_in_keeper != db_uuid)
throw Exception(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it clear?
throw Exception(
ErrorCodes::REPLICA_ALREADY_EXISTS,
"Replica {} of shard {} of replicated database at {} already exists. Replica host ID ('{}') does not match current host ID ('{}'). "
"A replica with the same name exists but with different node identity.",
replica_name, shard_name, zookeeper_path, replica_host_id, host_id);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much difference. And we introduce "name" and "node identity" here.

replica_host_id,
host_id);

// After restarting, InterserverIOAddress might change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list possbile address change senarios?

Copy link
Contributor

@tiandiwonder tiandiwonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a issue for it? such as oncall issue.

@tuanpach
Copy link
Member Author

Is there a issue for it? such as oncall issue.

There is a cross link: #89693

It is part of the issue.

Update the comment, explain why the InterserverIOAddress might change.
@tuanpach
Copy link
Member Author

test_scheduler_cpu_preemptive/test.py::test_independent_pools[cpu-slot-preemption-timeout-60s]

@tuanpach tuanpach added this pull request to the merge queue Jan 14, 2026
Merged via the queue into ClickHouse:master with commit 2bc5af7 Jan 14, 2026
128 of 132 checks passed
@tuanpach tuanpach deleted the fix-attach-replicated-db-when-interserver-host-changed branch January 14, 2026 06:28
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jan 14, 2026
@tuanpach tuanpach added the pr-must-backport Pull request should be backported intentionally. Use this label with great care! label Jan 23, 2026
@robot-clickhouse robot-clickhouse added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Jan 23, 2026
robot-ch-test-poll added a commit that referenced this pull request Jan 23, 2026
Cherry pick #93779 to 25.10: Fix attaching Replicated DBs when the interserver host changed after restarting
robot-clickhouse added a commit that referenced this pull request Jan 23, 2026
robot-ch-test-poll added a commit that referenced this pull request Jan 23, 2026
Cherry pick #93779 to 25.11: Fix attaching Replicated DBs when the interserver host changed after restarting
robot-clickhouse added a commit that referenced this pull request Jan 23, 2026
robot-ch-test-poll added a commit that referenced this pull request Jan 23, 2026
Cherry pick #93779 to 25.12: Fix attaching Replicated DBs when the interserver host changed after restarting
robot-clickhouse added a commit that referenced this pull request Jan 23, 2026
clickhouse-gh bot added a commit that referenced this pull request Jan 23, 2026
Backport #93779 to 25.10: Fix attaching Replicated DBs when the interserver host changed after restarting
clickhouse-gh bot added a commit that referenced this pull request Jan 23, 2026
Backport #93779 to 25.11: Fix attaching Replicated DBs when the interserver host changed after restarting
clickhouse-gh bot added a commit that referenced this pull request Jan 23, 2026
Backport #93779 to 25.12: Fix attaching Replicated DBs when the interserver host changed after restarting
robot-clickhouse-ci-1 added a commit that referenced this pull request Jan 28, 2026
Cherry pick #93779 to 25.3: Fix attaching Replicated DBs when the interserver host changed after restarting
robot-clickhouse added a commit that referenced this pull request Jan 28, 2026
robot-clickhouse-ci-1 added a commit that referenced this pull request Jan 28, 2026
Cherry pick #93779 to 25.8: Fix attaching Replicated DBs when the interserver host changed after restarting
robot-clickhouse added a commit that referenced this pull request Jan 28, 2026
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Jan 28, 2026
tuanpach added a commit that referenced this pull request Jan 28, 2026
Backport #93779 to 25.8: Fix attaching Replicated DBs when the interserver host changed after restarting
tuanpach added a commit that referenced this pull request Jan 28, 2026
Backport #93779 to 25.3: Fix attaching Replicated DBs when the interserver host changed after restarting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-bugfix Pull request with bugfix, not backported by default pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants