Skip to content

Improve restoring of access entities' dependencies #2#69563

Merged
alexey-milovidov merged 14 commits intoClickHouse:masterfrom
vitlibar:restore-access-dependencies-2
Sep 28, 2024
Merged

Improve restoring of access entities' dependencies #2#69563
alexey-milovidov merged 14 commits intoClickHouse:masterfrom
vitlibar:restore-access-dependencies-2

Conversation

@vitlibar
Copy link
Member

Changelog category:

  • Improvement

Changelog entry:

Follow-up to #69346
Point 4 described there will work now as well:

For example if before making a backup a role was granted to a user:

CREATE USER u1;
CREATE ROLE r1;
GRANT r1 TO u1;

then during RESTORE the following cases are possible:

  1. User u1 and role r1 both don't exist. Then RESTORE will recreate them both and that grant too.
  2. User u1 and role r1 both exist. Then RESTORE (with default settings) will skip them both and that grant too.
  3. User u1 doesn't exist and role r1 exist. Then RESTORE (with default settings) will recreate user u1, and that grant too.
  4. User u1 exists and role r1 doesn't exist. Then RESTORE (with default settings) will recreate role r1, and (after this PR) that grant too.

@robot-ch-test-poll robot-ch-test-poll added the pr-improvement Pull request with some product improvements label Sep 12, 2024
@robot-clickhouse
Copy link
Member

robot-clickhouse commented Sep 12, 2024

This is an automated comment for commit 8a7a411 with description of existing statuses. It's updated for the latest CI running

✅ Click here to open a full report in a separate page

Successful checks
Check nameDescriptionStatus
AST fuzzerRuns randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help✅ success
BuildsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
ClickBenchRuns [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docker keeper imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docker server imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docs checkBuilds and tests the documentation✅ success
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here✅ success
Flaky testsChecks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests✅ success
Performance ComparisonMeasure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests✅ success
Stateful testsRuns stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Stress testRuns stateless functional tests concurrently from several clients to detect concurrency-related errors✅ success
Style checkRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Unit testsRuns the unit tests for different release types✅ success
Upgrade checkRuns stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts✅ success

@vitlibar vitlibar added the do not test disable testing on pull request label Sep 12, 2024
@vitlibar vitlibar force-pushed the restore-access-dependencies-2 branch 2 times, most recently from 1bb4e16 to 5732a7d Compare September 13, 2024 14:53
@vitlibar vitlibar removed the do not test disable testing on pull request label Sep 13, 2024
@vitlibar vitlibar force-pushed the restore-access-dependencies-2 branch 7 times, most recently from 5bae8f5 to 4d5859e Compare September 14, 2024 07:07
@qoega
Copy link
Member

qoega commented Sep 16, 2024

CI all green

@qoega qoega marked this pull request as ready for review September 16, 2024 08:40
@vitlibar
Copy link
Member Author

I'd like to merge #69346 first.

@qoega qoega added pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-cloud labels Sep 16, 2024
@vitlibar vitlibar force-pushed the restore-access-dependencies-2 branch 2 times, most recently from eb5c7b1 to e44e2c9 Compare September 19, 2024 07:53
@vitlibar vitlibar requested a review from pufit September 19, 2024 07:53
@vitlibar vitlibar force-pushed the restore-access-dependencies-2 branch from e44e2c9 to 9e35eb6 Compare September 19, 2024 10:31
@vitlibar vitlibar force-pushed the restore-access-dependencies-2 branch from 9e35eb6 to 443ff06 Compare September 19, 2024 10:43
@vitlibar vitlibar force-pushed the restore-access-dependencies-2 branch from 443ff06 to 8a7a411 Compare September 19, 2024 17:30
@vitlibar
Copy link
Member Author

@pufit Please review

Copy link
Member

@pufit pufit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good!

std::vector<UUID> findAll() const { return findAll(EntityClassT::TYPE); }
/// Returns the identifiers of all the entities in the storage.
template <typename EntityClassT = IAccessEntity>
std::vector<UUID> findAll() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we use it somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, see https://github.com/ClickHouse/ClickHouse/pull/69563/files#diff-5b07fbe0e42dd2e297b2aaa50ebb5545f132ab5c3884d38b94c4d7cb06f7151dR913

When we're making a backup:

if (!entities_ids_set.contains(id) && possible_dependent->hasDependencies(entities_ids_set))
{
auto dependent = possible_dependent->clone();
dependent->clearAllExceptDependencies();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it do exactly? Why do we need to remove everything here?

Copy link
Member Author

@vitlibar vitlibar Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to store dependent access entities. For example, if we store roles:

BACKUP TABLE system.roles ...

then after this PR we'll have additional DEPENDENTS section in the backup with information about which users use those roles. This information must not contain the credentials of these users.

For example, if a role was granted to a user and if we backup system.roles then file /data/system/roles/access-<UUID>.txt inside the backup will look like this:

fb4c9f55-3c15-32b3-4c5b-ea6de397c48e	ROLE	role1
ATTACH ROLE role1 SETTINGS custom_x = 'x';

DEPENDENTS
7647694e-1f79-5d14-dd2c-29d976450ba1	USER	user1
ATTACH USER user1 DEFAULT ROLE ID('fb4c9f55-3c15-32b3-4c5b-ea6de397c48e');
ATTACH GRANT ID('fb4c9f55-3c15-32b3-4c5b-ea6de397c48e') TO user1;

So this information is to restore role role1 and grant it to existing user user1, and not to restore user1 - so we don't have to store the credentials of user1 here.

struct FilePathsAndHost
{
std::unordered_set<String> file_paths;
std::set<String> file_paths;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it need to be sorted?

Copy link
Member Author

@vitlibar vitlibar Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in BackupCoordinationReplicatedAccess::getFilePaths() we get the first file_path in this list and we want it to be the same on each host. This list of file paths contains the same elements on each host and if the list is sorted then we can be sure that the first element will be also the same on each host.


if (found)
{
for (const auto & role_id : src.roles_with_admin_option)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_if?


if (found)
{
for (auto it = roles_with_admin_option.begin(); it != roles_with_admin_option.end();)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erase_if?

@vitlibar
Copy link
Member Author

vitlibar commented Sep 28, 2024

I'm not sure about backport though - this PR has quite a bunch of changes, it adds new behavior and it even changes the format of some files stored inside a backup a little (in a backward-compatible manner).

@alexey-milovidov
Copy link
Member

Yes, let's close backports.

@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Sep 28, 2024
robot-ch-test-poll3 added a commit that referenced this pull request Sep 28, 2024
Backport #69563 to 24.9: Improve restoring of access entities' dependencies #2
@vitlibar vitlibar deleted the restore-access-dependencies-2 branch November 27, 2024 21:02
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-backports-created-cloud deprecated label, NOOP label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backports-created-cloud deprecated label, NOOP pr-improvement Pull request with some product improvements pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants