[CANCELED] Ability to preserve user for distributed queries by azat · Pull Request #11391 · ClickHouse/ClickHouse

azat · 2020-06-02T23:42:42Z

Changelog category (leave one):

Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Preserve initial_user for Distributed queries if user was not specified in remote_servers

Fixes: #6843

…d in remote_servers

…remote()

filimonov · 2020-06-05T08:25:01Z

Related:
#9751
#8926

abyss7 · 2020-06-05T17:25:22Z

It doesn't look good in some cases.

What to do if the propagated user is not defined on the whole cluster?
How to handle situations when incoming connection to the initiator is secured, but outcoming is not?
How to provide backward compatibility for current production clusters where all the settings and contraints are properly defined for 'default' user only? Especially for the 1st case.

Also, it's won't work, if we implement any other authentication options. So it may be a temporary solution for now.

My suggestion is to put this behavior behind config option like <propagate_plain_user_password>1</propagate_plain_user_password>.

abyss7

Put the new behavior under some config option.

abyss7

Also you should make sure, that a raw password isn't stored in RAM on the long-term basis, i.e. it should be sent to remote and then erased immediately from memory.

azat · 2020-06-05T19:58:23Z

How to handle situations when incoming connection to the initiator is secured, but outcoming is not?

This PR does not changes this, am I missing something?

How to provide backward compatibility for current production clusters where all the settings and contraints are properly defined for 'default' user only? Especially for the 1st case.

Agree, backward compatibility should be defined.
I went without for two reasons, simplicity (forgot to mark this PR as RFC, egh) and #6843 (comment) misread

Also, it's won't work, if we implement any other authentication options. So it may be a temporary solution for now.

Yes, and like @filimonov noted that there can be some issues for kerberos/LDAP (since they requires some keys IIRC).

Also you should make sure, that a raw password isn't stored in RAM on the long-term basis, i.e. it should be sent to remote and then erased immediately from memory.

But it is already stored in the ClientInfo, does it that significant? (Context::setUser)

Actually there is another way to go, suggested by @filimonov in #9751
And looks like it will resolve all the issues (and remove extra reconnect), any thoughts on this?

abyss7 · 2020-06-06T18:46:58Z

How to handle situations when incoming connection to the initiator is secured, but outcoming is not?

This PR does not changes this, am I missing something?

Yes, but AFAIK current implementation never re-sends passwords.

Also you should make sure, that a raw password isn't stored in RAM on the long-term basis, i.e. it should be sent to remote and then erased immediately from memory.

But it is already stored in the ClientInfo, does it that significant? (Context::setUser)

Actually yes, so it's not getting worse.

Actually there is another way to go, suggested by @filimonov in #9751
And looks like it will resolve all the issues (and remove extra reconnect), any thoughts on this?

Do you plan to continue this PR anyway, or will postpone it?

azat · 2020-06-06T21:43:58Z

Do you plan to continue this PR anyway, or will postpone it?

Sure, I will continue (marked as draft to avoid attracting extra attention until it will be reworked/fixed)

Right now I have two options:

just introduce options to make it backward compatible (simple)
introduce proxy/impersonate users (as suggested in Impersonate users in cluster #9751, any cons for it?), but it will require more time

Proxy user does not have mentioned limitations, but it looks a little bit oversized for this kind of support, I guess I will try it and see how it will looks eventually

azat · 2020-06-07T08:33:24Z

Also, it's won't work, if we implement any other authentication options. So it may be a temporary solution for now.

Actually the same thing is used in StorageReplicatedMergeTree::sendRequestToLeaderReplica

azat · 2020-06-18T16:22:44Z

src/Client/Connection.cpp

+    /// A: Don't think so, since the user should be same for subsequent queries.
+    if (current_user != new_current_user)
+    {
+        current_user = new_current_user;


Also this is not the only place where the connection will be established, another one in ConnectionPoolWithFailover->getServerRevision, and during this it may use stalled current_user user

(gdb) bt 0 DB::Connection::connect (this=0x7f7b09ab1118, timeouts=...) at ../src/Client/Connection.cpp:89 1 0x000000000e219675 in DB::Connection::getServerRevision (this=0x7f7b09ab1118, timeouts=...) at ../src/Client/Connection.cpp:286 2 0x000000000e22fac3 in DB::ConnectionPoolWithFailover::tryGetEntry (this=0x7f791fcff258, pool=..., timeouts=..., fail_message=..., settings=0x7f74adecf3b0, table_to_check=0x7f787e5b4bb0) at ../src/Client/ConnectionPoolWithFailover.cpp:232 3 0x000000000e230856 in DB::ConnectionPoolWithFailover::<>::operator() () at ../src/Client/ConnectionPoolWithFailover.cpp:153 ... 9 std::__1::function<PoolWithFailoverBase<DB::IConnectionPool>::TryResult ()>::operator()() const () at ../contrib/libcxx/include/functional:2473 10 PoolWithFailoverBase<DB::IConnectionPool>::getMany() () at ../src/Common/PoolWithFailoverBase.h:225 11 0x000000000e230b8b in DB::ConnectionPoolWithFailover::getManyImpl() () at ../src/Client/ConnectionPoolWithFailover.cpp:211 12 0x000000000e23129f in DB::ConnectionPoolWithFailover::getManyChecked () at ../src/Client/ConnectionPoolWithFailover.cpp:156 13 0x000000000d6759b3 in DB::RemoteQueryExecutor::<lambda()>::operator() () at ../src/Interpreters/StorageID.h:77 ... 20 DB::RemoteQueryExecutor::sendQuery (this=0x7f74adecf020) at ../src/DataStreams/RemoteQueryExecutor.cpp:149 ... 46 0x00007f7b22308fb7 in start_thread (arg=<optimized out>) at pthread_create.c:486

azat marked this pull request as draft June 2, 2020 23:42

blinkov added the pr-improvement Pull request with some product improvements label Jun 2, 2020

qoega added the doc-alert label Jun 3, 2020

azat changed the title ~~Preserve initial_user for Distributed queries if user is not specified in remote_servers~~ Preserve initial_user for Distributed queries if user was not specified in remote_servers Jun 3, 2020

azat added 2 commits June 4, 2020 02:33

Preserve initial_user for Distributed queries if user is not specifie…

57df6c2

…d in remote_servers

Add test_distributed_preserve_initial_user

586c3b3

azat force-pushed the dist-preserve-initial_user branch from 64e2cf4 to 586c3b3 Compare June 3, 2020 23:33

azat marked this pull request as ready for review June 4, 2020 08:20

azat added 2 commits June 4, 2020 11:45

Cover remote()/cluster() in test_distributed_preserve_initial_user

fa4cb18

Do not preserve user for Distributed queries if it was specified via …

664c123

…remote()

abyss7 self-assigned this Jun 4, 2020

abyss7 suggested changes Jun 5, 2020

View reviewed changes

azat changed the title ~~Preserve initial_user for Distributed queries if user was not specified in remote_servers~~ Ability to preserve user for distributed queries Jun 5, 2020

azat marked this pull request as draft June 5, 2020 19:58

azat mentioned this pull request Jun 7, 2020

[RFC] Add proxy users for distributed queries #11498

Closed

9 tasks

abyss7 removed their assignment Jun 8, 2020

alexey-milovidov force-pushed the master branch from 6c77191 to 09b9a30 Compare June 9, 2020 02:03

azat commented Jun 18, 2020

View reviewed changes

azat changed the title ~~Ability to preserve user for distributed queries~~ [CANCELED] Ability to preserve user for distributed queries Jun 18, 2020

azat closed this Jun 24, 2020

azat mentioned this pull request Jul 30, 2020

Secure inter-cluster query execution (with initial_user as current query user) [v3] #13156

Merged

azat deleted the dist-preserve-initial_user branch October 2, 2020 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CANCELED] Ability to preserve user for distributed queries#11391

[CANCELED] Ability to preserve user for distributed queries#11391
azat wants to merge 4 commits intoClickHouse:masterfrom
azat:dist-preserve-initial_user

azat commented Jun 2, 2020 •

edited

Loading

Uh oh!

filimonov commented Jun 5, 2020

Uh oh!

abyss7 commented Jun 5, 2020

Uh oh!

abyss7 left a comment

Uh oh!

abyss7 left a comment

Uh oh!

azat commented Jun 5, 2020

Uh oh!

abyss7 commented Jun 6, 2020

Uh oh!

azat commented Jun 6, 2020

Uh oh!

azat commented Jun 7, 2020

Uh oh!

azat Jun 18, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

azat commented Jun 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

filimonov commented Jun 5, 2020

Uh oh!

abyss7 commented Jun 5, 2020

Uh oh!

abyss7 left a comment

Choose a reason for hiding this comment

Uh oh!

abyss7 left a comment

Choose a reason for hiding this comment

Uh oh!

azat commented Jun 5, 2020

Uh oh!

abyss7 commented Jun 6, 2020

Uh oh!

azat commented Jun 6, 2020

Uh oh!

azat commented Jun 7, 2020

Uh oh!

azat Jun 18, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

azat commented Jun 2, 2020 •

edited

Loading