Skip to content

Add number of errors to ignore while choosing replicas#11669

Merged
akuzm merged 10 commits intoClickHouse:masterfrom
azat:distributed_replica_error_ignore
Jun 22, 2020
Merged

Add number of errors to ignore while choosing replicas#11669
akuzm merged 10 commits intoClickHouse:masterfrom
azat:distributed_replica_error_ignore

Conversation

@azat
Copy link
Member

@azat azat commented Jun 14, 2020

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add number of errors to ignore while choosing replicas (distributed_replica_error_ignore)

Detailed description / Documentation draft:
This will allow avoid switching to another replica in case of error
(since error can be temporary).

Refs: #10564

Details

Previous HEAD's:

  • 09f8f7227297989aaed45d8fc943eafa01bab779
  • c851407ffe0624f54567c71666a74874a8de8acf

@azat azat marked this pull request as draft June 14, 2020 22:18
@blinkov blinkov added the pr-improvement Pull request with some product improvements label Jun 14, 2020
@azat azat force-pushed the distributed_replica_error_ignore branch from 09f8f72 to c851407 Compare June 17, 2020 18:02
@azat azat marked this pull request as ready for review June 17, 2020 18:02
@akuzm
Copy link
Contributor

akuzm commented Jun 19, 2020

FWIW right now there distributed_replica_error_cap that is the high limit for the error numbers, so how about rename these settings to (not sure what is the policy for settings renames):

  • distributed_replica_error_ignore -> distributed_replica_error_limit_low
  • distributed_replica_error_cap -> distributed_replica_error_limit_high

Renaming settings is almost never good -- we strive to be backwards compatible where possible.

Speaking about the names themselves, "cap" is a saturation limit for the number of errors we track for a replica. It's limited so that the error number decays in a sane time, and the replica is not considered broken forever. I think it's not complementary to your new setting, and they don't form any meaningful range, so giving them complementary names would be misleading. I'd name your new setting "distributed_replica_max_ignored_errors". Although it's kind of hard to tell from these names which limit is which...

@akuzm akuzm self-assigned this Jun 19, 2020
@azat
Copy link
Member Author

azat commented Jun 19, 2020

Renaming settings is almost never good -- we strive to be backwards compatible where possible.

Will remember.

Speaking about the names themselves, "cap" is a saturation limit for the number of errors we track for a replica. It's limited so that the error number decays in a sane time, and the replica is not considered broken forever. I think it's not complementary to your new setting, and they don't form any meaningful range, so giving them complementary names would be misleading.

Yeah, when I was writing this I have some similar thoughts, ok - was a bad idea anyway.

I'd name your new setting "distributed_replica_max_ignored_errors". Although it's kind of hard to tell from these names which limit is which...

Ok, will rename

@azat azat force-pushed the distributed_replica_error_ignore branch 2 times, most recently from e525fd1 to 7ac46d7 Compare June 20, 2020 08:12
@azat azat force-pushed the distributed_replica_error_ignore branch from 7ac46d7 to 9bfda65 Compare June 20, 2020 08:21
At startup, server loads configuration files.

However ConfigReloader does not know about already loaded files (files
is empty()), hence it will always reload the configuration just after
server starts (+ 2 seconds, reload timeout).

And on configuration reload the clusters will be re-created, so some
internal stuff will be reseted:
- error_count
- last_used (round_robing)

And if the reload will happen during round_robin test it will start
querying from the beginning, so let's issue config reload just after
start to avoid reload in the middle of the test execution.
@azat
Copy link
Member Author

azat commented Jun 21, 2020

ClickHouse build check — 11/17 builds are OK

Some CI issues
The only difference between b8ee2ea (everything is green) and bd45592 (this build fail) is the integration test anyway

@alexey-milovidov
Copy link
Member

I'm not sure how admin is supposed to choose the value of this setting.

@nvartolomei
Copy link
Contributor

nvartolomei commented Jun 21, 2020

I think a better way to balance things would be to track the rate of errors (instead of an absolute value) with a sliding window and allow a small variability in the rate of errors between replicas.

The ignore value added here seems to help only in the beginning before the err counter reaches that value and then the behaviour will be exactly the same as it currently is until the counters come back to a values less than this min constant.

2c

@alexey-milovidov
Copy link
Member

@nvartolomei The calculated number of errors is already a sliding "window" with exponential smoothing.

@azat
Copy link
Member Author

azat commented Jun 21, 2020

I'm not sure how admin is supposed to choose the value of this setting.

via config.xml?

@alexey-milovidov
Copy link
Member

What value will be good: 1 or 10 or 100?

@azat
Copy link
Member Author

azat commented Jun 22, 2020

It depends, and should be adjusted only after looking at system.clusters (that has error_count rate) + server logs (for issues around distributed queries).

I would say that 100 definitely is not good, but 1-10 may be a good idea.

Also rate of queries and time of replica error_count recover (distributed_replica_error_half_life) also should be take into account.

(at first glance this can be enough)

@akuzm
Copy link
Contributor

akuzm commented Jun 22, 2020

P.S. I'm trying to keep PRs without cleanups/refactoring to reduce number of changes (lots of changes is hard to review), but looks like it is not a problem for you, so I will consider deviating from this rule in some cases

That's generally a good idea if there is a non-trivial amount of refactoring. One way to do it is maintain logically separate changes in separate commits -- it works well with the github interface. For very big PRs, we can even merge some commits earlier, to have less conflicts. It's not very convenient for the developer though -- lots of fiddling with interactive rebase.

@azat
Copy link
Member Author

azat commented Jun 22, 2020

That's generally a good idea if there is a non-trivial amount of refactoring.

And sometimes even trivial may be a problem (especially for backporting)

One way to do it is maintain logically separate changes in separate commits -- it works well with the github interface.

Well, keeping separate changes in separate commit is a rule of thumb, but this also does not solves issues with conflicts and so on.

lots of fiddling with interactive rebase.

ccache helps at least here :)

@akuzm akuzm merged commit e76941b into ClickHouse:master Jun 22, 2020
@azat azat deleted the distributed_replica_error_ignore branch June 23, 2020 16:39
azat added a commit to azat/ClickHouse that referenced this pull request Jun 29, 2020
Fixed in: ClickHouse#11669 bd45592 ("Fix
test_distributed_load_balancing flaps (due to config reload)")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants