Control nesting level for shards skipping and disallow non-deterministic functions#11715
Conversation
…stic func Example of such functions is rand() And this patch disables only optimize_skip_unused_shards, i.e. INSERT code path does not changed, so it will work as before.
…queries P.S. Looks like settings can be converted between SettingUInt64 and SettingBool without breaking binary protocol. FWIW maybe it is a good idea to change the semantics of the settings as follow (but I guess that changing semantic is not a good idea, better to add new settings and deprecate old ones): - optimize_skip_unused_shards -- accept nesting level on which the optimization will work - force_skip_optimize_shards_nesting -- accept nesting level on which the optimization will work
Before there is no check that optimize_skip_unused_shards was working for the first level, use cluster with unavalable shard to guarantee this.
080e309 to
0e218b0
Compare
Yes. |
Much better than |
alexey-milovidov
left a comment
There was a problem hiding this comment.
The code LGTM,
let's apply changes to make settings more convenient.
|
BTW, what is the setup when nested Distributed tables are required? |
- optimize_skip_unused_shards_nesting (allows control nesting level for shards skipping optimization) - force_skip_optimize_shards_nesting (allows control nesting level for checking was shards skipped or not) - deprecates force_skip_optimize_shards_no_nested
I was going to add some block about use case into the documentation (since there can be some interesting use cases and also caveats), but did not manage to find time for this, egh. Anyway let me try to explain it briefly here: Suppose you have lots of nodes (say 1000) in the cluster. And one more advantage rise up when you need to expand the cluster (basically all above can be solved without nesting, but will require some trickery), since in this case you can just add new nodes into smaller cluster without any data re-sharding. P.S. note that this a brief descriptions, that does not accounts some possible issues/aspects. |
This is Ok. |
|
Yet another small improvements around distributed querying.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
optimize_skip_unused_shards_nesting(allows control nesting level for shards skipping optimization)force_skip_optimize_shards_nesting(allows control nesting level for checking was shards skipped or not)force_optimize_skip_unused_shards_no_nested(force_skip_optimize_shards_nestingshould be used instead)optimize_skip_unused_shardsif sharding_key has non-deterministic func (i.e.rand(), note that this does not changes anything for INSERT side)Details
HEAD: