Skip to content

Optimization to skip unused shards for Distributed engine#1

Open
makimat wants to merge 7 commits intomasterfrom
pull-3592
Open

Optimization to skip unused shards for Distributed engine#1
makimat wants to merge 7 commits intomasterfrom
pull-3592

Conversation

@makimat
Copy link

@makimat makimat commented Nov 21, 2018

This pull request adds distributed_optimize_skip_select_on_unused_shards. When doing SELECT from Distributed tables, it will try to perform constant folding for sharding expression, based on constraints from WHERE condition, and determine if it's possible to query a subset of nodes of a cluster.

It's useful to:

  • reduce latency and increase availability, because fewer nodes are involved
  • increases the throughput of a cluster, because nodes don't try to perform queries that will return nothing

It assumes that data is sharded according to sharding key, it might not be true, that's why it isn't enabled by default.

Limitations:

  • handles only filtering by columns and composition with AND and OR, doesn't handle queries like SELECT ... FROM ... WHERE jumpConsistentHash(x, 2) = 1

NB: the last time I wrote C++ was 10 years ago

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

@softagram-bot
Copy link

Softagram Impact Report for pull/1 (head commit: 0bc9776)

⭐ Visual Overview

Changed elements and changed dependencies.
Changed dependencies - click for full size
Graph legend
(Open in Softagram Desktop for full details)

⭐ Change Impact

How the changed files are used by the rest of the project
Impacted files - click for full size
Graph legend
(Open in Softagram Desktop for full details)

📄 Full report

Give feedback of this report to [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants