Multicorn aggregation/grouping pushdown support#1
Merged
Conversation
mildbyte
reviewed
Dec 14, 2021
| """Convert a list of Multicorn quals to an ElasticSearch query""" | ||
| ignore_columns = ignore_columns or [] | ||
|
|
||
| # Aggreagtion/grouping queries |
mildbyte
reviewed
Dec 14, 2021
| } | ||
| } | ||
|
|
||
| if aggs is not None: |
There was a problem hiding this comment.
Can we be in a situation where aggs is None and group_clauses isn't? Is it basically something like SELECT a, b, c FROM T GROUP BY a, b, c which is the same as SELECT DISTINCT a, b, c FROM T?
Author
There was a problem hiding this comment.
Yes we can, that is a good example. Here's a concrete one:
sgr@localhost:splitgraph> explain select column5, column4 from es.iris group by column4, column5
+--------------------------------------------------------------------------------------------+
| QUERY PLAN |
|--------------------------------------------------------------------------------------------|
| Foreign Scan (cost=1.00..1.00 rows=1 width=1) |
| Multicorn: Elasticsearch query to <Elasticsearch([{'host': 'es01-test', 'port': 9200}])> |
| Multicorn: Query: { |
| "aggs": { |
| "group_buckets": { |
| "composite": { |
| "sources": [ |
| { |
| "column5": { |
| "terms": { |
| "field": "column5" |
| } |
| } |
| }, |
| { |
| "column4": { |
| "terms": { |
| "field": "column4" |
| } |
| } |
| } |
| ], |
| "size": 1000 |
| } |
| } |
| } |
| } |
+--------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.012s
sgr@localhost:splitgraph> select column5, column4 from es.iris group by column4, column5
+-----------------+---------+
| column5 | column4 |
|-----------------+---------|
| Iris-setosa | 0.1 |
| Iris-setosa | 0.2 |
| Iris-setosa | 0.3 |
| Iris-setosa | 0.4 |
| Iris-setosa | 0.5 |
| Iris-setosa | 0.6 |
| Iris-versicolor | 1.0 |
| Iris-versicolor | 1.1 |
| Iris-versicolor | 1.2 |
| Iris-versicolor | 1.3 |
| Iris-versicolor | 1.4 |
| Iris-versicolor | 1.5 |
| Iris-versicolor | 1.6 |
| Iris-versicolor | 1.7 |
| Iris-versicolor | 1.8 |
| Iris-virginica | 1.4 |
| Iris-virginica | 1.5 |
| Iris-virginica | 1.6 |
| Iris-virginica | 1.7 |
| Iris-virginica | 1.8 |
| Iris-virginica | 1.9 |
| Iris-virginica | 2.0 |
| Iris-virginica | 2.1 |
| Iris-virginica | 2.2 |
| Iris-virginica | 2.3 |
| Iris-virginica | 2.4 |
| Iris-virginica | 2.5 |
+-----------------+---------+
mildbyte
approved these changes
Dec 14, 2021
|
Would be nice to have some unit tests for this ES query handling (converting aggs/group_clauses into ES queries and back) + UPD Dec 21 -- the tests will live in the splitgraph repo (including checking the ES queries) so this is fine |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enable aggregation/grouping support offered in Multicorn through the accompanying PR splitgraph/Multicorn#1.
can_pushdown_upperrelwith relevant details so that Multicorn can decide whether and what to push to the Python side.Here are two instructive examples of the translated aggregation queries:
GROUP BYGROUP BYCU-1t1wycg