Skip to content

Initial explain#11873

Merged
alexey-milovidov merged 27 commits intomasterfrom
initial-explain
Jul 7, 2020
Merged

Initial explain#11873
alexey-milovidov merged 27 commits intomasterfrom
initial-explain

Conversation

@KochetovNicolai
Copy link
Member

@KochetovNicolai KochetovNicolai commented Jun 22, 2020

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Initial implementation of EXPLAIN query. Syntax: EXPLAIN SELECT .... This fixes #1118.

Usage:

EXPLAIN [AST | SYNTAX | PLAN | PIPELINE] [setting = value, ...] SELECT ... [FORMAT ...]

EXPLAIN AST

Dump query AST.

EXPLAIN AST SELECT 1

SelectWithUnionQuery (children 1)
 ExpressionList (children 1)
  SelectQuery (children 1)
   ExpressionList (children 1)
    Literal UInt64_1

EXPLAIN SYNTAX

Return query after syntax optimisations.

EXPLAIN SYNTAX SELECT * FROM system.numbers AS a, system.numbers AS b, system.numbers AS c

SELECT 
    `--a.number` AS `a.number`,
    `--b.number` AS `b.number`,
    number AS `c.number`
FROM 
(
    SELECT 
        number AS `--a.number`,
        b.number AS `--b.number`
    FROM system.numbers AS a
    CROSS JOIN system.numbers AS b
) AS `--.s`
CROSS JOIN system.numbers AS c

EXPLAIN PLAN

Dump query plan steps.

EXPLAIN SELECT sum(number) FROM numbers(10) GROUP BY number % 4

Union
  Expression (Projection)
    Expression (Before ORDER BY and SELECT)
      Aggregating
        Expression (Before GROUP BY)
          ReadFromStorage (Read from SystemNumbers)

Settings:

  • header - print output header for step. Default: 0.
  • description - print step description. Default: 1.
  • actions - print detailed information about step actions. Default: 0.

EXPLAIN PIPELINE

EXPLAIN PIPELINE SELECT sum(number) FROM numbers_mt(100000) GROUP BY number % 4

(Union)
Converting
  (Expression)
  ExpressionTransform
    (Expression)
    ExpressionTransform
      (Aggregating)
      Resize 321
        AggregatingTransform × 32
          (Expression)
          ExpressionTransform × 32
            (ReadFromStorage)
            NumbersMt × 32 01

Settings:

  • header - print header for each output port. Default: 0.
  • graph - use DOT graph description language. Default: 0.
  • compact - print graph in compact mode if graph is enabled. Default: 1.

@KochetovNicolai
Copy link
Member Author

:) explain select sum(number) from numbers(10) union all select sum(number) from numbers(10) order by sum(number)

EXPLAIN SELECT sum(number)
FROM numbers(10)
UNION ALL
SELECT sum(number)
FROM numbers(10)
ORDER BY sum(number) ASC
FORMAT TSV

Union
  Expression (Projection)
    Expression (Before ORDER BY and SELECT)
      Aggregating
        Expression (Before GROUP BY)
          ReadFromStorage (Read from SystemNumbers)
  Expression (Projection)
    MergingSorted (Merge sorted streams before ORDER BY)
      MergeSorting (Merge sorted blocks before ORDER BY)
        PartialSorting (Sort each block before ORDER BY)
          Expression (Before ORDER BY and SELECT)
            Aggregating
              Expression (Before GROUP BY)
                ReadFromStorage (Read from SystemNumbers)

@blinkov blinkov added doc-alert pr-feature Pull request with new product feature labels Jun 22, 2020
@alexey-milovidov
Copy link
Member

MergingSorted (Merge sorted streams before ORDER BY)
MergeSorting (Merge sorted blocks before ORDER BY)
PartialSorting (Sort each block before ORDER BY)

Actually not before ORDER BY but for ORDER BY.

@filimonov
Copy link
Contributor

BTW how about command to show processors pipeline graph in a graphviz format? I saw it in pipeline stuck exception, can it also be shown on demand?

@KochetovNicolai
Copy link
Member Author

BTW how about command to show processors pipeline graph in a graphviz format? I saw it in pipeline stuck exception, can it also be shown on demand?

Yes, I will add it too.

@KochetovNicolai KochetovNicolai marked this pull request as ready for review June 27, 2020 14:03
@alexey-milovidov
Copy link
Member

Integration tests

Broken in #11903

else if (s_syntax.ignore(pos, expected))
kind = ASTExplainQuery::ExplainKind::AnalyzedSyntax;
else if (s_pipeline.ignore(pos, expected))
kind = ASTExplainQuery::ExplainKind::QueryPipeline;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage:
EXPLAIN [AST | SYNTAX | PLAN | PIPELINE] [setting = value, ...] SELECT ... [FORMAT ...]

Looks like PLAN is not supported during parsing, or it wasn't not intended to be supported?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've forgotten about it indeed.

@azat
Copy link
Member

azat commented Jul 2, 2020

@KochetovNicolai thoughts on showing max_threads/max_distributed_connections/.. somehow?

i.e.:

SET max_distributed_connections = 10;
explain select * from remote('127.{1,2,3}', system.one);

Union
  Expression (Projection)
    ReadFromStorage x 10 (Read from Distributed)

There were some issues with these settings, when calculation was incorrect will help with debugging and will be useful in tests too

@KochetovNicolai
Copy link
Member Author

KochetovNicolai commented Jul 2, 2020

thoughts on showing max_threads/max_distributed_connections/.. somehow?

Nice idea!
I don't know where to show this information though.
Maybe print it before plan, under another one option. @azat what do you think?

@azat
Copy link
Member

azat commented Jul 2, 2020

Maybe print it before plan, under another one optio

I was thinking about showing it in the form of ReadFromStorage x 10 (Read from Distributed) (x10 part), but showing it before, under option looks good too!

@alexey-milovidov alexey-milovidov self-assigned this Jul 5, 2020
{
WriteBuffer & out;
size_t offset = 0;
const size_t ident = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent, not ident

ASTPtr ast_settings;

static String toString(ExplainKind kind)
static String toString(ExplainKind kind, bool old_syntax)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can safely remove the old AST query. It's not for users and no backward compatibility is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about 'we don' t care about backward compatibility fot that'. At the same time Ast and analyze are useful in some (rare) cases (for example to look on rewritten query with joins and lot a of aliases) it if possible it better to keep them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS It is used only for printing formatted query

  • ANALYZE = EXPLAIN SYNTAX
  • AST = EXPLAIN AST

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ANALYZE = EXPLAIN SYNTAX
AST = EXPLAIN AST

It's correct.

I mainly kept old syntax because it was used in tests. I probably may rewrite all that tests and remove compatibility. I would rather do it in a separate pr.

@alexey-milovidov
Copy link
Member

01193_metadata_loading

Flaky test.

@alexey-milovidov alexey-milovidov merged commit b4c2c49 into master Jul 7, 2020
@alexey-milovidov alexey-milovidov deleted the initial-explain branch July 7, 2020 22:59
bool first = true;
for (const auto & action : expression->getActions())
{
settings.out << prefix << (first ? "Actions: "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KochetovNicolai EXPLAIN actions=1 ... you can see row-level policies, while regular user cannot query them w/o access_management:

EXPLAIN actions = 1                                                                                                                                                                   SELECT *      
FROM prewhere_filter 
┌─explain───────────────────────────────────────────────────┐
│ Union                                                     │
│   Expression (Projection)                                 │
│   Actions: PROJECT a, b                                   │
│     Expression (Before ORDER BY and SELECT)               │
│     Actions: PROJECT a, b                                 │
│       Filter (Row-level security filter)                  │         
│       Filter column: equals(b, 1)                         │                                                                                                                         │       Actions: PROJECT a, b                               │
│                ADD 1 UInt8 Const(UInt8)                   │                                                                                                                         
│                FUNCTION equals(b, 1) UInt8 = equals(b, 1) │ <-- this
│                REMOVE 1                                   │
│         ReadFromStorage (Read from MergeTree)             │
└───────────────────────────────────────────────────────────┘

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be considered normal.
The user can see how data was filtered and the fact that data is being filtered but they cannot see the data that was filtered out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but then what is the purpose of restricting access to the system.row_policies/show row policies ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It prevents to show policies for all other users?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, ok then

size_t count;
};

using Edge = std::vector<EdgeData>;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KochetovNicolai , does it mean that there could be multiple edges (lines) connecting two nodes in the graph? Do you have such an example query? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature request: EXPLAIN output

6 participants