Fix function execution over const and LowCardinality with GROUP BY const for analyzer#59986
Conversation
|
This is an automated comment for commit 0e9c648 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page Successful checks
|
7c2b8c9 to
9a6c9e5
Compare
9a6c9e5 to
ae839af
Compare
|
AFAIU the tests fail because of |
ae839af to
6bf203c
Compare
|
Not all of them, need to take a more careful look. |
|
@azat What do you mean? Both tests in |
6bf203c to
a84dca3
Compare
Yeah, you are right, I guess this comment was for some old failure, that had been fixed in Anyway, I've fixed the remaining tests, let's see. |
|
…nst for analyzer
Consider the following example:
SELECT concatWithSeparator('|', 'a', concatWithSeparator('|', CAST('x', 'LowCardinality(String)'))) GROUP BY 'a'
Under analyzer it fails, UBsan report:
==15121==WARNING: MemorySanitizer: use-of-uninitialized-value
...
8 0x5555601880ed in void DB::FormatStringImpl::format<true, false>() /src/ch/clickhouse/src/Functions/formatString.h:125:21
9 0x55556017aeb8 in void DB::FormatStringImpl::formatExecute<>() /src/ch/clickhouse/src/Functions/formatString.h:30:13
10 0x555560196779 in DB::()::ConcatWithSeparatorImpl<>::executeImpl() const /src/ch/clickhouse/src/Functions/concatWithSeparator.cpp:151:9
11 0x55555a2ad5b7 in DB::FunctionToExecutableFunctionAdaptor::executeImpl() const /src/ch/clickhouse/src/Functions/IFunctionAdaptor.h:21:26
12 0x555584312297 in DB::IExecutableFunction::executeWithoutLowCardinalityColumns() const /src/ch/clickhouse/src/Functions/IFunction.cpp:249:15
13 0x555584317640 in DB::IExecutableFunction::executeWithoutSparseColumns() const /src/ch/clickhouse/src/Functions/IFunction.cpp:283:24
14 0x55558431bf5c in DB::IExecutableFunction::execute() const /src/ch/clickhouse/src/Functions/IFunction.cpp:380:16
15 0x555587bf3e20 in DB::executeAction() /src/ch/clickhouse/src/Interpreters/ExpressionActions.cpp:613:60
Uninitialized value was created by a heap allocation
...
6 0x55558b1c1a05 in DB::ColumnString::reserve(unsigned long) /src/ch/clickhouse/src/Columns/ColumnString.cpp:494:13
7 0x55558980095d in DB::prepareOutputBlockColumns() /src/ch/clickhouse/src/Interpreters/AggregationUtils.cpp:32:25
The problem is that during query analysis
(QueryAnalyzer::resolveFunction()), the return value of the function had
been executed as LowCardinality(String), but the 'a' argument that is
passed to the concatWithSeparator() is not-const, because it had been
reused from the GROUP BY step, and this causes UB, since column 'a' does
not have enough rows (it should have 2 rows, since LowCardinality always
contains the default, while it has only 1).
v2: fix GROUPING SETs
Signed-off-by: Azat Khuzhin <[email protected]>
Now it preserves the original header in case of GROUP BY const, though not for remote queries. Signed-off-by: Azat Khuzhin <[email protected]>
Fixes:
select toNullable(os_name) AS os_name, count() from (SELECT CAST('iphone' AS Enum8('iphone' = 1, 'android' = 2)) AS os_name) group by os_name WITH TOTALS settings allow_experimental_analyzer=1
Code: 10. DB::Exception: Received from localhost:9000. DB::Exception: Not found column __table1.os_name: in block toNullable(__table1.os_name) Nullable(Enum8('iphone' = 1, 'android' = 2)) Nullable(size = 0, Int8(size = 0), UInt8(size = 0)), count() UInt64 UInt64(size = 0). (NOT_FOUND_COLUMN_IN_BLOCK)
Signed-off-by: Azat Khuzhin <[email protected]>
Signed-off-by: Azat Khuzhin <[email protected]>
|
Test failures does related
Looking into it |
- 00757_enum_defaults - TOTALS - 02699_polygons_sym_difference_rollup - TOTALS - 02579_fill_empty_chunk - GROUP BY constX with arrayJoin(constX) Signed-off-by: Azat Khuzhin <[email protected]>
…imizer_bugs The behaviour of GROUP BY const in analyzer is slightly different, it still preserves the const property from the subqueries, and that's why it simply do not execute filters after GROUP BY, but simply one time for the const. The initial bug was about predicate pushdown not about GROUP BY const, so I will update the test. I've also tested the initial bug with fiddle and the new test is idential: - 19.17 (without the fix) - https://fiddle.clickhouse.com/312a48f4-6a5b-411d-a4bb-4b1a757effaf - 20.2 (with the fix) - https://fiddle.clickhouse.com/4e1ad2e5-1c03-4ce7-9ac3-e878ccfae83a Refs: ClickHouse#8503 Refs: ClickHouse#5682 Signed-off-by: Azat Khuzhin <[email protected]>
a84dca3 to
0e9c648
Compare
DetailsSELECT
2 AS x,
arrayJoin([NULL, NULL, NULL])
GROUP BY
GROUPING SETS (
(0),
([NULL, NULL, NULL]))
ORDER BY x ASC
SETTINGS allow_experimental_analyzer = 1, enable_positional_arguments = 0
Query id: 9b7f209d-f10a-4f35-9815-0c7fc76c9862
┌─x─┬─arrayJoin([NULL, NULL, NULL])─┐
│ 2 │ ᴺᵁᴸᴸ │
│ 2 │ ᴺᵁᴸᴸ │
│ 2 │ ᴺᵁᴸᴸ │
└───┴───────────────────────────────┘
┌─x─┬─arrayJoin([NULL, NULL, NULL])─┐
│ 2 │ ᴺᵁᴸᴸ │
│ 2 │ ᴺᵁᴸᴸ │
│ 2 │ ᴺᵁᴸᴸ │
└───┴───────────────────────────────┘
SELECT
2 AS x,
arrayJoin([NULL, NULL, NULL])
GROUP BY
GROUPING SETS (
(0),
([NULL, NULL, NULL]))
ORDER BY x ASC
SETTINGS allow_experimental_analyzer = 0, enable_positional_arguments = 0
Query id: 4ddcfaa0-6f06-45e6-aab1-96606536c536
┌─x─┬─arrayJoin([NULL, NULL, NULL])─┐
│ 2 │ ᴺᵁᴸᴸ │
│ 2 │ ᴺᵁᴸᴸ │
│ 2 │ ᴺᵁᴸᴸ │
└───┴───────────────────────────────┘
|
|
|
@novikd, @azat looks like this PR has introduced a crash that happens quite often in the CI: Examples: |
|
@tavplubix most likely fixed with #61717 |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix function execution over const and LowCardinality with GROUP BY const for analyzer
Consider the following example:
Under analyzer it fails, UBsan report:
The problem is that during query analysis
(QueryAnalyzer::resolveFunction()), the return value of the function had been executed as LowCardinality(String), but the 'a' argument that is passed to the concatWithSeparator() is not-const, because it had been reused from the GROUP BY step, and this causes UB, since column 'a' does not have enough rows (it should have 2 rows, since LowCardinality always contains the default, while it has only 1).
So to address this we need to take column as-is for aggregation.
Cc: @kitaisreal
Fixes: #59918
Fixes: #59684
Fixes: #59779