Crash in IN function where columns have different types and many columns are involved#89367
Conversation
|
Query The thing about the query - match between (Account, Symbol) and (Symbol, Account) which is UInt32 against String. Some (not all) Symbol values can be casted to UInt32. Backtrace The version is close to recent master. |
|
I have a repro, but it is not a small one, so I don't think that it make sense to create a test based on it. |
|
Hello, |
|
Hello @KochetovNicolai , |
|
Workflow [PR], commit [5d4d21b] Summary: ❌
|
| { | ||
| /// We cannot afford filtering rows because | ||
| /// sortBlock relies on equal number of rows in all columns | ||
| transformed_set_columns[set_element_index] = std::move(nullable_set_column); |
There was a problem hiding this comment.
This does not look correct to me. I think the code is written in a way that tuple columns and PK columns must have the same type, and we have to apply a cast in one or another way.
The proper solution seems to be creating a common nullmask. Because if one tuple component can't be converted to desired PK type, the whole tuple can't match.
There was a problem hiding this comment.
Hello @KochetovNicolai , thank you for looking into this.
What we are doing here is exactly applying cast
ColumnPtr nullable_set_column = castColumnAccurateOrNull({set_column, set_element_type, {}}, key_column_type);
...
if (set_columns.size() > 1)
{
transformed_set_columns[set_element_index] = std::move(nullable_set_column);
continue;
What we are not doing is applying filter.
Does it make sense?
Regarding 'creating a common nullmask' , I agree, it looks like a mature approach. I hesitated to go this way, because it makes the fix a bit more complex.
If you like, I can do this in another PR or switch to it in this PR.
|
Hello @KochetovNicolai , One of the test failures is caused by this PR, it must be listed in parallel_replicas_blacklist.txt . Others seem not related. |
33a028c
Crash in IN function where columns have different types and many columns are involved
Crash in IN function where columns have different types and many columns are involved
Crash in IN function where columns have different types and many columns are involved
25.8.12 Backport of ClickHouse#89367: Crash in IN function where columns have different types and many columns are involved
25.8.12 Backport of ClickHouse#89367: Crash in IN function where columns have different types and many columns are involved
25.8.13 Backport of ClickHouse#89367: Crash in IN function where columns have different types and many columns are involved
25.8.13 Backport of ClickHouse#89367: Crash in IN function where columns have different types and many columns are involved
25.8.13 Backport of ClickHouse#89367: Crash in IN function where columns have different types and many columns are involved
25.8.15 Backport of ClickHouse#89367: Crash in IN function where columns have different types and many columns are involved
…in_in_function 24.8.14 Backport of ClickHouse#89367 - Crash in IN function where columns have different types and many columns are involved
Crash in IN function where columns have different types and many columns are involved
Crash in IN function where columns have different types and many columns are involved
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Possible crash/undefined behavior in IN function where primary key column types are different from IN function right side column types. Example: SELECT string_column, int_column FROM test_table WHERE (string_column, int_column) IN (SELECT '5', 'not a number'). Appears if many rows are selected and there are rows contain not compatible types.