Skip to content

RIGHT and FULL JOIN for MergeJoin#12118

Merged
4ertus2 merged 11 commits intoClickHouse:masterfrom
4ertus2:joins
Jul 10, 2020
Merged

RIGHT and FULL JOIN for MergeJoin#12118
4ertus2 merged 11 commits intoClickHouse:masterfrom
4ertus2:joins

Conversation

@4ertus2
Copy link
Contributor

@4ertus2 4ertus2 commented Jul 3, 2020

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Support RIGHT and FULL JOIN with set join_algorithm=partial_merge. Only ALL strictness is supported (ANY, SEMI, ANTI, ASOF are not).

Detailed description / Documentation draft:
In equi-join phase add lazy used rows bitmaps for each right block in MergeJoin (in memory, use ~1 bit per row). After equi-join phase add special stream that append not-joined rows. There's wome common logic extracted from HashJoin additional stream into common NonJoined class and reused.

@blinkov blinkov added doc-alert pr-feature Pull request with new product feature labels Jul 3, 2020
@nikitamikhaylov nikitamikhaylov self-assigned this Jul 6, 2020
@nikitamikhaylov
Copy link
Member

@4ertus2 Please, look at perf test. Why can't we run test SELECT COUNT() FROM ints l RIGHT JOIN ints r USING i64 WHERE i32 = 20042 on old server?
https://github.com/ClickHouse/ClickHouse/blob/master/docker/test/performance-comparison/README.md#partial-queries

@akuzm
Copy link
Contributor

akuzm commented Jul 10, 2020

@4ertus2 Please, look at perf test. Why can't we run test SELECT COUNT() FROM ints l RIGHT JOIN ints r USING i64 WHERE i32 = 20042 on old server?
https://github.com/ClickHouse/ClickHouse/blob/master/docker/test/performance-comparison/README.md#partial-queries

left-server-log.log says:

2020.07.10 08:39:29.281734 [ 315 ] {joins_in_memory_pmj.query24.prewarm0} <Error> executeQuery: Code: 48, e.displayText() = DB::Exception: Not supported. PartialMergeJoin supports LEFT and INNER JOINs kinds. (version 20.6.1.4040 (official build)) (from [::1]:40738) (in query: SELECT COUNT() FROM ints l RIGHT JOIN ints r USING i64 WHERE i32 = 20042), Stack trace (when copying this message, always include the lines below):

Probably I should show these messages somewhere...
The variance is bad but it's universally bad for these queries anyway.

@4ertus2
Copy link
Contributor Author

4ertus2 commented Jul 10, 2020

This PR adds RIGHT JOIN support for partial merge join so it's normal that the version without it cannot run the query with this JOIN.

@4ertus2 4ertus2 merged commit 6b26842 into ClickHouse:master Jul 10, 2020
traceon added a commit to traceon/ClickHouse that referenced this pull request Jul 11, 2020
…-user-authentication

* commit 'ceac649c01b0158090cd271776f3219f5e7ff57c': (75 commits)
  [docs] split misc statements (ClickHouse#12403)
  Update 00405_pretty_formats.reference
  Update PrettyCompactBlockOutputFormat.cpp
  Update PrettyBlockOutputFormat.cpp
  Update DataTypeNullable.cpp
  Update 01383_remote_ambiguous_column_shard.sql
  add output_format_pretty_grid_charset setting in docs
  add setting output_format_pretty_grid_charset
  Added a test for ClickHouse#11135
  Update index.md
  RIGHT and FULL JOIN for MergeJoin (ClickHouse#12118)
  Update MergeTreeIndexFullText.cpp
  restart the tests
  [docs] add syntax highlight (ClickHouse#12398)
  query fuzzer
  Fix std::bad_typeid when JSON functions called with argument of wrong type.
  Allow typeid_cast() to cast nullptr to nullptr.
  fix another context-related segfault
  [security docs] actually, only admins can create advisories
  query fuzzer
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants