Conversation
|
Workflow [PR], commit [b5ca921] Summary: ❌
|
| QueryPipeline pipeline = QueryPipeline(std::move(chain)); | ||
|
|
||
| pipeline.setNumThreads(max_insert_threads); | ||
| pipeline.setNumThreads(max_threads); |
There was a problem hiding this comment.
Would this change make the max_insert_threads setting not work as expected? Or is the reasoning that this is not a INSERT SELECT so we should not use the max_insert_threads setting?
There was a problem hiding this comment.
lets define what max_insert_threads does first.
max_insert_threads makes pipeline more concurrent, that it, no more. It has really bad naming. The threads count should be defined by max_threads.
There was a problem hiding this comment.
This line has been introduced by me and by mistake. The existed tests cover only insert-select queries, so I missed it. In short future I will remove/rename max_insert_threads setting
There was a problem hiding this comment.
yes, I see that it is not in the insert-select code path. Thanks for explaining.
| create table testX (A Int64) engine=MergeTree order by tuple(); | ||
|
|
||
| create materialized view testXA engine=MergeTree order by tuple() as select sleep(0.1) from testX; | ||
| create materialized view testXB engine=MergeTree order by tuple() as select sleep(0.2), throwIf(A=1) from testX; |
There was a problem hiding this comment.
What is the purpose of throwIf(A=1)?
There was a problem hiding this comment.
It would fail with exception when A = 1.
Not strictly necessary here, But I prefer when different mv behave differently in tests cases.
Cherry pick #88339 to 25.7: fix threads count for inserts
Cherry pick #88339 to 25.8: fix threads count for inserts
Cherry pick #88339 to 25.9: fix threads count for inserts
Backport #88339 to 25.7: fix threads count for inserts
Backport #88339 to 25.8: fix threads count for inserts
Backport #88339 to 25.9: fix threads count for inserts
| QueryPipeline pipeline = QueryPipeline(std::move(chain)); | ||
|
|
||
| pipeline.setNumThreads(max_insert_threads); | ||
| pipeline.setNumThreads(max_threads); |
There was a problem hiding this comment.
@CheSema Should we respect parallel_view_processing as well and use max_threads only if parallel_view_processing=1?
There was a problem hiding this comment.
No, we should not, parallel_view_processing just turning max_insert_threads into 1 when materilized views are involved. parallel_view_processing was added to avoid some bugs with deduplication which are fixed now.
It would be more likely, but in general there is no guaranty about order even with 1 thread. |
- decrease max threads from 10 to 7 - increase the number of rows from 50 to 200 Both of these changes should increase the chance that the query can utilize all threads. Furthermore, we also accept if the query utilizes one less thread. It is still more than `max_insert_threads`, so it still effectively checks that `max_insert_threads` doesn't constrain the number of threads. See #88339 (comment) for more information.
It is bad for performance. We should use
max_threadsfor all inserts.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
...
Documentation entry for user-facing changes