Skip to content

Distributed execution: better split tasks#87508

Merged
scanhex12 merged 42 commits intoClickHouse:masterfrom
scanhex12:distributed_execution_better_spread
Nov 14, 2025
Merged

Distributed execution: better split tasks#87508
scanhex12 merged 42 commits intoClickHouse:masterfrom
scanhex12:distributed_execution_better_spread

Conversation

@scanhex12
Copy link
Member

@scanhex12 scanhex12 commented Sep 23, 2025

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Distributed execution: better split tasks by row groups IDs, not by files.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@scanhex12 scanhex12 marked this pull request as draft September 23, 2025 13:34
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Sep 23, 2025

Workflow [PR], commit [e0d5e8d]

Summary:

job_name test_name status info comment
Stateless tests (amd_binary, ParallelReplicas, s3 storage, sequential) failure
03711_deduplication_blocks_part_log FAIL cidb
Integration tests (amd_asan, old analyzer, 2/6) failure
test_external_http_authenticator/test.py::test_user_from_config_basic_auth_pass FAIL cidb
test_external_http_authenticator/test.py::test_user_create_basic_auth_pass FAIL cidb
test_external_http_authenticator/test.py::test_basic_auth_failed FAIL cidb
test_external_http_authenticator/test.py::test_header_failed FAIL cidb
test_external_http_authenticator/test.py::test_session_settings_from_auth_response FAIL cidb
BuzzHouse (amd_debug) failure
Buzzing result failure cidb
BuzzHouse (amd_ubsan) failure
Buzzing result failure cidb
Performance Comparison (amd_release, master_head, 3/6) failure
Check Results failure

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Sep 23, 2025
@alexey-milovidov
Copy link
Member

alexey-milovidov commented Sep 24, 2025

@scanhex12,

It should be implemented in the following way, not related to row groups or data lakes:

  • define a setting, desired job size in bytes;
  • add two new fields to the protocol for distributed files processing: bucket num and bucket size;
  • when a file is larger than this threshold, we pass the bucket number from 0 to N-1 for processing;
  • the bucket number defines a deterministic split of the file into smaller parts for processing;
  • however, this split will depend on the actual structure of the file, and sometimes certain buckets could be empty or the distribution can be uneven;
  • this will work for Parquet, ORC, and could be extended to any other formats.

The coordinator knows file sizes and, therefore, knows how many buckets each file will need. Then distributes the work for each bucket in each file. We can say "virtual buckets" to better understand the concept, because the actual split of the file and the way how it is split is abstracted.

@scanhex12 scanhex12 force-pushed the distributed_execution_better_spread branch from d200cb8 to f5f4240 Compare September 25, 2025 22:10
@scanhex12 scanhex12 marked this pull request as ready for review September 25, 2025 22:10
@Avogar Avogar self-assigned this Sep 26, 2025
@scanhex12 scanhex12 force-pushed the distributed_execution_better_spread branch from a3d7980 to fe3b556 Compare October 7, 2025 13:33
@divanik divanik self-assigned this Oct 9, 2025
Copy link
Member

@Avogar Avogar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small comments and I am ready to approve

Copy link
Member

@Avogar Avogar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Just some very minor final suggestions

@scanhex12
Copy link
Member Author

scanhex12 commented Nov 14, 2025

Latest times measurements on integration test with parquet v3 (first time is bucket-level splitting):

Check times 0.5894033908843994, 0.6704092025756836

@scanhex12 scanhex12 enabled auto-merge November 14, 2025 21:51
@scanhex12 scanhex12 added this pull request to the merge queue Nov 14, 2025
Merged via the queue into ClickHouse:master with commit 4bed2ad Nov 14, 2025
123 of 130 checks passed
@scanhex12 scanhex12 deleted the distributed_execution_better_spread branch November 14, 2025 22:11
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Nov 14, 2025
@alesapin alesapin added pr-performance Pull request with some performance improvements and removed pr-improvement Pull request with some product improvements labels Nov 25, 2025
zvonand pushed a commit to Altinity/ClickHouse that referenced this pull request Dec 16, 2025
…ion_better_spread

Distributed execution: better split tasks
zvonand pushed a commit to Altinity/ClickHouse that referenced this pull request Dec 17, 2025
…ion_better_spread

Distributed execution: better split tasks
zvonand pushed a commit to Altinity/ClickHouse that referenced this pull request Dec 19, 2025
…ion_better_spread

Distributed execution: better split tasks
zvonand added a commit to Altinity/ClickHouse that referenced this pull request Dec 22, 2025
Antalya 25.8.12 Backport of ClickHouse#87508: Distributed execution: better split tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants