Skip to content

Multiple disks/volumes for storing data for send in Distributed engine#8756

Merged
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
azat:distributed_storage_configuration
Jan 25, 2020
Merged

Multiple disks/volumes for storing data for send in Distributed engine#8756
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
azat:distributed_storage_configuration

Conversation

@azat
Copy link
Member

@azat azat commented Jan 20, 2020

Changelog category (leave one):

  • New Feature

Changelog entry (up to few sentences, required except for Non-significant/Documentation categories):
Multiple disks/volumes for storing data for send in Distributed engine

Detailed description (optional):

Usage example:

CREATE TABLE foo (key Int) Engine=Distributed(test_shard_localhost, currentDatabase(), some_table, key%2, 'default');

Follow-up: #4918

v2: use storage_policy over introducing separate distributed_storage_policy

@filimonov filimonov added the comp-multidisk Storages & storage policies label Jan 21, 2020
@azat azat force-pushed the distributed_storage_configuration branch 4 times, most recently from 9b920e7 to 8b4b65d Compare January 24, 2020 04:28
@azat azat changed the title Multiple disks/volumes for storing data for send in Distributed engine [WIP] Multiple disks/volumes for storing data for send in Distributed engine Jan 24, 2020
@azat
Copy link
Member Author

azat commented Jan 24, 2020

#8750 had been merged, this PR rebased against master (WIP dropped)

@azat
Copy link
Member Author

azat commented Jan 24, 2020

  • PVS check - Unable to start the analysis on this file.
  • Integration tests -- test_cluster_copier (does not looks related to this change)

@azat azat force-pushed the distributed_storage_configuration branch from 8b4b65d to bfe6931 Compare January 25, 2020 11:07
@azat
Copy link
Member Author

azat commented Jan 25, 2020

Integration tests -- test_cluster_copier (does not looks related to this change)

Even though I cannot reproduce it when running manually, it fails constantly on CI, digging into logs

azat added 2 commits January 25, 2020 20:52
Now Distributed() has gain the 5-th argument -- policy name (for storing
data to send):

  CREATE TABLE foo (key Int) Engine=Distributed(test_shard_localhost, currentDatabase(), some_table, key%2, 'default');
@azat azat force-pushed the distributed_storage_configuration branch from bfe6931 to 5c641b7 Compare January 25, 2020 17:53
@azat
Copy link
Member Author

azat commented Jan 25, 2020

2020.01.25 17:51:31.841341 [ 1 ] {} ClusterCopier: Will retry: Code: 49, e.displayText() = DB::Exception: Disk path must ends with '/', but '/var/log/clickhouse-server/copier/clickhouse-copier_20200125175131_66' doesn't., Stack trace (when copying this message, always include the lines below):

Fixed

(The reason it passed in my setup is due to stalled build objects)

@alexey-milovidov alexey-milovidov merged commit 2df93a6 into ClickHouse:master Jan 25, 2020
@azat azat deleted the distributed_storage_configuration branch January 26, 2020 08:17
@CurtizJ CurtizJ added the pr-feature Pull request with new product feature label Jan 30, 2020
azat added a commit to azat/ClickHouse that referenced this pull request Mar 29, 2020
azat added a commit to azat/ClickHouse that referenced this pull request Apr 19, 2020
…ibuted sends

After ClickHouse#8756 the problem with 1 thread for each (distributed table, disk)
for distributed sends became even worse (since there can be multiple
disks), so use predefined thread pool for this tasks, that can be
controlled with background_distributed_schedule_pool_size knob.
@alexey-milovidov alexey-milovidov self-assigned this Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp-multidisk Storages & storage policies pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants