Multiple paths (disks/volumes) for storing temporary data support#8750
Conversation
|
Ok. BTW, do you consider the option to use existing |
At first I went this way (and just has |
Got it, I want to have separate default disk path (since default selector uses |
|
@alexey-milovidov So, the question is, what is preferable? I did the same thing for distributed engine (did not submitted yet), by adding another separate |
Not sure that I understand this completely, but looks like it's not an issue:
Probably one issue is that |
The issue is that there is |
|
Always using |
|
Ok |
|
Looks like the following is failing in upstream too:
And |
e1a5f07 to
cf4e9fd
Compare
Rebased ( |
|
Looks like test failures are not relevant to this changes, or am I missing something? |
|
Hm,
@alexey-milovidov so |
So after looking into this again, I'm not sure which way this should goes:
IOW maybe this should left as-is? @alexey-milovidov |
So, the issue is that after we selected disk, we have to append tmp to its path but it's too error prone because we have to add it in multiple places: external aggregation, external sorting, joins, distributed tables... Do I understand correctly?
Ok, now it looks reasonable. |
Yes |
|
Ok. Let's proceed with the first variant of implementation. |
This patch adds <tmp_policy> config directive, that will define the policy to use for storing temporary files, if it is not set (default) the <tmp_path> will be used. Also tmp_policy has some limitations: - move_factor is ignored - keep_free_space_bytes is ignored - max_data_part_size_bytes is ignored - must have exactly one volume
cf4e9fd to
c9cc1ef
Compare
|
Rebased to include #8790 (to make the tests pass) |
225a819 to
4f86861
Compare
Changelog category (leave one):
Changelog entry (up to few sentences, required except for Non-significant/Documentation categories):
Support storage policy (
<tmp_policy>) for storing temporary data support.Follow-up for: #4918