Optimize memory consumption during backup to S3#45188
Optimize memory consumption during backup to S3#45188vitlibar merged 5 commits intoClickHouse:masterfrom
Conversation
1f8e6d0 to
39153b0
Compare
|
I like that approach. Actually this is the best way when we upload file from local disk to S3. When you copy file you know its size. Why do you still use exponential part upload size instead some fixed? When you copy from S3 to S3 why do you prefer seakable reading from S3 instead UploadPartCopyRequest? My concern is the following. With seakable reading you read a file twice. First is to calc sha hash. Second is to upload it to S3. Indeed this read is made with fixed buffer size, memory consumption is low. However I think it is worth to do with existing code in order not to have one more realisation with UploadPartCopyRequest. |
Yes you're right, that code was copied from
I don't quite understand. I don't prefer seekable reading over |
Sorry. Indeed, you don't. You have dedicated realisation for S3toS3 with UploadPartCopyRequest. This is great. |
I did that. Now an uploaded file is splitted into parts of the same size. |
87e4f70 to
a934fbf
Compare
…S3 to decrease memory consumption.
Correctly use AbortMultipleUpload request. Support std::ios_base::end StdStreamBufFromReadBuffer::seekpos().
a934fbf to
1c84518
Compare
| { | ||
| task->req = fillUploadPartRequest(part_number, part_offset, part_size); | ||
|
|
||
| schedule([this, task, task_finish_notify]() |
There was a problem hiding this comment.
Help me, please, to understand a thing here, how many concurrent uploads are run here? Is it limited thread pool?
My concern is that it runs a lot of tasks, each makes a new connection. It could run out of available inodes or face other limitation inside CH.
In compare with WriteBufferFromS3, WriteBufferFromS3 uses threads to write the data which it consumed. If consuming is not fast, it just upload in a background with 1 thread. However WriteBufferFromS3 also is able to produce a lot of tasks. Here is a different situation, a lot of task is run at start. Each allocates at least 1M (buffer) + new connection.
There was a problem hiding this comment.
It's normally used with IOThreadPool . It's limited of course. By default it runs up to 100 threads, so maximum 100 parallel uploads. But it actually must be smaller because IOThreadPool is also used for other tasks.
There was a problem hiding this comment.
In fact it should use its own thread pools backups_thread_pool and restores_thread_pool, so I changed the code to use the correct thread pools. Those thread pools are created specially for backups and restores.
|
|
||
| for (size_t part_number = 1; position < end_position; ++part_number) | ||
| { | ||
| if (multipart_upload_aborted) |
There was a problem hiding this comment.
Isn't it better to also check multipart_upload_aborted right before processUploadPartRequest? In order to stop work as fast as possible if non retryable error occurs.
My concern is that the code have to wait all task which has been scheduled. If 1000 task were scheduled and only 100 of them is running and finishing with connection timeout. It would take 10 x timeout to finish the upload with an error.
There was a problem hiding this comment.
It seems like the queue size is pretty big
auto queue_size = config.getUInt(".threadpool_writer_queue_size", 1000000);
and all task are about to be pushed to the schedule's queue. In case if first task fail no need to run the rest.
There was a problem hiding this comment.
Isn't it better to also check multipart_upload_aborted right before processUploadPartRequest?
Done, I changed the code.
633d795 to
1a680b0
Compare
|
The test |
… WriteBufferFromS3RequestsErrors Ref: ClickHouse#45188
| StdStreamFromReadBuffer(std::unique_ptr<ReadBuffer> buf, size_t size) : Base(&stream_buf), stream_buf(std::move(buf), size) { } | ||
| StdStreamFromReadBuffer(ReadBuffer & buf, size_t size) : Base(&stream_buf), stream_buf(buf, size) { } |
There was a problem hiding this comment.
I have some questions about this,
Base(&stream_buf) will call the following constructor
template <class _CharT, class _Traits>
basic_streambuf<_CharT, _Traits>::basic_streambuf()
: __binp_(nullptr),
__ninp_(nullptr),
__einp_(nullptr),
__bout_(nullptr),
__nout_(nullptr),
__eout_(nullptr)
{
}
template <class _CharT, class _Traits>
basic_streambuf<_CharT, _Traits>::basic_streambuf(const basic_streambuf& __sb)
: __loc_(__sb.__loc_),
__binp_(__sb.__binp_),
__ninp_(__sb.__ninp_),
__einp_(__sb.__einp_),
__bout_(__sb.__bout_),
__nout_(__sb.__nout_),
__eout_(__sb.__eout_)
{
}so, __ninp_ and __einp_ are nullptr. But std::iostream will call sgetc() to get char, so we will always call underflow() which does not advance the position.
inline _LIBCPP_HIDE_FROM_ABI_AFTER_V1
int_type sgetc() {
if (__ninp_ == __einp_)
return underflow();
return traits_type::to_int_type(*__ninp_);
}
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Optimize memory consumption during backup to S3: files to S3 now will be copied directly without using
WriteBufferFromS3(which could use a lot of memory).This PR fixes #43083