KAFKA-9373: Improve shutdown performance via lazy accessing the offset and time indices#7900
Closed
efeg wants to merge 2 commits intoapache:trunkfrom
Closed
KAFKA-9373: Improve shutdown performance via lazy accessing the offset and time indices#7900efeg wants to merge 2 commits intoapache:trunkfrom
efeg wants to merge 2 commits intoapache:trunkfrom
Conversation
…ndices. KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they needs to be closed or not. This patch: * Improves shutdown performance via lazy accessing the offset and time indices. * Eliminates redundant disk accesses and memory mapped operations while deleting or renaming files that back segment indices. * Prevents illegal accesses to underlying indices of a closed segment, which would lead to memory leaks due to recreation of the underlying memory mapped objects.
ijuma
reviewed
Jan 7, 2020
| * leaks due to recreation of underlying memory mapped object. | ||
| * | ||
| * Methods of this class are thread safe. Make sure to check `AbstractIndex` subclasses documentation | ||
| * to establish their thread safety. |
Member
There was a problem hiding this comment.
Why has the thread safety comment been removed?
Member
|
Nice optimization. I think we should consider delegating to the underlying implementations instead of adding logic to the wrapper. |
Member
|
@efeg Any thoughts on the above? |
Contributor
Author
|
@ijuma Thanks for the review and sorry for the delay on addressing the comments! -- I will try to update the PR by next week. |
Member
|
@efeg thanks, looking forward to it. |
308c917 to
9984743
Compare
3 tasks
Member
ijuma
added a commit
to ijuma/kafka
that referenced
this pull request
Mar 26, 2020
…ndexes KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they need to be closed or not. This is a cleaned up version of apache#7900, which was submitted by @efeg. It eliminates unnecessary disk accesses and memory map operations while deleting, renaming or closing offset and time indexes. In a cluster with 31 brokers, where each broker has 13K to 20K segments, @efeg and team observed up to 2 orders of magnitude faster LogManager shutdown times - i.e. dropping the LogManager shutdown time of each broker from 10s of seconds to 100s of milliseconds. To avoid confusion between `renameTo` and `setFile`, I replaced the latter with the more restricted updateParentDir` (it turns out that's all we need). Co-authored-by: Adem Efe Gencer <[email protected]> Co-authored-by: Ismael Juma <[email protected]>
ijuma
added a commit
that referenced
this pull request
Mar 26, 2020
…ndexes (#8346) KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they need to be closed or not. This is a cleaned up version of #7900, which was submitted by @efeg. It eliminates unnecessary disk accesses and memory map operations while deleting, renaming or closing offset and time indexes. In a cluster with 31 brokers, where each broker has 13K to 20K segments, @efeg and team observed up to 2 orders of magnitude faster LogManager shutdown times - i.e. dropping the LogManager shutdown time of each broker from 10s of seconds to 100s of milliseconds. To avoid confusion between `renameTo` and `setFile`, I replaced the latter with the more restricted updateParentDir` (it turns out that's all we need). Reviewers: Jun Rao <[email protected]>, Andrew Choi <[email protected]> Co-authored-by: Adem Efe Gencer <[email protected]> Co-authored-by: Ismael Juma <[email protected]>
Member
|
Thanks for the PR, we merged it via #8346. |
Contributor
Author
|
Thanks for the updated PR @ijuma! |
efeg
pushed a commit
to linkedin/kafka
that referenced
this pull request
Jun 1, 2020
…ng unnecessary loading of indexes (apache#8346) TICKET = LI_DESCRIPTION = KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they need to be closed or not. This is a cleaned up version of apache#7900, which was submitted by @efeg. It eliminates unnecessary disk accesses and memory map operations while deleting, renaming or closing offset and time indexes. In a cluster with 31 brokers, where each broker has 13K to 20K segments, @efeg and team observed up to 2 orders of magnitude faster LogManager shutdown times - i.e. dropping the LogManager shutdown time of each broker from 10s of seconds to 100s of milliseconds. To avoid confusion between `renameTo` and `setFile`, I replaced the latter with the more restricted updateParentDir` (it turns out that's all we need). Reviewers: Jun Rao <[email protected]>, Andrew Choi <[email protected]> Co-authored-by: Adem Efe Gencer <[email protected]> Co-authored-by: Ismael Juma <[email protected]> EXIT_CRITERIA = HASH [222726d]
gitlw
pushed a commit
to linkedin/kafka
that referenced
this pull request
Jun 12, 2020
…ng unnecessary loading of indexes (apache#8346) TICKET = LI_DESCRIPTION = KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they need to be closed or not. This is a cleaned up version of apache#7900, which was submitted by @efeg. It eliminates unnecessary disk accesses and memory map operations while deleting, renaming or closing offset and time indexes. In a cluster with 31 brokers, where each broker has 13K to 20K segments, @efeg and team observed up to 2 orders of magnitude faster LogManager shutdown times - i.e. dropping the LogManager shutdown time of each broker from 10s of seconds to 100s of milliseconds. To avoid confusion between `renameTo` and `setFile`, I replaced the latter with the more restricted updateParentDir` (it turns out that's all we need). Reviewers: Jun Rao <[email protected]>, Andrew Choi <[email protected]> Co-authored-by: Adem Efe Gencer <[email protected]> Co-authored-by: Ismael Juma <[email protected]> EXIT_CRITERIA = HASH [222726d]
gitlw
pushed a commit
to linkedin/kafka
that referenced
this pull request
Jun 13, 2020
…ng unnecessary loading of indexes (apache#8346) TICKET = LI_DESCRIPTION = KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they need to be closed or not. This is a cleaned up version of apache#7900, which was submitted by @efeg. It eliminates unnecessary disk accesses and memory map operations while deleting, renaming or closing offset and time indexes. In a cluster with 31 brokers, where each broker has 13K to 20K segments, @efeg and team observed up to 2 orders of magnitude faster LogManager shutdown times - i.e. dropping the LogManager shutdown time of each broker from 10s of seconds to 100s of milliseconds. To avoid confusion between `renameTo` and `setFile`, I replaced the latter with the more restricted updateParentDir` (it turns out that's all we need). Reviewers: Jun Rao <[email protected]>, Andrew Choi <[email protected]> Co-authored-by: Adem Efe Gencer <[email protected]> Co-authored-by: Ismael Juma <[email protected]> EXIT_CRITERIA = HASH [222726d]
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they need to be closed or not.
This patch:
In our evaluations in a cluster with 31 brokers, where each broker has 13K to 20K segments, we observed up to 2 orders of magnitude faster LogManager shutdown times with this patch -- i.e. dropping the LogManager shutdown time of each broker from 10s of seconds to 100s of milliseconds.
Committer Checklist (excluded from commit message)