tsdb: Find the last series ID on startup from the last series id file and WAL scan#18333
Conversation
Signed-off-by: Rushabh Mehta <[email protected]>
Signed-off-by: Rushabh Mehta <[email protected]>
98ce6d0 to
fc1fca8
Compare
Signed-off-by: Rushabh Mehta <[email protected]>
|
The failing CI linter test is due to a recent push by somebody else to a file I have not touched. |
Signed-off-by: Rushabh Mehta <[email protected]>
codesome
left a comment
There was a problem hiding this comment.
Good work! Most of the comments are just styling nits.
| state, err := h.readSeriesState() | ||
| if err != nil { | ||
| h.logger.Warn("Failed to read series state file, falling back to slow ID initialization", "err", err) | ||
| } else if state != nil { |
There was a problem hiding this comment.
Don't need the else
| } else if state != nil { | |
| } | |
| if state != nil { |
| if h.opts.EnableFastStartup { | ||
| state, err := h.readSeriesState() | ||
| if err != nil { | ||
| h.logger.Warn("Failed to read series state file, falling back to slow ID initialization", "err", err) |
There was a problem hiding this comment.
Let's keep it simple and only do fast startup if this file exists. Let's not keep an option to scan all files for series ID for the initial implementation of this feature.
| h.logger.Warn("Failed to read series state file, falling back to slow ID initialization", "err", err) | |
| h.logger.Warn("Failed to read series state file, skipping the fast startup", "err", err) |
There was a problem hiding this comment.
I am assuming once this feature is ready to go, we will change this up so that a new instance can infact make use of it.
| h.lastSeriesID.Store(state.LastSeriesID) | ||
| h.logger.Info("Fast startup: clean shutdown detected, restored last series ID", "id", state.LastSeriesID) | ||
| } else { | ||
| h.logger.Info("Fast startup: unclean shutdown detected, performing bounded reverse scan", "from_segment", endAt, "to_segment", state.LastWALSegment) |
There was a problem hiding this comment.
| h.logger.Info("Fast startup: unclean shutdown detected, performing bounded reverse scan", "from_segment", endAt, "to_segment", state.LastWALSegment) | |
| h.logger.Info("Fast startup: unclean shutdown detected, performing WAL scan", "from_segment", endAt, "to_segment", state.LastWALSegment) |
| } else { | ||
| h.logger.Info("Fast startup: unclean shutdown detected, performing bounded reverse scan", "from_segment", endAt, "to_segment", state.LastWALSegment) | ||
| if err := h.findLastSeriesID(state, endAt); err != nil { | ||
| h.logger.Warn("Bounded reverse scan failed, falling back to slow ID initialization", "err", err) |
There was a problem hiding this comment.
| h.logger.Warn("Bounded reverse scan failed, falling back to slow ID initialization", "err", err) | |
| h.logger.Error("Fast startup: WAL scan failed, skipping fast startup", "err", err) |
| if err := h.findLastSeriesID(state, endAt); err != nil { | ||
| h.logger.Warn("Bounded reverse scan failed, falling back to slow ID initialization", "err", err) | ||
| } else { | ||
| h.logger.Info("Fast startup: bounded reverse scan completed", "id", h.lastSeriesID.Load()) |
There was a problem hiding this comment.
| h.logger.Info("Fast startup: bounded reverse scan completed", "id", h.lastSeriesID.Load()) | |
| h.logger.Info("Fast startup: WAL scan completed", "last_series_id", h.lastSeriesID.Load()) |
| require.Equal(t, uint64(2), head.lastSeriesID.Load(), "Bounded scan should find the highest ID (2) from segment 1") | ||
| } | ||
|
|
||
| func TestHead_FastStartup_CleanShutdown(t *testing.T) { |
There was a problem hiding this comment.
The WAL replay is still running so we can't really test if we are loading the right series id using the new mechanism. We can drop this test for now and come back to it later to test the mechanism end-to-end.
There was a problem hiding this comment.
That is true, this test is better for later.
| // Get the current max segment number. | ||
| endSegment, _, err := head.wal.LastSegmentAndOffset() | ||
| require.NoError(t, err) | ||
| require.Equal(t, 1, endSegment) |
There was a problem hiding this comment.
I am not sure if the first file is 0 or 1. How about use wlog.Segments to get the first and last segment?
| require.Equal(t, expectedState, *state, "read state should match written state") | ||
| } | ||
|
|
||
| func TestHead_FindLastSeriesIDBounded(t *testing.T) { |
There was a problem hiding this comment.
I'd add another case here: before adding metric="B", add another sample for metric="A" in the second segment and then findLastSeriesID() on two files should give 1, and after adding metric=B the same call (scan on both files) should give 2 as the last ID.
| func TestHead_FindLastSeriesIDBounded(t *testing.T) { | |
| func TestHead_FindLastSeriesID(t *testing.T) { |
| require.Equal(t, 0, state.LastWALSegment, "LastWALSegment should remain 0") | ||
| } | ||
|
|
||
| func TestHead_ReadSeriesState(t *testing.T) { |
There was a problem hiding this comment.
| func TestHead_ReadSeriesState(t *testing.T) { | |
| func TestHead_ReadSeriesStateFile(t *testing.T) { |
…move test Signed-off-by: Rushabh Mehta <[email protected]>
Signed-off-by: Rushabh Mehta <[email protected]>
|
@codesome I think I have fixed all the nits + the changes you requested, have another look. This is looking good so far, thanks for your detailed reviews. |
codesome
left a comment
There was a problem hiding this comment.
LGTM! Just a couple of nits and one more test case and we are ready to merge.
| h.lastSeriesID.Store(state.LastSeriesID) | ||
| h.logger.Info("Fast startup: clean shutdown detected, restored last series ID", "last_series_id", state.LastSeriesID) | ||
| } else { | ||
| h.logger.Info("Fast startup: unclean shutdown detected, performing WAL scan", "from_segment", endAt, "to_segment", state.LastWALSegment) |
There was a problem hiding this comment.
| h.logger.Info("Fast startup: unclean shutdown detected, performing WAL scan", "from_segment", endAt, "to_segment", state.LastWALSegment) | |
| h.logger.Info("Fast startup: unclean shutdown detected, performing WAL scan", "from_segment", state.LastWALSegment, "to_segment", endAt) |
| // Scanning both files should now return 2 | ||
| id, err = head.findLastSeriesID(mockState, last) | ||
| require.NoError(t, err) | ||
| require.Equal(t, uint64(2), id, "Should find ID 2 after new series was created in segment 2") |
There was a problem hiding this comment.
Let's add one more case duplicating these lines but with mockState.LastWALSegment = last. It should still return 2 because on unclean shutdown it should still scan the last file.
| var highestID chunks.HeadSeriesRef | ||
| var found bool | ||
|
|
||
| // Read the segment forwards |
There was a problem hiding this comment.
| // Read the segment forwards | |
| // Read the segment forwards. |
Signed-off-by: Rushabh Mehta <[email protected]>
|
Thanks for the merge! Onto the final PR of this idea. |
Which issue(s) does the PR fix:
NA
Does this PR introduce a user-facing change?