Fix infinite loop in GetPreExistingChunksIdsAsync when records exceed MaxTopCount#7311
Merged
adamsitnik merged 3 commits intomainfrom Feb 16, 2026
Merged
Conversation
Co-authored-by: adamsitnik <[email protected]>
Copilot
AI
changed the title
[WIP] Fix bugs in GetPreExistingChunksIdsAsync implementation
Fix infinite loop in GetPreExistingChunksIdsAsync when records exceed MaxTopCount
Feb 16, 2026
roji
approved these changes
Feb 16, 2026
src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: adamsitnik <[email protected]>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes an infinite loop bug in GetPreExistingChunksIdsAsync that occurred when a document had more than 1,000 (MaxTopCount) pre-existing chunks. The method was repeatedly fetching the same records without proper pagination.
Changes:
- Add
Skip = keys.Countparameter toGetAsynccall to properly paginate through results - Add comprehensive test with 2,500 chunks to verify pagination across multiple batches works correctly
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs | Adds options: new() { Skip = keys.Count } parameter to GetAsync call to properly skip already-fetched records during pagination |
| test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Writers/VectorStoreWriterTests.cs | Adds test IncrementalIngestion_WithManyRecords_DeletesAllPreExistingChunks that creates 2500 chunks to verify pagination works correctly when records exceed MaxTopCount of 1000 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
GetPreExistingChunksIdsAsyncwould infinitely loop when a document had more pre-existing chunks thanMaxTopCount(1,000), repeatedly fetching and adding the same records without pagination.Changes
options: new() { Skip = keys.Count }toGetAsyncto properly paginate through resultsIncrementalIngestion_WithManyRecords_DeletesAllPreExistingChunksthat creates 2500 chunks to verify pagination across multiple batchesOriginal prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.
Microsoft Reviewers: Open in CodeFlow