git: worktree, optimize infiles function for very large repos#1853
git: worktree, optimize infiles function for very large repos#1853pjbgf merged 3 commits intogo-git:releases/v5.xfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the inFiles function for very large repositories by replacing a linear search (O(n*m) complexity) with a map-based lookup (O(n+m) complexity). The optimization is particularly beneficial when dealing with large numbers of changes and file filters during git reset operations.
Changes:
- Replaced slice-based linear search in
inFileswith O(1) map lookup - Added
buildFilePathMaphelper function to create file path maps with cleaned paths - Pre-allocated
removedFilesslice capacity inresetIndexfor better memory efficiency
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| func inFiles(files []string, v string) bool { | ||
| // inFiles checks if the given file is in the list of files. The incoming filepaths in files should be cleaned before calling this function. |
There was a problem hiding this comment.
The comment states "The incoming filepaths in files should be cleaned before calling this function", but this is misleading. The function already cleans the input parameter v on line 442, and the map parameter files is already built with cleaned paths by buildFilePathMap. Consider revising the comment to clarify that the map is expected to contain pre-cleaned paths (which is handled by buildFilePathMap), and that the function will clean the lookup key v internally.
| // inFiles checks if the given file is in the list of files. The incoming filepaths in files should be cleaned before calling this function. | |
| // inFiles checks if the given file is in the list of files. The files map is expected | |
| // to contain pre-cleaned paths (as produced by buildFilePathMap), and this function | |
| // will clean the lookup key v before checking for membership. |
|
Is it possible to have a benchmark or is there an existing benchmark that get better when using this PR? |
@cedric-appdirect I agree benchmarking performance changes is a good practice. On this case specifically I'm slighly less worried due to the change itself. That being said, I'm planning to have some level of performance regression tests into CI in the near future. |
This PR contains the following updates: | Package | Change | [Age](https://docs.renovatebot.com/merge-confidence/) | [Confidence](https://docs.renovatebot.com/merge-confidence/) | |---|---|---|---| | [github.com/go-git/go-git/v5](https://github.com/go-git/go-git) | `v5.16.5` -> `v5.17.0` |  |  | --- ### Release Notes <details> <summary>go-git/go-git (github.com/go-git/go-git/v5)</summary> ### [`v5.17.0`](https://github.com/go-git/go-git/releases/tag/v5.17.0) [Compare Source](go-git/go-git@v5.16.5...v5.17.0) #### What's Changed - build: Update module github.com/go-git/go-git/v5 to v5.16.5 \[SECURITY] (releases/v5.x) by [@​go-git-renovate](https://github.com/go-git-renovate)\[bot] in [#​1839](go-git/go-git#1839) - git: worktree, optimize infiles function for very large repos by [@​k-anshul](https://github.com/k-anshul) in [#​1853](go-git/go-git#1853) - git: Add strict checks for supported extensions by [@​pjbgf](https://github.com/pjbgf) in [#​1861](go-git/go-git#1861) - backport, git: Improve Status() speed with new index.ModTime check by [@​cedric-appdirect](https://github.com/cedric-appdirect) in [#​1862](go-git/go-git#1862) - storage: filesystem, Avoid overwriting loose obj files by [@​pjbgf](https://github.com/pjbgf) in [#​1864](go-git/go-git#1864) **Full Changelog**: <go-git/go-git@v5.16.5...v5.17.0> </details> --- ### Configuration 📅 **Schedule**: Branch creation - Between 12:00 AM and 03:59 AM ( * 0-3 * * * ) (UTC), Automerge - Between 12:00 AM and 03:59 AM ( * 0-3 * * * ) (UTC). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My41LjAiLCJ1cGRhdGVkSW5WZXIiOiI0My41LjAiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbIktpbmQvRGVwZW5kZW5jeVVwZGF0ZSIsInJ1bi1lbmQtdG8tZW5kLXRlc3RzIl19--> Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/1410 Reviewed-by: Mathieu Fenniak <[email protected]> Co-authored-by: Renovate Bot <[email protected]> Co-committed-by: Renovate Bot <[email protected]>
This PR contains the following updates: | Package | Type | Update | Change | OpenSSF | |---|---|---|---|---| | [github.com/go-git/go-git/v5](https://github.com/go-git/go-git) | require | minor | `v5.16.5` → `v5.17.0` | [](https://securityscorecards.dev/viewer/?uri=github.com/go-git/go-git) | --- >⚠️ **Warning** > > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>go-git/go-git (github.com/go-git/go-git/v5)</summary> ### [`v5.17.0`](https://github.com/go-git/go-git/releases/tag/v5.17.0) [Compare Source](go-git/go-git@v5.16.5...v5.17.0) #### What's Changed - build: Update module github.com/go-git/go-git/v5 to v5.16.5 \[SECURITY] (releases/v5.x) by [@​go-git-renovate](https://github.com/go-git-renovate)\[bot] in [#​1839](go-git/go-git#1839) - git: worktree, optimize infiles function for very large repos by [@​k-anshul](https://github.com/k-anshul) in [#​1853](go-git/go-git#1853) - git: Add strict checks for supported extensions by [@​pjbgf](https://github.com/pjbgf) in [#​1861](go-git/go-git#1861) - backport, git: Improve Status() speed with new index.ModTime check by [@​cedric-appdirect](https://github.com/cedric-appdirect) in [#​1862](go-git/go-git#1862) - storage: filesystem, Avoid overwriting loose obj files by [@​pjbgf](https://github.com/pjbgf) in [#​1864](go-git/go-git#1864) **Full Changelog**: <go-git/go-git@v5.16.5...v5.17.0> </details> --- ### Configuration 📅 **Schedule**: Branch creation - At 12:00 AM through 04:59 AM and 10:00 PM through 11:59 PM, Monday through Friday ( * 0-4,22-23 * * 1-5 ), Only on Sunday and Saturday ( * * * * 0,6 ) (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xNS4yIiwidXBkYXRlZEluVmVyIjoiNDMuMTUuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiS2luZC9EZXBlbmRlbmNpZXMiXX0=--> Reviewed-on: https://altlinux.space/stapler/stplr/pulls/333 Co-authored-by: Renovate Bot <[email protected]> Co-committed-by: Renovate Bot <[email protected]>
Backport for #1852.
Closes #1845.