Step over surrogate pairs on zero-lenth matches (fixes #100134)#100482
Merged
alexdima merged 1 commit intomicrosoft:masterfrom Jun 22, 2020
Merged
Step over surrogate pairs on zero-lenth matches (fixes #100134)#100482alexdima merged 1 commit intomicrosoft:masterfrom
alexdima merged 1 commit intomicrosoft:masterfrom
Conversation
IllusionMH
commented
Jun 18, 2020
| // we attempt to recover from that by advancing by one | ||
| this._searchRegex.lastIndex += 1; | ||
| // we attempt to recover from that by advancing by two if surrogate pair found and by one otherwise | ||
| if (strings.getNextCodePoint(text, textLength, this._searchRegex.lastIndex) > 0xFFFF) { |
Contributor
Author
There was a problem hiding this comment.
This one looks closest alternative to "missing" isSurrogatePair helper (without recreating logic that involves isHighSurrogate and isLowSurrogate)
Member
|
Looks great! Thank you! |
Contributor
Author
|
@alexdima thank you for review! Could you please provide more insights why when there is large number of matches in file (no in string with current match) it will re-evaluate matches ( Had no problem with small files with few matches - Find next can step over emoji and doesn't trigger |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes #100134
Problem lies in fact that regexp with
uflag will changelastIndexto the start of surrogate pair if started after high pair:Which ended attempts to step forward for zero-length matches in endless loop.
This PR takes into account surrogate pairs and propagates
lastIndexaccordingly.Endless loop during find matches is fixed in this PR.
Not in this PR:
In some cases (for files with a lot of matches) it's impossible to use Find next button to get to the next match after surrogate pair. Can give a try in this PR if needed, but would prefer to leave it out of scope of this PR
(new fallback logic not triggered in that case, but should use similar strategy - remember last position and advance for zero-length matches, but this should survive
reset)What do you think?
Inconsistency of handling matches of for surrogate pairs between JS RegExp's with
uflag and ripgrepI can create issue with more details but simple example will be for file with 1💻 replace using RegExp
1.- Find in files breaks surrogate, while replace in editor replaces whole emoji