Skip to content

Fix detection of orphaned emStrong delimiters#2203

Merged
UziTech merged 6 commits intomarkedjs:masterfrom
calculuschild:Fix2202
Sep 14, 2021
Merged

Fix detection of orphaned emStrong delimiters#2203
UziTech merged 6 commits intomarkedjs:masterfrom
calculuschild:Fix2202

Conversation

@calculuschild
Copy link
Copy Markdown
Contributor

Marked version:

3.0.3

Markdown flavor: CommonMark

Description

The check for orphaned delimiters (i.e., to skip over the _ in **em_strong**) looks for an isolated delimiter between two pairs of the opposite delimiter symbol (inside a strong of the other symbol). However, this was inadvertently including matches for at any point in the string, leading to the second _ here being counted as an orphan between the middle two asterisk pairs: _**foo**_ **bar**

This PR makes the orphan checker start from the beginning of the string to keep it from over-eagerly finding orphans.

Contributor

  • Test(s) exist to ensure functionality and minimize regression (if no tests added, list tests covering this PR); or,
  • no tests required for this PR.
  • If submitting new feature, it has been documented in the appropriate places.

Committer

In most cases, this should be a different person than the contributor.

@vercel
Copy link
Copy Markdown

vercel bot commented Sep 9, 2021

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/markedjs/markedjs/7DUkk3e967QRWWFt8FxsXq3WLJNm
✅ Preview: https://markedjs-git-fork-calculuschild-fix2202-markedjs.vercel.app

Copy link
Copy Markdown
Member

@UziTech UziTech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for *__foo__* __bar__ as well to make sure that is also fixed

Copy link
Copy Markdown
Contributor

@davisjam davisjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No REDOS issues added here.

@UziTech
Copy link
Copy Markdown
Member

UziTech commented Sep 9, 2021

@calculuschild does this make this while loop irrelevant since endReg will only match once at the beginning of the line?

while ((match = endReg.exec(maskedSrc)) != null) {

@calculuschild
Copy link
Copy Markdown
Contributor Author

@calculuschild does this make this while loop irrelevant since endReg will only match once at the beginning of the line?

No. The ^ beginning of line only applies to the "orphan" checker. https://regexper.com/#%5ECheckOrphan%7CotherwiseCase1%7CotherwiseCase2

The other matches need to keep looping through until we find a valid end delimiter run.

@calculuschild
Copy link
Copy Markdown
Contributor Author

calculuschild commented Sep 9, 2021

Hmm. Hold up. I just realized this case will still be broken when a false orphan occurs later in the line:

_**foo**_ **bar** _**foo**_

Needs a bit more work.

Edit: This is actually fine. False alarm; Typo in the test file.

@calculuschild
Copy link
Copy Markdown
Contributor Author

Ah, nevermind. Seems to work just fine. Updated the tests to include those cases though for future reference.

@UziTech
Copy link
Copy Markdown
Member

UziTech commented Sep 9, 2021

This seems to break _**foo_bar**_

Edit: actually it looks like it is already broken but still after this pr
demo

@calculuschild
Copy link
Copy Markdown
Contributor Author

This seems to break _**foo_bar**_

I've got a fix for that too.

@calculuschild
Copy link
Copy Markdown
Contributor Author

calculuschild commented Sep 9, 2021

@UziTech Using a lookahead so the orphan checker doesn't actually consume that last set of asterisks/underscores (similar to the way the rest of the regex works) which was preventing detection of the ending delimiter.

Added tests for both the asterisk and underscore case.

@UziTech
Copy link
Copy Markdown
Member

UziTech commented Sep 10, 2021

It looks like *__foo*bar__* is not supposed to work the same way.

demo

I'm ok with calling that unspecified behavior (since it is not good markdown) but we probably shouldn't have a test for it.

@calculuschild
Copy link
Copy Markdown
Contributor Author

Huh. That's really odd.

Ok, I have removed that test.

Copy link
Copy Markdown
Member

@UziTech UziTech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@UziTech UziTech merged commit 7792adc into markedjs:master Sep 14, 2021
@github-actions
Copy link
Copy Markdown

🎉 This PR is included in version 3.0.4 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong Parsing when mixing asterisk and underline

4 participants