Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exponential backoff in multiline string regexp #157

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

abgruszecki
Copy link

Hello there!

We've found an issue where emacs would freeze when using scala-mode scrolling to line https://github.com/lampepfl/dotty/blob/master/compiler/src/dotty/tools/dotc/typer/Namer.scala#L1199. If you put the point at the beginning of the trailing """ and call scala-syntax:looking-at-simplePattern-beginning, Emacs will freeze. I've traced this down to the scala-syntax:multiLineStringLiteral-re regexp.

Since the middle of the replaced regexp was a greedy match, it required exponential time to match (or fail to match) a very long literal.

After making * lazy, things no longer freeze.

I tried to make the change as unintrusive as possible. scala-syntax:multiLineStringLiteral-start-re could probably be changed as well. Let me know if I should continue to dig into this, I'd be willing to do it if I had some help.

Since the middle of the replaced regexp was a greedy match,
it required exponential time to match (or fail to match)
a very long literal.

For instance, if we had a `"""` followed by ~700 lines of code
that didn't contain a multine string, trying to match that with the
regexp would not terminate.
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@hvesalai
Copy link
Owner

Thank you for the PR. Yes please modify the scala-syntax:multiLineStringLiteral-start-re, it doesn't have to be greedy.

@abgruszecki
Copy link
Author

Just checked, making that regex lazy breaks highlighting. Sample: https://github.com/lampepfl/dotty/blob/master/compiler/test/dotty/tools/vulpix/ParallelTesting.scala#L95. Everything from that line onwards is HL-ed as string. Does this ring a bell? I'll look into it later.

@hvesalai
Copy link
Owner

Ah, I was wrong in my analysis. The start re actually matches the whole string, not just the start.

\\(\"?\"?[^\"]\\)* This is saying that for * time, skip over things that are anything but """ (so "" and something else is ok).

@hvesalai
Copy link
Owner

Now, if you make that reluctant, then it will stop before the end of the multi-line string, and thus the next """ (which is supposed to end the string) is seen as a start of the next string.

@smarter smarter mentioned this pull request Dec 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants