[Webkit-unassigned] [Bug 124319] New: Regex runtime depends on input size when it shouldn't

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Nov 13 15:53:30 PST 2013


https://bugs.webkit.org/show_bug.cgi?id=124319

           Summary: Regex runtime depends on input size when it shouldn't
           Product: WebKit
           Version: 528+ (Nightly build)
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Normal
          Priority: P3
         Component: JavaScriptCore
        AssignedTo: webkit-unassigned at lists.webkit.org
        ReportedBy: peter.isza at gmail.com
                CC: barraclough at apple.com, msaboff at apple.com


Created an attachment (id=216876)
 --> (https://bugs.webkit.org/attachment.cgi?id=216876&action=review)
Regexp runtime tester script

I have run the following regular expressions on strings that contain only letter 'a':
  1. /^b+/
  2. /^(b)+/
  3. /^(?:b+)/
  4. /^(?:(?:b)+)/

Obviously, there should be no matches. The regex engine should quit after reading the first character, so the running time shouldn't depend on what's after the first letter in the string. But in case #2 and #4 it does depend on it (linearly).

The bug appears in Safari and Firefox (and doesn't appear in Chrome, IE and Opera).

It turns out that in cases #2 and #4 YARR can't compile the expression, so it is using the interpreter. Fair enough.

Then it turns out that the interpreter uses some magic backtracking algorithm instead of a finite state automaton. Alright.

This magic algorithm seems to look for the matches first, and then decides whether the match is in the beginning or not:
if (((matchBegin && term.anchors.m_bol)
             || ((matchEnd != input.end()) && term.anchors.m_eol))
            && !pattern->m_multiline)
            return false;

Could you please fix this?

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list