[Webkit-unassigned] [Bug 258706] JS markdown parser performs 50x slower in JSC compared to V8, likely due to regex

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Mar 27 08:03:38 PDT 2024


https://bugs.webkit.org/show_bug.cgi?id=258706

--- Comment #5 from Michael Saboff <msaboff at apple.com> ---
(In reply to Michael Saboff from comment #2)

> The reason we don’t support back references for ignore case 16bit JIT’ing is
> due to the complicated case folding rules for some Unicode characters. 
> Again there are two possible options for addressing this bug.
> 1) If the RegExp contains back references, allow the back reference if the
> referenced group’s contents are easily case folded.  8 bit characters would
> be easily handled by this fix.
> 2) Completely handle Unicode case folding.  This could be built upon the
> work of the first alternative.  A full implementation of this approach would
> require calling out to a case folding helper for some patterns.  This helper
> could be generated as needed.

JIT support for ignore case backreferences on ARM64 and X86-64 landed in https://commits.webkit.org/276681@main via https://bugs.webkit.org/show_bug.cgi?id=271617.

> The first non-zero based variable counted parenthesis issue could be addressed at least two ways:
> 1) With some RegExp rewriting, e.g. changing (?:\*[ \t]*){3,} to (?:\*[ \t]*){3}(?:\*[ \t]*)*.  We currently do this for one or more variable counted parens, ie (?:\*[ \t]*)+
> 2) Some more involved work to properly handle the fixed non-zero count of variable counted parens in the JIT directly.

Working towards the second option to support non-zero based variable counted parenthesis in the Yarr JIT.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20240327/51c750ec/attachment.htm>


More information about the webkit-unassigned mailing list