[webkit-changes] [WebKit/WebKit] d43e2a: [JSC] Introduce SubjectSampler to heuristically pi...

Yusuke Suzuki noreply at github.com
Fri Feb 10 18:07:48 PST 2023


  Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: d43e2a23b5f0474c36868e01d6e8242d30cc2b1a
      https://github.com/WebKit/WebKit/commit/d43e2a23b5f0474c36868e01d6e8242d30cc2b1a
  Author: Yusuke Suzuki <ysuzuki at apple.com>
  Date:   2023-02-10 (Fri, 10 Feb 2023)

  Changed paths:
    M Source/JavaScriptCore/runtime/RegExp.cpp
    M Source/JavaScriptCore/runtime/RegExp.h
    M Source/JavaScriptCore/runtime/RegExpInlines.h
    M Source/JavaScriptCore/yarr/YarrJIT.cpp
    M Source/JavaScriptCore/yarr/YarrJIT.h

  Log Message:
  -----------
  [JSC] Introduce SubjectSampler to heuristically pick BM search in RegExp
https://bugs.webkit.org/show_bug.cgi?id=252065
rdar://105284820

Reviewed by Mark Lam.

BoyerMoore search's effectiveness depends on whether we can pick a good anchor which rarely appears on the actual text.
And if we pick a character which appears super frequently in text, then it does not have much effectiveness or rather
slows down RegExp performance since BM search adds additional searching code.

So in this patch, we integrate an idea of V8 Irregexp, which samples 128 characters of a text at compile time and use
character frequency as a weight to pick better BoyerMoore search character. Our weight calculation is simpler one than V8,
and it is effective in our benchmarks.

This patch improves JetStream2/regex-dna-SP by 5-10%.

* Source/JavaScriptCore/runtime/RegExp.cpp:
(JSC::RegExp::compile):
(JSC::RegExp::compileMatchOnly):
* Source/JavaScriptCore/runtime/RegExp.h:
* Source/JavaScriptCore/runtime/RegExpInlines.h:
(JSC::RegExp::compileIfNecessary):
(JSC::RegExp::matchInline):
(JSC::RegExp::compileIfNecessaryMatchOnly):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
(JSC::Yarr::SubjectSampler::SubjectSampler):
(JSC::Yarr::SubjectSampler::frequency const):
(JSC::Yarr::SubjectSampler::sample):
(JSC::Yarr::SubjectSampler::dump const):
(JSC::Yarr::SubjectSampler::is8Bit const):
(JSC::Yarr::SubjectSampler::add):
(JSC::Yarr::BoyerMooreInfo::findBestCharacterSequence const):
(JSC::Yarr::BoyerMooreInfo::findWorthwhileCharacterSequenceForLookahead const):
(JSC::Yarr::jitCompile):
* Source/JavaScriptCore/yarr/YarrJIT.h:

Canonical link: https://commits.webkit.org/260142@main




More information about the webkit-changes mailing list