[webkit-changes] [WebKit/WebKit] d43e2a: [JSC] Introduce SubjectSampler to heuristically pi...
Yusuke Suzuki
noreply at github.com
Fri Feb 10 18:07:48 PST 2023
Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: d43e2a23b5f0474c36868e01d6e8242d30cc2b1a
https://github.com/WebKit/WebKit/commit/d43e2a23b5f0474c36868e01d6e8242d30cc2b1a
Author: Yusuke Suzuki <ysuzuki at apple.com>
Date: 2023-02-10 (Fri, 10 Feb 2023)
Changed paths:
M Source/JavaScriptCore/runtime/RegExp.cpp
M Source/JavaScriptCore/runtime/RegExp.h
M Source/JavaScriptCore/runtime/RegExpInlines.h
M Source/JavaScriptCore/yarr/YarrJIT.cpp
M Source/JavaScriptCore/yarr/YarrJIT.h
Log Message:
-----------
[JSC] Introduce SubjectSampler to heuristically pick BM search in RegExp
https://bugs.webkit.org/show_bug.cgi?id=252065
rdar://105284820
Reviewed by Mark Lam.
BoyerMoore search's effectiveness depends on whether we can pick a good anchor which rarely appears on the actual text.
And if we pick a character which appears super frequently in text, then it does not have much effectiveness or rather
slows down RegExp performance since BM search adds additional searching code.
So in this patch, we integrate an idea of V8 Irregexp, which samples 128 characters of a text at compile time and use
character frequency as a weight to pick better BoyerMoore search character. Our weight calculation is simpler one than V8,
and it is effective in our benchmarks.
This patch improves JetStream2/regex-dna-SP by 5-10%.
* Source/JavaScriptCore/runtime/RegExp.cpp:
(JSC::RegExp::compile):
(JSC::RegExp::compileMatchOnly):
* Source/JavaScriptCore/runtime/RegExp.h:
* Source/JavaScriptCore/runtime/RegExpInlines.h:
(JSC::RegExp::compileIfNecessary):
(JSC::RegExp::matchInline):
(JSC::RegExp::compileIfNecessaryMatchOnly):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
(JSC::Yarr::SubjectSampler::SubjectSampler):
(JSC::Yarr::SubjectSampler::frequency const):
(JSC::Yarr::SubjectSampler::sample):
(JSC::Yarr::SubjectSampler::dump const):
(JSC::Yarr::SubjectSampler::is8Bit const):
(JSC::Yarr::SubjectSampler::add):
(JSC::Yarr::BoyerMooreInfo::findBestCharacterSequence const):
(JSC::Yarr::BoyerMooreInfo::findWorthwhileCharacterSequenceForLookahead const):
(JSC::Yarr::jitCompile):
* Source/JavaScriptCore/yarr/YarrJIT.h:
Canonical link: https://commits.webkit.org/260142@main
More information about the webkit-changes
mailing list