[webkit-changes] [WebKit/WebKit] 9ed7d9: [JSC] Implement RegExp Duplicate Named Capture Groups

Michael Saboff noreply at github.com
Wed Feb 22 12:01:32 PST 2023


  Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: 9ed7d9c4e36677ce84c2bd2b2b225d240365fa19
      https://github.com/WebKit/WebKit/commit/9ed7d9c4e36677ce84c2bd2b2b225d240365fa19
  Author: Michael Saboff <msaboff at apple.com>
  Date:   2023-02-22 (Wed, 22 Feb 2023)

  Changed paths:
    A JSTests/stress/regexp-duplicate-named-captures.js
    M JSTests/test262/config.yaml
    M LayoutTests/js/regexp-named-capture-groups-expected.txt
    M LayoutTests/js/script-tests/regexp-named-capture-groups.js
    M Source/JavaScriptCore/runtime/RegExp.cpp
    M Source/JavaScriptCore/runtime/RegExp.h
    M Source/JavaScriptCore/runtime/RegExpInlines.h
    M Source/JavaScriptCore/runtime/RegExpMatchesArray.h
    M Source/JavaScriptCore/runtime/StringPrototype.cpp
    M Source/JavaScriptCore/yarr/Yarr.h
    M Source/JavaScriptCore/yarr/YarrInterpreter.cpp
    M Source/JavaScriptCore/yarr/YarrInterpreter.h
    M Source/JavaScriptCore/yarr/YarrJIT.cpp
    M Source/JavaScriptCore/yarr/YarrJITRegisters.h
    M Source/JavaScriptCore/yarr/YarrParser.h
    M Source/JavaScriptCore/yarr/YarrPattern.cpp
    M Source/JavaScriptCore/yarr/YarrPattern.h

  Log Message:
  -----------
  [JSC] Implement RegExp Duplicate Named Capture Groups
https://bugs.webkit.org/show_bug.cgi?id=252553
rdar://100335581

Reviewed by Yusuke Suzuki.

This change implements RegExp Duplicate Named Capture Groups (https://github.com/tc39/ecma262/tree/duplicate-named-capture-groups).
Duplicate named captures have a unique positive ID like subpatternIds.  When matching, either in the interpreter or JIT,
space for an unsgined is added in the "output" vector for each duplicate named capture.  When a subpattern that is one of the
duplicates for a named capture matches, that subpattern's ID is saved in its duplicate named capture location of the output
vector.  When such a subpattern doesn't match or backtrackes, a zero is written in the named capture's output location.
When either matching a backreference for a named capture or creating the match result, the participating subpattern for a
duplicate named capture is indirectly accessed via the named capture's output location.

YarrPatterns now contains a map from a duplicate named capture to a vector of unsigneds.  The first value in that vector is
that duplicate named group's unique ID.  The remaining values are the subpatternIds that are part of the named group.
There is also a vector were the value at subpatternId index contains its duplicate named group's ID.  These allow for
finding the subpatternId's for a given duplicate named capture group as well as finding the duplicated named capture ID
for a given subpatternId.

Updated existing LayoutTests/js/regexp-named-capture-groups.html to no longer fail with duplcxate capture names.
Enabled duplicate named capture tests in test262.

* JSTests/test262/config.yaml: Enabled regexp-duplicate-named-groups
* JSTests/stress/regexp-duplicate-named-captures.js: Added.
(arrayToString):
(objectToString):
(dumpValue):
(compareArray):
(compareGroups):
(testRegExp):
(testRegExpSyntaxError):
(testRegExp.x.a):
(testRegExp.a.x):
* LayoutTests/js/regexp-named-capture-groups-expected.txt: Removed test for syntax error with a duplicate group name.
* LayoutTests/js/script-tests/regexp-named-capture-groups.js: Removed test for syntax error with a duplicate group name.
* Source/JavaScriptCore/runtime/RegExp.cpp:
(JSC::RegExp::finishCreation):
* Source/JavaScriptCore/runtime/RegExp.h:
* Source/JavaScriptCore/runtime/RegExpInlines.h:
(JSC::RegExp::matchInline):
* Source/JavaScriptCore/runtime/RegExpMatchesArray.h:
(JSC::createRegExpMatchesArray):
* Source/JavaScriptCore/runtime/StringPrototype.cpp:
(JSC::substituteBackreferencesSlow):
(JSC::replaceUsingRegExpSearch):
* Source/JavaScriptCore/yarr/Yarr.h:
* Source/JavaScriptCore/yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::ParenthesesDisjunctionContext::ParenthesesDisjunctionContext):
(JSC::Yarr::Interpreter::ParenthesesDisjunctionContext::restoreOutput):
(JSC::Yarr::Interpreter::ParenthesesDisjunctionContext::getDisjunctionContext):
(JSC::Yarr::Interpreter::ParenthesesDisjunctionContext::backupOffsetForDuplicateNamedGroup):
(JSC::Yarr::Interpreter::ParenthesesDisjunctionContext::allocationSize):
(JSC::Yarr::Interpreter::allocParenthesesDisjunctionContext):
(JSC::Yarr::Interpreter::matchBackReference):
(JSC::Yarr::Interpreter::backtrackBackReference):
(JSC::Yarr::Interpreter::recordParenthesesMatch):
(JSC::Yarr::Interpreter::resetMatches):
(JSC::Yarr::Interpreter::parenthesesDoBacktrack):
(JSC::Yarr::Interpreter::matchParenthesesOnceBegin):
(JSC::Yarr::Interpreter::matchParenthesesOnceEnd):
(JSC::Yarr::Interpreter::backtrackParenthesesOnceBegin):
(JSC::Yarr::Interpreter::backtrackParenthesesOnceEnd):
(JSC::Yarr::Interpreter::matchParentheses):
(JSC::Yarr::Interpreter::backtrackParentheses):
(JSC::Yarr::ByteCompiler::compile):
(JSC::Yarr::ByteCompiler::atomBackReference):
(JSC::Yarr::ByteCompiler::atomParentheticalAssertionBegin):
(JSC::Yarr::ByteCompiler::atomParentheticalAssertionEnd):
(JSC::Yarr::ByteCompiler::atomParenthesesSubpatternEnd):
(JSC::Yarr::ByteCompiler::atomParenthesesOnceEnd):
(JSC::Yarr::ByteCompiler::atomParenthesesTerminalEnd):
(JSC::Yarr::ByteCompiler::emitDisjunction):
(JSC::Yarr::ByteTermDumper::dumpTerm):
* Source/JavaScriptCore/yarr/YarrInterpreter.h:
(JSC::Yarr::ByteTerm::ByteTerm):
(JSC::Yarr::ByteTerm::ParentheticalAssertionBegin):
(JSC::Yarr::ByteTerm::ParentheticalAssertionEnd):
(JSC::Yarr::ByteTerm::containsAnyCaptures):
(JSC::Yarr::ByteTerm::subpatternId):
(JSC::Yarr::ByteTerm::duplicateNamedGroupId):
(JSC::Yarr::ByteTerm::firstSubpatternId):
(JSC::Yarr::ByteTerm::lastSubpatternId):
(JSC::Yarr::BytecodePattern::BytecodePattern):
(JSC::Yarr::BytecodePattern::hasDuplicateNamedCaptureGroups const):
(JSC::Yarr::BytecodePattern::offsetForDuplicateNamedGroupId):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:
(JSC::Yarr::loadSubPatternIdForDuplicateNamedGroup):
(JSC::Yarr::loadSubPattern):
(JSC::Yarr::loadSubPatternEnd):
(JSC::Yarr::YarrGenerator):
(JSC::Yarr::compile):
(JSC::Yarr::compileInline):
(JSC::Yarr::dumpCompileFailure): Deleted.
(JSC::Yarr::jitCompile): Deleted.
(JSC::Yarr::jitCompileInlinedTest): Deleted.
* Source/JavaScriptCore/yarr/YarrJITRegisters.h:
* Source/JavaScriptCore/yarr/YarrParser.h:
(JSC::Yarr::Parser::NamedCaptureGroups::NamedCaptureGroups):
(JSC::Yarr::Parser::NamedCaptureGroups::contains):
(JSC::Yarr::Parser::NamedCaptureGroups::isEmpty):
(JSC::Yarr::Parser::NamedCaptureGroups::reset):
(JSC::Yarr::Parser::NamedCaptureGroups::nextAlternative):
(JSC::Yarr::Parser::NamedCaptureGroups::pushParenthesis):
(JSC::Yarr::Parser::NamedCaptureGroups::popParenthesis):
(JSC::Yarr::Parser::NamedCaptureGroups::add):
(JSC::Yarr::Parser::parseEscape):
(JSC::Yarr::Parser::parseParenthesesBegin):
(JSC::Yarr::Parser::parseParenthesesEnd):
(JSC::Yarr::Parser::parseTokens):
(JSC::Yarr::Parser::parse):
(JSC::Yarr::Parser::handleIllegalReferences):
(JSC::Yarr::Parser::containsIllegalNamedForwardReference):
(JSC::Yarr::Parser::resetForReparsing):
* Source/JavaScriptCore/yarr/YarrPattern.cpp:
(JSC::Yarr::YarrPatternConstructor::UnresolvedForwardReference::UnresolvedForwardReference):
(JSC::Yarr::YarrPatternConstructor::UnresolvedForwardReference::hasNamedGroup):
(JSC::Yarr::YarrPatternConstructor::UnresolvedForwardReference::namedGroup):
(JSC::Yarr::YarrPatternConstructor::namedCaptureGroupIdForName):
(JSC::Yarr::YarrPatternConstructor::tryConvertingForwardReferencesToBackreferences):
(JSC::Yarr::YarrPatternConstructor::atomParenthesesSubpatternBegin):
(JSC::Yarr::YarrPatternConstructor::atomParenthesesEnd):
(JSC::Yarr::YarrPatternConstructor::atomNamedBackReference):
(JSC::Yarr::YarrPatternConstructor::atomNamedForwardReference):
(JSC::Yarr::YarrPatternConstructor::disjunction):
(JSC::Yarr::YarrPatternConstructor::setupDuplicateNamedCaptures):
(JSC::Yarr::YarrPattern::compile):
* Source/JavaScriptCore/yarr/YarrPattern.h:
(JSC::Yarr::PatternTerm::ForwardReference):
(JSC::Yarr::YarrPattern::resetForReparsing):
(JSC::Yarr::YarrPattern::offsetVectorBaseForNamedCaptures const):
(JSC::Yarr::YarrPattern::offsetsSize const):
(JSC::Yarr::YarrPattern::offsetForDuplicateNamedGroupId):
(JSC::Yarr::YarrPattern::hasDuplicateNamedCaptureGroups const):
(JSC::Yarr::BackTrackInfoBackReference::backReferenceSizeIndex):

Canonical link: https://commits.webkit.org/260692@main




More information about the webkit-changes mailing list