Ignore:
Timestamp:
Sep 7, 2017, 4:13:38 PM (8 years ago)
Author:
[email protected]
Message:

Add support for RegExp named capture groups
https://bugs.webkit.org/show_bug.cgi?id=176435

Reviewed by Filip Pizlo.

Source/JavaScriptCore:

Added parsing for both naming a captured parenthesis as well and using a named group in
a back reference. Also added support for using named groups with String.prototype.replace().

This patch does not throw Syntax Errors as described in the current spec text for the two
cases of malformed back references in String.prototype.replace() as I believe that it
is inconsistent with the current semantics for handling of other malformed replacement
tokens. I filed an issue for the requested change to the proposed spec and also filed
a FIXME bug https://bugs.webkit.org/show_bug.cgi?id=176434.

This patch does not implement strength reduction in the optimizing JITs for named capture
groups. Filed https://bugs.webkit.org/show_bug.cgi?id=176464.

  • dfg/DFGAbstractInterpreterInlines.h:

(JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects):

  • dfg/DFGStrengthReductionPhase.cpp:

(JSC::DFG::StrengthReductionPhase::handleNode):

  • runtime/CommonIdentifiers.h:
  • runtime/JSGlobalObject.cpp:

(JSC::JSGlobalObject::init):
(JSC::JSGlobalObject::haveABadTime):

  • runtime/JSGlobalObject.h:

(JSC::JSGlobalObject::regExpMatchesArrayWithGroupsStructure const):

  • runtime/RegExp.cpp:

(JSC::RegExp::finishCreation):

  • runtime/RegExp.h:
  • runtime/RegExpMatchesArray.cpp:

(JSC::createStructureImpl):
(JSC::createRegExpMatchesArrayWithGroupsStructure):
(JSC::createRegExpMatchesArrayWithGroupsSlowPutStructure):

  • runtime/RegExpMatchesArray.h:

(JSC::createRegExpMatchesArray):

  • runtime/StringPrototype.cpp:

(JSC::substituteBackreferencesSlow):
(JSC::replaceUsingRegExpSearch):

  • yarr/YarrParser.h:

(JSC::Yarr::Parser::CharacterClassParserDelegate::atomNamedBackReference):
(JSC::Yarr::Parser::parseEscape):
(JSC::Yarr::Parser::parseParenthesesBegin):
(JSC::Yarr::Parser::tryConsumeUnicodeEscape):
(JSC::Yarr::Parser::tryConsumeIdentifierCharacter):
(JSC::Yarr::Parser::isIdentifierStart):
(JSC::Yarr::Parser::isIdentifierPart):
(JSC::Yarr::Parser::tryConsumeGroupName):

  • yarr/YarrPattern.cpp:

(JSC::Yarr::YarrPatternConstructor::atomParenthesesSubpatternBegin):
(JSC::Yarr::YarrPatternConstructor::atomNamedBackReference):
(JSC::Yarr::YarrPattern::errorMessage):

  • yarr/YarrPattern.h:

(JSC::Yarr::YarrPattern::reset):

  • yarr/YarrSyntaxChecker.cpp:

(JSC::Yarr::SyntaxChecker::atomParenthesesSubpatternBegin):
(JSC::Yarr::SyntaxChecker::atomNamedBackReference):

Source/WebCore:

Implemented stub routines to support named capture groups. These are no-ops
just like for number capture group.

No new tests as this is covered by existing tests.

  • contentextensions/URLFilterParser.cpp:

(WebCore::ContentExtensions::PatternParser::atomNamedBackReference):
(WebCore::ContentExtensions::PatternParser::atomParenthesesSubpatternBegin):

LayoutTests:

New regression tests.

  • js/regexp-named-capture-groups-expected.txt: Added.
  • js/regexp-named-capture-groups.html: Added.
  • js/script-tests/regexp-named-capture-groups.js: Added.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/runtime/RegExp.h

    r221160 r221769  
    8080    unsigned numSubpatterns() const { return m_numSubpatterns; }
    8181
     82    bool hasNamedCaptures()
     83    {
     84        return !m_captureGroupNames.isEmpty();
     85    }
     86
     87    String getCaptureGroupName(unsigned i)
     88    {
     89        if (!i || m_captureGroupNames.size() <= i)
     90            return String();
     91        return m_captureGroupNames[i];
     92    }
     93
     94    unsigned subpatternForName(String groupName)
     95    {
     96        auto it = m_namedGroupToParenIndex.find(groupName);
     97        if (it == m_namedGroupToParenIndex.end())
     98            return 0;
     99        return it->value;
     100    }
     101
    82102    bool hasCode()
    83103    {
     
    135155    const char* m_constructionError;
    136156    unsigned m_numSubpatterns;
     157    Vector<String> m_captureGroupNames;
     158    HashMap<String, unsigned> m_namedGroupToParenIndex;
    137159#if ENABLE(REGEXP_TRACING)
    138160    double m_rtMatchOnlyTotalSubjectStringLen;
Note: See TracChangeset for help on using the changeset viewer.