Changeset 225695 in webkit for trunk/Source/JavaScriptCore


Ignore:
Timestamp:
Dec 8, 2017, 12:32:42 PM (7 years ago)
Author:
[email protected]
Message:

YARR: JIT RegExps with greedy parenthesized sub patterns
https://bugs.webkit.org/show_bug.cgi?id=180538

Reviewed by JF Bastien.

This patch adds JIT support for regular expressions containing greedy counted
parenthesis. An example expression that couldn't be JIT'ed before is /q(a|b)*q/.

Just like in the interpreter, expressions with nested parenthetical subpatterns
require saving the results of previous matches of the parentheses contents along
with any associated state. This saved state is needed in the case that we need
to backtrack. This state is called ParenContext within the code space allocated
for this ParenContext is managed using a simple block allocator within the JIT'ed
code. The raw space managed by this allocator is passed into the JIT'ed function.

Since this fixed sized space may be exceeded, this patch adds a fallback mechanism.
If the JIT'ed code exhausts all its ParenContext space, it returns a new error
JSRegExpJITCodeFailure. The caller will then bytecompile and interpret the
expression.

Due to increased register usage by the parenthesis handling code, the use of
registers by the JIT engine was restructured, with registers used for Unicode
pattern matching replaced with constants.

Reworked some of the context structures that are used across the interpreter
and JIT implementations to make them a little more uniform and to handle the
needs of JIT'ing the new parentheses forms.

To help with development and debugging of this code, compiled patterns dumping
code was enhanced. Also added the ability to also dump interpreter ByteCodes.

  • runtime/RegExp.cpp:

(JSC::byteCodeCompilePattern):
(JSC::RegExp::byteCodeCompileIfNecessary):
(JSC::RegExp::compile):
(JSC::RegExp::compileMatchOnly):

  • runtime/RegExp.h:
  • runtime/RegExpInlines.h:

(JSC::RegExp::matchInline):

  • testRegExp.cpp:

(parseRegExpLine):
(runFromFiles):

  • yarr/Yarr.h:
  • yarr/YarrInterpreter.cpp:

(JSC::Yarr::ByteCompiler::compile):
(JSC::Yarr::ByteCompiler::dumpDisjunction):

  • yarr/YarrJIT.cpp:

(JSC::Yarr::YarrGenerator::ParenContextSizes::ParenContextSizes):
(JSC::Yarr::YarrGenerator::ParenContextSizes::numSubpatterns):
(JSC::Yarr::YarrGenerator::ParenContextSizes::frameSlots):
(JSC::Yarr::YarrGenerator::ParenContext::sizeFor):
(JSC::Yarr::YarrGenerator::ParenContext::nextOffset):
(JSC::Yarr::YarrGenerator::ParenContext::beginOffset):
(JSC::Yarr::YarrGenerator::ParenContext::matchAmountOffset):
(JSC::Yarr::YarrGenerator::ParenContext::subpatternOffset):
(JSC::Yarr::YarrGenerator::ParenContext::savedFrameOffset):
(JSC::Yarr::YarrGenerator::initParenContextFreeList):
(JSC::Yarr::YarrGenerator::allocatePatternContext):
(JSC::Yarr::YarrGenerator::freePatternContext):
(JSC::Yarr::YarrGenerator::savePatternContext):
(JSC::Yarr::YarrGenerator::restorePatternContext):
(JSC::Yarr::YarrGenerator::tryReadUnicodeCharImpl):
(JSC::Yarr::YarrGenerator::storeToFrame):
(JSC::Yarr::YarrGenerator::generateJITFailReturn):
(JSC::Yarr::YarrGenerator::clearMatches):
(JSC::Yarr::YarrGenerator::generate):
(JSC::Yarr::YarrGenerator::backtrack):
(JSC::Yarr::YarrGenerator::opCompileParenthesesSubpattern):
(JSC::Yarr::YarrGenerator::generateEnter):
(JSC::Yarr::YarrGenerator::generateReturn):
(JSC::Yarr::YarrGenerator::YarrGenerator):
(JSC::Yarr::YarrGenerator::compile):

  • yarr/YarrJIT.h:

(JSC::Yarr::YarrCodeBlock::execute):

  • yarr/YarrPattern.cpp:

(JSC::Yarr::indentForNestingLevel):
(JSC::Yarr::dumpUChar32):
(JSC::Yarr::dumpCharacterClass):
(JSC::Yarr::PatternTerm::dump):
(JSC::Yarr::YarrPattern::dumpPattern):

  • yarr/YarrPattern.h:

(JSC::Yarr::PatternTerm::containsAnyCaptures):
(JSC::Yarr::BackTrackInfoParenthesesOnce::returnAddressIndex):
(JSC::Yarr::BackTrackInfoParentheses::beginIndex):
(JSC::Yarr::BackTrackInfoParentheses::returnAddressIndex):
(JSC::Yarr::BackTrackInfoParentheses::matchAmountIndex):
(JSC::Yarr::BackTrackInfoParentheses::patternContextHeadIndex):
(JSC::Yarr::BackTrackInfoAlternative::offsetIndex): Deleted.

Location:
trunk/Source/JavaScriptCore
Files:
11 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/ChangeLog

    r225693 r225695  
     12017-12-08  Michael Saboff  <[email protected]>
     2
     3        YARR: JIT RegExps with greedy parenthesized sub patterns
     4        https://bugs.webkit.org/show_bug.cgi?id=180538
     5
     6        Reviewed by JF Bastien.
     7
     8        This patch adds JIT support for regular expressions containing greedy counted
     9        parenthesis.  An example expression that couldn't be JIT'ed before is /q(a|b)*q/.
     10
     11        Just like in the interpreter, expressions with nested parenthetical subpatterns
     12        require saving the results of previous matches of the parentheses contents along
     13        with any associated state.  This saved state is needed in the case that we need
     14        to backtrack.  This state is called ParenContext within the code space allocated
     15        for this ParenContext is managed using a simple block allocator within the JIT'ed
     16        code.  The raw space managed by this allocator is passed into the JIT'ed function.
     17
     18        Since this fixed sized space may be exceeded, this patch adds a fallback mechanism.
     19        If the JIT'ed code exhausts all its ParenContext space, it returns a new error
     20        JSRegExpJITCodeFailure.  The caller will then bytecompile and interpret the
     21        expression.
     22
     23        Due to increased register usage by the parenthesis handling code, the use of
     24        registers by the JIT engine was restructured, with registers used for Unicode
     25        pattern matching replaced with constants.
     26
     27        Reworked some of the context structures that are used across the interpreter
     28        and JIT implementations to make them a little more uniform and to handle the
     29        needs of JIT'ing the new parentheses forms.
     30
     31        To help with development and debugging of this code, compiled patterns dumping
     32        code was enhanced.  Also added the ability to also dump interpreter ByteCodes.
     33
     34        * runtime/RegExp.cpp:
     35        (JSC::byteCodeCompilePattern):
     36        (JSC::RegExp::byteCodeCompileIfNecessary):
     37        (JSC::RegExp::compile):
     38        (JSC::RegExp::compileMatchOnly):
     39        * runtime/RegExp.h:
     40        * runtime/RegExpInlines.h:
     41        (JSC::RegExp::matchInline):
     42        * testRegExp.cpp:
     43        (parseRegExpLine):
     44        (runFromFiles):
     45        * yarr/Yarr.h:
     46        * yarr/YarrInterpreter.cpp:
     47        (JSC::Yarr::ByteCompiler::compile):
     48        (JSC::Yarr::ByteCompiler::dumpDisjunction):
     49        * yarr/YarrJIT.cpp:
     50        (JSC::Yarr::YarrGenerator::ParenContextSizes::ParenContextSizes):
     51        (JSC::Yarr::YarrGenerator::ParenContextSizes::numSubpatterns):
     52        (JSC::Yarr::YarrGenerator::ParenContextSizes::frameSlots):
     53        (JSC::Yarr::YarrGenerator::ParenContext::sizeFor):
     54        (JSC::Yarr::YarrGenerator::ParenContext::nextOffset):
     55        (JSC::Yarr::YarrGenerator::ParenContext::beginOffset):
     56        (JSC::Yarr::YarrGenerator::ParenContext::matchAmountOffset):
     57        (JSC::Yarr::YarrGenerator::ParenContext::subpatternOffset):
     58        (JSC::Yarr::YarrGenerator::ParenContext::savedFrameOffset):
     59        (JSC::Yarr::YarrGenerator::initParenContextFreeList):
     60        (JSC::Yarr::YarrGenerator::allocatePatternContext):
     61        (JSC::Yarr::YarrGenerator::freePatternContext):
     62        (JSC::Yarr::YarrGenerator::savePatternContext):
     63        (JSC::Yarr::YarrGenerator::restorePatternContext):
     64        (JSC::Yarr::YarrGenerator::tryReadUnicodeCharImpl):
     65        (JSC::Yarr::YarrGenerator::storeToFrame):
     66        (JSC::Yarr::YarrGenerator::generateJITFailReturn):
     67        (JSC::Yarr::YarrGenerator::clearMatches):
     68        (JSC::Yarr::YarrGenerator::generate):
     69        (JSC::Yarr::YarrGenerator::backtrack):
     70        (JSC::Yarr::YarrGenerator::opCompileParenthesesSubpattern):
     71        (JSC::Yarr::YarrGenerator::generateEnter):
     72        (JSC::Yarr::YarrGenerator::generateReturn):
     73        (JSC::Yarr::YarrGenerator::YarrGenerator):
     74        (JSC::Yarr::YarrGenerator::compile):
     75        * yarr/YarrJIT.h:
     76        (JSC::Yarr::YarrCodeBlock::execute):
     77        * yarr/YarrPattern.cpp:
     78        (JSC::Yarr::indentForNestingLevel):
     79        (JSC::Yarr::dumpUChar32):
     80        (JSC::Yarr::dumpCharacterClass):
     81        (JSC::Yarr::PatternTerm::dump):
     82        (JSC::Yarr::YarrPattern::dumpPattern):
     83        * yarr/YarrPattern.h:
     84        (JSC::Yarr::PatternTerm::containsAnyCaptures):
     85        (JSC::Yarr::BackTrackInfoParenthesesOnce::returnAddressIndex):
     86        (JSC::Yarr::BackTrackInfoParentheses::beginIndex):
     87        (JSC::Yarr::BackTrackInfoParentheses::returnAddressIndex):
     88        (JSC::Yarr::BackTrackInfoParentheses::matchAmountIndex):
     89        (JSC::Yarr::BackTrackInfoParentheses::patternContextHeadIndex):
     90        (JSC::Yarr::BackTrackInfoAlternative::offsetIndex): Deleted.
     91
    1922017-12-08  Joseph Pecoraro  <[email protected]>
    293
  • trunk/Source/JavaScriptCore/runtime/RegExp.cpp

    r223010 r225695  
    272272}
    273273
     274
     275static std::unique_ptr<Yarr::BytecodePattern> byteCodeCompilePattern(VM* vm, Yarr::YarrPattern& pattern)
     276{
     277    return Yarr::byteCompile(pattern, &vm->m_regExpAllocator, &vm->m_regExpAllocatorLock);
     278}
     279
     280void RegExp::byteCodeCompileIfNecessary(VM* vm)
     281{
     282    if (m_regExpBytecode)
     283        return;
     284
     285    Yarr::YarrPattern pattern(m_patternString, m_flags, &m_constructionError, vm->stackLimit());
     286    if (m_constructionError) {
     287        RELEASE_ASSERT_NOT_REACHED();
     288#if COMPILER_QUIRK(CONSIDERS_UNREACHABLE_CODE)
     289        m_state = ParseError;
     290        return;
     291#endif
     292    }
     293    ASSERT(m_numSubpatterns == pattern.m_numSubpatterns);
     294
     295    m_regExpBytecode = byteCodeCompilePattern(vm, pattern);
     296}
     297
    274298void RegExp::compile(VM* vm, Yarr::YarrCharSize charSize)
    275299{
     
    304328#endif
    305329
     330    if (Options::dumpCompiledRegExpPatterns())
     331        dataLog("Can't JIT this regular expression: \"", m_patternString, "\"\n");
     332
    306333    m_state = ByteCode;
    307     m_regExpBytecode = Yarr::byteCompile(pattern, &vm->m_regExpAllocator, &vm->m_regExpAllocatorLock);
     334    m_regExpBytecode = byteCodeCompilePattern(vm, pattern);
    308335}
    309336
     
    357384#endif
    358385
     386    if (Options::dumpCompiledRegExpPatterns())
     387        dataLog("Can't JIT this regular expression: \"", m_patternString, "\"\n");
     388
    359389    m_state = ByteCode;
    360     m_regExpBytecode = Yarr::byteCompile(pattern, &vm->m_regExpAllocator, &vm->m_regExpAllocatorLock);
     390    m_regExpBytecode = byteCodeCompilePattern(vm, pattern);
    361391}
    362392
  • trunk/Source/JavaScriptCore/runtime/RegExp.h

    r221769 r225695  
    141141    RegExpState m_state;
    142142
     143    void byteCodeCompileIfNecessary(VM*);
     144
    143145    void compile(VM*, Yarr::YarrCharSize);
    144146    void compileIfNecessary(VM&, Yarr::YarrCharSize);
  • trunk/Source/JavaScriptCore/runtime/RegExpInlines.h

    r219702 r225695  
    111111    int result;
    112112#if ENABLE(YARR_JIT)
     113#ifdef JIT_ALL_PARENS_EXPRESSIONS
     114    char patternContextBuffer[patternContextBufferSize];
     115#define EXTRA_JIT_PARAMS  , patternContextBuffer, patternContextBufferSize
     116#else
     117#define EXTRA_JIT_PARAMS
     118#endif
     119
    113120    if (m_state == JITCode) {
    114121        if (s.is8Bit())
    115             result = m_regExpJITCode.execute(s.characters8(), startOffset, s.length(), offsetVector).start;
     122            result = m_regExpJITCode.execute(s.characters8(), startOffset, s.length(), offsetVector EXTRA_JIT_PARAMS).start;
    116123        else
    117             result = m_regExpJITCode.execute(s.characters16(), startOffset, s.length(), offsetVector).start;
     124            result = m_regExpJITCode.execute(s.characters16(), startOffset, s.length(), offsetVector EXTRA_JIT_PARAMS).start;
     125
     126        if (result == Yarr::JSRegExpJITCodeFailure) {
     127            // JIT'ed code couldn't handle expression, so punt back to the interpreter.
     128            byteCodeCompileIfNecessary(&vm);
     129            result = Yarr::interpret(m_regExpBytecode.get(), s, startOffset, reinterpret_cast<unsigned*>(offsetVector));
     130        }
     131
    118132#if ENABLE(YARR_JIT_DEBUG)
    119133        matchCompareWithInterpreter(s, startOffset, offsetVector, result);
     
    200214
    201215#if ENABLE(YARR_JIT)
     216#ifdef JIT_ALL_PARENS_EXPRESSIONS
     217    char patternContextBuffer[patternContextBufferSize];
     218#define EXTRA_JIT_PARAMS  , patternContextBuffer, patternContextBufferSize
     219#else
     220#define EXTRA_JIT_PARAMS
     221#endif
     222
     223    MatchResult result;
     224
    202225    if (m_state == JITCode) {
    203         MatchResult result = s.is8Bit() ?
    204             m_regExpJITCode.execute(s.characters8(), startOffset, s.length()) :
    205             m_regExpJITCode.execute(s.characters16(), startOffset, s.length());
     226        if (s.is8Bit())
     227            result = m_regExpJITCode.execute(s.characters8(), startOffset, s.length() EXTRA_JIT_PARAMS);
     228        else
     229            result = m_regExpJITCode.execute(s.characters16(), startOffset, s.length() EXTRA_JIT_PARAMS);
     230
    206231#if ENABLE(REGEXP_TRACING)
    207232        if (!result)
    208233            m_rtMatchOnlyFoundCount++;
    209234#endif
    210         return result;
     235        if (result.start != static_cast<size_t>(Yarr::JSRegExpJITCodeFailure))
     236            return result;
     237
     238        // JIT'ed code couldn't handle expression, so punt back to the interpreter.
     239        byteCodeCompileIfNecessary(&vm);
    211240    }
    212241#endif
  • trunk/Source/JavaScriptCore/testRegExp.cpp

    r217108 r225695  
    316316}
    317317
    318 static RegExp* parseRegExpLine(VM& vm, char* line, int lineLength)
     318static RegExp* parseRegExpLine(VM& vm, char* line, int lineLength, const char** regexpError)
    319319{
    320320    StringBuilder pattern;
    321    
     321
    322322    if (line[0] != '/')
    323323        return 0;
     
    331331
    332332    RegExp* r = RegExp::create(vm, pattern.toString(), regExpFlags(line + i));
    333     if (r->isValid())
    334         return r;
    335     return nullptr;
     333    if (!r->isValid()) {
     334        *regexpError = r->errorMessage();
     335        return nullptr;
     336    }
     337    return r;
    336338}
    337339
     
    432434        char* linePtr = 0;
    433435        unsigned int lineNumber = 0;
     436        const char* regexpError = nullptr;
    434437
    435438        while ((linePtr = fgets(&lineBuffer[0], MaxLineLength, testCasesFile))) {
     
    445448
    446449            if (linePtr[0] == '/') {
    447                 regexp = parseRegExpLine(vm, linePtr, lineLength);
     450                regexp = parseRegExpLine(vm, linePtr, lineLength, &regexpError);
     451                if (!regexp) {
     452                    failures++;
     453                    fprintf(stderr, "Failure on line %u. '%s' %s\n", lineNumber, linePtr, regexpError);
     454                }
    448455            } else if (linePtr[0] == ' ') {
    449456                RegExpTest* regExpTest = parseTestLine(linePtr, lineLength);
     
    462469                tests++;
    463470                regexp = 0; // Reset the live regexp to avoid confusing other subsequent tests
    464                 bool successfullyParsed = parseRegExpLine(vm, linePtr + 1, lineLength - 1);
     471                bool successfullyParsed = parseRegExpLine(vm, linePtr + 1, lineLength - 1, &regexpError);
    465472                if (successfullyParsed) {
    466473                    failures++;
    467                     fprintf(stderr, "Failure on line %u. '%s' is not a valid regexp\n", lineNumber, linePtr + 1);
     474                    fprintf(stderr, "Failure on line %u. '%s' %s\n", lineNumber, linePtr + 1, regexpError);
    468475                }
    469476            }
  • trunk/Source/JavaScriptCore/yarr/Yarr.h

    r223081 r225695  
    3737#define YarrStackSpaceForBackTrackInfoAlternative 1 // One per alternative.
    3838#define YarrStackSpaceForBackTrackInfoParentheticalAssertion 1
    39 #define YarrStackSpaceForBackTrackInfoParenthesesOnce 1 // Only for !fixed quantifiers.
     39#define YarrStackSpaceForBackTrackInfoParenthesesOnce 2
    4040#define YarrStackSpaceForBackTrackInfoParenthesesTerminal 1
    41 #define YarrStackSpaceForBackTrackInfoParentheses 2
     41#define YarrStackSpaceForBackTrackInfoParentheses 4
    4242#define YarrStackSpaceForDotStarEnclosure 1
    4343
     
    5353    JSRegExpNoMatch = 0,
    5454    JSRegExpErrorNoMatch = -1,
    55     JSRegExpErrorHitLimit = -2,
    56     JSRegExpErrorNoMemory = -3,
    57     JSRegExpErrorInternal = -4
     55    JSRegExpJITCodeFailure = -2,
     56    JSRegExpErrorHitLimit = -3,
     57    JSRegExpErrorNoMemory = -4,
     58    JSRegExpErrorInternal = -5,
    5859};
    5960
  • trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp

    r225683 r225695  
    2828#include "YarrInterpreter.h"
    2929
     30#include "Options.h"
    3031#include "SuperSampler.h"
    3132#include "Yarr.h"
     
    16701671        regexEnd();
    16711672
     1673#ifndef NDEBUG
     1674        if (Options::dumpCompiledRegExpPatterns())
     1675            dumpDisjunction(m_bodyDisjunction.get());
     1676#endif
     1677
    16721678        return std::make_unique<BytecodePattern>(WTFMove(m_bodyDisjunction), m_allParenthesesInfo, m_pattern, allocator, lock);
    16731679    }
     
    18291835        return beginTerm;
    18301836    }
    1831 
    1832 #ifndef NDEBUG
    1833     void dumpDisjunction(ByteDisjunction* disjunction)
    1834     {
    1835         dataLogF("ByteDisjunction(%p):\n\t", disjunction);
    1836         for (unsigned i = 0; i < disjunction->terms.size(); ++i)
    1837             dataLogF("{ %d } ", disjunction->terms[i].type);
    1838         dataLogF("\n");
    1839     }
    1840 #endif
    18411837
    18421838    void closeAlternative(int beginTerm)
     
    21122108        }
    21132109    }
     2110#ifndef NDEBUG
     2111    void dumpDisjunction(ByteDisjunction* disjunction, unsigned nesting = 0)
     2112    {
     2113        PrintStream& out = WTF::dataFile();
     2114
     2115        unsigned termIndexNest = 0;
     2116
     2117        if (!nesting) {
     2118            out.printf("ByteDisjunction(%p):\n", disjunction);
     2119            nesting = 1;
     2120        } else {
     2121            termIndexNest = nesting - 1;
     2122            nesting = 2;
     2123        }
     2124
     2125        auto outputTermIndexAndNest = [&](size_t index, unsigned termNesting) {
     2126            for (unsigned nestingDepth = 0; nestingDepth < termIndexNest; nestingDepth++)
     2127                out.print("  ");
     2128            out.printf("%4lu", index);
     2129            for (unsigned nestingDepth = 0; nestingDepth < termNesting; nestingDepth++)
     2130                out.print("  ");
     2131        };
     2132
     2133        auto dumpQuantity = [&](ByteTerm& term) {
     2134            if (term.atom.quantityType == QuantifierFixedCount && term.atom.quantityMinCount == 1 && term.atom.quantityMaxCount == 1)
     2135                return;
     2136
     2137            out.print(" {", term.atom.quantityMinCount);
     2138            if (term.atom.quantityMinCount != term.atom.quantityMaxCount) {
     2139                if (term.atom.quantityMaxCount == UINT_MAX)
     2140                    out.print(",inf");
     2141                else
     2142                    out.print(",", term.atom.quantityMaxCount);
     2143            }
     2144            out.print("}");
     2145            if (term.atom.quantityType == QuantifierGreedy)
     2146                out.print(" greedy");
     2147            else if (term.atom.quantityType == QuantifierNonGreedy)
     2148                out.print(" non-greedy");
     2149        };
     2150
     2151        auto dumpCaptured = [&](ByteTerm& term) {
     2152            if (term.capture())
     2153                out.print(" captured (#", term.atom.subpatternId, ")");
     2154        };
     2155
     2156        auto dumpInverted = [&](ByteTerm& term) {
     2157            if (term.invert())
     2158                out.print(" inverted");
     2159        };
     2160
     2161        auto dumpInputPosition = [&](ByteTerm& term) {
     2162            out.printf(" inputPosition %u", term.inputPosition);
     2163        };
     2164
     2165        auto dumpCharacter = [&](ByteTerm& term) {
     2166            out.print(" ");
     2167            dumpUChar32(out, term.atom.patternCharacter);
     2168        };
     2169
     2170        auto dumpCharClass = [&](ByteTerm& term) {
     2171            out.print(" ");
     2172            dumpCharacterClass(out, &m_pattern, term.atom.characterClass);
     2173        };
     2174
     2175        for (size_t idx = 0; idx < disjunction->terms.size(); ++idx) {
     2176            ByteTerm term = disjunction->terms[idx];
     2177
     2178            bool outputNewline = true;
     2179
     2180            switch (term.type) {
     2181            case ByteTerm::TypeBodyAlternativeBegin:
     2182                outputTermIndexAndNest(idx, nesting++);
     2183                out.print("BodyAlternativeBegin");
     2184                if (term.alternative.onceThrough)
     2185                    out.print(" onceThrough");
     2186                break;
     2187            case ByteTerm::TypeBodyAlternativeDisjunction:
     2188                outputTermIndexAndNest(idx, nesting - 1);
     2189                out.print("BodyAlternativeDisjunction");
     2190                break;
     2191            case ByteTerm::TypeBodyAlternativeEnd:
     2192                outputTermIndexAndNest(idx, --nesting);
     2193                out.print("BodyAlternativeEnd");
     2194                break;
     2195            case ByteTerm::TypeAlternativeBegin:
     2196                outputTermIndexAndNest(idx, nesting++);
     2197                out.print("AlternativeBegin");
     2198                break;
     2199            case ByteTerm::TypeAlternativeDisjunction:
     2200                outputTermIndexAndNest(idx, nesting - 1);
     2201                out.print("AlternativeDisjunction");
     2202                break;
     2203            case ByteTerm::TypeAlternativeEnd:
     2204                outputTermIndexAndNest(idx, --nesting);
     2205                out.print("AlternativeEnd");
     2206                break;
     2207            case ByteTerm::TypeSubpatternBegin:
     2208                outputTermIndexAndNest(idx, nesting++);
     2209                out.print("SubpatternBegin");
     2210                break;
     2211            case ByteTerm::TypeSubpatternEnd:
     2212                outputTermIndexAndNest(idx, --nesting);
     2213                out.print("SubpatternEnd");
     2214                break;
     2215            case ByteTerm::TypeAssertionBOL:
     2216                outputTermIndexAndNest(idx, nesting);
     2217                out.print("AssertionBOL");
     2218                break;
     2219            case ByteTerm::TypeAssertionEOL:
     2220                outputTermIndexAndNest(idx, nesting);
     2221                out.print("AssertionEOL");
     2222                break;
     2223            case ByteTerm::TypeAssertionWordBoundary:
     2224                outputTermIndexAndNest(idx, nesting);
     2225                out.print("AssertionWordBoundary");
     2226                break;
     2227            case ByteTerm::TypePatternCharacterOnce:
     2228                outputTermIndexAndNest(idx, nesting);
     2229                out.print("PatternCharacterOnce");
     2230                dumpInverted(term);
     2231                dumpInputPosition(term);
     2232                dumpCharacter(term);
     2233                dumpQuantity(term);
     2234                break;
     2235            case ByteTerm::TypePatternCharacterFixed:
     2236                outputTermIndexAndNest(idx, nesting);
     2237                out.print("PatternCharacterFixed");
     2238                dumpInverted(term);
     2239                dumpInputPosition(term);
     2240                dumpCharacter(term);
     2241                out.print(" {", term.atom.quantityMinCount, "}");
     2242                break;
     2243            case ByteTerm::TypePatternCharacterGreedy:
     2244                outputTermIndexAndNest(idx, nesting);
     2245                out.print("PatternCharacterGreedy");
     2246                dumpInverted(term);
     2247                dumpInputPosition(term);
     2248                dumpCharacter(term);
     2249                dumpQuantity(term);
     2250                break;
     2251            case ByteTerm::TypePatternCharacterNonGreedy:
     2252                outputTermIndexAndNest(idx, nesting);
     2253                out.print("PatternCharacterNonGreedy");
     2254                dumpInverted(term);
     2255                dumpInputPosition(term);
     2256                dumpCharacter(term);
     2257                dumpQuantity(term);
     2258                break;
     2259            case ByteTerm::TypePatternCasedCharacterOnce:
     2260                outputTermIndexAndNest(idx, nesting);
     2261                out.print("PatternCasedCharacterOnce");
     2262                break;
     2263            case ByteTerm::TypePatternCasedCharacterFixed:
     2264                outputTermIndexAndNest(idx, nesting);
     2265                out.print("PatternCasedCharacterFixed");
     2266                break;
     2267            case ByteTerm::TypePatternCasedCharacterGreedy:
     2268                outputTermIndexAndNest(idx, nesting);
     2269                out.print("PatternCasedCharacterGreedy");
     2270                break;
     2271            case ByteTerm::TypePatternCasedCharacterNonGreedy:
     2272                outputTermIndexAndNest(idx, nesting);
     2273                out.print("PatternCasedCharacterNonGreedy");
     2274                break;
     2275            case ByteTerm::TypeCharacterClass:
     2276                outputTermIndexAndNest(idx, nesting);
     2277                out.print("CharacterClass");
     2278                dumpInverted(term);
     2279                dumpInputPosition(term);
     2280                dumpCharClass(term);
     2281                dumpQuantity(term);
     2282                break;
     2283            case ByteTerm::TypeBackReference:
     2284                outputTermIndexAndNest(idx, nesting);
     2285                out.print("BackReference #", term.atom.subpatternId);
     2286                dumpQuantity(term);
     2287                break;
     2288            case ByteTerm::TypeParenthesesSubpattern:
     2289                outputTermIndexAndNest(idx, nesting);
     2290                out.print("ParenthesesSubpattern");
     2291                dumpCaptured(term);
     2292                dumpInverted(term);
     2293                dumpInputPosition(term);
     2294                dumpQuantity(term);
     2295                out.print("\n");
     2296                outputNewline = false;
     2297                dumpDisjunction(term.atom.parenthesesDisjunction, nesting);
     2298                break;
     2299            case ByteTerm::TypeParenthesesSubpatternOnceBegin:
     2300                outputTermIndexAndNest(idx, nesting++);
     2301                out.print("ParenthesesSubpatternOnceBegin");
     2302                dumpCaptured(term);
     2303                dumpInverted(term);
     2304                dumpInputPosition(term);
     2305                break;
     2306            case ByteTerm::TypeParenthesesSubpatternOnceEnd:
     2307                outputTermIndexAndNest(idx, --nesting);
     2308                out.print("ParenthesesSubpatternOnceEnd");
     2309                break;
     2310            case ByteTerm::TypeParenthesesSubpatternTerminalBegin:
     2311                outputTermIndexAndNest(idx, nesting++);
     2312                out.print("ParenthesesSubpatternTerminalBegin");
     2313                dumpInverted(term);
     2314                dumpInputPosition(term);
     2315                break;
     2316            case ByteTerm::TypeParenthesesSubpatternTerminalEnd:
     2317                outputTermIndexAndNest(idx, --nesting);
     2318                out.print("ParenthesesSubpatternTerminalEnd");
     2319                break;
     2320            case ByteTerm::TypeParentheticalAssertionBegin:
     2321                outputTermIndexAndNest(idx, nesting++);
     2322                out.print("ParentheticalAssertionBegin");
     2323                dumpInverted(term);
     2324                dumpInputPosition(term);
     2325                break;
     2326            case ByteTerm::TypeParentheticalAssertionEnd:
     2327                outputTermIndexAndNest(idx, --nesting);
     2328                out.print("ParentheticalAssertionEnd");
     2329                break;
     2330            case ByteTerm::TypeCheckInput:
     2331                outputTermIndexAndNest(idx, nesting);
     2332                out.print("CheckInput ", term.checkInputCount);
     2333                break;
     2334            case ByteTerm::TypeUncheckInput:
     2335                outputTermIndexAndNest(idx, nesting);
     2336                out.print("UncheckInput ", term.checkInputCount);
     2337                break;
     2338            case ByteTerm::TypeDotStarEnclosure:
     2339                outputTermIndexAndNest(idx, nesting);
     2340                out.print("DotStarEnclosure");
     2341                break;
     2342            }
     2343            if (outputNewline)
     2344                out.print("\n");
     2345        }
     2346    }
     2347#endif
    21142348
    21152349private:
     
    21532387COMPILE_ASSERT(sizeof(BackTrackInfoParentheticalAssertion) == (YarrStackSpaceForBackTrackInfoParentheticalAssertion * sizeof(uintptr_t)), CheckYarrStackSpaceForBackTrackInfoParentheticalAssertion);
    21542388COMPILE_ASSERT(sizeof(BackTrackInfoParenthesesOnce) == (YarrStackSpaceForBackTrackInfoParenthesesOnce * sizeof(uintptr_t)), CheckYarrStackSpaceForBackTrackInfoParenthesesOnce);
    2155 COMPILE_ASSERT(sizeof(Interpreter<UChar>::BackTrackInfoParentheses) == (YarrStackSpaceForBackTrackInfoParentheses * sizeof(uintptr_t)), CheckYarrStackSpaceForBackTrackInfoParentheses);
     2389COMPILE_ASSERT(sizeof(Interpreter<UChar>::BackTrackInfoParentheses) <= (YarrStackSpaceForBackTrackInfoParentheses * sizeof(uintptr_t)), CheckYarrStackSpaceForBackTrackInfoParentheses);
    21562390
    21572391
  • trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp

    r225683 r225695  
    5959#define HAVE_INITIAL_START_REG
    6060#elif CPU(ARM64)
     61    // Argument registers
    6162    static const RegisterID input = ARM64Registers::x0;
    6263    static const RegisterID index = ARM64Registers::x1;
    6364    static const RegisterID length = ARM64Registers::x2;
    6465    static const RegisterID output = ARM64Registers::x3;
    65 
    66     static const RegisterID regT0 = ARM64Registers::x4;
    67     static const RegisterID regT1 = ARM64Registers::x5;
    68     static const RegisterID regUnicodeInputAndTrail = ARM64Registers::x6;
    69     static const RegisterID regUnicodeTemp = ARM64Registers::x7;
    70     static const RegisterID initialStart = ARM64Registers::x8;
    71     static const RegisterID supplementaryPlanesBase = ARM64Registers::x9;
    72     static const RegisterID surrogateTagMask = ARM64Registers::x10;
    73     static const RegisterID leadingSurrogateTag = ARM64Registers::x11;
    74     static const RegisterID trailingSurrogateTag = ARM64Registers::x12;
     66    static const RegisterID freelistRegister = ARM64Registers::x4;
     67    static const RegisterID freelistSizeRegister = ARM64Registers::x5;
     68
     69    // Scratch registers
     70    static const RegisterID regT0 = ARM64Registers::x6;
     71    static const RegisterID regT1 = ARM64Registers::x7;
     72    static const RegisterID regT2 = ARM64Registers::x8;
     73    static const RegisterID remainingMatchCount = ARM64Registers::x9;
     74    static const RegisterID regUnicodeInputAndTrail = ARM64Registers::x10;
     75    static const RegisterID initialStart = ARM64Registers::x11;
     76    static const RegisterID supplementaryPlanesBase = ARM64Registers::x12;
     77    static const RegisterID surrogateTagMask = ARM64Registers::x13;
     78    static const RegisterID leadingSurrogateTag = ARM64Registers::x14;
     79    static const RegisterID trailingSurrogateTag = ARM64Registers::x15;
    7580
    7681    static const RegisterID returnRegister = ARM64Registers::x0;
     
    106111#elif CPU(X86_64)
    107112#if !OS(WINDOWS)
     113    // Argument registers
    108114    static const RegisterID input = X86Registers::edi;
    109115    static const RegisterID index = X86Registers::esi;
    110116    static const RegisterID length = X86Registers::edx;
    111117    static const RegisterID output = X86Registers::ecx;
     118    static const RegisterID freelistRegister = X86Registers::r8;
     119    static const RegisterID freelistSizeRegister = X86Registers::r9; // Only used during initialization.
    112120#else
    113121    // If the return value doesn't fit in 64bits, its destination is pointed by rcx and the parameters are shifted.
     
    120128#endif
    121129
     130    // Scratch registers
    122131    static const RegisterID regT0 = X86Registers::eax;
    123132#if !OS(WINDOWS)
    124     static const RegisterID regT1 = X86Registers::r8;
     133    static const RegisterID regT1 = X86Registers::r9;
     134    static const RegisterID regT2 = X86Registers::r10;
    125135#else
    126136    static const RegisterID regT1 = X86Registers::ecx;
     137    static const RegisterID regT2 = X86Registers::edi;
    127138#endif
    128139
    129140    static const RegisterID initialStart = X86Registers::ebx;
    130141#if !OS(WINDOWS)
    131     static const RegisterID regUnicodeInputAndTrail = X86Registers::r9;
    132     static const RegisterID regUnicodeTemp = X86Registers::r10;
     142    static const RegisterID remainingMatchCount = X86Registers::r12;
    133143#else
    134     static const RegisterID regUnicodeInputAndTrail = X86Registers::esi;
    135     static const RegisterID regUnicodeTemp = X86Registers::edi;
    136 #endif
    137     static const RegisterID supplementaryPlanesBase = X86Registers::r12;
    138     static const RegisterID surrogateTagMask = X86Registers::r13;
     144    static const RegisterID remainingMatchCount = X86Registers::esi;
     145#endif
     146    static const RegisterID regUnicodeInputAndTrail = X86Registers::r13;
    139147    static const RegisterID leadingSurrogateTag = X86Registers::r14;
    140148    static const RegisterID trailingSurrogateTag = X86Registers::r15;
     
    143151    static const RegisterID returnRegister2 = X86Registers::edx;
    144152
     153    const TrustedImm32 supplementaryPlanesBase = TrustedImm32(0x10000);
     154    const TrustedImm32 surrogateTagMask = TrustedImm32(0xfffffc00);
    145155#define HAVE_INITIAL_START_REG
    146156#define JIT_UNICODE_EXPRESSIONS
     157#endif
     158
     159#ifdef JIT_ALL_PARENS_EXPRESSIONS
     160    struct ParenContextSizes {
     161        size_t m_numSubpatterns;
     162        size_t m_frameSlots;
     163
     164        ParenContextSizes(size_t numSubpatterns, size_t frameSlots)
     165            : m_numSubpatterns(numSubpatterns)
     166            , m_frameSlots(frameSlots)
     167        {
     168        }
     169
     170        size_t numSubpatterns() { return m_numSubpatterns; }
     171
     172        size_t frameSlots() { return m_frameSlots; }
     173    };
     174
     175    struct ParenContext {
     176        struct ParenContext* next;
     177        uint32_t begin;
     178        uint32_t matchAmount;
     179        struct Subpatterns {
     180            unsigned start;
     181            unsigned end;
     182        } subpatterns[0];
     183        uintptr_t frameSlots[0];
     184
     185        static size_t sizeFor(ParenContextSizes& parenContextSizes)
     186        {
     187            return sizeof(ParenContext) + sizeof(Subpatterns) * parenContextSizes.numSubpatterns() + sizeof(uintptr_t) * parenContextSizes.frameSlots();
     188        }
     189
     190        static ptrdiff_t nextOffset()
     191        {
     192            return offsetof(ParenContext, next);
     193        }
     194
     195        static ptrdiff_t beginOffset()
     196        {
     197            return offsetof(ParenContext, begin);
     198        }
     199
     200        static ptrdiff_t matchAmountOffset()
     201        {
     202            return offsetof(ParenContext, matchAmount);
     203        }
     204
     205        static ptrdiff_t subpatternOffset(size_t subpattern)
     206        {
     207            return offsetof(ParenContext, subpatterns) + (subpattern - 1) * sizeof(Subpatterns);
     208        }
     209
     210        static ptrdiff_t savedFrameOffset(ParenContextSizes& parenContextSizes)
     211        {
     212            return offsetof(ParenContext, subpatterns) + (parenContextSizes.numSubpatterns()) * sizeof(Subpatterns);
     213        }
     214    };
     215
     216    void initParenContextFreeList()
     217    {
     218        RegisterID parenContextPointer = regT0;
     219        RegisterID nextParenContextPointer = regT2;
     220
     221        size_t parenContextSize = ParenContext::sizeFor(m_parenContextSizes);
     222
     223        parenContextSize = WTF::roundUpToMultipleOf<sizeof(uintptr_t)>(parenContextSize);
     224
     225        // Check that the paren context is a reasonable size.
     226        if (parenContextSize > INT16_MAX)
     227            m_abortExecution.append(jump());
     228
     229        Jump emptyFreeList = branchTestPtr(Zero, freelistRegister);
     230        move(freelistRegister, parenContextPointer);
     231        addPtr(TrustedImm32(parenContextSize), freelistRegister, nextParenContextPointer);
     232        addPtr(freelistRegister, freelistSizeRegister);
     233        subPtr(TrustedImm32(parenContextSize), freelistSizeRegister);
     234
     235        Label loopTop(this);
     236        Jump initDone = branchPtr(Above, nextParenContextPointer, freelistSizeRegister);
     237        storePtr(nextParenContextPointer, Address(parenContextPointer, ParenContext::nextOffset()));
     238        move(nextParenContextPointer, parenContextPointer);
     239        addPtr(TrustedImm32(parenContextSize), parenContextPointer, nextParenContextPointer);
     240        jump(loopTop);
     241
     242        initDone.link(this);
     243        storePtr(TrustedImmPtr(0), Address(parenContextPointer, ParenContext::nextOffset()));
     244        emptyFreeList.link(this);
     245    }
     246
     247    void allocatePatternContext(RegisterID result)
     248    {
     249        m_abortExecution.append(branchTestPtr(Zero, freelistRegister));
     250        sub32(TrustedImm32(1), remainingMatchCount);
     251        m_hitMatchLimit.append(branchTestPtr(Zero, remainingMatchCount));
     252        move(freelistRegister, result);
     253        loadPtr(Address(freelistRegister, ParenContext::nextOffset()), freelistRegister);
     254    }
     255
     256    void freePatternContext(RegisterID headPtrRegister, RegisterID newHeadPtrRegister)
     257    {
     258        loadPtr(Address(headPtrRegister, ParenContext::nextOffset()), newHeadPtrRegister);
     259        storePtr(freelistRegister, Address(headPtrRegister, ParenContext::nextOffset()));
     260        move(headPtrRegister, freelistRegister);
     261    }
     262
     263    void savePatternContext(RegisterID patternContextReg, RegisterID tempReg, unsigned firstSubpattern, unsigned lastSubpattern, unsigned subpatternBaseFrameLocation)
     264    {
     265        store32(index, Address(patternContextReg, ParenContext::beginOffset()));
     266        loadFromFrame(subpatternBaseFrameLocation + BackTrackInfoParentheses::matchAmountIndex(), tempReg);
     267        store32(tempReg, Address(patternContextReg, ParenContext::matchAmountOffset()));
     268        if (compileMode == IncludeSubpatterns) {
     269            for (unsigned subpattern = firstSubpattern; subpattern <= lastSubpattern; subpattern++) {
     270                loadPtr(Address(output, (subpattern << 1) * sizeof(unsigned)), tempReg);
     271                storePtr(tempReg, Address(patternContextReg, ParenContext::subpatternOffset(subpattern)));
     272                clearSubpatternStart(subpattern);
     273            }
     274        }
     275        subpatternBaseFrameLocation += YarrStackSpaceForBackTrackInfoParentheses;
     276        for (unsigned frameLocation = subpatternBaseFrameLocation; frameLocation < m_parenContextSizes.frameSlots(); frameLocation++) {
     277            loadFromFrame(frameLocation, tempReg);
     278            storePtr(tempReg, Address(patternContextReg, ParenContext::savedFrameOffset(m_parenContextSizes) + frameLocation * sizeof(uintptr_t)));
     279        }
     280    }
     281
     282    void restorePatternContext(RegisterID patternContextReg, RegisterID tempReg, unsigned firstSubpattern, unsigned lastSubpattern, unsigned subpatternBaseFrameLocation)
     283    {
     284        load32(Address(patternContextReg, ParenContext::beginOffset()), index);
     285        storeToFrame(index, subpatternBaseFrameLocation + BackTrackInfoParentheses::beginIndex());
     286        load32(Address(patternContextReg, ParenContext::matchAmountOffset()), tempReg);
     287        storeToFrame(tempReg, subpatternBaseFrameLocation + BackTrackInfoParentheses::matchAmountIndex());
     288        if (compileMode == IncludeSubpatterns) {
     289            for (unsigned subpattern = firstSubpattern; subpattern <= lastSubpattern; subpattern++) {
     290                loadPtr(Address(patternContextReg, ParenContext::subpatternOffset(subpattern)), tempReg);
     291                storePtr(tempReg, Address(output, (subpattern << 1) * sizeof(unsigned)));
     292            }
     293        }
     294        subpatternBaseFrameLocation += YarrStackSpaceForBackTrackInfoParentheses;
     295        for (unsigned frameLocation = subpatternBaseFrameLocation; frameLocation < m_parenContextSizes.frameSlots(); frameLocation++) {
     296            loadPtr(Address(patternContextReg, ParenContext::savedFrameOffset(m_parenContextSizes) + frameLocation * sizeof(uintptr_t)), tempReg);
     297            storeToFrame(tempReg, frameLocation);
     298        }
     299    }
    147300#endif
    148301
     
    355508        JumpList notUnicode;
    356509        load16Unaligned(regUnicodeInputAndTrail, resultReg);
    357         and32(surrogateTagMask, resultReg, regUnicodeTemp);
    358         notUnicode.append(branch32(NotEqual, regUnicodeTemp, leadingSurrogateTag));
     510        and32(surrogateTagMask, resultReg, regT2);
     511        notUnicode.append(branch32(NotEqual, regT2, leadingSurrogateTag));
    359512        addPtr(TrustedImm32(2), regUnicodeInputAndTrail);
    360         getEffectiveAddress(BaseIndex(input, length, TimesTwo), regUnicodeTemp);
    361         notUnicode.append(branchPtr(AboveOrEqual, regUnicodeInputAndTrail, regUnicodeTemp));
     513        getEffectiveAddress(BaseIndex(input, length, TimesTwo), regT2);
     514        notUnicode.append(branch32(AboveOrEqual, regUnicodeInputAndTrail, regT2));
    362515        load16Unaligned(Address(regUnicodeInputAndTrail), regUnicodeInputAndTrail);
    363         and32(surrogateTagMask, regUnicodeInputAndTrail, regUnicodeTemp);
    364         notUnicode.append(branch32(NotEqual, regUnicodeTemp, trailingSurrogateTag));
     516        and32(surrogateTagMask, regUnicodeInputAndTrail, regT2);
     517        notUnicode.append(branch32(NotEqual, regT2, trailingSurrogateTag));
    365518        sub32(leadingSurrogateTag, resultReg);
    366519        sub32(trailingSurrogateTag, regUnicodeInputAndTrail);
     
    422575        poke(imm, frameLocation);
    423576    }
     577
     578#if CPU(ARM64) || CPU(X86_64)
     579    void storeToFrame(TrustedImmPtr imm, unsigned frameLocation)
     580    {
     581        poke(imm, frameLocation);
     582    }
     583#endif
    424584
    425585    DataLabelPtr storeToFrameWithPatch(unsigned frameLocation)
     
    468628    }
    469629
    470     // Used to record subpatters, should only be called if compileMode is IncludeSubpatterns.
     630    void generateJITFailReturn()
     631    {
     632        if (m_abortExecution.empty() && m_hitMatchLimit.empty())
     633            return;
     634
     635        JumpList finishExiting;
     636        if (!m_abortExecution.empty()) {
     637            m_abortExecution.link(this);
     638            move(TrustedImmPtr((void*)static_cast<size_t>(-2)), returnRegister);
     639            finishExiting.append(jump());
     640        }
     641
     642        if (!m_hitMatchLimit.empty()) {
     643            m_hitMatchLimit.link(this);
     644            move(TrustedImmPtr((void*)static_cast<size_t>(-1)), returnRegister);
     645        }
     646
     647        finishExiting.link(this);
     648        removeCallFrame();
     649        move(TrustedImm32(0), returnRegister2);
     650        generateReturn();
     651    }
     652
     653    // Used to record subpatterns, should only be called if compileMode is IncludeSubpatterns.
    471654    void setSubpatternStart(RegisterID reg, unsigned subpattern)
    472655    {
     
    486669        // FIXME: should be able to ASSERT(compileMode == IncludeSubpatterns), but then this function is conditionally NORETURN. :-(
    487670        store32(TrustedImm32(-1), Address(output, (subpattern << 1) * sizeof(int)));
     671    }
     672
     673    void clearMatches(unsigned subpattern, unsigned lastSubpattern)
     674    {
     675        for (; subpattern <= lastSubpattern; subpattern++)
     676            clearSubpatternStart(subpattern);
    488677    }
    489678
     
    530719        OpNestedAlternativeEnd,
    531720        // Used for alternatives in subpatterns where there is only a single
    532         // alternative (backtrackingis easier in these cases), or for alternatives
     721        // alternative (backtracking is easier in these cases), or for alternatives
    533722        // which never need to be backtracked (those in parenthetical assertions,
    534723        // terminal subpatterns).
     
    542731        OpParenthesesSubpatternTerminalBegin,
    543732        OpParenthesesSubpatternTerminalEnd,
     733        // Used to wrap generic captured matches
     734        OpParenthesesSubpatternBegin,
     735        OpParenthesesSubpatternEnd,
    544736        // Used to wrap parenthetical assertions.
    545737        OpParentheticalAssertionBegin,
     
    17691961                if (op.m_op == OpNestedAlternativeNext) {
    17701962                    unsigned parenthesesFrameLocation = term->frameLocation;
    1771                     unsigned alternativeFrameLocation = parenthesesFrameLocation;
    1772                     if (term->quantityType != QuantifierFixedCount)
    1773                         alternativeFrameLocation += YarrStackSpaceForBackTrackInfoParenthesesOnce;
    1774                     op.m_returnAddress = storeToFrameWithPatch(alternativeFrameLocation);
     1963                    op.m_returnAddress = storeToFrameWithPatch(parenthesesFrameLocation + BackTrackInfoParentheses::returnAddressIndex());
    17751964                }
    17761965
     
    18192008                if (op.m_op == OpNestedAlternativeEnd) {
    18202009                    unsigned parenthesesFrameLocation = term->frameLocation;
    1821                     unsigned alternativeFrameLocation = parenthesesFrameLocation;
    1822                     if (term->quantityType != QuantifierFixedCount)
    1823                         alternativeFrameLocation += YarrStackSpaceForBackTrackInfoParenthesesOnce;
    1824                     op.m_returnAddress = storeToFrameWithPatch(alternativeFrameLocation);
     2010                    op.m_returnAddress = storeToFrameWithPatch(parenthesesFrameLocation + BackTrackInfoParentheses::returnAddressIndex());
    18252011                }
    18262012
     
    19642150                }
    19652151
    1966                 // We know that the match is non-zero, we can accept it  and
     2152                // We know that the match is non-zero, we can accept it and
    19672153                // loop back up to the head of the subpattern.
    19682154                jump(beginOp.m_reentry);
     
    19712157                // do so once the subpattern cannot match any more.
    19722158                op.m_reentry = label();
     2159                break;
     2160            }
     2161
     2162            // OpParenthesesSubpatternBegin/End
     2163            //
     2164            // These nodes support generic subpatterns.
     2165            case OpParenthesesSubpatternBegin: {
     2166#ifdef JIT_ALL_PARENS_EXPRESSIONS
     2167                PatternTerm* term = op.m_term;
     2168                unsigned parenthesesFrameLocation = term->frameLocation;
     2169
     2170                // Upon entry to a Greedy quantified set of parenthese store the index.
     2171                // We'll use this for two purposes:
     2172                //  - To indicate which iteration we are on of mathing the remainder of
     2173                //    the expression after the parentheses - the first, including the
     2174                //    match within the parentheses, or the second having skipped over them.
     2175                //  - To check for empty matches, which must be rejected.
     2176                //
     2177                // At the head of a NonGreedy set of parentheses we'll immediately set the
     2178                // value on the stack to -1 (indicating a match skipping the subpattern),
     2179                // and plant a jump to the end. We'll also plant a label to backtrack to
     2180                // to reenter the subpattern later, with a store to set up index on the
     2181                // second iteration.
     2182                //
     2183                // FIXME: for capturing parens, could use the index in the capture array?
     2184                if (term->quantityType == QuantifierGreedy || term->quantityType == QuantifierNonGreedy) {
     2185                    storeToFrame(TrustedImm32(0), parenthesesFrameLocation + BackTrackInfoParentheses::matchAmountIndex());
     2186                    storeToFrame(TrustedImmPtr(0), parenthesesFrameLocation + BackTrackInfoParentheses::patternContextHeadIndex());
     2187
     2188                    if (term->quantityType == QuantifierNonGreedy) {
     2189                        storeToFrame(TrustedImm32(-1), parenthesesFrameLocation + BackTrackInfoParentheses::beginIndex());
     2190                        op.m_jumps.append(jump());
     2191                    }
     2192                   
     2193                    op.m_reentry = label();
     2194                    RegisterID currPatternContextReg = regT0;
     2195                    RegisterID newPatternContextReg = regT1;
     2196
     2197                    loadFromFrame(parenthesesFrameLocation + BackTrackInfoParentheses::patternContextHeadIndex(), currPatternContextReg);
     2198                    allocatePatternContext(newPatternContextReg);
     2199                    storePtr(currPatternContextReg, newPatternContextReg);
     2200                    storeToFrame(newPatternContextReg, parenthesesFrameLocation + BackTrackInfoParentheses::patternContextHeadIndex());
     2201                    savePatternContext(newPatternContextReg, regT2, term->parentheses.subpatternId, term->parentheses.lastSubpatternId, parenthesesFrameLocation);
     2202                    storeToFrame(index, parenthesesFrameLocation + BackTrackInfoParentheses::beginIndex());
     2203                }
     2204
     2205                // If the parenthese are capturing, store the starting index value to the
     2206                // captures array, offsetting as necessary.
     2207                //
     2208                // FIXME: could avoid offsetting this value in JIT code, apply
     2209                // offsets only afterwards, at the point the results array is
     2210                // being accessed.
     2211                if (term->capture() && compileMode == IncludeSubpatterns) {
     2212                    const RegisterID indexTemporary = regT0;
     2213                    unsigned inputOffset = (m_checkedOffset - term->inputPosition).unsafeGet();
     2214                    if (term->quantityType == QuantifierFixedCount)
     2215                        inputOffset += term->parentheses.disjunction->m_minimumSize;
     2216                    if (inputOffset) {
     2217                        move(index, indexTemporary);
     2218                        sub32(Imm32(inputOffset), indexTemporary);
     2219                        setSubpatternStart(indexTemporary, term->parentheses.subpatternId);
     2220                    } else
     2221                        setSubpatternStart(index, term->parentheses.subpatternId);
     2222                }
     2223#else // !JIT_ALL_PARENS_EXPRESSIONS
     2224                RELEASE_ASSERT_NOT_REACHED();
     2225#endif
     2226                break;
     2227            }
     2228            case OpParenthesesSubpatternEnd: {
     2229#ifdef JIT_ALL_PARENS_EXPRESSIONS
     2230                PatternTerm* term = op.m_term;
     2231                unsigned parenthesesFrameLocation = term->frameLocation;
     2232
     2233                // Runtime ASSERT to make sure that the nested alternative handled the
     2234                // "no input consumed" check.
     2235                if (!ASSERT_DISABLED && term->quantityType != QuantifierFixedCount && !term->parentheses.disjunction->m_minimumSize) {
     2236                    Jump pastBreakpoint;
     2237                    pastBreakpoint = branch32(NotEqual, index, Address(stackPointerRegister, parenthesesFrameLocation * sizeof(void*)));
     2238                    abortWithReason(YARRNoInputConsumed);
     2239                    pastBreakpoint.link(this);
     2240                }
     2241
     2242                const RegisterID countTemporary = regT1;
     2243
     2244                YarrOp& beginOp = m_ops[op.m_previousOp];
     2245                loadFromFrame(parenthesesFrameLocation + BackTrackInfoParentheses::matchAmountIndex(), countTemporary);
     2246                add32(TrustedImm32(1), countTemporary);
     2247                storeToFrame(countTemporary, parenthesesFrameLocation + BackTrackInfoParentheses::matchAmountIndex());
     2248
     2249                // If the parenthese are capturing, store the ending index value to the
     2250                // captures array, offsetting as necessary.
     2251                //
     2252                // FIXME: could avoid offsetting this value in JIT code, apply
     2253                // offsets only afterwards, at the point the results array is
     2254                // being accessed.
     2255                if (term->capture() && compileMode == IncludeSubpatterns) {
     2256                    const RegisterID indexTemporary = regT0;
     2257                   
     2258                    unsigned inputOffset = (m_checkedOffset - term->inputPosition).unsafeGet();
     2259                    if (inputOffset) {
     2260                        move(index, indexTemporary);
     2261                        sub32(Imm32(inputOffset), indexTemporary);
     2262                        setSubpatternEnd(indexTemporary, term->parentheses.subpatternId);
     2263                    } else
     2264                        setSubpatternEnd(index, term->parentheses.subpatternId);
     2265                }
     2266
     2267                // If the parentheses are quantified Greedy then add a label to jump back
     2268                // to if get a failed match from after the parentheses. For NonGreedy
     2269                // parentheses, link the jump from before the subpattern to here.
     2270                if (term->quantityType == QuantifierGreedy) {
     2271                    if (term->quantityMaxCount != quantifyInfinite)
     2272                        branch32(Below, countTemporary, Imm32(term->quantityMaxCount.unsafeGet())).linkTo(beginOp.m_reentry, this);
     2273                    else
     2274                        jump(beginOp.m_reentry);
     2275                   
     2276                    op.m_reentry = label();
     2277                } else if (term->quantityType == QuantifierNonGreedy) {
     2278                    YarrOp& beginOp = m_ops[op.m_previousOp];
     2279                    beginOp.m_jumps.link(this);
     2280                }
     2281#else // !JIT_ALL_PARENS_EXPRESSIONS
     2282                RELEASE_ASSERT_NOT_REACHED();
     2283#endif
    19732284                break;
    19742285            }
     
    23922703                    // Plant a jump to the return address.
    23932704                    unsigned parenthesesFrameLocation = term->frameLocation;
    2394                     unsigned alternativeFrameLocation = parenthesesFrameLocation;
    2395                     if (term->quantityType != QuantifierFixedCount)
    2396                         alternativeFrameLocation += YarrStackSpaceForBackTrackInfoParenthesesOnce;
    2397                     loadFromFrameAndJump(alternativeFrameLocation);
     2705                    loadFromFrameAndJump(parenthesesFrameLocation + BackTrackInfoParentheses::returnAddressIndex());
    23982706
    23992707                    // Link the DataLabelPtr associated with the end of the last
     
    24262734                ASSERT(term->quantityMaxCount == 1);
    24272735
    2428                 // We only need to backtrack to thispoint if capturing or greedy.
     2736                // We only need to backtrack to this point if capturing or greedy.
    24292737                if ((term->capture() && compileMode == IncludeSubpatterns) || term->quantityType == QuantifierGreedy) {
    24302738                    m_backtrackingState.link(this);
     
    24602768                    // (in which case the flag value on the stack will be -1).
    24612769                    unsigned parenthesesFrameLocation = term->frameLocation;
    2462                     Jump hadSkipped = branch32(Equal, Address(stackPointerRegister, parenthesesFrameLocation * sizeof(void*)), TrustedImm32(-1));
     2770                    Jump hadSkipped = branch32(Equal, Address(stackPointerRegister, (parenthesesFrameLocation + BackTrackInfoParenthesesOnce::beginIndex()) * sizeof(void*)), TrustedImm32(-1));
    24632771
    24642772                    if (term->quantityType == QuantifierGreedy) {
     
    25042812                break;
    25052813
     2814            // OpParenthesesSubpatternBegin/End
     2815            //
     2816            // When we are backtracking back out of a capturing subpattern we need
     2817            // to clear the start index in the matches output array, to record that
     2818            // this subpattern has not been captured.
     2819            //
     2820            // When backtracking back out of a Greedy quantified subpattern we need
     2821            // to catch this, and try running the remainder of the alternative after
     2822            // the subpattern again, skipping the parentheses.
     2823            //
     2824            // Upon backtracking back into a quantified set of parentheses we need to
     2825            // check whether we were currently skipping the subpattern. If not, we
     2826            // can backtrack into them, if we were we need to either backtrack back
     2827            // out of the start of the parentheses, or jump back to the forwards
     2828            // matching start, depending of whether the match is Greedy or NonGreedy.
     2829            case OpParenthesesSubpatternBegin: {
     2830#ifdef JIT_ALL_PARENS_EXPRESSIONS
     2831                PatternTerm* term = op.m_term;
     2832                unsigned parenthesesFrameLocation = term->frameLocation;
     2833
     2834                if (term->quantityType == QuantifierGreedy) {
     2835                    m_backtrackingState.link(this);
     2836
     2837                    if (term->quantityType == QuantifierGreedy) {
     2838                        RegisterID currPatternContextReg = regT0;
     2839                        RegisterID newPatternContextReg = regT1;
     2840
     2841                        loadFromFrame(parenthesesFrameLocation + BackTrackInfoParentheses::patternContextHeadIndex(), currPatternContextReg);
     2842
     2843                        restorePatternContext(currPatternContextReg, regT2, term->parentheses.subpatternId, term->parentheses.lastSubpatternId, parenthesesFrameLocation);
     2844
     2845                        freePatternContext(currPatternContextReg, newPatternContextReg);
     2846                        storeToFrame(newPatternContextReg, parenthesesFrameLocation + BackTrackInfoParentheses::patternContextHeadIndex());
     2847                        const RegisterID countTemporary = regT0;
     2848                        loadFromFrame(parenthesesFrameLocation + BackTrackInfoParentheses::matchAmountIndex(), countTemporary);
     2849                        Jump zeroLengthMatch = branchTest32(Zero, countTemporary);
     2850
     2851                        sub32(TrustedImm32(1), countTemporary);
     2852                        storeToFrame(countTemporary, parenthesesFrameLocation + BackTrackInfoParentheses::matchAmountIndex());
     2853
     2854                        jump(m_ops[op.m_nextOp].m_reentry);
     2855
     2856                        zeroLengthMatch.link(this);
     2857
     2858                        // Clear the flag in the stackframe indicating we didn't run through the subpattern.
     2859                        storeToFrame(TrustedImm32(-1), parenthesesFrameLocation + BackTrackInfoParentheses::beginIndex());
     2860
     2861                        jump(m_ops[op.m_nextOp].m_reentry);
     2862                    }
     2863
     2864                    // If Greedy, jump to the end.
     2865                    if (term->quantityType == QuantifierGreedy) {
     2866                        // A backtrack from after the parentheses, when skipping the subpattern,
     2867                        // will jump back to here.
     2868                        op.m_jumps.link(this);
     2869                    }
     2870
     2871                    m_backtrackingState.fallthrough();
     2872                }
     2873#else // !JIT_ALL_PARENS_EXPRESSIONS
     2874                RELEASE_ASSERT_NOT_REACHED();
     2875#endif
     2876                break;
     2877            }
     2878            case OpParenthesesSubpatternEnd: {
     2879#ifdef JIT_ALL_PARENS_EXPRESSIONS
     2880                PatternTerm* term = op.m_term;
     2881
     2882                if (term->quantityType != QuantifierFixedCount) {
     2883                    m_backtrackingState.link(this);
     2884
     2885                    // Check whether we should backtrack back into the parentheses, or if we
     2886                    // are currently in a state where we had skipped over the subpattern
     2887                    // (in which case the flag value on the stack will be -1).
     2888                    unsigned parenthesesFrameLocation = term->frameLocation;
     2889                    Jump hadSkipped = branch32(Equal, Address(stackPointerRegister, (parenthesesFrameLocation  + BackTrackInfoParentheses::beginIndex()) * sizeof(void*)), TrustedImm32(-1));
     2890
     2891                    if (term->quantityType == QuantifierGreedy) {
     2892                        // For Greedy parentheses, we skip after having already tried going
     2893                        // through the subpattern, so if we get here we're done.
     2894                        YarrOp& beginOp = m_ops[op.m_previousOp];
     2895                        beginOp.m_jumps.append(hadSkipped);
     2896                    } else {
     2897                        // For NonGreedy parentheses, we try skipping the subpattern first,
     2898                        // so if we get here we need to try running through the subpattern
     2899                        // next. Jump back to the start of the parentheses in the forwards
     2900                        // matching path.
     2901                        ASSERT(term->quantityType == QuantifierNonGreedy);
     2902                        YarrOp& beginOp = m_ops[op.m_previousOp];
     2903                        hadSkipped.linkTo(beginOp.m_reentry, this);
     2904                    }
     2905
     2906                    m_backtrackingState.fallthrough();
     2907                }
     2908
     2909                m_backtrackingState.append(op.m_jumps);
     2910#else // !JIT_ALL_PARENS_EXPRESSIONS
     2911                RELEASE_ASSERT_NOT_REACHED();
     2912#endif
     2913                break;
     2914            }
     2915
    25062916            // OpParentheticalAssertionBegin/End
    25072917            case OpParentheticalAssertionBegin: {
     
    25632973    // of a set of alternatives wrapped in an outer set of nodes for
    25642974    // the parentheses.
    2565     // Supported types of parentheses are 'Once' (quantityMaxCount == 1)
    2566     // and 'Terminal' (non-capturing parentheses quantified as greedy
    2567     // and infinite).
     2975    // Supported types of parentheses are 'Once' (quantityMaxCount == 1),
     2976    // 'Terminal' (non-capturing parentheses quantified as greedy
     2977    // and infinite), and 0 based greedy quantified parentheses.
    25682978    // Alternatives will use the 'Simple' set of ops if either the
    25692979    // subpattern is terminal (in which case we will never need to
     
    25852995        // failure in the second.
    25862996        if (term->quantityMinCount && term->quantityMinCount != term->quantityMaxCount) {
     2997            if (Options::dumpCompiledRegExpPatterns())
     2998                dataLogF("Can't JIT a variable counted parenthesis with a non-zero minimum\n");
    25872999            m_shouldFallBack = true;
    25883000            return;
     
    26033015            parenthesesEndOpCode = OpParenthesesSubpatternTerminalEnd;
    26043016        } else {
     3017#ifdef JIT_ALL_PARENS_EXPRESSIONS
     3018            // We only handle generic parenthesis with greedy counts.
     3019            if (term->quantityType != QuantifierGreedy) {
     3020                // This subpattern is not supported by the JIT.
     3021                m_shouldFallBack = true;
     3022                return;
     3023            }
     3024
     3025            m_containsNestedSubpatterns = true;
     3026
     3027            // Select the 'Generic' nodes.
     3028            parenthesesBeginOpCode = OpParenthesesSubpatternBegin;
     3029            parenthesesEndOpCode = OpParenthesesSubpatternEnd;
     3030
     3031            // If there is more than one alternative we cannot use the 'simple' nodes.
     3032            if (term->parentheses.disjunction->m_alternatives.size() != 1) {
     3033                alternativeBeginOpCode = OpNestedAlternativeBegin;
     3034                alternativeNextOpCode = OpNestedAlternativeNext;
     3035                alternativeEndOpCode = OpNestedAlternativeEnd;
     3036            }
     3037#else
    26053038            // This subpattern is not supported by the JIT.
    26063039            m_shouldFallBack = true;
    26073040            return;
     3041#endif
    26083042        }
    26093043
     
    28323266            push(X86Registers::ebx);
    28333267
    2834         if (m_decodeSurrogatePairs) {
     3268#ifdef JIT_ALL_PARENS_EXPRESSIONS
     3269        if (m_containsNestedSubpatterns) {
    28353270#if OS(WINDOWS)
    28363271            push(X86Registers::edi);
     
    28383273#endif
    28393274            push(X86Registers::r12);
     3275        }
     3276#endif
     3277
     3278        if (m_decodeSurrogatePairs) {
    28403279            push(X86Registers::r13);
    28413280            push(X86Registers::r14);
    28423281            push(X86Registers::r15);
    28433282
    2844             move(TrustedImm32(0x10000), supplementaryPlanesBase);
    2845             move(TrustedImm32(0xfffffc00), surrogateTagMask);
    28463283            move(TrustedImm32(0xd800), leadingSurrogateTag);
    28473284            move(TrustedImm32(0xdc00), trailingSurrogateTag);
     
    29133350            pop(X86Registers::r14);
    29143351            pop(X86Registers::r13);
     3352        }
     3353
     3354#ifdef JIT_ALL_PARENS_EXPRESSIONS
     3355        if (m_containsNestedSubpatterns) {
    29153356            pop(X86Registers::r12);
    29163357#if OS(WINDOWS)
     
    29193360#endif
    29203361        }
     3362#endif
    29213363
    29223364        if (m_pattern.m_saveInitialStartValue)
     
    29503392        , m_unicodeIgnoreCase(m_pattern.unicode() && m_pattern.ignoreCase())
    29513393        , m_canonicalMode(m_pattern.unicode() ? CanonicalMode::Unicode : CanonicalMode::UCS2)
     3394#ifdef JIT_ALL_PARENS_EXPRESSIONS
     3395        , m_containsNestedSubpatterns(false)
     3396        , m_parenContextSizes(compileMode == IncludeSubpatterns ? m_pattern.m_numSubpatterns : 0, m_pattern.m_body->m_callFrameSize)
     3397#endif
    29523398    {
    29533399    }
     
    29623408#endif
    29633409
     3410        // We need to compile before generating code since we set flags based on compilation that
     3411        // are used during generation.
     3412        opCompileBody(m_pattern.m_body);
     3413       
     3414        if (m_shouldFallBack) {
     3415            jitObject.setFallBack(true);
     3416            return;
     3417        }
     3418       
    29643419        generateEnter();
    29653420
     
    29683423        hasInput.link(this);
    29693424
     3425#ifdef JIT_ALL_PARENS_EXPRESSIONS
     3426        if (m_containsNestedSubpatterns)
     3427            move(TrustedImm32(matchLimit), remainingMatchCount);
     3428#endif
     3429
    29703430        if (compileMode == IncludeSubpatterns) {
    29713431            for (unsigned i = 0; i < m_pattern.m_numSubpatterns + 1; ++i)
     
    29783438        initCallFrame();
    29793439
     3440#ifdef JIT_ALL_PARENS_EXPRESSIONS
     3441        if (m_containsNestedSubpatterns)
     3442            initParenContextFreeList();
     3443#endif
     3444       
    29803445        if (m_pattern.m_saveInitialStartValue) {
    29813446#ifdef HAVE_INITIAL_START_REG
     
    29863451        }
    29873452
    2988         opCompileBody(m_pattern.m_body);
    2989 
    2990         if (m_shouldFallBack) {
    2991             jitObject.setFallBack(true);
    2992             return;
    2993         }
    2994 
    29953453        generate();
    29963454        backtrack();
    29973455
    29983456        generateTryReadUnicodeCharacterHelper();
     3457
     3458        generateJITFailReturn();
    29993459
    30003460        LinkBuffer linkBuffer(*this, REGEXP_CODE_ID, JITCompilationCanFail);
     
    30413501    bool m_unicodeIgnoreCase;
    30423502    CanonicalMode m_canonicalMode;
     3503#ifdef JIT_ALL_PARENS_EXPRESSIONS
     3504    bool m_containsNestedSubpatterns;
     3505    ParenContextSizes m_parenContextSizes;
     3506#endif
     3507    JumpList m_abortExecution;
     3508    JumpList m_hitMatchLimit;
    30433509    Vector<Call> m_tryReadUnicodeCharacterCalls;
    30443510    Label m_tryReadUnicodeCharacterEntry;
  • trunk/Source/JavaScriptCore/yarr/YarrJIT.h

    r221052 r225695  
    3939#endif
    4040
     41#if CPU(ARM64) || (CPU(X86_64) && !OS(WINDOWS))
     42#define JIT_ALL_PARENS_EXPRESSIONS
     43constexpr size_t patternContextBufferSize = 8192; // Space caller allocates to save nested parenthesis context
     44#endif
     45
    4146namespace JSC {
    4247
     
    4853class YarrCodeBlock {
    4954#if CPU(X86_64) || CPU(ARM64)
     55#ifdef JIT_ALL_PARENS_EXPRESSIONS
     56    typedef MatchResult (*YarrJITCode8)(const LChar* input, unsigned start, unsigned length, int* output, void* freeParenContext, unsigned parenContextSize) YARR_CALL;
     57    typedef MatchResult (*YarrJITCode16)(const UChar* input, unsigned start, unsigned length, int* output, void* freeParenContext, unsigned parenContextSize) YARR_CALL;
     58    typedef MatchResult (*YarrJITCodeMatchOnly8)(const LChar* input, unsigned start, unsigned length, void*, void* freeParenContext, unsigned parenContextSize) YARR_CALL;
     59    typedef MatchResult (*YarrJITCodeMatchOnly16)(const UChar* input, unsigned start, unsigned length, void*, void* freeParenContext, unsigned parenContextSize) YARR_CALL;
     60#else
    5061    typedef MatchResult (*YarrJITCode8)(const LChar* input, unsigned start, unsigned length, int* output) YARR_CALL;
    5162    typedef MatchResult (*YarrJITCode16)(const UChar* input, unsigned start, unsigned length, int* output) YARR_CALL;
    5263    typedef MatchResult (*YarrJITCodeMatchOnly8)(const LChar* input, unsigned start, unsigned length) YARR_CALL;
    5364    typedef MatchResult (*YarrJITCodeMatchOnly16)(const UChar* input, unsigned start, unsigned length) YARR_CALL;
     65#endif
    5466#else
    5567    typedef EncodedMatchResult (*YarrJITCode8)(const LChar* input, unsigned start, unsigned length, int* output) YARR_CALL;
     
    8294    void set16BitCodeMatchOnly(MacroAssemblerCodeRef matchOnly) { m_matchOnly16 = matchOnly; }
    8395
     96#ifdef JIT_ALL_PARENS_EXPRESSIONS
     97    MatchResult execute(const LChar* input, unsigned start, unsigned length, int* output, void* freeParenContext, unsigned parenContextSize)
     98    {
     99        ASSERT(has8BitCode());
     100        return MatchResult(reinterpret_cast<YarrJITCode8>(m_ref8.code().executableAddress())(input, start, length, output, freeParenContext, parenContextSize));
     101    }
     102
     103    MatchResult execute(const UChar* input, unsigned start, unsigned length, int* output, void* freeParenContext, unsigned parenContextSize)
     104    {
     105        ASSERT(has16BitCode());
     106        return MatchResult(reinterpret_cast<YarrJITCode16>(m_ref16.code().executableAddress())(input, start, length, output, freeParenContext, parenContextSize));
     107    }
     108
     109    MatchResult execute(const LChar* input, unsigned start, unsigned length, void* freeParenContext, unsigned parenContextSize)
     110    {
     111        ASSERT(has8BitCodeMatchOnly());
     112        return MatchResult(reinterpret_cast<YarrJITCodeMatchOnly8>(m_matchOnly8.code().executableAddress())(input, start, length, 0, freeParenContext, parenContextSize));
     113    }
     114
     115    MatchResult execute(const UChar* input, unsigned start, unsigned length, void* freeParenContext, unsigned parenContextSize)
     116    {
     117        ASSERT(has16BitCodeMatchOnly());
     118        return MatchResult(reinterpret_cast<YarrJITCodeMatchOnly16>(m_matchOnly16.code().executableAddress())(input, start, length, 0, freeParenContext, parenContextSize));
     119    }
     120#else
    84121    MatchResult execute(const LChar* input, unsigned start, unsigned length, int* output)
    85122    {
     
    105142        return MatchResult(reinterpret_cast<YarrJITCodeMatchOnly16>(m_matchOnly16.code().executableAddress())(input, start, length));
    106143    }
     144#endif
    107145
    108146#if ENABLE(REGEXP_TRACING)
  • trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp

    r225683 r225695  
    829829                term.frameLocation = currentCallFrameSize;
    830830                if (term.quantityMaxCount == 1 && !term.parentheses.isCopy) {
    831                     if (term.quantityType != QuantifierFixedCount)
    832                         currentCallFrameSize += YarrStackSpaceForBackTrackInfoParenthesesOnce;
     831                    currentCallFrameSize += YarrStackSpaceForBackTrackInfoParenthesesOnce;
    833832                    error = setupDisjunctionOffsets(term.parentheses.disjunction, currentCallFrameSize, currentInputPosition.unsafeGet(), currentCallFrameSize);
    834833                    if (error)
     
    846845                } else {
    847846                    term.inputPosition = currentInputPosition.unsafeGet();
    848                     unsigned ignoredCallFrameSize;
    849                     error = setupDisjunctionOffsets(term.parentheses.disjunction, 0, currentInputPosition.unsafeGet(), ignoredCallFrameSize);
     847                    currentCallFrameSize += YarrStackSpaceForBackTrackInfoParentheses;
     848                    error = setupDisjunctionOffsets(term.parentheses.disjunction, currentCallFrameSize, currentInputPosition.unsafeGet(), currentCallFrameSize);
    850849                    if (error)
    851850                        return error;
    852                     currentCallFrameSize += YarrStackSpaceForBackTrackInfoParentheses;
    853851                }
    854852                // Fixed count of 1 could be accepted, if they have a fixed size *AND* if all alternatives are of the same length.
     
    11861184}
    11871185
    1188 static void indentForNestingLevel(PrintStream& out, unsigned nestingDepth)
     1186void indentForNestingLevel(PrintStream& out, unsigned nestingDepth)
    11891187{
    11901188    out.print("    ");
     
    11931191}
    11941192
    1195 static void dumpUChar32(PrintStream& out, UChar32 c)
     1193void dumpUChar32(PrintStream& out, UChar32 c)
    11961194{
    11971195    if (c >= ' '&& c <= 0xff)
     
    11991197    else
    12001198        out.printf("0x%04x", c);
     1199}
     1200
     1201void dumpCharacterClass(PrintStream& out, YarrPattern* pattern, CharacterClass* characterClass)
     1202{
     1203    if (characterClass == pattern->anyCharacterClass())
     1204        out.print("<any character>");
     1205    else if (characterClass == pattern->newlineCharacterClass())
     1206        out.print("<newline>");
     1207    else if (characterClass == pattern->digitsCharacterClass())
     1208        out.print("<digits>");
     1209    else if (characterClass == pattern->spacesCharacterClass())
     1210        out.print("<whitespace>");
     1211    else if (characterClass == pattern->wordcharCharacterClass())
     1212        out.print("<word>");
     1213    else if (characterClass == pattern->wordUnicodeIgnoreCaseCharCharacterClass())
     1214        out.print("<unicode ignore case>");
     1215    else if (characterClass == pattern->nondigitsCharacterClass())
     1216        out.print("<non-digits>");
     1217    else if (characterClass == pattern->nonspacesCharacterClass())
     1218        out.print("<non-whitespace>");
     1219    else if (characterClass == pattern->nonwordcharCharacterClass())
     1220        out.print("<non-word>");
     1221    else if (characterClass == pattern->nonwordUnicodeIgnoreCaseCharCharacterClass())
     1222        out.print("<unicode non-ignore case>");
     1223    else {
     1224        bool needMatchesRangesSeperator = false;
     1225
     1226        auto dumpMatches = [&] (const char* prefix, Vector<UChar32> matches) {
     1227            size_t matchesSize = matches.size();
     1228            if (matchesSize) {
     1229                if (needMatchesRangesSeperator)
     1230                    out.print(",");
     1231                needMatchesRangesSeperator = true;
     1232
     1233                out.print(prefix, ":(");
     1234                for (size_t i = 0; i < matchesSize; ++i) {
     1235                    if (i)
     1236                        out.print(",");
     1237                    dumpUChar32(out, matches[i]);
     1238                }
     1239                out.print(")");
     1240            }
     1241        };
     1242
     1243        auto dumpRanges = [&] (const char* prefix, Vector<CharacterRange> ranges) {
     1244            size_t rangeSize = ranges.size();
     1245            if (rangeSize) {
     1246                if (needMatchesRangesSeperator)
     1247                    out.print(",");
     1248                needMatchesRangesSeperator = true;
     1249
     1250                out.print(prefix, " ranges:(");
     1251                for (size_t i = 0; i < rangeSize; ++i) {
     1252                    if (i)
     1253                        out.print(",");
     1254                    CharacterRange range = ranges[i];
     1255                    out.print("(");
     1256                    dumpUChar32(out, range.begin);
     1257                    out.print("..");
     1258                    dumpUChar32(out, range.end);
     1259                    out.print(")");
     1260                }
     1261                out.print(")");
     1262            }
     1263        };
     1264
     1265        out.print("[");
     1266        dumpMatches("ASCII", characterClass->m_matches);
     1267        dumpRanges("ASCII", characterClass->m_ranges);
     1268        dumpMatches("Unicode", characterClass->m_matchesUnicode);
     1269        dumpRanges("Unicode", characterClass->m_rangesUnicode);
     1270        out.print("]");
     1271    }
    12011272}
    12021273
     
    12401311    indentForNestingLevel(out, nestingDepth);
    12411312
    1242     if (invert() && (type != TypeParenthesesSubpattern && type != TypeParentheticalAssertion))
    1243         out.print("not ");
     1313    if (type != TypeParenthesesSubpattern && type != TypeParentheticalAssertion) {
     1314        if (invert())
     1315            out.print("not ");
     1316    }
    12441317
    12451318    switch (type) {
     
    12551328    case TypePatternCharacter:
    12561329        out.printf("character ");
     1330        out.printf("inputPosition %u ", inputPosition);
    12571331        if (thisPattern->ignoreCase() && isASCIIAlpha(patternCharacter)) {
    12581332            dumpUChar32(out, toASCIIUpper(patternCharacter));
     
    13761450            out.print(",terminal");
    13771451
    1378         if (quantityMaxCount != 1 || parentheses.isCopy || quantityType != QuantifierFixedCount)
    1379             out.println(",frame location ", frameLocation);
    1380         else
    1381             out.println();
     1452        out.println(",frame location ", frameLocation);
    13821453
    13831454        if (parentheses.disjunction->m_alternatives.size() > 1) {
    13841455            indentForNestingLevel(out, nestingDepth + 1);
    13851456            unsigned alternativeFrameLocation = frameLocation;
    1386             if (quantityType != QuantifierFixedCount)
     1457            if (quantityMaxCount == 1 && !parentheses.isCopy)
    13871458                alternativeFrameLocation += YarrStackSpaceForBackTrackInfoParenthesesOnce;
     1459            else if (parentheses.isTerminal)
     1460                alternativeFrameLocation += YarrStackSpaceForBackTrackInfoParenthesesTerminal;
     1461            else
     1462                alternativeFrameLocation += YarrStackSpaceForBackTrackInfoParentheses;
    13881463            out.println("alternative list,frame location ", alternativeFrameLocation);
    13891464        }
     
    14621537    }
    14631538    out.print(":\n");
     1539    if (m_body->m_callFrameSize)
     1540        out.print("    callframe size: ", m_body->m_callFrameSize, "\n");
    14641541    m_body->dump(out, this);
    14651542}
  • trunk/Source/JavaScriptCore/yarr/YarrPattern.h

    r225683 r225695  
    228228        return m_capture;
    229229    }
    230    
     230
     231    bool containsAnyCaptures()
     232    {
     233        ASSERT(this->type == TypeParenthesesSubpattern);
     234        return parentheses.lastSubpatternId >= parentheses.subpatternId;
     235    }
     236
    231237    void quantify(unsigned count, QuantifierType type)
    232238    {
     
    550556};
    551557
     558    void indentForNestingLevel(PrintStream&, unsigned);
     559    void dumpUChar32(PrintStream&, UChar32);
     560    void dumpCharacterClass(PrintStream&, YarrPattern*, CharacterClass*);
     561
    552562    struct BackTrackInfoPatternCharacter {
    553563        uintptr_t begin; // Only needed for unicode patterns
     
    575585
    576586    struct BackTrackInfoAlternative {
    577         uintptr_t offset;
    578 
    579         static unsigned offsetIndex() { return offsetof(BackTrackInfoAlternative, offset) / sizeof(uintptr_t); }
     587        union {
     588            uintptr_t offset;
     589        };
    580590    };
    581591
     
    588598    struct BackTrackInfoParenthesesOnce {
    589599        uintptr_t begin;
     600        uintptr_t returnAddress;
    590601
    591602        static unsigned beginIndex() { return offsetof(BackTrackInfoParenthesesOnce, begin) / sizeof(uintptr_t); }
     603        static unsigned returnAddressIndex() { return offsetof(BackTrackInfoParenthesesOnce, returnAddress) / sizeof(uintptr_t); }
    592604    };
    593605
     
    598610    };
    599611
     612    struct BackTrackInfoParentheses {
     613        uintptr_t begin;
     614        uintptr_t returnAddress;
     615        uintptr_t matchAmount;
     616        uintptr_t patternContextHead;
     617
     618        static unsigned beginIndex() { return offsetof(BackTrackInfoParentheses, begin) / sizeof(uintptr_t); }
     619        static unsigned returnAddressIndex() { return offsetof(BackTrackInfoParentheses, returnAddress) / sizeof(uintptr_t); }
     620        static unsigned matchAmountIndex() { return offsetof(BackTrackInfoParentheses, matchAmount) / sizeof(uintptr_t); }
     621        static unsigned patternContextHeadIndex() { return offsetof(BackTrackInfoParentheses, patternContextHead) / sizeof(uintptr_t); }
     622    };
     623
    600624} } // namespace JSC::Yarr
Note: See TracChangeset for help on using the changeset viewer.