Ignore:
Timestamp:
Sep 18, 2010, 2:04:15 PM (15 years ago)
Author:
[email protected]
Message:

2010-09-18 Michael Saboff <[email protected]>

Reviewed by Gavin Barraclough.

Added code to unroll regular expressions containing .
Alternatives that begin with
are tagged during parsing
and rolled up in containing sub expression structs.
After parsing, a regular expression flagged as containing
a (a.k.a. BOL) is processed further in optimizeBOL().
A copy of the disjunction is made excluding alternatives that
are rooted with BOL. The original alternatives are flagged
to only be executed once. The copy of the other alternatives are
added to the original expression.
In the case that all original alternatives are flagged, there
won't be any looping alternatives.
The JIT generator will emit code accordingly, executing the
original alternatives once and then looping over the
alternatives that aren't anchored with a BOL (if any).
https://bugs.webkit.org/show_bug.cgi?id=45787

  • yarr/RegexCompiler.cpp: (JSC::Yarr::RegexPatternConstructor::assertionBOL): (JSC::Yarr::RegexPatternConstructor::atomParenthesesEnd): (JSC::Yarr::RegexPatternConstructor::copyDisjunction): (JSC::Yarr::RegexPatternConstructor::copyTerm): (JSC::Yarr::RegexPatternConstructor::optimizeBOL): (JSC::Yarr::compileRegex):
  • yarr/RegexJIT.cpp: (JSC::Yarr::RegexGenerator::generateDisjunction):
  • yarr/RegexPattern.h: (JSC::Yarr::PatternAlternative::PatternAlternative): (JSC::Yarr::PatternAlternative::setOnceThrough): (JSC::Yarr::PatternAlternative::onceThrough): (JSC::Yarr::PatternDisjunction::PatternDisjunction): (JSC::Yarr::RegexPattern::RegexPattern): (JSC::Yarr::RegexPattern::reset):

2010-09-18 Michael Saboff <[email protected]>

Reviewed by Gavin Barraclough.

Added new tests to check for proper handling of in multiline
regular expressions. Added as part of
https://bugs.webkit.org/show_bug.cgi?id=45787

  • fast/js/regexp-bol-with-multiline-expected.txt: Added.
  • fast/js/regexp-bol-with-multiline.html: Added.
  • fast/js/script-tests/regexp-bol-with-multiline.js: Added.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/yarr/RegexPattern.h

    r60273 r67790  
    208208    PatternAlternative(PatternDisjunction* disjunction)
    209209        : m_parent(disjunction)
     210        , m_onceThrough(false)
     211        , m_hasFixedSize(false)
     212        , m_startsWithBOL(false)
     213        , m_containsBOL(false)
    210214    {
    211215    }
     
    221225        ASSERT(m_terms.size());
    222226        m_terms.shrink(m_terms.size() - 1);
     227    }
     228   
     229    void setOnceThrough()
     230    {
     231        m_onceThrough = true;
     232    }
     233   
     234    bool onceThrough()
     235    {
     236        return m_onceThrough;
    223237    }
    224238
     
    226240    PatternDisjunction* m_parent;
    227241    unsigned m_minimumSize;
    228     bool m_hasFixedSize;
     242    bool m_onceThrough : 1;
     243    bool m_hasFixedSize : 1;
     244    bool m_startsWithBOL : 1;
     245    bool m_containsBOL : 1;
    229246};
    230247
     
    232249    PatternDisjunction(PatternAlternative* parent = 0)
    233250        : m_parent(parent)
     251        , m_hasFixedSize(false)
    234252    {
    235253    }
     
    270288        : m_ignoreCase(ignoreCase)
    271289        , m_multiline(multiline)
     290        , m_containsBackreferences(false)
     291        , m_containsBOL(false)
    272292        , m_numSubpatterns(0)
    273293        , m_maxBackReference(0)
    274         , m_containsBackreferences(false)
    275294        , newlineCached(0)
    276295        , digitsCached(0)
     
    295314
    296315        m_containsBackreferences = false;
     316        m_containsBOL = false;
    297317
    298318        newlineCached = 0;
     
    358378    }
    359379
    360     bool m_ignoreCase;
    361     bool m_multiline;
     380    bool m_ignoreCase : 1;
     381    bool m_multiline : 1;
     382    bool m_containsBackreferences : 1;
     383    bool m_containsBOL : 1;
    362384    unsigned m_numSubpatterns;
    363385    unsigned m_maxBackReference;
    364     bool m_containsBackreferences;
    365386    PatternDisjunction* m_body;
    366387    Vector<PatternDisjunction*, 4> m_disjunctions;
Note: See TracChangeset for help on using the changeset viewer.