Changeset 43038 in webkit for trunk/JavaScriptCore


Ignore:
Timestamp:
Apr 29, 2009, 6:34:08 PM (16 years ago)
Author:
[email protected]
Message:

2009-04-29 Gavin Barraclough <[email protected]>

Reviewed by Oliver "Peg-Leg" Hunt.

Coallesce input checking and reduce futzing with the index position
between alternatives and iterations of the main loop of a regex,
when run in YARR.

Consider the following regex: /foo|bar/


Prior to this patch, this will be implemented something like this pseudo-code description:


loop:

check_for_available_input(3) this increments the index by 3, for the first alterantive.

if (available) { test "foo" }

decrement_index(3)
check_for_available_input(3) this increments the index by 3, for the second alterantive.

if (available) { test "bar" }

decrement_index(3)
check_for_available_input(1) can we loop again?

if (available) { goto loop }

With these changes it will look more like this:

check_for_available_input(3) this increments the index by 3, for the first alterantive.
if (!available) { goto fail }

loop:

test "foo"
test "bar"
check_for_available_input(1) can we loop again?

if (available) { goto loop }

fail:

This gives about a 5% gain on v8-regex, no change on Sunspider.

  • yarr/RegexJIT.cpp: (JSC::Yarr::RegexGenerator::TermGenerationState::linkAlternativeBacktracksTo): (JSC::Yarr::RegexGenerator::generateDisjunction):
Location:
trunk/JavaScriptCore
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/ChangeLog

    r43037 r43038  
     12009-04-29  Gavin Barraclough  <[email protected]>
     2
     3        Reviewed by Oliver "Peg-Leg" Hunt.
     4
     5        Coallesce input checking and reduce futzing with the index position
     6        between alternatives and iterations of the main loop of a regex,
     7        when run in YARR.
     8
     9        Consider the following regex:  /foo|bar/
     10       
     11        Prior to this patch, this will be implemented something like this pseudo-code description:
     12       
     13        loop:
     14            check_for_available_input(3) // this increments the index by 3, for the first alterantive.
     15                if (available) { test "foo" }
     16            decrement_index(3)
     17            check_for_available_input(3) // this increments the index by 3, for the second alterantive.
     18                if (available) { test "bar" }
     19            decrement_index(3)
     20            check_for_available_input(1) // can we loop again?
     21                if (available) { goto loop }
     22
     23        With these changes it will look more like this:
     24
     25            check_for_available_input(3) // this increments the index by 3, for the first alterantive.
     26            if (!available) { goto fail }
     27        loop:
     28            test "foo"
     29            test "bar"
     30            check_for_available_input(1) // can we loop again?
     31                if (available) { goto loop }
     32        fail:
     33
     34
     35        This gives about a 5% gain on v8-regex, no change on Sunspider.
     36
     37        * yarr/RegexJIT.cpp:
     38        (JSC::Yarr::RegexGenerator::TermGenerationState::linkAlternativeBacktracksTo):
     39        (JSC::Yarr::RegexGenerator::generateDisjunction):
     40
    1412009-04-29  Oliver Hunt  <[email protected]>
    242
  • trunk/JavaScriptCore/yarr/RegexJIT.cpp

    r42853 r43038  
    378378            isBackTrackGenerated = false;
    379379            backTrackJumps.link(masm);
     380        }
     381        void linkAlternativeBacktracksTo(Label label, MacroAssembler* masm)
     382        {
     383            isBackTrackGenerated = false;
     384            backTrackJumps.linkTo(label, masm);
    380385        }
    381386        void propagateBacktrackingFrom(TermGenerationState& nestedParenthesesState, MacroAssembler* masm)
     
    10831088    {
    10841089        TermGenerationState state(disjunction, 0);
     1090        state.resetAlternative();
     1091
     1092        // Plant a check to see if there is sufficient input available to run the first alternative.
     1093        // Jumping back to the label 'firstAlternative' will get to this check, jumping to
     1094        // 'firstAlternativeInputChecked' will jump directly to matching the alternative having
     1095        // skipped this check.
    10851096
    10861097        Label firstAlternative(this);
    10871098
    1088         for (state.resetAlternative(); state.alternativeValid(); state.nextAlternative()) {
     1099        // check availability for the next alternative
     1100        int countCheckedForCurrentAlternative = 0;
     1101        int countToCheckForFirstAlternative = 0;
     1102        bool hasShorterAlternatives = false;
     1103        JumpList notEnoughInputForPreviousAlternative;
     1104
     1105        if (state.alternativeValid()) {
    10891106            PatternAlternative* alternative = state.alternative();
    1090 
     1107            countToCheckForFirstAlternative = alternative->m_minimumSize;
     1108            state.checkedTotal += countToCheckForFirstAlternative;
     1109            if (countToCheckForFirstAlternative)
     1110                notEnoughInputForPreviousAlternative.append(jumpIfNoAvailableInput(countToCheckForFirstAlternative));
     1111            countCheckedForCurrentAlternative = countToCheckForFirstAlternative;
     1112        }
     1113
     1114        Label firstAlternativeInputChecked(this);
     1115
     1116        while (state.alternativeValid()) {
     1117            // Track whether any alternatives are shorter than the first one.
     1118            hasShorterAlternatives = hasShorterAlternatives || (countCheckedForCurrentAlternative < countToCheckForFirstAlternative);
     1119
     1120            PatternAlternative* alternative = state.alternative();
    10911121            optimizeAlternative(alternative);
    1092 
    1093             // check availability for the first alternative
    1094             int countToCheck = alternative->m_minimumSize;
    1095             if (countToCheck) {
    1096                 state.addBacktrackJump(jumpIfNoAvailableInput(countToCheck));
    1097                 state.checkedTotal += countToCheck;
    1098             }
    10991122
    11001123            for (state.resetTerm(); state.termValid(); state.nextTerm())
     
    11171140            generateReturn();
    11181141
     1142            state.nextAlternative();
     1143
     1144            // if there are any more alternatives, plant the check for input before looping.
     1145            if (state.alternativeValid()) {
     1146                PatternAlternative* nextAlternative = state.alternative();
     1147                int countToCheckForNextAlternative = nextAlternative->m_minimumSize;
     1148
     1149                if (countCheckedForCurrentAlternative > countToCheckForNextAlternative) { // CASE 1: current alternative was longer than the next one.
     1150                    // If we get here, there the last input checked failed.
     1151                    notEnoughInputForPreviousAlternative.link(this);
     1152
     1153                    // Check if sufficent input available to run the next alternative
     1154                    notEnoughInputForPreviousAlternative.append(jumpIfNoAvailableInput(countToCheckForNextAlternative - countCheckedForCurrentAlternative));
     1155                    // We are now in the correct state to enter the next alternative; this add is only required
     1156                    // to mirror and revert operation of the sub32, just below.
     1157                    add32(Imm32(countCheckedForCurrentAlternative - countToCheckForNextAlternative), index);
     1158
     1159                    // If we get here, there the last input checked passed.
     1160                    state.linkAlternativeBacktracks(this);
     1161                    // No need to check if we can run the next alternative, since it is shorter -
     1162                    // just update index.
     1163                    sub32(Imm32(countCheckedForCurrentAlternative - countToCheckForNextAlternative), index);
     1164                } else if (countCheckedForCurrentAlternative < countToCheckForNextAlternative) { // CASE 2: next alternative is longer than the current one.
     1165                    // If we get here, there the last input checked failed.
     1166                    // If there is insufficient input to run the current alternative, and the next alternative is longer,
     1167                    // then there is definitely not enough input to run it - don't even check. Just adjust index, as if
     1168                    // we had checked.
     1169                    notEnoughInputForPreviousAlternative.link(this);
     1170                    add32(Imm32(countToCheckForNextAlternative - countCheckedForCurrentAlternative), index);
     1171                    notEnoughInputForPreviousAlternative.append(jump());
     1172
     1173                    // The next alternative is longer than the current one; check the difference.
     1174                    state.linkAlternativeBacktracks(this);
     1175                    notEnoughInputForPreviousAlternative.append(jumpIfNoAvailableInput(countToCheckForNextAlternative - countCheckedForCurrentAlternative));
     1176                } else { // CASE 3: Both alternatives are the same length.
     1177                    ASSERT(countCheckedForCurrentAlternative == countToCheckForNextAlternative);
     1178
     1179                    // If the next alterative is the same length as this one, then no need to check the input -
     1180                    // if there was sufficent input to run the current alternative then there is sufficient
     1181                    // input to run the next one; if not, there isn't.
     1182                    state.linkAlternativeBacktracks(this);
     1183                }
     1184
     1185                state.checkedTotal -= countCheckedForCurrentAlternative;
     1186                countCheckedForCurrentAlternative = countToCheckForNextAlternative;
     1187                state.checkedTotal += countCheckedForCurrentAlternative;
     1188            }
     1189        }
     1190       
     1191        // If we get here, all Alternatives failed...
     1192
     1193        state.checkedTotal -= countCheckedForCurrentAlternative;
     1194
     1195        // How much more input need there be to be able to retry from the first alternative?
     1196        // examples:
     1197        //   /yarr_jit/ or /wrec|pcre/
     1198        //     In these examples we need check for one more input before looping.
     1199        //   /yarr_jit|pcre/
     1200        //     In this case we need check for 5 more input to loop (+4 to allow for the first alterative
     1201        //     being four longer than the last alternative checked, and another +1 to effectively move
     1202        //     the start position along by one).
     1203        //   /yarr|rules/ or /wrec|notsomuch/
     1204        //     In these examples, provided that there was sufficient input to have just been matching for
     1205        //     the second alternative we can loop without checking for available input (since the second
     1206        //     alternative is longer than the first).  In the latter example we need to decrement index
     1207        //     (by 4) so the start position is only progressed by 1 from the last iteration.
     1208        int incrementForNextIter = (countToCheckForFirstAlternative - countCheckedForCurrentAlternative) + 1;
     1209
     1210        // First, deal with the cases where there was sufficient input to try the last alternative.
     1211        if (incrementForNextIter > 0) // We need to check for more input anyway, fall through to the checking below.
    11191212            state.linkAlternativeBacktracks(this);
    1120 
    1121             if (countToCheck) {
    1122                 sub32(Imm32(countToCheck), index);
    1123                 state.checkedTotal -= countToCheck;
    1124             }
    1125         }
    1126        
    1127         // If we get here, all Alternatives failed.
    1128 
    1129         add32(Imm32(1), index);
    1130         if (!m_pattern.m_body->m_hasFixedSize)
    1131             poke(index, m_pattern.m_body->m_callFrameSize);
    1132         checkInput().linkTo(firstAlternative, this);
     1213        else if (m_pattern.m_body->m_hasFixedSize && !incrementForNextIter) // No need to update anything, link these backtracks straight to the to pof the loop!
     1214            state.linkAlternativeBacktracksTo(firstAlternativeInputChecked, this);
     1215        else { // no need to check the input, but we do have some bookkeeping to do first.
     1216            state.linkAlternativeBacktracks(this);
     1217
     1218            // Where necessary update our preserved start position.
     1219            if (!m_pattern.m_body->m_hasFixedSize) {
     1220                move(index, regT0);
     1221                sub32(Imm32(countCheckedForCurrentAlternative - 1), regT0);
     1222                poke(regT0, m_pattern.m_body->m_callFrameSize);
     1223            }
     1224
     1225            // Update index if necessary, and loop (without checking).
     1226            if (incrementForNextIter)
     1227                add32(Imm32(incrementForNextIter), index);
     1228            jump().linkTo(firstAlternativeInputChecked, this);
     1229        }
     1230
     1231        notEnoughInputForPreviousAlternative.link(this);
     1232        // Update our idea of the start position, if we're tracking this.
     1233        if (!m_pattern.m_body->m_hasFixedSize) {
     1234            if (countCheckedForCurrentAlternative - 1) {
     1235                move(index, regT0);
     1236                sub32(Imm32(countCheckedForCurrentAlternative - 1), regT0);
     1237                poke(regT0, m_pattern.m_body->m_callFrameSize);
     1238            } else
     1239                poke(index, m_pattern.m_body->m_callFrameSize);
     1240        }
     1241        // Check if there is sufficent input to run the first alternative again.
     1242        jumpIfAvailableInput(incrementForNextIter).linkTo(firstAlternativeInputChecked, this);
     1243        // No - insufficent input to run the first alteranative, are there any other alternatives we
     1244        // might need to check?  If so, the last check will have left the index incremented by
     1245        // (countToCheckForFirstAlternative + 1), so we need test whether countToCheckForFirstAlternative
     1246        // LESS input is available, to have the effect of just progressing the start position by 1
     1247        // from the last iteration.  If this check passes we can just jump up to the check associated
     1248        // with the first alternative in the loop.  This is a bit sad, since we'll end up trying the
     1249        // first alternative again, and this check will fail (otherwise the check planted just above
     1250        // here would have passed).  This is a bit sad, however it saves trying to do something more
     1251        // complex here in compilation, and in the common case we should end up coallescing the checks.
     1252        //
     1253        // FIXME: a nice improvement here may be to stop trying to match sooner, based on the least
     1254        // of the minimum-alterantive-lengths.  E.g. if I have two alternatives of length 200 and 150,
     1255        // and a string of length 100, we'll end up looping index from 0 to 100, checking whether there
     1256        // is sufficient input to run either alternative (constantly failing).  If there had been only
     1257        // one alternative, or if the shorter alternative had come first, we would have terminated
     1258        // immediately. :-/
     1259        if (hasShorterAlternatives)
     1260            jumpIfAvailableInput(-countToCheckForFirstAlternative).linkTo(firstAlternative, this);
     1261        // index will now be a bit garbled (depending on whether 'hasShorterAlternatives' is true,
     1262        // it has either been incremented by 1 or by (countToCheckForFirstAlternative + 1) ...
     1263        // but since we're about to return a failure this doesn't really matter!)
    11331264
    11341265        unsigned frameSize = m_pattern.m_body->m_callFrameSize;
Note: See TracChangeset for help on using the changeset viewer.