Changeset 17862 in webkit for trunk/JavaScriptCore/kjs/lexer.cpp


Ignore:
Timestamp:
Nov 20, 2006, 12:24:22 PM (19 years ago)
Author:
ap
Message:

2006-11-20 W. Andy Carrel <[email protected]>

Reviewed by Maciej.

http://bugs.webkit.org/show_bug.cgi?id=11501
REGRESSION: \u no longer escapes metacharacters in RegExps
http://bugs.webkit.org/show_bug.cgi?id=11502
Serializing RegExps doesn't preserve Unicode escapes

JavaScriptCore:

  • kjs/lexer.cpp: (Lexer::Lexer): (Lexer::setCode): (Lexer::shift): (Lexer::scanRegExp): Push \u parsing back down into the RegExp object rather than in the parser. This backs out r17354 in favor of a new fix that better matches the behavior of other browsers.
  • kjs/lexer.h:
  • kjs/regexp.cpp: (KJS::RegExp::RegExp): (KJS::sanitizePattern): (KJS::isHexDigit): (KJS::convertHex): (KJS::convertUnicode):
  • kjs/regexp.h: Translate \u escaped unicode characters for the benefit of pcre.
  • kjs/ustring.cpp: (KJS::UString::append): Fix failure to increment length on the first UChar appended to a UString that was copy-on-write.
  • tests/mozilla/ecma_2/RegExp/properties-001.js: Adjust tests back to the uniform standards.

LayoutTests:

  • fast/js/kde/RegExp-expected.txt:
  • fast/js/regexp-unicode-handling-expected.txt: Adjust these test results to passing as a result of other included changes in this revision.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/kjs/lexer.cpp

    r17372 r17862  
    6262    bol(true),
    6363#endif
    64     current(0), next1(0), next2(0), next3(0), next4(0),
     64    current(0), next1(0), next2(0), next3(0),
    6565    strings(0), numStrings(0), stringsCapacity(0),
    6666    identifiers(0), numIdentifiers(0), identifiersCapacity(0)
     
    120120  next2 = (length > 2) ? code[2].uc : -1;
    121121  next3 = (length > 3) ? code[3].uc : -1;
    122   next4 = (length > 4) ? code[4].uc : -1;
    123122}
    124123
     
    132131    next1 = next2;
    133132    next2 = next3;
    134     next3 = next4;
    135     next4 = (pos + 4 < length) ? code[pos+4].uc : -1;
     133    next3 = (pos + 3 < length) ? code[pos + 3].uc : -1;
    136134  }
    137135}
     
    837835    else if (current != '/' || lastWasEscape == true || inBrackets == true)
    838836    {
    839         if (lastWasEscape) {
    840           // deal with unicode escapes in inline regexps
    841           if (current == 'u') {
    842             if (isHexDigit(next1) && isHexDigit(next2) &&
    843                 isHexDigit(next3) && isHexDigit(next4)) {
    844               record16(convertUnicode(next1, next2, next3, next4));
    845               shift(5);
    846               lastWasEscape = false;
    847               continue;
    848             } else
    849               // this wasn't unicode after all
    850               record16('\\');
    851           }
    852         } else {
    853           // keep track of '[' and ']'
     837        // keep track of '[' and ']'
     838        if (!lastWasEscape) {
    854839          if ( current == '[' && !inBrackets )
    855840            inBrackets = true;
     
    857842            inBrackets = false;
    858843        }
    859         // don't want to capture the '\' for unicode escapes
    860         if (current != '\\' || next1 != 'u')
    861           record16(current);
     844        record16(current);
    862845        lastWasEscape =
    863846            !lastWasEscape && (current == '\\');
Note: See TracChangeset for help on using the changeset viewer.