Context Navigation

← Previous Change
Next Change →

lexer.cpp

Timestamp:

Nov 20, 2006, 12:24:22 PM (19 years ago)

Author:

Message:

2006-11-20 W. Andy Carrel <[email protected]>

Reviewed by Maciej.

http://bugs.webkit.org/show_bug.cgi?id=11501
REGRESSION: \u no longer escapes metacharacters in RegExps
http://bugs.webkit.org/show_bug.cgi?id=11502
Serializing RegExps doesn't preserve Unicode escapes

JavaScriptCore:

kjs/lexer.cpp: (Lexer::Lexer): (Lexer::setCode): (Lexer::shift): (Lexer::scanRegExp): Push \u parsing back down into the RegExp object rather than in the parser. This backs out r17354 in favor of a new fix that better matches the behavior of other browsers.

kjs/lexer.h:
kjs/regexp.cpp: (KJS::RegExp::RegExp): (KJS::sanitizePattern): (KJS::isHexDigit): (KJS::convertHex): (KJS::convertUnicode):
kjs/regexp.h: Translate \u escaped unicode characters for the benefit of pcre.

kjs/ustring.cpp: (KJS::UString::append): Fix failure to increment length on the first UChar appended to a UString that was copy-on-write.

tests/mozilla/ecma_2/RegExp/properties-001.js: Adjust tests back to the uniform standards.

LayoutTests:

fast/js/kde/RegExp-expected.txt:
fast/js/regexp-unicode-handling-expected.txt: Adjust these test results to passing as a result of other included changes in this revision.

File:

: 1 edited

trunk/JavaScriptCore/kjs/lexer.cpp (modified) (5 diffs)

Legend:

: Unmodified
: Added
: Removed

trunk/JavaScriptCore/kjs/lexer.cpp

-              r17372
+              r17862
     bol(true),
 #endif
     current(0), next1(0), next2(0), next3(0), next4(0),
+    current(0), next1(0), next2(0), next3(0),
     strings(0), numStrings(0), stringsCapacity(0),
     identifiers(0), numIdentifiers(0), identifiersCapacity(0)
 …
   next2 = (length > 2) ? code[2].uc : -1;
   next3 = (length > 3) ? code[3].uc : -1;
-  next4 = (length > 4) ? code[4].uc : -1;
+}
 …
     next1 = next2;
     next2 = next3;
+    next3 = next4;
+    next4 = (pos + 4 < length) ? code[pos+4].uc : -1;
+    next3 = (pos + 3 < length) ? code[pos + 3].uc : -1;
+  }
+}
 …
     else if (current != '/' || lastWasEscape == true || inBrackets == true)
+    {
+        if (lastWasEscape) {
+          // deal with unicode escapes in inline regexps
+          if (current == 'u') {
+            if (isHexDigit(next1) && isHexDigit(next2) &&
+                isHexDigit(next3) && isHexDigit(next4)) {
+              record16(convertUnicode(next1, next2, next3, next4));
+              shift(5);
+              lastWasEscape = false;
+              continue;
+            } else
+              // this wasn't unicode after all
+              record16('\\');
+          }
+        } else {
+          // keep track of '[' and ']'
+        // keep track of '[' and ']'
+        if (!lastWasEscape) {
           if ( current == '[' && !inBrackets )
             inBrackets = true;
 …
             inBrackets = false;
+        }
+        // don't want to capture the '\' for unicode escapes
+        if (current != '\\' || next1 != 'u')
+          record16(current);
+        record16(current);
         lastWasEscape =
             !lastWasEscape && (current == '\\');

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 17862 in webkit for trunk/JavaScriptCore/kjs/lexer.cpp

Legend:

trunk/JavaScriptCore/kjs/lexer.cpp

Download in other formats: