Ignore:
Timestamp:
Nov 3, 2007, 9:40:32 AM (18 years ago)
Author:
Darin Adler
Message:

JavaScriptCore:

Reviewed by Maciej.

These changes cause us to match the JavaScript specification and pass the
fast/js/kde/encode_decode_uri.html test.

  • kjs/function.cpp: (KJS::encode): Call the UTF-8 string conversion in its new strict mode, throwing an exception if there are malformed UTF-16 surrogate pairs in the text.
  • kjs/ustring.h: Added a strict version of the UTF-8 string conversion.
  • kjs/ustring.cpp: (KJS::decodeUTF8Sequence): Removed code to disallow U+FFFE and U+FFFF; while those might be illegal in some sense, they aren't supposed to get any special handling in the place where this function is currently used. (KJS::UString::UTF8String): Added the strictness.

LayoutTests:

Reviewed by Maciej.

  • fast/js/kde/resources/encode_decode_uri.js: Rewrote the test to cover edges better, and use the should functions in a way that makes failures easier to understand.
  • fast/js/kde/encode_decode_uri-expected.txt: Updated.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/kjs/ustring.cpp

    r27201 r27406  
    13481348    if (c >= 0xD800 && c <= 0xDFFF)
    13491349      return -1;
    1350     // Backwards BOM and U+FFFF should never appear in UTF-8 data.
    1351     if (c == 0xFFFE || c == 0xFFFF)
    1352       return -1;
    13531350    return c;
    13541351  }
     
    13701367}
    13711368
    1372 CString UString::UTF8String() const
    1373 {
     1369CString UString::UTF8String(bool* utf16WasGood) const
     1370{
     1371  if (utf16WasGood)
     1372    *utf16WasGood = true;
     1373
    13741374  // Allocate a buffer big enough to hold all the characters.
    13751375  const int length = size();
     
    13941394      ++i;
    13951395    } else {
     1396      if (utf16WasGood && c >= 0xD800 && c <= 0xDFFF)
     1397        *utf16WasGood = false;
    13961398      *p++ = (char)((c >> 12) | 0xE0); // E0 is the 3-byte flag for UTF-8
    13971399      *p++ = (char)(((c >> 6) | 0x80) & 0xBF); // next 6 bits, with high bit set
     
    14061408}
    14071409
     1410CString UString::UTF8String() const
     1411{
     1412    return UTF8String(0);
     1413}
     1414
     1415
    14081416} // namespace KJS
Note: See TracChangeset for help on using the changeset viewer.