Changeset 7223 in webkit for trunk/JavaScriptCore/kjs/regexp.cpp


Ignore:
Timestamp:
Aug 10, 2004, 2:35:09 PM (21 years ago)
Author:
darin
Message:

JavaScriptCore:

Reviewed by Dave.

  • switch PCRE to do UTF-16 directly instead of converting to/from UTF-8 for speed
  • pcre/pcre.h: Added PCRE_UTF16 switch, set to 1. Added pcre_char typedef, which is char or uint16_t depending on the mode, and used appropriate in the 7 public functions that need to use it.
  • pcre/pcre.c: Add UTF-16 support to all functions.
  • pcre/study.c: Ditto.
  • pcre/internal.h: Added ichar typedef, which is unsigned char or uint16_t depending on the mode. Changed declarations to use symbolic constants and typedefs so we size things to ichar when needed.
  • pcre/maketables.c: (pcre_maketables): Change code to make tables that are sized to 16-bit characters instead of 8-bit.
  • pcre/get.c: (pcre_copy_substring): Use pcre_char instead of char. (pcre_get_substring_list): Ditto. (pcre_free_substring_list): Ditto. (pcre_get_substring): Ditto. (pcre_free_substring): Ditto.
  • pcre/dftables.c: (main): Used a bit more const, and use ICHAR sizes instead of hard-coding 8-bit table sizes.
  • pcre/chartables.c: Regenerated.
  • kjs/ustring.h: Remove functions that convert UTF-16 to/from UTF-8 offsets.
  • kjs/ustring.cpp: Change the shared empty string to have a unicode pointer that is not null. The null string still has a null pointer. This prevents us from passing a null through to the regular expression engine (which results in a null error even when the string length is 0).
  • kjs/regexp.cpp: (KJS::RegExp::RegExp): Null-terminate the pattern and pass it. (KJS::RegExp::match): Use the 16-bit string directly, no need to convert to UTF-8.

WebCore:

Reviewed by Dave.

  • switch PCRE to do UTF-16 directly instead of converting to/from UTF-8 for speed
  • kwq/KWQRegExp.mm: (QRegExp::KWQRegExpPrivate::compile): Null-terminate the pattern and pass it. (QRegExp::match): Use the 16-bit string directly, no need to convert to UTF-8.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/kjs/regexp.cpp

    r4837 r7223  
    4343  const char *errorMessage;
    4444  int errorOffset;
    45   _regex = pcre_compile(p.UTF8String().c_str(), options, &errorMessage, &errorOffset, NULL);
     45  UString nullTerminated(p);
     46  char null(0);
     47  nullTerminated.append(null);
     48  _regex = pcre_compile(reinterpret_cast<const uint16_t *>(nullTerminated.data()), options, &errorMessage, &errorOffset, NULL);
    4649  if (!_regex) {
    4750#ifndef NDEBUG
     
    120123  }
    121124
    122   const CString buffer(s.UTF8String());
    123   convertUTF16OffsetsToUTF8Offsets(buffer.c_str(), &i, 1);
    124   const int numMatches = pcre_exec(_regex, NULL, buffer.c_str(), buffer.size(), i, 0, offsetVector, offsetVectorSize);
     125  const int numMatches = pcre_exec(_regex, NULL, reinterpret_cast<const uint16_t *>(s.data()), s.size(), i, 0, offsetVector, offsetVectorSize);
    125126
    126127  if (numMatches < 0) {
     
    133134    return UString::null();
    134135  }
    135 
    136   convertUTF8OffsetsToUTF16Offsets(buffer.c_str(), offsetVector, (numMatches == 0 ? 1 : numMatches) * 2);
    137136
    138137  *pos = offsetVector[0];
Note: See TracChangeset for help on using the changeset viewer.