Ignore:
Timestamp:
Nov 3, 2007, 10:22:44 PM (18 years ago)
Author:
Darin Adler
Message:

JavaScriptCore:

Reviewed by Maciej.

A first step toward removing the PCRE features we don't use.
This gives a 0.8% speedup on SunSpider, and a 6.5% speedup on
the SunSpider regular expression test.

Replaced the public interface with one that doesn't use the
name PCRE. Removed code we don't need for JavaScript and various
configurations we don't use. This is in preparation for still
more changes in the future. We'll probably switch to C++ and
make some even more significant changes to the regexp engine
to get some additional speed.

There's probably additional unused stuff that I haven't
deleted yet.

This does mean that our PCRE is now a fork, but I think that's
not really a big deal.

  • JavaScriptCore.exp: Remove the 5 old entry points and add the 3 new entry points for WebCore's direct use of the regular expression engine.
  • kjs/config.h: Remove the USE(PCRE16) define. I decided to flip its sense and now there's a USE(POSIX_REGEX) instead, which should probably not be set by anyone. Maybe later we'll just get rid of it altogether.
  • kjs/regexp.h:
  • kjs/regexp.cpp: (KJS::RegExp::RegExp): Switch to new jsRegExp function names and defines. Cut down on the number of functions used. (KJS::RegExp::~RegExp): Ditto. (KJS::RegExp::match): Ditto.
  • pcre/dftables.c: (main): Get rid of ctype_letter and ctype_meta, which are unused.
  • pcre/pcre-config.h: Get rid of EBCIDIC, PCRE_DATA_SCOPE, const, size_t, HAVE_STRERROR, HAVE_MEMMOVE, HAVE_BCOPY, NEWLINE, POSIX_MALLOC_THRESHOLD, NO_RECURSE, SUPPORT_UCP, SUPPORT_UTF8, and JAVASCRIPT. These are all no longer configurable in our copy of the library.
  • pcre/pcre.h: Remove the macro-based kjs prefix hack, the PCRE version macros, PCRE_UTF16, the code to set up PCRE_DATA_SCOPE, the include of <stdlib.h>, and most of the constants and functions defined in this header. Changed the naming scheme to use a JSRegExp prefix rather than a pcre prefix. In the future, we'll probably change this to be a C++ header.
  • pcre/pcre_compile.c: Removed all unused code branches, including many whole functions and various byte codes. Kept changes outside of removal to a minimum. (check_escape): (first_significant_code): (find_fixedlength): (find_recurse): (could_be_empty_branch): (compile_branch): (compile_regex): (is_anchored): (is_startline): (find_firstassertedchar): (jsRegExpCompile): Renamed from pcre_compile2 and changed the parameters around a bit. (jsRegExpFree): Added.
  • pcre/pcre_exec.c: Removed many unused opcodes and variables. Also started tearing down the NO_RECURSE mechanism since it's now the default. In some cases there were things in the explicit frame that could be turned into plain old local variables and other small like optimizations. (pchars): (match_ref): (match): Changed parameters quite a bit since it's now not used recursively. (jsRegExpExecute): Renamed from pcre_exec.
  • pcre/pcre_internal.h: Get rid of PCRE_DEFINITION, PCRE_SPTR, PCRE_IMS, PCRE_ICHANGED, PCRE_NOPARTIAL, PCRE_STUDY_MAPPED, PUBLIC_OPTIONS, PUBLIC_EXEC_OPTIONS, PUBLIC_DFA_EXEC_OPTIONS, PUBLIC_STUDY_OPTIONS, MAGIC_NUMBER, 16 of the opcodes, _pcre_utt, _pcre_utt_size, _pcre_try_flipped, _pcre_ucp_findprop, and _pcre_valid_utf8. Also moved pcre_malloc and pcre_free here.
  • pcre/pcre_maketables.c: Changed to only compile in dftables. Also got rid of many of the tables that we don't use.
  • pcre/pcre_tables.c: Removed the unused Unicode property tables.
  • pcre/pcre_ucp_searchfuncs.c: Removed everything except for _pcre_ucp_othercase.
  • pcre/pcre_xclass.c: (_pcre_xclass): Removed uneeded support for classes based on Unicode properties.
  • wtf/FastMallocPCRE.cpp: Removed unused bits. It would be good to eliminate this completely, but we need the regular expression code to be C++ first.
  • pcre/pcre_fullinfo.c:
  • pcre/pcre_get.c:
  • pcre/ucp.h: Files that are no longer needed. I didn't remove them with this check-in, because I didn't want to modify all the project files.

WebCore:

Reviewed by Maciej.

  • page/Frame.cpp: (WebCore::Frame::matchLabelsAgainstElement):
  • page/mac/FrameMac.mm: (WebCore::Frame::matchLabelsAgainstElement): Remove use of ":digit:" syntax. This hasn't worked for some time. Use "\d" instead.
  • platform/RegularExpression.h: Remove the unused cap function. We can add it back later if we find we need it.
  • platform/RegularExpression.cpp: (WebCore::RegularExpression::Private::compile): Update for JavaScriptCore regular expression entry point changes. (WebCore::RegularExpression::Private::~Private): Ditto. (WebCore::RegularExpression::match): Remove the code to set PCRE_NOTBOL. This means that regular expressions with metacharactesr like in them won't work any more with non-whole-string searches, but we don't use any regular expressions like that.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/pcre/pcre_maketables.c

    r26697 r27419  
    3939
    4040
    41 /* This module contains the external function pcre_maketables(), which builds
    42 character tables for PCRE in the current locale. The file is compiled on its
    43 own as part of the PCRE library. However, it is also included in the
    44 compilation of dftables.c, in which case the macro DFTABLES is defined. */
    45 
    46 
    47 #ifndef DFTABLES
    48 #include "pcre_internal.h"
    49 #endif
    50 
    51 
    5241/*************************************************
    5342*           Create PCRE character tables         *
     
    6453*/
    6554
    66 const unsigned char *
     55static const unsigned char *
    6756pcre_maketables(void)
    6857{
     
    9786
    9887memset(p, 0, cbit_length);
    99 for (i = 0; i < 256; i++)
     88for (i = 0; i < 128; i++)
    10089  {
    10190  if (isdigit(i)) p[cbit_digit  + i/8] |= 1 << (i&7);
    102   if (isupper(i)) p[cbit_upper  + i/8] |= 1 << (i&7);
    103   if (islower(i)) p[cbit_lower  + i/8] |= 1 << (i&7);
    10491  if (isalnum(i)) p[cbit_word   + i/8] |= 1 << (i&7);
    10592  if (i == '_')   p[cbit_word   + i/8] |= 1 << (i&7);
    10693  if (isspace(i)) p[cbit_space  + i/8] |= 1 << (i&7);
    107   if (isxdigit(i))p[cbit_xdigit + i/8] |= 1 << (i&7);
    108   if (isgraph(i)) p[cbit_graph  + i/8] |= 1 << (i&7);
    109   if (isprint(i)) p[cbit_print  + i/8] |= 1 << (i&7);
    110   if (ispunct(i)) p[cbit_punct  + i/8] |= 1 << (i&7);
    111   if (iscntrl(i)) p[cbit_cntrl  + i/8] |= 1 << (i&7);
    11294  }
    11395p += cbit_length;
     
    11799within regexes. */
    118100
    119 for (i = 0; i < 256; i++)
     101for (i = 0; i < 128; i++)
    120102  {
    121103  int x = 0;
    122   if (
    123 #if !JAVASCRIPT
    124       *i != 0x0b &&
    125 #endif
    126         isspace(i)) x += ctype_space;
    127   if (isalpha(i)) x += ctype_letter;
     104  if (isspace(i)) x += ctype_space;
    128105  if (isdigit(i)) x += ctype_digit;
    129106  if (isxdigit(i)) x += ctype_xdigit;
    130107  if (isalnum(i) || i == '_') x += ctype_word;
    131 
    132   /* Note: strchr includes the terminating zero in the characters it considers.
    133   In this instance, that is ok because we want binary zero to be flagged as a
    134   meta-character, which in this sense is any character that terminates a run
    135   of data characters. */
    136 
    137   if (strchr("*+?{^.$|()[", i) != 0) x += ctype_meta; *p++ = x; }
     108  *p++ = x;
     109  }
    138110
    139111return yield;
Note: See TracChangeset for help on using the changeset viewer.