Ignore:
Timestamp:
Nov 3, 2007, 10:22:44 PM (18 years ago)
Author:
Darin Adler
Message:

JavaScriptCore:

Reviewed by Maciej.

A first step toward removing the PCRE features we don't use.
This gives a 0.8% speedup on SunSpider, and a 6.5% speedup on
the SunSpider regular expression test.

Replaced the public interface with one that doesn't use the
name PCRE. Removed code we don't need for JavaScript and various
configurations we don't use. This is in preparation for still
more changes in the future. We'll probably switch to C++ and
make some even more significant changes to the regexp engine
to get some additional speed.

There's probably additional unused stuff that I haven't
deleted yet.

This does mean that our PCRE is now a fork, but I think that's
not really a big deal.

  • JavaScriptCore.exp: Remove the 5 old entry points and add the 3 new entry points for WebCore's direct use of the regular expression engine.
  • kjs/config.h: Remove the USE(PCRE16) define. I decided to flip its sense and now there's a USE(POSIX_REGEX) instead, which should probably not be set by anyone. Maybe later we'll just get rid of it altogether.
  • kjs/regexp.h:
  • kjs/regexp.cpp: (KJS::RegExp::RegExp): Switch to new jsRegExp function names and defines. Cut down on the number of functions used. (KJS::RegExp::~RegExp): Ditto. (KJS::RegExp::match): Ditto.
  • pcre/dftables.c: (main): Get rid of ctype_letter and ctype_meta, which are unused.
  • pcre/pcre-config.h: Get rid of EBCIDIC, PCRE_DATA_SCOPE, const, size_t, HAVE_STRERROR, HAVE_MEMMOVE, HAVE_BCOPY, NEWLINE, POSIX_MALLOC_THRESHOLD, NO_RECURSE, SUPPORT_UCP, SUPPORT_UTF8, and JAVASCRIPT. These are all no longer configurable in our copy of the library.
  • pcre/pcre.h: Remove the macro-based kjs prefix hack, the PCRE version macros, PCRE_UTF16, the code to set up PCRE_DATA_SCOPE, the include of <stdlib.h>, and most of the constants and functions defined in this header. Changed the naming scheme to use a JSRegExp prefix rather than a pcre prefix. In the future, we'll probably change this to be a C++ header.
  • pcre/pcre_compile.c: Removed all unused code branches, including many whole functions and various byte codes. Kept changes outside of removal to a minimum. (check_escape): (first_significant_code): (find_fixedlength): (find_recurse): (could_be_empty_branch): (compile_branch): (compile_regex): (is_anchored): (is_startline): (find_firstassertedchar): (jsRegExpCompile): Renamed from pcre_compile2 and changed the parameters around a bit. (jsRegExpFree): Added.
  • pcre/pcre_exec.c: Removed many unused opcodes and variables. Also started tearing down the NO_RECURSE mechanism since it's now the default. In some cases there were things in the explicit frame that could be turned into plain old local variables and other small like optimizations. (pchars): (match_ref): (match): Changed parameters quite a bit since it's now not used recursively. (jsRegExpExecute): Renamed from pcre_exec.
  • pcre/pcre_internal.h: Get rid of PCRE_DEFINITION, PCRE_SPTR, PCRE_IMS, PCRE_ICHANGED, PCRE_NOPARTIAL, PCRE_STUDY_MAPPED, PUBLIC_OPTIONS, PUBLIC_EXEC_OPTIONS, PUBLIC_DFA_EXEC_OPTIONS, PUBLIC_STUDY_OPTIONS, MAGIC_NUMBER, 16 of the opcodes, _pcre_utt, _pcre_utt_size, _pcre_try_flipped, _pcre_ucp_findprop, and _pcre_valid_utf8. Also moved pcre_malloc and pcre_free here.
  • pcre/pcre_maketables.c: Changed to only compile in dftables. Also got rid of many of the tables that we don't use.
  • pcre/pcre_tables.c: Removed the unused Unicode property tables.
  • pcre/pcre_ucp_searchfuncs.c: Removed everything except for _pcre_ucp_othercase.
  • pcre/pcre_xclass.c: (_pcre_xclass): Removed uneeded support for classes based on Unicode properties.
  • wtf/FastMallocPCRE.cpp: Removed unused bits. It would be good to eliminate this completely, but we need the regular expression code to be C++ first.
  • pcre/pcre_fullinfo.c:
  • pcre/pcre_get.c:
  • pcre/ucp.h: Files that are no longer needed. I didn't remove them with this check-in, because I didn't want to modify all the project files.

WebCore:

Reviewed by Maciej.

  • page/Frame.cpp: (WebCore::Frame::matchLabelsAgainstElement):
  • page/mac/FrameMac.mm: (WebCore::Frame::matchLabelsAgainstElement): Remove use of ":digit:" syntax. This hasn't worked for some time. Use "\d" instead.
  • platform/RegularExpression.h: Remove the unused cap function. We can add it back later if we find we need it.
  • platform/RegularExpression.cpp: (WebCore::RegularExpression::Private::compile): Update for JavaScriptCore regular expression entry point changes. (WebCore::RegularExpression::Private::~Private): Ditto. (WebCore::RegularExpression::match): Remove the code to set PCRE_NOTBOL. This means that regular expressions with metacharactesr like in them won't work any more with non-whole-string searches, but we don't use any regular expressions like that.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/pcre/pcre-config.h

    r26697 r27419  
    1 
    2 /* On Unix-like systems config.in is converted by "configure" into config.h.
    3 Some other environments also support the use of "configure". PCRE is written in
    4 Standard C, but there are a few non-standard things it can cope with, allowing
    5 it to run on SunOS4 and other "close to standard" systems.
    6 
    7 On a non-Unix-like system you should just copy this file into config.h, and set
    8 up the macros the way you need them. You should normally change the definitions
    9 of HAVE_STRERROR and HAVE_MEMMOVE to 1. Unfortunately, because of the way
    10 autoconf works, these cannot be made the defaults. If your system has bcopy()
    11 and not memmove(), change the definition of HAVE_BCOPY instead of HAVE_MEMMOVE.
    12 If your system has neither bcopy() nor memmove(), leave them both as 0; an
    13 emulation function will be used. */
    14 
    15 /* If you are compiling for a system that uses EBCDIC instead of ASCII
    16 character codes, define this macro as 1. On systems that can use "configure",
    17 this can be done via --enable-ebcdic. */
    18 
    19 #ifndef EBCDIC
    20 #define EBCDIC 0
    21 #endif
    22 
    23 /* If you are compiling for a system other than a Unix-like system or Win32,
    24 and it needs some magic to be inserted before the definition of a function that
    25 is exported by the library, define this macro to contain the relevant magic. If
    26 you do not define this macro, it defaults to "extern" for a C compiler and
    27 "extern C" for a C++ compiler on non-Win32 systems. This macro apears at the
    28 start of every exported function that is part of the external API. It does not
    29 appear on functions that are "external" in the C sense, but which are internal
    30 to the library. */
    31 
    32 #define PCRE_DATA_SCOPE extern
    33 
    34 /* Define the following macro to empty if the "const" keyword does not work. */
    35 
    36 #undef const
    37 
    38 /* Define the following macro to "unsigned" if <stddef.h> does not define
    39 size_t. */
    40 
    41 #undef size_t
    42 
    43 /* The following two definitions are mainly for the benefit of SunOS4, which
    44 does not have the strerror() or memmove() functions that should be present in
    45 all Standard C libraries. The macros HAVE_STRERROR and HAVE_MEMMOVE should
    46 normally be defined with the value 1 for other systems, but unfortunately we
    47 cannot make this the default because "configure" files generated by autoconf
    48 will only change 0 to 1; they won't change 1 to 0 if the functions are not
    49 found. */
    50 
    51 #define HAVE_STRERROR 1
    52 #define HAVE_MEMMOVE  1
    53 
    54 /* There are some non-Unix-like systems that don't even have bcopy(). If this
    55 macro is false, an emulation is used. If HAVE_MEMMOVE is set to 1, the value of
    56 HAVE_BCOPY is not relevant. */
    57 
    58 #define HAVE_BCOPY    0
    59 
    60 /* The value of NEWLINE determines the newline character. The default is to
    61 leave it up to the compiler, but some sites want to force a particular value.
    62 On Unix-like systems, "configure" can be used to override this default. */
    63 
    64 #ifndef NEWLINE
    65 #define NEWLINE '\n'
    66 #endif
    67 
    681/* The value of LINK_SIZE determines the number of bytes used to store links as
    692offsets within the compiled regex. The default is 2, which allows for compiled
     
    736to override this default. */
    747
    75 #ifndef LINK_SIZE
    768#define LINK_SIZE   2
    77 #endif
    78 
    79 /* When calling PCRE via the POSIX interface, additional working storage is
    80 required for holding the pointers to capturing substrings because PCRE requires
    81 three integers per substring, whereas the POSIX interface provides only two. If
    82 the number of expected substrings is small, the wrapper function uses space on
    83 the stack, because this is faster than using malloc() for each call. The
    84 threshold above which the stack is no longer used is defined by POSIX_MALLOC_
    85 THRESHOLD. On systems that support it, "configure" can be used to override this
    86 default. */
    87 
    88 #ifndef POSIX_MALLOC_THRESHOLD
    89 #define POSIX_MALLOC_THRESHOLD 10
    90 #endif
    91 
    92 /* PCRE uses recursive function calls to handle backtracking while matching.
    93 This can sometimes be a problem on systems that have stacks of limited size.
    94 Define NO_RECURSE to get a version that doesn't use recursion in the match()
    95 function; instead it creates its own stack by steam using pcre_recurse_malloc()
    96 to obtain memory from the heap. For more detail, see the comments and other
    97 stuff just above the match() function. On systems that support it, "configure"
    98 can be used to set this in the Makefile (use --disable-stack-for-recursion). */
    99 
    100 #define NO_RECURSE
    1019
    10210/* The value of MATCH_LIMIT determines the default number of times the internal
     
    10816override this default default. */
    10917
    110 #ifndef MATCH_LIMIT
    11118#define MATCH_LIMIT 10000000
    112 #endif
    11319
    11420/* The above limit applies to all calls of match(), whether or not they
     
    12127"configure" can be used to override this default default. */
    12228
    123 #ifndef MATCH_LIMIT_RECURSION
    12429#define MATCH_LIMIT_RECURSION MATCH_LIMIT
    125 #endif
    126 
    127 #define SUPPORT_UCP 1
    128 #define SUPPORT_UTF8 1
    129 
    130 #define JAVASCRIPT 1
    13130
    13231/* End */
Note: See TracChangeset for help on using the changeset viewer.