source: webkit/trunk/JavaScriptCore/kjs/regexp.h@ 25625

Last change on this file since 25625 was 24453, checked in by darin, 18 years ago

Reviewed by Geoff.

  • fix <rdar://problem/5345440> PCRE computes wrong length for expressions with quantifiers on named recursion or subexpressions

It's challenging to implement proper preflighting for compiling these advanced features.
But we don't want them in the JavaScript engine anyway.

Turned off the following features of PCRE (some of these are simply parsed and not implemented):

\C \E \G \L \N \P \Q \U \X \Z
\e \l \p \u \z
[::] .. [==]
(?#) (?<=) (?<!) (?>)
(?C) (?P) (?R)
(?0) (and 1-9)
(?imsxUX)

Added the following:

\u \v

Because of \v, the js1_2/regexp/special_characters.js test now passes.

To be conservative, I left some features that JavaScript doesn't want, such as
\012 and \x{2013}, in place. We can revisit these later; they're not directly-enough
related to avoiding the incorrect preflighting.

I also didn't try to remove unused opcodes and remove code from the execution engine.
That could save code size and speed things up a bit, but it would require more changes.

  • kjs/regexp.h:
  • kjs/regexp.cpp: (KJS::RegExp::RegExp): Remove the sanitizePattern workaround for lack of \u support, since the PCRE code now has \u support.
  • pcre/pcre-config.h: Set JAVASCRIPT to 1.
  • pcre/pcre_internal.h: Added ESC_v.
  • pcre/pcre_compile.c: Added a different escape table for when JAVASCRIPT is set that omits all the escapes we don't want interpreted and includes '\v'. (check_escape): Put !JAVASCRIPT around the code for '\l', '\L', '\N', '\u', and '\U', and added code to handle '\u2013' inside JAVASCRIPT. (compile_branch): Put !JAVASCRIPT if around all the code implementing the features we don't want. (pcre_compile2): Ditto.
  • tests/mozilla/expected.html: Updated since js1_2/regexp/special_characters.js now passes.
  • Property svn:eol-style set to native
File size: 2.1 KB
Line 
1// -*- c-basic-offset: 2 -*-
2/*
3 * This file is part of the KDE libraries
4 * Copyright (C) 1999-2000 Harri Porten ([email protected])
5 *
6 * This library is free software; you can redistribute it and/or
7 * modify it under the terms of the GNU Lesser General Public
8 * License as published by the Free Software Foundation; either
9 * version 2 of the License, or (at your option) any later version.
10 *
11 * This library is distributed in the hope that it will be useful,
12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
14 * Lesser General Public License for more details.
15 *
16 * You should have received a copy of the GNU Lesser General Public
17 * License along with this library; if not, write to the Free Software
18 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
19 *
20 */
21
22#ifndef _KJS_REGEXP_H_
23#define _KJS_REGEXP_H_
24
25#include <sys/types.h>
26
27#include "config.h"
28
29#if HAVE(PCREPOSIX)
30#include <pcre.h>
31#else // POSIX regex - not so good...
32extern "C" { // bug with some libc5 distributions
33#include <regex.h>
34}
35#endif // HAVE(PCREPOSIX)
36
37#include "ustring.h"
38
39namespace KJS {
40
41 class RegExp {
42 public:
43 enum { None = 0, Global = 1, IgnoreCase = 2, Multiline = 4 };
44
45 RegExp(const UString &pattern, int flags = None);
46 ~RegExp();
47
48 int flags() const { return m_flags; }
49 bool isValid() const { return !m_constructionError; }
50 const char* errorMessage() const { return m_constructionError; }
51
52 UString match(const UString &s, int i, int *pos = 0, int **ovector = 0);
53 unsigned subPatterns() const { return m_numSubPatterns; }
54
55 private:
56#if HAVE(PCREPOSIX)
57 pcre *m_regex;
58#else
59 regex_t m_regex;
60#endif
61 int m_flags;
62 char* m_constructionError;
63 unsigned m_numSubPatterns;
64
65 RegExp(const RegExp &);
66 RegExp &operator=(const RegExp &);
67
68 static bool isHexDigit(UChar);
69 static unsigned char convertHex(int);
70 static unsigned char convertHex(int, int);
71 static UChar convertUnicode(UChar, UChar, UChar, UChar);
72 };
73
74} // namespace
75
76#endif
Note: See TracBrowser for help on using the repository browser.