source: webkit/trunk/JavaScriptCore/kjs/lexer.cpp@ 1024

Last change on this file since 1024 was 1024, checked in by darin, 23 years ago

Merged KDE 3.0 final code in and:

JavaScriptCore:

  • kjs/internal.cpp:
  • kjs/property_map.cpp:
  • kjs/ustring.h: Removed some unneeded <config.h> includes so we are more similar to the real KDE sources.

Merged changes from KDE 3.0 final and did some build fixes.

  • kjs/grammar.*: Regenerated.
  • kjs/*.lut.h: Regenerated.

WebCore:

  • src/kdelibs/khtml/rendering/render_text.cpp: (TextSlave::printDecoration): Remove some minor gratuitous diffs vs. KDE.
  • src/kdelibs/khtml/rendering/render_text.cpp: (TextSlave::printDecoration): Richard updated to reflect changes in KDE.
  • src/kdelibs/khtml/css/css_valueimpl.cpp: (FontFamilyValueImpl::FontFamilyValueImpl): Fix comment.
  • src/kdelibs/khtml/css/cssstyleselector.cpp: Remove some gratuitous diffs vs. KDE.
  • src/kdelibs/khtml/html/html_objectimpl.cpp: (HTMLEmbedElementImpl::parseAttribute): Remove unneeded copy from KWQ's early days.
  • src/kdelibs/khtml/html/html_tableimpl.cpp: (HTMLTableElementImpl::parseAttribute), (HTMLTablePartElementImpl::parseAttribute): Remove unneeded copy from KWQ's early days.
  • src/kdelibs/khtml/html/htmltokenizer.cpp: (HTMLTokenizer::processToken): Redo the APPLE_CHANGES ifdef here.
  • src/kdelibs/khtml/khtmlpart_p.h: Update to latest kde.
  • src/kdelibs/khtml/khtmlview.cpp: (KHTMLView::KHTMLView): Add ifdef APPLE_CHANGES. (KHTMLView::~KHTMLView): Add ifdef APPLE_CHANGES. (KHTMLView::print): Remove code left in here during merge process.
  • src/kwq/KWQKHTMLPart.mm: Remove unused setFontSizes(), fontSizes(), and resetFontSizes(). After the merge is landed, remove more.
  • src/libwebcore.exp: Export updateStyleSelector() for WebKit.

Fix text to it displays at the right font size.

  • src/kdelibs/khtml/css/cssstyleselector.cpp: (CSSStyleSelector::computeFontSizes): Apply the same SCREEN_RESOLUTION hack here that we do elsewhere.
  • src/kdelibs/khtml/rendering/font.cpp: (Font::width): Use kMin instead of max (oops). (Font::update): Turn off font database chicanery.
  • src/kwq/KWQKHTMLPart.mm: (KHTMLPart::zoomFactor): Use zoom factor 100, not 1.

More fixes so text displays (still at wrong font size).

  • src/kdelibs/khtml/rendering/font.cpp: (max): New helper. (Font::drawText): Simplified implementation for now. (Font::width): Simplified implementation for now.
  • src/kwq/KWQColorGroup.mm: Reinstated QCOLOR_GROUP_SIZE.
  • src/kwq/qt/qfontmetrics.h: Removed charWidth and changed _width to take QChar *.
  • src/kwq/KWQFontMetrics.mm: Removed charWidth and changed _width to take QChar *.

Merged changes from KDE 3.0 final. Other fixes to get things compiling.

  • src/kdelibs/khtml/css/css_valueimpl.cpp: (CSSStyleDeclarationImpl::setProperty): Fix unused variable.
  • src/kdelibs/khtml/khtmlview.cpp: (KHTMLView::contentsContextMenuEvent): Fix unused variable.
  • src/kdelibs/khtml/rendering/font.cpp: (Font::drawText), (Font::width), (Font::update): Disable special "nsbp" logic for now. We can reenable it if necessary.
  • src/kdelibs/khtml/rendering/render_replaced.cpp: Fix mismerge.
  • src/kdelibs/khtml/rendering/render_text.cpp: (RenderText::nodeAtPoint): Fix unused variable.
  • src/kwq/KWQApplication.mm: (QDesktopWidget::width), (QApplication::desktop): Fix mismerge.
  • src/kwq/KWQColorGroup.mm: Fix QCOLOR_GROUP_SIZE.
  • src/kwq/KWQFontMetrics.mm: (QFontMetrics::lineSpacing): New. (QFontMetrics::width): Remove unused optimization.
  • src/kwq/qt/qfontmetrics.h: Add lineSpacing().

Merged changes from previous merge pass.

2002-03-25 Darin Adler <Darin Adler>

Last bit of making stuff compile and link. Probably will drop the merge now
and take it up again when it's time to merge in KDE 3.0 final.

  • src/kwq/KWQEvent.mm: (QFocusEvent::reason): New.
  • src/kwq/KWQPainter.mm: (QPainter::drawText): New overload.

2002-03-25 Darin Adler <Darin Adler>

  • src/kdelibs/khtml/rendering/font.cpp: (Font::width): Make it call _width so we don't lose the optimization.
  • src/kwq/KWQApplication.mm: (QDesktopWidget::screenNumber): New. (QDesktopWidget::screenGeometry): New. (QApplication::style): New.
  • src/kwq/KWQColorGroup.mm: (QColorGroup::highlight): New. (QColorGroup::highlightedText): New.
  • src/kwq/KWQFont.mm: (QFont::setPixelSize): New.
  • src/kwq/KWQFontMetrics.mm: (QFontMetrics::charWidth): New.
  • src/kwq/KWQKGlobal.mm: (KGlobal::locale): Implement. (KLocale::KLocale): New. (KLocale::languageList): New.
  • src/kwq/KWQKHTMLPart.mm: (KHTMLPart::sheetUsed): New. (KHTMLPart::setSheetUsed): New. (KHTMLPart::zoomFactor): New.
  • src/kwq/KWQKHTMLSettings.mm: (KHTMLSettings::mediumFontSize): New.
  • src/kwq/KWQScrollView.mm: (QScrollView::childX): New. (QScrollView::childY): New.
  • src/kwq/qt/qapplication.h: style() returns a QStyle &.
  • src/kwq/qt/qpalette.h: Add Highlight and HighlightedText.

2002-03-24 Darin Adler <Darin Adler>

More compiling. Still won't link.

  • src/kdelibs/khtml/khtmlview.cpp: Disable printing and drag and drop code.
  • src/kdelibs/khtml/rendering/render_text.cpp: (TextSlave::printDecoration): Temporarily turn off our smarter underlining since it relies on access to the string, and TextSlave doesn't have that any more. (RenderText::nodeAtPoint): Get rid of a workaround we don't need any more for a bug that was fixed by KDE folks.
  • src/kwq/KWQApplication.mm: (QApplication::desktop): Make the desktop be a QDesktopWidget.
  • src/kwq/qt/qnamespace.h: Add MetaButton.
  • src/kwq/qt/qtooltip.h: Add a maybeTip virtual function member and a virtual destructor.

2002-03-24 Darin Adler <Darin Adler>

Some fixes to get more stuff to compile.

  • src/kdelibs/khtml/ecma/kjs_dom.cpp: (DOMDocument::getValueProperty): Don't try to look at the private m_bComplete to display "complete". Just do "loading" and "loaded".
  • src/kdelibs/khtml/khtmlpart_p.h: #ifdef this all out for APPLE_CHANGES.
  • src/kdelibs/khtml/rendering/font.cpp: (Font::update): Add an explicit cast to int to avoid float -> int warning.
  • src/kdelibs/khtml/rendering/render_table.cpp: (RenderTable::calcColMinMax): Add an explicit cast to int to avoid uint compared with int warning.
  • src/kdelibs/khtml/xml/dom_docimpl.cpp: (DocumentImpl::recalcStyleSelector): Use sheetUsed() and setSheetUsed() functions on KHTMLPart intead of getting at private fields the way the real KDE code does.
  • src/kwq/KWQKHTMLPart.h: Declare zoomFactor(), sheetUsed(), and setSheetUsed().
  • src/kwq/KWQStyle.h: Add PM_DefaultFramWidth as another metric.
  • src/kwq/kdecore/klocale.h: Add languageList().
  • src/kwq/khtml/khtml_settings.h: Add mediumFontSize().
  • src/kwq/qt/qapplication.h: Add style() and QDesktopWidget.
  • src/kwq/qt/qevent.h: Add reason().
  • src/kwq/qt/qfont.h: Add setPixelSize(int).
  • src/kwq/qt/qfontmetrics.h: Add charWidth() and _charWidth() functions.
  • src/kwq/qt/qpainter.h: Add drawText() overload with position parameter.
  • src/kwq/qt/qpalette.h: Add highlight() and highlightedText().
  • src/kwq/qt/qscrollview.h: Add childX() and childY().
  • src/kwq/KWQApplication.mm: Change KWQDesktopWidget to QDesktopWidget.

WebKit:

  • WebView.subproj/IFPreferences.h:
  • WebView.subproj/IFPreferences.mm: (+[IFPreferences load]): Remove the old WebKitFontSizes preference. (-[IFPreferences mediumFontSize]), (-[IFPreferences setMediumFontSize:]): New.
  • WebView.subproj/IFWebView.mm: (-[IFWebView reapplyStyles]): Call updateStyleSelector() instead of recalcStyle().

Merged changes from previous merge branch.

2002-03-25 Darin Adler <Darin Adler>

  • WebView.subproj/IFPreferences.mm: (+[IFPreferences load]): Add WebKitMediumFontSizePreferenceKey.

WebBrowser:

  • Preferences.subproj/TextPreferences.m: (-[TextPreferences defaultFontSize]), (-[TextPreferences setDefaultFontSize:]): Just get and set the new mediumFontSize preference rather than doing the whole fontSizes preference dance.
  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 18.9 KB
Line 
1// -*- c-basic-offset: 2 -*-
2/*
3 * This file is part of the KDE libraries
4 * Copyright (C) 1999-2000 Harri Porten ([email protected])
5 *
6 * This library is free software; you can redistribute it and/or
7 * modify it under the terms of the GNU Library General Public
8 * License as published by the Free Software Foundation; either
9 * version 2 of the License, or (at your option) any later version.
10 *
11 * This library is distributed in the hope that it will be useful,
12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
14 * Library General Public License for more details.
15 *
16 * You should have received a copy of the GNU Library General Public License
17 * along with this library; see the file COPYING.LIB. If not, write to
18 * the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
19 * Boston, MA 02111-1307, USA.
20 *
21 */
22
23#ifdef HAVE_CONFIG_H
24#include <config.h>
25#endif
26
27#include <ctype.h>
28#include <stdlib.h>
29#include <stdio.h>
30#include <string.h>
31#include <assert.h>
32
33#include "value.h"
34#include "object.h"
35#include "types.h"
36#include "interpreter.h"
37#include "nodes.h"
38#include "lexer.h"
39#include "ustring.h"
40#include "lookup.h"
41#include "internal.h"
42
43// we can't specify the namespace in yacc's C output, so do it here
44using namespace KJS;
45
46static Lexer *currLexer = 0;
47
48#ifndef KDE_USE_FINAL
49#include "grammar.h"
50#endif
51
52#include "lexer.lut.h"
53
54extern YYLTYPE yylloc; // global bison variable holding token info
55
56// a bridge for yacc from the C world to C++
57int kjsyylex()
58{
59 return Lexer::curr()->lex();
60}
61
62Lexer::Lexer()
63 : yylineno(1),
64 size8(128), size16(128), restrKeyword(false),
65 eatNextIdentifier(false), stackToken(-1), lastToken(-1), pos(0),
66 code(0), length(0),
67#ifndef KJS_PURE_ECMA
68 bol(true),
69#endif
70 current(0), next1(0), next2(0), next3(0)
71{
72 // allocate space for read buffers
73 buffer8 = new char[size8];
74 buffer16 = new UChar[size16];
75 currLexer = this;
76
77}
78
79Lexer::~Lexer()
80{
81 delete [] buffer8;
82 delete [] buffer16;
83}
84
85Lexer *Lexer::curr()
86{
87 if (!currLexer) {
88 // create singleton instance
89 currLexer = new Lexer();
90 }
91 return currLexer;
92}
93
94void Lexer::setCode(const UChar *c, unsigned int len)
95{
96 yylineno = 1;
97 restrKeyword = false;
98 delimited = false;
99 eatNextIdentifier = false;
100 stackToken = -1;
101 lastToken = -1;
102 pos = 0;
103 code = c;
104 length = len;
105 skipLF = false;
106 skipCR = false;
107#ifndef KJS_PURE_ECMA
108 bol = true;
109#endif
110
111 // read first characters
112 current = (length > 0) ? code[0].unicode() : 0;
113 next1 = (length > 1) ? code[1].unicode() : 0;
114 next2 = (length > 2) ? code[2].unicode() : 0;
115 next3 = (length > 3) ? code[3].unicode() : 0;
116}
117
118void Lexer::shift(unsigned int p)
119{
120 while (p--) {
121 pos++;
122 current = next1;
123 next1 = next2;
124 next2 = next3;
125 next3 = (pos + 3 < length) ? code[pos+3].unicode() : 0;
126 }
127}
128
129// called on each new line
130void Lexer::nextLine()
131{
132 yylineno++;
133#ifndef KJS_PURE_ECMA
134 bol = true;
135#endif
136}
137
138void Lexer::setDone(State s)
139{
140 state = s;
141 done = true;
142}
143
144int Lexer::lex()
145{
146 int token = 0;
147 state = Start;
148 unsigned short stringType = 0; // either single or double quotes
149 pos8 = pos16 = 0;
150 done = false;
151 terminator = false;
152 skipLF = false;
153 skipCR = false;
154
155 // did we push a token on the stack previously ?
156 // (after an automatic semicolon insertion)
157 if (stackToken >= 0) {
158 setDone(Other);
159 token = stackToken;
160 stackToken = 0;
161 }
162
163 while (!done) {
164 if (skipLF && current != '\n') // found \r but not \n afterwards
165 skipLF = false;
166 if (skipCR && current != '\r') // found \n but not \r afterwards
167 skipCR = false;
168 if (skipLF || skipCR) // found \r\n or \n\r -> eat the second one
169 {
170 skipLF = false;
171 skipCR = false;
172 shift(1);
173 }
174 switch (state) {
175 case Start:
176 if (isWhiteSpace()) {
177 // do nothing
178 } else if (current == '/' && next1 == '/') {
179 shift(1);
180 state = InSingleLineComment;
181 } else if (current == '/' && next1 == '*') {
182 shift(1);
183 state = InMultiLineComment;
184 } else if (current == 0) {
185 if (!terminator && !delimited) {
186 // automatic semicolon insertion if program incomplete
187 token = ';';
188 stackToken = 0;
189 setDone(Other);
190 } else
191 setDone(Eof);
192 } else if (isLineTerminator()) {
193 nextLine();
194 terminator = true;
195 if (restrKeyword) {
196 token = ';';
197 setDone(Other);
198 }
199 } else if (current == '"' || current == '\'') {
200 state = InString;
201 stringType = current;
202 } else if (isIdentLetter(current)) {
203 record16(current);
204 state = InIdentifier;
205 } else if (current == '0') {
206 record8(current);
207 state = InNum0;
208 } else if (isDecimalDigit(current)) {
209 record8(current);
210 state = InNum;
211 } else if (current == '.' && isDecimalDigit(next1)) {
212 record8(current);
213 state = InDecimal;
214#ifndef KJS_PURE_ECMA
215 // <!-- marks the beginning of a line comment (for www usage)
216 } else if (current == '<' && next1 == '!' &&
217 next2 == '-' && next3 == '-') {
218 shift(3);
219 state = InSingleLineComment;
220 // same for -->
221 } else if (bol && current == '-' && next1 == '-' && next2 == '>') {
222 shift(2);
223 state = InSingleLineComment;
224#endif
225 } else {
226 token = matchPunctuator(current, next1, next2, next3);
227 if (token != -1) {
228 setDone(Other);
229 } else {
230 // cerr << "encountered unknown character" << endl;
231 setDone(Bad);
232 }
233 }
234 break;
235 case InString:
236 if (current == stringType) {
237 shift(1);
238 setDone(String);
239 } else if (current == 0 || isLineTerminator()) {
240 setDone(Bad);
241 } else if (current == '\\') {
242 state = InEscapeSequence;
243 } else {
244 record16(current);
245 }
246 break;
247 // Escape Sequences inside of strings
248 case InEscapeSequence:
249 if (isOctalDigit(current)) {
250 if (current >= '0' && current <= '3' &&
251 isOctalDigit(next1) && isOctalDigit(next2)) {
252 record16(convertOctal(current, next1, next2));
253 shift(2);
254 state = InString;
255 } else if (isOctalDigit(current) && isOctalDigit(next1)) {
256 record16(convertOctal('0', current, next1));
257 shift(1);
258 state = InString;
259 } else if (isOctalDigit(current)) {
260 record16(convertOctal('0', '0', current));
261 state = InString;
262 } else {
263 setDone(Bad);
264 }
265 } else if (current == 'x')
266 state = InHexEscape;
267 else if (current == 'u')
268 state = InUnicodeEscape;
269 else {
270 record16(singleEscape(current));
271 state = InString;
272 }
273 break;
274 case InHexEscape:
275 if (isHexDigit(current) && isHexDigit(next1)) {
276 state = InString;
277 record16(convertHex(current, next1));
278 shift(1);
279 } else if (current == stringType) {
280 record16('x');
281 shift(1);
282 setDone(String);
283 } else {
284 record16('x');
285 record16(current);
286 state = InString;
287 }
288 break;
289 case InUnicodeEscape:
290 if (isHexDigit(current) && isHexDigit(next1) &&
291 isHexDigit(next2) && isHexDigit(next3)) {
292 record16(convertUnicode(current, next1, next2, next3));
293 shift(3);
294 state = InString;
295 } else if (current == stringType) {
296 record16('u');
297 shift(1);
298 setDone(String);
299 } else {
300 setDone(Bad);
301 }
302 break;
303 case InSingleLineComment:
304 if (isLineTerminator()) {
305 nextLine();
306 terminator = true;
307 if (restrKeyword) {
308 token = ';';
309 setDone(Other);
310 } else
311 state = Start;
312 } else if (current == 0) {
313 setDone(Eof);
314 }
315 break;
316 case InMultiLineComment:
317 if (current == 0) {
318 setDone(Bad);
319 } else if (isLineTerminator()) {
320 nextLine();
321 } else if (current == '*' && next1 == '/') {
322 state = Start;
323 shift(1);
324 }
325 break;
326 case InIdentifier:
327 if (isIdentLetter(current) || isDecimalDigit(current)) {
328 record16(current);
329 break;
330 }
331 setDone(Identifier);
332 break;
333 case InNum0:
334 if (current == 'x' || current == 'X') {
335 record8(current);
336 state = InHex;
337 } else if (current == '.') {
338 record8(current);
339 state = InDecimal;
340 } else if (current == 'e' || current == 'E') {
341 record8(current);
342 state = InExponentIndicator;
343 } else if (isOctalDigit(current)) {
344 record8(current);
345 state = InOctal;
346 } else if (isDecimalDigit(current)) {
347 record8(current);
348 state = InDecimal;
349 } else {
350 setDone(Number);
351 }
352 break;
353 case InHex:
354 if (isHexDigit(current)) {
355 record8(current);
356 } else {
357 setDone(Hex);
358 }
359 break;
360 case InOctal:
361 if (isOctalDigit(current)) {
362 record8(current);
363 }
364 else if (isDecimalDigit(current)) {
365 record8(current);
366 state = InDecimal;
367 } else
368 setDone(Octal);
369 break;
370 case InNum:
371 if (isDecimalDigit(current)) {
372 record8(current);
373 } else if (current == '.') {
374 record8(current);
375 state = InDecimal;
376 } else if (current == 'e' || current == 'E') {
377 record8(current);
378 state = InExponentIndicator;
379 } else
380 setDone(Number);
381 break;
382 case InDecimal:
383 if (isDecimalDigit(current)) {
384 record8(current);
385 } else if (current == 'e' || current == 'E') {
386 record8(current);
387 state = InExponentIndicator;
388 } else
389 setDone(Number);
390 break;
391 case InExponentIndicator:
392 if (current == '+' || current == '-') {
393 record8(current);
394 } else if (isDecimalDigit(current)) {
395 record8(current);
396 state = InExponent;
397 } else
398 setDone(Bad);
399 break;
400 case InExponent:
401 if (isDecimalDigit(current)) {
402 record8(current);
403 } else
404 setDone(Number);
405 break;
406 default:
407 assert(!"Unhandled state in switch statement");
408 }
409
410 // move on to the next character
411 if (!done)
412 shift(1);
413#ifndef KJS_PURE_ECMA
414 if (state != Start && state != InSingleLineComment)
415 bol = false;
416#endif
417 }
418
419 // no identifiers allowed directly after numeric literal, e.g. "3in" is bad
420 if ((state == Number || state == Octal || state == Hex)
421 && isIdentLetter(current))
422 state = Bad;
423
424 // terminate string
425 buffer8[pos8] = '\0';
426
427#ifdef KJS_DEBUG_LEX
428 fprintf(stderr, "line: %d ", lineNo());
429 fprintf(stderr, "yytext (%x): ", buffer8[0]);
430 fprintf(stderr, "%s ", buffer8);
431#endif
432
433 double dval = 0;
434 if (state == Number) {
435 dval = strtod(buffer8, 0L);
436 } else if (state == Hex) { // scan hex numbers
437 // TODO: support long unsigned int
438 unsigned int i;
439 sscanf(buffer8, "%x", &i);
440 dval = i;
441 state = Number;
442 } else if (state == Octal) { // scan octal number
443 unsigned int ui;
444 sscanf(buffer8, "%o", &ui);
445 dval = ui;
446 state = Number;
447 }
448
449#ifdef KJS_DEBUG_LEX
450 switch (state) {
451 case Eof:
452 printf("(EOF)\n");
453 break;
454 case Other:
455 printf("(Other)\n");
456 break;
457 case Identifier:
458 printf("(Identifier)/(Keyword)\n");
459 break;
460 case String:
461 printf("(String)\n");
462 break;
463 case Number:
464 printf("(Number)\n");
465 break;
466 default:
467 printf("(unknown)");
468 }
469#endif
470
471 if (state != Identifier && eatNextIdentifier)
472 eatNextIdentifier = false;
473
474 restrKeyword = false;
475 delimited = false;
476 yylloc.first_line = yylineno; // ???
477 yylloc.last_line = yylineno;
478
479 switch (state) {
480 case Eof:
481 token = 0;
482 break;
483 case Other:
484 if(token == '}' || token == ';') {
485 delimited = true;
486 }
487 break;
488 case Identifier:
489 if ((token = Lookup::find(&mainTable, buffer16, pos16)) < 0) {
490 // Lookup for keyword failed, means this is an identifier
491 // Apply anonymous-function hack below (eat the identifier)
492 if (eatNextIdentifier) {
493 eatNextIdentifier = false;
494 UString debugstr(buffer16, pos16); fprintf(stderr,"Anonymous function hack: eating identifier %s\n",debugstr.ascii());
495 token = lex();
496 break;
497 }
498 /* TODO: close leak on parse error. same holds true for String */
499 kjsyylval.ustr = new UString(buffer16, pos16);
500 token = IDENT;
501 break;
502 }
503
504 eatNextIdentifier = false;
505 // Hack for "f = function somename() { ... }", too hard to get into the grammar
506 if (token == FUNCTION && lastToken == '=' )
507 eatNextIdentifier = true;
508
509 if (token == CONTINUE || token == BREAK ||
510 token == RETURN || token == THROW)
511 restrKeyword = true;
512 break;
513 case String:
514 kjsyylval.ustr = new UString(buffer16, pos16);
515 token = STRING;
516 break;
517 case Number:
518 kjsyylval.dval = dval;
519 token = NUMBER;
520 break;
521 case Bad:
522 fprintf(stderr, "yylex: ERROR.\n");
523 return -1;
524 default:
525 assert(!"unhandled numeration value in switch");
526 return -1;
527 }
528 lastToken = token;
529 return token;
530}
531
532bool Lexer::isWhiteSpace() const
533{
534 return (current == ' ' || current == '\t' ||
535 current == 0x0b || current == 0x0c);
536}
537
538bool Lexer::isLineTerminator()
539{
540 bool cr = (current == '\r');
541 bool lf = (current == '\n');
542 if (cr)
543 skipLF = true;
544 else if (lf)
545 skipCR = true;
546 return cr || lf;
547}
548
549bool Lexer::isIdentLetter(unsigned short c)
550{
551 /* TODO: allow other legitimate unicode chars */
552 return (c >= 'a' && c <= 'z' ||
553 c >= 'A' && c <= 'Z' ||
554 c == '$' || c == '_');
555}
556
557bool Lexer::isDecimalDigit(unsigned short c)
558{
559 return (c >= '0' && c <= '9');
560}
561
562bool Lexer::isHexDigit(unsigned short c) const
563{
564 return (c >= '0' && c <= '9' ||
565 c >= 'a' && c <= 'f' ||
566 c >= 'A' && c <= 'F');
567}
568
569bool Lexer::isOctalDigit(unsigned short c) const
570{
571 return (c >= '0' && c <= '7');
572}
573
574int Lexer::matchPunctuator(unsigned short c1, unsigned short c2,
575 unsigned short c3, unsigned short c4)
576{
577 if (c1 == '>' && c2 == '>' && c3 == '>' && c4 == '=') {
578 shift(4);
579 return URSHIFTEQUAL;
580 } else if (c1 == '=' && c2 == '=' && c3 == '=') {
581 shift(3);
582 return STREQ;
583 } else if (c1 == '!' && c2 == '=' && c3 == '=') {
584 shift(3);
585 return STRNEQ;
586 } else if (c1 == '>' && c2 == '>' && c3 == '>') {
587 shift(3);
588 return URSHIFT;
589 } else if (c1 == '<' && c2 == '<' && c3 == '=') {
590 shift(3);
591 return LSHIFTEQUAL;
592 } else if (c1 == '>' && c2 == '>' && c3 == '=') {
593 shift(3);
594 return RSHIFTEQUAL;
595 } else if (c1 == '<' && c2 == '=') {
596 shift(2);
597 return LE;
598 } else if (c1 == '>' && c2 == '=') {
599 shift(2);
600 return GE;
601 } else if (c1 == '!' && c2 == '=') {
602 shift(2);
603 return NE;
604 } else if (c1 == '+' && c2 == '+') {
605 shift(2);
606 if (terminator)
607 return AUTOPLUSPLUS;
608 else
609 return PLUSPLUS;
610 } else if (c1 == '-' && c2 == '-') {
611 shift(2);
612 if (terminator)
613 return AUTOMINUSMINUS;
614 else
615 return MINUSMINUS;
616 } else if (c1 == '=' && c2 == '=') {
617 shift(2);
618 return EQEQ;
619 } else if (c1 == '+' && c2 == '=') {
620 shift(2);
621 return PLUSEQUAL;
622 } else if (c1 == '-' && c2 == '=') {
623 shift(2);
624 return MINUSEQUAL;
625 } else if (c1 == '*' && c2 == '=') {
626 shift(2);
627 return MULTEQUAL;
628 } else if (c1 == '/' && c2 == '=') {
629 shift(2);
630 return DIVEQUAL;
631 } else if (c1 == '&' && c2 == '=') {
632 shift(2);
633 return ANDEQUAL;
634 } else if (c1 == '^' && c2 == '=') {
635 shift(2);
636 return XOREQUAL;
637 } else if (c1 == '%' && c2 == '=') {
638 shift(2);
639 return MODEQUAL;
640 } else if (c1 == '|' && c2 == '=') {
641 shift(2);
642 return OREQUAL;
643 } else if (c1 == '<' && c2 == '<') {
644 shift(2);
645 return LSHIFT;
646 } else if (c1 == '>' && c2 == '>') {
647 shift(2);
648 return RSHIFT;
649 } else if (c1 == '&' && c2 == '&') {
650 shift(2);
651 return AND;
652 } else if (c1 == '|' && c2 == '|') {
653 shift(2);
654 return OR;
655 }
656
657 switch(c1) {
658 case '=':
659 case '>':
660 case '<':
661 case ',':
662 case '!':
663 case '~':
664 case '?':
665 case ':':
666 case '.':
667 case '+':
668 case '-':
669 case '*':
670 case '/':
671 case '&':
672 case '|':
673 case '^':
674 case '%':
675 case '(':
676 case ')':
677 case '{':
678 case '}':
679 case '[':
680 case ']':
681 case ';':
682 shift(1);
683 return static_cast<int>(c1);
684 default:
685 return -1;
686 }
687}
688
689unsigned short Lexer::singleEscape(unsigned short c) const
690{
691 switch(c) {
692 case 'b':
693 return 0x08;
694 case 't':
695 return 0x09;
696 case 'n':
697 return 0x0A;
698 case 'v':
699 return 0x0B;
700 case 'f':
701 return 0x0C;
702 case 'r':
703 return 0x0D;
704 case '"':
705 return 0x22;
706 case '\'':
707 return 0x27;
708 case '\\':
709 return 0x5C;
710 default:
711 return c;
712 }
713}
714
715unsigned short Lexer::convertOctal(unsigned short c1, unsigned short c2,
716 unsigned short c3) const
717{
718 return ((c1 - '0') * 64 + (c2 - '0') * 8 + c3 - '0');
719}
720
721unsigned char Lexer::convertHex(unsigned short c)
722{
723 if (c >= '0' && c <= '9')
724 return (c - '0');
725 else if (c >= 'a' && c <= 'f')
726 return (c - 'a' + 10);
727 else
728 return (c - 'A' + 10);
729}
730
731unsigned char Lexer::convertHex(unsigned short c1, unsigned short c2)
732{
733 return ((convertHex(c1) << 4) + convertHex(c2));
734}
735
736UChar Lexer::convertUnicode(unsigned short c1, unsigned short c2,
737 unsigned short c3, unsigned short c4)
738{
739 return UChar((convertHex(c1) << 4) + convertHex(c2),
740 (convertHex(c3) << 4) + convertHex(c4));
741}
742
743void Lexer::record8(unsigned short c)
744{
745 assert(c <= 0xff);
746
747 // enlarge buffer if full
748 if (pos8 >= size8 - 1) {
749 char *tmp = new char[2 * size8];
750 memcpy(tmp, buffer8, size8 * sizeof(char));
751 delete [] buffer8;
752 buffer8 = tmp;
753 size8 *= 2;
754 }
755
756 buffer8[pos8++] = (char) c;
757}
758
759void Lexer::record16(UChar c)
760{
761 // enlarge buffer if full
762 if (pos16 >= size16 - 1) {
763 UChar *tmp = new UChar[2 * size16];
764 memcpy(tmp, buffer16, size16 * sizeof(UChar));
765 delete [] buffer16;
766 buffer16 = tmp;
767 size16 *= 2;
768 }
769
770 buffer16[pos16++] = c;
771}
772
773bool Lexer::scanRegExp()
774{
775 pos16 = 0;
776 bool lastWasEscape = false;
777 bool inBrackets = false;
778
779 while (1) {
780 if (isLineTerminator() || current == 0)
781 return false;
782 else if (current != '/' || lastWasEscape == true || inBrackets == true)
783 {
784 // keep track of '[' and ']'
785 if ( !lastWasEscape ) {
786 if ( current == '[' && !inBrackets )
787 inBrackets = true;
788 if ( current == ']' && inBrackets )
789 inBrackets = false;
790 }
791 record16(current);
792 lastWasEscape =
793 !lastWasEscape && (current == '\\');
794 }
795 else { // end of regexp
796 pattern = UString(buffer16, pos16);
797 pos16 = 0;
798 shift(1);
799 break;
800 }
801 shift(1);
802 }
803
804 while (isIdentLetter(current)) {
805 record16(current);
806 shift(1);
807 }
808 flags = UString(buffer16, pos16);
809
810 return true;
811}
Note: See TracBrowser for help on using the repository browser.