source: webkit/trunk/JavaScriptCore/kjs/lexer.cpp@ 34273

Last change on this file since 34273 was 34273, checked in by [email protected], 17 years ago

Made the starting line number of scripts be 1-based throughout the engine.

JavaScriptCore:

2008-05-30 Timothy Hatcher <[email protected]>

Made the starting line number of scripts be 1-based throughout the engine.
This cleans up script line numbers so they are all consistent now and fixes
some cases where script execution was shown as off by one line in the debugger.

No change in SunSpider.

Reviewed by Oliver Hunt.

  • API/minidom.c: (main): Pass a line number of 1 instead of 0 to parser().parse().
  • API/testapi.c: (main): Ditto. And removes a FIXME and changed an assertEqualsAsNumber to use 1 instead of 2 for the line number.
  • VM/Machine.cpp: (KJS::callEval): Pass a line number of 1 instead of 0. (KJS::Machine::debug): Use firstLine for WillExecuteProgram instead of lastLine. Use lastLine for DidExecuteProgram instead of firstLine.
  • kjs/DebuggerCallFrame.cpp: (KJS::DebuggerCallFrame::evaluate): Pass a line number of 1 instead of 0 to parser().parse().
  • kjs/Parser.cpp: (KJS::Parser::parse): ASSERT startingLineNumber is greatter than 0. Change the startingLineNumber to be 1 if it was less than or equal to 0. This is needed for release builds to maintain compatibility with the JavaScriptCore API.
  • kjs/function.cpp: (KJS::globalFuncEval): Pass a line number of 1 instead of 0 to parser().parse().
  • kjs/function_object.cpp: (FunctionObjectImp::construct): Pass a line number of 1 instead of 0 to construct().
  • kjs/lexer.cpp: (Lexer::setCode): Made yylineno = startingLineNumber instead of adding 1.
  • kjs/testkjs.cpp: (functionRun): Pass a line number of 1 instead of 0 to Interpreter::evaluate(). (functionLoad): Ditto. (prettyPrintScript): Ditto. (runWithScripts): Ditto.
  • profiler/Profiler.cpp: (WebCore::createCallIdentifier): Removed a plus 1 of startingLineNumber.

WebCore:

2008-05-30 Timothy Hatcher <[email protected]>

Made the starting line number of scripts be 1-based throughout the engine.
This cleans up script line numbers so they are all consistent now and fixes
some cases where script execution was shown as off by one line in the debugger.

Doing this also exposed a bug where JSLazyEventListener created in XHML or SVG
documents would always have a line number of 0. So this change fixed that bug
to pass all the SVG and XHTML tests.

All layout tests pass.

Reviewed by Oliver Hunt.

  • bindings/js/kjs_events.cpp: (WebCore::JSLazyEventListener::JSLazyEventListener): Set the line number to 1 if it was passed in as 0. This can happen when listeners are created with a setAttribute call from JavaScript. (WebCore::JSLazyEventListener::parseCode): Add a FIXME about the URL being incorrect when listeners are created with a setAttribute call from JavaScript.
  • bindings/js/kjs_events.h: Remove the default value for lineNumber, since no callers need it.
  • bindings/objc/WebScriptObject.mm: (-[WebScriptObject evaluateWebScript:]): Pass a line number of 1 instead of 0 to Interpreter::evaluate().
  • bridge/NP_jsobject.cpp: (_NPN_Evaluate): Ditto.
  • bridge/jni/jni_jsobject.mm: (JavaJSObject::eval): Ditto.
  • dom/XMLTokenizer.cpp: (WebCore::XMLTokenizer::startElementNs): Call KJSProxy::setEventHandlerLineno() around the call to handleElementAttributes, so any JSLazyEventListener created from those attributes have line numbers. (WebCore::XMLTokenizer::endElementNs): Remove a minus 1 of the line number. (WebCore::XMLTokenizer::notifyFinished): Pass a line number of 1 instead of 0. (WebCore::XMLTokenizer::parseEndElement): Remove a minus 1 of the line number.
  • html/HTMLScriptElement.cpp: (WebCore::HTMLScriptElement::evaluateScript): Add a FIXME about the starting line number being incorrect in some cases when this function is called.
  • html/HTMLTokenizer.cpp: (WebCore::HTMLTokenizer::parseSpecial): Add a plus 1 to the line number when setting scriptStartLineno so it is 1-based. Same for calling setEventHandlerLineno(). (WebCore::HTMLTokenizer::processToken): Ditto.
  • html/HTMLTokenizer.h: Change the default line number on scriptExecution() to 1 from 0.
  • loader/FrameLoader.cpp: (FrameLoader::executeIfJavaScriptURL): Pass a line number of 1 instead of 0 to executeScript().

WebKitTools:

2008-05-30 Timothy Hatcher <[email protected]>

Made the starting line number of scripts be 1-based throughout the engine.
This cleans up script line numbers so they are all consistent now.

Reviewed by Oliver Hunt.

  • DumpRenderTree/mac/ObjCController.m: (runJavaScriptThread): Pass a line number of 1 instead of 0 to JSEvaluateScript.
  • DumpRenderTree/pthreads/JavaScriptThreadingPthreads.cpp: (runJavaScriptThread): Ditto.
  • DumpRenderTree/win/DumpRenderTree.cpp: (runJavaScriptThread): Ditto.
  • Property svn:eol-style set to native
File size: 22.3 KB
Line 
1/*
2 * Copyright (C) 1999-2000 Harri Porten ([email protected])
3 * Copyright (C) 2006, 2007, 2008 Apple Inc. All Rights Reserved.
4 * Copyright (C) 2007 Cameron Zwarich ([email protected])
5 *
6 * This library is free software; you can redistribute it and/or
7 * modify it under the terms of the GNU Library General Public
8 * License as published by the Free Software Foundation; either
9 * version 2 of the License, or (at your option) any later version.
10 *
11 * This library is distributed in the hope that it will be useful,
12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
14 * Library General Public License for more details.
15 *
16 * You should have received a copy of the GNU Library General Public License
17 * along with this library; see the file COPYING.LIB. If not, write to
18 * the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
19 * Boston, MA 02110-1301, USA.
20 *
21 */
22
23#include "config.h"
24#include "lexer.h"
25
26#include "dtoa.h"
27#include "function.h"
28#include "nodes.h"
29#include "NodeInfo.h"
30#include <ctype.h>
31#include <limits.h>
32#include <string.h>
33#include <wtf/Assertions.h>
34#include <wtf/unicode/Unicode.h>
35
36#if USE(MULTIPLE_THREADS)
37#include <wtf/ThreadSpecific.h>
38#endif
39
40using namespace WTF;
41using namespace Unicode;
42
43// we can't specify the namespace in yacc's C output, so do it here
44using namespace KJS;
45
46#ifndef KDE_USE_FINAL
47#include "grammar.h"
48#endif
49
50#include "lookup.h"
51#include "lexer.lut.h"
52
53// a bridge for yacc from the C world to C++
54int kjsyylex(void* lvalp, void* llocp, void* lexer)
55{
56 return static_cast<Lexer*>(lexer)->lex(lvalp, llocp);
57}
58
59namespace KJS {
60
61static bool isDecimalDigit(int);
62
63static const size_t initialReadBufferCapacity = 32;
64static const size_t initialStringTableCapacity = 64;
65
66Lexer& lexer()
67{
68#if USE(MULTIPLE_THREADS)
69 static ThreadSpecific<Lexer> staticLexer;
70 return *staticLexer;
71#else
72 static Lexer staticLexer;
73 return staticLexer;
74#endif
75}
76
77Lexer::Lexer()
78 : yylineno(1)
79 , restrKeyword(false)
80 , eatNextIdentifier(false)
81 , stackToken(-1)
82 , lastToken(-1)
83 , pos(0)
84 , code(0)
85 , length(0)
86 , atLineStart(true)
87 , current(0)
88 , next1(0)
89 , next2(0)
90 , next3(0)
91 , mainTable(KJS::mainTable)
92{
93 m_buffer8.reserveCapacity(initialReadBufferCapacity);
94 m_buffer16.reserveCapacity(initialReadBufferCapacity);
95 m_strings.reserveCapacity(initialStringTableCapacity);
96 m_identifiers.reserveCapacity(initialStringTableCapacity);
97}
98
99Lexer::~Lexer()
100{
101 delete[] mainTable.table;
102}
103
104void Lexer::setCode(int startingLineNumber, PassRefPtr<SourceProvider> source)
105{
106 yylineno = startingLineNumber;
107 restrKeyword = false;
108 delimited = false;
109 eatNextIdentifier = false;
110 stackToken = -1;
111 lastToken = -1;
112
113 pos = 0;
114 m_source = source;
115 code = m_source->data();
116 length = m_source->length();
117 skipLF = false;
118 skipCR = false;
119 error = false;
120 atLineStart = true;
121
122 // read first characters
123 shift(4);
124}
125
126void Lexer::shift(unsigned p)
127{
128 // ECMA-262 calls for stripping Cf characters here, but we only do this for BOM,
129 // see <https://bugs.webkit.org/show_bug.cgi?id=4931>.
130
131 while (p--) {
132 current = next1;
133 next1 = next2;
134 next2 = next3;
135 do {
136 if (pos >= length) {
137 pos++;
138 next3 = -1;
139 break;
140 }
141 next3 = code[pos++];
142 } while (next3 == 0xFEFF);
143 }
144}
145
146// called on each new line
147void Lexer::nextLine()
148{
149 yylineno++;
150 atLineStart = true;
151}
152
153void Lexer::setDone(State s)
154{
155 state = s;
156 done = true;
157}
158
159int Lexer::lex(void* p1, void* p2)
160{
161 YYSTYPE* lvalp = static_cast<YYSTYPE*>(p1);
162 YYLTYPE* llocp = static_cast<YYLTYPE*>(p2);
163 int token = 0;
164 state = Start;
165 unsigned short stringType = 0; // either single or double quotes
166 m_buffer8.clear();
167 m_buffer16.clear();
168 done = false;
169 terminator = false;
170 skipLF = false;
171 skipCR = false;
172
173 // did we push a token on the stack previously ?
174 // (after an automatic semicolon insertion)
175 if (stackToken >= 0) {
176 setDone(Other);
177 token = stackToken;
178 stackToken = 0;
179 }
180
181 while (!done) {
182 if (skipLF && current != '\n') // found \r but not \n afterwards
183 skipLF = false;
184 if (skipCR && current != '\r') // found \n but not \r afterwards
185 skipCR = false;
186 if (skipLF || skipCR) // found \r\n or \n\r -> eat the second one
187 {
188 skipLF = false;
189 skipCR = false;
190 shift(1);
191 }
192 switch (state) {
193 case Start:
194 if (isWhiteSpace()) {
195 // do nothing
196 } else if (current == '/' && next1 == '/') {
197 shift(1);
198 state = InSingleLineComment;
199 } else if (current == '/' && next1 == '*') {
200 shift(1);
201 state = InMultiLineComment;
202 } else if (current == -1) {
203 if (!terminator && !delimited) {
204 // automatic semicolon insertion if program incomplete
205 token = ';';
206 stackToken = 0;
207 setDone(Other);
208 } else
209 setDone(Eof);
210 } else if (isLineTerminator()) {
211 nextLine();
212 terminator = true;
213 if (restrKeyword) {
214 token = ';';
215 setDone(Other);
216 }
217 } else if (current == '"' || current == '\'') {
218 state = InString;
219 stringType = static_cast<unsigned short>(current);
220 } else if (isIdentStart(current)) {
221 record16(current);
222 state = InIdentifierOrKeyword;
223 } else if (current == '\\') {
224 state = InIdentifierStartUnicodeEscapeStart;
225 } else if (current == '0') {
226 record8(current);
227 state = InNum0;
228 } else if (isDecimalDigit(current)) {
229 record8(current);
230 state = InNum;
231 } else if (current == '.' && isDecimalDigit(next1)) {
232 record8(current);
233 state = InDecimal;
234 // <!-- marks the beginning of a line comment (for www usage)
235 } else if (current == '<' && next1 == '!' &&
236 next2 == '-' && next3 == '-') {
237 shift(3);
238 state = InSingleLineComment;
239 // same for -->
240 } else if (atLineStart && current == '-' && next1 == '-' && next2 == '>') {
241 shift(2);
242 state = InSingleLineComment;
243 } else {
244 token = matchPunctuator(lvalp->intValue, current, next1, next2, next3);
245 if (token != -1) {
246 setDone(Other);
247 } else {
248 // cerr << "encountered unknown character" << endl;
249 setDone(Bad);
250 }
251 }
252 break;
253 case InString:
254 if (current == stringType) {
255 shift(1);
256 setDone(String);
257 } else if (isLineTerminator() || current == -1) {
258 setDone(Bad);
259 } else if (current == '\\') {
260 state = InEscapeSequence;
261 } else {
262 record16(current);
263 }
264 break;
265 // Escape Sequences inside of strings
266 case InEscapeSequence:
267 if (isOctalDigit(current)) {
268 if (current >= '0' && current <= '3' &&
269 isOctalDigit(next1) && isOctalDigit(next2)) {
270 record16(convertOctal(current, next1, next2));
271 shift(2);
272 state = InString;
273 } else if (isOctalDigit(current) && isOctalDigit(next1)) {
274 record16(convertOctal('0', current, next1));
275 shift(1);
276 state = InString;
277 } else if (isOctalDigit(current)) {
278 record16(convertOctal('0', '0', current));
279 state = InString;
280 } else {
281 setDone(Bad);
282 }
283 } else if (current == 'x')
284 state = InHexEscape;
285 else if (current == 'u')
286 state = InUnicodeEscape;
287 else if (isLineTerminator()) {
288 nextLine();
289 state = InString;
290 } else {
291 record16(singleEscape(static_cast<unsigned short>(current)));
292 state = InString;
293 }
294 break;
295 case InHexEscape:
296 if (isHexDigit(current) && isHexDigit(next1)) {
297 state = InString;
298 record16(convertHex(current, next1));
299 shift(1);
300 } else if (current == stringType) {
301 record16('x');
302 shift(1);
303 setDone(String);
304 } else {
305 record16('x');
306 record16(current);
307 state = InString;
308 }
309 break;
310 case InUnicodeEscape:
311 if (isHexDigit(current) && isHexDigit(next1) && isHexDigit(next2) && isHexDigit(next3)) {
312 record16(convertUnicode(current, next1, next2, next3));
313 shift(3);
314 state = InString;
315 } else if (current == stringType) {
316 record16('u');
317 shift(1);
318 setDone(String);
319 } else {
320 setDone(Bad);
321 }
322 break;
323 case InSingleLineComment:
324 if (isLineTerminator()) {
325 nextLine();
326 terminator = true;
327 if (restrKeyword) {
328 token = ';';
329 setDone(Other);
330 } else
331 state = Start;
332 } else if (current == -1) {
333 setDone(Eof);
334 }
335 break;
336 case InMultiLineComment:
337 if (current == -1) {
338 setDone(Bad);
339 } else if (isLineTerminator()) {
340 nextLine();
341 } else if (current == '*' && next1 == '/') {
342 state = Start;
343 shift(1);
344 }
345 break;
346 case InIdentifierOrKeyword:
347 case InIdentifier:
348 if (isIdentPart(current))
349 record16(current);
350 else if (current == '\\')
351 state = InIdentifierPartUnicodeEscapeStart;
352 else
353 setDone(state == InIdentifierOrKeyword ? IdentifierOrKeyword : Identifier);
354 break;
355 case InNum0:
356 if (current == 'x' || current == 'X') {
357 record8(current);
358 state = InHex;
359 } else if (current == '.') {
360 record8(current);
361 state = InDecimal;
362 } else if (current == 'e' || current == 'E') {
363 record8(current);
364 state = InExponentIndicator;
365 } else if (isOctalDigit(current)) {
366 record8(current);
367 state = InOctal;
368 } else if (isDecimalDigit(current)) {
369 record8(current);
370 state = InDecimal;
371 } else {
372 setDone(Number);
373 }
374 break;
375 case InHex:
376 if (isHexDigit(current)) {
377 record8(current);
378 } else {
379 setDone(Hex);
380 }
381 break;
382 case InOctal:
383 if (isOctalDigit(current)) {
384 record8(current);
385 }
386 else if (isDecimalDigit(current)) {
387 record8(current);
388 state = InDecimal;
389 } else
390 setDone(Octal);
391 break;
392 case InNum:
393 if (isDecimalDigit(current)) {
394 record8(current);
395 } else if (current == '.') {
396 record8(current);
397 state = InDecimal;
398 } else if (current == 'e' || current == 'E') {
399 record8(current);
400 state = InExponentIndicator;
401 } else
402 setDone(Number);
403 break;
404 case InDecimal:
405 if (isDecimalDigit(current)) {
406 record8(current);
407 } else if (current == 'e' || current == 'E') {
408 record8(current);
409 state = InExponentIndicator;
410 } else
411 setDone(Number);
412 break;
413 case InExponentIndicator:
414 if (current == '+' || current == '-') {
415 record8(current);
416 } else if (isDecimalDigit(current)) {
417 record8(current);
418 state = InExponent;
419 } else
420 setDone(Bad);
421 break;
422 case InExponent:
423 if (isDecimalDigit(current)) {
424 record8(current);
425 } else
426 setDone(Number);
427 break;
428 case InIdentifierStartUnicodeEscapeStart:
429 if (current == 'u')
430 state = InIdentifierStartUnicodeEscape;
431 else
432 setDone(Bad);
433 break;
434 case InIdentifierPartUnicodeEscapeStart:
435 if (current == 'u')
436 state = InIdentifierPartUnicodeEscape;
437 else
438 setDone(Bad);
439 break;
440 case InIdentifierStartUnicodeEscape:
441 if (!isHexDigit(current) || !isHexDigit(next1) || !isHexDigit(next2) || !isHexDigit(next3)) {
442 setDone(Bad);
443 break;
444 }
445 token = convertUnicode(current, next1, next2, next3);
446 shift(3);
447 if (!isIdentStart(token)) {
448 setDone(Bad);
449 break;
450 }
451 record16(token);
452 state = InIdentifier;
453 break;
454 case InIdentifierPartUnicodeEscape:
455 if (!isHexDigit(current) || !isHexDigit(next1) || !isHexDigit(next2) || !isHexDigit(next3)) {
456 setDone(Bad);
457 break;
458 }
459 token = convertUnicode(current, next1, next2, next3);
460 shift(3);
461 if (!isIdentPart(token)) {
462 setDone(Bad);
463 break;
464 }
465 record16(token);
466 state = InIdentifier;
467 break;
468 default:
469 ASSERT(!"Unhandled state in switch statement");
470 }
471
472 // move on to the next character
473 if (!done)
474 shift(1);
475 if (state != Start && state != InSingleLineComment)
476 atLineStart = false;
477 }
478
479 // no identifiers allowed directly after numeric literal, e.g. "3in" is bad
480 if ((state == Number || state == Octal || state == Hex) && isIdentStart(current))
481 state = Bad;
482
483 // terminate string
484 m_buffer8.append('\0');
485
486#ifdef KJS_DEBUG_LEX
487 fprintf(stderr, "line: %d ", lineNo());
488 fprintf(stderr, "yytext (%x): ", m_buffer8[0]);
489 fprintf(stderr, "%s ", buffer8.data());
490#endif
491
492 double dval = 0;
493 if (state == Number) {
494 dval = strtod(m_buffer8.data(), 0L);
495 } else if (state == Hex) { // scan hex numbers
496 const char* p = m_buffer8.data() + 2;
497 while (char c = *p++) {
498 dval *= 16;
499 dval += convertHex(c);
500 }
501
502 if (dval >= mantissaOverflowLowerBound)
503 dval = parseIntOverflow(m_buffer8.data() + 2, p - (m_buffer8.data() + 3), 16);
504
505 state = Number;
506 } else if (state == Octal) { // scan octal number
507 const char* p = m_buffer8.data() + 1;
508 while (char c = *p++) {
509 dval *= 8;
510 dval += c - '0';
511 }
512
513 if (dval >= mantissaOverflowLowerBound)
514 dval = parseIntOverflow(m_buffer8.data() + 1, p - (m_buffer8.data() + 2), 8);
515
516 state = Number;
517 }
518
519#ifdef KJS_DEBUG_LEX
520 switch (state) {
521 case Eof:
522 printf("(EOF)\n");
523 break;
524 case Other:
525 printf("(Other)\n");
526 break;
527 case Identifier:
528 printf("(Identifier)/(Keyword)\n");
529 break;
530 case String:
531 printf("(String)\n");
532 break;
533 case Number:
534 printf("(Number)\n");
535 break;
536 default:
537 printf("(unknown)");
538 }
539#endif
540
541 if (state != Identifier)
542 eatNextIdentifier = false;
543
544 restrKeyword = false;
545 delimited = false;
546 llocp->first_line = yylineno; // ???
547 llocp->last_line = yylineno;
548
549 switch (state) {
550 case Eof:
551 token = 0;
552 break;
553 case Other:
554 if (token == '}' || token == ';')
555 delimited = true;
556 break;
557 case Identifier:
558 // Apply anonymous-function hack below (eat the identifier).
559 if (eatNextIdentifier) {
560 eatNextIdentifier = false;
561 token = lex(lvalp, llocp);
562 break;
563 }
564 lvalp->ident = makeIdentifier(m_buffer16);
565 token = IDENT;
566 break;
567 case IdentifierOrKeyword:
568 lvalp->ident = makeIdentifier(m_buffer16);
569 if ((token = mainTable.value(*lvalp->ident)) < 0) {
570 // Lookup for keyword failed, means this is an identifier.
571 token = IDENT;
572 break;
573 }
574 // Hack for "f = function somename() { ... }"; too hard to get into the grammar.
575 eatNextIdentifier = token == FUNCTION && lastToken == '=';
576 if (token == CONTINUE || token == BREAK || token == RETURN || token == THROW)
577 restrKeyword = true;
578 break;
579 case String:
580 lvalp->string = makeUString(m_buffer16);
581 token = STRING;
582 break;
583 case Number:
584 lvalp->doubleValue = dval;
585 token = NUMBER;
586 break;
587 case Bad:
588#ifdef KJS_DEBUG_LEX
589 fprintf(stderr, "yylex: ERROR.\n");
590#endif
591 error = true;
592 return -1;
593 default:
594 ASSERT(!"unhandled numeration value in switch");
595 error = true;
596 return -1;
597 }
598 lastToken = token;
599 return token;
600}
601
602bool Lexer::isWhiteSpace() const
603{
604 return current == '\t' || current == 0x0b || current == 0x0c || isSeparatorSpace(current);
605}
606
607bool Lexer::isLineTerminator()
608{
609 bool cr = (current == '\r');
610 bool lf = (current == '\n');
611 if (cr)
612 skipLF = true;
613 else if (lf)
614 skipCR = true;
615 return cr || lf || current == 0x2028 || current == 0x2029;
616}
617
618bool Lexer::isIdentStart(int c)
619{
620 return (category(c) & (Letter_Uppercase | Letter_Lowercase | Letter_Titlecase | Letter_Modifier | Letter_Other))
621 || c == '$' || c == '_';
622}
623
624bool Lexer::isIdentPart(int c)
625{
626 return (category(c) & (Letter_Uppercase | Letter_Lowercase | Letter_Titlecase | Letter_Modifier | Letter_Other
627 | Mark_NonSpacing | Mark_SpacingCombining | Number_DecimalDigit | Punctuation_Connector))
628 || c == '$' || c == '_';
629}
630
631static bool isDecimalDigit(int c)
632{
633 return (c >= '0' && c <= '9');
634}
635
636bool Lexer::isHexDigit(int c)
637{
638 return (c >= '0' && c <= '9' ||
639 c >= 'a' && c <= 'f' ||
640 c >= 'A' && c <= 'F');
641}
642
643bool Lexer::isOctalDigit(int c)
644{
645 return (c >= '0' && c <= '7');
646}
647
648int Lexer::matchPunctuator(int& charPos, int c1, int c2, int c3, int c4)
649{
650 if (c1 == '>' && c2 == '>' && c3 == '>' && c4 == '=') {
651 shift(4);
652 return URSHIFTEQUAL;
653 } else if (c1 == '=' && c2 == '=' && c3 == '=') {
654 shift(3);
655 return STREQ;
656 } else if (c1 == '!' && c2 == '=' && c3 == '=') {
657 shift(3);
658 return STRNEQ;
659 } else if (c1 == '>' && c2 == '>' && c3 == '>') {
660 shift(3);
661 return URSHIFT;
662 } else if (c1 == '<' && c2 == '<' && c3 == '=') {
663 shift(3);
664 return LSHIFTEQUAL;
665 } else if (c1 == '>' && c2 == '>' && c3 == '=') {
666 shift(3);
667 return RSHIFTEQUAL;
668 } else if (c1 == '<' && c2 == '=') {
669 shift(2);
670 return LE;
671 } else if (c1 == '>' && c2 == '=') {
672 shift(2);
673 return GE;
674 } else if (c1 == '!' && c2 == '=') {
675 shift(2);
676 return NE;
677 } else if (c1 == '+' && c2 == '+') {
678 shift(2);
679 if (terminator)
680 return AUTOPLUSPLUS;
681 else
682 return PLUSPLUS;
683 } else if (c1 == '-' && c2 == '-') {
684 shift(2);
685 if (terminator)
686 return AUTOMINUSMINUS;
687 else
688 return MINUSMINUS;
689 } else if (c1 == '=' && c2 == '=') {
690 shift(2);
691 return EQEQ;
692 } else if (c1 == '+' && c2 == '=') {
693 shift(2);
694 return PLUSEQUAL;
695 } else if (c1 == '-' && c2 == '=') {
696 shift(2);
697 return MINUSEQUAL;
698 } else if (c1 == '*' && c2 == '=') {
699 shift(2);
700 return MULTEQUAL;
701 } else if (c1 == '/' && c2 == '=') {
702 shift(2);
703 return DIVEQUAL;
704 } else if (c1 == '&' && c2 == '=') {
705 shift(2);
706 return ANDEQUAL;
707 } else if (c1 == '^' && c2 == '=') {
708 shift(2);
709 return XOREQUAL;
710 } else if (c1 == '%' && c2 == '=') {
711 shift(2);
712 return MODEQUAL;
713 } else if (c1 == '|' && c2 == '=') {
714 shift(2);
715 return OREQUAL;
716 } else if (c1 == '<' && c2 == '<') {
717 shift(2);
718 return LSHIFT;
719 } else if (c1 == '>' && c2 == '>') {
720 shift(2);
721 return RSHIFT;
722 } else if (c1 == '&' && c2 == '&') {
723 shift(2);
724 return AND;
725 } else if (c1 == '|' && c2 == '|') {
726 shift(2);
727 return OR;
728 }
729
730 switch(c1) {
731 case '=':
732 case '>':
733 case '<':
734 case ',':
735 case '!':
736 case '~':
737 case '?':
738 case ':':
739 case '.':
740 case '+':
741 case '-':
742 case '*':
743 case '/':
744 case '&':
745 case '|':
746 case '^':
747 case '%':
748 case '(':
749 case ')':
750 case '[':
751 case ']':
752 case ';':
753 shift(1);
754 return static_cast<int>(c1);
755 case '{':
756 charPos = pos - 4;
757 shift(1);
758 return OPENBRACE;
759 case '}':
760 charPos = pos - 4;
761 shift(1);
762 return CLOSEBRACE;
763 default:
764 return -1;
765 }
766}
767
768unsigned short Lexer::singleEscape(unsigned short c)
769{
770 switch(c) {
771 case 'b':
772 return 0x08;
773 case 't':
774 return 0x09;
775 case 'n':
776 return 0x0A;
777 case 'v':
778 return 0x0B;
779 case 'f':
780 return 0x0C;
781 case 'r':
782 return 0x0D;
783 case '"':
784 return 0x22;
785 case '\'':
786 return 0x27;
787 case '\\':
788 return 0x5C;
789 default:
790 return c;
791 }
792}
793
794unsigned short Lexer::convertOctal(int c1, int c2, int c3)
795{
796 return static_cast<unsigned short>((c1 - '0') * 64 + (c2 - '0') * 8 + c3 - '0');
797}
798
799unsigned char Lexer::convertHex(int c)
800{
801 if (c >= '0' && c <= '9')
802 return static_cast<unsigned char>(c - '0');
803 if (c >= 'a' && c <= 'f')
804 return static_cast<unsigned char>(c - 'a' + 10);
805 return static_cast<unsigned char>(c - 'A' + 10);
806}
807
808unsigned char Lexer::convertHex(int c1, int c2)
809{
810 return ((convertHex(c1) << 4) + convertHex(c2));
811}
812
813UChar Lexer::convertUnicode(int c1, int c2, int c3, int c4)
814{
815 unsigned char highByte = (convertHex(c1) << 4) + convertHex(c2);
816 unsigned char lowByte = (convertHex(c3) << 4) + convertHex(c4);
817 return (highByte << 8 | lowByte);
818}
819
820void Lexer::record8(int c)
821{
822 ASSERT(c >= 0);
823 ASSERT(c <= 0xff);
824 m_buffer8.append(static_cast<char>(c));
825}
826
827void Lexer::record16(int c)
828{
829 ASSERT(c >= 0);
830 ASSERT(c <= USHRT_MAX);
831 record16(UChar(static_cast<unsigned short>(c)));
832}
833
834void Lexer::record16(UChar c)
835{
836 m_buffer16.append(c);
837}
838
839bool Lexer::scanRegExp()
840{
841 m_buffer16.clear();
842 bool lastWasEscape = false;
843 bool inBrackets = false;
844
845 while (1) {
846 if (isLineTerminator() || current == -1)
847 return false;
848 else if (current != '/' || lastWasEscape == true || inBrackets == true)
849 {
850 // keep track of '[' and ']'
851 if (!lastWasEscape) {
852 if ( current == '[' && !inBrackets )
853 inBrackets = true;
854 if ( current == ']' && inBrackets )
855 inBrackets = false;
856 }
857 record16(current);
858 lastWasEscape =
859 !lastWasEscape && (current == '\\');
860 } else { // end of regexp
861 m_pattern = UString(m_buffer16);
862 m_buffer16.clear();
863 shift(1);
864 break;
865 }
866 shift(1);
867 }
868
869 while (isIdentPart(current)) {
870 record16(current);
871 shift(1);
872 }
873 m_flags = UString(m_buffer16);
874
875 return true;
876}
877
878void Lexer::clear()
879{
880 deleteAllValues(m_strings);
881 Vector<UString*> newStrings;
882 newStrings.reserveCapacity(initialStringTableCapacity);
883 m_strings.swap(newStrings);
884
885 deleteAllValues(m_identifiers);
886 Vector<KJS::Identifier*> newIdentifiers;
887 newIdentifiers.reserveCapacity(initialStringTableCapacity);
888 m_identifiers.swap(newIdentifiers);
889
890 Vector<char> newBuffer8;
891 newBuffer8.reserveCapacity(initialReadBufferCapacity);
892 m_buffer8.swap(newBuffer8);
893
894 Vector<UChar> newBuffer16;
895 newBuffer16.reserveCapacity(initialReadBufferCapacity);
896 m_buffer16.swap(newBuffer16);
897
898 m_pattern = 0;
899 m_flags = 0;
900}
901
902Identifier* Lexer::makeIdentifier(const Vector<UChar>& buffer)
903{
904 KJS::Identifier* identifier = new KJS::Identifier(buffer.data(), buffer.size());
905 m_identifiers.append(identifier);
906 return identifier;
907}
908
909UString* Lexer::makeUString(const Vector<UChar>& buffer)
910{
911 UString* string = new UString(buffer);
912 m_strings.append(string);
913 return string;
914}
915
916} // namespace KJS
Note: See TracBrowser for help on using the repository browser.