Context Navigation

lexer.cpp@ 34273

Visit:

Last change on this file since 34273 was 34273, checked in by [email protected], 17 years ago

Made the starting line number of scripts be 1-based throughout the engine.

2008-05-30 Timothy Hatcher <[email protected]>

Made the starting line number of scripts be 1-based throughout the engine.
This cleans up script line numbers so they are all consistent now and fixes
some cases where script execution was shown as off by one line in the debugger.

No change in SunSpider.

Reviewed by Oliver Hunt.

API/minidom.c: (main): Pass a line number of 1 instead of 0 to parser().parse().
API/testapi.c: (main): Ditto. And removes a FIXME and changed an assertEqualsAsNumber to use 1 instead of 2 for the line number.
VM/Machine.cpp: (KJS::callEval): Pass a line number of 1 instead of 0. (KJS::Machine::debug): Use firstLine for WillExecuteProgram instead of lastLine. Use lastLine for DidExecuteProgram instead of firstLine.
kjs/DebuggerCallFrame.cpp: (KJS::DebuggerCallFrame::evaluate): Pass a line number of 1 instead of 0 to parser().parse().
kjs/Parser.cpp: (KJS::Parser::parse): ASSERT startingLineNumber is greatter than 0. Change the startingLineNumber to be 1 if it was less than or equal to 0. This is needed for release builds to maintain compatibility with the JavaScriptCore API.
kjs/function.cpp: (KJS::globalFuncEval): Pass a line number of 1 instead of 0 to parser().parse().
kjs/function_object.cpp: (FunctionObjectImp::construct): Pass a line number of 1 instead of 0 to construct().
kjs/lexer.cpp: (Lexer::setCode): Made yylineno = startingLineNumber instead of adding 1.
kjs/testkjs.cpp: (functionRun): Pass a line number of 1 instead of 0 to Interpreter::evaluate(). (functionLoad): Ditto. (prettyPrintScript): Ditto. (runWithScripts): Ditto.
profiler/Profiler.cpp: (WebCore::createCallIdentifier): Removed a plus 1 of startingLineNumber.

WebCore:

2008-05-30 Timothy Hatcher <[email protected]>

Made the starting line number of scripts be 1-based throughout the engine.
This cleans up script line numbers so they are all consistent now and fixes
some cases where script execution was shown as off by one line in the debugger.

Doing this also exposed a bug where JSLazyEventListener created in XHML or SVG
documents would always have a line number of 0. So this change fixed that bug
to pass all the SVG and XHTML tests.

All layout tests pass.

Reviewed by Oliver Hunt.

bindings/js/kjs_events.cpp: (WebCore::JSLazyEventListener::JSLazyEventListener): Set the line number to 1 if it was passed in as 0. This can happen when listeners are created with a setAttribute call from JavaScript. (WebCore::JSLazyEventListener::parseCode): Add a FIXME about the URL being incorrect when listeners are created with a setAttribute call from JavaScript.
bindings/js/kjs_events.h: Remove the default value for lineNumber, since no callers need it.
bindings/objc/WebScriptObject.mm: (-[WebScriptObject evaluateWebScript:]): Pass a line number of 1 instead of 0 to Interpreter::evaluate().
bridge/NP_jsobject.cpp: (_NPN_Evaluate): Ditto.
bridge/jni/jni_jsobject.mm: (JavaJSObject::eval): Ditto.
dom/XMLTokenizer.cpp: (WebCore::XMLTokenizer::startElementNs): Call KJSProxy::setEventHandlerLineno() around the call to handleElementAttributes, so any JSLazyEventListener created from those attributes have line numbers. (WebCore::XMLTokenizer::endElementNs): Remove a minus 1 of the line number. (WebCore::XMLTokenizer::notifyFinished): Pass a line number of 1 instead of 0. (WebCore::XMLTokenizer::parseEndElement): Remove a minus 1 of the line number.
html/HTMLScriptElement.cpp: (WebCore::HTMLScriptElement::evaluateScript): Add a FIXME about the starting line number being incorrect in some cases when this function is called.
html/HTMLTokenizer.cpp: (WebCore::HTMLTokenizer::parseSpecial): Add a plus 1 to the line number when setting scriptStartLineno so it is 1-based. Same for calling setEventHandlerLineno(). (WebCore::HTMLTokenizer::processToken): Ditto.
html/HTMLTokenizer.h: Change the default line number on scriptExecution() to 1 from 0.
loader/FrameLoader.cpp: (FrameLoader::executeIfJavaScriptURL): Pass a line number of 1 instead of 0 to executeScript().

WebKitTools:

2008-05-30 Timothy Hatcher <[email protected]>

Made the starting line number of scripts be 1-based throughout the engine.
This cleans up script line numbers so they are all consistent now.

Reviewed by Oliver Hunt.

DumpRenderTree/mac/ObjCController.m: (runJavaScriptThread): Pass a line number of 1 instead of 0 to JSEvaluateScript.
DumpRenderTree/pthreads/JavaScriptThreadingPthreads.cpp: (runJavaScriptThread): Ditto.
DumpRenderTree/win/DumpRenderTree.cpp: (runJavaScriptThread): Ditto.

Property svn:eol-style set to native

File size: 22.3 KB

Line
1	/*
2	* Copyright (C) 1999-2000 Harri Porten ([email protected])
3	* Copyright (C) 2006, 2007, 2008 Apple Inc. All Rights Reserved.
4	* Copyright (C) 2007 Cameron Zwarich ([email protected])
5	*
6	* This library is free software; you can redistribute it and/or
7	* modify it under the terms of the GNU Library General Public
8	* License as published by the Free Software Foundation; either
9	* version 2 of the License, or (at your option) any later version.
10	*
11	* This library is distributed in the hope that it will be useful,
12	* but WITHOUT ANY WARRANTY; without even the implied warranty of
13	* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
14	* Library General Public License for more details.
15	*
16	* You should have received a copy of the GNU Library General Public License
17	* along with this library; see the file COPYING.LIB. If not, write to
18	* the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
19	* Boston, MA 02110-1301, USA.
20	*
21	*/
22
23	#include "config.h"
24	#include "lexer.h"
25
26	#include "dtoa.h"
27	#include "function.h"
28	#include "nodes.h"
29	#include "NodeInfo.h"
30	#include <ctype.h>
31	#include <limits.h>
32	#include <string.h>
33	#include <wtf/Assertions.h>
34	#include <wtf/unicode/Unicode.h>
35
36	#if USE(MULTIPLE_THREADS)
37	#include <wtf/ThreadSpecific.h>
38	#endif
39
40	using namespace WTF;
41	using namespace Unicode;
42
43	// we can't specify the namespace in yacc's C output, so do it here
44	using namespace KJS;
45
46	#ifndef KDE_USE_FINAL
47	#include "grammar.h"
48	#endif
49
50	#include "lookup.h"
51	#include "lexer.lut.h"
52
53	// a bridge for yacc from the C world to C++
54	int kjsyylex(void* lvalp, void* llocp, void* lexer)
55	{
56	return static_cast<Lexer*>(lexer)->lex(lvalp, llocp);
57	}
58
59	namespace KJS {
60
61	static bool isDecimalDigit(int);
62
63	static const size_t initialReadBufferCapacity = 32;
64	static const size_t initialStringTableCapacity = 64;
65
66	Lexer& lexer()
67	{
68	#if USE(MULTIPLE_THREADS)
69	static ThreadSpecific<Lexer> staticLexer;
70	return *staticLexer;
71	#else
72	static Lexer staticLexer;
73	return staticLexer;
74	#endif
75	}
76
77	Lexer::Lexer()
78	: yylineno(1)
79	, restrKeyword(false)
80	, eatNextIdentifier(false)
81	, stackToken(-1)
82	, lastToken(-1)
83	, pos(0)
84	, code(0)
85	, length(0)
86	, atLineStart(true)
87	, current(0)
88	, next1(0)
89	, next2(0)
90	, next3(0)
91	, mainTable(KJS::mainTable)
92	{
93	m_buffer8.reserveCapacity(initialReadBufferCapacity);
94	m_buffer16.reserveCapacity(initialReadBufferCapacity);
95	m_strings.reserveCapacity(initialStringTableCapacity);
96	m_identifiers.reserveCapacity(initialStringTableCapacity);
97	}
98
99	Lexer::~Lexer()
100	{
101	delete[] mainTable.table;
102	}
103
104	void Lexer::setCode(int startingLineNumber, PassRefPtr<SourceProvider> source)
105	{
106	yylineno = startingLineNumber;
107	restrKeyword = false;
108	delimited = false;
109	eatNextIdentifier = false;
110	stackToken = -1;
111	lastToken = -1;
112
113	pos = 0;
114	m_source = source;
115	code = m_source->data();
116	length = m_source->length();
117	skipLF = false;
118	skipCR = false;
119	error = false;
120	atLineStart = true;
121
122	// read first characters
123	shift(4);
124	}
125
126	void Lexer::shift(unsigned p)
127	{
128	// ECMA-262 calls for stripping Cf characters here, but we only do this for BOM,
129	// see <https://bugs.webkit.org/show_bug.cgi?id=4931>.
130
131	while (p--) {
132	current = next1;
133	next1 = next2;
134	next2 = next3;
135	do {
136	if (pos >= length) {
137	pos++;
138	next3 = -1;
139	break;
140	}
141	next3 = code[pos++];
142	} while (next3 == 0xFEFF);
143	}
144	}
145
146	// called on each new line
147	void Lexer::nextLine()
148	{
149	yylineno++;
150	atLineStart = true;
151	}
152
153	void Lexer::setDone(State s)
154	{
155	state = s;
156	done = true;
157	}
158
159	int Lexer::lex(void* p1, void* p2)
160	{
161	YYSTYPE* lvalp = static_cast<YYSTYPE*>(p1);
162	YYLTYPE* llocp = static_cast<YYLTYPE*>(p2);
163	int token = 0;
164	state = Start;
165	unsigned short stringType = 0; // either single or double quotes
166	m_buffer8.clear();
167	m_buffer16.clear();
168	done = false;
169	terminator = false;
170	skipLF = false;
171	skipCR = false;
172
173	// did we push a token on the stack previously ?
174	// (after an automatic semicolon insertion)
175	if (stackToken >= 0) {
176	setDone(Other);
177	token = stackToken;
178	stackToken = 0;
179	}
180
181	while (!done) {
182	if (skipLF && current != '\n') // found \r but not \n afterwards
183	skipLF = false;
184	if (skipCR && current != '\r') // found \n but not \r afterwards
185	skipCR = false;
186	if (skipLF \|\| skipCR) // found \r\n or \n\r -> eat the second one
187	{
188	skipLF = false;
189	skipCR = false;
190	shift(1);
191	}
192	switch (state) {
193	case Start:
194	if (isWhiteSpace()) {
195	// do nothing
196	} else if (current == '/' && next1 == '/') {
197	shift(1);
198	state = InSingleLineComment;
199	} else if (current == '/' && next1 == '*') {
200	shift(1);
201	state = InMultiLineComment;
202	} else if (current == -1) {
203	if (!terminator && !delimited) {
204	// automatic semicolon insertion if program incomplete
205	token = ';';
206	stackToken = 0;
207	setDone(Other);
208	} else
209	setDone(Eof);
210	} else if (isLineTerminator()) {
211	nextLine();
212	terminator = true;
213	if (restrKeyword) {
214	token = ';';
215	setDone(Other);
216	}
217	} else if (current == '"' \|\| current == '\'') {
218	state = InString;
219	stringType = static_cast<unsigned short>(current);
220	} else if (isIdentStart(current)) {
221	record16(current);
222	state = InIdentifierOrKeyword;
223	} else if (current == '\\') {
224	state = InIdentifierStartUnicodeEscapeStart;
225	} else if (current == '0') {
226	record8(current);
227	state = InNum0;
228	} else if (isDecimalDigit(current)) {
229	record8(current);
230	state = InNum;
231	} else if (current == '.' && isDecimalDigit(next1)) {
232	record8(current);
233	state = InDecimal;
234	// <!-- marks the beginning of a line comment (for www usage)
235	} else if (current == '<' && next1 == '!' &&
236	next2 == '-' && next3 == '-') {
237	shift(3);
238	state = InSingleLineComment;
239	// same for -->
240	} else if (atLineStart && current == '-' && next1 == '-' && next2 == '>') {
241	shift(2);
242	state = InSingleLineComment;
243	} else {
244	token = matchPunctuator(lvalp->intValue, current, next1, next2, next3);
245	if (token != -1) {
246	setDone(Other);
247	} else {
248	// cerr << "encountered unknown character" << endl;
249	setDone(Bad);
250	}
251	}
252	break;
253	case InString:
254	if (current == stringType) {
255	shift(1);
256	setDone(String);
257	} else if (isLineTerminator() \|\| current == -1) {
258	setDone(Bad);
259	} else if (current == '\\') {
260	state = InEscapeSequence;
261	} else {
262	record16(current);
263	}
264	break;
265	// Escape Sequences inside of strings
266	case InEscapeSequence:
267	if (isOctalDigit(current)) {
268	if (current >= '0' && current <= '3' &&
269	isOctalDigit(next1) && isOctalDigit(next2)) {
270	record16(convertOctal(current, next1, next2));
271	shift(2);
272	state = InString;
273	} else if (isOctalDigit(current) && isOctalDigit(next1)) {
274	record16(convertOctal('0', current, next1));
275	shift(1);
276	state = InString;
277	} else if (isOctalDigit(current)) {
278	record16(convertOctal('0', '0', current));
279	state = InString;
280	} else {
281	setDone(Bad);
282	}
283	} else if (current == 'x')
284	state = InHexEscape;
285	else if (current == 'u')
286	state = InUnicodeEscape;
287	else if (isLineTerminator()) {
288	nextLine();
289	state = InString;
290	} else {
291	record16(singleEscape(static_cast<unsigned short>(current)));
292	state = InString;
293	}
294	break;
295	case InHexEscape:
296	if (isHexDigit(current) && isHexDigit(next1)) {
297	state = InString;
298	record16(convertHex(current, next1));
299	shift(1);
300	} else if (current == stringType) {
301	record16('x');
302	shift(1);
303	setDone(String);
304	} else {
305	record16('x');
306	record16(current);
307	state = InString;
308	}
309	break;
310	case InUnicodeEscape:
311	if (isHexDigit(current) && isHexDigit(next1) && isHexDigit(next2) && isHexDigit(next3)) {
312	record16(convertUnicode(current, next1, next2, next3));
313	shift(3);
314	state = InString;
315	} else if (current == stringType) {
316	record16('u');
317	shift(1);
318	setDone(String);
319	} else {
320	setDone(Bad);
321	}
322	break;
323	case InSingleLineComment:
324	if (isLineTerminator()) {
325	nextLine();
326	terminator = true;
327	if (restrKeyword) {
328	token = ';';
329	setDone(Other);
330	} else
331	state = Start;
332	} else if (current == -1) {
333	setDone(Eof);
334	}
335	break;
336	case InMultiLineComment:
337	if (current == -1) {
338	setDone(Bad);
339	} else if (isLineTerminator()) {
340	nextLine();
341	} else if (current == '*' && next1 == '/') {
342	state = Start;
343	shift(1);
344	}
345	break;
346	case InIdentifierOrKeyword:
347	case InIdentifier:
348	if (isIdentPart(current))
349	record16(current);
350	else if (current == '\\')
351	state = InIdentifierPartUnicodeEscapeStart;
352	else
353	setDone(state == InIdentifierOrKeyword ? IdentifierOrKeyword : Identifier);
354	break;
355	case InNum0:
356	if (current == 'x' \|\| current == 'X') {
357	record8(current);
358	state = InHex;
359	} else if (current == '.') {
360	record8(current);
361	state = InDecimal;
362	} else if (current == 'e' \|\| current == 'E') {
363	record8(current);
364	state = InExponentIndicator;
365	} else if (isOctalDigit(current)) {
366	record8(current);
367	state = InOctal;
368	} else if (isDecimalDigit(current)) {
369	record8(current);
370	state = InDecimal;
371	} else {
372	setDone(Number);
373	}
374	break;
375	case InHex:
376	if (isHexDigit(current)) {
377	record8(current);
378	} else {
379	setDone(Hex);
380	}
381	break;
382	case InOctal:
383	if (isOctalDigit(current)) {
384	record8(current);
385	}
386	else if (isDecimalDigit(current)) {
387	record8(current);
388	state = InDecimal;
389	} else
390	setDone(Octal);
391	break;
392	case InNum:
393	if (isDecimalDigit(current)) {
394	record8(current);
395	} else if (current == '.') {
396	record8(current);
397	state = InDecimal;
398	} else if (current == 'e' \|\| current == 'E') {
399	record8(current);
400	state = InExponentIndicator;
401	} else
402	setDone(Number);
403	break;
404	case InDecimal:
405	if (isDecimalDigit(current)) {
406	record8(current);
407	} else if (current == 'e' \|\| current == 'E') {
408	record8(current);
409	state = InExponentIndicator;
410	} else
411	setDone(Number);
412	break;
413	case InExponentIndicator:
414	if (current == '+' \|\| current == '-') {
415	record8(current);
416	} else if (isDecimalDigit(current)) {
417	record8(current);
418	state = InExponent;
419	} else
420	setDone(Bad);
421	break;
422	case InExponent:
423	if (isDecimalDigit(current)) {
424	record8(current);
425	} else
426	setDone(Number);
427	break;
428	case InIdentifierStartUnicodeEscapeStart:
429	if (current == 'u')
430	state = InIdentifierStartUnicodeEscape;
431	else
432	setDone(Bad);
433	break;
434	case InIdentifierPartUnicodeEscapeStart:
435	if (current == 'u')
436	state = InIdentifierPartUnicodeEscape;
437	else
438	setDone(Bad);
439	break;
440	case InIdentifierStartUnicodeEscape:
441	if (!isHexDigit(current) \|\| !isHexDigit(next1) \|\| !isHexDigit(next2) \|\| !isHexDigit(next3)) {
442	setDone(Bad);
443	break;
444	}
445	token = convertUnicode(current, next1, next2, next3);
446	shift(3);
447	if (!isIdentStart(token)) {
448	setDone(Bad);
449	break;
450	}
451	record16(token);
452	state = InIdentifier;
453	break;
454	case InIdentifierPartUnicodeEscape:
455	if (!isHexDigit(current) \|\| !isHexDigit(next1) \|\| !isHexDigit(next2) \|\| !isHexDigit(next3)) {
456	setDone(Bad);
457	break;
458	}
459	token = convertUnicode(current, next1, next2, next3);
460	shift(3);
461	if (!isIdentPart(token)) {
462	setDone(Bad);
463	break;
464	}
465	record16(token);
466	state = InIdentifier;
467	break;
468	default:
469	ASSERT(!"Unhandled state in switch statement");
470	}
471
472	// move on to the next character
473	if (!done)
474	shift(1);
475	if (state != Start && state != InSingleLineComment)
476	atLineStart = false;
477	}
478
479	// no identifiers allowed directly after numeric literal, e.g. "3in" is bad
480	if ((state == Number \|\| state == Octal \|\| state == Hex) && isIdentStart(current))
481	state = Bad;
482
483	// terminate string
484	m_buffer8.append('\0');
485
486	#ifdef KJS_DEBUG_LEX
487	fprintf(stderr, "line: %d ", lineNo());
488	fprintf(stderr, "yytext (%x): ", m_buffer8[0]);
489	fprintf(stderr, "%s ", buffer8.data());
490	#endif
491
492	double dval = 0;
493	if (state == Number) {
494	dval = strtod(m_buffer8.data(), 0L);
495	} else if (state == Hex) { // scan hex numbers
496	const char* p = m_buffer8.data() + 2;
497	while (char c = *p++) {
498	dval *= 16;
499	dval += convertHex(c);
500	}
501
502	if (dval >= mantissaOverflowLowerBound)
503	dval = parseIntOverflow(m_buffer8.data() + 2, p - (m_buffer8.data() + 3), 16);
504
505	state = Number;
506	} else if (state == Octal) { // scan octal number
507	const char* p = m_buffer8.data() + 1;
508	while (char c = *p++) {
509	dval *= 8;
510	dval += c - '0';
511	}
512
513	if (dval >= mantissaOverflowLowerBound)
514	dval = parseIntOverflow(m_buffer8.data() + 1, p - (m_buffer8.data() + 2), 8);
515
516	state = Number;
517	}
518
519	#ifdef KJS_DEBUG_LEX
520	switch (state) {
521	case Eof:
522	printf("(EOF)\n");
523	break;
524	case Other:
525	printf("(Other)\n");
526	break;
527	case Identifier:
528	printf("(Identifier)/(Keyword)\n");
529	break;
530	case String:
531	printf("(String)\n");
532	break;
533	case Number:
534	printf("(Number)\n");
535	break;
536	default:
537	printf("(unknown)");
538	}
539	#endif
540
541	if (state != Identifier)
542	eatNextIdentifier = false;
543
544	restrKeyword = false;
545	delimited = false;
546	llocp->first_line = yylineno; // ???
547	llocp->last_line = yylineno;
548
549	switch (state) {
550	case Eof:
551	token = 0;
552	break;
553	case Other:
554	if (token == '}' \|\| token == ';')
555	delimited = true;
556	break;
557	case Identifier:
558	// Apply anonymous-function hack below (eat the identifier).
559	if (eatNextIdentifier) {
560	eatNextIdentifier = false;
561	token = lex(lvalp, llocp);
562	break;
563	}
564	lvalp->ident = makeIdentifier(m_buffer16);
565	token = IDENT;
566	break;
567	case IdentifierOrKeyword:
568	lvalp->ident = makeIdentifier(m_buffer16);
569	if ((token = mainTable.value(*lvalp->ident)) < 0) {
570	// Lookup for keyword failed, means this is an identifier.
571	token = IDENT;
572	break;
573	}
574	// Hack for "f = function somename() { ... }"; too hard to get into the grammar.
575	eatNextIdentifier = token == FUNCTION && lastToken == '=';
576	if (token == CONTINUE \|\| token == BREAK \|\| token == RETURN \|\| token == THROW)
577	restrKeyword = true;
578	break;
579	case String:
580	lvalp->string = makeUString(m_buffer16);
581	token = STRING;
582	break;
583	case Number:
584	lvalp->doubleValue = dval;
585	token = NUMBER;
586	break;
587	case Bad:
588	#ifdef KJS_DEBUG_LEX
589	fprintf(stderr, "yylex: ERROR.\n");
590	#endif
591	error = true;
592	return -1;
593	default:
594	ASSERT(!"unhandled numeration value in switch");
595	error = true;
596	return -1;
597	}
598	lastToken = token;
599	return token;
600	}
601
602	bool Lexer::isWhiteSpace() const
603	{
604	return current == '\t' \|\| current == 0x0b \|\| current == 0x0c \|\| isSeparatorSpace(current);
605	}
606
607	bool Lexer::isLineTerminator()
608	{
609	bool cr = (current == '\r');
610	bool lf = (current == '\n');
611	if (cr)
612	skipLF = true;
613	else if (lf)
614	skipCR = true;
615	return cr \|\| lf \|\| current == 0x2028 \|\| current == 0x2029;
616	}
617
618	bool Lexer::isIdentStart(int c)
619	{
620	return (category(c) & (Letter_Uppercase \| Letter_Lowercase \| Letter_Titlecase \| Letter_Modifier \| Letter_Other))
621	\|\| c == '$' \|\| c == '_';
622	}
623
624	bool Lexer::isIdentPart(int c)
625	{
626	return (category(c) & (Letter_Uppercase \| Letter_Lowercase \| Letter_Titlecase \| Letter_Modifier \| Letter_Other
627	\| Mark_NonSpacing \| Mark_SpacingCombining \| Number_DecimalDigit \| Punctuation_Connector))
628	\|\| c == '$' \|\| c == '_';
629	}
630
631	static bool isDecimalDigit(int c)
632	{
633	return (c >= '0' && c <= '9');
634	}
635
636	bool Lexer::isHexDigit(int c)
637	{
638	return (c >= '0' && c <= '9' \|\|
639	c >= 'a' && c <= 'f' \|\|
640	c >= 'A' && c <= 'F');
641	}
642
643	bool Lexer::isOctalDigit(int c)
644	{
645	return (c >= '0' && c <= '7');
646	}
647
648	int Lexer::matchPunctuator(int& charPos, int c1, int c2, int c3, int c4)
649	{
650	if (c1 == '>' && c2 == '>' && c3 == '>' && c4 == '=') {
651	shift(4);
652	return URSHIFTEQUAL;
653	} else if (c1 == '=' && c2 == '=' && c3 == '=') {
654	shift(3);
655	return STREQ;
656	} else if (c1 == '!' && c2 == '=' && c3 == '=') {
657	shift(3);
658	return STRNEQ;
659	} else if (c1 == '>' && c2 == '>' && c3 == '>') {
660	shift(3);
661	return URSHIFT;
662	} else if (c1 == '<' && c2 == '<' && c3 == '=') {
663	shift(3);
664	return LSHIFTEQUAL;
665	} else if (c1 == '>' && c2 == '>' && c3 == '=') {
666	shift(3);
667	return RSHIFTEQUAL;
668	} else if (c1 == '<' && c2 == '=') {
669	shift(2);
670	return LE;
671	} else if (c1 == '>' && c2 == '=') {
672	shift(2);
673	return GE;
674	} else if (c1 == '!' && c2 == '=') {
675	shift(2);
676	return NE;
677	} else if (c1 == '+' && c2 == '+') {
678	shift(2);
679	if (terminator)
680	return AUTOPLUSPLUS;
681	else
682	return PLUSPLUS;
683	} else if (c1 == '-' && c2 == '-') {
684	shift(2);
685	if (terminator)
686	return AUTOMINUSMINUS;
687	else
688	return MINUSMINUS;
689	} else if (c1 == '=' && c2 == '=') {
690	shift(2);
691	return EQEQ;
692	} else if (c1 == '+' && c2 == '=') {
693	shift(2);
694	return PLUSEQUAL;
695	} else if (c1 == '-' && c2 == '=') {
696	shift(2);
697	return MINUSEQUAL;
698	} else if (c1 == '*' && c2 == '=') {
699	shift(2);
700	return MULTEQUAL;
701	} else if (c1 == '/' && c2 == '=') {
702	shift(2);
703	return DIVEQUAL;
704	} else if (c1 == '&' && c2 == '=') {
705	shift(2);
706	return ANDEQUAL;
707	} else if (c1 == '^' && c2 == '=') {
708	shift(2);
709	return XOREQUAL;
710	} else if (c1 == '%' && c2 == '=') {
711	shift(2);
712	return MODEQUAL;
713	} else if (c1 == '\|' && c2 == '=') {
714	shift(2);
715	return OREQUAL;
716	} else if (c1 == '<' && c2 == '<') {
717	shift(2);
718	return LSHIFT;
719	} else if (c1 == '>' && c2 == '>') {
720	shift(2);
721	return RSHIFT;
722	} else if (c1 == '&' && c2 == '&') {
723	shift(2);
724	return AND;
725	} else if (c1 == '\|' && c2 == '\|') {
726	shift(2);
727	return OR;
728	}
729
730	switch(c1) {
731	case '=':
732	case '>':
733	case '<':
734	case ',':
735	case '!':
736	case '~':
737	case '?':
738	case ':':
739	case '.':
740	case '+':
741	case '-':
742	case '*':
743	case '/':
744	case '&':
745	case '\|':
746	case '^':
747	case '%':
748	case '(':
749	case ')':
750	case '[':
751	case ']':
752	case ';':
753	shift(1);
754	return static_cast<int>(c1);
755	case '{':
756	charPos = pos - 4;
757	shift(1);
758	return OPENBRACE;
759	case '}':
760	charPos = pos - 4;
761	shift(1);
762	return CLOSEBRACE;
763	default:
764	return -1;
765	}
766	}
767
768	unsigned short Lexer::singleEscape(unsigned short c)
769	{
770	switch(c) {
771	case 'b':
772	return 0x08;
773	case 't':
774	return 0x09;
775	case 'n':
776	return 0x0A;
777	case 'v':
778	return 0x0B;
779	case 'f':
780	return 0x0C;
781	case 'r':
782	return 0x0D;
783	case '"':
784	return 0x22;
785	case '\'':
786	return 0x27;
787	case '\\':
788	return 0x5C;
789	default:
790	return c;
791	}
792	}
793
794	unsigned short Lexer::convertOctal(int c1, int c2, int c3)
795	{
796	return static_cast<unsigned short>((c1 - '0') * 64 + (c2 - '0') * 8 + c3 - '0');
797	}
798
799	unsigned char Lexer::convertHex(int c)
800	{
801	if (c >= '0' && c <= '9')
802	return static_cast<unsigned char>(c - '0');
803	if (c >= 'a' && c <= 'f')
804	return static_cast<unsigned char>(c - 'a' + 10);
805	return static_cast<unsigned char>(c - 'A' + 10);
806	}
807
808	unsigned char Lexer::convertHex(int c1, int c2)
809	{
810	return ((convertHex(c1) << 4) + convertHex(c2));
811	}
812
813	UChar Lexer::convertUnicode(int c1, int c2, int c3, int c4)
814	{
815	unsigned char highByte = (convertHex(c1) << 4) + convertHex(c2);
816	unsigned char lowByte = (convertHex(c3) << 4) + convertHex(c4);
817	return (highByte << 8 \| lowByte);
818	}
819
820	void Lexer::record8(int c)
821	{
822	ASSERT(c >= 0);
823	ASSERT(c <= 0xff);
824	m_buffer8.append(static_cast<char>(c));
825	}
826
827	void Lexer::record16(int c)
828	{
829	ASSERT(c >= 0);
830	ASSERT(c <= USHRT_MAX);
831	record16(UChar(static_cast<unsigned short>(c)));
832	}
833
834	void Lexer::record16(UChar c)
835	{
836	m_buffer16.append(c);
837	}
838
839	bool Lexer::scanRegExp()
840	{
841	m_buffer16.clear();
842	bool lastWasEscape = false;
843	bool inBrackets = false;
844
845	while (1) {
846	if (isLineTerminator() \|\| current == -1)
847	return false;
848	else if (current != '/' \|\| lastWasEscape == true \|\| inBrackets == true)
849	{
850	// keep track of '[' and ']'
851	if (!lastWasEscape) {
852	if ( current == '[' && !inBrackets )
853	inBrackets = true;
854	if ( current == ']' && inBrackets )
855	inBrackets = false;
856	}
857	record16(current);
858	lastWasEscape =
859	!lastWasEscape && (current == '\\');
860	} else { // end of regexp
861	m_pattern = UString(m_buffer16);
862	m_buffer16.clear();
863	shift(1);
864	break;
865	}
866	shift(1);
867	}
868
869	while (isIdentPart(current)) {
870	record16(current);
871	shift(1);
872	}
873	m_flags = UString(m_buffer16);
874
875	return true;
876	}
877
878	void Lexer::clear()
879	{
880	deleteAllValues(m_strings);
881	Vector<UString*> newStrings;
882	newStrings.reserveCapacity(initialStringTableCapacity);
883	m_strings.swap(newStrings);
884
885	deleteAllValues(m_identifiers);
886	Vector<KJS::Identifier*> newIdentifiers;
887	newIdentifiers.reserveCapacity(initialStringTableCapacity);
888	m_identifiers.swap(newIdentifiers);
889
890	Vector<char> newBuffer8;
891	newBuffer8.reserveCapacity(initialReadBufferCapacity);
892	m_buffer8.swap(newBuffer8);
893
894	Vector<UChar> newBuffer16;
895	newBuffer16.reserveCapacity(initialReadBufferCapacity);
896	m_buffer16.swap(newBuffer16);
897
898	m_pattern = 0;
899	m_flags = 0;
900	}
901
902	Identifier* Lexer::makeIdentifier(const Vector<UChar>& buffer)
903	{
904	KJS::Identifier* identifier = new KJS::Identifier(buffer.data(), buffer.size());
905	m_identifiers.append(identifier);
906	return identifier;
907	}
908
909	UString* Lexer::makeUString(const Vector<UChar>& buffer)
910	{
911	UString* string = new UString(buffer);
912	m_strings.append(string);
913	return string;
914	}
915
916	} // namespace KJS

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: webkit/trunk/JavaScriptCore/kjs/lexer.cpp@ 34273

Download in other formats: