Ignore:
Timestamp:
Jul 2, 2010, 3:31:40 PM (15 years ago)
Author:
[email protected]
Message:

2010-07-02 Oliver Hunt <[email protected]>

Reviewed by Geoffrey Garen.

Move BOM handling out of the lexer and parser
https://bugs.webkit.org/show_bug.cgi?id=41539

Doing the BOM stripping in the lexer meant that we could
end up having to strip the BOMs from a source multiple times.
To deal with this we now require all strings provided by
a SourceProvider to already have had the BOMs stripped.
This also simplifies some of the lexer logic.

  • parser/Lexer.cpp: (JSC::Lexer::setCode): (JSC::Lexer::sourceCode):
  • parser/SourceProvider.h: (JSC::SourceProvider::SourceProvider): (JSC::UStringSourceProvider::create): (JSC::UStringSourceProvider::getRange): (JSC::UStringSourceProvider::UStringSourceProvider):
  • wtf/text/StringImpl.h: (WebCore::StringImpl::copyStringWithoutBOMs):

2010-07-02 Oliver Hunt <[email protected]>

Reviewed by Geoffrey Garen.

Move BOM handling out of the lexer and parser
https://bugs.webkit.org/show_bug.cgi?id=41539

Update WebCore to ensure that SourceProviders don't
produce strings with BOMs in them.

  • bindings/js/ScriptSourceProvider.h: (WebCore::ScriptSourceProvider::ScriptSourceProvider):
  • bindings/js/StringSourceProvider.h: (WebCore::StringSourceProvider::StringSourceProvider):
  • loader/CachedScript.cpp: (WebCore::CachedScript::CachedScript): (WebCore::CachedScript::script):
  • loader/CachedScript.h: (WebCore::CachedScript::): CachedScript now stores decoded data with the BOMs stripped, and caches the presence of BOMs across memory purges.
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/wtf/text/StringImpl.h

    r60332 r62410  
    258258    }
    259259
     260    PassRefPtr<StringImpl> copyStringWithoutBOMs(bool definitelyHasBOMs, bool& hasBOMs)
     261    {
     262        static const UChar byteOrderMark = 0xFEFF;
     263        size_t i = 0;
     264        if (!definitelyHasBOMs) {
     265            hasBOMs = false;
     266            // ECMA-262 calls for stripping all Cf characters, but we only strip BOM characters.
     267            // See <https://bugs.webkit.org/show_bug.cgi?id=4931> for details.
     268            for (; i < m_length; i++) {
     269                if (UNLIKELY(m_data[i] == byteOrderMark)) {
     270                    hasBOMs = true;
     271                    break;
     272                }
     273            }
     274            if (!hasBOMs)
     275                return this;
     276        }
     277        Vector<UChar> result;
     278        result.reserveInitialCapacity(m_length);
     279        for (; i < m_length; i++)
     280            result.append(m_data[i]);
     281        for (; i < m_length; i++) {
     282            UChar c = m_data[i];
     283            if (c != byteOrderMark)
     284                result.append(c);
     285        }
     286        return StringImpl::adopt(result);
     287    }
     288
    260289    // Returns a StringImpl suitable for use on another thread.
    261290    PassRefPtr<StringImpl> crossThreadString();
Note: See TracChangeset for help on using the changeset viewer.