Changeset 65468 in webkit for trunk/JavaScriptCore/runtime


Ignore:
Timestamp:
Aug 16, 2010, 4:31:33 PM (15 years ago)
Author:
[email protected]
Message:

Bug 44080 - String find/reverseFind methods need tidying up
These methods have a couple of problems with their interface, and implementation.

Reviewed by Sam Weinig

These methods take and int index, and return an int - however this is problematic
since on 64-bit string indices may have a full 32-bit range. This spills out into
surrounding code, which unsafely casts string indices from unsigned to int. Code
checking the result of these methods check for a mix of "== -1", "< 0", and
"== notFound". Clean this up by changing these methods to take an unsigned
starting index, and return a size_t. with a failed match indicated by notFound.
reverseFind also has a special meaning for the starting index argument, in that a
negative index is interpreted as an offset back from the end of the string. Remove
this functionality, in the (1!) case where it is used we should just calculate the
offset by subtracting from the string's length.

The implementation has a few problems too. The code is not in webkit style, in
using assorted abbreviations in variable names, and implementations of similar
find methods with differing argument types were unnecessarily inconsistent. When
find is passed const char* data the string would be handled as latin1 (zero
extended to UTF-16) for all characters but the first; this is sign extended.
Case-insensitive find is broken for unicode strings; the hashing optimization is
not unicode safe, and could result in false negatives.

Unify UString find methods to match String.

JavaScriptCore:

(JSC::escapeQuotes):

  • bytecompiler/NodesCodegen.cpp:

(JSC::substitute):

  • runtime/JSString.cpp:

(JSC::JSString::replaceCharacter):

  • runtime/RegExp.cpp:

(JSC::RegExp::RegExp):

  • runtime/RegExpKey.h:

(JSC::RegExpKey::getFlagsValue):

  • runtime/StringPrototype.cpp:

(JSC::substituteBackreferencesSlow):
(JSC::substituteBackreferences):
(JSC::stringProtoFuncReplace):
(JSC::stringProtoFuncIndexOf):
(JSC::stringProtoFuncLastIndexOf):
(JSC::stringProtoFuncSplit):

  • runtime/UString.cpp:
  • runtime/UString.h:

(JSC::UString::find):
(JSC::UString::reverseFind):

  • wtf/text/AtomicString.h:

(WTF::AtomicString::find):

  • wtf/text/StringImpl.cpp:

(WTF::StringImpl::find):
(WTF::StringImpl::findCaseInsensitive):
(WTF::StringImpl::reverseFind):
(WTF::StringImpl::reverseFindCaseInsensitive):
(WTF::StringImpl::endsWith):
(WTF::StringImpl::replace):

  • wtf/text/StringImpl.h:

(WTF::StringImpl::startsWith):

  • wtf/text/WTFString.cpp:

(WTF::String::split):

  • wtf/text/WTFString.h:

(WTF::String::find):
(WTF::String::reverseFind):
(WTF::String::findCaseInsensitive):
(WTF::String::reverseFindCaseInsensitive):
(WTF::String::contains):
(WTF::find):
(WTF::reverseFind):

WebCore:

  • css/CSSSelector.cpp:

(WebCore::CSSSelector::RareData::parseNth):

  • css/CSSStyleDeclaration.cpp:

(WebCore::CSSStyleDeclaration::setProperty):

  • css/CSSStyleSelector.cpp:

(WebCore::CSSStyleSelector::SelectorChecker::checkOneSelector):

  • dom/Document.cpp:

(WebCore::Document::getImageMap):

  • editing/CompositeEditCommand.cpp:

(WebCore::CompositeEditCommand::inputText):

  • editing/InsertTextCommand.cpp:

(WebCore::InsertTextCommand::input):

  • editing/TextIterator.cpp:

(WebCore::TextIterator::handleTextBox):

  • editing/TypingCommand.cpp:

(WebCore::TypingCommand::insertText):

  • editing/markup.cpp:

(WebCore::fillContainerFromString):
(WebCore::createFragmentFromText):

  • html/File.cpp:

(WebCore::File::Init):

  • html/HTMLAnchorElement.cpp:

(WebCore::HTMLAnchorElement::setHost):

  • html/HTMLEmbedElement.cpp:

(WebCore::HTMLEmbedElement::parseMappedAttribute):

  • html/HTMLFormControlElement.cpp:

(WebCore::HTMLTextFormControlElement::isPlaceholderEmpty):

  • html/HTMLObjectElement.cpp:

(WebCore::HTMLObjectElement::parseMappedAttribute):

  • inspector/InspectorDOMAgent.cpp:

(WebCore::InspectorDOMAgent::performSearch):

  • loader/CrossOriginPreflightResultCache.cpp:

(WebCore::parseAccessControlAllowList):

  • loader/MainResourceLoader.cpp:

(WebCore::MainResourceLoader::substituteMIMETypeFromPluginDatabase):

  • loader/appcache/ApplicationCacheStorage.cpp:

(WebCore::parseHeader):
(WebCore::parseHeaders):

  • loader/icon/IconFetcher.cpp:

(WebCore::parseIconLink):

  • page/DOMWindow.cpp:

(WebCore::DOMWindow::parseModalDialogFeatures):

  • page/SecurityOrigin.cpp:

(WebCore::SecurityOrigin::createFromDatabaseIdentifier):

  • page/UserContentURLPattern.cpp:

(WebCore::UserContentURLPattern::parse):

  • page/XSSAuditor.cpp:

(WebCore::XSSAuditor::findInRequest):

  • platform/ContentType.cpp:

(WebCore::ContentType::parameter):
(WebCore::ContentType::type):

  • platform/KURL.cpp:

(WebCore::KURL::lastPathComponent):
(WebCore::KURL::setProtocol):
(WebCore::decodeURLEscapeSequences):
(WebCore::substituteBackslashes):
(WebCore::mimeTypeFromDataURL):

  • platform/Length.cpp:

(WebCore::newCoordsArray):
(WebCore::newLengthArray):

  • platform/LinkHash.cpp:

(WebCore::findSlashDotDotSlash):
(WebCore::findSlashSlash):
(WebCore::findSlashDotSlash):
(WebCore::cleanPath):

  • platform/MIMETypeRegistry.cpp:

(WebCore::MIMETypeRegistry::getMIMETypeForPath):

  • platform/SchemeRegistry.cpp:

(WebCore::SchemeRegistry::shouldTreatURLAsLocal):

  • platform/graphics/MediaPlayer.cpp:

(WebCore::MediaPlayer::load):

  • platform/mac/DragImageMac.mm:

(WebCore::createDragImageIconForCachedImage):

  • platform/network/CredentialStorage.cpp:

(WebCore::protectionSpaceMapKeyFromURL):
(WebCore::findDefaultProtectionSpaceForURL):

  • platform/network/HTTPParsers.cpp:

(WebCore::skipWhiteSpace):
(WebCore::skipToken):
(WebCore::parseHTTPRefresh):
(WebCore::filenameFromHTTPContentDisposition):
(WebCore::findCharsetInMediaType):
(WebCore::parseXSSProtectionHeader):
(WebCore::extractReasonPhraseFromHTTPStatusLine):

  • platform/network/ResourceResponseBase.cpp:

(WebCore::ResourceResponseBase::isAttachment):
(WebCore::parseCacheHeader):

  • rendering/RenderEmbeddedObject.cpp:

(WebCore::RenderEmbeddedObject::updateWidget):

  • storage/Entry.cpp:

(WebCore::Entry::Entry):

  • svg/SVGFont.cpp:

(WebCore::isCompatibleGlyph):

  • svg/SVGURIReference.cpp:

(WebCore::SVGURIReference::getTarget):

  • svg/animation/SVGSMILElement.cpp:

(WebCore::SVGSMILElement::parseClockValue):
(WebCore::SVGSMILElement::parseCondition):

  • xml/XPathFunctions.cpp:

(WebCore::XPath::FunSubstringBefore::evaluate):
(WebCore::XPath::FunSubstringAfter::evaluate):
(WebCore::XPath::FunTranslate::evaluate):
(WebCore::XPath::FunLang::evaluate):

  • xml/XPathParser.cpp:

(WebCore::XPath::Parser::expandQName):

Location:
trunk/JavaScriptCore/runtime
Files:
6 edited

Legend:

Unmodified
Added
Removed
  • trunk/JavaScriptCore/runtime/JSString.cpp

    r65177 r65468  
    109109{
    110110    if (!isRope()) {
    111         unsigned matchPosition = m_value.find(character);
    112         if (matchPosition == UString::NotFound)
     111        size_t matchPosition = m_value.find(character);
     112        if (matchPosition == notFound)
    113113            return JSValue(this);
    114114        return jsString(exec, m_value.substr(0, matchPosition), replacement, m_value.substr(matchPosition + 1));
     
    120120    size_t fiberCount = 0;
    121121    StringImpl* matchString = 0;
    122     int matchPosition = -1;
     122    size_t matchPosition = notFound;
    123123    for (RopeIterator it(m_other.m_fibers.data(), m_fiberCount); it != end; ++it) {
    124124        ++fiberCount;
     
    128128        StringImpl* string = *it;
    129129        matchPosition = string->find(character);
    130         if (matchPosition == -1)
     130        if (matchPosition == notFound)
    131131            continue;
    132132        matchString = string;
  • trunk/JavaScriptCore/runtime/RegExp.cpp

    r65188 r65468  
    5757    // String::match and RegExpObject::match.
    5858    if (!flags.isNull()) {
    59         if (flags.find('g') != UString::NotFound)
     59        if (flags.find('g') != notFound)
    6060            m_flagBits |= Global;
    61         if (flags.find('i') != UString::NotFound)
     61        if (flags.find('i') != notFound)
    6262            m_flagBits |= IgnoreCase;
    63         if (flags.find('m') != UString::NotFound)
     63        if (flags.find('m') != notFound)
    6464            m_flagBits |= Multiline;
    6565    }
  • trunk/JavaScriptCore/runtime/RegExpKey.h

    r65286 r65468  
    7474    {
    7575        flagsValue = 0;
    76         if (flags.find('g') != UString::NotFound)
     76        if (flags.find('g') != notFound)
    7777            flagsValue += 4;
    78         if (flags.find('i') != UString::NotFound)
     78        if (flags.find('i') != notFound)
    7979            flagsValue += 2;
    80         if (flags.find('m') != UString::NotFound)
     80        if (flags.find('m') != notFound)
    8181            flagsValue += 1;
    8282        return flagsValue;
  • trunk/JavaScriptCore/runtime/StringPrototype.cpp

    r65177 r65468  
    152152// ------------------------------ Functions --------------------------
    153153
    154 static NEVER_INLINE UString substituteBackreferencesSlow(const UString& replacement, const UString& source, const int* ovector, RegExp* reg, unsigned i)
     154static NEVER_INLINE UString substituteBackreferencesSlow(const UString& replacement, const UString& source, const int* ovector, RegExp* reg, size_t i)
    155155{
    156156    Vector<UChar> substitutedReplacement;
     
    208208        offset = i + 1;
    209209        substitutedReplacement.append(source.characters() + backrefStart, backrefLength);
    210     } while ((i = replacement.find('$', i + 1)) != UString::NotFound);
     210    } while ((i = replacement.find('$', i + 1)) != notFound);
    211211
    212212    if (replacement.length() - offset)
     
    219219static inline UString substituteBackreferences(const UString& replacement, const UString& source, const int* ovector, RegExp* reg)
    220220{
    221     unsigned i = replacement.find('$', 0);
    222     if (UNLIKELY(i != UString::NotFound))
     221    size_t i = replacement.find('$', 0);
     222    if (UNLIKELY(i != notFound))
    223223        return substituteBackreferencesSlow(replacement, source, ovector, reg, i);
    224224    return replacement;
     
    430430   
    431431    const UString& source = sourceVal->value(exec);
    432     unsigned matchPos = source.find(patternString);
    433 
    434     if (matchPos == UString::NotFound)
     432    size_t matchPos = source.find(patternString);
     433
     434    if (matchPos == notFound)
    435435        return JSValue::encode(sourceVal);
    436436
     
    543543    }
    544544
    545     unsigned result = s.find(u2, pos);
    546     if (result == UString::NotFound)
     545    size_t result = s.find(u2, pos);
     546    if (result == notFound)
    547547        return JSValue::encode(jsNumber(exec, -1));
    548548    return JSValue::encode(jsNumber(exec, result));
     
    572572#endif
    573573
    574     unsigned result = s.rfind(u2, static_cast<unsigned>(dpos));
    575     if (result == UString::NotFound)
     574    size_t result = s.reverseFind(u2, static_cast<unsigned>(dpos));
     575    if (result == notFound)
    576576        return JSValue::encode(jsNumber(exec, -1));
    577577    return JSValue::encode(jsNumber(exec, result));
     
    737737                result->put(exec, i++, jsSingleCharacterSubstring(exec, s, p0++));
    738738        } else {
    739             unsigned pos;
    740            
    741             while (i != limit && (pos = s.find(u2, p0)) != UString::NotFound) {
     739            size_t pos;
     740            while (i != limit && (pos = s.find(u2, p0)) != notFound) {
    742741                result->put(exec, i++, jsSubstring(exec, s, p0, pos - p0));
    743742                p0 = pos + u2.length();
  • trunk/JavaScriptCore/runtime/UString.cpp

    r65344 r65468  
    414414}
    415415
    416 unsigned UString::find(const UString& f, unsigned pos) const
    417 {
    418     unsigned fsz = f.length();
    419 
    420     if (fsz == 1) {
    421         UChar ch = f[0];
    422         const UChar* end = characters() + length();
    423         for (const UChar* c = characters() + pos; c < end; c++) {
    424             if (*c == ch)
    425                 return static_cast<unsigned>(c - characters());
    426         }
    427         return NotFound;
    428     }
    429 
    430     unsigned sz = length();
    431     if (sz < fsz)
    432         return NotFound;
    433     if (fsz == 0)
    434         return pos;
    435     const UChar* end = characters() + sz - fsz;
    436     unsigned fsizeminusone = (fsz - 1) * sizeof(UChar);
    437     const UChar* fdata = f.characters();
    438     unsigned short fchar = fdata[0];
    439     ++fdata;
    440     for (const UChar* c = characters() + pos; c <= end; c++) {
    441         if (c[0] == fchar && !memcmp(c + 1, fdata, fsizeminusone))
    442             return static_cast<unsigned>(c - characters());
    443     }
    444 
    445     return NotFound;
    446 }
    447 
    448 unsigned UString::find(UChar ch, unsigned pos) const
    449 {
    450     const UChar* end = characters() + length();
    451     for (const UChar* c = characters() + pos; c < end; c++) {
    452         if (*c == ch)
    453             return static_cast<unsigned>(c - characters());
    454     }
    455 
    456     return NotFound;
    457 }
    458 
    459 unsigned UString::rfind(const UString& f, unsigned pos) const
    460 {
    461     unsigned sz = length();
    462     unsigned fsz = f.length();
    463     if (sz < fsz)
    464         return NotFound;
    465     if (pos > sz - fsz)
    466         pos = sz - fsz;
    467     if (fsz == 0)
    468         return pos;
    469     unsigned fsizeminusone = (fsz - 1) * sizeof(UChar);
    470     const UChar* fdata = f.characters();
    471     for (const UChar* c = characters() + pos; c >= characters(); c--) {
    472         if (*c == *fdata && !memcmp(c + 1, fdata + 1, fsizeminusone))
    473             return static_cast<unsigned>(c - characters());
    474     }
    475 
    476     return NotFound;
    477 }
    478 
    479 unsigned UString::rfind(UChar ch, unsigned pos) const
    480 {
    481     if (isEmpty())
    482         return NotFound;
    483     if (pos + 1 >= length())
    484         pos = length() - 1;
    485     for (const UChar* c = characters() + pos; c >= characters(); c--) {
    486         if (*c == ch)
    487             return static_cast<unsigned>(c - characters());
    488     }
    489 
    490     return NotFound;
    491 }
    492 
    493416UString UString::substr(unsigned pos, unsigned len) const
    494417{
  • trunk/JavaScriptCore/runtime/UString.h

    r65344 r65468  
    105105    static UString number(double);
    106106
    107 
     107    // Find a single character or string, also with match function & latin1 forms.
     108    size_t find(UChar c, unsigned start = 0) const
     109        { return m_impl ? m_impl->find(c, start) : notFound; }
     110    size_t find(const UString& str, unsigned start = 0) const
     111        { return m_impl ? m_impl->find(str.impl(), start) : notFound; }
     112    size_t find(const char* str, unsigned start = 0) const
     113        { return m_impl ? m_impl->find(str, start) : notFound; }
     114
     115    // Find the last instance of a single character or string.
     116    size_t reverseFind(UChar c, unsigned start = UINT_MAX) const
     117        { return m_impl ? m_impl->reverseFind(c, start) : notFound; }
     118    size_t reverseFind(const UString& str, unsigned start = UINT_MAX) const
     119        { return m_impl ? m_impl->reverseFind(str.impl(), start) : notFound; }
    108120
    109121    double toDouble(bool tolerateTrailingJunk, bool tolerateEmptyString) const;
     
    115127    uint32_t toStrictUInt32(bool* ok = 0) const;
    116128
    117     static const unsigned NotFound = 0xFFFFFFFFu;
    118     unsigned find(const UString& f, unsigned pos = 0) const;
    119     unsigned find(UChar, unsigned pos = 0) const;
    120     unsigned rfind(const UString& f, unsigned pos) const;
    121     unsigned rfind(UChar, unsigned pos) const;
    122 
    123     UString substr(unsigned pos = 0, unsigned len = 0xFFFFFFFF) const;
     129    UString substr(unsigned pos = 0, unsigned len = UINT_MAX) const;
    124130
    125131private:
Note: See TracChangeset for help on using the changeset viewer.