[Webkit-unassigned] [Bug 50038] Add word-prefix search options to the text search API

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Nov 29 10:46:14 PST 2010


https://bugs.webkit.org/show_bug.cgi?id=50038


mitz at webkit.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #74790|0                           |1
        is obsolete|                            |
  Attachment #75040|                            |review?
               Flag|                            |




--- Comment #6 from mitz at webkit.org  2010-11-29 10:46:14 PST ---
Created an attachment (id=75040)
 --> (https://bugs.webkit.org/attachment.cgi?id=75040&action=review)
WebCore (ICU code path only) implementation, WebKit/mac WebKit2 API, DumpRenderTree/mac test support and tests

(In reply to comment #5)
> (From update of attachment 74790 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=74790&action=review
> 
> r=me, assuming you get this to build on all platforms before landing it

Thanks for the review. This version includes more build system changes.

> 
> > WebCore/editing/Editor.h:333
> > +    bool findString(const String&, FindOptions);
> >      bool findString(const String&, bool forward, bool caseFlag, bool wrapFlag, bool startInSelection);
> 
> Longer term it might be nice to return the version with four bool arguments. Maybe even put a FIXME reminding us that we want to do that?

Done.

> 
> > WebCore/editing/FindOptions.h:34
> > +    TreatMedialCapAsWordStart = 1 << 2,
> 
> Might want a comment nearby explaining what medial capitals are. Why the abbreviation “cap” for capital?

Added a comment. Changed  to “capital”.

> 
> > WebCore/editing/TextIterator.cpp:1841
> > +    static const char latin1SeparatorTable[256] = {
> 
> Is char the optimal type here? If bool is not good on some platforms, perhaps unsigned char is slightly better than char?

Changed to bool but left the values as 0s and 1s.

> 
> > WebCore/editing/TextIterator.cpp:1863
> > +    return U_GET_GC_MASK(character) & (U_GC_S_MASK | U_GC_P_MASK | U_GC_Z_MASK | U_GC_CF_MASK);
> 
> Should we make this part non-inline?

Done, although the name might not be ideal.

> 
> > WebCore/editing/TextIterator.cpp:1923
> > +        if (m_prefixLength)
> > +            m_prefixLength -= min(m_prefixLength, m_buffer.size() - m_overlap);
> 
> I don’t think you need the if here. The min already has a suitable branch. Unless the if is a performance optimization.

Removed the if for now.

> 
> > WebCore/editing/TextIterator.cpp:2044
> > +            U16_FWD_1(m_buffer.data(), offset, size);
> 
> I suspect the non-ICU-using platforms don’t have this macro. You may need to beef up the header they use to get some of the ICU macros.

Done.

> 
> > WebCore/editing/TextIterator.cpp:2062
> > +    size_t wordBreakSearchStart = start + length;
> > +    while (wordBreakSearchStart > start)
> > +        wordBreakSearchStart = findNextWordFromIndex(m_buffer.data(), m_buffer.size(), wordBreakSearchStart, false /* backwards */);
> > +    return wordBreakSearchStart == start;
> 
> Is this sufficiently efficient when called repeatedly?

Don’t know.

> 
> > WebCore/editing/TextIterator.cpp:2109
> > +        if (m_prefixLength)
> > +            m_prefixLength -= min(m_prefixLength, size - overlap);
> 
> Again, I suspect the if is not needed.

Removed the if.

> 
> > WebCore/editing/TextIterator.cpp:2128
> > +    if (m_prefixLength)
> > +        m_prefixLength -= min<size_t>(m_prefixLength, matchStart + 1);
> 
> Yet again, the if is not necessary.

Removed it.

> 
> > WebCore/editing/TextIterator.cpp:2469
> > +            int wordBoundaryContextStart = backwardsIterator.length();
> > +            U16_BACK_1(backwardsIterator.characters(), 0, wordBoundaryContextStart);
> > +            wordBoundaryContextStart = startOfLastWordBoundaryContext(backwardsIterator.characters(), wordBoundaryContextStart);
> > +            int contextLength = backwardsIterator.length() - wordBoundaryContextStart;
> 
> Is int really the right type here?

I deferred to startOfLastWordBoundary and SimplifiedBackwardsTextIterator::length().

> 
> > WebCore/editing/TextIterator.cpp:2515
> > -    {
> > -        CharacterIterator findIterator(range, TextIteratorEntersTextControls);
> > -        matchLength = findPlainText(findIterator, target, forward, caseSensitive, matchStart);
> > -        if (!matchLength)
> > -            return collapsedToBoundary(range, forward);
> > -    }
> > +
> > +    CharacterIterator findIterator(range, TextIteratorEntersTextControls);
> > +    matchLength = findPlainText(findIterator, target, options, matchStart);
> > +    if (!matchLength)
> > +        return collapsedToBoundary(range, !(options & Backwards));
> 
> You dropped the braces that make the character iterator fall out of scope early. Why? Was that not a helpful optimization?

I thought I was cleaning up and didn’t realize the performance implication. Reinstated the braces.

> 
> Another way to accomplish the same thing is to use a temporary for the character iterator instead of a named local.
> 
> > WebCore/editing/TextIterator.h:62
> >  String plainText(const Range*, TextIteratorBehavior defaultBehavior = TextIteratorDefaultBehavior);
> >  UChar* plainTextToMallocAllocatedBuffer(const Range*, unsigned& bufferLength, bool isDisplayString, TextIteratorBehavior defaultBehavior = TextIteratorDefaultBehavior);
> 
> We should remove that misleading argument name.

Done.

> 
> > WebCore/editing/TextIterator.h:64
> > +PassRefPtr<Range> findPlainText(const Range*, const String&, FindOptions);
> >  PassRefPtr<Range> findPlainText(const Range*, const String&, bool forward, bool caseSensitive);
> 
> As above, should we consider deprecating and then removing the bool version?

Yes. Added comments to that effect.

> 
> > WebCore/page/Page.h:205
> > +        bool findString(const String&, FindOptions);
> >          bool findString(const String&, TextCaseSensitivity, FindDirection, bool shouldWrap);
> > +        unsigned int markAllMatchesForText(const String&, FindOptions, bool shouldHighlight, unsigned);
> >          unsigned int markAllMatchesForText(const String&, TextCaseSensitivity, bool shouldHighlight, unsigned);
> 
> Should we deprecate the TextCaseSensitivity versions?

Ditto.

> 
> > LayoutTests/editing/text-iterator/findString.html:12
> > +        log("Searching for \u2018" + target + "\u2019 " + (text.length <= 64 ? "in \u2018" + text + "\u2019 " : "in long string ") + "with options [" + options.join(", ") + "]:");
> 
> Why not use straight quotes instead of curly?

Tough question.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the webkit-unassigned mailing list