[webkit-reviews] review granted: [Bug 50038] Add word-prefix search options to the text search API : [Attachment 74790] WebCore (ICU code path only) implementation, WebKit/mac WebKit2 API, DumpRenderTree/mac test support and tests

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sat Nov 27 18:00:56 PST 2010


Darin Adler <darin at apple.com> has granted mitz at webkit.org's request for review:
Bug 50038: Add word-prefix search options to the text search API
https://bugs.webkit.org/show_bug.cgi?id=50038

Attachment 74790: WebCore (ICU code path only) implementation, WebKit/mac
WebKit2 API, DumpRenderTree/mac test support and tests
https://bugs.webkit.org/attachment.cgi?id=74790&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
View in context: https://bugs.webkit.org/attachment.cgi?id=74790&action=review

r=me, assuming you get this to build on all platforms before landing it

> WebCore/editing/Editor.h:333
> +    bool findString(const String&, FindOptions);
>      bool findString(const String&, bool forward, bool caseFlag, bool
wrapFlag, bool startInSelection);

Longer term it might be nice to return the version with four bool arguments.
Maybe even put a FIXME reminding us that we want to do that?

> WebCore/editing/FindOptions.h:34
> +    TreatMedialCapAsWordStart = 1 << 2,

Might want a comment nearby explaining what medial capitals are. Why the
abbreviation “cap” for capital?

> WebCore/editing/TextIterator.cpp:1841
> +    static const char latin1SeparatorTable[256] = {

Is char the optimal type here? If bool is not good on some platforms, perhaps
unsigned char is slightly better than char?

> WebCore/editing/TextIterator.cpp:1863
> +    return U_GET_GC_MASK(character) & (U_GC_S_MASK | U_GC_P_MASK |
U_GC_Z_MASK | U_GC_CF_MASK);

Should we make this part non-inline?

> WebCore/editing/TextIterator.cpp:1923
> +	   if (m_prefixLength)
> +	       m_prefixLength -= min(m_prefixLength, m_buffer.size() -
m_overlap);

I don’t think you need the if here. The min already has a suitable branch.
Unless the if is a performance optimization.

> WebCore/editing/TextIterator.cpp:2044
> +	       U16_FWD_1(m_buffer.data(), offset, size);

I suspect the non-ICU-using platforms don’t have this macro. You may need to
beef up the header they use to get some of the ICU macros.

> WebCore/editing/TextIterator.cpp:2062
> +    size_t wordBreakSearchStart = start + length;
> +    while (wordBreakSearchStart > start)
> +	   wordBreakSearchStart = findNextWordFromIndex(m_buffer.data(),
m_buffer.size(), wordBreakSearchStart, false /* backwards */);
> +    return wordBreakSearchStart == start;

Is this sufficiently efficient when called repeatedly?

> WebCore/editing/TextIterator.cpp:2109
> +	   if (m_prefixLength)
> +	       m_prefixLength -= min(m_prefixLength, size - overlap);

Again, I suspect the if is not needed.

> WebCore/editing/TextIterator.cpp:2128
> +    if (m_prefixLength)
> +	   m_prefixLength -= min<size_t>(m_prefixLength, matchStart + 1);

Yet again, the if is not necessary.

> WebCore/editing/TextIterator.cpp:2469
> +	       int wordBoundaryContextStart = backwardsIterator.length();
> +	       U16_BACK_1(backwardsIterator.characters(), 0,
wordBoundaryContextStart);
> +	       wordBoundaryContextStart =
startOfLastWordBoundaryContext(backwardsIterator.characters(),
wordBoundaryContextStart);
> +	       int contextLength = backwardsIterator.length() -
wordBoundaryContextStart;

Is int really the right type here?

> WebCore/editing/TextIterator.cpp:2515
> -    {
> -	   CharacterIterator findIterator(range,
TextIteratorEntersTextControls);
> -	   matchLength = findPlainText(findIterator, target, forward,
caseSensitive, matchStart);
> -	   if (!matchLength)
> -	       return collapsedToBoundary(range, forward);
> -    }
> +
> +    CharacterIterator findIterator(range, TextIteratorEntersTextControls);
> +    matchLength = findPlainText(findIterator, target, options, matchStart);
> +    if (!matchLength)
> +	   return collapsedToBoundary(range, !(options & Backwards));

You dropped the braces that make the character iterator fall out of scope
early. Why? Was that not a helpful optimization?

Another way to accomplish the same thing is to use a temporary for the
character iterator instead of a named local.

> WebCore/editing/TextIterator.h:62
>  String plainText(const Range*, TextIteratorBehavior defaultBehavior =
TextIteratorDefaultBehavior);
>  UChar* plainTextToMallocAllocatedBuffer(const Range*, unsigned&
bufferLength, bool isDisplayString, TextIteratorBehavior defaultBehavior =
TextIteratorDefaultBehavior);

We should remove that misleading argument name.

> WebCore/editing/TextIterator.h:64
> +PassRefPtr<Range> findPlainText(const Range*, const String&, FindOptions);
>  PassRefPtr<Range> findPlainText(const Range*, const String&, bool forward,
bool caseSensitive);

As above, should we consider deprecating and then removing the bool version?

> WebCore/page/Page.h:205
> +	   bool findString(const String&, FindOptions);
>	   bool findString(const String&, TextCaseSensitivity, FindDirection,
bool shouldWrap);
> +	   unsigned int markAllMatchesForText(const String&, FindOptions, bool
shouldHighlight, unsigned);
>	   unsigned int markAllMatchesForText(const String&,
TextCaseSensitivity, bool shouldHighlight, unsigned);

Should we deprecate the TextCaseSensitivity versions?

> LayoutTests/editing/text-iterator/findString.html:12
> +	   log("Searching for \u2018" + target + "\u2019 " + (text.length <= 64
? "in \u2018" + text + "\u2019 " : "in long string ") + "with options [" +
options.join(", ") + "]:");

Why not use straight quotes instead of curly?


More information about the webkit-reviews mailing list