[webkit-reviews] review granted: [Bug 50038] Add word-prefix search options to the text search API : [Attachment 74790] WebCore (ICU code path only) implementation, WebKit/mac WebKit2 API, DumpRenderTree/mac test support and tests
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Sat Nov 27 18:00:56 PST 2010
Darin Adler <darin at apple.com> has granted mitz at webkit.org's request for review:
Bug 50038: Add word-prefix search options to the text search API
https://bugs.webkit.org/show_bug.cgi?id=50038
Attachment 74790: WebCore (ICU code path only) implementation, WebKit/mac
WebKit2 API, DumpRenderTree/mac test support and tests
https://bugs.webkit.org/attachment.cgi?id=74790&action=review
------- Additional Comments from Darin Adler <darin at apple.com>
View in context: https://bugs.webkit.org/attachment.cgi?id=74790&action=review
r=me, assuming you get this to build on all platforms before landing it
> WebCore/editing/Editor.h:333
> + bool findString(const String&, FindOptions);
> bool findString(const String&, bool forward, bool caseFlag, bool
wrapFlag, bool startInSelection);
Longer term it might be nice to return the version with four bool arguments.
Maybe even put a FIXME reminding us that we want to do that?
> WebCore/editing/FindOptions.h:34
> + TreatMedialCapAsWordStart = 1 << 2,
Might want a comment nearby explaining what medial capitals are. Why the
abbreviation “cap” for capital?
> WebCore/editing/TextIterator.cpp:1841
> + static const char latin1SeparatorTable[256] = {
Is char the optimal type here? If bool is not good on some platforms, perhaps
unsigned char is slightly better than char?
> WebCore/editing/TextIterator.cpp:1863
> + return U_GET_GC_MASK(character) & (U_GC_S_MASK | U_GC_P_MASK |
U_GC_Z_MASK | U_GC_CF_MASK);
Should we make this part non-inline?
> WebCore/editing/TextIterator.cpp:1923
> + if (m_prefixLength)
> + m_prefixLength -= min(m_prefixLength, m_buffer.size() -
m_overlap);
I don’t think you need the if here. The min already has a suitable branch.
Unless the if is a performance optimization.
> WebCore/editing/TextIterator.cpp:2044
> + U16_FWD_1(m_buffer.data(), offset, size);
I suspect the non-ICU-using platforms don’t have this macro. You may need to
beef up the header they use to get some of the ICU macros.
> WebCore/editing/TextIterator.cpp:2062
> + size_t wordBreakSearchStart = start + length;
> + while (wordBreakSearchStart > start)
> + wordBreakSearchStart = findNextWordFromIndex(m_buffer.data(),
m_buffer.size(), wordBreakSearchStart, false /* backwards */);
> + return wordBreakSearchStart == start;
Is this sufficiently efficient when called repeatedly?
> WebCore/editing/TextIterator.cpp:2109
> + if (m_prefixLength)
> + m_prefixLength -= min(m_prefixLength, size - overlap);
Again, I suspect the if is not needed.
> WebCore/editing/TextIterator.cpp:2128
> + if (m_prefixLength)
> + m_prefixLength -= min<size_t>(m_prefixLength, matchStart + 1);
Yet again, the if is not necessary.
> WebCore/editing/TextIterator.cpp:2469
> + int wordBoundaryContextStart = backwardsIterator.length();
> + U16_BACK_1(backwardsIterator.characters(), 0,
wordBoundaryContextStart);
> + wordBoundaryContextStart =
startOfLastWordBoundaryContext(backwardsIterator.characters(),
wordBoundaryContextStart);
> + int contextLength = backwardsIterator.length() -
wordBoundaryContextStart;
Is int really the right type here?
> WebCore/editing/TextIterator.cpp:2515
> - {
> - CharacterIterator findIterator(range,
TextIteratorEntersTextControls);
> - matchLength = findPlainText(findIterator, target, forward,
caseSensitive, matchStart);
> - if (!matchLength)
> - return collapsedToBoundary(range, forward);
> - }
> +
> + CharacterIterator findIterator(range, TextIteratorEntersTextControls);
> + matchLength = findPlainText(findIterator, target, options, matchStart);
> + if (!matchLength)
> + return collapsedToBoundary(range, !(options & Backwards));
You dropped the braces that make the character iterator fall out of scope
early. Why? Was that not a helpful optimization?
Another way to accomplish the same thing is to use a temporary for the
character iterator instead of a named local.
> WebCore/editing/TextIterator.h:62
> String plainText(const Range*, TextIteratorBehavior defaultBehavior =
TextIteratorDefaultBehavior);
> UChar* plainTextToMallocAllocatedBuffer(const Range*, unsigned&
bufferLength, bool isDisplayString, TextIteratorBehavior defaultBehavior =
TextIteratorDefaultBehavior);
We should remove that misleading argument name.
> WebCore/editing/TextIterator.h:64
> +PassRefPtr<Range> findPlainText(const Range*, const String&, FindOptions);
> PassRefPtr<Range> findPlainText(const Range*, const String&, bool forward,
bool caseSensitive);
As above, should we consider deprecating and then removing the bool version?
> WebCore/page/Page.h:205
> + bool findString(const String&, FindOptions);
> bool findString(const String&, TextCaseSensitivity, FindDirection,
bool shouldWrap);
> + unsigned int markAllMatchesForText(const String&, FindOptions, bool
shouldHighlight, unsigned);
> unsigned int markAllMatchesForText(const String&,
TextCaseSensitivity, bool shouldHighlight, unsigned);
Should we deprecate the TextCaseSensitivity versions?
> LayoutTests/editing/text-iterator/findString.html:12
> + log("Searching for \u2018" + target + "\u2019 " + (text.length <= 64
? "in \u2018" + text + "\u2019 " : "in long string ") + "with options [" +
options.join(", ") + "]:");
Why not use straight quotes instead of curly?
More information about the webkit-reviews
mailing list