[webkit-reviews] review granted: [Bug 107337] BackgroundHTMLParser should be able to atomize well-known strings : [Attachment 192152] Patch
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Fri Mar 8 11:08:48 PST 2013
Adam Barth <abarth at webkit.org> has granted Eric Seidel <eric at webkit.org>'s
request for review:
Bug 107337: BackgroundHTMLParser should be able to atomize well-known strings
https://bugs.webkit.org/show_bug.cgi?id=107337
Attachment 192152: Patch
https://bugs.webkit.org/attachment.cgi?id=192152&action=review
------- Additional Comments from Adam Barth <abarth at webkit.org>
View in context: https://bugs.webkit.org/attachment.cgi?id=192152&action=review
> Source/WebCore/ChangeLog:20
> + median= 443.726500002 ms, stdev= 7.25002679952 ms, min=
430.244000047 ms, max= 455.511000007 ms
> + to:
> + median= 427.849500004 ms, stdev= 9.96967058292 ms, min=
417.914000049 ms, max= 461.528000014 ms
You're reporting the median, but not the mean. We usually work with the mean.
> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:125
> + // FIXME: Using CaseFoldingHash::equal instead of equalIgnoringCase, as
equalIgnoringCase
> + // wants non-const StringImpl* (even though it never modifies them).
Maybe we should change equalIgnoringCase to use a const StringImpl* ?
> Source/WebCore/html/parser/HTMLIdentifier.cpp:49
> + return table;
Should we ASSERT(isMainThread() || !table.isEmpty()) ?
> Source/WebCore/html/parser/HTMLIdentifier.cpp:126
> + // We expect some hash collisions, but only for identical strings.
> + // Since all of these names are AtomicStrings pointers should be
equal.
> + ASSERT_UNUSED(addResult, !addResult.isNewEntry || name ==
addResult.iterator->value.second);
Is this just dumb luck? What happens when we happen to have two HTML tag names
with the same hash value? We should at least have a comment here that explain
what to do when that happens.
> Source/WebCore/html/parser/HTMLIdentifier.cpp:133
> + // This is safe to call from any thread, but must be called before
> + // using HTMLIdentifier.
I would just ASSERT(isMainThread()) since that's easier and how we plan to use
it anyway.
> Source/WebCore/html/parser/HTMLIdentifier.h:51
> + if (vector.size() <= maxNameLength) {
Why not put this branch into findIndex? That would simply this code
dramatically.
> Source/WebCore/html/parser/HTMLIdentifier.h:62
> + if (width == Likely8Bit)
> + m_string = StringImpl::create8BitIfPossible(vector);
> + else if (width == Force8Bit)
> + m_string = String::make8BitFrom16BitSource(vector);
> + else
> + m_string = String(vector);
This work should probably be done by String. It's useful in other places.
After these two changes, this function just becomes:
m_index = findIndex(vector.data(), vector.size());
if (m_index != invalidIndex)
return;
m_string = String(vector, width);
> Source/WebCore/html/parser/HTMLParserIdioms.cpp:286
> + // Note: This hash compare is not correct for all possible StringImpls,
but
> + // since we're always comparing against constants, it's OK.
Why is it not correct? How can two strings be equal if their hashes aren't
equal?
More information about the webkit-reviews
mailing list