[webkit-reviews] review granted: [Bug 74828] The HTML parser doesn't enforce the "Noah's Ark condition" from the HTML5 spec : [Attachment 119792] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sun Dec 18 18:22:18 PST 2011


Darin Adler <darin at apple.com> has granted Adam Barth <abarth at webkit.org>'s
request for review:
Bug 74828: The HTML parser doesn't enforce the "Noah's Ark condition" from the
HTML5 spec
https://bugs.webkit.org/show_bug.cgi?id=74828

Attachment 119792: Patch
https://bugs.webkit.org/attachment.cgi?id=119792&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
View in context: https://bugs.webkit.org/attachment.cgi?id=119792&action=review


> Source/WebCore/ChangeLog:10
> +	   elements that can be in the list of active formating elements.  I'm
not

Typo: formating instead of formatting

> Source/WebCore/ChangeLog:11
> +	   entirely sure that enforcing this condition is worth the complexity,


Besides complexity, there is the performance cost. Did you measure that? I hope
it’s negligible with the optimized code path you added here.

> Source/WebCore/html/parser/HTMLFormattingElementList.cpp:43
> +inline size_t attributeCount(Element* element)

Since this is in a cpp file it should have static so it gets internal linkage.

> Source/WebCore/html/parser/HTMLFormattingElementList.cpp:134
> +void HTMLFormattingElementList::tryToEnsureNoahsArkConditionQuickly(Element*
newElement, Vector<Element*>& remainingCandiates)

Typo: remainingCandiates rather than remainingCandidates

> Source/WebCore/html/parser/HTMLFormattingElementList.cpp:138
> +    // Initially, use a vector with inline capacity to avoid a malloc in the


The use of the word “initially” here is confusing. I know it refers to this
case vs. the non-quickly function, but it is not obvious.

> Source/WebCore/html/parser/HTMLFormattingElementList.cpp:142
> +    size_t newElementAttributeCount = attributeCount(newElement);

Is it possibly worth checking m_entries.size() for zero first, so we don’t have
to fetch the attribute count at all in that case?

> Source/WebCore/html/parser/HTMLFormattingElementList.cpp:145
> +    for (size_t i = 0; i < m_entries.size(); ++i) {
> +	   Entry& entry = m_entries[m_entries.size() - i - 1];

I would write the loop like this:

    for (size_t i = m_entries.size(); i; ) {
	--i;
	Entry& entry = m_entries[i];

It’s my new favorite way to write backwards loops. I think it’s easier to read
this way.

> Source/WebCore/html/parser/HTMLFormattingElementList.cpp:152
> +	   if (!newElement->hasTagName(candidate->tagQName()))
> +	       continue;

I believe hasTagName here is incorrect and the correct check is even faster and
stricter, element->tagQName() != candidate->tagQName(). Is that right?

> Source/WebCore/html/parser/HTMLFormattingElementList.cpp:192
> +	       // FIXME: Technically we shouldn't read this information back
from
> +	       // the DOM. Instead, the parser should keep a copy of the
information.

Could we be more specific about why reading back from the DOM is an issue?
Maybe the reason is that a script that changes the DOM should have no effect on
the Noah's Ark algorithm. Or maybe there’s some other reason that you know of
and I do not.

> Source/WebCore/html/parser/HTMLFormattingElementList.cpp:206
> +    // formatting lement list gets permuted.

Typo: lement list

> Source/WebCore/html/parser/HTMLFormattingElementList.h:130
> +    //
http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#list-o
f-active-formatting-elements

I think a human-readable description of what this is for could be valuable too.
My comment would say something like this:

    // The "Noah's Ark" algorithm, which removes redundant mis-nested elements.


More information about the webkit-reviews mailing list