[webkit-reviews] review granted: [Bug 74826] WebKit should support HTML entities that expand to more than one character : [Attachment 119787] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sun Dec 18 17:49:12 PST 2011


Darin Adler <darin at apple.com> has granted Adam Barth <abarth at webkit.org>'s
request for review:
Bug 74826: WebKit should support HTML entities that expand to more than one
character
https://bugs.webkit.org/show_bug.cgi?id=74826

Attachment 119787: Patch
https://bugs.webkit.org/attachment.cgi?id=119787&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
View in context: https://bugs.webkit.org/attachment.cgi?id=119787&action=review


> Source/WebCore/html/parser/HTMLEntityParser.cpp:160
> +    if (U16_LENGTH(entityValue) != 1 ||
search.mostRecentMatch()->rightValue) {

It’s fine to have one of these checks here at runtime. But the other should be
an assertion. There’s no need to check both.

Of course, you plan to delete this whole function eventually, I assume.

> Source/WebCore/html/parser/HTMLEntityParser.h:37
> +// FIXME: Move the XML parser to an entity decoding function that actually
works!

The “that actually works” here is unnecessarily mysterious. You could say
something specific like “that works for non-BMP characters”.

> Source/WebCore/html/parser/HTMLEntityTable.h:39
> +    UChar32 leftValue;
> +    UChar32 rightValue;

Using left and right here is curious. I would have called these first and
second or something that implies ordering rather than direction.

> Source/WebCore/xml/parser/XMLTreeBuilder.cpp:382
> +    UChar32 entityValue = search.mostRecentMatch()->leftValue;
> +    // FIXME: We need to account for rightValue if any XML entities are
longer
> +    // than one unicode character.
> +    ASSERT(!search.mostRecentMatch()->rightValue);

What prevents this assertion from firing? Is there a test case demonstrating
that?

> Source/WebCore/xml/parser/XMLTreeBuilder.cpp:385
>      if (entityValue <= 0xFFFF)
>	  appendToText(reinterpret_cast<UChar*>(&entityValue), 1);
>      else {

You can see given the code above that this else is dead code. This code is in a
strange state.

And the reinterpret_cast to UChar* makes the code little-endian-specific. That
is not good!


More information about the webkit-reviews mailing list