[webkit-reviews] review granted: [Bug 44790] Move UTF16 LEAD/TRAIL logic into the HTMLEntityParser : [Attachment 65756] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sat Aug 28 16:10:53 PDT 2010


Darin Adler <darin at apple.com> has granted Adam Barth <abarth at webkit.org>'s
request for review:
Bug 44790: Move UTF16 LEAD/TRAIL logic into the HTMLEntityParser
https://bugs.webkit.org/show_bug.cgi?id=44790

Attachment 65756: Patch
https://bugs.webkit.org/attachment.cgi?id=65756&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
> +	   * html/HTMLEntityParser.cpp:
> +	   (WebCore::):
> +	   (WebCore::consumeHTMLEntity):

This list of function names is a bit strange. As you know, I am a big fan of
per-function change log comments, but just removing the bogus function name
might be enough. For some reason the new function's name is not here either.

> +inline bool convertToCharacters(unsigned value, Vector<UChar>&
decodedEntity)

Functions defined and used within the same file should be marked static so they
get internal linkage, unless this is inside an unnamed namespace. We have been
using static rather than unnamed namespaces because tool support for unnamed
namespaces was lacking, but perhaps you decided on the other path for this
code?

The argument name "value" here is strange. We have a type UChar32 that we often
use instead of unsigned for such things, by the way.

The name convertToCharacters is a bit unclear since this converts a Unicode
character into a sequence of UTF-16 code units. It actually converts *from* a
character. I think you should use a different name.

> +    if (value < 0xFFFF) {

Why < here instead of <=? The old code used "< 0xFFFF" in one place and ">
0xFFFF" in another, which is inconsistent.

ICU has the macro U_IS_BMP for this operation, which I think is slightly more
explicit than doing the math directly.

> +    Vector<UChar> decodedEntity;
> +    bool success = consumeHTMLEntity(source, decodedEntity,
notEnoughCharacters);

I don't know how hot this code path is, but if it's hot enough, we might want
to use Vector<UChar, 16> for this instead of Vector<UChar> to avoid a memory
allocation each time.

r=me


More information about the webkit-reviews mailing list