[webkit-reviews] review granted: [Bug 215970] Remove NFC normalization when submitting forms and encoding URL queries and fix EUC-JP encoding : [Attachment 407530] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sat Aug 29 12:15:44 PDT 2020


Darin Adler <darin at apple.com> has granted Alex Christensen
<achristensen at apple.com>'s request for review:
Bug 215970: Remove NFC normalization when submitting forms and encoding URL
queries and fix EUC-JP encoding
https://bugs.webkit.org/show_bug.cgi?id=215970

Attachment 407530: Patch

https://bugs.webkit.org/attachment.cgi?id=407530&action=review




--- Comment #3 from Darin Adler <darin at apple.com> ---
Comment on attachment 407530
  --> https://bugs.webkit.org/attachment.cgi?id=407530
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407530&action=review

Looks like there are more tests to rebase.

> Source/WebCore/platform/text/ICUExtras.h:32
> +extern const UChar jis0208[15448];

It doesn’t make sense to me to call this "ICU" extras. This is our own
encoder/decoder for a particular encoding, and not part of ICU. I suggest a
different name for this file.

> Source/WebCore/platform/text/TextCodecICU.cpp:436
> +	   auto pointerRange = std::equal_range(reinterpret_cast<const
std::pair<UChar, UChar>*>(jis0208), reinterpret_cast<const std::pair<UChar,
UChar>*>(jis0208 + WTF_ARRAY_LENGTH(jis0208)), std::pair<UChar, UChar>(c, 0),
[](const auto& a, const auto& b) {

Why are we casting an array of UChar into an array of pairs? Why not actually
compile an array of pairs? That should work just as well, just with more braces
in the source file, and without the reinterpret_cast.

> Source/WebCore/platform/text/TextCodecICU.cpp:454
> +    CString decimal = makeString(static_cast<unsigned>(c)).utf8();
> +    for (size_t i = 0; i < decimal.length(); i++)
> +	   result.uncheckedAppend(decimal.data()[i]);

We have functions that will do this more efficiently, without allocating
memory:

    uint8_t buffer[5];
    writeIntegerToBuffer(character, buffer);
    result.append(buffer, lengthOfIntegerAsString(character));

Or some variation on this.

> Source/WebCore/platform/text/TextCodecICU.cpp:504
> +    if (!strcmp(m_canonicalConverterName, "euc-jp-2007"))
> +	   return eucJPEncode(string, unencodableHandler(handling));

The way our codec classes are designed, you would normally create and register
a separate class derived TextCodec rather than adding a non-ICU implementation
to the ICU one. Did you consider that? Why did you reject it?

> Source/WebCore/platform/text/TextEncoding.cpp:72
> +Vector<uint8_t> TextEncoding::encode(StringView string, UnencodableHandling
handling, NFCNormalize nfcNormalize) const

I suggest naming the local variable just "normalize".

> Source/WebCore/platform/text/TextEncoding.cpp:82
> +	   auto normalizedString = normalizedNFC(string);
> +	   return newTextCodec(*this)->encode(normalizedString.view, handling);

I suggest making this just a one-liner.


More information about the webkit-reviews mailing list