[webkit-reviews] review denied: [Bug 51035] UCS2 encoding aliases should be defaulted to Big Endian : [Attachment 76540] the patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Dec 14 11:19:13 PST 2010


Alexey Proskuryakov <ap at webkit.org> has denied Yong Li
<yong.li.webkit at gmail.com>'s request for review:
Bug 51035: UCS2 encoding aliases should be defaulted to Big Endian
https://bugs.webkit.org/show_bug.cgi?id=51035

Attachment 76540: the patch
https://bugs.webkit.org/attachment.cgi?id=76540&action=review

------- Additional Comments from Alexey Proskuryakov <ap at webkit.org>
> 4) IE and FireFox treat "ISO-10646-UCS-2" as Big Endian.

Did you test other encoding names modified here? Do all the added sub-tests
pass in IE and Firefox?

It's surprising that we fail on every page that's labeled as "UTF-16" and
doesn't have a BOM. In fact, I recall that the current behavior is intentional,
and was chosen for compatibility with other browsers - but I don't remember
details of testing.

+// UCS2 encoding aliases.
+testDecode('ISO-10646-UCS-2', '%00a', 'U+0061');
+testDecode('UCS-2', '%00a', 'U+0061');
+testDecode('UTF-16', '%00a', 'U+0061');
+testDecode('Unicode', '%00a', 'U+0061');
+testDecode('csUnicode', '%00a', 'U+0061');
+testDecode('unicodeFFFE', '%00a', 'U+0061');
+testDecode('unicodeFEFF', 'a%00', 'U+0061');

This patch has confused terminology. Why did you label UTF-16 as a "UCS2
encoding alias"? It's clearly not.

UCS-2 is a historic name for 16-bit encoding defined by Unicode 1.0, and
obsoleted by Unicode 2.0. I suspect that most or all clients treat it as UTF-16
in practice. ICU certainly does.

+testDecode('ISO-10646-UCS-2', '%00a', 'U+0061');

Please add tests for something besides zero bytes. While at it, please consider
adding tests for whether surrogate pairs are supported for each of these
encoding names.

Extending test coverage would is very welcome.


More information about the webkit-reviews mailing list