[Webkit-unassigned] [Bug 159891] [encoding] Support for GB18030

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Fri Oct 28 04:11:05 PDT 2016


https://bugs.webkit.org/show_bug.cgi?id=159891

--- Comment #19 from r12a <ishida at w3.org> ---
Myles, Alexey, Martin, this isn't about input at all. It's about the browser's encoder and decoder algorithms when it needs to convert between different character encodings. There happen to be two handy ways to expose the behaviour of the encoder (in this case, going from Unicode to GB18030) so that it can be tested: by writing characters to form output or to an href value which expect the encoding to be GB18030.  That's what these tests do (programmatically).

The example i gave above uses an actual character from the tests that doesn't go through the Safari encoder as expected (ie. without change) per the Encoding spec.

Note btw that NFC transformations would never change the character in that example, since the character used is a Compatability equivalent for the Unicode character it is converted to. Such characters are not affected by NFC.

So in summary, the test is only checking the behaviour of the browser's encoder/decoder when converting between one character encoding and another, and in the case shown, where equivalents exist in both Unicode and GB 18030, the i18n WG and the WhatWG believe that normalization is not relevant.

Note, btw, that when *decoding* text, ie. from GB 18030 to Unicode, Safari performs all the conversions as expected by the Encoding spec (including the character in the example). In other words, there is a discrepancy between the way the encoder and decoder work.

Does that help make things clearer?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-unassigned/attachments/20161028/f697d83e/attachment.html>


More information about the webkit-unassigned mailing list