[webkit-reviews] review granted: [Bug 21635] Use UTF-8 for Unicode encodings other than UTF-8 when submitting a form or encoding the query part of a URL : [Attachment 24910] upated patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Nov 24 13:36:55 PST 2008


Darin Adler <darin at apple.com> has granted Jungshik Shin <jshin at chromium.org>'s
request for review:
Bug 21635: Use UTF-8 for Unicode encodings other than UTF-8 when submitting a
form or encoding the query part of a URL
https://bugs.webkit.org/show_bug.cgi?id=21635

Attachment 24910: upated patch
https://bugs.webkit.org/attachment.cgi?id=24910&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
> +    // For UTF-{7,16,32}, we want to use UTF-8 for the query part as 
> +    // we do when submitting a form. A form with GET method
> +    // has its contents added to a url as query params and it makes sense
> +    // to be consistent whether we're dealing with a literal url with
> +    // the query part referred to in an HTML document in UTF-{7,16,32} 
> +    // or a url formed for form submission. So, in both cases, 
> +    // we use encodingForFormSubmission, which returns UTF-8 for 
> +    // UTF-{7,16,32} while leaving other encodings alone. 

In the comment it should be URL rather than url. This comment is quite good,
but probably a bit too wordy. I'd say:

    // For UTF-{7,16,32}, we want to use UTF-8 for the query part as 
    // we do when submitting a form. A form with GET method
    // has its contents added to a URL as query params and it makes sense
    // to be consistent.

And leave it at that.

> +// HTML5 specifies that UTF-8 be used in form submission when a form is 
> +// is a part of a document in UTF-16 probably because UTF-16 is not a 
> +// byte-based encoding and can contain 0x00.  By extension, the same
> +// should be done for UTF-32. In case of UTF-7, it is a byte-based encoding,

> +// but it's fraught with problems and we'd rather steer clear of it. 

Great comment! We normally use one space after a period, not two.

> +// FIXME : Do we also have to return UTF-8 for BOCU, SCSU, and CESU-8?
> +// Those algorithmic encodings are supported in ICU, but I have never
> +// seen them being used for a web page. Instead of dealing with them 
> +// here for form submission, it's probably better not to decode them at
> +// all.

I don't think we need this comment. Lets just omit it. If you do want to keep
the comment, please remove the space after "FIXME".

If you did want this comment, I'd suggest putting it in the ICU source file.
And yes, lets turn these off in TextCodecICU.

> +	   * http/tests/misc/submit-get-in-utf16be-expected.txt: Added.
> +	   * http/tests/misc/submit-get-in-utf16be.html: Added.
> +	   * http/tests/misc/submit-get-in-utf16le-expected.txt: Added.
> +	   * http/tests/misc/submit-get-in-utf16le.html: Added.
> +	   * http/tests/misc/submit-get-in-utf32be-expected.txt: Added.
> +	   * http/tests/misc/submit-get-in-utf32be.html: Added.
> +	   * http/tests/misc/submit-get-in-utf32le-expected.txt: Added.
> +	   * http/tests/misc/submit-get-in-utf32le.html: Added.
> +	   * http/tests/misc/submit-get-in-utf7-expected.txt: Added.
> +	   * http/tests/misc/submit-get-in-utf7.html: Added.
> +	   * http/tests/misc/submit-post-in-utf16be-expected.txt: Added.
> +	   * http/tests/misc/submit-post-in-utf16be.html: Added.
> +	   * http/tests/misc/submit-post-in-utf16le-expected.txt: Added.
> +	   * http/tests/misc/submit-post-in-utf16le.html: Added.
> +	   * http/tests/misc/submit-post-in-utf32be-expected.txt: Added.
> +	   * http/tests/misc/submit-post-in-utf32be.html: Added.
> +	   * http/tests/misc/submit-post-in-utf32le-expected.txt: Added.
> +	   * http/tests/misc/submit-post-in-utf32le.html: Added.
> +	   * http/tests/misc/submit-post-in-utf7-expected.txt: Added.
> +	   * http/tests/misc/submit-post-in-utf7.html: Added.
> +	   * http/tests/misc/url-in-utf16be-expected.txt: Added.
> +	   * http/tests/misc/url-in-utf16be.html: Added.
> +	   * http/tests/misc/url-in-utf16le-expected.txt: Added.
> +	   * http/tests/misc/url-in-utf16le.html: Added.
> +	   * http/tests/misc/url-in-utf32be-expected.txt: Added.
> +	   * http/tests/misc/url-in-utf32be.html: Added.
> +	   * http/tests/misc/url-in-utf32le-expected.txt: Added.
> +	   * http/tests/misc/url-in-utf32le.html: Added.
> +	   * http/tests/misc/url-in-utf7-expected.txt: Added.
> +	   * http/tests/misc/url-in-utf7.html: Added.

These tests are great, but it's too bad we didn't find a way to do this testing
with a single test driver file instead of having them all be separate tests.
For example take a look at the techniques used in char-encoding.html used put
lots of tests all into a single driver.

r=me as-is; would be nice to make those slight comment improvements


More information about the webkit-reviews mailing list