[Webkit-unassigned] [Bug 28970] content-type parameters not taken into account when building form-data

Mon Sep 21 13:02:55 PDT 2009

https://bugs.webkit.org/show_bug.cgi?id=28970

--- Comment #6 from Patrick Mueller <pmuellr at yahoo.com>  2009-09-21 13:02:55 PDT ---
I wrapped the decodeURIcomponent() call with a try/catch, and just left the
code as is, so that if the query string values can't be decoded, you'll only
ever see the encoded values.  This allows the resource data to be displayed,
instead of getting the "white div of death".

Here's some other info.

Pasting the following text into the form fields: "フリガナ" yields the following
values in the form data section: 

   %83t%83%8A%83K%83i 

>From the console, running encodeURIComponent("フリガナ") yields the following:

   %E3%83%95%E3%83%AA%E3%82%AC%E3%83%8A

Just based on the length of the strings, I'm guessing the first is the
shift_jis encoding of the original string.  There's nothing in the request
headers indicating a character coding being requested, but in the original page
(the one we entered the form data on), the Content-Type header returned for the
server was:

   text/html; charset=Shift_JIS

I think the thing we'll need to do, to handle this completely, is to have
charset translations available, so that web inspector code can perform the
translation of the string actually received (first string %83t... above) into
unicode for display for the user.  Rather than have to guess at what the
character encoding is, we will also need to have this information available. 
This is the effective encoding used on the original page that submitted the
request.

It appears the encoding may be available for the request via
m_responseContentDispositionEncodingFallbackArray field of ResourceRequestBase
(in platform/network), but the fact that there are up to three values in there
makes me a little worried.  We may still have to guess.

At this point, not clear if it's useful to even proceed beyond adding the
try/catch to prevent the "white div of death".  Are folks generally moving to
UTF-8 based charsets?  Are things like Shift JIS encoding important to support
for debugging purposes like this?  I can't see how a developer could ever do
anything useful with these strings, programmatically, as there is no charset
conversion facility generally available in JS.  

I'm opting for performing no heroics - try converting using the existing
decodeURIComponent() and if that fails only show the encoded form.  Should we
inform the user of the inability to convert?  Where - console message?

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.