[Webkit-unassigned] [Bug 81270] FileApi does not handle files with NFD encoded umlaut in file name

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Mar 20 15:22:52 PDT 2012


https://bugs.webkit.org/show_bug.cgi?id=81270


Alexey Proskuryakov <ap at webkit.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #132844|review?                     |review-
               Flag|                            |




--- Comment #7 from Alexey Proskuryakov <ap at webkit.org>  2012-03-20 15:22:52 PST ---
(From update of attachment 132844)
View in context: https://bugs.webkit.org/attachment.cgi?id=132844&action=review

My understanding is that to match the Web, file system implementation will need to decode percent escapes and do its own normalization. Or just work with NFC internally, which just makes sense for any Web-related system.

> Source/WebCore/ChangeLog:3
> +        [Chromium] FileApi does not handle files with NFD encoded umlaut in file name

Did you mean the other way around? TextEncoding::encode produces NFC, which is the standard encoding on the Web.

>> Source/WebCore/ChangeLog:10
>> +        -switch all calls to encode to normalizeAndEncode except the ones in platform/KURLGoogle.cpp
> 
> Does that introduce another difference between KURL and GURL?  Can you write a test in LayoutTest/fast/url that shows the difference for this patch?

The goal of existing encode() behavior is to make WebKit on any platform generate the same kind of data as Windows browsers do. So when faced with input that's not in NFC form (like a file name on OS X), it converts the data to NFC.

I think that this just makes sense. Sending randomly normalized data across the wire (in URLs or otherwise) does not. This patch seems to affect all URLs, not just file system ones.

> Source/WebCore/platform/text/TextEncoding.h:72
> +        CString normalizeAndEncode(const UChar*, size_t length, UnencodableHandling) const;

This new method does not make a lot of sense conceptually.

1. There are multiple normalization schemes. Normalize to what?
2. You only need to normalize when using one of Unicode encodings (such as UTF-*). So, exposing this as a public method on TextEncoding is confusing.
3. As mentioned above, are there any cases where one wants non-NFC data?

> LayoutTests/fast/filesystem/file-from-file-entry-nfd-name.html:7
> +<script src="resources/file-from-file-entry-nfd-name.js"></script>

Please don't split tests into .html and .js. The only exception is fast/js directory, and then the .js part should be pure JavaScript, no DOM.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list