<html>
<head>
<base href="https://bugs.webkit.org/" />
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - URL paths should not be normalized when encoded"
href="https://bugs.webkit.org/show_bug.cgi?id=144320#c12">Comment # 12</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - URL paths should not be normalized when encoded"
href="https://bugs.webkit.org/show_bug.cgi?id=144320">bug 144320</a>
from <span class="vcard"><a class="email" href="mailto:cgarcia@igalia.com" title="Carlos Garcia Campos <cgarcia@igalia.com>"> <span class="fn">Carlos Garcia Campos</span></a>
</span></b>
<pre>(In reply to <a href="show_bug.cgi?id=144320#c10">comment #10</a>)
<span class="quote">> > The file itself in the file system can be normalized or not, in the particular case of a filename containing a 'ñ', it can be encoded as U+006E U+0303, or U+00F1, but they end up being different files, because the bytes are different, even if the visual representation is the same.
>
> Yes, I understand that this is what you are saying. This is a bug in
> server's file system - everything that supports Unicode must treat different
> normalization forms as equivalent, so a filesystem may not have two files
> whose names only differ in normalization form.</span >
I don't think Linux and the most common file systems used in Linux know anything about encoding, filenames are just bytes (only exceptions are 0 and /, I think), so two files with different bytes in their name are just different. It seems HFS does care about encodings and normalization, I didn't know it. So, maybe this change should be made specific to Linux (or other unix systems, except mac)
<span class="quote">> > The very same files worked in chrome and firefox.
>
> Yes, they work for you in a test case, but they won't work in other
> scenarios, most notably those that involve user input on a Mac. This is as I
> said, the behavior in WebKit is intentionally different to have a more
> common Unicode form on the wire. Windows browsers have the luxury of letting
> the bytes through unchanged because their OS and Internet both use the same
> form, but for Safari, it is not as straightforward.</span >
Well, I isolated the problem in a test case, but the issue was happening in real cases. It's not that the server doesn't normalize the filenames, the server just uses what there's in the filesystem.
<span class="quote">> > Form data decoding hasn't changed, except for filenames, so what the user types in a search form is still normalized.
>
> The changes in encodeRelativeString() are quite confusing, I'm not sure if
> that's correct. There is some "otherDecoded" string that is
> counter-intuitively a result of calling encode(), and that's separate from
> where the path is handled.
>
> Another change in this patch is that filenames in form data are not encoded.
> This means that a file uploaded from Mac will retain the custom HFS
> normalization form that is not used anywhere else - how if that the right
> thing to do?</span >
I assumed all file systems handled files as just bytes, I didn't know HFS worked differently. We need to make this depending on the platform.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>