[Webkit-unassigned] [Bug 22247] New: Find in a page does not normalize

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Nov 13 16:21:11 PST 2008


https://bugs.webkit.org/show_bug.cgi?id=22247

           Summary: Find in a page does not normalize
           Product: WebKit
           Version: 528+ (Nightly build)
          Platform: All
        OS/Version: All
            Status: NEW
          Keywords: InChromiumBugs
          Severity: Normal
          Priority: P2
         Component: New Bugs
        AssignedTo: webkit-unassigned at lists.webkit.org
        ReportedBy: jshin at chromium.org


1. Go to http://fr.wikipedia.org
2. In 'Find in a page box', type U+0065 U+0301 ( é ) [1]

Expected : A lot of matches are found
Actual: No match is found

All the e with acute accent in the page are in composed form (U+00E9) and does
not match the decomposed representation.

A short-term fix : Convert the input ('needle') to NFC. This will take care of
the majority of cases because most web pages tend to use composed forms when
available. 

In the long run : NFC might not be the best choice. 'Hay' may have be
normalized as well.

At least on Windows, some African-language  keyboards produce decomposed forms
even for letters with accent which have a composed form representation. 

This may also be an issue for Japanese voicing marks. I barely remember some
hard-coded normalization for them in Webkit, but I haven't checked whether that
is used in 'Find in a page'. If they're not taken care of, it's a rather
serious issue. 

Reported against chrome: http://crbug.com/1100

[1] Go to http://rishida.net/scripts/uniview/conversion.php and type
'U+0065U+0301' in the second box on the left and copy'n'paste the result in the
top-left box.


-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list