[Webkit-unassigned] [Bug 22166] New: HTML entities for surrogate pair codepoints cause rendering issues

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Nov 10 16:33:17 PST 2008


https://bugs.webkit.org/show_bug.cgi?id=22166

           Summary: HTML entities for surrogate pair codepoints cause
                    rendering issues
           Product: WebKit
           Version: 528+ (Nightly build)
          Platform: Macintosh Intel
        OS/Version: Mac OS X 10.5
            Status: UNCONFIRMED
          Severity: Normal
          Priority: P2
         Component: WebKit Misc.
        AssignedTo: webkit-unassigned at lists.webkit.org
        ReportedBy: kevin at sb.org


When WebKit encounters an HTML entity that represents a codepoint that belongs
to the UTF-16 surrogate pair range (U+D800 - U+DFFF) it interprets that as a
single UTF-16 codepoint. This means a pair of these entities will be treated
the same way as a single entity for a high unicode codepoint (e.g. 𝍧 is
interpreted the same as ��). This in of itself is kinda strange,
but not necessarily incorrect. What is incorrect is WebKit's behavior when only
a single half of the surrogate pair is present (such as �). In this
scenario, WebKit will stop rendering text on that line starting with the
codepoint until a linebreak.

I don't know if there's any official spec on how such entities should be
treated, but my own preference would be to treat such an entity the same as an
unknown named entity and strip it from the rendered text entirely.


-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list