[Webkit-unassigned] [Bug 44039] c1 control codes shouldn't be interpreted as microsoft characters

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Aug 16 05:02:45 PDT 2010


https://bugs.webkit.org/show_bug.cgi?id=44039


Alexey Proskuryakov <ap at webkit.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID
          Component|Frames                      |HTML DOM
                 CC|                            |ap at webkit.org




--- Comment #1 from Alexey Proskuryakov <ap at webkit.org>  2010-08-16 05:02:45 PST ---
The test case attached to the Debian bug contains &#x80; in a text/html file (which also has ignored XHTML-style incantations inside). Handling of these is defined in HTML5 section 10.2.4.70 Tokenizing character references:

---------------------
If that number is one of the numbers in the first column of the following table, then this is a parse error. Find the row with that number in the first column, and return a character token for the Unicode character given in the second column of that row.

Number    Unicode character
0x00    U+FFFD    REPLACEMENT CHARACTER
0x0D    U+000D    CARRIAGE RETURN (CR)
0x80    U+20AC    EURO SIGN (€)
<...>
---------------------

Obviously, we match the HTML5 spec here.

If iI re-save this test case as an XHTML file, the character reference is interpreted as U+0080.

WebKit correctly implements the relevant specifications, and matches other browsers (I only tested Firefox 3.6.8 this time, but my recollection is that IE does the same). The FAQ is obsolete.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the webkit-unassigned mailing list