[Webkit-unassigned] [Bug 32937] LayoutTests/fast/encoding/invalid-UTF-8.html

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Feb 15 10:17:06 PST 2010


https://bugs.webkit.org/show_bug.cgi?id=32937


Alexey Proskuryakov <ap at webkit.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #48718|review?                     |review-
               Flag|                            |




--- Comment #5 from Alexey Proskuryakov <ap at webkit.org>  2010-02-15 10:17:05 PST ---
(From update of attachment 48718)
Turns out that per the Unicode spec, both results are allowed, although the
current results match the recommended ones. See "Constraints on Conversion
Processes" paragraph in
<http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf>.

> Mac, Firefox and Chromium (and presumably other decoders) represent the unicode
> string in this test as:
> |d1 82 | b5 | d1 | d1 82 | 20   | f0 90 80 | f0 80 f0 90 | 90 |
> |  T   | ?  | ?  |  T    | [\s] |     ?    |       ?     | ?  |

I don't think this is quite accurate - it seems that the invalid subsequences
are b5, d1, f09080, f080, and f09090. At least, that's what they should be per
the spec recommendation.

> test itself to a dumpAsText test since font metrics are not required to determine
> if it has passed or not.

This test was originally added for bug 8972, "REGRESSION: invalid UTF-8
sequences are not displayed". One needs pixel results to tell whether
characters are displayed!

We actually want to test two things here. First, it's that U+FFFD substitution
characters are actually displayed. Second, compliance with Unicode spec
recommended handling shouldn't regress on platforms that are compliant now.

Only the former subtest needs pixel results. One way to make it text-only would
be to have a separate container with text "тт ", and compare its rendered width
to actual one.

-<p>The output should be: "т??т ???" (with black diamonds in place of question
marks).</p>
+<p>The output on Mac should be: "т??т ???" (with black diamonds in place of
question marks).<br>
+The output on Qt should be: "т??т ????????" (with black diamonds in place of
question marks).<br>
+(See https://bugs.webkit.org/show_bug.cgi?id=32937 for the reason Qt is
different.)</p>

There is no need to single out Qt here. It would be better to just make test
results more self-explanatory, separating the "must" and "ideally should"
subtests.

I think it's fine to have Qt-specific results for the test, and it would be
nice to make the test text-only. But some clarification would be useful to
avoid future confusion, so please consider my comments above. Marking r- to get
this out of review queue.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the webkit-unassigned mailing list