[Webkit-unassigned] [Bug 178207] New: UTF-8 decoding produces incorrect results when an erroneous byte sequence is split into multiple chunks

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Oct 12 05:10:32 PDT 2017


https://bugs.webkit.org/show_bug.cgi?id=178207

            Bug ID: 178207
           Summary: UTF-8 decoding produces incorrect results when an
                    erroneous byte sequence is split into multiple chunks
           Product: WebKit
           Version: WebKit Nightly Build
          Hardware: Unspecified
                OS: Unspecified
            Status: NEW
          Severity: Normal
          Priority: P2
         Component: Web Template Framework
          Assignee: webkit-unassigned at lists.webkit.org
          Reporter: marja at chromium.org

See https://bugs.chromium.org/p/chromium/issues/detail?id=773320 for the same bug in Blink.

In that bug, we saw a script where an invalid byte sequence occurs and is split between two chunks (of size 4096):

0b11100000 << lead << 0xe0
0b10100101 << cont << 0xa5
0b00111111 << ascii << 0x3f

The bug is that TextCodecUTF8::HandlePartialSequence calls TextCodecUTF8::HandleError which assumes that each error consumes one byte from the byte stream and produces an invalid char. However, that ignores the fact that we need to consume multiple bytes (i.e., the maximal subpart).

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20171012/9f6e6890/attachment.html>


More information about the webkit-unassigned mailing list