[Webkit-unassigned] [Bug 10697] New: REGRESSION (r16175): Errors in incremental decoding of UTF-8

bugzilla-daemon at opendarwin.org bugzilla-daemon at opendarwin.org
Sat Sep 2 16:27:08 PDT 2006


http://bugzilla.opendarwin.org/show_bug.cgi?id=10697

           Summary: REGRESSION (r16175): Errors in incremental decoding of
                    UTF-8
           Product: WebKit
           Version: 420+ (nightly)
          Platform: Macintosh
        OS/Version: Mac OS X 10.4
            Status: NEW
          Keywords: Regression
          Severity: normal
          Priority: P1
         Component: Page Loading
        AssignedTo: webkit-unassigned at opendarwin.org
        ReportedBy: opendarwin.org at mitzpettel.com
                CC: ap at nypop.com


Try reloading the attachment a few times - sometimes some of the Alephs are
replaced by the Unicode replacement character (U+FFFD, looks like a question
mark inside a black rhombus). Decreasing the network interface's MTU (say, to
150) can help achieve the result.

This is a regression from r16175 (fix for bug 10155 et al.). This happens
because the UTF-8 decoder is destroyed and a new one is created mid-character.
In fact, the decoder is replaced every time Decoder::decode() is called! I
think the following is wrong:

    // If we still haven't found an encoding, assume latin1
    // (this can happen if an empty name is passed from outside).
    if (m_encodingName.isEmpty() || !m_encoding.isValid()) {
        m_encodingName = "iso8859-1";
        m_encoding = TextEncoding(Latin1Encoding);
    }
    m_decoder.set(StreamingTextDecoder::create(m_encoding));

The last line should go inside the braces too.


-- 
Configure bugmail: http://bugzilla.opendarwin.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list