[webkit-reviews] review denied: [Bug 3274] document() not supported : [Attachment 2555] Work in progress

Wed Jun 22 20:09:12 PDT 2005

Darin Adler <darin at apple.com> has denied Anders Carlsson <andersca at mac.com>'s
request for review:
Bug 3274: document() not supported
http://bugzilla.opendarwin.org/show_bug.cgi?id=3274

Attachment 2555: Work in progress
http://bugzilla.opendarwin.org/attachment.cgi?id=2555&action=edit

------- Additional Comments from Darin Adler <darin at apple.com>
Well, setting the review flag is indeed an effective way to get my attention!

Regarding your comments (2) and (3), sounds like you're on the right track. To
figure out what we should do in error cases, we first need to investigate what
other browsers do in such cases.

As far as issue (1) is concerned, the fact that other caller use xmlReadMemory
is due to the fact that we already have the data in memory, stored in a
QString. In the XMLTokenizer case we have already used our own decoding logic
for figuring out what character set to decode.

Whether libxml2's logic to determine what character set something is in is
acceptable or not, I don't really know. Since it doesn't get passed the
headers, I don't know how it could possibly handle that exactly right.

I think it's OK to start like this, and make a change only if we find a bug. If
we want to decode here in a way that matches the XMLTokenizer code path, it's
pretty simple:

Create a khtml::Decoder. Look at the headers and call setEncoding with the
encoding from the header, if any, passing EncodingFromHTTPHeader. See the logic
in KHTMLPart::write for other rules we might want to imitate (inheriting
encoding, user-specified encoding).

Call decode(), passing all the data. Call flush(), and append that QString to
the one you got from decode().

Then call xmlReadMemory the same way XMLTokenizer does.

I'm sure we could do a little refactoring to share code with KHTMLPart::write
too.

But it's probably not worth doing any of that until we make a test case and see
how the other browsers handle character sets in this case.