[webkit-reviews] review denied: [Bug 4820] hexadecimal HTML entities split across TCP packets are not parsed correctly : [Attachment 3755] proposed patch

bugzilla-request-daemon at opendarwin.org bugzilla-request-daemon at opendarwin.org
Sun Sep 4 12:27:59 PDT 2005


Darin Adler <darin at apple.com> has denied Alexey Proskuryakov <ap at nypop.com>'s
request for review:
Bug 4820: hexadecimal HTML entities split across TCP packets are not parsed
correctly
http://bugzilla.opendarwin.org/show_bug.cgi?id=4820

Attachment 3755: proposed patch
http://bugzilla.opendarwin.org/attachment.cgi?id=3755&action=edit

------- Additional Comments from Darin Adler <darin at apple.com>
I believe this patch reintroduces a bug we fixed a while back. It was trying to
fix that one that we introduced this bug. This was back when we didn't always
make tests for the bugs we fixed (bad!) so there's no layout test.

The original fix was back on 2003-11-17, you can see it in the ChangeLog.

3485925: Safari does not correctly parse eight-digit hex character entities

Also, please don't include the TOKEN_DEBUG changes along with the bug fix. Lets
handle those separately.

Here's some text from the original bug report:

--------

Safari does not correctly parse eight-digit hex character entities. I noticed
this at <http://www.alanwood.net/unicode/deseret.html>.

"&#x0010400;" in HTML works fine (gives me the glyph for U+10400, "DESERET
CAPITAL LETTER LONG I").  But if I use "&#x00010400;", the page renders
incorrectly; I get a Last Resort glyph followed by "0;"

It looks like the numeric entity parser only looks seven digits into the hex
string, so displays U+1040 followed by "0;", instead of realizing that it's all
one entity and displaying U+10400.

--------

So we should make a new fix that works properly for both and probably make a
layout test for the older bug fix.



More information about the webkit-reviews mailing list