[Webkit-unassigned] [Bug 26694] should we scan beyond 1kB for meta charset?

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Feb 8 01:16:15 PST 2012


https://bugs.webkit.org/show_bug.cgi?id=26694





--- Comment #9 from yosin at chromium.org  2012-02-08 01:16:15 PST ---
No. This value is simple statistics of byte position of end of meta tag element which contains charset declaration, <meta charset=".."> <meta http-equiv="Content-Type" content="...">.

WebKit does well. (^_^)

WebKit has other sources:
1. HTTP response header
2. Default encoding from browser

Note: 56.62% of URLs have "right" charset within "Content-Type" HTTP response 
header.

BTW, "charset" declarations are sometimes wrong. 

In HTTP, 
1.2% of charset is invalid charset name, e.g. "utf8", "foobar", blah.
12.45% of charset specify another charset, e.g. "shift_jis" for "utf-8".

In HTML, 1.25% are invalid charset, 17.83% charset specify another charset.

We're working for fixing invalid charset/missing charset case in https://bugs.webkit.org/show_bug.cgi?id=75594

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list