[Webkit-unassigned] [Bug 26694] should we scan beyond 1kB for meta charset?

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Feb 8 10:13:55 PST 2012


https://bugs.webkit.org/show_bug.cgi?id=26694





--- Comment #10 from Jungshik Shin <jshin at chromium.org>  2012-02-08 10:13:55 PST ---
I think it'd be easier to interpret the statistics if you run the tally ONLY with documents that do NOT have 'charset' declaration in Content-Type HTTP response header fields. 

Because charset value in HTTP response header has a higher priority than meta charset declaration, the position of 'meta charset' header  does not matter at all if charset header is present in HTTP response. 

Alternatively, you can treat any document with charset in HTTP header as if  meta charset is declared at position 0 when getting the statistics.  That way, the benefit of going beyond 1024 bytes would be clearly shown in terms of the relative frequency among the all the web documents.  So, I like this second approach. 

BTW, I don't think the fact that sometimes 'default charset' (that is used when no other information is available) matches the actual document encoding is relevant to this bug. Because 'default charset' value is user-dependent and what works for one user does not work for another user with different default.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list