Harmonizing content sniffing
My colleagues and I have put together a web site that makes it easy to compare the mime signatures used by Internet Explorer 7, Firefox 3, Safari 3.1, Google Chrome, and the HTML 5 specification: http://webblaze.cs.berkeley.edu/2009/content-sniffing/ I'm hoping we can use this information to converge the content sniffing algorithms used by different browsers and then update the HTML 5 spec to reflect the consensus. I know some changes to the content sniffing algorithm will be contentious (such as whether to sniff HTML from text/plain), but I'm hoping other changes will be easier. For example, Firefox, IE, and Safari all use different signatures for JPEG. When I raised this issue before on this list, I got the impression that the WebKit project was generally receptive to changing its content sniffing algorithm. Get the ball rolling, I'd suggest making the following changes: 1) Remove the fourth byte of the JPEG signature to match Firefox 3, Google Chrome, and the HTML 5 spec. (Internet Explorer uses a two-byte signature.) 2) Add a signature for GIF. Currently, Internet Explorer 7, Firefox 3, Google Chrome, and the HTML 5 spec all have signatures for GIF, but Safari 3.1 does not. 3) Add a signature for PNG. Currently, Internet Explorer 7, Firefox 3, Google Chrome, and the HTML 5 spec all have signatures for PNG, but Safari 3.1 does not. I think the next logical set is for me to file a bug about these issues, but I wanted to send an email to the list to provide some context for that bug report. Thanks, Adam
On Nov 14, 2008, at 2:06 PM, Adam Barth wrote:
When I raised this issue before on this list, I got the impression that the WebKit project was generally receptive to changing its content sniffing algorithm. Get the ball rolling, I'd suggest making the following changes:
As you know, today’s WebKit project does not have a content sniffing algorithm. Apple software that it uses for networking does have a sniffing algorithm. I believe earlier you proposed moving the sniffing into the WebKit project. If the list of changes here is proposed for the Apple software beneath WebKit, then we need to use a different forum. The folks responsible for that software don't participate on this mailing list. I think the proposals themselves sound good. One way to start the ball rolling is to file a bug at bugreport.apple.com with these suggested changes and cite the WebKit context in that bug report. If you do that, please give me the bug number so I can push the changes here within Apple. -- Darin
On Fri, Nov 14, 2008 at 2:21 PM, Darin Adler <darin@apple.com> wrote:
I believe earlier you proposed moving the sniffing into the WebKit project.
Yes.
One way to start the ball rolling is to file a bug at bugreport.apple.com with these suggested changes and cite the WebKit context in that bug report. If you do that, please give me the bug number so I can push the changes here within Apple.
I can do that if you like, or I can implement a content sniffing algorithm in WebKit itself. I think moving the algorithm into WebKit proper has a couple benefits: 1) All the ports can use exactly the same algorithm, increasing compatibility between different ports. 2) Having algorithm be open-source helps folks who are trying to design filters for their web sites. One disadvantage of moving the algorithm is that we might make some unintended changes. The issue that blocked this previously was the lack of documentation about CFNetwork's current content sniffer. Hopefully we've resolved this by extracting the signatures from CFNetwork accurately. Adam
On Nov 14, 2008, at 2:32 PM, Adam Barth wrote:
One disadvantage of moving the algorithm is that we might make some unintended changes.
Another disadvantage is that any other Mac OS X libraries or applications that rely on the sniffing done by CFNetwork will no longer get the same results as WebKit clients. Also, I don't think we've established yet if we can turn off the sniffing in CFNetwork and keep the other desirable CFNetwork features working. There are a number of closely related features, such as the one that determines the suggested filename for a download. I think the concept of moving the sniffing into WebKit is great if we can execute it successfully. I'm worried that it may be difficult to do that in the Mac OS X version and the Windows version used by Safari. -- Darin
Ok. I'll file a bug report at http://bugreport.apple.com/ with several things that should be easy to change. In parallel, we can investigate moving the sniffing algorithm to WebKit. Thanks, Adam On Fri, Nov 14, 2008 at 2:40 PM, Darin Adler <darin@apple.com> wrote:
On Nov 14, 2008, at 2:32 PM, Adam Barth wrote:
One disadvantage of moving the algorithm is that we might make some unintended changes.
Another disadvantage is that any other Mac OS X libraries or applications that rely on the sniffing done by CFNetwork will no longer get the same results as WebKit clients.
Also, I don't think we've established yet if we can turn off the sniffing in CFNetwork and keep the other desirable CFNetwork features working. There are a number of closely related features, such as the one that determines the suggested filename for a download.
I think the concept of moving the sniffing into WebKit is great if we can execute it successfully. I'm worried that it may be difficult to do that in the Mac OS X version and the Windows version used by Safari.
-- Darin
Has there been any progress with this effort? Specifically, I'm wondering if there is a bug on bugs.webkit.org that we are using to track this task. Cheers, Adam On Friday 14 November 2008 5:53:56 pm Adam Barth wrote:
Ok. I'll file a bug report at http://bugreport.apple.com/ with several things that should be easy to change. In parallel, we can investigate moving the sniffing algorithm to WebKit.
Thanks, Adam
On Fri, Nov 14, 2008 at 2:40 PM, Darin Adler <darin@apple.com> wrote:
On Nov 14, 2008, at 2:32 PM, Adam Barth wrote:
One disadvantage of moving the algorithm is that we might make some unintended changes.
Another disadvantage is that any other Mac OS X libraries or applications that rely on the sniffing done by CFNetwork will no longer get the same results as WebKit clients.
Also, I don't think we've established yet if we can turn off the sniffing in CFNetwork and keep the other desirable CFNetwork features working. There are a number of closely related features, such as the one that determines the suggested filename for a download.
I think the concept of moving the sniffing into WebKit is great if we can execute it successfully. I'm worried that it may be difficult to do that in the Mac OS X version and the Windows version used by Safari.
-- Darin
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Am Fri, 14 Nov 2008 14:40:44 -0800 schrieb Darin Adler <darin@apple.com>:
On Nov 14, 2008, at 2:32 PM, Adam Barth wrote:
One disadvantage of moving the algorithm is that we might make some unintended changes.
Another disadvantage is that any other Mac OS X libraries or applications that rely on the sniffing done by CFNetwork will no longer get the same results as WebKit clients.
Also, I don't think we've established yet if we can turn off the sniffing in CFNetwork and keep the other desirable CFNetwork features working. There are a number of closely related features, such as the one that determines the suggested filename for a download.
I think the concept of moving the sniffing into WebKit is great if we can execute it successfully. I'm worried that it may be difficult to do that in the Mac OS X version and the Windows version used by Safari.
Hey, I love the idea of unifying content sniffing across rendering engines. So why not actually aim for establishing something like an argeement, which, for a start, would be implemented in both CFNetwork and WebKit, where a #define determines if WebKit's or CFNetwork's sniffing is used, for the sake of running tests, and in the end the Apple ports could continue using the platform's implementation by default, to keep guaranteed consistency with other network clients that are not using WebKit, even if CFNetwork may not be able to use the exact same logic for whatever reason. So everyone could benefit from the improvements. Just my 2 pfennig, Christian
On Fri, Nov 14, 2008 at 2:21 PM, Darin Adler <darin@apple.com> wrote:
One way to start the ball rolling is to file a bug at bugreport.apple.com with these suggested changes and cite the WebKit context in that bug report. If you do that, please give me the bug number so I can push the changes here within Apple.
Done. The bug number is 6373681. The changes in that report should be clear wins because they improve compatibility, security, and standards compliance. I think there are some more changes that are worth making, but this is a good place to start. Adam
participants (4)
-
Adam Barth
-
Adam Treat
-
Christian Dywan
-
Darin Adler