[webkit-dev] HTML5 tokenizer landing soon

Adam Barth abarth at webkit.org
Mon Jun 14 10:22:31 PDT 2010

On Mon, Jun 14, 2010 at 8:42 AM, Maciej Stachowiak <mjs at apple.com> wrote:
> On Jun 14, 2010, at 8:18 AM, Simon Fraser wrote:
>> On Jun 13, 2010, at 10:21 PM, Adam Barth wrote:
>>> As mentioned recently on webkit-dev, Eric, Tonyg, and I have been
>>> working on implementing the HTML5 parsing algorithm in WebKit:
>>> http://www.mail-archive.com/webkit-dev@lists.webkit.org/msg11472.html
>>> We're now ready to turn the new tokenization algorithm on by default
>>> (probably early this week).  The new code passes all the existing
>>> LayoutTests, with the exception of roughly 40 tests that "expect"
>>> behavior that violates the HTML5 specification [1].
>> Does the HTML5 tokenizer kick in for all HTML documents, or is there
>> some switching based on DOCTYPE?
> There definitely shouldn't be switching based on doctype. The HTML5 parsing algorithm is specifically designed to handle existing Web content.

That's correct, there is no switching based on doctype.  The switch is
based on a WebCore::Setting.  If you think about it, switching based
on doctype would be hard because you need to start parsing the
document to find the doctype...


More information about the webkit-dev mailing list