[webkit-dev] HTML5 tokenizer landing soon

Adam Barth abarth at webkit.org
Mon Jun 14 10:21:00 PDT 2010


There's a patch out for review that starts in that direction:

https://bugs.webkit.org/show_bug.cgi?id=40557

We want to keep the old preload scanner around for a bit in case we
need to switch back to the old tokenizer.  In the new world, the
preload scanner is very simple because the tokenization algorithm is
separate from the rest of what the old HTMLTokenizer class did (which
was a lot).

Adam


On Mon, Jun 14, 2010 at 5:10 AM, Antti Koivisto <koivisto at iki.fi> wrote:
> Cool. Are you going to switch the PreloadScanner to the new tokenizer too?
>
>
>  antti
>
> On Mon, Jun 14, 2010 at 8:21 AM, Adam Barth <abarth at webkit.org> wrote:
>> People of WebKit,
>>
>> As mentioned recently on webkit-dev, Eric, Tonyg, and I have been
>> working on implementing the HTML5 parsing algorithm in WebKit:
>>
>> http://www.mail-archive.com/webkit-dev@lists.webkit.org/msg11472.html
>>
>> We're now ready to turn the new tokenization algorithm on by default
>> (probably early this week).  The new code passes all the existing
>> LayoutTests, with the exception of roughly 40 tests that "expect"
>> behavior that violates the HTML5 specification [1].
>>
>> There are some differences between the old parser and the HTML5
>> parser.  We've written up a brief document outlining those
>> differences:
>>
>> https://docs.google.com/document/edit?id=1as5xYjyMSCph4960iz0-Kb7hZKf_L6f2vts57NMcVBI&hl=en
>>
>> If these differences cause real compatibility issues on the web, we
>> should contribute this information to the working group so we can
>> improve the specification.  If these differences cause compatibility
>> issues for WebKit-specific HTML (e.g., for Dashboard widgets), we
>> might need to add a flag to support some subset of these parsing
>> quirks for non-web uses of WebKit.
>>
>> Please be on the lookout for parsing-related regressions and CC Eric,
>> Tonyg, and me on the bugs.  There's still a lot of work to do
>> (including implementing the tree construction algorithm), but turning
>> the tokenization code on by default is an important milestone for the
>> project.
>>
>> Happy parsing,
>> Adam
>>
>> [1] See https://spreadsheets.google.com/ccc?key=0AppchfQ5mBrEdDFJUW5DOGNsdmtvZkN0ZmIzMjdaT0E&hl=en
>> for details.
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
>


More information about the webkit-dev mailing list