[webkit-dev] HTML5 tokenizer landing soon
Oliver Hunt
oliver at apple.com
Mon Jun 14 12:47:03 PDT 2010
We have historically not taken patches that "will certainly be faster" without evidence that it will be faster -- Adam already said this will regress performance which makes me sad :-(
--Oliver
On Jun 14, 2010, at 12:11 PM, Eric Seidel wrote:
> The new parser will certainly be faster than the old, mostly because
> it's now hackable. The old parser was un-touchable for fear of
> breaking the world. This one is tested, perf-tested, documented and
> much better designed. May the optimizing begin!
>
> -eric
>
> On Mon, Jun 14, 2010 at 12:07 PM, Adam Barth <abarth at webkit.org> wrote:
>> On Mon, Jun 14, 2010 at 11:05 AM, Oliver Hunt <oliver at apple.com> wrote:
>>> Have you done perf testing?
>>
>> Yes. We've been working with our parsing benchmark:
>>
>> http://trac.webkit.org/browser/trunk/WebCore/benchmarks/parser/html-parser.html
>>
>>> What's the change?
>>
>> Last time we measured, the new parser was ~1% slower than the old
>> parser. I believe parsing accounts for <5% of PLT, so that
>> corresponds to a <0.05% slowdown on PTL, which is, AFAIK,
>> unmeasurable. We'll double check perf before we switch over.
>>
>> We think the new parser will end up being faster than the old parser.
>> We've done just enough performance optimization to remove perf as a
>> blocking issue for switching over. There's a bunch more we can do.
>> For example, we're currently wasting a bunch of time converting
>> new-style tokens into old-style tokens to feed them to the old tree
>> constructor. Once we start working on phase 2 (the HTML5 tree
>> constructor), we won't need to waste time there.
>>
>> Adam
>>
>>
>>> On Jun 13, 2010, at 10:21 PM, Adam Barth wrote:
>>>
>>>> People of WebKit,
>>>>
>>>> As mentioned recently on webkit-dev, Eric, Tonyg, and I have been
>>>> working on implementing the HTML5 parsing algorithm in WebKit:
>>>>
>>>> http://www.mail-archive.com/webkit-dev@lists.webkit.org/msg11472.html
>>>>
>>>> We're now ready to turn the new tokenization algorithm on by default
>>>> (probably early this week). The new code passes all the existing
>>>> LayoutTests, with the exception of roughly 40 tests that "expect"
>>>> behavior that violates the HTML5 specification [1].
>>>>
>>>> There are some differences between the old parser and the HTML5
>>>> parser. We've written up a brief document outlining those
>>>> differences:
>>>>
>>>> https://docs.google.com/document/edit?id=1as5xYjyMSCph4960iz0-Kb7hZKf_L6f2vts57NMcVBI&hl=en
>>>>
>>>> If these differences cause real compatibility issues on the web, we
>>>> should contribute this information to the working group so we can
>>>> improve the specification. If these differences cause compatibility
>>>> issues for WebKit-specific HTML (e.g., for Dashboard widgets), we
>>>> might need to add a flag to support some subset of these parsing
>>>> quirks for non-web uses of WebKit.
>>>>
>>>> Please be on the lookout for parsing-related regressions and CC Eric,
>>>> Tonyg, and me on the bugs. There's still a lot of work to do
>>>> (including implementing the tree construction algorithm), but turning
>>>> the tokenization code on by default is an important milestone for the
>>>> project.
>>>>
>>>> Happy parsing,
>>>> Adam
>>>>
>>>> [1] See https://spreadsheets.google.com/ccc?key=0AppchfQ5mBrEdDFJUW5DOGNsdmtvZkN0ZmIzMjdaT0E&hl=en
>>>> for details.
>>>> _______________________________________________
>>>> webkit-dev mailing list
>>>> webkit-dev at lists.webkit.org
>>>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>>
>>>
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
More information about the webkit-dev
mailing list