[webkit-dev] HTML5 tokenizer landing soon

David Hyatt hyatt at apple.com
Mon Jun 14 14:09:04 PDT 2010

On Jun 14, 2010, at 4:05 PM, Adam Barth wrote:

> On Mon, Jun 14, 2010 at 1:53 PM, David Hyatt <hyatt at apple.com> wrote:
>> On Jun 14, 2010, at 3:48 PM, Adam Barth wrote:
>>> We ended up using the same algorithm as the old tokenizer to manage
>>> insertion points, however, we moved all the work into a separate
>>> InputStream data structure:
>>> http://trac.webkit.org/browser/trunk/WebCore/html/HTML5DocumentParser.h#L75
>>> The old code was actually pretty clever once I figured out what it was
>>> doing.  We're considering moving InputStream into its own file instead
>>> of keeping it as an inner class of the document parser.
>> If you're talking about the segmented string stuff, I added that to the existing tokenizer. :)
> Yeah, well, I already knew you were a clever guy.  :)

In all seriousness,  though, those are the kinds of optimizations to make sure to move over.  I remember other optimizations we did around quick comparisons, some AtomicString stuff, and the SegmentedString stuff.  As long as that moved over, you should see comparable performance I would think.  We just need to make sure not to lose that work in the transition (and it sounds like we haven't, so good).


More information about the webkit-dev mailing list