[webkit-dev] Enabling the HTML5 tree builder soon

Adam Barth abarth at webkit.org
Mon Jul 26 05:57:14 PDT 2010

Would someone from Apple be willing to run the patch below though the
PLT?  We're doing well on our parsing benchmark (4% speedup), but the
PLT might have a different mix of HTML.


diff --git a/WebCore/html/HTMLTreeBuilder.cpp b/WebCore/html/HTMLTreeBuilder.cpp
index 7a9c295..5b89c37 100644
--- a/WebCore/html/HTMLTreeBuilder.cpp
+++ b/WebCore/html/HTMLTreeBuilder.cpp
@@ -327,7 +327,7 @@ HTMLTreeBuilder::HTMLTreeBuilder(HTMLTokenizer*
tokenizer, HTMLDocument* documen
     , m_originalInsertionMode(InitialMode)
     , m_secondaryInsertionMode(InitialMode)
     , m_tokenizer(tokenizer)
-    , m_legacyTreeBuilder(shouldUseLegacyTreeBuilder(document) ? new
LegacyHTMLTreeBuilder(document, reportErrors) : 0)
+    , m_legacyTreeBuilder(0)
     , m_lastScriptElementStartLine(uninitializedLineNumberValue)
     , m_scriptToProcessStartLine(uninitializedLineNumberValue)
     , m_fragmentScriptingPermission(FragmentScriptingAllowed)

On Thu, Jul 22, 2010 at 3:30 AM, Adam Barth <abarth at webkit.org> wrote:
> We're getting close to enabling the HTML5 tree builder on trunk.  Once
> we do that, we'll have the core of the HTML5 parsing algorithm turned
> on, including SVG-in-HTML.  There are still a bunch of details left to
> finish (such as fragment parsing, MathML entities, and better error
> reporting), but this marks a significant milestone for this work.
> The tree builder is markedly more complicated than the tokenizer, and
> I'm sure we're going to have some bad regressions.  I'd like to ask
> your patience and your help to spot and triage these regressions.
> We've gotten about as much mileage as we can out of the HTML5lib test
> suite and the LayoutTests.  The next step for is to see how the
> algorithm works in the real world.
> There are about 84 tests that will require new expectations, mostly
> due to invisible differences in render tree dumps (e.g., one more or
> fewer 0x0 render text).  In about half the cases, we've manually
> verified that our new results agree with the Firefox nightly builds,
> which is great from a compliance and interoperability point of view.
> The other half involve things like the exact text for the <isindex>,
> which we've chosen to match the spec exactly, or the <keygen> element,
> which needs some shadow DOM love to hide its implementation details
> from web content.
> As for performance, last time we ran our parser benchmark, the new
> tree builder was 1% faster than the old tree builder.  There's still a
> bunch of low-hanging performance work we can do, such as atomizing
> strings and inlining functions.  If you're interested in performance,
> let me or Eric know and we can point you in the right direction.
> I don't have an exact timeline for when we're going to throw the
> switch, but sometime in the next few days.  If you'd like us to hold
> off for any reason, please let Eric or me know.
> Adam
> P.S., you can follow along by CCing yourself on the master bug,
> <https://bugs.webkit.org/show_bug.cgi?id=41123>, or by looking at our
> LayoutTest failure triage spreadsheet,
> <https://spreadsheets.google.com/ccc?key=0AlC4tS7Ao1fIdEo0SFdLaVpiclBHMVNQcHlTenV5TEE&hl=en>.

