[webkit-dev] DOM tree traversal on detached nodes

Ojan Vafai ojan at chromium.org
Mon Jun 11 23:35:21 PDT 2012


On Mon, Jun 11, 2012 at 9:32 PM, Filip Pizlo <fpizlo at apple.com> wrote:

> I think that a lot of the performance penalty can be alleviated by:
>
> 1) Moving rarely executed paths into non-inlineable methods.
>
> 2) Marking the various refing methods ALWAYS_INLINE, after you've moved as
> much as possible out of line.
>
> 3) Using LIKELY and UNLIKELY where appropriate.
>

We should only add these when we can point to a measurable benefit IMO
(maybe that's what you meant by "appropriate"?). My experience is that
marking codepaths LIKELY/UNLIKELY very rarely has a measurable performance
improvement with modern compilers + x86 processors. I can't speak for arm
since I have less direct experience optimizing for arm.

The reason I suggest these things is that you're adding a lot of code in a
> bunch of methods, but a lot of that logic looks like it will execute
> infrequently. That reduces inlining opportunities. And in places where the
> methods are still inlined, you're adding code bloat that reduces icache
> locality and may blow out branch prediction caches.
>
> I'm also not sure that your benchmark is conclusive. What does the perf
> test have in common with things we know we care about, like say PLT?
>
> -Filip
>
> On Jun 11, 2012, at 8:45 PM, Kentaro Hara <haraken at chromium.org> wrote:
>
> > Thanks ggaren!
> >
> > I filed a bug here: https://bugs.webkit.org/show_bug.cgi?id=88834
> >
> >
> >> I believe you could achieve (a) (i.e., preserve all reachable nodes
> without help from the JavaScript garbage collector) with these semantics in
> the C++ DOM:
> >>
> >> ===== Design =====
> >>
> >> ref():
> >>
> >> ++this->refCount
> >> ++root()->refCount
> >>
> >> deref():
> >>
> >> --this->refCount
> >> --root()->refCount
> >
> > Actually I've tried the approach yesterday but confirmed 25.9%
> > performance regression, even if I have TreeShared hold a pointer to
> > the root. Please look at the WIP patch and the performance test in
> > https://bugs.webkit.org/show_bug.cgi?id=88834.
> >
> > What I've learned is that we must not insert any line to ref() and
> > deref(), which are performance sensitive:)
> >
> > I am now trying another approach that hacks Node destruction. Let's
> > discuss the implementation details in the bug.
> >
> > Thanks!
> >
> >
> > On Tue, Jun 12, 2012 at 12:18 PM, Geoffrey Garen <ggaren at apple.com>
> wrote:
> >>
> >> IMHO, (a) would be the best semantics.
> >>
> >>
> >> I agree, and I think that the specification actually does require this.
> >>
> >> The real issue here is how to fix this bug efficiently. I believe that
> Geoff Garen has been contemplating this and also has a proposal for how to
> do it.
> >>
> >>
> >> I believe you could achieve (a) (i.e., preserve all reachable nodes
> without help from the JavaScript garbage collector) with these semantics in
> the C++ DOM:
> >>
> >> ===== Design =====
> >>
> >> ref():
> >>
> >> ++this->refCount
> >> ++root()->refCount
> >>
> >> deref():
> >>
> >> --this->refCount
> >> --root()->refCount
> >>
> >> if !root()->refCount:
> >> assert(!this->refCount)
> >> for each descendent of root():
> >> delete descendent
> >> delete root()
> >>
> >> Remove 'this' from tree:
> >>
> >> assert(this->refCount) // clients must put a node into a RefPtr to
> remove it from a tree
> >>
> >> root()->refCount -= this->refCount
> >>
> >> for each descendent of this:
> >> this->refCount += descendent->refCount
> >>
> >> Insert 'this' into tree:
> >>
> >> root()->refCount += this->refCount
> >>
> >> for each descendent of this:
> >> this->refCount -= descendent->refCount
> >>
> >> What is "root()"?:
> >>
> >> If this is in a document:
> >> root() == document()
> >>
> >> else:
> >> root() == the highest link in the parentNode chain
> >>
> >>
> >> ===== FAQ =====
> >>
> >> Cycles?
> >>
> >> No cycles are possible because the DOM API does not allow cycles in a
> DOM tree.
> >>
> >> Is O(n) subtree insertion and removal OK?
> >>
> >> I believe these operations are already O(n).
> >>
> >> Is average case O(log n) access to disconnected root() OK?
> >>
> >> If n is small and/or ref/deref of disconnected subnodes are rare, then
> yes.
> >>
> >> Otherwise, we can optimize this case by putting a direct pointer to
> root() in rare data.
> >>
> >> Geoff
> >
> >
> >
> > --
> > Kentaro Hara, Tokyo, Japan (http://haraken.info)
> > _______________________________________________
> > webkit-dev mailing list
> > webkit-dev at lists.webkit.org
> > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20120611/d4dc2644/attachment.html>


More information about the webkit-dev mailing list