[Webkit-unassigned] [Bug 98694] [Performance] Speed-up DOM tree traversal on ARM platforms

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Oct 16 16:38:20 PDT 2012


https://bugs.webkit.org/show_bug.cgi?id=98694





--- Comment #19 from Kulanthaivel Palanichamy <kulanthaivel at codeaurora.org>  2012-10-16 16:39:11 PST ---
(In reply to comment #6)
> (From update of attachment 167661 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=167661&action=review
> 
> Thanks for submitting this. It’s cool that this can make DOM traversal much faster. I’m sure we can figure out a form of this that can go into the tree.
> 

Thanks, this is very encouraging.

> There are lots of design decisions that are not explained in this code. Need more comments explaining why, not just mystery code.
> 
> Clearly this prefetching code could make some operations slower. It’s good that it makes benchmarks faster, but we should also construct the pathological cases that become slower so we can see how much slower.
> 
This code definitely makes some of the tests slower in our WebKit performance suite itself. It may not be possible to reach a point where there is
no regression at all, but I'm working on addressing them one by one to reduce the regression to a level where it is acceptable.

> > Source/WTF/wtf/Compiler.h:222
> > +#define PREFETCH(x)
> 
> This isn’t valuable. Seems highly unlikely we will want to compile code that uses the PREFETCH macro without also having an ENABLE conditional around it. I think we can just leave this out.
> 
Renamed it to BUILTIN_PREFETCH to match the compiler __builtin_XXX naming format.

> > Source/WTF/wtf/Platform.h:1033
> > +/* Prefetch to minimize cache-miss latency by moving data into a cache before it is accessed */
> > +#if !defined(BUILTIN_PREFETCH) && CPU(ARM) && COMPILER_SUPPORTS(BUILTIN_PREFETCH)
> > +#define ENABLE_PREFETCH 1
> > +#endif
> 
> I don’t understand the !defined(BUILTIN_PREFETCH) in this expression.
It was a mistake. It suppose to be !define(ENABLE_PREFETCH), now it is renamed as ENABLE_PLD_DOM_OPTIMIZATION.

> 
> I don’t like the use of CPU(ARM) here, without a comment mentioning why it’s appropriate.
> 
Initially I tested this patch only on ARM, it looks like it benefits even x86.

> I don’t think ENABLE(PREFETCH) is specific enough. There are so many things that prefetch could mean. We should chose a name that is specific enough to make it clear that this is memory prefetching.
> 
> > Source/WebCore/dom/ContainerNode.h:279
> > +    prefetch();
> 
> This function name is unclear. It doesn’t say what is being fetched. It’s not appropriate to have a function named just “prefetch” on a node. Need a human-comprehensible name.
> 
Renamed it to preloadNextNodeIntoCPUCache(). Though it doesn't preload the next node into CPU cache, it preloads a node that is few nodes ahead in the traversal path. 

> At each call site we need to say why it’s right to call prefetch there. Otherwise it just looks like you made magic decisions about where to put these calls, and no one in the future can figure out whether it’s safe to remove them and where they need to be added then.
> 
> > Source/WebCore/dom/ContainerNode.h:287
> > +    prefetch();
> 
> Not good to put this above the comment below, which should be at the top of the function.
> 
Corrected.

> > Source/WebCore/dom/Node.h:103
> > +const int prefetchStride = 6;
> 
> The mystery magic constant needs a comment explaining where it came from. If you determined the value by testing, it needs to explain that. If you determined it by some theoretical means it needs to explain that.
> 
Done.

> > Source/WebCore/dom/Node.h:268
> > +    void updatePrefetchTarget();
> > +    void setPrefetchTarget(Node *prefetch) { m_prefetchTarget = prefetch; }
> 
> These new functions shouldn’t be crammed in with the setPrevious/NextSibling call. They should go in their own paragraph with comments that help explain the mechanism.
>
Done.

> > Source/WebCore/dom/Node.h:269
> > +    void setNextSibling(Node* next) { m_next = next; updatePrefetchTarget(); }
> 
> This is an inappropriate level to add the prefetch updating. It should be done at call sites, not in this setter, unless that’s completely impractical. This low level setter needs to remain a low level setter.
>
Done. Now updateCPUCachePreloadTarget() is getting called from proper call sites.

> > Source/WebCore/dom/Node.h:815
> > +#if ENABLE(PREFETCH)
> > +    Node* m_prefetchTarget;
> > +#endif
> 
> Is it important to have this data member here? Could it be put at the end of the data members instead where it would not be in the middle of others, making them confusing.
>
Moved it to the end.

> I don’t think m_prefetchTarget is the right name for this. Instead I would like it to be something that expresses more clearly what it’s supposed to represent, rather than how it’s used.
>
Renamed it to m_cpuCachePreloadTarget.

> > Source/WebCore/dom/Node.h:909
> > +    if (m_next) {
> > +        Node* from = this;
> > +        Node* n = from->traversePreviousNodePostOrder();
> > +        for (int skew = prefetchStride; skew && n; skew--) {
> > +            from = n;
> > +            n = n->traversePreviousNodePostOrder();
> > +        }
> > +        from->setPrefetchTarget(m_next);
> > +    }
> 
> This function is completely mysterious to me. There’s no why comment explaining why going backwards is what we want to do. No explanation of what “from” means as a variable name, or what “skew” means.
>
I've added some minimal comments in this function, but explaining the complete logic in comments seems to be too much in Node.h, so I've added the bugzilla link in comments
for further reference.

> I gather that m_prefetchTarget is a sort of inaccurate “skip” pointer that contains an often incorrect, often correct pointer to the sixth sibling from the current sibling.
> 
> I can make guesses about why it does what it does, but I should not have to guess. You should be explaining this in comments.
> 
> > Source/WebCore/dom/Node.h:916
> > +    if (LIKELY(m_prefetchTarget != 0))
> 
> Need a comment here explaining it’s OK to have a bad pointer as the prefetch target; otherwise this code looks like a bug because it uses a dangling pointer.
> 
> Do we really want this branch? If it’s OK to prefetch with any pointer, then why not initialize m_prefetchTarget to this rather than having 0 as a magic value?
> 
Removing NULL check regresses many of the performance tests.
Technically passing NULL to PLD should result in NOP, but it looks like it carries some penalty on ARM & x86.

> > Source/WebCore/dom/Node.h:917
> > +        PREFETCH(((char *) m_prefetchTarget));
> 
> The cast to (char *) should be something that’s part of the PREFETCH macro rather than done at each call site, since it’s part of the idiosyncrasy of a particular compiler’s feature and might be different in other compilers.
Done.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the webkit-unassigned mailing list