[Webkit-unassigned] [Bug 17510] Acid3 test 26 takes >33ms
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Sun May 25 14:51:48 PDT 2008
------- Comment #33 from mjs at apple.com 2008-05-25 14:51 PDT -------
(In reply to comment #32)
> (In reply to comment #26)
> > 5) 0.7% in Node::isReadOnlyNode().
> I have a patch that cuts this down to about 0.1% attached to the related bug.
> (I noticed unnecessary multiple calls to nodeType() in
> Other optimization thoughts (not the kind of great ideas Maciej posted, but
> still perhaps useful):
Comments on a few of these.
> B) DOM getOwnPropertySlot functions are taking a lot of time, more than 5%. The
> big costs are the PIC branch used to get access to the global table and the
> memory accesses in that table. And the size for the hash table for JSDOMNode,
> for example, is huge -- 4096 slots for the 19 values just to avoid the need to
> rehash. Maybe there's a better data structure we can use?
I think just inlining some of them (especially the Node and EventTargetNode
ones, which are never called directly) would help.
> C) Overhead for render objects we never end up rendering is still a big part of
> the profile, even with Maciej's tear-down optimization. For example,
> CSSStyleSelector::styleForElement is 4.7% of the time and the total time for
> createRendererIfNeeded is 12.5%. And this is all for an element that's going to
> be removed before the next layout. If we can find a way to defer creating the
> renderer and never end up doing it at all then we will get a *big* speedup.
> Maybe we can change things so that renderers can be created at layout time.
I have thought about this but I am not sure it is good for the normal case.
> D) Tons of time is spent inside toJS, making wrappers for the text node and
> HTMLAnchorElement objects that are created and looking for the existing wrapper
> for the result of parentNode, nextSibling, and others. We could cut down a lot
> of hash table overhead if we were willing to put a pointer in every
> WebCore::Node to the corresponding JSNode.
I think two things would speed this up a fair bit without the extra pointer:
D.i) Make a version of toJS that takes a node which is known to be newly
created, and in that case skip the hash lookup for the existing wrapper, since
there can't be one.
D.ii) Add type-spepcific overload for Text* and perhaps Element* or even
HTMLElement* to avoid dispatching on the node type in the newly created case.
> E) We could avoid making a lot of JS string wrappers if we could create code
> paths that would keep the UString in the engine register until the value needs
> to be used as a generic value. The expression "iteration " + i + " failed"
> results in creation of many string wrappers and it's never used for anything at
> all. Similarly, the expression "iteration " + i + " at " + d results in the
> creation of many string wrappers and ultimately it's passed to createTextNode,
> which takes a string parameter rather than an arbitrary object value. I suspect
> that we could speed up by more than 5% by reducing the number of string
> wrappers if we could get the strings to the "+" operator and to the function
> call without converting to a wrapper, but that's probably a tall order.
I think we can make a significant improvement without changing the register
representation, and adding only two opcodes. But I am not sure how general the
optimization is. Here's the idea:
a) The condition for this to apply is an AddNode chain where the leftmost
operand is a constant string (this means every addition is a string add). You
have to be careful though because "xx" + (i + z) can't assume all string adds.
b) In this case, emit code to load the constant string operands and convert
non-string operands to string (with a new tostring opcode) to sequential
c) Emit a new append_strings opcode which assumes all operands are primitive
strings, and concatenates them at one go into a fresh UString. This would both
reduce allocation work in UString (can do multiple appends all at once) and
reduce the number of temporaries created (in "iteration " + i + " at " + d, i
and d are still converted to string resulting in temporaries, but the GC values
for "iteration " + i and "iteration " + i + " at " need never be created,
cutting the number of unobserved string temporaries in half).
I am not sure whether this optimization is general enough to be worth adding
> L) NumberImp::toObject is taking 0.7% of the time. That's all being done to
> call toString() on the value of the date. Is there a way to avoid the toObject
> in that op_get_by_id opcode when it's a built-in function on Number? We already
> have to branch to convert to an object, but we could use that branch to instead
> types that avoids creating an object. Seems like it could be a 1% speedup.
It's possible to optimize property access on primitives, but a bit tricky. If
the property turns out to be a manually added prototype method that uses the
"this" value and possibly even stores it, then the toObject conversion has to
actually happen on entry to that function. Geoff and I have discussed ideas how
to do this. Note that this would in general help strings more than numbers. It
would help some of the string tests on SunSpider quite a bit. I can give more
detail if needed.
Configure bugmail: http://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the webkit-unassigned