[webkit-dev] Iterating SunSpider

Tue Jul 7 16:21:56 PDT 2009

> What you seem to think is better would be to repeatedly update  
> sunspider everytime that something gets faster, ignoring entirely  
> that the value in sunspider is precisely that it has not changed.
>
> Not quite what I'm saying :-)
>
> I'd like benchmarks to:
>     a) have meaning even as browsers change over time
>     b) evolve.  as new areas of JS (or whatever) become important,  
> the benchmark should have facilities to include that.
>
> Fair?  Good? Bad?

It's not unreasonable, but it can't be done on a whim, and changes  
cannot be made trivially.  Both re-weighting sunspider and adding new  
tests as things are made faster is incredibly hard to do soundly  
because it becomes easy to end up obscuring meaningful data.

In the context of regex for example, say sunspider had been reweighted  
for the current generation on js engines before anyone had looked at  
regex.  Regex would not have stood out as being substantially slower,  
and would likely not have been investigated resulting in everyone  
having regex an order of magnitude slower than current engines.   
That's why sunspider has not been updated: after what a year and a  
half (?) it can still show areas where performance can be improved and  
while it does that it's still useful.

So determining when it is sensible to update sunspider is difficult,  
you may be right, and find rebalancing shows new areas where  
performance can be improved, but if you're wrong you run the risk of  
changing the benchmark from something that is actually useful  
development tool into something that is only useful for producing a  
number at the end.

> If we see one section of the test taking dramatically longer than  
> another then we can assume that we have not been paying enough  
> attention to performance in that area, this is how we orginally  
> noticed just how slow the regex engine was.  If we had been  
> continually rebalancing the test over and over again we would not  
> have noticed this or other areas where performance could be (and  
> has) improved.  It would also break sunspider as a means for  
> tracking and/or preventing performance regressions.
>
> Of course, using old versions of the benchmark for regression  
> testing is not prohibited by iterating a benchmark.

But what happens when the benchmarks disagree as to what is the  
improvement?  You can't improve performance with one benchmark while  
testing for regressions with another.

--Oliver
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090707/6ba852db/attachment.html>