[webkit-dev] Iterating SunSpider

Maciej Stachowiak mjs at apple.com
Sat Jul 4 15:30:06 PDT 2009

On Jul 4, 2009, at 1:06 PM, Peter Kasting wrote:

> On Sat, Jul 4, 2009 at 11:47 AM, Mike Belshe <mike at belshe.com> wrote:
> #3: The SunSpider harness has a variance problem due to CPU power  
> savings modes.
> This one worries me because it decreases the consistency/ 
> reproducibility of test scores and makes it harder to compare  
> engines or to track one engine's scores over time.  For example,  
> doing a bunch of CPU work just before running the benchmark can  
> affect whether and when the CPU throttles down during the benchmark  
> run.
> Possible solution:
> The dromaeo test suite already incorporates the SunSpider individual  
> tests under a new benchmark harness which fixes all 3 of the above  
> issues.   Thus, one approach would be to retire SunSpider 0.9 in  
> favor of Dromaeo.   http://dromaeo.com/?sunspider  Dromaeo has also  
> done a lot of good work to ensure statistical significance of the  
> results.  Once we have a better benchmarking framework, it would be  
> great to build a new microbenchmark mix which more realistically  
> exercises today's JavaScript.
> One complaint I have heard about the Dromaeo tests (not the harness)  
> is that the actual JS that gets run differs from browser to browser  
> (e.g. because it is a direct copy of a source library that does UA  
> sniffing).  If this is true it means that this suite as-is isn't  
> useful to compare engines to each other.
> However, the Dromaeo _harness_ is probably a win as-is.
> Of course, changing anything about Sunspider raises the question of  
> tracking historical performance.  Perhaps the harness could support  
> versioning, or perhaps people are simply willing to say "Sunspider  
> 1.0 scores cannot be compared to Sunspider 0.9 scores".  I believe  
> this is the approach the V8 benchmark takes.

I think versioning the test content is right, and I think we should do  
that over time. I think a harness change to avoid triggering  
powersaving mode on Windows would be a reasonable thing to do to the  
harness without a version change. I don't think Dromaeo is a good  
choice of harness - I don't think their results are stable enough and  
I am not confident in the statistical soundness of their methodology.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090704/2a7072e9/attachment.html>

More information about the webkit-dev mailing list