[webkit-dev] Iterating SunSpider

Sat Jul 4 13:06:34 PDT 2009

On Sat, Jul 4, 2009 at 11:47 AM, Mike Belshe <mike at belshe.com> wrote:

> #3: The SunSpider harness has a variance problem due to CPU power savings
> modes.
>

This one worries me because it decreases the consistency/reproducibility of
test scores and makes it harder to compare engines or to track one engine's
scores over time.  For example, doing a bunch of CPU work just before
running the benchmark can affect whether and when the CPU throttles down
during the benchmark run.

Possible solution:
> The dromaeo test suite already incorporates the SunSpider individual tests
> under a new benchmark harness which fixes all 3 of the above issues.   Thus,
> one approach would be to retire SunSpider 0.9 in favor of Dromaeo.
> http://dromaeo.com/?sunspider  Dromaeo has also done a lot of good work to
> ensure statistical significance of the results.  Once we have a better
> benchmarking framework, it would be great to build a new microbenchmark mix
> which more realistically exercises today's JavaScript.
>

One complaint I have heard about the Dromaeo tests (not the harness) is that
the actual JS that gets run differs from browser to browser (e.g. because it
is a direct copy of a source library that does UA sniffing).  If this is
true it means that this suite as-is isn't useful to compare engines to each
other.

However, the Dromaeo _harness_ is probably a win as-is.

Of course, changing anything about Sunspider raises the question of
tracking historical performance.  Perhaps the harness could support
versioning, or perhaps people are simply willing to say "Sunspider
1.0 scores cannot be compared to Sunspider 0.9 scores".  I believe this is
the approach the V8 benchmark takes.

PK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090704/d099f75c/attachment.html>