[webkit-dev] Iterating SunSpider

Mike Belshe mike at belshe.com
Tue Jul 7 19:02:38 PDT 2009


On Tue, Jul 7, 2009 at 5:08 PM, Maciej Stachowiak <mjs at apple.com> wrote:

>
> On Jul 7, 2009, at 4:19 PM, Peter Kasting wrote:
>
>  For example, the framework could compute both sums _and_ geomeans, if
>> people thought both were valuable.
>>
>
> That's a plausible thing to do, but I think there's a downside: if you make
> a change that moves the two scores in opposite directions, the benchmark
> doesn't help you decide if it's good or not. Avoiding paralysis in the face
> of tradeoffs is part of the reason we look primarily at the total score, not
> the individual subtest scores. The whole point of a meta-benchmark like this
> is to force ourselves to simplemindedly look at only one number.
>
>  We could agree on a way of benchmarking a representative sample of current
>> sites to get an idea of how widespread certain operations currently are.  We
>> could talk with the maintainers of jQuery, Dojo, etc. to see what sorts of
>> operations they think would be helpful to future apps to make faster.  We
>> could instrument browsers to have some sort of (opt-in) sampling of
>> real-world workloads.  etc.  Surely together we can come up with ways to
>> make Sunspider even better, while keeping its current strengths in mind.
>>
>
> I think these are all good ideas. I think there's one way in which sampling
> the Web is not quite right. To some extent, what matters is not average
> density of an operation but peak density. An operation that's used a *lot*
> by a few sites and hardly used by most sites, may deserve a weighting above
> its average proportion of Web use. I would like to hear input on what is
> inadequately covered. I tend to think there should be more coverage of the
> following:
>
> - property access, involving at least some polymorphic access patterns
> - method calls
> - object-oriented programming patterns
> - GC load
> - programming in a style that makes significant use of closures


This sounds like good stuff to me.  A few more thoughts:
   - We also see sites with just huge chunks of JS code being delivered, yet
sparsely used.  Perhaps a parsing/loading test is interesting.
   - Object cloning.  We should verify this is a useful test, but I believe
template engines often use a pattern, combined with json data to clone js
objects.  This may be more of a DOM-level test, but a JS equivalent should
be doable.
   - JSON performance
   - Tests of prototype chain usage (basically the counter-programming-style
to closures)


If I were to characterize SunSpider and V8Benchmark tests, the SunSpider
tests are generally short and focused micro-benchmarks.  The v8 tests are
generally larger tests comprised of real code.  Both types of test offer
unique advantages.  The microbenchmarks provide a way to create lots of
small tests which cover a certain pattern.  The larger tests are less
focused, but require more features to work well together in the engine to
get higher scores.  Tracemonkey is fairly new, and with its tracing
approach, it is not surprising that it's initial traces can optimize the
micro benchmarks but not fully trace larger code like what is found in the
V8 benchmark.  In my opinion, both sets of tests are useful.

Mike






>
> I think the V8 benchmark does a much better job of covering the first four
> of these things. I also think it overweights them, to the exclusion of most
> other considerations(*). As I mentioned before, I'd like to include some of
> V8's tests in a future SunSpider 2.0 content set.
>
> It would be good to know what other things should be tested that are not
> sufficiently covered.
>
> Regards,
> Maciej
>
> * - For example, Mozilla's TraceMonkey effort showed relatively little
> improvement on the V8 benchmark, even though it showed significant
> improvement on SunSpider and other benchmarks. I think TraceMonkey speedups
> are real and significant, so this would tend to undermine my confidence in
> the V8 benchmark's coverage. Note: I don't mean to start a side thread about
> whether the V8 benchmark is good or not, I just wanted to justify my remarks
> above._______________________________________________
>
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090707/f5ec016e/attachment.html>


More information about the webkit-dev mailing list