<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"

"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>[185014] trunk/Tools</title>

</head>

<body>

<style type="text/css"><!--

#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }

#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }

#msg dt:after { content:':';}

#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }

#msg dl a { font-weight: bold}

#msg dl a:link    { color:#fc3; }

#msg dl a:active  { color:#ff0; }

#msg dl a:visited { color:#cc6; }

h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }

#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }

#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }

#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }

#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }

#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }

#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }

#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }

#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }

#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }

#logmsg pre { background: #eee; padding: 1em; }

#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}

#logmsg dl { margin: 0; }

#logmsg dt { font-weight: bold; }

#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }

#logmsg dd:before { content:'\00bb';}

#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }

#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }

#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }

#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }

#logmsg table th.Corner { text-align: left; }

#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }

#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }

#patch { width: 100%; }

#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}

#patch .propset h4, #patch .binary h4 {margin:0;}

#patch pre {padding:0;line-height:1.2em;margin:0;}

#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}

#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}

#patch span {display:block;padding:0 10px;}

#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}

#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}

#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}

#patch .lines, .info {color:#888;background:#fff;}

--></style>

<div id="msg">

<dl class="meta">

<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/185014">185014</a></dd>

<dt>Author</dt> <dd>rniwa@webkit.org</dd>

<dt>Date</dt> <dd>2015-05-29 16:03:45 -0700 (Fri, 29 May 2015)</dd>

</dl>

<h3>Log Message</h3>

<pre>run-benchmark should print out the results

https://bugs.webkit.org/show_bug.cgi?id=145398

Reviewed by Antti Koivisto.

Added BenchmarkResults to compute and format the aggregated values. It also does the syntax/semantic check

of the output to catch early errors.

* Scripts/webkitpy/benchmark_runner/benchmark_results.py: Added.

(BenchmarkResults): Added.

(BenchmarkResults.__init__): Added.

(BenchmarkResults.format): Added.

(BenchmarkResults._format_tests): Added. Used by BenchmarkResults.format.

(BenchmarkResults._format_values): Formats a list of values measured for a given metric on a given test.

Uses the sample standard deviation to compute the significant figures for the value.

(BenchmarkResults._unit_from_metric): Added.

(BenchmarkResults._aggregate_results): Added.

(BenchmarkResults._aggregate_results_for_test): Added.

(BenchmarkResults._flatten_list): Added.

(BenchmarkResults._subtest_values_by_config_iteration): Added. Organizes values measured for subtests

by the iteration number so that i-th array contains values for all subtests at i-th iteration.

(BenchmarkResults._aggregate_values): Added.

(BenchmarkResults._lint_results): Added.

(BenchmarkResults._lint_subtest_results): Added.

(BenchmarkResults._lint_aggregator_list): Added.

(BenchmarkResults._lint_configuration): Added.

(BenchmarkResults._lint_values): Added.

(BenchmarkResults._is_numeric): Added.

* Scripts/webkitpy/benchmark_runner/benchmark_results_unittest.py: Added.

(BenchmarkResultsTest):

(BenchmarkResultsTest.test_init):

(BenchmarkResultsTest.test_format):

(test_format_values_with_large_error):

(test_format_values_with_small_error):

(test_format_values_with_time):

(test_format_values_with_no_error):

(test_format_values_with_small_difference):

(test_aggregate_results):

(test_aggregate_results_with_gropus):

(test_aggregate_nested_results):

(test_lint_results):

* Scripts/webkitpy/benchmark_runner/benchmark_runner.py:

(BenchmarkRunner.execute): Added a call to show_results

(BenchmarkRunner.wrap): Only dump the merged JSON when debugging.

(BenchmarkRunner.show_results): Added.</pre>

<h3>Modified Paths</h3>

<ul>

<li><a href="#trunkToolsChangeLog">trunk/Tools/ChangeLog</a></li>

<li><a href="#trunkToolsScriptswebkitpybenchmark_runnerbenchmark_runnerpy">trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_runner.py</a></li>

</ul>

<h3>Added Paths</h3>

<ul>

<li><a href="#trunkToolsScriptswebkitpybenchmark_runnerbenchmark_resultspy">trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_results.py</a></li>

<li><a href="#trunkToolsScriptswebkitpybenchmark_runnerbenchmark_results_unittestpy">trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_results_unittest.py</a></li>

</ul>

</div>

<div id="patch">

<h3>Diff</h3>

<a id="trunkToolsChangeLog"></a>

<div class="modfile"><h4>Modified: trunk/Tools/ChangeLog (185013 => 185014)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Tools/ChangeLog        2015-05-29 23:02:56 UTC (rev 185013)

+++ trunk/Tools/ChangeLog        2015-05-29 23:03:45 UTC (rev 185014)

</span><span class="lines">@@ -1,3 +1,65 @@

</span><ins>+2015-05-29  Ryosuke Niwa  &lt;rniwa@webkit.org&gt;

+

+        run-benchmark should print out the results

+        https://bugs.webkit.org/show_bug.cgi?id=145398

+

+        Reviewed by Antti Koivisto.

+

+        Added BenchmarkResults to compute and format the aggregated values. It also does the syntax/semantic check

+        of the output to catch early errors.

+

+        * Scripts/webkitpy/benchmark_runner/benchmark_results.py: Added.

+        (BenchmarkResults): Added.

+        (BenchmarkResults.__init__): Added.

+        (BenchmarkResults.format): Added.

+        (BenchmarkResults._format_tests): Added. Used by BenchmarkResults.format.

+        (BenchmarkResults._format_values): Formats a list of values measured for a given metric on a given test.

+        Uses the sample standard deviation to compute the significant figures for the value.

+        (BenchmarkResults._unit_from_metric): Added.

+        (BenchmarkResults._aggregate_results): Added.

+        (BenchmarkResults._aggregate_results_for_test): Added.

+        (BenchmarkResults._flatten_list): Added.

+        (BenchmarkResults._subtest_values_by_config_iteration): Added. Organizes values measured for subtests

+        by the iteration number so that i-th array contains values for all subtests at i-th iteration.

+        (BenchmarkResults._aggregate_values): Added.

+        (BenchmarkResults._lint_results): Added.

+        (BenchmarkResults._lint_subtest_results): Added.

+        (BenchmarkResults._lint_aggregator_list): Added.

+        (BenchmarkResults._lint_configuration): Added.

+        (BenchmarkResults._lint_values): Added.

+        (BenchmarkResults._is_numeric): Added.

+        * Scripts/webkitpy/benchmark_runner/benchmark_results_unittest.py: Added.

+        (BenchmarkResultsTest):

+        (BenchmarkResultsTest.test_init):

+        (BenchmarkResultsTest.test_format):

+        (test_format_values_with_large_error):

+        (test_format_values_with_small_error):

+        (test_format_values_with_time):

+        (test_format_values_with_no_error):

+        (test_format_values_with_small_difference):

+        (test_aggregate_results):

+        (test_aggregate_results_with_gropus):

+        (test_aggregate_nested_results):

+        (test_lint_results):

+        * Scripts/webkitpy/benchmark_runner/benchmark_runner.py:

+        (BenchmarkRunner.execute): Added a call to show_results

+        (BenchmarkRunner.wrap): Only dump the merged JSON when debugging.

+        (BenchmarkRunner.show_results): Added.

+

+2015-05-15  Ryosuke Niwa  &lt;rniwa@webkit.org&gt;

+

+        run_benchmark should have an option to specify the number of runs

+        https://bugs.webkit.org/show_bug.cgi?id=145091

+

+        Reviewed by Antti Koivisto.

+

+        Added --count option.

+

+        * Scripts/run-benchmark:

+        (main):

+        * Scripts/webkitpy/benchmark_runner/benchmark_runner.py:

+        (BenchmarkRunner.__init__):

+

</ins><span class="cx"> 2015-05-28  Alexey Proskuryakov  &lt;ap@apple.com&gt;

</span><span class="cx"> 

</span><span class="cx">         Update results of WebKit1.StringTruncator after r184965. I missed one letter.

</span></span></pre></div>

<a id="trunkToolsScriptswebkitpybenchmark_runnerbenchmark_resultspy"></a>

<div class="addfile"><h4>Added: trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_results.py (0 => 185014)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_results.py                                (rev 0)

+++ trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_results.py        2015-05-29 23:03:45 UTC (rev 185014)

</span><span class="lines">@@ -0,0 +1,245 @@

</span><ins>+# Copyright (C) 2015 Apple Inc. All rights reserved.

+#

+# Redistribution and use in source and binary forms, with or without

+# modification, are permitted provided that the following conditions

+# are met:

+# 1.  Redistributions of source code must retain the above copyright

+#     notice, this list of conditions and the following disclaimer.

+# 2.  Redistributions in binary form must reproduce the above copyright

+#     notice, this list of conditions and the following disclaimer in the

+#     documentation and/or other materials provided with the distribution.

+#

+# THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS'' AND ANY

+# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED

+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE

+# DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS BE LIABLE FOR ANY

+# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES

+# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;

+# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON

+# ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT

+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS

+# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

+

+import json

+import math

+import re

+

+

+class BenchmarkResults(object):

+

+    aggregators = {

+        'Total': (lambda values: sum(values)),

+        'Arithmetic': (lambda values: sum(values) / len(values)),

+        'Geometric': (lambda values: math.exp(sum(map(math.log, values)) / len(values))),

+    }

+    metric_to_unit = {

+        'FrameRate': 'fps',

+        'Runs': '/s',

+        'Time': 'ms',

+        'Duration': 'ms',

+        'Malloc': 'B',

+        'Heap': 'B',

+        'Allocations': 'B',

+        'Score': 'pt',

+    }

+    SI_prefixes = ['n', 'u', 'm', '', 'K', 'M', 'G', 'T', 'P', 'E']

+

+    def __init__(self, results):

+        self._lint_results(results)

+        self._results = self._aggregate_results(results)

+

+    def format(self):

+        return self._format_tests(self._results)

+

+    @classmethod

+    def _format_tests(self, tests, indent=''):

+        output = ''

+        config_name = 'current'

+        for test_name in sorted(tests.keys()):

+            is_first = True

+            test = tests[test_name]

+            metrics = test.get('metrics', {})

+            for metric_name in sorted(metrics.keys()):

+                metric = metrics[metric_name]

+                for aggregator_name in sorted(metric.keys()):

+                    output += indent

+                    if is_first:

+                        output += test_name

+                        is_first = False

+                    else:

+                        output += ' ' * len(test_name)

+                    output += ':' + metric_name + ':'

+                    if aggregator_name:

+                        output += aggregator_name + ':'

+                    output += ' ' + self._format_values(metric_name, metric[aggregator_name][config_name]) + '\n'

+            if 'tests' in test:

+                output += self._format_tests(test['tests'], indent=(indent + ' ' * len(test_name)))

+        return output

+

+    @classmethod

+    def _format_values(cls, metric_name, values):

+        values = map(float, values)

+        total = sum(values)

+        mean = total / len(values)

+        square_sum = sum(map(lambda x: x * x, values))

+        sample_count = len(values)

+

+        # With sum and sum of squares, we can compute the sample standard deviation in O(1).

+        # See https://rniwa.com/2012-11-10/sample-standard-deviation-in-terms-of-sum-and-square-sum-of-samples/

+        if sample_count &lt;= 1:

+            sample_stdev = 0

+        else:

+            sample_stdev = math.sqrt(square_sum / (sample_count - 1) - total * total / (sample_count - 1) / sample_count)

+

+        unit = cls._unit_from_metric(metric_name)

+

+        if unit == 'ms':

+            unit = 's'

+            mean = float(mean) / 1000

+            sample_stdev /= 1000

+

+        base = 1024 if unit == 'B' else 1000

+        value_sig_fig = 1 - math.floor(math.log10(sample_stdev / mean)) if sample_stdev else 3

+        SI_magnitude = math.floor(math.log(mean, base))

+

+        scaled_mean = mean * math.pow(base, -SI_magnitude)

+        SI_prefix = cls.SI_prefixes[int(SI_magnitude) + 3]

+

+        non_floating_digits = 1 + math.floor(math.log10(scaled_mean))

+        floating_points_count = max(0, value_sig_fig - non_floating_digits)

+        return ('{mean:.' + str(int(floating_points_count)) + 'f}{prefix}{unit} stdev={delta:.1%}').format(

+            mean=scaled_mean, delta=sample_stdev / mean, prefix=SI_prefix, unit=unit)

+

+    @classmethod

+    def _unit_from_metric(cls, metric_name):

+        # FIXME: Detect unknown mettric names

+        suffix = re.match(r'.*?([A-z][a-z]+|FrameRate)$', metric_name)

+        return cls.metric_to_unit[suffix.group(1)]

+

+    @classmethod

+    def _aggregate_results(cls, tests):

+        results = {}

+        for test_name, test in tests.iteritems():

+            results[test_name] = cls._aggregate_results_for_test(test)

+        return results

+

+    @classmethod

+    def _aggregate_results_for_test(cls, test):

+        subtest_results = cls._aggregate_results(test['tests']) if 'tests' in test else {}

+        results = {}

+        for metric_name, metric in test['metrics'].iteritems():

+            if not isinstance(metric, list):

+                results[metric_name] = {None: {}}

+                for config_name, values in metric.iteritems():

+                    results[metric_name][None][config_name] = cls._flatten_list(values)

+                continue

+

+            aggregator_list = metric

+            results[metric_name] = {}

+            for aggregator in aggregator_list:

+                values_by_config_iteration = cls._subtest_values_by_config_iteration(subtest_results, metric_name, aggregator)

+                for config_name, values_by_iteration in values_by_config_iteration.iteritems():

+                    results[metric_name].setdefault(aggregator, {})

+                    results[metric_name][aggregator][config_name] = [cls._aggregate_values(aggregator, values) for values in values_by_iteration]

+

+        return {'metrics': results, 'tests': subtest_results}

+

+    @classmethod

+    def _flatten_list(cls, nested_list):

+        flattened_list = []

+        for item in nested_list:

+            if isinstance(item, list):

+                flattened_list += cls._flatten_list(item)

+            else:

+                flattened_list.append(item)

+        return flattened_list

+

+    @classmethod

+    def _subtest_values_by_config_iteration(cls, subtest_results, metric_name, aggregator):

+        values_by_config_iteration = {}

+        for subtest_name, subtest in subtest_results.iteritems():

+            results_for_metric = subtest['metrics'].get(metric_name, {})

+            results_for_aggregator = results_for_metric.get(aggregator, results_for_metric.get(None, {}))

+            for config_name, values in results_for_aggregator.iteritems():

+                values_by_config_iteration.setdefault(config_name, [[] for _ in values])

+                for iteration, value in enumerate(values):

+                    values_by_config_iteration[config_name][iteration].append(value)

+        return values_by_config_iteration

+

+    @classmethod

+    def _aggregate_values(cls, aggregator, values):

+        return cls.aggregators[aggregator](values)

+

+    @classmethod

+    def _lint_results(cls, tests):

+        cls._lint_subtest_results(tests, None)

+        return True

+

+    @classmethod

+    def _lint_subtest_results(cls, subtests, parent_needing_aggregation):

+        iteration_groups_by_config = {}

+        for test_name, test in subtests.iteritems():

+            if 'metrics' not in test:

+                raise TypeError('&quot;%s&quot; does not contain metrics' % test_name)

+

+            metrics = test['metrics']

+            if not isinstance(metrics, dict):

+                raise TypeError('The metrics in &quot;%s&quot; is not a dictionary' % test_name)

+

+            needs_aggregation = False

+            for metric_name, metric in metrics.iteritems():

+                if isinstance(metric, list):

+                    cls._lint_aggregator_list(test_name, metric_name, metric)

+                    needs_aggregation = True

+                elif isinstance(metric, dict):

+                    cls._lint_configuration(test_name, metric_name, metric, parent_needing_aggregation, iteration_groups_by_config)

+                else:

+                    raise TypeError('&quot;%s&quot; metric of &quot;%s&quot; was not an aggregator list or a dictionary of configurations: %s' % (metric_name, test_name, str(metric)))

+

+            if 'tests' in test:

+                cls._lint_subtest_results(test['tests'], test_name if needs_aggregation else None)

+            elif needs_aggregation:

+                raise TypeError('&quot;%s&quot; requires aggregation but &quot;SomeTest&quot; has no subtests' % (test_name))

+        return iteration_groups_by_config

+

+    @classmethod

+    def _lint_aggregator_list(cls, test_name, metric_name, aggregator_list):

+        if len(aggregator_list) != len(set(aggregator_list)):

+            raise TypeError('&quot;%s&quot; metric of &quot;%s&quot; had invalid aggregator list: %s' % (metric_name, test_name, json.dumps(aggregator_list)))

+        if not aggregator_list:

+            raise TypeError('The aggregator list is empty in &quot;%s&quot; metric of &quot;%s&quot;' % (metric_name, test_name))

+        for aggregator_name in aggregator_list:

+            if cls._is_numeric(aggregator_name):

+                raise TypeError('&quot;%s&quot; metric of &quot;%s&quot; is not wrapped by a configuration; e.g. &quot;current&quot;' % (metric_name, test_name))

+            if aggregator_name not in cls.aggregators:

+                raise TypeError('&quot;%s&quot; metric of &quot;%s&quot; uses unknown aggregator: %s' % (metric_name, test_name, aggregator_name))

+

+    @classmethod

+    def _lint_configuration(cls, test_name, metric_name, configurations, parent_needing_aggregation, iteration_groups_by_config):

+        # FIXME: Check that config_name is always &quot;current&quot;.

+        for config_name, values in configurations.iteritems():

+            nested_list_count = [isinstance(value, list) for value in values].count(True)

+            if nested_list_count not in [0, len(values)]:

+                raise TypeError('&quot;%s&quot; metric of &quot;%s&quot; had malformed values: %s' % (metric_name, test_name, json.dumps(values)))

+

+            if nested_list_count:

+                value_shape = []

+                for value_group in values:

+                    value_shape.append(len(value_group))

+                    cls._lint_values(test_name, metric_name, value_group)

+            else:

+                value_shape = len(values)

+                cls._lint_values(test_name, metric_name, values)

+

+            iteration_groups_by_config.setdefault(metric_name, {}).setdefault(config_name, value_shape)

+            if parent_needing_aggregation and value_shape != iteration_groups_by_config[metric_name][config_name]:

+                raise TypeError('&quot;%s&quot; metric of &quot;%s&quot; had a mismatching subtest values' % (metric_name, parent_needing_aggregation))

+

+    @classmethod

+    def _lint_values(cls, test_name, metric_name, values):

+        if any([not cls._is_numeric(value) for value in values]):

+            raise TypeError('&quot;%s&quot; metric of &quot;%s&quot; contains non-numeric value: %s' % (metric_name, test_name, json.dumps(values)))

+

+    @classmethod

+    def _is_numeric(cls, value):

+        return isinstance(value, int) or isinstance(value, float)

</ins></span></pre></div>

<a id="trunkToolsScriptswebkitpybenchmark_runnerbenchmark_results_unittestpy"></a>

<div class="addfile"><h4>Added: trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_results_unittest.py (0 => 185014)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_results_unittest.py                                (rev 0)

+++ trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_results_unittest.py        2015-05-29 23:03:45 UTC (rev 185014)

</span><span class="lines">@@ -0,0 +1,255 @@

</span><ins>+# Copyright (C) 2015 Apple Inc. All rights reserved.

+#

+# Redistribution and use in source and binary forms, with or without

+# modification, are permitted provided that the following conditions

+# are met:

+# 1.  Redistributions of source code must retain the above copyright

+#     notice, this list of conditions and the following disclaimer.

+# 2.  Redistributions in binary form must reproduce the above copyright

+#     notice, this list of conditions and the following disclaimer in the

+#     documentation and/or other materials provided with the distribution.

+#

+# THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS'' AND ANY

+# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED

+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE

+# DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS BE LIABLE FOR ANY

+# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES

+# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;

+# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON

+# ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT

+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS

+# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

+

+import unittest

+

+from benchmark_results import BenchmarkResults

+

+

+class BenchmarkResultsTest(unittest.TestCase):

+    def test_init(self):

+        results = BenchmarkResults({'SomeTest': {'metrics': {'Time': {'current': [1, 2, 3]}}}})

+        self.assertEqual(results._results, {'SomeTest': {'metrics': {'Time': {None: {'current': [1, 2, 3]}}}, 'tests': {}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; contains non-numeric value: \[1, 2, &quot;a&quot;\]'):

+            BenchmarkResults({'SomeTest': {'metrics': {'Time': {'current': [1, 2, 'a']}}}})

+

+    def test_format(self):

+        result = BenchmarkResults({'SomeTest': {'metrics': {'Time': {'current': [1, 2, 3]}}}})

+        self.assertEqual(result.format(), 'SomeTest:Time: 2.0ms stdev=50.0%\n')

+

+        result = BenchmarkResults({'SomeTest': {'metrics': {'Time': {'current': [1, 2, 3]}, 'Score': {'current': [2, 3, 4]}}}})

+        self.assertEqual(result.format(), '''

+SomeTest:Score: 3.0pt stdev=33.3%

+        :Time: 2.0ms stdev=50.0%

+'''[1:])

+

+        result = BenchmarkResults({'SomeTest': {

+            'metrics': {'Time': ['Total', 'Arithmetic']},

+            'tests': {

+                'SubTest1': {'metrics': {'Time': {'current': [1, 2, 3]}}},

+                'SubTest2': {'metrics': {'Time': {'current': [4, 5, 6]}}}}}})

+        self.assertEqual(result.format(), '''

+SomeTest:Time:Arithmetic: 3.0ms stdev=33.3%

+        :Time:Total: 7.0ms stdev=28.6%

+        SubTest1:Time: 2.0ms stdev=50.0%

+        SubTest2:Time: 5.0ms stdev=20.0%

+'''[1:])

+

+    def test_format_values_with_large_error(self):

+        self.assertEqual(BenchmarkResults._format_values('Runs', [1, 2, 3]), '2.0/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [10, 20, 30]), '20/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [100, 200, 300]), '200/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [1000, 2000, 3000]), '2.0K/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [10000, 20000, 30000]), '20K/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [100000, 200000, 300000]), '200K/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [1000000, 2000000, 3000000]), '2.0M/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.1, 0.2, 0.3]), '200m/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.01, 0.02, 0.03]), '20m/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.001, 0.002, 0.003]), '2.0m/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.0001, 0.0002, 0.0003]), '200u/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.00001, 0.00002, 0.00003]), '20u/s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.000001, 0.000002, 0.000003]), '2.0u/s stdev=50.0%')

+

+    def test_format_values_with_small_error(self):

+        self.assertEqual(BenchmarkResults._format_values('Runs', [1.1, 1.2, 1.3]), '1.20/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [11, 12, 13]), '12.0/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [110, 120, 130]), '120/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [1100, 1200, 1300]), '1.20K/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [11000, 12000, 13000]), '12.0K/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [110000, 120000, 130000]), '120K/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [1100000, 1200000, 1300000]), '1.20M/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.11, 0.12, 0.13]), '120m/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.011, 0.012, 0.013]), '12.0m/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.0011, 0.0012, 0.0013]), '1.20m/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.00011, 0.00012, 0.00013]), '120u/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.000011, 0.000012, 0.000013]), '12.0u/s stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Runs', [0.0000011, 0.0000012, 0.0000013]), '1.20u/s stdev=8.3%')

+

+    def test_format_values_with_time(self):

+        self.assertEqual(BenchmarkResults._format_values('Time', [1, 2, 3]), '2.0ms stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [10, 20, 30]), '20ms stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [100, 200, 300]), '200ms stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [1000, 2000, 3000]), '2.0s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [10000, 20000, 30000]), '20s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [100000, 200000, 300000]), '200s stdev=50.0%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [0.11, 0.12, 0.13]), '120us stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [0.011, 0.012, 0.013]), '12.0us stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [0.0011, 0.0012, 0.0013]), '1.20us stdev=8.3%')

+        self.assertEqual(BenchmarkResults._format_values('Time', [0.00011, 0.00012, 0.00013]), '120ns stdev=8.3%')

+

+    def test_format_values_with_no_error(self):

+        self.assertEqual(BenchmarkResults._format_values('Time', [1, 1, 1]), '1.00ms stdev=0.0%')

+

+    def test_format_values_with_small_difference(self):

+        self.assertEqual(BenchmarkResults._format_values('Time', [5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]),

+            '4.05ms stdev=5.5%')

+

+    def test_aggregate_results(self):

+        self.maxDiff = None

+        self.assertEqual(BenchmarkResults._aggregate_results(

+            {'SomeTest': {'metrics': {'Time': {'current': [1, 2, 3]}}}}),

+            {'SomeTest': {'metrics': {'Time': {None: {'current': [1, 2, 3]}}}, 'tests': {}}})

+

+        self.assertEqual(BenchmarkResults._aggregate_results(

+            {'SomeTest': {

+                'metrics': {'Time': ['Total']},

+                'tests': {

+                    'SubTest1': {'metrics': {'Time': {'current': [1, 2, 3]}}},

+                    'SubTest2': {'metrics': {'Time': {'current': [4, 5, 6]}}}}}}),

+            {'SomeTest': {

+                'metrics': {'Time': {'Total': {'current': [5, 7, 9]}}},

+                'tests': {

+                    'SubTest1': {'metrics': {'Time': {None: {'current': [1, 2, 3]}}}, 'tests': {}},

+                    'SubTest2': {'metrics': {'Time': {None: {'current': [4, 5, 6]}}}, 'tests': {}}}}})

+

+        self.assertEqual(BenchmarkResults._aggregate_results(

+            {'SomeTest': {

+                'metrics': {'Time': ['Total'], 'Runs': ['Total']},

+                'tests': {

+                    'SubTest1': {'metrics': {'Time': {'current': [1, 2, 3]}}},

+                    'SubTest2': {'metrics': {'Time': {'current': [4, 5, 6]}}},

+                    'SubTest3': {'metrics': {'Runs': {'current': [7, 8, 9]}}}}}}),

+            {'SomeTest': {

+                'metrics': {

+                    'Time': {'Total': {'current': [5, 7, 9]}},

+                    'Runs': {'Total': {'current': [7, 8, 9]}}},

+                'tests': {

+                    'SubTest1': {'metrics': {'Time': {None: {'current': [1, 2, 3]}}}, 'tests': {}},

+                    'SubTest2': {'metrics': {'Time': {None: {'current': [4, 5, 6]}}}, 'tests': {}},

+                    'SubTest3': {'metrics': {'Runs': {None: {'current': [7, 8, 9]}}}, 'tests': {}}}}})

+

+    def test_aggregate_results_with_gropus(self):

+        self.maxDiff = None

+        self.assertEqual(BenchmarkResults._aggregate_results(

+            {'SomeTest': {

+                'metrics': {'Time': ['Total']},

+                'tests': {

+                    'SubTest1': {'metrics': {'Time': {'current': [[1, 2], [3, 4]]}}},

+                    'SubTest2': {'metrics': {'Time': {'current': [[5, 6], [7, 8]]}}}}}}),

+            {'SomeTest': {

+                'metrics': {'Time': {'Total': {'current': [6, 8, 10, 12]}}},

+                'tests': {

+                    'SubTest1': {'metrics': {'Time': {None: {'current': [1, 2, 3, 4]}}}, 'tests': {}},

+                    'SubTest2': {'metrics': {'Time': {None: {'current': [5, 6, 7, 8]}}}, 'tests': {}}}}})

+

+    def test_aggregate_nested_results(self):

+        self.maxDiff = None

+        self.assertEqual(BenchmarkResults._aggregate_results(

+            {'SomeTest': {

+                'metrics': {'Time': ['Total']},

+                'tests': {

+                    'SubTest1': {

+                        'metrics': {'Time': ['Total']},

+                        'tests': {

+                            'GrandChild1': {'metrics': {'Time': {'current': [1, 2]}}},

+                            'GrandChild2': {'metrics': {'Time': {'current': [3, 4]}}}}},

+                    'SubTest2': {'metrics': {'Time': {'current': [5, 6]}}}}}}),

+            {'SomeTest': {

+                'metrics': {'Time': {'Total': {'current': [9, 12]}}},

+                'tests': {

+                    'SubTest1': {

+                        'metrics': {'Time': {'Total': {'current': [4, 6]}}},

+                        'tests': {

+                            'GrandChild1': {'metrics': {'Time': {None: {'current': [1, 2]}}}, 'tests': {}},

+                            'GrandChild2': {'metrics': {'Time': {None: {'current': [3, 4]}}}, 'tests': {}}}},

+                    'SubTest2': {'metrics': {'Time': {None: {'current': [5, 6]}}}, 'tests': {}}}}})

+

+        self.assertEqual(BenchmarkResults._aggregate_results(

+            {'SomeTest': {

+                'metrics': {'Time': ['Total']},

+                'tests': {

+                    'SubTest1': {

+                        'metrics': {'Time': ['Total', 'Arithmetic']},

+                        'tests': {

+                            'GrandChild1': {'metrics': {'Time': {'current': [1, 2]}}},

+                            'GrandChild2': {'metrics': {'Time': {'current': [3, 4]}}}}},

+                    'SubTest2': {'metrics': {'Time': {'current': [5, 6]}}}}}}),

+            {'SomeTest': {

+                'metrics': {'Time': {'Total': {'current': [9, 12]}}},

+                'tests': {

+                    'SubTest1': {

+                        'metrics': {'Time': {'Total': {'current': [4, 6]}, 'Arithmetic': {'current': [2, 3]}}},

+                        'tests': {

+                            'GrandChild1': {'metrics': {'Time': {None: {'current': [1, 2]}}}, 'tests': {}},

+                            'GrandChild2': {'metrics': {'Time': {None: {'current': [3, 4]}}}, 'tests': {}}}},

+                    'SubTest2': {'metrics': {'Time': {None: {'current': [5, 6]}}}, 'tests': {}}}}})

+

+    def test_lint_results(self):

+        with self.assertRaisesRegexp(TypeError, r'&quot;SomeTest&quot; does not contain metrics'):

+            BenchmarkResults._lint_results({'SomeTest': {}})

+

+        with self.assertRaisesRegexp(TypeError, r'The metrics in &quot;SomeTest&quot; is not a dictionary'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': []}})

+

+        with self.assertRaisesRegexp(TypeError, r'The aggregator list is empty in &quot;Time&quot; metric of &quot;SomeTest&quot;'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': []}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; is not wrapped by a configuration; e.g. &quot;current&quot;'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': [1, 2]}}})

+

+        self.assertTrue(BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': {'current': [1, 2]}}}}))

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; was not an aggregator list or a dictionary of configurations: 1'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': 1}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; contains non-numeric value: \[&quot;Total&quot;\]'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': {'current': ['Total']}}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; contains non-numeric value: \[&quot;Total&quot;, &quot;Geometric&quot;\]'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': {'current': [['Total', 'Geometric']]}}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;SomeTest&quot; requires aggregation but &quot;SomeTest&quot; has no subtests'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': ['Total']}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; had invalid aggregator list: \[&quot;Total&quot;, &quot;Total&quot;\]'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': ['Total', 'Total']}, 'tests': {

+                'SubTest1': {'metrics': {'Time': {'current': []}}}}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; uses unknown aggregator: KittenMean'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': ['KittenMean']}, 'tests': {

+                'SubTest1': {'metrics': {'Time': {'current': []}}}}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; had a mismatching subtest values'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': ['Total']}, 'tests': {

+                'SubTest1': {'metrics': {'Time': {'current': [1, 2, 3]}}},

+                'SubTest2': {'metrics': {'Time': {'current': [4, 5, 6, 7]}}}}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; had a mismatching subtest values'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': ['Total']}, 'tests': {

+                'SubTest1': {'metrics': {'Time': {'current': [[1, 2], [3]]}}},

+                'SubTest2': {'metrics': {'Time': {'current': [[4, 5], [6, 7]]}}}}}})

+

+        with self.assertRaisesRegexp(TypeError, r'&quot;Time&quot; metric of &quot;SomeTest&quot; had malformed values: \[1, \[2\], 3\]'):

+            BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': {'current': [1, [2], 3]}}}})

+

+        self.assertTrue(BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': ['Total']}, 'tests': {

+            'SubTest1': {'metrics': {'Time': {'current': [1, 2, 3]}}},

+            'SubTest2': {'metrics': {'Time': {'current': [4, 5, 6], 'baseline': [7]}}}}}}))

+

+        self.assertTrue(BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': ['Total']}, 'tests': {

+            'SubTest1': {'metrics': {'Time': {'current': [1, 2, 3]}}},

+            'SubTest2': {'metrics': {'Runs': {'current': [4, 5, 6, 7]}}}}}}))

+

+        self.assertTrue(BenchmarkResults._lint_results({'SomeTest': {'metrics': {'Time': ['Total']}, 'tests': {

+            'SubTest1': {'metrics': {'Time': {'current': [[1, 2], [3, 4]]}}},

+            'SubTest2': {'metrics': {'Time': {'current': [[5, 6], [7, 8]]}}}}}}))

</ins></span></pre></div>

<a id="trunkToolsScriptswebkitpybenchmark_runnerbenchmark_runnerpy"></a>

<div class="modfile"><h4>Modified: trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_runner.py (185013 => 185014)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_runner.py        2015-05-29 23:02:56 UTC (rev 185013)

+++ trunk/Tools/Scripts/webkitpy/benchmark_runner/benchmark_runner.py        2015-05-29 23:03:45 UTC (rev 185014)

</span><span class="lines">@@ -12,6 +12,7 @@

</span><span class="cx"> import urlparse

</span><span class="cx"> 

</span><span class="cx"> from benchmark_builder.benchmark_builder_factory import BenchmarkBuilderFactory

</span><ins>+from benchmark_results import BenchmarkResults

</ins><span class="cx"> from browser_driver.browser_driver_factory import BrowserDriverFactory

</span><span class="cx"> from http_server_driver.http_server_driver_factory import HTTPServerDriverFactory

</span><span class="cx"> from utils import loadModule, getPathFromProjectRoot

</span><span class="lines">@@ -91,6 +92,7 @@

</span><span class="cx">                 _log.info('End of %d iteration of current benchmark' % (x + 1))

</span><span class="cx">         results = self.wrap(results)

</span><span class="cx">         self.dump(results, self.outputFile if self.outputFile else self.plan['output_file'])

</span><ins>+        self.show_results(results)

</ins><span class="cx">         benchmarkBuilder.clean()

</span><span class="cx">         return 0

</span><span class="cx"> 

</span><span class="lines">@@ -106,13 +108,13 @@

</span><span class="cx"> 

</span><span class="cx">     @classmethod

</span><span class="cx">     def wrap(cls, dicts):

</span><del>-        _log.info('Merging following results:\n%s', json.dumps(dicts))

</del><ins>+        _log.debug('Merging following results:\n%s', json.dumps(dicts))

</ins><span class="cx">         if not dicts:

</span><span class="cx">             return None

</span><span class="cx">         ret = {}

</span><span class="cx">         for dic in dicts:

</span><span class="cx">             ret = cls.merge(ret, dic)

</span><del>-        _log.info('Results after merging:\n%s', json.dumps(ret))

</del><ins>+        _log.debug('Results after merging:\n%s', json.dumps(ret))

</ins><span class="cx">         return ret

</span><span class="cx"> 

</span><span class="cx">     @classmethod

</span><span class="lines">@@ -135,3 +137,8 @@

</span><span class="cx">             return result

</span><span class="cx">         # for other types

</span><span class="cx">         return a + b

</span><ins>+

+    @classmethod

+    def show_results(cls, results):

+        results = BenchmarkResults(results)

+        print results.format()

</ins></span></pre>

</div>

</div>

</body>

</html>