[webkit-dev] Fwd: Some text about the B3 compiler

Tue Feb 9 12:08:29 PST 2016

---------- Forwarded message ----------
From: Ryosuke Niwa <rniwa at webkit.org>
Date: Wed, Feb 3, 2016 at 4:18 PM
Subject: Re: [webkit-dev] Some text about the B3 compiler
To: Carlos Alberto Lopez Perez <clopez at igalia.com>

On Tue, Feb 2, 2016 at 4:56 PM, Carlos Alberto Lopez Perez
<clopez at igalia.com> wrote:
> On 02/02/16 19:58, Ryosuke Niwa wrote:
>> On Tue, Feb 2, 2016 at 10:42 AM, Carlos Alberto Lopez Perez
>>> But this script seems focused on comparing the performance between
>>> different browsers (safari vs chrome vs firefox) rather than in testing
>>> and comparing the performance between different revisions of WebKit.
>>
>> Not at all.  It simply supports running benchmark in other browsers.
>>
>>> Do you think it makes any difference (from the point of view of
>>> detecting failures, not from the performance PoV) to run this tests in a
>>> full-fledged browser like Safari rather than in WebKitTestRunner ?
>>
>> Yes. There are many browser features that can significantly impact the
>> real world performance.
>>
>
> I'm specifically not asking about performance, but about correctness.

WebKitTestRunner and Safari load pages differently so I wouldn't be
surprised even if there were differences.

One of the most important difference is that WebKitTestRunner is
hidden outside the viewport by default
but any graphics benchmark ran this way may not accurately measure the
performance since GPU
may not be pushing anything to the screen and avoiding to do some work.

However, the primary use of benchmarks is to keep track of performance,
and for that purpose, WebKitTestRunner is not adequate for our purpose.

> This discussion was started because Filip said that running JS tests on
> a browser catches many failures that are not cached when running the
> tests from the terminal.
>
> So, I'm wondering if running the JS tests on WTR or Safari makes any
> difference when catching failures.

For that purpose, WebKitTestRunner is probably good enough.

You can probably add a new driver code in benchmark_runner code to
support it easily
since the framework is designed to be generic enough to even support
other non-WebKit-based browsers.

>>> We already have a performance test bot running tests inside WTR.
>>> And I see that the current set of tests executed on this bot already
>>> includes Speedometer, and that JetStream and Sunspider are skipped on
>>> PerformanceTests/Skipped.
>>>
>>> So I see some options going forward:
>>>
>>>  - Fix the JetStream and Sunspider tests so they can be run as part of
>>> the current run-perf-tests script that the performance bots execute.
>>
>> We should use run-benchmark instead since run-benchmark spits out the
>> JSON file that's compatible with run-pref-tests.
>>
>
> I'm a bit lost here. Are you planning to deprecate run-perf-tests with
> run-benchmark? What is wrong with run-perf-tests?

No.  run-perf-tests will be maintained to keep running our micro perf
tests as they have proven to be useful for keeping track of some
performance.  run-benchmark is a new tool that allows us to easily run
third-party benchmarks which doesn't require the use of runner.js
script or WebKitTestRunner features.

>>>  - Implement support on the script run-benchmark to run the tests inside
>>> WTR, and create a new step running this script that will be executed on
>>> the test bots.
>>
>> I don't see a point in doing this.   Why is it desirable to run these
>> benchmarks inside WebKitTestRunner?
>>
>
> Less dependencies: WTR (or the MiniBrowser) is something that is
> currently built by the bots on each build.
> If we want to use Epiphany (for example) for the performance tests, is
> another thing we have to take care of building before each run. Not a
> big deal, but I wonder if is really worth.

I see.  You can easily write a new browser driver for MiniBrowser or
WebKitTestRunner.

>>>  - Deploy a new bot that runs run-perf-tests on a full-fledged browser
>>> like Safari or Epiphany.
>>
>> We should just do this.
>>
>>> I wonder what you think is the best option or if there is some option
>>> not viable.
>>>
>>> From my PoV, the option #1 has the advantage of reusing the current
>>> infrastructure that collects and draws performance data at
>>> https://perf.webkit.org
>>
>> We have an internal instance of the same dashboard to which we're
>> reporting results of run-benchmark script.
>>
>
> What about making this public? We will happily contribute with a
> GTK+/Linux buildbot for it.

We can't since it contains proprietary data.

However, the perf dashboard we use internally is identical to the one
deployed at perf.webkit.org so we should be able to submit results to
perf.webkit.org from run-benchmark easily.  (we probably just need to
copy & paste some code in run-perf-tests).

- R. Niwa