[webkit-dev] Gated trunk, experiences from OpenStack

Tue Feb 5 17:55:48 PST 2013

On Tue, Feb 5, 2013 at 3:34 PM, Tim Ansell <mithro at mithis.com> wrote:
> On 6 February 2013 07:17, Dirk Pranke <dpranke at chromium.org> wrote:
>>
>> On Tue, Feb 5, 2013 at 9:46 AM, Martin Robinson <mrobinson at webkit.org>
>> wrote:
>> > On Tue, Feb 5, 2013 at 9:28 AM, Adam Barth <abarth at webkit.org> wrote:
>> >> Do you know how they got rid of flakiness in their tests?  We've spent
>> >> a bunch of effort fixing flaky tests (and in marking the remaining
>> >> flaky tests as flaky), but there's still a long tail of flakiness.  I
>> >> wonder if that sort of thing might be different for OpenStack if they
>> >> have a different approach to testing than we do.
>
>
> From what I can see they have a pretty similar goal to us. I personally
> don't know where our test flakyness comes from, so can't really comment on
> how we could fix it.
>
>>
>> >
>> > Another useful thing is to know the number of tests in OpenStack.
>> > WebKit has more tests than any other project I've worked on.
>> >
>>
>> There are two other related aspects that make our tests flaky:
>>
>> 1) They're very high level integration tests (mostly), which, as they
>> cover large swaths of code in each test, are much more susceptible to
>> flakiness than method-level unit tests.
>
>
> While OpenStack doesn't have anywhere near the number of integration tests
> WebKit does, it does have large integration tests. Infact, one of their
> tests brings up a whole cloud stack and checks that you can operate the
> cluster.
>
>>
>> 2) They weren't generally written to be run in parallel, and thus we
>> often have to be concerned with system-level resource contention.
>
>
> Neither where OpenStack's originally. They made heavy use of tool called
> testr ( http://pypi.python.org/pypi/testrepository ) which has a mode to
> automatically find when two tests are interfering with each other. testr
> also has a bunch of other useful features, like only re-running tests which
> are currently failing and keeping a database of test runs and allowing stat
> collection.
>

Ah, the testr isolation bisection does look interesting. I have done a
little work along those lines but haven't gotten very far.

-- Dirk