No subject

Mon Jan 28 08:41:14 PST 2013

don't know where our test flakyness comes from, so can't really comment on
how we could fix it.

> >
> > Another useful thing is to know the number of tests in OpenStack.
> > WebKit has more tests than any other project I've worked on.
> >
>
> There are two other related aspects that make our tests flaky:
>
> 1) They're very high level integration tests (mostly), which, as they
> cover large swaths of code in each test, are much more susceptible to
> flakiness than method-level unit tests.
>

While OpenStack doesn't have anywhere near the number of integration tests
WebKit does, it does have large integration tests. Infact, one of their
tests brings up a whole cloud stack and checks that you can operate the
cluster.

> 2) They weren't generally written to be run in parallel, and thus we
> often have to be concerned with system-level resource contention.
>

Neither where OpenStack's originally. They made heavy use of tool called *
testr* ( http://pypi.python.org/pypi/testrepository ) which has a mode to
automatically find when two tests are interfering with each other. testr
also has a bunch of other useful features, like only re-running tests which
are currently failing and keeping a database of test runs and allowing stat
collection.

We too could use testr if our tests output the subunit format. The subunit
format was originally developed for python and has excellent python support
so I think it should be pretty trivial to add.

Tim 'mithro' Ansell

--e89a8f646e41e09e1504d502a614
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 6 February 2013 07:17, Dirk Pranke <span dir=3D"ltr">&lt;<a href=3D"mail=
to:dpranke at chromium.org" target=3D"_blank">dpranke at chromium.org</a>&gt;</sp=
an> wrote:<br><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" =
style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class=3D"HOEnZb"><div class=3D"h5">On Tue, Feb 5, 2013 at 9:46 AM, Mar=
tin Robinson &lt;<a href=3D"mailto:mrobinson at webkit.org">mrobinson at webkit.o=
rg</a>&gt; wrote:<br>
&gt; On Tue, Feb 5, 2013 at 9:28 AM, Adam Barth &lt;<a href=3D"mailto:abart=
h at webkit.org">abarth at webkit.org</a>&gt; wrote:<br>
&gt;&gt; Do you know how they got rid of flakiness in their tests? =C2=A0We=
&#39;ve spent<br>
&gt;&gt; a bunch of effort fixing flaky tests (and in marking the remaining=
<br>
&gt;&gt; flaky tests as flaky), but there&#39;s still a long tail of flakin=
ess. =C2=A0I<br>
&gt;&gt; wonder if that sort of thing might be different for OpenStack if t=
hey<br>
&gt;&gt; have a different approach to testing than we do.<br></div></div></=
blockquote><div><br></div><div>From what I can see they have a pretty simil=
ar goal to us. I personally don&#39;t know where our test flakyness comes f=
rom, so can&#39;t really comment on how we could fix it.</div>

<div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex"><div class=3D"HOEnZb"><div =
class=3D"h5">
&gt;<br>
&gt; Another useful thing is to know the number of tests in OpenStack.<br>
&gt; WebKit has more tests than any other project I&#39;ve worked on.<br>
&gt;<br>
<br>
</div></div>There are two other related aspects that make our tests flaky:<=
br>
<br>
1) They&#39;re very high level integration tests (mostly), which, as they<b=
r>
cover large swaths of code in each test, are much more susceptible to<br>
flakiness than method-level unit tests.<br></blockquote><div><br></div><div=
>While OpenStack doesn&#39;t have anywhere near the number of integration t=
ests WebKit does, it does have large integration tests. Infact, one of thei=
r tests brings up a whole cloud stack and checks that you can operate the c=
luster.</div>

<div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex">2) They weren&#39;t general=
ly written to be run in parallel, and thus we<br>
often have to be concerned with system-level resource contention.<br></bloc=
kquote><div><br></div><div>Neither where OpenStack&#39;s originally. They m=
ade heavy use of tool called <b>testr</b> ( <a href=3D"http://pypi.python.o=
rg/pypi/testrepository">http://pypi.python.org/pypi/testrepository</a> ) wh=
ich has a mode to automatically find when two tests are interfering with ea=
ch other. testr also has a bunch of other useful features, like only re-run=
ning tests which are currently failing and keeping a database of test runs =
and allowing stat collection.</div>

<div><br></div><div>We too could use testr if our tests output the subunit =
format. The subunit format was=C2=A0originally=C2=A0developed for python an=
d has excellent python support so I think it should be pretty trivial to ad=
d.</div>

<div><br></div><div>Tim &#39;mithro&#39; Ansell</div><div><br></div></div>

--e89a8f646e41e09e1504d502a614--