[webkit-dev] Skipping Flakey Tests

Maciej Stachowiak mjs at apple.com
Mon Sep 28 17:00:05 PDT 2009

On Sep 28, 2009, at 4:40 PM, Eric Seidel wrote:

> On Fri, Sep 25, 2009 at 2:42 PM, Maciej Stachowiak <mjs at apple.com>  
> wrote:
>> I like Dave Levin's idea that the first action should be to  
>> instrument the
>> tests so we can find out why they intermittently fail. Especially  
>> if the
>> failure is reproducible on the bots but not on developer systems.
> I like this idea too.  I don't like the reality that flakey tests are
> a burden on all developers caused by one.

Then in my opinion that's what we should do first, when a test is  
failing sporadically and the cause is unknown.

>> Using the
>> skip list should be a last resort, because that hides the failure  
>> instead of
>> helping us diangose the cause.
> I (respectfully) disagree.  I think we shouldn't be so afraid to skip
> tests.  We don't allow people to check in compiles which fail.  We
> don't allow people to check in tests which fail on other platforms
> (without skipping them) or on every other run.  Why should we allow
> people to check in tests which fail every 10 runs?  Or worse, why
> should we leave a known flakey test checked in/un-attended which fails
> every 10 runs?

If a brand new test fails every 10 runs, then we should revert the  
patch that landed it, just as if it had caused a test to always fail.  
However, I get the impression that many "flaky test" issues appear  
after the fact - a test that has been running fine for a long time  
starts failing sporadically. In that kind of case, it seems likely  
that a subsequent code change and not the original test is at fault.  
The challenge is that it may be difficult to identify the code change  
that made the test start failing sporadically. But for example if a  
test newly started failing 100% of the time, we would not consider it  
an appropriate fix to disable that test. Or at least I wouldn't.

> If we can't easily roll-out the failing tests (or the commit which
> cause them to start failing), we should skip them to keep the bots (a
> shared resource) green, so as not to block other work on the project.
> No?

The reason for the "keep the bots green" rule is to prevent  
regressions. If we maintain the rule by disabling tests, we are  
sacrificing the actual purpose of the rule for the sake of pro forma  
adherence. Thats why I'd like it to be a last resort, unless the test  
is new and its clear it was flaky from the get-go. The first resort  
should be to get the person who made the test

> I very much like WebKit's "everyone is responsible for the whole
> project" culture, but I disagree that the burden of diagnosis should
> be on the person trying to make a completely unrelated checkin (as is
> the case when we leave flakey tests enabled in the tree).

This normally hasn't been a huge problem, other than for the commit  
queue script.

> -eric
> p.s. I now have two "skipping flakey tests" changes up for review:
> https://bugs.webkit.org/show_bug.cgi?id=29322

If Brady and Alexey are ok with disabling this test, then I'm fine  
with it. I like Alexey's suggestion to have a separate "flaky tests"  
list that the commit queue script can ignore, without preventing them  
from being run at all.

> https://bugs.webkit.org/show_bug.cgi?id=29344

This one seems like the test was always buggy, so OK to turn off. I do  
think the test can be fixed however, despite comments to the contrary  
in the bug.


More information about the webkit-dev mailing list