[webkit-dev] A proposal for handling "failing" layout tests and TestExpectations

Thu Aug 16 14:13:52 PDT 2012

On Wed, Aug 15, 2012 at 6:02 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>
> 2) Possibility of the sheriff getting it wrong.
>
> (2) concerns me most.  We're talking about using filenames to serve as a
> kind of unchecked comment.  We already know that comments are usually bad
> because there is no protection against them going stale.
>

Sheriffs can already get things wrong (and rebaseline when they
shouldn't). I believe that adding passing/failing to expected will
make things better in this regard, not worse.

Another idea/observation is that if we have multiple types of
expectation files, it might be easier to set up watchlists, e.g., "let
me know whenever a file gets checked into fast/forms with an -expected
or -failing result". It seems like this might be useful, but I'm not
sure.

> In particular, to further clarify my position, if someone were to argue that
> Dirk's proposal would be a wholesale replacement for TestExpectations, then
> I would be more likely to be on board, since I very much like the idea of
> reducing the number of ways of doing things.  Maybe that's a good way to
> reach compromise.
>
> Dirk, what value do you see in TestExpectations were your change to be
> landed?  Do scenarios still exist where there would be a test for which (a)
> there is no -fail.* file, (b) the test is not skipped, and (c) it's marked
> with some annotation in TestExpectations?  I'm most interested in the
> question of such scenarios exist, since in my experience, whenever a test is
> not rebased, is not skipped, and is marked as failing in TestExpectations,
> it ends up just causing gardening overhead later.

This is a good question, because it is definitely my intent that this
change replace some existing practices, not add to them.

Currently, the Chromium port uses TestExpectations entries for four
different kinds of things: tests we don't ever plan to fix (WONTFIX),
tests that we skip because not doing so causes other tests to break,
tests that fail (reliably), and tests that are flaky.

Skipped files do not let you distinguish (programmatically) between
the first two categories, and so my plan is to replace Skipped files
with TestExpectations (using the new syntax discussed a month or so
ago) soon (next week or two at the latest).

I would like to replace using TestExpectations for failing tests (at
least for tests that are expected to keep failing indefinitely because
someone isn't working on an active fix) with this new mechanism.

That leaves flaky tests. One can debate what the right thing to do w/
flaky tests is here; I'm inclined to argue that flakiness is at least
as bad as failing, and we should probably be skipping them, but the
Chromium port has not yet actively tried this approach (I think other
ports probably have experience here, though).

Does that help answer your question / sway you at all?

-- Dirk