[webkit-dev] A proposal for handling "failing" layout tests and TestExpectations

Wed Aug 15 18:24:35 PDT 2012

On Aug 15, 2012, at 6:18 PM, Peter Kasting <pkasting at chromium.org> wrote:

> On Wed, Aug 15, 2012 at 6:02 PM, Filip Pizlo <fpizlo at apple.com> wrote:
> 2) Possibility of the sheriff getting it wrong.
> 
> (2) concerns me most.  We're talking about using filenames to serve as a kind of unchecked comment.  We already know that comments are usually bad because there is no protection against them going stale.
> 
> I don't see how this is parallel to stale comments.  Tests get continually compared to the given output and we see immediately when something changes.
> 
> It is certainly possible for whoever assigns the filenames to get things wrong.  There are basically two mitigations of this.  One is allowing the existing "expected.xxx" file extensions to remain, and encouraging people to leave them as-is when they're not sure whether the existing result is correct.  The other is for sheriffs to use this signal as just that -- a signal -- just as today we use the "expected.xxx" files as a signal of what the correct output might be.  The difference is that this can generally be considered a stronger signal.  Historically, there's been no real attempt to guarantee that an "expected" result is anything other than the test's current behavior.
> 
> I'm sure some would love to get rid of Skipped files just as much as I would love to get rid of TestExpectations files.  Both are valid things to love, and imply that there must surely exist a middle ground: a way of doing things that is strictly better than the sum of the two.
> 
> That's exactly what we're trying to do.
> 
> The value of this change is that hopefully it would dramatically reduce the amount of content in these, especially in TestExpectations files.  If you want to kill these so much, then this is a change you should at least let us test!

You still have to convince me that what you're trying to test is not simply the approach of rebaselining failing tests, and you also have to convince me that this is really a path to getting rid of TestExpectations entirely.

> 
> In particular, to further clarify my position, if someone were to argue that Dirk's proposal would be a wholesale replacement for TestExpectations, then I would be more likely to be on board, since I very much like the idea of reducing the number of ways of doing things.  Maybe that's a good way to reach compromise.
> 
> It's hard to know if we could completely eliminate them without testing this, but yes, one goal here is to greatly reduce the need for TestExpectations lines.  A related goal is to make the patterns and mechanisms used by all ports more similar.  As someone who has noted his frustration with both "different ways of doing things" and "philosophical directions chosen by one port", you would hopefully be well-served by this direction.

I'm opposed to adding more ways of doing things.  Right now you seem to be arguing that we should add more ways of handing failing tests because it *might* allow us to reduce the number of ways of handling failing tests.  But even if this is a matter of replacing TestExpectations with something else, then we still have, by my count, 5 different ways of dealing with failing tests.  In other words, we're just spinning our wheels in sand and padding our commit counts.  That's not worthwhile, particularly since this whole process will, at best temporarily, put us into a world where there are 6 different ways of dealing with failing tests.

If you can convince me that this is part of a credible path to reducing the number of ways of doing things from 5 to 4, rather than from 5 to 5, then that's a different story.

-F
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20120815/93e6fa52/attachment.html>