[webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling "failing" layout tests and TestExpectations

Sat Aug 18 17:55:13 PDT 2012

On Aug 18, 2012, at 5:11 PM, Filip Pizlo <fpizlo at apple.com> wrote:

> Maybe at this point we can agree to let Dirk land some variant of this with whatever half-way sensible name (any of the options on the table are decent) and see how it works?
> 
> It seems that the only thing anyone is disagreeing over is naming and which files to keep around, which is a much smaller set of differences than status-quo versus any variant of this proposal. 

I agree that we should adopt some variant over the status quo. As you rightly noted, there are too many different ways to handle tests that deviate from the original expectation, and we have the opportunity to obsolete most of those ways with an approach that combines advantages of multiple current approaches.

However, I fear that whatever names we pick for the first round will then be unchangeable due to status quo bias (which we see a lot of in test infrastructure discussions, indeed, even this one). And anyone arguing against change at that point will have a valid argument that a huge global rename of tests is a bad idea. So I think it's worth expending a little effort to find names that are good.

Would you object to -expected-failure/-unexpected-pass as a naming scheme, along with the approach of keeping both around when they are used?

Regards,
Maciej

> 
> -Filip
> 
> On Aug 18, 2012, at 2:01 PM, Maciej Stachowiak <mjs at apple.com> wrote:
> 
>> 
>> On Aug 18, 2012, at 1:08 AM, Filip Pizlo <fpizlo at apple.com> wrote:
>> 
>>> I like your idea of having both the result-we-currently-expect and the result-we-think-may-be-more-correct to be checked in.  I still prefer Dirk's naming scheme though.
>> 
>> I think if we had both checked in, the result-we-think-may-be-more-correct should be named something other than -expected, since it is not, in fact, expected. That was the basis of my naming scheme. 
>> 
>> I think I would be happy with any scheme that had both checked in, and matched the criteria that you never have a file named -expected that is unexpected. For example, there could be schemes with no file named expected. If you let it be verbose, you could have:
>> 
>> Single result:
>>   foo-expected.txt
>> 
>> Possibly-worse current result, possibly-better older result:
>>   foo-expected-failure.txt
>>   foo-unexpected-pass.txt
>> 
>>> 
>>> I get the notion that "expected" always means literally what it seems to mean from the standpoint of whether the tooling is silent for the test (actual == expected) or has something to say.
>>> 
>>> But I think that if the tooling is behaving right, your concern that "a test would fail if it did *not* match the "failing" result" would be addressed: the tooling could be silent for actual == failing (if a failing file exists) but notify you of an "unexpected pass" if actual == expected.
>> 
>> But if you match neither, you get a failure for not matching the "failing" result. That still strikes me as a little goofy. Not failing is failing, and getting the expected result is unexpected. I think my extra-verbose naming scheme above would better match what you suggest the tool UI would do. Maybe there is a more concise way to get the same point across.
>> 
>> Regards,
>> Maciej
>>