[webkit-dev] Best practices for landing new/changed layout test expectations?

Mon Feb 25 13:29:34 PST 2013

On Mon, Feb 25, 2013 at 1:22 PM, Ryosuke Niwa <rniwa at webkit.org> wrote:
> On Mon, Feb 25, 2013 at 12:52 PM, Dirk Pranke <dpranke at google.com> wrote:
>>
>> On Mon, Feb 25, 2013 at 12:38 PM, Eric Seidel <eseidel at google.com> wrote:
>> > I've noticed as of late several different approaches being used when
>> > adding/changing LayoutTests which need rebaselining on other platforms.
>> >
>> > Obviously we cannot expect developers to test/rebaseline on all
>> > platforms
>> > before landing given our current infrastructure.
>> >
>> > But what should we expect them to do?
>> >
>> > Currently some folks add failing expectations to other ports
>> > TestExpectations.  Some add [skip].  Some even add them to the (new)
>> > global
>> > TestExpectations file.
>> >
>> > What's the proper course of action?
>> >
>> > I checked:
>> > http://trac.webkit.org/wiki/Keeping%20the%20Tree%20Green
>> >
>> > http://trac.webkit.org/wiki/Creating%20and%20Submitting%20Layout%20Tests%20and%20Patches
>> > http://trac.webkit.org/wiki/TestExpectations
>> > and didn't see the "I expect this to fail/need rebaseline on other
>> > ports"
>> > case discussed.
>> >
>>
>> Memory says that the last time this was raised [1], the consensus was
>> to land the change, turn the tree red, notify as many gardeners / port
>> maintainers as possible, and rebaseline as quickly as possible. I.e.,
>> don't add entries to TestExpectations. Of course, such a policy
>> wouldn't play nicely w/ the EWS bots, and I think this discussion
>> predated everyone switching over to TestExpectations (and certainly
>> predated the generic expectations file).
>>
>> I don't find the above entirely satisfactory, but I also don't have
>> any great alternatives to endorse given the existing tooling.
>>
>> Also, if the test fails generically (all ports), it probably shouldn't
>> be landed. I'm not sure why you couldn't at least update the port
>> you're developing on; if you don't, how do you know your fix is
>> working at all? (Of course, there could be a case where a
>> high-priority fix might break some low-priority tests).
>
>
> Adding a new feature behind a build flag disabled everywhere by default.
>
>> > I remember some discussion of a [rebaseline] keyword in
>> > TestExpectations,
>> > but I'm not sure that ever made it in?
>>
>> The [ NeedsRebaseline ] enhancement is, as of yet, unimplemented and
>> unclaimed [2]. It shouldn't be too hard for someone to try it if
>> they're looking for a reason to explore the NRWT code :)
>
>
> I object to adding such a thing. People add and forget about these entries
> way too often:
>

The rationale for adding the keyword was precisely this ... it would
make it easier to programmatically identify tests needing updated
baselines and alert people :).

Do you have an alternative you'd suggest for keeping people from
forgetting (particularly in the case where we have an expected failure
waiting on a bug fix)?

-- Dirk