[webkit-dev] Best practices for landing new/changed layout test expectations?

Mon Feb 25 16:27:09 PST 2013

On Mon, Feb 25, 2013 at 3:36 PM, Glenn Adams <glenn at skynav.com> wrote:

>
> On Mon, Feb 25, 2013 at 3:53 PM, Ryosuke Niwa <rniwa at webkit.org> wrote:
>
>> On Mon, Feb 25, 2013 at 2:48 PM, Dirk Pranke <dpranke at google.com> wrote:
>>
>>> On Mon, Feb 25, 2013 at 2:05 PM, Ryosuke Niwa <rniwa at webkit.org> wrote:
>>> > On Mon, Feb 25, 2013 at 1:39 PM, Dirk Pranke <dpranke at google.com>
>>> wrote:
>>> >> Are you suggesting we should land a "failling" baseline in the
>>> meantime?
>>> >
>>> >
>>> > No. I'm suggesting patch authors perform their due diligence and
>>> either ask
>>> > port maintainers to rebaseline or rebaseline tests themselves.
>>> >
>>>
>>> I think either you misunderstood my question, or I am misunderstanding
>>> your answer. I'm not asking "who", I'm asking "what" ...
>>>
>>> If we know some tests are failing, and when we fix a bug the tests
>>> will start passing again (but that patch might not land for quite some
>>> time), what should we (anyone) do in the meantime? Leave the tree red,
>>> land "incorrect" -expected baselines so that we can catch changes in
>>> behavior, or add lines to TestExpectations? Many of the lines you
>>> cited fell into the last category.
>>>
>>
>> Either one of those two solutions would work (although I strictly advice
>> we do the latter) when there are failing tests and we need a fix in order
>> for those tests to pass but I'm not interested in discussing that matter at
>> the moment.
>>
>> I'm specifically opposed to adding new entries to TestExpectations for
>> the sole purpose of rebaselining them later.
>>
>
> One of the modes that the new generic TE file permits is to specify a
> generic Skip and overriding this with per-port Pass. This provides (IMO) a
> more manageable way to land a patch that you expect will require per-port
> mods to fully get the patch working on the primary ports. It also allows
> one to land a patch containing tests where a subsequent patch will provide
> the functionality to be tested.
>
> I recently used both of these modes to sub-divide a large patch and to
> incrementally verify (landing individual per-port fixes as needed)
> individual ports.
>
> Personally, I prefer this over the "just turn bots red" mode. I suspect
> the turn bots red approach results in greater churn, due to rollouts and
> re-landing, than the finer grain approach I outlined above. I agree it
> requires the committer to follow through on other ports and eventually
> remove the generic Skip and Pass rules (once it works everywhere), but we
> have to assume here (IMO) that committers will do the right thing and take
> responsibility for the process. [And if not, then remind them.]
>
> So I for one, would vote against the "turn the bots red" approach.
> Overall, I think that approach produces a higher interrupt load on our
> limited committer and reviewer resources. I think it would be useful to
> give other procedures an opportunity to be tried out so we can have more
> data for choosing between alternative approaches.
>

This has been tried as many people including yourself are using this
approach in landing and rebaselining patches. In particular, a lot of new
contributors from Chromium port use this strategy.

When I used to work for Google, I periodically went through all entries in
TestExpectations files that needed rebaselines and manually rebaselined
them myself because people just leave entries like I've cited in another
email all the time.

Quite frankly, I don't want it to be my (or anyone but patch author's) job
to take care of all these stale entries people add.

- R. Niwa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20130225/7d2c3469/attachment.html>