[webkit-dev] handling failing tests (test_expectations, Skipped files, etc.)

Tue Apr 10 11:58:51 PDT 2012

On Tue, Apr 10, 2012 at 11:42 AM, Stephen Chenney <schenney at chromium.org>wrote:

> On Tue, Apr 10, 2012 at 1:00 PM, Ryosuke Niwa <rniwa at webkit.org> wrote:
>
>> On Tue, Apr 10, 2012 at 6:10 AM, Stephen Chenney <schenney at chromium.org>wrote:
>>
>>> There is a significant practical problem to "turn the tree red and work
>>> with someone to rebaseline the tests". It takes multiple hours for some
>>> bots to build and test a given patch. That means, at any moment, you will
>>> have maybe tens and in some cases hundreds of failing tests associated with
>>> some changelist that you need to track on the bots. You might have more
>>> failing tests associated with a different changelist, and so on.
>>
>>
>> But you have to do this for non-Chromium ports anyway because they don't
>> use test_expectations.txt and skipping the tests won't help you generate
>> new baseline. In my opinion, we should not further diverge from the way
>> things are done in other ports.
>>
>
> How long on average does it take a builder to get through a change on
> another port? Right now the Chromium Mac 10.5 and 10.6 dbg builds are
> processing a patch from about 3 hours ago. About 20 patches have gone in
> since then. For the Mac 10.5 tree to ever be green would require there
> being no changes at all requiring new baselines for a 3 hour window.
>
> Just because other teams do it some way does not mean that Chromium, with
> it's greater number of bots and platforms, should do it the same way.
>

Yes, it does mean that we should do it the same way. What if non-Chromium
ports started imposing arbitrary processes like this on the rest of us?
It'll be a total chaos, and nobody would understand the right thing to do
for all ports.

> We are discussing a process here, not code, and in my mind the goal is to
> have the tree be as green as possible with all failures tracked with a
> "minimal" expectations file and as little engineer time as possible.
>

That's not our project goal. We have continuous builds and regression tests
to prevent regressions to improve the stability, not to keep bots green.
Please review http://www.webkit.org/projects/goals.html

Just look at how often the non-chromium mac and win builds are red. In
> particular, changes submitted via the commit queue take an indeterminate
> amount of time to go in, anything from an hour to several hours. Patch
> authors do not necessarily even have control over when the CQ+ is given.
>

That's why I don't use commit queue when I know my patch requires
platform-dependent rebaselines.

> Even when manually committing, if it takes 3 hours to create baselines
> then no patches go in in the afternoon. What if the bots are down or
> misbehaving?
>

We need to promptly fix those bots.

I would also point out the waste of resources when every contributor needs
> to track every failure around commit time in order to know when their own
> changes cause failures, and then track the bots to know when they are free
> to go home.
>

But that's clearly stated in the contribution guide line.

 Why not simply attach an owner and a resolution date to each expectation?
>>> The real problem right now is accountability and a way to remind people
>>> that they have left expectations hanging.
>>>
>>
>> That's what WebKit bugs are for. Ossy frequently files a bug and cc'es
>> the patch author when a new test is added or a test starts failing and he
>> doesn't know whether new result is correct or not. He also either skips the
>> test or rebaseline the test as needed. He also reverts patches when the
>> patch clearly introduced serious regressions (e.g. crashes on hundreds of
>> tests).
>>
>
> Yes, Ossy does an excellent job of gardening. Unfortunately, on Chrome we
> have tens if not hundreds of gardeners and, as this thread has revealed, no
> clear agreement on the best way to garden.
>

That IS the problem. We have too many in-experiented gardeners that don't
understand the WebKit culture or the WebKit process.

I strongly believe that keeping the tree green is more important than
> having a clean expectations file.
>

I disagree. You're effectively just disabling the test temporarily.

Finally, there is no pain free way to do this. The question is how to
> distribute the pain. Right now each gardening is using a process that
> distributes pain in their preferred way. From a community standpoint it
> would be nice if the Chromium team could come up with something consistent.
>

The process Chromium port uses should be consistent with non-Chromium ports.

- Ryosuke
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20120410/67f4d1af/attachment.html>