[webkit-dev] handling failing tests (test_expectations, Skipped files, etc.)
ojan at chromium.org
Tue Apr 10 12:29:54 PDT 2012
I don't think we can come up with a hard and fast rule given current
tooling. In a theoretical future world in which it's easy to get expected
results off the EWS bots (or some other infrastructure), it would be
reasonable to expect people to incorporate the correct expected results for
any EWS-having ports before committing the patch. I expect we'd all agree
that would be better than turning the bots red or adding to
In the current world, it's a judgement call. If I expect a patch to need a
lot of platform-specific baselines, I'll make sure to commit it at a time
when I have hours to spare to cleanup any failures or, if I can't stick
around for the bots to cycle, I'll add it to test_expectations.txt
Both approaches have nasty tradeoffs. It is probably worth writing up a
wiki page outlining these two options and explaining why you might do one
or the other for people new to the project, but I don't see benefit in
trying to pick a hard rule that everyone must follow.
On Tue, Apr 10, 2012 at 11:58 AM, Ryosuke Niwa <rniwa at webkit.org> wrote:
> On Tue, Apr 10, 2012 at 11:42 AM, Stephen Chenney <schenney at chromium.org>wrote:
>> On Tue, Apr 10, 2012 at 1:00 PM, Ryosuke Niwa <rniwa at webkit.org> wrote:
>>> On Tue, Apr 10, 2012 at 6:10 AM, Stephen Chenney <schenney at chromium.org>wrote:
>>>> There is a significant practical problem to "turn the tree red and work
>>>> with someone to rebaseline the tests". It takes multiple hours for some
>>>> bots to build and test a given patch. That means, at any moment, you will
>>>> have maybe tens and in some cases hundreds of failing tests associated with
>>>> some changelist that you need to track on the bots. You might have more
>>>> failing tests associated with a different changelist, and so on.
>>> But you have to do this for non-Chromium ports anyway because they don't
>>> use test_expectations.txt and skipping the tests won't help you generate
>>> new baseline. In my opinion, we should not further diverge from the way
>>> things are done in other ports.
>> How long on average does it take a builder to get through a change on
>> another port? Right now the Chromium Mac 10.5 and 10.6 dbg builds are
>> processing a patch from about 3 hours ago. About 20 patches have gone in
>> since then. For the Mac 10.5 tree to ever be green would require there
>> being no changes at all requiring new baselines for a 3 hour window.
>> Just because other teams do it some way does not mean that Chromium, with
>> it's greater number of bots and platforms, should do it the same way.
> Yes, it does mean that we should do it the same way. What if non-Chromium
> ports started imposing arbitrary processes like this on the rest of us?
> It'll be a total chaos, and nobody would understand the right thing to do
> for all ports.
>> We are discussing a process here, not code, and in my mind the goal is to
>> have the tree be as green as possible with all failures tracked with a
>> "minimal" expectations file and as little engineer time as possible.
> That's not our project goal. We have continuous builds and regression
> tests to prevent regressions to improve the stability, not to keep bots
> green. Please review http://www.webkit.org/projects/goals.html
> Just look at how often the non-chromium mac and win builds are red. In
>> particular, changes submitted via the commit queue take an indeterminate
>> amount of time to go in, anything from an hour to several hours. Patch
>> authors do not necessarily even have control over when the CQ+ is given.
> That's why I don't use commit queue when I know my patch requires
> platform-dependent rebaselines.
>> Even when manually committing, if it takes 3 hours to create baselines
>> then no patches go in in the afternoon. What if the bots are down or
> We need to promptly fix those bots.
> I would also point out the waste of resources when every contributor needs
>> to track every failure around commit time in order to know when their own
>> changes cause failures, and then track the bots to know when they are free
>> to go home.
> But that's clearly stated in the contribution guide line.
> Why not simply attach an owner and a resolution date to each
>>>> expectation? The real problem right now is accountability and a way to
>>>> remind people that they have left expectations hanging.
>>> That's what WebKit bugs are for. Ossy frequently files a bug and cc'es
>>> the patch author when a new test is added or a test starts failing and he
>>> doesn't know whether new result is correct or not. He also either skips the
>>> test or rebaseline the test as needed. He also reverts patches when the
>>> patch clearly introduced serious regressions (e.g. crashes on hundreds of
>> Yes, Ossy does an excellent job of gardening. Unfortunately, on Chrome we
>> have tens if not hundreds of gardeners and, as this thread has revealed, no
>> clear agreement on the best way to garden.
> That IS the problem. We have too many in-experiented gardeners that don't
> understand the WebKit culture or the WebKit process.
> I strongly believe that keeping the tree green is more important than
>> having a clean expectations file.
> I disagree. You're effectively just disabling the test temporarily.
> Finally, there is no pain free way to do this. The question is how to
>> distribute the pain. Right now each gardening is using a process that
>> distributes pain in their preferred way. From a community standpoint it
>> would be nice if the Chromium team could come up with something consistent.
> The process Chromium port uses should be consistent with non-Chromium
> - Ryosuke
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the webkit-dev