[webkit-dev] handling failing tests (test_expectations, Skipped files, etc.)

Mon Apr 9 16:09:23 PDT 2012

On Mon, Apr 9, 2012 at 3:50 PM, Julien Chaffraix <jchaffraix at webkit.org> wrote:
>> In my ideal world, you would be able to get updated baselines *prior*
>> to trying to land a patch. This is of course not really possible today
>> for any test that fails on multiple ports with different results, as
>> it's practically impossible to run more than a couple of ports by
>> hand, and we don't have a bot infrastructure to help you here.
>
> That would be the best outcome indeed, however that would more or less
> require all the EWS to be able to run the tests (including pixel tests
> for the platforms that enable them).
>

Yes. Probably even harder, we'd have to build tooling to extract the
results off of this bot and merge them into your patch.

>> 3) the current tool-of-the-day for managing rebaselining,
>> garden-o-matic, is much better suited to handling the "unexpected"
>> failures on the bots rather than the "expected" failures you've
>> introduced.
>
> You could see it the other way. How could we make garden-o-matic
> handle newly added suppression better? Maybe some new sub-tool listing
> the newly added suppressions would help? Ignore the suppressions added
> XX hours ago?
>

Yes, that is similar.

> Your issue seems to be suppressions sticking into the tree not
> strictly suppressing in itself. Our current tool (garden-o-matic)
> handles failures a lot better than it handles suppressions.
>

Yes.

>> If there's consensus in the mean time that it is better on balance to
>> check in suppressions, perhaps we can figure out a better way to do
>> that. Maybe (shudder) a second test_expectations file? Or maybe it
>> would be better to actually check in suppressions marked as REBASELINE
>> (or something like that)?
>
> That sounds quirky as it involves maintaining 2 sets of files.
>
> From my perspective, saying that we should discard the EWS result and
> allow changes to get in WebKit trunk, knowing they will turn the bots
> red, is a bad proposal regardless of how you justify it. In the small
> delta where the bots are red, you can bet people will miss something
> else that breaks.
>

As Ryosuke points out, practically we're already in that situation -
from what I can tell, the tree is red virtually all of the time, at
least during US/Pacific working hours. It's not clear to me if the EWS
has made this better or worse, but perhaps others have noticed a
difference. That said, I doubt I like red trees any more than you do
:)

I had not considered the EWS implications when I wrote the initial
note, so I haven't decided if (or how) to revise my opinions yet.

-- Dirk