On Mon, Apr 9, 2012 at 3:50 PM, Julien Chaffraix <jchaffraix@webkit.org> wrote:
In my ideal world, you would be able to get updated baselines *prior* to trying to land a patch. This is of course not really possible today for any test that fails on multiple ports with different results, as it's practically impossible to run more than a couple of ports by hand, and we don't have a bot infrastructure to help you here.
That would be the best outcome indeed, however that would more or less require all the EWS to be able to run the tests (including pixel tests for the platforms that enable them).
Yes. Probably even harder, we'd have to build tooling to extract the results off of this bot and merge them into your patch.
3) the current tool-of-the-day for managing rebaselining, garden-o-matic, is much better suited to handling the "unexpected" failures on the bots rather than the "expected" failures you've introduced.
You could see it the other way. How could we make garden-o-matic handle newly added suppression better? Maybe some new sub-tool listing the newly added suppressions would help? Ignore the suppressions added XX hours ago?
Yes, that is similar.
Your issue seems to be suppressions sticking into the tree not strictly suppressing in itself. Our current tool (garden-o-matic) handles failures a lot better than it handles suppressions.
Yes.
If there's consensus in the mean time that it is better on balance to check in suppressions, perhaps we can figure out a better way to do that. Maybe (shudder) a second test_expectations file? Or maybe it would be better to actually check in suppressions marked as REBASELINE (or something like that)?
That sounds quirky as it involves maintaining 2 sets of files.
From my perspective, saying that we should discard the EWS result and allow changes to get in WebKit trunk, knowing they will turn the bots red, is a bad proposal regardless of how you justify it. In the small delta where the bots are red, you can bet people will miss something else that breaks.
As Ryosuke points out, practically we're already in that situation - from what I can tell, the tree is red virtually all of the time, at least during US/Pacific working hours. It's not clear to me if the EWS has made this better or worse, but perhaps others have noticed a difference. That said, I doubt I like red trees any more than you do :) I had not considered the EWS implications when I wrote the initial note, so I haven't decided if (or how) to revise my opinions yet. -- Dirk