[webkit-dev] Please don't leave entries for rebaseline in TestExpectation files

Thu Mar 21 11:39:17 PDT 2013

On Thu, Mar 21, 2013 at 11:16 AM, Ryosuke Niwa <rniwa at webkit.org> wrote:

> On Thu, Mar 21, 2013 at 10:54 AM, Žan Doberšek <zandobersek at gmail.com>wrote:
>
>> On Thu, Mar 21, 2013 at 5:18 PM, Robert Hogan <lists at roberthogan.net>wrote:
>>
>>> On Thursday, 21 March 2013, Ryosuke Niwa wrote:
>>>
>>>> On Thu, Mar 21, 2013 at 1:31 AM, Robert Hogan <lists at roberthogan.net>wrote:
>>>>
>>>>> On Thursday, 21 March 2013, Ryosuke Niwa wrote:
>>>>>
>>>>>>  I used to pull results from the bots where possible but creating
>>>>>>> inconsistency between png/text results is not good.
>>>>>>>
>>>>>>
>>>>>> It is unfortunate but it's much better than losing the complete test
>>>>>> coverage.
>>>>>>
>>>>>
>>>>> If that's the case then I'm happy to land whatever garden-o-matic
>>>>> pulls in or I can sweep from the bots, even if it means that png results
>>>>> for Mac, Qt, et al. go bad as a result.
>>>>>
>>>>> I guess we will always have ports whose bots do not run pixel tests so
>>>>> if those ports are happy to live with the downsides of doing that then
>>>>> there really is no obstacle to authors owning the job of updating the
>>>>> baselines for all ports when they land a change.
>>>>>
>>>>> IMHO ports who don't run pixel tests would be better off deleting any
>>>>> png results they have in the tree. Is there a reason Mac hasn't done that?
>>>>> Don't you get lots of failures when you run pixel tests locally?
>>>>>
>>>>
>>>> Yes, but I'd argue that it's better than losing the test coverage.
>>>>
>>>> By the way, we can easily address this problem by always generating
>>>> pixel results for unexpectedly failing tests. Namely, we can force --pixel
>>>> when we're retrying tests.
>>>>
>>>>
>>> Perhaps NRWT could produce txt and png results for all tests marked with
>>> REBASELINE or similar in TestExpectations. That would avoid the need to
>>> turn the bots red on each platform for at least one build cycle.
>>>
>>
>> I like this specific proposal. There's already a similar expectation
>> planned, 'NeedsRebaseline'.
>> https://bugs.webkit.org/show_bug.cgi?id=100415
>>
>
> How do we know that new results is correct prior to running tests on each
> platform/port?  There are cases where we regress tests on some ports while
> needing to rebaseline on other ports but all of that is unknown until we
> actually run tests on the bots.
>

If we're adding a token of this sort, it should be named something like
NeedsTriaging.  Saying that a test just needs a rebaseline is a pretense of
knowledge.

- R. Niwa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20130321/52451da3/attachment.html>