[webkit-dev] IMAGE+TEXT WAS: TestExpectations syntax changes, last call (for a while, at least) ...

Thu Jun 14 22:51:39 PDT 2012

On Thu, Jun 14, 2012 at 10:37 PM, Ojan Vafai <ojan at chromium.org> wrote:
>
>
> On Thu, Jun 14, 2012 at 9:20 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>>
>>
>> On Jun 14, 2012, at 9:06 PM, Adam Barth <abarth at webkit.org> wrote:
>>
>> > On Thu, Jun 14, 2012 at 9:02 PM, Ojan Vafai <ojan at chromium.org> wrote:
>> >>
>> >> Seems like it will be a common error to mark a reftest failure as
>> >> ImageOnlyFail and then be confused why it's not working, no?
>> >
>> > Maybe that can be solved with another name, like PixelOnlyFailure.
>
>
> I'm OK with trying it this way. We can always have another
> <strike>bikeshed</strike> fruitful discussion if it turns out to be a
> frequent cause of confusion in practice.
>
> Not sure one name is any more clear than the other. PixelOnlyFailure seems
> fine to me since others have expressed a preference in the past for Pixel
> over Image.
>

I can certainly see the advantages of the suggested scheme, but I
would want to sleep on this one, and possibly do some more data mining
to see if I can figure out exactly what percentage of Chromium's tests
are pixel tests marked as TEXT only failures or reftests are either
just IMAGE or just TEXT. (I'm sure Maciej is right and the percentage
is low, but I'd like to know how low). Further, once Chromium moves to
a world where failing baselines are checked in, the percentage is
probably somewhere between zero and inconsequential :).

I don't think PixelOnlyFailure is any better than ImageOnlyFailure.

>>
>> That sounds good. We could also make it an error to apply PixelOnlyFailure
>> (or what have you) to a text-only test, a reftest, or an audio test. Error
>> in the sense that it would be reported as a failure, with an informative
>> diagnostic saying it does not apply.
>
>
> We already have a mechanism for "errors" like this. They are reported when
> you run the tests or when you run with --lint-test-files. At least on the
> chromium bots, this runs as a separate step that turns red when when you
> cause a lint failure. That way errors get noticed and addressed quickly (the
> lint step takes ~2 seconds to run).

We also will catch these errors on upload through the style checker
(which is essentially just running --lint-test-files).

As a side note, I need to add the lint step to the b.w.o bots ...

-- Dirk