[webkit-gtk] Rationale for disabling pixel tests on bots

Thu Aug 3 09:38:48 PDT 2017

Thanks, this is very clear!

2017-08-03 11:53 GMT+02:00 Carlos Alberto Lopez Perez <clopez at igalia.com>:

> On 03/08/17 11:11, Romain Bellessort wrote:
> > Hi,
> >
> > For several years (apparently since 2012), pixel tests have been disabled
> > when running tests on bots (see e.g. "Pixel tests disabled" in [1]).
> There
> > is an option to run them locally (-p), but I was wondering what was the
> > rationale for disabling them.
> >
> > Based on what I found, the reason seems to be that running pixel tests on
> > bots has a high processing cost. In addition, in most cases, this cost is
> > not needed as considered features may be tested through reftests (hence
> > disabling pixel tests on bots is not a big issue as they can generally be
> > avoided).
> >
> > Would you say this is correct, or are there other reasons?
> >
>
> Pixel tests are run on the bots for the tests that first fail on the
> text diff. The bot first does a first run without pixel tests (checking
> only text diffs). Then it does a second run only over the tests that
> first failed (this time enabling pixel tests).
>
> Regarding about why we don't run pixel tests always..
>
> I'm unsure if the processing cost is a concern. It will be useful to
> know how much times it takes to run the whole test suite with and
> without pixel tests enabled. If the difference of time it less than 25%
> more I don't think this should be a concern.
>
> My understanding is that currently there are 3 main reasons for not
> doing this:
>
>  1) Other ports (Mac) are also not running pixel test, and we currently
> don't see a need to do different here. If we end enabling pixel tests
> globally I think this should be done for all ports (ideally).
>
>  2) Increased burden to keep the bots green: we already have a hard time
> to keep our bots green without running pixel tests by default. If we
> enable this, then the burden will be much higher than now.
>
>  3) Difficulty to have accurate results between distributions: we have
> developers using all kinds of GNU/Linux distributions. And the test
> results many times depends on the very specific version of some
> libraries. For example different versions of Cairo or GTK+ can cause
> different 1-pixel differences (or some box to render with a sightly
> different color) on the output that may make the test fail when it
> actually should have passed. We try to avoid this as much as possible by
> building a bunch of libraries on our internal JHBuild that we have
> identified as that can cause this kind of issues. But still there are
> different failures depending in if you use Fedora or Debian (for
> example). So we still have not mastered the art of bundling all the
> libraries that can cause different test results.
>
> My 2 cents.
>
>
> _______________________________________________
> webkit-gtk mailing list
> webkit-gtk at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-gtk
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-gtk/attachments/20170803/06719fa2/attachment-0001.html>