[webkit-dev] Running pixel tests on build.webkit.org

Jeremy Orlow jorlow at chromium.org
Fri Jan 8 09:52:15 PST 2010

Plan 3 seems like the best (and simplest) one until the infrastructure for
the others (and/or a champion for fixing currently failing tests) is

What would it take to go with plan 3?  I guess someone needs to rebaseline
everything that's currently failing, check them in, and then someone (like
bdash?) needs to flip a switch on the bots...?  Did I miss anything?

Are there instructions on how to do the rebaselining anywhere?  I've only
ever created pixel baselines for Chromium before (where we have a pretty
neat tool that pretty much does it for you).


On Fri, Jan 8, 2010 at 9:23 AM, Pam Greene <pam at chromium.org> wrote:

> And one very quick, short-term solution:
> 3. Generate new pixel results to match the current behavior, and check them
> in as hypothetically correct.
> And of course if someone notices an existing problem and fixes it, they
> check in corrected images then. It doesn't help find current problems, but
> those are being missed now anyway. It does let the tests be run again
> approximately immediately, even faster than waiting for test expectations
> functionality, so we can catch regressions moving forward.
> - Pam
> On Thu, Jan 7, 2010 at 5:01 PM, Ojan Vafai <ojan at chromium.org> wrote:
>> On Thu, Jan 7, 2010 at 10:22 AM, Darin Adler <darin at apple.com> wrote:
>>> On Jan 7, 2010, at 10:19 AM, Dimitri Glazkov wrote:
>>> > Are we planning to run pixel tests on the build bots?
>>> If we can get them green, we should. It’s a lot of work. We need a
>>> volunteer to do that work. We’ve tried before.
>> Two possible long-term solutions come to mind:
>> 1. Turn the bots orange on pixel failures. They still need fixing, but are
>> not as severe as text diff failures. I'm not a huge fan of this, but it's an
>> option.
>> 2. Add in a concept of expected failures and only turn the bots red for
>> *unexpected* failurs. More details on this below.
>> In chromium-land, there's an expectations file that lists expected
>> failures and allows for distinguishing different types of failures (e.g.
>> IMAGE vs. TEXT). It's like Skipped lists, but doesn't necessarily skip the
>> test. Fixing the expected failures still needs doing of course, but can be
>> done asynchronously. The primary advantage of this approach is that we can
>> turn on pixel tests, keep the bots green and avoid further regressions.
>> Would something like that make sense for WebKit as a whole? To be clear,
>> we would be nearly as loathe to add tests to this file as we are about
>> adding them to the Skipped lists. This just provides a way forward.
>> While it's true that the bots used to be red more frequently with pixel
>> tests turned on, for the most part, there weren't significant pixel
>> regressions. Now, if you run the pixel tests on a clean build, there are a
>> number of failures and a very large number of hash-mismatches that are
>> within the failure tolerance level.
>> -Ojan
>> For reference, the format of the expectations file is something like this:
>> // Fails the image diff but not the text diff.
>> fast/forms/foo.html = IMAGE
>> // Fails just the text diff.
>> fast/forms/bar.html = TEXT
>> // Fails both the image and text diffs.
>> fast/forms/baz.html = IMAGE+TEXT
>> // Skips this test (e.g. because it hangs run-webkit-tests or causes other
>> tests to fail).
>> SKIP : fast/forms/foo1.html = IMAGE
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20100108/5e87d902/attachment.html>

More information about the webkit-dev mailing list