[webkit-dev] Running pixel tests on build.webkit.org

Thu Jan 7 17:08:42 PST 2010

I'm totally in favor of adding test_expectations.txt like
functionality to webkit (and we'll get it for free when Dirk finishes
up-streaming run_webkit_tests.py)

But the troubles with the pixel tests in the past were more to do with
text metrics changing between OS releases, and individual font
differences between machines.  I suspect that those issues are very
solvable.

I think we mostly need someone willing to set up the pixel test bots.

-eric

On Thu, Jan 7, 2010 at 5:01 PM, Ojan Vafai <ojan at chromium.org> wrote:
> On Thu, Jan 7, 2010 at 10:22 AM, Darin Adler <darin at apple.com> wrote:
>>
>> On Jan 7, 2010, at 10:19 AM, Dimitri Glazkov wrote:
>> > Are we planning to run pixel tests on the build bots?
>>
>> If we can get them green, we should. It’s a lot of work. We need a
>> volunteer to do that work. We’ve tried before.
>
> Two possible long-term solutions come to mind:
> 1. Turn the bots orange on pixel failures. They still need fixing, but are
> not as severe as text diff failures. I'm not a huge fan of this, but it's an
> option.
> 2. Add in a concept of expected failures and only turn the bots red for
> *unexpected* failurs. More details on this below.
> In chromium-land, there's an expectations file that lists expected failures
> and allows for distinguishing different types of failures (e.g. IMAGE vs.
> TEXT). It's like Skipped lists, but doesn't necessarily skip the test.
> Fixing the expected failures still needs doing of course, but can be done
> asynchronously. The primary advantage of this approach is that we can turn
> on pixel tests, keep the bots green and avoid further regressions.
> Would something like that make sense for WebKit as a whole? To be clear, we
> would be nearly as loathe to add tests to this file as we are about adding
> them to the Skipped lists. This just provides a way forward.
> While it's true that the bots used to be red more frequently with pixel
> tests turned on, for the most part, there weren't significant pixel
> regressions. Now, if you run the pixel tests on a clean build, there are a
> number of failures and a very large number of hash-mismatches that are
> within the failure tolerance level.
> -Ojan
> For reference, the format of the expectations file is something like this:
> // Fails the image diff but not the text diff.
> fast/forms/foo.html = IMAGE
> // Fails just the text diff.
> fast/forms/bar.html = TEXT
> // Fails both the image and text diffs.
> fast/forms/baz.html = IMAGE+TEXT
> // Skips this test (e.g. because it hangs run-webkit-tests or causes other
> tests to fail).
> SKIP : fast/forms/foo1.html = IMAGE
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
>