[webkit-dev] Running pixel tests on build.webkit.org

Mon Jan 11 09:17:59 PST 2010

Wow, much easier than I expected.  :-)

OK, then what about buy in on this approach?

I'll even file bugs on everything I rebaseline so we can track getting
things back to a correct state and/or verifying that the new baselines are
correct.

J

On Mon, Jan 11, 2010 at 9:13 AM, Dimitri Glazkov <dglazkov at chromium.org>wrote:

> It's baiscally just run-webkit-tests --reset-results --pixel-tests. No
> magic :)
>
> See run-webkit-tests --help for more info.
>
> BTW, Victor is working to port the rebaselining tool to
> build.webkit.org. You may want to check with him -- maybe he's close
> to finishing the patch.
>
> :DG<
>
> On Mon, Jan 11, 2010 at 9:06 AM, Jeremy Orlow <jorlow at chromium.org> wrote:
> > On Fri, Jan 8, 2010 at 9:52 AM, Jeremy Orlow <jorlow at chromium.org>
> wrote:
> >>
> >> Plan 3 seems like the best (and simplest) one until
> the infrastructure for
> >> the others (and/or a champion for fixing currently failing tests) is
> >> available.
> >> What would it take to go with plan 3?  I guess someone needs to
> rebaseline
> >> everything that's currently failing, check them in, and then someone
> (like
> >> bdash?) needs to flip a switch on the bots...?  Did I miss anything?
> >> Are there instructions on how to do the rebaselining anywhere?  I've
> only
> >> ever created pixel baselines for Chromium before (where we have a pretty
> >> neat tool that pretty much does it for you).
> >
> > Does anyone know?
> > I'm happy to do the rebaselining if someone can tell me how and we agree
> to
> > turn pixel tests on on the bots.
> >
> >>
> >> On Fri, Jan 8, 2010 at 9:23 AM, Pam Greene <pam at chromium.org> wrote:
> >>>
> >>> And one very quick, short-term solution:
> >>> 3. Generate new pixel results to match the current behavior, and check
> >>> them in as hypothetically correct.
> >>> And of course if someone notices an existing problem and fixes it, they
> >>> check in corrected images then. It doesn't help find current problems,
> but
> >>> those are being missed now anyway. It does let the tests be run again
> >>> approximately immediately, even faster than waiting for test
> expectations
> >>> functionality, so we can catch regressions moving forward.
> >>> - Pam
> >>>
> >>> On Thu, Jan 7, 2010 at 5:01 PM, Ojan Vafai <ojan at chromium.org> wrote:
> >>>>
> >>>> On Thu, Jan 7, 2010 at 10:22 AM, Darin Adler <darin at apple.com> wrote:
> >>>>>
> >>>>> On Jan 7, 2010, at 10:19 AM, Dimitri Glazkov wrote:
> >>>>> > Are we planning to run pixel tests on the build bots?
> >>>>>
> >>>>> If we can get them green, we should. It’s a lot of work. We need a
> >>>>> volunteer to do that work. We’ve tried before.
> >>>>
> >>>> Two possible long-term solutions come to mind:
> >>>> 1. Turn the bots orange on pixel failures. They still need fixing, but
> >>>> are not as severe as text diff failures. I'm not a huge fan of this,
> but
> >>>> it's an option.
> >>>> 2. Add in a concept of expected failures and only turn the bots red
> for
> >>>> *unexpected* failurs. More details on this below.
> >>>> In chromium-land, there's an expectations file that lists expected
> >>>> failures and allows for distinguishing different types of failures
> (e.g.
> >>>> IMAGE vs. TEXT). It's like Skipped lists, but doesn't necessarily skip
> the
> >>>> test. Fixing the expected failures still needs doing of course, but
> can be
> >>>> done asynchronously. The primary advantage of this approach is that we
> can
> >>>> turn on pixel tests, keep the bots green and avoid further
> regressions.
> >>>> Would something like that make sense for WebKit as a whole? To be
> clear,
> >>>> we would be nearly as loathe to add tests to this file as we are about
> >>>> adding them to the Skipped lists. This just provides a way forward.
> >>>> While it's true that the bots used to be red more frequently with
> pixel
> >>>> tests turned on, for the most part, there weren't significant pixel
> >>>> regressions. Now, if you run the pixel tests on a clean build, there
> are a
> >>>> number of failures and a very large number of hash-mismatches that are
> >>>> within the failure tolerance level.
> >>>> -Ojan
> >>>> For reference, the format of the expectations file is something like
> >>>> this:
> >>>> // Fails the image diff but not the text diff.
> >>>> fast/forms/foo.html = IMAGE
> >>>> // Fails just the text diff.
> >>>> fast/forms/bar.html = TEXT
> >>>> // Fails both the image and text diffs.
> >>>> fast/forms/baz.html = IMAGE+TEXT
> >>>> // Skips this test (e.g. because it hangs run-webkit-tests or causes
> >>>> other tests to fail).
> >>>> SKIP : fast/forms/foo1.html = IMAGE
> >>>> _______________________________________________
> >>>> webkit-dev mailing list
> >>>> webkit-dev at lists.webkit.org
> >>>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> webkit-dev mailing list
> >>> webkit-dev at lists.webkit.org
> >>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
> >>>
> >>
> >
> >
> > _______________________________________________
> > webkit-dev mailing list
> > webkit-dev at lists.webkit.org
> > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20100111/cfea02bb/attachment.html>