The thing I find most difficult about not having pixel bots is that, if I make a change that changes pixel results, I need to actually build that change on every platform to get the new pixel results. Could we put up pixel bots on a separate waterfall? It's a waterfall we don't expect to keep green all the time. This has a few advantages over the current state of the world: 1. When making cross-platform changes, it's easy to grab pixel results off the bots. 2. When making changes that affect pixel tests, it's easier to see which pixel failures are regressions caused by my patch. I think these two would greatly help in stemming the tide of pixel test regressions. Does that seem possible/reasonable? Ojan On Mon, Jan 11, 2010 at 9:17 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
Wow, much easier than I expected. :-)
OK, then what about buy in on this approach?
I'll even file bugs on everything I rebaseline so we can track getting things back to a correct state and/or verifying that the new baselines are correct.
J
On Mon, Jan 11, 2010 at 9:13 AM, Dimitri Glazkov <dglazkov@chromium.org>wrote:
It's baiscally just run-webkit-tests --reset-results --pixel-tests. No magic :)
See run-webkit-tests --help for more info.
BTW, Victor is working to port the rebaselining tool to build.webkit.org. You may want to check with him -- maybe he's close to finishing the patch.
:DG<
On Mon, Jan 11, 2010 at 9:06 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
On Fri, Jan 8, 2010 at 9:52 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
Plan 3 seems like the best (and simplest) one until
the infrastructure for
the others (and/or a champion for fixing currently failing tests) is available. What would it take to go with plan 3? I guess someone needs to rebaseline everything that's currently failing, check them in, and then someone (like bdash?) needs to flip a switch on the bots...? Did I miss anything? Are there instructions on how to do the rebaselining anywhere? I've only ever created pixel baselines for Chromium before (where we have a pretty neat tool that pretty much does it for you).
Does anyone know? I'm happy to do the rebaselining if someone can tell me how and we agree to turn pixel tests on on the bots.
On Fri, Jan 8, 2010 at 9:23 AM, Pam Greene <pam@chromium.org> wrote:
And one very quick, short-term solution: 3. Generate new pixel results to match the current behavior, and check them in as hypothetically correct. And of course if someone notices an existing problem and fixes it,
they
check in corrected images then. It doesn't help find current problems, but those are being missed now anyway. It does let the tests be run again approximately immediately, even faster than waiting for test expectations functionality, so we can catch regressions moving forward. - Pam
On Thu, Jan 7, 2010 at 5:01 PM, Ojan Vafai <ojan@chromium.org> wrote:
On Thu, Jan 7, 2010 at 10:22 AM, Darin Adler <darin@apple.com>
wrote:
> > On Jan 7, 2010, at 10:19 AM, Dimitri Glazkov wrote: > > Are we planning to run pixel tests on the build bots? > > If we can get them green, we should. It’s a lot of work. We need a > volunteer to do that work. We’ve tried before.
Two possible long-term solutions come to mind: 1. Turn the bots orange on pixel failures. They still need fixing, but are not as severe as text diff failures. I'm not a huge fan of this, but it's an option. 2. Add in a concept of expected failures and only turn the bots red for *unexpected* failurs. More details on this below. In chromium-land, there's an expectations file that lists expected failures and allows for distinguishing different types of failures (e.g. IMAGE vs. TEXT). It's like Skipped lists, but doesn't necessarily skip the test. Fixing the expected failures still needs doing of course, but can be done asynchronously. The primary advantage of this approach is that we can turn on pixel tests, keep the bots green and avoid further regressions. Would something like that make sense for WebKit as a whole? To be clear, we would be nearly as loathe to add tests to this file as we are about adding them to the Skipped lists. This just provides a way forward. While it's true that the bots used to be red more frequently with pixel tests turned on, for the most part, there weren't significant pixel regressions. Now, if you run the pixel tests on a clean build, there are a number of failures and a very large number of hash-mismatches that are within the failure tolerance level. -Ojan For reference, the format of the expectations file is something like this: // Fails the image diff but not the text diff. fast/forms/foo.html = IMAGE // Fails just the text diff. fast/forms/bar.html = TEXT // Fails both the image and text diffs. fast/forms/baz.html = IMAGE+TEXT // Skips this test (e.g. because it hangs run-webkit-tests or causes other tests to fail). SKIP : fast/forms/foo1.html = IMAGE _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev