[webkit-dev] insanity of updating 4000+ baseline images due to font rendering change?

Ryosuke Niwa rniwa at webkit.org
Thu Oct 20 01:04:33 PDT 2011

On Wed, Oct 19, 2011 at 2:04 PM, Elliot Poger <epoger at google.com> wrote:
> Here are the various approaches I can think of... what's the
> Hive-Mind-Approved approach?
>    - Commit 4500 new baseline images for SnowLeopard
>       - pro: known to work, will catch any regressions that come later
>       - con: takes a long time to commit, chews up disk space and
>       bandwidth for all developers, future minor changes may require yet another
>       set of new baselines
>    - Leave all SnowLeopard tests marked as "PASS FAIL" (or maybe mark them
>    "SKIP") in test_expectations
>       - pro: known to work, quick and easy, doesn't clog repo space and
>       developer update bandwidth, future minor changes won't break any bots
>       - con: will not catch any regressions that come later on SnowLeopard
>    - Remove descriptive text from all these tests, so that text rendering
>    is only evaluated in tests specifically for that purpose
>       - pro: prevents this problem for future OS versions, should allow
>       for lots more baseline images to be shared across platforms
>       - con: a lot of work to replace all existing baseline images, must
>       coordinate across community of Chromium/WebKit developers, tests will be
>       more difficult to interpret without text
>    - Figure out how our test pages can be rendered with a completely
>    cross-platform pixel-equivalent font
>       - pro: similar to above but tests keep their descriptive text
>       - con: similar to above but more technically challenging
>    - Augment our pixel-diff tools to allow for comparison masks (only pay
>    attention to pixel diffs within this rectangle)
>       - pro: existing baseline images can stay in place, and perhaps be
>       shared with new OS versions and platforms
>       - con: requires modification of pixel-diff tools, need to add
>       comparison mask to each test definition
> I'd add another option to increase the tolerance level so that we ignore
all these tiny gradient/font rendering differences. I don't think the
added maintenance cost is not worth the benefit of being able to catch all

But I'd argue that we should keep baselines for Snow Leopard with
tolerance=0 and increase the tolerance level of Leopard since Snow Leopard
is a newer platform and will probably be supported for a longer period of
time than Leopard.

- Ryosuke
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20111020/c96e41ed/attachment.html>

More information about the webkit-dev mailing list