[webkit-dev] pixel tests and --tolerance (was Re: Pixel test experiment)

Tue Oct 19 17:57:13 PDT 2010

FWIW, I needed NRWT to support --tolerance for something else today
(mainly because when using it with the Mac port, it defaults to 0.1
tolerance, with no way to override it), so I added NRWT support for
it: http://webkit.org/b/47959.

Mihai

On Thu, Oct 14, 2010 at 2:44 PM, Dirk Pranke <dpranke at chromium.org> wrote:
> On Thu, Oct 14, 2010 at 9:06 AM, Ojan Vafai <ojan at chromium.org> wrote:
>> Dirk, implementing --tolerance in NRWT isn't that hard is it? Getting rid of
>> --tolerance will be a lot of work of making sure all the pixel results that
>> currently pass also pass with --tolerance=0. While I would support someone
>> doing that work, I don't think we should block moving to NRWT on it.
>
> Assuming we implement it only for the ports that currently use
> tolerance on old-run-webkit-tests, no, I wouldn't expect it to be
> hard. Dunno how much work it would be to implement tolerance on the
> chromium image_diff implementations (side note: it would be nice if
> these binaries weren't port-specific, but that's another topic).
>
> As to how many files we'd have to rebaseline for the base ports, I
> don't know how many there are compared to how many fail pixel tests,
> period. I'll run a couple tests and find out.
>
> -- Dirk
>
>> Ojan
>> On Fri, Oct 8, 2010 at 1:03 PM, Simon Fraser <simon.fraser at apple.com> wrote:
>>>
>>> I think the best solution to this pixel matching problem is ref tests.
>>>
>>> How practical would it be to use ref tests for SVG?
>>>
>>> Simon
>>>
>>> On Oct 8, 2010, at 12:43 PM, Dirk Pranke wrote:
>>>
>>> > Jeremy is correct; the Chromium port has seen real regressions that
>>> > virtually no concept of a fuzzy match that I can imagine would've
>>> > caught.
>>> > new-run-webkit-tests doesn't currently support the tolerance concept
>>> > at al, and I am inclined to argue that it shouldn't.
>>> >
>>> > However, I frequently am wrong about things, so it's quite possible
>>> > that there are good arguments for supporting it that I'm not aware of.
>>> > I'm not particularly interested in working on a tool that doesn't do
>>> > what the group wants it to do, and I would like all of the other
>>> > WebKit ports to be running pixel tests by default (and
>>> > new-run-webkit-tests ;) ) since I think it catches bugs.
>>> >
>>> > As far as I know, the general sentiment on the list has been that we
>>> > should be running pixel tests by default, and the reason that we
>>> > aren't is largely due to the work involved in getting them back up to
>>> > date and keeping them up to date. I'm sure that fuzzy matching reduces
>>> > the work load, especially for the sort of mismatches caused by
>>> > differences in the text antialiasing.
>>> >
>>> > In addition, I have heard concerns that we'd like to keep fuzzy
>>> > matching because people might potentially get different results on
>>> > machines with different hardware configurations, but I don't know that
>>> > we have any confirmed cases of that (except for arguably the case of
>>> > different code paths for gpu-accelerated rendering vs. unaccelerated
>>> > rendering).
>>> >
>>> > If we made it easier to maintain the baselines (improved tooling like
>>> > the chromium's rebaselining tool, add reftest support, etc.) are there
>>> > still compelling reasons for supporting --tolerance -based testing as
>>> > opposed to exact matching?
>>> >
>>> > -- Dirk
>>> >
>>> > On Fri, Oct 8, 2010 at 11:14 AM, Jeremy Orlow <jorlow at chromium.org>
>>> > wrote:
>>> >> I'm not an expert on Pixel tests, but my understanding is that in
>>> >> Chromium
>>> >> (where we've always run with tolerance 0) we've seen real regressions
>>> >> that
>>> >> would have slipped by with something like tolerance 0.1.  When you have
>>> >> 0 tolerance, it is more maintenance work, but if we can avoid
>>> >> regressions,
>>> >> it seems worth it.
>>> >> J
>>> >>
>>> >> On Fri, Oct 8, 2010 at 10:58 AM, Nikolas Zimmermann
>>> >> <zimmermann at physik.rwth-aachen.de> wrote:
>>> >>>
>>> >>> Am 08.10.2010 um 19:53 schrieb Maciej Stachowiak:
>>> >>>
>>> >>>>
>>> >>>> On Oct 8, 2010, at 12:46 AM, Nikolas Zimmermann wrote:
>>> >>>>
>>> >>>>>
>>> >>>>> Am 08.10.2010 um 00:44 schrieb Maciej Stachowiak:
>>> >>>>>
>>> >>>>>>
>>> >>>>>> On Oct 7, 2010, at 6:34 AM, Nikolas Zimmermann wrote:
>>> >>>>>>
>>> >>>>>>> Good evening webkit folks,
>>> >>>>>>>
>>> >>>>>>> I've finished landing svg/ pixel test baselines, which pass with
>>> >>>>>>> --tolerance 0 on my 10.5 & 10.6 machines.
>>> >>>>>>> As the pixel testing is very important for the SVG tests, I'd like
>>> >>>>>>> to
>>> >>>>>>> run them on the bots, experimentally, so we can catch regressions
>>> >>>>>>> easily.
>>> >>>>>>>
>>> >>>>>>> Maybe someone with direct access to the leopard & snow leopard
>>> >>>>>>> bots,
>>> >>>>>>> could just run "run-webkit-tests --tolerance 0 -p svg" and mail me
>>> >>>>>>> the
>>> >>>>>>> results?
>>> >>>>>>> If it passes, we could maybe run the pixel tests for the svg/
>>> >>>>>>> subdirectory on these bots?
>>> >>>>>>
>>> >>>>>> Running pixel tests would be great, but can we really expect the
>>> >>>>>> results to be stable cross-platform with tolerance 0? Perhaps we
>>> >>>>>> should
>>> >>>>>> start with a higher tolerance level.
>>> >>>>>
>>> >>>>> Sure, we could do that. But I'd really like to get a feeling, for
>>> >>>>> what's
>>> >>>>> problematic first. If we see 95% of the SVG tests pass with
>>> >>>>> --tolerance 0,
>>> >>>>> and only a few need higher tolerances
>>> >>>>> (64bit vs. 32bit aa differences, etc.), I could come up with a
>>> >>>>> per-file
>>> >>>>> pixel test tolerance extension to DRT, if it's needed.
>>> >>>>>
>>> >>>>> How about starting with just one build slave (say. Mac Leopard) that
>>> >>>>> runs the pixel tests for SVG, with --tolerance 0 for a while. I'd be
>>> >>>>> happy
>>> >>>>> to identify the problems, and see
>>> >>>>> if we can make it work, somehow :-)
>>> >>>>
>>> >>>> The problem I worry about is that on future Mac OS X releases,
>>> >>>> rendering
>>> >>>> of shapes may change in some tiny way that is not visible but enough
>>> >>>> to
>>> >>>> cause failures at tolerance 0. In the past, such false positives
>>> >>>> arose from
>>> >>>> time to time, which is one reason we added pixel test tolerance in
>>> >>>> the first
>>> >>>> place. I don't think running pixel tests on just one build slave will
>>> >>>> help
>>> >>>> us understand that risk.
>>> >>>
>>> >>> I think we'd just update the baseline to the newer OS X release, then,
>>> >>> like it has been done for the tiger -> leopard, leopard -> snow
>>> >>> leopard
>>> >>> switch?
>>> >>> platform/mac/ should always contain the newest release baseline, when
>>> >>> therere are differences on leopard, the results go into
>>> >>> platform/mac-leopard/
>>> >>>
>>> >>>> Why not start with some low but non-zero tolerance (0.1?) and see if
>>> >>>> we
>>> >>>> can at least make that work consistently, before we try the bolder
>>> >>>> step of
>>> >>>> tolerance 0?
>>> >>>> Also, and as a side note, we probably need to add more build slaves
>>> >>>> to
>>> >>>> run pixel tests at all, since just running the test suite without
>>> >>>> pixel
>>> >>>> tests is already slow enough that the testers are often significantly
>>> >>>> behind
>>> >>>> the builders.
>>> >>>
>>> >>> Well, I thought about just running the pixel tests for the svg/
>>> >>> subdirectory as a seperate step, hence my request for tolerance 0, as
>>> >>> the
>>> >>> baseline passes without problems at least on my & Dirks machine
>>> >>> already.
>>> >>> I wouldnt' want to argue running 20.000+ pixel tests with tolerance 0
>>> >>> as
>>> >>> first step :-) But the 1000 SVG tests, might be fine, with tolerance
>>> >>> 0?
>>> >>>
>>> >>> Even tolerance 0.1 as default for SVG would be fine with me, as long
>>> >>> as we
>>> >>> can get the bots to run the SVG pixel tests :-)
>>> >>>
>>> >>> Cheers,
>>> >>> Niko
>>> >>>
>>> >>> _______________________________________________
>>> >>> webkit-dev mailing list
>>> >>> webkit-dev at lists.webkit.org
>>> >>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> webkit-dev mailing list
>>> >> webkit-dev at lists.webkit.org
>>> >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>> >>
>>> >>
>>> > _______________________________________________
>>> > webkit-dev mailing list
>>> > webkit-dev at lists.webkit.org
>>> > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>>
>>> _______________________________________________
>>> webkit-dev mailing list
>>> webkit-dev at lists.webkit.org
>>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
>>
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>