[webkit-dev] Importing W3C tests to webkit

Wed May 23 14:16:34 PDT 2012

On Wed, May 23, 2012 at 1:41 PM, Ryosuke Niwa <rniwa at webkit.org> wrote:
>> > The only sane argument I've heard so far to gate pixel tests is that the
>> > correctness of such tests need to be manually inspected, which requires
>> > a
>> > lot of manual labor and is very error prone.
>>
>> I'm assuming the above includes the ongoing maintenance cost of
>> keeping pixel tests up to date, as well as the cost at the initial
>> checkin.
>
> I'm not concerned of those. Once the correct expected result is checked in,
> it's pretty easy to rebaseline tests per rendering engine changes assuming
> people who are rebaselining tests know what they're doing.

You should be concerned; keeping pixel tests up-to-date is clearly a
non-zero cost that only the chromium port thus far has been willing to
bear, and I suspect that the cost of updating baselines is
substantially higher than the cost of the initial review over time
(since it's a recurring cost).

We only have to ask Emil and Levi what the cost of updating all of the
pixel tests were for the subpixel layout test change, or ask the skia
guys how many bug fixes they're reluctant to make because of the cost
of reviewing literally thousands of images that change
inconsequentially for empirical evidence for this.

>> There is also the fact that the more tests we have, the more tests we
>> have to run, and increasing cycle time by itself is a cost to developer
>> productivity.
>
> Sure, but I don't think that's a valid argument for not adding tests
> especially since there is no way for us to mechanically test whether two
> tests test the same set of features or not (this is an intractable problem
> even in its limited form and an undecidable one in its most general form).

At some point adding more tests will introduce a declining marginal
rate of return in finding more bugs; this is a truism of software
development, and is why *all* software testing efforts stop at some
point (ignoring formal proofs of completeness in model checkers).
Either you don't think this is true, or you think this is true and
we're just not at that point yet.

If you think the latter then we agree, but I don't understand why you
are arguing as if you believe the former.

To repeat myself, I have never said that we shouldn't ever add more
tests, just that we should have a rational process for doing so that
includes looking at what overlap we have with the existing tests and
making sure that adding more tests delivers value. Since you agree at
least that we shouldn't be adding duplicate tests, you clearly agree
with this to some degree, so I'm not sure if you and I have any real
disagreements or if we're just talking past each other.

> Also, using ref test or pixel test, etc... doesn't change the cycle time
> significantly so I don't understand what your argument is. Or are you
> suggesting that non-ref tests are somehow more redundant than ref tests?
> (please give us why).

I am saying that I believe that adding pixel tests incur more cost on
the project than adding ref tests, and since all testing is about cost
vs. benefit, you need to be more careful when adding pixel tests.
Since we actively discourage people from writing pixel tests in favor
of text-only or ref tests, I hardly think this is a controversial
stance.

-- Dirk