[webkit-dev] can we stop using Skipped files?

Mon Jun 11 23:13:34 PDT 2012

On Mon, Jun 11, 2012 at 5:46 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
> On Jun 10, 2012, at 9:26 AM, Ojan Vafai <ojan at chromium.org> wrote:
>
> On Sun, Jun 10, 2012 at 4:54 AM, Balazs Kelemen <kbalazs at webkit.org> wrote:
>>>
>>> So the unit tests are superfluous.  In particular, if I had to pick
>>> between only having unit tests or only having regression tests, I might pick
>>> unit tests.  But if I already have regression tests then I'm unlikely to
>>> want to incur technical debt to build unit tests, particularly since unit
>>> tests requiring changing the infrastructure to make the code more testable,
>>> which then leads to the problems listed above.
>>
>>
>> There are many code paths are used rarely. In practice, we were having
>> regressions frequently when people modified the code. Since the codebase has
>> been unittested, the rate of regressions has gone down considerably. The
>> time you spend dealing with tests is considerably less than the time you
>> spend rolling patches in an out as you encounter different edge cases that
>> different configurations/flags hit.
>>
>>
>>
>> A quick note to unittests. I think it's easy to define a hard limit for
>> unittests, which is that: if I want to add a feature, or some customizing
>> option for a particular port, it should be less effort to write the unittest
>> than to write the actual code. I heard from my colleges a few times that
>> it's not always the case with nrwt. I can imagine that it's not trivial to
>> setup the unittest system for a module that has not been unittested so far
>> but I think it should rather be the job of those who are actively working on
>> the test harness, not of those who just need some work to be done for their
>> port.
>
>
> While this is a nice ideal to strive for, I don't think this ever plays out
> for testing on any project, e.g. it is very frequently harder to write tests
> for my WebCore changes than to make the change itself. Certainly anything we
> can do to make testing easier is better, but I don't see NRWT as more
> difficult to test than any other code in the WebKit project.
>
> WebKit has a policy of every change requiring tests. I don't see why tooling
> should be any different. It's unfortunate that NRWT started with 0 tests, so
> there are still (very few now!) parts that aren't tested. It's hard to test
> those parts if that's what your modifying. However, it's *especially* for
> the cases of port-specific code that need testing. Those are exactly the
> codepaths that break from lack of testing.
>
>
> Do we have some data that shows NRWT suffering fewer regressions (per unit
> time or per N changes) than ORWT?
>

I am not aware of any. Given that the two tools were developed over
different periods of time by different people with different
requirements, I'm not even sure how meaningful a comparison this would
be.

What we do have data for is the rate of regressions in NRWT over time.
I haven't dug up that data, but I can say with great certainty that we
broke things more often before we had tests. Usually, I would break
things multiple times before I got one change to land. It could be
that I'm just a crappy coder, but given that I'm the one doing most of
the work, you should probably be glad that there are tests now.

> I am strongly in favor of automated tests in general, but I'm skeptical of
> it here for two reasons:
>
> 1) I have found the hackability of anything involving webkitpy and its unit
> tests to be poor. It takes a long time to make a simple change, and the need
> to add tests or modify tests is certainly part of it.
>

I think it is almost a tautology that writing tests for a change will
take more time than not writing tests. The question is what is the
long-term cost of not having tests?

Also, it could be that the tests for webkitpy are badly written;
that's an argument that webkitpy is badly written at least as much as
an argument that you shouldn't write tests. So far I don't think
anyone in this thread has made the claim that NRWT is well written
code or is even better-written code than ORWT.

> 2) For code that ships to end-users or third parties, I am a strong advocate
> of comprehensive testing. I think testing is worthwhile even if it were
> hypothetically the case that faith-based programming was less total work.
> That is so because we are trading off the time of a couple of hundred WebKit
> engineers for quality of software experienced by hundreds of millions of
> users. So it's worth it to incur significant test infrastructure costs to
> benefit a much greater number of users.
>
> But for the case of internal tools, I think the tradeoff is fundamentally
> different. The costs of maintaining test infrastructure and the costs of
> dealing with regressions are borne by more or less the same set of people.

This is simply not true. The number of people that have committed a
change to NRWT is < 25% the total number of committers to the project.
The vast majority (90%+) of the changes come from fewer than 10
people.

I'm not sure how this thread got derailed into a debate about whether
or not writing tests for our changes was a good idea, but it is not a
terribly interesting debate to me. I will try to close off this thread
in a separate note.

-- Dirk