[webkit-dev] can we stop using Skipped files?

Fri Jun 8 13:22:40 PDT 2012

On Fri, Jun 8, 2012 at 12:50 PM, Filip Pizlo <fpizlo at apple.com> wrote:
> On Jun 8, 2012, at 12:31 PM, Dirk Pranke wrote:
>
>> On Fri, Jun 8, 2012 at 10:56 AM, Filip Pizlo <fpizlo at apple.com> wrote:
>>>
>>> It's a lot harder to dive into, a lot more cumbersome to improve, and not
>>> any easier to maintain.
>>>
>>
>> I definitely agree that NRWT is more complicated than it seems like it
>> should be; it got contorted as we added all the features we needed to
>> add, and I have been in a "simplify" mode over the past few months. I
>> would welcome any feedback where you think things are overly complex.
>
> This is a difficult question - it's unfortunately easier to observe that something is complex, than it is to pinpoint why it is complex.  But I will try.
>

Thanks! Believe you me, I've spent a lot of time pondering this with
this codebase myself.

> 1) Code locality.  I can open Tools/Scripts/old-run-webkit-tests and pretty rapidly discover (a) how options are parsed, (b) how platform differences are handled, (c) how tests are found, and (d) how tests are run.  I can hack all of this code because it's all in one place.  I don't have to be a domain expert to do it.  Hell, I don't even have to be good a Perl to find my way around.
>

Yup, that's a downside to having things in multiple files, all right.
More on this below ...

> 2) Code size:
>
> [pizlo at wartooth OpenSource] wc Tools/Scripts/old-run-webkit-tests
>    2796   10316   98733 Tools/Scripts/old-run-webkit-tests
>
> [pizlo at wartooth OpenSource] wc Tools/Scripts/new-run-webkit-tests Tools/Scripts/webkitpy/layout_tests/*.py Tools/Scripts/webkitpy/layout_tests/*/*.py
> .... bunch of stuff
>   23197 91897 1049914 total
>
> That's a *HUGE* difference.  Consider that NRWT just adds only one thing that most people care about: parallelism.  Is an 8x increase in code size justified?
>
> I know that LoC metrics are evil in most cases.  But this is not most cases.  This is an order-of-magnitude difference.  That's 8x more code I have to look at to find what I want.  That's 8x more code that I potentially have to edit to add a feature.  That's 8x more code that could have a bug.  And so on.  Badness!
>

About half of that code is tests. The actual code to do the
parallelism is about ~500 lines; I have patches posted somewhere that
I haven't gotten around to landing that reduce that to about ~250
lines (or less, I don't recall exactly)

Much of the additional code actually is for other things. There's a
lot more options for logging and gathering statistics -- most of it
probably unneeded -- and there's a *lot* of complexity added for
handling all of the different test expectations options. There's
probably other stuff that can be deleted or simplified. It would be
interesting to break that down, and get the code more structured into
the stuff you need to be aware of and the stuff you can ignore.

>
> But the act of running layout tests is in itself a regression test of the layout test harness just as much as it is a regression test of WebKit.
>
> So the unit tests are superfluous.

Running the layout tests is a regression test, but both ORWT and NRWT
have dozens of options. Do you know which ones are used, and in which
combinations? Do you actually run all of them? If not, are you willing
to let the bots and other devs be your testers?

There's actually a single file of integration tests that covers NRWT
pretty well (it basically documented the above). However, others have
found that they want unit tests for the different modules to make
things more individually testable (and in some cases, you need unit
tests for things that aren't easily tested via integration tests), so
that's how we get so many tests.

> I really like approach of shared-nothing message-passing, but was under the (possibly mistaken) impression that NRWT has some threading in it.

It did use threads heavily at one point, and the threading didn't work
right, which led me to rewrite it into the current architecture. We do
actually still use threads in one case, to support the --singly flag
(we used to need threads for this with chromium), but we don't even
need that any more.

-- Dirk