[webkit-dev] An update on new-run-webkit-tests

Wed Apr 6 22:33:17 PDT 2011

On Wed, Apr 6, 2011 at 9:01 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
> On Apr 6, 2011, at 7:39 PM, Dirk Pranke wrote:
>
>>
>> There are also a number of bugs currently listed as blocking that I
>> don't think really qualify. Unless told otherwise, I'm plannning to
>> remove the blocking flag from the following on Monday 4/11 (if they
>> haven't been fixed first):
>>
>> 57640 [GTK] overlapping drag&drop tests fail on NRWT
>> 55909 new-run-webkit-tests --run-singly option is busted
>> 55163 new-run-webkit-tests: enable multiple processes by default on Chromium Win
>> 47240 new-run-webkit-tests: getting an "error 2" back from ImageDiff
>> 37426 new-run-webkit-tests should use the ServerProcess abstraction in
>> chromium.py
>> 37007 fast/tokenizer/doctype-search-reset.html fails when run out of
>> order (new-run-webkit-tests)
>> 35359 fast/repaint/renderer-destruction-by-invalidateSelection-crash.html
>> fails intermittently
>> 35266 new-run-webkit-tests --platform=mac-leopard timeout limit should
>> match run-webkit-tests
>> 35049 http/tests/security/cross-frame-access-put.html fails
>> intermittently under new-run-webkit-tests
>> 35006 fast/dom/global-constructors.html is failing based on previous tests
>>
>> Also, just because I don't think they should block a cutover, I do
>> still think they should be fixed, so don't worry about that :)
>
> I think the ones that represent tests newly failing or becoming flaky should be fixed before cutting over. We wouldn't want to lose test coverage when we do the switch, right?
>

Hi Maciej,

I'm not sure I understand you, but if I do, this is what I was
attempting to talk about in the paragraph above, about expecting some
tests to be flaky or failing under NRWT simply because NRWT isn't
exactly identical to ORWT. NRWT may be exposing bugs in the code that
ORWT didn't trigger (e.g., because tests ran in a slightly different
order, or because of the concurrency issues).

It may be that you're thinking that either we run the test and it
fails, or we put the test in the Skipped file, because that was our
only choice with ORWT. In the new system, we can mark the test as
expected to fail in a particular way, but continue to run it (in order
to ensure that the test doesn't get worse and maintaining coverage).

Certainly running both systems in parallel for a while and shaking out
bugs that the NRWT bots reveal prior to cutting over is a good idea,
but I don't know that it's realistic to target all tests passing 100%
of the time prior to cutover. Then again, it may be that I'm more used
to Chromium bots where we have a large number of tests that aren't
expected to pass for one reason or another, and the Apple Mac port
will be more stable and easier to converge on.

Does that address your concerns?

And, just to be clear, I am not presuming to decide when anyone can or
should cut over (besides Chromium, of course). It's up to the
respective bot owners to decide to reconfigure their bots and switch
over if and when they're ready to do so. I'm just trying to make it
look appealing :)

-- Dirk