[webkit-dev] Upstreaming from LayoutTests to web-platform-tests, coordinating Blink+WebKit

Tue Nov 28 07:24:03 PST 2017

On Tue, Nov 28, 2017 at 3:37 PM Ryosuke Niwa <rniwa at webkit.org> wrote:

> On Mon, Nov 27, 2017 at 5:58 PM, Philip Jägenstedt <foolip at chromium.org>
> wrote:
>
>> (From Nov 24, I used the wrong email, resending for the archives.)
>>
>> On Sat, Nov 18, 2017 at 8:02 AM, youenn fablet <youennf at gmail.com> wrote:
>> > Thanks for taking the time to share this additional information.
>> > I think this is helpful to make progress.
>> > Please see inline for some more comments related to the points you are
>> > bringing to the table.
>> >
>> > Stepping back from the WPT server specific issue, being optimistic and
>> > assuming that we are able to run WPT tests with good enough
>> performances.
>> > Migration potential downsides I heard so far:
>> > - Prioritization of tests to investigate might be harder.
>> > - Sharing tests with other teams might be harder if subresource links
>> are
>> > not relative.
>> > These two seem like solvable issues to me.
>> >
>> > Le ven. 17 nov. 2017 à 10:09, Alexey Proskuryakov <ap at apple.com> a
>> écrit :
>> >>
>> >>
>> >> 17 нояб. 2017 г., в 9:18, youenn fablet <youennf at gmail.com>
>> написал(а):
>> >>
>> >>
>> >> Chris recently noticed that some heavily used files (testharness*) were
>> >> cacheable through Apache but not WPT.
>> >> This is now fixed and should improve WPT performances.
>> >>
>> >>
>> >> This is part of <https://bugs.webkit.org/show_bug.cgi?id=178277>,
>> another
>> >> part is that the server is 10x slower than Apache.
>> >
>> >
>> > Other measurements showed 3x slower, which makes it still worthwhile to
>> > explore.
>> > We need to be cautious here though since optimization is all about
>> chasing
>> > where time is actually spent.
>> >
>> > If we can prove to ourselves that this is an important issue, we should
>> > discuss with the WPT community to see how to fix this issue.
>> > In addition to better caching, other optimization like
>> > https://github.com/w3c/wptserve/pull/86 may bring some additional
>> benefit.
>> >
>> > If we do not find any good solution at WPT server level, we still have
>> the
>> > option to run some tests as file-based.
>> > Ryosuke mentioned the possibility to classify tests at import time by
>> > checking their results when served through WPT server and as file URLs.
>>
>> Thanks for filing https://github.com/w3c/web-platform-tests/issues/8391,
>> Youenn!
>>
>> I've put a placeholder around this in our planning for next quarter.
>> At a minimum, I'd like to investigate this and see how much slower the
>> tests are in Chromium's infra than they would be without wptserve, and
>> where that extra time is spent. But the reasons may not be the same in
>> WebKit, so please don't count on any improvement :)
>>
>> >> I just tested on my MacBook Pro, and WPT tests took 23% of time while
>> >> being only 9% of the total count. Taking in mind that WebKit own tests
>> have
>> >> higher value due to the way we choose what to test (see below), that's
>> not a
>> >> great story in my opinion.
>> >
>> >
>> > These numbers are difficult to interpret.
>> > WPT authoring style is multiple tests in a single file, which is bad for
>> > stability but potentially good for performances.
>> > WebKit usually prefers one file/one test.
>> >
>> > If we want to talk to WPT community about this, we need to provide some
>> more
>> > tangible numbers.
>> > We could try to run a large subset of WPT tests that run the same
>> through
>> > Apache and WPT.
>> > I just did that on a small subset of IDB tests. This seems to show
>> something
>> > like a 25% slowdown.
>> >
>> >> One other thing that we discussed before was the operational
>> complexity of
>> >> running WPT tests. We frequently need to share tests with people who
>> don't
>> >> work on WebKit directly, but have the need to edit and run our tests.
>> >> Inability to drag and drop a local copy into a Safari window is a
>> deterrent
>> >> to addressing problems caught by the tests. I think that the response
>> we got
>> >> was that tests will continue to require a server to run.
>> >
>> >
>> > Tests are available at https://w3c-test.org which makes it easy to
>> share
>> > through any tool supporting hyperlinks.
>> > A webarchive can also be made so that it is easy to share and probably
>> edit
>> > such tests.
>> > Tools like jsfiddle are also a great way to create/share/edit tests.
>> > I received several bug reports on bugzilla using it and this proved to
>> be
>> > efficient.
>> >
>> >>
>> >> Let me explain why I think that WebKit tests are often more valuable as
>> >> regression tests than WPT tests are. We add tests as we fix bugs, so
>> we know
>> >> that the tests are generally for problems that have a high impact on
>> users
>> >> and developers - that's because someone actually discovered the
>> problem, and
>> >> someone prioritized it highly enough to fix. We also know that our
>> tests
>> >> cover code that is error prone, which is why we had bugs in the first
>> place.
>> >> Of course, anything can be broken, but certain things are less likely
>> to.
>> >> Compliance tests written for specs are also valuable, but at some
>> point we
>> >> need to prioritize which tests to investigate and even to run.
>> >
>> >
>> > I don't really see why we should prioritize the tests to run when all of
>> > them provide clear value to some WebKit members.
>> > I agree that we need to prioritize tests we investigate. There can be a
>> > solution inside WPT, like adding WebKit specific metadata so that:
>> > - WPT contributors would communicate with WebKit members whenever
>> changing
>> > such tests
>> > - WebKit contributors would prioritize WPT-WebKit-metadata failing tests
>> >
>> > That said, if these tests are so beneficial to WebKit, they are
>> potentially
>> > very useful to other teams as well.
>> > And vice-versa, we might find really good WPT tests that show useful
>> crashes
>> > and failures in WebKit.
>> > I am experiencing that nowadays with WPT service worker tests.
>>
>> FWIW, I think that Alexey is very likely right about the average value
>> of LayoutTests vs. web-platform-tests, certainly as regression tests
>> for WebKit. But, with things like wpt.fyi and more 2-way sync, I think
>> we'll see the overall quality of wpt increase.
>>
>> Still, now and for the foreseeable future, not everyone will import
>> and run all of wpt. If we put "upstream large parts of LayoutTests" to
>> the side for the moment, I think at the minimum we still need to be
>> concerned about the cases where Chromium and WebKit are already
>> importing a directory, and someone wants to upstream things from one
>> of the projects into web-platform-tests. This is such a case:
>> https://bugs.chromium.org/p/chromium/issues/detail?id=761790
>>
>> Assuming the upstreaming was carefully reviewed and the WebKit
>> reviewers trusts that review, would it be welcome to delete the same
>> tests in WebKit?
>>
>
> "Upstreaming was carefully reviewed" is a big if. I don't necessarily
> trust Blink code review process for the needs of WebKit changes. We
> certainly shouldn't be rubber-stamping changes to remove tests in WebKit
> just because it had been code reviewed in Blink or WPT.
>

To do the upstreaming from WebKit first would get around this problem, and
is something I could see us doing. If a Chromium developer upstreamed tests
from WebKit and got review from a WebKit reviewer, I think we would
rubberstamp the corresponding removal in Chromium.

But that would be a futile exercise if the WebKit reviewers don't want to
delete the tests from WebKit. The only remaining options would then be not
sharing the tests with EdgeHTML and Gecko, or creating more duplication for
WebKit.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20171128/4fe2189d/attachment.html>