[webkit-dev] Filtering results on wpt.fyi, Safari-specific failures

Tue Feb 26 09:51:41 PST 2019

A lot of the test results I'm seeing there are the "harness status",
which has been a common cause of confusion:
https://github.com/web-platform-tests/wpt.fyi/issues/62. Don't know
quite what the right solution is here, but it's definitely still
confusing.

/Sam

On Mon, Feb 25, 2019 at 9:57 PM Philip Jägenstedt <foolip at chromium.org> wrote:
>
> I think I know what's going on there. When drilling down into tests and subtests, only those matching the filter are shown. Clearing the filter things look a bit different in the directories you mentioned:
> https://wpt.fyi/results/ambient-light?label=master&label=experimental&product=chrome%5Btaskcluster%5D&product=firefox%5Btaskcluster%5D&product=safari%5Bazure%5D&aligned
> https://wpt.fyi/results/bluetooth?label=master&label=experimental&product=chrome%5Btaskcluster%5D&product=firefox%5Btaskcluster%5D&product=safari%5Bazure%5D&aligned
>
> In particular for idlharness.js tests some subtests will pass because they're preconditions for the real tests. There will also be tests that check that something doesn't work, which will pass even if the feature is entirely unsupported if "not working" results in the same thing, e.g. throwing an exception. Sometimes tests can be tweaked to fail if the feature is unsupported.
>
> Drilling down into a directory somewhat at random and clearing filters, it does look like this is legit:
> https://wpt.fyi/results/fetch/api/cors?label=master&label=experimental&product=chrome%5Btaskcluster%5D&product=firefox%5Btaskcluster%5D&product=safari%5Bazure%5D&aligned
>
> On Mon, Feb 25, 2019 at 8:31 PM Maciej Stachowiak <mjs at apple.com> wrote:
>>
>>
>> Neat.
>>
>> I see some obvious areas for focus, where Safari fails lots of tests that the other browser don’t.
>>
>> For context, I tried looking at this view, which shows all tests that Safari and Firefox pass with Safari results regardless of result:
>> https://wpt.fyi/results/?label=master&label=experimental&product=chrome%5Btaskcluster%5D&product=firefox%5Btaskcluster%5D&product=safari%5Bazure%5D&aligned&q=%28chrome%3Apass%7Cchrome%3Aok%29+%28firefox%3Apass%7Cfirefox%3Aok%29
>>
>> I noticed some puzzling results there: Safari passes all the ambient-light and bluetooth tests that Chrome and Firefox do, despite not supporting these standards at all. (For that matter I’m not sure Firefox supports these specs either.) Not sure if harness problem, or dubious tests that don’t actually test the standard.
>>
>> Regards,
>> Maciej
>>
>> On Feb 25, 2019, at 5:48 AM, Philip Jägenstedt <foolip at chromium.org> wrote:
>>
>> I'd like to point out right away that diagnosing reftest failures is
>> currently cumbersome because we don't store the screenshots. This is
>> also a work in progress:
>> https://docs.google.com/document/d/1IhZa4mrjK1msUMhtamKwKJ_HhXD-nqh_4-BcPWM6soQ/edit?usp=sharing
>>
>> Until that has launched, I would recommend ignoring reftest failures
>> if the cause of failure isn't obvious.
>>
>> On Mon, Feb 25, 2019 at 2:30 PM Philip Jägenstedt <foolip at chromium.org> wrote:
>>
>>
>> Hi all,
>>
>> Following the improved Safari results last year [1] and the discussion
>> that generated, I'm happy to announce that the filtering requested as
>> now available in the search box. The full syntax is documented [2] but
>> there's also a new insights view [3] with some useful searches.
>>
>> Especially interesting for this list could be this view, of Chrome
>> Dev, Firefox Nightly and Safari Technology Preview, filtered to the
>> Safari-specific failures:
>> https://wpt.fyi/results/?label=master&label=experimental&product=chrome%5Btaskcluster%5D&product=firefox%5Btaskcluster%5D&product=safari%5Bazure%5D&aligned&q=%28chrome%3Apass%7Cchrome%3Aok%29+%28firefox%3Apass%7Cfirefox%3Aok%29+%28safari%3A%21pass%26safari%3A%21ok%29
>>
>> Both Google and Mozilla have efforts [4][5] to reduce the number of
>> Chrome/Firefox-specific failures, as this seems like a category of
>> problems which especially valuable, where changing just one browser
>> can remove a pain point for web developers.
>>
>> No doubt some failures are spurious, but hopefully there is value to
>> be found by looking into where the largest numbers of failures appear
>> to be. If something seems to be wrong with the search/filtering,
>> please file an issue for us! [6]
>>
>> Credit to Mark Dittmer and Luke Bjerring who owned this project.
>>
>> P.S. We are also working on triage metadata for wpt.fyi, to make it
>> possible to burn down a list of failures like this and not later have
>> to re-triage to find the new failures. [7]
>>
>> [1] https://lists.webkit.org/pipermail/webkit-dev/2018-October/030209.html
>> [2] https://github.com/web-platform-tests/wpt.fyi/blob/master/api/query/README.md
>> [3] https://staging.wpt.fyi/insights
>> [4] https://bugs.chromium.org/p/chromium/issues/detail?id=896242
>> [5] https://bugzilla.mozilla.org/show_bug.cgi?id=1498357
>> [6] https://github.com/web-platform-tests/wpt.fyi/issues/new?title=Structured+Queries+issue&projects=web-platform-tests/wpt.fyi/8&labels=bug&template=search.md
>> [7] https://docs.google.com/document/d/1oWYVkc2ztANCGUxwNVTQHlWV32zq6Ifq9jkkbYNbSAg/edit?usp=sharing
>>
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> https://lists.webkit.org/mailman/listinfo/webkit-dev
>>
>>
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-dev