On Fri, Mar 29, 2019 at 6:16 PM Robert Ma <robertma@chromium.org> wrote:
On Mon, Feb 25, 2019 at 8:49 AM Philip Jägenstedt <foolip@chromium.org> wrote:
I'd like to point out right away that diagnosing reftest failures is
currently cumbersome because we don't store the screenshots. This is
also a work in progress:
https://docs.google.com/document/d/1IhZa4mrjK1msUMhtamKwKJ_HhXD-nqh_4-BcPWM6soQ/edit?usp=sharing

Until that has launched, I would recommend ignoring reftest failures
if the cause of failure isn't obvious.
  

Great news! Reftest screenshots are now available on wpt.fyi. No more guesswork for why a reftest fails!

For example, this is one of the Safari-only reftest failures you can find using the search link posted earlier. Now you can click the "compare" button (you might need to force-reload the page to see it) to view the screenshots. This example looks like a genuine failure, while some others are probably caused by font antialiasing/kerning (they should most likely use the Ahem font instead).

We are also working on another feature to triage the failures (e.g. to mark a test as a genuine failure and link it to bug trackers, or as flaky/broken). Stay tuned!

The screenshots can also come in handy when comparing Safari stable to Technology Preview:
https://wpt.fyi/results/?diff&filter=ADC&q=seq%28status%3Apass+status%3Afail%29&run_id=5130810281689088&run_id=5197532699295744

/css/css-contain/contain-layout-baseline-003.html is one reftest that appears to have regressed in Technology Preview, and one can see the failure here:
https://wpt.fyi/analyzer?screenshot=sha1%3A66e5479ec5db9b860338e89803b563f7e99510f6&screenshot=sha1%3A385fc160998db876af7fce0e6a9fbf8ad06b4a45