[webkit-dev] Test Expectations: A Pipe Dream

Sat Feb 26 01:46:09 PST 2011

Am 26.02.2011 um 00:14 schrieb Dimitri Glazkov:

> (re-sending to webkit-dev, since this idea was just mentioned on #webkit)
> 
> If wishes were fishes, checking layout tests expectations into WebKit
> would work like so:
> 
> * Authors of layout tests never need to submit test expectations with
> their patches
I could imagine cases, where I explicitely want to disable LaTER, so maybe a lr? flag could help. (lr == LaTER)

> * All new patches are flagged by a Layout Test Expectations Resolver
> (LaTER from here on).
> 
> * LaTER is a set of EWS bots (one for each platform). It takes the new
> patch, applies/builds/runs tests/generates expectations.
Great.

> * Newly generated expectations are compared against existing
> expectations. A minimal set of changes for each platform is created.
> 
> * This set of generated expectations is merged with the changes in the
> original patch, and attached to bug as a new patch, obsoleting the
> original patch. LaTER knows not to chew on this patch again.
That could be dangerous. Think of patches slightly changing the text rendering, or even the textual DRT results.
A mass-rebaseline could easily lead to 50+ MB for the final patch. Do we want to manage all of that on bugzilla?
Do we have enough resources for that?

If we're all worried about the size of the patches LaTER could produce, it would be very handy, if one
could choose to a) merge the LaTER results with the patch, or b) generate a link, where the results can be downloaded.
If I produce a mass rebaseline patch, I can use LaTER to generate all desired baselines, download the package (several dozens of MB...)
apply it to my tree and land it in chunks manually (no way a 20+MB commit would work during regular working hours).

> * The rest of the process (review/cq/land) is the same as it was before.
> 
> Thoughts? Comments? More crazy ideas?

I think this would be a wonderful idea. Thinking of SVG pixel tests, it would be very handy to see the impact of a patch on all other
LaTER supported platforms _before_ landing the patch to trunk -- that's the key change compared to the current workflow of landing, and watching bots.

More crazy ideas? Sure:
A LaTER summary page for each bug report (a link similar to review patch), showing all layout test changes in an overview, per platform.
One can easily crawl through the results, visually compare them, eventually click a link to file a bug for a specific testcase, say which fails on gtk.
If we had that, I'd be amazed :-)

I'm hoping for lots of feedback. If I recall how EWS boosted our productivity, LaTER may have the same impact.

Cheers,
Niko