[webkit-dev] Reducing layout test flakiness
jparent+webkit at gmail.com
Fri Nov 20 14:49:03 PST 2009
In Chromium-land, we have been working to eliminate flaky layout tests. One
thing that has helped us gain insight into flaky tests is the flakiness
I'd love to see this available for webkit.org as well. I'm happy to do the
work to set up the dashboard itself, but one thing we'd need is for
run-webkit-tests to output an extra json file, and for the build bots to
store this file.
Is a flakiness dashboard something you'd be interested in? Is storing the
extra data file ok (the largest chromium.org one is < 1MB and the
webkit.orgones should be much smaller since it is based on # failing
who knows Perl interested in adding the support to run-webkit-tests to
output the json (we can provide details on the format)?
Q: Why do we care about flaky layout tests?
A: They cost us time. Specfically, they cause the commit-queue to reject
good patches, lead engineers to spend extra time spent determining if they
caused the tree to go red, mask real regressions because the tree was
already red, put an extra burden on ports trying to determine if their port
is bad or the test is just bad, etc.
Q: How can I help fix flaky tests/not introduce more flaky tests?
A: One big, easily fixable, source of flakiness is setTimeouts. Whenever
possible, use specific events rather than relying on setTimeouts. Don't
use setTimeout to wait for resources to load. Use onload events instead
(iframe, image, body all have onload events). If there are no events
possible, and you need time to elapse before checking something, provide a
little extra wiggle room.
Q: Do you have examples of these sorts of easy fixes?
A: Timeout increase: http://trac.webkit.org/changeset/51150. Remove extra
watchdog setTimeout: http://trac.webkit.org/changeset/51120. Remove
unnecessary setTimeout: http://trac.webkit.org/changeset/51088. Use onload
to detect iframe loading: http://trac.webkit.org/changeset/49592
Q. I am still reading, can I see a pretty picture?
A: Sure. Here is the dashboard results clearly showing 2 tests that got
fixed. You can see the flakiness before (to the right) and the new
non-flakiness after (to the left).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the webkit-dev