[webkit-dev] The green tree era

Adam Barth abarth at webkit.org
Sat Apr 3 10:36:57 PDT 2010


Thanks to an enormous effort by a large number of people (spearheaded
by Eric), we have returned returned the Tiger and Snow Leopard bots to
green, letting us add them to the list of core builders.  I'd like to
ask your help to keep them that way.

In the past, red bots went unnoticed and failures (including real
regressions) accumulated in the tree.  The longer a failure persists
in the tree, the harder it is to track down the source of the failure
and the higher the chance that the failure is hiding another
regression.  Keeping the tree green will require a cultural shift in
the project, but I think the near term costs of changing the culture
are outweighed by the long term gains in productivity.

== Sheriffbot ==

We now have sheriffbot to help us keep the tree green.  Here's how
sheriffbot works today.  If a core builder fails twice in a row,
sheriffbot computes the blame list for the failure, notifies the
responsible parties (committer, author, and reviewer) in IRC, and
comments on the relevant bugs.

Sheriffbot can't fix the tree himself: only you can prevent forest
fires.  If sheriffbot complains about your patch, please take a moment
to investigate whether your patch (or the patch you reviewed) actually
caused a failure (ideally coordinating with other folks on IRC).  It's
possible that your change broke the tree, but it's also possible that
your change was blamed because it was caught in the same cycle as the
change that really caused a failed (or it's possible sheriffbot was
tricked by two flaky tests in a row).

Q: Why wait for two failures in a row?
A: We wait for two failures in a row to avoid spamming in the case of
a flaky test.  That system isn't perfect, and we've seen cases where
sheriffbot is fooled by flaky tests.  We're going to continue to
iterate on the failure detection algorithm to reduce false alarms.

Q: How does sheriffbot know my IRC nick?
A: We've add the IRC nicks for most members of the project to
http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/common/config/committers.py.
 Please take a minute to verify that your nick is correct or add your
nick if it's missing.  Adding your nick to committers.py doesn't
require review.

== Responding to failures ==

Generally, there are two approaches to responding to failures: (1)
attempt to fix the failure live on the tree, or (2) roll out the
offending change.  Historically, the WebKit project has favored fixing
live because, I think, rolling changes in and out of the tree was
cumbersome.  The tradeoff between these approaches is that fixing live
imposes the cost of a broken tree on every member of the project
whereas rollouts imposes a cost on the committer of the patch.  As the
number of people involved in the project grows, the cost of a broken
tree scales while cost of rollout remains fixed.  At some point,
attempting to fix the tree live is more costly to the project than
rolling out the patch.

If you cause a failure, I'd like to encourage you to roll out your
patch instead of trying to fix the failure live.  There's no shame in
rolling out your patch, and you can always land it again once you've
tracked down the failure.  Often re-landing your change doesn't
require an additional review.  Of course, rolling out a patch isn't
appropriate for all situations.  The next time you find yourself
trying to fix a failure live on the tree, ask yourself whether you're
selfishly imposing costs on other members of the project.

== Using sheriffbot to roll out a patch ==

Sheriffbot can help you roll out a patch.  Here's how it works.
Suppose revision 57047 broke Tiger.  You can send sheriffbot the
following command in #webkit:

sheriffbot: rollout 57047 This patch broke xyz test on Tiger

Sheriffbot will file a bug about the failure, cc the appropriate
people, mark the bug as blocking the original bug, attach a rollout
patch, and give you a link to the bug in IRC.  All you need to do is
go to the bug and mark the rollout patch as commit-queue+.  The
commit-queue will then land your rollout and reopen the origin bug.

This functionality is still very new, and we haven't worked out all
the kinks yet, but if you have trouble, please let Eric or me know.

Q: Why do I have to mark the patch as commit-queue+, haven't I already
told sheriffbot to roll out the patch?
A: We don't trust commands received on IRC.  We need you to authorize
the rollout using your bugs.webkit.org credentials.

Q: Won't spammers create infinite bugs by poking the sheriffbot?
A: We might need to restrict use of the rollout command to committers.
 We'd still use bugs.webkit.org to authorize landing the rollout, but
that should reduce the spam problem, if there is one.

== The Commit-Queue Superhighway ==

The sheriffbot is the third and final component in a system we've been
referring to informally as the commit-queue superhighway.  Once we get
the kinks worked out in sheriffbot and add the final pieces of
integration, I'll explain how you can use the commit-queue
superhighway make yourself (and the project as a whole) more
productive.

I've put a version of this mail on the wiki at
<https://trac.webkit.org/wiki/Keeping%20the%20Tree%20Green>.  Please
join me in making the green tree era the most productive era of WebKit
development yet.

Adam


More information about the webkit-dev mailing list