<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[210982] trunk/Websites/perf.webkit.org</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/210982">210982</a></dd>
<dt>Author</dt> <dd>rniwa@webkit.org</dd>
<dt>Date</dt> <dd>2017-01-20 14:04:23 -0800 (Fri, 20 Jan 2017)</dd>
</dl>

<h3>Log Message</h3>
<pre>Make sync-commits.py robust against missing Subversion authors and missing parent Git commits
https://bugs.webkit.org/show_bug.cgi?id=167231

Reviewed by Antti Koivisto.

Fixed a bug that a subversion commit that's missing author name (anonymous commit) results in an out of bound
exception, and a bug that syncing a git repository starts failing once there was a merge commit which pulled
in a commit data earlier than that of the last reported commit.

For the latter fix, added --max-ancestor-fetch-count to specify the number of maximum commits to look back.

* tools/sync-commits.py:
(main): Added --max-ancestor-fetch-count.
(Repository.fetch_commits_and_submit): If submit_commits fails with FailedToFindParentCommit, fetch the parent
commit's information until we've resolved them all.
(Repository.fetch_next_commit): Renamed from fetch_commit.
(SVNRepository.fetch_next_commit): Renamed from fetch_commit. Don't try to get the author name if it's missing
due to an anonymous commit. It's important to never include the &quot;author&quot; field in the JSON submitted to
a dashboard since it rejects when &quot;author&quot; field is not an array (e.g. null). 
(GitRepository.fetch_next_commit): Renamed from fetch_commit.
(GitRepository.fetch_commit): Added. Fetches the commit information for a given git hash. Used to retrieve
missing parent commits.
(GitRepository._revision_from_tokens): Extracted from fetch_commit.

* tools/util.py:
(submit_commits): Optionally takes status_to_accept to avoid throwing in the case of FailedToFindParentCommit
and returns the response JSON.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkWebsitesperfwebkitorgChangeLog">trunk/Websites/perf.webkit.org/ChangeLog</a></li>
<li><a href="#trunkWebsitesperfwebkitorgtoolssynccommitspy">trunk/Websites/perf.webkit.org/tools/sync-commits.py</a></li>
<li><a href="#trunkWebsitesperfwebkitorgtoolsutilpy">trunk/Websites/perf.webkit.org/tools/util.py</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkWebsitesperfwebkitorgChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Websites/perf.webkit.org/ChangeLog (210981 => 210982)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Websites/perf.webkit.org/ChangeLog        2017-01-20 22:02:09 UTC (rev 210981)
+++ trunk/Websites/perf.webkit.org/ChangeLog        2017-01-20 22:04:23 UTC (rev 210982)
</span><span class="lines">@@ -1,5 +1,35 @@
</span><span class="cx"> 2017-01-20  Ryosuke Niwa  &lt;rniwa@webkit.org&gt;
</span><span class="cx"> 
</span><ins>+        Make sync-commits.py robust against missing Subversion authors and missing parent Git commits
+        https://bugs.webkit.org/show_bug.cgi?id=167231
+
+        Reviewed by Antti Koivisto.
+
+        Fixed a bug that a subversion commit that's missing author name (anonymous commit) results in an out of bound
+        exception, and a bug that syncing a git repository starts failing once there was a merge commit which pulled
+        in a commit data earlier than that of the last reported commit.
+
+        For the latter fix, added --max-ancestor-fetch-count to specify the number of maximum commits to look back.
+
+        * tools/sync-commits.py:
+        (main): Added --max-ancestor-fetch-count.
+        (Repository.fetch_commits_and_submit): If submit_commits fails with FailedToFindParentCommit, fetch the parent
+        commit's information until we've resolved them all.
+        (Repository.fetch_next_commit): Renamed from fetch_commit.
+        (SVNRepository.fetch_next_commit): Renamed from fetch_commit. Don't try to get the author name if it's missing
+        due to an anonymous commit. It's important to never include the &quot;author&quot; field in the JSON submitted to
+        a dashboard since it rejects when &quot;author&quot; field is not an array (e.g. null). 
+        (GitRepository.fetch_next_commit): Renamed from fetch_commit.
+        (GitRepository.fetch_commit): Added. Fetches the commit information for a given git hash. Used to retrieve
+        missing parent commits.
+        (GitRepository._revision_from_tokens): Extracted from fetch_commit.
+
+        * tools/util.py:
+        (submit_commits): Optionally takes status_to_accept to avoid throwing in the case of FailedToFindParentCommit
+        and returns the response JSON.
+
+2017-01-20  Ryosuke Niwa  &lt;rniwa@webkit.org&gt;
+
</ins><span class="cx">         REGRESSION(r198234): /api/commits/%revision% always fails
</span><span class="cx">         https://bugs.webkit.org/show_bug.cgi?id=167235
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkWebsitesperfwebkitorgtoolssynccommitspy"></a>
<div class="modfile"><h4>Modified: trunk/Websites/perf.webkit.org/tools/sync-commits.py (210981 => 210982)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Websites/perf.webkit.org/tools/sync-commits.py        2017-01-20 22:02:09 UTC (rev 210981)
+++ trunk/Websites/perf.webkit.org/tools/sync-commits.py        2017-01-20 22:04:23 UTC (rev 210982)
</span><span class="lines">@@ -23,6 +23,7 @@
</span><span class="cx">     parser.add_argument('--server-config-json', required=True, help='The path to a JSON file that specifies the perf dashboard')
</span><span class="cx">     parser.add_argument('--seconds-to-sleep', type=float, default=900, help='The seconds to sleep between iterations')
</span><span class="cx">     parser.add_argument('--max-fetch-count', type=int, default=10, help='The number of commits to fetch at once')
</span><ins>+    parser.add_argument('--max-ancestor-fetch-count', type=int, default=100, help='The number of commits to fetch at once if some commits are missing parents')
</ins><span class="cx">     args = parser.parse_args()
</span><span class="cx"> 
</span><span class="cx">     with open(args.repository_config_json) as repository_config_json:
</span><span class="lines">@@ -32,7 +33,7 @@
</span><span class="cx">         server_config = load_server_config(args.server_config_json)
</span><span class="cx">         for repository in repositories:
</span><span class="cx">             try:
</span><del>-                repository.fetch_commits_and_submit(server_config, args.max_fetch_count)
</del><ins>+                repository.fetch_commits_and_submit(server_config, args.max_fetch_count, args.max_ancestor_fetch_count)
</ins><span class="cx">             except Exception as error:
</span><span class="cx">                 print &quot;Failed to fetch and sync:&quot;, error
</span><span class="cx"> 
</span><span class="lines">@@ -56,7 +57,7 @@
</span><span class="cx">         self._name = name
</span><span class="cx">         self._last_fetched = None
</span><span class="cx"> 
</span><del>-    def fetch_commits_and_submit(self, server_config, max_fetch_count):
</del><ins>+    def fetch_commits_and_submit(self, server_config, max_fetch_count, max_ancestor_fetch_count):
</ins><span class="cx">         if not self._last_fetched:
</span><span class="cx">             print &quot;Determining the stating revision for %s&quot; % self._name
</span><span class="cx">             self._last_fetched = self.determine_last_reported_revision(server_config)
</span><span class="lines">@@ -63,7 +64,7 @@
</span><span class="cx"> 
</span><span class="cx">         pending_commits = []
</span><span class="cx">         for unused in range(max_fetch_count):
</span><del>-            commit = self.fetch_commit(server_config, self._last_fetched)
</del><ins>+            commit = self.fetch_next_commit(server_config, self._last_fetched)
</ins><span class="cx">             if not commit:
</span><span class="cx">                 break
</span><span class="cx">             pending_commits += [commit]
</span><span class="lines">@@ -73,17 +74,33 @@
</span><span class="cx">             print &quot;No new revision found for %s (last fetched: %s)&quot; % (self._name, self.format_revision(self._last_fetched))
</span><span class="cx">             return
</span><span class="cx"> 
</span><del>-        revision_list = ', '.join([self.format_revision(commit['revision']) for commit in pending_commits])
</del><ins>+        for unused in range(max_ancestor_fetch_count):
+            revision_list = ', '.join([self.format_revision(commit['revision']) for commit in pending_commits])
+            print &quot;Submitting revisions %s for %s to %s&quot; % (revision_list, self._name, server_config['server']['url'])
</ins><span class="cx"> 
</span><del>-        print &quot;Submitting revisions %s for %s to %s&quot; % (revision_list, self._name, server_config['server']['url'])
</del><ins>+            result = submit_commits(pending_commits, server_config['server']['url'],
+                server_config['slave']['name'], server_config['slave']['password'], ['OK', 'FailedToFindParentCommit'])
</ins><span class="cx"> 
</span><del>-        submit_commits(pending_commits, server_config['server']['url'],
-            server_config['slave']['name'], server_config['slave']['password'])
</del><ins>+            if result.get('status') == 'OK':
+                break
</ins><span class="cx"> 
</span><ins>+            if result.get('status') == 'FailedToFindParentCommit':
+                parent_commit = self.fetch_commit(server_config, result['commit']['parent'])
+                if not parent_commit:
+                    raise Exception('Could not find the parent %s of %s' % (result['commit']['parent'], result['commit']['revision']))
+                pending_commits = [parent_commit] + pending_commits
+
+        if result.get('status') != 'OK':
+            raise Exception(result)
+
</ins><span class="cx">         print &quot;Successfully submitted.&quot;
</span><span class="cx">         print
</span><span class="cx"> 
</span><span class="cx">     @abstractmethod
</span><ins>+    def fetch_next_commit(self, server_config, last_fetched):
+        pass
+
+    @abstractmethod
</ins><span class="cx">     def fetch_commit(self, server_config, last_fetched):
</span><span class="cx">         pass
</span><span class="cx"> 
</span><span class="lines">@@ -115,7 +132,7 @@
</span><span class="cx">         self._use_server_auth = use_server_auth
</span><span class="cx">         self._account_name_script_path = account_name_script_path
</span><span class="cx"> 
</span><del>-    def fetch_commit(self, server_config, last_fetched):
</del><ins>+    def fetch_next_commit(self, server_config, last_fetched):
</ins><span class="cx">         if not last_fetched:
</span><span class="cx">             # FIXME: This is a problematic if dashboard can get results for revisions older than oldest_revision
</span><span class="cx">             # in the future because we never refetch older revisions.
</span><span class="lines">@@ -139,19 +156,24 @@
</span><span class="cx"> 
</span><span class="cx">         xml = parseXmlString(output)
</span><span class="cx">         time = text_content(xml.getElementsByTagName(&quot;date&quot;)[0])
</span><del>-        author_account = text_content(xml.getElementsByTagName(&quot;author&quot;)[0])
</del><ins>+        author_elements = xml.getElementsByTagName(&quot;author&quot;)
+        author_account = text_content(author_elements[0]) if author_elements.length else None
</ins><span class="cx">         message = text_content(xml.getElementsByTagName(&quot;msg&quot;)[0])
</span><span class="cx"> 
</span><del>-        name = self._resolve_author_name(author_account) if self._account_name_script_path else None
</del><ins>+        name = self._resolve_author_name(author_account) if author_account and self._account_name_script_path else None
</ins><span class="cx"> 
</span><del>-        return {
</del><ins>+        result = {
</ins><span class="cx">             'repository': self._name,
</span><span class="cx">             'revision': revision_to_fetch,
</span><span class="cx">             'time': time,
</span><del>-            'author': {'account': author_account, 'name': name},
</del><span class="cx">             'message': message,
</span><span class="cx">         }
</span><span class="cx"> 
</span><ins>+        if author_account:
+            result['author'] = {'account': author_account, 'name': name}
+
+        return result
+
</ins><span class="cx">     def _resolve_author_name(self, account):
</span><span class="cx">         try:
</span><span class="cx">             output = subprocess.check_output(self._account_name_script_path + [account])
</span><span class="lines">@@ -177,7 +199,7 @@
</span><span class="cx">         self._git_url = git_url
</span><span class="cx">         self._tokenized_hashes = []
</span><span class="cx"> 
</span><del>-    def fetch_commit(self, server_config, last_fetched):
</del><ins>+    def fetch_next_commit(self, server_config, last_fetched):
</ins><span class="cx">         if not last_fetched:
</span><span class="cx">             self._fetch_all_hashes()
</span><span class="cx">             tokens = self._tokenized_hashes[0]
</span><span class="lines">@@ -188,7 +210,16 @@
</span><span class="cx">                 tokens = self._find_next_hash(last_fetched)
</span><span class="cx">                 if not tokens:
</span><span class="cx">                     return None
</span><ins>+        return self._revision_from_tokens(tokens)
</ins><span class="cx"> 
</span><ins>+    def fetch_commit(self, server_config, hash_to_find):
+        assert(self._tokenized_hashes)
+        for i, tokens in enumerate(self._tokenized_hashes):
+            if tokens and tokens[0] == hash_to_find:
+                return self._revision_from_tokens(tokens)
+        return None
+
+    def _revision_from_tokens(self, tokens):
</ins><span class="cx">         current_hash = tokens[0]
</span><span class="cx">         commit_time = int(tokens[1])
</span><span class="cx">         author_email = tokens[2]
</span></span></pre></div>
<a id="trunkWebsitesperfwebkitorgtoolsutilpy"></a>
<div class="modfile"><h4>Modified: trunk/Websites/perf.webkit.org/tools/util.py (210981 => 210982)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Websites/perf.webkit.org/tools/util.py        2017-01-20 22:02:09 UTC (rev 210981)
+++ trunk/Websites/perf.webkit.org/tools/util.py        2017-01-20 22:04:23 UTC (rev 210982)
</span><span class="lines">@@ -3,7 +3,7 @@
</span><span class="cx"> import urllib2
</span><span class="cx"> 
</span><span class="cx"> 
</span><del>-def submit_commits(commits, dashboard_url, slave_name, slave_password):
</del><ins>+def submit_commits(commits, dashboard_url, slave_name, slave_password, status_to_accept=['OK']):
</ins><span class="cx">     try:
</span><span class="cx">         payload = json.dumps({
</span><span class="cx">             'slaveName': slave_name,
</span><span class="lines">@@ -20,8 +20,9 @@
</span><span class="cx">         except Exception, error:
</span><span class="cx">             raise Exception(error, output)
</span><span class="cx"> 
</span><del>-        if result.get('status') != 'OK':
</del><ins>+        if result.get('status') not in status_to_accept:
</ins><span class="cx">             raise Exception(result)
</span><ins>+        return result
</ins><span class="cx">     except Exception as error:
</span><span class="cx">         sys.exit('Failed to submit commits: %s' % str(error))
</span><span class="cx"> 
</span></span></pre>
</div>
</div>

</body>
</html>