[webkit-reviews] review denied: [Bug 169462] Add a script to automate W3c web-platform-tests pull request creations from WebKit commits : [Attachment 309359] Adding URL to Chromium license

Mon May 15 20:56:16 PDT 2017

Ryosuke Niwa <rniwa at webkit.org> has denied youenn fablet <youennf at gmail.com>'s
request for review:
Bug 169462: Add a script to automate W3c web-platform-tests pull request
creations from WebKit commits
https://bugs.webkit.org/show_bug.cgi?id=169462

Attachment 309359: Adding URL to Chromium license

https://bugs.webkit.org/attachment.cgi?id=309359&action=review

--- Comment #15 from Ryosuke Niwa <rniwa at webkit.org> ---
Comment on attachment 309359
  --> https://bugs.webkit.org/attachment.cgi?id=309359
Adding URL to Chromium license

View in context: https://bugs.webkit.org/attachment.cgi?id=309359&action=review

It's exciting to get this script working.

By the way, the question of where to put tests to be exported hasn't been
settled.
e.g. Maciej in
https://lists.webkit.org/pipermail/webkit-dev/2017-May/029022.html says:
" think it would be cleaner to have a separate directory of tests intended for
[export], separate from imported tests. It could be right next to
imported/w3c/web-platform-tests. I think mixing tests that are imported from
upstream and tests intended for eventual upstreaming is more confusing than
helpful."

> Tools/Scripts/webkitpy/common/checkout/scm/git.py:587
> +    def reset_hard(self, revision):

Instead of just using the Git lingo, we should clarify what it does.

> Tools/Scripts/webkitpy/common/checkout/scm/git.py:588
> +	   return self._run_git(['reset', '--hard', revision])

It's strange that we sometimes call this function with revision being a Git
hashish but other times it's a branch name.

> Tools/Scripts/webkitpy/common/checkout/scm/git.py:590
> +    def am(self, options):

Please give a more descriptive name like "apply_mail_patch" or
"apply_git_patch".

> Tools/Scripts/webkitpy/common/checkout/scm/git.py:593
> +	   args = ['am']
> +	   args.extend(options)
> +	   return self._run_git(args)

More idiomatic way to write this in Python would be self._run_git(['am'] +
options)

> Tools/Scripts/webkitpy/common/checkout/scm/git.py:595
> +    def commit(self, options):

I'd call this local_commit or commit_local since scm interface is shared with
SVN.

> Tools/Scripts/webkitpy/common/checkout/scm/git.py:602
> +    def format_patch(self, options):
> +	   args = ['format-patch']
> +	   args.extend(options)

We should just use scm().create_patch(git_commit=commit_id) instead.
r- because of this feature duplication.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:51
> +	   self.bugzilla = bugzillaClass()

These instance variables should be prefixed with _ since they're more or less
private to this class.
If they're intended to be publicly accessible variable, then we have too many
of those in this class.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:52
> +	   self.webKitGit = gitClass('.', None, executive=self._host.executive,
filesystem=self._filesystem)

In general, I don't really like tools relying on the current working directory
like this.
It's fine to use the current working directory in the case no option was
specified,
but it's better if the directory name of the WPT checkout was explicitly
specifiable in an option
so that it's easier to call this function from another script for example.

Also, why are we assuming that WebKit checkout is a Git checkout?
Why can't we just let the host object automatically detect the SCM type?

> Tools/Scripts/webkitpy/w3c/test_exporter.py:60
> +	   self.repository_directory = options.repository_directory

It's not obvious at all whether this repository refers to WebKit checkout to
WPT checkout.
We should probably prefix each and every instance variable in this class by
_webkit_ or _wpt_.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:66
> +	   self.username = options.username
> +	   if not self.username:
> +	       self.username = os.environ.get('GITHUB_WPT_USER')
> +	       if not self.username:
> +		   raise ValueError("Missing GitHub username.")

We should also get it from git config user.name and keychain. See
webkitpy/common/net/credentials.py
r- because of this.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:78
> +	   self.branch_name = "webkit-" + str(self.bug_id)
> +	   self.bugzilla_url = "https://bugs.webkit.org/show_bug.cgi?id=" +
str(self.bug_id)
> +	   self.commit_title = 'WebKit export of ' + self.bugzilla_url

We should use % or .format.

I also think "webkit-169462" is not exotic enough not to conflict with
user-created branch name.
It should be something more unique like "wpt-export-for-webkit-169462"

In addition, it's ambiguous as to what commit_title means in this context. It's
the commit message / PR description in WPT,
so I think a better name would be self._wpt_commit_message or
self._wpt_pr_title.

Finally, "WebKit export of 169462" is a strange title for a PR request.
I would have phrased it like "Test updates for WebKit bug 169462".

> Tools/Scripts/webkitpy/w3c/test_exporter.py:82
> +	   if not self.repository_remote:
> +	       self.repository_remote = self.username

What's the point of this code? Why do we need to push to a specific user's
repo?
I've never had to specify repository name like this since I usually only have
one remote in my GitHub fork.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:84
> +	   self.repository_origin_url = "https://" + self.username +
"@github.com/" + self.username + "/web-platform-tests.git"

We should just run git config --get remote.origin.url instead.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:85
> +	   self.repository_public_url = "https://github.com/" + self.username +
"/web-platform-tests.git" if not options.repository_public_url else
options.repository_public_url

This instance variable is never used other than the line below so it should be
a local variable.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:86
> +	   self.repository_public_tree_url = self.repository_public_url[:-4] +
"/tree/"

Are we truncating ".git" by [:-4]? That is not obvious at all.
Also, it looks like this is only used in logging.
If we used "git config --get" above (please add a new method on Git class),
then we should be able to parse that result and construct this URL instead.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:95
> +    def _init_repository(self, url, repository_directory, gitClass):

I think we should call this _ensure_wpt_checkout or something since we're not
init'ing a new git repository.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:109
> +	   patchOptions = ["--no-update", "--no-clean", "--local-commit"]

Nit: should be patch_options.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:121
> +	   # webKitPatch =
WebKitPatch(self._filesystem.join(self._filesystem.abspath(__file__), "..",
".."))
> +	   #(options, args) = webKitPatch.parse_args(patchOptions)
> +	   #return webKitPatch.check_arguments_and_execute(options, args)

Please remove this commented out code. r- because of this.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:128
> +	   self.git.reset_hard('origin/master')
> +	   if not self.options.git_commit:
> +	       self.webKitGit.reset_hard('HEAD~1')

It's confusing that we're calling reset_hard with a branch name or hashish.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:138
> +	       _log.info("Retrying to create the branch")
> +	       self.git.delete_branch(self.branch_name)
> +	       self.git.checkout_new_branch(self.branch_name)

I don't think we should automatically delete a branch created by an user like
this.
We should ask the user to confirm it's okay to delete the branch at minimum.
We can also add a flag to automatically delete the branch if there is a
conflicting one
but doing so by default without giving any warning to the user is dangerous.
r- because of this.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:141
> +	   try:
> +	       self.git.am([patch, '--exclude', '*-expected.txt'])
> +	   except Exception as e:

It's not a good idea to capture any exception like this.
Also, capturing the exception should probably happen in git.apply_patch (or
whatever its name will be) instead.
That way, that function can take a patch name and files to exclude as arguments
instead o an arbitrary list of options.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:148
> +    def push_to_public_repository(self):

This is a very vague name. It should be something like push_to_remote_wpt_fork.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:157
> +	   response = self.gitHub.create_pr(self.repository_remote + ':' +
self.branch_name, self.commit_title, description)

It's not obvious at all that the first argument is a branch name.
I think we should use the name arguments as in:
create_pr(remote_branch_name=self.repository_remote + ':' + self.branch_name).

> Tools/Scripts/webkitpy/w3c/test_exporter.py:170
> +	   patch_data = self.webKitGit.format_patch([git_commit,
'LayoutTests/imported/w3c/web-platform-tests', '--stdout'])

Again, we should be using scm.create_patch(git_commit=commit_id) instead.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:175
> +	   patch_data =
patch_data.replace('LayoutTests/imported/w3c/web-platform-tests/', '')

It seems a little risky to do a global replaccement like this but I suppose
text like this would never appear in the tests themselves.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:184
> +	       _log.info("Will push branch to " + self.git.remote(["get-url",
self.repository_remote]).strip())
> +	       return
> +	   _log.info("Will push branch to " + self.repository_origin_url)

Again, it's not obvious what repository_remote and repository_origin_url refer
to.

> Tools/Scripts/webkitpy/w3c/test_exporter.py:222
> +    -    USERNAME may be set using the environment variable GITHUB_WPT_USER
or as a command line option.
> +    -    Github credential may be set using the environment variable
GITHUB_WPT_AUTH or as a command line option. (Please provide a valid GitHub
'Personal access token' with 'repo' as scope)

We should support pulling username & password from the keychain.
Having to do this extra step is really annoying.
Even asking the user the password manually is better than requiring the user to
get this extra token thing.
Also, if you have SSH key setup with GitHub, it should just work without having
to do any authentication.