[webkit-reviews] review denied: [Bug 183888] Add a bisect button to automatically schedule bisecting A/B tasks. : [Attachment 337249] Patch

Thu Apr 5 23:46:23 PDT 2018

Ryosuke Niwa <rniwa at webkit.org> has denied dewei_zhu at apple.com's request for
review:
Bug 183888: Add a bisect button to automatically schedule bisecting A/B tasks.
https://bugs.webkit.org/show_bug.cgi?id=183888

Attachment 337249: Patch

https://bugs.webkit.org/attachment.cgi?id=337249&action=review

--- Comment #13 from Ryosuke Niwa <rniwa at webkit.org> ---
Comment on attachment 337249
  --> https://bugs.webkit.org/attachment.cgi?id=337249
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337249&action=review

This patch is getting really big. Why don't we split off patches to:
- Add diff'ing function to CommitSet (we can start using it in analysis task
pages now!)
- Refactor the logic to create a new test group name for an analysis task (and
use it in analysis task page!)

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:89
> +	   for (const repository of repositoriesSortedByCommitCountInRange) {

As we discussed in person, we should probably sort all commits based
on commit time across repositories and fallback to time-less commits' ordering.
It's unfortunate that we don't necessarily have a guarantee that all commits
in a given repository has commit_time set or not set.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:121
> +	   return [
> +	       {name:
CommitSet.describeDifferencesBetweenCommitSets(firstCommitSet,
middleCommitSet), commitSetList: [firstCommitSet, middleCommitSet]},
> +	       {name:
CommitSet.describeDifferencesBetweenCommitSets(middleCommitSet, lastCommitSet),
commitSetList: [middleCommitSet, lastCommitSet]}
> +	   ];

This function should really just return middleCommitSet
and let the caller make calls to create test groups.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:124
> +    static async _sortingToolsForCommitSets(rangeSpecifiedByCommitSets,
allCommitSets)

What does sortingTools means in this context?
I really don't like these free-standing utility functions that does a lot of
magical things inside.
It's a lot better if
_findAndSortCommitSetsWithSameRepositorySetAsCommitSetInTargetTestGroup fetched
commits logs, etc...

r- because this function needs to be broken up & rewritten.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:126
> +	   const [oneCommitSet, anotherCommitSet] = rangeSpecifiedByCommitSets;

If rangeSpecifiedByCommitSets is actually a range, these commit sets should
surely be called startSet and endSet.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:127
> +	   if (oneCommitSet.latestCommitTime() &&
anotherCommitSet.latestCommitTime())

It's super unclear what the trune-ness of latestCommitTime() means.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:130
> +		   hasComparingKeys: (commitSet) => !!
commitSet.latestCommitTime()

There should be no space between !! and commitSet.latestCommitTime.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:143
> +	       sortingRepositories.push(repository);

So we'd put a repository to sortingRepositories only if it had ordering? That
seems backwards.
If anything, we want them if latestCommitTime wasn't set & order is present.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:185
> +    static async
_findAndSortCommitSetsWithSameRepositorySetAsCommitSetInTargetTestGroup(commitS
etsInTestGroup, commitSetsToBeFiltered)

I think it's simpler to make this function take a single commit set, and a list
of commits to filter from
since we don't currently support filtering based on combined set of
repositories.
We can restructure the code as needs come up in the future.

In general, I find that the over generalization in code is usually a bad idea
because we can't predict
what the future code & feature would look like until we implement them.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:190
> +	   if (commitSetsInTestGroup.length != 2) {
> +	       AnalysisStrategies.logger.error('Can only bisect a test group
with 2 commit sets');
> +	       return [];
> +	   }

Just assert this.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:192
> +	   const bothSimpleCommitSets = commitSetsInTestGroup.every((commitSet)
=> commitSet.withoutRootPatchOrOwnedCommit());

What's the point of having a local variable here?

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:193
> +	   if (!bothSimpleCommitSets) {

We should simply return without emitting an error message since we don't
legitimately support this.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:199
> +	   const sameRepositories = (commitSet) =>
commitSet.repositories().length == allowedRepositories.size &&

This should really be a helper function on CommitSet.
Its name should be a verb, not noun / adjective. e.g.
specifiesSameRepositories(commitSet).
In fact, it's redundant to create a set of repositories out of repositories()
here
since CommitSet already has a bunch of sets that maps repository to a commit.
You can just check those sets directly instead.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:204
> +	       AnalysisStrategies.logger.error('CommitSets in test group have
different repositories');
> +	       return [];

Again, we should simply exit early without an error message.
In my view, these error messages need to be generated by a higher level.
e.g. there is no use in showing these error messages to an end user in the
front end for example.
So there is no reason to generate them.
Whatever backend code we write should be the one generating error messages.

> Websites/perf.webkit.org/public/v3/analysis-strategies.js:208
> +	   const sortingTools = await
RangeSplitStrategies._sortingToolsForCommitSets(commitSetsInTestGroup,
commitSetsWithSameRepositories);

It's super confusing that rangeSpecifiedByCommitSets is sometimes called
commitSetsInTestGroup.
Also, this isn't really a "range". It's simply a tuple of commit sets.
Why don't we call it commitSetsToSplit?

> Websites/perf.webkit.org/public/v3/models/analysis-task.js:170
> +    async analysisConfigForTestGroup(testGroup)

What the heck is a config? Please don't make up a word like that.
Every new term we introduce is a new complexity a new maintainer of this
codebase has to decipher.

> Websites/perf.webkit.org/public/v3/models/analysis-task.js:177
> +	   const allTestGroupsInTask = await TestGroup.fetchForTask(this.id());
> +	   const existingTestGroupNames = new Set;
> +	   for (const group of allTestGroupsInTask)
> +	       existingTestGroupNames.add(group.name());

This should really be a method on TestGroup or AnalysisTask.
AnalysisTaskPage's _hasDuplicateTestGroupName should be replaced by that.

> Websites/perf.webkit.org/public/v3/models/analysis-task.js:304
> +    static createNameWithoutCollision(name, hasDuplicateTestGroupName)

As we discussed, we should have this in CommitSet instead.

> Websites/perf.webkit.org/public/v3/models/analysis-task.js:309
> +	   var nameWithNumberMatch = name.match(/(.+?)\s*\(\s*(\d+)\s*\)\s*$/);
> +	   var number = 1;

Please use const & let.

> Websites/perf.webkit.org/public/v3/models/analysis-task.js:315
> +	   var newName;

Please use let.

> Websites/perf.webkit.org/public/v3/models/commit-log.js:45
> +    order() { return +this._rawData['order']; }

It doesn't make any sense to force it to be a number if it's null / undefined.

> Websites/perf.webkit.org/public/v3/models/commit-set.js:91
> +    withoutRootPatchOrOwnedCommit()

As we discussed in person, this needs to be a verb; e.g.
containsRootPatchOrOwnedCommit().

> Websites/perf.webkit.org/public/v3/models/commit-set.js:164
> +    static describeDifferencesBetweenCommitSets(firstCommitSet,
secondCommitSet)

Why not simply diff?

> Websites/perf.webkit.org/public/v3/models/commit-set.js:170
> +	   const missingPatch = {filename: () => 'missing'};

I think "none" is a more appropriate term to be used here
since it's okay for one side to be missing a patch.
In fact, we'd expect that in most A/B testing.

> Websites/perf.webkit.org/public/v3/models/commit-set.js:174
> +		   const newName =
AnalysisTask.createNameWithoutCollision(name, (name) =>
existingNameSet.has(name));

As we discussed in person, CommitSet shouldn't rely on AnalysisTask like this.

> Websites/perf.webkit.org/public/v3/models/commit-set.js:180
> +	   for (const repository of Array.from(allRepositories)) {

We should probably use Repository.sortByNamePreferringOnesWithURL here.

> Websites/perf.webkit.org/public/v3/models/commit-set.js:194
> +	       const nameForFirstPatch = nameGenerator(firstPatch.filename());
> +	       const nameForSecondPath = nameGenerator(secondPatch.filename());

We should probably limit the length of a filename and abbreviate them as
needed.

> Websites/perf.webkit.org/public/v3/models/commit-set.js:211
> +	       nameParts.push(`Roots: ${leftRootFileDescription.length ?
leftRootFileDescription : 'missing'} - ${rightRootFileDescription.length ?
rightRootFileDescription : 'missing'}`);

I think "none" is more appropriate term to be used here
since it's not like we expect roots to be always present in both sides.

> Websites/perf.webkit.org/public/v3/pages/analysis-task-page.js:347
> +	      
currentGroup.task().analysisConfigForTestGroup(currentGroup).then((analysisConf
igs) =>
> +		   this.content('bisect-form').style.display =
analysisConfigs.length ? null : 'none');

Ugh... this would mean that we'd re-comptue this every time currentGroup
changes.
That seems highly undesirable given it can end up fetching the data from the
server,
and run a whole bunch of computations.
We should compute this result for each test group once and cache the result.
Also, we should never directly update the style attribute asynchronously like
this.

This has a bug that if the current test group is changed multiple times,
whichever finished the last would determine whether the button is visible or
not.
r- for this serious bug.

> Websites/perf.webkit.org/public/v3/pages/analysis-task-page.js:810
> +	   const analysisConfigs = await
this._task.analysisConfigForTestGroup(testGroup);

We shouldn't be re-computing this.
Just store the result somewhere the first time we computed it.