[Webkit-unassigned] [Bug 39184] Add support for parsing a SVN property

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sun May 16 17:11:48 PDT 2010


https://bugs.webkit.org/show_bug.cgi?id=39184





--- Comment #3 from Daniel Bates <dbates at webkit.org>  2010-05-16 17:11:47 PST ---
(In reply to comment #2)
> (From update of attachment 56197 [details])
> Thanks for this patch, Dan!  Overall this looks pretty good, but several minor comments below.
> 
> > Index: WebKitTools/ChangeLog
> > +        Adds function VCSUtils::parseSvnProperty to parse an SVN property with
> > +        either a single-line or multi-line value.
> 
> I would add the word "change" at the end because the property can have two
> "value" sections if there is both a "-" and "+".  So, "...with either a
> single-line or multi-line value change."

Will add.

> 
> > +
> > +        * Scripts/VCSUtils.pm:
> > +          - Added function parseSvnProperty. We will use this function
> > +          towards resolving Bug #38885 <https://bugs.webkit.org/show_bug.cgi?id=38885>.
> 
> alignment
> 
> > +          - Removed FIXME comment above function parseSvnPropertyValue, since
> > +            it is being used by parseSvnProperty.
> > +          - (*) Modified function parseSvnPropertyValue to break out of "while (<$fileHandle>)"
> > +            loop when it encounters the start of the next property so that it can be
> > +            processed by its caller, parseSvnPropertyValue.
> 
> Instead of saying "(*)", I would say include an additional sentence saying,
> "We reference this bullet below by (*)."  I was confused by what the "(*)"
> meant until I read to the end of the entry.

Will change. 

> > +        * Scripts/webkitperl/VCSUtils_unittest/parseSvnProperty.pl: Added.
> > +          - Added unit tests.
> > +        * Scripts/webkitperl/VCSUtils_unittest/parseSvnPropertyValue.pl:
> > +          - Changed the name of the unit test "simple multi-line '-' change" to
> > +            "single-line '-' change followed by empty line" since the former was an
> > +            incorrect description of this test.
> > +          - Added unit test "single-line '-' change followed by the next property", and
> > +            "multi-line '-' change followed by the next property" to test (*).
> 
> I would insert the word "above" at the end as in "to test (*) above".

Will change.

> 
> > Index: WebKitTools/Scripts/VCSUtils.pm
> > +my $svnPropertyStartRegEx = qr#^(Modified|Name|Added|Deleted): ([^\r\n]+)#; # $1 is the name of the property.
> 
> Should the comment say $2?

Good catch. Will change.

> 
> > +# Parse the next SVN property from the given file handle, and advance the handle so the last
> > +# line read is the first line after the property.
> > +#
> > +# This subroutine dies if the first line is an invalid SVN property
> > +# (i.e. a line that does not match $svnPropertyStartRegEx) or
> > +# the value of the property is missing.
> 
> I would rewrite this as "...if the first line is not a valid start of an
> SVN property (e.g. if the value of the property is missing)."

I am unclear if you are implying we should lump together, for documentation purposes, the cases where we die when the first line is invalid and also when the property value is missing?

Without loss of generality, this function dies if the the first line is not of the form "Added: svn:executable". Following from this, suppose the input to this function is:

Added: svn:executable

Index: Makefile.shared
==========================================================
[...]

This is a malformed patch since it is missing a property value. So, I chose to die if there is no property value. Hence, we also die if the value for the property is missing. Maybe we should assume this is just improbable and/or simply return a property hash whose value and propertyChangeDelta keys are undefined? 

Note, I inadvertently left out that we also die if there is disagreement between the $propertyNamePrefix (e.g. "Added" , "Deleted")  and the value prefix (e.g. '+', '-'), such as:


Added: svn:executable
   - *

> 
> 
> > +# Returns ($propertyHashRef, $lastReadLine):
> > +#   $propertyHashRef: a hash reference representing a SVN property.
> > +#     name: the name of the property.
> > +#     value: the last-most value of the property. For instance, suppose the property is "Modified".
> > +#            Then it has both a '-' and '+' property value in that order. Therefore, the value
> > +#            of this key is the value of the '+' property by ordering (since it is the last-most value).
> 
> I would be consistent with your uses above and use two spaces after each period.
> Also, instead of "last-most", I would just say, "the last property value."

Will change both.

> 
> 
> > +#     propertyChangeDelta: the value 1 or -1 if the property was added or
> > +#                         removed, respectively.
> 
> alignment (from before)

Will fix alignment of second line of comment.

> 
> > +#   $lastReadLine: the line last read from $fileHandle.
> > +#
> > +# FIXME: This method is unused as of (05/16/2010). We will call this function
> > +#        as part of parsing a SVN footer. See <https://bugs.webkit.org/show_bug.cgi?id=38885>.
> 
> Two spaces after each period.  Also do this below (I will stop mentioning it.)

Will change.

> 
> > +sub parseSvnProperty($$)
> > +{
> > +    my ($fileHandle, $line) = @_;
> > +
> > +    $_ = $line;
> > +
> > +    my $propertyName;
> > +    my $propertyNamePrefix;
> 
> Maybe $propertyChangeType instead of prefix?  propertyNamePrefix makes it
> seem like it's the prefix of the name -- e.g. the "svn" in "svn:executable".
> 

I agree. Will change.

> 
> 
> > +    if (/$svnPropertyStartRegEx/) {
> > +        $propertyNamePrefix = $1;
> > +        $propertyName = $2;
> > +    } else {
> > +        die("Failed to find SVN property: \"$_\".");
> > +    }
> > +
> > +    $_ = <$fileHandle>; # Not defined if end-of-file reached.
> > +
> > +    my $propertyValue;
> > +    my $propertyValuePrefix;
> 
> Again, for similar reasons, maybe propertyValueType?
> 
> > +    while (defined($_)) {
> > +        while (defined($_) && /$svnPropertyValueStartRegEx/) {
> > +            # Note, a '-' property may be followed by '+' property in the case of a "Modified"
> > +            # or "Name" property. We only care about the actual change (i.e. the '+' property)
> 
> Maybe "ending value" instead of "actual change" would be clearer?

Will change.

> 
> > +            # in such circumstances. So, we take the property value for the property to be its
> > +            # last parsed property value.
> > +            # FIXME: We may want to consider strictly enforcing a '-', '+' property ordering or
> > +            # add error checking to prevent '+', '+', ..., '+' and other invalid combinations.
> > +            $propertyValuePrefix = $1;
> > +            ($propertyValue, $_) = parseSvnPropertyValue($fileHandle, $_);
> 
> Does this while loop handle the case of empty lines between property values?
> If we don't need to support that for some reason, it might be worth
> mentioning commenting why.

The "svn diff" command does not insert newlines between property values or successive properties. So, these new lines would be part of the property value. This patch does not support property values that end with trailing newlines as it does not have support to disambiguate between these new lines and the new line that precedes the contents of a binary patch. I would suggest adding such support in a future patch. I'll add a FIXME to clarify this.

> 
> > +        }
> > +
> > +        # We skip over everything that isn't an empty line or $svnPropertyStartRegEx. There is an
> > +        # empty line before the contents of a binary patch.
> 
> I'm confused.  Once you have the last property value, why do you need to
> keep looping through the while loop?  It seems like you can just exit
> since you have the last value.  This also makes it seem like you only
> need one while loop and not two nested loops.

We don't need the outer while loop, will remove.

> 
> > +        last if (!defined($_) || /^[\r\n]*\z/ || /$svnPropertyStartRegEx/);
> > +        $_ = <$fileHandle>; # Not defined if end-of-file reached.
> > +    }
> > +
> > +    if (!$propertyValue) {
> > +        die("Failed to find the property value for the SVN property \"$propertyName\": \"$_\".");
> > +    }
> > +
> > +    my $propertyChangeDelta;
> > +    if ($propertyValuePrefix eq '+') {
> > +        $propertyChangeDelta = 1;
> > +    } elsif ($propertyValuePrefix eq '-') {
> > +        $propertyChangeDelta = -1;
> > +    } else {
> > +        # Not reached.
> > +        die;
> > +    }
> > +
> > +    my $expectedChangeDelta;
> 
> I would add a comment here that you are doing additional validation, 
> if that is what you're doing.

Will add comment.

> 
> > +    if ($propertyNamePrefix eq "Name") {
> > +        $expectedChangeDelta = 0;
> > +    } elsif ($propertyNamePrefix eq "Modified") {
> > +        $expectedChangeDelta = 0;
> > +    } elsif ($propertyNamePrefix eq "Added") {
> > +        $expectedChangeDelta = 1;
> > +    } elsif ($propertyNamePrefix eq "Deleted") {
> > +        $expectedChangeDelta = -1;
> > +    } else {
> > +        # Not reached.
> > +        die;
> > +    }
> > +
> > +    if ($propertyChangeDelta == -1 * $expectedChangeDelta) {
> 
> Why aren't you doing "if ($propertyChangeDelta != $expectedChangeDelta)"?

Take $propertyChangeDelta := 1, and $expectedChangeDelta := 0. Then we would die with a mismatch error.

Notice, $expectedChangeDelta is zero when the property type is "Name" or "Modified" since we cannot infer whether this is an addition or a removal of a property. Instead, we must inspect the +/- line(s) that follow this line in the diff.

Alternatively, we could write this as "if ($expectedChangeDelta && $propertyChangeDelta != $expectedChangeDelta)".

> 
> > +        die("Property change prefix \"$propertyValuePrefix\" does not match " .
> > +            "the property name prefix \"$propertyNamePrefix\".");
> 
> I don't think "match" is the right word since they're not really supposed
> to be the same.  Maybe "The final property value type found \"$propertyValueType\"
> does not correspond to the property change type found \"$propertyChangeType\"."?
> 
> How hard would it be to include the correct property value type in the message?

As aforementioned above, it is not always possible to infer what is the correct property value type.

> 
> > -#   $line: the line last read from $fileHandle
> > +#   $line: the line last read from $fileHandle.
> 
> Thanks.
> 
> > -        if (/^$/ || /$svnPropertyValueStartRegEx/) {
> > +        if (/^$/ || /$svnPropertyValueStartRegEx/ || /$svnPropertyStartRegEx/) {
> >              # Note, we may encounter an empty line before the contents of a binary patch.
> >              # Also, we check for $svnPropertyValueStartRegEx because a '-' property may be
> >              # followed by a '+' property in the case of a "Modified" or "Name" property.
> > +            # We check for $svnPropertyStartRegEx because it indicates the start of the
> > +            # next property to parse.
> 
> Does this mean that there is not always an empty line separating one
> property change from another?

Right. As aforementioned above, the "svn diff" command does not insert newlines between property values. 

> 
> > Index: WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnProperty.pl
> 
> Thanks for your unit tests.  It seems like you should be putting lines
> after the property changes in each of your test cases to test the
> expectedNextLine and last read line in each case.
> 
> > +{
> > +    # New test
> > +    diffName => "simple: add svn:executable",
> > +    inputText => <<'END',
> > +Added: svn:executable
> > +   + *
> 
> For example, include a blank line and a line marking the beginning of the
> next diff here.  In most use cases, you will have stuff after the property
> value.  This will also let you test your handling of empty lines, etc.
> in relation to the while loops above.

Will add a case for the simple scenario. This is already covered for multi-lines in tests:
""multi-line '+' change and start of binary patch", and "multi-line '-' change followed by multi-line '+' change followed by start of binary patch".

> 
> > +{
> > +    # New test
> > +    diffName => "multi-line '+' change, followed by svn:executable",
> > +    inputText => <<'END',
> > +Name: documentation
> > +   + A
> > +long sentence that spans
> > +multiple lines.
> > +Name: svn:executable
> > +   + *
> > +END
> 
> I thought there was also an empty line between successive properties,
> or did you recently find that that is not the case?

As aforementioned, the "svn diff" command does not insert newlines between successive properties. The property value may contain trailing newlines. This patch does not support properties whose values end in trailing new lines (mentioned above).

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list