[webkit-reviews] review denied: [Bug 62882] WebVTT parser : [Attachment 102319] updating to ToT so patch will apply

Mon Aug 1 15:11:57 PDT 2011

Eric Carlson <eric.carlson at apple.com> has denied Anna Cavender
<annacc at chromium.org>'s request for review:
Bug 62882: WebVTT parser
https://bugs.webkit.org/show_bug.cgi?id=62882

Attachment 102319: updating to ToT so patch will apply
https://bugs.webkit.org/attachment.cgi?id=102319&action=review

------- Additional Comments from Eric Carlson <eric.carlson at apple.com>
View in context: https://bugs.webkit.org/attachment.cgi?id=102319&action=review

First, I apologize that it took me to long to get this reviewed :-(

Second, while this is much closer than the last time around, there are enough
minor issues that I am going to mark it r- for now.

> Source/WebCore/html/TextTrackCue.cpp:67
> +bool TextTrackCue::charIsANumber(char c)
> +{
> +    return '0' <= c && c <= '9';
> +}

Is there any reason to not use WTF::isASCIIDigit instead of defining this here?
If you think it is worth having a new function to emphasize that is follows the
webVTT rules, maybe "isWebVTTDigit" and make it an inline?

> Source/WebCore/html/TextTrackCue.cpp:69
> +String TextTrackCue::collectNumberCharacters(const String& input, unsigned*
position)

Nit: While you are collecting number characters, you return a string and
advance the position. Maybe "collectWebVTTDigits", or "parseWebVTTDigits"?

Actually, you could put this and the other static parsing functions in this
class into WebVTTParser so that "WebVTT" in the name isn't necessary and they
are all in the same place.

> Source/WebCore/html/TextTrackCue.cpp:80
> +bool TextTrackCue::charIsASpaceCharacter(char c)
> +{
> +    // WebVTT space characters are U+0020 SPACE, U+0009 CHARACTER TABULATION
(tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), and U+000D CARRIAGE RETURN
(CR).
> +    return c == ' ' || c == '\t' || c == '\n' || c == '\f' || c == '\r';

Nit: I don't think "char" prefix is necessary, and you definitely don't need
both the prefix and "Character" suffix. Maybe "isWebVTTSpace"? 

Also, this could be inline.

> Source/WebCore/html/TextTrackCue.cpp:83
> +String TextTrackCue::collectString(const String& input, unsigned* position)

Nit: "collectString" is accurate but not especially descriptive. You collect up
to the next space and advance the position, so maybe "collectWebVTTWord", or
"parseWebVTTWord"?

> Source/WebCore/html/TextTrackCue.cpp:184
> +	   // 5-7 - If the character at position is not ':', set setting to
empty string.
> +	   if (input[position++] != ':')
> +	       setting = ' ';

Does the spec consider a space to be the same as an empty string?

> Source/WebCore/html/TextTrackCue.cpp:191
> +	   switch (setting) {
> +	   case 'D':
> +	       {

Here and elsewhere, I think it would be helpful to quote (or summarize) at
least some of the parsing text from the spec. In addition to making it easier
for someone that comes along later to understand what the logic is supposed to
to, I think it will make somewhat easier to spot parts that need to be changed
later because of spec changes.

> Source/WebCore/html/TextTrackCue.cpp:196
> +	       if (writingDirection == "vertical")
> +		   m_writingDirection = VerticalGrowingLeft;
> +	       if (writingDirection == "vertical-lr")
> +		   m_writingDirection = VerticalGrowingRight;

The string can't have both values, so an "else" here will make this slightly
more efficient.

> Source/WebCore/html/TextTrackCue.cpp:212
> +	       long number = linePosition.toInt64(&validNumber);

Do you need this implicit cast from int64_t to long?

> Source/WebCore/html/TextTrackCue.cpp:235
> +	       long number = textPosition.toInt64(&validNumber);

Implicit cast?

> Source/WebCore/html/TextTrackCue.cpp:255
> +	       long number = cueSize.toInt64(&validNumber);

Implicit cast?

> Source/WebCore/html/TextTrackCue.cpp:271
> +	       if (cueAlignment == "start")
> +		   m_cueAlignment = Start;
> +	       if (cueAlignment == "middle")
> +		   m_cueAlignment = Middle;
> +	       if (cueAlignment == "end")
> +		   m_cueAlignment = End;

"else if" will make this code slightly more efficient.

> Source/WebCore/html/TextTrackCue.cpp:279
> +	   continue;
> +Otherwise:

Nit: a blank line after the "continue" will make this more readable IMO.

> Source/WebCore/platform/track/WebVTTParser.cpp:41
> +bool WebVTTParser::requireFileIdentifier(const char* data, int length)

I was confused about what this function did when I saw the name above,
"require" as the first part of the name made me think it must enforce a
requirement of some sort. Because the function checks to see if the byte stream
has a required identifier, maybe "hasRequiredFileIdentifier"?

> Source/WebCore/platform/track/WebVTTParser.cpp:72
> +void WebVTTParser::parseBytes(const char* data, int length)

Here and elsewhere - why do you pass length around as a signed int but use
unsigned inside of the functions?

> Source/WebCore/platform/track/WebVTTParser.cpp:114
> +	       m_state = collectCueText(line);
> +	       if ((int)position >= length - 1 && m_state == CueText)
> +		   processCueText();

If "length" has to be signed (see above), you should use a static_cast rather
than a C-style cast here.

> Source/WebCore/platform/track/WebVTTParser.cpp:122
> +    } while ((int)position < length);

C-style cast.

> Source/WebCore/platform/track/WebVTTParser.cpp:142
> +    String start = TextTrackCue::collectString(line, &position);
> +    m_currentStartTime = collectTimeStamp(start);

It looks like you always put a word into a local string that is only used to
pass to collectTimeStamp. Is there any reason to not have collectTimeStamp take
line and position like your other parsing functions?

> Source/WebCore/platform/track/WebVTTParser.cpp:175
> +WebVTTParser::ParseState WebVTTParser::collectCueText(const String& line)
> +{
> +    if (line.isEmpty()) {
> +	   processCueText();
> +	   return Id;
> +    }
> +    if (!m_currentContent.isEmpty())
> +	   m_currentContent.append("\n");
> +    m_currentContent.append(line);
> +    return CueText;
> +}

I think the current factoring is confusing. Some of the logic for what makes a
good cue is here and some is back in parseBytes. You sometimes call
processCueText here, and sometimes call it in parseBytes. It might be clearer
if you combined processCueText and collectCueText so all of the logic is in one
place. Could this work?

> Source/WebCore/platform/track/WebVTTParser.cpp:279
> +void WebVTTParser::skipLineTerminator(const char* data, int length,
unsigned* position)
> +{
> +    if ((int)*position >= length)
> +	   return;
> +    if (data[*position] == '\r')
> +	   (*position)++;
> +    if ((int)*position >= length)
> +	   return;
> +    if (data[*position] == '\n')
> +	   (*position)++;
> +}

Is there a reason that "length" needs to be signed? If so, the casts should no
be C-style.

> Source/WebCore/platform/track/WebVTTParser.cpp:289
> +String WebVTTParser::collectNextLine(const char* data, int length, unsigned*
position)
> +{
> +    unsigned oldPosition = *position;
> +    while ((int)*position < length && data[*position] != '\r' &&
data[*position] != '\n')
> +	   (*position)++;
> +    String line = String::fromUTF8(data + oldPosition, *position -
oldPosition);
> +    skipLineTerminator(data, length, position);
> +    return line;
> +}

Ditto.

> Source/WebCore/platform/track/WebVTTParser.h:43
> +static const int secondsInHour = 3600;
> +static const int secondsInMinute = 60;

Minor nit: "Per" might be better than "In", eg. "secondsPerHour" and
"secondsPerMinute"