[webkit-reviews] review granted: [Bug 41845] refactor identifier parsing in lexer : [Attachment 63583] identifier parsing patch (v2)

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Aug 5 08:21:47 PDT 2010


Darin Adler <darin at apple.com> has granted Zoltan Herczeg
<zherczeg at webkit.org>'s request for review:
Bug 41845: refactor identifier parsing in lexer
https://bugs.webkit.org/show_bug.cgi?id=41845

Attachment 63583:  identifier parsing patch (v2)
https://bugs.webkit.org/attachment.cgi?id=63583&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
Thanks. This looks good.

> +ALWAYS_INLINE JSTokenType Lexer::parseIdent(JSTokenData* lvalp, LexType
lexType)

Since this is identifier parsing, how about naming it parseIdentifier? I know
the file sometimes does abbreviate to "ident", but I find the full word easier
to read as I almost always do in code in general.

> +    while (true) {
> +	   if (LIKELY(isIdentPart(m_current)))
> +	       shift();
> +	   else if (LIKELY(m_current != '\\'))
> +	       break;
> +	   else {
> +	       // \uXXXX unicode characters
> +	       bufferRequired = true;
> +	       if (identifierStart != currentCharacter())
> +		   m_buffer16.append(identifierStart, currentCharacter() -
identifierStart);
> +	       shift();
> +	       if (UNLIKELY(m_current != 'u'))
> +		   return ERRORTOK;
> +	       shift();
> +	       int character = getUnicodeCharacter();
> +	       if (UNLIKELY(character == -1))
> +		   return ERRORTOK;
> +	       if (UNLIKELY(m_buffer16.size() ? !isIdentPart(character) :
!isIdentStart(character)))
> +		   return ERRORTOK;
> +	       record16(character);
> +	       identifierStart = currentCharacter();
> +	   }
> +    }

I prefer the "early return" style, so if it gives you the same efficiency I'd
like this to be written as follows:

    if (LIKELY(isIdentPart(m_current))) {
	shift();
	continue;
    }
    if (LIKELY(m_current != '\\'))
	break;
    ...

With the rest of the loop body not indented.

> +	   // keywords are not checked if there was an \uXXXX in the identifier


This is a confusing comment. I think what we mean to say is that keywords must
not be recognized in such cases. Saying keywords "are not checked" leaves it
ambiguous whether we are describing a bug or a feature.

Also WebKit preferred style is to capitalize comments and end with a period.
Especially a comment like this one that is a full sentence.

> +	   // Continue to the next case

Would be better to have a period and use the terminology "Fall through" as we
do in other similar comments.

> +	   ALWAYS_INLINE JSTokenType parseIdent(JSTokenData* lvalp, LexType
lexType);

No need for the argument names lvavlp and lexType here. The types of the
arguments speak for themselves without argument names and in such cases we omit
the argument names in WebKit code.


More information about the webkit-reviews mailing list