[webkit-reviews] review denied: [Bug 27032] document.title does not replace or remove space characters : [Attachment 44480] Third attempt

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Fri Dec 11 14:13:01 PST 2009


Darin Adler <darin at apple.com> has denied Christian Sejersen
<christian.webkit at gmail.com>'s request for review:
Bug 27032: document.title does not replace or remove space characters
https://bugs.webkit.org/show_bug.cgi?id=27032

Attachment 44480: Third attempt
https://bugs.webkit.org/attachment.cgi?id=44480&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
Thanks for tackling this. I would love to see this fixed.

> +static inline String canonicalizedTitle(const String& title)
> +{
> +    if (title.isEmpty())
> +	   return title;
> +	   
> +    return title.stripWhiteSpace().simplifyWhiteSpace();
> +}

Why the early exit for empty string? I think stripWhiteSpace and
simplifyWhiteSpace already are fast and correct for the empty string.

Maybe it's my fault? Did I suggest it?

>  void Document::updateTitle()
>  {
> +    m_canonicalizedTitle = canonicalizedTitle(m_title);
>      if (Frame* f = frame())
> -	   f->loader()->setTitle(m_title);
> +	   f->loader()->setTitle(m_canonicalizedTitle);
>  }

It seems to me this should only call setTitle if the new m_canonicalizedTitle
is different from the old. I know that DocumentLoader::setTitle has a similar
optimization, but I'd like to see it at this level too.

> -    String title() const { return m_title; }
> +    String title() const { return m_canonicalizedTitle; }

I'm sorry, I know you quibbled with my comment last time, but I have to
reiterate. It is not acceptable to have a class with a title function and an
m_title data member, but have those be different things. You need to rename
either title or m_title.

> > You eliminated conversion of of control characters to spaces. Why are you
sure
> > that's OK? I'd like to see a separate patch for that behavior change which
does
> > not combine that change with all the refactoring.
> 
> Actually, isASCIISpace has the same check for control characters, so there
> isn't any behavior change.

isASCIISpace does not return true for control characters. The check in
isASCIISpace is a performance optimization. The isASCIISpace function returns
false for characters like U+0000 and U+0001.

>  static inline bool isSpaceOrNewline(UChar c)
>  {
> +    // According to HTML 5 space characters, are defined as U+0020 SPACE, 
> +    // U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), 
> +    // U+000C FORM FEED (FF) and U+000D CARRIAGE RETURN (CR).
> +    return c <= 0x7F ? isASCIISpace(c) : false;
>  }

While it's nice to have that comment there, it's not what the code implements.
isASCIISpace returns true for U+000B, and thus so does this function. I think
it's not good to put the comment in there, implying that's what the function
does.

The check for <= 0x7F is unnecessary. The isASCIISpace function already returns
false for non-ASCII characters, so you don't need to redo that work.

I think the function should be named isHTMLSpace, and should be correctly
implemented to return false for U+000B.

Also, since you are fixing this function, we need to write tests to show the
bugs we're fixing by correcting the definition. We need to look at call sites
and see which ones we're changing. There's a chance that for at least some of
the call sites, the HTML space character definition is not what's wanted, and
we need test cases to figure this out.


More information about the webkit-reviews mailing list