[Webkit-unassigned] [Bug 29972] Implement URL decomposition IDL attributes for HTMLAnchorElement

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Oct 26 00:04:04 PDT 2009


https://bugs.webkit.org/show_bug.cgi?id=29972


Darin Adler <darin at apple.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #41668|review?                     |review-
               Flag|                            |




--- Comment #10 from Darin Adler <darin at apple.com>  2009-10-26 00:04:04 PDT ---
(From update of attachment 41668)
Looks good!

I'm not sure you have the expected results right in the cases where what you
pass in for the URL is a Windows-style file pathname.

>      bool protocolIs(const char*) const;
>      bool protocolInHTTPFamily() const;
>      bool isLocalFile() const;
> +    bool hasPort() const;

I believe I mentioned in an earlier that I think hasPort() does not belong in
the paragraph with all the protocol boolean predicates. Instead I think it
should go just after port(), just as hasFragmentIdentifier() goes just after
fragmentIdentifier().

> +bool KURL::isProtocolValid(const String& protocol)
> +{
> +    if (protocol.isEmpty())
> +        return false;
> +    if (!isSchemeFirstChar(protocol[0]))
> +        return false;
> +    unsigned protocolLength = protocol.length();
> +    for (unsigned i = 1; i < protocolLength; i++) {
> +        if (!isSchemeChar(protocol[i]))
> +            return false;
> +    }
> +    return true;
> +}

I think this should be a non-member function like the protocolIs and
protocolIsJavaScript functions rather than a KURL member function.

There is no need for the isEmpty() check, since the isSchemeFirstChar check
will take care of that automatically, since indexing off the end of a String
gives you a 0 character.

> +    // Don't set the host on a local file e.g. c:/path
> +    if (m_schemeEnd == 1 && m_userStart == m_schemeEnd + 1)
> +        return;

This seems to be Windows-specific. We don't have any other code special casing
Windows-style paths unconditionally on all platforms, and I'm quite surprised
to see this here! Maybe we can add this in another patch. Is this really the
only case in the entire KURL class where we need to detect that something is a
Windows-style path? And is this really the right condition for detecting such a
path? While I accept that we may need special handling for such paths, if so it
should be at parse time, not in the set functions, where we decide that
something is a path, not a URL with an unusual one-letter URL scheme.

I think we'll need to land a first version of this patch without these special
cases and then decide how to handle the Windows-style path case separately. You
can land all the test cases showing what we're doing wrong with FAIL results
and then we can fix them in a second patch that addresses only that issue.

> +void KURL::removePort()
> +{
> +    if (!m_isValid)
> +        return;
> +    if (m_hostEnd == m_portEnd)
> +        return;
> +    parse(m_string.left(m_hostEnd) + m_string.substring(m_portEnd));
> +}

I don't think we need the m_isValid check here. If a KURL is not valid, then
m_hostEnd and m_portEnd are both guaranteed to be zero.

> +    // Don't set the port on a local file e.g. c:/path
> +    if (m_schemeEnd == 1 && m_userStart == m_schemeEnd + 1)
> +        return;

If we're going to have this check in more than one place, I think we need a
named helper function named something like isWindowsStyleFilePath() -- and I
have no idea why we would ever have such a thing without a "file:" prefix on
it. I really think this is wrong and we need to consider handling it some other
way. We should not treat the Windows drive letter as a "scheme" at parsing
time.

> +    // Only do this for http and https, unless the URL is not hierarchical.
> +    if (m_protocolInHTTPFamily && hierarchical && pathEnd - pathStart == 0)

It's strange to write "pathEnd == pathStart" as "pathEnd - pathStart == 0".
Maybe you can fix this while touching the code.

WebKit uses sentence-style comments rather than comments that end in a period.

> +// This function does not allow leading spaces before the port number.
> +static unsigned getPortFromString(const String& value, unsigned& portEnd)
> +{
> +    unsigned port = 0;
> +    portEnd = 0;
> +    unsigned valueLength = value.length();
> +    while (portEnd < valueLength && isASCIIDigit(value[portEnd]))
> +        portEnd++;
> +    if (portEnd > 0) {
> +        if (portEnd == valueLength)
> +            port = value.toUInt();
> +        else
> +            port = value.substring(0, portEnd).toUInt();
> +    }
> +    return port;
> +}

A function that returns a value such as a port number should not be named get
in WebKit code. We use nouns for such getter functions, and "get" only when
it's an out function.

Since an attempt to get a character past the end of a string returns the 0
character, there is no need to do the valueLength condition on the first if
statement.

WebKit style would write this "if (portEnd)" rather "if (portEnd > 0)".

Since the substring function already special cases a substring of the entire
string, there's no need for the "if (portEnd == valueLength)" check -- we can
just always use substring() and it will take care of optimizing the normal
case. Actually it would be clearer to use left().

Since toUInt() already returns 0 when passed the empty or null string, and the
empty string is already optimized, there's no need to special case the portEnd
== 0 case.

So this function becomes:

    static unsigned parsePortAtStringStart(const String& value, unsigned&
portEnd)
    {
        portEnd = 0
        while (isASCIIDigit(value[portEnd]))
            ++portEnd;
        return value.substring(0, portEnd).toUInt();
    }

And given that this actually is being passed a substring, I think it would be
better to make it take another argument and parse a port starting at the passed
in location so the caller doesn't have to call substring.

> +            else if (separator + 1 + portEnd == value.length())
> +                url.setHostAndPort(value);
> +            else
> +                url.setHostAndPort(value.substring(0, separator + 1 + portEnd));

Again, there is no need to special case the value.length() case this since
value.substring already does that.

> +void HTMLAnchorElement::setHostname(const String& value)
> +{
> +    // Before setting new value:
> +    // Remove all leading U+002F SOLIDUS ("/") characters.
> +    unsigned i = 0;
> +    unsigned hostLength = value.length();
> +    while (i < hostLength && value[i] == '/')
> +        i++;

There is no need to check against hostLength in the loop condition here because
String already takes care of that.

> +    String subvalue = value.substring(i);
> +    KURL url = href();
> +    if (!url.canSetHostOrPort())
> +        return;

There is no need to fetch subvalue into a local variable here. It's only used
once, and it's used after the if statement, so I suggest just doing the
substring call inside the setHost function call.

> +    if (SecurityOrigin::isDefaultPortForProtocol(port, url.protocol() ))

There's an extra space on this line that should go away.

> +    // Following Firefox 3.5.2 which removes anything after the first ":"
> +    String subvalue = (separator > 0 ? value.substring(0, separator) : value);

You don't need the ?: here because signed -1 becomes unsigned UINT_MAX, and
then that means the entire string. So you can just say:

    String protocol = value.substring(0, separator);

I also recommend the name protocol or newProtocol rather than subvalue.

> +    String escapedValue = (value[0] == '?') ? value.substring(1) : value;

The name escapedValue here doesn't make sense, because you do the escaping
*after* this. This is just value with the leading "?" stripped off it.

I didn't study all the tests carefully his time around, but they look good. It
would be good to compile a list of where you decided to match Firefox and where
you decided to match IE when the two differed, so people could consider that
rather than reading all the test cases.

review- because at least some of my comments above should drive you to make
some changes

This is getting really close!

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list