[webkit-reviews] review denied: [Bug 174816] [GTK][WPE] Need a function to convert internal URI to display ("pretty") URI : [Attachment 342542] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Jun 12 10:07:54 PDT 2018

Michael Catanzaro <mcatanzaro at igalia.com> has denied Ms2ger
<Ms2ger at igalia.com>'s request for review:
Bug 174816: [GTK][WPE] Need a function to convert internal URI to display
("pretty") URI

Attachment 342542: Patch


--- Comment #67 from Michael Catanzaro <mcatanzaro at igalia.com> ---
Comment on attachment 342542
  --> https://bugs.webkit.org/attachment.cgi?id=342542

View in context: https://bugs.webkit.org/attachment.cgi?id=342542&action=review

Thanks a bunch for resurrecting this work.

The main issue is that Alex requested that we not add this to URLParser:

(In reply to Alex Christensen from comment #49)
> I'm not against moving this from a .mm to a sharable location, but the
> URLParser is an implementation of https://url.spec.whatwg.org so let's make
> a new class for this.

So some new home will be needed, e.g. URLHelpers.cpp or URLDecoder.cpp,
something like that. r- for this

> Source/WebCore/ChangeLog:8
> +	   This code is based almost entirely on code by Gabriel Ivascu.

We often credit multiple authors on the authorship line:

2018-06-12 Ms2ger <Ms2ger at igalia.com> and Gabriel Ivascu
<gabrielivascu at gnome.org>

> Source/WebCore/platform/URLParser.cpp:2886
> +
> +

Let's drop down to one blank line between functions.

> Source/WebCore/platform/URLParser.cpp:2910
> +
> +
> +


> Source/WebCore/platform/URLParser.h:56
> +    static String ICUConvertHostName(const String& hostName, bool encode,
const uint32_t (&IDNScriptWhiteList)[(USCRIPT_CODE_LIMIT + 31) / 32], bool*

This function name is rather ambiguous... how about DecodeHostnameForDisplay?

> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:43
> + * against IDN homograph attacks so in some cases the host part of the

comma after "attacks"

> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:48
> +gchar* webkit_uri_for_display(const gchar* uri)

We might want to bikeshed about the name. It's fine with me, but please ask
Carlos Garcia if he has a preference here.

> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:52
> +    auto coreURI = WebCore::URL(WebCore::URL(), String::fromUTF8(uri));

I think we normally use bracket initialization for URLs:

auto coreURI = WebCore::URL { { }, String::fromUTF8(uri) };

> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:54
> +    if (!coreURI.isValid())
> +	   return g_strdup(uri);

Hm, we should document that if the input is not a valid URI, it returns the
original input. I wonder if it would be better to return nullptr when the input
is not a valid URI, though. Probably? I'd like an opinion from Carlos Garcia on
this. The behavior of strdup()ing the original input is coming from
ephy_uri_decode(), which is designed to always return a value for convenience,
but it might be better for public API to indicate the error. In that case, the
return value would need to be annotated as (nullable) and everywhere we
currently return g_strdup(uri) we would need to instead return nullptr.

> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:60
> +    // Remove password and percent-decode host name.
> +    coreURI.setPass(emptyString());
> +    auto soupURI = coreURI.createSoupURI();
> +    if (!soupURI.get()->host)
> +	   return g_strdup(uri);

If the URI contains a password but not a hostname, which is valid, then we fail
to remove the password. Oops. That's true for the subsequent early return on
line 69, as well. Returning null would avoid this.

> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:65
> +    uint32_t IDNScriptWhiteList[(USCRIPT_CODE_LIMIT + 31) / 32] = {};

The third parameter of ICUConvertHostName should be declared to use a typedef
(or using statement), then this line could become something much saner, like:

ICUConvertHostnameWhitelist whitelist = {};

> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:73
> +    if (convertedHostName.isNull()) {
> +	   convertedHostName = percentDecodedHost;
> +    }

No brackets around a one-line condition

> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:80
> +    // Now, decode any percent-encoded characters in the URI. If there are
> +    // characters or escaped slashes, this returns nullptr, so just display
> +    // encoded URI in that case.

I'm confused, why are percent-decoding the hostname twice? Epiphany feeds the
percent-encoded URI into ICU to decode the punycode, then percent-decodes. I
think it did not work properly when doing it the other way around. But here you
are doing it twice, once before decoding punycode and once after. What is the
correct approach?

This would also be simpler if we return nullptr on error.

More information about the webkit-reviews mailing list