<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"

"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>[205493] trunk</title>

</head>

<body>

<style type="text/css"><!--

#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }

#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }

#msg dt:after { content:':';}

#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }

#msg dl a { font-weight: bold}

#msg dl a:link    { color:#fc3; }

#msg dl a:active  { color:#ff0; }

#msg dl a:visited { color:#cc6; }

h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }

#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }

#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }

#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }

#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }

#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }

#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }

#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }

#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }

#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }

#logmsg pre { background: #eee; padding: 1em; }

#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}

#logmsg dl { margin: 0; }

#logmsg dt { font-weight: bold; }

#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }

#logmsg dd:before { content:'\00bb';}

#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }

#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }

#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }

#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }

#logmsg table th.Corner { text-align: left; }

#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }

#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }

#patch { width: 100%; }

#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}

#patch .propset h4, #patch .binary h4 {margin:0;}

#patch pre {padding:0;line-height:1.2em;margin:0;}

#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}

#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}

#patch span {display:block;padding:0 10px;}

#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}

#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}

#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}

#patch .lines, .info {color:#888;background:#fff;}

--></style>

<div id="msg">

<dl class="meta">

<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/205493">205493</a></dd>

<dt>Author</dt> <dd>achristensen@apple.com</dd>

<dt>Date</dt> <dd>2016-09-06 11:16:07 -0700 (Tue, 06 Sep 2016)</dd>

</dl>

<h3>Log Message</h3>

<pre>Implement relative file urls and begin implementing character encoding in URLParser

https://bugs.webkit.org/show_bug.cgi?id=161618

Reviewed by Tim Horton.

Source/WebCore:

Covered by new API tests.

Also, this is a significant step towards passing the URL web platform tests when using the URLParser,

which is still off by default.

* platform/URLParser.cpp:

(WebCore::isInSimpleEncodeSet):

(WebCore::isInDefaultEncodeSet):

(WebCore::isInUserInfoEncodeSet):

(WebCore::isInvalidDomainCharacter):

(WebCore::shouldCopyFileURL):

(WebCore::percentEncode):

(WebCore::utf8PercentEncode):

(WebCore::encodeQuery):

(WebCore::isDefaultPort):

(WebCore::isPercentEncodedDot):

(WebCore::URLParser::parse):

(WebCore::percentDecode):

(WebCore::domainToASCII):

(WebCore::hasInvalidDomainCharacter):

(WebCore::URLParser::parsePort):

(WebCore::URLParser::parseHost):

(WebCore::isTabOrNewline): Deleted.

* platform/URLParser.h:

Tools:

* TestWebKitAPI/Tests/WebCore/URLParser.cpp:

(TestWebKitAPI::TEST_F):</pre>

<h3>Modified Paths</h3>

<ul>

<li><a href="#trunkSourceWebCoreChangeLog">trunk/Source/WebCore/ChangeLog</a></li>

<li><a href="#trunkSourceWebCoreplatformURLParsercpp">trunk/Source/WebCore/platform/URLParser.cpp</a></li>

<li><a href="#trunkSourceWebCoreplatformURLParserh">trunk/Source/WebCore/platform/URLParser.h</a></li>

<li><a href="#trunkToolsChangeLog">trunk/Tools/ChangeLog</a></li>

<li><a href="#trunkToolsTestWebKitAPITestsWebCoreURLParsercpp">trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp</a></li>

</ul>

</div>

<div id="patch">

<h3>Diff</h3>

<a id="trunkSourceWebCoreChangeLog"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/ChangeLog (205492 => 205493)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/ChangeLog        2016-09-06 18:02:28 UTC (rev 205492)

+++ trunk/Source/WebCore/ChangeLog        2016-09-06 18:16:07 UTC (rev 205493)

</span><span class="lines">@@ -1,3 +1,34 @@

</span><ins>+2016-09-05  Alex Christensen  &lt;achristensen@webkit.org&gt;

+

+        Implement relative file urls and begin implementing character encoding in URLParser

+        https://bugs.webkit.org/show_bug.cgi?id=161618

+

+        Reviewed by Tim Horton.

+

+        Covered by new API tests.

+        Also, this is a significant step towards passing the URL web platform tests when using the URLParser,

+        which is still off by default.

+

+        * platform/URLParser.cpp:

+        (WebCore::isInSimpleEncodeSet):

+        (WebCore::isInDefaultEncodeSet):

+        (WebCore::isInUserInfoEncodeSet):

+        (WebCore::isInvalidDomainCharacter):

+        (WebCore::shouldCopyFileURL):

+        (WebCore::percentEncode):

+        (WebCore::utf8PercentEncode):

+        (WebCore::encodeQuery):

+        (WebCore::isDefaultPort):

+        (WebCore::isPercentEncodedDot):

+        (WebCore::URLParser::parse):

+        (WebCore::percentDecode):

+        (WebCore::domainToASCII):

+        (WebCore::hasInvalidDomainCharacter):

+        (WebCore::URLParser::parsePort):

+        (WebCore::URLParser::parseHost):

+        (WebCore::isTabOrNewline): Deleted.

+        * platform/URLParser.h:

+

</ins><span class="cx"> 2016-09-06  Daniel Bates  &lt;dabates@apple.com&gt;

</span><span class="cx"> 

</span><span class="cx">         Fix the Apple-internal build following &lt;https://trac.webkit.org/changeset/205488&gt;

</span></span></pre></div>

<a id="trunkSourceWebCoreplatformURLParsercpp"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/platform/URLParser.cpp (205492 => 205493)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/platform/URLParser.cpp        2016-09-06 18:02:28 UTC (rev 205492)

+++ trunk/Source/WebCore/platform/URLParser.cpp        2016-09-06 18:16:07 UTC (rev 205493)

</span><span class="lines">@@ -28,7 +28,10 @@

</span><span class="cx"> 

</span><span class="cx"> #include &quot;Logging.h&quot;

</span><span class="cx"> #include &lt;array&gt;

</span><ins>+#include &lt;wtf/HashMap.h&gt;

+#include &lt;wtf/NeverDestroyed.h&gt;

</ins><span class="cx"> #include &lt;wtf/text/StringBuilder.h&gt;

</span><ins>+#include &lt;wtf/text/StringHash.h&gt;

</ins><span class="cx"> 

</span><span class="cx"> namespace WebCore {

</span><span class="cx"> 

</span><span class="lines">@@ -35,6 +38,10 @@

</span><span class="cx"> template&lt;typename CharacterType&gt; static bool isC0Control(CharacterType character) { return character &lt;= 0x0001F; }

</span><span class="cx"> template&lt;typename CharacterType&gt; static bool isC0ControlOrSpace(CharacterType character) { return isC0Control(character) || character == 0x0020; }

</span><span class="cx"> template&lt;typename CharacterType&gt; static bool isTabOrNewline(CharacterType character) { return character == 0x0009 || character == 0x000A || character == 0x000D; }

</span><ins>+template&lt;typename CharacterType&gt; static bool isInSimpleEncodeSet(CharacterType character) { return isC0Control(character) || character &gt; 0x007E; }

+template&lt;typename CharacterType&gt; static bool isInDefaultEncodeSet(CharacterType character) { return isInSimpleEncodeSet(character) || character == 0x0020 || character == '&quot;' || character == '#' || character == '&lt;' || character == '&gt;' || character == '?' || character == '`' || character == '{' || character == '}'; }

+template&lt;typename CharacterType&gt; static bool isInUserInfoEncodeSet(CharacterType character) { return isInDefaultEncodeSet(character) || character == '/' || character == ':' || character == ';' || character == '=' || character == '@' || character == '[' || character == '\\' || character == ']' || character == '^' || character == '|'; }

+template&lt;typename CharacterType&gt; static bool isInvalidDomainCharacter(CharacterType character) { return character == 0x0000 || character == 0x0009 || character == 0x000A || character == 0x000D || character == 0x0020 || character == '#' || character == '%' || character == '/' || character == ':' || character == '?' || character == '@' || character == '[' || character == '\\' || character == ']'; }

</ins><span class="cx">     

</span><span class="cx"> static bool isWindowsDriveLetter(StringView::CodePoints::Iterator iterator, const StringView::CodePoints::Iterator&amp; end)

</span><span class="cx"> {

</span><span class="lines">@@ -63,9 +70,74 @@

</span><span class="cx">     if (iterator == end)

</span><span class="cx">         return true;

</span><span class="cx">     ++iterator;

</span><del>-    return *iterator != '/' &amp;&amp; *iterator != '\\' &amp;&amp; *iterator != '?' &amp;&amp; *iterator == '#';

</del><ins>+    if (iterator == end)

+        return true;

+    return *iterator != '/' &amp;&amp; *iterator != '\\' &amp;&amp; *iterator != '?' &amp;&amp; *iterator != '#';

</ins><span class="cx"> }

</span><span class="cx"> 

</span><ins>+static void percentEncode(uint8_t byte, StringBuilder&amp; builder)

+{

+    builder.append('%');

+    builder.append(upperNibbleToASCIIHexDigit(byte));

+    builder.append(lowerNibbleToASCIIHexDigit(byte));

+}

+

+static void utf8PercentEncode(UChar32 codePoint, StringBuilder&amp; builder, bool(*isInCodeSet)(UChar32))

+{

+    if (isInCodeSet(codePoint)) {

+        uint8_t buffer[U8_MAX_LENGTH];

+        int32_t offset = 0;

+        UBool error = false;

+        U8_APPEND(buffer, offset, U8_MAX_LENGTH, codePoint, error);

+        // FIXME: Check error.

+        for (int32_t i = 0; i &lt; offset; ++i)

+            percentEncode(buffer[i], builder);

+    } else

+        builder.append(codePoint);

+}

+

+static bool shouldPercentEncodeQueryByte(uint8_t byte)

+{

+    if (byte &lt; 0x21)

+        return true;

+    if (byte &gt; 0x7E)

+        return true;

+    if (byte == 0x22)

+        return true;

+    if (byte == 0x23)

+        return true;

+    if (byte == 0x3C)

+        return true;

+    return byte == 0x3E;

+}

+

+static void encodeQuery(const StringBuilder&amp; source, StringBuilder&amp; destination, const TextEncoding&amp; encoding)

+{

+    // FIXME: It is unclear in the spec what to do when encoding fails. The behavior should be specified and tested.

+    CString encoded = encoding.encode(StringView(source.toStringPreserveCapacity()), URLEncodedEntitiesForUnencodables);

+    const char* data = encoded.data();

+    size_t length = encoded.length();

+    for (size_t i = 0; i &lt; length; ++i) {

+        uint8_t byte = data[i];

+        if (shouldPercentEncodeQueryByte(byte))

+            percentEncode(byte, destination);

+        else

+            destination.append(byte);

+    }

+}

+

+static bool isDefaultPort(const String&amp; scheme, uint16_t port)

+{

+    static NeverDestroyed&lt;HashMap&lt;String, uint16_t&gt;&gt; defaultPorts(HashMap&lt;String, uint16_t&gt;({

+        {&quot;ftp&quot;, 21},

+        {&quot;gopher&quot;, 70},

+        {&quot;http&quot;, 80},

+        {&quot;https&quot;, 443},

+        {&quot;ws&quot;, 80},

+        {&quot;wss&quot;, 443}}));

+    return defaultPorts.get().get(scheme) == port;

+}

+

</ins><span class="cx"> static bool isSpecialScheme(const String&amp; scheme)

</span><span class="cx"> {

</span><span class="cx">     return scheme == &quot;ftp&quot;

</span><span class="lines">@@ -159,6 +231,23 @@

</span><span class="cx"> 

</span><span class="cx"> static const char* dotASCIICode = &quot;2e&quot;;

</span><span class="cx"> 

</span><ins>+static bool isPercentEncodedDot(StringView::CodePoints::Iterator c, const StringView::CodePoints::Iterator&amp; end)

+{

+    if (c == end)

+        return false;

+    if (*c != '%')

+        return false;

+    ++c;

+    if (c == end)

+        return false;

+    if (*c != dotASCIICode[0])

+        return false;

+    ++c;

+    if (c == end)

+        return false;

+    return toASCIILower(*c) == dotASCIICode[1];

+}

+

</ins><span class="cx"> static bool isSingleDotPathSegment(StringView::CodePoints::Iterator c, const StringView::CodePoints::Iterator&amp; end)

</span><span class="cx"> {

</span><span class="cx">     if (c == end)

</span><span class="lines">@@ -261,12 +350,15 @@

</span><span class="cx">     m_buffer.resize(m_url.m_pathAfterLastSlash);

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-URL URLParser::parse(const String&amp; input, const URL&amp; base, const TextEncoding&amp;)

</del><ins>+URL URLParser::parse(const String&amp; input, const URL&amp; base, const TextEncoding&amp; encoding)

</ins><span class="cx"> {

</span><span class="cx">     LOG(URLParser, &quot;Parsing URL &lt;%s&gt; base &lt;%s&gt;&quot;, input.utf8().data(), base.string().utf8().data());

</span><span class="cx">     m_url = { };

</span><span class="cx">     m_buffer.clear();

</span><span class="cx">     m_buffer.reserveCapacity(input.length());

</span><ins>+    

+    // FIXME: We shouldn't need to allocate another buffer for this.

+    StringBuilder queryBuffer;

</ins><span class="cx"> 

</span><span class="cx">     auto codePoints = StringView(input).codePoints();

</span><span class="cx">     auto c = codePoints.begin();

</span><span class="lines">@@ -347,6 +439,8 @@

</span><span class="cx">                 break;

</span><span class="cx">             }

</span><span class="cx">             ++c;

</span><ins>+            while (c != end &amp;&amp; isTabOrNewline(*c))

+                ++c;

</ins><span class="cx">             if (c == end) {

</span><span class="cx">                 m_buffer.clear();

</span><span class="cx">                 state = State::NoScheme;

</span><span class="lines">@@ -380,6 +474,7 @@

</span><span class="cx">                     return { };

</span><span class="cx">             } else if (base.protocol() == &quot;file&quot;) {

</span><span class="cx">                 copyURLPartsUntil(base, URLPart::SchemeEnd);

</span><ins>+                m_buffer.append(':');

</ins><span class="cx">                 state = State::File;

</span><span class="cx">             } else

</span><span class="cx">                 state = State::Relative;

</span><span class="lines">@@ -479,12 +574,20 @@

</span><span class="cx">             if (*c == '@') {

</span><span class="cx">                 parseAuthority(authorityOrHostBegin, c);

</span><span class="cx">                 ++c;

</span><ins>+                while (c != end &amp;&amp; isTabOrNewline(*c))

+                    ++c;

</ins><span class="cx">                 authorityOrHostBegin = c;

</span><span class="cx">                 state = State::Host;

</span><ins>+                break;

</ins><span class="cx">             } else if (*c == '/' || *c == '?' || *c == '#') {

</span><span class="cx">                 m_url.m_userEnd = m_buffer.length();

</span><span class="cx">                 m_url.m_passwordEnd = m_url.m_userEnd;

</span><del>-                parseHost(authorityOrHostBegin, c);

</del><ins>+                if (!parseHost(authorityOrHostBegin, c))

+                    return { };

+                if (*c != '/') {

+                    m_buffer.append('/');

+                    m_url.m_pathAfterLastSlash = m_buffer.length();

+                }

</ins><span class="cx">                 state = State::Path;

</span><span class="cx">                 break;

</span><span class="cx">             }

</span><span class="lines">@@ -493,7 +596,8 @@

</span><span class="cx">         case State::Host:

</span><span class="cx">             LOG_STATE(&quot;Host&quot;);

</span><span class="cx">             if (*c == '/' || *c == '?' || *c == '#') {

</span><del>-                parseHost(authorityOrHostBegin, c);

</del><ins>+                if (!parseHost(authorityOrHostBegin, c))

+                    return { };

</ins><span class="cx">                 state = State::Path;

</span><span class="cx">                 break;

</span><span class="cx">             }

</span><span class="lines">@@ -509,7 +613,7 @@

</span><span class="cx">                 ++c;

</span><span class="cx">                 break;

</span><span class="cx">             case '?':

</span><del>-                if (!base.isNull() &amp;&amp; base.protocol() == &quot;file&quot;)

</del><ins>+                if (!base.isNull() &amp;&amp; base.protocolIs(&quot;file&quot;))

</ins><span class="cx">                     copyURLPartsUntil(base, URLPart::PathEnd);

</span><span class="cx">                 m_buffer.append(&quot;///?&quot;);

</span><span class="cx">                 m_url.m_userStart = m_buffer.length() - 2;

</span><span class="lines">@@ -523,7 +627,7 @@

</span><span class="cx">                 ++c;

</span><span class="cx">                 break;

</span><span class="cx">             case '#':

</span><del>-                if (!base.isNull() &amp;&amp; base.protocol() == &quot;file&quot;)

</del><ins>+                if (!base.isNull() &amp;&amp; base.protocolIs(&quot;file&quot;))

</ins><span class="cx">                     copyURLPartsUntil(base, URLPart::QueryEnd);

</span><span class="cx">                 m_buffer.append(&quot;///#&quot;);

</span><span class="cx">                 m_url.m_userStart = m_buffer.length() - 2;

</span><span class="lines">@@ -538,10 +642,9 @@

</span><span class="cx">                 ++c;

</span><span class="cx">                 break;

</span><span class="cx">             default:

</span><del>-                if (shouldCopyFileURL(c, end)) {

-                    copyURLPartsUntil(base, URLPart::PathEnd);

-                    popPath();

-                } else {

</del><ins>+                if (!base.isNull() &amp;&amp; base.protocolIs(&quot;file&quot;) &amp;&amp; shouldCopyFileURL(c, end))

+                    copyURLPartsUntil(base, URLPart::PathAfterLastSlash);

+                else {

</ins><span class="cx">                     m_buffer.append(&quot;///&quot;);

</span><span class="cx">                     m_url.m_userStart = m_buffer.length() - 1;

</span><span class="cx">                     m_url.m_userEnd = m_url.m_userStart;

</span><span class="lines">@@ -667,8 +770,17 @@

</span><span class="cx">                 state = State::Fragment;

</span><span class="cx">                 break;

</span><span class="cx">             }

</span><del>-            // FIXME: Percent encode c

-            m_buffer.append(*c);

</del><ins>+            if (isPercentEncodedDot(c, end)) {

+                m_buffer.append('.');

+                ASSERT(*c == '%');

+                ++c;

+                ASSERT(*c == dotASCIICode[0]);

+                ++c;

+                ASSERT(toASCIILower(*c) == dotASCIICode[1]);

+                ++c;

+                break;

+            }

+            utf8PercentEncode(*c, m_buffer, isInDefaultEncodeSet);

</ins><span class="cx">             ++c;

</span><span class="cx">             break;

</span><span class="cx">         case State::CannotBeABaseURLPath:

</span><span class="lines">@@ -688,11 +800,12 @@

</span><span class="cx">         case State::Query:

</span><span class="cx">             LOG_STATE(&quot;Query&quot;);

</span><span class="cx">             if (*c == '#') {

</span><ins>+                encodeQuery(queryBuffer, m_buffer, encoding);

</ins><span class="cx">                 m_url.m_queryEnd = m_buffer.length();

</span><span class="cx">                 state = State::Fragment;

</span><span class="cx">                 break;

</span><span class="cx">             }

</span><del>-            m_buffer.append(*c);

</del><ins>+            queryBuffer.append(*c);

</ins><span class="cx">             ++c;

</span><span class="cx">             break;

</span><span class="cx">         case State::Fragment:

</span><span class="lines">@@ -743,7 +856,8 @@

</span><span class="cx">     case State::Host:

</span><span class="cx">         if (state == State::Host)

</span><span class="cx">             LOG_FINAL_STATE(&quot;Host&quot;);

</span><del>-        parseHost(authorityOrHostBegin, end);

</del><ins>+        if (!parseHost(authorityOrHostBegin, end))

+            return { };

</ins><span class="cx">         m_buffer.append('/');

</span><span class="cx">         m_url.m_pathEnd = m_url.m_portEnd + 1;

</span><span class="cx">         m_url.m_pathAfterLastSlash = m_url.m_pathEnd;

</span><span class="lines">@@ -832,6 +946,7 @@

</span><span class="cx">         break;

</span><span class="cx">     case State::Query:

</span><span class="cx">         LOG_FINAL_STATE(&quot;Query&quot;);

</span><ins>+        encodeQuery(queryBuffer, m_buffer, encoding);

</ins><span class="cx">         m_url.m_queryEnd = m_buffer.length();

</span><span class="cx">         m_url.m_fragmentEnd = m_url.m_queryEnd;

</span><span class="cx">         break;

</span><span class="lines">@@ -1131,6 +1246,73 @@

</span><span class="cx">     return address;

</span><span class="cx"> }

</span><span class="cx"> 

</span><ins>+static String percentDecode(const String&amp; input)

+{

+    StringBuilder output;

+    RELEASE_ASSERT(input.is8Bit());

+    const LChar* inputBytes = input.characters8();

+    size_t length = input.length();

+    

+    for (size_t i = 0; i &lt; length; ++i) {

+        uint8_t byte = inputBytes[i];

+        if (byte != '%')

+            output.append(byte);

+        else if (i &lt; length - 2) {

+            if (isASCIIHexDigit(inputBytes[i + 1]) &amp;&amp; isASCIIHexDigit(inputBytes[i + 2])) {

+                output.append(toASCIIHexValue(inputBytes[i + 1], inputBytes[i + 2]));

+                i += 2;

+            } else

+                output.append(byte);

+        } else

+            output.append(byte);

+    }

+    return output.toStringPreserveCapacity();

+}

+

+static Optional&lt;String&gt; domainToASCII(const String&amp; domain)

+{

+    // FIXME: Implement correctly

+    CString utf8 = domain.utf8();

+    return String(utf8.data(), utf8.length());

+}

+

+static bool hasInvalidDomainCharacter(const String&amp; asciiDomain)

+{

+    RELEASE_ASSERT(asciiDomain.is8Bit());

+    const LChar* characters = asciiDomain.characters8();

+    for (size_t i = 0; i &lt; asciiDomain.length(); ++i) {

+        if (isInvalidDomainCharacter(characters[i]))

+            return true;

+    }

+    return false;

+}

+

+bool URLParser::parsePort(StringView::CodePoints::Iterator&amp; iterator, const StringView::CodePoints::Iterator&amp; end)

+{

+    uint32_t port = 0;

+    ASSERT(iterator != end);

+    for (; iterator != end; ++iterator) {

+        if (isTabOrNewline(*iterator))

+            continue;

+        if (isASCIIDigit(*iterator)) {

+            port = port * 10 + *iterator - '0';

+            if (port &gt; std::numeric_limits&lt;uint16_t&gt;::max())

+                return false;

+        } else

+            return false;

+    }

+    

+    // FIXME: This shouldn't need a String allocation.

+    String scheme = m_buffer.toStringPreserveCapacity().substring(0, m_url.m_schemeEnd);

+    if (isDefaultPort(scheme, port)) {

+        ASSERT(m_buffer[m_buffer.length() - 1] == ':');

+        m_buffer.resize(m_buffer.length() - 1);

+    } else

+        m_buffer.appendNumber(port);

+

+    return true;

+}

+

</ins><span class="cx"> bool URLParser::parseHost(StringView::CodePoints::Iterator&amp; iterator, const StringView::CodePoints::Iterator&amp; end)

</span><span class="cx"> {

</span><span class="cx">     if (iterator == end)

</span><span class="lines">@@ -1148,7 +1330,30 @@

</span><span class="cx">             return true;

</span><span class="cx">         }

</span><span class="cx">     }

</span><del>-    if (auto address = parseIPv4Host(iterator, end)) {

</del><ins>+

+    // FIXME: We probably don't need to make so many buffers and String copies.

+    StringBuilder utf8Encoded;

+    for (; iterator != end; ++iterator) {

+        if (isTabOrNewline(*iterator))

+            continue;

+        if (*iterator == ':')

+            break;

+        uint8_t buffer[U8_MAX_LENGTH];

+        int32_t offset = 0;

+        UBool error = false;

+        U8_APPEND(buffer, offset, U8_MAX_LENGTH, *iterator, error);

+        // FIXME: Check error.

+        utf8Encoded.append(buffer, offset);

+    }

+    String percentDecoded = percentDecode(utf8Encoded.toStringPreserveCapacity());

+    RELEASE_ASSERT(percentDecoded.is8Bit());

+    String domain = String::fromUTF8(percentDecoded.characters8(), percentDecoded.length());

+    auto asciiDomain = domainToASCII(domain);

+    if (!asciiDomain || hasInvalidDomainCharacter(asciiDomain.value()))

+        return false;

+    

+    auto asciiDomainCodePoints = StringView(asciiDomain.value()).codePoints();

+    if (auto address = parseIPv4Host(asciiDomainCodePoints.begin(), asciiDomainCodePoints.end())) {

</ins><span class="cx">         serializeIPv4(address.value(), m_buffer);

</span><span class="cx">         m_url.m_hostEnd = m_buffer.length();

</span><span class="cx">         // FIXME: Handle the port correctly.

</span><span class="lines">@@ -1155,20 +1360,21 @@

</span><span class="cx">         m_url.m_portEnd = m_buffer.length();

</span><span class="cx">         return true;

</span><span class="cx">     }

</span><del>-    for (; iterator != end; ++iterator) {

-        if (*iterator == ':') {

</del><ins>+    

+    m_buffer.append(asciiDomain.value());

+    m_url.m_hostEnd = m_buffer.length();

+    if (iterator != end) {

+        ASSERT(*iterator == ':');

+        ++iterator;

+        while (iterator != end &amp;&amp; isTabOrNewline(*iterator))

</ins><span class="cx">             ++iterator;

</span><del>-            m_url.m_hostEnd = m_buffer.length();

</del><ins>+        if (iterator != end) {

</ins><span class="cx">             m_buffer.append(':');

</span><del>-            for (; iterator != end; ++iterator)

-                m_buffer.append(*iterator);

-            m_url.m_portEnd = m_buffer.length();

-            return true;

</del><ins>+            if (!parsePort(iterator, end))

+                return false;

</ins><span class="cx">         }

</span><del>-        m_buffer.append(*iterator);

</del><span class="cx">     }

</span><del>-    m_url.m_hostEnd = m_buffer.length();

-    m_url.m_portEnd = m_url.m_hostEnd;

</del><ins>+    m_url.m_portEnd = m_buffer.length();

</ins><span class="cx">     return true;

</span><span class="cx"> }

</span><span class="cx"> 

</span></span></pre></div>

<a id="trunkSourceWebCoreplatformURLParserh"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/platform/URLParser.h (205492 => 205493)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/platform/URLParser.h        2016-09-06 18:02:28 UTC (rev 205492)

+++ trunk/Source/WebCore/platform/URLParser.h        2016-09-06 18:16:07 UTC (rev 205493)

</span><span class="lines">@@ -44,6 +44,7 @@

</span><span class="cx">     StringBuilder m_buffer;

</span><span class="cx">     void parseAuthority(StringView::CodePoints::Iterator&amp;, const StringView::CodePoints::Iterator&amp; end);

</span><span class="cx">     bool parseHost(StringView::CodePoints::Iterator&amp;, const StringView::CodePoints::Iterator&amp; end);

</span><ins>+    bool parsePort(StringView::CodePoints::Iterator&amp;, const StringView::CodePoints::Iterator&amp; end);

</ins><span class="cx"> 

</span><span class="cx">     enum class URLPart;

</span><span class="cx">     void copyURLPartsUntil(const URL&amp; base, URLPart);

</span></span></pre></div>

<a id="trunkToolsChangeLog"></a>

<div class="modfile"><h4>Modified: trunk/Tools/ChangeLog (205492 => 205493)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Tools/ChangeLog        2016-09-06 18:02:28 UTC (rev 205492)

+++ trunk/Tools/ChangeLog        2016-09-06 18:16:07 UTC (rev 205493)

</span><span class="lines">@@ -1,3 +1,13 @@

</span><ins>+2016-09-05  Alex Christensen  &lt;achristensen@webkit.org&gt;

+

+        Implement relative file urls and begin implementing character encoding in URLParser

+        https://bugs.webkit.org/show_bug.cgi?id=161618

+

+        Reviewed by Tim Horton.

+

+        * TestWebKitAPI/Tests/WebCore/URLParser.cpp:

+        (TestWebKitAPI::TEST_F):

+

</ins><span class="cx"> 2016-09-06  Commit Queue  &lt;commit-queue@webkit.org&gt;

</span><span class="cx"> 

</span><span class="cx">         Unreviewed, rolling out r205480.

</span></span></pre></div>

<a id="trunkToolsTestWebKitAPITestsWebCoreURLParsercpp"></a>

<div class="modfile"><h4>Modified: trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp (205492 => 205493)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp        2016-09-06 18:02:28 UTC (rev 205492)

+++ trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp        2016-09-06 18:16:07 UTC (rev 205493)

</span><span class="lines">@@ -165,6 +165,9 @@

</span><span class="cx">     checkURL(&quot;file:///#fragment&quot;, {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;fragment&quot;, &quot;file:///#fragment&quot;});

</span><span class="cx">     checkURL(&quot;file:////?query&quot;, {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;//&quot;, &quot;query&quot;, &quot;&quot;, &quot;file:////?query&quot;});

</span><span class="cx">     checkURL(&quot;file:////#fragment&quot;, {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;//&quot;, &quot;&quot;, &quot;fragment&quot;, &quot;file:////#fragment&quot;});

</span><ins>+    checkURL(&quot;http://host/A b&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/A%20b&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/A%20b&quot;});

+    checkURL(&quot;http://host/a%20B&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/a%20B&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/a%20B&quot;});

+    checkURL(&quot;http://host?q=@ &lt;&gt;!#fragment&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;q=@%20%3C%3E!&quot;, &quot;fragment&quot;, &quot;http://host/?q=@%20%3C%3E!#fragment&quot;});

</ins><span class="cx"> }

</span><span class="cx"> 

</span><span class="cx"> static void checkRelativeURL(const String&amp; urlString, const String&amp; baseURLString, const ExpectedParts&amp; parts)

</span><span class="lines">@@ -204,6 +207,8 @@

</span><span class="cx">     checkRelativeURL(&quot;http://whatwg.org/index.html&quot;, &quot;http://webkit.org/path1/path2/&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;whatwg.org&quot;, 0, &quot;/index.html&quot;, &quot;&quot;, &quot;&quot;, &quot;http://whatwg.org/index.html&quot;});

</span><span class="cx">     checkRelativeURL(&quot;index.html&quot;, &quot;http://webkit.org/path1/path2/page.html?query#fragment&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;webkit.org&quot;, 0, &quot;/path1/path2/index.html&quot;, &quot;&quot;, &quot;&quot;, &quot;http://webkit.org/path1/path2/index.html&quot;});

</span><span class="cx">     checkRelativeURL(&quot;//whatwg.org/index.html&quot;, &quot;https://www.webkit.org/path&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;whatwg.org&quot;, 0, &quot;/index.html&quot;, &quot;&quot;, &quot;&quot;, &quot;https://whatwg.org/index.html&quot;});

</span><ins>+    checkRelativeURL(&quot;http://example\t.\norg&quot;, &quot;http://example.org/foo/bar&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;example.org&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://example.org/&quot;});

+    checkRelativeURL(&quot;test&quot;, &quot;file:///path1/path2&quot;, {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;/path1/test&quot;, &quot;&quot;, &quot;&quot;, &quot;file:///path1/test&quot;});

</ins><span class="cx"> }

</span><span class="cx"> 

</span><span class="cx"> static void checkURLDifferences(const String&amp; urlString, const ExpectedParts&amp; partsNew, const ExpectedParts&amp; partsOld)

</span><span class="lines">@@ -338,8 +343,74 @@

</span><span class="cx">     checkURLDifferences(&quot;file:path&quot;,

</span><span class="cx">         {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;/path&quot;, &quot;&quot;, &quot;&quot;, &quot;file:///path&quot;},

</span><span class="cx">         {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;path&quot;, &quot;&quot;, &quot;&quot;, &quot;file://path&quot;});

</span><ins>+    

+    // FIXME: Fix and test incomplete percent encoded characters in the middle and end of the input string.

+    // FIXME: Fix and test percent encoded upper case characters in the host.

+    checkURLDifferences(&quot;http://host%73&quot;,

+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;hosts&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://hosts/&quot;},

+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host%73&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host%73/&quot;});

+    

+    // URLParser matches Chrome and the spec, but not URL::parse or Firefox.

+    checkURLDifferences(&quot;http://host/path%2e.%2E&quot;,

+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/path...&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/path...&quot;},

+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/path%2e.%2E&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/path%2e.%2E&quot;});

</ins><span class="cx"> }

</span><span class="cx"> 

</span><ins>+TEST_F(URLParserTest, DefaultPort)

+{

+    checkURL(&quot;ftp://host:21/&quot;, {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host/&quot;});

+    checkURL(&quot;ftp://host:22/&quot;, {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 22, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host:22/&quot;});

+    checkURLDifferences(&quot;ftp://host:21&quot;,

+        {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host/&quot;},

+        {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host&quot;});

+    checkURLDifferences(&quot;ftp://host:22&quot;,

+        {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 22, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host:22/&quot;},

+        {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 22, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host:22&quot;});

+    

+    checkURL(&quot;gopher://host:70/&quot;, {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host/&quot;});

+    checkURL(&quot;gopher://host:71/&quot;, {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 71, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host:71/&quot;});

+    // Spec, Chrome, Firefox, and URLParser have &quot;/&quot;, URL::parse does not.

+    // Spec, Chrome, URLParser, URL::parse recognize gopher default port, Firefox does not.

+    checkURLDifferences(&quot;gopher://host:70&quot;,

+        {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host/&quot;},

+        {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host&quot;});

+    checkURLDifferences(&quot;gopher://host:71&quot;,

+        {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 71, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host:71/&quot;},

+        {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 71, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host:71&quot;});

+    

+    checkURL(&quot;http://host:80&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/&quot;});

+    checkURL(&quot;http://host:80/&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/&quot;});

+    checkURL(&quot;http://host:81&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host:81/&quot;});

+    checkURL(&quot;http://host:81/&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host:81/&quot;});

+    

+    checkURL(&quot;https://host:443&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;https://host/&quot;});

+    checkURL(&quot;https://host:443/&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;https://host/&quot;});

+    checkURL(&quot;https://host:444&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;https://host:444/&quot;});

+    checkURL(&quot;https://host:444/&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;https://host:444/&quot;});

+    

+    checkURL(&quot;ws://host:80/&quot;, {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host/&quot;});

+    checkURL(&quot;ws://host:81/&quot;, {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host:81/&quot;});

+    // URLParser matches Chrome and Firefox, but not URL::parse

+    checkURLDifferences(&quot;ws://host:80&quot;,

+        {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host/&quot;},

+        {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host&quot;});

+    checkURLDifferences(&quot;ws://host:81&quot;,

+        {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host:81/&quot;},

+        {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host:81&quot;});

+    

+    checkURL(&quot;wss://host:443/&quot;, {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host/&quot;});

+    checkURL(&quot;wss://host:444/&quot;, {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host:444/&quot;});

+    // URLParser matches Chrome and Firefox, but not URL::parse

+    checkURLDifferences(&quot;wss://host:443&quot;,

+        {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host/&quot;},

+        {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host&quot;});

+    checkURLDifferences(&quot;wss://host:444&quot;,

+        {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host:444/&quot;},

+        {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host:444&quot;});

+    

+    // FIXME: Fix and check unknown schemes with ports, as well as ftps.

+}

+    

</ins><span class="cx"> static void shouldFail(const String&amp; urlString)

</span><span class="cx"> {

</span><span class="cx">     URLParser parser;

</span></span></pre>

</div>

</div>

</body>

</html>