<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[205493] trunk</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/205493">205493</a></dd>
<dt>Author</dt> <dd>achristensen@apple.com</dd>
<dt>Date</dt> <dd>2016-09-06 11:16:07 -0700 (Tue, 06 Sep 2016)</dd>
</dl>

<h3>Log Message</h3>
<pre>Implement relative file urls and begin implementing character encoding in URLParser
https://bugs.webkit.org/show_bug.cgi?id=161618

Reviewed by Tim Horton.

Source/WebCore:

Covered by new API tests.
Also, this is a significant step towards passing the URL web platform tests when using the URLParser,
which is still off by default.

* platform/URLParser.cpp:
(WebCore::isInSimpleEncodeSet):
(WebCore::isInDefaultEncodeSet):
(WebCore::isInUserInfoEncodeSet):
(WebCore::isInvalidDomainCharacter):
(WebCore::shouldCopyFileURL):
(WebCore::percentEncode):
(WebCore::utf8PercentEncode):
(WebCore::encodeQuery):
(WebCore::isDefaultPort):
(WebCore::isPercentEncodedDot):
(WebCore::URLParser::parse):
(WebCore::percentDecode):
(WebCore::domainToASCII):
(WebCore::hasInvalidDomainCharacter):
(WebCore::URLParser::parsePort):
(WebCore::URLParser::parseHost):
(WebCore::isTabOrNewline): Deleted.
* platform/URLParser.h:

Tools:

* TestWebKitAPI/Tests/WebCore/URLParser.cpp:
(TestWebKitAPI::TEST_F):</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkSourceWebCoreChangeLog">trunk/Source/WebCore/ChangeLog</a></li>
<li><a href="#trunkSourceWebCoreplatformURLParsercpp">trunk/Source/WebCore/platform/URLParser.cpp</a></li>
<li><a href="#trunkSourceWebCoreplatformURLParserh">trunk/Source/WebCore/platform/URLParser.h</a></li>
<li><a href="#trunkToolsChangeLog">trunk/Tools/ChangeLog</a></li>
<li><a href="#trunkToolsTestWebKitAPITestsWebCoreURLParsercpp">trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkSourceWebCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/ChangeLog (205492 => 205493)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/ChangeLog        2016-09-06 18:02:28 UTC (rev 205492)
+++ trunk/Source/WebCore/ChangeLog        2016-09-06 18:16:07 UTC (rev 205493)
</span><span class="lines">@@ -1,3 +1,34 @@
</span><ins>+2016-09-05  Alex Christensen  &lt;achristensen@webkit.org&gt;
+
+        Implement relative file urls and begin implementing character encoding in URLParser
+        https://bugs.webkit.org/show_bug.cgi?id=161618
+
+        Reviewed by Tim Horton.
+
+        Covered by new API tests.
+        Also, this is a significant step towards passing the URL web platform tests when using the URLParser,
+        which is still off by default.
+
+        * platform/URLParser.cpp:
+        (WebCore::isInSimpleEncodeSet):
+        (WebCore::isInDefaultEncodeSet):
+        (WebCore::isInUserInfoEncodeSet):
+        (WebCore::isInvalidDomainCharacter):
+        (WebCore::shouldCopyFileURL):
+        (WebCore::percentEncode):
+        (WebCore::utf8PercentEncode):
+        (WebCore::encodeQuery):
+        (WebCore::isDefaultPort):
+        (WebCore::isPercentEncodedDot):
+        (WebCore::URLParser::parse):
+        (WebCore::percentDecode):
+        (WebCore::domainToASCII):
+        (WebCore::hasInvalidDomainCharacter):
+        (WebCore::URLParser::parsePort):
+        (WebCore::URLParser::parseHost):
+        (WebCore::isTabOrNewline): Deleted.
+        * platform/URLParser.h:
+
</ins><span class="cx"> 2016-09-06  Daniel Bates  &lt;dabates@apple.com&gt;
</span><span class="cx"> 
</span><span class="cx">         Fix the Apple-internal build following &lt;https://trac.webkit.org/changeset/205488&gt;
</span></span></pre></div>
<a id="trunkSourceWebCoreplatformURLParsercpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/platform/URLParser.cpp (205492 => 205493)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/platform/URLParser.cpp        2016-09-06 18:02:28 UTC (rev 205492)
+++ trunk/Source/WebCore/platform/URLParser.cpp        2016-09-06 18:16:07 UTC (rev 205493)
</span><span class="lines">@@ -28,7 +28,10 @@
</span><span class="cx"> 
</span><span class="cx"> #include &quot;Logging.h&quot;
</span><span class="cx"> #include &lt;array&gt;
</span><ins>+#include &lt;wtf/HashMap.h&gt;
+#include &lt;wtf/NeverDestroyed.h&gt;
</ins><span class="cx"> #include &lt;wtf/text/StringBuilder.h&gt;
</span><ins>+#include &lt;wtf/text/StringHash.h&gt;
</ins><span class="cx"> 
</span><span class="cx"> namespace WebCore {
</span><span class="cx"> 
</span><span class="lines">@@ -35,6 +38,10 @@
</span><span class="cx"> template&lt;typename CharacterType&gt; static bool isC0Control(CharacterType character) { return character &lt;= 0x0001F; }
</span><span class="cx"> template&lt;typename CharacterType&gt; static bool isC0ControlOrSpace(CharacterType character) { return isC0Control(character) || character == 0x0020; }
</span><span class="cx"> template&lt;typename CharacterType&gt; static bool isTabOrNewline(CharacterType character) { return character == 0x0009 || character == 0x000A || character == 0x000D; }
</span><ins>+template&lt;typename CharacterType&gt; static bool isInSimpleEncodeSet(CharacterType character) { return isC0Control(character) || character &gt; 0x007E; }
+template&lt;typename CharacterType&gt; static bool isInDefaultEncodeSet(CharacterType character) { return isInSimpleEncodeSet(character) || character == 0x0020 || character == '&quot;' || character == '#' || character == '&lt;' || character == '&gt;' || character == '?' || character == '`' || character == '{' || character == '}'; }
+template&lt;typename CharacterType&gt; static bool isInUserInfoEncodeSet(CharacterType character) { return isInDefaultEncodeSet(character) || character == '/' || character == ':' || character == ';' || character == '=' || character == '@' || character == '[' || character == '\\' || character == ']' || character == '^' || character == '|'; }
+template&lt;typename CharacterType&gt; static bool isInvalidDomainCharacter(CharacterType character) { return character == 0x0000 || character == 0x0009 || character == 0x000A || character == 0x000D || character == 0x0020 || character == '#' || character == '%' || character == '/' || character == ':' || character == '?' || character == '@' || character == '[' || character == '\\' || character == ']'; }
</ins><span class="cx">     
</span><span class="cx"> static bool isWindowsDriveLetter(StringView::CodePoints::Iterator iterator, const StringView::CodePoints::Iterator&amp; end)
</span><span class="cx"> {
</span><span class="lines">@@ -63,9 +70,74 @@
</span><span class="cx">     if (iterator == end)
</span><span class="cx">         return true;
</span><span class="cx">     ++iterator;
</span><del>-    return *iterator != '/' &amp;&amp; *iterator != '\\' &amp;&amp; *iterator != '?' &amp;&amp; *iterator == '#';
</del><ins>+    if (iterator == end)
+        return true;
+    return *iterator != '/' &amp;&amp; *iterator != '\\' &amp;&amp; *iterator != '?' &amp;&amp; *iterator != '#';
</ins><span class="cx"> }
</span><span class="cx"> 
</span><ins>+static void percentEncode(uint8_t byte, StringBuilder&amp; builder)
+{
+    builder.append('%');
+    builder.append(upperNibbleToASCIIHexDigit(byte));
+    builder.append(lowerNibbleToASCIIHexDigit(byte));
+}
+
+static void utf8PercentEncode(UChar32 codePoint, StringBuilder&amp; builder, bool(*isInCodeSet)(UChar32))
+{
+    if (isInCodeSet(codePoint)) {
+        uint8_t buffer[U8_MAX_LENGTH];
+        int32_t offset = 0;
+        UBool error = false;
+        U8_APPEND(buffer, offset, U8_MAX_LENGTH, codePoint, error);
+        // FIXME: Check error.
+        for (int32_t i = 0; i &lt; offset; ++i)
+            percentEncode(buffer[i], builder);
+    } else
+        builder.append(codePoint);
+}
+
+static bool shouldPercentEncodeQueryByte(uint8_t byte)
+{
+    if (byte &lt; 0x21)
+        return true;
+    if (byte &gt; 0x7E)
+        return true;
+    if (byte == 0x22)
+        return true;
+    if (byte == 0x23)
+        return true;
+    if (byte == 0x3C)
+        return true;
+    return byte == 0x3E;
+}
+
+static void encodeQuery(const StringBuilder&amp; source, StringBuilder&amp; destination, const TextEncoding&amp; encoding)
+{
+    // FIXME: It is unclear in the spec what to do when encoding fails. The behavior should be specified and tested.
+    CString encoded = encoding.encode(StringView(source.toStringPreserveCapacity()), URLEncodedEntitiesForUnencodables);
+    const char* data = encoded.data();
+    size_t length = encoded.length();
+    for (size_t i = 0; i &lt; length; ++i) {
+        uint8_t byte = data[i];
+        if (shouldPercentEncodeQueryByte(byte))
+            percentEncode(byte, destination);
+        else
+            destination.append(byte);
+    }
+}
+
+static bool isDefaultPort(const String&amp; scheme, uint16_t port)
+{
+    static NeverDestroyed&lt;HashMap&lt;String, uint16_t&gt;&gt; defaultPorts(HashMap&lt;String, uint16_t&gt;({
+        {&quot;ftp&quot;, 21},
+        {&quot;gopher&quot;, 70},
+        {&quot;http&quot;, 80},
+        {&quot;https&quot;, 443},
+        {&quot;ws&quot;, 80},
+        {&quot;wss&quot;, 443}}));
+    return defaultPorts.get().get(scheme) == port;
+}
+
</ins><span class="cx"> static bool isSpecialScheme(const String&amp; scheme)
</span><span class="cx"> {
</span><span class="cx">     return scheme == &quot;ftp&quot;
</span><span class="lines">@@ -159,6 +231,23 @@
</span><span class="cx"> 
</span><span class="cx"> static const char* dotASCIICode = &quot;2e&quot;;
</span><span class="cx"> 
</span><ins>+static bool isPercentEncodedDot(StringView::CodePoints::Iterator c, const StringView::CodePoints::Iterator&amp; end)
+{
+    if (c == end)
+        return false;
+    if (*c != '%')
+        return false;
+    ++c;
+    if (c == end)
+        return false;
+    if (*c != dotASCIICode[0])
+        return false;
+    ++c;
+    if (c == end)
+        return false;
+    return toASCIILower(*c) == dotASCIICode[1];
+}
+
</ins><span class="cx"> static bool isSingleDotPathSegment(StringView::CodePoints::Iterator c, const StringView::CodePoints::Iterator&amp; end)
</span><span class="cx"> {
</span><span class="cx">     if (c == end)
</span><span class="lines">@@ -261,12 +350,15 @@
</span><span class="cx">     m_buffer.resize(m_url.m_pathAfterLastSlash);
</span><span class="cx"> }
</span><span class="cx"> 
</span><del>-URL URLParser::parse(const String&amp; input, const URL&amp; base, const TextEncoding&amp;)
</del><ins>+URL URLParser::parse(const String&amp; input, const URL&amp; base, const TextEncoding&amp; encoding)
</ins><span class="cx"> {
</span><span class="cx">     LOG(URLParser, &quot;Parsing URL &lt;%s&gt; base &lt;%s&gt;&quot;, input.utf8().data(), base.string().utf8().data());
</span><span class="cx">     m_url = { };
</span><span class="cx">     m_buffer.clear();
</span><span class="cx">     m_buffer.reserveCapacity(input.length());
</span><ins>+    
+    // FIXME: We shouldn't need to allocate another buffer for this.
+    StringBuilder queryBuffer;
</ins><span class="cx"> 
</span><span class="cx">     auto codePoints = StringView(input).codePoints();
</span><span class="cx">     auto c = codePoints.begin();
</span><span class="lines">@@ -347,6 +439,8 @@
</span><span class="cx">                 break;
</span><span class="cx">             }
</span><span class="cx">             ++c;
</span><ins>+            while (c != end &amp;&amp; isTabOrNewline(*c))
+                ++c;
</ins><span class="cx">             if (c == end) {
</span><span class="cx">                 m_buffer.clear();
</span><span class="cx">                 state = State::NoScheme;
</span><span class="lines">@@ -380,6 +474,7 @@
</span><span class="cx">                     return { };
</span><span class="cx">             } else if (base.protocol() == &quot;file&quot;) {
</span><span class="cx">                 copyURLPartsUntil(base, URLPart::SchemeEnd);
</span><ins>+                m_buffer.append(':');
</ins><span class="cx">                 state = State::File;
</span><span class="cx">             } else
</span><span class="cx">                 state = State::Relative;
</span><span class="lines">@@ -479,12 +574,20 @@
</span><span class="cx">             if (*c == '@') {
</span><span class="cx">                 parseAuthority(authorityOrHostBegin, c);
</span><span class="cx">                 ++c;
</span><ins>+                while (c != end &amp;&amp; isTabOrNewline(*c))
+                    ++c;
</ins><span class="cx">                 authorityOrHostBegin = c;
</span><span class="cx">                 state = State::Host;
</span><ins>+                break;
</ins><span class="cx">             } else if (*c == '/' || *c == '?' || *c == '#') {
</span><span class="cx">                 m_url.m_userEnd = m_buffer.length();
</span><span class="cx">                 m_url.m_passwordEnd = m_url.m_userEnd;
</span><del>-                parseHost(authorityOrHostBegin, c);
</del><ins>+                if (!parseHost(authorityOrHostBegin, c))
+                    return { };
+                if (*c != '/') {
+                    m_buffer.append('/');
+                    m_url.m_pathAfterLastSlash = m_buffer.length();
+                }
</ins><span class="cx">                 state = State::Path;
</span><span class="cx">                 break;
</span><span class="cx">             }
</span><span class="lines">@@ -493,7 +596,8 @@
</span><span class="cx">         case State::Host:
</span><span class="cx">             LOG_STATE(&quot;Host&quot;);
</span><span class="cx">             if (*c == '/' || *c == '?' || *c == '#') {
</span><del>-                parseHost(authorityOrHostBegin, c);
</del><ins>+                if (!parseHost(authorityOrHostBegin, c))
+                    return { };
</ins><span class="cx">                 state = State::Path;
</span><span class="cx">                 break;
</span><span class="cx">             }
</span><span class="lines">@@ -509,7 +613,7 @@
</span><span class="cx">                 ++c;
</span><span class="cx">                 break;
</span><span class="cx">             case '?':
</span><del>-                if (!base.isNull() &amp;&amp; base.protocol() == &quot;file&quot;)
</del><ins>+                if (!base.isNull() &amp;&amp; base.protocolIs(&quot;file&quot;))
</ins><span class="cx">                     copyURLPartsUntil(base, URLPart::PathEnd);
</span><span class="cx">                 m_buffer.append(&quot;///?&quot;);
</span><span class="cx">                 m_url.m_userStart = m_buffer.length() - 2;
</span><span class="lines">@@ -523,7 +627,7 @@
</span><span class="cx">                 ++c;
</span><span class="cx">                 break;
</span><span class="cx">             case '#':
</span><del>-                if (!base.isNull() &amp;&amp; base.protocol() == &quot;file&quot;)
</del><ins>+                if (!base.isNull() &amp;&amp; base.protocolIs(&quot;file&quot;))
</ins><span class="cx">                     copyURLPartsUntil(base, URLPart::QueryEnd);
</span><span class="cx">                 m_buffer.append(&quot;///#&quot;);
</span><span class="cx">                 m_url.m_userStart = m_buffer.length() - 2;
</span><span class="lines">@@ -538,10 +642,9 @@
</span><span class="cx">                 ++c;
</span><span class="cx">                 break;
</span><span class="cx">             default:
</span><del>-                if (shouldCopyFileURL(c, end)) {
-                    copyURLPartsUntil(base, URLPart::PathEnd);
-                    popPath();
-                } else {
</del><ins>+                if (!base.isNull() &amp;&amp; base.protocolIs(&quot;file&quot;) &amp;&amp; shouldCopyFileURL(c, end))
+                    copyURLPartsUntil(base, URLPart::PathAfterLastSlash);
+                else {
</ins><span class="cx">                     m_buffer.append(&quot;///&quot;);
</span><span class="cx">                     m_url.m_userStart = m_buffer.length() - 1;
</span><span class="cx">                     m_url.m_userEnd = m_url.m_userStart;
</span><span class="lines">@@ -667,8 +770,17 @@
</span><span class="cx">                 state = State::Fragment;
</span><span class="cx">                 break;
</span><span class="cx">             }
</span><del>-            // FIXME: Percent encode c
-            m_buffer.append(*c);
</del><ins>+            if (isPercentEncodedDot(c, end)) {
+                m_buffer.append('.');
+                ASSERT(*c == '%');
+                ++c;
+                ASSERT(*c == dotASCIICode[0]);
+                ++c;
+                ASSERT(toASCIILower(*c) == dotASCIICode[1]);
+                ++c;
+                break;
+            }
+            utf8PercentEncode(*c, m_buffer, isInDefaultEncodeSet);
</ins><span class="cx">             ++c;
</span><span class="cx">             break;
</span><span class="cx">         case State::CannotBeABaseURLPath:
</span><span class="lines">@@ -688,11 +800,12 @@
</span><span class="cx">         case State::Query:
</span><span class="cx">             LOG_STATE(&quot;Query&quot;);
</span><span class="cx">             if (*c == '#') {
</span><ins>+                encodeQuery(queryBuffer, m_buffer, encoding);
</ins><span class="cx">                 m_url.m_queryEnd = m_buffer.length();
</span><span class="cx">                 state = State::Fragment;
</span><span class="cx">                 break;
</span><span class="cx">             }
</span><del>-            m_buffer.append(*c);
</del><ins>+            queryBuffer.append(*c);
</ins><span class="cx">             ++c;
</span><span class="cx">             break;
</span><span class="cx">         case State::Fragment:
</span><span class="lines">@@ -743,7 +856,8 @@
</span><span class="cx">     case State::Host:
</span><span class="cx">         if (state == State::Host)
</span><span class="cx">             LOG_FINAL_STATE(&quot;Host&quot;);
</span><del>-        parseHost(authorityOrHostBegin, end);
</del><ins>+        if (!parseHost(authorityOrHostBegin, end))
+            return { };
</ins><span class="cx">         m_buffer.append('/');
</span><span class="cx">         m_url.m_pathEnd = m_url.m_portEnd + 1;
</span><span class="cx">         m_url.m_pathAfterLastSlash = m_url.m_pathEnd;
</span><span class="lines">@@ -832,6 +946,7 @@
</span><span class="cx">         break;
</span><span class="cx">     case State::Query:
</span><span class="cx">         LOG_FINAL_STATE(&quot;Query&quot;);
</span><ins>+        encodeQuery(queryBuffer, m_buffer, encoding);
</ins><span class="cx">         m_url.m_queryEnd = m_buffer.length();
</span><span class="cx">         m_url.m_fragmentEnd = m_url.m_queryEnd;
</span><span class="cx">         break;
</span><span class="lines">@@ -1131,6 +1246,73 @@
</span><span class="cx">     return address;
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+static String percentDecode(const String&amp; input)
+{
+    StringBuilder output;
+    RELEASE_ASSERT(input.is8Bit());
+    const LChar* inputBytes = input.characters8();
+    size_t length = input.length();
+    
+    for (size_t i = 0; i &lt; length; ++i) {
+        uint8_t byte = inputBytes[i];
+        if (byte != '%')
+            output.append(byte);
+        else if (i &lt; length - 2) {
+            if (isASCIIHexDigit(inputBytes[i + 1]) &amp;&amp; isASCIIHexDigit(inputBytes[i + 2])) {
+                output.append(toASCIIHexValue(inputBytes[i + 1], inputBytes[i + 2]));
+                i += 2;
+            } else
+                output.append(byte);
+        } else
+            output.append(byte);
+    }
+    return output.toStringPreserveCapacity();
+}
+
+static Optional&lt;String&gt; domainToASCII(const String&amp; domain)
+{
+    // FIXME: Implement correctly
+    CString utf8 = domain.utf8();
+    return String(utf8.data(), utf8.length());
+}
+
+static bool hasInvalidDomainCharacter(const String&amp; asciiDomain)
+{
+    RELEASE_ASSERT(asciiDomain.is8Bit());
+    const LChar* characters = asciiDomain.characters8();
+    for (size_t i = 0; i &lt; asciiDomain.length(); ++i) {
+        if (isInvalidDomainCharacter(characters[i]))
+            return true;
+    }
+    return false;
+}
+
+bool URLParser::parsePort(StringView::CodePoints::Iterator&amp; iterator, const StringView::CodePoints::Iterator&amp; end)
+{
+    uint32_t port = 0;
+    ASSERT(iterator != end);
+    for (; iterator != end; ++iterator) {
+        if (isTabOrNewline(*iterator))
+            continue;
+        if (isASCIIDigit(*iterator)) {
+            port = port * 10 + *iterator - '0';
+            if (port &gt; std::numeric_limits&lt;uint16_t&gt;::max())
+                return false;
+        } else
+            return false;
+    }
+    
+    // FIXME: This shouldn't need a String allocation.
+    String scheme = m_buffer.toStringPreserveCapacity().substring(0, m_url.m_schemeEnd);
+    if (isDefaultPort(scheme, port)) {
+        ASSERT(m_buffer[m_buffer.length() - 1] == ':');
+        m_buffer.resize(m_buffer.length() - 1);
+    } else
+        m_buffer.appendNumber(port);
+
+    return true;
+}
+
</ins><span class="cx"> bool URLParser::parseHost(StringView::CodePoints::Iterator&amp; iterator, const StringView::CodePoints::Iterator&amp; end)
</span><span class="cx"> {
</span><span class="cx">     if (iterator == end)
</span><span class="lines">@@ -1148,7 +1330,30 @@
</span><span class="cx">             return true;
</span><span class="cx">         }
</span><span class="cx">     }
</span><del>-    if (auto address = parseIPv4Host(iterator, end)) {
</del><ins>+
+    // FIXME: We probably don't need to make so many buffers and String copies.
+    StringBuilder utf8Encoded;
+    for (; iterator != end; ++iterator) {
+        if (isTabOrNewline(*iterator))
+            continue;
+        if (*iterator == ':')
+            break;
+        uint8_t buffer[U8_MAX_LENGTH];
+        int32_t offset = 0;
+        UBool error = false;
+        U8_APPEND(buffer, offset, U8_MAX_LENGTH, *iterator, error);
+        // FIXME: Check error.
+        utf8Encoded.append(buffer, offset);
+    }
+    String percentDecoded = percentDecode(utf8Encoded.toStringPreserveCapacity());
+    RELEASE_ASSERT(percentDecoded.is8Bit());
+    String domain = String::fromUTF8(percentDecoded.characters8(), percentDecoded.length());
+    auto asciiDomain = domainToASCII(domain);
+    if (!asciiDomain || hasInvalidDomainCharacter(asciiDomain.value()))
+        return false;
+    
+    auto asciiDomainCodePoints = StringView(asciiDomain.value()).codePoints();
+    if (auto address = parseIPv4Host(asciiDomainCodePoints.begin(), asciiDomainCodePoints.end())) {
</ins><span class="cx">         serializeIPv4(address.value(), m_buffer);
</span><span class="cx">         m_url.m_hostEnd = m_buffer.length();
</span><span class="cx">         // FIXME: Handle the port correctly.
</span><span class="lines">@@ -1155,20 +1360,21 @@
</span><span class="cx">         m_url.m_portEnd = m_buffer.length();
</span><span class="cx">         return true;
</span><span class="cx">     }
</span><del>-    for (; iterator != end; ++iterator) {
-        if (*iterator == ':') {
</del><ins>+    
+    m_buffer.append(asciiDomain.value());
+    m_url.m_hostEnd = m_buffer.length();
+    if (iterator != end) {
+        ASSERT(*iterator == ':');
+        ++iterator;
+        while (iterator != end &amp;&amp; isTabOrNewline(*iterator))
</ins><span class="cx">             ++iterator;
</span><del>-            m_url.m_hostEnd = m_buffer.length();
</del><ins>+        if (iterator != end) {
</ins><span class="cx">             m_buffer.append(':');
</span><del>-            for (; iterator != end; ++iterator)
-                m_buffer.append(*iterator);
-            m_url.m_portEnd = m_buffer.length();
-            return true;
</del><ins>+            if (!parsePort(iterator, end))
+                return false;
</ins><span class="cx">         }
</span><del>-        m_buffer.append(*iterator);
</del><span class="cx">     }
</span><del>-    m_url.m_hostEnd = m_buffer.length();
-    m_url.m_portEnd = m_url.m_hostEnd;
</del><ins>+    m_url.m_portEnd = m_buffer.length();
</ins><span class="cx">     return true;
</span><span class="cx"> }
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkSourceWebCoreplatformURLParserh"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/platform/URLParser.h (205492 => 205493)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/platform/URLParser.h        2016-09-06 18:02:28 UTC (rev 205492)
+++ trunk/Source/WebCore/platform/URLParser.h        2016-09-06 18:16:07 UTC (rev 205493)
</span><span class="lines">@@ -44,6 +44,7 @@
</span><span class="cx">     StringBuilder m_buffer;
</span><span class="cx">     void parseAuthority(StringView::CodePoints::Iterator&amp;, const StringView::CodePoints::Iterator&amp; end);
</span><span class="cx">     bool parseHost(StringView::CodePoints::Iterator&amp;, const StringView::CodePoints::Iterator&amp; end);
</span><ins>+    bool parsePort(StringView::CodePoints::Iterator&amp;, const StringView::CodePoints::Iterator&amp; end);
</ins><span class="cx"> 
</span><span class="cx">     enum class URLPart;
</span><span class="cx">     void copyURLPartsUntil(const URL&amp; base, URLPart);
</span></span></pre></div>
<a id="trunkToolsChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Tools/ChangeLog (205492 => 205493)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Tools/ChangeLog        2016-09-06 18:02:28 UTC (rev 205492)
+++ trunk/Tools/ChangeLog        2016-09-06 18:16:07 UTC (rev 205493)
</span><span class="lines">@@ -1,3 +1,13 @@
</span><ins>+2016-09-05  Alex Christensen  &lt;achristensen@webkit.org&gt;
+
+        Implement relative file urls and begin implementing character encoding in URLParser
+        https://bugs.webkit.org/show_bug.cgi?id=161618
+
+        Reviewed by Tim Horton.
+
+        * TestWebKitAPI/Tests/WebCore/URLParser.cpp:
+        (TestWebKitAPI::TEST_F):
+
</ins><span class="cx"> 2016-09-06  Commit Queue  &lt;commit-queue@webkit.org&gt;
</span><span class="cx"> 
</span><span class="cx">         Unreviewed, rolling out r205480.
</span></span></pre></div>
<a id="trunkToolsTestWebKitAPITestsWebCoreURLParsercpp"></a>
<div class="modfile"><h4>Modified: trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp (205492 => 205493)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp        2016-09-06 18:02:28 UTC (rev 205492)
+++ trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp        2016-09-06 18:16:07 UTC (rev 205493)
</span><span class="lines">@@ -165,6 +165,9 @@
</span><span class="cx">     checkURL(&quot;file:///#fragment&quot;, {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;fragment&quot;, &quot;file:///#fragment&quot;});
</span><span class="cx">     checkURL(&quot;file:////?query&quot;, {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;//&quot;, &quot;query&quot;, &quot;&quot;, &quot;file:////?query&quot;});
</span><span class="cx">     checkURL(&quot;file:////#fragment&quot;, {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;//&quot;, &quot;&quot;, &quot;fragment&quot;, &quot;file:////#fragment&quot;});
</span><ins>+    checkURL(&quot;http://host/A b&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/A%20b&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/A%20b&quot;});
+    checkURL(&quot;http://host/a%20B&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/a%20B&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/a%20B&quot;});
+    checkURL(&quot;http://host?q=@ &lt;&gt;!#fragment&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;q=@%20%3C%3E!&quot;, &quot;fragment&quot;, &quot;http://host/?q=@%20%3C%3E!#fragment&quot;});
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> static void checkRelativeURL(const String&amp; urlString, const String&amp; baseURLString, const ExpectedParts&amp; parts)
</span><span class="lines">@@ -204,6 +207,8 @@
</span><span class="cx">     checkRelativeURL(&quot;http://whatwg.org/index.html&quot;, &quot;http://webkit.org/path1/path2/&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;whatwg.org&quot;, 0, &quot;/index.html&quot;, &quot;&quot;, &quot;&quot;, &quot;http://whatwg.org/index.html&quot;});
</span><span class="cx">     checkRelativeURL(&quot;index.html&quot;, &quot;http://webkit.org/path1/path2/page.html?query#fragment&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;webkit.org&quot;, 0, &quot;/path1/path2/index.html&quot;, &quot;&quot;, &quot;&quot;, &quot;http://webkit.org/path1/path2/index.html&quot;});
</span><span class="cx">     checkRelativeURL(&quot;//whatwg.org/index.html&quot;, &quot;https://www.webkit.org/path&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;whatwg.org&quot;, 0, &quot;/index.html&quot;, &quot;&quot;, &quot;&quot;, &quot;https://whatwg.org/index.html&quot;});
</span><ins>+    checkRelativeURL(&quot;http://example\t.\norg&quot;, &quot;http://example.org/foo/bar&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;example.org&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://example.org/&quot;});
+    checkRelativeURL(&quot;test&quot;, &quot;file:///path1/path2&quot;, {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;/path1/test&quot;, &quot;&quot;, &quot;&quot;, &quot;file:///path1/test&quot;});
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> static void checkURLDifferences(const String&amp; urlString, const ExpectedParts&amp; partsNew, const ExpectedParts&amp; partsOld)
</span><span class="lines">@@ -338,8 +343,74 @@
</span><span class="cx">     checkURLDifferences(&quot;file:path&quot;,
</span><span class="cx">         {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;/path&quot;, &quot;&quot;, &quot;&quot;, &quot;file:///path&quot;},
</span><span class="cx">         {&quot;file&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;path&quot;, &quot;&quot;, &quot;&quot;, &quot;file://path&quot;});
</span><ins>+    
+    // FIXME: Fix and test incomplete percent encoded characters in the middle and end of the input string.
+    // FIXME: Fix and test percent encoded upper case characters in the host.
+    checkURLDifferences(&quot;http://host%73&quot;,
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;hosts&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://hosts/&quot;},
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host%73&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host%73/&quot;});
+    
+    // URLParser matches Chrome and the spec, but not URL::parse or Firefox.
+    checkURLDifferences(&quot;http://host/path%2e.%2E&quot;,
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/path...&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/path...&quot;},
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/path%2e.%2E&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/path%2e.%2E&quot;});
</ins><span class="cx"> }
</span><span class="cx"> 
</span><ins>+TEST_F(URLParserTest, DefaultPort)
+{
+    checkURL(&quot;ftp://host:21/&quot;, {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host/&quot;});
+    checkURL(&quot;ftp://host:22/&quot;, {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 22, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host:22/&quot;});
+    checkURLDifferences(&quot;ftp://host:21&quot;,
+        {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host/&quot;},
+        {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host&quot;});
+    checkURLDifferences(&quot;ftp://host:22&quot;,
+        {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 22, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host:22/&quot;},
+        {&quot;ftp&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 22, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ftp://host:22&quot;});
+    
+    checkURL(&quot;gopher://host:70/&quot;, {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host/&quot;});
+    checkURL(&quot;gopher://host:71/&quot;, {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 71, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host:71/&quot;});
+    // Spec, Chrome, Firefox, and URLParser have &quot;/&quot;, URL::parse does not.
+    // Spec, Chrome, URLParser, URL::parse recognize gopher default port, Firefox does not.
+    checkURLDifferences(&quot;gopher://host:70&quot;,
+        {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host/&quot;},
+        {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host&quot;});
+    checkURLDifferences(&quot;gopher://host:71&quot;,
+        {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 71, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host:71/&quot;},
+        {&quot;gopher&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 71, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;gopher://host:71&quot;});
+    
+    checkURL(&quot;http://host:80&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/&quot;});
+    checkURL(&quot;http://host:80/&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/&quot;});
+    checkURL(&quot;http://host:81&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host:81/&quot;});
+    checkURL(&quot;http://host:81/&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host:81/&quot;});
+    
+    checkURL(&quot;https://host:443&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;https://host/&quot;});
+    checkURL(&quot;https://host:443/&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;https://host/&quot;});
+    checkURL(&quot;https://host:444&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;https://host:444/&quot;});
+    checkURL(&quot;https://host:444/&quot;, {&quot;https&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;https://host:444/&quot;});
+    
+    checkURL(&quot;ws://host:80/&quot;, {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host/&quot;});
+    checkURL(&quot;ws://host:81/&quot;, {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host:81/&quot;});
+    // URLParser matches Chrome and Firefox, but not URL::parse
+    checkURLDifferences(&quot;ws://host:80&quot;,
+        {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host/&quot;},
+        {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host&quot;});
+    checkURLDifferences(&quot;ws://host:81&quot;,
+        {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host:81/&quot;},
+        {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 81, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ws://host:81&quot;});
+    
+    checkURL(&quot;wss://host:443/&quot;, {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host/&quot;});
+    checkURL(&quot;wss://host:444/&quot;, {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host:444/&quot;});
+    // URLParser matches Chrome and Firefox, but not URL::parse
+    checkURLDifferences(&quot;wss://host:443&quot;,
+        {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host/&quot;},
+        {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host&quot;});
+    checkURLDifferences(&quot;wss://host:444&quot;,
+        {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host:444/&quot;},
+        {&quot;wss&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 444, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;wss://host:444&quot;});
+    
+    // FIXME: Fix and check unknown schemes with ports, as well as ftps.
+}
+    
</ins><span class="cx"> static void shouldFail(const String&amp; urlString)
</span><span class="cx"> {
</span><span class="cx">     URLParser parser;
</span></span></pre>
</div>
</div>

</body>
</html>