<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[206198] trunk</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/206198">206198</a></dd>
<dt>Author</dt> <dd>achristensen@apple.com</dd>
<dt>Date</dt> <dd>2016-09-20 23:34:13 -0700 (Tue, 20 Sep 2016)</dd>
</dl>

<h3>Log Message</h3>
<pre>Optimize URLParser
https://bugs.webkit.org/show_bug.cgi?id=162105

Reviewed by Geoffrey Garen.

Source/WebCore:

Covered by new API tests.
This is about a 5% speedup on my URLParser benchmark.

* platform/URLParser.cpp:
(WebCore::percentEncodeByte):
(WebCore::utf8PercentEncode):
(WebCore::utf8QueryEncode):
(WebCore::encodeQuery):
(WebCore::URLParser::parse):
(WebCore::serializeURLEncodedForm):
(WebCore::percentEncode): Deleted.
(WebCore::utf8PercentEncodeQuery): Deleted.

Tools:

* TestWebKitAPI/Tests/WebCore/URLParser.cpp:
(TestWebKitAPI::TEST_F):</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkSourceWebCoreChangeLog">trunk/Source/WebCore/ChangeLog</a></li>
<li><a href="#trunkSourceWebCoreplatformURLParsercpp">trunk/Source/WebCore/platform/URLParser.cpp</a></li>
<li><a href="#trunkToolsChangeLog">trunk/Tools/ChangeLog</a></li>
<li><a href="#trunkToolsTestWebKitAPITestsWebCoreURLParsercpp">trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkSourceWebCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/ChangeLog (206197 => 206198)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/ChangeLog        2016-09-21 06:17:13 UTC (rev 206197)
+++ trunk/Source/WebCore/ChangeLog        2016-09-21 06:34:13 UTC (rev 206198)
</span><span class="lines">@@ -1,3 +1,23 @@
</span><ins>+2016-09-20  Alex Christensen  &lt;achristensen@webkit.org&gt;
+
+        Optimize URLParser
+        https://bugs.webkit.org/show_bug.cgi?id=162105
+
+        Reviewed by Geoffrey Garen.
+
+        Covered by new API tests.
+        This is about a 5% speedup on my URLParser benchmark.
+
+        * platform/URLParser.cpp:
+        (WebCore::percentEncodeByte):
+        (WebCore::utf8PercentEncode):
+        (WebCore::utf8QueryEncode):
+        (WebCore::encodeQuery):
+        (WebCore::URLParser::parse):
+        (WebCore::serializeURLEncodedForm):
+        (WebCore::percentEncode): Deleted.
+        (WebCore::utf8PercentEncodeQuery): Deleted.
+
</ins><span class="cx"> 2016-09-20  Carlos Garcia Campos  &lt;cgarcia@igalia.com&gt;
</span><span class="cx"> 
</span><span class="cx">         [GTK] Clean up DataObjectGtk handling
</span></span></pre></div>
<a id="trunkSourceWebCoreplatformURLParsercpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/platform/URLParser.cpp (206197 => 206198)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/platform/URLParser.cpp        2016-09-21 06:17:13 UTC (rev 206197)
+++ trunk/Source/WebCore/platform/URLParser.cpp        2016-09-21 06:34:13 UTC (rev 206198)
</span><span class="lines">@@ -457,7 +457,7 @@
</span><span class="cx">     return !isSlashQuestionOrHash(*iterator);
</span><span class="cx"> }
</span><span class="cx"> 
</span><del>-inline static void percentEncode(uint8_t byte, Vector&lt;LChar&gt;&amp; buffer)
</del><ins>+inline static void percentEncodeByte(uint8_t byte, Vector&lt;LChar&gt;&amp; buffer)
</ins><span class="cx"> {
</span><span class="cx">     buffer.append('%');
</span><span class="cx">     buffer.append(upperNibbleToASCIIHexDigit(byte));
</span><span class="lines">@@ -464,6 +464,9 @@
</span><span class="cx">     buffer.append(lowerNibbleToASCIIHexDigit(byte));
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+const char* replacementCharacterUTF8PercentEncoded = &quot;%EF%BF%BD&quot;;
+const size_t replacementCharacterUTF8PercentEncodedLength = 9;
+
</ins><span class="cx"> template&lt;bool serialized&gt;
</span><span class="cx"> inline static void utf8PercentEncode(UChar32 codePoint, Vector&lt;LChar&gt;&amp; destination, bool(*isInCodeSet)(UChar32))
</span><span class="cx"> {
</span><span class="lines">@@ -472,23 +475,30 @@
</span><span class="cx">         ASSERT_WITH_SECURITY_IMPLICATION(!isInCodeSet(codePoint));
</span><span class="cx">         destination.append(codePoint);
</span><span class="cx">     } else {
</span><del>-        if (isInCodeSet(codePoint)) {
-            uint8_t buffer[U8_MAX_LENGTH];
-            int32_t offset = 0;
-            UBool error = false;
-            U8_APPEND(buffer, offset, U8_MAX_LENGTH, codePoint, error);
-            // FIXME: Check error.
-            for (int32_t i = 0; i &lt; offset; ++i)
-                percentEncode(buffer[i], destination);
-        } else {
-            ASSERT_WITH_MESSAGE(isASCII(codePoint), &quot;isInCodeSet should always return true for non-ASCII characters&quot;);
-            destination.append(codePoint);
</del><ins>+        if (isASCII(codePoint)) {
+            if (isInCodeSet(codePoint))
+                percentEncodeByte(codePoint, destination);
+            else
+                destination.append(codePoint);
+            return;
</ins><span class="cx">         }
</span><ins>+        ASSERT_WITH_MESSAGE(isInCodeSet(codePoint), &quot;isInCodeSet should always return true for non-ASCII characters&quot;);
+        
+        if (!U_IS_UNICODE_CHAR(codePoint)) {
+            destination.append(replacementCharacterUTF8PercentEncoded, replacementCharacterUTF8PercentEncodedLength);
+            return;
+        }
+        
+        uint8_t buffer[U8_MAX_LENGTH];
+        int32_t offset = 0;
+        U8_APPEND_UNSAFE(buffer, offset, codePoint);
+        for (int32_t i = 0; i &lt; offset; ++i)
+            percentEncodeByte(buffer[i], destination);
</ins><span class="cx">     }
</span><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> template&lt;bool serialized&gt;
</span><del>-inline static void utf8PercentEncodeQuery(UChar32 codePoint, Vector&lt;LChar&gt;&amp; destination)
</del><ins>+inline static void utf8QueryEncode(UChar32 codePoint, Vector&lt;LChar&gt;&amp; destination)
</ins><span class="cx"> {
</span><span class="cx">     if (serialized) {
</span><span class="cx">         ASSERT_WITH_SECURITY_IMPLICATION(isASCII(codePoint));
</span><span class="lines">@@ -495,16 +505,26 @@
</span><span class="cx">         ASSERT_WITH_SECURITY_IMPLICATION(!shouldPercentEncodeQueryByte(codePoint));
</span><span class="cx">         destination.append(codePoint);
</span><span class="cx">     } else {
</span><ins>+        if (isASCII(codePoint)) {
+            if (shouldPercentEncodeQueryByte(codePoint))
+                percentEncodeByte(codePoint, destination);
+            else
+                destination.append(codePoint);
+            return;
+        }
+        
+        if (!U_IS_UNICODE_CHAR(codePoint)) {
+            destination.append(replacementCharacterUTF8PercentEncoded, replacementCharacterUTF8PercentEncodedLength);
+            return;
+        }
+
</ins><span class="cx">         uint8_t buffer[U8_MAX_LENGTH];
</span><span class="cx">         int32_t offset = 0;
</span><del>-        UBool error = false;
-        U8_APPEND(buffer, offset, U8_MAX_LENGTH, codePoint, error);
-        ASSERT_WITH_SECURITY_IMPLICATION(offset &lt;= static_cast&lt;int32_t&gt;(sizeof(buffer)));
-        // FIXME: Check error.
</del><ins>+        U8_APPEND_UNSAFE(buffer, offset, codePoint);
</ins><span class="cx">         for (int32_t i = 0; i &lt; offset; ++i) {
</span><span class="cx">             auto byte = buffer[i];
</span><span class="cx">             if (shouldPercentEncodeQueryByte(byte))
</span><del>-                percentEncode(byte, destination);
</del><ins>+                percentEncodeByte(byte, destination);
</ins><span class="cx">             else
</span><span class="cx">                 destination.append(byte);
</span><span class="cx">         }
</span><span class="lines">@@ -520,7 +540,7 @@
</span><span class="cx">     for (size_t i = 0; i &lt; length; ++i) {
</span><span class="cx">         uint8_t byte = data[i];
</span><span class="cx">         if (shouldPercentEncodeQueryByte(byte))
</span><del>-            percentEncode(byte, destination);
</del><ins>+            percentEncodeByte(byte, destination);
</ins><span class="cx">         else
</span><span class="cx">             destination.append(byte);
</span><span class="cx">     }
</span><span class="lines">@@ -1413,7 +1433,7 @@
</span><span class="cx">                 break;
</span><span class="cx">             }
</span><span class="cx">             if (isUTF8Encoding)
</span><del>-                utf8PercentEncodeQuery&lt;serialized&gt;(*c, m_asciiBuffer);
</del><ins>+                utf8QueryEncode&lt;serialized&gt;(*c, m_asciiBuffer);
</ins><span class="cx">             else
</span><span class="cx">                 appendCodePoint(queryBuffer, *c);
</span><span class="cx">             ++c;
</span><span class="lines">@@ -2198,7 +2218,7 @@
</span><span class="cx">             || (byte &gt;= 0x61 &amp;&amp; byte &lt;= 0x7A))
</span><span class="cx">             output.append(byte);
</span><span class="cx">         else
</span><del>-            percentEncode(byte, output);
</del><ins>+            percentEncodeByte(byte, output);
</ins><span class="cx">     }
</span><span class="cx"> }
</span><span class="cx">     
</span></span></pre></div>
<a id="trunkToolsChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Tools/ChangeLog (206197 => 206198)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Tools/ChangeLog        2016-09-21 06:17:13 UTC (rev 206197)
+++ trunk/Tools/ChangeLog        2016-09-21 06:34:13 UTC (rev 206198)
</span><span class="lines">@@ -1,3 +1,13 @@
</span><ins>+2016-09-20  Alex Christensen  &lt;achristensen@webkit.org&gt;
+
+        Optimize URLParser
+        https://bugs.webkit.org/show_bug.cgi?id=162105
+
+        Reviewed by Geoffrey Garen.
+
+        * TestWebKitAPI/Tests/WebCore/URLParser.cpp:
+        (TestWebKitAPI::TEST_F):
+
</ins><span class="cx"> 2016-09-20  Aakash Jain  &lt;aakash_jain@apple.com&gt;
</span><span class="cx"> 
</span><span class="cx">         enable remote_api (for debugging) in flakiness dashboard app
</span></span></pre></div>
<a id="trunkToolsTestWebKitAPITestsWebCoreURLParsercpp"></a>
<div class="modfile"><h4>Modified: trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp (206197 => 206198)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp        2016-09-21 06:17:13 UTC (rev 206197)
+++ trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp        2016-09-21 06:34:13 UTC (rev 206198)
</span><span class="lines">@@ -215,7 +215,6 @@
</span><span class="cx">     checkURL(&quot;http://123.256/&quot;, {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;123.256&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://123.256/&quot;});
</span><span class="cx">     checkURL(&quot;notspecial:/a&quot;, {&quot;notspecial&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;/a&quot;, &quot;&quot;, &quot;&quot;, &quot;notspecial:/a&quot;});
</span><span class="cx">     checkURL(&quot;notspecial:&quot;, {&quot;notspecial&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;notspecial:&quot;});
</span><del>-    // FIXME: Fix and add a test with an invalid surrogate pair at the end with a space as the second code unit.
</del><span class="cx"> 
</span><span class="cx">     // This disagrees with the web platform test for http://:@www.example.com but agrees with Chrome and URL::parse,
</span><span class="cx">     // and Firefox fails the web platform test differently. Maybe the web platform test ought to be changed.
</span><span class="lines">@@ -656,7 +655,9 @@
</span><span class="cx">     checkURLDifferences(&quot;http://%48OsT&quot;,
</span><span class="cx">         {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/&quot;},
</span><span class="cx">         {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;%48ost&quot;, 0, &quot;/&quot;, &quot;&quot;, &quot;&quot;, &quot;http://%48ost/&quot;});
</span><del>-
</del><ins>+    checkURLDifferences(&quot;http://host/`&quot;,
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/%60&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/%60&quot;},
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;host&quot;, 0, &quot;/`&quot;, &quot;&quot;, &quot;&quot;, &quot;http://host/`&quot;});
</ins><span class="cx"> }
</span><span class="cx">     
</span><span class="cx"> static void shouldFail(const String&amp; urlString)
</span><span class="lines">@@ -719,6 +720,29 @@
</span><span class="cx">         {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;ws:&quot;},
</span><span class="cx">         {&quot;ws&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;s:&quot;, &quot;&quot;, &quot;&quot;, &quot;ws:s:&quot;});
</span><span class="cx">     checkRelativeURL(&quot;notspecial:&quot;, &quot;http://example.org/foo/bar&quot;, {&quot;notspecial&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, 0, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;notspecial:&quot;});
</span><ins>+    
+    const wchar_t surrogateBegin = 0xD800;
+    const wchar_t validSurrogateEnd = 0xDD55;
+    const wchar_t invalidSurrogateEnd = 'A';
+    checkURL(wideString&lt;12&gt;({'h', 't', 't', 'p', ':', '/', '/', 'w', '/', surrogateBegin, validSurrogateEnd, '\0'}),
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/%F0%90%85%95&quot;, &quot;&quot;, &quot;&quot;, &quot;http://w/%F0%90%85%95&quot;});
+    
+    // URLParser matches Chrome and Firefox but not URL::parse.
+    checkURLDifferences(wideString&lt;12&gt;({'h', 't', 't', 'p', ':', '/', '/', 'w', '/', surrogateBegin, invalidSurrogateEnd}),
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/%EF%BF%BDA&quot;, &quot;&quot;, &quot;&quot;, &quot;http://w/%EF%BF%BDA&quot;},
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/%ED%A0%80A&quot;, &quot;&quot;, &quot;&quot;, &quot;http://w/%ED%A0%80A&quot;});
+    checkURLDifferences(wideString&lt;13&gt;({'h', 't', 't', 'p', ':', '/', '/', 'w', '/', '?', surrogateBegin, invalidSurrogateEnd, '\0'}),
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/&quot;, &quot;%EF%BF%BDA&quot;, &quot;&quot;, &quot;http://w/?%EF%BF%BDA&quot;},
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/&quot;, &quot;%ED%A0%80A&quot;, &quot;&quot;, &quot;http://w/?%ED%A0%80A&quot;});
+    checkURLDifferences(wideString&lt;11&gt;({'h', 't', 't', 'p', ':', '/', '/', 'w', '/', surrogateBegin, '\0'}),
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/%EF%BF%BD&quot;, &quot;&quot;, &quot;&quot;, &quot;http://w/%EF%BF%BD&quot;},
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/%ED%A0%80&quot;, &quot;&quot;, &quot;&quot;, &quot;http://w/%ED%A0%80&quot;});
+    checkURLDifferences(wideString&lt;12&gt;({'h', 't', 't', 'p', ':', '/', '/', 'w', '/', '?', surrogateBegin, '\0'}),
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/&quot;, &quot;%EF%BF%BD&quot;, &quot;&quot;, &quot;http://w/?%EF%BF%BD&quot;},
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/&quot;, &quot;%ED%A0%80&quot;, &quot;&quot;, &quot;http://w/?%ED%A0%80&quot;});
+    checkURLDifferences(wideString&lt;13&gt;({'h', 't', 't', 'p', ':', '/', '/', 'w', '/', '?', surrogateBegin, ' ', '\0'}),
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/&quot;, &quot;%EF%BF%BD&quot;, &quot;&quot;, &quot;http://w/?%EF%BF%BD&quot;},
+        {&quot;http&quot;, &quot;&quot;, &quot;&quot;, &quot;w&quot;, 0, &quot;/&quot;, &quot;%ED%A0%80&quot;, &quot;&quot;, &quot;http://w/?%ED%A0%80&quot;});
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> static void checkURL(const String&amp; urlString, const TextEncoding&amp; encoding, const ExpectedParts&amp; parts)
</span></span></pre>
</div>
</div>

</body>
</html>