<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"

"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>[195074] trunk/Source/WebCore</title>

</head>

<body>

<style type="text/css"><!--

#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }

#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }

#msg dt:after { content:':';}

#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }

#msg dl a { font-weight: bold}

#msg dl a:link    { color:#fc3; }

#msg dl a:active  { color:#ff0; }

#msg dl a:visited { color:#cc6; }

h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }

#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }

#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }

#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }

#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }

#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }

#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }

#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }

#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }

#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }

#logmsg pre { background: #eee; padding: 1em; }

#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}

#logmsg dl { margin: 0; }

#logmsg dt { font-weight: bold; }

#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }

#logmsg dd:before { content:'\00bb';}

#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }

#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }

#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }

#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }

#logmsg table th.Corner { text-align: left; }

#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }

#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }

#patch { width: 100%; }

#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}

#patch .propset h4, #patch .binary h4 {margin:0;}

#patch pre {padding:0;line-height:1.2em;margin:0;}

#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}

#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}

#patch span {display:block;padding:0 10px;}

#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}

#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}

#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}

#patch .lines, .info {color:#888;background:#fff;}

--></style>

<div id="msg">

<dl class="meta">

<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/195074">195074</a></dd>

<dt>Author</dt> <dd>dbates@webkit.org</dd>

<dt>Date</dt> <dd>2016-01-14 13:40:13 -0800 (Thu, 14 Jan 2016)</dd>

</dl>

<h3>Log Message</h3>

<pre>[XSS Auditor] Extract attribute truncation logic and formalize string canonicalization

https://bugs.webkit.org/show_bug.cgi?id=152874

Reviewed by Brent Fulgham.

Derived from Blink patch (by Tom Sepez &lt;tsepez@chromium.org&gt;):

&lt;https://src.chromium.org/viewvc/blink?revision=176339&amp;view=revision&gt;

Extract the src-like and script-like attribute truncation logic into independent functions

towards making it more straightforward to re-purpose this logic. Additionally, formalize the

concept of string canonicalization as a member function that consolidates the process of

decoding URL escape sequences, truncating the decoded string (if applicable), and removing

characters that are considered noise.

* html/parser/XSSAuditor.cpp:

(WebCore::truncateForSrcLikeAttribute): Extracted from XSSAuditor::decodedSnippetForAttribute().

(WebCore::truncateForScriptLikeAttribute): Ditto.

(WebCore::XSSAuditor::init): Write in terms of XSSAuditor::canonicalize().

(WebCore::XSSAuditor::filterCharacterToken): Updated to make use of formalized canonicalization methods.

(WebCore::XSSAuditor::filterScriptToken): Ditto.

(WebCore::XSSAuditor::filterObjectToken): Ditto.

(WebCore::XSSAuditor::filterParamToken): Ditto.

(WebCore::XSSAuditor::filterEmbedToken): Ditto.

(WebCore::XSSAuditor::filterAppletToken): Ditto.

(WebCore::XSSAuditor::filterFrameToken): Ditto.

(WebCore::XSSAuditor::filterInputToken): Ditto.

(WebCore::XSSAuditor::filterButtonToken): Ditto.

(WebCore::XSSAuditor::eraseDangerousAttributesIfInjected): Ditto.

(WebCore::XSSAuditor::eraseAttributeIfInjected): Updated code to use early return style and avoid an unnecessary string

comparison when we know that a src attribute was injected.

(WebCore::XSSAuditor::canonicalizedSnippetForTagName): Renamed; formerly known as XSSAuditor::decodedSnippetForName(). Updated

to make use of XSSAuditor::canonicalize().

(WebCore::XSSAuditor::snippetFromAttribute): Renamed; formerly known as XSSAuditor::decodedSnippetForAttribute(). Moved

truncation logic from here to WebCore::truncateFor{Script, Src}LikeAttribute.

(WebCore::XSSAuditor::canonicalize): Added.

(WebCore::XSSAuditor::canonicalizedSnippetForJavaScript): Added.

(WebCore::canonicalize): Deleted.

(WebCore::XSSAuditor::decodedSnippetForName): Deleted.

(WebCore::XSSAuditor::decodedSnippetForAttribute): Deleted.

(WebCore::XSSAuditor::decodedSnippetForJavaScript): Deleted.

* html/parser/XSSAuditor.h: Define enum class for the various attribute truncation styles.</pre>

<h3>Modified Paths</h3>

<ul>

<li><a href="#trunkSourceWebCoreChangeLog">trunk/Source/WebCore/ChangeLog</a></li>

<li><a href="#trunkSourceWebCorehtmlparserXSSAuditorcpp">trunk/Source/WebCore/html/parser/XSSAuditor.cpp</a></li>

<li><a href="#trunkSourceWebCorehtmlparserXSSAuditorh">trunk/Source/WebCore/html/parser/XSSAuditor.h</a></li>

</ul>

</div>

<div id="patch">

<h3>Diff</h3>

<a id="trunkSourceWebCoreChangeLog"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/ChangeLog (195073 => 195074)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/ChangeLog        2016-01-14 21:37:49 UTC (rev 195073)

+++ trunk/Source/WebCore/ChangeLog        2016-01-14 21:40:13 UTC (rev 195074)

</span><span class="lines">@@ -1,5 +1,49 @@

</span><span class="cx"> 2016-01-14  Daniel Bates  &lt;dabates@apple.com&gt;

</span><span class="cx"> 

</span><ins>+        [XSS Auditor] Extract attribute truncation logic and formalize string canonicalization

+        https://bugs.webkit.org/show_bug.cgi?id=152874

+

+        Reviewed by Brent Fulgham.

+

+        Derived from Blink patch (by Tom Sepez &lt;tsepez@chromium.org&gt;):

+        &lt;https://src.chromium.org/viewvc/blink?revision=176339&amp;view=revision&gt;

+

+        Extract the src-like and script-like attribute truncation logic into independent functions

+        towards making it more straightforward to re-purpose this logic. Additionally, formalize the

+        concept of string canonicalization as a member function that consolidates the process of

+        decoding URL escape sequences, truncating the decoded string (if applicable), and removing

+        characters that are considered noise.

+

+        * html/parser/XSSAuditor.cpp:

+        (WebCore::truncateForSrcLikeAttribute): Extracted from XSSAuditor::decodedSnippetForAttribute().

+        (WebCore::truncateForScriptLikeAttribute): Ditto.

+        (WebCore::XSSAuditor::init): Write in terms of XSSAuditor::canonicalize().

+        (WebCore::XSSAuditor::filterCharacterToken): Updated to make use of formalized canonicalization methods.

+        (WebCore::XSSAuditor::filterScriptToken): Ditto.

+        (WebCore::XSSAuditor::filterObjectToken): Ditto.

+        (WebCore::XSSAuditor::filterParamToken): Ditto.

+        (WebCore::XSSAuditor::filterEmbedToken): Ditto.

+        (WebCore::XSSAuditor::filterAppletToken): Ditto.

+        (WebCore::XSSAuditor::filterFrameToken): Ditto.

+        (WebCore::XSSAuditor::filterInputToken): Ditto.

+        (WebCore::XSSAuditor::filterButtonToken): Ditto.

+        (WebCore::XSSAuditor::eraseDangerousAttributesIfInjected): Ditto.

+        (WebCore::XSSAuditor::eraseAttributeIfInjected): Updated code to use early return style and avoid an unnecessary string

+        comparison when we know that a src attribute was injected.

+        (WebCore::XSSAuditor::canonicalizedSnippetForTagName): Renamed; formerly known as XSSAuditor::decodedSnippetForName(). Updated

+        to make use of XSSAuditor::canonicalize().

+        (WebCore::XSSAuditor::snippetFromAttribute): Renamed; formerly known as XSSAuditor::decodedSnippetForAttribute(). Moved

+        truncation logic from here to WebCore::truncateFor{Script, Src}LikeAttribute.

+        (WebCore::XSSAuditor::canonicalize): Added.

+        (WebCore::XSSAuditor::canonicalizedSnippetForJavaScript): Added.

+        (WebCore::canonicalize): Deleted.

+        (WebCore::XSSAuditor::decodedSnippetForName): Deleted.

+        (WebCore::XSSAuditor::decodedSnippetForAttribute): Deleted.

+        (WebCore::XSSAuditor::decodedSnippetForJavaScript): Deleted.

+        * html/parser/XSSAuditor.h: Define enum class for the various attribute truncation styles.

+

+2016-01-14  Daniel Bates  &lt;dabates@apple.com&gt;

+

</ins><span class="cx">         [XSS Auditor] Partial bypass when web server collapses path components

</span><span class="cx">         https://bugs.webkit.org/show_bug.cgi?id=152872

</span><span class="cx"> 

</span></span></pre></div>

<a id="trunkSourceWebCorehtmlparserXSSAuditorcpp"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/html/parser/XSSAuditor.cpp (195073 => 195074)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/html/parser/XSSAuditor.cpp        2016-01-14 21:37:49 UTC (rev 195073)

+++ trunk/Source/WebCore/html/parser/XSSAuditor.cpp        2016-01-14 21:40:13 UTC (rev 195074)

</span><span class="lines">@@ -63,11 +63,6 @@

</span><span class="cx">     return (c == '\\' || c == '0' || c == '\0' || c == '/' || c &gt;= 127);

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-static String canonicalize(const String&amp; string)

-{

-    return string.removeCharacters(&amp;isNonCanonicalCharacter);

-}

-

</del><span class="cx"> static bool isRequiredForInjection(UChar c)

</span><span class="cx"> {

</span><span class="cx">     return (c == '\'' || c == '&quot;' || c == '&lt;' || c == '&gt;');

</span><span class="lines">@@ -180,6 +175,57 @@

</span><span class="cx">     return workingString;

</span><span class="cx"> }

</span><span class="cx"> 

</span><ins>+static void truncateForSrcLikeAttribute(String&amp; decodedSnippet)

+{

+    // In HTTP URLs, characters following the first ?, #, or third slash may come from

+    // the page itself and can be merely ignored by an attacker's server when a remote

+    // script or script-like resource is requested. In DATA URLS, the payload starts at

+    // the first comma, and the the first /*, //, or &lt;!-- may introduce a comment. Characters

+    // following this may come from the page itself and may be ignored when the script is

+    // executed. For simplicity, we don't differentiate based on URL scheme, and stop at

+    // the first # or ?, the third slash, or the first slash or &lt; once a comma is seen.

+    int slashCount = 0;

+    bool commaSeen = false;

+    for (size_t currentLength = 0; currentLength &lt; decodedSnippet.length(); ++currentLength) {

+        UChar currentChar = decodedSnippet[currentLength];

+        if (currentChar == '?'

+            || currentChar == '#'

+            || ((currentChar == '/' || currentChar == '\\') &amp;&amp; (commaSeen || ++slashCount &gt; 2))

+            || (currentChar == '&lt;' &amp;&amp; commaSeen)) {

+            decodedSnippet.truncate(currentLength);

+            return;

+        }

+        if (currentChar == ',')

+            commaSeen = true;

+    }

+}

+

+static void truncateForScriptLikeAttribute(String&amp; decodedSnippet)

+{

+    // Beware of trailing characters which came from the page itself, not the

+    // injected vector. Excluding the terminating character covers common cases

+    // where the page immediately ends the attribute, but doesn't cover more

+    // complex cases where there is other page data following the injection.

+    // Generally, these won't parse as JavaScript, so the injected vector

+    // typically excludes them from consideration via a single-line comment or

+    // by enclosing them in a string literal terminated later by the page's own

+    // closing punctuation. Since the snippet has not been parsed, the vector

+    // may also try to introduce these via entities. As a result, we'd like to

+    // stop before the first &quot;//&quot;, the first &lt;!--, the first entity, or the first

+    // quote not immediately following the first equals sign (taking whitespace

+    // into consideration). To keep things simpler, we don't try to distinguish

+    // between entity-introducing ampersands vs. other uses, nor do we bother to

+    // check for a second slash for a comment, nor do we bother to check for

+    // !-- following a less-than sign. We stop instead on any ampersand

+    // slash, or less-than sign.

+    size_t position = 0;

+    if ((position = decodedSnippet.find('=')) != notFound

+        &amp;&amp; (position = decodedSnippet.find(isNotHTMLSpace, position + 1)) != notFound

+        &amp;&amp; (position = decodedSnippet.find(isTerminatingCharacter, isHTMLQuote(decodedSnippet[position]) ? position + 1 : position)) != notFound) {

+        decodedSnippet.truncate(position);

+    }

+}

+

</ins><span class="cx"> static ContentSecurityPolicy::ReflectedXSSDisposition combineXSSProtectionHeaderAndCSP(ContentSecurityPolicy::ReflectedXSSDisposition xssProtection, ContentSecurityPolicy::ReflectedXSSDisposition reflectedXSS)

</span><span class="cx"> {

</span><span class="cx">     ContentSecurityPolicy::ReflectedXSSDisposition result = std::max(xssProtection, reflectedXSS);

</span><span class="lines">@@ -269,7 +315,7 @@

</span><span class="cx">     if (document-&gt;decoder())

</span><span class="cx">         m_encoding = document-&gt;decoder()-&gt;encoding();

</span><span class="cx"> 

</span><del>-    m_decodedURL = canonicalize(fullyDecodeString(m_documentURL.string(), m_encoding));

</del><ins>+    m_decodedURL = canonicalize(m_documentURL.string(), TruncationStyle::None);

</ins><span class="cx">     if (m_decodedURL.find(isRequiredForInjection) == notFound)

</span><span class="cx">         m_decodedURL = String();

</span><span class="cx"> 

</span><span class="lines">@@ -307,7 +353,7 @@

</span><span class="cx">         if (httpBody &amp;&amp; !httpBody-&gt;isEmpty()) {

</span><span class="cx">             httpBodyAsString = httpBody-&gt;flattenToString();

</span><span class="cx">             if (!httpBodyAsString.isEmpty()) {

</span><del>-                m_decodedHTTPBody = canonicalize(fullyDecodeString(httpBodyAsString, m_encoding));

</del><ins>+                m_decodedHTTPBody = canonicalize(httpBodyAsString, TruncationStyle::None);

</ins><span class="cx">                 if (m_decodedHTTPBody.find(isRequiredForInjection) == notFound)

</span><span class="cx">                     m_decodedHTTPBody = String();

</span><span class="cx">                 if (m_decodedHTTPBody.length() &gt;= minimumLengthForSuffixTree)

</span><span class="lines">@@ -389,7 +435,7 @@

</span><span class="cx"> bool XSSAuditor::filterCharacterToken(const FilterTokenRequest&amp; request)

</span><span class="cx"> {

</span><span class="cx">     ASSERT(m_scriptTagNestingLevel);

</span><del>-    if (m_wasScriptTagFoundInRequest &amp;&amp; isContainedInRequest(decodedSnippetForJavaScript(request))) {

</del><ins>+    if (m_wasScriptTagFoundInRequest &amp;&amp; isContainedInRequest(canonicalizedSnippetForJavaScript(request))) {

</ins><span class="cx">         request.token.clear();

</span><span class="cx">         LChar space = ' ';

</span><span class="cx">         request.token.appendToCharacter(space); // Technically, character tokens can't be empty.

</span><span class="lines">@@ -403,12 +449,12 @@

</span><span class="cx">     ASSERT(request.token.type() == HTMLToken::StartTag);

</span><span class="cx">     ASSERT(hasName(request.token, scriptTag));

</span><span class="cx"> 

</span><del>-    m_wasScriptTagFoundInRequest = isContainedInRequest(decodedSnippetForName(request));

</del><ins>+    m_wasScriptTagFoundInRequest = isContainedInRequest(canonicalizedSnippetForTagName(request));

</ins><span class="cx"> 

</span><span class="cx">     bool didBlockScript = false;

</span><span class="cx">     if (m_wasScriptTagFoundInRequest) {

</span><del>-        didBlockScript |= eraseAttributeIfInjected(request, srcAttr, blankURL().string(), SrcLikeAttribute);

-        didBlockScript |= eraseAttributeIfInjected(request, XLinkNames::hrefAttr, blankURL().string(), SrcLikeAttribute);

</del><ins>+        didBlockScript |= eraseAttributeIfInjected(request, srcAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);

+        didBlockScript |= eraseAttributeIfInjected(request, XLinkNames::hrefAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);

</ins><span class="cx">     }

</span><span class="cx"> 

</span><span class="cx">     return didBlockScript;

</span><span class="lines">@@ -420,8 +466,8 @@

</span><span class="cx">     ASSERT(hasName(request.token, objectTag));

</span><span class="cx"> 

</span><span class="cx">     bool didBlockScript = false;

</span><del>-    if (isContainedInRequest(decodedSnippetForName(request))) {

-        didBlockScript |= eraseAttributeIfInjected(request, dataAttr, blankURL().string(), SrcLikeAttribute);

</del><ins>+    if (isContainedInRequest(canonicalizedSnippetForTagName(request))) {

+        didBlockScript |= eraseAttributeIfInjected(request, dataAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);

</ins><span class="cx">         didBlockScript |= eraseAttributeIfInjected(request, typeAttr);

</span><span class="cx">         didBlockScript |= eraseAttributeIfInjected(request, classidAttr);

</span><span class="cx">     }

</span><span class="lines">@@ -441,7 +487,7 @@

</span><span class="cx">     if (!HTMLParamElement::isURLParameter(String(nameAttribute.value)))

</span><span class="cx">         return false;

</span><span class="cx"> 

</span><del>-    return eraseAttributeIfInjected(request, valueAttr, blankURL().string(), SrcLikeAttribute);

</del><ins>+    return eraseAttributeIfInjected(request, valueAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);

</ins><span class="cx"> }

</span><span class="cx"> 

</span><span class="cx"> bool XSSAuditor::filterEmbedToken(const FilterTokenRequest&amp; request)

</span><span class="lines">@@ -450,9 +496,9 @@

</span><span class="cx">     ASSERT(hasName(request.token, embedTag));

</span><span class="cx"> 

</span><span class="cx">     bool didBlockScript = false;

</span><del>-    if (isContainedInRequest(decodedSnippetForName(request))) {

-        didBlockScript |= eraseAttributeIfInjected(request, codeAttr, String(), SrcLikeAttribute);

-        didBlockScript |= eraseAttributeIfInjected(request, srcAttr, blankURL().string(), SrcLikeAttribute);

</del><ins>+    if (isContainedInRequest(canonicalizedSnippetForTagName(request))) {

+        didBlockScript |= eraseAttributeIfInjected(request, codeAttr, String(), TruncationStyle::SrcLikeAttribute);

+        didBlockScript |= eraseAttributeIfInjected(request, srcAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);

</ins><span class="cx">         didBlockScript |= eraseAttributeIfInjected(request, typeAttr);

</span><span class="cx">     }

</span><span class="cx">     return didBlockScript;

</span><span class="lines">@@ -464,8 +510,8 @@

</span><span class="cx">     ASSERT(hasName(request.token, appletTag));

</span><span class="cx"> 

</span><span class="cx">     bool didBlockScript = false;

</span><del>-    if (isContainedInRequest(decodedSnippetForName(request))) {

-        didBlockScript |= eraseAttributeIfInjected(request, codeAttr, String(), SrcLikeAttribute);

</del><ins>+    if (isContainedInRequest(canonicalizedSnippetForTagName(request))) {

+        didBlockScript |= eraseAttributeIfInjected(request, codeAttr, String(), TruncationStyle::SrcLikeAttribute);

</ins><span class="cx">         didBlockScript |= eraseAttributeIfInjected(request, objectAttr);

</span><span class="cx">     }

</span><span class="cx">     return didBlockScript;

</span><span class="lines">@@ -476,9 +522,9 @@

</span><span class="cx">     ASSERT(request.token.type() == HTMLToken::StartTag);

</span><span class="cx">     ASSERT(hasName(request.token, iframeTag) || hasName(request.token, frameTag));

</span><span class="cx"> 

</span><del>-    bool didBlockScript = eraseAttributeIfInjected(request, srcdocAttr, String(), ScriptLikeAttribute);

-    if (isContainedInRequest(decodedSnippetForName(request)))

-        didBlockScript |= eraseAttributeIfInjected(request, srcAttr, String(), SrcLikeAttribute);

</del><ins>+    bool didBlockScript = eraseAttributeIfInjected(request, srcdocAttr, String(), TruncationStyle::ScriptLikeAttribute);

+    if (isContainedInRequest(canonicalizedSnippetForTagName(request)))

+        didBlockScript |= eraseAttributeIfInjected(request, srcAttr, String(), TruncationStyle::SrcLikeAttribute);

</ins><span class="cx"> 

</span><span class="cx">     return didBlockScript;

</span><span class="cx"> }

</span><span class="lines">@@ -512,7 +558,7 @@

</span><span class="cx">     ASSERT(request.token.type() == HTMLToken::StartTag);

</span><span class="cx">     ASSERT(hasName(request.token, inputTag));

</span><span class="cx"> 

</span><del>-    return eraseAttributeIfInjected(request, formactionAttr, blankURL().string(), SrcLikeAttribute);

</del><ins>+    return eraseAttributeIfInjected(request, formactionAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);

</ins><span class="cx"> }

</span><span class="cx"> 

</span><span class="cx"> bool XSSAuditor::filterButtonToken(const FilterTokenRequest&amp; request)

</span><span class="lines">@@ -520,7 +566,7 @@

</span><span class="cx">     ASSERT(request.token.type() == HTMLToken::StartTag);

</span><span class="cx">     ASSERT(hasName(request.token, buttonTag));

</span><span class="cx"> 

</span><del>-    return eraseAttributeIfInjected(request, formactionAttr, blankURL().string(), SrcLikeAttribute);

</del><ins>+    return eraseAttributeIfInjected(request, formactionAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);

</ins><span class="cx"> }

</span><span class="cx"> 

</span><span class="cx"> bool XSSAuditor::eraseDangerousAttributesIfInjected(const FilterTokenRequest&amp; request)

</span><span class="lines">@@ -536,7 +582,7 @@

</span><span class="cx">         bool valueContainsJavaScriptURL = (!isInlineEventHandler &amp;&amp; protocolIsJavaScript(strippedValue)) || (isSemicolonSeparatedAttribute(attribute) &amp;&amp; semicolonSeparatedValueContainsJavaScriptURL(strippedValue));

</span><span class="cx">         if (!isInlineEventHandler &amp;&amp; !valueContainsJavaScriptURL)

</span><span class="cx">             continue;

</span><del>-        if (!isContainedInRequest(decodedSnippetForAttribute(request, attribute, ScriptLikeAttribute)))

</del><ins>+        if (!isContainedInRequest(canonicalize(snippetFromAttribute(request, attribute), TruncationStyle::ScriptLikeAttribute)))

</ins><span class="cx">             continue;

</span><span class="cx">         request.token.eraseValueOfAttribute(i);

</span><span class="cx">         if (valueContainsJavaScriptURL)

</span><span class="lines">@@ -546,94 +592,59 @@

</span><span class="cx">     return didBlockScript;

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-bool XSSAuditor::eraseAttributeIfInjected(const FilterTokenRequest&amp; request, const QualifiedName&amp; attributeName, const String&amp; replacementValue, AttributeKind treatment)

</del><ins>+bool XSSAuditor::eraseAttributeIfInjected(const FilterTokenRequest&amp; request, const QualifiedName&amp; attributeName, const String&amp; replacementValue, TruncationStyle truncationStyle)

</ins><span class="cx"> {

</span><span class="cx">     size_t indexOfAttribute = 0;

</span><del>-    if (findAttributeWithName(request.token, attributeName, indexOfAttribute)) {

-        const HTMLToken::Attribute&amp; attribute = request.token.attributes().at(indexOfAttribute);

-        if (isContainedInRequest(decodedSnippetForAttribute(request, attribute, treatment))) {

-            if (threadSafeMatch(attributeName, srcAttr) &amp;&amp; isLikelySafeResource(String(attribute.value)))

-                return false;

-            if (threadSafeMatch(attributeName, http_equivAttr) &amp;&amp; !isDangerousHTTPEquiv(String(attribute.value)))

-                return false;

-            request.token.eraseValueOfAttribute(indexOfAttribute);

-            if (!replacementValue.isEmpty())

-                request.token.appendToAttributeValue(indexOfAttribute, replacementValue);

-            return true;

-        }

</del><ins>+    if (!findAttributeWithName(request.token, attributeName, indexOfAttribute))

+        return false;

+

+    const HTMLToken::Attribute&amp; attribute = request.token.attributes().at(indexOfAttribute);

+    if (!isContainedInRequest(canonicalize(snippetFromAttribute(request, attribute), truncationStyle)))

+        return false;

+

+    if (threadSafeMatch(attributeName, srcAttr)) {

+        if (isLikelySafeResource(String(attribute.value)))

+            return false;

+    } else if (threadSafeMatch(attributeName, http_equivAttr)) {

+        if (!isDangerousHTTPEquiv(String(attribute.value)))

+            return false;

</ins><span class="cx">     }

</span><del>-    return false;

</del><ins>+

+    request.token.eraseValueOfAttribute(indexOfAttribute);

+    if (!replacementValue.isEmpty())

+        request.token.appendToAttributeValue(indexOfAttribute, replacementValue);

+    return true;

</ins><span class="cx"> }

</span><span class="cx"> 

</span><del>-String XSSAuditor::decodedSnippetForName(const FilterTokenRequest&amp; request)

</del><ins>+String XSSAuditor::canonicalizedSnippetForTagName(const FilterTokenRequest&amp; request)

</ins><span class="cx"> {

</span><span class="cx">     // Grab a fixed number of characters equal to the length of the token's name plus one (to account for the &quot;&lt;&quot;).

</span><del>-    return canonicalize(fullyDecodeString(request.sourceTracker.source(request.token), m_encoding).substring(0, request.token.name().size() + 1));

</del><ins>+    return canonicalize(request.sourceTracker.source(request.token).substring(0, request.token.name().size() + 1), TruncationStyle::None);

</ins><span class="cx"> }

</span><span class="cx"> 

</span><del>-String XSSAuditor::decodedSnippetForAttribute(const FilterTokenRequest&amp; request, const HTMLToken::Attribute&amp; attribute, AttributeKind treatment)

</del><ins>+String XSSAuditor::snippetFromAttribute(const FilterTokenRequest&amp; request, const HTMLToken::Attribute&amp; attribute)

</ins><span class="cx"> {

</span><span class="cx">     // The range doesn't include the character which terminates the value. So,

</span><span class="cx">     // for an input of |name=&quot;value&quot;|, the snippet is |name=&quot;value|. For an

</span><span class="cx">     // unquoted input of |name=value |, the snippet is |name=value|.

</span><span class="cx">     // FIXME: We should grab one character before the name also.

</span><del>-    unsigned start = attribute.startOffset;

-    unsigned end = attribute.endOffset;

</del><ins>+    return request.sourceTracker.source(request.token, attribute.startOffset, attribute.endOffset);

+}

</ins><span class="cx"> 

</span><del>-    // We defer canonicalizing the decoded string here to preserve embedded slashes (if any) that

-    // may lead us to truncate the string.

-    String decodedSnippet = fullyDecodeString(request.sourceTracker.source(request.token, start, end), m_encoding);

-    decodedSnippet.truncate(kMaximumFragmentLengthTarget);

-    if (treatment == SrcLikeAttribute) {

-        int slashCount = 0;

-        bool commaSeen = false;

-        // In HTTP URLs, characters following the first ?, #, or third slash may come from 

-        // the page itself and can be merely ignored by an attacker's server when a remote

-        // script or script-like resource is requested. In DATA URLS, the payload starts at

-        // the first comma, and the the first /*, //, or &lt;!-- may introduce a comment. Characters

-        // following this may come from the page itself and may be ignored when the script is

-        // executed. For simplicity, we don't differentiate based on URL scheme, and stop at

-        // the first # or ?, the third slash, or the first slash or &lt; once a comma is seen.

-        for (size_t currentLength = 0; currentLength &lt; decodedSnippet.length(); ++currentLength) {

-            UChar currentChar = decodedSnippet[currentLength];

-            if (currentChar == '?'

-                || currentChar == '#'

-                || ((currentChar == '/' || currentChar == '\\') &amp;&amp; (commaSeen || ++slashCount &gt; 2))

-                || (currentChar == '&lt;' &amp;&amp; commaSeen)) {

-                decodedSnippet.truncate(currentLength);

-                break;

-            }

-            if (currentChar == ',')

-                commaSeen = true;

-        }

-    } else if (treatment == ScriptLikeAttribute) {

-        // Beware of trailing characters which came from the page itself, not the 

-        // injected vector. Excluding the terminating character covers common cases

-        // where the page immediately ends the attribute, but doesn't cover more

-        // complex cases where there is other page data following the injection. 

-        // Generally, these won't parse as javascript, so the injected vector

-        // typically excludes them from consideration via a single-line comment or

-        // by enclosing them in a string literal terminated later by the page's own

-        // closing punctuation. Since the snippet has not been parsed, the vector

-        // may also try to introduce these via entities. As a result, we'd like to

-        // stop before the first &quot;//&quot;, the first &lt;!--, the first entity, or the first

-        // quote not immediately following the first equals sign (taking whitespace

-        // into consideration). To keep things simpler, we don't try to distinguish

-        // between entity-introducing amperands vs. other uses, nor do we bother to

-        // check for a second slash for a comment, nor do we bother to check for

-        // !-- following a less-than sign. We stop instead on any ampersand

-        // slash, or less-than sign.

-        size_t position = 0;

-        if ((position = decodedSnippet.find('=')) != notFound

-            &amp;&amp; (position = decodedSnippet.find(isNotHTMLSpace, position + 1)) != notFound

-            &amp;&amp; (position = decodedSnippet.find(isTerminatingCharacter, isHTMLQuote(decodedSnippet[position]) ? position + 1 : position)) != notFound) {

-            decodedSnippet.truncate(position);

-        }

</del><ins>+String XSSAuditor::canonicalize(const String&amp; snippet, TruncationStyle truncationStyle)

+{

+    String decodedSnippet = fullyDecodeString(snippet, m_encoding);

+    if (truncationStyle != TruncationStyle::None) {

+        decodedSnippet.truncate(kMaximumFragmentLengthTarget);

+        if (truncationStyle == TruncationStyle::SrcLikeAttribute)

+            truncateForSrcLikeAttribute(decodedSnippet);

+        else if (truncationStyle == TruncationStyle::ScriptLikeAttribute)

+            truncateForScriptLikeAttribute(decodedSnippet);

</ins><span class="cx">     }

</span><del>-    return canonicalize(decodedSnippet);

</del><ins>+    return decodedSnippet.removeCharacters(&amp;isNonCanonicalCharacter);

</ins><span class="cx"> }

</span><span class="cx"> 

</span><del>-String XSSAuditor::decodedSnippetForJavaScript(const FilterTokenRequest&amp; request)

</del><ins>+String XSSAuditor::canonicalizedSnippetForJavaScript(const FilterTokenRequest&amp; request)

</ins><span class="cx"> {

</span><span class="cx">     String string = request.sourceTracker.source(request.token);

</span><span class="cx">     size_t startPosition = 0;

</span><span class="lines">@@ -687,7 +698,6 @@

</span><span class="cx">                 foundPosition = lastNonSpacePosition;

</span><span class="cx">                 break;

</span><span class="cx">             }

</span><del>-

</del><span class="cx">             if (foundPosition &gt; startPosition + kMaximumFragmentLengthTarget) {

</span><span class="cx">                 // After hitting the length target, we can only stop at a point where we know we are

</span><span class="cx">                 // not in the middle of a %-escape sequence. For the sake of simplicity, approximate

</span><span class="lines">@@ -701,7 +711,7 @@

</span><span class="cx">                 lastNonSpacePosition = foundPosition;

</span><span class="cx">         }

</span><span class="cx"> 

</span><del>-        result = canonicalize(fullyDecodeString(string.substring(startPosition, foundPosition - startPosition), m_encoding));

</del><ins>+        result = canonicalize(string.substring(startPosition, foundPosition - startPosition), TruncationStyle::None);

</ins><span class="cx">         startPosition = foundPosition + 1;

</span><span class="cx">     }

</span><span class="cx">     return result;

</span></span></pre></div>

<a id="trunkSourceWebCorehtmlparserXSSAuditorh"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/html/parser/XSSAuditor.h (195073 => 195074)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/html/parser/XSSAuditor.h        2016-01-14 21:37:49 UTC (rev 195073)

+++ trunk/Source/WebCore/html/parser/XSSAuditor.h        2016-01-14 21:40:13 UTC (rev 195074)

</span><span class="lines">@@ -70,7 +70,8 @@

</span><span class="cx">         Initialized

</span><span class="cx">     };

</span><span class="cx"> 

</span><del>-    enum AttributeKind {

</del><ins>+    enum class TruncationStyle {

+        None,

</ins><span class="cx">         NormalAttribute,

</span><span class="cx">         SrcLikeAttribute,

</span><span class="cx">         ScriptLikeAttribute

</span><span class="lines">@@ -92,12 +93,12 @@

</span><span class="cx">     bool filterButtonToken(const FilterTokenRequest&amp;);

</span><span class="cx"> 

</span><span class="cx">     bool eraseDangerousAttributesIfInjected(const FilterTokenRequest&amp;);

</span><del>-    bool eraseAttributeIfInjected(const FilterTokenRequest&amp;, const QualifiedName&amp;, const String&amp; replacementValue = String(), AttributeKind treatment = NormalAttribute);

</del><ins>+    bool eraseAttributeIfInjected(const FilterTokenRequest&amp;, const QualifiedName&amp;, const String&amp; replacementValue = String(), TruncationStyle = TruncationStyle::NormalAttribute);

</ins><span class="cx"> 

</span><del>-    String decodedSnippetForToken(const HTMLToken&amp;);

-    String decodedSnippetForName(const FilterTokenRequest&amp;);

-    String decodedSnippetForAttribute(const FilterTokenRequest&amp;, const HTMLToken::Attribute&amp;, AttributeKind treatment = NormalAttribute);

-    String decodedSnippetForJavaScript(const FilterTokenRequest&amp;);

</del><ins>+    String canonicalizedSnippetForTagName(const FilterTokenRequest&amp;);

+    String canonicalizedSnippetForJavaScript(const FilterTokenRequest&amp;);

+    String snippetFromAttribute(const FilterTokenRequest&amp;, const HTMLToken::Attribute&amp;);

+    String canonicalize(const String&amp;, TruncationStyle);

</ins><span class="cx"> 

</span><span class="cx">     bool isContainedInRequest(const String&amp;);

</span><span class="cx">     bool isLikelySafeResource(const String&amp; url);

</span></span></pre>

</div>

</div>

</body>

</html>