<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[195074] trunk/Source/WebCore</title>
</head>
<body>
<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; }
#msg dl a { font-weight: bold}
#msg dl a:link { color:#fc3; }
#msg dl a:active { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/195074">195074</a></dd>
<dt>Author</dt> <dd>dbates@webkit.org</dd>
<dt>Date</dt> <dd>2016-01-14 13:40:13 -0800 (Thu, 14 Jan 2016)</dd>
</dl>
<h3>Log Message</h3>
<pre>[XSS Auditor] Extract attribute truncation logic and formalize string canonicalization
https://bugs.webkit.org/show_bug.cgi?id=152874
Reviewed by Brent Fulgham.
Derived from Blink patch (by Tom Sepez <tsepez@chromium.org>):
<https://src.chromium.org/viewvc/blink?revision=176339&view=revision>
Extract the src-like and script-like attribute truncation logic into independent functions
towards making it more straightforward to re-purpose this logic. Additionally, formalize the
concept of string canonicalization as a member function that consolidates the process of
decoding URL escape sequences, truncating the decoded string (if applicable), and removing
characters that are considered noise.
* html/parser/XSSAuditor.cpp:
(WebCore::truncateForSrcLikeAttribute): Extracted from XSSAuditor::decodedSnippetForAttribute().
(WebCore::truncateForScriptLikeAttribute): Ditto.
(WebCore::XSSAuditor::init): Write in terms of XSSAuditor::canonicalize().
(WebCore::XSSAuditor::filterCharacterToken): Updated to make use of formalized canonicalization methods.
(WebCore::XSSAuditor::filterScriptToken): Ditto.
(WebCore::XSSAuditor::filterObjectToken): Ditto.
(WebCore::XSSAuditor::filterParamToken): Ditto.
(WebCore::XSSAuditor::filterEmbedToken): Ditto.
(WebCore::XSSAuditor::filterAppletToken): Ditto.
(WebCore::XSSAuditor::filterFrameToken): Ditto.
(WebCore::XSSAuditor::filterInputToken): Ditto.
(WebCore::XSSAuditor::filterButtonToken): Ditto.
(WebCore::XSSAuditor::eraseDangerousAttributesIfInjected): Ditto.
(WebCore::XSSAuditor::eraseAttributeIfInjected): Updated code to use early return style and avoid an unnecessary string
comparison when we know that a src attribute was injected.
(WebCore::XSSAuditor::canonicalizedSnippetForTagName): Renamed; formerly known as XSSAuditor::decodedSnippetForName(). Updated
to make use of XSSAuditor::canonicalize().
(WebCore::XSSAuditor::snippetFromAttribute): Renamed; formerly known as XSSAuditor::decodedSnippetForAttribute(). Moved
truncation logic from here to WebCore::truncateFor{Script, Src}LikeAttribute.
(WebCore::XSSAuditor::canonicalize): Added.
(WebCore::XSSAuditor::canonicalizedSnippetForJavaScript): Added.
(WebCore::canonicalize): Deleted.
(WebCore::XSSAuditor::decodedSnippetForName): Deleted.
(WebCore::XSSAuditor::decodedSnippetForAttribute): Deleted.
(WebCore::XSSAuditor::decodedSnippetForJavaScript): Deleted.
* html/parser/XSSAuditor.h: Define enum class for the various attribute truncation styles.</pre>
<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkSourceWebCoreChangeLog">trunk/Source/WebCore/ChangeLog</a></li>
<li><a href="#trunkSourceWebCorehtmlparserXSSAuditorcpp">trunk/Source/WebCore/html/parser/XSSAuditor.cpp</a></li>
<li><a href="#trunkSourceWebCorehtmlparserXSSAuditorh">trunk/Source/WebCore/html/parser/XSSAuditor.h</a></li>
</ul>
</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkSourceWebCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/ChangeLog (195073 => 195074)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/ChangeLog        2016-01-14 21:37:49 UTC (rev 195073)
+++ trunk/Source/WebCore/ChangeLog        2016-01-14 21:40:13 UTC (rev 195074)
</span><span class="lines">@@ -1,5 +1,49 @@
</span><span class="cx"> 2016-01-14 Daniel Bates <dabates@apple.com>
</span><span class="cx">
</span><ins>+ [XSS Auditor] Extract attribute truncation logic and formalize string canonicalization
+ https://bugs.webkit.org/show_bug.cgi?id=152874
+
+ Reviewed by Brent Fulgham.
+
+ Derived from Blink patch (by Tom Sepez <tsepez@chromium.org>):
+ <https://src.chromium.org/viewvc/blink?revision=176339&view=revision>
+
+ Extract the src-like and script-like attribute truncation logic into independent functions
+ towards making it more straightforward to re-purpose this logic. Additionally, formalize the
+ concept of string canonicalization as a member function that consolidates the process of
+ decoding URL escape sequences, truncating the decoded string (if applicable), and removing
+ characters that are considered noise.
+
+ * html/parser/XSSAuditor.cpp:
+ (WebCore::truncateForSrcLikeAttribute): Extracted from XSSAuditor::decodedSnippetForAttribute().
+ (WebCore::truncateForScriptLikeAttribute): Ditto.
+ (WebCore::XSSAuditor::init): Write in terms of XSSAuditor::canonicalize().
+ (WebCore::XSSAuditor::filterCharacterToken): Updated to make use of formalized canonicalization methods.
+ (WebCore::XSSAuditor::filterScriptToken): Ditto.
+ (WebCore::XSSAuditor::filterObjectToken): Ditto.
+ (WebCore::XSSAuditor::filterParamToken): Ditto.
+ (WebCore::XSSAuditor::filterEmbedToken): Ditto.
+ (WebCore::XSSAuditor::filterAppletToken): Ditto.
+ (WebCore::XSSAuditor::filterFrameToken): Ditto.
+ (WebCore::XSSAuditor::filterInputToken): Ditto.
+ (WebCore::XSSAuditor::filterButtonToken): Ditto.
+ (WebCore::XSSAuditor::eraseDangerousAttributesIfInjected): Ditto.
+ (WebCore::XSSAuditor::eraseAttributeIfInjected): Updated code to use early return style and avoid an unnecessary string
+ comparison when we know that a src attribute was injected.
+ (WebCore::XSSAuditor::canonicalizedSnippetForTagName): Renamed; formerly known as XSSAuditor::decodedSnippetForName(). Updated
+ to make use of XSSAuditor::canonicalize().
+ (WebCore::XSSAuditor::snippetFromAttribute): Renamed; formerly known as XSSAuditor::decodedSnippetForAttribute(). Moved
+ truncation logic from here to WebCore::truncateFor{Script, Src}LikeAttribute.
+ (WebCore::XSSAuditor::canonicalize): Added.
+ (WebCore::XSSAuditor::canonicalizedSnippetForJavaScript): Added.
+ (WebCore::canonicalize): Deleted.
+ (WebCore::XSSAuditor::decodedSnippetForName): Deleted.
+ (WebCore::XSSAuditor::decodedSnippetForAttribute): Deleted.
+ (WebCore::XSSAuditor::decodedSnippetForJavaScript): Deleted.
+ * html/parser/XSSAuditor.h: Define enum class for the various attribute truncation styles.
+
+2016-01-14 Daniel Bates <dabates@apple.com>
+
</ins><span class="cx"> [XSS Auditor] Partial bypass when web server collapses path components
</span><span class="cx"> https://bugs.webkit.org/show_bug.cgi?id=152872
</span><span class="cx">
</span></span></pre></div>
<a id="trunkSourceWebCorehtmlparserXSSAuditorcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/html/parser/XSSAuditor.cpp (195073 => 195074)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/html/parser/XSSAuditor.cpp        2016-01-14 21:37:49 UTC (rev 195073)
+++ trunk/Source/WebCore/html/parser/XSSAuditor.cpp        2016-01-14 21:40:13 UTC (rev 195074)
</span><span class="lines">@@ -63,11 +63,6 @@
</span><span class="cx"> return (c == '\\' || c == '0' || c == '\0' || c == '/' || c >= 127);
</span><span class="cx"> }
</span><span class="cx">
</span><del>-static String canonicalize(const String& string)
-{
- return string.removeCharacters(&isNonCanonicalCharacter);
-}
-
</del><span class="cx"> static bool isRequiredForInjection(UChar c)
</span><span class="cx"> {
</span><span class="cx"> return (c == '\'' || c == '"' || c == '<' || c == '>');
</span><span class="lines">@@ -180,6 +175,57 @@
</span><span class="cx"> return workingString;
</span><span class="cx"> }
</span><span class="cx">
</span><ins>+static void truncateForSrcLikeAttribute(String& decodedSnippet)
+{
+ // In HTTP URLs, characters following the first ?, #, or third slash may come from
+ // the page itself and can be merely ignored by an attacker's server when a remote
+ // script or script-like resource is requested. In DATA URLS, the payload starts at
+ // the first comma, and the the first /*, //, or <!-- may introduce a comment. Characters
+ // following this may come from the page itself and may be ignored when the script is
+ // executed. For simplicity, we don't differentiate based on URL scheme, and stop at
+ // the first # or ?, the third slash, or the first slash or < once a comma is seen.
+ int slashCount = 0;
+ bool commaSeen = false;
+ for (size_t currentLength = 0; currentLength < decodedSnippet.length(); ++currentLength) {
+ UChar currentChar = decodedSnippet[currentLength];
+ if (currentChar == '?'
+ || currentChar == '#'
+ || ((currentChar == '/' || currentChar == '\\') && (commaSeen || ++slashCount > 2))
+ || (currentChar == '<' && commaSeen)) {
+ decodedSnippet.truncate(currentLength);
+ return;
+ }
+ if (currentChar == ',')
+ commaSeen = true;
+ }
+}
+
+static void truncateForScriptLikeAttribute(String& decodedSnippet)
+{
+ // Beware of trailing characters which came from the page itself, not the
+ // injected vector. Excluding the terminating character covers common cases
+ // where the page immediately ends the attribute, but doesn't cover more
+ // complex cases where there is other page data following the injection.
+ // Generally, these won't parse as JavaScript, so the injected vector
+ // typically excludes them from consideration via a single-line comment or
+ // by enclosing them in a string literal terminated later by the page's own
+ // closing punctuation. Since the snippet has not been parsed, the vector
+ // may also try to introduce these via entities. As a result, we'd like to
+ // stop before the first "//", the first <!--, the first entity, or the first
+ // quote not immediately following the first equals sign (taking whitespace
+ // into consideration). To keep things simpler, we don't try to distinguish
+ // between entity-introducing ampersands vs. other uses, nor do we bother to
+ // check for a second slash for a comment, nor do we bother to check for
+ // !-- following a less-than sign. We stop instead on any ampersand
+ // slash, or less-than sign.
+ size_t position = 0;
+ if ((position = decodedSnippet.find('=')) != notFound
+ && (position = decodedSnippet.find(isNotHTMLSpace, position + 1)) != notFound
+ && (position = decodedSnippet.find(isTerminatingCharacter, isHTMLQuote(decodedSnippet[position]) ? position + 1 : position)) != notFound) {
+ decodedSnippet.truncate(position);
+ }
+}
+
</ins><span class="cx"> static ContentSecurityPolicy::ReflectedXSSDisposition combineXSSProtectionHeaderAndCSP(ContentSecurityPolicy::ReflectedXSSDisposition xssProtection, ContentSecurityPolicy::ReflectedXSSDisposition reflectedXSS)
</span><span class="cx"> {
</span><span class="cx"> ContentSecurityPolicy::ReflectedXSSDisposition result = std::max(xssProtection, reflectedXSS);
</span><span class="lines">@@ -269,7 +315,7 @@
</span><span class="cx"> if (document->decoder())
</span><span class="cx"> m_encoding = document->decoder()->encoding();
</span><span class="cx">
</span><del>- m_decodedURL = canonicalize(fullyDecodeString(m_documentURL.string(), m_encoding));
</del><ins>+ m_decodedURL = canonicalize(m_documentURL.string(), TruncationStyle::None);
</ins><span class="cx"> if (m_decodedURL.find(isRequiredForInjection) == notFound)
</span><span class="cx"> m_decodedURL = String();
</span><span class="cx">
</span><span class="lines">@@ -307,7 +353,7 @@
</span><span class="cx"> if (httpBody && !httpBody->isEmpty()) {
</span><span class="cx"> httpBodyAsString = httpBody->flattenToString();
</span><span class="cx"> if (!httpBodyAsString.isEmpty()) {
</span><del>- m_decodedHTTPBody = canonicalize(fullyDecodeString(httpBodyAsString, m_encoding));
</del><ins>+ m_decodedHTTPBody = canonicalize(httpBodyAsString, TruncationStyle::None);
</ins><span class="cx"> if (m_decodedHTTPBody.find(isRequiredForInjection) == notFound)
</span><span class="cx"> m_decodedHTTPBody = String();
</span><span class="cx"> if (m_decodedHTTPBody.length() >= minimumLengthForSuffixTree)
</span><span class="lines">@@ -389,7 +435,7 @@
</span><span class="cx"> bool XSSAuditor::filterCharacterToken(const FilterTokenRequest& request)
</span><span class="cx"> {
</span><span class="cx"> ASSERT(m_scriptTagNestingLevel);
</span><del>- if (m_wasScriptTagFoundInRequest && isContainedInRequest(decodedSnippetForJavaScript(request))) {
</del><ins>+ if (m_wasScriptTagFoundInRequest && isContainedInRequest(canonicalizedSnippetForJavaScript(request))) {
</ins><span class="cx"> request.token.clear();
</span><span class="cx"> LChar space = ' ';
</span><span class="cx"> request.token.appendToCharacter(space); // Technically, character tokens can't be empty.
</span><span class="lines">@@ -403,12 +449,12 @@
</span><span class="cx"> ASSERT(request.token.type() == HTMLToken::StartTag);
</span><span class="cx"> ASSERT(hasName(request.token, scriptTag));
</span><span class="cx">
</span><del>- m_wasScriptTagFoundInRequest = isContainedInRequest(decodedSnippetForName(request));
</del><ins>+ m_wasScriptTagFoundInRequest = isContainedInRequest(canonicalizedSnippetForTagName(request));
</ins><span class="cx">
</span><span class="cx"> bool didBlockScript = false;
</span><span class="cx"> if (m_wasScriptTagFoundInRequest) {
</span><del>- didBlockScript |= eraseAttributeIfInjected(request, srcAttr, blankURL().string(), SrcLikeAttribute);
- didBlockScript |= eraseAttributeIfInjected(request, XLinkNames::hrefAttr, blankURL().string(), SrcLikeAttribute);
</del><ins>+ didBlockScript |= eraseAttributeIfInjected(request, srcAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);
+ didBlockScript |= eraseAttributeIfInjected(request, XLinkNames::hrefAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);
</ins><span class="cx"> }
</span><span class="cx">
</span><span class="cx"> return didBlockScript;
</span><span class="lines">@@ -420,8 +466,8 @@
</span><span class="cx"> ASSERT(hasName(request.token, objectTag));
</span><span class="cx">
</span><span class="cx"> bool didBlockScript = false;
</span><del>- if (isContainedInRequest(decodedSnippetForName(request))) {
- didBlockScript |= eraseAttributeIfInjected(request, dataAttr, blankURL().string(), SrcLikeAttribute);
</del><ins>+ if (isContainedInRequest(canonicalizedSnippetForTagName(request))) {
+ didBlockScript |= eraseAttributeIfInjected(request, dataAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);
</ins><span class="cx"> didBlockScript |= eraseAttributeIfInjected(request, typeAttr);
</span><span class="cx"> didBlockScript |= eraseAttributeIfInjected(request, classidAttr);
</span><span class="cx"> }
</span><span class="lines">@@ -441,7 +487,7 @@
</span><span class="cx"> if (!HTMLParamElement::isURLParameter(String(nameAttribute.value)))
</span><span class="cx"> return false;
</span><span class="cx">
</span><del>- return eraseAttributeIfInjected(request, valueAttr, blankURL().string(), SrcLikeAttribute);
</del><ins>+ return eraseAttributeIfInjected(request, valueAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);
</ins><span class="cx"> }
</span><span class="cx">
</span><span class="cx"> bool XSSAuditor::filterEmbedToken(const FilterTokenRequest& request)
</span><span class="lines">@@ -450,9 +496,9 @@
</span><span class="cx"> ASSERT(hasName(request.token, embedTag));
</span><span class="cx">
</span><span class="cx"> bool didBlockScript = false;
</span><del>- if (isContainedInRequest(decodedSnippetForName(request))) {
- didBlockScript |= eraseAttributeIfInjected(request, codeAttr, String(), SrcLikeAttribute);
- didBlockScript |= eraseAttributeIfInjected(request, srcAttr, blankURL().string(), SrcLikeAttribute);
</del><ins>+ if (isContainedInRequest(canonicalizedSnippetForTagName(request))) {
+ didBlockScript |= eraseAttributeIfInjected(request, codeAttr, String(), TruncationStyle::SrcLikeAttribute);
+ didBlockScript |= eraseAttributeIfInjected(request, srcAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);
</ins><span class="cx"> didBlockScript |= eraseAttributeIfInjected(request, typeAttr);
</span><span class="cx"> }
</span><span class="cx"> return didBlockScript;
</span><span class="lines">@@ -464,8 +510,8 @@
</span><span class="cx"> ASSERT(hasName(request.token, appletTag));
</span><span class="cx">
</span><span class="cx"> bool didBlockScript = false;
</span><del>- if (isContainedInRequest(decodedSnippetForName(request))) {
- didBlockScript |= eraseAttributeIfInjected(request, codeAttr, String(), SrcLikeAttribute);
</del><ins>+ if (isContainedInRequest(canonicalizedSnippetForTagName(request))) {
+ didBlockScript |= eraseAttributeIfInjected(request, codeAttr, String(), TruncationStyle::SrcLikeAttribute);
</ins><span class="cx"> didBlockScript |= eraseAttributeIfInjected(request, objectAttr);
</span><span class="cx"> }
</span><span class="cx"> return didBlockScript;
</span><span class="lines">@@ -476,9 +522,9 @@
</span><span class="cx"> ASSERT(request.token.type() == HTMLToken::StartTag);
</span><span class="cx"> ASSERT(hasName(request.token, iframeTag) || hasName(request.token, frameTag));
</span><span class="cx">
</span><del>- bool didBlockScript = eraseAttributeIfInjected(request, srcdocAttr, String(), ScriptLikeAttribute);
- if (isContainedInRequest(decodedSnippetForName(request)))
- didBlockScript |= eraseAttributeIfInjected(request, srcAttr, String(), SrcLikeAttribute);
</del><ins>+ bool didBlockScript = eraseAttributeIfInjected(request, srcdocAttr, String(), TruncationStyle::ScriptLikeAttribute);
+ if (isContainedInRequest(canonicalizedSnippetForTagName(request)))
+ didBlockScript |= eraseAttributeIfInjected(request, srcAttr, String(), TruncationStyle::SrcLikeAttribute);
</ins><span class="cx">
</span><span class="cx"> return didBlockScript;
</span><span class="cx"> }
</span><span class="lines">@@ -512,7 +558,7 @@
</span><span class="cx"> ASSERT(request.token.type() == HTMLToken::StartTag);
</span><span class="cx"> ASSERT(hasName(request.token, inputTag));
</span><span class="cx">
</span><del>- return eraseAttributeIfInjected(request, formactionAttr, blankURL().string(), SrcLikeAttribute);
</del><ins>+ return eraseAttributeIfInjected(request, formactionAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);
</ins><span class="cx"> }
</span><span class="cx">
</span><span class="cx"> bool XSSAuditor::filterButtonToken(const FilterTokenRequest& request)
</span><span class="lines">@@ -520,7 +566,7 @@
</span><span class="cx"> ASSERT(request.token.type() == HTMLToken::StartTag);
</span><span class="cx"> ASSERT(hasName(request.token, buttonTag));
</span><span class="cx">
</span><del>- return eraseAttributeIfInjected(request, formactionAttr, blankURL().string(), SrcLikeAttribute);
</del><ins>+ return eraseAttributeIfInjected(request, formactionAttr, blankURL().string(), TruncationStyle::SrcLikeAttribute);
</ins><span class="cx"> }
</span><span class="cx">
</span><span class="cx"> bool XSSAuditor::eraseDangerousAttributesIfInjected(const FilterTokenRequest& request)
</span><span class="lines">@@ -536,7 +582,7 @@
</span><span class="cx"> bool valueContainsJavaScriptURL = (!isInlineEventHandler && protocolIsJavaScript(strippedValue)) || (isSemicolonSeparatedAttribute(attribute) && semicolonSeparatedValueContainsJavaScriptURL(strippedValue));
</span><span class="cx"> if (!isInlineEventHandler && !valueContainsJavaScriptURL)
</span><span class="cx"> continue;
</span><del>- if (!isContainedInRequest(decodedSnippetForAttribute(request, attribute, ScriptLikeAttribute)))
</del><ins>+ if (!isContainedInRequest(canonicalize(snippetFromAttribute(request, attribute), TruncationStyle::ScriptLikeAttribute)))
</ins><span class="cx"> continue;
</span><span class="cx"> request.token.eraseValueOfAttribute(i);
</span><span class="cx"> if (valueContainsJavaScriptURL)
</span><span class="lines">@@ -546,94 +592,59 @@
</span><span class="cx"> return didBlockScript;
</span><span class="cx"> }
</span><span class="cx">
</span><del>-bool XSSAuditor::eraseAttributeIfInjected(const FilterTokenRequest& request, const QualifiedName& attributeName, const String& replacementValue, AttributeKind treatment)
</del><ins>+bool XSSAuditor::eraseAttributeIfInjected(const FilterTokenRequest& request, const QualifiedName& attributeName, const String& replacementValue, TruncationStyle truncationStyle)
</ins><span class="cx"> {
</span><span class="cx"> size_t indexOfAttribute = 0;
</span><del>- if (findAttributeWithName(request.token, attributeName, indexOfAttribute)) {
- const HTMLToken::Attribute& attribute = request.token.attributes().at(indexOfAttribute);
- if (isContainedInRequest(decodedSnippetForAttribute(request, attribute, treatment))) {
- if (threadSafeMatch(attributeName, srcAttr) && isLikelySafeResource(String(attribute.value)))
- return false;
- if (threadSafeMatch(attributeName, http_equivAttr) && !isDangerousHTTPEquiv(String(attribute.value)))
- return false;
- request.token.eraseValueOfAttribute(indexOfAttribute);
- if (!replacementValue.isEmpty())
- request.token.appendToAttributeValue(indexOfAttribute, replacementValue);
- return true;
- }
</del><ins>+ if (!findAttributeWithName(request.token, attributeName, indexOfAttribute))
+ return false;
+
+ const HTMLToken::Attribute& attribute = request.token.attributes().at(indexOfAttribute);
+ if (!isContainedInRequest(canonicalize(snippetFromAttribute(request, attribute), truncationStyle)))
+ return false;
+
+ if (threadSafeMatch(attributeName, srcAttr)) {
+ if (isLikelySafeResource(String(attribute.value)))
+ return false;
+ } else if (threadSafeMatch(attributeName, http_equivAttr)) {
+ if (!isDangerousHTTPEquiv(String(attribute.value)))
+ return false;
</ins><span class="cx"> }
</span><del>- return false;
</del><ins>+
+ request.token.eraseValueOfAttribute(indexOfAttribute);
+ if (!replacementValue.isEmpty())
+ request.token.appendToAttributeValue(indexOfAttribute, replacementValue);
+ return true;
</ins><span class="cx"> }
</span><span class="cx">
</span><del>-String XSSAuditor::decodedSnippetForName(const FilterTokenRequest& request)
</del><ins>+String XSSAuditor::canonicalizedSnippetForTagName(const FilterTokenRequest& request)
</ins><span class="cx"> {
</span><span class="cx"> // Grab a fixed number of characters equal to the length of the token's name plus one (to account for the "<").
</span><del>- return canonicalize(fullyDecodeString(request.sourceTracker.source(request.token), m_encoding).substring(0, request.token.name().size() + 1));
</del><ins>+ return canonicalize(request.sourceTracker.source(request.token).substring(0, request.token.name().size() + 1), TruncationStyle::None);
</ins><span class="cx"> }
</span><span class="cx">
</span><del>-String XSSAuditor::decodedSnippetForAttribute(const FilterTokenRequest& request, const HTMLToken::Attribute& attribute, AttributeKind treatment)
</del><ins>+String XSSAuditor::snippetFromAttribute(const FilterTokenRequest& request, const HTMLToken::Attribute& attribute)
</ins><span class="cx"> {
</span><span class="cx"> // The range doesn't include the character which terminates the value. So,
</span><span class="cx"> // for an input of |name="value"|, the snippet is |name="value|. For an
</span><span class="cx"> // unquoted input of |name=value |, the snippet is |name=value|.
</span><span class="cx"> // FIXME: We should grab one character before the name also.
</span><del>- unsigned start = attribute.startOffset;
- unsigned end = attribute.endOffset;
</del><ins>+ return request.sourceTracker.source(request.token, attribute.startOffset, attribute.endOffset);
+}
</ins><span class="cx">
</span><del>- // We defer canonicalizing the decoded string here to preserve embedded slashes (if any) that
- // may lead us to truncate the string.
- String decodedSnippet = fullyDecodeString(request.sourceTracker.source(request.token, start, end), m_encoding);
- decodedSnippet.truncate(kMaximumFragmentLengthTarget);
- if (treatment == SrcLikeAttribute) {
- int slashCount = 0;
- bool commaSeen = false;
- // In HTTP URLs, characters following the first ?, #, or third slash may come from
- // the page itself and can be merely ignored by an attacker's server when a remote
- // script or script-like resource is requested. In DATA URLS, the payload starts at
- // the first comma, and the the first /*, //, or <!-- may introduce a comment. Characters
- // following this may come from the page itself and may be ignored when the script is
- // executed. For simplicity, we don't differentiate based on URL scheme, and stop at
- // the first # or ?, the third slash, or the first slash or < once a comma is seen.
- for (size_t currentLength = 0; currentLength < decodedSnippet.length(); ++currentLength) {
- UChar currentChar = decodedSnippet[currentLength];
- if (currentChar == '?'
- || currentChar == '#'
- || ((currentChar == '/' || currentChar == '\\') && (commaSeen || ++slashCount > 2))
- || (currentChar == '<' && commaSeen)) {
- decodedSnippet.truncate(currentLength);
- break;
- }
- if (currentChar == ',')
- commaSeen = true;
- }
- } else if (treatment == ScriptLikeAttribute) {
- // Beware of trailing characters which came from the page itself, not the
- // injected vector. Excluding the terminating character covers common cases
- // where the page immediately ends the attribute, but doesn't cover more
- // complex cases where there is other page data following the injection.
- // Generally, these won't parse as javascript, so the injected vector
- // typically excludes them from consideration via a single-line comment or
- // by enclosing them in a string literal terminated later by the page's own
- // closing punctuation. Since the snippet has not been parsed, the vector
- // may also try to introduce these via entities. As a result, we'd like to
- // stop before the first "//", the first <!--, the first entity, or the first
- // quote not immediately following the first equals sign (taking whitespace
- // into consideration). To keep things simpler, we don't try to distinguish
- // between entity-introducing amperands vs. other uses, nor do we bother to
- // check for a second slash for a comment, nor do we bother to check for
- // !-- following a less-than sign. We stop instead on any ampersand
- // slash, or less-than sign.
- size_t position = 0;
- if ((position = decodedSnippet.find('=')) != notFound
- && (position = decodedSnippet.find(isNotHTMLSpace, position + 1)) != notFound
- && (position = decodedSnippet.find(isTerminatingCharacter, isHTMLQuote(decodedSnippet[position]) ? position + 1 : position)) != notFound) {
- decodedSnippet.truncate(position);
- }
</del><ins>+String XSSAuditor::canonicalize(const String& snippet, TruncationStyle truncationStyle)
+{
+ String decodedSnippet = fullyDecodeString(snippet, m_encoding);
+ if (truncationStyle != TruncationStyle::None) {
+ decodedSnippet.truncate(kMaximumFragmentLengthTarget);
+ if (truncationStyle == TruncationStyle::SrcLikeAttribute)
+ truncateForSrcLikeAttribute(decodedSnippet);
+ else if (truncationStyle == TruncationStyle::ScriptLikeAttribute)
+ truncateForScriptLikeAttribute(decodedSnippet);
</ins><span class="cx"> }
</span><del>- return canonicalize(decodedSnippet);
</del><ins>+ return decodedSnippet.removeCharacters(&isNonCanonicalCharacter);
</ins><span class="cx"> }
</span><span class="cx">
</span><del>-String XSSAuditor::decodedSnippetForJavaScript(const FilterTokenRequest& request)
</del><ins>+String XSSAuditor::canonicalizedSnippetForJavaScript(const FilterTokenRequest& request)
</ins><span class="cx"> {
</span><span class="cx"> String string = request.sourceTracker.source(request.token);
</span><span class="cx"> size_t startPosition = 0;
</span><span class="lines">@@ -687,7 +698,6 @@
</span><span class="cx"> foundPosition = lastNonSpacePosition;
</span><span class="cx"> break;
</span><span class="cx"> }
</span><del>-
</del><span class="cx"> if (foundPosition > startPosition + kMaximumFragmentLengthTarget) {
</span><span class="cx"> // After hitting the length target, we can only stop at a point where we know we are
</span><span class="cx"> // not in the middle of a %-escape sequence. For the sake of simplicity, approximate
</span><span class="lines">@@ -701,7 +711,7 @@
</span><span class="cx"> lastNonSpacePosition = foundPosition;
</span><span class="cx"> }
</span><span class="cx">
</span><del>- result = canonicalize(fullyDecodeString(string.substring(startPosition, foundPosition - startPosition), m_encoding));
</del><ins>+ result = canonicalize(string.substring(startPosition, foundPosition - startPosition), TruncationStyle::None);
</ins><span class="cx"> startPosition = foundPosition + 1;
</span><span class="cx"> }
</span><span class="cx"> return result;
</span></span></pre></div>
<a id="trunkSourceWebCorehtmlparserXSSAuditorh"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/html/parser/XSSAuditor.h (195073 => 195074)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/html/parser/XSSAuditor.h        2016-01-14 21:37:49 UTC (rev 195073)
+++ trunk/Source/WebCore/html/parser/XSSAuditor.h        2016-01-14 21:40:13 UTC (rev 195074)
</span><span class="lines">@@ -70,7 +70,8 @@
</span><span class="cx"> Initialized
</span><span class="cx"> };
</span><span class="cx">
</span><del>- enum AttributeKind {
</del><ins>+ enum class TruncationStyle {
+ None,
</ins><span class="cx"> NormalAttribute,
</span><span class="cx"> SrcLikeAttribute,
</span><span class="cx"> ScriptLikeAttribute
</span><span class="lines">@@ -92,12 +93,12 @@
</span><span class="cx"> bool filterButtonToken(const FilterTokenRequest&);
</span><span class="cx">
</span><span class="cx"> bool eraseDangerousAttributesIfInjected(const FilterTokenRequest&);
</span><del>- bool eraseAttributeIfInjected(const FilterTokenRequest&, const QualifiedName&, const String& replacementValue = String(), AttributeKind treatment = NormalAttribute);
</del><ins>+ bool eraseAttributeIfInjected(const FilterTokenRequest&, const QualifiedName&, const String& replacementValue = String(), TruncationStyle = TruncationStyle::NormalAttribute);
</ins><span class="cx">
</span><del>- String decodedSnippetForToken(const HTMLToken&);
- String decodedSnippetForName(const FilterTokenRequest&);
- String decodedSnippetForAttribute(const FilterTokenRequest&, const HTMLToken::Attribute&, AttributeKind treatment = NormalAttribute);
- String decodedSnippetForJavaScript(const FilterTokenRequest&);
</del><ins>+ String canonicalizedSnippetForTagName(const FilterTokenRequest&);
+ String canonicalizedSnippetForJavaScript(const FilterTokenRequest&);
+ String snippetFromAttribute(const FilterTokenRequest&, const HTMLToken::Attribute&);
+ String canonicalize(const String&, TruncationStyle);
</ins><span class="cx">
</span><span class="cx"> bool isContainedInRequest(const String&);
</span><span class="cx"> bool isLikelySafeResource(const String& url);
</span></span></pre>
</div>
</div>
</body>
</html>