<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[181663] trunk</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/181663">181663</a></dd>
<dt>Author</dt> <dd>benjamin@webkit.org</dd>
<dt>Date</dt> <dd>2015-03-17 13:47:42 -0700 (Tue, 17 Mar 2015)</dd>
</dl>

<h3>Log Message</h3>
<pre>Compile character ranges targeting the same state as range check in the bytecode
https://bugs.webkit.org/show_bug.cgi?id=142759

Patch by Benjamin Poulain &lt;bpoulain@apple.com&gt; on 2015-03-17
Reviewed by Alex Christensen.

Source/WebCore:

Previously, character ranges would be compiled as many individual character checks.
For example, a transition on &quot;[a-z]&quot; would do 26 character checks + jump, which leads
to enormous matchines.

With this patch, we find the ranges at lowering time and generate a single instruction
for them: &quot;CheckValueRange&quot;. This helps making the machine denser when the input
use character sets.

The second part of this patch goes further in the case where the transitions out of
a state cover the entire alphabet. In that case, we create a fallback transition
on the fly and remove all the ranges made useless.
That case is common when ranges are used with inverse character set (e.g. [^a]+a).

* contentextensions/DFABytecode.h:
(WebCore::ContentExtensions::instructionSizeWithArguments):
* contentextensions/DFABytecodeCompiler.cpp:
(WebCore::ContentExtensions::DFABytecodeCompiler::emitCheckValueRange):
(WebCore::ContentExtensions::DFABytecodeCompiler::compileNode):
(WebCore::ContentExtensions::DFABytecodeCompiler::compileNodeTransitions):
(WebCore::ContentExtensions::DFABytecodeCompiler::compileCheckForRange):
* contentextensions/DFABytecodeCompiler.h:
Extend the compiler to detect ranges and lower them as CheckValueRange.

* contentextensions/DFABytecodeInterpreter.cpp:
(WebCore::ContentExtensions::DFABytecodeInterpreter::interpret):
Range checks in the interpreter.

* contentextensions/NFA.cpp:
(WebCore::ContentExtensions::NFA::setFinal):
This assertion does not make sense with the current codebase. Actions are &quot;compressed&quot;,
it is possible to have two patterns with the same action.

* contentextensions/NFAToDFA.cpp:
(WebCore::ContentExtensions::simplifyTransitions):
A very simple DFA optimization function: it only reduce the strength of ranges.

(WebCore::ContentExtensions::NFAToDFA::convert):

Tools:

* TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:
(TestWebKitAPI::TEST_F):</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkSourceWebCoreChangeLog">trunk/Source/WebCore/ChangeLog</a></li>
<li><a href="#trunkSourceWebCorecontentextensionsDFABytecodeh">trunk/Source/WebCore/contentextensions/DFABytecode.h</a></li>
<li><a href="#trunkSourceWebCorecontentextensionsDFABytecodeCompilercpp">trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.cpp</a></li>
<li><a href="#trunkSourceWebCorecontentextensionsDFABytecodeCompilerh">trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.h</a></li>
<li><a href="#trunkSourceWebCorecontentextensionsDFABytecodeInterpretercpp">trunk/Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp</a></li>
<li><a href="#trunkSourceWebCorecontentextensionsNFAcpp">trunk/Source/WebCore/contentextensions/NFA.cpp</a></li>
<li><a href="#trunkSourceWebCorecontentextensionsNFAToDFAcpp">trunk/Source/WebCore/contentextensions/NFAToDFA.cpp</a></li>
<li><a href="#trunkToolsChangeLog">trunk/Tools/ChangeLog</a></li>
<li><a href="#trunkToolsTestWebKitAPITestsWebCoreContentExtensionscpp">trunk/Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkSourceWebCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/ChangeLog (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/ChangeLog        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Source/WebCore/ChangeLog        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -1,3 +1,48 @@
</span><ins>+2015-03-17  Benjamin Poulain  &lt;bpoulain@apple.com&gt;
+
+        Compile character ranges targeting the same state as range check in the bytecode
+        https://bugs.webkit.org/show_bug.cgi?id=142759
+
+        Reviewed by Alex Christensen.
+
+        Previously, character ranges would be compiled as many individual character checks.
+        For example, a transition on &quot;[a-z]&quot; would do 26 character checks + jump, which leads
+        to enormous matchines.
+
+        With this patch, we find the ranges at lowering time and generate a single instruction
+        for them: &quot;CheckValueRange&quot;. This helps making the machine denser when the input
+        use character sets.
+
+        The second part of this patch goes further in the case where the transitions out of
+        a state cover the entire alphabet. In that case, we create a fallback transition
+        on the fly and remove all the ranges made useless.
+        That case is common when ranges are used with inverse character set (e.g. [^a]+a).
+
+        * contentextensions/DFABytecode.h:
+        (WebCore::ContentExtensions::instructionSizeWithArguments):
+        * contentextensions/DFABytecodeCompiler.cpp:
+        (WebCore::ContentExtensions::DFABytecodeCompiler::emitCheckValueRange):
+        (WebCore::ContentExtensions::DFABytecodeCompiler::compileNode):
+        (WebCore::ContentExtensions::DFABytecodeCompiler::compileNodeTransitions):
+        (WebCore::ContentExtensions::DFABytecodeCompiler::compileCheckForRange):
+        * contentextensions/DFABytecodeCompiler.h:
+        Extend the compiler to detect ranges and lower them as CheckValueRange.
+
+        * contentextensions/DFABytecodeInterpreter.cpp:
+        (WebCore::ContentExtensions::DFABytecodeInterpreter::interpret):
+        Range checks in the interpreter.
+
+        * contentextensions/NFA.cpp:
+        (WebCore::ContentExtensions::NFA::setFinal):
+        This assertion does not make sense with the current codebase. Actions are &quot;compressed&quot;,
+        it is possible to have two patterns with the same action.
+
+        * contentextensions/NFAToDFA.cpp:
+        (WebCore::ContentExtensions::simplifyTransitions):
+        A very simple DFA optimization function: it only reduce the strength of ranges.
+
+        (WebCore::ContentExtensions::NFAToDFA::convert):
+
</ins><span class="cx"> 2015-03-17  Jer Noble  &lt;jer.noble@apple.com&gt;
</span><span class="cx"> 
</span><span class="cx">         REGRESSION (r181423): Crash @ generatedcontent.org at com.apple.WebCore: WebCore::MediaPlayer::maximumDurationToCacheMediaTime const + 4
</span></span></pre></div>
<a id="trunkSourceWebCorecontentextensionsDFABytecodeh"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/contentextensions/DFABytecode.h (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/contentextensions/DFABytecode.h        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Source/WebCore/contentextensions/DFABytecode.h        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -41,6 +41,12 @@
</span><span class="cx">     // The index to jump to if the values are equal (4 bytes).
</span><span class="cx">     CheckValue,
</span><span class="cx"> 
</span><ins>+    // Jump to an offset if the input value is within a certain range.
+    // The lower value (1 byte).
+    // The higher value (1 byte).
+    // The index to jump to if the value is in the range (4 bytes).
+    CheckValueRange,
+
</ins><span class="cx">     // AppendAction has one argument:
</span><span class="cx">     // The action to append (4 bytes).
</span><span class="cx">     AppendAction,
</span><span class="lines">@@ -63,6 +69,8 @@
</span><span class="cx">     switch (instruction) {
</span><span class="cx">     case DFABytecodeInstruction::CheckValue:
</span><span class="cx">         return sizeof(DFABytecodeInstruction) + sizeof(uint8_t) + sizeof(unsigned);
</span><ins>+    case DFABytecodeInstruction::CheckValueRange:
+        return sizeof(DFABytecodeInstruction) + sizeof(uint8_t) + sizeof(uint8_t) + sizeof(unsigned);
</ins><span class="cx">     case DFABytecodeInstruction::AppendAction:
</span><span class="cx">         return sizeof(DFABytecodeInstruction) + sizeof(unsigned);
</span><span class="cx">     case DFABytecodeInstruction::TestFlagsAndAppendAction:
</span></span></pre></div>
<a id="trunkSourceWebCorecontentextensionsDFABytecodeCompilercpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.cpp (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.cpp        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.cpp        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -30,6 +30,7 @@
</span><span class="cx"> 
</span><span class="cx"> #include &quot;ContentExtensionRule.h&quot;
</span><span class="cx"> #include &quot;DFA.h&quot;
</span><ins>+#include &quot;DFANode.h&quot;
</ins><span class="cx"> 
</span><span class="cx"> namespace WebCore {
</span><span class="cx">     
</span><span class="lines">@@ -76,6 +77,18 @@
</span><span class="cx">     append&lt;unsigned&gt;(m_bytecode, 0); // This value will be set when linking.
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+void DFABytecodeCompiler::emitCheckValueRange(uint8_t lowValue, uint8_t highValue, unsigned destinationNodeIndex)
+{
+    ASSERT_WITH_MESSAGE(lowValue != highValue, &quot;A single value check should be emitted for single values.&quot;);
+    ASSERT_WITH_MESSAGE(lowValue &lt; highValue, &quot;The instruction semantic impose lowValue is smaller than highValue.&quot;);
+
+    append&lt;DFABytecodeInstruction&gt;(m_bytecode, DFABytecodeInstruction::CheckValueRange);
+    append&lt;uint8_t&gt;(m_bytecode, lowValue);
+    append&lt;uint8_t&gt;(m_bytecode, highValue);
+    m_linkRecords.append(std::make_pair(m_bytecode.size(), destinationNodeIndex));
+    append&lt;unsigned&gt;(m_bytecode, 0);
+}
+
</ins><span class="cx"> void DFABytecodeCompiler::emitTerminate()
</span><span class="cx"> {
</span><span class="cx">     append&lt;DFABytecodeInstruction&gt;(m_bytecode, DFABytecodeInstruction::Terminate);
</span><span class="lines">@@ -95,16 +108,63 @@
</span><span class="cx">         else
</span><span class="cx">             emitAppendAction(static_cast&lt;unsigned&gt;(action));
</span><span class="cx">     }
</span><del>-    
-    for (const auto&amp; transition : node.transitions)
-        emitCheckValue(transition.key, transition.value);
-    
</del><ins>+    compileNodeTransitions(node);
+}
+
+void DFABytecodeCompiler::compileNodeTransitions(const DFANode&amp; node)
+{
+    bool hasRangeMin = false;
+    uint16_t rangeMin;
+    unsigned rangeDestination = 0;
+
+    for (unsigned char i = 0; i &lt; 128; ++i) {
+        auto transitionIterator = node.transitions.find(i);
+        if (transitionIterator == node.transitions.end()) {
+            if (hasRangeMin) {
+                ASSERT_WITH_MESSAGE(!(node.hasFallbackTransition &amp;&amp; node.fallbackTransition == rangeDestination), &quot;Individual transitions to the fallback transitions should have been eliminated by the optimizer.&quot;);
+
+                unsigned char lastHighValue = i - 1;
+                compileCheckForRange(rangeMin, lastHighValue, rangeDestination);
+                hasRangeMin = false;
+            }
+            continue;
+        }
+
+        if (!hasRangeMin) {
+            hasRangeMin = true;
+            rangeMin = transitionIterator-&gt;key;
+            rangeDestination = transitionIterator-&gt;value;
+        } else {
+            if (transitionIterator-&gt;value == rangeDestination)
+                continue;
+
+            unsigned char lastHighValue = i - 1;
+            compileCheckForRange(rangeMin, lastHighValue, rangeDestination);
+            rangeMin = i;
+            rangeDestination = transitionIterator-&gt;value;
+        }
+    }
+    if (hasRangeMin)
+        compileCheckForRange(rangeMin, 127, rangeDestination);
+
</ins><span class="cx">     if (node.hasFallbackTransition)
</span><span class="cx">         emitJump(node.fallbackTransition);
</span><span class="cx">     else
</span><span class="cx">         emitTerminate();
</span><span class="cx"> }
</span><del>-    
</del><ins>+
+void DFABytecodeCompiler::compileCheckForRange(uint16_t lowValue, uint16_t highValue, unsigned destinationNodeIndex)
+{
+    ASSERT_WITH_MESSAGE(lowValue &lt; 128, &quot;The DFA engine only supports the ASCII alphabet.&quot;);
+    ASSERT_WITH_MESSAGE(highValue &lt; 128, &quot;The DFA engine only supports the ASCII alphabet.&quot;);
+    ASSERT(lowValue &lt;= highValue);
+
+    if (lowValue == highValue)
+        emitCheckValue(lowValue, destinationNodeIndex);
+    else
+        emitCheckValueRange(lowValue, highValue, destinationNodeIndex);
+}
+
</ins><span class="cx"> void DFABytecodeCompiler::compile()
</span><span class="cx"> {
</span><span class="cx">     ASSERT(!m_bytecode.size());
</span></span></pre></div>
<a id="trunkSourceWebCorecontentextensionsDFABytecodeCompilerh"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.h (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.h        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Source/WebCore/contentextensions/DFABytecodeCompiler.h        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -29,7 +29,6 @@
</span><span class="cx"> #if ENABLE(CONTENT_EXTENSIONS)
</span><span class="cx"> 
</span><span class="cx"> #include &quot;DFABytecode.h&quot;
</span><del>-#include &quot;DFANode.h&quot;
</del><span class="cx"> #include &lt;wtf/Vector.h&gt;
</span><span class="cx"> 
</span><span class="cx"> namespace WebCore {
</span><span class="lines">@@ -37,6 +36,7 @@
</span><span class="cx"> namespace ContentExtensions {
</span><span class="cx"> 
</span><span class="cx"> class DFA;
</span><ins>+class DFANode;
</ins><span class="cx"> 
</span><span class="cx"> class DFABytecodeCompiler {
</span><span class="cx"> public:
</span><span class="lines">@@ -50,11 +50,14 @@
</span><span class="cx"> 
</span><span class="cx"> private:
</span><span class="cx">     void compileNode(unsigned);
</span><ins>+    void compileNodeTransitions(const DFANode&amp;);
+    void compileCheckForRange(uint16_t lowValue, uint16_t highValue, unsigned destinationNodeIndex);
</ins><span class="cx"> 
</span><span class="cx">     void emitAppendAction(unsigned);
</span><span class="cx">     void emitTestFlagsAndAppendAction(uint16_t flags, unsigned);
</span><span class="cx">     void emitJump(unsigned destinationNodeIndex);
</span><span class="cx">     void emitCheckValue(uint8_t value, unsigned destinationNodeIndex);
</span><ins>+    void emitCheckValueRange(uint8_t lowValue, uint8_t highValue, unsigned destinationNodeIndex);
</ins><span class="cx">     void emitTerminate();
</span><span class="cx"> 
</span><span class="cx">     Vector&lt;DFABytecode&gt;&amp; m_bytecode;
</span></span></pre></div>
<a id="trunkSourceWebCorecontentextensionsDFABytecodeInterpretercpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Source/WebCore/contentextensions/DFABytecodeInterpreter.cpp        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -73,6 +73,22 @@
</span><span class="cx">                 programCounter += instructionSizeWithArguments(DFABytecodeInstruction::CheckValue);
</span><span class="cx">             break;
</span><span class="cx"> 
</span><ins>+        case DFABytecodeInstruction::CheckValueRange: {
+            if (urlIndexIsAfterEndOfString)
+                return actions;
+
+            char character = url[urlIndex];
+            if (character &gt;= getBits&lt;uint8_t&gt;(m_bytecode, m_bytecodeLength, programCounter + sizeof(DFABytecode))
+                &amp;&amp; character &lt;= getBits&lt;uint8_t&gt;(m_bytecode, m_bytecodeLength, programCounter + sizeof(DFABytecode) + sizeof(uint8_t))) {
+                programCounter = getBits&lt;unsigned&gt;(m_bytecode, m_bytecodeLength, programCounter + sizeof(DFABytecode) + sizeof(uint8_t) + sizeof(uint8_t));
+                if (!character)
+                    urlIndexIsAfterEndOfString = true;
+                urlIndex++; // This represents an edge in the DFA.
+            } else
+                programCounter += instructionSizeWithArguments(DFABytecodeInstruction::CheckValueRange);
+            break;
+        }
+
</ins><span class="cx">         case DFABytecodeInstruction::Jump:
</span><span class="cx">             if (!url[urlIndex] || urlIndexIsAfterEndOfString)
</span><span class="cx">                 return actions;
</span></span></pre></div>
<a id="trunkSourceWebCorecontentextensionsNFAcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/contentextensions/NFA.cpp (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/contentextensions/NFA.cpp        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Source/WebCore/contentextensions/NFA.cpp        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -83,8 +83,8 @@
</span><span class="cx"> 
</span><span class="cx"> void NFA::setFinal(unsigned node, uint64_t ruleId)
</span><span class="cx"> {
</span><del>-    ASSERT(!m_nodes[node].finalRuleIds.contains(ruleId));
-    m_nodes[node].finalRuleIds.append(ruleId);
</del><ins>+    if (!m_nodes[node].finalRuleIds.contains(ruleId))
+        m_nodes[node].finalRuleIds.append(ruleId);
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> unsigned NFA::graphSize() const
</span></span></pre></div>
<a id="trunkSourceWebCorecontentextensionsNFAToDFAcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/WebCore/contentextensions/NFAToDFA.cpp (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/WebCore/contentextensions/NFAToDFA.cpp        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Source/WebCore/contentextensions/NFAToDFA.cpp        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -338,6 +338,49 @@
</span><span class="cx">     return uniqueNodeIdAddResult.iterator-&gt;impl()-&gt;m_dfaNodeId;
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+static void simplifyTransitions(Vector&lt;DFANode&gt;&amp; dfaGraph)
+{
+    for (DFANode&amp; dfaNode : dfaGraph) {
+        if (!dfaNode.hasFallbackTransition
+            &amp;&amp; ((dfaNode.transitions.size() == 126 &amp;&amp; !dfaNode.transitions.contains(0))
+                || (dfaNode.transitions.size() == 127 &amp;&amp; dfaNode.transitions.contains(0)))) {
+            unsigned bestTarget = std::numeric_limits&lt;unsigned&gt;::max();
+            unsigned bestTargetScore = 0;
+            HashMap&lt;unsigned, unsigned, DefaultHash&lt;unsigned&gt;::Hash, WTF::UnsignedWithZeroKeyHashTraits&lt;unsigned&gt;&gt; targetHistogram;
+            for (const auto transition : dfaNode.transitions) {
+                if (!transition.key)
+                    continue;
+
+                unsigned transitionTarget = transition.value;
+                auto addResult = targetHistogram.add(transitionTarget, 1);
+                if (!addResult.isNewEntry)
+                    addResult.iterator-&gt;value++;
+
+                if (addResult.iterator-&gt;value &gt; bestTargetScore) {
+                    bestTargetScore = addResult.iterator-&gt;value;
+                    bestTarget = transitionTarget;
+                }
+            }
+            ASSERT_WITH_MESSAGE(bestTargetScore, &quot;There should be at least one valid target since having transitions is a precondition to enter this path.&quot;);
+
+            dfaNode.hasFallbackTransition = true;
+            dfaNode.fallbackTransition = bestTarget;
+        }
+
+        if (dfaNode.hasFallbackTransition) {
+            Vector&lt;uint16_t, 128&gt; keys;
+            DFANodeTransitions&amp; transitions = dfaNode.transitions;
+            copyKeysToVector(transitions, keys);
+
+            for (uint16_t key : keys) {
+                auto transitionIterator = transitions.find(key);
+                if (transitionIterator-&gt;value == dfaNode.fallbackTransition)
+                    transitions.remove(transitionIterator);
+            }
+        }
+    }
+}
+
</ins><span class="cx"> DFA NFAToDFA::convert(NFA&amp; nfa)
</span><span class="cx"> {
</span><span class="cx">     Vector&lt;NFANode&gt;&amp; nfaGraph = nfa.m_nodes;
</span><span class="lines">@@ -387,6 +430,7 @@
</span><span class="cx">         }
</span><span class="cx">     } while (!unprocessedNodes.isEmpty());
</span><span class="cx"> 
</span><ins>+    simplifyTransitions(dfaGraph);
</ins><span class="cx">     return DFA(WTF::move(dfaGraph), 0);
</span><span class="cx"> }
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkToolsChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Tools/ChangeLog (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Tools/ChangeLog        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Tools/ChangeLog        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -1,3 +1,13 @@
</span><ins>+2015-03-17  Benjamin Poulain  &lt;bpoulain@apple.com&gt;
+
+        Compile character ranges targeting the same state as range check in the bytecode
+        https://bugs.webkit.org/show_bug.cgi?id=142759
+
+        Reviewed by Alex Christensen.
+
+        * TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp:
+        (TestWebKitAPI::TEST_F):
+
</ins><span class="cx"> 2015-03-17  Youenn Fablet  &lt;youenn.fablet@crf.canon.fr&gt;
</span><span class="cx"> 
</span><span class="cx">         W3C test parser and converter should use test importer host
</span></span></pre></div>
<a id="trunkToolsTestWebKitAPITestsWebCoreContentExtensionscpp"></a>
<div class="modfile"><h4>Modified: trunk/Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp (181662 => 181663)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp        2015-03-17 20:28:33 UTC (rev 181662)
+++ trunk/Tools/TestWebKitAPI/Tests/WebCore/ContentExtensions.cpp        2015-03-17 20:47:42 UTC (rev 181663)
</span><span class="lines">@@ -122,6 +122,56 @@
</span><span class="cx">     testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/&quot;), { ContentExtensions::ActionType::BlockLoad });
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+TEST_F(ContentExtensionTest, RangeBasic)
+{
+    const char* rangeBasicFilter = &quot;[{\&quot;action\&quot;:{\&quot;type\&quot;:\&quot;block\&quot;},\&quot;trigger\&quot;:{\&quot;url-filter\&quot;:\&quot;.*w[0-9]c\&quot;, \&quot;url-filter-is-case-sensitive\&quot;:true}},{\&quot;action\&quot;:{\&quot;type\&quot;:\&quot;block-cookies\&quot;},\&quot;trigger\&quot;:{\&quot;url-filter\&quot;:\&quot;.*[A-H][a-z]cko\&quot;, \&quot;url-filter-is-case-sensitive\&quot;:true}}]&quot;;
+    auto extensionData = ContentExtensions::compileRuleList(rangeBasicFilter);
+    auto extension = InMemoryCompiledContentExtension::create(WTF::move(extensionData));
+
+    ContentExtensions::ContentExtensionsBackend backend;
+    backend.addContentExtension(&quot;PatternNestedGroupsFilter&quot;, extension);
+
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;w2c://whatwg.org/&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/w0c&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/wac&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/wAc&quot;), { });
+
+    // Note: URL parsing and canonicalization lowercase the scheme and hostname.
+    testRequest(backend, mainDocumentRequest(&quot;Aacko://webkit.org&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;aacko://webkit.org&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://gCcko.org/&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://gccko.org/&quot;), { });
+
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/Gecko&quot;), { ContentExtensions::ActionType::BlockCookies });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/gecko&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/GEcko&quot;), { });
+}
+
+TEST_F(ContentExtensionTest, RangeExclusionGeneratingUniversalTransition)
+{
+    // Transition of the type ([^X]X) effictively transition on every input.
+    const char* rangeExclusionGeneratingUniversalTransitionFilter = &quot;[{\&quot;action\&quot;:{\&quot;type\&quot;:\&quot;block\&quot;},\&quot;trigger\&quot;:{\&quot;url-filter\&quot;:\&quot;.*[^a]+afoobar\&quot;}}]&quot;;
+    auto extensionData = ContentExtensions::compileRuleList(rangeExclusionGeneratingUniversalTransitionFilter);
+    auto extension = InMemoryCompiledContentExtension::create(WTF::move(extensionData));
+
+    ContentExtensions::ContentExtensionsBackend backend;
+    backend.addContentExtension(&quot;PatternNestedGroupsFilter&quot;, extension);
+
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org&quot;), { });
+
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org/foobafoobar&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org/foobarfoobar&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org/FOOBAFOOBAR&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org/FOOBARFOOBAR&quot;), { });
+
+    // The character before the &quot;a&quot; prefix cannot be another &quot;a&quot;.
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org/aafoobar&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org/Aafoobar&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org/aAfoobar&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://w3c.org/AAfoobar&quot;), { });
+}
+
</ins><span class="cx"> const char* patternsStartingWithGroupFilter = &quot;[{\&quot;action\&quot;:{\&quot;type\&quot;:\&quot;block\&quot;},\&quot;trigger\&quot;:{\&quot;url-filter\&quot;:\&quot;(http://whatwg\\\\.org/)?webkit\134\134.org\&quot;}}]&quot;;
</span><span class="cx"> 
</span><span class="cx"> TEST_F(ContentExtensionTest, PatternStartingWithGroup)
</span><span class="lines">@@ -219,6 +269,26 @@
</span><span class="cx">     testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/foobarfoo&quot;), { });
</span><span class="cx">     testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/foobarf&quot;), { });
</span><span class="cx"> }
</span><ins>+
+TEST_F(ContentExtensionTest, EndOfLineAssertionWithInvertedCharacterSet)
+{
+    const char* endOfLineAssertionWithInvertedCharacterSetFilter = &quot;[{\&quot;action\&quot;:{\&quot;type\&quot;:\&quot;block\&quot;},\&quot;trigger\&quot;:{\&quot;url-filter\&quot;:\&quot;.*[^y]$\&quot;}}]&quot;;
+    auto extensionData = ContentExtensions::compileRuleList(endOfLineAssertionWithInvertedCharacterSetFilter);
+    auto extension = InMemoryCompiledContentExtension::create(WTF::move(extensionData));
+
+    ContentExtensions::ContentExtensionsBackend backend;
+    backend.addContentExtension(&quot;EndOfLineAssertion&quot;, extension);
+
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/a&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/foobar&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/Ya&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/yFoobar&quot;), { ContentExtensions::ActionType::BlockLoad });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/y&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/Y&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/foobary&quot;), { });
+    testRequest(backend, mainDocumentRequest(&quot;http://webkit.org/foobarY&quot;), { });
+}
</ins><span class="cx">     
</span><span class="cx"> const char* loadTypeFilter = &quot;[{\&quot;action\&quot;:{\&quot;type\&quot;:\&quot;block\&quot;},\&quot;trigger\&quot;:{\&quot;url-filter\&quot;:\&quot;.*webkit.org\&quot;,\&quot;load-type\&quot;:[\&quot;third-party\&quot;]}},&quot;
</span><span class="cx">     &quot;{\&quot;action\&quot;:{\&quot;type\&quot;:\&quot;block\&quot;},\&quot;trigger\&quot;:{\&quot;url-filter\&quot;:\&quot;.*whatwg.org\&quot;,\&quot;load-type\&quot;:[\&quot;first-party\&quot;]}},&quot;
</span></span></pre>
</div>
</div>

</body>
</html>