<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[215057] trunk/Source/JavaScriptCore</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/215057">215057</a></dd>
<dt>Author</dt> <dd>fpizlo@apple.com</dd>
<dt>Date</dt> <dd>2017-04-06 13:58:34 -0700 (Thu, 06 Apr 2017)</dd>
</dl>

<h3>Log Message</h3>
<pre>B3 -O1 should generate better code than -O0
https://bugs.webkit.org/show_bug.cgi?id=170563

Reviewed by Michael Saboff.
        
Prior to this change, code generated by -O1 ran slower than code generated by -O0. This turned
out to be because of reduceStrength optimizations that increase live ranges and create register
pressure, which then creates problems for linear scan.
        
It seemed obvious that canonicalizations that help isel, constant folding, and one-for-one
strength reductions should stay. It also seemed obvious that SSA and CFG simplification are fast
and harmless. So, I focused on removing:
        
- CSE, which increases live ranges. This is a risky optimization when we know that we've chosen
  to use a bad register allocator.
        
- Sophisticated strength reductions that create more code, like the insane division optimization.
        
- Anything that inserts basic blocks.
        
CSE appeared to be the cause of half of the throughput regression of -O1 but none of the compile
time. This change also reduces the running time of reduceStrength by making it not a fixpoint at
optLevel&lt;2.
        
This makes wasm -O1 compile 17% faster. This makes wasm -O1 run 19% faster. This makes -O1 code
run 3% faster than -O0, and compile about 4% slower than -O0. We may yet end up choosing to use
-O0, but at least now -O1 isn't totally useless.

* b3/B3ReduceStrength.cpp:</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCoreChangeLog">trunk/Source/JavaScriptCore/ChangeLog</a></li>
<li><a href="#trunkSourceJavaScriptCoreb3B3ReduceStrengthcpp">trunk/Source/JavaScriptCore/b3/B3ReduceStrength.cpp</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkSourceJavaScriptCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ChangeLog (215056 => 215057)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ChangeLog        2017-04-06 20:38:38 UTC (rev 215056)
+++ trunk/Source/JavaScriptCore/ChangeLog        2017-04-06 20:58:34 UTC (rev 215057)
</span><span class="lines">@@ -1,3 +1,35 @@
</span><ins>+2017-04-06  Filip Pizlo  &lt;fpizlo@apple.com&gt;
+
+        B3 -O1 should generate better code than -O0
+        https://bugs.webkit.org/show_bug.cgi?id=170563
+
+        Reviewed by Michael Saboff.
+        
+        Prior to this change, code generated by -O1 ran slower than code generated by -O0. This turned
+        out to be because of reduceStrength optimizations that increase live ranges and create register
+        pressure, which then creates problems for linear scan.
+        
+        It seemed obvious that canonicalizations that help isel, constant folding, and one-for-one
+        strength reductions should stay. It also seemed obvious that SSA and CFG simplification are fast
+        and harmless. So, I focused on removing:
+        
+        - CSE, which increases live ranges. This is a risky optimization when we know that we've chosen
+          to use a bad register allocator.
+        
+        - Sophisticated strength reductions that create more code, like the insane division optimization.
+        
+        - Anything that inserts basic blocks.
+        
+        CSE appeared to be the cause of half of the throughput regression of -O1 but none of the compile
+        time. This change also reduces the running time of reduceStrength by making it not a fixpoint at
+        optLevel&lt;2.
+        
+        This makes wasm -O1 compile 17% faster. This makes wasm -O1 run 19% faster. This makes -O1 code
+        run 3% faster than -O0, and compile about 4% slower than -O0. We may yet end up choosing to use
+        -O0, but at least now -O1 isn't totally useless.
+
+        * b3/B3ReduceStrength.cpp:
+
</ins><span class="cx"> 2017-04-06  Jon Davis  &lt;jond@apple.com&gt;
</span><span class="cx"> 
</span><span class="cx">         Updates feature status for recently shipped features
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreb3B3ReduceStrengthcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/b3/B3ReduceStrength.cpp (215056 => 215057)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/b3/B3ReduceStrength.cpp        2017-04-06 20:38:38 UTC (rev 215056)
+++ trunk/Source/JavaScriptCore/b3/B3ReduceStrength.cpp        2017-04-06 20:58:34 UTC (rev 215057)
</span><span class="lines">@@ -442,9 +442,11 @@
</span><span class="cx">             
</span><span class="cx">             simplifySSA();
</span><span class="cx">             
</span><del>-            m_proc.resetValueOwners();
-            m_dominators = &amp;m_proc.dominators(); // Recompute if necessary.
-            m_pureCSE.clear();
</del><ins>+            if (m_proc.optLevel() &gt;= 2) {
+                m_proc.resetValueOwners();
+                m_dominators = &amp;m_proc.dominators(); // Recompute if necessary.
+                m_pureCSE.clear();
+            }
</ins><span class="cx"> 
</span><span class="cx">             for (BasicBlock* block : m_proc.blocksInPreOrder()) {
</span><span class="cx">                 m_block = block;
</span><span class="lines">@@ -457,23 +459,25 @@
</span><span class="cx">                     }
</span><span class="cx">                     m_value = m_block-&gt;at(m_index);
</span><span class="cx">                     m_value-&gt;performSubstitution();
</span><del>-                    
</del><span class="cx">                     reduceValueStrength();
</span><del>-                    replaceIfRedundant();
</del><ins>+                    if (m_proc.optLevel() &gt;= 2)
+                        replaceIfRedundant();
</ins><span class="cx">                 }
</span><span class="cx">                 m_insertionSet.execute(m_block);
</span><span class="cx">             }
</span><span class="cx"> 
</span><span class="cx">             m_changedCFG |= m_blockInsertionSet.execute();
</span><del>-            if (m_changedCFG) {
-                m_proc.resetReachability();
-                m_proc.invalidateCFG();
-                m_dominators = nullptr; // Dominators are not valid anymore, and we don't need them yet.
-                m_changed = true;
-            }
</del><ins>+            handleChangedCFGIfNecessary();
</ins><span class="cx">             
</span><span class="cx">             result |= m_changed;
</span><del>-        } while (m_changed);
</del><ins>+        } while (m_changed &amp;&amp; m_proc.optLevel() &gt;= 2);
+        
+        if (m_proc.optLevel() &lt; 2) {
+            m_changedCFG = false;
+            simplifyCFG();
+            handleChangedCFGIfNecessary();
+        }
+        
</ins><span class="cx">         return result;
</span><span class="cx">     }
</span><span class="cx">     
</span><span class="lines">@@ -726,6 +730,9 @@
</span><span class="cx"> 
</span><span class="cx">                     if (m_value-&gt;type() != Int32)
</span><span class="cx">                         break;
</span><ins>+                    
+                    if (m_proc.optLevel() &lt; 2)
+                        break;
</ins><span class="cx"> 
</span><span class="cx">                     int32_t divisor = m_value-&gt;child(1)-&gt;asInt32();
</span><span class="cx">                     DivisionMagic&lt;int32_t&gt; magic = computeDivisionMagic(divisor);
</span><span class="lines">@@ -821,6 +828,9 @@
</span><span class="cx">                     break;
</span><span class="cx"> 
</span><span class="cx">                 default:
</span><ins>+                    if (m_proc.optLevel() &lt; 2)
+                        break;
+                    
</ins><span class="cx">                     // Turn this: Mod(N, D)
</span><span class="cx">                     // Into this: Sub(N, Mul(Div(N, D), D))
</span><span class="cx">                     //
</span><span class="lines">@@ -1841,6 +1851,9 @@
</span><span class="cx">                 checkValue-&gt;child(0) = checkValue-&gt;child(0)-&gt;child(0);
</span><span class="cx">                 m_changed = true;
</span><span class="cx">             }
</span><ins>+            
+            if (m_proc.optLevel() &lt; 2)
+                break;
</ins><span class="cx"> 
</span><span class="cx">             // If we are checking some bounded-size SSA expression that leads to a Select that
</span><span class="cx">             // has a constant as one of its results, then turn the Select into a Branch and split
</span><span class="lines">@@ -1947,17 +1960,19 @@
</span><span class="cx">                 break;
</span><span class="cx">             }
</span><span class="cx"> 
</span><del>-            // If a check for the same property dominates us, we can kill the branch. This sort
-            // of makes sense here because it's cheap, but hacks like this show that we're going
-            // to need SCCP.
-            Value* check = m_pureCSE.findMatch(
-                ValueKey(Check, Void, m_value-&gt;child(0)), m_block, *m_dominators);
-            if (check) {
-                // The Check would have side-exited if child(0) was non-zero. So, it must be
-                // zero here.
-                m_block-&gt;taken().block()-&gt;removePredecessor(m_block);
-                m_value-&gt;replaceWithJump(m_block, m_block-&gt;notTaken());
-                m_changedCFG = true;
</del><ins>+            if (m_proc.optLevel() &gt;= 2) {
+                // If a check for the same property dominates us, we can kill the branch. This sort
+                // of makes sense here because it's cheap, but hacks like this show that we're going
+                // to need SCCP.
+                Value* check = m_pureCSE.findMatch(
+                    ValueKey(Check, Void, m_value-&gt;child(0)), m_block, *m_dominators);
+                if (check) {
+                    // The Check would have side-exited if child(0) was non-zero. So, it must be
+                    // zero here.
+                    m_block-&gt;taken().block()-&gt;removePredecessor(m_block);
+                    m_value-&gt;replaceWithJump(m_block, m_block-&gt;notTaken());
+                    m_changedCFG = true;
+                }
</ins><span class="cx">             }
</span><span class="cx">             break;
</span><span class="cx">         }
</span><span class="lines">@@ -2378,6 +2393,16 @@
</span><span class="cx">             dataLog(m_proc);
</span><span class="cx">         }
</span><span class="cx">     }
</span><ins>+    
+    void handleChangedCFGIfNecessary()
+    {
+        if (m_changedCFG) {
+            m_proc.resetReachability();
+            m_proc.invalidateCFG();
+            m_dominators = nullptr; // Dominators are not valid anymore, and we don't need them yet.
+            m_changed = true;
+        }
+    }
</ins><span class="cx"> 
</span><span class="cx">     void checkPredecessorValidity()
</span><span class="cx">     {
</span></span></pre>
</div>
</div>

</body>
</html>