<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[224592] trunk</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/224592">224592</a></dd>
<dt>Author</dt> <dd>rmorisset@apple.com</dd>
<dt>Date</dt> <dd>2017-11-08 12:42:15 -0800 (Wed, 08 Nov 2017)</dd>
</dl>

<h3>Log Message</h3>
<pre>Turn recursive tail calls into loops
https://bugs.webkit.org/show_bug.cgi?id=176601

Reviewed by Saam Barati.

Relanding after https://bugs.webkit.org/show_bug.cgi?id=178834.

JSTests:

Add some simple test that computes factorial in several ways, and other trivial computations.
They all tests the case where foo calls bar (in an inlineable way) that then does a tail call.
Depending on the nature of both calls, it is possible or not to turn the tail call into a loop.
I have no clear way of checking that the call was indeed transformed, but I can check that the code computes the right result
(which it doesn't if that tail call is transformed into a loop in the unsound cases).

* stress/inline-call-to-recursive-tail-call.js: Added.
(factorial.aux):
(factorial):
(factorial2.aux2):
(factorial2.id):
(factorial2):
(factorial3.aux3):
(factorial3):
(aux4):
(factorial4):
(foo):
(auxBar):
(bar):
(test):

Source/JavaScriptCore:

We want to turn recursive tail calls into loops early in the pipeline, so that the loops can then be optimized.
One difficulty is that we need to split the entry block of the function we are jumping to in order to have somewhere to jump to.
Worse: it is not necessarily the first block of the codeBlock, because of inlining! So we must do the splitting in the DFGByteCodeParser, at the same time as inlining.
We do this part through modifying the computation of the jump targets.
Importantly, we only do this splitting for functions that have tail calls.
It is the only case where the optimisation is sound, and doing the splitting unconditionnaly destroys performance on Octane/raytrace.

We must then do the actual transformation also in DFGByteCodeParser, to avoid code motion moving code out of the body of what will become a loop.
The transformation is entirely contained in handleRecursiveTailCall, which is hooked to the inlining machinery.

* bytecode/CodeBlock.h:
(JSC::CodeBlock::hasTailCalls const):
* bytecode/PreciseJumpTargets.cpp:
(JSC::getJumpTargetsForBytecodeOffset):
(JSC::computePreciseJumpTargetsInternal):
* bytecode/UnlinkedCodeBlock.cpp:
(JSC::UnlinkedCodeBlock::UnlinkedCodeBlock):
* bytecode/UnlinkedCodeBlock.h:
(JSC::UnlinkedCodeBlock::hasTailCalls const):
(JSC::UnlinkedCodeBlock::setHasTailCalls):
* bytecompiler/BytecodeGenerator.cpp:
(JSC::BytecodeGenerator::emitEnter):
(JSC::BytecodeGenerator::emitCallInTailPosition):
* dfg/DFGByteCodeParser.cpp:
(JSC::DFG::ByteCodeParser::allocateTargetableBlock):
(JSC::DFG::ByteCodeParser::makeBlockTargetable):
(JSC::DFG::ByteCodeParser::handleCall):
(JSC::DFG::ByteCodeParser::handleRecursiveTailCall):
(JSC::DFG::ByteCodeParser::parseBlock):
(JSC::DFG::ByteCodeParser::parse):</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkJSTestsChangeLog">trunk/JSTests/ChangeLog</a></li>
<li><a href="#trunkSourceJavaScriptCoreChangeLog">trunk/Source/JavaScriptCore/ChangeLog</a></li>
<li><a href="#trunkSourceJavaScriptCorebytecodeCodeBlockh">trunk/Source/JavaScriptCore/bytecode/CodeBlock.h</a></li>
<li><a href="#trunkSourceJavaScriptCorebytecodePreciseJumpTargetscpp">trunk/Source/JavaScriptCore/bytecode/PreciseJumpTargets.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCorebytecodeUnlinkedCodeBlockcpp">trunk/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCorebytecodeUnlinkedCodeBlockh">trunk/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.h</a></li>
<li><a href="#trunkSourceJavaScriptCorebytecompilerBytecodeGeneratorcpp">trunk/Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoredfgDFGByteCodeParsercpp">trunk/Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreruntimeExceptionHelperscpp">trunk/Source/JavaScriptCore/runtime/ExceptionHelpers.cpp</a></li>
</ul>

<h3>Added Paths</h3>
<ul>
<li><a href="#trunkJSTestsstressinlinecalltorecursivetailcalljs">trunk/JSTests/stress/inline-call-to-recursive-tail-call.js</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkJSTestsChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/JSTests/ChangeLog (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/JSTests/ChangeLog  2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/JSTests/ChangeLog     2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -1,3 +1,33 @@
</span><ins>+2017-11-08  Robin Morisset  <rmorisset@apple.com>
+
+        Turn recursive tail calls into loops
+        https://bugs.webkit.org/show_bug.cgi?id=176601
+
+        Reviewed by Saam Barati.
+
+        Relanding after https://bugs.webkit.org/show_bug.cgi?id=178834.
+
+        Add some simple test that computes factorial in several ways, and other trivial computations.
+        They all tests the case where foo calls bar (in an inlineable way) that then does a tail call.
+        Depending on the nature of both calls, it is possible or not to turn the tail call into a loop.
+        I have no clear way of checking that the call was indeed transformed, but I can check that the code computes the right result
+        (which it doesn't if that tail call is transformed into a loop in the unsound cases).
+
+        * stress/inline-call-to-recursive-tail-call.js: Added.
+        (factorial.aux):
+        (factorial):
+        (factorial2.aux2):
+        (factorial2.id):
+        (factorial2):
+        (factorial3.aux3):
+        (factorial3):
+        (aux4):
+        (factorial4):
+        (foo):
+        (auxBar):
+        (bar):
+        (test):
+
</ins><span class="cx"> 2017-11-07  Mark Lam  <mark.lam@apple.com>
</span><span class="cx"> 
</span><span class="cx">         AccessCase::generateImpl() should exclude the result register when restoring registers after a call.
</span></span></pre></div>
<a id="trunkJSTestsstressinlinecalltorecursivetailcalljs"></a>
<div class="addfile"><h4>Added: trunk/JSTests/stress/inline-call-to-recursive-tail-call.js (0 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/JSTests/stress/inline-call-to-recursive-tail-call.js                               (rev 0)
+++ trunk/JSTests/stress/inline-call-to-recursive-tail-call.js  2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -0,0 +1,94 @@
</span><ins>+"use strict";
+
+// factorial calls aux that tail-calls back into factorial.
+// It is not OK to turn that tail call into a jump back to the top of the function, because the call to aux is not a tail call.
+function factorial(n) {
+    function aux(n) {
+        if (n == 1)
+            return 1;
+        return factorial(n - 1);
+    }
+
+    return n * aux(n);
+}
+
+// Same, but this time with an irrelevant tail call in factorial.
+// This exercises a different code path because the recursive tail call optimization aborts early if the op_enter block is not split, and it is split by the detection of a tail call.
+function factorial2(n) {
+    function aux2(n) {
+        if (n == 1)
+            return 1;
+        return factorial2(n - 1);
+    }
+    function id(n) {
+        return n;
+    }
+
+    return id(n * aux2(n));
+}
+
+// This time, aux is tail-calling itself: the optimization is valid but must jump to the start of aux3, not of factorial3.
+function factorial3(n) {
+    function aux3(n, acc) {
+        if (n == 1)
+            return acc;
+        return aux3 (n-1, n*acc);
+    }
+
+    return n * aux3(n-1, 1);
+}
+
+// In this case, it is valid to jump all the way to the top of factorial4, because all calls are tail calls.
+function factorial4(n, acc) {
+    function aux4(n, acc) {
+        if (n == 1)
+            return acc;
+        return factorial4(n-1, n*acc);
+    }
+
+    if (acc)
+        return aux4(n, acc);
+    return aux4(n, 1);
+}
+
+// This function is used to test that recursing with a different number of arguments doesn't mess up with the stack.
+// The first two tail calls should be optimized, but not the last one (no place on the stack for the third argument).
+function foo(a, b) {
+    if (a == 0)
+        return 42;
+    if (a == 1)
+        return foo(a - 1);
+    if (a == 2)
+        return foo(b - 1, a);
+    return foo (b - 1, a, 43);
+}
+
+// Same deal as foo, just with an inlining thrown into the mix.
+// Even the first tail call should not be optimized in this case, because some code in the compiler may constant-fold the number of arguments in that case.
+function bar(x, y) {
+    function auxBar(a, b) {
+        if (a == 0)
+            return 42;
+        if (a == 1)
+            return auxBar(a - 1);
+        if (a == 2)
+            return auxBar(b - 1, a);
+        return auxBar(b - 1, a, 43);
+    }
+
+    return auxBar(x, y);
+}
+
+function test(result, expected, name) {
+    if (result != expected)
+        throw "Wrong result for " + name + ": " + result + " instead of " + expected;
+}
+
+for (var i = 0; i < 10000; ++i) {
+    test(factorial(20), 2432902008176640000, "factorial");
+    test(factorial2(20), 2432902008176640000, "factorial2");
+    test(factorial3(20), 2432902008176640000, "factorial3");
+    test(factorial4(20), 2432902008176640000, "factorial4");
+    test(foo(10, 10), 42, "foo");
+    test(bar(10, 10), 42, "bar");
+}
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ChangeLog (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ChangeLog    2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/Source/JavaScriptCore/ChangeLog       2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -1,3 +1,43 @@
</span><ins>+2017-11-08  Robin Morisset  <rmorisset@apple.com>
+
+        Turn recursive tail calls into loops
+        https://bugs.webkit.org/show_bug.cgi?id=176601
+
+        Reviewed by Saam Barati.
+
+        Relanding after https://bugs.webkit.org/show_bug.cgi?id=178834.
+
+        We want to turn recursive tail calls into loops early in the pipeline, so that the loops can then be optimized.
+        One difficulty is that we need to split the entry block of the function we are jumping to in order to have somewhere to jump to.
+        Worse: it is not necessarily the first block of the codeBlock, because of inlining! So we must do the splitting in the DFGByteCodeParser, at the same time as inlining.
+        We do this part through modifying the computation of the jump targets.
+        Importantly, we only do this splitting for functions that have tail calls.
+        It is the only case where the optimisation is sound, and doing the splitting unconditionnaly destroys performance on Octane/raytrace.
+
+        We must then do the actual transformation also in DFGByteCodeParser, to avoid code motion moving code out of the body of what will become a loop.
+        The transformation is entirely contained in handleRecursiveTailCall, which is hooked to the inlining machinery.
+
+        * bytecode/CodeBlock.h:
+        (JSC::CodeBlock::hasTailCalls const):
+        * bytecode/PreciseJumpTargets.cpp:
+        (JSC::getJumpTargetsForBytecodeOffset):
+        (JSC::computePreciseJumpTargetsInternal):
+        * bytecode/UnlinkedCodeBlock.cpp:
+        (JSC::UnlinkedCodeBlock::UnlinkedCodeBlock):
+        * bytecode/UnlinkedCodeBlock.h:
+        (JSC::UnlinkedCodeBlock::hasTailCalls const):
+        (JSC::UnlinkedCodeBlock::setHasTailCalls):
+        * bytecompiler/BytecodeGenerator.cpp:
+        (JSC::BytecodeGenerator::emitEnter):
+        (JSC::BytecodeGenerator::emitCallInTailPosition):
+        * dfg/DFGByteCodeParser.cpp:
+        (JSC::DFG::ByteCodeParser::allocateTargetableBlock):
+        (JSC::DFG::ByteCodeParser::makeBlockTargetable):
+        (JSC::DFG::ByteCodeParser::handleCall):
+        (JSC::DFG::ByteCodeParser::handleRecursiveTailCall):
+        (JSC::DFG::ByteCodeParser::parseBlock):
+        (JSC::DFG::ByteCodeParser::parse):
+
</ins><span class="cx"> 2017-11-08  Joseph Pecoraro  <pecoraro@apple.com>
</span><span class="cx"> 
</span><span class="cx">         Web Inspector: Remove unused Page.ScriptIdentifier protocol type
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorebytecodeCodeBlockh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/bytecode/CodeBlock.h (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/bytecode/CodeBlock.h 2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/Source/JavaScriptCore/bytecode/CodeBlock.h    2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -736,7 +736,7 @@
</span><span class="cx"> 
</span><span class="cx">     // Call this to cause an optimization trigger to fire soon, but
</span><span class="cx">     // not necessarily the next one. This makes sense if optimization
</span><del>-    // succeeds. Successfuly optimization means that all calls are
</del><ins>+    // succeeds. Successful optimization means that all calls are
</ins><span class="cx">     // relinked to the optimized code, so this only affects call
</span><span class="cx">     // frames that are still executing this CodeBlock. The value here
</span><span class="cx">     // is tuned to strike a balance between the cost of OSR entry
</span><span class="lines">@@ -919,6 +919,8 @@
</span><span class="cx">     std::optional<CodeOrigin> findPC(void* pc);
</span><span class="cx"> #endif
</span><span class="cx"> 
</span><ins>+    bool hasTailCalls() const { return m_unlinkedCode->hasTailCalls(); }
+
</ins><span class="cx"> protected:
</span><span class="cx">     void finalizeLLIntInlineCaches();
</span><span class="cx">     void finalizeBaselineJITInlineCaches();
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorebytecodePreciseJumpTargetscpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/bytecode/PreciseJumpTargets.cpp (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/bytecode/PreciseJumpTargets.cpp      2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/Source/JavaScriptCore/bytecode/PreciseJumpTargets.cpp 2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -42,6 +42,11 @@
</span><span class="cx">     // op_loop_hint does not have jump target stored in bytecode instructions.
</span><span class="cx">     if (opcodeID == op_loop_hint)
</span><span class="cx">         out.append(bytecodeOffset);
</span><ins>+    else if (opcodeID == op_enter && codeBlock->hasTailCalls()) {
+        // We need to insert a jump after op_enter, so recursive tail calls have somewhere to jump to.
+        // But we only want to pay that price for functions that have at least one tail call.
+        out.append(bytecodeOffset + opcodeLengths[op_enter]);
+    }
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> enum class ComputePreciseJumpTargetsMode {
</span><span class="lines">@@ -53,9 +58,8 @@
</span><span class="cx"> void computePreciseJumpTargetsInternal(Block* codeBlock, Instruction* instructionsBegin, unsigned instructionCount, Vector<unsigned, vectorSize>& out)
</span><span class="cx"> {
</span><span class="cx">     ASSERT(out.isEmpty());
</span><del>-    
-    // We will derive a superset of the jump targets that the code block thinks it has.
-    // So, if the code block claims there are none, then we are done.
</del><ins>+
+    // The code block has a superset of the jump targets. So if it claims to have none, we are done.
</ins><span class="cx">     if (Mode == ComputePreciseJumpTargetsMode::FollowCodeBlockClaim && !codeBlock->numberOfJumpTargets())
</span><span class="cx">         return;
</span><span class="cx">     
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorebytecodeUnlinkedCodeBlockcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.cpp (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.cpp       2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.cpp  2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -71,6 +71,7 @@
</span><span class="cx">     , m_constructorKind(static_cast<unsigned>(info.constructorKind()))
</span><span class="cx">     , m_derivedContextType(static_cast<unsigned>(info.derivedContextType()))
</span><span class="cx">     , m_evalContextType(static_cast<unsigned>(info.evalContextType()))
</span><ins>+    , m_hasTailCalls(false)
</ins><span class="cx">     , m_lineCount(0)
</span><span class="cx">     , m_endColumn(UINT_MAX)
</span><span class="cx">     , m_didOptimize(MixedTriState)
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorebytecodeUnlinkedCodeBlockh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.h (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.h 2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/Source/JavaScriptCore/bytecode/UnlinkedCodeBlock.h    2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -131,6 +131,8 @@
</span><span class="cx">     EvalContextType evalContextType() const { return static_cast<EvalContextType>(m_evalContextType); }
</span><span class="cx">     bool isArrowFunctionContext() const { return m_isArrowFunctionContext; }
</span><span class="cx">     bool isClassContext() const { return m_isClassContext; }
</span><ins>+    bool hasTailCalls() const { return m_hasTailCalls; }
+    void setHasTailCalls() { m_hasTailCalls = true; }
</ins><span class="cx"> 
</span><span class="cx">     void addExpressionInfo(unsigned instructionOffset, int divot,
</span><span class="cx">         int startOffset, int endOffset, unsigned line, unsigned column);
</span><span class="lines">@@ -453,6 +455,7 @@
</span><span class="cx">     unsigned m_constructorKind : 2;
</span><span class="cx">     unsigned m_derivedContextType : 2;
</span><span class="cx">     unsigned m_evalContextType : 2;
</span><ins>+    unsigned m_hasTailCalls : 1;
</ins><span class="cx">     unsigned m_lineCount;
</span><span class="cx">     unsigned m_endColumn;
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorebytecompilerBytecodeGeneratorcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp   2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp      2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -1313,6 +1313,12 @@
</span><span class="cx"> void BytecodeGenerator::emitEnter()
</span><span class="cx"> {
</span><span class="cx">     emitOpcode(op_enter);
</span><ins>+
+    // We must add the end of op_enter as a potential jump target, because the bytecode parser may decide to split its basic block
+    // to have somewhere to jump to if there is a recursive tail-call that points to this function.
+    m_codeBlock->addJumpTarget(instructions().size());
+    // This disables peephole optimizations when an instruction is a jump target
+    m_lastOpcodeID = op_end;
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> void BytecodeGenerator::emitLoopHint()
</span><span class="lines">@@ -3357,7 +3363,11 @@
</span><span class="cx"> 
</span><span class="cx"> RegisterID* BytecodeGenerator::emitCallInTailPosition(RegisterID* dst, RegisterID* func, ExpectedFunction expectedFunction, CallArguments& callArguments, const JSTextPosition& divot, const JSTextPosition& divotStart, const JSTextPosition& divotEnd, DebuggableCall debuggableCall)
</span><span class="cx"> {
</span><del>-    return emitCall(m_inTailPosition ? op_tail_call : op_call, dst, func, expectedFunction, callArguments, divot, divotStart, divotEnd, debuggableCall);
</del><ins>+    if (m_inTailPosition) {
+        m_codeBlock->setHasTailCalls();
+        return emitCall(op_tail_call, dst, func, expectedFunction, callArguments, divot, divotStart, divotEnd, debuggableCall);
+    }
+    return emitCall(op_call, dst, func, expectedFunction, callArguments, divot, divotStart, divotEnd, debuggableCall);
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> RegisterID* BytecodeGenerator::emitCallEval(RegisterID* dst, RegisterID* func, CallArguments& callArguments, const JSTextPosition& divot, const JSTextPosition& divotStart, const JSTextPosition& divotEnd, DebuggableCall debuggableCall)
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoredfgDFGByteCodeParsercpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp    2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp       2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -224,6 +224,7 @@
</span><span class="cx">     void emitFunctionChecks(CallVariant, Node* callTarget, VirtualRegister thisArgumnt);
</span><span class="cx">     void emitArgumentPhantoms(int registerOffset, int argumentCountIncludingThis);
</span><span class="cx">     Node* getArgumentCount();
</span><ins>+    bool handleRecursiveTailCall(Node* callTargetNode, const CallLinkStatus&, int registerOffset, VirtualRegister thisArgument, int argumentCountIncludingThis);
</ins><span class="cx">     unsigned inliningCost(CallVariant, int argumentCountIncludingThis, InlineCallFrame::Kind); // Return UINT_MAX if it's not an inlining candidate. By convention, intrinsics have a cost of 1.
</span><span class="cx">     // Handle inlining. Return true if it succeeded, false if we need to plant a call.
</span><span class="cx">     bool handleInlining(Node* callTargetNode, int resultOperand, const CallLinkStatus&, int registerOffset, VirtualRegister thisArgument, VirtualRegister argumentsArgument, unsigned argumentsOffset, int argumentCountIncludingThis, unsigned nextOffset, NodeType callOp, InlineCallFrame::Kind, SpeculatedType prediction);
</span><span class="lines">@@ -231,7 +232,6 @@
</span><span class="cx">     bool attemptToInlineCall(Node* callTargetNode, int resultOperand, CallVariant, int registerOffset, int argumentCountIncludingThis, unsigned nextOffset, InlineCallFrame::Kind, SpeculatedType prediction, unsigned& inliningBalance, BasicBlock* continuationBlock, const ChecksFunctor& insertChecks);
</span><span class="cx">     template<typename ChecksFunctor>
</span><span class="cx">     void inlineCall(Node* callTargetNode, int resultOperand, CallVariant, int registerOffset, int argumentCountIncludingThis, unsigned nextOffset, InlineCallFrame::Kind, BasicBlock* continuationBlock, const ChecksFunctor& insertChecks);
</span><del>-    void cancelLinkingForBlock(InlineStackEntry*, BasicBlock*); // Only works when the given block is the last one to have been added for that inline stack entry.
</del><span class="cx">     // Handle intrinsic functions. Return true if it succeeded, false if we need to plant a call.
</span><span class="cx">     template<typename ChecksFunctor>
</span><span class="cx">     bool handleIntrinsicCall(Node* callee, int resultOperand, Intrinsic, int registerOffset, int argumentCountIncludingThis, SpeculatedType prediction, const ChecksFunctor& insertChecks);
</span><span class="lines">@@ -640,6 +640,8 @@
</span><span class="cx">             flushDirect(inlineStackEntry->remapOperand(virtualRegisterForArgument(argument)));
</span><span class="cx">         if (!inlineStackEntry->m_inlineCallFrame && m_graph.needsFlushedThis())
</span><span class="cx">             flushDirect(virtualRegisterForArgument(0));
</span><ins>+        else
+            phantomLocalDirect(virtualRegisterForArgument(0));
</ins><span class="cx"> 
</span><span class="cx">         if (m_graph.needsScopeRegister())
</span><span class="cx">             flushDirect(m_codeBlock->scopeRegister());
</span><span class="lines">@@ -1064,7 +1066,7 @@
</span><span class="cx">     }
</span><span class="cx"> 
</span><span class="cx">     bool needsDynamicLookup(ResolveType, OpcodeID);
</span><del>-    
</del><ins>+
</ins><span class="cx">     VM* m_vm;
</span><span class="cx">     CodeBlock* m_codeBlock;
</span><span class="cx">     CodeBlock* m_profiledBlock;
</span><span class="lines">@@ -1127,9 +1129,6 @@
</span><span class="cx">         // cannot have two blocks that have the same bytecodeBegin.
</span><span class="cx">         Vector<BasicBlock*> m_blockLinkingTargets;
</span><span class="cx"> 
</span><del>-        // This is set by op_enter in parseBlock(), and indicates the first block of the function.
-        BasicBlock* m_entryBlock;
-
</del><span class="cx">         // Optional: a continuation block for returns to jump to. It is set by early returns if it does not exist.
</span><span class="cx">         BasicBlock* m_continuationBlock;
</span><span class="cx"> 
</span><span class="lines">@@ -1217,6 +1216,9 @@
</span><span class="cx">     ASSERT(bytecodeIndex != UINT_MAX);
</span><span class="cx">     Ref<BasicBlock> block = adoptRef(*new BasicBlock(bytecodeIndex, m_numArguments, m_numLocals, 1));
</span><span class="cx">     BasicBlock* blockPtr = block.ptr();
</span><ins>+    // m_blockLinkingTargets must always be sorted in increasing order of bytecodeBegin
+    if (m_inlineStackTop->m_blockLinkingTargets.size())
+        ASSERT(m_inlineStackTop->m_blockLinkingTargets.last()->bytecodeBegin < bytecodeIndex);
</ins><span class="cx">     m_inlineStackTop->m_blockLinkingTargets.append(blockPtr);
</span><span class="cx">     m_graph.appendBlock(WTFMove(block));
</span><span class="cx">     return blockPtr;
</span><span class="lines">@@ -1234,6 +1236,9 @@
</span><span class="cx"> {
</span><span class="cx">     ASSERT(block->bytecodeBegin == UINT_MAX);
</span><span class="cx">     block->bytecodeBegin = bytecodeIndex;
</span><ins>+    // m_blockLinkingTargets must always be sorted in increasing order of bytecodeBegin
+    if (m_inlineStackTop->m_blockLinkingTargets.size())
+        ASSERT(m_inlineStackTop->m_blockLinkingTargets.last()->bytecodeBegin < bytecodeIndex);
</ins><span class="cx">     m_inlineStackTop->m_blockLinkingTargets.append(block);
</span><span class="cx"> }
</span><span class="cx"> 
</span><span class="lines">@@ -1295,6 +1300,9 @@
</span><span class="cx">     if (callLinkStatus.canOptimize()) {
</span><span class="cx">         VirtualRegister thisArgument = virtualRegisterForArgument(0, registerOffset);
</span><span class="cx"> 
</span><ins>+        if (op == TailCall && handleRecursiveTailCall(callTarget, callLinkStatus, registerOffset, thisArgument, argumentCountIncludingThis))
+            return Terminal;
+
</ins><span class="cx">         // Inlining is quite complex, and managed by a pipeline of functions:
</span><span class="cx">         // handle(Varargs)Call -> handleInlining -> attemptToInlineCall -> inlineCall
</span><span class="cx">         // - handleCall and handleVarargsCall deal with the case where no inlining happens, and do some sanity checks on their arguments
</span><span class="lines">@@ -1412,6 +1420,75 @@
</span><span class="cx">         addToGraph(Phantom, get(virtualRegisterForArgument(i, registerOffset)));
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+bool ByteCodeParser::handleRecursiveTailCall(Node* callTargetNode, const CallLinkStatus& callLinkStatus, int registerOffset, VirtualRegister thisArgument, int argumentCountIncludingThis)
+{
+    // FIXME: We currently only do this optimisation in the simple, non-polymorphic case.
+    // https://bugs.webkit.org/show_bug.cgi?id=178390
+    if (callLinkStatus.couldTakeSlowPath() || callLinkStatus.size() != 1)
+        return false;
+
+    auto targetExecutable = callLinkStatus[0].executable();
+    InlineStackEntry* stackEntry = m_inlineStackTop;
+    do {
+        if (targetExecutable != stackEntry->executable())
+            continue;
+        VERBOSE_LOG("   We found a recursive tail call, trying to optimize it into a jump.\n");
+
+        if (auto* callFrame = stackEntry->m_inlineCallFrame) {
+            // Some code may statically use the argument count from the InlineCallFrame, so it would be invalid to loop back if it does not match.
+            // We "continue" instead of returning false in case another stack entry further on the stack has the right number of arguments.
+            if (argumentCountIncludingThis != static_cast<int>(callFrame->argumentCountIncludingThis))
+                continue;
+        } else {
+            // We are in the machine code entry (i.e. the original caller).
+            // If we have more arguments than the number of parameters to the function, it is not clear where we could put them on the stack.
+            if (argumentCountIncludingThis > m_codeBlock->numParameters())
+                return false;
+        }
+
+        // We must add some check that the profiling information was correct and the target of this call is what we thought
+        emitFunctionChecks(callLinkStatus[0], callTargetNode, thisArgument);
+
+        // We must set the arguments to the right values
+        int argIndex = 0;
+        for (; argIndex < argumentCountIncludingThis; ++argIndex) {
+            Node* value = get(virtualRegisterForArgument(argIndex, registerOffset));
+            setDirect(stackEntry->remapOperand(virtualRegisterForArgument(argIndex)), value, NormalSet);
+        }
+        Node* undefined = addToGraph(JSConstant, OpInfo(m_constantUndefined));
+        for (; argIndex < stackEntry->m_codeBlock->numParameters(); ++argIndex)
+            setDirect(stackEntry->remapOperand(virtualRegisterForArgument(argIndex)), undefined, NormalSet);
+
+        // We must repeat the work of op_enter here as we will jump right after it.
+        // We jump right after it and not before it, because of some invariant saying that a CFG root cannot have predecessors in the IR.
+        for (int i = 0; i < stackEntry->m_codeBlock->m_numVars; ++i)
+            setDirect(stackEntry->remapOperand(virtualRegisterForLocal(i)), undefined, NormalSet);
+
+        // We want to emit the SetLocals with an exit origin that points to the place we are jumping to.
+        unsigned oldIndex = m_currentIndex;
+        auto oldStackTop = m_inlineStackTop;
+        m_inlineStackTop = stackEntry;
+        m_currentIndex = OPCODE_LENGTH(op_enter);
+        m_exitOK = true;
+        processSetLocalQueue();
+        m_currentIndex = oldIndex;
+        m_inlineStackTop = oldStackTop;
+        m_exitOK = false;
+
+        // We flush everything, as if we were in the backedge of a loop (see treatment of op_jmp in parseBlock).
+        flushForTerminal();
+
+        BasicBlock** entryBlockPtr = tryBinarySearch<BasicBlock*, unsigned>(stackEntry->m_blockLinkingTargets, stackEntry->m_blockLinkingTargets.size(), OPCODE_LENGTH(op_enter), getBytecodeBeginForBlock);
+        RELEASE_ASSERT(entryBlockPtr);
+        addJumpTo(*entryBlockPtr);
+        return true;
+        // It would be unsound to jump over a non-tail call: the "tail" call is not really a tail call in that case.
+    } while (stackEntry->m_inlineCallFrame && stackEntry->m_inlineCallFrame->kind == InlineCallFrame::TailCall && (stackEntry = stackEntry->m_caller));
+
+    // The tail call was not recursive
+    return false;
+}
+
</ins><span class="cx"> unsigned ByteCodeParser::inliningCost(CallVariant callee, int argumentCountIncludingThis, InlineCallFrame::Kind kind)
</span><span class="cx"> {
</span><span class="cx">     CallMode callMode = InlineCallFrame::callModeFor(kind);
</span><span class="lines">@@ -4232,8 +4309,6 @@
</span><span class="cx">             for (int i = 0; i < m_inlineStackTop->m_codeBlock->m_numVars; ++i)
</span><span class="cx">                 set(virtualRegisterForLocal(i), undefined, ImmediateNakedSet);
</span><span class="cx"> 
</span><del>-            m_inlineStackTop->m_entryBlock = m_currentBlock;
-
</del><span class="cx">             NEXT_OPCODE(op_enter);
</span><span class="cx">         }
</span><span class="cx">             
</span><span class="lines">@@ -5420,7 +5495,8 @@
</span><span class="cx">             if (terminality == NonTerminal)
</span><span class="cx">                 NEXT_OPCODE(op_tail_call);
</span><span class="cx">             else
</span><del>-                LAST_OPCODE(op_tail_call);
</del><ins>+                LAST_OPCODE_LINKED(op_tail_call);
+            // We use LAST_OPCODE_LINKED instead of LAST_OPCODE because if the tail call was optimized, it may now be a jump to a bytecode index in a different InlineStackEntry.
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         case op_construct:
</span><span class="lines">@@ -6456,8 +6532,6 @@
</span><span class="cx">     linkBlocks(inlineStackEntry.m_unlinkedBlocks, inlineStackEntry.m_blockLinkingTargets);
</span><span class="cx"> 
</span><span class="cx">     m_graph.determineReachability();
</span><del>-    if (Options::verboseDFGBytecodeParsing())
-        m_graph.dump();
</del><span class="cx">     m_graph.killUnreachableBlocks();
</span><span class="cx"> 
</span><span class="cx">     for (BlockIndex blockIndex = m_graph.numBlocks(); blockIndex--;) {
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreruntimeExceptionHelperscpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/runtime/ExceptionHelpers.cpp (224591 => 224592)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/runtime/ExceptionHelpers.cpp 2017-11-08 20:02:58 UTC (rev 224591)
+++ trunk/Source/JavaScriptCore/runtime/ExceptionHelpers.cpp    2017-11-08 20:42:15 UTC (rev 224592)
</span><span class="lines">@@ -272,6 +272,7 @@
</span><span class="cx">     scope.assertNoException();
</span><span class="cx">     JSObject* exception = createTypeError(exec, errorMessage, appender, runtimeTypeForValue(value));
</span><span class="cx">     ASSERT(exception->isErrorInstance());
</span><ins>+
</ins><span class="cx">     return exception;
</span><span class="cx"> }
</span><span class="cx"> 
</span></span></pre>
</div>
</div>

</body>
</html>