<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[194334] trunk/Source/JavaScriptCore</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/194334">194334</a></dd>
<dt>Author</dt> <dd>fpizlo@apple.com</dd>
<dt>Date</dt> <dd>2015-12-21 10:56:54 -0800 (Mon, 21 Dec 2015)</dd>
</dl>

<h3>Log Message</h3>
<pre>FTL B3 should do vararg calls
https://bugs.webkit.org/show_bug.cgi?id=152468

Reviewed by Benjamin Poulain.

This adds FTL-&gt;B3 lowering of all kinds of varargs calls - forwarding or not, tail or not,
and construct or not. Like all other such lowerings, all of the code is in one place in
FTLLower.

I removed code for varargs and exception spill slots from the B3 path, since it won't need
it. The plan is to rely on B3 doing the spilling for us by using some combination of early
clobber and late use.

This adds ValueRep::emitRestore(), a helpful method for emitting code to restore any ValueRep
into any 64-bit Reg (FPR or GPR).

I wrote new tests for vararg calls, because I wasn't sure which of the existing ones we can
run. These are short-running tests, so I'm not worried about bloating our test suite.

* b3/B3ValueRep.cpp:
(JSC::B3::ValueRep::dump):
(JSC::B3::ValueRep::emitRestore):
* b3/B3ValueRep.h:
* ftl/FTLLowerDFGToLLVM.cpp:
(JSC::FTL::DFG::LowerDFGToLLVM::lower):
(JSC::FTL::DFG::LowerDFGToLLVM::compileCallOrConstructVarargs):
(JSC::FTL::DFG::LowerDFGToLLVM::compileInvalidationPoint):
* ftl/FTLState.h:
* tests/stress/varargs-no-forward.js: Added.
* tests/stress/varargs-simple.js: Added.
* tests/stress/varargs-two-level.js: Added.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCoreChangeLog">trunk/Source/JavaScriptCore/ChangeLog</a></li>
<li><a href="#trunkSourceJavaScriptCoreb3B3ValueRepcpp">trunk/Source/JavaScriptCore/b3/B3ValueRep.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreb3B3ValueReph">trunk/Source/JavaScriptCore/b3/B3ValueRep.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreftlFTLLowerDFGToLLVMcpp">trunk/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreftlFTLStateh">trunk/Source/JavaScriptCore/ftl/FTLState.h</a></li>
</ul>

<h3>Added Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCoretestsstressvarargsnoforwardjs">trunk/Source/JavaScriptCore/tests/stress/varargs-no-forward.js</a></li>
<li><a href="#trunkSourceJavaScriptCoretestsstressvarargssimplejs">trunk/Source/JavaScriptCore/tests/stress/varargs-simple.js</a></li>
<li><a href="#trunkSourceJavaScriptCoretestsstressvarargstwoleveljs">trunk/Source/JavaScriptCore/tests/stress/varargs-two-level.js</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkSourceJavaScriptCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ChangeLog (194333 => 194334)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ChangeLog        2015-12-21 18:40:10 UTC (rev 194333)
+++ trunk/Source/JavaScriptCore/ChangeLog        2015-12-21 18:56:54 UTC (rev 194334)
</span><span class="lines">@@ -1,3 +1,37 @@
</span><ins>+2015-12-21  Filip Pizlo  &lt;fpizlo@apple.com&gt;
+
+        FTL B3 should do vararg calls
+        https://bugs.webkit.org/show_bug.cgi?id=152468
+
+        Reviewed by Benjamin Poulain.
+
+        This adds FTL-&gt;B3 lowering of all kinds of varargs calls - forwarding or not, tail or not,
+        and construct or not. Like all other such lowerings, all of the code is in one place in
+        FTLLower.
+
+        I removed code for varargs and exception spill slots from the B3 path, since it won't need
+        it. The plan is to rely on B3 doing the spilling for us by using some combination of early
+        clobber and late use.
+
+        This adds ValueRep::emitRestore(), a helpful method for emitting code to restore any ValueRep
+        into any 64-bit Reg (FPR or GPR).
+
+        I wrote new tests for vararg calls, because I wasn't sure which of the existing ones we can
+        run. These are short-running tests, so I'm not worried about bloating our test suite.
+
+        * b3/B3ValueRep.cpp:
+        (JSC::B3::ValueRep::dump):
+        (JSC::B3::ValueRep::emitRestore):
+        * b3/B3ValueRep.h:
+        * ftl/FTLLowerDFGToLLVM.cpp:
+        (JSC::FTL::DFG::LowerDFGToLLVM::lower):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileCallOrConstructVarargs):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileInvalidationPoint):
+        * ftl/FTLState.h:
+        * tests/stress/varargs-no-forward.js: Added.
+        * tests/stress/varargs-simple.js: Added.
+        * tests/stress/varargs-two-level.js: Added.
+
</ins><span class="cx"> 2015-12-18  Mark Lam  &lt;mark.lam@apple.com&gt;
</span><span class="cx"> 
</span><span class="cx">         Add unary operator tests to compare JIT and LLINT results.
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreb3B3ValueRepcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/b3/B3ValueRep.cpp (194333 => 194334)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/b3/B3ValueRep.cpp        2015-12-21 18:40:10 UTC (rev 194333)
+++ trunk/Source/JavaScriptCore/b3/B3ValueRep.cpp        2015-12-21 18:56:54 UTC (rev 194334)
</span><span class="lines">@@ -28,6 +28,8 @@
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(B3_JIT)
</span><span class="cx"> 
</span><ins>+#include &quot;AssemblyHelpers.h&quot;
+
</ins><span class="cx"> namespace JSC { namespace B3 {
</span><span class="cx"> 
</span><span class="cx"> void ValueRep::dump(PrintStream&amp; out) const
</span><span class="lines">@@ -55,6 +57,49 @@
</span><span class="cx">     RELEASE_ASSERT_NOT_REACHED();
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+void ValueRep::emitRestore(AssemblyHelpers&amp; jit, Reg reg)
+{
+    if (reg.isGPR()) {
+        switch (kind()) {
+        case Register:
+            if (isGPR())
+                jit.move(gpr(), reg.gpr());
+            else
+                jit.moveDoubleTo64(fpr(), reg.gpr());
+            break;
+        case Stack:
+            jit.load64(AssemblyHelpers::Address(GPRInfo::callFrameRegister, offsetFromFP()), reg.gpr());
+            break;
+        case Constant:
+            jit.move(AssemblyHelpers::TrustedImm64(value()), reg.gpr());
+            break;
+        default:
+            RELEASE_ASSERT_NOT_REACHED();
+            break;
+        }
+        return;
+    }
+    
+    switch (kind()) {
+    case Register:
+        if (isGPR())
+            jit.move64ToDouble(gpr(), reg.fpr());
+        else
+            jit.moveDouble(fpr(), reg.fpr());
+        break;
+    case Stack:
+        jit.loadDouble(AssemblyHelpers::Address(GPRInfo::callFrameRegister, offsetFromFP()), reg.fpr());
+        break;
+    case Constant:
+        jit.move(AssemblyHelpers::TrustedImm64(value()), jit.scratchRegister());
+        jit.move64ToDouble(jit.scratchRegister(), reg.fpr());
+        break;
+    default:
+        RELEASE_ASSERT_NOT_REACHED();
+        break;
+    }
+}
+
</ins><span class="cx"> } } // namespace JSC::B3
</span><span class="cx"> 
</span><span class="cx"> namespace WTF {
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreb3B3ValueReph"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/b3/B3ValueRep.h (194333 => 194334)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/b3/B3ValueRep.h        2015-12-21 18:40:10 UTC (rev 194333)
+++ trunk/Source/JavaScriptCore/b3/B3ValueRep.h        2015-12-21 18:56:54 UTC (rev 194334)
</span><span class="lines">@@ -34,8 +34,12 @@
</span><span class="cx"> #include &quot;Reg.h&quot;
</span><span class="cx"> #include &lt;wtf/PrintStream.h&gt;
</span><span class="cx"> 
</span><del>-namespace JSC { namespace B3 {
</del><ins>+namespace JSC {
</ins><span class="cx"> 
</span><ins>+class AssemblyHelpers;
+
+namespace B3 {
+
</ins><span class="cx"> // We use this class to describe value representations at stackmaps. It's used both to force a
</span><span class="cx"> // representation and to get the representation. When the B3 client forces a representation, we say
</span><span class="cx"> // that it's an input. When B3 tells the client what representation it picked, we say that it's an
</span><span class="lines">@@ -216,6 +220,10 @@
</span><span class="cx"> 
</span><span class="cx">     JS_EXPORT_PRIVATE void dump(PrintStream&amp;) const;
</span><span class="cx"> 
</span><ins>+    // This has a simple contract: it emits code to restore the value into the given register. This
+    // will work even if it requires moving between bits a GPR and a FPR.
+    void emitRestore(AssemblyHelpers&amp;, Reg);
+
</ins><span class="cx"> private:
</span><span class="cx">     Kind m_kind;
</span><span class="cx">     union U {
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreftlFTLLowerDFGToLLVMcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp (194333 => 194334)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp        2015-12-21 18:40:10 UTC (rev 194333)
+++ trunk/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp        2015-12-21 18:56:54 UTC (rev 194334)
</span><span class="lines">@@ -58,6 +58,7 @@
</span><span class="cx"> #include &quot;ScopedArguments.h&quot;
</span><span class="cx"> #include &quot;ScopedArgumentsTable.h&quot;
</span><span class="cx"> #include &quot;ScratchRegisterAllocator.h&quot;
</span><ins>+#include &quot;SetupVarargsFrame.h&quot;
</ins><span class="cx"> #include &quot;VirtualRegister.h&quot;
</span><span class="cx"> #include &quot;Watchdog.h&quot;
</span><span class="cx"> #include &lt;atomic&gt;
</span><span class="lines">@@ -205,8 +206,8 @@
</span><span class="cx"> #endif // FTL_USE_B3
</span><span class="cx"> 
</span><span class="cx">         auto preOrder = m_graph.blocksInPreOrder();
</span><del>-        
-        // If we have any CallVarargs then we need to have a spill slot for it.
</del><ins>+
+#if !FTL_USES_B3
</ins><span class="cx">         bool hasVarargs = false;
</span><span class="cx">         size_t maxNumberOfCatchSpills = 0;
</span><span class="cx">         for (DFG::BasicBlock* block : preOrder) {
</span><span class="lines">@@ -270,10 +271,8 @@
</span><span class="cx">             }
</span><span class="cx">         }
</span><span class="cx"> 
</span><del>-#if FTL_USES_B3
-        UNUSED_PARAM(hasVarargs);
-        // FIXME
-#else
</del><ins>+        // B3 doesn't need the varargs spill slot because we just use call arg area size as a way to
+        // request spill slots.
</ins><span class="cx">         if (hasVarargs) {
</span><span class="cx">             LValue varargsSpillSlots = m_out.alloca(
</span><span class="cx">                 arrayType(m_out.int64, JSCallVarargs::numSpillSlotsNeeded()));
</span><span class="lines">@@ -284,6 +283,7 @@
</span><span class="cx">                 m_out.int32Zero, varargsSpillSlots);
</span><span class="cx">         }
</span><span class="cx"> 
</span><ins>+        // B3 doesn't need the exception spill slot because we just use the 
</ins><span class="cx">         if (m_graph.m_hasExceptionHandlers &amp;&amp; maxNumberOfCatchSpills) {
</span><span class="cx">             RegisterSet volatileRegisters = RegisterSet::volatileRegistersForJSCall();
</span><span class="cx">             maxNumberOfCatchSpills = std::min(volatileRegisters.numberOfSetRegisters(), maxNumberOfCatchSpills);
</span><span class="lines">@@ -296,7 +296,7 @@
</span><span class="cx">                 m_out.constInt64(m_ftlState.exceptionHandlingSpillSlotStackmapID),
</span><span class="cx">                 m_out.int32Zero, exceptionHandlingVolatileRegistersSpillSlots);
</span><span class="cx">         }
</span><del>-#endif
</del><ins>+#endif // !FTL_USES_B3
</ins><span class="cx">         
</span><span class="cx">         // We should not create any alloca's after this point, since they will cease to
</span><span class="cx">         // be mem2reg candidates.
</span><span class="lines">@@ -5076,32 +5076,263 @@
</span><span class="cx">     
</span><span class="cx">     void compileCallOrConstructVarargs()
</span><span class="cx">     {
</span><del>-#if FTL_USES_B3
-        if (verboseCompilationEnabled() || !verboseCompilationEnabled())
-            CRASH();
-#else
</del><ins>+        Node* node = m_node;
</ins><span class="cx">         LValue jsCallee = lowJSValue(m_node-&gt;child1());
</span><span class="cx">         LValue thisArg = lowJSValue(m_node-&gt;child3());
</span><span class="cx">         
</span><span class="cx">         LValue jsArguments = nullptr;
</span><ins>+        bool forwarding = false;
</ins><span class="cx">         
</span><del>-        switch (m_node-&gt;op()) {
</del><ins>+        switch (node-&gt;op()) {
</ins><span class="cx">         case CallVarargs:
</span><span class="cx">         case TailCallVarargs:
</span><span class="cx">         case TailCallVarargsInlinedCaller:
</span><span class="cx">         case ConstructVarargs:
</span><del>-            jsArguments = lowJSValue(m_node-&gt;child2());
</del><ins>+            jsArguments = lowJSValue(node-&gt;child2());
</ins><span class="cx">             break;
</span><span class="cx">         case CallForwardVarargs:
</span><span class="cx">         case TailCallForwardVarargs:
</span><span class="cx">         case TailCallForwardVarargsInlinedCaller:
</span><span class="cx">         case ConstructForwardVarargs:
</span><ins>+            forwarding = true;
</ins><span class="cx">             break;
</span><span class="cx">         default:
</span><del>-            DFG_CRASH(m_graph, m_node, &quot;bad node type&quot;);
</del><ins>+            DFG_CRASH(m_graph, node, &quot;bad node type&quot;);
</ins><span class="cx">             break;
</span><span class="cx">         }
</span><span class="cx">         
</span><ins>+#if FTL_USES_B3
+        // FIXME: Need a story for exceptions.
+        // https://bugs.webkit.org/show_bug.cgi?id=151686
+
+        PatchpointValue* patchpoint = m_out.patchpoint(Int64);
+
+        // Append the forms of the arguments that we will use before any clobbering happens.
+        patchpoint-&gt;append(jsCallee, ValueRep::reg(GPRInfo::regT0));
+        if (jsArguments)
+            patchpoint-&gt;append(jsArguments, ValueRep::SomeRegister);
+        patchpoint-&gt;append(thisArg, ValueRep::SomeRegister);
+
+        if (!forwarding) {
+            // Now append them again for after clobbering. Note that the compiler may ask us to use a
+            // different register for the late for the post-clobbering version of the value. This gives
+            // the compiler a chance to spill these values without having to burn any callee-saves.
+            patchpoint-&gt;append(jsCallee, ValueRep::LateColdAny);
+            patchpoint-&gt;append(jsArguments, ValueRep::LateColdAny);
+            patchpoint-&gt;append(thisArg, ValueRep::LateColdAny);
+        }
+
+        patchpoint-&gt;clobber(RegisterSet::macroScratchRegisters());
+        patchpoint-&gt;clobberLate(RegisterSet::volatileRegistersForJSCall());
+        patchpoint-&gt;resultConstraint = ValueRep::reg(GPRInfo::returnValueGPR);
+
+        // This is the minimum amount of call arg area stack space that all JS-&gt;JS calls always have.
+        unsigned minimumJSCallAreaSize =
+            sizeof(CallerFrameAndPC) +
+            WTF::roundUpToMultipleOf(stackAlignmentBytes(), 5 * sizeof(EncodedJSValue));
+
+        m_proc.requestCallArgAreaSize(minimumJSCallAreaSize);
+        
+        CodeOrigin codeOrigin = codeOriginDescriptionOfCallSite();
+        State* state = &amp;m_ftlState;
+        patchpoint-&gt;setGenerator(
+            [=] (CCallHelpers&amp; jit, const StackmapGenerationParams&amp; params) {
+                AllowMacroScratchRegisterUsage allowScratch(jit);
+                CallSiteIndex callSiteIndex =
+                    state-&gt;jitCode-&gt;common.addUniqueCallSiteIndex(codeOrigin);
+
+                // FIXME: We would ask the OSR exit descriptor to prepare and then we would modify
+                // the OSRExit data structure inside the OSRExitHandle to link it up to this call.
+                // Also, the exception checks JumpList should be linked to somewhere.
+                // https://bugs.webkit.org/show_bug.cgi?id=151686
+                CCallHelpers::JumpList exceptions;
+
+                jit.store32(
+                    CCallHelpers::TrustedImm32(callSiteIndex.bits()),
+                    CCallHelpers::tagFor(VirtualRegister(JSStack::ArgumentCount)));
+
+                CallLinkInfo* callLinkInfo = jit.codeBlock()-&gt;addCallLinkInfo();
+                CallVarargsData* data = node-&gt;callVarargsData();
+
+                unsigned argIndex = 1;
+                GPRReg calleeGPR = params[argIndex++].gpr();
+                ASSERT(calleeGPR == GPRInfo::regT0);
+                GPRReg argumentsGPR = jsArguments ? params[argIndex++].gpr() : InvalidGPRReg;
+                GPRReg thisGPR = params[argIndex++].gpr();
+
+                B3::ValueRep calleeLateRep;
+                B3::ValueRep argumentsLateRep;
+                B3::ValueRep thisLateRep;
+                if (!forwarding) {
+                    // If we're not forwarding then we'll need callee, arguments, and this after we
+                    // have potentially clobbered calleeGPR, argumentsGPR, and thisGPR. Our technique
+                    // for this is to supply all of those operands as late uses in addition to
+                    // specifying them as early uses. It's possible that the late use uses a spill
+                    // while the early use uses a register, and it's possible for the late and early
+                    // uses to use different registers. We do know that the late uses interfere with
+                    // all volatile registers and so won't use those, but the early uses may use
+                    // volatile registers and in the case of calleeGPR, it's pinned to regT0 so it
+                    // definitely will.
+                    //
+                    // Note that we have to be super careful with these. It's possible that these
+                    // use a shuffling of the registers used for calleeGPR, argumentsGPR, and
+                    // thisGPR. If that happens and we do for example:
+                    //
+                    //     calleeLateRep.emitRestore(jit, calleeGPR);
+                    //     argumentsLateRep.emitRestore(jit, calleeGPR);
+                    //
+                    // Then we might end up with garbage if calleeLateRep.gpr() == argumentsGPR and
+                    // argumentsLateRep.gpr() == calleeGPR.
+                    //
+                    // We do a variety of things to prevent this from happening. For example, we use
+                    // argumentsLateRep before needing the other two and after we've already stopped
+                    // using the *GPRs. Also, we pin calleeGPR to regT0, and rely on the fact that
+                    // the *LateReps cannot use volatile registers (so they cannot be regT0, so
+                    // calleeGPR != argumentsLateRep.gpr() and calleeGPR != thisLateRep.gpr()).
+                    //
+                    // An alternative would have been to just use early uses and early-clobber all
+                    // volatile registers. But that would force callee, arguments, and this into
+                    // callee-save registers even if we have to spill them. We don't want spilling to
+                    // use up three callee-saves.
+                    //
+                    // TL;DR: The way we use LateReps here is dangerous and barely works but achieves
+                    // some desirable performance properties, so don't mistake the cleverness for
+                    // elegance.
+                    calleeLateRep = params[argIndex++];
+                    argumentsLateRep = params[argIndex++];
+                    thisLateRep = params[argIndex++];
+                }
+
+                // Get some scratch registers.
+                RegisterSet usedRegisters;
+                usedRegisters.merge(RegisterSet::stackRegisters());
+                usedRegisters.merge(RegisterSet::reservedHardwareRegisters());
+                usedRegisters.merge(RegisterSet::calleeSaveRegisters());
+                usedRegisters.set(calleeGPR);
+                if (argumentsGPR != InvalidGPRReg)
+                    usedRegisters.set(argumentsGPR);
+                usedRegisters.set(thisGPR);
+                if (calleeLateRep.isReg())
+                    usedRegisters.set(calleeLateRep.reg());
+                if (argumentsLateRep.isReg())
+                    usedRegisters.set(argumentsLateRep.reg());
+                if (thisLateRep.isReg())
+                    usedRegisters.set(thisLateRep.reg());
+                ScratchRegisterAllocator allocator(usedRegisters);
+                GPRReg scratchGPR1 = allocator.allocateScratchGPR();
+                GPRReg scratchGPR2 = allocator.allocateScratchGPR();
+                GPRReg scratchGPR3 = forwarding ? allocator.allocateScratchGPR() : InvalidGPRReg;
+                RELEASE_ASSERT(!allocator.numberOfReusedRegisters());
+
+                auto callWithExceptionCheck = [&amp;] (void* callee) {
+                    jit.move(CCallHelpers::TrustedImmPtr(callee), GPRInfo::nonPreservedNonArgumentGPR);
+                    jit.call(GPRInfo::nonPreservedNonArgumentGPR);
+                    exceptions.append(jit.emitExceptionCheck(AssemblyHelpers::NormalExceptionCheck, AssemblyHelpers::FarJumpWidth));
+                };
+
+                auto adjustStack = [&amp;] (GPRReg amount) {
+                    jit.addPtr(CCallHelpers::TrustedImm32(sizeof(CallerFrameAndPC)), amount, CCallHelpers::stackPointerRegister);
+                };
+
+                unsigned originalStackHeight = params.proc().frameSize();
+
+                if (forwarding) {
+                    jit.move(CCallHelpers::TrustedImm32(originalStackHeight / sizeof(EncodedJSValue)), scratchGPR2);
+                    
+                    CCallHelpers::JumpList slowCase;
+                    emitSetupVarargsFrameFastCase(jit, scratchGPR2, scratchGPR1, scratchGPR2, scratchGPR3, node-&gt;child2()-&gt;origin.semantic.inlineCallFrame, data-&gt;firstVarArgOffset, slowCase);
+
+                    CCallHelpers::Jump done = jit.jump();
+                    slowCase.link(&amp;jit);
+                    jit.setupArgumentsExecState();
+                    callWithExceptionCheck(bitwise_cast&lt;void*&gt;(operationThrowStackOverflowForVarargs));
+                    jit.abortWithReason(DFGVarargsThrowingPathDidNotThrow);
+                    
+                    done.link(&amp;jit);
+
+                    adjustStack(scratchGPR2);
+                } else {
+                    jit.move(CCallHelpers::TrustedImm32(originalStackHeight / sizeof(EncodedJSValue)), scratchGPR1);
+                    jit.setupArgumentsWithExecState(argumentsGPR, scratchGPR1, CCallHelpers::TrustedImm32(data-&gt;firstVarArgOffset));
+                    callWithExceptionCheck(bitwise_cast&lt;void*&gt;(operationSizeFrameForVarargs));
+
+                    jit.move(GPRInfo::returnValueGPR, scratchGPR1);
+                    jit.move(CCallHelpers::TrustedImm32(originalStackHeight / sizeof(EncodedJSValue)), scratchGPR2);
+                    argumentsLateRep.emitRestore(jit, argumentsGPR);
+                    emitSetVarargsFrame(jit, scratchGPR1, false, scratchGPR2, scratchGPR2);
+                    jit.addPtr(CCallHelpers::TrustedImm32(-minimumJSCallAreaSize), scratchGPR2, CCallHelpers::stackPointerRegister);
+                    jit.setupArgumentsWithExecState(scratchGPR2, argumentsGPR, CCallHelpers::TrustedImm32(data-&gt;firstVarArgOffset), scratchGPR1);
+                    callWithExceptionCheck(bitwise_cast&lt;void*&gt;(operationSetupVarargsFrame));
+                    
+                    adjustStack(GPRInfo::returnValueGPR);
+
+                    calleeLateRep.emitRestore(jit, GPRInfo::regT0);
+
+                    // This may not emit code if thisGPR got a callee-save. Also, we're guaranteed
+                    // that thisGPR != GPRInfo::regT0 because regT0 interferes with it.
+                    thisLateRep.emitRestore(jit, thisGPR);
+                }
+                
+                jit.store64(GPRInfo::regT0, CCallHelpers::calleeFrameSlot(JSStack::Callee));
+                jit.store64(thisGPR, CCallHelpers::calleeArgumentSlot(0));
+                
+                CallLinkInfo::CallType callType;
+                if (node-&gt;op() == ConstructVarargs || node-&gt;op() == ConstructForwardVarargs)
+                    callType = CallLinkInfo::ConstructVarargs;
+                else if (node-&gt;op() == TailCallVarargs || node-&gt;op() == TailCallForwardVarargs)
+                    callType = CallLinkInfo::TailCallVarargs;
+                else
+                    callType = CallLinkInfo::CallVarargs;
+                
+                bool isTailCall = CallLinkInfo::callModeFor(callType) == CallMode::Tail;
+                
+                CCallHelpers::DataLabelPtr targetToCheck;
+                CCallHelpers::Jump slowPath = jit.branchPtrWithPatch(
+                    CCallHelpers::NotEqual, GPRInfo::regT0, targetToCheck,
+                    CCallHelpers::TrustedImmPtr(0));
+                
+                CCallHelpers::Call fastCall;
+                CCallHelpers::Jump done;
+                
+                if (isTailCall) {
+                    jit.prepareForTailCallSlow();
+                    fastCall = jit.nearTailCall();
+                } else {
+                    fastCall = jit.nearCall();
+                    done = jit.jump();
+                }
+                
+                slowPath.link(&amp;jit);
+                
+                jit.move(CCallHelpers::TrustedImmPtr(callLinkInfo), GPRInfo::regT2);
+                CCallHelpers::Call slowCall = jit.nearCall();
+                
+                if (isTailCall)
+                    jit.abortWithReason(JITDidReturnFromTailCall);
+                else
+                    done.link(&amp;jit);
+                
+                callLinkInfo-&gt;setUpCall(callType, node-&gt;origin.semantic, GPRInfo::regT0);
+                
+                jit.addPtr(
+                    CCallHelpers::TrustedImm32(-originalStackHeight),
+                    GPRInfo::callFrameRegister, CCallHelpers::stackPointerRegister);
+                
+                jit.addLinkTask(
+                    [=] (LinkBuffer&amp; linkBuffer) {
+                        MacroAssemblerCodePtr linkCall =
+                            linkBuffer.vm().getCTIStub(linkCallThunkGenerator).code();
+                        linkBuffer.link(slowCall, FunctionPtr(linkCall.executableAddress()));
+                        
+                        callLinkInfo-&gt;setCallLocations(
+                            linkBuffer.locationOfNearCall(slowCall),
+                            linkBuffer.locationOf(targetToCheck),
+                            linkBuffer.locationOfNearCall(fastCall));
+                    });
+            });
+
+        setJSValue(patchpoint);
+#else
</ins><span class="cx">         unsigned stackmapID = m_stackmapIDs++;
</span><span class="cx">         
</span><span class="cx">         StackmapArgumentList arguments;
</span><span class="lines">@@ -5115,15 +5346,15 @@
</span><span class="cx"> 
</span><span class="cx">         arguments.insert(0, m_out.constInt32(2 + !!jsArguments));
</span><span class="cx">         arguments.insert(0, constNull(m_out.ref8));
</span><del>-        arguments.insert(0, m_out.constInt32(sizeOfICFor(m_node)));
</del><ins>+        arguments.insert(0, m_out.constInt32(sizeOfICFor(node)));
</ins><span class="cx">         arguments.insert(0, m_out.constInt64(stackmapID));
</span><span class="cx">         
</span><span class="cx">         LValue call = m_out.call(m_out.int64, m_out.patchpointInt64Intrinsic(), arguments);
</span><span class="cx">         setInstructionCallingConvention(call, LLVMCCallConv);
</span><span class="cx">         
</span><del>-        m_ftlState.jsCallVarargses.append(JSCallVarargs(stackmapID, m_node, codeOriginDescriptionOfCallSite()));
</del><ins>+        m_ftlState.jsCallVarargses.append(JSCallVarargs(stackmapID, node, codeOriginDescriptionOfCallSite()));
</ins><span class="cx"> 
</span><del>-        switch (m_node-&gt;op()) {
</del><ins>+        switch (node-&gt;op()) {
</ins><span class="cx">         case TailCallVarargs:
</span><span class="cx">         case TailCallForwardVarargs:
</span><span class="cx">             m_out.unreachable();
</span><span class="lines">@@ -5495,7 +5726,7 @@
</span><span class="cx">         DFG_ASSERT(m_graph, m_node, m_origin.exitOK);
</span><span class="cx">         
</span><span class="cx"> #if FTL_USES_B3
</span><del>-        B3::PatchpointValue* patchpoint = m_out.patchpoint(Void);
</del><ins>+        PatchpointValue* patchpoint = m_out.patchpoint(Void);
</ins><span class="cx">         OSRExitDescriptor* descriptor = appendOSRExitDescriptor(noValue(), nullptr);
</span><span class="cx">         NodeOrigin origin = m_origin;
</span><span class="cx">         patchpoint-&gt;appendColdAnys(buildExitArguments(descriptor, origin.forExit, noValue()));
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreftlFTLStateh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ftl/FTLState.h (194333 => 194334)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ftl/FTLState.h        2015-12-21 18:40:10 UTC (rev 194333)
+++ trunk/Source/JavaScriptCore/ftl/FTLState.h        2015-12-21 18:56:54 UTC (rev 194334)
</span><span class="lines">@@ -85,8 +85,6 @@
</span><span class="cx">     B3::PatchpointValue* handleStackOverflowExceptionValue { nullptr };
</span><span class="cx">     B3::PatchpointValue* handleExceptionValue { nullptr };
</span><span class="cx">     B3::StackSlotValue* capturedValue { nullptr };
</span><del>-    B3::StackSlotValue* varargsSpillSlotsValue { nullptr };
-    B3::StackSlotValue* exceptionHandlingSpillSlotValue { nullptr };
</del><span class="cx"> #else // FTL_USES_B3
</span><span class="cx">     unsigned handleStackOverflowExceptionStackmapID { UINT_MAX };
</span><span class="cx">     unsigned handleExceptionStackmapID { UINT_MAX };
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoretestsstressvarargsnoforwardjs"></a>
<div class="addfile"><h4>Added: trunk/Source/JavaScriptCore/tests/stress/varargs-no-forward.js (0 => 194334)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/tests/stress/varargs-no-forward.js                                (rev 0)
+++ trunk/Source/JavaScriptCore/tests/stress/varargs-no-forward.js        2015-12-21 18:56:54 UTC (rev 194334)
</span><span class="lines">@@ -0,0 +1,18 @@
</span><ins>+function foo(a, b, c) {
+    return a + b * 2 + c * 3;
+}
+
+noInline(foo);
+
+function baz(args) {
+    return foo.apply(this, args);
+}
+
+noInline(baz);
+
+for (var i = 0; i &lt; 10000; ++i) {
+    var result = baz([5, 6, 7]);
+    if (result != 5 + 6 * 2 + 7 * 3)
+        throw &quot;Error: bad result: &quot; + result;
+}
+
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoretestsstressvarargssimplejs"></a>
<div class="addfile"><h4>Added: trunk/Source/JavaScriptCore/tests/stress/varargs-simple.js (0 => 194334)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/tests/stress/varargs-simple.js                                (rev 0)
+++ trunk/Source/JavaScriptCore/tests/stress/varargs-simple.js        2015-12-21 18:56:54 UTC (rev 194334)
</span><span class="lines">@@ -0,0 +1,18 @@
</span><ins>+function foo(a, b, c) {
+    return a + b * 2 + c * 3;
+}
+
+noInline(foo);
+
+function baz() {
+    return foo.apply(this, arguments);
+}
+
+noInline(baz);
+
+for (var i = 0; i &lt; 10000; ++i) {
+    var result = baz(5, 6, 7);
+    if (result != 5 + 6 * 2 + 7 * 3)
+        throw &quot;Error: bad result: &quot; + result;
+}
+
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoretestsstressvarargstwoleveljs"></a>
<div class="addfile"><h4>Added: trunk/Source/JavaScriptCore/tests/stress/varargs-two-level.js (0 => 194334)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/tests/stress/varargs-two-level.js                                (rev 0)
+++ trunk/Source/JavaScriptCore/tests/stress/varargs-two-level.js        2015-12-21 18:56:54 UTC (rev 194334)
</span><span class="lines">@@ -0,0 +1,22 @@
</span><ins>+function foo(a, b, c) {
+    return a + b * 2 + c * 3;
+}
+
+noInline(foo);
+
+function bar() {
+    return foo.apply(this, arguments);
+}
+
+function baz() {
+    return bar.apply(this, arguments);
+}
+
+noInline(baz);
+
+for (var i = 0; i &lt; 10000; ++i) {
+    var result = baz(5, 6, 7);
+    if (result != 5 + 6 * 2 + 7 * 3)
+        throw &quot;Error: bad result: &quot; + result;
+}
+
</ins></span></pre>
</div>
</div>

</body>
</html>