<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[279256] trunk/Source/JavaScriptCore</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/279256">279256</a></dd>
<dt>Author</dt> <dd>mark.lam@apple.com</dd>
<dt>Date</dt> <dd>2021-06-24 17:06:56 -0700 (Thu, 24 Jun 2021)</dd>
</dl>

<h3>Log Message</h3>
<pre>Use ldp and stp more for saving / restoring registers on ARM64.
https://bugs.webkit.org/show_bug.cgi?id=227039
rdar://79354736

Reviewed by Saam Barati.

This patch introduces a spooler abstraction in AssemblyHelpers.  The spooler
basically batches up load / store operations and emit them as pair instructions
if appropriate.

There are 4 spooler classes:
a. Spooler
   - template base class for LoadRegSpooler and StoreRegSpooler.
   - encapsulates the batching strategy for load / store pairs.

b. LoadRegSpooler - specializes Spooler to handle load pairs.
b. StoreRegSpooler - specializes Spooler to handle store pairs.

d. CopySpooler
   - handles matching loads with stores.
   - tries to emit loads as load pairs if possible.
   - tries to emot stores as store pairs if possible.
   - ensures that pre-requisite loads are emitted before stores are emitted.
   - other than loads, also support constants and registers as sources of values
     to be stored.  This is useful in OSR exit ramps where we may materialize a
     stack value to store from constants or registers in addition to values we
     load from the old stack frame or from a scratch buffer.

In this patch, we also do the following:

1. Use spoolers in many places so that we can emit load / store pairs instead of
   single load / stores.  This helps shrink JIT code side, and also potentially
   improves performance.

2. In DFG::OSRExit::compileExit(), we used to recover constants into a scratch
   buffer, and then later, load from that scratch buffer to store into the
   new stack frame(s).

   This patch changes it so that we defer constant recovery until the final
   loop where we store the recovered value directly into the new stack frame(s).
   This saves us the work (and JIT code space) for storing into a scratch buffer
   and then reloading from the scratch buffer.

   There is one exception: tmp values used by active checkpoints.  We need to call
   operationMaterializeOSRExitSideState() to materialize the active checkpoint
   side state before the final loop where we now recover constants.  Hence, we
   need these tmp values recovered before hand.

   So, we check upfront if we have active checkpoint side state to materialize.
   If so, we'll eagerly recover the constants for initializing those tmps.

   We also use the CopySpooler in the final loop to emit load / store pairs for
   filling in the new stack frame(s).

   One more thing: it turns out that the vast majority of constants to be recovered
   is simply the undefined value.  So, as an optimization, the final loop keeps
   the undefined value in a register, and has the spooler store directly from
   that register when appropriate.  This saves on JIT code to repeatedly materialize
   the undefined JSValue constant.

3. In reifyInlinedCallFrames(), replace the use of GPRInfo::nonArgGPR0 with
   GPRInfo::regT4.  nonArgGPRs are sometimes map to certain regTXs on certain ports.
   Replacing with regT4 makes it easier to ensure that we're not trashing the
   register when we use more temp registers.

   reifyInlinedCallFrames() will be using emitSaveOrCopyLLIntBaselineCalleeSavesFor()
   later where we need more temp registers.

4. Move the following functions to AssemblyHelpers.cpp.  They don't need to be
   inline functions.  Speedometer2 and JetStream2 shows that making these non
   inline does not hurt performance:

        AssemblyHelpers::emitSave(const RegisterAtOffsetList&);
        AssemblyHelpers::emitRestore(const RegisterAtOffsetList&);
        AssemblyHelpers::emitSaveCalleeSavesFor(const RegisterAtOffsetList*);
        AssemblyHelpers::emitSaveOrCopyCalleeSavesFor(...);
        AssemblyHelpers::emitRestoreCalleeSavesFor(const RegisterAtOffsetList*);
        AssemblyHelpers::copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(...);

   Also renamed emitSaveOrCopyCalleeSavesFor() to emitSaveOrCopyLLIntBaselineCalleeSavesFor()
   because it is only used with baseline codeBlocks.

Results:
Cummulative LinkBuffer profile sizes shrunk by ~2M in aggregate:

                    base                           new
                    ====                           ===
       BaselineJIT: 83827048 (79.943703 MB)     => 83718736 (79.840408 MB)
               DFG: 56594836 (53.973042 MB)     => 56603508 (53.981312 MB)
       InlineCache: 33923900 (32.352352 MB)     => 33183156 (31.645924 MB)
               FTL: 6770956 (6.457287 MB)       => 6568964 (6.264652 MB)
        DFGOSRExit: 5212096 (4.970642 MB)       => 3728088 (3.555382 MB)
            CSSJIT: 748428 (730.886719 KB)      => 748428 (730.886719 KB)
        FTLOSRExit: 692276 (676.050781 KB)      => 656884 (641.488281 KB)
           YarrJIT: 445280 (434.843750 KB)      => 512988 (500.964844 KB)
          FTLThunk: 22908 (22.371094 KB)        => 22556 (22.027344 KB)
BoundFunctionThunk: 8400 (8.203125 KB)          => 10088 (9.851562 KB)
     ExtraCTIThunk: 6952 (6.789062 KB)          => 6824 (6.664062 KB)
  SpecializedThunk: 4508 (4.402344 KB)          => 4508 (4.402344 KB)
             Thunk: 3912 (3.820312 KB)          => 3784 (3.695312 KB)
        LLIntThunk: 2908 (2.839844 KB)          => 2908 (2.839844 KB)
      VirtualThunk: 1248 (1.218750 KB)          => 1248 (1.218750 KB)
          DFGThunk: 1084 (1.058594 KB)          => 444
       DFGOSREntry: 216                         => 184
        JumpIsland: 0
         WasmThunk: 0
              Wasm: 0
     Uncategorized: 0
             Total: 188266956 (179.545361 MB)   => 185773296 (177.167221 MB)

Speedometer2 and JetStream2 results shows that performance is neutral for this
patch (as measured on an M1 Mac):

Speedometer2:
----------------------------------------------------------------------------------------------------------------------------------
|               subtest                |     ms      |     ms      |  b / a   | pValue (significance using False Discovery Rate) |
----------------------------------------------------------------------------------------------------------------------------------
| Elm-TodoMVC                          |129.037500   |127.212500   |0.985857  | 0.012706                                         |
| VueJS-TodoMVC                        |28.312500    |27.525000    |0.972185  | 0.240315                                         |
| EmberJS-TodoMVC                      |132.550000   |132.025000   |0.996039  | 0.538034                                         |
| Flight-TodoMVC                       |80.762500    |80.875000    |1.001393  | 0.914749                                         |
| BackboneJS-TodoMVC                   |51.637500    |51.175000    |0.991043  | 0.285427                                         |
| Preact-TodoMVC                       |21.025000    |22.075000    |1.049941  | 0.206140                                         |
| AngularJS-TodoMVC                    |142.900000   |142.887500   |0.999913  | 0.990681                                         |
| Inferno-TodoMVC                      |69.300000    |69.775000    |1.006854  | 0.505201                                         |
| Vanilla-ES2015-TodoMVC               |71.500000    |71.225000    |0.996154  | 0.608650                                         |
| Angular2-TypeScript-TodoMVC          |43.287500    |43.275000    |0.999711  | 0.987926                                         |
| VanillaJS-TodoMVC                    |57.212500    |57.812500    |1.010487  | 0.333357                                         |
| jQuery-TodoMVC                       |276.150000   |276.775000   |1.002263  | 0.614404                                         |
| EmberJS-Debug-TodoMVC                |353.612500   |352.762500   |0.997596  | 0.518836                                         |
| React-TodoMVC                        |93.637500    |92.637500    |0.989321  | 0.036277                                         |
| React-Redux-TodoMVC                  |158.237500   |156.587500   |0.989573  | 0.042154                                         |
| Vanilla-ES2015-Babel-Webpack-TodoMVC |68.050000    |68.087500    |1.000551  | 0.897149                                         |
----------------------------------------------------------------------------------------------------------------------------------

a mean = 236.26950
b mean = 236.57964
pValue = 0.7830785938
(Bigger means are better.)
1.001 times better
Results ARE NOT significant

JetStream2:
-------------------------------------------------------------------------------------------------------------------------
|          subtest          |     pts      |     pts      |  b / a   | pValue (significance using False Discovery Rate) |
-------------------------------------------------------------------------------------------------------------------------
| gaussian-blur             |542.570057    |542.671885    |1.000188  | 0.982573                                         |
| HashSet-wasm              |57.710498     |64.406371     |1.116025  | 0.401424                                         |
| gcc-loops-wasm            |44.516009     |44.453535     |0.998597  | 0.973651                                         |
| json-parse-inspector      |241.275085    |240.720491    |0.997701  | 0.704732                                         |
| prepack-wtb               |62.640114     |63.754878     |1.017796  | 0.205840                                         |
| date-format-xparb-SP      |416.976817    |448.921409    |1.076610  | 0.052977                                         |
| WSL                       |1.555257      |1.570233      |1.009629  | 0.427924                                         |
| OfflineAssembler          |177.052352    |179.746511    |1.015217  | 0.112114                                         |
| cdjs                      |192.517586    |194.598906    |1.010811  | 0.025807                                         |
| UniPoker                  |514.023694    |526.111500    |1.023516  | 0.269892                                         |
| json-stringify-inspector  |227.584725    |223.619390    |0.982576  | 0.102714                                         |
| crypto-sha1-SP            |980.728788    |984.192104    |1.003531  | 0.838618                                         |
| Basic                     |685.148483    |711.590247    |1.038593  | 0.142952                                         |
| chai-wtb                  |106.256376    |106.590318    |1.003143  | 0.865894                                         |
| crypto-aes-SP             |722.308829    |728.702310    |1.008851  | 0.486766                                         |
| Babylon                   |655.857561    |654.204901    |0.997480  | 0.931520                                         |
| string-unpack-code-SP     |407.837271    |405.710752    |0.994786  | 0.729122                                         |
| stanford-crypto-aes       |456.906021    |449.993856    |0.984872  | 0.272994                                         |
| raytrace                  |883.911335    |902.887238    |1.021468  | 0.189785                                         |
| multi-inspector-code-load |409.997347    |405.643639    |0.989381  | 0.644447                                         |
| hash-map                  |593.590160    |601.576332    |1.013454  | 0.249414                                         |
| stanford-crypto-pbkdf2    |722.178638    |728.283532    |1.008453  | 0.661195                                         |
| coffeescript-wtb          |42.393544     |41.869545     |0.987640  | 0.197441                                         |
| Box2D                     |452.034685    |454.104868    |1.004580  | 0.535342                                         |
| richards-wasm             |140.873688    |148.394050    |1.053384  | 0.303651                                         |
| lebab-wtb                 |61.671318     |62.119403     |1.007266  | 0.620998                                         |
| tsf-wasm                  |108.592794    |119.498398    |1.100427  | 0.504710                                         |
| base64-SP                 |629.744643    |603.425565    |0.958207  | 0.049997                                         |
| navier-stokes             |740.588523    |739.951662    |0.999140  | 0.871445                                         |
| jshint-wtb                |51.938359     |52.651104     |1.013723  | 0.217137                                         |
| regex-dna-SP              |459.251148    |463.492489    |1.009235  | 0.371891                                         |
| async-fs                  |235.853820    |236.031189    |1.000752  | 0.938459                                         |
| first-inspector-code-load |275.298325    |274.172125    |0.995909  | 0.623403                                         |
| segmentation              |44.002842     |43.445960     |0.987344  | 0.207134                                         |
| typescript                |26.360161     |26.458820     |1.003743  | 0.609942                                         |
| octane-code-load          |1126.749036   |1087.132024   |0.964840  | 0.524171                                         |
| float-mm.c                |16.691935     |16.721354     |1.001762  | 0.194425                                         |
| quicksort-wasm            |461.630091    |450.161127    |0.975156  | 0.371394                                         |
| Air                       |392.442375    |412.201810    |1.050350  | 0.046887                                         |
| splay                     |510.111886    |475.131657    |0.931426  | 0.024732                                         |
| ai-astar                  |607.966974    |626.573181    |1.030604  | 0.468711                                         |
| acorn-wtb                 |67.510766     |68.143956     |1.009379  | 0.481663                                         |
| gbemu                     |144.133842    |145.620304    |1.010313  | 0.802154                                         |
| richards                  |963.475078    |946.658879    |0.982546  | 0.231189                                         |
| 3d-cube-SP                |549.426784    |550.479154    |1.001915  | 0.831307                                         |
| espree-wtb                |68.707483     |73.762202     |1.073569  | 0.033603                                         |
| bomb-workers              |96.882596     |96.116121     |0.992089  | 0.687952                                         |
| tagcloud-SP               |309.888767    |303.538511    |0.979508  | 0.187768                                         |
| mandreel                  |133.667031    |135.009929    |1.010047  | 0.075232                                         |
| 3d-raytrace-SP            |491.967649    |492.528992    |1.001141  | 0.957842                                         |
| delta-blue                |1066.718312   |1080.230772   |1.012667  | 0.549382                                         |
| ML                        |139.617293    |140.088630    |1.003376  | 0.661651                                         |
| regexp                    |351.773956    |351.075935    |0.998016  | 0.769250                                         |
| crypto                    |1510.474663   |1519.218842   |1.005789  | 0.638420                                         |
| crypto-md5-SP             |795.447899    |774.082493    |0.973140  | 0.079728                                         |
| earley-boyer              |812.574545    |870.678372    |1.071506  | 0.044081                                         |
| octane-zlib               |25.162470     |25.660261     |1.019783  | 0.554591                                         |
| date-format-tofte-SP      |395.296135    |398.008992    |1.006863  | 0.650475                                         |
| n-body-SP                 |1165.386611   |1150.525110   |0.987248  | 0.227908                                         |
| pdfjs                     |189.060252    |191.015628    |1.010343  | 0.633777                                         |
| FlightPlanner             |908.426192    |903.636642    |0.994728  | 0.838821                                         |
| uglify-js-wtb             |34.029399     |34.164342     |1.003965  | 0.655652                                         |
| babylon-wtb               |81.329869     |80.855680     |0.994170  | 0.854393                                         |
| stanford-crypto-sha256    |826.850533    |838.494164    |1.014082  | 0.579636                                         |
-------------------------------------------------------------------------------------------------------------------------

a mean = 237.91084
b mean = 239.92670
pValue = 0.0657710897
(Bigger means are better.)
1.008 times better
Results ARE NOT significant

* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* assembler/MacroAssembler.h:
(JSC::MacroAssembler::pushToSaveByteOffset):
* assembler/MacroAssemblerARM64.h:
(JSC::MacroAssemblerARM64::pushToSaveByteOffset):
* dfg/DFGOSRExit.cpp:
(JSC::DFG::OSRExit::compileExit):
* dfg/DFGOSRExitCompilerCommon.cpp:
(JSC::DFG::reifyInlinedCallFrames):
* dfg/DFGThunks.cpp:
(JSC::DFG::osrExitGenerationThunkGenerator):
* ftl/FTLSaveRestore.cpp:
(JSC::FTL::saveAllRegisters):
(JSC::FTL::restoreAllRegisters):
* ftl/FTLSaveRestore.h:
* ftl/FTLThunks.cpp:
(JSC::FTL::genericGenerationThunkGenerator):
(JSC::FTL::slowPathCallThunkGenerator):
* jit/AssemblyHelpers.cpp:
(JSC::AssemblyHelpers::restoreCalleeSavesFromEntryFrameCalleeSavesBuffer):
(JSC::AssemblyHelpers::copyCalleeSavesToEntryFrameCalleeSavesBufferImpl):
(JSC::AssemblyHelpers::emitSave):
(JSC::AssemblyHelpers::emitRestore):
(JSC::AssemblyHelpers::emitSaveCalleeSavesFor):
(JSC::AssemblyHelpers::emitRestoreCalleeSavesFor):
(JSC::AssemblyHelpers::copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer):
(JSC::AssemblyHelpers::emitSaveOrCopyLLIntBaselineCalleeSavesFor):
* jit/AssemblyHelpers.h:
(JSC::AssemblyHelpers::copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer):
(JSC::AssemblyHelpers::emitSave): Deleted.
(JSC::AssemblyHelpers::emitRestore): Deleted.
(JSC::AssemblyHelpers::emitSaveOrCopyCalleeSavesFor): Deleted.
* jit/AssemblyHelpersSpoolers.h: Added.
(JSC::AssemblyHelpers::Spooler::Spooler):
(JSC::AssemblyHelpers::Spooler::handleGPR):
(JSC::AssemblyHelpers::Spooler::finalizeGPR):
(JSC::AssemblyHelpers::Spooler::handleFPR):
(JSC::AssemblyHelpers::Spooler::finalizeFPR):
(JSC::AssemblyHelpers::Spooler::op):
(JSC::AssemblyHelpers::LoadRegSpooler::LoadRegSpooler):
(JSC::AssemblyHelpers::LoadRegSpooler::loadGPR):
(JSC::AssemblyHelpers::LoadRegSpooler::finalizeGPR):
(JSC::AssemblyHelpers::LoadRegSpooler::loadFPR):
(JSC::AssemblyHelpers::LoadRegSpooler::finalizeFPR):
(JSC::AssemblyHelpers::LoadRegSpooler::handlePair):
(JSC::AssemblyHelpers::LoadRegSpooler::handleSingle):
(JSC::AssemblyHelpers::StoreRegSpooler::StoreRegSpooler):
(JSC::AssemblyHelpers::StoreRegSpooler::storeGPR):
(JSC::AssemblyHelpers::StoreRegSpooler::finalizeGPR):
(JSC::AssemblyHelpers::StoreRegSpooler::storeFPR):
(JSC::AssemblyHelpers::StoreRegSpooler::finalizeFPR):
(JSC::AssemblyHelpers::StoreRegSpooler::handlePair):
(JSC::AssemblyHelpers::StoreRegSpooler::handleSingle):
(JSC::RegDispatch<GPRReg>::get):
(JSC::RegDispatch<GPRReg>::temp1):
(JSC::RegDispatch<GPRReg>::temp2):
(JSC::RegDispatch<GPRReg>::regToStore):
(JSC::RegDispatch<GPRReg>::invalid):
(JSC::RegDispatch<GPRReg>::regSize):
(JSC::RegDispatch<GPRReg>::isValidLoadPairImm):
(JSC::RegDispatch<GPRReg>::isValidStorePairImm):
(JSC::RegDispatch<FPRReg>::get):
(JSC::RegDispatch<FPRReg>::temp1):
(JSC::RegDispatch<FPRReg>::temp2):
(JSC::RegDispatch<FPRReg>::regToStore):
(JSC::RegDispatch<FPRReg>::invalid):
(JSC::RegDispatch<FPRReg>::regSize):
(JSC::RegDispatch<FPRReg>::isValidLoadPairImm):
(JSC::RegDispatch<FPRReg>::isValidStorePairImm):
(JSC::AssemblyHelpers::CopySpooler::Source::getReg):
(JSC::AssemblyHelpers::CopySpooler::CopySpooler):
(JSC::AssemblyHelpers::CopySpooler::temp1 const):
(JSC::AssemblyHelpers::CopySpooler::temp2 const):
(JSC::AssemblyHelpers::CopySpooler::regToStore):
(JSC::AssemblyHelpers::CopySpooler::invalid):
(JSC::AssemblyHelpers::CopySpooler::regSize):
(JSC::AssemblyHelpers::CopySpooler::isValidLoadPairImm):
(JSC::AssemblyHelpers::CopySpooler::isValidStorePairImm):
(JSC::AssemblyHelpers::CopySpooler::load):
(JSC::AssemblyHelpers::CopySpooler::move):
(JSC::AssemblyHelpers::CopySpooler::copy):
(JSC::AssemblyHelpers::CopySpooler::store):
(JSC::AssemblyHelpers::CopySpooler::flush):
(JSC::AssemblyHelpers::CopySpooler::loadGPR):
(JSC::AssemblyHelpers::CopySpooler::copyGPR):
(JSC::AssemblyHelpers::CopySpooler::moveConstant):
(JSC::AssemblyHelpers::CopySpooler::storeGPR):
(JSC::AssemblyHelpers::CopySpooler::finalizeGPR):
(JSC::AssemblyHelpers::CopySpooler::loadFPR):
(JSC::AssemblyHelpers::CopySpooler::copyFPR):
(JSC::AssemblyHelpers::CopySpooler::storeFPR):
(JSC::AssemblyHelpers::CopySpooler::finalizeFPR):
(JSC::AssemblyHelpers::CopySpooler::loadPair):
(JSC::AssemblyHelpers::CopySpooler::storePair):
* jit/ScratchRegisterAllocator.cpp:
(JSC::ScratchRegisterAllocator::preserveReusedRegistersByPushing):
(JSC::ScratchRegisterAllocator::restoreReusedRegistersByPopping):
(JSC::ScratchRegisterAllocator::preserveRegistersToStackForCall):
(JSC::ScratchRegisterAllocator::restoreRegistersFromStackForCall):
* jit/ScratchRegisterAllocator.h:
* wasm/WasmAirIRGenerator.cpp:
(JSC::Wasm::AirIRGenerator::addReturn):
* wasm/WasmB3IRGenerator.cpp:
(JSC::Wasm::B3IRGenerator::addReturn):</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCoreCMakeListstxt">trunk/Source/JavaScriptCore/CMakeLists.txt</a></li>
<li><a href="#trunkSourceJavaScriptCoreChangeLog">trunk/Source/JavaScriptCore/ChangeLog</a></li>
<li><a href="#trunkSourceJavaScriptCoreJavaScriptCorexcodeprojprojectpbxproj">trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj</a></li>
<li><a href="#trunkSourceJavaScriptCoreassemblerMacroAssemblerh">trunk/Source/JavaScriptCore/assembler/MacroAssembler.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreassemblerMacroAssemblerARM64h">trunk/Source/JavaScriptCore/assembler/MacroAssemblerARM64.h</a></li>
<li><a href="#trunkSourceJavaScriptCoredfgDFGOSRExitcpp">trunk/Source/JavaScriptCore/dfg/DFGOSRExit.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoredfgDFGOSRExitCompilerCommoncpp">trunk/Source/JavaScriptCore/dfg/DFGOSRExitCompilerCommon.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoredfgDFGThunkscpp">trunk/Source/JavaScriptCore/dfg/DFGThunks.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreftlFTLSaveRestorecpp">trunk/Source/JavaScriptCore/ftl/FTLSaveRestore.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreftlFTLSaveRestoreh">trunk/Source/JavaScriptCore/ftl/FTLSaveRestore.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreftlFTLThunkscpp">trunk/Source/JavaScriptCore/ftl/FTLThunks.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCorejitAssemblyHelperscpp">trunk/Source/JavaScriptCore/jit/AssemblyHelpers.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCorejitAssemblyHelpersh">trunk/Source/JavaScriptCore/jit/AssemblyHelpers.h</a></li>
<li><a href="#trunkSourceJavaScriptCorejitScratchRegisterAllocatorcpp">trunk/Source/JavaScriptCore/jit/ScratchRegisterAllocator.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCorejitScratchRegisterAllocatorh">trunk/Source/JavaScriptCore/jit/ScratchRegisterAllocator.h</a></li>
<li><a href="#trunkSourceJavaScriptCorewasmWasmAirIRGeneratorcpp">trunk/Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCorewasmWasmB3IRGeneratorcpp">trunk/Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp</a></li>
</ul>

<h3>Added Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCorejitAssemblyHelpersSpoolersh">trunk/Source/JavaScriptCore/jit/AssemblyHelpersSpoolers.h</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkSourceJavaScriptCoreCMakeListstxt"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/CMakeLists.txt (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/CMakeLists.txt       2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/CMakeLists.txt  2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -757,6 +757,7 @@
</span><span class="cx">     interpreter/VMEntryRecord.h
</span><span class="cx"> 
</span><span class="cx">     jit/AssemblyHelpers.h
</span><ins>+    jit/AssemblyHelpersSpoolers.h
</ins><span class="cx">     jit/CCallHelpers.h
</span><span class="cx">     jit/ExecutableAllocator.h
</span><span class="cx">     jit/FPRInfo.h
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ChangeLog (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ChangeLog    2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/ChangeLog       2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -1,3 +1,330 @@
</span><ins>+2021-06-24  Mark Lam  <mark.lam@apple.com>
+
+        Use ldp and stp more for saving / restoring registers on ARM64.
+        https://bugs.webkit.org/show_bug.cgi?id=227039
+        rdar://79354736
+
+        Reviewed by Saam Barati.
+
+        This patch introduces a spooler abstraction in AssemblyHelpers.  The spooler
+        basically batches up load / store operations and emit them as pair instructions
+        if appropriate.
+
+        There are 4 spooler classes:
+        a. Spooler
+           - template base class for LoadRegSpooler and StoreRegSpooler.
+           - encapsulates the batching strategy for load / store pairs.
+
+        b. LoadRegSpooler - specializes Spooler to handle load pairs.
+        b. StoreRegSpooler - specializes Spooler to handle store pairs.
+
+        d. CopySpooler
+           - handles matching loads with stores.
+           - tries to emit loads as load pairs if possible.
+           - tries to emot stores as store pairs if possible.
+           - ensures that pre-requisite loads are emitted before stores are emitted.
+           - other than loads, also support constants and registers as sources of values
+             to be stored.  This is useful in OSR exit ramps where we may materialize a
+             stack value to store from constants or registers in addition to values we
+             load from the old stack frame or from a scratch buffer.
+
+        In this patch, we also do the following:
+
+        1. Use spoolers in many places so that we can emit load / store pairs instead of
+           single load / stores.  This helps shrink JIT code side, and also potentially
+           improves performance.
+
+        2. In DFG::OSRExit::compileExit(), we used to recover constants into a scratch
+           buffer, and then later, load from that scratch buffer to store into the
+           new stack frame(s).
+
+           This patch changes it so that we defer constant recovery until the final
+           loop where we store the recovered value directly into the new stack frame(s).
+           This saves us the work (and JIT code space) for storing into a scratch buffer
+           and then reloading from the scratch buffer.
+
+           There is one exception: tmp values used by active checkpoints.  We need to call
+           operationMaterializeOSRExitSideState() to materialize the active checkpoint
+           side state before the final loop where we now recover constants.  Hence, we
+           need these tmp values recovered before hand.
+
+           So, we check upfront if we have active checkpoint side state to materialize.
+           If so, we'll eagerly recover the constants for initializing those tmps.
+
+           We also use the CopySpooler in the final loop to emit load / store pairs for
+           filling in the new stack frame(s).
+
+           One more thing: it turns out that the vast majority of constants to be recovered
+           is simply the undefined value.  So, as an optimization, the final loop keeps
+           the undefined value in a register, and has the spooler store directly from
+           that register when appropriate.  This saves on JIT code to repeatedly materialize
+           the undefined JSValue constant.
+
+        3. In reifyInlinedCallFrames(), replace the use of GPRInfo::nonArgGPR0 with
+           GPRInfo::regT4.  nonArgGPRs are sometimes map to certain regTXs on certain ports.
+           Replacing with regT4 makes it easier to ensure that we're not trashing the
+           register when we use more temp registers.
+
+           reifyInlinedCallFrames() will be using emitSaveOrCopyLLIntBaselineCalleeSavesFor()
+           later where we need more temp registers.
+
+        4. Move the following functions to AssemblyHelpers.cpp.  They don't need to be
+           inline functions.  Speedometer2 and JetStream2 shows that making these non
+           inline does not hurt performance:
+
+                AssemblyHelpers::emitSave(const RegisterAtOffsetList&);
+                AssemblyHelpers::emitRestore(const RegisterAtOffsetList&);
+                AssemblyHelpers::emitSaveCalleeSavesFor(const RegisterAtOffsetList*);
+                AssemblyHelpers::emitSaveOrCopyCalleeSavesFor(...);
+                AssemblyHelpers::emitRestoreCalleeSavesFor(const RegisterAtOffsetList*);
+                AssemblyHelpers::copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(...);
+
+           Also renamed emitSaveOrCopyCalleeSavesFor() to emitSaveOrCopyLLIntBaselineCalleeSavesFor()
+           because it is only used with baseline codeBlocks.
+
+        Results:
+        Cummulative LinkBuffer profile sizes shrunk by ~2M in aggregate:
+
+                            base                           new
+                            ====                           ===
+               BaselineJIT: 83827048 (79.943703 MB)     => 83718736 (79.840408 MB)
+                       DFG: 56594836 (53.973042 MB)     => 56603508 (53.981312 MB)
+               InlineCache: 33923900 (32.352352 MB)     => 33183156 (31.645924 MB)
+                       FTL: 6770956 (6.457287 MB)       => 6568964 (6.264652 MB)
+                DFGOSRExit: 5212096 (4.970642 MB)       => 3728088 (3.555382 MB)
+                    CSSJIT: 748428 (730.886719 KB)      => 748428 (730.886719 KB)
+                FTLOSRExit: 692276 (676.050781 KB)      => 656884 (641.488281 KB)
+                   YarrJIT: 445280 (434.843750 KB)      => 512988 (500.964844 KB)
+                  FTLThunk: 22908 (22.371094 KB)        => 22556 (22.027344 KB)
+        BoundFunctionThunk: 8400 (8.203125 KB)          => 10088 (9.851562 KB)
+             ExtraCTIThunk: 6952 (6.789062 KB)          => 6824 (6.664062 KB)
+          SpecializedThunk: 4508 (4.402344 KB)          => 4508 (4.402344 KB)
+                     Thunk: 3912 (3.820312 KB)          => 3784 (3.695312 KB)
+                LLIntThunk: 2908 (2.839844 KB)          => 2908 (2.839844 KB)
+              VirtualThunk: 1248 (1.218750 KB)          => 1248 (1.218750 KB)
+                  DFGThunk: 1084 (1.058594 KB)          => 444
+               DFGOSREntry: 216                         => 184
+                JumpIsland: 0
+                 WasmThunk: 0
+                      Wasm: 0
+             Uncategorized: 0
+                     Total: 188266956 (179.545361 MB)   => 185773296 (177.167221 MB)
+
+        Speedometer2 and JetStream2 results shows that performance is neutral for this
+        patch (as measured on an M1 Mac):
+
+        Speedometer2:
+        ----------------------------------------------------------------------------------------------------------------------------------
+        |               subtest                |     ms      |     ms      |  b / a   | pValue (significance using False Discovery Rate) |
+        ----------------------------------------------------------------------------------------------------------------------------------
+        | Elm-TodoMVC                          |129.037500   |127.212500   |0.985857  | 0.012706                                         |
+        | VueJS-TodoMVC                        |28.312500    |27.525000    |0.972185  | 0.240315                                         |
+        | EmberJS-TodoMVC                      |132.550000   |132.025000   |0.996039  | 0.538034                                         |
+        | Flight-TodoMVC                       |80.762500    |80.875000    |1.001393  | 0.914749                                         |
+        | BackboneJS-TodoMVC                   |51.637500    |51.175000    |0.991043  | 0.285427                                         |
+        | Preact-TodoMVC                       |21.025000    |22.075000    |1.049941  | 0.206140                                         |
+        | AngularJS-TodoMVC                    |142.900000   |142.887500   |0.999913  | 0.990681                                         |
+        | Inferno-TodoMVC                      |69.300000    |69.775000    |1.006854  | 0.505201                                         |
+        | Vanilla-ES2015-TodoMVC               |71.500000    |71.225000    |0.996154  | 0.608650                                         |
+        | Angular2-TypeScript-TodoMVC          |43.287500    |43.275000    |0.999711  | 0.987926                                         |
+        | VanillaJS-TodoMVC                    |57.212500    |57.812500    |1.010487  | 0.333357                                         |
+        | jQuery-TodoMVC                       |276.150000   |276.775000   |1.002263  | 0.614404                                         |
+        | EmberJS-Debug-TodoMVC                |353.612500   |352.762500   |0.997596  | 0.518836                                         |
+        | React-TodoMVC                        |93.637500    |92.637500    |0.989321  | 0.036277                                         |
+        | React-Redux-TodoMVC                  |158.237500   |156.587500   |0.989573  | 0.042154                                         |
+        | Vanilla-ES2015-Babel-Webpack-TodoMVC |68.050000    |68.087500    |1.000551  | 0.897149                                         |
+        ----------------------------------------------------------------------------------------------------------------------------------
+
+        a mean = 236.26950
+        b mean = 236.57964
+        pValue = 0.7830785938
+        (Bigger means are better.)
+        1.001 times better
+        Results ARE NOT significant
+
+        JetStream2:
+        -------------------------------------------------------------------------------------------------------------------------
+        |          subtest          |     pts      |     pts      |  b / a   | pValue (significance using False Discovery Rate) |
+        -------------------------------------------------------------------------------------------------------------------------
+        | gaussian-blur             |542.570057    |542.671885    |1.000188  | 0.982573                                         |
+        | HashSet-wasm              |57.710498     |64.406371     |1.116025  | 0.401424                                         |
+        | gcc-loops-wasm            |44.516009     |44.453535     |0.998597  | 0.973651                                         |
+        | json-parse-inspector      |241.275085    |240.720491    |0.997701  | 0.704732                                         |
+        | prepack-wtb               |62.640114     |63.754878     |1.017796  | 0.205840                                         |
+        | date-format-xparb-SP      |416.976817    |448.921409    |1.076610  | 0.052977                                         |
+        | WSL                       |1.555257      |1.570233      |1.009629  | 0.427924                                         |
+        | OfflineAssembler          |177.052352    |179.746511    |1.015217  | 0.112114                                         |
+        | cdjs                      |192.517586    |194.598906    |1.010811  | 0.025807                                         |
+        | UniPoker                  |514.023694    |526.111500    |1.023516  | 0.269892                                         |
+        | json-stringify-inspector  |227.584725    |223.619390    |0.982576  | 0.102714                                         |
+        | crypto-sha1-SP            |980.728788    |984.192104    |1.003531  | 0.838618                                         |
+        | Basic                     |685.148483    |711.590247    |1.038593  | 0.142952                                         |
+        | chai-wtb                  |106.256376    |106.590318    |1.003143  | 0.865894                                         |
+        | crypto-aes-SP             |722.308829    |728.702310    |1.008851  | 0.486766                                         |
+        | Babylon                   |655.857561    |654.204901    |0.997480  | 0.931520                                         |
+        | string-unpack-code-SP     |407.837271    |405.710752    |0.994786  | 0.729122                                         |
+        | stanford-crypto-aes       |456.906021    |449.993856    |0.984872  | 0.272994                                         |
+        | raytrace                  |883.911335    |902.887238    |1.021468  | 0.189785                                         |
+        | multi-inspector-code-load |409.997347    |405.643639    |0.989381  | 0.644447                                         |
+        | hash-map                  |593.590160    |601.576332    |1.013454  | 0.249414                                         |
+        | stanford-crypto-pbkdf2    |722.178638    |728.283532    |1.008453  | 0.661195                                         |
+        | coffeescript-wtb          |42.393544     |41.869545     |0.987640  | 0.197441                                         |
+        | Box2D                     |452.034685    |454.104868    |1.004580  | 0.535342                                         |
+        | richards-wasm             |140.873688    |148.394050    |1.053384  | 0.303651                                         |
+        | lebab-wtb                 |61.671318     |62.119403     |1.007266  | 0.620998                                         |
+        | tsf-wasm                  |108.592794    |119.498398    |1.100427  | 0.504710                                         |
+        | base64-SP                 |629.744643    |603.425565    |0.958207  | 0.049997                                         |
+        | navier-stokes             |740.588523    |739.951662    |0.999140  | 0.871445                                         |
+        | jshint-wtb                |51.938359     |52.651104     |1.013723  | 0.217137                                         |
+        | regex-dna-SP              |459.251148    |463.492489    |1.009235  | 0.371891                                         |
+        | async-fs                  |235.853820    |236.031189    |1.000752  | 0.938459                                         |
+        | first-inspector-code-load |275.298325    |274.172125    |0.995909  | 0.623403                                         |
+        | segmentation              |44.002842     |43.445960     |0.987344  | 0.207134                                         |
+        | typescript                |26.360161     |26.458820     |1.003743  | 0.609942                                         |
+        | octane-code-load          |1126.749036   |1087.132024   |0.964840  | 0.524171                                         |
+        | float-mm.c                |16.691935     |16.721354     |1.001762  | 0.194425                                         |
+        | quicksort-wasm            |461.630091    |450.161127    |0.975156  | 0.371394                                         |
+        | Air                       |392.442375    |412.201810    |1.050350  | 0.046887                                         |
+        | splay                     |510.111886    |475.131657    |0.931426  | 0.024732                                         |
+        | ai-astar                  |607.966974    |626.573181    |1.030604  | 0.468711                                         |
+        | acorn-wtb                 |67.510766     |68.143956     |1.009379  | 0.481663                                         |
+        | gbemu                     |144.133842    |145.620304    |1.010313  | 0.802154                                         |
+        | richards                  |963.475078    |946.658879    |0.982546  | 0.231189                                         |
+        | 3d-cube-SP                |549.426784    |550.479154    |1.001915  | 0.831307                                         |
+        | espree-wtb                |68.707483     |73.762202     |1.073569  | 0.033603                                         |
+        | bomb-workers              |96.882596     |96.116121     |0.992089  | 0.687952                                         |
+        | tagcloud-SP               |309.888767    |303.538511    |0.979508  | 0.187768                                         |
+        | mandreel                  |133.667031    |135.009929    |1.010047  | 0.075232                                         |
+        | 3d-raytrace-SP            |491.967649    |492.528992    |1.001141  | 0.957842                                         |
+        | delta-blue                |1066.718312   |1080.230772   |1.012667  | 0.549382                                         |
+        | ML                        |139.617293    |140.088630    |1.003376  | 0.661651                                         |
+        | regexp                    |351.773956    |351.075935    |0.998016  | 0.769250                                         |
+        | crypto                    |1510.474663   |1519.218842   |1.005789  | 0.638420                                         |
+        | crypto-md5-SP             |795.447899    |774.082493    |0.973140  | 0.079728                                         |
+        | earley-boyer              |812.574545    |870.678372    |1.071506  | 0.044081                                         |
+        | octane-zlib               |25.162470     |25.660261     |1.019783  | 0.554591                                         |
+        | date-format-tofte-SP      |395.296135    |398.008992    |1.006863  | 0.650475                                         |
+        | n-body-SP                 |1165.386611   |1150.525110   |0.987248  | 0.227908                                         |
+        | pdfjs                     |189.060252    |191.015628    |1.010343  | 0.633777                                         |
+        | FlightPlanner             |908.426192    |903.636642    |0.994728  | 0.838821                                         |
+        | uglify-js-wtb             |34.029399     |34.164342     |1.003965  | 0.655652                                         |
+        | babylon-wtb               |81.329869     |80.855680     |0.994170  | 0.854393                                         |
+        | stanford-crypto-sha256    |826.850533    |838.494164    |1.014082  | 0.579636                                         |
+        -------------------------------------------------------------------------------------------------------------------------
+
+        a mean = 237.91084
+        b mean = 239.92670
+        pValue = 0.0657710897
+        (Bigger means are better.)
+        1.008 times better
+        Results ARE NOT significant
+
+        * CMakeLists.txt:
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+        * assembler/MacroAssembler.h:
+        (JSC::MacroAssembler::pushToSaveByteOffset):
+        * assembler/MacroAssemblerARM64.h:
+        (JSC::MacroAssemblerARM64::pushToSaveByteOffset):
+        * dfg/DFGOSRExit.cpp:
+        (JSC::DFG::OSRExit::compileExit):
+        * dfg/DFGOSRExitCompilerCommon.cpp:
+        (JSC::DFG::reifyInlinedCallFrames):
+        * dfg/DFGThunks.cpp:
+        (JSC::DFG::osrExitGenerationThunkGenerator):
+        * ftl/FTLSaveRestore.cpp:
+        (JSC::FTL::saveAllRegisters):
+        (JSC::FTL::restoreAllRegisters):
+        * ftl/FTLSaveRestore.h:
+        * ftl/FTLThunks.cpp:
+        (JSC::FTL::genericGenerationThunkGenerator):
+        (JSC::FTL::slowPathCallThunkGenerator):
+        * jit/AssemblyHelpers.cpp:
+        (JSC::AssemblyHelpers::restoreCalleeSavesFromEntryFrameCalleeSavesBuffer):
+        (JSC::AssemblyHelpers::copyCalleeSavesToEntryFrameCalleeSavesBufferImpl):
+        (JSC::AssemblyHelpers::emitSave):
+        (JSC::AssemblyHelpers::emitRestore):
+        (JSC::AssemblyHelpers::emitSaveCalleeSavesFor):
+        (JSC::AssemblyHelpers::emitRestoreCalleeSavesFor):
+        (JSC::AssemblyHelpers::copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer):
+        (JSC::AssemblyHelpers::emitSaveOrCopyLLIntBaselineCalleeSavesFor):
+        * jit/AssemblyHelpers.h:
+        (JSC::AssemblyHelpers::copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer):
+        (JSC::AssemblyHelpers::emitSave): Deleted.
+        (JSC::AssemblyHelpers::emitRestore): Deleted.
+        (JSC::AssemblyHelpers::emitSaveOrCopyCalleeSavesFor): Deleted.
+        * jit/AssemblyHelpersSpoolers.h: Added.
+        (JSC::AssemblyHelpers::Spooler::Spooler):
+        (JSC::AssemblyHelpers::Spooler::handleGPR):
+        (JSC::AssemblyHelpers::Spooler::finalizeGPR):
+        (JSC::AssemblyHelpers::Spooler::handleFPR):
+        (JSC::AssemblyHelpers::Spooler::finalizeFPR):
+        (JSC::AssemblyHelpers::Spooler::op):
+        (JSC::AssemblyHelpers::LoadRegSpooler::LoadRegSpooler):
+        (JSC::AssemblyHelpers::LoadRegSpooler::loadGPR):
+        (JSC::AssemblyHelpers::LoadRegSpooler::finalizeGPR):
+        (JSC::AssemblyHelpers::LoadRegSpooler::loadFPR):
+        (JSC::AssemblyHelpers::LoadRegSpooler::finalizeFPR):
+        (JSC::AssemblyHelpers::LoadRegSpooler::handlePair):
+        (JSC::AssemblyHelpers::LoadRegSpooler::handleSingle):
+        (JSC::AssemblyHelpers::StoreRegSpooler::StoreRegSpooler):
+        (JSC::AssemblyHelpers::StoreRegSpooler::storeGPR):
+        (JSC::AssemblyHelpers::StoreRegSpooler::finalizeGPR):
+        (JSC::AssemblyHelpers::StoreRegSpooler::storeFPR):
+        (JSC::AssemblyHelpers::StoreRegSpooler::finalizeFPR):
+        (JSC::AssemblyHelpers::StoreRegSpooler::handlePair):
+        (JSC::AssemblyHelpers::StoreRegSpooler::handleSingle):
+        (JSC::RegDispatch<GPRReg>::get):
+        (JSC::RegDispatch<GPRReg>::temp1):
+        (JSC::RegDispatch<GPRReg>::temp2):
+        (JSC::RegDispatch<GPRReg>::regToStore):
+        (JSC::RegDispatch<GPRReg>::invalid):
+        (JSC::RegDispatch<GPRReg>::regSize):
+        (JSC::RegDispatch<GPRReg>::isValidLoadPairImm):
+        (JSC::RegDispatch<GPRReg>::isValidStorePairImm):
+        (JSC::RegDispatch<FPRReg>::get):
+        (JSC::RegDispatch<FPRReg>::temp1):
+        (JSC::RegDispatch<FPRReg>::temp2):
+        (JSC::RegDispatch<FPRReg>::regToStore):
+        (JSC::RegDispatch<FPRReg>::invalid):
+        (JSC::RegDispatch<FPRReg>::regSize):
+        (JSC::RegDispatch<FPRReg>::isValidLoadPairImm):
+        (JSC::RegDispatch<FPRReg>::isValidStorePairImm):
+        (JSC::AssemblyHelpers::CopySpooler::Source::getReg):
+        (JSC::AssemblyHelpers::CopySpooler::CopySpooler):
+        (JSC::AssemblyHelpers::CopySpooler::temp1 const):
+        (JSC::AssemblyHelpers::CopySpooler::temp2 const):
+        (JSC::AssemblyHelpers::CopySpooler::regToStore):
+        (JSC::AssemblyHelpers::CopySpooler::invalid):
+        (JSC::AssemblyHelpers::CopySpooler::regSize):
+        (JSC::AssemblyHelpers::CopySpooler::isValidLoadPairImm):
+        (JSC::AssemblyHelpers::CopySpooler::isValidStorePairImm):
+        (JSC::AssemblyHelpers::CopySpooler::load):
+        (JSC::AssemblyHelpers::CopySpooler::move):
+        (JSC::AssemblyHelpers::CopySpooler::copy):
+        (JSC::AssemblyHelpers::CopySpooler::store):
+        (JSC::AssemblyHelpers::CopySpooler::flush):
+        (JSC::AssemblyHelpers::CopySpooler::loadGPR):
+        (JSC::AssemblyHelpers::CopySpooler::copyGPR):
+        (JSC::AssemblyHelpers::CopySpooler::moveConstant):
+        (JSC::AssemblyHelpers::CopySpooler::storeGPR):
+        (JSC::AssemblyHelpers::CopySpooler::finalizeGPR):
+        (JSC::AssemblyHelpers::CopySpooler::loadFPR):
+        (JSC::AssemblyHelpers::CopySpooler::copyFPR):
+        (JSC::AssemblyHelpers::CopySpooler::storeFPR):
+        (JSC::AssemblyHelpers::CopySpooler::finalizeFPR):
+        (JSC::AssemblyHelpers::CopySpooler::loadPair):
+        (JSC::AssemblyHelpers::CopySpooler::storePair):
+        * jit/ScratchRegisterAllocator.cpp:
+        (JSC::ScratchRegisterAllocator::preserveReusedRegistersByPushing):
+        (JSC::ScratchRegisterAllocator::restoreReusedRegistersByPopping):
+        (JSC::ScratchRegisterAllocator::preserveRegistersToStackForCall):
+        (JSC::ScratchRegisterAllocator::restoreRegistersFromStackForCall):
+        * jit/ScratchRegisterAllocator.h:
+        * wasm/WasmAirIRGenerator.cpp:
+        (JSC::Wasm::AirIRGenerator::addReturn):
+        * wasm/WasmB3IRGenerator.cpp:
+        (JSC::Wasm::B3IRGenerator::addReturn):
+
</ins><span class="cx"> 2021-06-24  Yusuke Suzuki  <ysuzuki@apple.com>
</span><span class="cx"> 
</span><span class="cx">         Unreviewed, build fix for ARM64
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreJavaScriptCorexcodeprojprojectpbxproj"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj     2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj        2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -1149,6 +1149,7 @@
</span><span class="cx">          65B8392E1BACAD360044E824 /* CachedRecovery.h in Headers */ = {isa = PBXBuildFile; fileRef = 65B8392C1BACA92A0044E824 /* CachedRecovery.h */; };
</span><span class="cx">          6A38CFAA1E32B5AB0060206F /* AsyncStackTrace.h in Headers */ = {isa = PBXBuildFile; fileRef = 6A38CFA81E32B58B0060206F /* AsyncStackTrace.h */; };
</span><span class="cx">          6AD2CB4D19B9140100065719 /* DebuggerEvalEnabler.h in Headers */ = {isa = PBXBuildFile; fileRef = 6AD2CB4C19B9140100065719 /* DebuggerEvalEnabler.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><ins>+               6B767E7B26791F270017F8D1 /* AssemblyHelpersSpoolers.h in Headers */ = {isa = PBXBuildFile; fileRef = 6B767E7A26791F270017F8D1 /* AssemblyHelpersSpoolers.h */; settings = {ATTRIBUTES = (Private, ); }; };
</ins><span class="cx">           6BCCEC0425D1FA27000F391D /* VerifierSlotVisitorInlines.h in Headers */ = {isa = PBXBuildFile; fileRef = 6BCCEC0325D1FA27000F391D /* VerifierSlotVisitorInlines.h */; };
</span><span class="cx">          70113D4C1A8DB093003848C4 /* IteratorOperations.h in Headers */ = {isa = PBXBuildFile; fileRef = 70113D4A1A8DB093003848C4 /* IteratorOperations.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><span class="cx">          7013CA8C1B491A9400CAE613 /* JSMicrotask.h in Headers */ = {isa = PBXBuildFile; fileRef = 7013CA8A1B491A9400CAE613 /* JSMicrotask.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><span class="lines">@@ -3999,6 +4000,7 @@
</span><span class="cx">          6A38CFA81E32B58B0060206F /* AsyncStackTrace.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = AsyncStackTrace.h; sourceTree = "<group>"; };
</span><span class="cx">          6AD2CB4C19B9140100065719 /* DebuggerEvalEnabler.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = DebuggerEvalEnabler.h; sourceTree = "<group>"; };
</span><span class="cx">          6B731CC02647A8370014646F /* SlowPathCall.cpp */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.cpp; path = SlowPathCall.cpp; sourceTree = "<group>"; };
</span><ins>+               6B767E7A26791F270017F8D1 /* AssemblyHelpersSpoolers.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = AssemblyHelpersSpoolers.h; sourceTree = "<group>"; };
</ins><span class="cx">           6BA93C9590484C5BAD9316EA /* JSScriptFetcher.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JSScriptFetcher.h; sourceTree = "<group>"; };
</span><span class="cx">          6BCCEC0325D1FA27000F391D /* VerifierSlotVisitorInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = VerifierSlotVisitorInlines.h; sourceTree = "<group>"; };
</span><span class="cx">          70113D491A8DB093003848C4 /* IteratorOperations.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = IteratorOperations.cpp; sourceTree = "<group>"; };
</span><span class="lines">@@ -6200,6 +6202,7 @@
</span><span class="cx">                          0FB57068267A642E0080FA8B /* HashableRegisterSet.h */,
</span><span class="cx">                          0F24E53B17EA9F5900ABB217 /* AssemblyHelpers.cpp */,
</span><span class="cx">                          0F24E53C17EA9F5900ABB217 /* AssemblyHelpers.h */,
</span><ins>+                               6B767E7A26791F270017F8D1 /* AssemblyHelpersSpoolers.h */,
</ins><span class="cx">                           723998F6265DBCDB0057867F /* BaselineJITPlan.cpp */,
</span><span class="cx">                          723998F5265DBCDB0057867F /* BaselineJITPlan.h */,
</span><span class="cx">                          0F64B26F1A784BAF006E4E66 /* BinarySwitch.cpp */,
</span><span class="lines">@@ -9154,6 +9157,7 @@
</span><span class="cx">                          FE912B5125311AD100FABDDF /* AbstractSlotVisitorInlines.h in Headers */,
</span><span class="cx">                          534E034E1E4D4B1600213F64 /* AccessCase.h in Headers */,
</span><span class="cx">                          E3BFD0BC1DAF808E0065DEA2 /* AccessCaseSnippetParams.h in Headers */,
</span><ins>+                               6B767E7B26791F270017F8D1 /* AssemblyHelpersSpoolers.h in Headers */,
</ins><span class="cx">                           5370B4F61BF26205005C40FC /* AdaptiveInferredPropertyValueWatchpointBase.h in Headers */,
</span><span class="cx">                          9168BD872447BA4E0080FFB4 /* AggregateError.h in Headers */,
</span><span class="cx">                          918E15C32447B22700447A56 /* AggregateErrorConstructor.h in Headers */,
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreassemblerMacroAssemblerh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/assembler/MacroAssembler.h (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/assembler/MacroAssembler.h   2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/assembler/MacroAssembler.h      2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -337,7 +337,7 @@
</span><span class="cx">         addPtr(TrustedImm32(sizeof(double)), stackPointerRegister);
</span><span class="cx">     }
</span><span class="cx">     
</span><del>-    static ptrdiff_t pushToSaveByteOffset() { return sizeof(void*); }
</del><ins>+    static constexpr ptrdiff_t pushToSaveByteOffset() { return sizeof(void*); }
</ins><span class="cx"> #endif // !CPU(ARM64)
</span><span class="cx"> 
</span><span class="cx"> #if CPU(X86_64) || CPU(ARM64)
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreassemblerMacroAssemblerARM64h"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/assembler/MacroAssemblerARM64.h (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/assembler/MacroAssemblerARM64.h      2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/assembler/MacroAssemblerARM64.h 2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -2534,7 +2534,7 @@
</span><span class="cx">         storeDouble(src, stackPointerRegister);
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    static ptrdiff_t pushToSaveByteOffset() { return 16; }
</del><ins>+    static constexpr ptrdiff_t pushToSaveByteOffset() { return 16; }
</ins><span class="cx"> 
</span><span class="cx">     // Register move operations:
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoredfgDFGOSRExitcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/dfg/DFGOSRExit.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/dfg/DFGOSRExit.cpp   2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/dfg/DFGOSRExit.cpp      2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -28,7 +28,7 @@
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(DFG_JIT)
</span><span class="cx"> 
</span><del>-#include "AssemblyHelpers.h"
</del><ins>+#include "AssemblyHelpersSpoolers.h"
</ins><span class="cx"> #include "BytecodeStructs.h"
</span><span class="cx"> #include "CheckpointOSRExitSideState.h"
</span><span class="cx"> #include "DFGGraph.h"
</span><span class="lines">@@ -562,6 +562,11 @@
</span><span class="cx">     // do this even for state that's already in the right place on the stack.
</span><span class="cx">     // It makes things simpler later.
</span><span class="cx"> 
</span><ins>+    bool inlineStackContainsActiveCheckpoint = exit.m_codeOrigin.inlineStackContainsActiveCheckpoint();
+    size_t firstTmpToRestoreEarly = operands.size() - operands.numberOfTmps();
+    if (!inlineStackContainsActiveCheckpoint)
+        firstTmpToRestoreEarly = operands.size(); // Don't eagerly restore.
+
</ins><span class="cx">     // The tag registers are needed to materialize recoveries below.
</span><span class="cx">     jit.emitMaterializeTagCheckRegisters();
</span><span class="cx"> 
</span><span class="lines">@@ -568,7 +573,8 @@
</span><span class="cx">     for (size_t index = 0; index < operands.size(); ++index) {
</span><span class="cx">         const ValueRecovery& recovery = operands[index];
</span><span class="cx"> 
</span><del>-        switch (recovery.technique()) {
</del><ins>+        auto currentTechnique = recovery.technique();
+        switch (currentTechnique) {
</ins><span class="cx">         case DisplacedInJSStack:
</span><span class="cx"> #if USE(JSVALUE64)
</span><span class="cx">         case CellDisplacedInJSStack:
</span><span class="lines">@@ -594,9 +600,12 @@
</span><span class="cx"> 
</span><span class="cx">         case Constant: {
</span><span class="cx"> #if USE(JSVALUE64)
</span><del>-            jit.move(AssemblyHelpers::TrustedImm64(JSValue::encode(recovery.constant())), GPRInfo::regT0);
-            jit.store64(GPRInfo::regT0, scratch + index);
-#else
</del><ins>+            if (index >= firstTmpToRestoreEarly) {
+                ASSERT(operands.operandForIndex(index).isTmp());
+                jit.move(AssemblyHelpers::TrustedImm64(JSValue::encode(recovery.constant())), GPRInfo::regT0);
+                jit.store64(GPRInfo::regT0, scratch + index);
+            }
+#else // not USE(JSVALUE64)
</ins><span class="cx">             jit.store32(
</span><span class="cx">                 AssemblyHelpers::TrustedImm32(recovery.constant().tag()),
</span><span class="cx">                 &bitwise_cast<EncodedValueDescriptor*>(scratch + index)->asBits.tag);
</span><span class="lines">@@ -766,7 +775,7 @@
</span><span class="cx">     if (exit.isExceptionHandler())
</span><span class="cx">         jit.copyCalleeSavesToEntryFrameCalleeSavesBuffer(vm.topEntryFrame);
</span><span class="cx"> 
</span><del>-    if (exit.m_codeOrigin.inlineStackContainsActiveCheckpoint()) {
</del><ins>+    if (inlineStackContainsActiveCheckpoint) {
</ins><span class="cx">         EncodedJSValue* tmpScratch = scratch + operands.tmpIndex(0);
</span><span class="cx">         jit.setupArguments<decltype(operationMaterializeOSRExitSideState)>(&vm, &exit, tmpScratch);
</span><span class="cx">         jit.prepareCallOperation(vm);
</span><span class="lines">@@ -776,6 +785,16 @@
</span><span class="cx"> 
</span><span class="cx">     // Do all data format conversions and store the results into the stack.
</span><span class="cx"> 
</span><ins>+#if USE(JSVALUE64)
+    constexpr GPRReg srcBufferGPR = GPRInfo::regT2;
+    constexpr GPRReg destBufferGPR = GPRInfo::regT3;
+    constexpr GPRReg undefinedGPR = GPRInfo::regT4;
+    bool undefinedGPRIsInitialized = false;
+
+    jit.move(CCallHelpers::TrustedImmPtr(scratch), srcBufferGPR);
+    jit.move(CCallHelpers::framePointerRegister, destBufferGPR);
+    CCallHelpers::CopySpooler spooler(CCallHelpers::CopySpooler::BufferRegs::AllowModification, jit, srcBufferGPR, destBufferGPR, GPRInfo::regT0, GPRInfo::regT1);
+#endif
</ins><span class="cx">     for (size_t index = 0; index < operands.size(); ++index) {
</span><span class="cx">         const ValueRecovery& recovery = operands[index];
</span><span class="cx">         Operand operand = operands.operandForIndex(index);
</span><span class="lines">@@ -786,6 +805,23 @@
</span><span class="cx">             continue;
</span><span class="cx"> 
</span><span class="cx">         switch (recovery.technique()) {
</span><ins>+        case Constant: {
+#if USE(JSVALUE64)
+            EncodedJSValue currentConstant = JSValue::encode(recovery.constant());
+            if (currentConstant == encodedJSUndefined()) {
+                if (!undefinedGPRIsInitialized) {
+                    jit.move(CCallHelpers::TrustedImm64(encodedJSUndefined()), undefinedGPR);
+                    undefinedGPRIsInitialized = true;
+                }
+                spooler.copyGPR(undefinedGPR);
+            } else
+                spooler.moveConstant(currentConstant);
+            spooler.storeGPR(operand.virtualRegister().offset() * sizeof(CPURegister));
+            break;
+#else
+            FALLTHROUGH;
+#endif
+        }
</ins><span class="cx">         case DisplacedInJSStack:
</span><span class="cx">         case BooleanDisplacedInJSStack:
</span><span class="cx">         case Int32DisplacedInJSStack:
</span><span class="lines">@@ -795,7 +831,6 @@
</span><span class="cx">         case UnboxedInt32InGPR:
</span><span class="cx">         case UnboxedCellInGPR:
</span><span class="cx">         case UnboxedDoubleInFPR:
</span><del>-        case Constant:
</del><span class="cx">         case InFPR:
</span><span class="cx"> #if USE(JSVALUE64)
</span><span class="cx">         case InGPR:
</span><span class="lines">@@ -803,8 +838,8 @@
</span><span class="cx">         case Int52DisplacedInJSStack:
</span><span class="cx">         case UnboxedStrictInt52InGPR:
</span><span class="cx">         case StrictInt52DisplacedInJSStack:
</span><del>-            jit.load64(scratch + index, GPRInfo::regT0);
-            jit.store64(GPRInfo::regT0, AssemblyHelpers::addressFor(operand));
</del><ins>+            spooler.loadGPR(index * sizeof(CPURegister));
+            spooler.storeGPR(operand.virtualRegister().offset() * sizeof(CPURegister));
</ins><span class="cx">             break;
</span><span class="cx"> #else // not USE(JSVALUE64)
</span><span class="cx">         case InPair:
</span><span class="lines">@@ -833,6 +868,9 @@
</span><span class="cx">             break;
</span><span class="cx">         }
</span><span class="cx">     }
</span><ins>+#if USE(JSVALUE64)
+    spooler.finalizeGPR();
+#endif
</ins><span class="cx"> 
</span><span class="cx">     // Now that things on the stack are recovered, do the arguments recovery. We assume that arguments
</span><span class="cx">     // recoveries don't recursively refer to each other. But, we don't try to assume that they only
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoredfgDFGOSRExitCompilerCommoncpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/dfg/DFGOSRExitCompilerCommon.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/dfg/DFGOSRExitCompilerCommon.cpp     2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/dfg/DFGOSRExitCompilerCommon.cpp        2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -284,7 +284,7 @@
</span><span class="cx">             jit.addPtr(AssemblyHelpers::TrustedImm32(sizeof(CallerFrameAndPC)), GPRInfo::callFrameRegister, GPRInfo::regT2);
</span><span class="cx">             jit.untagPtr(GPRInfo::regT2, GPRInfo::regT3);
</span><span class="cx">             jit.addPtr(AssemblyHelpers::TrustedImm32(inlineCallFrame->returnPCOffset() + sizeof(void*)), GPRInfo::callFrameRegister, GPRInfo::regT2);
</span><del>-            jit.validateUntaggedPtr(GPRInfo::regT3, GPRInfo::nonArgGPR0);
</del><ins>+            jit.validateUntaggedPtr(GPRInfo::regT3, GPRInfo::regT4);
</ins><span class="cx">             jit.tagPtr(GPRInfo::regT2, GPRInfo::regT3);
</span><span class="cx"> #endif
</span><span class="cx">             jit.storePtr(GPRInfo::regT3, AssemblyHelpers::addressForByteOffset(inlineCallFrame->returnPCOffset()));
</span><span class="lines">@@ -305,9 +305,9 @@
</span><span class="cx"> 
</span><span class="cx"> #if CPU(ARM64E)
</span><span class="cx">             jit.addPtr(AssemblyHelpers::TrustedImm32(inlineCallFrame->returnPCOffset() + sizeof(void*)), GPRInfo::callFrameRegister, GPRInfo::regT2);
</span><del>-            jit.move(AssemblyHelpers::TrustedImmPtr(jumpTarget.untaggedExecutableAddress()), GPRInfo::nonArgGPR0);
-            jit.tagPtr(GPRInfo::regT2, GPRInfo::nonArgGPR0);
-            jit.storePtr(GPRInfo::nonArgGPR0, AssemblyHelpers::addressForByteOffset(inlineCallFrame->returnPCOffset()));
</del><ins>+            jit.move(AssemblyHelpers::TrustedImmPtr(jumpTarget.untaggedExecutableAddress()), GPRInfo::regT4);
+            jit.tagPtr(GPRInfo::regT2, GPRInfo::regT4);
+            jit.storePtr(GPRInfo::regT4, AssemblyHelpers::addressForByteOffset(inlineCallFrame->returnPCOffset()));
</ins><span class="cx"> #else
</span><span class="cx">             jit.storePtr(AssemblyHelpers::TrustedImmPtr(jumpTarget.untaggedExecutableAddress()), AssemblyHelpers::addressForByteOffset(inlineCallFrame->returnPCOffset()));
</span><span class="cx"> #endif
</span><span class="lines">@@ -318,11 +318,11 @@
</span><span class="cx">         // Restore the inline call frame's callee save registers.
</span><span class="cx">         // If this inlined frame is a tail call that will return back to the original caller, we need to
</span><span class="cx">         // copy the prior contents of the tag registers already saved for the outer frame to this frame.
</span><del>-        jit.emitSaveOrCopyCalleeSavesFor(
</del><ins>+        jit.emitSaveOrCopyLLIntBaselineCalleeSavesFor(
</ins><span class="cx">             baselineCodeBlock,
</span><span class="cx">             static_cast<VirtualRegister>(inlineCallFrame->stackOffset),
</span><span class="cx">             trueCaller ? AssemblyHelpers::UseExistingTagRegisterContents : AssemblyHelpers::CopyBaselineCalleeSavedRegistersFromBaseFrame,
</span><del>-            GPRInfo::regT2);
</del><ins>+            GPRInfo::regT2, GPRInfo::regT1, GPRInfo::regT4);
</ins><span class="cx"> 
</span><span class="cx">         if (callerIsLLInt) {
</span><span class="cx">             CodeBlock* baselineCodeBlockForCaller = jit.baselineCodeBlockFor(*trueCaller);
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoredfgDFGThunkscpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/dfg/DFGThunks.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/dfg/DFGThunks.cpp    2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/dfg/DFGThunks.cpp       2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -28,6 +28,7 @@
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(DFG_JIT)
</span><span class="cx"> 
</span><ins>+#include "AssemblyHelpersSpoolers.h"
</ins><span class="cx"> #include "CCallHelpers.h"
</span><span class="cx"> #include "DFGJITCode.h"
</span><span class="cx"> #include "DFGOSRExit.h"
</span><span class="lines">@@ -49,19 +50,45 @@
</span><span class="cx">     size_t scratchSize = sizeof(EncodedJSValue) * (GPRInfo::numberOfRegisters + FPRInfo::numberOfRegisters);
</span><span class="cx">     ScratchBuffer* scratchBuffer = vm.scratchBufferForSize(scratchSize);
</span><span class="cx">     EncodedJSValue* buffer = static_cast<EncodedJSValue*>(scratchBuffer->dataBuffer());
</span><del>-    
-    for (unsigned i = 0; i < GPRInfo::numberOfRegisters; ++i) {
</del><ins>+
+#if CPU(ARM64)
+    constexpr GPRReg bufferGPR = CCallHelpers::memoryTempRegister;
+    constexpr unsigned firstGPR = 0;
+#elif CPU(X86_64)
+    GPRReg bufferGPR = jit.scratchRegister();
+    constexpr unsigned firstGPR = 0;
+#else
+    GPRReg bufferGPR = GPRInfo::toRegister(0);
+    constexpr unsigned firstGPR = 1;
+#endif
+
+    if constexpr (firstGPR) {
+        // We're using the firstGPR as the bufferGPR, and need to save it manually.
+        RELEASE_ASSERT(GPRInfo::numberOfRegisters >= 1);
+        RELEASE_ASSERT(bufferGPR == GPRInfo::toRegister(0));
</ins><span class="cx"> #if USE(JSVALUE64)
</span><del>-        jit.store64(GPRInfo::toRegister(i), buffer + i);
</del><ins>+        jit.store64(bufferGPR, buffer);
</ins><span class="cx"> #else
</span><del>-        jit.store32(GPRInfo::toRegister(i), buffer + i);
</del><ins>+        jit.store32(bufferGPR, buffer);
</ins><span class="cx"> #endif
</span><span class="cx">     }
</span><ins>+
+    jit.move(CCallHelpers::TrustedImmPtr(buffer), bufferGPR);
+
+    CCallHelpers::StoreRegSpooler storeSpooler(jit, bufferGPR);
+
+    for (unsigned i = firstGPR; i < GPRInfo::numberOfRegisters; ++i) {
+        ptrdiff_t offset = i * sizeof(CPURegister);
+        storeSpooler.storeGPR({ GPRInfo::toRegister(i), offset });
+    }
+    storeSpooler.finalizeGPR();
+
</ins><span class="cx">     for (unsigned i = 0; i < FPRInfo::numberOfRegisters; ++i) {
</span><del>-        jit.move(MacroAssembler::TrustedImmPtr(buffer + GPRInfo::numberOfRegisters + i), GPRInfo::regT0);
-        jit.storeDouble(FPRInfo::toRegister(i), MacroAssembler::Address(GPRInfo::regT0));
</del><ins>+        ptrdiff_t offset = (GPRInfo::numberOfRegisters + i) * sizeof(double);
+        storeSpooler.storeFPR({ FPRInfo::toRegister(i), offset });
</ins><span class="cx">     }
</span><del>-    
</del><ins>+    storeSpooler.finalizeFPR();
+
</ins><span class="cx">     // Set up one argument.
</span><span class="cx">     jit.move(GPRInfo::callFrameRegister, GPRInfo::argumentGPR0);
</span><span class="cx">     jit.prepareCallOperation(vm);
</span><span class="lines">@@ -68,15 +95,28 @@
</span><span class="cx"> 
</span><span class="cx">     MacroAssembler::Call functionCall = jit.call(OperationPtrTag);
</span><span class="cx"> 
</span><ins>+    jit.move(CCallHelpers::TrustedImmPtr(buffer), bufferGPR);
+    CCallHelpers::LoadRegSpooler loadSpooler(jit, bufferGPR);
+
+    for (unsigned i = firstGPR; i < GPRInfo::numberOfRegisters; ++i) {
+        ptrdiff_t offset = i * sizeof(CPURegister);
+        loadSpooler.loadGPR({ GPRInfo::toRegister(i), offset });
+    }
+    loadSpooler.finalizeGPR();
+
</ins><span class="cx">     for (unsigned i = 0; i < FPRInfo::numberOfRegisters; ++i) {
</span><del>-        jit.move(MacroAssembler::TrustedImmPtr(buffer + GPRInfo::numberOfRegisters + i), GPRInfo::regT0);
-        jit.loadDouble(MacroAssembler::Address(GPRInfo::regT0), FPRInfo::toRegister(i));
</del><ins>+        ptrdiff_t offset = (GPRInfo::numberOfRegisters + i) * sizeof(double);
+        loadSpooler.loadFPR({ FPRInfo::toRegister(i), offset });
</ins><span class="cx">     }
</span><del>-    for (unsigned i = 0; i < GPRInfo::numberOfRegisters; ++i) {
</del><ins>+    loadSpooler.finalizeFPR();
+
+    if constexpr (firstGPR) {
+        // We're using the firstGPR as the bufferGPR, and need to restore it manually.
+        ASSERT(bufferGPR == GPRInfo::toRegister(0));
</ins><span class="cx"> #if USE(JSVALUE64)
</span><del>-        jit.load64(buffer + i, GPRInfo::toRegister(i));
</del><ins>+        jit.load64(buffer, bufferGPR);
</ins><span class="cx"> #else
</span><del>-        jit.load32(buffer + i, GPRInfo::toRegister(i));
</del><ins>+        jit.load32(buffer, bufferGPR);
</ins><span class="cx"> #endif
</span><span class="cx">     }
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreftlFTLSaveRestorecpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ftl/FTLSaveRestore.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ftl/FTLSaveRestore.cpp       2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/ftl/FTLSaveRestore.cpp  2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2013, 2014 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2013-2021 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -28,9 +28,10 @@
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(FTL_JIT)
</span><span class="cx"> 
</span><ins>+#include "AssemblyHelpersSpoolers.h"
</ins><span class="cx"> #include "FPRInfo.h"
</span><span class="cx"> #include "GPRInfo.h"
</span><del>-#include "MacroAssembler.h"
</del><ins>+#include "Reg.h"
</ins><span class="cx"> #include "RegisterSet.h"
</span><span class="cx"> 
</span><span class="cx"> namespace JSC { namespace FTL {
</span><span class="lines">@@ -77,70 +78,104 @@
</span><span class="cx">     {
</span><span class="cx">         special = RegisterSet::stackRegisters();
</span><span class="cx">         special.merge(RegisterSet::reservedHardwareRegisters());
</span><del>-        
</del><ins>+
</ins><span class="cx">         first = MacroAssembler::firstRegister();
</span><span class="cx">         while (special.get(first))
</span><span class="cx">             first = MacroAssembler::nextRegister(first);
</span><del>-        second = MacroAssembler::nextRegister(first);
-        while (special.get(second))
-            second = MacroAssembler::nextRegister(second);
</del><span class="cx">     }
</span><del>-    
</del><ins>+
+    GPRReg nextRegister(GPRReg current)
+    {
+        auto next = MacroAssembler::nextRegister(current);
+        while (special.get(next))
+            next = MacroAssembler::nextRegister(next);
+        return next;
+    }
+
+    FPRReg nextFPRegister(FPRReg current)
+    {
+        auto next = MacroAssembler::nextFPRegister(current);
+        while (special.get(next))
+            next = MacroAssembler::nextFPRegister(next);
+        return next;
+    }
+
</ins><span class="cx">     RegisterSet special;
</span><span class="cx">     GPRReg first;
</span><del>-    GPRReg second;
</del><span class="cx"> };
</span><span class="cx"> 
</span><span class="cx"> } // anonymous namespace
</span><span class="cx"> 
</span><del>-void saveAllRegisters(MacroAssembler& jit, char* scratchMemory)
</del><ins>+void saveAllRegisters(AssemblyHelpers& jit, char* scratchMemory)
</ins><span class="cx"> {
</span><span class="cx">     Regs regs;
</span><span class="cx">     
</span><span class="cx">     // Get the first register out of the way, so that we can use it as a pointer.
</span><del>-    jit.poke64(regs.first, 0);
-    jit.move(MacroAssembler::TrustedImmPtr(scratchMemory), regs.first);
-    
</del><ins>+    GPRReg baseGPR = regs.first;
+#if CPU(ARM64)
+    GPRReg nextGPR = regs.nextRegister(baseGPR);
+    GPRReg firstToSaveGPR = regs.nextRegister(nextGPR);
+    ASSERT(baseGPR == ARM64Registers::x0);
+    ASSERT(nextGPR == ARM64Registers::x1);
+#else
+    GPRReg firstToSaveGPR = regs.nextRegister(baseGPR);
+#endif
+    jit.poke64(baseGPR, 0);
+    jit.move(MacroAssembler::TrustedImmPtr(scratchMemory), baseGPR);
+
+    AssemblyHelpers::StoreRegSpooler spooler(jit, baseGPR);
+
</ins><span class="cx">     // Get all of the other GPRs out of the way.
</span><del>-    for (MacroAssembler::RegisterID reg = regs.second; reg <= MacroAssembler::lastRegister(); reg = MacroAssembler::nextRegister(reg)) {
-        if (regs.special.get(reg))
-            continue;
-        jit.store64(reg, MacroAssembler::Address(regs.first, offsetOfGPR(reg)));
-    }
</del><ins>+    for (GPRReg reg = firstToSaveGPR; reg <= MacroAssembler::lastRegister(); reg = regs.nextRegister(reg))
+        spooler.storeGPR({ reg, static_cast<ptrdiff_t>(offsetOfGPR(reg)) });
+    spooler.finalizeGPR();
</ins><span class="cx">     
</span><span class="cx">     // Restore the first register into the second one and save it.
</span><del>-    jit.peek64(regs.second, 0);
-    jit.store64(regs.second, MacroAssembler::Address(regs.first, offsetOfGPR(regs.first)));
</del><ins>+    jit.peek64(firstToSaveGPR, 0);
+#if CPU(ARM64)
+    jit.storePair64(firstToSaveGPR, nextGPR, baseGPR, AssemblyHelpers::TrustedImm32(offsetOfGPR(baseGPR)));
+#else
+    jit.store64(firstToSaveGPR, MacroAssembler::Address(baseGPR, offsetOfGPR(baseGPR)));
+#endif
</ins><span class="cx">     
</span><span class="cx">     // Finally save all FPR's.
</span><del>-    for (MacroAssembler::FPRegisterID reg = MacroAssembler::firstFPRegister(); reg <= MacroAssembler::lastFPRegister(); reg = MacroAssembler::nextFPRegister(reg)) {
-        if (regs.special.get(reg))
-            continue;
-        jit.storeDouble(reg, MacroAssembler::Address(regs.first, offsetOfFPR(reg)));
-    }
</del><ins>+    for (MacroAssembler::FPRegisterID reg = MacroAssembler::firstFPRegister(); reg <= MacroAssembler::lastFPRegister(); reg = regs.nextFPRegister(reg))
+        spooler.storeFPR({ reg, static_cast<ptrdiff_t>(offsetOfFPR(reg)) });
+    spooler.finalizeFPR();
</ins><span class="cx"> }
</span><span class="cx"> 
</span><del>-void restoreAllRegisters(MacroAssembler& jit, char* scratchMemory)
</del><ins>+void restoreAllRegisters(AssemblyHelpers& jit, char* scratchMemory)
</ins><span class="cx"> {
</span><span class="cx">     Regs regs;
</span><span class="cx">     
</span><span class="cx">     // Give ourselves a pointer to the scratch memory.
</span><del>-    jit.move(MacroAssembler::TrustedImmPtr(scratchMemory), regs.first);
</del><ins>+    GPRReg baseGPR = regs.first;
+    jit.move(MacroAssembler::TrustedImmPtr(scratchMemory), baseGPR);
</ins><span class="cx">     
</span><ins>+    AssemblyHelpers::LoadRegSpooler spooler(jit, baseGPR);
+
</ins><span class="cx">     // Restore all FPR's.
</span><del>-    for (MacroAssembler::FPRegisterID reg = MacroAssembler::firstFPRegister(); reg <= MacroAssembler::lastFPRegister(); reg = MacroAssembler::nextFPRegister(reg)) {
-        if (regs.special.get(reg))
-            continue;
-        jit.loadDouble(MacroAssembler::Address(regs.first, offsetOfFPR(reg)), reg);
-    }
</del><ins>+    for (MacroAssembler::FPRegisterID reg = MacroAssembler::firstFPRegister(); reg <= MacroAssembler::lastFPRegister(); reg = regs.nextFPRegister(reg))
+        spooler.loadFPR({ reg, static_cast<ptrdiff_t>(offsetOfFPR(reg)) });
+    spooler.finalizeFPR();
</ins><span class="cx">     
</span><del>-    for (MacroAssembler::RegisterID reg = regs.second; reg <= MacroAssembler::lastRegister(); reg = MacroAssembler::nextRegister(reg)) {
-        if (regs.special.get(reg))
-            continue;
-        jit.load64(MacroAssembler::Address(regs.first, offsetOfGPR(reg)), reg);
-    }
-    
-    jit.load64(MacroAssembler::Address(regs.first, offsetOfGPR(regs.first)), regs.first);
</del><ins>+#if CPU(ARM64)
+    GPRReg nextGPR = regs.nextRegister(baseGPR);
+    GPRReg firstToRestoreGPR = regs.nextRegister(nextGPR);
+    ASSERT(baseGPR == ARM64Registers::x0);
+    ASSERT(nextGPR == ARM64Registers::x1);
+#else
+    GPRReg firstToRestoreGPR = regs.nextRegister(baseGPR);
+#endif
+    for (MacroAssembler::RegisterID reg = firstToRestoreGPR; reg <= MacroAssembler::lastRegister(); reg = regs.nextRegister(reg))
+        spooler.loadGPR({ reg, static_cast<ptrdiff_t>(offsetOfGPR(reg)) });
+    spooler.finalizeGPR();
+
+#if CPU(ARM64)
+    jit.loadPair64(baseGPR, AssemblyHelpers::TrustedImm32(offsetOfGPR(baseGPR)), baseGPR, nextGPR);
+#else
+    jit.load64(MacroAssembler::Address(baseGPR, offsetOfGPR(baseGPR)), baseGPR);
+#endif
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> } } // namespace JSC::FTL
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreftlFTLSaveRestoreh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ftl/FTLSaveRestore.h (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ftl/FTLSaveRestore.h 2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/ftl/FTLSaveRestore.h    2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2013 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2013-2021 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -33,7 +33,7 @@
</span><span class="cx"> 
</span><span class="cx"> namespace JSC {
</span><span class="cx"> 
</span><del>-class MacroAssembler;
</del><ins>+class AssemblyHelpers;
</ins><span class="cx"> 
</span><span class="cx"> namespace FTL {
</span><span class="cx"> 
</span><span class="lines">@@ -46,9 +46,9 @@
</span><span class="cx"> // Assumes that top-of-stack can be used as a pointer-sized scratchpad. Saves all of
</span><span class="cx"> // the registers into the scratch buffer such that RegisterID * sizeof(int64_t) is the
</span><span class="cx"> // offset of every register.
</span><del>-void saveAllRegisters(MacroAssembler& jit, char* scratchMemory);
</del><ins>+void saveAllRegisters(AssemblyHelpers& jit, char* scratchMemory);
</ins><span class="cx"> 
</span><del>-void restoreAllRegisters(MacroAssembler& jit, char* scratchMemory);
</del><ins>+void restoreAllRegisters(AssemblyHelpers& jit, char* scratchMemory);
</ins><span class="cx"> 
</span><span class="cx"> } } // namespace JSC::FTL
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreftlFTLThunkscpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ftl/FTLThunks.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ftl/FTLThunks.cpp    2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/ftl/FTLThunks.cpp       2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -28,7 +28,7 @@
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(FTL_JIT)
</span><span class="cx"> 
</span><del>-#include "AssemblyHelpers.h"
</del><ins>+#include "AssemblyHelpersSpoolers.h"
</ins><span class="cx"> #include "DFGOSRExitCompilerCommon.h"
</span><span class="cx"> #include "FTLOSRExitCompiler.h"
</span><span class="cx"> #include "FTLOperations.h"
</span><span class="lines">@@ -56,31 +56,34 @@
</span><span class="cx">     }
</span><span class="cx">     
</span><span class="cx">     // Note that the "return address" will be the ID that we pass to the generation function.
</span><del>-    
-    ptrdiff_t stackMisalignment = MacroAssembler::pushToSaveByteOffset();
-    
</del><ins>+
+    constexpr GPRReg stackPointerRegister = MacroAssembler::stackPointerRegister;
+    constexpr GPRReg framePointerRegister = MacroAssembler::framePointerRegister;
+    constexpr ptrdiff_t pushToSaveByteOffset = MacroAssembler::pushToSaveByteOffset();
+    ptrdiff_t stackMisalignment = pushToSaveByteOffset;
+
</ins><span class="cx">     // Pretend that we're a C call frame.
</span><del>-    jit.pushToSave(MacroAssembler::framePointerRegister);
-    jit.move(MacroAssembler::stackPointerRegister, MacroAssembler::framePointerRegister);
-    stackMisalignment += MacroAssembler::pushToSaveByteOffset();
-    
</del><ins>+    jit.pushToSave(framePointerRegister);
+    jit.move(stackPointerRegister, framePointerRegister);
+    stackMisalignment += pushToSaveByteOffset;
+
</ins><span class="cx">     // Now create ourselves enough stack space to give saveAllRegisters() a scratch slot.
</span><span class="cx">     unsigned numberOfRequiredPops = 0;
</span><span class="cx">     do {
</span><del>-        jit.pushToSave(GPRInfo::regT0);
-        stackMisalignment += MacroAssembler::pushToSaveByteOffset();
</del><ins>+        stackMisalignment += pushToSaveByteOffset;
</ins><span class="cx">         numberOfRequiredPops++;
</span><span class="cx">     } while (stackMisalignment % stackAlignmentBytes());
</span><del>-    
</del><ins>+    jit.subPtr(MacroAssembler::TrustedImm32(numberOfRequiredPops * pushToSaveByteOffset), stackPointerRegister);
+
</ins><span class="cx">     ScratchBuffer* scratchBuffer = vm.scratchBufferForSize(requiredScratchMemorySizeInBytes());
</span><span class="cx">     char* buffer = static_cast<char*>(scratchBuffer->dataBuffer());
</span><span class="cx">     
</span><span class="cx">     saveAllRegisters(jit, buffer);
</span><span class="cx"> 
</span><del>-    jit.loadPtr(GPRInfo::callFrameRegister, GPRInfo::argumentGPR0);
</del><ins>+    jit.loadPtr(framePointerRegister, GPRInfo::argumentGPR0);
</ins><span class="cx">     jit.peek(
</span><span class="cx">         GPRInfo::argumentGPR1,
</span><del>-        (stackMisalignment - MacroAssembler::pushToSaveByteOffset()) / sizeof(void*));
</del><ins>+        (stackMisalignment - pushToSaveByteOffset) / sizeof(void*));
</ins><span class="cx">     jit.prepareCallOperation(vm);
</span><span class="cx">     MacroAssembler::Call functionCall = jit.call(OperationPtrTag);
</span><span class="cx"> 
</span><span class="lines">@@ -92,15 +95,14 @@
</span><span class="cx">     jit.move(GPRInfo::returnValueGPR, GPRInfo::regT0);
</span><span class="cx"> 
</span><span class="cx">     // Prepare for tail call.
</span><del>-    while (numberOfRequiredPops--)
-        jit.popToRestore(GPRInfo::regT1);
-    jit.popToRestore(MacroAssembler::framePointerRegister);
</del><span class="cx"> 
</span><del>-    // When we came in here, there was an additional thing pushed to the stack. Some clients want it
-    // popped before proceeding.
-    while (extraPopsToRestore--)
-        jit.popToRestore(GPRInfo::regT1);
</del><ins>+    jit.loadPtr(MacroAssembler::Address(stackPointerRegister, numberOfRequiredPops * pushToSaveByteOffset), framePointerRegister);
</ins><span class="cx"> 
</span><ins>+    // When we came in here, there was an additional thing pushed to the stack (extraPopsToRestore).
+    // Some clients want it popped before proceeding. Also add 1 for the pushToSave of the framePointerRegister.
+    numberOfRequiredPops += 1 + extraPopsToRestore;
+    jit.addPtr(MacroAssembler::TrustedImm32(numberOfRequiredPops * pushToSaveByteOffset), stackPointerRegister);
+
</ins><span class="cx">     // Put the return address wherever the return instruction wants it. On all platforms, this
</span><span class="cx">     // ensures that the return address is out of the way of register restoration.
</span><span class="cx">     jit.restoreReturnAddressBeforeReturn(GPRInfo::regT0);
</span><span class="lines">@@ -177,21 +179,25 @@
</span><span class="cx"> #if CPU(X86_64)
</span><span class="cx">     currentOffset += sizeof(void*);
</span><span class="cx"> #endif
</span><del>-    
</del><ins>+
+    AssemblyHelpers::StoreRegSpooler storeSpooler(jit, MacroAssembler::stackPointerRegister);
+
</ins><span class="cx">     for (MacroAssembler::RegisterID reg = MacroAssembler::firstRegister(); reg <= MacroAssembler::lastRegister(); reg = static_cast<MacroAssembler::RegisterID>(reg + 1)) {
</span><span class="cx">         if (!key.usedRegisters().get(reg))
</span><span class="cx">             continue;
</span><del>-        jit.storePtr(reg, AssemblyHelpers::Address(MacroAssembler::stackPointerRegister, currentOffset));
</del><ins>+        storeSpooler.storeGPR({ reg, static_cast<ptrdiff_t>(currentOffset) });
</ins><span class="cx">         currentOffset += sizeof(void*);
</span><span class="cx">     }
</span><del>-    
</del><ins>+    storeSpooler.finalizeGPR();
+
</ins><span class="cx">     for (MacroAssembler::FPRegisterID reg = MacroAssembler::firstFPRegister(); reg <= MacroAssembler::lastFPRegister(); reg = static_cast<MacroAssembler::FPRegisterID>(reg + 1)) {
</span><span class="cx">         if (!key.usedRegisters().get(reg))
</span><span class="cx">             continue;
</span><del>-        jit.storeDouble(reg, AssemblyHelpers::Address(MacroAssembler::stackPointerRegister, currentOffset));
</del><ins>+        storeSpooler.storeFPR({ reg, static_cast<ptrdiff_t>(currentOffset) });
</ins><span class="cx">         currentOffset += sizeof(double);
</span><span class="cx">     }
</span><del>-    
</del><ins>+    storeSpooler.finalizeFPR();
+
</ins><span class="cx">     jit.preserveReturnAddressAfterCall(GPRInfo::nonArgGPR1);
</span><span class="cx">     jit.storePtr(GPRInfo::nonArgGPR1, AssemblyHelpers::Address(MacroAssembler::stackPointerRegister, key.offset()));
</span><span class="cx">     jit.prepareCallOperation(vm);
</span><span class="lines">@@ -210,24 +216,28 @@
</span><span class="cx">     jit.loadPtr(AssemblyHelpers::Address(MacroAssembler::stackPointerRegister, key.offset()), GPRInfo::nonPreservedNonReturnGPR);
</span><span class="cx">     jit.restoreReturnAddressBeforeReturn(GPRInfo::nonPreservedNonReturnGPR);
</span><span class="cx">     
</span><ins>+    AssemblyHelpers::LoadRegSpooler loadSpooler(jit, MacroAssembler::stackPointerRegister);
+
</ins><span class="cx">     for (MacroAssembler::FPRegisterID reg = MacroAssembler::lastFPRegister(); ; reg = static_cast<MacroAssembler::FPRegisterID>(reg - 1)) {
</span><span class="cx">         if (key.usedRegisters().get(reg)) {
</span><span class="cx">             currentOffset -= sizeof(double);
</span><del>-            jit.loadDouble(AssemblyHelpers::Address(MacroAssembler::stackPointerRegister, currentOffset), reg);
</del><ins>+            loadSpooler.loadFPR({ reg, static_cast<ptrdiff_t>(currentOffset) });
</ins><span class="cx">         }
</span><span class="cx">         if (reg == MacroAssembler::firstFPRegister())
</span><span class="cx">             break;
</span><span class="cx">     }
</span><del>-    
</del><ins>+    loadSpooler.finalizeFPR();
+
</ins><span class="cx">     for (MacroAssembler::RegisterID reg = MacroAssembler::lastRegister(); ; reg = static_cast<MacroAssembler::RegisterID>(reg - 1)) {
</span><span class="cx">         if (key.usedRegisters().get(reg)) {
</span><span class="cx">             currentOffset -= sizeof(void*);
</span><del>-            jit.loadPtr(AssemblyHelpers::Address(MacroAssembler::stackPointerRegister, currentOffset), reg);
</del><ins>+            loadSpooler.loadGPR({ reg, static_cast<ptrdiff_t>(currentOffset) });
</ins><span class="cx">         }
</span><span class="cx">         if (reg == MacroAssembler::firstRegister())
</span><span class="cx">             break;
</span><span class="cx">     }
</span><del>-    
</del><ins>+    loadSpooler.finalizeGPR();
+
</ins><span class="cx">     jit.ret();
</span><span class="cx"> 
</span><span class="cx">     LinkBuffer patchBuffer(jit, GLOBAL_THUNK_ID, LinkBuffer::Profile::FTLThunk);
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorejitAssemblyHelperscpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/jit/AssemblyHelpers.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/jit/AssemblyHelpers.cpp      2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/jit/AssemblyHelpers.cpp 2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -29,6 +29,7 @@
</span><span class="cx"> #if ENABLE(JIT)
</span><span class="cx"> 
</span><span class="cx"> #include "AccessCase.h"
</span><ins>+#include "AssemblyHelpersSpoolers.h"
</ins><span class="cx"> #include "JITOperations.h"
</span><span class="cx"> #include "JSArrayBufferView.h"
</span><span class="cx"> #include "JSCJSValueInlines.h"
</span><span class="lines">@@ -606,32 +607,74 @@
</span><span class="cx"> 
</span><span class="cx">     GPRReg scratch = InvalidGPRReg;
</span><span class="cx">     unsigned scratchGPREntryIndex = 0;
</span><ins>+#if CPU(ARM64)
+    // We don't need a second scratch GPR, but we'll also defer restoring this
+    // GPR (in the next slot after the scratch) so that we can restore them together
+    // later using a loadPair64.
+    GPRReg unusedNextSlotGPR = InvalidGPRReg;
+#endif
</ins><span class="cx"> 
</span><del>-    // Use the first GPR entry's register as our scratch.
</del><ins>+    // Use the first GPR entry's register as our baseGPR.
</ins><span class="cx">     for (unsigned i = 0; i < registerCount; i++) {
</span><span class="cx">         RegisterAtOffset entry = allCalleeSaves->at(i);
</span><del>-        if (dontRestoreRegisters.get(entry.reg()))
</del><ins>+        if (dontRestoreRegisters.contains(entry.reg()))
</ins><span class="cx">             continue;
</span><span class="cx">         if (entry.reg().isGPR()) {
</span><ins>+#if CPU(ARM64)
+            if (i + 1 < registerCount) {
+                RegisterAtOffset entry2 = allCalleeSaves->at(i + 1);
+                if (!dontRestoreRegisters.contains(entry2.reg())
+                    && entry2.reg().isGPR()
+                    && entry2.offset() == entry.offset() + static_cast<ptrdiff_t>(sizeof(CPURegister))) {
+                    scratchGPREntryIndex = i;
+                    scratch = entry.reg().gpr();
+                    unusedNextSlotGPR = entry2.reg().gpr();
+                    break;
+                }
+            }
+#else
</ins><span class="cx">             scratchGPREntryIndex = i;
</span><span class="cx">             scratch = entry.reg().gpr();
</span><span class="cx">             break;
</span><ins>+#endif
</ins><span class="cx">         }
</span><span class="cx">     }
</span><span class="cx">     ASSERT(scratch != InvalidGPRReg);
</span><ins>+    
+    RegisterSet skipList;
+    skipList.set(dontRestoreRegisters);
</ins><span class="cx"> 
</span><ins>+    // Skip the scratch register(s). We'll restore them later.
+    skipList.add(scratch);
+#if CPU(ARM64)
+    RELEASE_ASSERT(unusedNextSlotGPR != InvalidGPRReg);
+    skipList.add(unusedNextSlotGPR);
+#endif
+
</ins><span class="cx">     loadPtr(&topEntryFrame, scratch);
</span><span class="cx">     addPtr(TrustedImm32(EntryFrame::calleeSaveRegistersBufferOffset()), scratch);
</span><span class="cx"> 
</span><ins>+    LoadRegSpooler spooler(*this, scratch);
+
</ins><span class="cx">     // Restore all callee saves except for the scratch.
</span><del>-    for (unsigned i = 0; i < registerCount; i++) {
</del><ins>+    unsigned i = 0;
+    for (; i < registerCount; i++) {
</ins><span class="cx">         RegisterAtOffset entry = allCalleeSaves->at(i);
</span><del>-        if (dontRestoreRegisters.get(entry.reg()))
</del><ins>+        if (skipList.contains(entry.reg()))
</ins><span class="cx">             continue;
</span><del>-        if (i == scratchGPREntryIndex)
</del><ins>+        if (!entry.reg().isGPR())
+            break;
+        spooler.loadGPR(entry);
+    }
+    spooler.finalizeGPR();
+    for (; i < registerCount; i++) {
+        RegisterAtOffset entry = allCalleeSaves->at(i);
+        if (skipList.contains(entry.reg()))
</ins><span class="cx">             continue;
</span><del>-        loadReg(Address(scratch, entry.offset()), entry.reg());
</del><ins>+        ASSERT(!entry.reg().isGPR());
+        spooler.loadFPR(entry);
</ins><span class="cx">     }
</span><ins>+    spooler.finalizeFPR();
</ins><span class="cx"> 
</span><span class="cx">     // Restore the callee save value of the scratch.
</span><span class="cx">     RegisterAtOffset entry = allCalleeSaves->at(scratchGPREntryIndex);
</span><span class="lines">@@ -638,10 +681,19 @@
</span><span class="cx">     ASSERT(!dontRestoreRegisters.get(entry.reg()));
</span><span class="cx">     ASSERT(entry.reg().isGPR());
</span><span class="cx">     ASSERT(scratch == entry.reg().gpr());
</span><del>-    loadReg(Address(scratch, entry.offset()), scratch);
</del><ins>+#if CPU(ARM64)
+    RegisterAtOffset entry2 = allCalleeSaves->at(scratchGPREntryIndex + 1);
+    ASSERT_UNUSED(entry2, !dontRestoreRegisters.get(entry2.reg()));
+    ASSERT(entry2.reg().isGPR());
+    ASSERT(unusedNextSlotGPR == entry2.reg().gpr());
+    loadPair64(scratch, TrustedImm32(entry.offset()), scratch, unusedNextSlotGPR);
</ins><span class="cx"> #else
</span><ins>+    loadPtr(Address(scratch, entry.offset()), scratch);
+#endif
+
+#else
</ins><span class="cx">     UNUSED_PARAM(topEntryFrame);
</span><del>-#endif
</del><ins>+#endif // NUMBER_OF_CALLEE_SAVES_REGISTERS > 0
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> void AssemblyHelpers::emitVirtualCall(VM& vm, JSGlobalObject* globalObject, CallLinkInfo* info)
</span><span class="lines">@@ -999,13 +1051,27 @@
</span><span class="cx">     RegisterAtOffsetList* allCalleeSaves = RegisterSet::vmCalleeSaveRegisterOffsets();
</span><span class="cx">     RegisterSet dontCopyRegisters = RegisterSet::stackRegisters();
</span><span class="cx">     unsigned registerCount = allCalleeSaves->size();
</span><del>-    
-    for (unsigned i = 0; i < registerCount; i++) {
</del><ins>+
+    StoreRegSpooler spooler(*this, calleeSavesBuffer);
+
+    unsigned i = 0;
+    for (; i < registerCount; i++) {
</ins><span class="cx">         RegisterAtOffset entry = allCalleeSaves->at(i);
</span><del>-        if (dontCopyRegisters.get(entry.reg()))
</del><ins>+        if (dontCopyRegisters.contains(entry.reg()))
</ins><span class="cx">             continue;
</span><del>-        storeReg(entry.reg(), Address(calleeSavesBuffer, entry.offset()));
</del><ins>+        if (!entry.reg().isGPR())
+            break;
+        spooler.storeGPR(entry);
</ins><span class="cx">     }
</span><ins>+    spooler.finalizeGPR();
+    for (; i < registerCount; i++) {
+        RegisterAtOffset entry = allCalleeSaves->at(i);
+        if (dontCopyRegisters.contains(entry.reg()))
+            continue;
+        spooler.storeFPR(entry);
+    }
+    spooler.finalizeFPR();
+
</ins><span class="cx"> #else
</span><span class="cx">     UNUSED_PARAM(calleeSavesBuffer);
</span><span class="cx"> #endif
</span><span class="lines">@@ -1103,6 +1169,188 @@
</span><span class="cx">     UNUSED_PARAM(scratch);
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+void AssemblyHelpers::emitSave(const RegisterAtOffsetList& list)
+{
+    StoreRegSpooler spooler(*this, framePointerRegister);
+
+    size_t listSize = list.size();
+    size_t i = 0;
+    for (; i < listSize; i++) {
+        auto entry = list.at(i);
+        if (!entry.reg().isGPR())
+            break;
+        spooler.storeGPR(entry);
+    }
+    spooler.finalizeGPR();
+
+    for (; i < listSize; i++)
+        spooler.storeFPR(list.at(i));
+    spooler.finalizeFPR();
+}
+
+void AssemblyHelpers::emitRestore(const RegisterAtOffsetList& list)
+{
+    LoadRegSpooler spooler(*this, framePointerRegister);
+
+    size_t listSize = list.size();
+    size_t i = 0;
+    for (; i < listSize; i++) {
+        auto entry = list.at(i);
+        if (!entry.reg().isGPR())
+            break;
+        spooler.loadGPR(entry);
+    }
+    spooler.finalizeGPR();
+
+    for (; i < listSize; i++)
+        spooler.loadFPR(list.at(i));
+    spooler.finalizeFPR();
+}
+
+void AssemblyHelpers::emitSaveCalleeSavesFor(const RegisterAtOffsetList* calleeSaves)
+{
+    RegisterSet dontSaveRegisters = RegisterSet(RegisterSet::stackRegisters());
+    unsigned registerCount = calleeSaves->size();
+
+    StoreRegSpooler spooler(*this, framePointerRegister);
+
+    unsigned i = 0;
+    for (; i < registerCount; i++) {
+        RegisterAtOffset entry = calleeSaves->at(i);
+        if (entry.reg().isFPR())
+            break;
+        if (dontSaveRegisters.contains(entry.reg()))
+            continue;
+        spooler.storeGPR(entry);
+    }
+    spooler.finalizeGPR();
+    for (; i < registerCount; i++) {
+        RegisterAtOffset entry = calleeSaves->at(i);
+        if (dontSaveRegisters.contains(entry.reg()))
+            continue;
+        spooler.storeFPR(entry);
+    }
+    spooler.finalizeFPR();
+}
+
+void AssemblyHelpers::emitRestoreCalleeSavesFor(const RegisterAtOffsetList* calleeSaves)
+{
+    RegisterSet dontRestoreRegisters = RegisterSet(RegisterSet::stackRegisters());
+    unsigned registerCount = calleeSaves->size();
+    
+    LoadRegSpooler spooler(*this, framePointerRegister);
+
+    unsigned i = 0;
+    for (; i < registerCount; i++) {
+        RegisterAtOffset entry = calleeSaves->at(i);
+        if (entry.reg().isFPR())
+            break;
+        if (dontRestoreRegisters.get(entry.reg()))
+            continue;
+        spooler.loadGPR(entry);
+    }
+    spooler.finalizeGPR();
+    for (; i < registerCount; i++) {
+        RegisterAtOffset entry = calleeSaves->at(i);
+        if (dontRestoreRegisters.get(entry.reg()))
+            continue;
+        spooler.loadFPR(entry);
+    }
+    spooler.finalizeFPR();
+}
+
+void AssemblyHelpers::copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(EntryFrame*& topEntryFrame, const TempRegisterSet& usedRegisters)
+{
+#if NUMBER_OF_CALLEE_SAVES_REGISTERS > 0
+    // Copy saved calleeSaves on stack or unsaved calleeSaves in register to vm calleeSave buffer
+    GPRReg destBufferGPR = usedRegisters.getFreeGPR(0);
+    GPRReg temp1 = usedRegisters.getFreeGPR(1);
+    FPRReg fpTemp1 = usedRegisters.getFreeFPR(0);
+    GPRReg temp2 = isARM64() ? usedRegisters.getFreeGPR(2) : InvalidGPRReg;
+    FPRReg fpTemp2 = isARM64() ? usedRegisters.getFreeFPR(1) : InvalidFPRReg;
+
+    loadPtr(&topEntryFrame, destBufferGPR);
+    addPtr(TrustedImm32(EntryFrame::calleeSaveRegistersBufferOffset()), destBufferGPR);
+
+    CopySpooler spooler(*this, framePointerRegister, destBufferGPR, temp1, temp2, fpTemp1, fpTemp2);
+
+    RegisterAtOffsetList* allCalleeSaves = RegisterSet::vmCalleeSaveRegisterOffsets();
+    const RegisterAtOffsetList* currentCalleeSaves = &RegisterAtOffsetList::llintBaselineCalleeSaveRegisters();
+    RegisterSet dontCopyRegisters = RegisterSet::stackRegisters();
+    unsigned registerCount = allCalleeSaves->size();
+
+    unsigned i = 0;
+    for (; i < registerCount; i++) {
+        RegisterAtOffset entry = allCalleeSaves->at(i);
+        if (dontCopyRegisters.contains(entry.reg()))
+            continue;
+        RegisterAtOffset* currentFrameEntry = currentCalleeSaves->find(entry.reg());
+
+        if (!entry.reg().isGPR())
+            break;
+        if (currentFrameEntry)
+            spooler.loadGPR(currentFrameEntry->offset());
+        else
+            spooler.copyGPR(entry.reg().gpr());
+        spooler.storeGPR(entry.offset());
+    }
+    spooler.finalizeGPR();
+
+    for (; i < registerCount; i++) {
+        RegisterAtOffset entry = allCalleeSaves->at(i);
+        if (dontCopyRegisters.get(entry.reg()))
+            continue;
+        RegisterAtOffset* currentFrameEntry = currentCalleeSaves->find(entry.reg());
+
+        RELEASE_ASSERT(entry.reg().isFPR());
+        if (currentFrameEntry)
+            spooler.loadFPR(currentFrameEntry->offset());
+        else
+            spooler.copyFPR(entry.reg().fpr());
+        spooler.storeFPR(entry.offset());
+    }
+    spooler.finalizeFPR();
+
+#else
+    UNUSED_PARAM(topEntryFrame);
+    UNUSED_PARAM(usedRegisters);
+#endif
+}
+
+void AssemblyHelpers::emitSaveOrCopyLLIntBaselineCalleeSavesFor(CodeBlock* codeBlock, VirtualRegister offsetVirtualRegister, RestoreTagRegisterMode tagRegisterMode, GPRReg temp1, GPRReg temp2, GPRReg temp3)
+{
+    ASSERT_UNUSED(codeBlock, codeBlock);
+    ASSERT(JITCode::isBaselineCode(codeBlock->jitType()));
+    ASSERT(codeBlock->calleeSaveRegisters() == &RegisterAtOffsetList::llintBaselineCalleeSaveRegisters());
+
+    const RegisterAtOffsetList* calleeSaves = &RegisterAtOffsetList::llintBaselineCalleeSaveRegisters();
+    RegisterSet dontSaveRegisters = RegisterSet(RegisterSet::stackRegisters());
+    unsigned registerCount = calleeSaves->size();
+
+    GPRReg dstBufferGPR = temp1;
+    addPtr(TrustedImm32(offsetVirtualRegister.offsetInBytes()), framePointerRegister, dstBufferGPR);
+
+    CopySpooler spooler(*this, framePointerRegister, dstBufferGPR, temp2, temp3);
+
+    for (unsigned i = 0; i < registerCount; i++) {
+        RegisterAtOffset entry = calleeSaves->at(i);
+        if (dontSaveRegisters.get(entry.reg()))
+            continue;
+        RELEASE_ASSERT(entry.reg().isGPR());
+
+#if USE(JSVALUE32_64)
+        UNUSED_PARAM(tagRegisterMode);
+#else
+        if (tagRegisterMode == CopyBaselineCalleeSavedRegistersFromBaseFrame)
+            spooler.loadGPR(entry.offset());
+        else
+#endif
+            spooler.copyGPR(entry.reg().gpr());
+        spooler.storeGPR(entry.offset());
+    }
+    spooler.finalizeGPR();
+}
+
</ins><span class="cx"> } // namespace JSC
</span><span class="cx"> 
</span><span class="cx"> #endif // ENABLE(JIT)
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorejitAssemblyHelpersh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/jit/AssemblyHelpers.h (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/jit/AssemblyHelpers.h        2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/jit/AssemblyHelpers.h   2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -295,85 +295,33 @@
</span><span class="cx">         store32(TrustedImm32(value.payload()), address.withOffset(PayloadOffset));
</span><span class="cx"> #endif
</span><span class="cx">     }
</span><del>-    
</del><ins>+
+    template<typename Op> class Spooler;
+    class LoadRegSpooler;
+    class StoreRegSpooler;
+    class CopySpooler;
+
</ins><span class="cx">     Address addressFor(const RegisterAtOffset& entry)
</span><span class="cx">     {
</span><span class="cx">         return Address(GPRInfo::callFrameRegister, entry.offset());
</span><span class="cx">     }
</span><del>-    
-    void emitSave(const RegisterAtOffsetList& list)
-    {
-        for (const RegisterAtOffset& entry : list) {
-            if (entry.reg().isGPR())
-                storePtr(entry.reg().gpr(), addressFor(entry));
-            else
-                storeDouble(entry.reg().fpr(), addressFor(entry));
-        }
-    }
-    
-    void emitRestore(const RegisterAtOffsetList& list)
-    {
-        for (const RegisterAtOffset& entry : list) {
-            if (entry.reg().isGPR())
-                loadPtr(addressFor(entry), entry.reg().gpr());
-            else
-                loadDouble(addressFor(entry), entry.reg().fpr());
-        }
-    }
</del><span class="cx"> 
</span><ins>+    void emitSave(const RegisterAtOffsetList&);
+    void emitRestore(const RegisterAtOffsetList&);
+
</ins><span class="cx">     void emitSaveCalleeSavesFor(CodeBlock* codeBlock)
</span><span class="cx">     {
</span><span class="cx">         ASSERT(codeBlock);
</span><span class="cx"> 
</span><span class="cx">         const RegisterAtOffsetList* calleeSaves = codeBlock->calleeSaveRegisters();
</span><del>-        RegisterSet dontSaveRegisters = RegisterSet(RegisterSet::stackRegisters());
-        unsigned registerCount = calleeSaves->size();
</del><ins>+        emitSaveCalleeSavesFor(calleeSaves);
+    }
</ins><span class="cx"> 
</span><del>-        for (unsigned i = 0; i < registerCount; i++) {
-            RegisterAtOffset entry = calleeSaves->at(i);
-            if (dontSaveRegisters.get(entry.reg()))
-                continue;
-            storeReg(entry.reg(), Address(framePointerRegister, entry.offset()));
-        }
-    }
</del><ins>+    void emitSaveCalleeSavesFor(const RegisterAtOffsetList* calleeSaves);
</ins><span class="cx">     
</span><span class="cx">     enum RestoreTagRegisterMode { UseExistingTagRegisterContents, CopyBaselineCalleeSavedRegistersFromBaseFrame };
</span><span class="cx"> 
</span><del>-    void emitSaveOrCopyCalleeSavesFor(CodeBlock* codeBlock, VirtualRegister offsetVirtualRegister, RestoreTagRegisterMode tagRegisterMode, GPRReg temp)
-    {
-        ASSERT(codeBlock);
-        ASSERT(JITCode::isBaselineCode(codeBlock->jitType()));
-        
-        const RegisterAtOffsetList* calleeSaves = codeBlock->calleeSaveRegisters();
-        RegisterSet dontSaveRegisters = RegisterSet(RegisterSet::stackRegisters());
-        unsigned registerCount = calleeSaves->size();
-
-#if USE(JSVALUE64)
-        RegisterSet baselineCalleeSaves = RegisterSet::llintBaselineCalleeSaveRegisters();
-#endif
-        
-        for (unsigned i = 0; i < registerCount; i++) {
-            RegisterAtOffset entry = calleeSaves->at(i);
-            if (dontSaveRegisters.get(entry.reg()))
-                continue;
-            RELEASE_ASSERT(entry.reg().isGPR());
-
-            GPRReg registerToWrite;
-
-#if USE(JSVALUE32_64)
-            UNUSED_PARAM(tagRegisterMode);
-            UNUSED_PARAM(temp);
-#else
-            if (tagRegisterMode == CopyBaselineCalleeSavedRegistersFromBaseFrame && baselineCalleeSaves.get(entry.reg())) {
-                registerToWrite = temp;
-                loadPtr(AssemblyHelpers::Address(GPRInfo::callFrameRegister, entry.offset()), registerToWrite);
-            } else
-#endif
-                registerToWrite = entry.reg().gpr();
-
-            storePtr(registerToWrite, Address(framePointerRegister, offsetVirtualRegister.offsetInBytes() + entry.offset()));
-        }
-    }
</del><ins>+    void emitSaveOrCopyLLIntBaselineCalleeSavesFor(CodeBlock*, VirtualRegister offsetVirtualRegister, RestoreTagRegisterMode, GPRReg temp1, GPRReg temp2, GPRReg temp3);
</ins><span class="cx">     
</span><span class="cx">     void emitRestoreCalleeSavesFor(CodeBlock* codeBlock)
</span><span class="cx">     {
</span><span class="lines">@@ -383,18 +331,7 @@
</span><span class="cx">         emitRestoreCalleeSavesFor(calleeSaves);
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void emitRestoreCalleeSavesFor(const RegisterAtOffsetList* calleeSaves)
-    {
-        RegisterSet dontRestoreRegisters = RegisterSet(RegisterSet::stackRegisters());
-        unsigned registerCount = calleeSaves->size();
-        
-        for (unsigned i = 0; i < registerCount; i++) {
-            RegisterAtOffset entry = calleeSaves->at(i);
-            if (dontRestoreRegisters.get(entry.reg()))
-                continue;
-            loadReg(Address(framePointerRegister, entry.offset()), entry.reg());
-        }
-    }
</del><ins>+    void emitRestoreCalleeSavesFor(const RegisterAtOffsetList* calleeSaves);
</ins><span class="cx"> 
</span><span class="cx">     void emitSaveCalleeSaves()
</span><span class="cx">     {
</span><span class="lines">@@ -465,59 +402,8 @@
</span><span class="cx"> 
</span><span class="cx">     void restoreCalleeSavesFromEntryFrameCalleeSavesBuffer(EntryFrame*&);
</span><span class="cx"> 
</span><del>-    void copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(EntryFrame*& topEntryFrame, const TempRegisterSet& usedRegisters = { RegisterSet::stubUnavailableRegisters() })
-    {
-#if NUMBER_OF_CALLEE_SAVES_REGISTERS > 0
-        GPRReg temp1 = usedRegisters.getFreeGPR(0);
-        GPRReg temp2 = usedRegisters.getFreeGPR(1);
-        FPRReg fpTemp = usedRegisters.getFreeFPR();
-        ASSERT(temp2 != InvalidGPRReg);
</del><ins>+    void copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(EntryFrame*&, const TempRegisterSet& usedRegisters = { RegisterSet::stubUnavailableRegisters() });
</ins><span class="cx"> 
</span><del>-        // Copy saved calleeSaves on stack or unsaved calleeSaves in register to vm calleeSave buffer
-        loadPtr(&topEntryFrame, temp1);
-        addPtr(TrustedImm32(EntryFrame::calleeSaveRegistersBufferOffset()), temp1);
-
-        RegisterAtOffsetList* allCalleeSaves = RegisterSet::vmCalleeSaveRegisterOffsets();
-        const RegisterAtOffsetList* currentCalleeSaves = &RegisterAtOffsetList::llintBaselineCalleeSaveRegisters();
-        RegisterSet dontCopyRegisters = RegisterSet::stackRegisters();
-        unsigned registerCount = allCalleeSaves->size();
-
-        for (unsigned i = 0; i < registerCount; i++) {
-            RegisterAtOffset entry = allCalleeSaves->at(i);
-            if (dontCopyRegisters.get(entry.reg()))
-                continue;
-            RegisterAtOffset* currentFrameEntry = currentCalleeSaves->find(entry.reg());
-
-            if (entry.reg().isGPR()) {
-                GPRReg regToStore;
-                if (currentFrameEntry) {
-                    // Load calleeSave from stack into temp register
-                    regToStore = temp2;
-                    loadPtr(Address(framePointerRegister, currentFrameEntry->offset()), regToStore);
-                } else
-                    // Just store callee save directly
-                    regToStore = entry.reg().gpr();
-
-                storePtr(regToStore, Address(temp1, entry.offset()));
-            } else {
-                FPRReg fpRegToStore;
-                if (currentFrameEntry) {
-                    // Load calleeSave from stack into temp register
-                    fpRegToStore = fpTemp;
-                    loadDouble(Address(framePointerRegister, currentFrameEntry->offset()), fpRegToStore);
-                } else
-                    // Just store callee save directly
-                    fpRegToStore = entry.reg().fpr();
-
-                storeDouble(fpRegToStore, Address(temp1, entry.offset()));
-            }
-        }
-#else
-        UNUSED_PARAM(topEntryFrame);
-        UNUSED_PARAM(usedRegisters);
-#endif
-    }
-
</del><span class="cx">     void emitMaterializeTagCheckRegisters()
</span><span class="cx">     {
</span><span class="cx"> #if USE(JSVALUE64)
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorejitAssemblyHelpersSpoolersh"></a>
<div class="addfile"><h4>Added: trunk/Source/JavaScriptCore/jit/AssemblyHelpersSpoolers.h (0 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/jit/AssemblyHelpersSpoolers.h                                (rev 0)
+++ trunk/Source/JavaScriptCore/jit/AssemblyHelpersSpoolers.h   2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -0,0 +1,575 @@
</span><ins>+/*
+ * Copyright (C) 2021 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#if ENABLE(JIT)
+
+#include "AssemblyHelpers.h"
+
+namespace JSC {
+
+template<typename RegType>
+struct RegDispatch {
+    static bool hasSameType(Reg);
+    static RegType get(Reg);
+    template<typename Spooler> static RegType temp1(const Spooler*);
+    template<typename Spooler> static RegType temp2(const Spooler*);
+    template<typename Spooler> static RegType& regToStore(Spooler*);
+    static constexpr RegType invalid();
+    static constexpr size_t regSize();
+    static bool isValidLoadPairImm(int);
+    static bool isValidStorePairImm(int);
+};
+
+template<>
+struct RegDispatch<GPRReg> {
+    static bool hasSameType(Reg reg) { return reg.isGPR(); }
+    static GPRReg get(Reg reg) { return reg.gpr(); }
+    template<typename Spooler> static GPRReg temp1(const Spooler* spooler) { return spooler->m_temp1GPR; }
+    template<typename Spooler> static GPRReg temp2(const Spooler* spooler) { return spooler->m_temp2GPR; }
+    template<typename Spooler> static GPRReg& regToStore(Spooler* spooler) { return spooler->m_gprToStore; }
+    static constexpr GPRReg invalid() { return InvalidGPRReg; }
+    static constexpr size_t regSize() { return sizeof(CPURegister); }
+#if CPU(ARM64)
+    static bool isValidLoadPairImm(int offset) { return ARM64Assembler::isValidLDPImm<64>(offset); }
+    static bool isValidStorePairImm(int offset) { return ARM64Assembler::isValidSTPImm<64>(offset); }
+#else
+    static bool isValidLoadPairImm(int) { return false; }
+    static bool isValidStorePairImm(int) { return false; }
+#endif
+};
+
+template<>
+struct RegDispatch<FPRReg> {
+    static bool hasSameType(Reg reg) { return reg.isFPR(); }
+    static FPRReg get(Reg reg) { return reg.fpr(); }
+    template<typename Spooler> static FPRReg temp1(const Spooler* spooler) { return spooler->m_temp1FPR; }
+    template<typename Spooler> static FPRReg temp2(const Spooler* spooler) { return spooler->m_temp2FPR; }
+    template<typename Spooler> static FPRReg& regToStore(Spooler* spooler) { return spooler->m_fprToStore; }
+    static constexpr FPRReg invalid() { return InvalidFPRReg; }
+    static constexpr size_t regSize() { return sizeof(double); }
+#if CPU(ARM64)
+    static bool isValidLoadPairImm(int offset) { return ARM64Assembler::isValidLDPFPImm<64>(offset); }
+    static bool isValidStorePairImm(int offset) { return ARM64Assembler::isValidSTPFPImm<64>(offset); }
+#else
+    static bool isValidLoadPairImm(int) { return false; }
+    static bool isValidStorePairImm(int) { return false; }
+#endif
+};
+
+template<typename Op>
+class AssemblyHelpers::Spooler {
+public:
+    using JIT = AssemblyHelpers;
+
+    Spooler(JIT& jit, GPRReg baseGPR)
+        : m_jit(jit)
+        , m_baseGPR(baseGPR)
+    { }
+
+    template<typename RegType>
+    void execute(const RegisterAtOffset& entry)
+    {
+        RELEASE_ASSERT(RegDispatch<RegType>::hasSameType(entry.reg()));
+        if constexpr (!hasPairOp)
+            return op().executeSingle(entry.offset(), RegDispatch<RegType>::get(entry.reg()));
+
+        if (!m_bufferedEntry.reg().isSet()) {
+            m_bufferedEntry = entry;
+            return;
+        }
+
+        constexpr ptrdiff_t regSize = RegDispatch<RegType>::regSize();
+        RegType bufferedEntryReg = RegDispatch<RegType>::get(m_bufferedEntry.reg());
+        RegType entryReg = RegDispatch<RegType>::get(entry.reg());
+
+        if (entry.offset() == m_bufferedEntry.offset() + regSize) {
+            op().executePair(m_bufferedEntry.offset(), bufferedEntryReg, entryReg);
+            m_bufferedEntry = { };
+            return;
+        }
+        if (m_bufferedEntry.offset() == entry.offset() + regSize) {
+            op().executePair(entry.offset(), entryReg, bufferedEntryReg);
+            m_bufferedEntry = { };
+            return;
+        }
+
+        // We don't have a pair of operations that we can execute as a pair.
+        // Execute the previous one as a single (finalize will do that), and then
+        // buffer the current entry to potentially be paired with the next entry.
+        finalize<RegType>();
+        execute<RegType>(entry);
+    }
+
+    template<typename RegType>
+    void finalize()
+    {
+        if constexpr (hasPairOp) {
+            if (m_bufferedEntry.reg().isSet()) {
+                op().executeSingle(m_bufferedEntry.offset(), RegDispatch<RegType>::get(m_bufferedEntry.reg()));
+                m_bufferedEntry = { };
+            }
+        }
+    }
+
+private:
+#if CPU(ARM64)
+    static constexpr bool hasPairOp = true;
+#else
+    static constexpr bool hasPairOp = false;
+#endif
+
+    Op& op() { return *reinterpret_cast<Op*>(this); }
+
+protected:
+    JIT& m_jit;
+    GPRReg m_baseGPR;
+    RegisterAtOffset m_bufferedEntry;
+};
+
+class AssemblyHelpers::LoadRegSpooler : public AssemblyHelpers::Spooler<LoadRegSpooler> {
+    using Base = Spooler<LoadRegSpooler>;
+    using JIT = Base::JIT;
+public:
+    LoadRegSpooler(JIT& jit, GPRReg baseGPR)
+        : Base(jit, baseGPR)
+    { }
+
+    ALWAYS_INLINE void loadGPR(const RegisterAtOffset& entry) { execute<GPRReg>(entry); }
+    ALWAYS_INLINE void finalizeGPR() { finalize<GPRReg>(); }
+    ALWAYS_INLINE void loadFPR(const RegisterAtOffset& entry) { execute<FPRReg>(entry); }
+    ALWAYS_INLINE void finalizeFPR() { finalize<FPRReg>(); }
+
+private:
+#if CPU(ARM64)
+    template<typename RegType>
+    ALWAYS_INLINE void executePair(ptrdiff_t offset, RegType reg1, RegType reg2)
+    {
+        m_jit.loadPair64(m_baseGPR, TrustedImm32(offset), reg1, reg2);
+    }
+#else
+    template<typename RegType>
+    ALWAYS_INLINE void executePair(ptrdiff_t, RegType, RegType) { }
+#endif
+
+    ALWAYS_INLINE void executeSingle(ptrdiff_t offset, GPRReg reg)
+    {
+#if USE(JSVALUE64)
+        m_jit.load64(Address(m_baseGPR, offset), reg);
+#else
+        m_jit.load32(Address(m_baseGPR, offset), reg);
+#endif
+    }
+
+    ALWAYS_INLINE void executeSingle(ptrdiff_t offset, FPRReg reg)
+    {
+        m_jit.loadDouble(Address(m_baseGPR, offset), reg);
+    }
+
+    friend class AssemblyHelpers::Spooler<LoadRegSpooler>;
+};
+
+class AssemblyHelpers::StoreRegSpooler : public AssemblyHelpers::Spooler<StoreRegSpooler> {
+    using Base = Spooler<StoreRegSpooler>;
+    using JIT = typename Base::JIT;
+public:
+    StoreRegSpooler(JIT& jit, GPRReg baseGPR)
+        : Base(jit, baseGPR)
+    { }
+
+    ALWAYS_INLINE void storeGPR(const RegisterAtOffset& entry) { execute<GPRReg>(entry); }
+    ALWAYS_INLINE void finalizeGPR() { finalize<GPRReg>(); }
+    ALWAYS_INLINE void storeFPR(const RegisterAtOffset& entry) { execute<FPRReg>(entry); }
+    ALWAYS_INLINE void finalizeFPR() { finalize<FPRReg>(); }
+
+private:
+#if CPU(ARM64)
+    template<typename RegType>
+    ALWAYS_INLINE void executePair(ptrdiff_t offset, RegType reg1, RegType reg2)
+    {
+        m_jit.storePair64(reg1, reg2, m_baseGPR, TrustedImm32(offset));
+    }
+#else
+    template<typename RegType>
+    ALWAYS_INLINE void executePair(ptrdiff_t, RegType, RegType) { }
+#endif
+
+    ALWAYS_INLINE void executeSingle(ptrdiff_t offset, GPRReg reg)
+    {
+#if USE(JSVALUE64)
+        m_jit.store64(reg, Address(m_baseGPR, offset));
+#else
+        m_jit.store32(reg, Address(m_baseGPR, offset));
+#endif
+    }
+
+    ALWAYS_INLINE void executeSingle(ptrdiff_t offset, FPRReg reg)
+    {
+        m_jit.storeDouble(reg, Address(m_baseGPR, offset));
+    }
+
+    friend class AssemblyHelpers::Spooler<StoreRegSpooler>;
+};
+
+class AssemblyHelpers::CopySpooler {
+public:
+    using JIT = AssemblyHelpers;
+    using Address = JIT::Address;
+    using TrustedImm32 = JIT::TrustedImm32;
+
+    struct Source {
+        enum class Type { BufferOffset, Reg, EncodedJSValue } type;
+        int offset;
+        Reg reg;
+        EncodedJSValue value;
+
+        template<typename RegType> RegType getReg() { return RegDispatch<RegType>::get(reg); };
+    };
+
+    enum class BufferRegs {
+        NeedPreservation,
+        AllowModification
+    };
+
+    CopySpooler(BufferRegs attribute, JIT& jit, GPRReg srcBuffer, GPRReg destBuffer, GPRReg temp1, GPRReg temp2, FPRReg fpTemp1 = InvalidFPRReg, FPRReg fpTemp2 = InvalidFPRReg)
+        : m_jit(jit)
+        , m_srcBufferGPR(srcBuffer)
+        , m_dstBufferGPR(destBuffer)
+        , m_temp1GPR(temp1)
+        , m_temp2GPR(temp2)
+        , m_temp1FPR(fpTemp1)
+        , m_temp2FPR(fpTemp2)
+        , m_bufferRegsAttr(attribute)
+    {
+        if constexpr (hasPairOp && !isARM64())
+            RELEASE_ASSERT_NOT_REACHED(); // unsupported architecture.
+    }
+
+    CopySpooler(JIT& jit, GPRReg srcBuffer, GPRReg destBuffer, GPRReg temp1, GPRReg temp2, FPRReg fpTemp1 = InvalidFPRReg, FPRReg fpTemp2 = InvalidFPRReg)
+        : CopySpooler(BufferRegs::NeedPreservation, jit, srcBuffer, destBuffer, temp1, temp2, fpTemp1, fpTemp2)
+    { }
+
+private:
+    template<typename RegType> RegType temp1() const { return RegDispatch<RegType>::temp1(this); }
+    template<typename RegType> RegType temp2() const { return RegDispatch<RegType>::temp2(this); }
+    template<typename RegType> RegType& regToStore() { return RegDispatch<RegType>::regToStore(this); }
+
+    template<typename RegType> static constexpr RegType invalid() { return RegDispatch<RegType>::invalid(); }
+    template<typename RegType> static constexpr int regSize() { return RegDispatch<RegType>::regSize(); }
+
+    template<typename RegType> static bool isValidLoadPairImm(int offset) { return RegDispatch<RegType>::isValidLoadPairImm(offset); }
+    template<typename RegType> static bool isValidStorePairImm(int offset) { return RegDispatch<RegType>::isValidStorePairImm(offset); }
+
+    template<typename RegType>
+    void load(int offset)
+    {
+        if constexpr (!hasPairOp) {
+            auto& regToStore = this->regToStore<RegType>();
+            regToStore = temp1<RegType>();
+            load(offset, regToStore);
+            return;
+        }
+
+        auto& source = m_sources[m_currentSource++];
+        source.type = Source::Type::BufferOffset;
+        source.offset = offset;
+    }
+
+    void move(EncodedJSValue value)
+    {
+        if constexpr (!hasPairOp) {
+            auto& regToStore = this->regToStore<GPRReg>();
+            regToStore = temp1<GPRReg>();
+            move(value, regToStore);
+            return;
+        }
+
+        auto& source = m_sources[m_currentSource++];
+        source.type = Source::Type::EncodedJSValue;
+        source.value = value;
+    }
+
+    template<typename RegType>
+    void copy(RegType reg)
+    {
+        if constexpr (!hasPairOp) {
+            auto& regToStore = this->regToStore<RegType>();
+            regToStore = reg;
+            return;
+        }
+
+        auto& source = m_sources[m_currentSource++];
+        source.type = Source::Type::Reg;
+        source.reg = reg;
+    }
+
+    template<typename RegType>
+    void store(int storeOffset)
+    {
+        if constexpr (!hasPairOp) {
+            auto regToStore = this->regToStore<RegType>();
+            store(regToStore, storeOffset);
+            return;
+        }
+
+        constexpr bool regTypeIsGPR = std::is_same<RegType, GPRReg>::value;
+
+        if (m_currentSource < 2) {
+            m_deferredStoreOffset = storeOffset;
+            return;
+        }
+
+        RegType regToStore1 = invalid<RegType>();
+        RegType regToStore2 = invalid<RegType>();
+        auto& source1 = m_sources[0];
+        auto& source2 = m_sources[1];
+        auto srcOffset1 = m_sources[0].offset - m_srcOffsetAdjustment;
+        auto srcOffset2 = m_sources[1].offset - m_srcOffsetAdjustment;
+        constexpr int registerSize = regSize<RegType>();
+
+        if (source1.type == Source::Type::BufferOffset && source2.type == Source::Type::BufferOffset) {
+            regToStore1 = temp1<RegType>();
+            regToStore2 = temp2<RegType>();
+
+            int offsetDelta = abs(srcOffset1 - srcOffset2);
+            int minOffset = std::min(srcOffset1, srcOffset2);
+            bool isValidOffset = isValidLoadPairImm<RegType>(minOffset);
+
+            if (offsetDelta != registerSize || (!isValidOffset && m_bufferRegsAttr != BufferRegs::AllowModification)) {
+                load(srcOffset1, regToStore1);
+                load(srcOffset2, regToStore2);
+            } else {
+                if (!isValidOffset) {
+                    ASSERT(m_bufferRegsAttr == BufferRegs::AllowModification);
+                    m_srcOffsetAdjustment += minOffset;
+                    m_jit.addPtr(TrustedImm32(minOffset), m_srcBufferGPR);
+
+                    srcOffset1 -= minOffset;
+                    srcOffset2 -= minOffset;
+                    ASSERT(isValidLoadPairImm<RegType>(std::min(srcOffset1, srcOffset2)));
+                }
+                if (srcOffset1 < srcOffset2)
+                    loadPair(srcOffset1, regToStore1, regToStore2);
+                else
+                    loadPair(srcOffset2, regToStore2, regToStore1);
+            }
+        } else if (source1.type == Source::Type::BufferOffset) {
+            regToStore1 = temp1<RegType>();
+            load(srcOffset1, regToStore1);
+            if (source2.type == Source::Type::EncodedJSValue) {
+                if constexpr (regTypeIsGPR) {
+                    regToStore2 = temp2<RegType>();
+                    move(source2.value, regToStore2);
+                } else
+                    RELEASE_ASSERT_NOT_REACHED();
+            } else
+                regToStore2 = source2.getReg<RegType>();
+
+        } else if (source2.type == Source::Type::BufferOffset) {
+            if (source1.type == Source::Type::EncodedJSValue) {
+                if constexpr (regTypeIsGPR) {
+                    regToStore1 = temp1<RegType>();
+                    move(source1.value, regToStore1);
+                } else
+                    RELEASE_ASSERT_NOT_REACHED();
+            } else
+                regToStore1 = source1.getReg<RegType>();
+            regToStore2 = temp2<RegType>();
+            load(srcOffset2, regToStore2);
+
+        } else {
+            if (source1.type == Source::Type::EncodedJSValue) {
+                if constexpr (regTypeIsGPR) {
+                    regToStore1 = temp1<RegType>();
+                    move(source1.value, regToStore1);
+                } else
+                    RELEASE_ASSERT_NOT_REACHED();
+            } else
+                regToStore1 = source1.getReg<RegType>();
+
+            if (source2.type == Source::Type::EncodedJSValue) {
+                if constexpr (regTypeIsGPR) {
+                    regToStore2 = temp2<RegType>();
+                    move(source2.value, regToStore2);
+                } else
+                    RELEASE_ASSERT_NOT_REACHED();
+            } else
+                regToStore2 = source2.getReg<RegType>();
+        }
+
+        int dstOffset1 = m_deferredStoreOffset - m_dstOffsetAdjustment;
+        int dstOffset2 = storeOffset - m_dstOffsetAdjustment;
+
+        int offsetDelta = abs(dstOffset1 - dstOffset2);
+        int minOffset = std::min(dstOffset1, dstOffset2);
+        bool isValidOffset = isValidStorePairImm<RegType>(minOffset);
+
+        if (offsetDelta != registerSize || (!isValidOffset && m_bufferRegsAttr != BufferRegs::AllowModification)) {
+            store(regToStore1, dstOffset1);
+            store(regToStore2, dstOffset2);
+        } else {
+            if (!isValidOffset) {
+                ASSERT(m_bufferRegsAttr == BufferRegs::AllowModification);
+                m_dstOffsetAdjustment += minOffset;
+                m_jit.addPtr(TrustedImm32(minOffset), m_dstBufferGPR);
+
+                dstOffset1 -= minOffset;
+                dstOffset2 -= minOffset;
+                ASSERT(isValidStorePairImm<RegType>(std::min(dstOffset1, dstOffset2)));
+            }
+            if (dstOffset1 < dstOffset2)
+                storePair(regToStore1, regToStore2, dstOffset1);
+            else
+                storePair(regToStore2, regToStore1, dstOffset2);
+        }
+
+        m_currentSource = 0;
+    }
+
+    template<typename RegType>
+    void finalize()
+    {
+        if constexpr (!hasPairOp)
+            return;
+
+        if (!m_currentSource)
+            return; // Nothing to finalize.
+
+        ASSERT(m_currentSource == 1);
+
+        RegType regToStore = invalid<RegType>();
+        auto& source = m_sources[0];
+        auto& srcOffset = source.offset;
+        constexpr bool regTypeIsGPR = std::is_same<RegType, GPRReg>::value;
+
+        if (source.type == Source::Type::BufferOffset) {
+            regToStore = temp1<RegType>();
+            load(srcOffset - m_srcOffsetAdjustment, regToStore);
+        } else if (source.type == Source::Type::Reg)
+            regToStore = source.getReg<RegType>();
+        else if constexpr (regTypeIsGPR) {
+            regToStore = temp1<RegType>();
+            move(source.value, regToStore);
+        } else
+            RELEASE_ASSERT_NOT_REACHED();
+
+        store(regToStore, m_deferredStoreOffset - m_dstOffsetAdjustment);
+        m_currentSource = 0;
+    }
+
+public:
+    ALWAYS_INLINE void loadGPR(int srcOffset) { load<GPRReg>(srcOffset); }
+    ALWAYS_INLINE void copyGPR(GPRReg gpr) { copy<GPRReg>(gpr); }
+    ALWAYS_INLINE void moveConstant(EncodedJSValue value) { move(value); }
+    ALWAYS_INLINE void storeGPR(int dstOffset) { store<GPRReg>(dstOffset); }
+    ALWAYS_INLINE void finalizeGPR() { finalize<GPRReg>(); }
+
+    ALWAYS_INLINE void loadFPR(int srcOffset) { load<FPRReg>(srcOffset); }
+    ALWAYS_INLINE void copyFPR(FPRReg gpr) { copy<FPRReg>(gpr); }
+    ALWAYS_INLINE void storeFPR(int dstOffset) { store<FPRReg>(dstOffset); }
+    ALWAYS_INLINE void finalizeFPR() { finalize<FPRReg>(); }
+
+protected:
+#if USE(JSVALUE64)
+    ALWAYS_INLINE void move(EncodedJSValue value, GPRReg dest)
+    {
+        m_jit.move(TrustedImm64(value), dest);
+    }
+#else
+    NO_RETURN_DUE_TO_CRASH void move(EncodedJSValue, GPRReg) { RELEASE_ASSERT_NOT_REACHED(); }
+#endif
+
+    ALWAYS_INLINE void load(int offset, GPRReg dest)
+    {
+        m_jit.loadPtr(Address(m_srcBufferGPR, offset), dest);
+    }
+
+    ALWAYS_INLINE void store(GPRReg src, int offset)
+    {
+        m_jit.storePtr(src, Address(m_dstBufferGPR, offset));
+    }
+
+    ALWAYS_INLINE void load(int offset, FPRReg dest)
+    {
+        m_jit.loadDouble(Address(m_srcBufferGPR, offset), dest);
+    }
+
+    ALWAYS_INLINE void store(FPRReg src, int offset)
+    {
+        m_jit.storeDouble(src, Address(m_dstBufferGPR, offset));
+    }
+
+#if CPU(ARM64)
+    template<typename RegType>
+    ALWAYS_INLINE void loadPair(int offset, RegType dest1, RegType dest2)
+    {
+        m_jit.loadPair64(m_srcBufferGPR, TrustedImm32(offset), dest1, dest2);
+    }
+
+    template<typename RegType>
+    ALWAYS_INLINE void storePair(RegType src1, RegType src2, int offset)
+    {
+        m_jit.storePair64(src1, src2, m_dstBufferGPR, TrustedImm32(offset));
+    }
+
+    static constexpr bool hasPairOp = true;
+#else
+    template<typename RegType> ALWAYS_INLINE void loadPair(int, RegType, RegType) { }
+    template<typename RegType> ALWAYS_INLINE void storePair(RegType, RegType, int) { }
+
+    static constexpr bool hasPairOp = false;
+#endif
+
+    JIT& m_jit;
+
+    GPRReg m_srcBufferGPR;
+    GPRReg m_dstBufferGPR;
+    GPRReg m_temp1GPR;
+    GPRReg m_temp2GPR;
+    FPRReg m_temp1FPR;
+    FPRReg m_temp2FPR;
+
+private:
+    static constexpr int gprSize = static_cast<int>(sizeof(CPURegister));
+    static constexpr int fprSize = static_cast<int>(sizeof(double));
+
+    // These point to which register to use.
+    GPRReg m_gprToStore { InvalidGPRReg }; // Only used when !hasPairOp.
+    FPRReg m_fprToStore { InvalidFPRReg }; // Only used when !hasPairOp.
+
+    BufferRegs m_bufferRegsAttr;
+    Source m_sources[2];
+    unsigned m_currentSource { 0 };
+    int m_srcOffsetAdjustment { 0 };
+    int m_dstOffsetAdjustment { 0 };
+    int m_deferredStoreOffset;
+
+    template<typename RegType> friend struct RegDispatch;
+};
+
+} // namespace JSC
+
+#endif // ENABLE(JIT)
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCorejitScratchRegisterAllocatorcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/jit/ScratchRegisterAllocator.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/jit/ScratchRegisterAllocator.cpp     2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/jit/ScratchRegisterAllocator.cpp        2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2014-2017 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2014-2021 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -28,6 +28,7 @@
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(JIT)
</span><span class="cx"> 
</span><ins>+#include "AssemblyHelpersSpoolers.h"
</ins><span class="cx"> #include "MaxFrameExtentForSlowPathCall.h"
</span><span class="cx"> #include "VM.h"
</span><span class="cx"> 
</span><span class="lines">@@ -101,7 +102,7 @@
</span><span class="cx"> GPRReg ScratchRegisterAllocator::allocateScratchGPR() { return allocateScratch<GPRInfo>(); }
</span><span class="cx"> FPRReg ScratchRegisterAllocator::allocateScratchFPR() { return allocateScratch<FPRInfo>(); }
</span><span class="cx"> 
</span><del>-ScratchRegisterAllocator::PreservedState ScratchRegisterAllocator::preserveReusedRegistersByPushing(MacroAssembler& jit, ExtraStackSpace extraStackSpace)
</del><ins>+ScratchRegisterAllocator::PreservedState ScratchRegisterAllocator::preserveReusedRegistersByPushing(AssemblyHelpers& jit, ExtraStackSpace extraStackSpace)
</ins><span class="cx"> {
</span><span class="cx">     if (!didReuseRegisters())
</span><span class="cx">         return PreservedState(0, extraStackSpace);
</span><span class="lines">@@ -124,7 +125,7 @@
</span><span class="cx">     return PreservedState(stackAdjustmentSize, extraStackSpace);
</span><span class="cx"> }
</span><span class="cx"> 
</span><del>-void ScratchRegisterAllocator::restoreReusedRegistersByPopping(MacroAssembler& jit, const ScratchRegisterAllocator::PreservedState& preservedState)
</del><ins>+void ScratchRegisterAllocator::restoreReusedRegistersByPopping(AssemblyHelpers& jit, const ScratchRegisterAllocator::PreservedState& preservedState)
</ins><span class="cx"> {
</span><span class="cx">     RELEASE_ASSERT(preservedState);
</span><span class="cx">     if (!didReuseRegisters())
</span><span class="lines">@@ -161,7 +162,7 @@
</span><span class="cx">     return usedRegistersForCall().numberOfSetRegisters() * sizeof(JSValue);
</span><span class="cx"> }
</span><span class="cx"> 
</span><del>-unsigned ScratchRegisterAllocator::preserveRegistersToStackForCall(MacroAssembler& jit, const RegisterSet& usedRegisters, unsigned extraBytesAtTopOfStack)
</del><ins>+unsigned ScratchRegisterAllocator::preserveRegistersToStackForCall(AssemblyHelpers& jit, const RegisterSet& usedRegisters, unsigned extraBytesAtTopOfStack)
</ins><span class="cx"> {
</span><span class="cx">     RELEASE_ASSERT(extraBytesAtTopOfStack % sizeof(void*) == 0);
</span><span class="cx">     if (!usedRegisters.numberOfSetRegisters())
</span><span class="lines">@@ -174,19 +175,24 @@
</span><span class="cx">         MacroAssembler::TrustedImm32(stackOffset),
</span><span class="cx">         MacroAssembler::stackPointerRegister);
</span><span class="cx"> 
</span><ins>+    AssemblyHelpers::StoreRegSpooler spooler(jit, MacroAssembler::stackPointerRegister);
+
</ins><span class="cx">     unsigned count = 0;
</span><span class="cx">     for (GPRReg reg = MacroAssembler::firstRegister(); reg <= MacroAssembler::lastRegister(); reg = MacroAssembler::nextRegister(reg)) {
</span><span class="cx">         if (usedRegisters.get(reg)) {
</span><del>-            jit.storePtr(reg, MacroAssembler::Address(MacroAssembler::stackPointerRegister, extraBytesAtTopOfStack + (count * sizeof(EncodedJSValue))));
</del><ins>+            spooler.storeGPR({ reg, static_cast<ptrdiff_t>(extraBytesAtTopOfStack + (count * sizeof(EncodedJSValue))) });
</ins><span class="cx">             count++;
</span><span class="cx">         }
</span><span class="cx">     }
</span><ins>+    spooler.finalizeGPR();
+
</ins><span class="cx">     for (FPRReg reg = MacroAssembler::firstFPRegister(); reg <= MacroAssembler::lastFPRegister(); reg = MacroAssembler::nextFPRegister(reg)) {
</span><span class="cx">         if (usedRegisters.get(reg)) {
</span><del>-            jit.storeDouble(reg, MacroAssembler::Address(MacroAssembler::stackPointerRegister, extraBytesAtTopOfStack + (count * sizeof(EncodedJSValue))));
</del><ins>+            spooler.storeFPR({ reg, static_cast<ptrdiff_t>(extraBytesAtTopOfStack + (count * sizeof(EncodedJSValue))) });
</ins><span class="cx">             count++;
</span><span class="cx">         }
</span><span class="cx">     }
</span><ins>+    spooler.finalizeFPR();
</ins><span class="cx"> 
</span><span class="cx">     RELEASE_ASSERT(count == usedRegisters.numberOfSetRegisters());
</span><span class="cx"> 
</span><span class="lines">@@ -193,7 +199,7 @@
</span><span class="cx">     return stackOffset;
</span><span class="cx"> }
</span><span class="cx"> 
</span><del>-void ScratchRegisterAllocator::restoreRegistersFromStackForCall(MacroAssembler& jit, const RegisterSet& usedRegisters, const RegisterSet& ignore, unsigned numberOfStackBytesUsedForRegisterPreservation, unsigned extraBytesAtTopOfStack)
</del><ins>+void ScratchRegisterAllocator::restoreRegistersFromStackForCall(AssemblyHelpers& jit, const RegisterSet& usedRegisters, const RegisterSet& ignore, unsigned numberOfStackBytesUsedForRegisterPreservation, unsigned extraBytesAtTopOfStack)
</ins><span class="cx"> {
</span><span class="cx">     RELEASE_ASSERT(extraBytesAtTopOfStack % sizeof(void*) == 0);
</span><span class="cx">     if (!usedRegisters.numberOfSetRegisters()) {
</span><span class="lines">@@ -201,21 +207,26 @@
</span><span class="cx">         return;
</span><span class="cx">     }
</span><span class="cx"> 
</span><ins>+    AssemblyHelpers::LoadRegSpooler spooler(jit, MacroAssembler::stackPointerRegister);
+
</ins><span class="cx">     unsigned count = 0;
</span><span class="cx">     for (GPRReg reg = MacroAssembler::firstRegister(); reg <= MacroAssembler::lastRegister(); reg = MacroAssembler::nextRegister(reg)) {
</span><span class="cx">         if (usedRegisters.get(reg)) {
</span><span class="cx">             if (!ignore.get(reg))
</span><del>-                jit.loadPtr(MacroAssembler::Address(MacroAssembler::stackPointerRegister, extraBytesAtTopOfStack + (sizeof(EncodedJSValue) * count)), reg);
</del><ins>+                spooler.loadGPR({ reg, static_cast<ptrdiff_t>(extraBytesAtTopOfStack + (sizeof(EncodedJSValue) * count)) });
</ins><span class="cx">             count++;
</span><span class="cx">         }
</span><span class="cx">     }
</span><ins>+    spooler.finalizeGPR();
+
</ins><span class="cx">     for (FPRReg reg = MacroAssembler::firstFPRegister(); reg <= MacroAssembler::lastFPRegister(); reg = MacroAssembler::nextFPRegister(reg)) {
</span><span class="cx">         if (usedRegisters.get(reg)) {
</span><span class="cx">             if (!ignore.get(reg))
</span><del>-                jit.loadDouble(MacroAssembler::Address(MacroAssembler::stackPointerRegister, extraBytesAtTopOfStack + (sizeof(EncodedJSValue) * count)), reg);
</del><ins>+                spooler.loadFPR({ reg, static_cast<ptrdiff_t>(extraBytesAtTopOfStack + (sizeof(EncodedJSValue) * count)) });
</ins><span class="cx">             count++;
</span><span class="cx">         }
</span><span class="cx">     }
</span><ins>+    spooler.finalizeFPR();
</ins><span class="cx"> 
</span><span class="cx">     unsigned stackOffset = (usedRegisters.numberOfSetRegisters()) * sizeof(EncodedJSValue);
</span><span class="cx">     stackOffset += extraBytesAtTopOfStack;
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorejitScratchRegisterAllocatorh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/jit/ScratchRegisterAllocator.h (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/jit/ScratchRegisterAllocator.h       2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/jit/ScratchRegisterAllocator.h  2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2012, 2014 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2012-2021 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -27,12 +27,12 @@
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(JIT)
</span><span class="cx"> 
</span><del>-#include "MacroAssembler.h"
</del><span class="cx"> #include "RegisterSet.h"
</span><span class="cx"> #include "TempRegisterSet.h"
</span><span class="cx"> 
</span><span class="cx"> namespace JSC {
</span><span class="cx"> 
</span><ins>+class AssemblyHelpers;
</ins><span class="cx"> struct ScratchBuffer;
</span><span class="cx"> 
</span><span class="cx"> // This class provides a low-level register allocator for use in stubs.
</span><span class="lines">@@ -84,16 +84,16 @@
</span><span class="cx">         ExtraStackSpace extraStackSpaceRequirement;
</span><span class="cx">     };
</span><span class="cx"> 
</span><del>-    PreservedState preserveReusedRegistersByPushing(MacroAssembler& jit, ExtraStackSpace);
-    void restoreReusedRegistersByPopping(MacroAssembler& jit, const PreservedState&);
</del><ins>+    PreservedState preserveReusedRegistersByPushing(AssemblyHelpers& jit, ExtraStackSpace);
+    void restoreReusedRegistersByPopping(AssemblyHelpers& jit, const PreservedState&);
</ins><span class="cx">     
</span><span class="cx">     RegisterSet usedRegistersForCall() const;
</span><span class="cx">     
</span><span class="cx">     unsigned desiredScratchBufferSizeForCall() const;
</span><del>-    
-    static unsigned preserveRegistersToStackForCall(MacroAssembler& jit, const RegisterSet& usedRegisters, unsigned extraPaddingInBytes);
-    static void restoreRegistersFromStackForCall(MacroAssembler& jit, const RegisterSet& usedRegisters, const RegisterSet& ignore, unsigned numberOfStackBytesUsedForRegisterPreservation, unsigned extraPaddingInBytes);
</del><span class="cx"> 
</span><ins>+    static unsigned preserveRegistersToStackForCall(AssemblyHelpers& jit, const RegisterSet& usedRegisters, unsigned extraPaddingInBytes);
+    static void restoreRegistersFromStackForCall(AssemblyHelpers& jit, const RegisterSet& usedRegisters, const RegisterSet& ignore, unsigned numberOfStackBytesUsedForRegisterPreservation, unsigned extraPaddingInBytes);
+
</ins><span class="cx"> private:
</span><span class="cx">     RegisterSet m_usedRegisters;
</span><span class="cx">     TempRegisterSet m_lockedRegisters;
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorewasmWasmAirIRGeneratorcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp  2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp     2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2019-2020 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2019-2021 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -2981,10 +2981,7 @@
</span><span class="cx">     B3::PatchpointValue* patch = addPatchpoint(B3::Void);
</span><span class="cx">     patch->setGenerator([] (CCallHelpers& jit, const B3::StackmapGenerationParams& params) {
</span><span class="cx">         auto calleeSaves = params.code().calleeSaveRegisterAtOffsetList();
</span><del>-
-        for (RegisterAtOffset calleeSave : calleeSaves)
-            jit.load64ToReg(CCallHelpers::Address(GPRInfo::callFrameRegister, calleeSave.offset()), calleeSave.reg());
-
</del><ins>+        jit.emitRestore(calleeSaves);
</ins><span class="cx">         jit.emitFunctionEpilogue();
</span><span class="cx">         jit.ret();
</span><span class="cx">     });
</span></span></pre></div>
<a id="trunkSourceJavaScriptCorewasmWasmB3IRGeneratorcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp (279255 => 279256)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp   2021-06-25 00:06:06 UTC (rev 279255)
+++ trunk/Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp      2021-06-25 00:06:56 UTC (rev 279256)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2016-2020 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2016-2021 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -2284,10 +2284,7 @@
</span><span class="cx">     PatchpointValue* patch = m_proc.add<PatchpointValue>(B3::Void, origin());
</span><span class="cx">     patch->setGenerator([] (CCallHelpers& jit, const B3::StackmapGenerationParams& params) {
</span><span class="cx">         auto calleeSaves = params.code().calleeSaveRegisterAtOffsetList();
</span><del>-
-        for (RegisterAtOffset calleeSave : calleeSaves)
-            jit.load64ToReg(CCallHelpers::Address(GPRInfo::callFrameRegister, calleeSave.offset()), calleeSave.reg());
-
</del><ins>+        jit.emitRestore(calleeSaves);
</ins><span class="cx">         jit.emitFunctionEpilogue();
</span><span class="cx">         jit.ret();
</span><span class="cx">     });
</span></span></pre>
</div>
</div>

</body>
</html>