<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[198708] trunk/Source/JavaScriptCore</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/198708">198708</a></dd>
<dt>Author</dt> <dd>commit-queue@webkit.org</dd>
<dt>Date</dt> <dd>2016-03-25 20:47:23 -0700 (Fri, 25 Mar 2016)</dd>
</dl>

<h3>Log Message</h3>
<pre>[JSC] Put the x86 Assembler on a binary diet
https://bugs.webkit.org/show_bug.cgi?id=155683

Patch by Benjamin Poulain &lt;bpoulain@apple.com&gt; on 2016-03-25
Reviewed by Darin Adler.

The MacroAssemblers are heavily inlined. This is unfortunately
important for baseline JIT where many branches can be eliminated
at compile time.

This inlining causes a lot of binary bloat. The phases
lowering to ASM are massively large.

This patch improves the situation a bit for x86 through
many small improvements:

-Every instruction starts with ensureSpace(). The slow
 path realloc the buffer.
 From that slow path, only fastRealloc() was a function
 call. What is around does not need to be fast, I moved
 the whole grow() function out of line for those cases.

-When testing multiple registers for REX requirements,
 we had something like this:
     byteRegRequiresRex(reg) || byteRegRequiresRex(rm)
     regRequiresRex(index) || regRequiresRex(base)
 Those were producing multiple test-and-branch. Those branches
 are effectively random so we don't have to care about individual
 branches being predictable.

 The new code effectively does:
     byteRegRequiresRex(reg | rm)
     regRequiresRex(index | base)

-Change &quot;ModRmMode&quot; to have the value we can OR directly
 to the generated ModRm.
 This is important because some ModRM code is so large
 that is goes out of line;

-Finally, a big change on how we write to the AssemblerBuffer.

 Previously, instructions were written byte by byte into
 the assembler buffer of the MacroAssembler.

 The problem with that is the compiler cannot prove that
 the buffer pointer and the AssemblerBuffer are not pointing
 to the same memory.

 Because of that, before any write, all the local register
 were pushed back to the AssemblerBuffer memory, then everything
 was read back after the write to compute the next write.

 I attempted to use the &quot;restrict&quot; keyword and wrapper types
 to help Clang with that but nothing worked.

 The current solution is to keep a local copy of the index
 and the buffer pointer in the scope of each instruction.
 That is done by AssemblerBuffer::LocalWriter.

 Since LocalWriter only exists locally, it stays in
 register and we don't have all the memory churn between
 each byte writing. This also allows clang to combine
 obvious cases since there are no longer observable side
 effects between bytes.

This patch reduces the binary size by 66k. It is a small
speed-up on Sunspider.

* assembler/AssemblerBuffer.h:
(JSC::AssemblerBuffer::ensureSpace):
(JSC::AssemblerBuffer::LocalWriter::LocalWriter):
(JSC::AssemblerBuffer::LocalWriter::~LocalWriter):
(JSC::AssemblerBuffer::LocalWriter::putByteUnchecked):
(JSC::AssemblerBuffer::LocalWriter::putShortUnchecked):
(JSC::AssemblerBuffer::LocalWriter::putIntUnchecked):
(JSC::AssemblerBuffer::LocalWriter::putInt64Unchecked):
(JSC::AssemblerBuffer::LocalWriter::putIntegralUnchecked):
(JSC::AssemblerBuffer::putIntegral):
(JSC::AssemblerBuffer::outOfLineGrow):
* assembler/MacroAssemblerX86Common.h:
* assembler/X86Assembler.h:
(JSC::X86Assembler::X86InstructionFormatter::byteRegRequiresRex):
(JSC::X86Assembler::X86InstructionFormatter::regRequiresRex):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::LocalBufferWriter):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::emitRex):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::emitRexW):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::emitRexIf):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::emitRexIfNeeded):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::putModRm):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::putModRmSib):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::registerModRM):
(JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::memoryModRM):
(JSC::X86Assembler::X86InstructionFormatter::oneByteOp): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::oneByteOp_disp32): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::oneByteOp_disp8): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::twoByteOp): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::threeByteOp): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::oneByteOp64): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::oneByteOp64_disp32): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::oneByteOp64_disp8): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::twoByteOp64): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::oneByteOp8): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::twoByteOp8): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::emitRex): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::emitRexW): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::emitRexIf): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::emitRexIfNeeded): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::putModRm): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::putModRmSib): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::registerModRM): Deleted.
(JSC::X86Assembler::X86InstructionFormatter::memoryModRM): Deleted.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCoreChangeLog">trunk/Source/JavaScriptCore/ChangeLog</a></li>
<li><a href="#trunkSourceJavaScriptCoreassemblerAssemblerBufferh">trunk/Source/JavaScriptCore/assembler/AssemblerBuffer.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreassemblerMacroAssemblerX86Commonh">trunk/Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreassemblerX86Assemblerh">trunk/Source/JavaScriptCore/assembler/X86Assembler.h</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkSourceJavaScriptCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ChangeLog (198707 => 198708)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ChangeLog        2016-03-26 03:45:01 UTC (rev 198707)
+++ trunk/Source/JavaScriptCore/ChangeLog        2016-03-26 03:47:23 UTC (rev 198708)
</span><span class="lines">@@ -1,3 +1,116 @@
</span><ins>+2016-03-25  Benjamin Poulain  &lt;bpoulain@apple.com&gt;
+
+        [JSC] Put the x86 Assembler on a binary diet
+        https://bugs.webkit.org/show_bug.cgi?id=155683
+
+        Reviewed by Darin Adler.
+
+        The MacroAssemblers are heavily inlined. This is unfortunately
+        important for baseline JIT where many branches can be eliminated
+        at compile time.
+
+        This inlining causes a lot of binary bloat. The phases
+        lowering to ASM are massively large.
+
+        This patch improves the situation a bit for x86 through
+        many small improvements:
+
+        -Every instruction starts with ensureSpace(). The slow
+         path realloc the buffer.
+         From that slow path, only fastRealloc() was a function
+         call. What is around does not need to be fast, I moved
+         the whole grow() function out of line for those cases.
+
+        -When testing multiple registers for REX requirements,
+         we had something like this:
+             byteRegRequiresRex(reg) || byteRegRequiresRex(rm)
+             regRequiresRex(index) || regRequiresRex(base)
+         Those were producing multiple test-and-branch. Those branches
+         are effectively random so we don't have to care about individual
+         branches being predictable.
+
+         The new code effectively does:
+             byteRegRequiresRex(reg | rm)
+             regRequiresRex(index | base)
+
+        -Change &quot;ModRmMode&quot; to have the value we can OR directly
+         to the generated ModRm.
+         This is important because some ModRM code is so large
+         that is goes out of line;
+
+        -Finally, a big change on how we write to the AssemblerBuffer.
+
+         Previously, instructions were written byte by byte into
+         the assembler buffer of the MacroAssembler.
+
+         The problem with that is the compiler cannot prove that
+         the buffer pointer and the AssemblerBuffer are not pointing
+         to the same memory.
+
+         Because of that, before any write, all the local register
+         were pushed back to the AssemblerBuffer memory, then everything
+         was read back after the write to compute the next write.
+
+         I attempted to use the &quot;restrict&quot; keyword and wrapper types
+         to help Clang with that but nothing worked.
+
+         The current solution is to keep a local copy of the index
+         and the buffer pointer in the scope of each instruction.
+         That is done by AssemblerBuffer::LocalWriter.
+
+         Since LocalWriter only exists locally, it stays in
+         register and we don't have all the memory churn between
+         each byte writing. This also allows clang to combine
+         obvious cases since there are no longer observable side
+         effects between bytes.
+
+        This patch reduces the binary size by 66k. It is a small
+        speed-up on Sunspider.
+
+        * assembler/AssemblerBuffer.h:
+        (JSC::AssemblerBuffer::ensureSpace):
+        (JSC::AssemblerBuffer::LocalWriter::LocalWriter):
+        (JSC::AssemblerBuffer::LocalWriter::~LocalWriter):
+        (JSC::AssemblerBuffer::LocalWriter::putByteUnchecked):
+        (JSC::AssemblerBuffer::LocalWriter::putShortUnchecked):
+        (JSC::AssemblerBuffer::LocalWriter::putIntUnchecked):
+        (JSC::AssemblerBuffer::LocalWriter::putInt64Unchecked):
+        (JSC::AssemblerBuffer::LocalWriter::putIntegralUnchecked):
+        (JSC::AssemblerBuffer::putIntegral):
+        (JSC::AssemblerBuffer::outOfLineGrow):
+        * assembler/MacroAssemblerX86Common.h:
+        * assembler/X86Assembler.h:
+        (JSC::X86Assembler::X86InstructionFormatter::byteRegRequiresRex):
+        (JSC::X86Assembler::X86InstructionFormatter::regRequiresRex):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::LocalBufferWriter):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::emitRex):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::emitRexW):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::emitRexIf):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::emitRexIfNeeded):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::putModRm):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::putModRmSib):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::registerModRM):
+        (JSC::X86Assembler::X86InstructionFormatter::LocalBufferWriter::memoryModRM):
+        (JSC::X86Assembler::X86InstructionFormatter::oneByteOp): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::oneByteOp_disp32): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::oneByteOp_disp8): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::twoByteOp): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::threeByteOp): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::oneByteOp64): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::oneByteOp64_disp32): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::oneByteOp64_disp8): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::twoByteOp64): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::oneByteOp8): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::twoByteOp8): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::emitRex): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::emitRexW): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::emitRexIf): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::emitRexIfNeeded): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::putModRm): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::putModRmSib): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::registerModRM): Deleted.
+        (JSC::X86Assembler::X86InstructionFormatter::memoryModRM): Deleted.
+
</ins><span class="cx"> 2016-03-25  Saam barati  &lt;sbarati@apple.com&gt;
</span><span class="cx"> 
</span><span class="cx">         RegExp.prototype.test should be an intrinsic again
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreassemblerAssemblerBufferh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/assembler/AssemblerBuffer.h (198707 => 198708)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/assembler/AssemblerBuffer.h        2016-03-26 03:45:01 UTC (rev 198707)
+++ trunk/Source/JavaScriptCore/assembler/AssemblerBuffer.h        2016-03-26 03:47:23 UTC (rev 198708)
</span><span class="lines">@@ -121,15 +121,15 @@
</span><span class="cx">         {
</span><span class="cx">         }
</span><span class="cx"> 
</span><del>-        bool isAvailable(int space)
</del><ins>+        bool isAvailable(unsigned space)
</ins><span class="cx">         {
</span><span class="cx">             return m_index + space &lt;= m_storage.capacity();
</span><span class="cx">         }
</span><span class="cx"> 
</span><del>-        void ensureSpace(int space)
</del><ins>+        void ensureSpace(unsigned space)
</ins><span class="cx">         {
</span><span class="cx">             if (!isAvailable(space))
</span><del>-                grow();
</del><ins>+                outOfLineGrow();
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         bool isAligned(int alignment) const
</span><span class="lines">@@ -165,13 +165,63 @@
</span><span class="cx"> 
</span><span class="cx">         AssemblerData releaseAssemblerData() { return WTFMove(m_storage); }
</span><span class="cx"> 
</span><ins>+        // LocalWriter is a trick to keep the storage buffer and the index
+        // in memory while issuing multiple Stores.
+        // It is created in a block scope and its attribute can stay live
+        // between writes.
+        //
+        // LocalWriter *CANNOT* be mixed with other types of access to AssemblerBuffer.
+        // AssemblerBuffer cannot be used until its LocalWriter goes out of scope.
+        class LocalWriter {
+        public:
+            LocalWriter(AssemblerBuffer&amp; buffer, unsigned requiredSpace)
+                : m_buffer(buffer)
+            {
+                buffer.ensureSpace(requiredSpace);
+                m_storageBuffer = buffer.m_storage.buffer();
+                m_index = buffer.m_index;
+#if !defined(NDEBUG)
+                m_initialIndex = m_index;
+                m_requiredSpace = requiredSpace;
+#endif
+            }
+
+            ~LocalWriter()
+            {
+                ASSERT(m_index - m_initialIndex &lt;= m_requiredSpace);
+                ASSERT(m_buffer.m_index == m_initialIndex);
+                ASSERT(m_storageBuffer == m_buffer.m_storage.buffer());
+                m_buffer.m_index = m_index;
+            }
+
+            void putByteUnchecked(int8_t value) { putIntegralUnchecked(value); }
+            void putShortUnchecked(int16_t value) { putIntegralUnchecked(value); }
+            void putIntUnchecked(int32_t value) { putIntegralUnchecked(value); }
+            void putInt64Unchecked(int64_t value) { putIntegralUnchecked(value); }
+        private:
+            template&lt;typename IntegralType&gt;
+            void putIntegralUnchecked(IntegralType value)
+            {
+                ASSERT(m_index + sizeof(IntegralType) &lt;= m_buffer.m_storage.capacity());
+                *reinterpret_cast_ptr&lt;IntegralType*&gt;(m_storageBuffer + m_index) = value;
+                m_index += sizeof(IntegralType);
+            }
+            AssemblerBuffer&amp; m_buffer;
+            char* m_storageBuffer;
+            unsigned m_index;
+#if !defined(NDEBUG)
+            unsigned m_initialIndex;
+            unsigned m_requiredSpace;
+#endif
+        };
+
</ins><span class="cx">     protected:
</span><span class="cx">         template&lt;typename IntegralType&gt;
</span><span class="cx">         void putIntegral(IntegralType value)
</span><span class="cx">         {
</span><span class="cx">             unsigned nextIndex = m_index + sizeof(IntegralType);
</span><span class="cx">             if (UNLIKELY(nextIndex &gt; m_storage.capacity()))
</span><del>-                grow();
</del><ins>+                outOfLineGrow();
</ins><span class="cx">             ASSERT(isAvailable(sizeof(IntegralType)));
</span><span class="cx">             *reinterpret_cast_ptr&lt;IntegralType*&gt;(m_storage.buffer() + m_index) = value;
</span><span class="cx">             m_index = nextIndex;
</span><span class="lines">@@ -200,6 +250,13 @@
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">     private:
</span><ins>+        NEVER_INLINE void outOfLineGrow()
+        {
+            m_storage.grow();
+        }
+
+        friend LocalWriter;
+
</ins><span class="cx">         AssemblerData m_storage;
</span><span class="cx">         unsigned m_index;
</span><span class="cx">     };
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreassemblerMacroAssemblerX86Commonh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h (198707 => 198708)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h        2016-03-26 03:45:01 UTC (rev 198707)
+++ trunk/Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h        2016-03-26 03:47:23 UTC (rev 198708)
</span><span class="lines">@@ -230,7 +230,7 @@
</span><span class="cx">         }
</span><span class="cx">         m_assembler.leal_mr(index.offset, index.base, index.index, index.scale, dest);
</span><span class="cx">     }
</span><del>-    
</del><ins>+
</ins><span class="cx">     void and32(RegisterID src, RegisterID dest)
</span><span class="cx">     {
</span><span class="cx">         m_assembler.andl_rr(src, dest);
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreassemblerX86Assemblerh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/assembler/X86Assembler.h (198707 => 198708)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/assembler/X86Assembler.h        2016-03-26 03:45:01 UTC (rev 198707)
+++ trunk/Source/JavaScriptCore/assembler/X86Assembler.h        2016-03-26 03:47:23 UTC (rev 198708)
</span><span class="lines">@@ -2852,16 +2852,14 @@
</span><span class="cx">     }
</span><span class="cx"> 
</span><span class="cx">     class X86InstructionFormatter {
</span><del>-
</del><span class="cx">         static const int maxInstructionSize = 16;
</span><span class="cx"> 
</span><span class="cx">     public:
</span><del>-
</del><span class="cx">         enum ModRmMode {
</span><del>-            ModRmMemoryNoDisp,
-            ModRmMemoryDisp8,
-            ModRmMemoryDisp32,
-            ModRmRegister,
</del><ins>+            ModRmMemoryNoDisp = 0,
+            ModRmMemoryDisp8 = 1 &lt;&lt; 6,
+            ModRmMemoryDisp32 = 2 &lt;&lt; 6,
+            ModRmRegister = 3 &lt;&lt; 6,
</ins><span class="cx">         };
</span><span class="cx"> 
</span><span class="cx">         // Legacy prefix bytes:
</span><span class="lines">@@ -2873,6 +2871,205 @@
</span><span class="cx">             m_buffer.putByte(pre);
</span><span class="cx">         }
</span><span class="cx"> 
</span><ins>+#if CPU(X86_64)
+        // Byte operand register spl &amp; above require a REX prefix (to prevent the 'H' registers be accessed).
+        static bool byteRegRequiresRex(int reg)
+        {
+            static_assert(X86Registers::esp == 4, &quot;Necessary condition for OR-masking&quot;);
+            return (reg &gt;= X86Registers::esp);
+        }
+        static bool byteRegRequiresRex(int a, int b)
+        {
+            return byteRegRequiresRex(a | b);
+        }
+
+        // Registers r8 &amp; above require a REX prefixe.
+        static bool regRequiresRex(int reg)
+        {
+            static_assert(X86Registers::r8 == 8, &quot;Necessary condition for OR-masking&quot;);
+            return (reg &gt;= X86Registers::r8);
+        }
+        static bool regRequiresRex(int a, int b)
+        {
+            return regRequiresRex(a | b);
+        }
+        static bool regRequiresRex(int a, int b, int c)
+        {
+            return regRequiresRex(a | b | c);
+        }
+#else
+        static bool byteRegRequiresRex(int) { return false; }
+        static bool byteRegRequiresRex(int, int) { return false; }
+        static bool regRequiresRex(int) { return false; }
+        static bool regRequiresRex(int, int) { return false; }
+        static bool regRequiresRex(int, int, int) { return false; }
+#endif
+
+        class SingleInstructionBufferWriter : public AssemblerBuffer::LocalWriter {
+        public:
+            SingleInstructionBufferWriter(AssemblerBuffer&amp; buffer)
+                : AssemblerBuffer::LocalWriter(buffer, maxInstructionSize)
+            {
+            }
+
+            // Internals; ModRm and REX formatters.
+
+            static constexpr RegisterID noBase = X86Registers::ebp;
+            static constexpr RegisterID hasSib = X86Registers::esp;
+            static constexpr RegisterID noIndex = X86Registers::esp;
+
+#if CPU(X86_64)
+            static constexpr RegisterID noBase2 = X86Registers::r13;
+            static constexpr RegisterID hasSib2 = X86Registers::r12;
+
+            // Format a REX prefix byte.
+            ALWAYS_INLINE void emitRex(bool w, int r, int x, int b)
+            {
+                ASSERT(r &gt;= 0);
+                ASSERT(x &gt;= 0);
+                ASSERT(b &gt;= 0);
+                putByteUnchecked(PRE_REX | ((int)w &lt;&lt; 3) | ((r&gt;&gt;3)&lt;&lt;2) | ((x&gt;&gt;3)&lt;&lt;1) | (b&gt;&gt;3));
+            }
+
+            // Used to plant a REX byte with REX.w set (for 64-bit operations).
+            ALWAYS_INLINE void emitRexW(int r, int x, int b)
+            {
+                emitRex(true, r, x, b);
+            }
+
+            // Used for operations with byte operands - use byteRegRequiresRex() to check register operands,
+            // regRequiresRex() to check other registers (i.e. address base &amp; index).
+            ALWAYS_INLINE void emitRexIf(bool condition, int r, int x, int b)
+            {
+                if (condition)
+                    emitRex(false, r, x, b);
+            }
+
+            // Used for word sized operations, will plant a REX prefix if necessary (if any register is r8 or above).
+            ALWAYS_INLINE void emitRexIfNeeded(int r, int x, int b)
+            {
+                emitRexIf(regRequiresRex(r, x, b), r, x, b);
+            }
+#else
+            // No REX prefix bytes on 32-bit x86.
+            ALWAYS_INLINE void emitRexIf(bool, int, int, int) { }
+            ALWAYS_INLINE void emitRexIfNeeded(int, int, int) { }
+#endif
+
+            ALWAYS_INLINE void putModRm(ModRmMode mode, int reg, RegisterID rm)
+            {
+                putByteUnchecked(mode | ((reg &amp; 7) &lt;&lt; 3) | (rm &amp; 7));
+            }
+
+            ALWAYS_INLINE void putModRmSib(ModRmMode mode, int reg, RegisterID base, RegisterID index, int scale)
+            {
+                ASSERT(mode != ModRmRegister);
+
+                putModRm(mode, reg, hasSib);
+                putByteUnchecked((scale &lt;&lt; 6) | ((index &amp; 7) &lt;&lt; 3) | (base &amp; 7));
+            }
+
+            ALWAYS_INLINE void registerModRM(int reg, RegisterID rm)
+            {
+                putModRm(ModRmRegister, reg, rm);
+            }
+
+            ALWAYS_INLINE void memoryModRM(int reg, RegisterID base, int offset)
+            {
+                // A base of esp or r12 would be interpreted as a sib, so force a sib with no index &amp; put the base in there.
+#if CPU(X86_64)
+                if ((base == hasSib) || (base == hasSib2)) {
+#else
+                if (base == hasSib) {
+#endif
+                    if (!offset) // No need to check if the base is noBase, since we know it is hasSib!
+                        putModRmSib(ModRmMemoryNoDisp, reg, base, noIndex, 0);
+                    else if (CAN_SIGN_EXTEND_8_32(offset)) {
+                        putModRmSib(ModRmMemoryDisp8, reg, base, noIndex, 0);
+                        putByteUnchecked(offset);
+                    } else {
+                        putModRmSib(ModRmMemoryDisp32, reg, base, noIndex, 0);
+                        putIntUnchecked(offset);
+                    }
+                } else {
+#if CPU(X86_64)
+                    if (!offset &amp;&amp; (base != noBase) &amp;&amp; (base != noBase2))
+#else
+                    if (!offset &amp;&amp; (base != noBase))
+#endif
+                        putModRm(ModRmMemoryNoDisp, reg, base);
+                    else if (CAN_SIGN_EXTEND_8_32(offset)) {
+                        putModRm(ModRmMemoryDisp8, reg, base);
+                        putByteUnchecked(offset);
+                    } else {
+                        putModRm(ModRmMemoryDisp32, reg, base);
+                        putIntUnchecked(offset);
+                    }
+                }
+            }
+
+            ALWAYS_INLINE void memoryModRM_disp8(int reg, RegisterID base, int offset)
+            {
+                // A base of esp or r12 would be interpreted as a sib, so force a sib with no index &amp; put the base in there.
+                ASSERT(CAN_SIGN_EXTEND_8_32(offset));
+#if CPU(X86_64)
+                if ((base == hasSib) || (base == hasSib2)) {
+#else
+                if (base == hasSib) {
+#endif
+                    putModRmSib(ModRmMemoryDisp8, reg, base, noIndex, 0);
+                    putByteUnchecked(offset);
+                } else {
+                    putModRm(ModRmMemoryDisp8, reg, base);
+                    putByteUnchecked(offset);
+                }
+            }
+
+            ALWAYS_INLINE void memoryModRM_disp32(int reg, RegisterID base, int offset)
+            {
+                // A base of esp or r12 would be interpreted as a sib, so force a sib with no index &amp; put the base in there.
+#if CPU(X86_64)
+                if ((base == hasSib) || (base == hasSib2)) {
+#else
+                if (base == hasSib) {
+#endif
+                    putModRmSib(ModRmMemoryDisp32, reg, base, noIndex, 0);
+                    putIntUnchecked(offset);
+                } else {
+                    putModRm(ModRmMemoryDisp32, reg, base);
+                    putIntUnchecked(offset);
+                }
+            }
+        
+            ALWAYS_INLINE void memoryModRM(int reg, RegisterID base, RegisterID index, int scale, int offset)
+            {
+                ASSERT(index != noIndex);
+
+#if CPU(X86_64)
+                if (!offset &amp;&amp; (base != noBase) &amp;&amp; (base != noBase2))
+#else
+                if (!offset &amp;&amp; (base != noBase))
+#endif
+                    putModRmSib(ModRmMemoryNoDisp, reg, base, index, scale);
+                else if (CAN_SIGN_EXTEND_8_32(offset)) {
+                    putModRmSib(ModRmMemoryDisp8, reg, base, index, scale);
+                    putByteUnchecked(offset);
+                } else {
+                    putModRmSib(ModRmMemoryDisp32, reg, base, index, scale);
+                    putIntUnchecked(offset);
+                }
+            }
+
+#if !CPU(X86_64)
+            ALWAYS_INLINE void memoryModRM(int reg, const void* address)
+            {
+                // noBase + ModRmMemoryNoDisp means noBase + ModRmMemoryDisp32!
+                putModRm(ModRmMemoryNoDisp, reg, noBase);
+                putIntUnchecked(reinterpret_cast&lt;int32_t&gt;(address));
+            }
+#endif
+        };
+
</ins><span class="cx">         // Word-sized operands / no operand instruction formatters.
</span><span class="cx">         //
</span><span class="cx">         // In addition to the opcode, the following operand permutations are supported:
</span><span class="lines">@@ -2889,136 +3086,136 @@
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp(OneByteOpcodeID opcode)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            m_buffer.putByteUnchecked(opcode);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.putByteUnchecked(opcode);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp(OneByteOpcodeID opcode, RegisterID reg)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(0, 0, reg);
-            m_buffer.putByteUnchecked(opcode + (reg &amp; 7));
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(0, 0, reg);
+            writer.putByteUnchecked(opcode + (reg &amp; 7));
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp(OneByteOpcodeID opcode, int reg, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, 0, rm);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(reg, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, 0, rm);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(reg, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp(OneByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, 0, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, 0, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp_disp32(OneByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, 0, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM_disp32(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, 0, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM_disp32(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx">         
</span><span class="cx">         void oneByteOp_disp8(OneByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, 0, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM_disp8(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, 0, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM_disp8(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp(OneByteOpcodeID opcode, int reg, RegisterID base, RegisterID index, int scale, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, index, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, index, scale, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, index, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, index, scale, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx"> #if !CPU(X86_64)
</span><span class="cx">         void oneByteOp(OneByteOpcodeID opcode, int reg, const void* address)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, address);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, address);
</ins><span class="cx">         }
</span><span class="cx"> #endif
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp(TwoByteOpcodeID opcode)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp(TwoByteOpcodeID opcode, int reg, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, 0, rm);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(reg, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, 0, rm);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(reg, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp(TwoByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, 0, base);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, 0, base);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp(TwoByteOpcodeID opcode, int reg, RegisterID base, RegisterID index, int scale, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, index, base);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, index, scale, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, index, base);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, index, scale, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx"> #if !CPU(X86_64)
</span><span class="cx">         void twoByteOp(TwoByteOpcodeID opcode, int reg, const void* address)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, address);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, address);
</ins><span class="cx">         }
</span><span class="cx"> #endif
</span><span class="cx"> 
</span><span class="cx">         void threeByteOp(TwoByteOpcodeID twoBytePrefix, ThreeByteOpcodeID opcode)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(twoBytePrefix);
-            m_buffer.putByteUnchecked(opcode);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(twoBytePrefix);
+            writer.putByteUnchecked(opcode);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void threeByteOp(TwoByteOpcodeID twoBytePrefix, ThreeByteOpcodeID opcode, int reg, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, 0, rm);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(twoBytePrefix);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(reg, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, 0, rm);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(twoBytePrefix);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(reg, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void threeByteOp(TwoByteOpcodeID twoBytePrefix, ThreeByteOpcodeID opcode, int reg, RegisterID base, int displacement)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIfNeeded(reg, 0, base);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(twoBytePrefix);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, displacement);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIfNeeded(reg, 0, base);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(twoBytePrefix);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, displacement);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx"> #if CPU(X86_64)
</span><span class="lines">@@ -3030,83 +3227,83 @@
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp64(OneByteOpcodeID opcode)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(0, 0, 0);
-            m_buffer.putByteUnchecked(opcode);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(0, 0, 0);
+            writer.putByteUnchecked(opcode);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp64(OneByteOpcodeID opcode, RegisterID reg)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(0, 0, reg);
-            m_buffer.putByteUnchecked(opcode + (reg &amp; 7));
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(0, 0, reg);
+            writer.putByteUnchecked(opcode + (reg &amp; 7));
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp64(OneByteOpcodeID opcode, int reg, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(reg, 0, rm);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(reg, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(reg, 0, rm);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(reg, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp64(OneByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(reg, 0, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(reg, 0, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp64_disp32(OneByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(reg, 0, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM_disp32(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(reg, 0, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM_disp32(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx">         
</span><span class="cx">         void oneByteOp64_disp8(OneByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(reg, 0, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM_disp8(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(reg, 0, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM_disp8(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp64(OneByteOpcodeID opcode, int reg, RegisterID base, RegisterID index, int scale, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(reg, index, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, index, scale, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(reg, index, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, index, scale, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp64(TwoByteOpcodeID opcode, int reg, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(reg, 0, rm);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(reg, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(reg, 0, rm);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(reg, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp64(TwoByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(reg, 0, base);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(reg, 0, base);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp64(TwoByteOpcodeID opcode, int reg, RegisterID base, RegisterID index, int scale, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexW(reg, index, base);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, index, scale, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexW(reg, index, base);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, index, scale, offset);
</ins><span class="cx">         }
</span><span class="cx"> #endif
</span><span class="cx"> 
</span><span class="lines">@@ -3137,52 +3334,52 @@
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp8(OneByteOpcodeID opcode, GroupOpcodeID groupOp, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIf(byteRegRequiresRex(rm), 0, 0, rm);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(groupOp, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIf(byteRegRequiresRex(rm), 0, 0, rm);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(groupOp, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp8(OneByteOpcodeID opcode, int reg, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIf(byteRegRequiresRex(reg) || byteRegRequiresRex(rm), reg, 0, rm);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(reg, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIf(byteRegRequiresRex(reg, rm), reg, 0, rm);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(reg, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp8(OneByteOpcodeID opcode, int reg, RegisterID base, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIf(byteRegRequiresRex(reg) || byteRegRequiresRex(base), reg, 0, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIf(byteRegRequiresRex(reg, base), reg, 0, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void oneByteOp8(OneByteOpcodeID opcode, int reg, RegisterID base, RegisterID index, int scale, int offset)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIf(byteRegRequiresRex(reg) || regRequiresRex(index) || regRequiresRex(base), reg, index, base);
-            m_buffer.putByteUnchecked(opcode);
-            memoryModRM(reg, base, index, scale, offset);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIf(byteRegRequiresRex(reg) || regRequiresRex(index, base), reg, index, base);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, index, scale, offset);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp8(TwoByteOpcodeID opcode, RegisterID reg, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIf(byteRegRequiresRex(reg)|byteRegRequiresRex(rm), reg, 0, rm);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(reg, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIf(byteRegRequiresRex(reg, rm), reg, 0, rm);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(reg, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         void twoByteOp8(TwoByteOpcodeID opcode, GroupOpcodeID groupOp, RegisterID rm)
</span><span class="cx">         {
</span><del>-            m_buffer.ensureSpace(maxInstructionSize);
-            emitRexIf(byteRegRequiresRex(rm), 0, 0, rm);
-            m_buffer.putByteUnchecked(OP_2BYTE_ESCAPE);
-            m_buffer.putByteUnchecked(opcode);
-            registerModRM(groupOp, rm);
</del><ins>+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIf(byteRegRequiresRex(rm), 0, 0, rm);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.registerModRM(groupOp, rm);
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         // Immediates:
</span><span class="lines">@@ -3225,177 +3422,6 @@
</span><span class="cx"> 
</span><span class="cx">         unsigned debugOffset() { return m_buffer.debugOffset(); }
</span><span class="cx"> 
</span><del>-    private:
-
-        // Internals; ModRm and REX formatters.
-
-        static const RegisterID noBase = X86Registers::ebp;
-        static const RegisterID hasSib = X86Registers::esp;
-        static const RegisterID noIndex = X86Registers::esp;
-#if CPU(X86_64)
-        static const RegisterID noBase2 = X86Registers::r13;
-        static const RegisterID hasSib2 = X86Registers::r12;
-
-        // Registers r8 &amp; above require a REX prefixe.
-        inline bool regRequiresRex(int reg)
-        {
-            return (reg &gt;= X86Registers::r8);
-        }
-
-        // Byte operand register spl &amp; above require a REX prefix (to prevent the 'H' registers be accessed).
-        inline bool byteRegRequiresRex(int reg)
-        {
-            return (reg &gt;= X86Registers::esp);
-        }
-
-        // Format a REX prefix byte.
-        inline void emitRex(bool w, int r, int x, int b)
-        {
-            ASSERT(r &gt;= 0);
-            ASSERT(x &gt;= 0);
-            ASSERT(b &gt;= 0);
-            m_buffer.putByteUnchecked(PRE_REX | ((int)w &lt;&lt; 3) | ((r&gt;&gt;3)&lt;&lt;2) | ((x&gt;&gt;3)&lt;&lt;1) | (b&gt;&gt;3));
-        }
-
-        // Used to plant a REX byte with REX.w set (for 64-bit operations).
-        inline void emitRexW(int r, int x, int b)
-        {
-            emitRex(true, r, x, b);
-        }
-
-        // Used for operations with byte operands - use byteRegRequiresRex() to check register operands,
-        // regRequiresRex() to check other registers (i.e. address base &amp; index).
-        inline void emitRexIf(bool condition, int r, int x, int b)
-        {
-            if (condition) emitRex(false, r, x, b);
-        }
-
-        // Used for word sized operations, will plant a REX prefix if necessary (if any register is r8 or above).
-        inline void emitRexIfNeeded(int r, int x, int b)
-        {
-            emitRexIf(regRequiresRex(r) || regRequiresRex(x) || regRequiresRex(b), r, x, b);
-        }
-#else
-        // No REX prefix bytes on 32-bit x86.
-        inline bool regRequiresRex(int) { return false; }
-        inline bool byteRegRequiresRex(int) { return false; }
-        inline void emitRexIf(bool, int, int, int) {}
-        inline void emitRexIfNeeded(int, int, int) {}
-#endif
-
-        void putModRm(ModRmMode mode, int reg, RegisterID rm)
-        {
-            m_buffer.putByteUnchecked((mode &lt;&lt; 6) | ((reg &amp; 7) &lt;&lt; 3) | (rm &amp; 7));
-        }
-
-        void putModRmSib(ModRmMode mode, int reg, RegisterID base, RegisterID index, int scale)
-        {
-            ASSERT(mode != ModRmRegister);
-
-            putModRm(mode, reg, hasSib);
-            m_buffer.putByteUnchecked((scale &lt;&lt; 6) | ((index &amp; 7) &lt;&lt; 3) | (base &amp; 7));
-        }
-
-        void registerModRM(int reg, RegisterID rm)
-        {
-            putModRm(ModRmRegister, reg, rm);
-        }
-
-        void memoryModRM(int reg, RegisterID base, int offset)
-        {
-            // A base of esp or r12 would be interpreted as a sib, so force a sib with no index &amp; put the base in there.
-#if CPU(X86_64)
-            if ((base == hasSib) || (base == hasSib2)) {
-#else
-            if (base == hasSib) {
-#endif
-                if (!offset) // No need to check if the base is noBase, since we know it is hasSib!
-                    putModRmSib(ModRmMemoryNoDisp, reg, base, noIndex, 0);
-                else if (CAN_SIGN_EXTEND_8_32(offset)) {
-                    putModRmSib(ModRmMemoryDisp8, reg, base, noIndex, 0);
-                    m_buffer.putByteUnchecked(offset);
-                } else {
-                    putModRmSib(ModRmMemoryDisp32, reg, base, noIndex, 0);
-                    m_buffer.putIntUnchecked(offset);
-                }
-            } else {
-#if CPU(X86_64)
-                if (!offset &amp;&amp; (base != noBase) &amp;&amp; (base != noBase2))
-#else
-                if (!offset &amp;&amp; (base != noBase))
-#endif
-                    putModRm(ModRmMemoryNoDisp, reg, base);
-                else if (CAN_SIGN_EXTEND_8_32(offset)) {
-                    putModRm(ModRmMemoryDisp8, reg, base);
-                    m_buffer.putByteUnchecked(offset);
-                } else {
-                    putModRm(ModRmMemoryDisp32, reg, base);
-                    m_buffer.putIntUnchecked(offset);
-                }
-            }
-        }
-
-        void memoryModRM_disp8(int reg, RegisterID base, int offset)
-        {
-            // A base of esp or r12 would be interpreted as a sib, so force a sib with no index &amp; put the base in there.
-            ASSERT(CAN_SIGN_EXTEND_8_32(offset));
-#if CPU(X86_64)
-            if ((base == hasSib) || (base == hasSib2)) {
-#else
-            if (base == hasSib) {
-#endif
-                putModRmSib(ModRmMemoryDisp8, reg, base, noIndex, 0);
-                m_buffer.putByteUnchecked(offset);
-            } else {
-                putModRm(ModRmMemoryDisp8, reg, base);
-                m_buffer.putByteUnchecked(offset);
-            }
-        }
-
-        void memoryModRM_disp32(int reg, RegisterID base, int offset)
-        {
-            // A base of esp or r12 would be interpreted as a sib, so force a sib with no index &amp; put the base in there.
-#if CPU(X86_64)
-            if ((base == hasSib) || (base == hasSib2)) {
-#else
-            if (base == hasSib) {
-#endif
-                putModRmSib(ModRmMemoryDisp32, reg, base, noIndex, 0);
-                m_buffer.putIntUnchecked(offset);
-            } else {
-                putModRm(ModRmMemoryDisp32, reg, base);
-                m_buffer.putIntUnchecked(offset);
-            }
-        }
-    
-        void memoryModRM(int reg, RegisterID base, RegisterID index, int scale, int offset)
-        {
-            ASSERT(index != noIndex);
-
-#if CPU(X86_64)
-            if (!offset &amp;&amp; (base != noBase) &amp;&amp; (base != noBase2))
-#else
-            if (!offset &amp;&amp; (base != noBase))
-#endif
-                putModRmSib(ModRmMemoryNoDisp, reg, base, index, scale);
-            else if (CAN_SIGN_EXTEND_8_32(offset)) {
-                putModRmSib(ModRmMemoryDisp8, reg, base, index, scale);
-                m_buffer.putByteUnchecked(offset);
-            } else {
-                putModRmSib(ModRmMemoryDisp32, reg, base, index, scale);
-                m_buffer.putIntUnchecked(offset);
-            }
-        }
-
-#if !CPU(X86_64)
-        void memoryModRM(int reg, const void* address)
-        {
-            // noBase + ModRmMemoryNoDisp means noBase + ModRmMemoryDisp32!
-            putModRm(ModRmMemoryNoDisp, reg, noBase);
-            m_buffer.putIntUnchecked(reinterpret_cast&lt;int32_t&gt;(address));
-        }
-#endif
-
</del><span class="cx">     public:
</span><span class="cx">         AssemblerBuffer m_buffer;
</span><span class="cx">     } m_formatter;
</span></span></pre>
</div>
</div>

</body>
</html>