[Webkit-unassigned] [Bug 139851] New: JSC::VM::m_lastStackTop initialized incorrectly when on non-main fiber; runaway memory corruption

Sat Dec 20 02:52:53 PST 2014

https://bugs.webkit.org/show_bug.cgi?id=139851

            Bug ID: 139851
           Summary: JSC::VM::m_lastStackTop initialized incorrectly when
                    on non-main fiber; runaway memory corruption
    Classification: Unclassified
           Product: WebKit
           Version: 528+ (Nightly build)
          Hardware: PC
                OS: Windows 8
            Status: NEW
          Severity: Critical
          Priority: P2
         Component: JavaScriptCore
          Assignee: webkit-unassigned at lists.webkit.org
          Reporter: jnlehner at yahoo.com

On Microsoft Windows, if the main thread of an application has fibers, they all share the same thread-local storage.

Now in JSC::Heap::markRoots(), we call sanitizeStackForVM(), which via offlineasm becomes:

sanitizeStackForVMImpl PROC PUBLIC
    mov rdx, qword ptr [51744 + rcx]                         
    cmp rsp, rdx                                             
    jbe _offlineasm_zeroFillDone
    xor rax, rax                                             

  _offlineasm_zeroFillLoop:
    mov qword ptr [0 + rdx], rax                             
    add rdx, 8                                               
    cmp rsp, rdx                                             
    ja _offlineasm_zeroFillLoop

  _offlineasm_zeroFillDone:
    mov rdx, rsp                                             
    mov qword ptr [51744 + rcx], rdx                         
    ret                                                      
sanitizeStackForVMImpl ENDP

qword ptr [51744 + rcx] is the value of JSC::VM::m_lastStackTop.

If we look at all the reads and writes to JSC::VM::m_lastStackTop, besides the read and write above, we find:

WRITES:

JSC::JSLock::didAcquireLock():
JSC::JSLock::grabAllLocks():
    m_vm->setLastStackTop(wtfThreadData().savedLastStackTop());

JSC::VM::VM():
    setLastStackTop(wtfThreadData().stack().origin());

READS:

JSC::JSLock::dropAllLocks():
    wtfThreadData().setSavedLastStackTop(m_vm->lastStackTop());

So the value of JSC::VM::m_lastStackTop depends on what's in the thread-local storage provided by wtfThreadData(), and the only other write to thread-local storage not covered above is:

JSC::initializeThreading():
    std::call_once(initializeThreadingOnceFlag, []{
        // ...
        wtfThreadData().setSavedLastStackTop(wtfThreadData().stack().origin());
    });

There are 2 problems here.  First, JSC::initializeThreading() is only called once for the whole process, so the saved last stack top in thread-local storage will only be for one thread in the process.

But the bigger problem is that JSC::JSLock::didAcquireLock() and JSC::JSLock::grabAllLocks() set JSC::VM::m_lastStackTop based on this value from thread-local storage, even if it is the wrong value when JSC::JSLock::didAcquireLock() or JSC::JSLock::grabAllLocks() are called from another fiber!

Interestingly, note that JSC::VM::VM() sets JSC::VM::m_lastStackTop based on a fresh call to wtfThreadData().stack().origin(), which already has a fix for fiber environments:

 const StackBounds& stack()
    {
        // We need to always get a fresh StackBounds from the OS due to how fibers work.
        // See https://bugs.webkit.org/show_bug.cgi?id=102411
#if OS(WINDOWS)
        m_stackBounds = StackBounds::currentThreadStackBounds();
#endif
        return m_stackBounds;
    }

but this is insufficient to avoid disaster, because JSC::JSLock::didAcquireLock() and JSC::JSLock::grabAllLocks() overwrite the at-construction-time-correct JSC::VM::m_lastStackTop.

Now to explain the result of this problem.  The stack on an Intel machine grows toward lower memory addresses, and Windows threads by default have 1MB of virtual memory reserved for them.  WTF::StackBounds::origin() points to the next byte beyond the end of this allocation:

000000CC`00000000:
guard pages

top of stack

bottom of stack (e.g. main())
000000CC`00100000: origin

sanitizeStackForVMImpl() cares about only two locations in memory: JSC::VM::m_lastStackTop, which is rdx, and rsp. Now it appears that normally, the first time sanitization runs, rdx is supposed to be at the very bottom of the stack, and rsp will be somewhere in the middle, so 

cmp rsp, rdx                                             
jbe _offlineasm_zeroFillDone

executes and we simply update JSC::VM::m_lastStackTop with rsp.  The next time sanitization runs, rsp may have moved closer to the bottom of the stack, so rdx < rsp and we do some zero-filling.

Now suppose we're on a fiber created from the main thread.  JSC::VM::m_lastStackTop will be the origin of the main thread, not the fiber.  If the fiber's stack has been reserved in the virtual address space at a lower address than the main thread, and we're running in the fiber, rsp < rdx, so rdx is then updated to the correct rsp and no zero-filling occurs.

HOWEVER, if the fiber's stack has been reserved in the virtual address space at a higher address than the main thread, now we have rdx < rsp, which to sanitizeStackForVMImpl() looks like everything between must be zero-filled!  I've seen as much as 500MB of difference on Windows 8.1, presumably due to the new high entropy ASLR.

This will show up as a crash report with a stack like:

JavaScriptCore!sanitizeStackForVMImpl(void)+0xf
JavaScriptCore!JSC::Heap::markRoots(double
JavaScriptCore!JSC::Heap::collect(JSC::HeapOperation
JavaScriptCore!JSC::Heap::collectAllGarbage(void)+0x1c
JavaScriptCore!JSC::MarkedAllocator::allocateSlowCase(unsigned
JavaScriptCore!JSC::RegExpPrototype::create(class
JavaScriptCore!JSC::JSGlobalObject::reset(class
JavaScriptCore!JSC::JSGlobalObject::finishCreation(class
JavaScriptCore!JSC::JSGlobalObject::create(class
JavaScriptCore!JSGlobalContextCreateInGroup(struct
...
...
...
kernel32!CreateFiberEx+0x265

if you're lucky, but due to the runaway zeroing of memory, other crashes are possible.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20141220/e9384234/attachment-0002.html>