[Webkit-unassigned] [Bug 200983] New: [Android] 64-bit JSC r245459 crashes in JSC::AccessCase::propagateTransitions(JSC::SlotVisitor&)

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Aug 21 09:02:40 PDT 2019


https://bugs.webkit.org/show_bug.cgi?id=200983

            Bug ID: 200983
           Summary: [Android] 64-bit JSC r245459 crashes in
                    JSC::AccessCase::propagateTransitions(JSC::SlotVisitor
                    &)
           Product: WebKit
           Version: WebKit Nightly Build
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: Blocker
          Priority: P2
         Component: JavaScriptCore
          Assignee: webkit-unassigned at lists.webkit.org
          Reporter: prti at amazon.com

Hi folks, 
As part of google’s Support 64-bit architectures (https://developer.android.com/distribute/best-practices/develop/64-bit) requirement, Amazon Alexa (https://play.google.com/store/apps/details?id=com.amazon.dee.app&hl=en_US) launched 64-bit version of Android app on July-30, 2019 based on an O3-optimized JSC (r245459). 
While stability is vastly improved compared to our earlier attempts, there are still residual problems. We see a crash rate of 0.35% over cold starts from the 64-bit JSC seg faults alone. 
Following is the stack-trace and currently the reproduction steps are unknown to us as we are unable to reproduce this crash in our in-house testing.

SIGSEGV
MainActivity
Segmentation violation (invalid memory reference)
Aug 14th, 2019, 11:32:55 UTC

STACKTRACE

SIGSEGV: Segmentation violation (invalid memory reference)
        JSC::AccessCase::propagateTransitions(JSC::SlotVisitor&) const at sfp-exceptions.c:?
        JSC::PolymorphicAccess::propagateTransitions(JSC::SlotVisitor&) const at sfp-exceptions.c:?
        JSC::CodeBlock::propagateTransitions(JSC::ConcurrentJSLocker const&, JSC::SlotVisitor&) at sfp-exceptions.c:?
        JSC::ExecutableToCodeBlockEdge::runConstraint(JSC::ConcurrentJSLocker const&, JSC::VM&, JSC::SlotVisitor&) at sfp-exceptions.c:?
        JSC::ExecutableToCodeBlockEdge::visitChildren(JSC::JSCell*, JSC::SlotVisitor&) at sfp-exceptions.c:?
        JSC::SlotVisitor::drain(WTF::MonotonicTime)::$_3::operator()(JSC::MarkStackArray&) const at sfp-exceptions.c:?
        JSC::SlotVisitor::drain(WTF::MonotonicTime) at sfp-exceptions.c:?
        JSC::SlotVisitor::drainFromShared(JSC::SlotVisitor::SharedDrainMode, WTF::MonotonicTime) at sfp-exceptions.c:?
        WTF::SharedTaskFunctor<void (), JSC::Heap::runBeginPhase(JSC::GCConductor)::$_17>::run() at sfp-exceptions.c:?
        WTF::ParallelHelperClient::runTask(WTF::RefPtr<WTF::SharedTask<void ()>, WTF::DumbPtrTraits<WTF::SharedTask<void ()> > > const&) at sfp-exceptions.c:?
        WTF::ParallelHelperPool::Thread::work() at sfp-exceptions.c:?
        WTF::Function<void ()>::CallableWrapper<WTF::AutomaticThread::start(WTF::AbstractLocker const&)::$_0>::call() at sfp-exceptions.c:?
        WTF::Thread::entryPoint(WTF::Thread::NewThreadContext*) at sfp-exceptions.c:?
        WTF::wtfThreadEntryPoint(void*) at sfp-exceptions.c:?
        at 0x7f89700484(/system/lib64/libc.so:427140)
        at 0x7f896b5db4(/system/lib64/libc.so:122292)
        at 0x0(Unknown)

The React Native community reports the same issue in https://github.com/facebook/react-native/issues/25494. The stack-trace is identical to what we are observing, see https://github.com/facebook/react-native/issues/25494#issuecomment-514976930.

There are two options discussed in the RN community as a workaround.
1) Use JSC with JIT disabled: We tried this option but this has great performance impact that we cannot deploy in the field.
2) Use RN 0.60.x with Hermes: The current version of RN 0.60.x comes with its own stability issues and until those get resolved, we cannot move to RN 0.60.

We contacted ARM since this crash is specific to arm64 and happening across multiple devices. We got the following response:
“We had a few engineers looking at this and we do not see an obvious pattern. There is a mixture of CPUs ranging from A53 only to “big/little” multi-core systems using A53, A55, A73, A75, A76, and M4. Our open source software engineers primarily work on the V8 JS engine so, we looked for similar issues fixed there and it does not look like something we have fixed previously. 

Given the problem is across so many disparate devices, our engineers suspect missing barriers, cache maintenance, or both in the generated code. These are the first areas, along with CPU detection (since it only seems to occur on Arm), where we would examine if we had better knowledge of JSC. I hope this helps you frame the issue for the bug report.”


Thanks for your support,
Pratik Patel

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20190821/268f7c76/attachment-0001.html>


More information about the webkit-unassigned mailing list