[Webkit-unassigned] [Bug 200863] Crash in JSC::SlotVisitor::visitChildren

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Nov 15 01:03:57 PST 2022


https://bugs.webkit.org/show_bug.cgi?id=200863

--- Comment #13 from Mark Lam <mark.lam at apple.com> ---
Tips for debugging a GC related crash (like this one):

1. Does it reproduce with JSC_useGenerationalGC=0?
2. Does it reproduce with JSC_useConcurrentGC=0?

   These test if you have some sort of missing write barrier issue.

3. Does running with JSC_useZombieMode=1 make it reproduce more easily?

   Rules out incremental sweeping as a factor.
   Plus, helps make GC issues manifest sooner, though it may perturb the timing of the run and hide the issue.

4. Does it reproduce with a Debug build?

   Helps makes things easier to debug.
   Plus enable a lot more assertions to check invariants.

5. Does running with JSC_verifyGC=1 report any errors?

   Helps catch potential concurrent GC and generational GC issue and point to potential where the issue is.
   Note: though rare, may report a false positive.

Some thoughts on your specific issue:
6. This appears to reproduce only on your "custom AArch64 platform".

   Is this "custom AArch64 platform" stable?
   Have you ruled out silicon or OS kernel bugs?

   From my past experience in the real world (not theoretical), I've known new CPUs (from a vendor whom I shall not name but is not Apple) to have either silicon or OS kernel configuration bugs that result in concurrency issues where the hardware itself does not enforce proper memory coherency despite the presence of the needed memory barriers.  Has this been ruled out yet?

7. If you're running on custom silicon, are you also adding custom code to WebKit e.g. new types of Objects that are JSCells, or new functions that allocate and manipulate JSCells?

    If so, are you sure you have issued write barriers in all the needed places?

    One way to test this is to see if your issue still reproduces with the concurrent and generational GC disabled (see (1) and (2) above).

    If the reproduction stops, the next thing is to turn those back on, and start sprinting write barriers liberally in your code to see if it makes the issue goes away.

    If it does, gradually remove this sprinkling of write barriers, and see which one re-introduces the crash.  If you've isolated it, then audit the code around there to figure out why that write barrier is needed, or not.

There are also advanced techniques for debugging GC issues using JSC_verifyHeap=1 that requires writing a lot of custom code carefully: requires knowing what you are doing with GC related code, and understanding the art of bisecting bugs in time (vs in space).  It's not a turn key solution for debugging such issues, but if you're the type who can dive in and reason deeply about how the system works, you can use this to help isolate the issue ... assuming it is a software issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20221115/b8d4ba83/attachment.htm>


More information about the webkit-unassigned mailing list