[webkit-dev] Questions about JSC CLoop crashes on ppc64
Mark Lam
mark.lam at apple.com
Tue Apr 1 09:17:56 PDT 2014
cc’ing webkit dev because there may be other folks who might be interested in this information, and/or may also be able to help.
On Apr 1, 2014, at 6:57 AM, Tomas Popela <tpopela at redhat.com> wrote:
> Hi Mark,
> if you don't mind I have some questions for you regarding to https://bugs.webkit.org/show_bug.cgi?id=128743 - CLoop crashes on PPC64, S390X as I'm lost in it.. While looking on it I came to https://bugs.webkit.org/show_bug.cgi?id=97586#c10 where you suggest to modify llint/LowLevelInterpreter.cpp and add there some debug prints. I tried it and I saw that t0.i32 != t0.i.. What that could mean?
>
> &t0.i32 = 0x3fffd709693c
> &t0.i = 0x3fffd7096938
> Setting t0.i = 0xfedcba9876543211;
> raw t0 = 0xfedcba9876543211
> Setting t0.i32 = 0x00000000;
> raw t0 = 0xfedcba9800000000
>
> Also another thing. I managed to compile trunk on that PPC64 machine. When I open jsc and just type "1" there it crashes as well. I have modified the *.asm files to print names of macros and labels to see the flow with:
>
> $ sed 's@^\(\..*\):$@&\n cloopDo // printf ("\1\\n");@' LowLevelInterpreter.asm
> $ sed 's@^macro \(.*\).*$@&\n cloopDo // printf ("\1\\n");@' LowLevelInterpreter.asm
>
FYI, there’s already some built-in tracing functionality for the C loop LLINT. In LowLevelInterpreter.cpp, search for TRACE_OPCODE. You can define ENABLE_TRACE_OPCODE 1 to enable that code. Similar to the tracing that you’ve added, it will tell which opcodes are being executed but only at the opcode granularity.
> and the same for LowLevelInterpreter64.asm as well. The flow is different between ppc64 and x86_64:
>
> [tpopela at ibm-power7r2-01 bin]$ ./jsc
> >>> 1
> &t0.i32 = 0x3ffff1982c2c
> &t0.i = 0x3ffff1982c28
> Setting t0.i = 0xfedcba9876543211;
> raw t0 = 0xfedcba9876543211
> Setting t0.i32 = 0x00000000;
> raw t0 = 0xfedcba9800000000
> doCallToJavaScript(makeCall)
> callToJavaScriptPrologue()
> checkStackPointerAlignment(tempReg, location)
> .stackHeightOK
> .copyHeaderLoop
> .copyHeaderLoop
> .copyHeaderLoop
> .copyHeaderLoop
> .copyHeaderLoop
> .copyArgs
> .copyArgsLoop
> Segmentation fault
>
> When I do this on my x86_64 machine the processflow is different.
>
> tpopela ~/dev/WebKit/WebKitBuild/Debug/bin > 14:24 [master]
> $ ./jsc
> >>> 1
> doCallToJavaScript(makeCall)
> callToJavaScriptPrologue()
> checkStackPointerAlignment(tempReg, location)
> .stackHeightOK
> .copyHeaderLoop
> .copyHeaderLoop
> .copyHeaderLoop
> .copyHeaderLoop
> .copyHeaderLoop
> .fillExtraArgsLoop <-------------------------
> .copyArgs
> .copyArgsLoop
> .copyArgsDone
> checkStackPointerAlignment(tempReg, location)
> makeJavaScriptCall(entry, temp)
> prologue(codeBlockGetter, codeBlockSetter, osrSlowPath, traceSlowPath)
> preserveCallerPCAndCFR()
> notFunctionCodeBlockGetter(targetRegister)
> notFunctionCodeBlockSetter(sourceRegister)
> moveStackPointerForCodeBlock(codeBlock, scratch)
> dispatch(advance)
> jumpToInstruction()
> traceExecution()
> checkStackPointerAlignment(tempReg, location)
> .opEnterDone
> callSlowPath(slowPath)
> prepareStateForCCall()
> cCall2(function, arg1, arg2)
> checkStackPointerAlignment(tempReg, location)
> restoreStateAfterCCall()
> dispatch(advance)
> jumpToInstruction()
> traceExecution()
> loadConstantOrVariable(index, value)
> .constant
> .done
> dispatch(advance)
> jumpToInstruction()
> traceExecution()
> loadConstantOrVariable(index, value)
> .constant
> .done
> dispatch(advance)
> jumpToInstruction()
> traceExecution()
> checkSwitchToJITForEpilogue()
> checkSwitchToJIT(increment, action)
> assertNotConstant(index)
> assert(assertion)
> doReturn()
> restoreCallerPCAndCFR()
> checkStackPointerAlignment(tempReg, location)
> .calleeFramePopped
> checkStackPointerAlignment(tempReg, location)
> callToJavaScriptEpilogue()
> 1
Looking at .fillExtraArgsLoop in LowLevelInterpreter64.asm, I see that it is part of the loop that starts at .copyHeaderLoop.
I suggest you add some “cloopDo // printf(“…”);” tracing in the instructions there and inspect the CLoopRegisters to see what values they contain to see why the difference exists.
You can also add “cloopDo // myTestFunction();” and set a gdb breakpoint in myTestFunction(). Thereafter, step thru the code to see how it works and compare between x86_64 and ppc64.
> When it crashes on ppc64 it is crashing here:
>
> 106 pcBase.i64 = *CAST<int64_t*>(t3.i8p + (pc.i << 3)); // /home/tpopela/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:225
> (gdb) p t3
> $1 = {{i = 0, u = 0, {i32padding = 0, i32 = 0}, {u32padding = 0, u32 = 0}, {i8padding = "\000\000\000\000\000\000", i8 = 0 '\000'}, {u8padding = "\000\000\000\000\000\000", u8 = 0 '\000'}, ip = 0x0, i8p = 0x0, vp = 0x0, callFrame = 0x0, execState = 0x0, instruction = 0x0, vm = 0x0, cell = 0x0, protoCallFrame = 0x0, nativeFunc = 0x0, i64 = 0, u64 = 0, encodedJSValue = 0, castToDouble = 0, opcode = 0x0}}
>
> As you see the t3 register is completely empty thus is crashes. The problem is that it went to copy the args as
>
> if (pc.i32 == 0) // /home/tpopela/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:223
> goto _offlineasm_doCallToJavaScript__copyArgsDone;
>
> ppc64:
> (gdb) p pc
> $1 = {{i = 4294967295, u = 4294967295, {i32padding = 0, i32 = -1}, {u32padding = 0, u32 = 4294967295}, {i8padding = "\000\000\000\000\377\377\377",
> i8 = -1 '\377'}, {u8padding = "\000\000\000\000\377\377\377", u8 = 255 '\377'}, ip = 0xffffffff, i8p = 0xffffffff <Address 0xffffffff out of bounds>,
> vp = 0xffffffff, callFrame = 0xffffffff, execState = 0xffffffff, instruction = 0xffffffff, vm = 0xffffffff, cell = 0xffffffff,
> protoCallFrame = 0xffffffff, nativeFunc = 0xffffffff, i64 = 4294967295, u64 = 4294967295, encodedJSValue = 4294967295,
> castToDouble = 2.1219957904712067e-314, opcode = 0xffffffff}}
>
> x86_64:
> (gdb) p pc
> $1 = {{i = 0, u = 0, {i32 = 0, i32padding = 0}, {u32 = 0, u32padding = 0}, {i8 = 0 '\000',
> i8padding = "\000\000\000\000\000\000"}, {u8 = 0 '\000', u8padding = "\000\000\000\000\000\000"},
> ip = 0x0, i8p = 0x0, vp = 0x0, callFrame = 0x0, execState = 0x0, instruction = 0x0, vm = 0x0,
> cell = 0x0, protoCallFrame = 0x0, nativeFunc = 0x0, i64 = 0, u64 = 0, encodedJSValue = 0,
> castToDouble = 0, opcode = 0x0}}
>
> Also as you can see on x86_64 it is entering the .fillExtraArgsLoop label that depends on this if statement
>
> if (pc.i32 == pcBase.i32) // /home/tpopela/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:209
> goto _offlineasm_doCallToJavaScript__copyArgs;
>
> in the time when this statement is processed the values are:
>
> ppc64:
> (gdb) p pc
> $1 = {{i = 4294967295, u = 4294967295, {i32padding = 0, i32 = -1}, {u32padding = 0, u32 = 4294967295}, {i8padding = "\000\000\000\000\377\377\377",
> i8 = -1 '\377'}, {u8padding = "\000\000\000\000\377\377\377", u8 = 255 '\377'}, ip = 0xffffffff, i8p = 0xffffffff <Address 0xffffffff out of bounds>,
> vp = 0xffffffff, callFrame = 0xffffffff, execState = 0xffffffff, instruction = 0xffffffff, vm = 0xffffffff, cell = 0xffffffff,
> protoCallFrame = 0xffffffff, nativeFunc = 0xffffffff, i64 = 4294967295, u64 = 4294967295, encodedJSValue = 4294967295,
> castToDouble = 2.1219957904712067e-314, opcode = 0xffffffff}}
> (gdb) p pcBase
> $2 = {{i = 4294967295, u = 4294967295, {i32padding = 0, i32 = -1}, {u32padding = 0, u32 = 4294967295}, {i8padding = "\000\000\000\000\377\377\377",
> i8 = -1 '\377'}, {u8padding = "\000\000\000\000\377\377\377", u8 = 255 '\377'}, ip = 0xffffffff, i8p = 0xffffffff <Address 0xffffffff out of bounds>,
> vp = 0xffffffff, callFrame = 0xffffffff, execState = 0xffffffff, instruction = 0xffffffff, vm = 0xffffffff, cell = 0xffffffff,
> protoCallFrame = 0xffffffff, nativeFunc = 0xffffffff, i64 = 4294967295, u64 = 4294967295, encodedJSValue = 4294967295,
> castToDouble = 2.1219957904712067e-314, opcode = 0xffffffff}}
I’m not sure about the values should be, but your pc and pcBase looks suspicious to me. I recommend you use a debugger (e.g. gdb) and step through CLoop::execute(). You can ignore the first time CLoop::execute() is called. That is just the initialization pass to init the opcode map. On the second entry to CLoop::execute(), step through the code and see how the pc and pcBase is executed.
> x86_64:
> (gdb) p pc
> $1 = {{i = 0, u = 0, {i32 = 0, i32padding = 0}, {u32 = 0, u32padding = 0}, {i8 = 0 '\000', i8padding = "\000\000\000\000\000\000"}, {u8 = 0 '\000',
> u8padding = "\000\000\000\000\000\000"}, ip = 0x0, i8p = 0x0, vp = 0x0, callFrame = 0x0, execState = 0x0, instruction = 0x0, vm = 0x0, cell = 0x0,
> protoCallFrame = 0x0, nativeFunc = 0x0, i64 = 0, u64 = 0, encodedJSValue = 0, castToDouble = 0, opcode = 0x0}}
> (gdb) p pcBase
> $2 = {{i = 1, u = 1, {i32 = 1, i32padding = 0}, {u32 = 1, u32padding = 0}, {i8 = 1 '\001', i8padding = "\000\000\000\000\000\000"}, {u8 = 1 '\001',
> u8padding = "\000\000\000\000\000\000"}, ip = 0x1, i8p = 0x1 <Address 0x1 out of bounds>, vp = 0x1, callFrame = 0x1, execState = 0x1,
> instruction = 0x1, vm = 0x1, cell = 0x1, protoCallFrame = 0x1, nativeFunc = 0x1, i64 = 1, u64 = 1, encodedJSValue = 1,
> castToDouble = 4.9406564584124654e-324, opcode = 0x1}}
>
> as you see the values are different on x86_64 so it will not jump directly to .copyArgs as on ppc64.
>
> Also I though that the values returned by
>
> loadi ProtoCallFrame::argCountAndCodeOriginValue[protoCallFrame], temp2
> loadi ProtoCallFrame::paddedArgCount[protoCallFrame], temp3
>
> should be the same on both architectures as these are about the current command (in this case "1").
>
> loadi ProtoCallFrame::argCountAndCodeOriginValue[protoCallFrame], temp2
> on ppc64:
> 78 pc.u = *CAST<uint32_t*>(t2.i8p + 24); // /home/tpopela/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:204
> (gdb) n
> 79 pc.i32 = pc.i32 - int32_t(0x1); // /home/tpopela/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:205
> (gdb) p pc.u
> $3 = 0
>
> on x86_64:
> 78 pc.u = *CAST<uint32_t*>(t2.i8p + 24); // /home/tpopela/dev/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:204
> (gdb) n
> 79 pc.i32 = pc.i32 - int32_t(0x1); // /home/tpopela/dev/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:205
> (gdb) p pc.u
> $1 = 1
>
> loadi ProtoCallFrame::paddedArgCount[protoCallFrame], temp3
> on ppc64:
> 81 pcBase.u = *CAST<uint32_t*>(t2.i8p + 40); // /home/tpopela/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:206
> (gdb) n
> 82 pcBase.i32 = pcBase.i32 - int32_t(0x1); // /home/tpopela/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:207
> (gdb) p pcBase.u
> $4 = 0
>
> on x86_64:
> 81 pcBase.u = *CAST<uint32_t*>(t2.i8p + 40); // /home/tpopela/dev/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:206
> (gdb) n
> 82 pcBase.i32 = pcBase.i32 - int32_t(0x1); // /home/tpopela/dev/WebKit/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:207
> (gdb) p pcBase.u
> $2 = 2
>
> So I'm confused if that means something.
>
> I uploaded LowLevelInterpreter64.asm, LowLevelInterpreter.asm and LLIntAssembly.h for x86_64 and ppc64 on http://tpopela.fedorapeople.org/
>
> Can you tell me if some of my concerns that I mentioned above are valid?
I’m not sure. You should look in Interpreter.cpp for the initialization of ProtoCallFrame, and work backward from there. I’m not able to tell of the top of my head whether these should be the same or not. If there are platform differences, there should be captured by some abstraction e.g. some symbolic constant or a function to grab some value.
Regards,
Mark
> Thank you
>
> Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-dev/attachments/20140401/863245b4/attachment.html>
More information about the webkit-dev
mailing list