[Webkit-unassigned] [Bug 159827] run-webkit-tests should trigger a spindump when WebContent process is unresponsive

Thu Jul 21 12:28:29 PDT 2016

https://bugs.webkit.org/show_bug.cgi?id=159827

--- Comment #20 from Jonathan Bedard <jbedard at apple.com> ---
Comment on attachment 284131
  --> https://bugs.webkit.org/attachment.cgi?id=284131
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=284131&action=review

>> Tools/ChangeLog:7
>> +
> 
> Please explain the motivation for this change. From what I can tell the spindump(8) report is the concatenation of sample(1)ing all running processing plus some additional data. I take it that our interest in a spindump report is to capture the state of other WebKit processes (e.g. the Network process) when a WebContent process hangs to help diagnose a hard-to-reproduce test failure.

The motivation for this change was a flakey timeout which occurred before a test had actually started.  In that use case, no crash report was generated

>> Tools/Scripts/webkitpy/port/driver.py:440
>> +        return True
> 
> Is it necessary to have separate logic for spin dumps? I mean, this patch unconditionally replaces the machinery for sampling a process using sample(1) with machinery to call spindump(8). I do not see the need to only do one or the other, especially given that spindump(8) can only be run by root. Could we make use of the existing "PROCESS UNRESPONSIVE" message and incorporate the spin dump logic into Port.sample_process()? Could we incorporate the spin dump logic into Port.sample_process() such that we first call spindump(8) and if that command returns a non-zero exit status (say, the current user does not have password-less sudo access) then we call sample(1) on the WebContent process as we do today?

Maybe not.  There is another bug (r27472618) which may be related to this, and may have contributed to the difficulty tracking down the bug which created a need for this change.

>> Tools/Scripts/webkitpy/port/ios.py:432
>> +            os.makedirs(self.results_directory() + '/' + path_location)
> 
> Can we make use of the existing machinery (*) that copies sample(1) reports to the appropriate place in the results directory? If it makes sense to incorporate the spin dump logic into Port.sample_process() then we likely can pass Port.sample_file_path(..., ...) as the path for the -file argument to spindump(8) and make use of the existing machinery the collect samples and moves then to be near the test results for a failed test.
> 
> (*) After running a test Manager._look_for_new_crash_logs() is called to collect crash reports and sample reports. Manager._look_for_new_crash_logs() calls Port.look_for_new_samples() for the list of sample reports that were written and uses TestResultWriter.copy_sample_file() to copy each sample report to the appropriate place in the results directory.

My concern with integrating the spin dump logic into Port.sample_process() is that we would lose the hook back to the TestInvocation through stdin.  I kept spin dump separate because unlike sample_process(), spin dump is actually preventing the TestInvocation process from continuing.

It's also possible that I have misunderstood the point of sample_process() and spin dump does belong there.  I think it is worth noting, however, that the cause of this change was a flakey test which specifically wasn't able to be debugged through sample_process().

>> Tools/WebKitTestRunner/TestInvocation.cpp:215
>> +        return;
> 
> Can you elaborate on this?

The point of this snippet of code (per Alexey) is to continue on without waiting for a stdin trigger from the driving process.  The m_usingServerMode flag is used in the testController to know when to receive testing arguments from standard in (at least, this is my understanding of it).  This should mean that for most cases where the tests are being manually driven, the spindump trigger will do nothing.

>> Tools/WebKitTestRunner/TestInvocation.cpp:216
>> +    fputs("If running manually, hit enter...\n", stderr);
> 
> Is this a workflow that most people will make use of? Running WebKitTestRunner in server mode and wanting to invoke spindump(8) by hand?

No, see above.  WebKitTestRunner should be driven by a script if being run in server mode, but, if for some reason WebKitTestRunnner is being run in server mode manually, we would want to alert the user that the application is waiting for input.

>> Tools/WebKitTestRunner/TestInvocation.cpp:219
>> +        fputs("Spindump return failure\n", stderr);
> 
> I take it that this output is helpful.

The intention of this 'if' statement and it's pickiness is to provide some logging information indicating something has gone wrong with spindump.  Since this code path will be run rarely, I thought that this statement (and the pickiness of the stdin read) would be helpful for any future debugging

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-unassigned/attachments/20160721/832eb856/attachment-0001.html>