[Webkit-unassigned] [Bug 107451] [soup] Use SoupSession instead of SoupSessionAsync for doing sync and async IO from the same session

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Fri Nov 30 14:18:18 PST 2018


https://bugs.webkit.org/show_bug.cgi?id=107451

Michael Catanzaro <mcatanzaro at igalia.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mcatanzaro at igalia.com

--- Comment #3 from Michael Catanzaro <mcatanzaro at igalia.com> ---
With glib-networking 2.57.1, WebKit is no longer able to load TLS error pages. The problem is a network process deadlock caused by a change in how glib-networking performs certificate verification. Previously it verified certificates *after* the TLS handshake had completed, which was weirdly late, but previously not problematic. But now that TLS 1.3 exists, application data can be sent before certificate verification occurs, which is no good. So I moved verification to occur during the handshake. I needed to do this regardless because I need to add a new callback in GnuTLS for another feature. This introduced a deadlock in WebKit:

 * glib-networking detects an unacceptable certificate, emits accept-certificate signal
 * NetworkDataTaskSoup::tlsConnectionAcceptCertificate calls NetworkDataTaskSoup::invalidateAndCancel calls NetworkDataTaskSoup::clearRequest
 * NetworkDataTaskSoup::clearRequest calls soup_session_cancel_message

The problem is that, in the deprecated SoupSessionAsync used by WebKit, cancellation is always *synchronous* despite the name of the class. So soup_session_cancel_message winds up doing its thing to close everything out, and that eventually ends up in a synchronous call to g_tls_connection_gnutls_close. The close operation can't proceed until the TLS handshake is finished, and the handshake is blocked waiting for WebKit to return from its accept-certificate handler. So the close operation winds up polling forever waiting for the handshake to finish (the handshake normally runs on a secondary thread, just except for accept-certificate). Deadlock. I've attached a backtrace showing the callstack at the time of deadlock.

We can try to fix this in glib-networking or in WebKit. I've actually stalled on this for several weeks just because I'm not sure which approach is best. WebKit's use of soup_session_cancel_message seems *really* innocuous and even for someone familiar with the libsoup API, it's really unexpected that using it in the accept-certificate handler could cause a deadlock, so that favored fixing it in glib-networking. That could be done in several ways, e.g. queuing the close operation to occur at a later time and returning a fake success exit status, or returning an error code instead of polling until the handshake completes.

Ultimately, I decided glib-networking is already much too complicated, and solving this in WebKit is so much easier. All we need to do is switch from SoupSessionAsync to the modern SoupSession API. Ironically, with modern SoupSession, cancellation is asynchronous, unlike SoupSessionAsync, so the problem goes away. Solving this in WebKit has the major disadvantage that other applications could be broken by the behavior change, but I've used Debian codesearch to verify that WebKit is the only software likely to be affected.

The only changes required in WebKit are adjustments for the new default property values. Most of the properties we used to set explicitly are now defaults and should be removed. Additionally, SoupSession has default timeouts, which we want to override to allow NetworkDataTaskSoup to implement its own timeouts.

Some final notes:

 * We already require libsoup 2.42, the first version to introduce the modern SoupSession, so there is no dependency bump required. Youenn noted that this version is kinda broken, but it will build.
 * This deadlock bug is super annoying, so if we wind up accepting this solution, it would be nice to do a 2.23.2 release.
 * There is a moderately-high risk of unexpected regression. Do not backport!

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20181130/d9b5f396/attachment.html>


More information about the webkit-unassigned mailing list