[webkit-dev] Limiting slow unload handlers (Re: Back/forward cache for pages with unload handlers)

Wed Sep 16 22:47:51 PDT 2009

On Wed, Sep 16, 2009 at 10:33 PM, Darin Fisher <darin at chromium.org> wrote:

>
>
> On Wed, Sep 16, 2009 at 9:59 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
>>
>> On Sep 16, 2009, at 4:49 PM, Darin Fisher wrote:
>>
>>
>>
>> On Wed, Sep 16, 2009 at 2:21 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>>
>>>
>>> On Sep 16, 2009, at 1:58 PM, John Abd-El-Malek wrote:
>>>
>>>
>>> Either way though, I don't think it'll work in this case.  I've seen
>>> pages have 8 beforeunload/unload handlers each sleeping for 200ms, just so
>>> that they don't have 1 handler that'll trip the slow script detection.  If
>>> we decrease the timeout for unload handlers, they would just increase the
>>> number of registered handlers proportionally.
>>>
>>>
>>> I think that setting an upper bound on the amount of time that can be
>>> spent in all unload handlers is a better solution than hacking the behavior
>>> of the Date API. Because (a) It's less likely to have unexpected side
>>> effects; and (b) there's no way for content authors to work around it, so we
>>> are less likely to end up in an "arms race" situation. There were worries
>>> expressed that swapping or context switching might trigger false positives,
>>> but I expect this is unlikely in practice, based on our experience with the
>>> slow script dialog.
>>>
>>
>>
>> I too would like to avoid an arms race, but...
>>
>> I disagree.  You'll get false positives at an unacceptable rate,
>> especially if you try to tamp down the interval to a small fraction of a
>> second.  We saw these problems in spades with Chrome's hang monitor
>> (detecting unresponsive subprocesses), and we had to push the interval to
>> something larger than we would have liked.
>>
>>
>> Interesting - I don't recall every seeing false positives with Safari's
>> "slow script" detection. Maybe due to our particular timeout design (see
>> below).
>>
>>
>> Counting work instead of time is much more robust.  The getTime call
>> counts is a measure of work, albeit approximate.
>>
>>
>> The way JavaScriptCore execution time limit works is that the clock
>> doesn't start ticking until JS execution begins. So it's unlikely that a
>> full timeout cycle will occur while the process is swapped out or paused,
>> since the clock won't start running until the process is actually executing
>> JS. And the actual timeout check is only done occasionally (every N loop
>> back edges or function calls, for some value of N). So even if there's a
>> context switch in the middle of JS execution, it's unlikely that JS
>> processing will be terminated immediately upon return. So maybe a different
>> solution is appropriate for JavaScriptCore than V8.
>>
>>
> Consider what happens if during JS execution garbage collection runs.  That
> could cause portions of the VM to be swapped into RAM, which could cause
> significant wall clock delay.  Do you discount time spent in GC?
>
>
>
>>
>> Also, it is very important to note that content authors are not entirely
>> in control here.  A content author may have some ads on their page, and it
>> may be the ad that is delivering the bad unload handler.  If we applied a
>> limit to all unload handlers, then we'd be punishing both the content author
>> as well as the ad provider.  That doesn't seem fair to the content author,
>> who might have a legit unload handler.
>>
>>
>> As long as the author installs their unload handlers before the ad does,
>> they won't have a problem.
>>
>
> Good point.
>
>
>
>>
>> To help us decide whether (and how) to tackle this for non-V8 ports of
>> WebKit, can the Chrome team share the data they have on the following:
>>
>> (1) Frequency of pages doing a busy loop in an unload handler. I've heard
>> it's common but no specific data.
>> (2) A few examples of URLs to pages that do this, so we can study what
>> they are doing and why.
>> (3) Frequency of a date-based loop being used to implement the busy loop.
>> (4) Average additional delay imposed by unload busy loops.
>> (5) Proportion of sites that use busy looping in unload solely for link
>> tracking and not for any other purpose.
>>
>>
> You can find links to example sites in the Chromium bug report:
> http://code.google.com/p/chromium/issues/detail?id=7823
>
> The bug contains some distilled data.
>
> By the way, the issue is not with trouble sites but with trouble ad
> networks and/or producers.  I believe the web sites are just victims here.
>
>
>
>> The reason I'm interested in (1)-(4) is to determine if doing nothing is
>> really worse than doing something hackish, as suggested by Adam.
>>
>> The reason I'm interested in (5) is to determine if <a ping> is an
>> adequate replacement. I think if we break existing techniques, we need to
>> give authors a replacement. unload fires when the user leaves the page in
>> any way whatsoever, including closing the window or typing in the location
>> field. So sites could use I/O in unload plus a busy loop to track the amount
>> of time the user spent on the page, or to save state. If sites are doing
>> that, then <a ping> won't be an adequate replacement, so we'll have to do
>> something like Adam's suggestion to guarantee completion of I/O that is
>> initiated in the unload handler. The reason I think it's possible sites care
>> about more than just link tracking is that if that's all they care about,
>> they could just use redirect links, and get a better user experience today
>> than busy looping in unload. If sites are not using redirects for link
>> tracking today, why would they use <a ping> in the future?
>>
>>
> The reason why I don't think they are using it for critical data is because
> they have a timeout.  If they were trying to persist critical data then they
> would just use a synchronous XHR.  In this case, they are trying to increase
> the probability of successfully sending a ping by giving themselves a few
> 100 ms.
>
> -Darin
>

By the way, to be clear these ads aren't on the critical path for link
clicks.  A navigation occurs, and the ad just observes unload.  During
unload it presumably tries to send home some data (ad impression time,
perhaps?).  I'm not sure how a redirect could be used to report such
information.

<a ping> is a useful tool nonetheless because you could dynamically create
one, and dispatch a click event to it.

-Darin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090916/3b4a9e03/attachment.html>