[webkit-dev] Limiting slow unload handlers (Re: Back/forward cache for pages with unload handlers)

Wed Sep 16 21:59:04 PDT 2009

On Sep 16, 2009, at 4:49 PM, Darin Fisher wrote:

>
>
> On Wed, Sep 16, 2009 at 2:21 PM, Maciej Stachowiak <mjs at apple.com>  
> wrote:
>
> On Sep 16, 2009, at 1:58 PM, John Abd-El-Malek wrote:
>
>>
>> Either way though, I don't think it'll work in this case.  I've  
>> seen pages have 8 beforeunload/unload handlers each sleeping for  
>> 200ms, just so that they don't have 1 handler that'll trip the slow  
>> script detection.  If we decrease the timeout for unload handlers,  
>> they would just increase the number of registered handlers  
>> proportionally.
>
> I think that setting an upper bound on the amount of time that can  
> be spent in all unload handlers is a better solution than hacking  
> the behavior of the Date API. Because (a) It's less likely to have  
> unexpected side effects; and (b) there's no way for content authors  
> to work around it, so we are less likely to end up in an "arms race"  
> situation. There were worries expressed that swapping or context  
> switching might trigger false positives, but I expect this is  
> unlikely in practice, based on our experience with the slow script  
> dialog.
>
>
> I too would like to avoid an arms race, but...
>
> I disagree.  You'll get false positives at an unacceptable rate,  
> especially if you try to tamp down the interval to a small fraction  
> of a second.  We saw these problems in spades with Chrome's hang  
> monitor (detecting unresponsive subprocesses), and we had to push  
> the interval to something larger than we would have liked.

Interesting - I don't recall every seeing false positives with  
Safari's "slow script" detection. Maybe due to our particular timeout  
design (see below).

>
> Counting work instead of time is much more robust.  The getTime call  
> counts is a measure of work, albeit approximate.

The way JavaScriptCore execution time limit works is that the clock  
doesn't start ticking until JS execution begins. So it's unlikely that  
a full timeout cycle will occur while the process is swapped out or  
paused, since the clock won't start running until the process is  
actually executing JS. And the actual timeout check is only done  
occasionally (every N loop back edges or function calls, for some  
value of N). So even if there's a context switch in the middle of JS  
execution, it's unlikely that JS processing will be terminated  
immediately upon return. So maybe a different solution is appropriate  
for JavaScriptCore than V8.

>
> Also, it is very important to note that content authors are not  
> entirely in control here.  A content author may have some ads on  
> their page, and it may be the ad that is delivering the bad unload  
> handler.  If we applied a limit to all unload handlers, then we'd be  
> punishing both the content author as well as the ad provider.  That  
> doesn't seem fair to the content author, who might have a legit  
> unload handler.

As long as the author installs their unload handlers before the ad  
does, they won't have a problem.

To help us decide whether (and how) to tackle this for non-V8 ports of  
WebKit, can the Chrome team share the data they have on the following:

(1) Frequency of pages doing a busy loop in an unload handler. I've  
heard it's common but no specific data.
(2) A few examples of URLs to pages that do this, so we can study what  
they are doing and why.
(3) Frequency of a date-based loop being used to implement the busy  
loop.
(4) Average additional delay imposed by unload busy loops.
(5) Proportion of sites that use busy looping in unload solely for  
link tracking and not for any other purpose.

The reason I'm interested in (1)-(4) is to determine if doing nothing  
is really worse than doing something hackish, as suggested by Adam.

The reason I'm interested in (5) is to determine if <a ping> is an  
adequate replacement. I think if we break existing techniques, we need  
to give authors a replacement. unload fires when the user leaves the  
page in any way whatsoever, including closing the window or typing in  
the location field. So sites could use I/O in unload plus a busy loop  
to track the amount of time the user spent on the page, or to save  
state. If sites are doing that, then <a ping> won't be an adequate  
replacement, so we'll have to do something like Adam's suggestion to  
guarantee completion of I/O that is initiated in the unload handler.  
The reason I think it's possible sites care about more than just link  
tracking is that if that's all they care about, they could just use  
redirect links, and get a better user experience today than busy  
looping in unload. If sites are not using redirects for link tracking  
today, why would they use <a ping> in the future?

Regards,
Maciej

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090916/31f980b4/attachment.html>