[webkit-dev] Limiting slow unload handlers (Re: Back/forward cache for pages with unload handlers)

Wed Sep 16 23:03:38 PDT 2009

On Wed, Sep 16, 2009 at 10:57 PM, Maciej Stachowiak <mjs at apple.com> wrote:

>
> On Sep 16, 2009, at 10:33 PM, Darin Fisher wrote:
>
>
>
> On Wed, Sep 16, 2009 at 9:59 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
>>
>> On Sep 16, 2009, at 4:49 PM, Darin Fisher wrote:
>>
>>
>>
>> Counting work instead of time is much more robust.  The getTime call
>> counts is a measure of work, albeit approximate.
>>
>>
>> The way JavaScriptCore execution time limit works is that the clock
>> doesn't start ticking until JS execution begins. So it's unlikely that a
>> full timeout cycle will occur while the process is swapped out or paused,
>> since the clock won't start running until the process is actually executing
>> JS. And the actual timeout check is only done occasionally (every N loop
>> back edges or function calls, for some value of N). So even if there's a
>> context switch in the middle of JS execution, it's unlikely that JS
>> processing will be terminated immediately upon return. So maybe a different
>> solution is appropriate for JavaScriptCore than V8.
>>
>>
> Consider what happens if during JS execution garbage collection runs.  That
> could cause portions of the VM to be swapped into RAM, which could cause
> significant wall clock delay.  Do you discount time spent in GC?
>
>
> We don't exclude time spent in GC - slow is slow. But in practice we
> haven't seen the scenario you describe come up under similar circumstances.
>
>
This may be a bigger factor for Chrome since it is not uncommon for the
renderer associated with a background tab to be swapped out.  If you just
click the close button on a background tab, it is not uncommon for it to
require some expensive paging :-(

>
> To help us decide whether (and how) to tackle this for non-V8 ports of
>> WebKit, can the Chrome team share the data they have on the following:
>>
>> (1) Frequency of pages doing a busy loop in an unload handler. I've heard
>> it's common but no specific data.
>> (2) A few examples of URLs to pages that do this, so we can study what
>> they are doing and why.
>> (3) Frequency of a date-based loop being used to implement the busy loop.
>> (4) Average additional delay imposed by unload busy loops.
>> (5) Proportion of sites that use busy looping in unload solely for link
>> tracking and not for any other purpose.
>>
>>
> You can find links to example sites in the Chromium bug report:
> http://code.google.com/p/chromium/issues/detail?id=7823
>
> The bug contains some distilled data.
>
>
> I found a couple of URLs (which addresses #2) but I couldn't easily find
> the other data I asked about. Will I find it if I carefully read all 80
> comments on that bug, or should I assume it's not available?
>
>
I'm not sure if everything you are looking for is spelled out in the bug.
 Comment #46 has details for one such advertiser.  John may have some more
distilled data.

>
> By the way, the issue is not with trouble sites but with trouble ad
> networks and/or producers.  I believe the web sites are just victims here.
>
>
>
>> The reason I'm interested in (1)-(4) is to determine if doing nothing is
>> really worse than doing something hackish, as suggested by Adam.
>>
>> The reason I'm interested in (5) is to determine if <a ping> is an
>> adequate replacement. I think if we break existing techniques, we need to
>> give authors a replacement. unload fires when the user leaves the page in
>> any way whatsoever, including closing the window or typing in the location
>> field. So sites could use I/O in unload plus a busy loop to track the amount
>> of time the user spent on the page, or to save state. If sites are doing
>> that, then <a ping> won't be an adequate replacement, so we'll have to do
>> something like Adam's suggestion to guarantee completion of I/O that is
>> initiated in the unload handler. The reason I think it's possible sites care
>> about more than just link tracking is that if that's all they care about,
>> they could just use redirect links, and get a better user experience today
>> than busy looping in unload. If sites are not using redirects for link
>> tracking today, why would they use <a ping> in the future?
>>
>>
> The reason why I don't think they are using it for critical data is because
> they have a timeout.  If they were trying to persist critical data then they
> would just use a synchronous XHR.  In this case, they are trying to increase
> the probability of successfully sending a ping by giving themselves a few
> 100 ms.
>
>
> I'm not saying it's necessarily critical data, just that I suspect they may
> want to detect when the user leaves the page for a reason other than a link,
> and therefore may not be satisfied with <a ping>. If they only care about
> link tracking, why don't they just convert links to redirects?
>
>
See my follow-up email to the one you are replying to.

-Darin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090916/a583300b/attachment.html>