[webkit-dev] Limiting slow unload handlers (Re: Back/forward cache for pages with unload handlers)

Thu Sep 17 14:08:43 PDT 2009

On Wed, Sep 16, 2009 at 11:03 PM, Darin Fisher <darin at chromium.org> wrote:

> On Wed, Sep 16, 2009 at 10:57 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
>>
>> On Sep 16, 2009, at 10:33 PM, Darin Fisher wrote:
>>
>>
>>
>> On Wed, Sep 16, 2009 at 9:59 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>>
>>>
>>> On Sep 16, 2009, at 4:49 PM, Darin Fisher wrote:
>>>
>>>
>>>
>>> Counting work instead of time is much more robust.  The getTime call
>>> counts is a measure of work, albeit approximate.
>>>
>>>
>>> The way JavaScriptCore execution time limit works is that the clock
>>> doesn't start ticking until JS execution begins. So it's unlikely that a
>>> full timeout cycle will occur while the process is swapped out or paused,
>>> since the clock won't start running until the process is actually executing
>>> JS. And the actual timeout check is only done occasionally (every N loop
>>> back edges or function calls, for some value of N). So even if there's a
>>> context switch in the middle of JS execution, it's unlikely that JS
>>> processing will be terminated immediately upon return. So maybe a different
>>> solution is appropriate for JavaScriptCore than V8.
>>>
>>>
>> Consider what happens if during JS execution garbage collection runs.
>>  That could cause portions of the VM to be swapped into RAM, which could
>> cause significant wall clock delay.  Do you discount time spent in GC?
>>
>>
>> We don't exclude time spent in GC - slow is slow. But in practice we
>> haven't seen the scenario you describe come up under similar circumstances.
>>
>>
> This may be a bigger factor for Chrome since it is not uncommon for the
> renderer associated with a background tab to be swapped out.  If you just
> click the close button on a background tab, it is not uncommon for it to
> require some expensive paging :-(
>
>
>
>>
>> To help us decide whether (and how) to tackle this for non-V8 ports of
>>> WebKit, can the Chrome team share the data they have on the following:
>>>
>>> (1) Frequency of pages doing a busy loop in an unload handler. I've heard
>>> it's common but no specific data.
>>> (2) A few examples of URLs to pages that do this, so we can study what
>>> they are doing and why.
>>> (3) Frequency of a date-based loop being used to implement the busy loop.
>>> (4) Average additional delay imposed by unload busy loops.
>>> (5) Proportion of sites that use busy looping in unload solely for link
>>> tracking and not for any other purpose.
>>>
>>>
>> You can find links to example sites in the Chromium bug report:
>> http://code.google.com/p/chromium/issues/detail?id=7823
>>
>> The bug contains some distilled data.
>>
>>
>> I found a couple of URLs (which addresses #2) but I couldn't easily find
>> the other data I asked about. Will I find it if I carefully read all 80
>> comments on that bug, or should I assume it's not available?
>>
>>
> I'm not sure if everything you are looking for is spelled out in the bug.
>  Comment #46 has details for one such advertiser.  John may have some more
> distilled data.
>

I had used FireBug/Web Inspector to look into all the unload handlers on the
slow sites that are mentioned on the bug.  I posted some of the JS sleep
code, but not for all sites.  The ads served might have changed in the
meantime, which affects whether this code will be served or not.

>
>
>>
>> By the way, the issue is not with trouble sites but with trouble ad
>> networks and/or producers.  I believe the web sites are just victims here.
>>
>>
>>
>>> The reason I'm interested in (1)-(4) is to determine if doing nothing is
>>> really worse than doing something hackish, as suggested by Adam.
>>>
>>> The reason I'm interested in (5) is to determine if <a ping> is an
>>> adequate replacement. I think if we break existing techniques, we need to
>>> give authors a replacement. unload fires when the user leaves the page in
>>> any way whatsoever, including closing the window or typing in the location
>>> field. So sites could use I/O in unload plus a busy loop to track the amount
>>> of time the user spent on the page, or to save state. If sites are doing
>>> that, then <a ping> won't be an adequate replacement, so we'll have to do
>>> something like Adam's suggestion to guarantee completion of I/O that is
>>> initiated in the unload handler. The reason I think it's possible sites care
>>> about more than just link tracking is that if that's all they care about,
>>> they could just use redirect links, and get a better user experience today
>>> than busy looping in unload. If sites are not using redirects for link
>>> tracking today, why would they use <a ping> in the future?
>>>
>>>
>> The reason why I don't think they are using it for critical data is
>> because they have a timeout.  If they were trying to persist critical data
>> then they would just use a synchronous XHR.  In this case, they are trying
>> to increase the probability of successfully sending a ping by giving
>> themselves a few 100 ms.
>>
>>
>> I'm not saying it's necessarily critical data, just that I suspect they
>> may want to detect when the user leaves the page for a reason other than a
>> link, and therefore may not be satisfied with <a ping>. If they only care
>> about link tracking, why don't they just convert links to redirects?
>>
>>
> See my follow-up email to the one you are replying to.
>
> -Darin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090917/40343a27/attachment.html>