[webkit-dev] Limiting slow unload handlers (Re: Back/forward cache for pages with unload handlers)
John Abd-El-Malek
jam at google.com
Thu Sep 17 14:11:54 PDT 2009
On Thu, Sep 17, 2009 at 2:09 PM, Darin Fisher <darin at chromium.org> wrote:
>
>
> On Thu, Sep 17, 2009 at 12:52 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
>>
>> On Sep 17, 2009, at 12:14 PM, Darin Fisher wrote:
>>
>>
>>
>> On Thu, Sep 17, 2009 at 2:28 AM, Maciej Stachowiak <mjs at apple.com> wrote:
>>
>>>
>>> On Sep 17, 2009, at 12:35 AM, Darin Fisher wrote:
>>>
>>>
>>>>
>>>> For <a ping> to be used as I suggested, you would need to set the href
>>>> to a javascript URL such as javascript:void(), so that it would not
>>>> interfere with an existing navigation.
>>>>
>>>
>>> Navigating to a javascript: URL will still cancel with an existing
>>> pending navigation, both per spec and in at least some browser engines
>>> (including WebKit). For spec reference see step 5 here: <
>>> http://dev.w3.org/html5/spec/Overview.html#navigate>. Note lack of
>>> exception for "javascript:" URLs.
>>
>>
>> nit: WebKit and most browsers that I tested _only_ do so if the
>> javascript: URL evaluates to a string. Take a look at
>> FrameLoader::executeIfJavaScriptURL(). I can also share my test case with
>> you if you are still doubting! ;-)
>>
>>
>> My test case did this from an onclick handler, and the navigation to
>> yahoo.com did not happen:
>>
>> location.href="http://yahoo.com/";
>> location.href="javascript:void()";
>>
>>
> Yes, in that case the scheduled location change from the first href
> assignment is cancelled by the second.
>
> My test case was to dispatch a click event to an anchor tag during the
> unload handler. Set the anchor's href to javascript:void() and notice that
> it does not interrupt the existing navigation.
>
>
>
>
>> It may be that other ways of navigating to a javascript: URL don't stop a
>> pending load, but that would be a bug.
>>
>
> I'm not sure. It seems like each browser goes out of its way to only
> cancel the existing navigation when the javascript: URL results in a string.
> Coincidence, really??
>
>
>
>>
>> Based on your comments below, I think the expedient thing to do is to let
>> Image loads (only) complete their I/O when initiated from unload or
>> pagehide.
>>
>
> Why exclude beforeunload? Some of the sites we found use the busy loop
> hack in beforeunload.
>
These sites presumably did it to split the sleep calls across as many
handlers as possible to avoid hung script detectors. If they rewrite their
pages to use one clean method, it seems they only need to do it in one
place.
>
> -Darin
>
>
>
>>
>> - Maciej
>>
>>
>>
>>
>>>
>>>
>>> I think it would also be necessary for the ping to be sent during the
>>>> default event handler instead of when the href is normally processed.
>>>>
>>>
>>> I'm not sure what you're suggesting, but link clicks are currently
>>> processed in the default event handler for HTMLAnchorElement.
>>
>>
>>
>> Sorry, I was thinking of the Mozilla implementation :-/
>>
>>
>>
>>>
>>>
>>> As for guaranteeing completion of I/O started during unload, I am fairly
>>>> concerned about the code complexity that would result from allowing some
>>>> resource loads to go uncanceled at page navigation time. Pings are
>>>> different since they are fire and forget, but other resource loads, which
>>>> normally get read and processed by the frame / loader system are a bit
>>>> problematic to keep active. We'd want to put them in a special mode where
>>>> they don't get canceled but also don't get processed when their responses
>>>> arrive. There may be a clean way to do this, but I am concerned about the
>>>> potential maintenance cost due to the extra mode for resource loading.
>>>>
>>>
>>>
>>> Yes, it would be somewhat tricky, but the proposed <a ping> solution
>>> simply doesn't work. Even if there was a way to make it work, it would be
>>> pretty hard to use correctly, and not an intuitive choice.
>>>
>>
>> I agree that it is tricky, but that doesn't sound like much of a barrier
>> to the folks who would be using it. I think people are desperate for
>> anything that works, and given that nothing really works today, they end up
>> with suboptimal hacks (e.g., busy looping).
>>
>>
>>
>>>
>>> Here's what would need to happen to let loads from unload run to
>>> completion:
>>>
>>> 1) Implement a way to track all entities that start loads during unload
>>> (all owners of ResourceHandles, say).
>>> 2) Add a way (perhaps via an abstract base class) for all such entities
>>> to release their ResourceHandle to another owner and then cancel themselves.
>>> 3) Add code after unload finishes to create an object that takes
>>> ownership of all these ResourceHandles, and stays alive until they all
>>> complete their loads (dropping results on the floor).
>>>
>>> I don't think this would be much harder than it would be to implement <a
>>> ping>. The only hard part really is #1, particularly making sure we've
>>> caught all possible kinds of resource loads.
>>>
>>
>> This does sound workable. It might even be best to only start out
>> supporting Image requests. That should make it easier to be confident in
>> the solution. I worry about supporting XHR since CORS makes for some
>> complicated back-n-forth: not just a single ResourceHandle instance.
>>
>>
>>
>>>
>>> Yet another possibility: we could introduce a Ping object or sendPing()
>>> method that performs a ping without the need to follow a link, and
>>> guarantees the I/O won't be cancelled due to navigation. That would localize
>>> the need for added complexity. The downside is we'd have to either get it
>>> into standards or do something WebKit-specific.
>>>
>>> A third possibility: limit unload persistence to Image and
>>> XMLHttpRequest, to limit the complexity. That's assuming sites are not using
>>> scripts, stylesheets or frames to do the exit ping - I don't know if that's
>>> a good assumption.
>>
>>
>> I think it is a good assumption. Image is used today because of an IE
>> *quirk* that causes it to not be cancelled upon page navigation. Other ways
>> of fetching subresources do not appear to be subject to that behavior.
>> However, I admit that I haven't personally studied IE enough to be certain.
>>
>> At any rate, I think I'm persuaded by your arguments. Adam made a similar
>> argument the other day too. It seems reasonable to make Image behave this
>> way when created in certain contexts. We should probably include
>> beforeunload and pagehide handlers since we have evidence that people do the
>> same thing in beforeunload that they do in unload. I include pagehide given
>> the recent change to WebKit's page cache insertion policy. We might even
>> want to allow it in anchor tag click handlers for completeness.
>>
>> We should probably still consider an <a ping> implementation, but I agree
>> that it doesn't have to be the only tool that we provide.
>>
>>
>> I think <a ping> is a good way to do link tracking. It just doesn't
>>
>>
>
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090917/a55e8723/attachment.html>
More information about the webkit-dev
mailing list