[webkit-dev] Limiting slow unload handlers (Re: Back/forward cache for pages with unload handlers)

Darin Fisher darin at chromium.org
Thu Sep 17 14:09:29 PDT 2009


On Thu, Sep 17, 2009 at 12:52 PM, Maciej Stachowiak <mjs at apple.com> wrote:

>
> On Sep 17, 2009, at 12:14 PM, Darin Fisher wrote:
>
>
>
> On Thu, Sep 17, 2009 at 2:28 AM, Maciej Stachowiak <mjs at apple.com> wrote:
>
>>
>> On Sep 17, 2009, at 12:35 AM, Darin Fisher wrote:
>>
>>
>>>
>>> For <a ping> to be used as I suggested, you would need to set the href to
>>> a javascript URL such as javascript:void(), so that it would not interfere
>>> with an existing navigation.
>>>
>>
>> Navigating to a javascript: URL will still cancel with an existing pending
>> navigation, both per spec and in at least some browser engines (including
>> WebKit). For spec reference see step 5 here: <
>> http://dev.w3.org/html5/spec/Overview.html#navigate>. Note lack of
>> exception for "javascript:" URLs.
>
>
> nit: WebKit and most browsers that I tested _only_ do so if the javascript:
> URL evaluates to a string. Take a look at
> FrameLoader::executeIfJavaScriptURL().  I can also share my test case with
> you if you are still doubting! ;-)
>
>
> My test case did this from an onclick handler, and the navigation to
> yahoo.com did not happen:
>
>      location.href="http://yahoo.com/";
>      location.href="javascript:void()";
>
>
Yes, in that case the scheduled location change from the first href
assignment is cancelled by the second.

My test case was to dispatch a click event to an anchor tag during the
unload handler.  Set the anchor's href to javascript:void() and notice that
it does not interrupt the existing navigation.




> It may be that other ways of navigating to a javascript: URL don't stop a
> pending load, but that would be a bug.
>

I'm not sure.  It seems like each browser goes out of its way to only cancel
the existing navigation when the javascript: URL results in a string.
 Coincidence, really??



>
> Based on your comments below, I think the expedient thing to do is to let
> Image loads (only) complete their I/O when initiated from unload or
> pagehide.
>

Why exclude beforeunload?  Some of the sites we found use the busy loop hack
in beforeunload.

-Darin



>
>  - Maciej
>
>
>
>
>>
>>
>>   I think it would also be necessary for the ping to be sent during the
>>> default event handler instead of when the href is normally processed.
>>>
>>
>> I'm not sure what you're suggesting, but link clicks are currently
>> processed in the default event handler for HTMLAnchorElement.
>
>
>
> Sorry, I was thinking of the Mozilla implementation :-/
>
>
>
>>
>>
>>  As for guaranteeing completion of I/O started during unload, I am fairly
>>> concerned about the code complexity that would result from allowing some
>>> resource loads to go uncanceled at page navigation time.  Pings are
>>> different since they are fire and forget, but other resource loads, which
>>> normally get read and processed by the frame / loader system are a bit
>>> problematic to keep active.  We'd want to put them in a special mode where
>>> they don't get canceled but also don't get processed when their responses
>>> arrive.  There may be a clean way to do this, but I am concerned about the
>>> potential maintenance cost due to the extra mode for resource loading.
>>>
>>
>>
>> Yes, it would be somewhat tricky, but the proposed <a ping> solution
>> simply doesn't work. Even if there was a way to make it work, it would be
>> pretty hard to use correctly, and not an intuitive choice.
>>
>
> I agree that it is tricky, but that doesn't sound like much of a barrier to
> the folks who would be using it.  I think people are desperate for anything
> that works, and given that nothing really works today, they end up with
> suboptimal hacks (e.g., busy looping).
>
>
>
>>
>> Here's what would need to happen to let loads from unload run to
>> completion:
>>
>> 1) Implement a way to track all entities that start loads during unload
>> (all owners of ResourceHandles, say).
>> 2) Add a way (perhaps via an abstract base class) for all such entities to
>> release their ResourceHandle to another owner and then cancel themselves.
>> 3) Add code after unload finishes to create an object that takes ownership
>> of all these ResourceHandles, and stays alive until they all complete their
>> loads (dropping results on the floor).
>>
>> I don't think this would be much harder than it would be to implement <a
>> ping>. The only hard part really is #1, particularly making sure we've
>> caught all possible kinds of resource loads.
>>
>
> This does sound workable.  It might even be best to only start out
> supporting Image requests.  That should make it easier to be confident in
> the solution.  I worry about supporting XHR since CORS makes for some
> complicated back-n-forth: not just a single ResourceHandle instance.
>
>
>
>>
>> Yet another possibility: we could introduce a Ping object or sendPing()
>> method that performs a ping without the need to follow a link, and
>> guarantees the I/O won't be cancelled due to navigation. That would localize
>> the need for added complexity. The downside is we'd have to either get it
>> into standards or do something WebKit-specific.
>>
>> A third possibility: limit unload persistence to Image and XMLHttpRequest,
>> to limit the complexity. That's assuming sites are not using scripts,
>> stylesheets or frames to do the exit ping - I don't know if that's a good
>> assumption.
>
>
> I think it is a good assumption.  Image is used today because of an IE
> *quirk* that causes it to not be cancelled upon page navigation.  Other ways
> of fetching subresources do not appear to be subject to that behavior.
>  However, I admit that I haven't personally studied IE enough to be certain.
>
> At any rate, I think I'm persuaded by your arguments.  Adam made a similar
> argument the other day too.  It seems reasonable to make Image behave this
> way when created in certain contexts.  We should probably include
> beforeunload and pagehide handlers since we have evidence that people do the
> same thing in beforeunload that they do in unload.  I include pagehide given
> the recent change to WebKit's page cache insertion policy.  We might even
> want to allow it in anchor tag click handlers for completeness.
>
> We should probably still consider an <a ping> implementation, but I agree
> that it doesn't have to be the only tool that we provide.
>
>
> I think <a ping> is a good way to do link tracking. It just doesn't
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090917/d27138be/attachment.html>


More information about the webkit-dev mailing list