[webkit-dev] beforeload & link (esp rel prefetch)

Thu Jan 20 17:52:35 PST 2011

Folks,

I want to thank everyone who contributed to this thread on beforeload,
the link element, and the HTTP link header.  The consensus was that it
was a mistake to add rel=prefetch without beforeload handlers, and
that it's a mistake to permit link headers to launch resource requests
before an extension or a page has had an opportunity to attach a
beforeload header. I think we can implement this feature and address
those concerns.  I also had a chance last week to hear from a
developer who'd tried to make an SSL Everywhere extension for
Safari[1].  His experience was that more beforeload events would have
been helpful and he created Bug 52577.  As well, Alexey drew our
attention to the Safari plugin Incognito (distinct from the Chrome
Incognito feature), which uses event capture on the beforeload event
to prevent cross-site loads to user-tracking sites (Google, Facebook,
ВКонтакте).  Although, due to bug 52581, Safari actually launches
requests that capturers try to squash in event capture, which sort of
undermines everything Incognito is trying to do.

As well, there were also compelling arguments made in favour of
continuing experimentation with the Link header and prefetching.  The
Link header has the potential to make the web faster; it enables
automated optimizers which can use link headers to provide server
hints, without having to change the HTML of a document being served.
I can imagine other uses too: a caching proxy can follow link
prefetch/subresource instructions, and sites with complex serving
infrastructure can have their front end servers insert link headers
while dynamic pages are being constructed in their backends.  Without
continued experimentation, it's hard to know what difference this
feature can make.

I want to find a way forward then, to learn what performance can be.
So, I started this thread asking five questions, I'll repeat them with
where I think the right answers are now:

1. Should HTML Link rel=prefetch have beforeload events?
    Yes.
2. How about rel=icon and rel=dns-prefetch ?
    Yes for icon, not clear for dns-prefetch.
3. If the answer to (1) is yes, then should HTTP Link have events?  Really?
    Yes, at least for capture (which is what plugins use), otherwise
there's a way around blocking plugins.
4. Should HTML Link permit rel=subresource?
    Sure.
5. If the answer to (4) is yes, should HTML Link rel=subresource have
beforeload events?
    Yes, same reasons as all the other yesses.

I also have thought about how we can go forward, I'd like folks
comments on this:

Step 1: Land bug 51941, a refactoring of the HTMLLinkPrefetch element
which pulls out loading for rel types prefetch, icon and dns-prefetch.
Step 2: Add beforeload to at least prefetch & icon rel types, and hey,
why not dns-prefetch too!  Do this to fix bug 52577.
Step 3: Add rel type subresource (same as rel type prefetch, only
higher priority for in-page resources) (need to create a bug for
this).
Step 4: Add Link header, providing rel types subresource, prefetch &
dns-prefetch only (currently bug 51940).
Step 5: Add beforeload events to the Link header (as a followup after
bug 51940).

Maciej asked a bit about Step 5: my thinking right now is that we only
care about capture (from the three stages: capture, target &
bubbling).  The two best use cases I've found for beforeload in
privacy enhancing situations were SSL Everywhere & the Safari
Incognito extension.  Both of them use the event capture interface, no
targets for beforeload events.  So if we defer launching Link header
loads until epsilon after document start (only milliseconds away),
then we will have a target for the event capture interface; the events
need not have a target, since every use case we've considered uses the
capture interface only, and of course a target doesn't make sense if
the header is not to be in the DOM.

I hope that this addresses the concerns raised: we're respecting user
privacy, and separating our layers as much as possible; the only rel
types exposed in the Link header will be prefetch and subresource,
which are in a sense network layer since they speak only to cache
warming.  Yet, they will still be capturable by extensions.  This does
mean that we probably won't initially support Link header on non-HTML
documents as well, another question Alexey raised in the thread.

- Gavin

[1] See http://www.nearinfinity.com/blogs/jeff_kunkle/lessons_learned_building_an_ht.html
for a good discussion from Jeff Kunkle about using beforeload.

[2] I just installed Incognito mode in a local Safari, and sniffed it.
 I can confirm requests to blocked domains go straight through; with
cookies, everything.  The only place you won't see them is in the
document, the DOM & the Safari resource loading view.  So Incognito
mode doesn't really stop tracking right now!

On 14 January 2011 20:23, Gavin Peters (蓋文彼德斯) <gavinp at chromium.org> wrote:
>
> Thanks for your message, Maciej!
>
> On 14 January 2011 13:53, Maciej Stachowiak <mjs at apple.com> wrote:
>>
>> I agree that beforeload support could be more pervasive than it is today.  The exclusion of prefetch, icon and dns-prefetch from beforeload events bears revisiting.  But are these insurmountable?  Currently the bug up for review, bug 51941 doesn't remove beforeload handling from anything that had it.  The semantics of beforeload handlers for link headers wrt extensions bear some thought, but I suspect these can be solved: can we create another bug for adding this suppo
>>
>> It's not obvious how it could work, since a load triggered by a Link header has no associated element, and in fact starts before any elements exist. So there is nothing that can act as the event target. If you think it can be done, how about a proposal?
>
> Certainly immediately, the first thing that comes to mind is that we continue, as WebKit has for as long as it's had the rel types dns-prefetch, prefetch & icon, to not issue beforeload events on these elements.  That's the behaviour now, and it's been acceptable to date.  Are you convinced it was a mistake to omit these rel types from beforeload now?  That question seems independent of the Link header that's also been discussed in this thread, but if I'm wrong I'm totally open to hearing why it's a blocker now.
> If we decide to issue them, particularly on the header, I concur that it's tricky.  I don't have a great proposal now for handling beforeload in link headers, and I'm not sure the ideas I do have are developed enough to really share, I suspect they're all both obvious and naive given the problem.  But, counter balancing this problem, I'd like to continue to experiment with this feature and learn what benefits it has to offer.  What's the way forward on that?
> - Gavin
>>
>>
>> On 13 January 2011 12:48, Alexey Proskuryakov <ap at webkit.org> wrote:
>>>
>>> 13.01.2011, в 09:14, Julian Reschke написал(а):
>>>
>>> >> I'm wondering what the use cases are. To me, adding a way for metadata to change document behavior sounds like a misfeature - it adds significant complexity to the system, while taking away existing functionality. As a specific example discussed in this thread, we'd break some browser extensions like Incognito if resource loading could bypass onbeforeload. As another example, we'd break<base> element.
>>> >
>>> > Well, there are document types where you *can't* inline the metadata.
>>>
>>>
>>> Indeed, and I don't have anything against metadata as long as it doesn't directly modify actual data. For example, Last-Modified and Cache-Control are quite obvious example of what can be in HTTP headers. Despite the practical/historical difficulties that I mentioned with Content-Type, it's arguably a valid example of metadata, too.
>>>
>>> Subresource references on the other hand are a part of a document, not of its metadata. Am I just missing a reason why one would want to prefetch subresources for a JPEG image?
>>>
>>> > We should distinguish between the act of declaring the link, and the moment where a potential fetch actually happens (it doesn't always happen, after all).
>>> >
>>> > I agree that stuffing things just to get a fetch to happen "earlier" maybe a premature optimization.
>>>
>>>
>>> Optimizing prefetch to start before actual document data arrives is highly controversial, but I believe that it's the primary reason why we're considering the Link header implementation.
>>>
>>> - WBR, Alexey Proskuryakov
>>>
>>> _______________________________________________
>>> webkit-dev mailing list
>>> webkit-dev at lists.webkit.org
>>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
>