[Webkit-unassigned] [Bug 203058] New: HEAD requests are not cached

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Oct 16 15:31:34 PDT 2019


https://bugs.webkit.org/show_bug.cgi?id=203058

            Bug ID: 203058
           Summary: HEAD requests are not cached
           Product: WebKit
           Version: WebKit Nightly Build
          Hardware: All
                OS: All
            Status: NEW
          Severity: Normal
          Priority: P2
         Component: Page Loading
          Assignee: webkit-unassigned at lists.webkit.org
          Reporter: nham at apple.com
                CC: beidson at apple.com

Created attachment 381123

  --> https://bugs.webkit.org/attachment.cgi?id=381123&action=review

test case for GET followed by HEAD

The warm Amazon page load on PLT5 shows a number of static resources that aren't cached. This is due to a GET followed by a HEAD for the same resource:

1. The main page has an <img src="foo.jpg"> which causes a GET for foo.jpg.
2. Some time later, they do a HEAD request for foo.jpg via an XHR (they extract Content-Length out of the response headers to power some other logic).

I've attached a reduced test case that makes this request pattern. (Note that Amazon's home page even as of today shows this request pattern, so it's not entirely theoretical.)

This request pattern causes misses in both our memory and disk caches.

# Memory Cache

In the memory cache, the GET has a type of ImageResource, while the HEAD has a type of RawResource. The mismatched types cause the memory cache to always reload the resource, as shown in `CachedResourceLoader::determineRevalidationPolicy`:

```
// If the same URL has been loaded as a different type, we need to reload.
if (existingResource->type() != type) {
    LOG(ResourceLoading, "CachedResourceLoader::determineRevalidationPolicy reloading due to type mismatch.");
    logMemoryCacheResourceRequest(frame(), DiagnosticLoggingKeys::inMemoryCacheKey(), DiagnosticLoggingKeys::unusedReasonTypeMismatchKey());
    return Reload;
}
```

# Disk Cache

In the disk cache, the first GET is cached to disk as expected. However, `makeStoreDecision` decides to not store HEAD responses:

```
if (originalRequest.httpMethod() != "GET")
    return StoreDecision::NoDueToHTTPMethod;
```

In response, `Cache::store` deletes the existing GET response from the disk cache:

```
StoreDecision storeDecision = makeStoreDecision(request, response, responseData ? responseData->size() : 0);
if (storeDecision != StoreDecision::Yes) {
    LOG(NetworkCache, "(NetworkProcess) didn't store, storeDecision=%d", static_cast<int>(storeDecision));
    auto key = makeCacheKey(request);

    auto isSuccessfulRevalidation = response.httpStatusCode() == 304;
    if (!isSuccessfulRevalidation) {
        // Make sure we don't keep a stale entry in the cache.
        remove(key);
    }

    return nullptr;
}
```

So as a result, when the page load is complete, there is nothing at all cached for foo.jpg. So on reload, we actually end up fetching foo.jpg from the network again, for both the GET and HEAD requests.

# Other Browsers

Chrome is able to downgrade a cached GET response to a HEAD response. When loading the test case in Chrome:

1. On a cold load, foo.jpg is loaded from the network for the GET, and then the disk cache cache is used to satisfy the HEAD request.
2. On a warm load, both the GET and HEAD requests are served out of disk cache.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20191016/71a684ed/attachment.html>


More information about the webkit-unassigned mailing list