[webkit-gtk] webkit2gtk resources

Jose jmalv04 at gmail.com
Wed Aug 27 10:02:59 PDT 2014


In a previous discussion
https://lists.webkit.org/pipermail/webkit-gtk/2012-February/000960.html

the webkit_web_view_save_to_dir was proposed.

What ended up being implemented was webkit_web_view_save and
_save_to_file, which dump the webview using
the MHTML format.

I thought it would be useful for my understanding to dump each
resource to a file, so I basically
follow the approach described in the testing files

- connect to resource-load-started
- Connect to loaded signal and add each resource to a GLIst
- wait some time after the page has loaded
- take a snapshot
- save the resource and subresources to a directory

This works well for most web pages but I have 3 cases I don't know how
to handle (using 2.5.3):

1) The page has frames

E.g. when I Ioad a URL with this content

<html>
<frameset cols="10%,10%,80%">
        <frame src="data:text/html,ok" />
        <frame src="/doesnotexist.html" />
        <frame src="/test/index2.html" />
</frameset>
</html>

I get the following (status code, type, url, get_content_len, actual dataSize)

200 html /test/frames_broken.html 202 202
200 html /test/index2.html 41 0
404 html /doesnotexist.html 168 0
0 html data:text/html,ok 2 0

The main_resource and the data URI are ok but I am unable to get the
content for the other two frames (and I think at least I should get
the 41 bytes content from /test/index2.html, as it has status 200)


2) The page loads a Flash plugin

I would need to save the file that is being passed to the plugin (and
maybe a reference to which plugin, which may be obvious from the
extension). The plugin content file does not show as a Resource.


3) Debugging a large page

I've taken snapshots of several large pages and they render properly.
When I try a full page rendering of http://www.yahoo.es, there are some images
 missing (happens for short and long waits after load
but does not happen when the rendering is limited in height).

Anecdotally, I verified that these final images load slowly in
Epiphany 3.10 (but not so much in the 2.5.3 MiniBrowser) and
they load much faster in Firefox.

And when I look at the resources, the images not in the snapshot are
missing but I don't get any non-200 status code or failed callback.


Any ideas on how to approach these three issues ?

thanks


More information about the webkit-gtk mailing list