[webkit-gtk] [WK2] Implement new API to save a web page into a self-contained document

Mario Sanchez Prada msanchez at igalia.com
Mon Jun 25 07:50:48 PDT 2012

Hi all,

As per Carlos's suggestion, I've started last week to work on
implementing a new API to allow saving the web page associated to the
current WebKitWebView in an easy way, which should help removing some
code from browsers like Epiphany, which do their own thing (e.g. Ephy
saves an .html file and then all the resources in a separate directory).

The idea would be to allow save both the current web page, and its
associated external resources, into a self-contained file we could use
later to read the contents (maybe offline), or send to someone else, for

So, we thought a good initial approach could be to start using the MHTML
format[1] (already supported  in WebKit since some time ago [2][3]),
although it could probably be good if we did not tied ourselves up into
it and left the door open to some other possibilities that might be
interesting in the future.

With that in mind, we thought of adding an extra parameter
'WebKitWebViewSaveMode' to the new functions. We would start defining
only a MHTML mode, but we could in the future support others, like using
the base HTML document "as is" and embedding external resources as 'data
URIs' (let's call it COMPATIBLE).

Last, it would be interesting both to have a way to easily save the page
into disk (a save_to_file() function) but also a way to receive the
serialized page in some other way, so the end application can decide
what to do with it.

So, after talking to Carlos, we think a proposal like this one could
make sense:

 typedef enum {
     WEBKIT_SAVE_MODE_MHTML // The only supported mode atm
 } WebKitWebViewSaveMode;

 webkit_web_view_save (WebKitWebView *web_view,
                       WebKitWebViewSaveMode save_mode,
                       GCancellable *cancellable,
                       GAsyncReadyCallback callback,
                       gpointer user_data);

 webkit_web_view_save_finish (WebKitWebView *web_view,
                              GAsyncResult *result,
                              GError **error);

 webkit_web_view_save_to_file (WebKitWebView *web_view,
                               const gchar *filepath,
                               WebKitWebViewSaveMode save_mode,
                               GCancellable *cancellable,
                               GAsyncReadyCallback callback,
                               gpointer user_data);

 webkit_web_view_save_to_file_finish (WebKitWebView *web_view,
                                      GAsyncResult *result,
                                      GError **error)

I already started working on this and you can see progress being tracked
down at bugs 89872 and 89873 already.

Feel free to provide any feedback you consider. This is completely open
at this stage, so any change is possible. 

I should probably have sent this mail earlier. Sorry about it.


[1] http://en.wikipedia.org/wiki/MHTML
[2] https://bugs.webkit.org/show_bug.cgi?id=7168
[3] https://bugs.webkit.org/show_bug.cgi?id=7169
[4] http://en.wikipedia.org/wiki/Data_URI_scheme
[5] https://bugs.webkit.org/show_bug.cgi?id=89872
[6] https://bugs.webkit.org/show_bug.cgi?id=89873

More information about the webkit-gtk mailing list