[webkit-gtk] Modifying HTML before loading it in a WebView

Adrián Pérez de Castro aperez at igalia.com
Wed Aug 3 06:45:58 PDT 2016


Hi Michael,

Quoting Michael Gratton (2016-08-02 04:58:59)

> I'm (slowly) working on porting Geary to WebKit2.

That's great! Any effort done in porting to WebKit2 deserves to be praised.
Hopefully distributions can stop shipping old, unsecure, tired WebKit1 at
some point thanks to people like you :-)

> One of the things that needs to be taken care of is how Geary does 
> manipulates an HTML email's markup before loading it into a WebView. 
> This is done for a few reasons: Applying app-specific and user-specific 
> CSS, to implement collapsible quote sections, for handling loading of 
> inline and attached images, and to ensure that bulk and junk messages 
> containing bugged remote images, etc. aren't automatically triggered.

You may want to consider using WebKitUserContentManager [1]. In short, it
allows you to inject CSS and JavaScript snippets into content loaded in a
WebKitWebView. The injected JavaScript code runs in the WebProcess, in the
same context as the loaded web content, and it can manipulate it in any
way it wants, using the DOM and all the rest of web APIs.

Additionally, you can register a “message handler”, which allows you to
send messages from JavaScript with:

  window.webkit.messageHandlers.<handler-name>.postMessage(value)

When that function is called from JavaScript, “value” is serialized and
sent to the UIProcess (your application), and the WebKitUserContentManager
emits the “script-message-received::<handler-name>” signal.

If you need to send messages from the UIProcess to the WebProcess, you can
use webkit_web_view_run_javascript() e.g. to call JavaScript functions which
have been defined in your injected scripts.

> I can parse, manipulate and re-serialise the HTML in the in the app 
> before shipping it off to WebKit, but it would be nice to be able take 
> advantage of WK's excellent parser instead. Is loading the HTML in a 
> WebView and manipulating it from there using a WebExtension the only 
> way to achieve this? I couldn't see anything like the DOM Load/Save 
> APIs being documented as implemented. Is it possible to get a 
> "background" WebKitWebPage for doing this without having the web view 
> visibly change as the DOM is modified, then hand-ball the modified DOM 
> over to a visible page for display?

As outlined above, you can inject JavaScript code, and do your DOM
modifications using web APIs. For example, Epiphany's overview page (the
one with screenshots of pages used often) was rewritten to work in that
way, and the resulting code is smaller and easier to maintain than using
the DOM bindings from C.

If you absolutely need (or want) to use the DOM bindings in C, you will
need to implement it as a WebExtension. In this case probably you want to
do the DOM modifications in the WebExtension, without serializing the DOM
tree over to the UIProcess and then sending it back. Instead, it seems like
a better idea to do something like:

  1. Sending a list of blocked items to the UIProcess, along with some
     identifier and a description (and whatever else the UI needs to show).
  2. The IUProcess sends a message to the WebExtension saying “unblock
     the items with identifier $FOO, $BAR, and $BAZ”.
  3. The WebExtensions manipulates the DOM to unblock the item(s).

As a rule of thumb, it is better to avoid passing big amounts of data between
the UIProcess (your app) and the WebProcess/WebExtension :-)

> For intercepting resource loads, connecting to 
> WebKitWebPage::send-request and/or registering a handler with 
> webkit_web_context_register_uri_scheme() looks like the way to go - I 
> assume that one or the other will allow intercepting and cancelling 
> loads of all remote resources at load-time, including images, CSS and 
> JS, but how secure are they? Will a DNS lookup be performed for 
> resources with remote URIs before asking for load permission?

I am not very familiar with the pipeline for resource loads, but your plan
seems correct from my understanding of how this works (and the documentation).

Even though I am not a Geary user, I am looking forward to seeing it ported
and hope that the info above will get you on a good track :-)

Cheers,

--
 ☛ Adrián

[1] https://webkitgtk.org/reference/webkit2gtk/stable/WebKitUserContentManager.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: signature
URL: <https://lists.webkit.org/pipermail/webkit-gtk/attachments/20160803/ec2a8032/attachment.sig>


More information about the webkit-gtk mailing list