[webkit-help] Feedback about Content Blocking Extensions from Adblock Plus

Sun Jun 14 20:42:21 PDT 2015

Hi Sebastian,

Thanks for stating a thread for this. Let's see what we can do...

Did you already file radars for the issues? If you did, can you give the 
radar numbers? I'll link them with the meta radars tracking the features 
requests we are getting for content blockers. If you did not file 
radars, I'll do that.

On 6/14/15 7:07 AM, Sebastian Noack wrote:
> I'm from Adblock Plus, and just read the articleon the WebKit website 
> [1] about the new content blocking mechanism, introduced with Safari 
> 9. Thanks for providing some details. But I identified following 
> shortcomings that would effectively make the new mechanism 
> insufficient for us, or anybody supporting our filters [2], which are 
> used by popular filter lists, including EasyList:
 From the list, it seems to me that we should discuss concrete cases 
instead of concrete solution.

The content blockers in WebKit are vastly different from what extensions 
do today. As such, a solution that works well for classic extensions may 
not be the best way to solve the same problem in content blockers. If 
you tell us about the actual problems (for example an example of a 
website were you can't filter a resource), it would be easier for us to 
identify what we can do.
> 1. Most importantly, our exception rules are recursive. For example 
> ||example.com <http://example.com>$document prevents not only 
> documents loaded from example.com <http://example.com> being blocked. 
> But also resources loaded as part of that document or in any of it's 
> subframes or their subframes wouldn't be blocked either. However, this 
> logic doesn't seem to be possible with the ignore-previous-rules 
> action. A recursive flag would come handy here.
That seems feasible. I have a couple of ideas on how to best achieve this.

Including the subframes is a bit worrying to me. A subframe of a trusted 
source is typically not to be trusted. Do you have examples where that 
is useful?
> 2. There doesn't seem to be a way to distinguish between document and 
> subdocument requests. While Adblock Plus blocks frames, it never 
> blocks the top level document, so that users can still access the 
> resource that is blocked, when entering its URL in the address bar.
This sounds like a good idea for your use case.

Any suggestion on the format? What would be the best way to specify this 
in your opinion?
> 3. A dedicated resource-type for XMLHttpRequests, objects (requests 
> loading a Flash element) and object subrequests (subsequent requests 
> issued by a Flash object) would certainly be useful as well. EasyList 
> has quite some filters specifically checking for those.
Targeting XHR specifically seems very easy to counter to me. Couldn't 
one just use the Fetch API or Sockets to work around the rule?
Do you have an example where the distinction matters?

Regarding the object subrequest, that seems like a valuable thing to do.
> 4. Adblock Plus uses filters subscriptions (periodically downloaded 
> filter lists, like EasyList) as well as filters added by the user, to 
> decide what to block. So we'd need a way to dynamically configure 
> block lists. I saw the pre-release announcements mentioning the 
> new setContentBlocker API for this purpose. I couldn't find any 
> details on that, but I assume that you can simply pass in a block list 
> as JavaScript object? But note that we'd need a way to invalidate 
> previously set blocking rules when filters in Adblock Plus changed. 
> However, a way to add new rules without flushing the previously set 
> block list would be extremely useful in some cases as well. So 
> ideally, this API should let you modify the block list in place.
You can pass the rules as a JavaScript object, or as a serialized JSON 
string.

When you set a new content blocker, it replaces the old one. Strictly 
speaking, the old one remains active until the new one is compiled and 
then it is replaced.

There is a technical reason why you cannot modify/add/delete individual 
rules. In the engine, the rules are combined into giant state machines. 
The concept of rule does not exist past the compiler, after that all we 
have is a very simple bytecode 
(http://trac.webkit.org/browser/trunk/Source/WebCore/contentextensions/DFABytecode.h) 
that executes several thousands triggers at once.

Note that compiling is not cheap. We are paying compile time when 
loading rules in exchange for faster runtime and lower memory footprint. 
How often do you need to update the rules?

Cheers,
Benjamin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-help/attachments/20150614/fb5d2652/attachment.html>