[webkit-help] Feedback about Content Blocking Extensions from Adblock Plus
Sebastian Noack
sebastian at adblockplus.org
Thu Jun 18 09:19:34 PDT 2015
I wrote a script to convert our filters into WebKit's block list format:
https://github.com/snoack/abp2blocklist/blob/master/abp2blocklist.js
Here is the output for EasyList (our default filter list):
https://raw.githubusercontent.com/snoack/abp2blocklist/master/easylist.json
Note that it uses some extensions to the format, which I suggested in my
previous emails:
1. Some rules specify "ignore-previous-rules-in-document" as action type.
If any of these rules matches for a document (toplevel or subframe), no
resource in that document or any of its subframes or their subframes should
be blocked or hidden according to the rules given above. Neither should any
"css-display-none" rule above, hide any elements there.
2. Some rules set trigger.collapse to true. Elements like images and
subframes who don't load due to being blocked by a rule using that option,
should not show a placeholder on the page, but are supposed to be hidden
like per "display: none".
3. The resource type "subdocument" should match documents loaded in a
subframe. The resource type "document" isn't used, but I suggest to make it
match only toplevel documents.
4. The resource type "object" should match <object> and <embed> elements
(loading a Flash video for example).
5. The resource type "object-subrequest" should match requests initiated by
a third-party plugin like Flash. Most of these requests we do block are
in-video ads.
One other assumptions:
Our filters doesn't consider subdomains as third-party. Also we don't
consider the protocol for third-party checks. So if trigger.load-type is
set to "third-party" it is not supposed to match resources like
https://foo.example.co.uk/img.png on http://bar.example.co.uk. We compare
the "basedomain", which is the public suffix (https://publicsuffix.org/)
plus one more part ("example.co.uk" in this example) to determine whether
something is third-party or first-party. Without this logic websites on
domains that serve ads on other websites (e.g. websites from ad providers)
are broken if you visit them directly.
Similar for "if-domain" and "unless-domain". Our filters assume "example.com"
to match "www.example.com" or "this.is.still.example.com" as well. However,
the logic implemented here is more trivial, and corresponds to /^(.*\.)*
example.com$/. But this is even more critical, as many websites use
different subdomains.
Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-help/attachments/20150618/5e789be3/attachment.html>
More information about the webkit-help
mailing list