[webkit-help] Content blocker, performance issue
Benjamin Poulain
benjamin at webkit.org
Mon Jul 6 18:02:09 PDT 2015
Hi Tuan,
On 7/6/15 3:07 PM, Tuan Bach Quoc wrote:
> I was reading the documentation around the content blocker extension.
> there is one sentence that triggers my curiosity:
>
> If the rule compiler detects that a set of rules would negatively impact
> user experience, it refuses to load them and returns an error.
>
> The compiler will check the performance of the rules only once? or every
> time a page is loaded?
>
> Basically my concern is that the performance of the set of rules are
> also impacted by the page the user is currently loading. The performance
> of a set of rules could be different depending on which website it is
> applied to right ?
>
> Could we have more details on how the compiler works and how does it
> evaluate if the rules are "negatively impact user experience" ?
>
> Many thanks in advance for reading,
Basically, all the rules are combined together into a few giant state
machines.
First, the triggers are grouped into sets that work well together. Then
each group is transformed into the equivalent non-deterministic
automatons (NFAs). The NFAs are then transformed into a deterministic
machine (DFAs) and minimized.
When resulting machines are small, we combine them to reduce the number
of machines and reach a good ratio machine/size.
Each machine is then lowered to bytecode for execution.
A big rule list that have similar expressions ends up being just 10-1000
state machines. Evaluating them is usually very efficient (the worst
case being: "128 * URL length * number of machines" instructions, which
is a small number).
Since we are dealing with deterministic state machines, we have an idea
of how fast they are ahead of time.
We could find the worst case by going through trough the graphs but we
don't do that at the moment. Simple limits on the size and numbers of
machines seems to serve us well so far.
---
To answer your perf-per-page question: we do not track the performance
of extensions per page. In many cases, the time spent in Content
Blockers is so small that it is not measurable anyway.
What we do is place various limits in the compiler to make sure that the
machines stay efficient. We do that as we compile, we give up the
compilation step.
---
The limits are still evolving. The current limits are not very
restrictive because we want developers to experiment and tell us what
works well and what doesn't.
There are many things we can do in the compiler when hitting a bad case
but we need to know about it.
Many developers have been sending us their blocking lists and we are
making dramatic improvements in response to what we discover.
I invite you to email me your biggest blocking lists. This is the kind
of data that will help us improve those limits.
Benjamin
More information about the webkit-help
mailing list