[webkit-dev] [Block Pointer] Deterministic Region Based Memory Manager

Phil Bouchard philippeb8 at gmail.com
Sat Mar 5 21:59:52 PST 2016


Thanks for the references, I will take a look.

But about performance, a GC is perhaps faster for a period of time but 
when the collector kicks in we notice a CPU usage spiking for a bit 
followed by a performance slowdown on some Javascript animation, 
specially on a embedded box with 1 or 2 processors only.

With block_ptr<> the animation would go smoothly at the same constant 
speed.

But if you say that would involve a 2x slowdown on the UI thread 
regardless then I am surprised.

Anyway I am not sure if I can create a patch within a short period of 
time but if I happen to have an interesting Javascript benchmark then I 
will repost it to this mailing list.


Regards,
-Phil

On 03/06/2016 12:30 AM, Filip Pizlo wrote:
> Phil,
>
> I would expect our GC to be much faster than shared_ptr.  This shouldn’t
> really be surprising; it’s the expected behavior according to the GC
> literature.  High-level languages avoid the kind of eager reference
> counting that shared_ptr does because it’s too expensive.  I would
> expect a 2x-5x slow-down if we switched to anything that did reference
> counting.
>
> You should take a look at our GC, and maybe read some of the major
> papers about GC.  It’s awesome stuff.  Here are a few papers that I
> consider good reading:
>
> Some great ideas about high-throughput GC:
> http://www.cs.utexas.edu/users/mckinley/papers/mmtk-icse-2004.pdf
> Some great ideas about low-latency GC:
> http://www.filpizlo.com/papers/pizlo-pldi2010-schism.pdf
> Some great ideas about GC roots:
> http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-88-2.pdf
> A good exploration of the limits of reference counting performance:
> http://research.microsoft.com/pubs/70475/tr-2007-104.pdf
>
> Anyway, you can’t ask us to change our code to use your memory manager.
>   You can, however, try to get your memory manager to work in WebKit,
> and post a patch if you get it working.  If that patch is an improvement
> - in the sense that both you and the reviewers can apply the patch and
> confirm that it is in fact a progression and doesn’t break anything -
> then this would be the kind of thing we would accept.
>
> Having looked at your code a bit, I think that you’ll encounter the
> following problems:
> - Your code uses std::mutex for synchronization.  std::mutex is quite
> slow.  You should look at WTF::Lock, it’s much better (as in, orders of
> magnitude better).
> - Your code implements lifecycle management that is limited to reference
> counting.  This is not adequate to support JS, DOM, and JIT semantics,
> which are based on solving arbitrary data flow equations over the
> reachability set.
> - It’s not clear that your allocator results in fast path code that is
> competitive against either of the JSC GC’s allocators.  Both of those
> require ~5 instructions in the common case.  That instruction count
> includes the oversize object safety checks.
> - It’s not clear that your allocator is compatible with JITing and
> standard JavaScript performance optimizations, which assume that values
> can be passed around as bits without calling into the runtime.  A
> reference counter needs to do some kinds of memory operations on
> variable assignments.  This is likely to be about a 2x-5x slow-down.  I
> would expect a 2x slow-down if you did non-thread-safe reference
> counting, and 5x if you made it thread-safe.
>
> -Filip




More information about the webkit-dev mailing list