[webkit-dev] [Block Pointer] Deterministic Region Based Memory Manager
Phil Bouchard
philippeb8 at gmail.com
Sat Mar 5 21:59:52 PST 2016
Thanks for the references, I will take a look.
But about performance, a GC is perhaps faster for a period of time but
when the collector kicks in we notice a CPU usage spiking for a bit
followed by a performance slowdown on some Javascript animation,
specially on a embedded box with 1 or 2 processors only.
With block_ptr<> the animation would go smoothly at the same constant
speed.
But if you say that would involve a 2x slowdown on the UI thread
regardless then I am surprised.
Anyway I am not sure if I can create a patch within a short period of
time but if I happen to have an interesting Javascript benchmark then I
will repost it to this mailing list.
Regards,
-Phil
On 03/06/2016 12:30 AM, Filip Pizlo wrote:
> Phil,
>
> I would expect our GC to be much faster than shared_ptr. This shouldn’t
> really be surprising; it’s the expected behavior according to the GC
> literature. High-level languages avoid the kind of eager reference
> counting that shared_ptr does because it’s too expensive. I would
> expect a 2x-5x slow-down if we switched to anything that did reference
> counting.
>
> You should take a look at our GC, and maybe read some of the major
> papers about GC. It’s awesome stuff. Here are a few papers that I
> consider good reading:
>
> Some great ideas about high-throughput GC:
> http://www.cs.utexas.edu/users/mckinley/papers/mmtk-icse-2004.pdf
> Some great ideas about low-latency GC:
> http://www.filpizlo.com/papers/pizlo-pldi2010-schism.pdf
> Some great ideas about GC roots:
> http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-88-2.pdf
> A good exploration of the limits of reference counting performance:
> http://research.microsoft.com/pubs/70475/tr-2007-104.pdf
>
> Anyway, you can’t ask us to change our code to use your memory manager.
> You can, however, try to get your memory manager to work in WebKit,
> and post a patch if you get it working. If that patch is an improvement
> - in the sense that both you and the reviewers can apply the patch and
> confirm that it is in fact a progression and doesn’t break anything -
> then this would be the kind of thing we would accept.
>
> Having looked at your code a bit, I think that you’ll encounter the
> following problems:
> - Your code uses std::mutex for synchronization. std::mutex is quite
> slow. You should look at WTF::Lock, it’s much better (as in, orders of
> magnitude better).
> - Your code implements lifecycle management that is limited to reference
> counting. This is not adequate to support JS, DOM, and JIT semantics,
> which are based on solving arbitrary data flow equations over the
> reachability set.
> - It’s not clear that your allocator results in fast path code that is
> competitive against either of the JSC GC’s allocators. Both of those
> require ~5 instructions in the common case. That instruction count
> includes the oversize object safety checks.
> - It’s not clear that your allocator is compatible with JITing and
> standard JavaScript performance optimizations, which assume that values
> can be passed around as bits without calling into the runtime. A
> reference counter needs to do some kinds of memory operations on
> variable assignments. This is likely to be about a 2x-5x slow-down. I
> would expect a 2x slow-down if you did non-thread-safe reference
> counting, and 5x if you made it thread-safe.
>
> -Filip
More information about the webkit-dev
mailing list