[Webkit-unassigned] [Bug 161581] New: Speed up GC by using indexed collections to track MarkedBlock::Handles and WeakSets and using bitvectors to track their states
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Sun Sep 4 18:59:12 PDT 2016
https://bugs.webkit.org/show_bug.cgi?id=161581
Bug ID: 161581
Summary: Speed up GC by using indexed collections to track
MarkedBlock::Handles and WeakSets and using bitvectors
to track their states
Classification: Unclassified
Product: WebKit
Version: WebKit Nightly Build
Hardware: All
OS: All
Status: NEW
Severity: Normal
Priority: P2
Component: JavaScriptCore
Assignee: webkit-unassigned at lists.webkit.org
Reporter: fpizlo at apple.com
For example, each MarkedAllocator could have a Vector<std::unique_ptr<MarkedBlock::Handle>> m_blocks. Then it could have the following sets represented by bitvectors indexed by the handle's index in m_blocks:
- m_canAllocate, which contains all non-empty non-retired blocks.
- m_snapshot, which contains all blocks in the block snapshot.
- m_empty, which contains all empty blocks.
- m_eden, which contains all blocks that have new objects.
- m_allocated, which contains all blocks that have been completely filled up.
If we do this, then we can get rid of block state, the m_blocksWithNewObjects vector, and the block snapshot vector. The following operations become cheap operations over bitvectors:
- All of the flips:
- Full GC flip: m_empty = 0, m_canAllocate = 0, m_allocated = 0.
- Eden GC flip: m_canAllocate |= m_eden, m_allocated = 0.
- Full GC snapshot: m_snapshot = 1.
- Eden GC snapshot: m_snapshot |= m_eden.
- All of the iterations:
- Allocate: scan m_canAllocate, then scan m_empty, then scan m_empty of other size classes (so we can rebalance the heap cheaply!).
- Incremental sweep: scan (m_snapshot & ~m_allocated).
We could allocate WeakSets outside MarkedBlocks, and have them also be tracked in their own Vector<std::unique_ptr<WeakSet>>, owned by MarkedSpace, and also track various sets of them using bitvectors.
This ought to reduce any per-block overheads by so much that they should appear invisible. We did lose some perf by going from 64KB to 16KB even with many optimizations, so this may be enough to bring us back to where we were before. For example, consider what happens when we are using 200MB. That means having to sometimes consider about 13000 blocks, which the current code does by looping over that many things, and those things are unlikely to reside on the same cache line. But with bitvectors, the typical flipping operation would only have to touch 13000 / 8 = 1625 bytes of contiguous bits. Using 64-bit operations, that means only having to loop over about 200 words. That means that the previous flipping operations that would take about 13000 iterations just to touch each block will now only require 200 iterations - as in we actually have a chance of speeding up those operations by 65x.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-unassigned/attachments/20160905/9db0b6fc/attachment.html>
More information about the webkit-unassigned
mailing list