[webkit-dev] PageGroup and visited link coloring

Brett Wilson brettw at chromium.org
Mon Nov 10 11:12:17 PST 2008


I was recently looking at the PageGroup and visited link coloring.
Chromium has some interesting requirements. Our design goal is to
store hundreds of thousands to a million URLs in the database with no
problems (basically all your history forever). We have multiple
processes so we can't just have a local list of visited pages in each
renderer process.

Our solution to the first problem is to have 64-bit hashes (with 1
million visited links, you would get too many collisions using 32-bit
hashes like WebKit currently uses). Our solution to the second problem
is to have a dedicated multiprocess hash table. This dedicated system
manages its own hashing because we also have salting which must be in
sync through all processes.

WebKit recently changed around how visited link coloring worked. It
used to work call a global function historyContains() and this was
easy to integrate into our system, The new system passes 32-bit hashes
around and maintains a global list of visited pages in the PageGroup.
Neither of these will work with our system.

My current idea is to create a new file LinkHash which has a typedef
for the hash type (rather than using unsigned everywhere) so we can
define it to be 64-bits in PLATFORM(CHROMIUM) and it can remain
32-bits for other platforms (or they can change it if they like). It
also defines a visitedLinkHash function which is moved from Document.
I have a patch for this, and it's very clean. I think it improves
things even without our porting constraint since almost 200 lines got
moved out of Document. This is described in
https://bugs.webkit.org/show_bug.cgi?id=22131

The more complicated part is in PageGroup, which seems to basically be
the visited link database. I'm thinking of just providing a new
PageGroupChromium.cpp which contains a different implementation that
proxies these calls to our glue layer to be sent to our multiprocess
database.

However, I'm not sure what exactly the intent of PageGroup is. It's
clearly not intended that this be port-specific. Is there a cleaner
way to integrate our link database with the rest of WebKit?

Thanks,
Brett


More information about the webkit-dev mailing list