[Webkit-unassigned] [Bug 227150] New: [webkitcorepy] Add NestedFuzzyDict
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Thu Jun 17 16:03:10 PDT 2021
https://bugs.webkit.org/show_bug.cgi?id=227150
Bug ID: 227150
Summary: [webkitcorepy] Add NestedFuzzyDict
Product: WebKit
Version: WebKit Nightly Build
Hardware: Unspecified
OS: Unspecified
Status: NEW
Severity: Normal
Priority: P2
Component: Tools / Tests
Assignee: webkit-unassigned at lists.webkit.org
Reporter: jbedard at apple.com
Hashes have some unique properties when it comes to dictionaries. A direct look-up is obviously fastest, but, and this is particularly relevant for git hashes, you don't actually need the entirety of the string to get a low collision rate. With as few as 4 characters, most hashes in the WebKit project won't collide. But the birthday paradox ensures that at least a few hashes will collide, even with as many as 8 or 9 characters. To address this problem, we should have a nested dictionary structure that allows partial look-up keys (as long as those keys meet some minimum length) while allowing a fuzzy look-up of the the remaining 36 characters of a SHA-1 hash.
This technique is generally applicable to strings, so webkitcorepy should own this object.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20210617/87c8a5bb/attachment.htm>
More information about the webkit-unassigned
mailing list