[Webkit-unassigned] [Bug 227150] New: [webkitcorepy] Add NestedFuzzyDict

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Jun 17 16:03:10 PDT 2021


https://bugs.webkit.org/show_bug.cgi?id=227150

            Bug ID: 227150
           Summary: [webkitcorepy] Add NestedFuzzyDict
           Product: WebKit
           Version: WebKit Nightly Build
          Hardware: Unspecified
                OS: Unspecified
            Status: NEW
          Severity: Normal
          Priority: P2
         Component: Tools / Tests
          Assignee: webkit-unassigned at lists.webkit.org
          Reporter: jbedard at apple.com

Hashes have some unique properties when it comes to dictionaries. A direct look-up is obviously fastest, but, and this is particularly relevant for git hashes, you don't actually need the entirety of the string to get a low collision rate. With as few as 4 characters, most hashes in the WebKit project won't collide. But the birthday paradox ensures that at least a few hashes will collide, even with as many as 8 or 9 characters. To address this problem, we should have a nested dictionary structure that allows partial look-up keys (as long as those keys meet some minimum length) while allowing a fuzzy look-up of the the remaining 36 characters of a SHA-1 hash.

This technique is generally applicable to strings, so webkitcorepy should own this object.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20210617/87c8a5bb/attachment.htm>


More information about the webkit-unassigned mailing list