[webkit-dev] HTML5 & MathML3 entities

Adam Barth abarth at webkit.org
Sun Jul 11 16:33:57 PDT 2010


On Sat, Jul 10, 2010 at 6:28 PM, Maciej Stachowiak <mjs at apple.com> wrote:
> On Jul 10, 2010, at 11:10 AM, Sausset François wrote:
>> I just saw that when looking at the code by myself.
>> What do you exactly mean by a prefix tree?
>
> The data structure commonly called a "Trie" is a prefix tree:
> http://en.wikipedia.org/wiki/Trie
>
> This data structure not only lets you tell if a particular key is present, but it also lets you check if a string you have could possibly be the prefix of any valid key.
>
> I think it is challenging, though, to make a trie structure that can be a compile-time constant, and building one dynamically will cost runtime memory per-process (whereas constant data would be in the data segment and shared).
>
> Another possibility is to make an array of all the entity names in sorted order. Then lookup can use a binary search, and on a failed lookup, looking to either side of the last key checked should determine whether it is a valid prefix.
>
> I expect binary search would be slower than Trie lookup, though I don't know by how much.

Binary search will certainly be easier to implement.  Let's start with
that and experiment with prefix trees as a possible performance
optimization.  I'll give it a try now.

Adam


More information about the webkit-dev mailing list