[webkit-reviews] review granted: [Bug 224727] Use binary-search in LocaleToScriptMapping : [Attachment 426363] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sat Apr 17 20:46:44 PDT 2021


Darin Adler <darin at apple.com> has granted Yusuke Suzuki <ysuzuki at apple.com>'s
request for review:
Bug 224727: Use binary-search in LocaleToScriptMapping
https://bugs.webkit.org/show_bug.cgi?id=224727

Attachment 426363: Patch

https://bugs.webkit.org/attachment.cgi?id=426363&action=review




--- Comment #2 from Darin Adler <darin at apple.com> ---
Comment on attachment 426363
  --> https://bugs.webkit.org/attachment.cgi?id=426363
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=426363&action=review

Great idea!

> Source/WebCore/ChangeLog:10
> +	   This patch removes HashMaps in LocaleToScriptMapping, and
binary-search onto the constant data arrays.
> +	   These maps are not frequently used. Keys of the maps can be encoded
into uint32_t or uint64_t so that
> +	   comparison becomes super cheap and we can initialize this array at
compile-time.

I’ve long wanted to make a re-usable binary search constant data array class.
Glad you are doing this here; there seem to be many other places using hash
tables that could do this sort of thing instead.

> Source/WebCore/platform/text/LocaleToScriptMapping.cpp:41
> +using ScriptName = uint32_t;

I’d like to see this use a bit more template machinery so we can do this for 4
and 8 without writing two separate sets of functions.

> Source/WebCore/platform/text/LocaleToScriptMapping.cpp:50
> +	   result |= (static_cast<ScriptName>(code) << ((4 - index - 1) *
CHAR_BIT));

I think we can use 8 instead of CHAR_BIT; I don’t think CHAR_BIT helps
readability or portability here.

> Source/WebCore/platform/text/LocaleToScriptMapping.cpp:55
> +static Optional<ScriptName> tryParseScriptName(StringView string)

Not 100% sure this needs "try" in its name. It is quite common throughout all
kinds of WebKit code for functions named parse to have failure modes and use
Optional to express that failure.

> Source/WebCore/platform/text/LocaleToScriptMapping.cpp:64
> +	   result |= (static_cast<ScriptName>(toASCIILower(code)) << ((4 -
index - 1) * CHAR_BIT));

I think we can use 8 instead of CHAR_BIT; I don’t think CHAR_BIT helps
readability or portability here.

Ideally we’d find a way to share code so we know the above is the same.

> Source/WebCore/platform/text/LocaleToScriptMapping.cpp:199
> +    auto* element = tryBinarySearch<ScriptNameCode>(scriptNameCodeList,
WTF_ARRAY_LENGTH(scriptNameCodeList), name.value(),

std::size should work and seems nicer to use than WTF_ARRAY_LENGTH

> Source/WebCore/platform/text/LocaleToScriptMapping.cpp:455
> +	   auto* element = tryBinarySearch<LocaleScript>(localeScriptList,
WTF_ARRAY_LENGTH(localeScriptList), localeName.value(),

std::size should work and seems nicer to use than WTF_ARRAY_LENGTH

Also would be nice to make another template like tryBinarySearch, maybe even an
overload of it, that handles the sizing automatically. Seems like it should be
able to deduce both the type and the size without an explicit template or
function argument for either.


More information about the webkit-reviews mailing list