[webkit-reviews] review granted: [Bug 167991] JSC: Intl API should ignore encoding when parsing BCP 47 language tag from ISO 15897 locale string (passed via LANG) : [Attachment 345787] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Jul 25 16:35:51 PDT 2018


Michael Catanzaro <mcatanzaro at igalia.com> has granted Andy VanWagoner
<andy at vanwagoner.family>'s request for review:
Bug 167991: JSC: Intl API should ignore encoding when parsing BCP 47 language
tag from ISO 15897 locale string (passed via LANG)
https://bugs.webkit.org/show_bug.cgi?id=167991

Attachment 345787: Patch

https://bugs.webkit.org/attachment.cgi?id=345787&action=review




--- Comment #13 from Michael Catanzaro <mcatanzaro at igalia.com> ---
Comment on attachment 345787
  --> https://bugs.webkit.org/attachment.cgi?id=345787
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=345787&action=review

> Source/JavaScriptCore/runtime/IntlObject.cpp:138
> +    auto length = uloc_toLanguageTag(localeID, buffer.data(), buffer.size(),
false, &status);
> +    if (status == U_BUFFER_OVERFLOW_ERROR) {

Gotta call out the ICU developers for this terrible API design, making you call
the function twice rather than just returning the proper amount of allocated
memory....

> Source/JavaScriptCore/runtime/IntlObject.cpp:144
> +	   return String(buffer.data(), length);

I believe the langtag buffer is filled with ASCII, not UTF-16, guessing from
the API reference [1] and [2], which says: "In some APIs, ICU uses char *
strings. This is either for file system paths or for strings that contain
invariant characters only (such as locale identifiers). These strings are in
the platform-specific encoding of either ASCII or EBCDIC." It's only
appropriate to pass the data directly to the String constructor if it's already
UTF-16. So assuming we don't care about EBCDIC, and considering that all ASCII
is valid UTF-*, we should use String::fromUTF8() to create the return value,
rather than directly using the String constructor.

[1]
https://ssl.icu-project.org/apiref/icu4c/uloc_8h.html#a1d50c91925ca3853fce6f28c
f7390c3c
[2] http://userguide.icu-project.org/strings

> Source/JavaScriptCore/runtime/IntlObject.cpp:621
> +    return "en"_s;

Huh, I didn't know we had this user-defined literal. How about that.

> LayoutTests/js/intl-default-locale.html:15
> +shouldBe("new Intl.DateTimeFormat().resolvedOptions().locale", "'tlh'");
> +shouldBe("new Intl.NumberFormat().resolvedOptions().locale", "'tlh'");
> +shouldBe("new Intl.Collator().resolvedOptions().locale", "'tlh'");

Uh, perhaps I don't understand something here, but why is 'tlh' a good default
locale?


More information about the webkit-reviews mailing list