[webkit-reviews] review granted: [Bug 167991] JSC: Intl API should ignore encoding when parsing BCP 47 language tag from ISO 15897 locale string (passed via LANG) : [Attachment 345787] Patch
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Wed Jul 25 16:35:51 PDT 2018
Michael Catanzaro <mcatanzaro at igalia.com> has granted Andy VanWagoner
<andy at vanwagoner.family>'s request for review:
Bug 167991: JSC: Intl API should ignore encoding when parsing BCP 47 language
tag from ISO 15897 locale string (passed via LANG)
https://bugs.webkit.org/show_bug.cgi?id=167991
Attachment 345787: Patch
https://bugs.webkit.org/attachment.cgi?id=345787&action=review
--- Comment #13 from Michael Catanzaro <mcatanzaro at igalia.com> ---
Comment on attachment 345787
--> https://bugs.webkit.org/attachment.cgi?id=345787
Patch
View in context: https://bugs.webkit.org/attachment.cgi?id=345787&action=review
> Source/JavaScriptCore/runtime/IntlObject.cpp:138
> + auto length = uloc_toLanguageTag(localeID, buffer.data(), buffer.size(),
false, &status);
> + if (status == U_BUFFER_OVERFLOW_ERROR) {
Gotta call out the ICU developers for this terrible API design, making you call
the function twice rather than just returning the proper amount of allocated
memory....
> Source/JavaScriptCore/runtime/IntlObject.cpp:144
> + return String(buffer.data(), length);
I believe the langtag buffer is filled with ASCII, not UTF-16, guessing from
the API reference [1] and [2], which says: "In some APIs, ICU uses char *
strings. This is either for file system paths or for strings that contain
invariant characters only (such as locale identifiers). These strings are in
the platform-specific encoding of either ASCII or EBCDIC." It's only
appropriate to pass the data directly to the String constructor if it's already
UTF-16. So assuming we don't care about EBCDIC, and considering that all ASCII
is valid UTF-*, we should use String::fromUTF8() to create the return value,
rather than directly using the String constructor.
[1]
https://ssl.icu-project.org/apiref/icu4c/uloc_8h.html#a1d50c91925ca3853fce6f28c
f7390c3c
[2] http://userguide.icu-project.org/strings
> Source/JavaScriptCore/runtime/IntlObject.cpp:621
> + return "en"_s;
Huh, I didn't know we had this user-defined literal. How about that.
> LayoutTests/js/intl-default-locale.html:15
> +shouldBe("new Intl.DateTimeFormat().resolvedOptions().locale", "'tlh'");
> +shouldBe("new Intl.NumberFormat().resolvedOptions().locale", "'tlh'");
> +shouldBe("new Intl.Collator().resolvedOptions().locale", "'tlh'");
Uh, perhaps I don't understand something here, but why is 'tlh' a good default
locale?
More information about the webkit-reviews
mailing list