<html>
<head>
<base href="https://bugs.webkit.org/" />
</head>
<body><span class="vcard"><a class="email" href="mailto:darin@apple.com" title="Darin Adler <darin@apple.com>"> <span class="fn">Darin Adler</span></a>
</span> changed
<a class="bz_bug_link
bz_status_NEW "
title="NEW - [INTL] Implement Collator Compare Functions"
href="https://bugs.webkit.org/show_bug.cgi?id=147604">bug 147604</a>
<br>
<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>What</th>
<th>Removed</th>
<th>Added</th>
</tr>
<tr>
<td style="text-align:right;">Attachment #267366 Flags</td>
<td>review?
</td>
<td>review-
</td>
</tr></table>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - [INTL] Implement Collator Compare Functions"
href="https://bugs.webkit.org/show_bug.cgi?id=147604#c19">Comment # 19</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - [INTL] Implement Collator Compare Functions"
href="https://bugs.webkit.org/show_bug.cgi?id=147604">bug 147604</a>
from <span class="vcard"><a class="email" href="mailto:darin@apple.com" title="Darin Adler <darin@apple.com>"> <span class="fn">Darin Adler</span></a>
</span></b>
<pre>Comment on <span class=""><a href="attachment.cgi?id=267366&action=diff" name="attach_267366" title="Patch">attachment 267366</a> <a href="attachment.cgi?id=267366&action=edit" title="Patch">[details]</a></span>
Patch
View in context: <a href="https://bugs.webkit.org/attachment.cgi?id=267366&action=review">https://bugs.webkit.org/attachment.cgi?id=267366&action=review</a>
Looks like great work. Many good improvements, not just the new feature.
review- because the string object lifetime mistake in IntlCollatorFuncCompare is a serious one.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:104
> + keyLocaleData.append(String());</span >
Alternate syntax for null in this context is { } rather than String(). I slightly prefer the former.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:125
> + keyLocaleData.append(String(collation));</span >
This treats the keyword value strings as ASCII or Latin-1, not UTF-8. Is that correct? Also, an explicit conversion to String() is not needed; please just write append(collation).
To create the strings as UTF-8, we would use String::fromUTF8(collation) but we also have to consider that the string might be invalid UTF-8 which would result in a null string.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:150
> + keyLocaleData.append(String());</span >
Alternate syntax for null in this context is { } rather than String(). I slightly prefer the former.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:170
> + Vector<String> requestedLocales = canonicalizeLocaleList(state, locales);</span >
I like to use auto for the return value type in a case like this.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:220
> + opt.set(ASCIILiteral("localeMatcher"), matcher);</span >
More efficient to use the add function here rather than the set function.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:241
> + opt.set(ASCIILiteral("kn"), numericString);</span >
Ditto.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:247
> + opt.set(ASCIILiteral("kf"), caseFirst);</span >
Ditto.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:252
> + const HashSet<String>& availableLocales = state.callee()->globalObject()->intlCollatorAvailableLocales();</span >
I like auto& here better for the return type.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:253
> + HashMap<String, String> result = resolveLocale(availableLocales, requestedLocales, opt, relevantExtensionKeys, sizeof(relevantExtensionKeys) / sizeof(relevantExtensionKeys[0]), localeData);</span >
I like auto here better for the return type.
WTF_ARRAY_LENGTH is preferred over that sizeof expression. Or perhaps we can use std::array and use relevantExtensionKeys.size().
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:329
> + ASSERT(!state.hadException());</span >
Why can we assert this? What guarantees it is true. If we can assert here, why can’t we assert along the way doing each step inside initializeCollator itself?
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:332
> + if (!m_collator) {</span >
I suggest putting the code to allocate a new collator into a separate function. So the code here would be:
if (!m_collator) {
createCollator();
if (!m_collator)
return state.vm().throwException(&state, createError(&state, ASCIILiteral("error message here")));
}
The code to call initializeCollator could also possibly go into the createCollator function for the same reason.
I think this would make this function easier to read and possible even more efficient.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:379
> + UColAttributeValue strength;
> + UColAttributeValue caseLevel = UCOL_OFF;
> + switch (m_sensitivity) {
> + case Sensitivity::Base:
> + strength = UCOL_PRIMARY;
> + break;
> + case Sensitivity::Accent:
> + strength = UCOL_SECONDARY;
> + break;
> + case Sensitivity::Case:
> + strength = UCOL_PRIMARY;
> + caseLevel = UCOL_ON;
> + break;
> + case Sensitivity::Variant:
> + strength = UCOL_TERTIARY;
> + break;
> + default:
> + ASSERT_NOT_REACHED();
> + }
> + ucol_setAttribute(collator, UCOL_STRENGTH, strength, &status);
> + if (U_FAILURE(status))
> + goto fail;
> + ucol_setAttribute(collator, UCOL_CASE_LEVEL, caseLevel, &status);
> + if (U_FAILURE(status))
> + goto fail;
> +
> + ucol_setAttribute(collator, UCOL_NUMERIC_COLLATION, m_numeric ? UCOL_ON : UCOL_OFF, &status);
> + if (U_FAILURE(status))
> + goto fail;
> +
> + // FIXME: Setting UCOL_ALTERNATE_HANDLING to UCOL_SHIFTED causes punctuation and whitespace to be
> + // ignored. There is currently no way to ignore only punctuation.
> + ucol_setAttribute(collator, UCOL_ALTERNATE_HANDLING, m_ignorePunctuation ? UCOL_SHIFTED : UCOL_DEFAULT, &status);
> + if (U_FAILURE(status))
> + goto fail;
> +
> + // "The method is required to return 0 when comparing Strings that are considered canonically
> + // equivalent by the Unicode standard."
> + ucol_setAttribute(collator, UCOL_NORMALIZATION_MODE, UCOL_ON, &status);
> + if (U_FAILURE(status))
> + goto fail;
> +</span >
Is there any other place we can share this code with? It’s pretty long winded and I seem to have seen code like this before.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.cpp:384
> + return state.vm().throwException(&state, createError(&state, ASCIILiteral("ucol_setAttribute failed.")));</span >
This does not seem like an acceptable string for an error seen by the web. When would this exception occur? Why is it useful to give the function name ucol_setAttribute to the web developer?
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.h:50
> + JSValue compareStrings(ExecState&, StringView x, StringView y);</span >
I don’t think the names “x” and “y” here add anything. Please omit them.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.h:67
> + enum class Usage {
> + Sort,
> + Search
> + };</span >
I would find this more readable as a one-liner.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollator.h:74
> + enum class Sensitivity {
> + Base,
> + Accent,
> + Case,
> + Variant
> + };</span >
I would find this more readable as a one-liner.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:98
> + StringView x = state->argument(0).toString(state)->view(state);</span >
This won’t work. We are creating a StringView pointing to the result of toString, but then the garbage collector is allowed to destroy that object. Instead the local variable needs to be of type JSString* and the call to view needs to be separate. I suggest calling view in the line that calls compareStrings.
<span class="quote">> Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:104
> + StringView y = state->argument(1).toString(state)->view(state);</span >
Ditto.
<span class="quote">> Source/JavaScriptCore/runtime/IntlObject.cpp:118
> + String locale(uloc_getDefault());</span >
Is the result of uloc_getDefault guaranteed to be ASCII or Latin-1? Or could it be UTF-8? If it’s UTF-8 then we need to write this code differently.
<span class="quote">> Source/JavaScriptCore/runtime/IntlObject.cpp:202
> state.vm().throwException(&state, createRangeError(&state, String(notFound)));</span >
This explicit conversion to String should not be needed. Please remove it.
<span class="quote">> Source/JavaScriptCore/runtime/IntlObject.cpp:211
> - return fallback;
> + return String(fallback);</span >
This explicit conversion to String should not be needed. Please don’t make this change.
<span class="quote">> Source/JavaScriptCore/runtime/IntlObject.cpp:759
> - supportedExtensionAddition = "-" + key + '-' + value;
> + supportedExtensionAddition = String("-") + key + '-' + value;</span >
More efficient idiom for this is:
supportedExtensionAddition = makeString('-', key, '-', value);
Avoids extra work to allocate and destroy a single character string for "-".</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>