[Webkit-unassigned] [Bug 147604] [INTL] Implement Collator Compare Functions

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sun Nov 8 15:28:36 PST 2015


https://bugs.webkit.org/show_bug.cgi?id=147604

--- Comment #15 from Sukolsak Sakshuwong <sukolsak at gmail.com> ---
(In reply to comment #9)
> Comment on attachment 263825 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=263825&action=review
> 
> Looks OK. I see a lot of room for refinement and improvement.
> 
> > Source/JavaScriptCore/runtime/IntlCollator.h:61
> > +    bool initializedCollator() const { return m_initializedCollator; }
> 
> The naming here is unfortunate. This sounds like a function that returns a
> collator that has been initialized.
> 
> > Source/JavaScriptCore/runtime/IntlCollator.h:62
> > +    void setInitializedCollator(bool initializedCollator) { m_initializedCollator = initializedCollator; }
> 
> The naming here is unfortunate. This sounds like a function that takes a
> collator as an argument, not a function that takes a boolean.
> 
> Using public functions for this is taking the standards language too
> literally. We should keep this boolean private and write a member function
> that initializes if needed and returns if not.
> 
> > Source/JavaScriptCore/runtime/IntlCollatorConstructor.h:63
> > +IntlCollator* initializeCollator(ExecState*, IntlCollator*, JSValue locales, JSValue optionsValue);
> 
> The interface to this function is strange. Why does the function both take a
> collator as an argument and return one. If the function needs to indicate
> failure, I suggest it return a boolean or error code. The ExecState argument
> should be ExecState&, not ExecState*, and the collator argument should be
> IntlCollator&, not IntlCollator*.

I made initializeCollator, resolvedOptions, and compareStrings member functions of IntlCollator. We now don't need to expose the internal slots of IntlCollator objects such as usage, locale, collation, sensitivity, etc.

I also made initializeCollator returns if the collator has already been initialized.

> > Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:90
> > +static int compareStrings(IntlCollator* collator, const String& x, const String& y)
> 
> This should take an IntlCollator&, not an IntlCollator*.
> 
> Also should probably take two StringView arguments, not two const String&
> arguments.

Fixed.

> > Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:94
> > +    UCollator* icuCollator = ucol_open(collator->locale().utf8().data(), &status);
> 
> For acceptable performance, I think it’s important that we not use ucol_open
> and ucol_close every time. I tried that in code in WebCore and found that
> everything was much too slow. I think we probably need to put the UCollator
> in the IntlCollator, not create a new one every time.
> 
> Doing all that work to initialize the collator attributes every time we
> compare a pair of strings is going to result in very poor performance.
> 
> We also need to come up with a fast path so we aren’t converting simple
> strings with all ASCII characters to 16 bit just to call ICU with them.
> Again, we’ve run into this in the past. We need performance tests for this.

Cached UCollator in IntlCollator and called ucol_close in IntlCollator::~IntlCollator().

> > Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:96
> > +        return codePointCompare(x, y);
> 
> It is really OK to silently do the wrong thing if ucol_open fails?

Used RELEASE_ASSERT(U_SUCCESS(status)).

> > Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:100
> > +    const String& sensitivity = collator->sensitivity();
> 
> Is this attribute string itself supposed to be case folding or case
> sensitive? This is not a good pattern, doing string comparisons to convert
> the collator’s sensitivity setting into arguments for the ICU collator every
> time we compare a string.

Added enum classes for [[sensitivity]] and [[usage]] and added functions for converting them to strings.

> > Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:134
> > +    String x16Bit = x.is8Bit() ? String::make16BitFrom8BitSource(x.characters8(), x.length()) : x;
> > +    String y16Bit = y.is8Bit() ? String::make16BitFrom8BitSource(y.characters8(), y.length()) : y;
> > +    UCollationResult result = ucol_strcoll(icuCollator, x16Bit.characters16(), x16Bit.length(), y16Bit.characters16(), y16Bit.length());
> 
> This is the wrong idiom for making a temporary 16-bit copy of a string. The
> correct idiom is to use upconvertedCharacters().

Fixed.

> > Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:147
> > +    switch (result) {
> > +    case UCOL_EQUAL:
> > +        return 0;
> > +    case UCOL_GREATER:
> > +        return 1;
> > +    case UCOL_LESS:
> > +        return -1;
> > +    }
> > +    ASSERT_NOT_REACHED();
> > +    return 0;
> 
> It’s OK to try to use some abstraction here and not assume the values, but
> it’s really not useful. This named constants are just ICU names for these
> same values. There is no benefit to the switch. This can and should just be:
> 
>     return result;

Fixed.

> > Source/JavaScriptCore/runtime/IntlCollatorPrototype.cpp:157
> > +    RELEASE_ASSERT(collator);
> 
> I don’t see why we would have this assertion in release code.

Changed to ASSERT(collator).

> > Source/JavaScriptCore/runtime/IntlObject.cpp:118
> > +    return String(uloc_getDefault()).replace('_', '-');
> 
> We should stop copying and pasting the replace we are doing here.

Added convertICULocaleToBCP47LanguageTag().

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-unassigned/attachments/20151108/56fbad51/attachment.html>


More information about the webkit-unassigned mailing list