[Webkit-unassigned] [Bug 30437] Japanese Text Search Problem

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Oct 19 15:26:19 PDT 2009


https://bugs.webkit.org/show_bug.cgi?id=30437


Darin Adler <darin at apple.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #41396|review?                     |review+
               Flag|                            |




--- Comment #7 from Darin Adler <darin at apple.com>  2009-10-19 15:26:19 PDT ---
(From update of attachment 41396)
> +        * editing/TextIterator.cpp:
> +        (WebCore::):
> +        (WebCore::createSearcher):

The line saying just "WebCore::" should be deleted.

> +// Tailored collation rules for Japanese text search.
> +// The default Unicode Collation Algorithm is unnatural in Japanese.
> +// These rules intend to treat the following characters as different characters.
> +//
> +// - Small kana letters and normal kana letters
> +// - Voiceless letters, voiced letters and semi-voiced letters
> +//

This comment should document where this array came from. Is this original work
or did you copy this here from some other project?

> +static const UChar JAPANESE_KANA_COLLATION_RULES[] = {

This array should not have a name that's in all capitals. Those names are
reserved for macros.

>      UErrorCode status = U_ZERO_ERROR;
>      UStringSearch* searcher = usearch_open(&newlineCharacter, 1, &newlineCharacter, 1, currentSearchLocaleID(), 0, &status);
>      ASSERT(status == U_ZERO_ERROR || status == U_USING_FALLBACK_WARNING || status == U_USING_DEFAULT_WARNING);
> +    
> +    static UCollator* collator = 0;
> +    if (!collator) {
> +        // Set tailored collation rules to fix Japanese text search.
> +        // See the comments before JAPANESE_KANA_COLLATION_RULES for details.
> +        status = U_ZERO_ERROR;
> +        collator = ucol_openRules(JAPANESE_KANA_COLLATION_RULES, -1, UCOL_DEFAULT,
> +                                  UCOL_DEFAULT_STRENGTH, 0, &status);
> +        ASSERT(status == U_ZERO_ERROR);
> +        status = U_ZERO_ERROR;
> +        usearch_setCollator(searcher, collator, &status);
> +        ASSERT(status == U_ZERO_ERROR);
> +        usearch_reset(searcher);
> +    }

This is OK, but not quite right.

The usearch_setCollator and usearch_reset calls should be outside the if
statement, since they are part of the creation of searcher, not the creation of
the collator. However, since the createSearcher function is only called once,
there's no problem in practice.

The entire function should ideally be refactored to match the way the
searcher() and createSearcher() function work. There would be a collator() and
createCollator() function, and createSearcher() would call collator().

Otherwise, the patch looks good. I'm going to say r=me because I think it's OK
to land this patch as is, but I think it would be even better if you took my
suggestions above.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list