[webkit-reviews] review denied: [Bug 15914] [GTK] Implement Unicode functionality using GLib : [Attachment 26711] 1/4 - Moving WTF Unicode backend to GLib

Thu Jan 15 10:26:01 PST 2009

Darin Adler <darin at apple.com> has denied Dominik Röttsches
<dominik.roettsches at access-company.com>'s request for review:
Bug 15914: [GTK] Implement Unicode functionality using GLib
https://bugs.webkit.org/show_bug.cgi?id=15914

Attachment 26711: 1/4 - Moving WTF Unicode backend to GLib
https://bugs.webkit.org/attachment.cgi?id=26711&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
> Index: JavaScriptCore/wtf/unicode/glib/CasefoldTableFromGLib.h
> ===================================================================
> --- JavaScriptCore/wtf/unicode/glib/CasefoldTableFromGLib.h	(revision 0)
> +++ JavaScriptCore/wtf/unicode/glib/CasefoldTableFromGLib.h	(revision 0)
> @@ -0,0 +1,200 @@
> +/*
> + *  Copyright (C) 2006 George Staikos <staikos at kde.org>
> + *  Copyright (C) 2006 Alexey Proskuryakov <ap at nypop.com>
> + *  Copyright (C) 2007 Apple Computer, Inc. All rights reserved.
> + *  Copyright (C) 2008 JÃ¼rg Billeter <j at bitron.ch>
> + *  Copyright (C) 2008 Dominik RÃ¶ttsches
<dominik.roettsches at access-company.com>

Is this file really needed?

I don't think the copyrights are right on this file.

> +} casefold_table[] = {

This is not the normal naming for identifiers in WebKit. This would typically
be inside the WTF::Unicode namespace and be named something like caseFoldTable
or CaseFoldTable.

> +#include "CasefoldTableFromGLib.h"

Do we really need to compile in a copy of the table into each source file? That
seems like a very bad idea. Since the case folding table is declared static, it
will have internal linkage so each source file that uses this header will get a
separate copy of the table!

> +	   inline UChar32 foldCase(UChar32 ch)

These functions are pretty long. I don't think it makes sense to inline them.

> +	   inline int umemcasecmp(const UChar* a, const UChar* b, int len)
> +	   {
> +	       GOwnPtr<char> utf8a;
> +	       GOwnPtr<char> utf8b;
> +
> +	       utf8a.set(g_utf16_to_utf8(a, len, 0, 0, 0));
> +	       utf8b.set(g_utf16_to_utf8(b, len, 0, 0, 0));
> +
> +	       GOwnPtr<char> foldedA;
> +	       GOwnPtr<char> foldedB;
> +
> +	       foldedA.set(g_utf8_casefold(utf8a.get(), -1));
> +	       foldedB.set(g_utf8_casefold(utf8b.get(), -1));
> +
> +	       int result = g_utf8_collate(foldedA.get(), foldedB.get());
> +
> +	       return result;
> +	   }

Converting to UTF-8 just to do the case folding is going to be very
inefficient. This is going to result in a quite-slow GTK port if it's used
anywhere.

In general we had to work hard to eliminate the conversion from UTF-16 and
UTF-8 and back that used to be present in WebKit and JavaScriptCore in its core
algorithms, and move that conversion to the external API. I think you need to
do some performance tests if you're going to introduce memory allocation and
UTF-8 conversion into these algorithms, unless it's OK to have the performance
of the GTK port suffer.

> +#ifndef UnicodeGLibTypes_h
> +#define UnicodeGLibTypes_h
> +
> +typedef uint16_t UChar;
> +typedef int32_t UChar32;
> +
> +#endif

Seems a little excessive to have a separate header file for these, but it might
be OK.

I'm going to say review- because of some of the issues with the case folding
table and the performance of the UTF-8 conversion.