[webkit-dev] compact ICU unicode

Glenn Adams glenn at skynav.com
Thu Jun 13 18:31:34 PDT 2013

On Sat, Jun 8, 2013 at 3:15 AM, Salisbury, Mark <mark.salisbury at hp.com>wrote:

> Hello,
> What would people think about including specific ICU data tables in WTF in
> order to provide a lightweight (but functional) unicode implementation?

FWIW, I'd suggest you port ICU to your platform or if the size is too
large, port the portion of it that WK uses, and then use that portion.
However, I think the ICU library or even a subset should NOT be added to

> On embedded systems the size of ICU is prohibitive.  Determining the right
> way to package it to make it small enough isn't simple either.
> A patch was reviewed once that attempted to add ICU data tables directly
> in WTF and there were two concerns:
> 1) Checking in generated files (
> https://bugs.webkit.org/show_bug.cgi?id=27305#c8)
> 2) Questions concerning if the ICU license is compatible with WebCore (
> https://bugs.webkit.org/show_bug.cgi?id=27305#c9)
> I believe the patch could be done differently as to not check in generated
> files.  Regarding the second concern, ICU has a very permissive license (
> http://www.icu-project.org/repos/icu/icu/trunk/license.html).  There are
> three requirements, basically that the copyright and permission notice has
> to appear with copies of the software.  I believe that is already a
> requirement for distributions of webkit that use ICU.  Except for WChar
> unicode, I believe all webkit builds now use ICU Unicode.
> This Unicode path could replace WCHAR_UNICODE or be introduced as a third
> option, call it what you like - BASIC_ICU_UNICODE, ICU_LITE_UNICODE,
> COMPACT_ICU_UNICODE, etc..  I think it might be valuable for other ports
> that are size conscious - the up and coming NIX port comes to mind.
> Thanks,
> Mark
> Background:
> After rebasing my WinCE port of webkit, I ran into an ASSERT in
> WebCore/platform/text/wchar/TextBreakIteratorWchar.cpp,
> acquireLineBreakIterator().  I thought I'd be able to easily fix this,
> since I had already modified how LineBreakIterator works to take prior
> context into account (on my own branch) and find line break in a stream of
> non-ASCII characters.
> However, the WCHAR Unicode implementation is very bare bones and does not
> even support returning the Unicode character category (
> http://trac.webkit.org/browser/trunk/Source/WTF/wtf/unicode/wchar/UnicodeWchar.cpp#L35).
>  WCHAR Unicode was originally called WinCE Unicode, then it was properly
> renamed as it had nothing to do with WinCE.
> WinCE Unicode originally came in here:
> https://bugs.webkit.org/show_bug.cgi?id=27305.  The reason it was
> introduced was to save space (filesystem and RAM).  ICU, if not packaged
> very carefully (http://userguide.icu-project.org/packaging), is actually
> larger than webkit itself.  On embedded systems, this is a big deal.  The
> original plan with the bug above was to include specific ICU data tables in
> webkit.
> I've been compiling WTF with Unicode tables embedded for some time now.  I
> don't believe I've seen many layout test regressions due to using a
> simplified ICU implementation.
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20130614/0281ce78/attachment.html>

More information about the webkit-dev mailing list