[webkit-dev] Webkit compatibility in India - Transcoding Indic fonts

Jungshik Shin (신정식, 申政湜) jshin at chromium.org
Wed Nov 19 10:42:04 PST 2008

2008/11/6 Prunthaban Kanthakumar <prunthaban at google.com>

> Hi All,
> This is a continuation of the mail thread
> https://lists.webkit.org/pipermail/webkit-dev/2008-October/005495.html
> I am interested in discussing about some of the ways to implement mjs'
> ideas.
> As mjs says in the above mail,
> *In case you look into implementing this, what I'd suggest is an extra CSS
> property that can be set based the font property at style resolution time.
> (since I think the computed font list will strip EOT fonts, so it might be
> too late to look at it once you are on the rendering side). Something like
> -webkit-indic-text-decode. *
> When the code reaches RenderText::styleDidChange method, the font
> information will still remain in the RenderStyle object associated with the
> RenderText (because this happens at the time of parsing the html file, well
> before font resolution happens).  Now in this method, there is check to see
> if there are text-transformations as part of the style and if there is one,
> then the method setText is called, forcing it to modify the 'internal text'
> if needed.
> Now we can do the following,
> 1. Add an additional condition in styleDidChange method to check if the
> font-family is supported by our transcoder (At present a fast look-up table
> should do because we plan to support only limited set of fonts)  - This
> condition will be #ifdefed on ENABLE(TRANSCODER_SUPPORT).

Shouldn't this be triggered by (font-family, site) rather than just

> 2. Now in the setTextInternal method, based on the font-family, we get the
> corresponding transcoder (probably from a map) and perform the transcoding.
> Later when font-resolution happens, since the particular font is eot, it
> will be ignored and based on the code point of glyphs a default font will be
> choosen by Webkit and hence the correct characters will appear on the
> screen.
> Also after setTextInternal method there is a layout & width recalculation
> done which is important for us because we modify the characters. So
> RenderText::setTextInternal method seems to be the ideal place to plug-in
> the transcoder.
> On a related note, I would like to mention here that, we cannot go with the
> approach of 'one look-up table' per font-face and a single transcoder to do
> the look-up for all fonts. The problem is that many indic languages use
> multiple code-points to represent one character and different fonts use
> different standards! For example there are situations where one glyph in EOT
> needs to be transcoded to 5+ Unicode code points. A reverse situation is
> also possible. Due to these issues, we cannot go with a simple look-up table
> for all fonts. This forces us to write some specialized code to handle each
> font (there might also be some fonts where a one-to-one look-up table will
> be enough).

In October, I listed two alternatives for this transformation. One is adding
ICU converters for Indic font encodings (it can deal with m-to-n mappings)
and the other is implementing your own. The first was ruled out because it's
not easy to add new converters on Mac OS X where ICU is a part of the OS.
There's another approach you can take. You can build ICU transliterator
rules and it seems to be the cleanest way to do this. You don't need to
port/implement conversion code (from another project : e.g. Padma) but just
need to 'port' the conversion tables to ICU transliterator rules.

This transcoding will be invoked on the content of a text node already in
Unicode just like 'text-transform: capitalize' or 'text-transform:
lowercase' is.  ICU transformer is for transforming a chunk of text in
Unicode to another chunk of text in Unicode.
( http://www.icu-project.org/userguide/Transform.html ) So, it appears to be
almost a perfect fit.


P.S. BTW, I filed https://bugs.webkit.org/show_bug.cgi?id=22339 for this
If you haven't filed one, why don't you use 22339 for uploading a prototype
patch for one (site, font) pair as Brett suggested?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20081119/4a96594c/attachment.html>

More information about the webkit-dev mailing list