[Webkit-unassigned] [Bug 22339] Support Indian web sites with EOT by on-the-fly transcoding to Unicode

Wed Jan 21 03:57:40 PST 2009

https://bugs.webkit.org/show_bug.cgi?id=22339

prunthaban at google.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #26367|0                           |1
        is obsolete|                            |
  Attachment #26892|                            |review?
               Flag|                            |

------- Comment #9 from prunthaban at google.com  2009-01-21 03:57 PDT -------
Created an attachment (id=26892)
 --> (https://bugs.webkit.org/attachment.cgi?id=26892&action=view)
Patch to support on-the-fly transcoding for one font (new version)

Here is a summary of changes made in this new patch:

* The naming of classes has been changed. Earlier I had Transformer and
Transcoder. One of the classes were removed (the functionality has been moved
to a common class because they are very similar). Now the class is called
FontTranscoder. Other classes has been renamed as well as per Darin's review
comments.
* The global setting is renamed to TEXT_TRANSCODER. I think this should be
fine.
* The efficiency of the transcoding process has been improved in this new
patch. I have made the converters to act on a range of UChars in one shot. Also
all String building methods now use Vector<UChar> to build the string and
finally use String::adopt().
* The classes has been moved under platform/text. But I am still keeping them
under a special directory because we need to add a dozen more Encoders and
Converters. So I felt it will be good to keep them separate. If it is ok to
have all of them directly under platform/text then we can move it there. Let me
know your thoughts on this.
* All the review comments by Darin has been taken care of except using the
TextCodec machinery of WebKit. As I mentioned in my previous mail, this
implementation is based on Padma. The Padma architecture allows us to add new
encoders (for each font. Based on our requirement we might add a dozen fonts
initially) and converters (one for each language)  easily. They just plug-in to
the existing code base. Instead, if we use one TextCodec for each font, then
these codecs should either duplicate the logic or rely on external (Padma
based) classes to do the transcoding. In the latter case the TextCodec will
simply add one new layer to the transcoding process. Also the current
implementation does not exactly fit into a code-decode process. The current
implementation converts text (in a particular indic font) into an intermediate
form and this intermediate form (there is no decoding logic from intermediate
form to the initial font) will get converted into the target language.
Considering the difference in architecture, I am still keeping the transcoding
machinery separate.
* The TSCII Encoder I have added is just one encoder. There will be atleast a
dozen more (once we figure out what are the major sites we want to support).
* Site based restriction for transcoding has been removed. As mjs suggested, if
we want to bring it back we can do so., Let me know whether we need this.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.