[Webkit-unassigned] [Bug 13136] Spurious glyphs in Google Israel and Gmail (all languages)

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Mar 21 13:12:18 PDT 2007


------- Comment #5 from jungshik.shin at gmail.com  2007-03-21 13:12 PDT -------
(In reply to comment #4)
> (In reply to comment #0)
> > I'd be willing to give it a shot. If anyone can give a pointer or two, that
> > would be great.
> GlyphPageTreeNode::initializePage() replaces characters that should not render
> with zero-width spaces for the 'simple' code path. For the 'complex' code path,
> this feature isn't implemented yet (see comment in
> ATSULayoutParameters::initialize(), which is probably where you'd add it).

Thanks for the pointer. I was looking at WidthIterator::advance, but that place
seems better.  

> I guess the question here is whether you want to make it impossible to display
> those glyphs. Other browsers doing it is a strong argument for WebKit to do it
> too (despite the inconsistency with the rest of Mac OS X).

IMHO, there are two groups of (invisible or near-invisible) characters in
'default ignorables' 

  a. characters that should always be rendered invisible (*) and don't interact
with adjacent characters (e.g. 'U+2062 INVISIBLE TIMES'). they should be killed
higher up and should not be passed to 'string drawing/measuring' routines to
avoid what we're observing with LRM/RLM here.  (some fonts have visibly glyphs
for them causing problems like this). Gecko has the list at :


LRM/RLM is not there because Gecko special-case them in its BiDi code at 


  b. characters that interact with adjacent characters to change the rendering
result (e.g ZWNJ, ZWJ). If the rendering routine (API) and fonts can take care
of them, that's good. If there isn't a font that covers them, they should be
turned into nothingness (rather than a 'last resort' glyph) at the drawing
stage. Gecko has the list at : 


For characters in group (a), GlyphPageTreeNode::initializePage() seems to be a
good place to kill them. 

For characters in group (b), we may want to 'kill them' right before the
invocation of 'last resort' glyph in platform-specific files. 

> Zero-width joiner and zero-width non-joiner should be handled the same as LRM
> and RLM. 

I'm afraid it's not that simple. ZWJ and ZWNJ need to be treated in a
context-sensitive manner because they interact with adjacent characters to
change the rendering result (conjunct formation in Indic scripts and joining
behavior in Arabic). In the code path for simple scripts, they might well be
safely turned into nothingness, though. 

> What other characters are "default ignorable"?

Unicode database has the list (I'll give the URL later), but as outlined above,
not all of them can be treated the same way. 

BTW, there might be cases we want to render 'the invisible' with visible glyphs
('code view' of a html editor, mathematical equation editor, etc). However, CSS
does not provide any way to that effect at the moment. So that, I guess we
don't have to worry about it at least for now. 

Configure bugmail: http://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

More information about the webkit-unassigned mailing list