[Webkit-unassigned] [Bug 13136] Spurious glyphs in Google Israel and Gmail (all languages)

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Mar 21 22:19:10 PDT 2007


http://bugs.webkit.org/show_bug.cgi?id=13136





------- Comment #8 from jungshik.shin at gmail.com  2007-03-21 22:19 PDT -------
(In reply to comment #7)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > (In reply to comment #0)
> > > 
> >
> > IMHO, there are two groups of (invisible or near-invisible) characters in
> > 'default ignorables' 
> > 
> >   a. characters that should always be rendered invisible (*) and don't interact
> > with adjacent characters (e.g. 'U+2062 INVISIBLE TIMES'). they should be killed
> > higher up and should not be passed to 'string drawing/measuring' routines to
> > avoid what we're observing with LRM/RLM here.  (some fonts have visibly glyphs
> > for them causing problems like this).

> I don't agree with removing these characters before passing them into our Font
> drawing/rendering routines, since the only way to do so would be to actually
> break up the text into smaller runs.  

Thank you for the detailed explanation.

I'm afraid my use of the word 'kill' was confusing. What's suggested by mitzin
comment #4 is what I want to do for a subset of invisible characters in the
fast path. That way, those characters would be available for copy into
clipboard and we wouldn't have to keep checking whether they're in the set   as
you wrote. So, I guess we agree with each other.

As for white space characters with various widths (EN QUAD, EM QUAD, EN SPACE,
THREE-PER-EM SPACE, etc : U+2000 - U+200A), what would be the best way to deal
with them? 


> In general we pass strings unaltered into
> drawing/rendering routines for performance reasons and then rely on the fixups
> to happen in the Font routines themselves.

I guess that means nothing is done about  BiDi control characters (LRM, RLM,
LRE, RLE, PDF. yes, they'd better not be used in html, but some documents use
them)? In other words, do they have impact on the directionality of text layout
in WebKit? Or, does it pass them down so that ATSUI (complex path) can take
care of them on Mac OS X?    


In reply to comment #6

Thank you for the feedback. 

> I'm not sure why you want to hide them only if there isn't a font that covers
> them. 


> If you do that, and the user has installed one of the Office X fonts, and
> the page specifies that font (Arial is a classic example), those characters
> will render. Is that how it works in WinIE and Firefox?

My assumption was that if the page author specified those characters to be
rendered by a specific fonts,  those characters in group (b) (e.g. Hebrew vowel
signs) would be rendered well by the font. 

Even if that's not the case, a font claiming to have glyphs for them are likely
to have "acceptable" (could be far-from-optimal as in Arial) glyphs.  Because
'truly invisible' characters in group (a) are turned to 'zero-width' glyph
higher up in the code path, using these 'acceptable' (fallback) [1] glyphs for
group b (perhaps, we need to give more thoughts to how to divide two groups)
would make more sense than just using 'last resort' glyph (a box with a symbol
representing a script, or a hollow box or question mark.) 


Perhaps, I'll try to solve the group A case first. Or, I may begin with an even
simpler case (the original bug of LRM/RLM). 

[1] Examples of 'acceptable fallback' glyphs would include stand-alone glyphs
for 'combining mark XXX'. 




-- 
Configure bugmail: http://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list