[Webkit-unassigned] [Bug 177040] [FreeType] Support emoji modifiers

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Dec 19 13:35:42 PST 2018


https://bugs.webkit.org/show_bug.cgi?id=177040

--- Comment #6 from Myles C. Maxfield <mmaxfield at apple.com> ---
(In reply to Carlos Garcia Campos from comment #5)
> Ok, so the problem only happens with emojis having the zero with joiner
> (U+200D) in the sequence. The sequence is broken because
> createAndFillGlyphPage() in Font.cpp overwrites zero with joiner with zero
> width space (U+200B). Myles, why are we doing that overwrite? should that
> depend on the font? Not doing the overwrite fixes all the combinations for
> me in https://unicode.org/emoji/charts/full-emoji-list.html. If we can't
> remove the overwrite we can handle that as a special case in harfbuzz
> set_nominal_glyph function.

This code is trying to make control characters invisible. I haven't done the research, but I expect the replacement code in createAndFillGlyphPage() is older than the WidthIterator::applyFontTransforms() code. (Both the simple text path and the complex text path use createAndFillGlyphPage().)

You've hit the reason I filed https://bugs.webkit.org/show_bug.cgi?id=187166. Because shaping operates on the character stream of the input, if we muck around with the character string, we break shaping. This has bitten us multiple times, see https://bugs.webkit.org/show_bug.cgi?id=185976

The right way to do this is to run shaping on the unperturbed character stream, and after shaping is finished, map each glyph in the result back to the character in the stream it represents, and if it represents a control character, remove that glyph from the stream.

Unfortunately, "map each glyph in the result back to the character in the stream it represents" is impossible in the Cocoa fast text codepath. That codepath uses CTFontTransformGlyphs() to do the shaping, but doesn't output a mapping back to the original string indices. <rdar://problem/44466695> is tracking this.

So, if I understand the Harfbuzz API correctly, the Harfbuzz ports should be able to fix this immediately, but the existing code needs to continue to exist for the Cocoa ports until we get a solution from the platform that fits.

Bonus: There used to be interop for many years between the browsers about making control characters invisible. However, the spec changed[1] and now says "Control characters (Unicode category Cc) other than tab (U+0009), line feed (U+000A), and form feed (U+000C), must be rendered as a visible glyph." However, I'm not sure we can actually make that change, because I don't think we've characterized how many sites would break (where "break" means "control character garbage shows up and makes content unreadable where it used to be invisible and readable"). The decision about rendering control characters visibly should be independently from the method of fixing createAndFillGlyphPage().

[1] https://drafts.csswg.org/css-text-3/#white-space-processing

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20181219/cca346cd/attachment.html>


More information about the webkit-unassigned mailing list