[webkit-dev] drawing LF/CR/etc.

Darin Adler darin at apple.com
Tue Jun 13 11:31:05 PDT 2006


On Jun 13, 2006, at 8:59 AM, Mike Reed wrote:

> ...
> <input type="checkbox" name="box" checked="checked" />Test
> <input ...>
> ...
>
> When I draw this page, I see a box at the end of "Test". "Test" is
> comming into Font::drawText() as a 5 character string, with a CR (or
> LF, don't remember which) at the end. In my font, that draws as a box.
>
> Is it correct that the parser didn't strip that, or convert it into a
> space?

Yes. The parser must not convert it to a space; the DOM must contain  
a space.

> If so, is my port expected to strip these sorts of characters
> each time I measure or draw (hurting performance)?

Yes.

Having the text rendering machinery handle these characters specially  
makes things faster on platforms where we can do that efficiently  
(which now includes both Macintosh and Windows on TOT, since there's  
shared high speed text rendering code).

Allan outlined a way we could change bidi.cpp to implement this rule  
at a higher level. If we can do that without hurting performance on  
Macintosh and Windows, we could take the code out.

Hyatt's the one who's been working on this recently.

> If I had complete control over all my fonts, I could wack their  
> cmap tables to ensure that all control characters mapped to zero- 
> width spaces, but I don't have the luxury.

There may be other ways to do that quickly in the text rendering  
layer, for example it's probably quite quick to scan a string and  
check if any characters are in this range. In the case where they  
are, then you have to allocate a buffer and copy the string, but I  
think that's relatively rare. I'd also be comfortable taking a patch  
that changes it so that the bidi.cpp level takes care of this and the  
code from the platform directory doesn't have to handle it any more.

Since this is a highly-performance-sensitive part of the code, and  
the way we do this now is very fast, we have to make sure we do  
performance measurements if we change how this works.

> If I am required to handle these control characters, is there a list
> of exactly which the parser will pass through?

Here's the rule, taken from the code in GlyphMap.cpp (now cross- 
platform on TOT, formerly Macintosh-specific code) that implements  
the rule for the fast code path:

     Control characters (U+0000 - U+0020, U+007F - U+00A0) must not  
render at all.
     \n (U+000A), \t (U+0009), and non-breaking space (U+0020) must  
render as a space.

     -- Darin




More information about the webkit-dev mailing list