[webkit-dev] Should we create an 8-bit path from the network stack to the parser?
abarth at webkit.org
Thu Mar 7 11:11:58 PST 2013
The HTMLTokenizer still works in UChars. There's likely some
performance to be gained by moving it to an 8-bit character type.
There's some trickiness involved because HTML entities can expand to
characters outside of Latin-1. Also, it's unclear if we want two
tokenizers (one that's 8 bits wide and another that's 16 bits wide) or
if we should find a way for the 8-bit tokenizer to handle, for
example, UTF-16 encoded network responses.
On Thu, Mar 7, 2013 at 10:11 AM, Darin Adler <darin at apple.com> wrote:
> No. I retract my question. Sounds like we already have it right! thanks for setting me straight.
> Maybe some day we could make a non copying code path that points directly at the data in the SharedBuffer, but I have no idea if that'd be beneficial.
> -- Darin
> Sent from my iPhone
> On Mar 7, 2013, at 10:01 AM, Michael Saboff <msaboff at apple.com> wrote:
>> There is an all-ASCII case in TextCodecUTF8::decode(). It should be keeping all ASCII data as 8 bit. TextCodecWindowsLatin1::decode() has not only an all-ASCII case, but it only up converts to 16 bit in a couple of rare cases. Is there some other case you don't think we are handling?
>> - Michael
>> On Mar 7, 2013, at 9:29 AM, Darin Adler <darin at apple.com> wrote:
>>> Hi folks.
>>> Today, bytes that come in from the network get turned into UTF-16 by the decoding process. We then turn some of them back into Latin-1 during the parsing process. Should we make changes so there’s an 8-bit path? It might be as simple as writing code that has more of an all-ASCII special case in TextCodecUTF8 and something similar in TextCodecWindowsLatin1.
>>> Is there something significant to be gained here? I’ve been wondering this for a while, so I thought I’d ask the rest of the WebKit contributors.
>>> -- Darin
>>> webkit-dev mailing list
>>> webkit-dev at lists.webkit.org
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
More information about the webkit-dev