[webkit-reviews] review canceled: [Bug 95207] CSS Parser should directly parse 8 bit source strings : [Attachment 161039] Patch with style nit fixed
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Tue Aug 28 18:54:52 PDT 2012
Michael Saboff <msaboff at apple.com> has canceled Michael Saboff
<msaboff at apple.com>'s request for review:
Bug 95207: CSS Parser should directly parse 8 bit source strings
https://bugs.webkit.org/show_bug.cgi?id=95207
Attachment 161039: Patch with style nit fixed
https://bugs.webkit.org/attachment.cgi?id=161039&action=review
------- Additional Comments from Michael Saboff <msaboff at apple.com>
(In reply to comment #7)
> (From update of attachment 161039 [details])
> View in context:
https://bugs.webkit.org/attachment.cgi?id=161039&action=review
>
> I had a first look, I have early comments about some details.
> Can you please measure how much does the code grow with the patch?
Comparing the size output from before and after X86-64 release builds of Mac
WebCore.framework:
__TEXT __DATA __OBJC others dec hex
before- 17391616 1871872 0 18460672 37724160 23fa000
after- 17408000 1871872 0 18399232 37679104 23ef000
delta- 16384 0 0
This growth is less than I would have thought!
> > Source/WebCore/css/CSSParser.cpp:168
> > +template <typename CharacterType>
> > +ALWAYS_INLINE static bool equal(const CharacterType* a, const char* b, int
length)
> > {
> > - for (int i = 0; i < a.length; ++i) {
> > + for (int i = 0; i < length; ++i) {
> > if (!b[i])
> > return false;
> > - if (a.characters[i] != b[i])
> > + if (a[i] != b[i])
> > return false;
> > }
> > - return !b[a.length];
> > + return !b[length];
> > }
>
> We have optimized versions of this in StringImpl.h, you should call them
directly.
Done.
> > Source/WebCore/css/CSSParser.cpp:186
> > +template <typename CharacterType>
> > +ALWAYS_INLINE static bool equalIgnoringCase(const CharacterType* a, const
char* b, int length)
> > {
> > - for (int i = 0; i < a.length; ++i) {
> > + for (int i = 0; i < length; ++i) {
> > if (!b[i])
> > return false;
> > ASSERT(!isASCIIUpper(b[i]));
> > - if (toASCIILower(a.characters[i]) != b[i])
> > + if (toASCIILower(a[i]) != b[i])
> > return false;
> > }
> > - return !b[a.length];
> > + return !b[length];
> > +}
>
> We have optimized versions of this in StringImpl.h, you should call them
directly.
Done.
> > Source/WebCore/css/CSSParser.cpp:335
> > + int stringLength = string.length();
> > + int length = stringLength + m_parsedTextPrefixLength + strlen(suffix)
+ 1;
>
> String length are typically unsigned.
Fixed this.
> > Source/WebCore/css/CSSParser.cpp:340
> > + for (unsigned i = 0; i < m_parsedTextPrefixLength; i++)
> > + m_dataStart8[i] = prefix[i];
>
> memcpy?
The prefixes and suffices are so small, 0 to a 33 bytes with most of them 0
that I think the call overhead would outweigh the benefit.
> > Source/WebCore/css/CSSParser.cpp:348
> > + for (unsigned i = start; i < end; i++)
> > + m_dataStart8[i] = suffix[i - start];
>
> memcpy?
> Or just a while loop incrementing the pointers. I think the "i - start" is a
little less readable.
Clang on X86 makes slightly better code using scaled index instructions instead
of needing to increment a pointer (an ALU operation). I assume that the same
is true for gcc. I can clean up the suffix index calculation, but it will
probably look best with another local.
> > Source/WebCore/css/CSSParser.cpp:350
> > + m_dataStart8[length - 1] = 0;
>
> If you do:
> unsigned length = stringLength + m_parsedTextPrefixLength + strlen(suffix) +
1;
> ...
> m_dataStart8 = adoptArrayPtr(new LChar[length + 1]);
> ...
> This becomes:
> m_dataStart8[length] = 0;
>
> I personally prefer this way of handling zero termination, but that is up to
you.
I want m_length to be the buffer length. It is used to allocate a 16 bit
buffer when there is 8 bit source with a 16 bit escape.
> > Source/WebCore/css/CSSParser.cpp:3122
> > - AtomicString variableName = String(name.characters + 12, name.length -
12);
> > + AtomicString variableName = name.is8Bit() ? String(name.characters8()
+ 12, name.length() - 12) : String(name.characters16() + 12, name.length() -
12);
>
> Creating an AtomicString from a String can be inefficient. You should use the
AtomicString constructor here instead.
Done.
> > Source/WebCore/css/CSSParser.cpp:8871
> > + UnicodeToChars(result, unicode); }
>
> Lost bracket :)
Saw and fixed after original post.
> > Source/WebCore/css/CSSParser.h:489
> > + LChar* m_currentCharacter8;
> > + UChar* m_currentCharacter16;
>
> Should this be an union? Are they mutually exclusive?
These are both needed. For the case that we have an 8 bit source and a Unicode
escape that resolves to 16 bit, we use the 16 bit data buffer and
m_currentCharacter16 to track where in that buffer.
More information about the webkit-reviews
mailing list