[Webkit-unassigned] [Bug 16179] any attribute name start with a unicode which like #xx00(x could be any hex number[0-9a-f]) will cause HTMLTokenizer parse error.

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Dec 6 23:32:08 PST 2007


http://bugs.webkit.org/show_bug.cgi?id=16179





------- Comment #14 from johnnyding.webkit at gmail.com  2007-12-06 23:32 PDT -------
Sorry for my pool wording, I will fix that and keep it in my mind!:-)
The new patch will uploaded soon.

I agree we should change the title to avoid misunderstanding. maybe like:
Need WebKit to support Unicode characters as part of attribute name?

About the performance test you mentioned in previous review, any update? How
about my approach by using stand-alone buffer to temporarily save entity name?

Thanks!

(In reply to comment #11)
> (From update of attachment 17733 [edit])
> Alexey informed me that there may be some security concerns with supporting
> these additional characters in tag and attribute names. I don't know the
> details yet. Alexey would you be willing to comment?
> 
> +        (WebCore::HTMLTokenizer::parseEntity): Handle unicode Entity Name by
> using acsii version findentity.
> 
> "ASCII, not acssi". "version of findEntity", not "version findentity".
> 
> "Unicode", not "unicode".
> 
> Someone should fix the title of the bug; it no longer matches what's being
> fixed here.
> 
> +                    // Since the maximum length of entity name only
> +                    // can be 9, so one char array which is allocated
> +                    // from stack, its length is 10, should be OK.
> +                    // Also if we have illegal character, we treat it
> +                    // as illegal entity name.
> 
> "maximum length is 9", not "maximum length can be 9"
> 
> "a single char array", not "one char array"
> 
> "on the stack", not "from stack"
> 
> "have an illegal character", not "have illegal character"
> 
> +                    char chTmpEntityNameBuffer[10];
> 
> We don't normally use type prefixes like "ch" in code.
> 
> +
> 

(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > Alexey informed me that there may be some security concerns with supporting
> > > these additional characters in tag and attribute names. I don't know the
> > > details yet. Alexey would you be willing to comment?
> > 
> > I tried to say that treating U+3000 as whitespace could be dangerous, referring
> > to the comment that "some Chinese websites use one Chinese space symbol +U3000
> > as space to separate attribute name/value group" - I'm not aware of any issues
> > with treating them as non-whitespace.
> > 
> Oh, I guess it's my fault, My previous meaning was that some authors of Chinese
> sites tried to use one Chinese space symbol +U3000 as space to separate
> attribute name/value group, or maybe they just did not realize they used some
> symbols which look like space. it does not mean WebKit need to treat them as
> space. Also in my patch, I just treat those characters as normal, right? My
> previous sentence just guess the motivation about why they used those strange
> symbol characters in their pages.
> 
> So are we clear?
> 


-- 
Configure bugmail: http://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list