[Webkit-unassigned] [Bug 16179] any attribute name start with a unicode which like #xx00(x could be any hex number[0-9a-f]) will cause HTMLTokenizer parse error.
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Thu Dec 6 23:32:08 PST 2007
http://bugs.webkit.org/show_bug.cgi?id=16179
------- Comment #14 from johnnyding.webkit at gmail.com 2007-12-06 23:32 PDT -------
Sorry for my pool wording, I will fix that and keep it in my mind!:-)
The new patch will uploaded soon.
I agree we should change the title to avoid misunderstanding. maybe like:
Need WebKit to support Unicode characters as part of attribute name?
About the performance test you mentioned in previous review, any update? How
about my approach by using stand-alone buffer to temporarily save entity name?
Thanks!
(In reply to comment #11)
> (From update of attachment 17733 [edit])
> Alexey informed me that there may be some security concerns with supporting
> these additional characters in tag and attribute names. I don't know the
> details yet. Alexey would you be willing to comment?
>
> + (WebCore::HTMLTokenizer::parseEntity): Handle unicode Entity Name by
> using acsii version findentity.
>
> "ASCII, not acssi". "version of findEntity", not "version findentity".
>
> "Unicode", not "unicode".
>
> Someone should fix the title of the bug; it no longer matches what's being
> fixed here.
>
> + // Since the maximum length of entity name only
> + // can be 9, so one char array which is allocated
> + // from stack, its length is 10, should be OK.
> + // Also if we have illegal character, we treat it
> + // as illegal entity name.
>
> "maximum length is 9", not "maximum length can be 9"
>
> "a single char array", not "one char array"
>
> "on the stack", not "from stack"
>
> "have an illegal character", not "have illegal character"
>
> + char chTmpEntityNameBuffer[10];
>
> We don't normally use type prefixes like "ch" in code.
>
> +
>
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > Alexey informed me that there may be some security concerns with supporting
> > > these additional characters in tag and attribute names. I don't know the
> > > details yet. Alexey would you be willing to comment?
> >
> > I tried to say that treating U+3000 as whitespace could be dangerous, referring
> > to the comment that "some Chinese websites use one Chinese space symbol +U3000
> > as space to separate attribute name/value group" - I'm not aware of any issues
> > with treating them as non-whitespace.
> >
> Oh, I guess it's my fault, My previous meaning was that some authors of Chinese
> sites tried to use one Chinese space symbol +U3000 as space to separate
> attribute name/value group, or maybe they just did not realize they used some
> symbols which look like space. it does not mean WebKit need to treat them as
> space. Also in my patch, I just treat those characters as normal, right? My
> previous sentence just guess the motivation about why they used those strange
> symbol characters in their pages.
>
> So are we clear?
>
--
Configure bugmail: http://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the webkit-unassigned
mailing list