[Webkit-unassigned] [Bug 69083] wrong CSS lexer rules

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Sep 29 05:30:12 PDT 2011


https://bugs.webkit.org/show_bug.cgi?id=69083


Andras Becsi <abecsi at webkit.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |abecsi at webkit.org




--- Comment #1 from Andras Becsi <abecsi at webkit.org>  2011-09-29 05:30:11 PST ---
(In reply to comment #0)
> I am playing with a hand written CSS lexer, and during that work I found a few bugs in the current flex rules in WebKit. I am not sure we prefer compatibility or standard compilance in such case, so I just tell you what I found, and let you decide what to do with them:
> 
> The "original" comes form: http://www.w3.org/TR/CSS21/grammar.html "G.2 Lexical scanner"
> The "wk" comes form css/tokenizer.flex
> 
> original:
> nonascii    [\240-\377]
> wk:
> nonascii        [\200-\377]
> 
> They start nonascii from 160 not 128. Not sure why.

I think this is just wrong in the spec, but there might be an explanation for it.

> 
> original:
> string1        \"([^\n\r\f\\"]|\\{nl}|{escape})*\"
> wk:
> string1         \"([\t !#$%&(-~]|\\{nl}|\'|{nonascii}|{escape})*\"
> 
> Basically we disallow 127 (DELETE) and <32 non-newline chars while the original grammar allows them.
> 

This might be to make the parser simpler, not sure why the spec is that permissive.

> original:
> unicode        \\{h}{1,6}(\r\n|[ \t\r\n\f])?
> wk:
> unicode         \\{h}{1,6}[ \t\r\n\f]?
> 
> This can be exploited by a \r\n newline: A\41\r\nB should be "AAB" but it will be "AA" and "B" in WK.
> 

This feels like a real bug which could need a test case if possible

> original:
> {num}%            {return PERCENTAGE;}
> wk:
> {num}%+                 {yyTok = PERCENTAGE; return yyTok;}
> 
> Why do we allow multpile percent sign? Although we still treat them as one...

In this case the WebKit lexer only consumes all the garbage percentage signs which I think is not a big compatibility problem.

All this is really ancient code though.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list