[webkit-dev] HTML5 Parsing & MathML

David Carlisle d.p.carlisle at gmail.com
Tue Nov 2 07:55:38 PDT 2010

Alex Milowski <alex <at> milowski.org> writes:

sorry for late reply, I'm not subscribed, just saw this in the archives.

> On Fri, Oct 1, 2010 at 12:52 PM, Adam Barth <abarth <at> webkit.org> wrote:
> > Our parser follows the spec (modulo late-breaking spec changes that we

Actually most mathml in the wild will be mis-parsed by the webkit html5 parser
because of


but that's hopefully a temporary glitch.

> > haven't picked up yet).  The different namespaces can only be nested
> > in certain ways, unlike in XML where arbitrary nesting is possible.
> ...
> <p> ...
> <math>
> <mfenced open='[" close="]">
> <div> ... random stuff </div>
> </mfenced>
> </math>
> </p>
> It would then pop the open stack back to the parent "p" element
> and the "div" element would be a child of the paragraph and not
> of the fencing.

Personally I agree with you that this desire to make html elements forcibly
close the surrounding math elements is entirely bogus, and it causes all sorts
of problems in annotation-xml (where you really want nested html) but we failed
to convince the html WG (or the html editor) of that and so ended up with a
special case workaround for annotation-xml


sometimes you have to take what you can get:-)

However I don't agree that using the token elements as extension points is only
necessary because of html parser strangeness, I think it leads to a cleaner
design, and better fallback behaviour for systems that do not understand the
foreign elements, in any case.

> In XHTML, assuming there are appropriate uses of
> namespaces, everything would work fine and you'd get a "div"
> element fenced with stretching square brackets.

It would probably render OK but wouldn't be valid according to the published
schemas. As with most "polyglot" requirements assuming xml and html validity
goes a log way to ensuring that you get the same dom.
> So, if you cut-n-pasted the same content with the 'xmlns'
> attributes, you'd get two very different results.
> That really feels "fixable" but I'm going to need to think a bit
> more about what adjustments there would need to be
> to the rules.
> I wonder what the intersection of local names is between
> MathML and HTML ...

By design there is no intersection, although it turns out that browsers
implemented (and html5 acknowledges) image as a synonym for img which is
therefore the one clash with a mathml name.

> This is, of course, an HTML5 issue and not really an WebKit
> issue except for the question of difficulty of implementation.



More information about the webkit-dev mailing list