[webkit-dev] XML Serialization Issues

Alex Milowski alex at milowski.com
Wed Jun 19 11:44:57 PDT 2013


I was working on using MathJax [1] to turn MathML into SVG and ran into
some serious serialization issues.  In summary, as
MathJax programmatically creates SVG renderings of the MathML, when it
creates XLink attributes, it doesn't seem to define a prefix.  While this
works for rendering, it does when you try to extract a serialization of the
SVG.

That is, MathJax creates SVG 'use' elements like (assuming SVG as the
default namespace):

<use xlink:href="#MJMATHI-78" xmlns:xlink="http://www.w3.org/1999/xlink"/>

but instead I get:

<use href="#MJMATHI-78" xmlns="http://www.w3.org/1999/xlink"/>

which makes the SVG incorrect as the 'use' element is now in the xlink
namespace.

You can work around this by manually setting the "prefix" property on each
xlink:href attribute.

Looking into why this happens, I can see that the serializer seriously
broken in a number of ways when the DOM is constructed with incomplete
(e.g. missing namespace declarations) or inconsistent information (e.g.
same prefix used for different namespaces in the same context).

I found at least 6 bugs outstanding (#16739 [2], #16496 [3], #19121 [4],
#22958 [5], #83056 [6], #106531 [7]) and filed a new one (#117764 [8]).
 Some of these date back to 2007 (6 years ago!).

These bugs break down to these categories:

1. Default namespace issues: #16739, #106531, #16496
2. Conflicting prefix mappings: #117764, #19121
3. Namespace attribute issues: #22958, #83056, #117764

In looking at the code (MarkupAccumulator.cpp), they all suffer from one of
two problems:

1. The computed prefix used isn't properly used for the declaration.

2. The generated namespace mappings aren't properly stored, scoped, or
dealt with when they are inconsistent.

There is an general assumption in the code that certain prefixes should
always be used for certain namespaces.  Unfortunately, it does so without
looking to see whether there is a conflict already in scope.  Also, when
the namespace is not recognized and there is no prefix, a prefix needs to
be generated for the serialization.

Having written several robust XML Serializers for other projects, this can
all be fixed in a straightforward way.  I've looked at the code and know
what should be done.  The changes are probably modest.

Unfortunately, I can't spend the time to directly write and test the code
till probably after November.  :(

I am certainly willing to help, explain my strategy, advise, test, etc. if
there was another willing developer out there who would like to see these
bugs closed.

[1] http://www.mathjax.org/
[2] https://bugs.webkit.org/show_bug.cgi?id=16739
[3] https://bugs.webkit.org/show_bug.cgi?id=16496
[4] https://bugs.webkit.org/show_bug.cgi?id=19121
[5] https://bugs.webkit.org/show_bug.cgi?id=22958
[6] https://bugs.webkit.org/show_bug.cgi?id=83056
[7] https://bugs.webkit.org/show_bug.cgi?id=106531
[8] https://bugs.webkit.org/show_bug.cgi?id=117764


-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20130619/51e4c297/attachment.html>


More information about the webkit-dev mailing list