[webkit-dev] Writing a new XML parser with no external libraries

Alex Milowski alex at milowski.org
Wed Jun 29 06:59:27 PDT 2011

On Wed, Jun 29, 2011 at 3:39 AM, Maciej Stachowiak <mjs at apple.com> wrote:
> Both RapidXml and Expat apparently have not been updated in quite some time (since 2009 and 2007 respectively). Copying an unmaintained project into the WebKit repository and forking it is certainly a possible alternative to writing something new based on the approach of our HTML5 parser. But I'm not sure it this approach gives us a long term more hackable code base. Cases where we've done this have often resulted in code that doesn't fit WebKit style and isn't fully understood by anyone.
> RapidXml in particular only claims "reasonable W3C compatibility", which likely is not an adequate level of conformance for a browser engine. I don't know if updating it to full XML 1.0 and Namespaces in XML 1.0 compliance would be a lesser effort than adapting the HTML parser.

I agree with this assessment.  I went through an extensive search
earlier this year (just after the contributor's meeting).  Expat is a
good parser but I worry about the support.  It will also suffer from
the string copy problem.

I also checked around with some of my contacts at XML software vendors
to determine what they've done.  Some of them have "gone native" as
well and wrote their own to deal with performance issues regarding
their own internal apis, etc.

--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language

Bertrand Russell in a footnote of Principles of Mathematics

More information about the webkit-dev mailing list