[Webkit-unassigned] [Bug 53375] space character treated as newline with XSLT stylesheet when run using XSLTProcessor script API, but not when using xml-stylesheet pi

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sun Jan 30 07:53:52 PST 2011


https://bugs.webkit.org/show_bug.cgi?id=53375





--- Comment #2 from Martin Honnen <martin.honnen at gmx.de>  2011-01-30 07:53:52 PST ---
(In reply to comment #1)

> The issue is that other browsers apply XSL transformations to existing DOM trees, violating the spec. WebKit's XSLTProcessor serializes DOM trees to create XML documents (and XSL stylesheets)  for transformation.

Which spec is violated by transforming a DOM tree with XSLT? Are you saying that WebKit serializes the DOM node passed to importStylesheet to follow some spec? Which one?
And if you have choosen to serialize the DOM tree, shouldn't that happen in a way so that the result round-trips correctly and the meaning of a stylesheet is not changed?

> Now, serializing LF in attribute value doesn't produce a character reference like 
 - you can see it by opening <http://home.arcor.de/martin.honnen/safariBugs/test2011012802Xsl.xml> in Firefox or WebKit, and executing the following script in browser address bar:
> 
> javascript:alert((new XMLSerializer).serializeToString(document.documentElement))
> 
> And parsing XML with an actual line feed embedded produces a space in DOM tree - again, we match Firefox here.


To make it clear, the XSLT stylesheet contains a numeric character reference '
' in an attribute value so any compliant XML parser should not normalize that to a space, rather the attribute value should contain a LF character as http://www.w3.org/TR/xml/#AVNormalize says "For a character reference, append the referenced character to the normalized value".

And I think WebKit's and Firefox DOM tree both contain a LF and not a space for the 'select' attribute so at that stage the stylesheet is fine.

If WebKit serializes the stylesheet for further processing with its XSLT processor then that step should not change the meaning of the stylesheet. I don't think the behaviour of XMLSerializer in Firefox should be used as an argument. If DOMParser and XMLSerializer results in Firefox or WebKit don't round-trip then they are not suitable for serializing a DOM node with an XSLT stylesheet to an XML document, if that is deemed necessary for executing the stylesheet.

Other DOMParser/XMLSerializer implementations do round-trip a numeric character reference '
', for instance in Opera (tested with 11.01) the code

var xml = '<test att1="Line 1
Line 2"/>';
var doc = new DOMParser().parseFromString(xml, 'application/xml');
new XMLSerializer().serializeToString(doc)

gives the result

<test att1="Line 1&#xa;Line 2"/>

so there it is ensured that the meaning is not changed (only the lexical representation changes from '
' to '&#xa;' but that does not change the semantics of the document).

I am not sure whether XMLSerializer/DOMParser have ever been specified but there are XML serialization specifications, for instance http://www.w3.org/TR/xslt-xquery-serialization/ in "5 XML Output Method" says "characters MUST be output as character references, to ensure that they survive the round trip through serialization and parsing. (...) while CR, NL, TAB, NEL and LINE SEPARATOR characters in attribute nodes MUST be output respectively as "&#xD;", "&#xA;", "&#x9;", "&#x85;", and "&#x2028;", or their equivalents".

To summarize, I don't think serializing a stylesheet should change its meaning, thus if WebKit needs to serialize the DOM node passed to importStylesheet then it should do so in a way that the meaning of the stylesheet is not changed. For that a linefeed in an attribute value must be serialized as a character reference i.e. either '&#xA;' or '
'.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list