[Webkit-unassigned] [Bug 14945] An ampersand ("&") appearing in a document is treated as a fatal error (instead of a non-fatal error)

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Aug 14 17:34:40 PDT 2007


http://bugs.webkit.org/show_bug.cgi?id=14945





------- Comment #7 from robburns1 at mac.com  2007-08-14 17:34 PDT -------
(In reply to comment #5)
> 1) We should check what other XML processors do for this error.
> 2) I believe this particular document violates the well-formedness constraints.
> Its the fact that there is [no[ semicolon before the next & that violates
> well-formedness, not the fact that the entity is not declared. So I think this
> bug is invalid, but I don't have time to study the spec further right now.


The reason I so carefully went through the spec and the well-formedness rules.
Here are the well-formedness constraints:

well-formedness constraint
[Definition: A rule which applies to all well-formed XML documents. Violations
of well-formedness constraints are fatal errors.]

Well-formedness constraint: PEs in Internal Subset
In the internal DTD subset, parameter-entity references must not occur within
markup declarations; they may occur where markup declarations can occur. (This
does not apply to references that occur in external parameter entities or to
the external subset.)

Well-formedness constraint: External Subset
The external subset, if any, must match the production for extSubset.

Well-formedness constraint: PE Between Declarations
The replacement text of a parameter entity reference in a DeclSep must match
the production extSubsetDecl.

Well-formedness constraint: Element Type Match
The Name in an element's end-tag must match the element type in the start-tag.

Well-formedness constraint: Unique Att Spec
An attribute name must not appear more than once in the same start-tag or
empty-element tag.

Well-formedness constraint: No External Entity References
Attribute values must not contain direct or indirect entity references to
external entities.

Well-formedness constraint: No < in Attribute Values
The replacement text of any entity referred to directly or indirectly in an
attribute value must not contain a <.

Well-formedness constraint: Legal Character
Characters referred to using character references must match the production for
Char.

Well-formedness constraint: Entity Declared
In a document without any DTD, a document with only an internal DTD subset
which contains no parameter entity references, or a document with
"standalone='yes'", for an entity reference that does not occur within the
external subset or a parameter entity, the Name given in the entity reference
must match that in an entity declaration that does not occur within the
external subset or a parameter entity, except that well-formed documents need
not declare any of the following entities: amp, lt, gt, apos, quot. The
declaration of a general entity must precede any reference to it which appears
in a default value in an attribute-list declaration.

Well-formedness constraint: Parsed Entity
An entity reference must not contain the name of an unparsed entity. Unparsed
entities may be referred to only in attribute values declared to be of type
ENTITY or ENTITIES.

Well-formedness constraint: No Recursion
A parsed entity must not contain a recursive reference to itself, either
directly or indirectly.

Well-formedness constraint: In DTD
Parameter-entity references must not appear outside the DTD.

There are other mentions of well-formedness, but nothing I see says anything
about this violating well-formedness.

What the spec does say abou this is (as I quoted above):

"The ampersand character (&) and the left angle bracket (<) must not appear in
their literal form, except when used as markup delimiters, or within a comment,
a processing instruction, or a CDATA section"

On the issue of other implementations, I think this is a problem with other
implementations as well. There's no damage caused by WebKit being less
draconian than the other implementations. The XML recommendation never intended
this level of draconian error-handling.

What could WebKit possibly be solving by processing these errors as fatal
errors (when the XML recommendation doesn't call for that). The XML
recommendation has this reputation of being draconian yet the implementations
felt the need to be even more draconian? What's up with that?


-- 
Configure bugmail: http://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list