[Webkit-unassigned] [Bug 14952] New: Unknown character reference (general reference) mistakenly treated as a fatal-error rather than a non-fatal error (and crashes with current Safari 3 beta)

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sun Aug 12 16:17:04 PDT 2007


http://bugs.webkit.org/show_bug.cgi?id=14952

           Summary: Unknown character reference (general reference)
                    mistakenly treated as a fatal-error rather than a non-
                    fatal error (and crashes with current Safari 3 beta)
           Product: WebKit
           Version: 522+ (nightly)
          Platform: Macintosh
        OS/Version: Mac OS X 10.4
            Status: UNCONFIRMED
          Severity: Major
          Priority: P2
         Component: XML
        AssignedTo: webkit-unassigned at lists.webkit.org
        ReportedBy: robburns1 at mac.com


XML lists the following fatal errors (http://www.w3.org/TR/xml/#dt-fatal):
 • Well-fromedness constraint violation
(http://www.w3.org/TR/xml/#dt-wellformed)
 • Encoding declaration errors (http://www.w3.org/TR/xml/#dt-fatal)
   - entity in the wrong encoding
   - an encoding declaration not at the beginning of an entity
   - whenever the encoding cannot be processed
 • And under forbidden (http://www.w3.org/TR/xml/#forbidden):
   -  appearance of a reference to an unparsed entity, except in the
EntityValue in an entity declaration.
   -  the appearance of any character or general-entity reference in the DTD
except within an EntityValue or AttValue.
   - a reference to an external entity in an attribute value.

There is no mention, in the list of fatal errors, of character entity
references (general entity references), except in an XML DTD. So errant general
entities are not part of the fatal error definition.  No other errors are fatal
and therefore: "Conforming software may detect and report an error and may
recover from it" (http://www.w3.org/TR/xml/#dt-error).

On the other hand the recommendation says:


Unknown character entity references, or undeclared character entity references
are only a
well-formedness constraint violation (a fatal error) for standalone='yes'
documents. For standalone='no' documents, these are instead a validity
constraint violation (a non-fatal error) (see:
http://www.w3.org/TR/xml/#sec-references).

However, validity constraint violations are not fatal errors. Again, the
recommendation says: "Conforming software may detect and report an error and
may recover from it" This means that WebKit may report the unknown reference,
but it does not have to even report the errant reference. Since the
recommendation allows WebKit to  recover from the error, I think it should.
Probably replacing the unknown reference with Unicode replacement character
(U+FFFD) would be the most correct approach.  
 No other error reporting should be necessary as the replacement character is
sufficient to indicate an error has occurred.

These sorts of bugs give XML the reputation for having more draconian error
handling than it actually has. I may file a separate bug on the issue of
general entities (this is also related to bug#14945)


-- 
Configure bugmail: http://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list