[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Possible changes for XML 2nd Edition
John Cowan wrote: > Issue PE28: > > Currently the XML Recommendation is silent about the handling of > documents that contain "impossible" bytes. For example, the byte 0xFF > cannot appear in any UTF-8 encoded document. We are considering making > such violations of the encoding a fatal error. > > PRO: an improperly encoded document is not really a text document at all; > nothing should be done on the basis of it. XML's draconian error handling rule > should lead to a "fatal error", which means the rest of the document must > not be parsed. > > CON: Some parsers may be relying on libraries supplied by the OS, which may > not properly signal erroneous input. Is it too great a burden on the > parser implementor to impose this restriction? I think this goes too far, for basic WF. Instead, I would propose another level of validity "character validity" which XML processors should be encouraged, but not required, to support, or to support as much as they can. Unlike validity, which sits on top of well-formedness, "character validity" sits more-or-less underneath well-formedness as XML's soft underbelly. An XML document that was "character valid" would 1) not have any impossible bytes in any entity 2) not have a BOM if the encoding="utf16le" or "utf16be" (and any other encoding constraints) 3) all names in markup must follow the NAMECHAR conventions. 4) all data Unicode-normalized This would keep a basic XML implementation that did not support "character validity" simple: 1) it can use any library for transcoding 2) it does not have to have any special BOM handling for utf16xe 3) it can tokenize tags based on whitespace and delimiters rather than NAMECHAR or NAMESTRT 4) normalization not checked/enforced A character-validating processor should be the goal for any XML processor not specifically aimed at ultra-lightweight uses. Rick Jelliffe *************************************************************************** This is xml-dev, the mailing list for XML developers. To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev List archives are available at http://xml.org/archives/xml-dev/ ***************************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|