|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Request for Erratum to XML 1.0 and 1.1 Specs
This is not an erratum. It is a change proposal. Michael Kay > -----Original Message----- > From: Rick Jelliffe [mailto:ricko@a...] > Sent: 21 October 2003 10:11 > To: xml-dev@l... > Subject: Request for Erratum to XML 1.0 and 1.1 Specs > > > I have just sent this off to the XML Editor mail list. I encourage > anyone who thinks it > is good or bad (or who just thinks there should be something > but doesn't > care what) > to also send to them. > > It also raises an interesting question: the XML spec is written in > draconian terms with, > nominally, very few options. Yet SAX 2, the almost > universally deployed > parser > interface, is highly parameterizable with features, handlers and > properties. So it > cannot be too tragic to accept that some systems may need to bend > certain rules, > without altering the basic definitions. > > Rick > > =============================================================== > > Request for Erratum to XML 1.0 and 1.1 Specs > ---------------------------------------------- > Rick Jelliffe, ricko@t..., 2003-10-21 > > > I request the XML Working Group please consider the following > erratum to XML 1.0 which should also apply to XML 1.1. > > The following two paragraphs, or something to the same > effect, should be > appended to section 5.1 "Validating and Non-Validating Processors" > > > > "A non-validating processor may, at user option, imply > definitions for all the character entities defined by HTML > 4[1]. A document or entity > for which definitions are implied is not well-formed. The > processor must > report a non-fatal error. NOTE: The document is 'not well-formed but > processed'. Reliance on this feature by specifications is deprecated; > this option may be withdrawn at some > future time should it prove dangerous." > > "A non-validating processor which provides the HTML 4 > definitions may, at user option, also imply definitions for > other Math ML and ISO standard sets[2]. A processor must > report a non-fatal error. The document is 'not well-formed > but processed'. NOTE: Reliance > on this feature by specifications is deprecated; this option may be > withdrawn at some future time should it prove dangerous." > > [1] http://www.w3.org/TR/html401/sgml/entities.html > [2] http://www.w3.org/TR/MathML2/chapter6.html#chars_entity-tables > > > > This suggested erratum has the following characteristics: > > 1) It does not require any change to any XML processor > 2) It does not change the basic XML characteristic that the > only way to guarantee information is received at the other > end is to use a UTF-* encoding, no entities and no attribute > defaulting. > 3) It maintains the current layering, ao no re-architecting > or change in design is needed > 4) It keeps the XML specification as the location on how to > go from characters to data+markup. > > 5) It does not make any existing valid XML document invalid > 6) It does not make any existing invalid XML document valid > 7) It does not make any existing WF document or entity non-WF > 8) It does not make any existing non-WF document formally WF > > 9) It does allow the continued non-validating processing of > documents which are non-WF only because they contain standard > references > 10) It limits this to user option > 11) It does not allow other specifications to use this as > its default > 12) It can be withdrawn > > 13) I believe it is practical and would be simple to implement. > > > > I believe the beneficiaries of such an erratum include: > > * Users typing in editors with no adequate input methods > for non-ASCII characters. I note that although Unicode > editors can display many characters, not all operating > systems have input methods to allow convenient data entry > even of Latin1 characters. (I believe this is better > provided by using decent XML markup editors, without prejudice.) > > * XHTML users who are used to named references without > declarations in HTML. > > * Potential XInclude users, who may wish > to treat a WF parsed entity from a document that uses > standard character references as a microdocument > > * Potential XML Schemas, Schematron and RELAX NG users who > may wish to upgrade from DTDs. > > * Potential XQuery users who are being hindered by the lack > of XML Schemas. > > * XML pipeline systems which can pass XML without requiring > tricky prologs > > * SOAP, RSS and RDF systems which must cope with data > fragments from externally-generated document being embedded > > * Programmers serializing data to XML, especially for internal > systems, who may prefer to generate "—" or " " > rather than the numeric or literal equivalents. > > * Vendors who make products for the above > > * Low-sight or motion-impaired users whose speech synthesizers > or input methods only support ASCII characters. Aged, enraged > or diminished capacity users who may be frustrated at having > to lookup the number for something they know the name for. > (Though I do not want to suggest that "entity rage" is a hidden > problem.) > > > I suggest its benefits over other suggested approaches include: > > * It does not require change to subsequent processes, as PSVI > processing would, nor any changes or additions to schema > specifications > > * It does not require pre-processing, as a macro processor would > > * It does not require the introdution and deployment of new > transcoders, as would Tim Bray and John Cowan's recent thought > experiment "UTF-8+Names" > > * It does not require interaction with other standards > groups, notably > XML Schemas EG or IANA or IETF. > > * By providing it at user option, it can succeed or fail; if > it is popular and successful, that is good; if it is > unpopular or unsafe. > > * By limiting itself to the HTML and the MathML/ISO entities, it > avoids issues of user-defined entities, and the need to enumerate > the entities. > > * It does not define mappings for those characters, but defers to > HTML and MathML/ISO, who may provide standard mappings. > > This gives a very wide constituency: > > I note that Xerces' SAX 2 provide features by which a parser > can continue processing after an error. This proposal could > be seen as a very limit nod of recognition of that kind of practise. > > > Cheers > Rick Jelliffe > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org > <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this list use the subscription manager: <http://lists.xml.org/ob/adm.pl>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








