[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Common XML (was Re: Document Feature Requirements)
[Rick Jelliffe] > >I think the name "Common XML" doesn't capture where the Common XML >conventions are most appropriate. Some name like "Exchange XML" or >"Round-trippable XML" >would be more appropriate. > True. My first cut at a spec for it over on SML-DEV used the phrase "Interchange XML". After a while though, the group settled on "Common XML". >I welcome Common XML. We need to keep a profile to let people know which >features have been implemented well or are appropriate at any time. Yup. In SGML-land years ago Wayne Wohler used the term "Monastic SGML" for what we are calling "Common XML". [...] > >Does anyone else detect a running fallacy throughout some of the other >sections that goes "if you want to use an apple as an orange you cannot, >therefore you >should use only oranges and we don't need to provide apples"? > >For example, comments should not be used becuase it is impossible to >guarantee roundtripping or to transport data in them. PIs are >"ambiguous" >"because they are not part of the document's character data" and "many >simple applications ignore or discard them". CDATA section tags don't >have semantic meaning. > >However, it is the nature of a comment that it is not data -- it is >an annotation for the benefit of human readers on the state of the text >of a document. The above interpretatioon of a comment is the result of a human being attaching semantics to a construct that of itself, has no such semantics. What is this:- <!-- bgcolour=#00AA00 --> Syntactically, it is a comment. Semantically it is something else - probably. There is nothing in the nature of a comment that makes it not data - it is all data at the end of the day and processing systems will attach arbitrary semantics to chunks of it as they see fit. What *is* different between a comment and PCDATA is that the syntax demarcates them differently and the XML type system assigns different handles to them. In the ontology of XML types, PCDATA and COMMENT are at the same level XML CONSTRUCT --- COMMENT - PCDATA - ELEMENT - ATTRIBUTE etc. It could be argumed that the XML element type system is powerful enough to handle COMMENTS and PROCESSING INSTRUCTIONS without adding extra syntax and ontology entries. I.e. elements have and associated element type name. If the element type name is "xml:pi" the element contents have the general semantics of a processing instruction as defined in the XML 1.0 spec. If the element type name is "xml:comment" the element contents have the general semantics of a comment as defined in the XML 1.0 spec. The result is the use of "element syntax" for comments and PIs. If this is done in the syntax of the XML documents there is no need for comments or PIs and issues of round-tripping just go away. Of course they are round-tripped because they are cosher elements. An alernative from a developers point of view is to leave them syntactially different in serialized XML form but treat them homogenously at the API level. In Pyxie[1] for example, I read comments and PIs in their native form but transform PIs to element nodes internally. Thus in a Pyxie tree, there is no API for manipulating comments - you use the element API restricting your attention to element nodes of type "xml:pi". When Pyxie serializes XML, it recreates the PI syntax from XML 1.0. [1] http://www.pyxie.org regards, Sean, *************************************************************************** This is xml-dev, the mailing list for XML developers. To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev List archives are available at http://xml.org/archives/xml-dev/ ***************************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|