|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Saxon and Sun Serializer problems?
Thank you for engaging me on these details, Jim. At 2009-05-30 15:10 -0700, Jim Tivy wrote: >Hi Ken > >I read what you said below. The jist seems to be: > >Why would you want to do this? I'm sorry I didn't make myself clear. My jist was: what feature(s) does having the DOCTYPE give you? Which is different. There are so many other reasons why an XML document cannot be round-tripped through XSLT that just providing the DOCTYPE feature won't solve. I cited the lack of preservation of CDATA sections, the lack of preservation of the entity references (which includes numeric character references (not even resolved by a DOCTYPE), internal parsed general entities, external parsed general entities), and there are others including no link to NOTATION declarations for processing instruction target de-referencing (a very sore point of mine that the designers of XML processing interfaces have never felt the need to support). So given so many features for round-tripping that are not there, just putting in the DOCTYPE won't fix any of the ones I've cited. >I should point out the "this" had to do with using SAX in java with the jaxp >Identity Transform. However, I now extend it more tentatively to include >the "no DocType in the XDM" problem. Yes, I saw that. I was trying to figure out what it was about the DOCTYPE that you would get when you can't get other things left out of the infoset or XDM. >To give you some context of what I am doing - my need is primarily pragmatic >- I am a java programmer trying to get from A to B. Fine ... I won't hold that against you. :{)} >In an Xml content management system users use a variety of Xml processors >(or programs if you would prefer) like diverse Xml Editors - XMetal, Epic, >XmlMind and the content management systems that have file Store and Retrieve >capabilities as well as link extract and other Xml processing needs. All of >these parts "process" Xml. Actually, they process XML syntax, they don't process the information in an XML document. XSLT and XQuery were designed to build new structures from the information in structured sources. They were not designed to process the syntax of an XML document. XML editors, in particular, are designed to process the syntax of an XML document, and as we old (er, long-time) SGML'ers learned long ago you can't base an XML editor on an XML processor in the same way you can't base an SGML editor on an SGML processor. Now the DOM *does* have a few features that process some (not all!) of the syntax of an XML document, but the perspective is different. In the DOM the input tree *is* the output tree, unlike XSLT and XQuery where the input tree is read-only and the output tree is write-only: created, from scratch, in a single pass, without backtrack or repair or inspection. >All of these parts rely on the DocType for >validation or element insertion help or just need it to "round trip" the Xml >so other processors can use that DocType. Without the DocType, the >serialization looses some serious part of its capability. Well now you've lost me again, because the limited number of serialization features in XSLT/XQuery renders the information found in the DOCTYPE quite irrelevant. The XSLT feature of adding a SYSTEM identifier is there as I see it really only for the validation bit. Because what is serialized is the information that was used to build the result tree ... not the syntax borrowed from the source tree. >Most of these parts operate on the serialization of the Xml from time to >time. Editors read serializations, users import serializations - >serializations are the standard way of exchanging and making xml processors >interoperable. Ummmmm .... I can't agree for anything other than XML editors which are XML syntax applications not XML information applications. XML-based applications are interoperable because the XML processors all deliver the same content information to the applications using them. And the decision by designers of DOM to include syntax related issues (note again, not all syntax related issues) can enable many aspects of input syntax preservation because the DOM is acting *on* the document. XSLT and XQuery are not acting *on* the document, they are acting on the information found in the document. >Not being able to use powerful tools like XSLT and Sax to process Xml when >"round" tripping of the serialization is required, is restrictive to say the >least, as these technologies have their own strengths - eg: DOM is not XSLT >is not SAX. > >Fortunately SAX is usuable on Java - just make sure to use the Sun's Trax >serializer which keeps the docType as the saxon one drops the docType. (see >earlier post). > >Does this begin to motivate the reason why? I hear what you are trying to say, and I had already interpreted the need for syntax preservation to be to round trip the syntax of an XML document, but I haven't yet heard a justification for adding the DOCTYPE to XDM. Adding the DOCTYPE to XDM doesn't give you round-tripping of an arbitrary XML document because so much more would be needed. And all of it would be out of scope for XSLT/XQuery. This comes up often in the classroom from students who thought XSLT and XQuery could/should be used for XML document syntax preservation. Because XSLT and XQuery are node-tree-transformation tools and not XML syntax tools, they cannot be used for syntax preservation. XSLT and XQuery are not angle-bracket processors, they are node-tree processors. Serialization is not needed when the processor is embedded in, say, an XSL-FO engine. Serialization is a nice-to-have that allows one to create artefacts that can be useful as input to other XML-based tools. Consider source tree data projection: if I do an XSLT or XQuery transformation on a source node tree created from a non-XML source, what is the definition of the DOCTYPE? More to the point, what information might there have been put into a DOCTYPE in the interpretation of the projection to be useful in the node-tree transformation? I claim there is no such information. And I haven't found such information in the response that you've given. Thank you again for trying to help me better understand what you need. I really am trying to be supportive here to reveal what specific features of DOCTYPE you will find helpful. . . . . . . . . . . . . . . Ken -- XQuery/XSLT/XSL-FO hands-on training - Los Angeles, USA 2009-06-08 Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/ Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video Video lesson: http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18 Video overview: http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18 G. Ken Holman mailto:gkholman@CraneSoftwrights.com Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/x/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||






