|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Re: Where does the "nothing left but toolkits" mythcome fr
Fair enough. I wasn't thinking at that level of round-tripping, which I agree is problematic. What worried me about ERH's example was the potential for not even being able to round-trip text -- an issue that hasn't come up before (modulo entity references). The problem is not limited just to values, such as would occur with binary representations of real numbers. It also applies to formats. Dates and numbers have multiple formats, some of which may inadvertently carry information. For example, French geneological data might represent dates from the Napoleonic period using the Napoleonic calendar; since this is how the data is originally recorded, it should probably be continued to be represented that way, even though these dates can be converted to modern date systems. Similarly, a transcription of notes written by a criminal suspect might include dates in a particular format. Since this format might be a clue to the suspect's nationality or background, changing the format would mean losing information. Obviously, this additional information could be represented by additional metadata. But it is naive to think that all document designers will add such metadata. -- Ron Bob Foster wrote: > Ronald Bourret wrote: > > This points out something that should be a requirement for binary XML: > > lossless roundtripping. In other words, you should be able to go from > > the text serialization to the binary serialization and back losslessly > > (within the confines of canonical XML). Same is true for binary <=> > > text, binary <=> binary, and (of course) text <=> text. > > Of course text <=> text? This doesn't work today. I don't keep a list, > but off the top of my head. Information in the text such as character > references and internal general entity references in attribute values > are removed by parsers (e.g., SAX) and are not available to write back > out again. This is a perennial source of XSLT questions. Until SAX2 > Extensions 1.1, SAX didn't report the xml declaration, so the > application didn't know the original encoding. The application couldn't > tell which attribute values were specified in the document and which > came from the DTD as defaults. As ERH points out, canonicalization loses > the DOCTYPE declaration. And so on.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








