|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Pure syntax vs the Infoset permathread (was Re:
mc@x... (Mike Champion) writes: >But it further strengthens the argument that essentially nobody except >Simon :-) and the proverbial desperate Perl hacker actually works with >XML at the pure syntax level. If only it were so simple. There is a large set of problems that the Infoset and even the PSVI do very poorly at expressing, though I think in many ways it goes back to bad layering (more precisely no layering) in the XML 1.0 specification. The problems largely have to do with information that comes from the DOCTYPE declaration (or other annotative source) and is inserted into the document between the reading of the bytes and the presentation to the application of the infosettish-API. Many applications, if asked to round-trip an XML document, save out an infosettish document rather than the original. The DOCTYPE is gone, entities are flattened into the text, default attributes are presented there, etc. If any of that information mattered to you, you're stuck. This does happen pretty easily. Recently, I accidentally overwrote a book.xml file which had referenced huge volumes of chapter files. That was a serious mess, but fortunately I still had the original in a zip. (Of course, if it didn't read external resources and then saved it out without the DOCTYPE, it might even be worse, but I haven't seen that case much.) Entities are probably the case where staying close to the syntax matters. I may well not want my special characters as numeric character references or straight Unicode text. In the case of books with chapters, I may want to retain the ability to edit chapters without digging into the whole @#X! book file. We do have some nifty tools, notably catalogs, which simplify dealing with these things, but they're not much good when the DOCTYPE's just plain stripped. Default attributes are less of a problem, though I have heard of people who change document processing context (different kinds of editors, for instance) using different DOCTYPE declarations. If it weren't that DOCTYPE-sniffing has become such a common part of browsers I might write this off as an odd approach (stylesheets seem more appropriate), but there's something there. A lot of the people dealing with these problems are users of software, not programmers, and I worry that a lot of them are just giving up. "How I Learned to Stop Worrying and Love the €" or something like that. I wrote a piece on some of this a long time ago: http://simonstl.com/articles/layering/layered.htm I'm only just now getting to implementation, unfortunately. -- Simon St.Laurent Ring around the content, a pocket full of brackets Errors, errors, all fall down! http://simonstl.com -- http://monasticxml.org
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








