|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Abstract Equivalence (was Microsoft FUD on binary XML...)
Alaric B Snell wrote: > There are several data models (and, hence, equivelance tests) in circulation > for XML - but every application uses at least one of them... Anyone using XSLT > will be looking at the XPath tree model; likewise, CSS users will be looking > at something similar. Application developers will be looking at DOM or SAX or > something else. > > So yes, what the XML encodes is the application's problem - but without > applications to read it, or at least the possibility of applications to read > it, a bit of XML is just a string of bits, with no meaning or purpose... > > Note that I'd include being printed out on paper and read by a human as an > application. The 'human' data model of XML will, at heart, be a woolly mix of > SAX and DOM since one will probably look at a small snippet of a few nested > elements on a few lines as a single tree-structure unit, but will otherwise > read the file from top to bottom ;-) > > I mean, a string of bits is an encoding, although to decide 'of what' you'd > need to (in general) hunt down where it came from and ask, unless it was an > encoding you happened to recognise. Having officially embraced Namespaces and the Infoset, XML years ago forfeited its defenses against the 'abstract syntax' arguments of the ASN.1 partisans. The only meaningful way to draw a line separating XML from the abstract syntax camp is to insist that--at the level of the parseable document--there is no equivalence except character-by-character lexical equivalence. That is a fundamental distinction, and one worth insisting upon for the extraordinarily useful consequences it delivers. Now, granted, the simplest of compliant XML 1.0 parsers will make no distinction between single and double quotes delimiting attribute values nor between various manifestations of whitespace, but accepting such abstract equivalence is the price of compromise necessary to have an agreed XML 1.0 Recommendation implementable in a parser. Choose stricter premises and you will find it necessary to implement something like Simon St.Laurent's Ripper. However, for the benefits of a global XML community and the bounty of general-purpose tools it produces, many of us have accepted those particular assertions of abstract equivalences which are incorporated in the XML 1.0 Recommendation. We accept them as specific, enumerated exceptions to an otherwise prevailing rule, and as identified exceptions they in fact confirm the existence of that rule in their absence. Namespaces, even if we accept only the simplest argument for them as a mechanism of disambiguation, necessarily imply that there is a level of abstract equivalence at which lexically identical GIs must be disambiguated as belonging to different (abstract) vocabularies, which implies the converse, that lexically distinct GIs may in fact be understood as equivalent once it is accepted that they are separate manifestations within different namespaces of a single abstraction. If we embrace namespaces this abstract equivalence becomes a fundamental rule which displaces the very different premise of XML 1.0. Likewise the Infoset, however light an abstraction of the instance syntax its original proponents insisted on, introduces abstraction as the general rule of which lexically variant instances are equivalent manifestations. Reversing the fundamental assumption of a lexically grounded XML, that abstraction is a slippery slope on which there is no meaningful point to draw a distinction between XML and any platonic abstract syntax such as ASN.1 and the like. I have made these arguments here before, for a number of years now. I apologize if my emphasis on these points becomes tiresome, but I think that the 50 years or longer struggle in the 20th century that was required for classical philology to understand the nature of oral poetry demonstrates why the physical, rather than any abstract nature of a text is worth insisting upon. Texts can and do 'encode' physical properties, among them rhythm, scansion and various devices of assonance. Once encoded in an instance text these qualities are inherent, and neither require markup nor other metadata to impute them, nor can those properties be removed from the text by instructions of markup. Beginning from the concrete instance gives us these properties in an unambiguous and clearly perceived form. On the other hand, to begin from an abstraction of syntax is to forfeit a concrete means of conveying those properties and to be forced to rely on metadata--that is, metadata to the abstraction!--if they are to be communicated at all. And of course any recipient of that metadata is free to ignore it in realizing some physical instantiation from the text. For those cases I must care most about, ASN.1 and abstract syntax generally are incapable of a precise and unambiguous encoding of inherent fundamental textual properties without resorting to a priori agreements between the creator and the consumer of a document, and from the very nature of document processing such agreements are unreliable and negligible. This is the fundamental distinction of document and data to which all permathreads return, but which I think the recent championing of ASN.1 on xml-dev gives us a useful new perspective on. Can't we now assert that what is fundamentally data is that of which the most salient properties are abstract? That is, different lexical manifestations are understood by both their creator and their consumer to be secondary to some abstract underlying platonic reality and, conversely, the physical qualities which might be inherent in a particular lexical manifestation are understood by both creator and consumer to be spurious and negligible. The content of documents, on the other hand, most specifically includes, often as the chief concern, those characteristics which come with the lexical manifestation and cannot be purged from the physical realization. Respectfully, Walter Perry
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








