[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: SAX LexicalHandler::comment issue
"Simon St.Laurent" wrote: > Where exactly in XML 1.0 is the distinction between logical and lexical > information drawn? I don't believe it really is, except as an unfortunate > side-effect of describing parsing in the same document which describes > syntax. > > I can't say I trust anyone who talks about _the_ logical view of an XML > document - I don't believe any such thing exists in a general way. At > best, there may be some consensus among data-oriented folks, but I don't > believe there is any general consensus about what always matters and what > always doesn't. > > Is XML what goes into a parser or what comes out? I used to argue for > blurring those two, but I'm leaning more and more toward XML being the > input, not the result of parsing. This protean 'interpretability' is the salient characteristic of text. The most concrete realization which XML provides--the document instance--nonetheless remains highly abstracted from each separate rendition, performance, or other processing which that document instance might be given. Simon is exactly right in understanding XML as the input to--not a particular product of--a process. Anyone who would like a mind-bending view--with contemporary legal, political and technical relevance--of the chasm between text and its interpretation might look at http://www.utm.edu/research/primes/curios/485...443.html. Much of U.S. intellectual property law turns on the distinction between the expression (i.e., the instance text) which traditionally may be protected and an underlying concept or general principle which may not. In the celebrated (infamous?) DeCSS case, it turns out that a gzipped version of the C source code for decrypting CSS can be expressed in hex as a 1401-digit prime number. The website cited posts that number as a curiosity among primes, though of course the same instance text can clearly be read as something very different. These differences between text and some processed logical view of it are not a new discovery. Nor is this a vague question of ontological form, like the parable of the moon in the water. The textual instance as surface syntax carries possibilities which even the most generous logical models will miss precisely because they refuse, a priori, to give primacy to the idiosyncratic text. A century and a half of learned philology collapsed with Milman Parry's insight in the 1920's that Homeric metaphors and epithets were primarily devices for fitting necessary nouns and the names of heroes, in the grammatical cases required, into the various strictures of the metrical feet. The vast edifice of scholarship in the aesthetics of epic poetry has had to be rebuilt on obstinate facts of syntax which had conveyed nothing to the most literate critics, but turned out to be the crucial tool for learning the craft of reciting poetry in a pre-literate society. The larger point, as Simon notes, is that there is no comprehensive logical view of the information conveyed by specific syntax. XML is predicated on the correct choice, the primacy of syntax. Subsequent decisions to build upon any narrower foundation seem both arrogant and pointless. Respectfully, Walter Perry
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|