[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Xml is _not_ self describing
Well said Leigh. Jonathan > > > > -----Original Message----- > > From: Bullard, Claude L (Len) [mailto:clbullar@i...] > > Sent: 15 January 2002 20:46 > > To: 'Elliotte Rusty Harold'; 'xml-dev@l...' > > Subject: RE: Xml is _not_ self describing > > > > > > I can't wait to see the XML.COM condensed > > version of this thread. :-) > > And me, because hopefully then I can read it and > understand what the real issue is. > > (...pauses...) > > D'oh! > > -- > > Seriously though, I gave a talk recently, introducing Markup > and XML to some Medical Informatics students. I outlined the > overheads of writing custom parsers for custom formats; > suggested that providing additional rules to structure > data formats could improve the situation; then explained > why CSV is fragile and limited; and then introduced > labelled formats as the best solution. > > I also made it clear that introducing grammatical rules > such as labelling doesn't necessarily say anything about > the meaning of the data following those rules > (cf: Edward Lear). That's for a higher layer. > > They seemed to accept the benefits of this, and > understood where the limitations were. > > So aside from the philosophy (interesting as it is) it seems to me > there's a fairly simple message to get across. Is there any real > evidence that there's been a failure to communicate it, beyond > the existing marketing-technology disconnects? > > Personally I'm not sure I've seen it. Most developers I've worked > with just approach XML as syntax, and don't expect a whole > lot more. > > Cheers, > > L. > > -- > Leigh Dodds, Research Group, Ingenta | "Pluralitas non est ponenda > http://weblogs.userland.com/eclectic | sine necessitate" > http://www.xml.com/pub/xmldeviant | -- William of Ockham > > > > > > Is it there? We can split some fine hairs here, but > > often meaning has to be discovered from clues found > > elsewhere and then projected onto the text. Worse, > > the translations into an understanding readily shared > > can vary enormously such that any such original meaning > > is distorted or not provable as original until some > > acceptable number of texts are translated. There are > > linear markings from the Mystery Hill site (American > > Stonehenge) which some claim are Phoenician but are > > hotly contested otherwise. Before accepted, both > > the decipherers and the archaeologists have to > > find mutually reinforcing but quite separate > > evidence (previous examples of the text types and > > artifacts attributable to some past civilization). > > > > It may not be random but be meaningless: see the > > problems of assuming some astronomical signals > > were meaningful because they were regular (rotating > > and emitting). Non-randomness isn't meaningful > > per se. One can assume that a wedge-shaped tablet > > found in a collection of such is if other evidence > > indicates the site is a library, then start building > > up example sets until the key is discovered or a > > dictionary is created that self-consistent to a > > tolerable degree. Otherwise, a Rosetta Stone is > > required. > > > > So it isn't that cut and dry. As I said in my > > reply to Mike, you can be looking for math only > > to discover belatedly, possibly by accident, that > > they were just saying Hi: Cheops Slept Here. Once > > you know about star alignments, some aspects of > > pyramid layouts make sense. Unfortunately, > > so does Stonehenge, Mystery Hill and a myriad > > of other sites - but it can't be proved and > > may not be true in each or every case. > > > > "Documents written in natural languages have meaning even if you don't > > speak those languages. They do carry information." > > > > That is so but until you learn them or someone who has tells you, > > you don't know what they mean. We are quite close to the > > "if the tree falls in the forest.." argument. The best I can > > do is say, yes it has meaning to someone and yes, strictly > > speaking, by establishing the non-randomness is purposeful, not > > a side effect of another regular process, we can agree there > > is information there. Shannon built modern communications > > by saying reproducibility, not semantics, are the key to > > designing communication systems. > > > > That said, we of course agree about the value of tagging regardless > > of whether we have the descriptions. XML is self-describing to > > the extent one understands the Rosetta Stone that is the > > XML 1.0 specification, then acquires by some evidence, a > > workable set of descriptions for the tag names. Doctor Goldfarb > > often points to glossing as the original modern form of hypertext > > and markup. > > > > All other things being equal, given some XML instance, I sure > > do prefer a well-documented schema or DTD to reading someone > > else's code to discover what I am supposed to expect and > > what to do about it. Or just Hide The XML and give me > > the stinkin' compiled application to install. > > > > len > > > > -----Original Message----- > > From: Elliotte Rusty Harold [mailto:elharo@m...] > > > > At 12:17 PM -0600 1/15/02, Bullard, Claude L (Len) wrote: > > > > >A label is not a name unless it is meaningful. > > >Natural language is not self-describing unless > > >you were taught it. > > > > I guess it depends on what exactly you mean by "self-describing". I > > think a book about the English language written in English is > > self-describing in and of itself, whether anybody speaks English or > > not. However, leaving that aside there's a deeper assumption I want > > to cut off before it becomes too embedded in the debate. > > > > Documents written in natural languages have meaning even if you don't > > speak those languages. They do carry information. They are not random > > strings of characters. I've been reading a lot about the theory and > > history of cryptography lately, and it's amazing just how much > > information you can pull out of ciphered text, because, in fact it > > isn't random. It's harder to read ciphered text than unciphered text, > > but it's not impossible. And that's a world of difference. > > > > Reading text in a language you don't speak, but which has not been > > deliberately encrypted, is a similar problem; and in fact some of the > > same techniques were applied to languages like Linear B and > > hieroglyphics that are used to break ciphers. > > > > When a document is marked up, the information of the markup is there, > > whether we recognize it or not. It is a property of the text itself, > > not a property of our perception of the text. With appropriate work, > > experience, intelligence, and luck that markup can be understood. Can > > unmarked up text be understood as well? Yes, certainly; but markup > > adds to the information content of the text. It makes it easier to > > decipher its meaning in a very practically useful way. This is a > > question of degree, and text+markup is easier to understand than text > > alone. > > > > Langauge is certainly important, but it is orthogonal issue. Given > > the choice of data marked up in Ugaritic vs. the same data marked up > > in English, I pick English. But given the choice of data marked up in > > Ugaritic vs. the same data not marked up at all, I pick the data > > marked up in Ugaritic. > > > > ----------------------------------------------------------------- > > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > > initiative of OASIS <http://www.oasis-open.org> > > > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > > > To subscribe or unsubscribe from this list use the subscription > > manager: <http://lists.xml.org/ob/adm.pl> > > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl> >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|