[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Xml is _not_ self describing


Re:  Xml is _not_ self describing
Well said Leigh.

Jonathan

> 
> 
> > -----Original Message-----
> > From: Bullard, Claude L (Len) [mailto:clbullar@i...]
> > Sent: 15 January 2002 20:46
> > To: 'Elliotte Rusty Harold'; 'xml-dev@l...'
> > Subject: RE:  Xml is _not_ self describing
> > 
> > 
> > I can't wait to see the XML.COM condensed 
> > version of this thread. :-)
> 
> And me, because hopefully then I can read it and 
> understand what the real issue is.
> 
> (...pauses...)
> 
> D'oh!
> 
> --
> 
> Seriously though, I gave a talk recently, introducing Markup 
> and XML to some Medical Informatics students. I outlined the 
> overheads of writing custom parsers for custom formats; 
> suggested that providing additional rules to structure 
> data formats could improve the situation; then explained 
> why CSV is fragile and limited; and then introduced 
> labelled formats as the best solution.
> 
> I also made it clear that introducing grammatical rules 
> such as labelling doesn't necessarily say anything about 
> the meaning of the data following those rules 
> (cf: Edward Lear). That's for a higher layer.
> 
> They seemed to accept the benefits of this, and 
> understood where the limitations were. 
> 
> So aside from the philosophy (interesting as it is) it seems to me 
> there's a fairly simple message to get across. Is there any real 
> evidence that there's been a failure to communicate it, beyond 
> the existing marketing-technology disconnects?
> 
> Personally I'm not sure I've seen it. Most developers I've worked 
> with just approach XML as syntax, and don't expect a whole 
> lot more.
> 
> Cheers,
> 
> L.
> 
> -- 
> Leigh Dodds, Research Group, Ingenta | "Pluralitas non est ponenda
> http://weblogs.userland.com/eclectic |    sine necessitate"
> http://www.xml.com/pub/xmldeviant    |     -- William of Ockham
> 
> 
> > 
> > Is it there?  We can split some fine hairs here, but 
> > often meaning has to be discovered from clues found 
> > elsewhere and then projected onto the text.  Worse, 
> > the translations into an understanding readily shared 
> > can vary enormously such that any such original meaning 
> > is distorted or not provable as original until some 
> > acceptable number of texts are translated.  There are 
> > linear markings from the Mystery Hill site (American 
> > Stonehenge) which some claim are Phoenician but are 
> > hotly contested otherwise.  Before accepted, both 
> > the decipherers and the archaeologists have to 
> > find mutually reinforcing but quite separate 
> > evidence (previous examples of the text types and 
> > artifacts attributable to some past civilization). 
> > 
> > It may not be random but be meaningless:  see the 
> > problems of assuming some astronomical signals 
> > were meaningful because they were regular (rotating 
> > and emitting).  Non-randomness isn't meaningful 
> > per se.  One can assume that a wedge-shaped tablet 
> > found in a collection of such is if other evidence 
> > indicates the site is a library, then start building 
> > up example sets until the key is discovered or a 
> > dictionary is created that self-consistent to a 
> > tolerable degree.  Otherwise, a Rosetta Stone is 
> > required.
> > 
> > So it isn't that cut and dry.  As I said in my 
> > reply to Mike, you can be looking for math only 
> > to discover belatedly, possibly by accident, that 
> > they were just saying Hi: Cheops Slept Here.  Once 
> > you know about star alignments, some aspects of 
> > pyramid layouts make sense.  Unfortunately, 
> > so does Stonehenge, Mystery Hill and a myriad 
> > of other sites - but it can't be proved and 
> > may not be true in each or every case.
> > 
> > "Documents written in natural languages have meaning even if you don't 
> > speak those languages. They do carry information."
> > 
> > That is so but until you learn them or someone who has tells you, 
> > you don't know what they mean.  We are quite close to the 
> > "if the tree falls in the forest.." argument.  The best I can 
> > do is say, yes it has meaning to someone and yes, strictly 
> > speaking, by establishing the non-randomness is purposeful, not 
> > a side effect of another regular process, we can agree there 
> > is information there.  Shannon built modern communications 
> > by saying reproducibility, not semantics, are the key to 
> > designing communication systems.
> > 
> > That said, we of course agree about the value of tagging regardless 
> > of whether we have the descriptions.  XML is self-describing to 
> > the extent one understands the Rosetta Stone that is the 
> > XML 1.0 specification, then acquires by some evidence, a 
> > workable set of descriptions for the tag names.  Doctor Goldfarb 
> > often points to glossing as the original modern form of hypertext
> > and markup.
> > 
> > All other things being equal, given some XML instance, I sure 
> > do prefer a well-documented schema or DTD to reading someone 
> > else's code to discover what I am supposed to expect and 
> > what to do about it.  Or just Hide The XML and give me 
> > the stinkin' compiled application to install.
> > 
> > len
> > 
> > -----Original Message-----
> > From: Elliotte Rusty Harold [mailto:elharo@m...]
> > 
> > At 12:17 PM -0600 1/15/02, Bullard, Claude L (Len) wrote:
> > 
> > >A label is not a name unless it is meaningful.
> > >Natural language is not self-describing unless
> > >you were taught it.
> > 
> > I guess it depends on what exactly you mean by "self-describing". I 
> > think a book about the English language written in English is 
> > self-describing in and of itself, whether anybody speaks English or 
> > not. However, leaving that aside there's a deeper assumption I want 
> > to cut off before it becomes too embedded in the debate.
> > 
> > Documents written in natural languages have meaning even if you don't 
> > speak those languages. They do carry information. They are not random 
> > strings of characters. I've been reading a lot about the theory and 
> > history of cryptography  lately, and it's amazing just how much 
> > information you can pull out of ciphered text, because, in fact it 
> > isn't random. It's harder to read ciphered text than unciphered text, 
> > but it's not impossible. And that's a world of difference.
> > 
> > Reading text in a language you don't speak, but which has not been 
> > deliberately encrypted, is a similar problem; and in fact some of the 
> > same techniques were applied to languages like Linear B and 
> > hieroglyphics that are used to break ciphers.
> > 
> > When a document is marked up, the information of the markup is there, 
> > whether we recognize it or not. It is a property of the text itself, 
> > not a property of our perception of the text. With appropriate work, 
> > experience, intelligence, and luck that markup can be understood. Can 
> > unmarked up text be understood as well? Yes, certainly; but markup 
> > adds to the information content of the text. It makes it easier to 
> > decipher its meaning in a very practically useful way. This is a 
> > question of degree, and text+markup is easier to understand than text 
> > alone.
> > 
> > Langauge is certainly important, but it is orthogonal issue.  Given 
> > the choice of data marked up in Ugaritic vs. the same data marked up 
> > in English, I pick English. But given the choice of data marked up in 
> > Ugaritic vs. the same data not marked up at all, I pick the data 
> > marked up in Ugaritic.
> > 
> > -----------------------------------------------------------------
> > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> > initiative of OASIS <http://www.oasis-open.org>
> > 
> > The list archives are at http://lists.xml.org/archives/xml-dev/
> > 
> > To subscribe or unsubscribe from this list use the subscription
> > manager: <http://lists.xml.org/ob/adm.pl>
> > 
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
> 


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.