[Home] [By Thread] [By Date] [Recent Entries]

  • From: David <dlee@c...>
  • To: xml-dev@l...
  • Date: Wed, 16 Jun 2010 07:49:09 -0400


> characters. Only after that comes XML with its opening tags, closing
> tags, attributes, and namespaces. But that is already relatively easy,
> because at this point, you should see text that you can more or less
> understand.
>

I suggest this is just the basement of the tower of Babel.  Once you 
have recognized string of characters you still have to decode the 
original document.
XML is typically (usually?) not the humanly consumed format or even 
close to it.   Consider an Open Office or Word document or PDF document 
in XML format.
Try to reconstruct the original document layout or meaning from the XML 
without the necessary software processors.

Or take a representation of a database in XML.  For relevency (as 
something someone may want to actually read the real meaning), consider 
a government database of the history of land ownership records, or say 
even your health records from 80 years ago.
Parsing the XML and turning it into something *usable and 
comprehensible* is a daunting task, potentially nearly impossible 
without the actual software that was used to create/read the original 
document.   This data may relational (in the loose or strict sense() and 
essentially unusable without understing things like foreign keys, 
ordering, implicit enumerations  (does a "125A64" mean you bought or 
sold a property at that date ?)  How about an ICD9 code ?
There's often a lot more you need then just the string of bytes to 
actually make data usable.

This is the thinking/strategy many archivests are taking ... that its 
not enough to just reconstruct the bytes.   You need to also reconstruct 
the software where those bytes were relevent/intepreted as well as all 
the 'out of band' data which it references or is implied.

-David

( Cant wait for the conference !)



-- 
-------------------------
David A. Lee
dlee@c...
http://www.calldei.com
http://www.xmlsh.org





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member