Resolving XML's Unicode Incompatibility Problem Using"Positional Tagging
I've simplified my response to Noah's question about how to resolve the XML Unicode incompatibility problem by transforming XML docs to grid-based files using Unicode 5.0 characters via the code2000 font. See http://nhds.com/org_chart_xml2ss.jpg for a revised spreadsheet containing the recursive hierarchies he requested. In it, the hierarchies are reflected in the row position of the "header tags" cells, which appear in rows 1-4. Thus, for example, cells C1 and C2 indicate that a person whose "Manager" tag resides in row 1 is the immediate manager of each person whose "Employee" tag resides in row 2, etc. So, Bob is the manager of Mary and Frank, while Leroy is the manager of Stan, Wes and Brenden. Note that any Unicode characters can be used in header tags and in any element and attribute value. Also note that the example above contains hyperlinks to each employees resume, even though I could have embedded them in the cells (since each cell can hold over 32K characters, which is more than 10 typewritten pages). Other advantages of the data organization model I'm proposing include: * The tags appear only once, no matter how many rows of values are under it, which reduces overhead and compression/decompression time. * The tags and values can be stored easily in any delimited text file (such a CSV), which is parsed in an instant using native spreadsheet functionality, thereby reducing processing time. * The spreadsheet document is very human readable. * It accommodates and types of elements, attributes and extensions. * Mapping different tags names can be as simple as predefining header cell locations into which equivalent tags are to be placed. Thus, while one file uses "Employee" at the tag in cell A1, another may use, say, "FirstName" in A1 and the values are inherently associated by their column position. * Querying the spreadsheet is easy and requires no additional metadata. For example, compiling a list of all the employees under Bob can be done automatically via an recursive macro that builds a collection by finding all populated cells in each column with an "Employee" tag to the right of Bob's (i.e., columns B through H), which end at row 10 since row 11 "belongs" to the next employee at Bob's same hierarchical level. I realize this is a major paradigm shift -- from tagged lists to cellular locations. I'd like to know what others think about this model. Steve -----Original Message----- From: noah_mendelsohn@u... [mailto:noah_mendelsohn@u...] Sent: Saturday, February 23, 2008 6:49 PM To: Stephen-NHDS Cc: Xml-Dev Listserv Subject: Re: Re: Ten Years Later - XML 1.0 Fifth Edition? SBeller writes: > An elegant solution for many situations is available if we shift > from a string-based language to "positional-based" method. This > solution involves transforming XML documents into grid-based files > (such as spreadsheets), in which (a) the cells of each column are > populated with the element or attribute values sharing the same XML > name, (b) the columns are arranged in a manner that maintains > hierarchies, and (b) the values in each cell in a row are associated. I'm a bit confused about this proposal. One of XML's most valuable features is its ability to unify documents and data in the same framework. I can see how to translate a list of potential hires into a spreadsheet, as you suggest. How would I handle the XML documents that are, for example, their resumés? The use of XML for structured documents is at least as important as for data; indeed it's the combination that I think makes XML uniquely interesting. I've never seen a spreadsheet that could do much more with documents than either extracting bits out of them, or maybe storing the text as blobs in cells. How would a collection of resumes look in this form, presuming that the resumes had variable structure and lots of text? Certainly, spreadsheets are a also stretch for recursive hierarchies, even of data, and likewise I'm not sure how you represent the content corresponding to <xsd:choice>. If I had an XML organization chart for my company, with a format like: <manager name="bob"> <employee name="mary"/> <manager name="sue"> <employee name="tom"/> </manager> </manager> how would that map to your spreadsheet? (note that managers have mixes employees and managers reporting to them at each level.) Thank you. Noah -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- Stephen-NHDS <sbeller@n...> 02/23/2008 11:40 AM To: Xml-Dev Listserv <xml-dev@l...> cc: (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: Re: Ten Years Later - XML 1.0 Fifth Edition? It seems to me that the very nature of the text-based markup method is causing this Unicode incompatibility problem. An elegant solution for many situations is available if we shift from a string-based language to "positional-based" method. This solution involves transforming XML documents into grid-based files (such as spreadsheets), in which (a) the cells of each column are populated with the element or attribute values sharing the same XML name, (b) the columns are arranged in a manner that maintains hierarchies, and (b) the values in each cell in a row are associated. The resulting grid could then be queried easily and its contents formatted based on their cellular positions. Any Unicode characters can be used in the names and values, e.g., Excel can accommodate all Unicode 5.0 characters via the code2000 font, as well as using a character's code decimal value in its macros. And the grid could be saved as a delimited text file, without the overhead of tags and tag-based parsing. I realize this paradigm shift isn't easy for many to comprehend, but it can be done and is worth exploration, imo. Steve _______________________________________________________________________ XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting. [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-dev-unsubscribe@l... subscribe: xml-dev-subscribe@l... List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format