[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: When Empty is Everything
At 16:46 04.12.2003, Dean Snyder wrote: >I just joined this list so pardon me if this topic has been dealt with >before, but it is something I have been pondering for some time and about >which I would like expert feedback. > >I work with ancient texts in multiple languages, including cuneiform >tablets, inscriptions, parchment and papyri manuscripts. There's a mailing list that is largely devoted to working with XML representations of biblical texts, and some of these people are doing similar things. You can subscribe to it here: http://lists.ibiblio.org/mailman/listinfo/biblical-languages >Converting these >texts to XML form presents messy problems because they exhibit rampantly >overlapping hierarchies: > >* single graphemes split across line boundaries; > >* character effacements occurring randomly in the texts, across lines, >cases, columns, and facets; > >* discontiguous parsing information; > >An on and on. Patrick Durusau and Matt O'Donnell have been working on this problem for years. Also, Michael Sperberg-McQueen has been working on this using a different approach. >My question is why not just use empty tags for everything? And if that >works why have non-empty tags at all? (I'm aware of the argument for XML >parsing simplicity.) Using these empty tags as milestones, I presume, and using a container element for each document to make the XML tools happy? That can be done, but it does increase the programming effort and slow down processing because you no longer have one "thing" that represents each basic unit. The basic problem is this: almost every XML tool is very good at querying or manipulating elements or the content of elements, but many XML tools aren't as good at performing primitive operations on regions between tags. If that's the basic representation of your data, you might actually be more interested in a tool based on some kind of region algebra rather than normal XML...but I don't know what tools to recommend for this. For many applications, a mixture of milestones and "real" elements is very useful, and it is often useful to be able to generate multiple representations of the same data. Hope this helps, Jonathan
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|