[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML-DEV JEWELS (was : XML-DEV on Groves)
On 12 Feb 2000, Thierry Bezecourt wrote: > From what I have learned [...] groves are about hierarchical data > structures and addressing nodes in these structures, so they seem > ideal for a mailing list archive. Having tackled the problem before in various lo-tech ways, I haven't found much hierarchic structure that was useful, as opposed to merely organizational. Unlike usenet posts, mail messages lack a References: header (the In-Reply-To:, when it's there, is much too bogotically variable) so you don't get the benefit of threading. [However, it might be a basis for a collaborative effort, where the grove is "grown" over time with feedback on threading links - say a forms-based adjunct to a Hypermail/MHonArc-style interface, driven by a grove-aware engine doing smarter things than just spitting out the contents of an overview.fmt database.] > To do that, if I'm correct, we would have to define a property set > for mailing lists, where articles would be nodes, header fields > would be properties of these nodes, the "References" header field > would be used for links to other articles, and the "contents" > would be the body of the article, which could contain links. My limited understanding of groves tells me that the key is the grove plan - which basically determines the amount of analytic granularity one wants or needs to work with. (E.g. an article would be a node, but how "high" or "low" in the hierarchy?) Maximum flexibility needs an exhaustive/detailed property set as the basis. > It does not seem very difficult. Well, it has been my experience that reliably extracting fine-grained material from mail messages is very difficult. (Just think of the variety of quoting habits/conventions.) For comparison, look at: (1) the monthly aggregations of messages, in UNIX mbox format, from the majordomo bot at IC where this list was. (2) Erik Naggum's old archive of usenet posts to comp.text.sgml, already preprocessed into a SGML format at ftp://ftp.ifiuio.no/pub/SGML/comp.text.sgml Care to develop a good property set?:) Arjun
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|