[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: indexing and querying XML (not XQuery)
Index what? Ideas, ideas emerging from conversations, the conversations? So far, what you are describing seems to be Google. Can you out Google Google? A semantic aggregator is a topical query engine that automatically synthesizes topics and arranges them by a set of meta topics sometimes known as 'annotations': opposes, in contrast to, supports, etc. The topics are links and the annotations are links, usually out-of-line. This is an old idea from the pre-web days sometimes found in the context of researchers capturing corporate expertise. In it's older but less robust form, it is an inverted index as found at the back of any decent text (which is why this field was called bibliographic linking). Instead of returning links, it returns a fully-formatted report. So with that bit of insight from the WayBack Machine, Sherman, here is a thought experiment: XML-Dev is regularly harvested for ideas, some attributed, some not, by readers, some lurkers, some contributors. These ideas might get implemented or not, might get rephrased or reformulated to mimic invention, or not. How would you index them to: o Prove a source is THE source. o Diagram the emergence of an idea o Create permathread links for any idea that recurs o Automatically derive proofs for propositions expressed o Provide QOS metrics in the face of determined gamers Use REST if you like, SOA if you like, remember I don't care about the religious technical convictions, just the results. Think of it as the autoDrill (a typical Google search is an exercise in drilling not for a reference, but for an insight). Remember we don't all speak English if you want to expand to blogs from XML-Dev. XML-Dev is the easy test. Blogs are much harder. len From: Alan Gutierrez [mailto:alan-xml-dev@e...] Len was in a thread a while back, on Web 2.0, where I posited the notion of a REST interface to full text search of syndicated feeds, or blogs. While we're at it, Len, did you think about that any further? Reading through the article, the thing that strikes me is that it that full text search of an XML document depends so much on the structure of the document. If that document can be divided into chapters, messages, articles, pages, etc, then it's best to create a full-text index with application specific documents. So, perhaps, the scaleable solution, is full-text engine that is fed a XML documents, and a full-text indexing schema. The existing schema langauges like to atomize documents, while a full-text indexing schema might group their elements into concepts, like paths, links, articles, and clues for ranking articles based on conditions specified in XPath. I've wanted to explore the use of Lucene in my document object model, so I'd like to hear more about this.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|