[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: indexing and querying XML (not XQuery)
* Robert Koberg <rob@k...> [2005-08-23 09:06]: > Hi, > > Someone on the Lucene user's list posted a link to this paper: > http://www.idealliance.org/papers/xmle02/dx_xmle02/papers/03-02-08/03-02-08.html > that talks about indexing and searching XML documents. I have been doing > something similar for a while (3 years, I think) but it is specific to > our configuration/content which probably doesn't have wider > applicability. I have also found it to be: > "a fast, reliable XML search engine, which has exceeded our expectations > in terms of flexibility and low development cost." > I was thinking the article would be of interest to many people here. I > was also wondering about your thoughts on this method of dealing with > XML. I have not looked in depth at XQuery, and I am wondering what > strengths/benefits XQuery would have over using something like Lucene to > index/query XML. > It would be interesting to see what folk from this list would come up > with if they put their brains to work on ways to handle > indexing/searching with something like Lucene. Len was in a thread a while back, on Web 2.0, where I posited the notion of a REST interface to full text search of syndicated feeds, or blogs. While we're at it, Len, did you think about that any further? Reading through the article, the thing that strikes me is that it that full text search of an XML document depends so much on the structure of the document. If that document can be divided into chapters, messages, articles, pages, etc, then it's best to create a full-text index with application specific documents. So, perhaps, the scaleable solution, is full-text engine that is fed a XML documents, and a full-text indexing schema. The existing schema langauges like to atomize documents, while a full-text indexing schema might group their elements into concepts, like paths, links, articles, and clues for ranking articles based on conditions specified in XPath. I've wanted to explore the use of Lucene in my document object model, so I'd like to hear more about this. -- Alan Gutierrez - alan@e... - http://engrm.com/blogometer/index.html - http://engrm.com/blogometer/rss.2.0.xml
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|