[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Concordance with XSLT

Subject: Re: Concordance with XSLT
From: Robert Koberg <rob@xxxxxxxxxx>
Date: Thu, 03 Nov 2005 16:06:33 -0500
lucene concordance
Rick Quatro wrote:
I am in the investigation stage of a project where the client wants a
concordance of a Bible. The concordance would be exhaustive, except for
words like "a", "the", "and", etc. We would supply an exclusion list. My
main question is this: given an XML version of the Bible, could this be done
practically with XSLT?

I don't think XSL is the best way to handle this type of thing. You might want to ask the same question on the Apache Lucene mail list (the main is at http://lucene.apache.org/) or some other search/indexing software list. This type of thing sounds more like a job for a search engine.


You would write a ContentHandler to index the XML into a lucene search index. You would create fields for the passage identifier, passage content and the passage's book ancestor. Another ContentHandler could create a create a list of all words not in the "stop word list". The list can then be sorted, duplicates removed and then run to search each word against the index. The results for each word could be returned as XML and XSL could be used to write them to a file.

best,
-Rob

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.