[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Identifying place names in text...
This isn't difficult, no need to contemplate doing it in Java. You can tokenize the text using the tokenize() function in XSLT 2.0, or the str:tokenize() function/template in EXSLT (www.exslt.org). Then look up each token in your list of place names, using a key for efficiency. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Karl Koch [mailto:TheRanger@xxxxxxx] > Sent: 21 July 2005 14:56 > To: Mulberry list > Subject: Identifying place names in text... > > Hello group, > > I would like to find a way of automatically identifying > references to places > in XML text. The thing is that I have a very large set of > content. In this > content there are sometimes references to particular places, > which I want to > know about. > > This is my xml structure (made up for simplification): > > <bookshelf: > <book> > <title>1000 years of London's history</title> > ... > </book> > <book> > <title>1984</title> > ... > </book> > </bookshelf> > > Can I use XSLT to search for place names in the title of all > the books? I > would like to use a wordlist of geographical place names > (which I already > have). This would contain coutry and city names. The > stylesheet would match > occurances of these words in the <title> XML element. The > output here would > be a list of all books which have references about locations > in the title. > In this example, the result would only be the first book, > because it has > "London" in th title. > > Perhaps this is the point where XSLT is getting too > complicated and I should > consider Java as a solution. However, I am continuously > impressed by the > power of XSLT and therefore I ask here because I think there > might be even a > solution for that problem using XSLT. > > A note on the side: The output of this stylesheet would be a > helper and an > additional control for a mainly handcrafted process. I could > discover books > which I have overseen in the manual process. > > Any help would be greatly appreciated. > > Kind Regards, > Karl > > -- > 5 GB Mailbox, 50 FreeSMS http://www.gmx.net/de/go/promail > +++ GMX - die erste Adresse fo?=r Mail, Message, More +++
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|