[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Identifying place names in text...

Subject: RE: Identifying place names in text...
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Thu, 21 Jul 2005 17:38:07 +0100
book names in text
This isn't difficult, no need to contemplate doing it in Java. You can
tokenize the text using the tokenize() function in XSLT 2.0, or the
str:tokenize() function/template in EXSLT (www.exslt.org). Then look up each
token in your list of place names, using a key for efficiency. 

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: Karl Koch [mailto:TheRanger@xxxxxxx] 
> Sent: 21 July 2005 14:56
> To: Mulberry list
> Subject:  Identifying place names in text...
> 
> Hello group,
> 
> I would like to find a way of automatically identifying 
> references to places
> in XML text. The thing is that I have a very large set of 
> content. In this
> content there are sometimes references to particular places, 
> which I want to
> know about. 
> 
> This is my xml structure (made up for simplification):
> 
> <bookshelf:
>   <book>
>     <title>1000 years of London's history</title>
>     ...
>   </book>
>   <book>
>     <title>1984</title>
>     ...
>   </book>
> </bookshelf>
> 
> Can I use XSLT to search for place names in the title of all 
> the books? I
> would like to use a wordlist of geographical place names 
> (which I already
> have). This would contain coutry and city names. The 
> stylesheet would match
> occurances of these words in the <title> XML element. The 
> output here would
> be a list of all books which have references about locations 
> in the title.
> In this example, the result would only be the first book, 
> because it has
> "London" in th title.
> 
> Perhaps this is the point where XSLT is getting too 
> complicated and I should
> consider Java as a solution. However, I am continuously 
> impressed by the
> power of XSLT and therefore I ask here because I think there 
> might be even a
> solution for that problem using XSLT.
> 
> A note on the side: The output of this stylesheet would be a 
> helper and an
> additional control for a mainly handcrafted process. I could 
> discover books
> which I have overseen in the manual process.
> 
> Any help would be greatly appreciated.
> 
> Kind Regards,
> Karl
> 
> -- 
> 5 GB Mailbox, 50 FreeSMS http://www.gmx.net/de/go/promail
> +++ GMX - die erste Adresse fo?=r Mail, Message, More +++

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.