|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: regexp question
> The most important thing is to get "surname, forename" so that I can > more easily query and transform that later (into more finely parsed > bibliographic records, for example). While there will be > exceptions to > this rule, I'm content enough to just say: > > - a name is all caps > - within a name the last name is the surname > - anything before that are the forenames > - multiple names are delimited by either ", " or " and " There are two approaches to this: do it all with regex analysis; or tokenize it first into words, and then use for-each-group to group the words. > > Titlecasing would be nice (though I note there's no such function in > XSLT 2.0). > Titlecasing is very sensitive to local rules. Rules that work for English wouldn't work for German. In fact, rules that work for American English wouldn't work for British English - in Britain, it would be unthinkable to write "In" or "Is" in a headline, but I'm sure I've seen US newspapers that do it, and certainly Microsoft Word (even the UK edition) does, though the grammar checker then flags the result as being incorrect. Michael Kay
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








