[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Inverting names with Jr and Sr considered

Subject: Re: Inverting names with Jr and Sr considered
From: Wolfgang Laun <wolfgang.laun@xxxxxxxxx>
Date: Tue, 6 Nov 2012 10:57:57 +0100
Re:  Inverting names with Jr and Sr considered
Hopefully you won't have names like "Augustus De Morgan", which should
not be transformed to "Morgan, Augustus De".

And I think this is the time and the place to quote this article once more:
http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-name
s/

-W

On 06/11/2012, Mark <mark@xxxxxxxxxxxx> wrote:
> I agree, my specification is likely not complete. However, my input is a
> single document written by one person indexing a single journal. There is a
>
> great deal of consistency to the data and I doubt that there are as many as
>
> 1000 names. That said:
>
> I received an answer off the list (thus do not feel authorized to post it
> here) that will help me discover what oddities I have not covered. It
> explained the regex expressions it used so that perhaps if modification is
> required, I may be able to do it.
>
> Thanks for your time, Michael; as always this list provides the most
> consistent and practical advice around, something you all can be proud of.
>
> Mark
>
> -----Original Message-----
> From: Michael Kay
> Sent: Tuesday, November 06, 2012 2:10 AM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re:  Inverting names with Jr and Sr considered
>
> I wouldn't even attempt to write any code based on this as the
> specification. For this to work at all well, you're going to need to
> iteratively adapt the solution to handle all the names in your dataset,
> or at least a sample of a couple of thousand of them. There's just too
> much variation in the names you might encounter. Are "Jr" and "Sr"
> really the only suffixes, and are they always spelt this way, or do you
> also get "III" and "Jnr" and "Jnr."?
>
> If I'm wrong, and the names are all regular and in the pattern you
> describe, then I think you can just tokenize on whitespace and do
> something like
>
> suffix := $tokens[last()][. = ('Jr', 'Sr')]
> stem := if ($suffix) then remove($tokens, count($tokens)) else $tokens
> value-of select="concat($stem[last()], ',']), remove($stem,
> count($stem), if ($suffix) then concat('(', $suffix, ')') else '')"
>
> Michael Kay
> Saxonica
>
> On 05/11/2012 23:45, Mark wrote:
>> This must have been done many times, so can some one show me where to find
>>
>> the answer?
>>
>> I have a series of personal names in natural order that I need to invert.
>>
>> The surname is always last except when followed by Jr, or Sr (either
>> of which may not be present). I want to represent:
>>
>> J Allen Rogers > Rogers, J Allen
>> Bill T Wilson Jr > Wilson, Bill T (Jr)
>> A B Brown > Brown, A B
>> John Victor Case Sr > Case, John Victor (Sr)
>>
>> and so on. There may be a single space or multiple spaces between some the
>>
>> elements of the name.
>>
>> It looks like <xsl:analyze-string> will do this, but I do not know how to
>>
>> write regex.
>>
>> Thanks,
>> Mark

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.