[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: citation processing

Subject: Re: citation processing
From: David Carlisle <davidc@xxxxxxxxx>
Date: Fri, 20 Oct 2006 16:55:05 +0100
citation processing
> If you think its not really feasible to parse a plain text citation
> into a marked up version then that's good feedback 

well never say never of course and it depends if the input always
follows the rules.

You could parse the example you gave with a couple (or half a dozen:-)
incantations with xsl:analyze-string,


get up to the first "." as list of authors
recursively spit that up on , to get each author,
etc

trouble is if the citations have been entered by hand  some of them are
going to use a : where you expect a , or a . instead of ;, or microsoft
code page "smart quote" characters instead of the real thing.
and a simple regexp replace mechanism isn't usually very good at
recovering from variable input like that.

If on the other hand the text was originally in a citation system and
was generated but has been cut and pasted across a few generations of
html pages and the original source is no longer available, you mightbe
OK to assume that the text itself is regular in its use of punctuation
and 
([^\.]*).([^\.]*).([^;]*);(.*)
will for example give you the author list, the title, the journal title,
and volumepage info as $1 ..$4 each of which could be further split up.

David

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.