Matching string values across element boundaries

Play the video

Subject: Matching string values across element boundaries
From: David Sewell <dsewell@xxxxxxxxxxxx>
Date: Mon, 8 Apr 2013 14:15:28 -0400

I expect this has been discussed here before, but I can't locate any relevant
discussion, so here goes.

We have input data with many unmarked short-title citations that look like
this:

   Sprague, <hi rend="italic">Braintree Families</hi>

We want to wrap them inside another element, in our case a <ref> to the
bibliographic expansion. We have a venerable chain of XSLT 2.0 transforms that
does this, and pretty well, by preprocessing the data to convert all those
<hi>
tags into a pair of unique ASCII characters, so that we can do string-matching
operations within a single text node that now includes something like

   Sprague, "Braintree Families%

which is easy to handle with xsl:analyze-string. then once we've wrapped all
the
strings we need to, we post-process with xsl:analyze-string to put the <hi>
elements back in.

In practice, given the proper regexes, this works quite well and provides the
desired output, but I always feel a bit guilty about the hackishness of the
approach. Given that the citations are quite variable in structure (usually
but
not always containing <hi> elements, with various combinations of text nodes
at
start and end), I've never come up with a good general-purpose way to operate
purely on elements and text nodes without the convert-tags-to-characters step.
Is there one (or more)?

David S.

--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: dsewell@xxxxxxxxxxxx   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/

Current Thread
Matching string values across element boundaries David Sewell - 8 Apr 2013 18:15:40 -0000 <= Michael Müller-Hillebrand - 8 Apr 2013 19:00:09 -0000 David Sewell - 8 Apr 2013 20:26:16 -0000 steve.majewski@xxxxxxxxx - 8 Apr 2013 21:01:23 -0000 Michael Sokolov - 8 Apr 2013 23:00:14 -0000

<- Previous	Index	Next ->
Re: CSSXX to XML, Andriy Gerasika	Thread	Re: Matching string values ac, Michael Müller-Hille
Re: Cannot write more than on, G. Ken Holman	Date	Re: Matching string values ac, Michael Müller-Hille
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >