[xsl] Aligning/merging two sequences

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

Subject: Aligning/merging two sequences
From: Markus Flatscher <markus.flatscher@xxxxxxxxxxxx>
Date: Thu, 30 Sep 2010 12:51:00 -0400

I'm banging my head against a sequence alignment problem. I have a feeling that this is straightforward, but I can't put my finger on what's missing from my attempts.

Suppose I have two inputs like so, where input1//w is always a subset of input2//w:

<input1>
 <w n="1">I</w>
 <w n="2">am</w>
 <w n="3">a</w>
 <w n="4">sequence</w>
</input1>

<input2>
 <w>I</w>
 <w>am</w>
 <w>a</w>
 <w>longer</w>
 <w>longer</w>
 <w>sequence</w>
</input2>

I'd like to get output like so:

<output>
 <w n="1">I</w>
 <w n="2">am</w>
 <w n="3">a</w>
 <w n="skipped">longer</w>
 <w n="skipped">longer</w>
 <w n="4">sequence</w>
</output>

I.e., for each input1//w, @n should be copied to the nearest following sibling <w> in input2 that matches .; <w>s in input2 that aren't in input1 should be flagged as "skipped".

P.S.: The use case is aligning an imperfect but timestamped transcription of an audio file (input1, machine-generated) with a perfect but not-timestamped one (input2, human-generated).

Thanks much for any help,

Markus

--
Markus Flatscher, Project Editor
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville VA 22904, USA
Courier: 211 Emmet Street South, Charlottesville VA 22903, USA
Email: markus.flatscher@xxxxxxxxxxxx
Web: http://rotunda.upress.virginia.edu/

Current Thread
Aligning/merging two sequences Markus Flatscher - 30 Sep 2010 16:51:15 -0000 <= Michael Kay - 30 Sep 2010 17:08:47 -0000 Martin Honnen - 30 Sep 2010 17:38:59 -0000 Martin Honnen - 30 Sep 2010 18:11:15 -0000 Markus Flatscher - 30 Sep 2010 19:18:39 -0000

<- Previous	Index	Next ->
Re: Difference in data?, List Owner	Thread	Re: Aligning/merging two sequ, Michael Kay
Re: Difference in data?, List Owner	Date	Hyphenation code, Dave Pawson
	Month

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >