|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] XSLT2 node comparison, wordlists
I'm sure this is easy to do in XSLT2 but I've just not got my head
wrapped around how to compare things properly in an efficient manner.
Let's say I have a wordlist where automatically generated from another
file I've got instances of how each word was used. In many cases
these are identical in spelling, and what I want to do is merge them
and store links between the original file and the wordlist in a
stand-off markup method.
Say the file has entries for each word which are like:
=====
<entry xml:id="let22-w27">
<form>
<orth type="hw">the</orth>
<form type="orthVar">
<orth xml:id="w72">The</orth>
<orth xml:id="w3955">The</orth>
<orth xml:id="w4513">The</orth>
<orth xml:id="w4578">The</orth>
<orth xml:id="w4650">The</orth>
<orth xml:id="w4672">The</orth>
<orth xml:id="w4703">The</orth>
<orth xml:id="w4824">The</orth>
<orth xml:id="w4830">The</orth>
<orth xml:id="w2045">the</orth>
<orth xml:id="w2079">the</orth>
<orth xml:id="w2101">the</orth>
<orth xml:id="w2112">the</orth>
<orth xml:id="w2333">the</orth>
<orth xml:id="w2400">the</orth>
<orth xml:id="w2442">the</orth>
<orth xml:id="w1402">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w2422">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w6458">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w7822">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w2097">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w2155">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w2482">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w5887">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w5642">T<ex>h</ex>e</orth>
<orth xml:id="w5378">t<ex>h</ex>e</orth>
</form>
</form>
</entry>
=====
What I want to end up with is for each form[@type='orthVar'] only
distinct-values for the orth elements therein with new @xml:id values,
and the old ones preserved at the bottom of the file linking new
values with the current ones (which are copies from a different file).
So something like:
=====
<div>
<entry xml:id="let22-w27">
<form>
<orth type="hw">the</orth>
<form type="orthVar" n="6"> <!-- n= num of diff variants-->
<orth xml:id="let22-w27-vA">The</orth>
<orth xml:id="let22-w27-vB">the</orth>
<orth xml:id="let22-w27-vC">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="let22-w27-vD">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="let22-w27-vE">T<ex>h</ex>e</orth>
<orth xml:id="let22-w27-vF">t<ex>h</ex>e</orth>
</form>
</form>
</entry>
<!-- more entries -->
<!-- at bottom of file -->
<div type="links">
<linkGrp xml:id="let22-w27-lg">
<!-- links between the orth form above with its instance in file.xml -->
<link targets="#let22-w27-vA file.xml#w72 file.xml#w3955
file.xml#w4513 file.xml#w4578 file.xml#w4650 file.xml#w4672
file.xml#w4703 file.xml#w4824 file.xml#w4830"/>
<link targets="#let22-w27-vB file.xml#w2045 file.xml#w2079
file.xml#w2101 file.xml#w2112 file.xml#w2333 file.xml#w2400
file.xml#w2442"/>
<link targets="#let22-w27-vC file.xml#w1402 file.xml#w2422
file.xml#w6458 file.xml#w7822 "/>
<link targets="#let22-w27-vD file.xml#w2097 file.xml#w2155
file.xml#w2482 file.xml#w5887"/>
<link targets="#let22-w27-vE file.xml#w5642"/>
<link targets="#let22-w27-vF file.xml#w5378"/>
</linkGrp>
<!-- more linkGrps -->
</div>
</div>
======
XSLT2 is certainly usable in this case, but all of my attempts have
been hideously inefficient, or fail to accurately compare the nested
children properly.
Suggestions?
Thanks,
-James
--
James Cummings, Cummings dot James at GMail dot com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






