[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: mixed content, text-based abbreviations to xml
Hi James,
You can find below 3 transformation steps that get you to the final result. You can eventually combine them into one stylesheet using a micro-pipelining technique (putting the templates in different modes and the results in variables and applying templates in the next mode on the variable from the preceding step). The first step marks with ex the content in parantheses: step1.xsl <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="* | @* | comment() | processing-instruction()"> <xsl:copy> <xsl:apply-templates select="node() | @*"/> </xsl:copy> </xsl:template> <xsl:template match="text()"> <xsl:analyze-string select="." regex="\(.+?\)"> <xsl:matching-substring> <ex><xsl:value-of select="translate(., '()', '')"/></ex> </xsl:matching-substring> <xsl:non-matching-substring> <xsl:value-of select="."/> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:template> </xsl:stylesheet> giving as result <?xml version="1.0" encoding="UTF-8"?><p> <lb n="1"/>In nomine Domini amen. Ne error obliuionis <supplied>geE?tis</supplied> E?ub tempore verE?antibus pariat detrimentu<ex>m</ex>. <lb n="2"/>Conuenit, ut actus h<supplied>om</supplied>inu<ex>m</ex> l<ex>itte</ex>r<supplied>ar</supplied><ex>um</ex> et teE?tium fidedignorum <seg>annotac<ex>i</ex>on<ex>e</ex></seg> ad poE?teritatis noticiam <foo>deducantur <seg>aut int<ex>er</ex>dum</seg> ob</foo> scripture vetustatem renovent<ex>ur</ex>. Ad perpetuam proinde ... </p> The second step marks with fragment the text before and after ex and before supplied step2.xsl <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:variable name="marks" select="' ,. '"/> <xsl:template match="node() | @*"> <xsl:copy> <xsl:apply-templates select="node() | @*"/> </xsl:copy> </xsl:template> <xsl:template match="text()[following-sibling::*[1][self::ex or self::supplied] and not(translate(substring(., string-length(.)), $marks, '')='')]"> <xsl:variable name="words" select="tokenize(., '\s')"/> <xsl:value-of select="substring(., 1, string-length(.)-string-length($words[last()]))"/> <fragment><xsl:value-of select="$words[last()]"/></fragment> </xsl:template> <xsl:template match="text()[preceding-sibling::*[1][self::ex] and not(translate(substring(.,1,1), $marks, '')='')]"> <xsl:variable name="words" select="tokenize(., '\s')"/> <fragment><xsl:value-of select="$words[1]"/></fragment> <xsl:value-of select="substring(., string-length($words[1]) + 1)"/> </xsl:template> </xsl:stylesheet> giving as result <?xml version="1.0" encoding="UTF-8"?><p> <lb n="1"/>In nomine Domini amen. Ne error obliuionis <supplied>geE?tis</supplied> E?ub tempore verE?antibus pariat <fragment>detrimentu</fragment><ex>m</ex>. <lb n="2"/>Conuenit, ut actus <fragment>h</fragment><supplied>om</supplied><fragment>inu</fragment><ex>m</ex> <fragment>l</fragment><ex>itte</ex><fragment>r</fragment><supplied>ar</supplied><ex>um</ex> et teE?tium fidedignorum <seg><fragment>annotac</fragment><ex>i</ex><fragment>on</fragment><ex>e</ex></seg> ad poE?teritatis noticiam <foo>deducantur <seg>aut <fragment>int</fragment><ex>er</ex><fragment>dum</fragment></seg> ob</foo> scripture vetustatem <fragment>renovent</fragment><ex>ur</ex>. Ad perpetuam proinde ... </p> The final step groups the adjacent fragment, supplied and ex nodes and outputs the choice: step3.xsl <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="p|seg"> <xsl:call-template name="process"/> </xsl:template> <xsl:template name="process"> <xsl:copy> <xsl:for-each-group select="node()" group-adjacent="name() = ('fragment','supplied','ex')"> <xsl:choose> <xsl:when test="current-grouping-key() and current-group()/name() = 'ex'"> <choice> <xsl:if test="current-group()/name() = 'supplied'"> <orig><xsl:apply-templates select="current-group()" mode="orig"/></orig> </xsl:if> <abbr><xsl:apply-templates select="current-group()" mode="abbr"/></abbr> <expan><xsl:apply-templates select="current-group()" mode="expan"/></expan> </choice> </xsl:when> <xsl:otherwise> <xsl:apply-templates select="current-group()" mode="text"/> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:copy> </xsl:template> <xsl:template match="seg" mode="text"> <xsl:call-template name="process"/> </xsl:template> <xsl:template match="fragment" mode="text"> <xsl:value-of select="."/> </xsl:template> <xsl:template match="node() | @*" mode="text"> <xsl:copy> <xsl:apply-templates select="node() | @*" mode="text"/> </xsl:copy> </xsl:template> <xsl:template match="ex" mode="orig"> <am/> </xsl:template> <xsl:template match="fragment" mode="orig"> <xsl:value-of select="."/> </xsl:template> <xsl:template match="supplied" mode="orig"> <damage/> </xsl:template> <xsl:template match="ex" mode="abbr"> <am/> </xsl:template> <xsl:template match="fragment" mode="abbr"> <xsl:value-of select="."/> </xsl:template> <xsl:template match="supplied" mode="abbr"> <xsl:copy-of select="."/> </xsl:template> <xsl:template match="fragment" mode="expan"> <xsl:value-of select="."/> </xsl:template> <xsl:template match="supplied|ex" mode="expan"> <xsl:copy-of select="."/> </xsl:template> </xsl:stylesheet> giving the result you expect <?xml version="1.0" encoding="UTF-8"?><p> <lb n="1"/>In nomine Domini amen. Ne error obliuionis <supplied>geE?tis</supplied> E?ub tempore verE?antibus pariat <choice><abbr>detrimentu<am/></abbr><expan>detrimentu<ex>m</ex></expan></choice>. <lb n="2"/>Conuenit, ut actus <choice><orig>h<damage/>inu<am/></orig><abbr>h<supplied>om</supplied>inu<am/></abbr><expan>h<supplied>om</supplied>inu<ex>m</ex></expan></choice> <choice><orig>l<am/>r<damage/><am/></orig><abbr>l<am/>r<supplied>ar</supplied><am/></abbr><expan>l<ex>itte</ex>r<supplied>ar</supplied><ex>um</ex></expan></choice> et teE?tium fidedignorum <seg><choice><abbr>annotac<am/>on<am/></abbr><expan>annotac<ex>i</ex>on<ex>e</ex></expan></choice></seg> ad poE?teritatis noticiam <foo>deducantur <seg>aut <choice><abbr>int<am/>dum</abbr><expan>int<ex>er</ex>dum</expan></choice></seg> ob</foo> scripture vetustatem <choice><abbr>renovent<am/></abbr><expan>renovent<ex>ur</ex></expan></choice>. Ad perpetuam proinde ... </p> Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com James Cummings wrote: [resending after bounce message...because the mailing list doesn't like google app's different X-MAIL-FROM header...fingers crossed it is right now.]
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|