[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Troublshooting XSLT replace()
Hi, In general this problem is difficult because the requirement defies the XML data model. It specifies a set of changes to make to a run of text that does not exist as such in the tree that XPath sees. It can be inferred by processing the mixed content (as string() will do or indeed as asking for the element's string value will do), but it is not there to be operated on. If you operate on your "view" of it, you will typically find it hard to work with any inline markup, since by definition working on the string wipes element structure away. Usually the better course of valor is to accept that partial solutions are enough. For example, if you can assume that no token to be processed (such as "Undated") will ever be split by markup, then a solution like Graydon's (operating on each of the text nodes discretely, not the whole string together) is the best course. When content is potentially split by markup (if you are so fortunate as to have such markup, as some do some applications of XML) then things are not so easy, due to the XPath data model. XSLT is designed to work the other way around. This suggests to me another approach to the problem, something like this: <xsl:template match="unittitle"> <xsl:variable name="with-text-as-elements" as="element()"> <xsl:apply-templates select="." mode="text-as-elements"/> </xsl:variable> <xsl:apply-templates select="$with-text-as-elements"/> </xsl:template> Mode 'with-text-as-elements' generates a temporary tree in which all text nodes are transformed into element-based representations, sequences of 't' ('token' elements) something like this: <unittitle> <n:t str="Here"/><n:t str="is"/><n:t str='my'/><n:t str="title"/> </unittitle> This is created by a near-identity transformation that operates on the text node descendants. The model could be extended to handle punctuation etc. (Maybe you want to represent them with their own elements.) Or you can plan to work around any punctuation with your regular expressions. This temporary tree could then be processed to do whatever you need to with the text (or rather, token) content. Templates to match these elements should be super-easy to write, test and extend with new substitution and filtering rules. By default: <xsl:template match="n:t"> <xsl:apply-templates select="@str"/> </xsl:templates> Then <xsl:template match="n:t/@str[matches(.,'Undated','-i')]>undated</xsl:template> etc. (this is where the regular expressions come in.) This works well as long as your substitution rules are confined to working with single words or tokens. Matching and processing sequences of them is possible but harder. It assumes that the result does not have whitespace fidelity to the source. Add yet another token type when you need to do this. (You will also need logic to insert whatever whitespace you want into the result when you serialize this back into text.) It involves more overhead than doing it directly in templates a la Graydon. So it might be worth doing only if you have to do a lot of this. Cheers, Wendell Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^ On Tue, Dec 3, 2013 at 4:59 PM, Graydon <graydon@xxxxxxxxx> wrote: > On Tue, Dec 03, 2013 at 04:48:32PM -0500, Nathan Tallman scripsit: >> Thank you, Graydon. I am cleaning up a huge stack of XMLs; >> unfortunately I cannot use lower() because there may be other text in >> <unitdate> that needs to remain capitalized. > > Well, bother. replace() it is, then. > >> > <xsl:template match="text()[ancestor::unittittle]"> > > If there's lots, you might want > > <xsl:template match="text()[ancestor::unittittle][normalize-space()]"> > > instead; the optimizer will _probably_ figure out that you're only > interesting in text nodes with some non-whitespace contents, but it > rarely hurts to provide a hint. > > -- Graydon
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|