Re: Normalizing string containing entities
> Consider, for example, the following: > > <para>Some text <em>some other text</em> remaining text</para> <snip/> > The answer I found in several books is that we should not have elements > mixing CDATA and subelements. If we apply this rule, it is impossible to > represent the real structure of text. not entirely true. There's nothing preventing you from marking up the plain text as "plain" the same way you mark-up the emphasized text as "em". eg, do this instead: <para> <plain>Some text </plain> <em>some other text</em> <plain> remaining text</plain> </para> the drawbacks to this solution are that the structure can seem more complicated, and it will use more memory (I think -- i'm no expert on that...). in addition, often-times you dont' even have control over the original structure, so you have to use somebody else's model which mixes content. but if you can set up your structure this way, it makes XSLT processing much easier, as well as processing for other XML apps. (this doesn't actually help w/ your initial problem, though, b/c there's still the matter of stripping the whitespace in the middle of text nodes...) Imran XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format