[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

broken text surrounding an entity I want to drop?

Subject: broken text surrounding an entity I want to drop?
From: "Trevor Nicholls" <trevor@xxxxxxxxxxxxxxxxxx>
Date: Tue, 13 Sep 2005 14:50:55 +1200
broken text
Hi

My source XML file contains a myriad of <A id=something></A> entities which
for the most part I wish to drop. I am using an identity template plus the
following to do this:

-----
<!-- drop A tags which have no content -->
<xsl:template match="A">
<xsl:if test="* or text() or string(.)">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>
-----

Unfortunately and unsurprisingly this is too naove; it drops the <a> tags OK
but leaves me with broken text. If I look at the input source in a text
editor I can see that these tags are placed (arbitrarily so far as I can
tell):
a. between newlines
b. after a newline and before text
c. after text and before a newline
d. between text strings

In cases a and b my output preserves the newlines and my later transforms
which normalize whitespace are fine.
In cases c and d the tag may be in the middle of a word or at either end of
a word. When at either end it is again not a problem as the output XML
contains a newline which is normalized acceptably. My difficulty comes with
this kind of input:

-----
<Body>
<A ID="something"></A>
To delete a n<A ID="something"></A>
ode:</Body>
-----

Note that there are several thousand node types which can potentially hold
this kind of text content, so writing a "Body" template to manage it isn't
really feasible.
Ideally this input should become
-----
<Body>To delete a node:</Body>
-----

But of course the transform I'm using isn't doing this, I'm getting
-----
<Body>To delete a n
ode:</Body>
-----

My ham-fisted attempts to come up with templates which
(a) *reliably* identify this situation, and
(b) *don't* lead to my dropping huge screeds of wanted XML
are failing miserably. This doesn't seem like a terribly unusual
requirement, but I can't find an answer in the FAQ or my current set of
books. I've also read the (otherwise helpful) "controlling whitespace"
articles by Bob DuCharme on xml.com.

Could somebody please point me towards the right technique to use here?

Thanks
Trevor

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.