[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Removing unwanted space

Subject: Removing unwanted space
From: "Charles O'Connor coconnor@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 3 Jun 2021 23:54:18 -0000
 Removing unwanted space
OK, I've tried this a bunch of ways and failed (using XSLT 2.0).

The XML I'm working with has a bunch of unwanted whitespace in all sorts of
places, but looking just at paragraphs, it can have

<p>
	The rain in <bold>Spain</bold> <italic>is</italic> wet.
</p>

Or

<p>
	<bold>The rain in Spain is wet.</bold>
</p>

What I and any semi-sane person wants is (TBH, it's the online XML editor that
wants it):

<p>The rain in <bold>Spain</bold> <italic>is</italic> wet.</p>

Or

<p><bold>The rain in Spain is wet.</bold></p>

In some places the XML actually starts this way, but it's not consistent at
all.

One track I went down dead-ended at regular expressions not being able to be
constructed in a way that could return an empty string. Me, I'd have been fine
with the occasional empty string, because it would have been an empty string
of things I did not want, if that makes any sense (and it does not).

Anyway, my attempt to get around that was to look at the first text node and
see if it started with spaces and if so to get rid of them:

    <xsl:template match="p/text()[1]">
        <xsl:choose>
            <xsl:when test="matches(.,'^\s+.*')">
                 <xsl:analyze-string select="." regex="^\s+(\S?.*)">
                    <xsl:matching-substring>
                        <xsl:value-of select="regex-group(1)"/>
                    </xsl:matching-substring>
                </xsl:analyze-string>
            </xsl:when>
            <xsl:otherwise>
                <xsl:apply-templates/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

And sure, I know that the first text node might in reality come after some
content in a child of <p>, but I was willing to cross that bridge when I
actually mangled some content. But for this template, I got a warning: "The
child axis starting at a text node node will never select anything", which is
rather dreary.

Anyway, I'm a little loopy with banging my head against this, but one way or
another, I'm missing this. I'm only treating the text node as a string, not as
a node with children, but apparently I only think that and I am wrong, because
the machine is smarter than I am.

Any help for how to get rid of the space at the beginning and end of
paragraphs without getting rid of the space between elements within the
paragraph would be appreciated.

Thanks!
Charles


Charles O'Connor l Business Systems Analyst
Pronouns: He/Him
Aries Systems Corporation l www.ariessys.com
50 High Street, Suite 21 l North Andover, MA l 01845 l USA  


Main: +1 (978) 975-7570
Cell: +1 (802) 585-5655

       

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.