[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Correcting misplaced spaces in XML documents

Subject: Re: Correcting misplaced spaces in XML documents
From: "Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 26 Mar 2023 13:39:01 -0000
Re:  Correcting misplaced spaces in XML documents
Thank you Gerrit, that looks like a very useful project which I will have a
close look at.
I would not have thought of the complication with footnotes without your
comments, but that's something I could well encounter in our documents.

Thanks to others who made suggestions too.

(Syd) I can't be completely generic because there are elements where leading
spaces really are significant (e.g. code snippets). But I'll look at your
methods as well.

cheers
T

-----Original Message-----
From: Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Sunday, 26 March 2023 23:21
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re:  Correcting misplaced spaces in XML documents

Hi Trevor,

emphasis-normalize-space [1] can deal with whitespace within nested elements
and with embedded footnotes whose accidental leading or trailing whitespace
shouldn't be pulled out and put into the surrounding paragraph.

Gerrit

[1] https://github.com/gimsieke/emphasis-normalize-space

On 26.03.2023 03:33, Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx wrote:
> I suppose this falls into the category of data cleanup.
>
> In the very simple case I am importing documents which have content
> like
> this:
>
>      <para>Press the<keyname> Escape </keyname>key.</para>
>
> You'll notice that the adjacent spaces are wrapped in the keyname
> element when they should just be adjacent to it, not in it.
>
> This is a pathological case, usually the keyname is correct, but
> occasionally there is a leading or a trailing space, hardly ever both.
>
> I've written a simple stylesheet which corrects this situation,
> identifying leading and trailing whitespace, and outputting the
> appropriate breakdown:
>
>    <xsl:template match="keyname">
>
>      <xsl:variable name="leading">b&</xsl:variable>
>
>      <xsl:variable name="trailing">b&</xsl:variable>
>
>      <xsl:variable name="content">b&</xsl:variable>
>
>      <xsl:if test="$leading" != ''><xsl:value-of
> select="$leading"/></xsl:if>
>
>      <xsl:element name="keyname">
>
>        <xsl:apply-templates select="@*"/>
>
>        <xsl:value-of select="$content" />
>
>     </xsl:element>
>
>      <xsl:if test="$trailing" != ''><xsl:value-of
> select="$trailing"/></xsl:if>
>
>    </xsl:template>
>
> This is all fine, and it's adequate for the job when the "greedy"
> elements only contain text, which is the case for keynames.
>
> However now I want to extend the stylesheet to correct some other
> cases where the content model of the element is not just simple text.
>
> For example:
>
>    <para>Select the<filename> <var>username</var>.profile
> </filename>file.</para>
>
> Although the cases I am looking at right now only have a content model
> of text or <var> elements, a more general solution would be welcome
> because other cases are going to turn up where elements are nested two
> or three levels deep.
>
> I've got myself neck deep into conditionals trying to extend my simple
> template to cope with this, and I'm sure there's a straightforward way
> of doing it that doesn't need several hundred lines of code.
>
> Can anyone point me to a cleaner way of doing it?
>
> cheers
>
> T

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.