[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Is it possible to use replace with an variable for

Subject: Re: Is it possible to use replace with an variable for entities?
From: "Wendell Piez wapiez@xxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 7 Jul 2022 11:13:51 -0000
Re:  Is it possible to use replace with an variable for
Hi,

The serialization option for US-ASCII works tolerably well for this.

<xsl:output encoding="us-ascii"/>

As Mike describes, it essentially forces all characters not in US ASCII to
be represented as numeric character references.

Cheers, Wendell



On Thu, Jul 7, 2022 at 3:12 AM Michael Kay mike@xxxxxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> There are two stages to this: (a) replacing \uHHHH with the unicode
> character that it represents, and (b) replacing this unicode character with
> an XML entity reference. Logically, the first step is a transformation,
> while the second step is part of serialization (since entity references
> exist only in serialized XML, and not in the XDM tree representation).
>
> The "cheap and dirty" way is probably to use disable-output-escaping:
>
> <xsl:value-of select="replace($in, '\\u(\d\d\d\d)', '&amp;#x$1;')"
> disable-output-escaping="yes"/>
>
> (Note that this won't work for surrogate pairs, since \uXXXX can represent
> half of a surrogate pair, and `&#xXXXX;` can't).
>
> If you want to do the two stages separately, then
>
> (a) Saxon offers the function saxon:replace-with() which allows you to
> apply a user-supplied function to the matched substring - see
>
https://www.saxonica.com/documentation11/index.html#!functions/saxon/replace-
with
>
> (b) You can force characters to be serialized using entity references
> (technically, character references) by using an encoding (such as
> iso-8859-1) in which the characters cannot be represented any other way.
> Saxon also has an xsl:output option (saxon:character-representation) to
> force all non-ASCII characters to be represented as character references.
> Or if you want to be more specific, you can use a character map.
>
> Michael Kay
> Saxonica
>
> On 7 Jul 2022, at 06:36, Torsten SchaCan schassan@xxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> o;?Dear colleagues,
>
> I need to replace Unicode references (encoded in RTF) with entities via
> XSLT.
>
> My replace command would look like these for example:
>
> replace($value, '\\u7936', 'a<')
> replace($value, '\\u183 \\\^b7', 'B7')
>
> Now I want to avoid to have x-times (nested?) replaces for each character,
> but would like to use a variable like this:
>
> replace($value, '\\u(\d{4})', '&#$1;')
> replace($value, '\\u(\d{3}) \\\^[0-9a-z]{2}', '&#$1;')
>
> This, unfortunately, throws an error, as '&#$1;' is no valid entity
> declaration.
>
> Additionally, my parser doesn't allow to use map:keys($rtfEncodingMap).
>
> Is there a workaround or a solution I might have missed?
>
>
>
> Best,
> Torsten
> --
> Torsten Schassan - Abteilung Handschriften und Sondersammlungen / Digitale
> Editionen
> Herzog August Bibliothek, D-38299 Wolfenbuettel, Tel.: +49 5331 808-130
> Fax -165
> Handschriftendatenbank: https://diglib.hab.de/?db=mss
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
> email)
>
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/174322> (by
> email <>)
>


--
...Wendell Piez... ...wendell -at- nist -dot- gov...
...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org...
...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.