[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Switching off character entity resolution in XSL

Subject: Re: Switching off character entity resolution in XSL
From: Richard Light <richard@xxxxxxxxxxxxxxxxx>
Date: Tue, 3 Feb 2004 10:09:06 +0000
richard light
In message
<OFD1EA90EE.C86A23F7-ONCA256E2F.000FE8D4-CA256E2F.00118833@xxxxxxxxxx>,
AHynes@xxxxxxxxxx writes
>Hello All,
>
>Unlike what most people would use XSL for (i.e. conversion of XML to HTML
>or other output format), I have a requirement to transform from one XML
>structure to another (subsequent presentation rendering occuring way
>downstream). No big deal I guess, but the annoying thing here is that by
>the time an XML parser has done it's job as per the XML specification, all
>those pesky character entities have been resolved (as defined in the DTD
>for the source document) and the output contains square brackets.

I've done this for entities which map to character references, rather
than to the SGML-style "SDATA" strings you quote below.  My strategy is
to live with the fact that the parser has carried out all the entity
mappings, and to use a "mappings" document containing entries like this:

<char>
<name>Delta</name>
<value>&#x0394;</value>
<unicode>0394</unicode>
<description>Delta       Dec:916 </description>
<mapping>[capital Delta]</mapping>
<!--U0394 /Delta capital Delta, Greek -->
</char>

to reverse the process on output.  (For your purposes, all you need is
the <name> and <value> elements - the other element types have different
uses.)

Essentially, when you come to output text(), iterate through it
character by character.  Have a convenience variable $normal-chars,
e.g.:

<xsl:variable name="normal-chars"
          select="concat('ABCDEFGHIJKLMNOPQRSTUVWXYZ',
                  'abcdefghijklmnopqrstuvwxyz',
                  '0123456789 ',
                  '!$%^*()-_+={}[];:@#~/?.,')"/>

so you can quickly test and output characters which can be output as
found.  For all others, look up the <char> with the appropriate <value>,
and output its <name>:

    <xsl:value-of select="concat('&amp;', $ch-name, ';')"
disable-output-escaping="yes"/>

Yes, it is slow and clumsy, and yes, it does use the deprecated
disable-output-escaping, but it does work ...

Richard Light

>Example:
>source document contains:     &bull;
>After transformation:         [bull  ]    (of course, the entity declared
>in the DTD is this, i.e. <!ENTITY bull "[bull  ]">)
>What I would like:            &bull;
>
>I really don't want to go messing with the DTD either, and I really don't
>think a parser would like there being unparsed entities within an entity
>declaration in a  DTD i.e. <!ENTITY bull &bull;> is illegal.
>
>I realise there is some way of dealing with this with character
>substitutions before or after using something like sed, but this isn't
>really a great solution, particularly across platforms. Is there any way of
>manipulating the output using XSL, or alternatively switching off entity
>resolution in the parser? I've played with custom entity resolvers with
>Java XML parsers (i.e. resolving URLs for example) but cannot see how this
>could be used for external character entities, and also realise there is
>some scope for writing a solution in something like JDOM - but what a pain!
>That defeats the whole purpose of XSL. I have gotten used to a pretty good
>compromise of using Saxon with the Xerces parser and the Norm Walsh entity
>resolver classes if that's of any help.
>
>Either there's a simple solution to this, it's something XML 2.0 (or
>whatever is on the horizon) might address (which is no help for me really),
>I'm on the wrong mailing list or I should just resort back to ("the good
>ol' days of" - yes, sarcasm) Omnimark which was really good at "unparsing"
>entities. I'm sure others experience similar problems so hopefully the
>first option is the right one (i.e. easy ?).
>
>Thanks very much,
>Alan Hynes.
>
>
>
>
>
>
> XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>

-- 
Richard Light
SGML/XML and Museum Information Consultancy
richard@xxxxxxxxxxxxxxxxx


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.