[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Switching off character entity resolution in XSL

Subject: Re: Switching off character entity resolution in XSL
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Wed, 04 Feb 2004 14:09:49 -0500
bull entity
At 05:09 AM 2/3/2004, Richard wrote:
... My strategy is
to live with the fact that the parser has carried out all the entity
mappings, and to use a "mappings" document containing entries like this:

<char>
<name>Delta</name>
<value>&#x0394;</value>
<unicode>0394</unicode>
<description>Delta       Dec:916 </description>
<mapping>[capital Delta]</mapping>
<!--U0394 /Delta capital Delta, Greek -->
</char>
...

Yes, it is slow and clumsy, and yes, it does use the deprecated
disable-output-escaping, but it does work ...

In my view this is a perfectly reasonable approach, as long as one is clear on the dependencies it introduces -- by using XSLT to drive the serializer, one in effect requires that the result be written out to a file (using a processor that implements d-o-e, of course), but since that's built into the requirement to begin with, it's not a big deal. Accordingly, I don't consider it an abuse of d-o-e -- just an application of XSLT+serializer as string writer bound to XSLT's role as a transformer. (In fact when I've implemented this solution to the entity-writing problem, I've deliberate kept the d-o-e operations separate from transformation logic, pipelining two different stylesheets. This way the entity-writing routine is portable.)


Also see Zarella Rendon and Tony Coates on this issue: http://www.xml.com/pub/a/2003/01/02/xmlchar.html

Also, Mike wrote:
I'm afraid the simple answer is the ugly one: just preprocess the entity
references with a text editor to read "$#$bull;" instead of "&bull;". No
point banging your head against the wall to find something more elegant,
it will only give you a headache.

This approach, wrapping your transformation in non-XSLT "entity escaping/un-escaping" routines, may perform better (faster tools), and has the virtue of architectural clarity. It does introduce other local dependencies, of course, but for this kind of a problem that's not really an issue, is it?


Cheers,
Wendell


Richard Light

>Example:
>source document contains:     &bull;
>After transformation:         [bull  ]    (of course, the entity declared
>in the DTD is this, i.e. <!ENTITY bull "[bull  ]">)
>What I would like:            &bull;
>
>I really don't want to go messing with the DTD either, and I really don't
>think a parser would like there being unparsed entities within an entity
>declaration in a  DTD i.e. <!ENTITY bull &bull;> is illegal.
>
>I realise there is some way of dealing with this with character
>substitutions before or after using something like sed, but this isn't
>really a great solution, particularly across platforms. Is there any way of
>manipulating the output using XSL, or alternatively switching off entity
>resolution in the parser? I've played with custom entity resolvers with
>Java XML parsers (i.e. resolving URLs for example) but cannot see how this
>could be used for external character entities, and also realise there is
>some scope for writing a solution in something like JDOM - but what a pain!
>That defeats the whole purpose of XSL. I have gotten used to a pretty good
>compromise of using Saxon with the Xerces parser and the Norm Walsh entity
>resolver classes if that's of any help.
>
>Either there's a simple solution to this, it's something XML 2.0 (or
>whatever is on the horizon) might address (which is no help for me really),
>I'm on the wrong mailing list or I should just resort back to ("the good
>ol' days of" - yes, sarcasm) Omnimark which was really good at "unparsing"
>entities. I'm sure others experience similar problems so hopefully the
>first option is the right one (i.e. easy ?).
>
>Thanks very much,
>Alan Hynes.
>
>
>
>
>
>
> XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>

--
Richard Light
SGML/XML and Museum Information Consultancy
richard@xxxxxxxxxxxxxxxxx


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list


======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list



Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.