[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Switching off character entity resolution in XSL
In message <OFD1EA90EE.C86A23F7-ONCA256E2F.000FE8D4-CA256E2F.00118833@xxxxxxxxxx>, AHynes@xxxxxxxxxx writes >Hello All, > >Unlike what most people would use XSL for (i.e. conversion of XML to HTML >or other output format), I have a requirement to transform from one XML >structure to another (subsequent presentation rendering occuring way >downstream). No big deal I guess, but the annoying thing here is that by >the time an XML parser has done it's job as per the XML specification, all >those pesky character entities have been resolved (as defined in the DTD >for the source document) and the output contains square brackets. I've done this for entities which map to character references, rather than to the SGML-style "SDATA" strings you quote below. My strategy is to live with the fact that the parser has carried out all the entity mappings, and to use a "mappings" document containing entries like this: <char> <name>Delta</name> <value>Δ</value> <unicode>0394</unicode> <description>Delta Dec:916 </description> <mapping>[capital Delta]</mapping> <!--U0394 /Delta capital Delta, Greek --> </char> to reverse the process on output. (For your purposes, all you need is the <name> and <value> elements - the other element types have different uses.) Essentially, when you come to output text(), iterate through it character by character. Have a convenience variable $normal-chars, e.g.: <xsl:variable name="normal-chars" select="concat('ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz', '0123456789 ', '!$%^*()-_+={}[];:@#~/?.,')"/> so you can quickly test and output characters which can be output as found. For all others, look up the <char> with the appropriate <value>, and output its <name>: <xsl:value-of select="concat('&', $ch-name, ';')" disable-output-escaping="yes"/> Yes, it is slow and clumsy, and yes, it does use the deprecated disable-output-escaping, but it does work ... Richard Light >Example: >source document contains: • >After transformation: [bull ] (of course, the entity declared >in the DTD is this, i.e. <!ENTITY bull "[bull ]">) >What I would like: • > >I really don't want to go messing with the DTD either, and I really don't >think a parser would like there being unparsed entities within an entity >declaration in a DTD i.e. <!ENTITY bull •> is illegal. > >I realise there is some way of dealing with this with character >substitutions before or after using something like sed, but this isn't >really a great solution, particularly across platforms. Is there any way of >manipulating the output using XSL, or alternatively switching off entity >resolution in the parser? I've played with custom entity resolvers with >Java XML parsers (i.e. resolving URLs for example) but cannot see how this >could be used for external character entities, and also realise there is >some scope for writing a solution in something like JDOM - but what a pain! >That defeats the whole purpose of XSL. I have gotten used to a pretty good >compromise of using Saxon with the Xerces parser and the Norm Walsh entity >resolver classes if that's of any help. > >Either there's a simple solution to this, it's something XML 2.0 (or >whatever is on the horizon) might address (which is no help for me really), >I'm on the wrong mailing list or I should just resort back to ("the good >ol' days of" - yes, sarcasm) Omnimark which was really good at "unparsing" >entities. I'm sure others experience similar problems so hopefully the >first option is the right one (i.e. easy ?). > >Thanks very much, >Alan Hynes. > > > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > -- Richard Light SGML/XML and Museum Information Consultancy richard@xxxxxxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|