[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: multiple special characters in XML

Subject: RE: multiple special characters in XML
From: Mike Brown <mbrown@xxxxxxxxxxxxx>
Date: Thu, 2 Sep 1999 10:13:12 -0600
replace bad xml characters
> I can add <!ENTITY eacute "é"> to the XML
> header. But then if they add something else cute tomorrow I 
> am stuck with bad XML again, until I add a new declaration. 
> 	I could add all possible declarations now and have huge XML
> documents (We store 1000s of them)	Alternatively I could 
> look for all the referenced entities and construct an appropriate
> header for each document, which seems like a lot of overhead

Well, the solution *must* be to declare those entities *somewhere* in your
documents. The efficiency tradeoff may not be as bad as you think. It's only
about 250 entity declarations with one character of replacement text per
entity.

Another listmember sent me this document a few months ago:
http://www.skew.org/xml/misc_demos/Html40EntityDeclsForXml.dtd
(view source if it doesn't show up in your browser)

It seems to be derived from this section of the HTML 4.0 spec:
http://www.w3.org/TR/WD-html40-970708/sgml/entities.html

...and a more XML oriented variation can be found in the XHTML 1.0 spec:
http://www.w3.org/TR/1999/xhtml-modularization-19990406/module_defs.html#a_x
html_character_entities

However, none of these resources are appropriate for referencing directly at
the URIs indicated -- my copy is unofficial and on a private web server, and
the others are embedded in HTML documents. It would be inefficient to rely
on network connectivity to be fetching the DTD from remote locations,
anyway. 

If you don't want to explicitly declare the entities in every document, put
the Html40EntityDeclsForXml.dtd file somewhere locally, then reference it
from your XML and/or XSL documents like this (well, replace xsl:stylesheet
with the document element as needed, and make sure the path/URI to the DTD
is correct):

<!DOCTYPE xsl:stylesheet SYSTEM "Html40EntityDeclsForXml.dtd" ]>

This is the same as putting it in your documents, and introduces an
inefficiency in that you need to access an extra file all the time. Your
other option would be to use a post-processor to replace those named
entities in the XML with their corresponding numeric references.

-Mike


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.