[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

SEPM0004 ignores corrected XML when corrected by char

Subject: SEPM0004 ignores corrected XML when corrected by character-map
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Fri, 16 Feb 2007 18:04:52 +0100
 SEPM0004 ignores corrected XML when corrected by  char
Hi List,

I have a specs question that I stumbled upon when helping someone out with his doctype problems earlier today (gmt+1). Whenever you use doctype-system and/or doctype-system and/or standalone (other than 'omit') and you have some text node in your data model under the root, this will raise an SEPM0004 error, as explained here: http://www.w3.org/TR/xslt-xquery-serialization/#ERRSEPM0004.

I found only one parser actually raising that error (Saxon, of course), one other parser (AltovaXML) either output nothing, or output the wrong XML without error. According to the specs, you can ignore the error and recover by removing the doctype/standalone attributes and output as normal.

Of course, the reasons behind having this error are clear in most situations. But what should happen when the output, after character mapping, happens to become legal XHTML/XML? Example (though not a good use-case):

<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"
use-character-maps="testmap"
standalone="yes"/>
<xsl:character-map name="testmap">
<xsl:output-character
character="&#xE050;"
string="&#xA;&#xA;" />
</xsl:character-map>
<xsl:template match="/" name="main">
<xsl:text>&#xE050;</xsl:text>
<some-root />
</xsl:template>
</xsl:stylesheet>



The result of this stylesheet, if SEPM0004 were not there, would be the perfectly legal XML (which is the output from the - in this respect - non-conforming AltovaXML):


<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<some-root />


Most of the discussion on character-maps is about "the resulting serialized XML may be non-well-formed or non-validating". But this situation is the other way around, by applying a character-map, the resulting XML becomes correct. Shouldn't the SEPM0004 be raised after this phase? Like: "if the serialized result tree does not contain non-whitespace text nodes in the root the SEPM0004 should not be raised"?


I know, it is all a bit of a corner case. I just stumbled upon it today and I am just curious what your (the list) thoughts are on this matter, if any.

Cheers,
-- Abel Braaksma
  http://www.nuntia.nl

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.