SEPM0004 ignores corrected XML when corrected by char

Play the video

Subject: SEPM0004 ignores corrected XML when corrected by character-map
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Fri, 16 Feb 2007 18:04:52 +0100

Hi List,

I have a specs question that I stumbled upon when helping someone out with his doctype problems earlier today (gmt+1). Whenever you use doctype-system and/or doctype-system and/or standalone (other than 'omit') and you have some text node in your data model under the root, this will raise an SEPM0004 error, as explained here: http://www.w3.org/TR/xslt-xquery-serialization/#ERRSEPM0004.

I found only one parser actually raising that error (Saxon, of course), one other parser (AltovaXML) either output nothing, or output the wrong XML without error. According to the specs, you can ignore the error and recover by removing the doctype/standalone attributes and output as normal.

Of course, the reasons behind having this error are clear in most situations. But what should happen when the output, after character mapping, happens to become legal XHTML/XML? Example (though not a good use-case):

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="yes" use-character-maps="testmap" standalone="yes"/> <xsl:character-map name="testmap"> <xsl:output-character character="" string="

" /> </xsl:character-map> <xsl:template match="/" name="main"> <xsl:text></xsl:text> <some-root /> </xsl:template> </xsl:stylesheet>

The result of this stylesheet, if SEPM0004 were not there, would be the perfectly legal XML (which is the output from the - in this respect - non-conforming AltovaXML):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<some-root />

Most of the discussion on character-maps is about "the resulting serialized XML may be non-well-formed or non-validating". But this situation is the other way around, by applying a character-map, the resulting XML becomes correct. Shouldn't the SEPM0004 be raised after this phase? Like: "if the serialized result tree does not contain non-whitespace text nodes in the root the SEPM0004 should not be raised"?

I know, it is all a bit of a corner case. I just stumbled upon it today and I am just curious what your (the list) thoughts are on this matter, if any.

Cheers,
-- Abel Braaksma
  http://www.nuntia.nl

Current Thread
SEPM0004 ignores corrected XML when corrected by character-map Abel Braaksma - 16 Feb 2007 17:07:01 -0000 <=

<- Previous	Index	Next ->
Re: Multiple expressions in i, Abel Braaksma	Thread	Books on XSLT/XPATH, Simon Shutter
Re: XHTML html validation, Owen Rees	Date	Books on XSLT/XPATH, Simon Shutter
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >