[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Ascii end-of-file character output in an XSL file

Subject: Re: Ascii end-of-file character output in an XSL file
From: Kevin Rodgers <kevin.rodgers@xxxxxxx>
Date: Wed, 1 Jun 2005 13:44:07 -0600
end of file character
[I just came across this draft response to an old thread and finally
finished it.]

David Carlisle writes:
>   Is there a way in XSLT to output an external unparsed entity (which
>   would contain the disallowed character)?
> 
> In standard XSLT1 you can only write one output file anyway so you
> couldn't write the actual entity (even if you could generate the
> character), You can write a reference to such an entity as it's just a
> normal attribute value, but being an attribute  value you can't put it
> anywhere near the end of file. 

I did not mean to ask how to generate the character within the
stylesheet (which isn't possible), but how to read it from an external
file and write it to the output.

Let's not constrain ourselves to XSLT 1, and let's assume there is a
UTF-8 file containing just a single character, ASCII SUB (Control-Z) aka
Unicode SUBSTITUTE.  Since that's an ASCII character, it's UTF-8
encoding is identical, the single byte 1A.

In XML, we can do something like:

<!NOTATION c0-controls SYSTEM "http://www.unicode.org/charts/PDF/U0000.pdf">
<!ENTITY substitute SYSTEM "substitute.utf-8" NDATA c0-controls>

<!ELEMENT sub EMPTY>
<!ATTLIST sub char ENTITY #FIXED "substitute">

<text>...<sub/>...</text>

A validating XML processor must inform the application of the system
identifier for the entity, and XSLT 2 supports that via the
unparsed-entity-uri function.  So we can do:

<xsl:template match="sub">
  <xsl:value-of select="unparsed-text(unparsed-entity-uri(sub/@char))"/>
</xsl:template>

But!  The spec says:

[ERR XTDE1180] It is a non-recoverable dynamic error if a resource contains
characters that are not permitted XML characters.

What is the rationale for that restriction?  It means that an XSLT
processor can't do anything with the content of files like
substitute.utf-8 -- not to mention binary files such as images or
compressed text.  Even if a processor implements an extension via the
<xsl:output method=qname-but-not-ncname> attribute to output binary
data, it would violate the spec to read such data in the first place.

-- 
Kevin Rodgers

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.