[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: doctype

Subject: Re: doctype
From: Colin Paul Adams <colin@xxxxxxxxxxxxxxxxxx>
Date: 30 Jun 2006 18:56:32 +0100
catalog.pen docbook
>>>>> "Marcus" == Marcus Streets <marcus@xxxxxxxxxxx> writes:

    Marcus> I probably missing something trivial here.  I have an xml
    Marcus> document with the doctype:

    Marcus> <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet
    Marcus> href="http://localhost/xslt/docbook/html/docbook.xsl"
    Marcus> type="text/xsl"?> <!DOCTYPE book SYSTEM
    Marcus> "../../System/DTD/main.dtd"[ <!NOTATION XML SYSTEM "">
    Marcus> <!NOTATION MIF SYSTEM ""> <!NOTATION TIF SYSTEM "">
    Marcus> <!NOTATION AI SYSTEM ""> <!ENTITY % catalog PUBLIC
    Marcus> "-//Siberlogic//ENTITIES V3.0.1//EN"
    Marcus> "file:///C:/xml/fips/catalog.pen"> %catalog; ]>

    Marcus> On which I am going to run an identity transformation
    Marcus> which is going to do some filtering.

    Marcus> The question is - is can I keep the Doctype as is.

    Marcus> There are various xml:output options, but I seem to need
    Marcus> to know what the doctype is - and I really just want to
    Marcus> pass it through.

The first problem is to read the doctype - when the xml file is
parsed, this information is lost.

If you are able to use XSLT 2.0, then you can recover the information
by reading the file a second time, using the unparsed-text() function.

You could then use the various XPath 2.0 string functions to extract
the DOCTYPE internal subset yourself.

    Marcus> If I have to define it - how do I define the part within
    Marcus> the square brackets. I( can see how to specify the rest
    Marcus> but not that.

There is no standard way of doing this. Some processors provide a
means to specify this information. let's assume you are using XSLT
2.0, and you have read in and isolated the internal subset with unparsed-text().

In the case of Saxon 8, there are processor-specific facilities to
specifiy the various components of the internal subset (look at the Saxon
documentation). In this case, you would have to completely parse the
internal subset, and then write each part out (I think).

In the case of gestalt, there is a processor-specific output method
that allows you to specify the entire internal subset as a
string. This would be ideal for your scenario (although I don't claim
to have had any wonderful foresight here - I was just writing the
output method as an example of how to do it).

In either case, you are fighting against the rationale of XSLT
processing - the information set of the xml document is the intended
input to a transformation. So it would be better to see if you can
avoid the whole scenario (maybe a non-XSLT approach is what you need).
-- 
Colin Adams
Preston Lancashire

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.