[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Multiple pages of well formed HTML ---> XML

Subject: RE: Multiple pages of well formed HTML ---> XML
From: "Maxime Levesque" <maximel@xxxxxxxxxxxxxx>
Date: Tue, 3 Aug 1999 11:20:44 -0700
html multiple pages
 As a *workaround* for the unimplemented document() function,
you could implement a 'composite parser' (or agregating parser ...)
that would callback it's org.xml.sax.DocumentHandler to make
it think that it's handling a single document ...

 That will work if you are using a SAX based XSL processor (ex.: XT),
if it's DOM based, you can just 'glue' the trees together ...


public class CompositeParser
    implements org.xml.sax.DocumentHandler, org.xml.sax.Parser {

    private InputSources[] inputSources_;

    private DocumentHandler documentHandler_;

    private org.xml.sax.Parser aRealParser_ = "... your favorite parser
...";

    public void setDocumentHandler(DocumentHandler handler) {
	 documentHandler_ = handler;
    }

    public CompositeParser(InputSource[] inputSources) {
	inputSources_ = inputSources;
    }

    public void parse(InputSource source) throws SAXException,
java.io.IOException {

	 // ignore source ...

       documentHandler_.startDocument(); // fake the start of the
'aggregated' doc,

	 // fake a root start
       documentHandler_.startElement("YourFakeRoot", new
AttributeListImpl());

	 // receive the callbacks from all the
       // inputSources_ :

	 for(int i = 0; i < inputSources_.length; i++) {
	   aRealParser_.setDocumentHandler(this);
	   aRealParser_.parse(inputSources_[i]);
       }

	 // fake a root end
	 documentHandler_.endElement("YourFakeRoot");

       documentHandler_.endDocument(); // fake the end of the 'aggregated'
doc.
    }

    public void startElement(String name, AttributeList atts) throws
SAXException {
	 documentHandler_.startElement(name, atts);
    }

    public void endElement(String name) throws SAXException {
	 documentHandler_.startElement(name);
    }

    public void characters(char[] ch, int start, int length) throws
SAXException {
	 documentHandler_.characters(ch, start, length);
    }

    public void ignorableWhitespace(char[] ch, int start, int length) throws
SAXException {
	 documentHandler_.ignorableWhitespace(ch, start, length);
    }

    public void processingInstruction(String target, String data) throws
SAXException {
	 documentHandler_.(target, data);
    }

    public void startDocument() throws SAXException {} // silence this calls

    public void endDocument() throws SAXException {} // silence this calls

    //.... implement the other methods of org.xml.sax.Parser with empty
methods ....
    //... or delegate them to 'aRealParser_' ...
}


 Maxime Levesque


> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxx
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxx]On Behalf Of McKisson, Shawn
> Sent: Tuesday, August 03, 1999 8:38 AM
> To: 'xsl-list@xxxxxxxxxxxxxxxx'
> Subject: Multiple pages of well formed HTML ---> XML
>
>
> Thanks to all those that helped me with the linear to deep xsl
> transformation - the information you gave was priceless to a beginner like
> myself. (see post XSL problem 8/2/1999)
> Special thanks to David Carlisle and Dave Pawson who went out of their way
> to help.
>
> Related to this, I now have the need to gather well formed HTML from
> multiple web pages and form it into a single XML document. It
> seems like to
> only trick here is to get each of the HTML trees
> to hang off of the root node of the DOM tree that XSL is going to
> manipulate.
> ie.
>
> (wp = webpage)
>
>             DOM
>             root
>            / |  \
>           /  |   \
>          /   |    \
>         wp1 wp2..wpn
>
> With that accomplished, it seems that I could use XSL in standard way to
> generate the XML.
> Does this sound like a reasonable solution to the problem? Any other
> suggestions? (I haven't looked into XLink, so I'm not sure exactly what it
> is or if it is relevant here)
>
> --shawn
>
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.