RE: One texdocument in and several xmldocuments out?
You can convert a Word document to HTML using File / Save As... and selecting HTML or Filtered HTML. The difference between these two is HTML preserves all of Word's information such as <span> tags to mark spelling and grammar issues, whereas Filtered HTML drops the Word-specific tags. Then follow the advice already provided here (e.g., Tidy) to ensure that the HTML is well-formed XML. Cheers, Stuart -----Original Message----- From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Robert Koberg Sent: Monday, May 06, 2002 06:43 To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: One texdocument in and several xmldocuments out? Hi, Zack Brown wrote: >On Mon, May 06, 2002 at 01:28:51PM +0200, Tove Nilstun wrote: > > >>Hi >> >>I am a total beginner when it comes to XML, but in order to start working >>with it, there are two things I need to sort out. >> >>I have a user guide (written in MS Word) with both text and pictures. I >>would like to 1. convert this document to several xml documents, one per >>headline and 2. create an additional xml file containing an index of the >>files created in step one. >> >>Is this possible? >> >> > >Absolutely. Just create one XSLT file for each output file you desire. >Then run the XML through your parser once for each XSLT file you've >created > You do not need an XSLT file for each page. First you have to get the MSWord doc into XML. THere are a few products out there that convert Word to docbook or some other XML. A neat trick we found when building our MSIE-based editor was that you could paste a MSWord doc into an element that has contentEditable="true". IE converts this to HTML. We use JS to convert it to XML on the client, but you could use Tidy to get well-formed HTML (XML). Then hopefully there are clean separations to indicate where a new page should start. Apply-templates (loop) on each page division and (you can) use extension functions built into Saxon or Xalan to create multiple output documents from one source. best, -Rob XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format