[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Word and XML (was: XML standards coherency and so forth)
> From: "Ogievetsky, Nikita" <nikita.ogievetsky@c...> > Date: Wed, 13 Jan 1999 12:37:06 -0500 > Subject: RE: XML standards coherency and so forth > > >Andreas Berg wrote: > > I am searching for a converter from Word documents to XML. Unfortunatly > >I > have > > no time to wait for Office 2000..... Is there something like this > available? > > In the MS Word go to <File>/<Save As> menu, select "Save as HTML > document". > It will create a well formed XML file: HTML with all elements having start > and end tags. > (Just remember to exhume the <body> - sorry for bad joke). > > Nikita Ogievetsky. > Actually, it is very easy to generate a Word '97 document which when saved as HTML will be non-wellformed. Try the following, where *xxx* means "make xxx bold", and _yyy_ means "make yyy italicized". This is *a test _of the* emergency_ broadcast system The relevant portion of the HTML produced by word is <P>This is <B>a test <I>of the</B> emergency</I> broadcast system</P> The "nesting" of the B and I elements is not well-formed. As far as I can tell this works (or doesn't as the case may be) for any format/font changes. Word 97 also produced several well-formedness violations when doing anything more than simple nested lists. pvb SGML Business Analyst Kaiser Permanente, So Cal. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|