XML Editor
Sign up for a WebBoard account Sign Up Keyword Search Search More Options... Options
Chat Rooms Chat Help Help News News Log in to WebBoard Log in Not Logged in
Show tree view Topic
Topic Page 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Go to previous topicPrev TopicGo to next topicNext Topic
Postnext
zeinah KSubject: From WordML to XML
Author: zeinah K
Date: 08 Apr 2005 05:26 AM
Hi
I have the wordml document with a user-defined schema called "Wordtest" attached.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/core" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:ns0="schemas-WordTest" w:macrosPresent="no" w:embeddedObjPresent="no" w:ocxPresent="no" xml:space="preserve"><o:DocumentProperties><o:Title>My TitleMy Sub TitleSome text goes here</o:Title><o:Author>AIS</o:Author><o:LastAuthor>AIS</o:LastAuthor><o:Revision>1</o:Revision><o:TotalTime>1</o:TotalTime><o:Created>2005-04-08T09:14:00Z</o:Created><o:LastSaved>2005-04-08T09:15:00Z</o:LastSaved><o:Pages>1</o:Pages><o:Words>7</o:Words><o:Characters>41</o:Characters><o:Company>AIS</o:Company><o:Lines>1</o:Lines><o:Paragraphs>1</o:Paragraphs><o:CharactersWithSpaces>47</o:CharactersWithSpaces><o:Version>11.6359</o:Version></o:DocumentProperties><w:fonts><w:defaultFonts w:ascii="Times New Roman" w:fareast="Times New Roman" w:h-ansi="Times New Roman" w:cs="Times New Roman"/></w:fonts><w:styles><w:versionOfBuiltInStylenames w:val="4"/><w:latentStyles w:defLockedState="off" w:latentStyleCount="156"/><w:style w:type="paragraph" w:default="on" w:styleId="Normal"><w:name w:val="Normal"/><w:rPr><wx:font wx:val="Times New Roman"/><w:sz w:val="24"/><w:sz-cs w:val="24"/><w:lang w:val="FR" w:fareast="FR" w:bidi="AR-SA"/></w:rPr></w:style><w:style w:type="character" w:default="on" w:styleId="DefaultParagraphFont"><w:name w:val="Default Paragraph Font"/><w:semiHidden/></w:style><w:style w:type="table" w:default="on" w:styleId="TableNormal"><w:name w:val="Normal Table"/><wx:uiName wx:val="Table Normal"/><w:semiHidden/><w:rPr><wx:font wx:val="Times New Roman"/></w:rPr><w:tblPr><w:tblInd w:w="0" w:type="dxa"/><w:tblCellMar><w:top w:w="0" w:type="dxa"/><w:left w:w="108" w:type="dxa"/><w:bottom w:w="0" w:type="dxa"/><w:right w:w="108" w:type="dxa"/></w:tblCellMar></w:tblPr></w:style><w:style w:type="list" w:default="on" w:styleId="NoList"><w:name w:val="No List"/><w:semiHidden/></w:style></w:styles><w:shapeDefaults><o:shapedefaults v:ext="edit" spidmax="2050"/><o:shapelayout v:ext="edit"><o:idmap v:ext="edit" data="1"/></o:shapelayout></w:shapeDefaults><w:docPr><w:view w:val="print"/><w:zoom w:percent="100"/><w:doNotEmbedSystemFonts/><w:proofState w:spelling="clean" w:grammar="clean"/><w:attachedTemplate w:val=""/><w:defaultTabStop w:val="708"/><w:hyphenationZone w:val="425"/><w:punctuationKerning/><w:characterSpacingControl w:val="DontCompress"/><w:optimizeForBrowser/><w:validateAgainstSchema/><w:saveInvalidXML w:val="off"/><w:ignoreMixedContent w:val="off"/><w:alwaysShowPlaceholderText w:val="off"/><w:compat><w:breakWrappedTables/><w:snapToGridInCell/><w:wrapTextWithPunct/><w:useAsianBreakRules/><w:dontGrowAutofit/></w:compat></w:docPr><w:body><wx:sect><ns0:Fonction><w:p><w:pPr><w:rPr><w:lang w:val="EN-GB"/></w:rPr></w:pPr><ns0:Title><w:r><w:rPr><w:b/><w:lang w:val="EN-GB"/></w:rPr><w:t>My Title</w:t></w:r><w:proofErr w:type="spellStart"/></ns0:Title><ns0:SousTitre><w:proofErr w:type="spellEnd"/><w:r><w:rPr><w:lang w:val="EN-GB"/></w:rPr><w:t>My Sub Title</w:t></w:r><w:proofErr w:type="spellStart"/></ns0:SousTitre><ns0:Texte><w:proofErr w:type="spellEnd"/><w:r><w:rPr><w:lang w:val="EN-GB"/></w:rPr><w:t>Some text goes here</w:t></w:r></ns0:Texte></w:p></ns0:Fonction><w:sectPr><w:pgSz w:w="11906" w:h="16838"/><w:pgMar w:top="1417" w:right="1417" w:bottom="1417" w:left="1417" w:header="708" w:footer="708" w:gutter="0"/><w:cols w:space="708"/><w:docGrid w:line-pitch="360"/></w:sectPr></wx:sect></w:body></w:wordDocument>

I want to convert this wordml into xml like
<?xml version="1.0" encoding="UTF-8"?>
<Fonction>
<Title><b>My Title</b></Title>
<SousTitre>My Subtitle</Soustitre>
<Texte>Some text goes here</Texte>
</Fonction>

I want to preserve the user defined tags in the schema and also the formatting applied by word.
I saw an xsl on this forum n tried to use that, but it removes the user defined tags.I have no knowledge of how to write an xsl.
Could Anyone help me with this please?
Thanks
Zeinah

Postnext
Minollo I.Subject: From WordML to XML
Author: Minollo I.
Date: 08 Apr 2005 10:26 PM
Your desire to generate an XML document consistent with the structure you are describing seems to be in contrast with the second requirement you are mentioning "I want to preserve the user defined tags"; you'll need to decide which of the two you want.

> Could Anyone help me with this please?

We are more than willing to help you, but you will need to learn some XSLT or XQuery if you want to succeed in any XML-to-XML transformation you are planning to do.
This XSLT does (part of) what (I guess) you are looking for:

<?xml version='1.0' ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xmlns:ns0="schemas-WordTest">
<xsl:template match="/">
<Fonction>
<Title>
<b>
<xsl:value-of select="w:wordDocument/w:body/wx:sect/ns0:Fonction/w:p/ns0:Title/w:r/w:t"/>
</b>
</Title>
<SousTitre>
<xsl:value-of select="w:wordDocument/w:body/wx:sect/ns0:Fonction/w:p/ns0:SousTitre/w:r/w:t"/>
</SousTitre>
<Texte>
<xsl:value-of select="w:wordDocument/w:body/wx:sect/ns0:Fonction/w:p/ns0:Texte/w:r/w:t"/>
</Texte>
</Fonction>
</xsl:template>
</xsl:stylesheet>

Hope this helps,
Minollo

Postnext
zeinah KSubject: From WordML to XML
Author: zeinah K
Date: 11 Apr 2005 02:15 AM
Sorry , i didnt make myself clear enough.
What i want to do, is to keep the <Fonction>,<Subtitle>,<Texte> tags and also the word formatting tags like <b>,<i>,<u> which a user can optionally choose.
For example, within the <Texte> and </Texte> tags, the user will insert some text and will include some formatting to different words.
Is it possible to achieve this?
Thanks
Zeinah

Postnext
Minollo I.Subject: From WordML to XML
Author: Minollo I.
Date: 11 Apr 2005 08:35 AM
You can copy the whole sub-tree of an XML node using xsl:copy-of; that will copy any text formatting element too.
So, you can adapt the XSLT I mentioned earlier to use <xsl:copy-of.../> rather than <xsl:value-of.../>

Hope this helps,
Minollo

Postnext
zeinah KSubject: From WordML to XML
Author: zeinah K
Date: 12 Apr 2005 01:16 AM
Hello,
Thanks this brings me nearer to my desired solution.
Suppose i put the subtitle in bold, now i have this output

<SousTitre>
<w:t xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/core" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882">My Subtitle</w:t>
</SousTitre>

Is it possible to have this output instead

<SousTitre>
<b>My Subtitle</b>
</SousTitre>

is there some kind of mapping which can be done?
Thanks
Zeinah

Posttop
d pSubject: From WordML to XML
Author: d p
Date: 17 Apr 2005 07:20 AM
Originally Posted: 17 Apr 2005 07:18 AM
hi,<br> I have been working on a similar project getting xml from word (2003) and here is the process i went through<br> <br> * download and install the word 2003 xml toolbar (this requires .net programability support enabled in the office 2003 installation)<br> * develop a schema for your document<br> * add the schema to the scheam library for the document<br> * use the xml toolbar to markup your content with xml tags that are based on the schema document.<br> * export as xml (sometimes you may have to say save as data only and not validate against the shcema as for some reason MS doesnt like particular schemas even thought they are valid and well formed)<br> <br> sounds simple? yeah.... thats what their video said but I found it a lot more difficult than that.<br> <br> drop me a line if you want more specifics.<br> <br> hope this helps<br> <br> -david

 
Topic Page 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Go to previous topicPrev TopicGo to next topicNext Topic
Download A Free Trial of Stylus Studio 6 XML Professional Edition Today! Powered by Stylus Studio, the world's leading XML IDE for XML, XSLT, XQuery, XML Schema, DTD, XPath, WSDL, XHTML, SQL/XML, and XML Mapping!  
go

Log In Options

Site Map | Privacy Policy | Terms of Use | Trademarks
Stylus Scoop XML Newsletter:
W3C Member
Stylus Studio® and DataDirect XQuery ™are from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2016 All Rights Reserved.