[Home] [By Thread] [By Date] [Recent Entries]
I am new to XSLT and working with ASP.NET 2.0 trying to bulk upload
content from Word 2007 docx files to a SQL Server 2005 Express Edition
database in order to publish the content through my content management
system. So far I think I will need to use xml version 2.0 and Saxon 8.7
processor for .NET (since the .NET XslCompiledTransform processor only
supports xml version 1.0).
I would like to split the Word 2007 documents into several parts via XSLT so I can publish a long Word 2007 document as several web pages to the internet. I added my own customXML to the Word 2007 document to insert information like page title, url, meta description and meta keywords and so on (the WORD2007SAMPLE_DOCUMENT.XML file below only shows the page title customXML to keep the sample short). Every <w:customXml w:element="pageTitle"> indicates the start of a new web page. The content in between will be converted to HTML. The DESIRED_OUTPUT.XML shows the xml file I would like to get as a result. This file will be loaded into the corresponding tables and columns of my SQL Server 2005 Express Edition database. The RECEIVED_OUTPUT.XML shows the output I get so far. It shows that the content is not grouped correctly into separate web pages. The MY_NOT_WORKING_TRANSFORM.XSL shows how I tried to transform the WORD2007SAMPLE_DOCUMENT.XML into DESIRED_OUTPUT.XML without success. The conversion of the content to HTML is not included to keep the sample short. MY PROBLEM: When I group by <w:customXml w:element="pageTitle"> using for-each-group I cant get to the value of <w:t>Content ?</w:t> nodes without destroying my grouping effort. I suppose this is because the content is not in the same or a lower level than my <w:customXml w:element="pageTitle">. Thanks for your help. ---------------------------- WORD2007SAMPLE_DOCUMENT.XML ---------------------------- <?xml version="1.0"?> <w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"> <w:body> <w:p> <w:customXml w:element="pageTitle"> <w:r> <w:t>1. Web Page Title</w:t> </w:r> </w:customXml> </w:p> <w:p> <w:r> <w:t>Content A</w:t> </w:r> </w:p> <w:p> <w:r> <w:t>Content B</w:t> </w:r> </w:p> <w:p> <w:customXml w:element="pageTitle"> <w:r> <w:t>2. Web Page Title</w:t> </w:r> </w:customXml> </w:p> <w:p> <w:r> <w:t>Content C</w:t> </w:r> </w:p> <w:p> <w:r> <w:t>Content D</w:t> </w:r> </w:p> </w:body> </w:document> ---------------------------- DESIRED_OUTPUT.XML ---------------------------- <?xml version="1.0" encoding="utf-8"?> <root xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"> <pageData> <pageTitle>1. Web Page Title</pageTitle> <pageContent> Content A and Content B </pageContent> </pageData> <pageData> <pageTitle>2. Web Page Title</pageTitle> <pageContent> Content C and Content D </pageContent> </pageData> </root> ---------------------------- RECEIVED_OUTPUT.XML ---------------------------- <?xml version="1.0" encoding="utf-8"?> <root xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"> <pageData> <pageTitle>1. Web Page Title</pageTitle> <pageContent> Content A and Content B Content C and Content D </pageContent> </pageData> <pageData> <pageTitle>2. Web Page Title</pageTitle> <pageContent> Content A and Content B Content C and Content D </pageContent> </pageData> </root> ---------------------------- MY_NOT_WORKING_TRANSFORM.XSL ---------------------------- <xsl:stylesheet version="2.0" xmlns:xsl=http://www.w3.org/1999/XSL/Transform xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"> <xsl:output method="xml" indent="yes" encoding="utf-8" /> <xsl:strip-space elements="*"/> <xsl:template match="/">
<xsl:apply-templates select="//w:body"/>
</xsl:template> <xsl:template match="w:body">
<root>
<xsl:for-each-group select="*"
group-starting-with="w:p[w:customXml/@w:element = 'pageTitle']">
<pageData>
<pageTitle>
<xsl:value-of select="."/>
</pageTitle>
<pageContent>
<xsl:value-of select="//w:p/w:r/w:t"/>
</pageContent>
</pageData>
</xsl:for-each-group>
</root>
</xsl:template>
</xsl:stylesheet>
|

Cart



