[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: flat file transformation: Splitting and merging of

Subject: RE: flat file transformation: Splitting and merging of OpenOffice 1.1Documents
From: cknell@xxxxxxxxxx
Date: Fri, 22 Aug 2003 11:31:44 -0400
merge flat files
Congratulations on engineering a solution. I had been looking at this off and on during the week, but I have been busy, so I didn't have enough time to devote to it. I am not at work today, and so just sat down at my computer with the intention of cracking the problem. 

My idea was similar to yours, but instead of using an external file, I was planning to create a variable to hold the positions of each <text:h text:style-name="Heading 1.1" text:level="1">Level1</text:h> and compare the position of each element as I processed it to the appropriate element in the variable.

I may just do it as an exercise. If I get it working, I'll send you the file.

-- 
Charles Knell
cknell@xxxxxxxxxx - email



-----Original Message-----
From:     Xsl-list <Xsl-list@xxxxxxxxxxxxx>
Sent:     Fri, 22 Aug 2003 16:55:24 +0200
To:       XSL-List@xxxxxxxxxxxxxxxxxxxxxx
Subject:   flat file transformation: Splitting and merging of OpenOffice 1.1Documents

Hi Charles,
finally I came to a solution for my
Splitting-Open-Office-1.1-Document-problem.
As You said before, some XPATH expressions in the stylesheet were to
restrictive.
Here is the final stylesheet. I used the redirect extension of Xalan.
The stylesheet writes all elements that begin before the first chapter (e.g.
style information) and a list of all chapters in a file named
"kopfdaten.xml"
Then it creates XML files for the chapters (without the elements that
precede chapter 1) named e.g. "kapitel-3.xml".
These files can then be processed with other tools, in my case for doing
translations.
Merging the resulting XML files to get back a valid OpenOffice 1.1 doc is
easy now, because I can use the list of all chapters in the file
"kopfdaten.xml".
Thank You for helping me out!

To get a valid XML instance simply unzip a *.SXW file (OpenOffice 1.1
format) and extract content.xml from there. 
*********************************************************************
SPLIT
*********************************************************************
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xalan="org.apache.xalan.xslt.extensions.Redirect"
extension-element-prefixes="xalan"
  xmlns:office="http://openoffice.org/2000/office" 
  xmlns:style="http://openoffice.org/2000/style" 
  xmlns:text="http://openoffice.org/2000/text" 
  xmlns:table="http://openoffice.org/2000/table" 
  xmlns:draw="http://openoffice.org/2000/drawing" 
  xmlns:fo="http://www.w3.org/1999/XSL/Format" 
  xmlns:xlink="http://www.w3.org/1999/xlink" 
  xmlns:number="http://openoffice.org/2000/datastyle" 
  xmlns:svg="http://www.w3.org/2000/svg" 
  xmlns:chart="http://openoffice.org/2000/chart" 
  xmlns:dr3d="http://openoffice.org/2000/dr3d" 
  xmlns:math="http://www.w3.org/1998/Math/MathML" 
  xmlns:form="http://openoffice.org/2000/form" 
  xmlns:script="http://openoffice.org/2000/script" office:class="text"
office:version="1.0"
  xmlns:meta="http://openoffice.org/2000/meta"
  xmlns:dc="http://purl.org/dc/elements/1.1/">

  <xsl:output method="xml" indent="no" encoding="UTF-8"
doctype-public="-//OpenOffice.org//DTD OfficeDocument 1.0//EN"
doctype-system="office.dtd"/>
  <xsl:strip-space elements="*" />
 
  <xsl:template match="/">
    <xsl:apply-templates select="//office:body"/>
  </xsl:template>

  <xsl:template match="office:body">
    <xsl:apply-templates />
  </xsl:template>

  <xsl:template match="text:h[@text:level='1']">
    <xsl:variable name="kap-num"
select="count(preceding-sibling::*[name()='text:h' and @text:level='1']) +
1" />
    
    <xsl:choose>
      <xsl:when test="$kap-num = '1'">
        <xalan:write select="concat('kopfdaten','.xml')">
          <office:document-content>
            <xsl:copy-of
select="/office:document-content/office:body/preceding-sibling::*"/>
            <office:body>
              <xsl:copy-of
select="/office:document-content/office:body/text:h[1]/preceding-sibling::*"
/>
              <files>
                <xsl:for-each select="//text:h[@text:level='1']">
                  <xsl:call-template name="dateiliste">
                    <xsl:with-param name="anzahl">
                      <xsl:value-of
select="count(following-sibling::node()[name()='text:h' and
@text:level='1']) + 1" />
                    </xsl:with-param>
                  </xsl:call-template>
                </xsl:for-each>
              </files>
            </office:body>
          </office:document-content>
        </xalan:write>
        <xalan:write select="concat('kapitel-',$kap-num,'.xml')">
          <office:document-content>
            <office:body>
              <xsl:copy-of select="."/>
              <xsl:apply-templates select="following-sibling::node()">
                <xsl:with-param name="num">
                  <xsl:value-of select="$kap-num" />
                </xsl:with-param>
              </xsl:apply-templates>
            </office:body>
          </office:document-content>
        </xalan:write>
      </xsl:when>
      <xsl:otherwise>
        <xalan:write select="concat('kapitel-',$kap-num,'.xml')">
          <office:document-content>
            <office:body>
              <xsl:copy-of select="."/>
              <xsl:apply-templates select="following-sibling::node()">
                <xsl:with-param name="num">
                  <xsl:value-of select="$kap-num" />
                </xsl:with-param>
              </xsl:apply-templates>
            </office:body>
          </office:document-content>
        </xalan:write>
      </xsl:otherwise>
    </xsl:choose>
    
    
  </xsl:template>

  <xsl:template match="node()">
    <xsl:param name="num" />
    <xsl:variable name="parent-kap"
select="count(preceding-sibling::*[name()='text:h' and @text:level='1'])" />
    <xsl:if test="$num = $parent-kap">
      <xsl:copy-of select="." />
    </xsl:if>
  </xsl:template>

  <xsl:template name="dateiliste">
    <xsl:param name="anzahl" />
    <xsl:if test="not($anzahl='0')">
    <file>
      <xsl:variable name="anzahl-kapitel-gesamt"
select="count(//text:h[@text:level='1']) + 1" />
      <xsl:value-of select="concat('kapitel-',$anzahl-kapitel-gesamt -
$anzahl,'.xml')" />
    </file>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>
*********************************************************************
MERGE
*********************************************************************
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xalan="org.apache.xalan.xslt.extensions.Redirect"
extension-element-prefixes="xalan"
  xmlns:office="http://openoffice.org/2000/office" 
  xmlns:style="http://openoffice.org/2000/style" 
  xmlns:text="http://openoffice.org/2000/text" 
  xmlns:table="http://openoffice.org/2000/table" 
  xmlns:draw="http://openoffice.org/2000/drawing" 
  xmlns:fo="http://www.w3.org/1999/XSL/Format" 
  xmlns:xlink="http://www.w3.org/1999/xlink" 
  xmlns:number="http://openoffice.org/2000/datastyle" 
  xmlns:svg="http://www.w3.org/2000/svg" 
  xmlns:chart="http://openoffice.org/2000/chart" 
  xmlns:dr3d="http://openoffice.org/2000/dr3d" 
  xmlns:math="http://www.w3.org/1998/Math/MathML" 
  xmlns:form="http://openoffice.org/2000/form" 
  xmlns:script="http://openoffice.org/2000/script" office:class="text"
office:version="1.0"
  xmlns:meta="http://openoffice.org/2000/meta"
  xmlns:dc="http://purl.org/dc/elements/1.1/">

  <xsl:output method="xml" indent="no" encoding="UTF-8"
doctype-public="-//OpenOffice.org//DTD OfficeDocument 1.0//EN"
doctype-system="office.dtd"/>
  <xsl:strip-space elements="*" />
  
  <xsl:template match="@*|node()">
	  <xsl:copy>
	    <xsl:apply-templates select="@*|node()"/>
	  </xsl:copy>
	</xsl:template>

  <xsl:template match="files">
    <xsl:for-each select="//files/file">
      <xsl:variable name="file" select="document(text())"/>
		  <xsl:copy-of
select="$file/office:document-content/office:body/*"/>
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>
*********************************************************************

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.