[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: split OpenOffice 1.1 documents (flat xml)

Subject: RE: split OpenOffice 1.1 documents (flat xml)
From: cknell@xxxxxxxxxx
Date: Wed, 30 Jul 2003 11:26:02 -0400
openoffice split
It appears that the beginning and end of a chapter is not signified by an element, that is to say, there is no element that contains a chapter. Is that correct?

If so, how can you determine where a chapter begins and ends? If you can answer that question, you have moved a long way toward solving the problem.

It appears that you can identify the beginning of a chapter with an XPath expression along these lines: "office:document-content/office:body/text:h[@text:level="1"]. It also seems that all sibling nodes of a particular <text:h> element up to but not including the next <text:h> sibling node are part of the chapter, is that correct?
-- 
Charles Knell
cknell@xxxxxxxxxx - email



-----Original Message-----
From:     "Linnemann, Victor" <Linnemann@xxxxxxxxxxxxx>
Sent:     Wed, 30 Jul 2003 15:50:51 +0200
To:       XSL-List@xxxxxxxxxxxxxxxxxxxxxx
Subject:   split OpenOffice 1.1 documents (flat xml)

Hello everybody,
my question is about splitting large OpenOffice 1.1 documents (the
content.xml that you will see once you unzipped the *.swx) into single
chapters for translation purposes.
It's flat xml, and because of this I already looked in the XSL-FAQ under
http://www.dpawson.co.uk/xsl/sect2/flatfile.htm "Convert a flat XML
document", but I was not able to apply the suggested solution to my problem.
Each of the splitted files has to be a valid OpenOffice document and must
contain exactly one chapter (begins with <text:h ...>bla</text:h> and ends
with the next <text:h ...>bla</text:h>).
***********************************************************
XML (sorry, very odd content):
***********************************************************
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE office:document-content PUBLIC "-//OpenOffice.org//DTD
OfficeDocument 1.0//EN" "office.dtd">
<office:document-content 
	xmlns:office="http://openoffice.org/2000/office" 
	xmlns:style="http://openoffice.org/2000/style" 
	xmlns:text="http://openoffice.org/2000/text" 
	(...)
	xmlns:script="http://openoffice.org/2000/script" office:class="text"
office:version="1.0">
<office:script/>
<office:font-decls>
	<style:font-decl style:name="Arial" fo:font-family="Arial"
style:font-family-generic="swiss" style:font-pitch="variable"/>
</office:font-decls>
<office:automatic-styles/>
<office:body>
	<text:sequence-decls>
		<text:sequence-decl text:display-outline-level="0"
text:name="Illustration"/>
		<text:sequence-decl text:display-outline-level="0"
text:name="Table"/>
		<text:sequence-decl text:display-outline-level="0"
text:name="Text"/>
		<text:sequence-decl text:display-outline-level="0"
text:name="Drawing"/>
	</text:sequence-decls>
	<text:h text:style-name="Heading 1" text:level="1">Kapitel
1</text:h>
	<text:p text:style-name="Standard">Dies ist mein Dokument.</text:p>
	<text:h text:style-name="Heading 1" text:level="1">Kapitel
2</text:h>
	<text:p text:style-name="Standard">Vor jedem neuen Kapitel soll
gesplittet werden.</text:p>
</office:body>
</office:document-content>
***********************************************************
desired result:
***********************************************************
The same document structure, but splitted file 1 has as it's content

	<text:h text:style-name="Heading 1" text:level="1">Chapter
1</text:h>
	<text:p text:style-name="Standard">This is my content.</text:p>

whereas splitted file 2 has as it's content

	<text:h text:style-name="Heading 1" text:level="1">Chapter
2</text:h>
	<text:p text:style-name="Standard">This is my other
content.</text:p>

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.