[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

odf2xhtml: Processing nested element content seperatly

Subject: odf2xhtml: Processing nested element content seperatly ?
From: "Andreas M." <sfamix@xxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 27 Oct 2006 15:24:50 +0200
div nested anchor
Hi,

I am trying to create an OASIS ODF -> XHTML XSL-T. I want it to be 
as much 1:1 as possible. I ran into some problems, that I find no 
way to solve.

I am using XSLT v1.0 and currently parse with MSXML.NET on oXygen.

A quick outline of the problem:

ODF has a different approach to lining out text than HTML. HTML is 
sensible: Within html:p there may be no other block-elements. Only 
inline-elements are allowed. The same is valid for inline elements 
(ie: html:span, html:img, html:a). They may contain no 
block-elements (html:div, html:h*, etc.)

ODF can intermix paragraphs with tables and frames (that would 
translate to html:div as the most logical advice)

Now, if you have a source document with a paragraph and inside 
this paragraph you have a frame with an image, and this 
image-frame itself contains a paragraph of text 
(a description to the image), then the problems start.

It seems, at least to my knowledge and skills, impossible to create a
clean ODF -> XHTML translation. Check this horrible result out. Of
course, this results in completly invalid XHTML.


"content.xml" (the source):

<text:h text:style-name="Heading_20_1" text:outline-level="1">
	<draw:frame draw:style-name="fr2" draw:name="Grafik1"
	   text:anchor-type="paragraph" svg:x="2.27cm"
	   svg:y="2.057cm" svg:width="5.689cm" style:rel-width="22%"
	   svg:height="5.539cm" style:rel-height="scale"
	   draw:z-index="11">
		<draw:image 
		   xlink:href="Pictures/100000000000012C0000012CBED4AE2D.jpg"
		   xlink:type="simple"
		   xlink:show="embed" xlink:actuate="onLoad"/>
		</draw:frame>TITLE_TEXT
</text:h>
<text:p text:style-name="Text_20_body">
	<draw:frame draw:style-name="fr3"
	   draw:name="KaratekaPrincess" text:anchor-type="paragraph"
	   svg:x="15.727cm" svg:y="0.279cm"
	   svg:width="10.16cm" svg:height="7.17cm" draw:z-index="4">
	 	<draw:image xlink:href="Pictures/10000201000001800000010FE410B668.png"
		   xlink:type="simple"
		   xlink:show="embed" xlink:actuate="onLoad" />
	</draw:frame>
	SOME_PARAGRAPH_TEXT
		<text:span
		   text:style-name="Emphasis">THIS_WILL_BE_EMPHASIZED
		</text:span>.
	PARAGRAPH_TEXT_CONTINUES
</text:p>
			

"content.html" (the result):

<div
style="top:2.27cm;left:2.057cm;height:5.539cm;width:5.689cm;border:1px
solid black;">
         <img src="Pictures/100000000000012C0000012CBED4AE2D.jpg"
alt="Pictures/100000000000012C0000012CBED4AE2D.jpg"/>
</div>TITLE_TEXT
<p>
	<div
style="top:15.727cm;left:0.279cm;height:7.17cm;width:10.16cm;border:1px
solid black;">
            <img src="Pictures/10000201000001800000010FE410B668.png"
alt="Pictures/10000201000001800000010FE410B668.png"/>

</div>SOME_PARAGRAPH_TEXT<span>THIS_WILL_BE_EMPHASIZED</span>.PARAGRAPH_TEXT_CONTINUES.
</p>


This is completly crazy!

Please note, that both images are outlined "at paragraph" in OpenOffice.
So it should not happen, imo, that the first image gets put into the
<text:h>, since there is clearly a new paragraph following the
heading. I mean, the title comes _before_ the image in the document,
which is aligned at the side to the paragraph following the heading.

I also have no clue as to what technique to use in order to get the
<text:p> and the <draw:frame> correct. In HTML the only element,
that would match a draw-frame would be a <div>, but a <div> is not
allowed within <p>. So, for the ODF this is perfectly fitting, also
it is perfectly legal to have an <img> within a <p> in HTML, but as
soon we get the frame, there seems to be a problem.

I would be very glad if someone would know of a solution, since right
now, I make all a <div> and this is surley not, how HTML should be
marked up.


I also checked the XSL FAQ, especially the point about xsl:copy. I had 
hoped, that I, somehow, could do a programmatic rearrangement of the
elements in question. First I would extract all the text from the
text:p element and remember all other, that is contained within, which
then I would process seperatly, after the text:p has been transformed
neatly into html:p. However, if I use the text() function I get only 
the first fragment of the text and, since I need to issue an 
xsl:apply-templates I get the text even twice.

Thanks.

-- 
Bye, 
Andreas M.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Cast Your Vote

We need your help – Vote for DataDirect XML Products!

  • Best SOA or XML site

Winners and finalists announced at SOA World Conference in November.

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.