[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: trouble with preceding:: and parsing xhtml

Subject: Re: trouble with preceding:: and parsing xhtml
From: Chris Wolf <cw10025@xxxxxxxxx>
Date: Sun, 04 Oct 2009 00:59:14 -0400
Re:  trouble with preceding:: and parsing xhtml
Unfortunately, after I moved the application to Java (xalan, whatever is baked in
jdk-1.5.x) it still renders *some* nodes with preceding::div[@tid='field'][1] 
with the value of the first node, so with those, I tried flipping it by replacing 
"[1]" with "[last()]" again, but that hack only worked for some nodes.

Other then programmtically, the stylesheet works perfectly fine with 
"xsltproc" (MacOS/Linux) and "msxsl" on Windoze.

I also tried your Saxon-6.5.5 which works fine from the command line,
i.e. java -jar /opt/saxon-6.5.5/saxon.jar af.xhtml fbdata.xsl

...works.  Unfortunately, I get the same weird results when I replace
the default "javax.xml.transform.TransformerFactory" impl with
"com.icl.saxon.TransformerFactoryImpl".

Actually - saxon won't even read the xsl file unless I override and revert
the parser back to the built-in jdk (xerces) parser.  Unless, I do that,
I get:

	at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet.java:121)
	at com.icl.saxon.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:120)
	at com.icl.saxon.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:72)
	at com.starclass.ciafb.parser.FbParser.main(FbParser.java:49)
Caused by: java.io.EOFException: no more input
	at com.icl.saxon.aelfred.XmlParser.popInput(XmlParser.java:4083)
	at com.icl.saxon.aelfred.XmlParser.pushURL(XmlParser.java:3620)
	at com.icl.saxon.aelfred.XmlParser.doParse(XmlParser.java:159)
	at com.icl.saxon.aelfred.SAXDriver.parse(SAXDriver.java:320)
	at com.icl.saxon.om.Builder.build(Builder.java:265)
	at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet.java:111)
	... 3 more
---------
java.io.EOFException: no more input
	at com.icl.saxon.aelfred.XmlParser.popInput(XmlParser.java:4083)
	at com.icl.saxon.aelfred.XmlParser.pushURL(XmlParser.java:3620)
	at com.icl.saxon.aelfred.XmlParser.doParse(XmlParser.java:159)
	at com.icl.saxon.aelfred.SAXDriver.parse(SAXDriver.java:320)
	at com.icl.saxon.om.Builder.build(Builder.java:265)
	at com.icl.saxon.PreparedStyleSheet.prepare(PreparedStyleSheet.java:111)
	at com.icl.saxon.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:120)
	at com.icl.saxon.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:72)
	at com.starclass.ciafb.parser.FbParser.main(FbParser.java:49)


Overriding the parser to be "com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl"
fixes this, but the resulting transformation does not look anything like what
I get from that command line.

I'm using saxon-6.5.5 like this:

System.setProperty("javax.xml.transform.TransformerFactory", 
	"com.icl.saxon.TransformerFactoryImpl");
System.setProperty("javax.xml.parsers.SAXParserFactory", 
	"com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl");

HtmlCleaner cleaner = new HtmlCleaner();
TagNode result = cleaner.clean(new File(fin), "utf-8");
Document doc = new DomSerializer(cleaner.getProperties(), true).createDOM(result);

TransformerFactory tFactory = TransformerFactory.newInstance();
StreamSource ss = new StreamSource(xsl);
Transformer xform = tFactory.newTransformer(ss);
StringWriter sw = new StringWriter();
StreamResult sr = new StreamResult(sw);

xform.transform(new DOMSource(doc), sr);
sw.flush();
System.out.println(sw.toString());
	

BTW, when I ran saxon succesfully from the command line, I fed it a document
produced by HtmlCleaner, from the command line, via:
java -jar /opt/jlib/htmlcleaner2_1.jar src=countrytemplate_af.html dest=af.data outcharset=utf-8



Thanks,

  -Chris W.

Michael Kay wrote:
> You're nearly there: you want  
> 
> preceding::div[@tid='field'][1]
> 
> Without the [1], you select all of them throughout the document; and if you
> then use something like xsl:value-of (in XSLT 1.0) then you get the one that
> is first in document order.
> 
>> Then I tried preceding::div[@tid='field' and last()] 
> 
> last() always gives a number that is 1 or more. "and last()" converts this
> number to a boolean, and any number other than 1 is treated as true. So
> you're adding "and true()" to your predicate, which doesn't change its
> result. You were probably thinking of
> 
> preceding::div[@tid='field'][last()]
> 
> which means
> 
> preceding::div[@tid='field'][position() = last()]
> 
> But numeric predicates attached to a reverse axis step count the nodes in
> reverse document order: 1 is the nearest, and last() is the furthest. So the
> correct predicate is [1].
> 
> Regards,
> 
> Michael Kay
> http://www.saxonica.com/
> http://twitter.com/michaelhkay 
>  
> 
>> -----Original Message-----
>> From: Chris Wolf [mailto:cw10025@xxxxxxxxx] 
>> Sent: 03 October 2009 20:37
>> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
>> Subject:  trouble with preceding:: and parsing xhtml
>>
>> I have some xhtml documents that I want to process with XSL.  
>> The patterns that I'm interested in have a series of 
>> occurances of "div" element in pairs as in:
>>
>> <xhtml...>
>> <head/>
>> <body..>
>> <table...>
>> <tr..>
>> <td...>
>> <div tid="field"><a href="...">Foo</a></div> <table...> 
>> <tr...> <td...> <div class="category_data">Bla,Bla,Bla</div>
>> <...>
>>
>> this pattern of the two pairs of div variations repeats an 
>> arbitrary number of times throughout the document and there 
>> could be other "div" elements interspersed, but not with the 
>> same qualifying attributes.
>>
>>
>> Note that the "div" with "class='category_data'" is not a 
>> descendant of the first "div[@tid='field']"
>> I don't think these pairs of DIVs are siblings either (at the 
>> same level).
>>
>> Basically, I'm trying to generate XML of name-value pairs 
>> where the name
>> comes from the content of the <a/> in the first 
>> "div[@tid='field']", and the value is the
>> content of the second "div[@class='category_data']".
>>
>> So the output should be:
>> <Field name="Foo">Bla,Bla,Bla</Field>
>>
>> Where the value of the "name" attribute is the content of the 
>> input doc's
>> div[@tid='field']/a, i.e. in this example, 'Foo'
>>
>> ...and the content of "Field" is the content of the input doc's
>> div[@class='category_data']
>>
>>
>>
>> Since the the second div is not a descendant of the first, I 
>> can't capture 
>> the <a/> content in a variable and call <xsl:apply-templates 
>> select="div[@class='category_data']"/>
>> with a parameter.
>>
>> The question is how else to pass data from one template to 
>> another template?
>>
>> I tried "reaching back" from the second template by using 
>> preceding::div[@tid='field']
>> but this retrieved the value of the first node matching 
>> "div[@tid='field']" not
>> the immediately preceding node that matches, as I would have 
>> expected.  Then I tried
>> preceding::div[@tid='field' and last()] - same result; always 
>> the same value and
>> always the value of the very first node that matched.
>>
>> I guess I have no idea how "preceding::" is supposed to work.
>>
>>
>> I would greatly appreciate any help.  
>>
>> Thanks,
>>
>>    -Chris
>>
>> <xsl:stylesheet version="1.0"
>>     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>     xmlns:h="http://www.w3.org/1999/xhtml">
>>
>> <xsl:output method="xml" indent="yes"/>
>> <xsl:strip-space elements="div a"/>
>>
>> <xsl:template match="/">
>>   <xsl:message>***** ROOT</xsl:message>
>>     <xsl:apply-templates select="//h:div"/>
>> </xsl:template>
>>
>> <xsl:template match="h:div[@tid='field']">
>>   <xsl:message>***** DIV1</xsl:message>
>>   <xsl:apply-templates select="h:div"/>
>> </xsl:template>
>>
>> <xsl:template match="h:div[@class='category_data']">
>>   <xsl:param name="fname"/>
>>   <xsl:message>***** DIV2</xsl:message>
>>   <xsl:message>^<xsl:value-of 
>> select="preceding::h:div[@tid='field']"/>^</xsl:message>
>>   <xsl:element name="Field">
>>     <xsl:attribute name="name">
>>       <xsl:value-of select="preceding::h:div[@tid='field']"/>
>>     </xsl:attribute>
>>     <xsl:value-of select="."/>
>>   </xsl:element><xsl:text>
>> </xsl:text>
>>         <xsl:apply-templates/>
>> </xsl:template>
>>
>> <xsl:template match="text()">
>>   <xsl:message>***** TEXT</xsl:message>
>>     <xsl:apply-templates/>
>> </xsl:template>
>>
>> </xsl:stylesheet>

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.