[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Dealing mixed content with invalid node-like text

Subject: Re: Dealing mixed content with invalid node-like text
From: Karlmarx R <karlmarxr@xxxxxxxxx>
Date: Wed, 7 Dec 2011 06:42:03 +0800 (SGT)
Re:  Dealing mixed content with invalid node-like text
Hello David,

Yes, I do process the content in 2 stages, preprocess into one
form of XML and then further process that to my final XML form. BUT, BOTH are
done in XSL with one signle file and the problem that I reported is in first
stage conversion itself. To make things even more clear, here is a rough
skeleton and explanation of my process.I get the entire content of the input
into a variable $input-text, and then tokenize it to get each line of data
into another variable, as below.

<xsl:variable name="lines"
select="tokenize($input-text, '\r?\n')"/>

<!--then pass it to another
template to process each line of data:-->
<xsl:call-template
name="process-lines">
                <xsl:with-param name="lines"
select="$lines"/>
</xsl:call-template>

<!-- And here, I  further process it
to select the REQUIRED value, -->
<xsl:template name="process-lines">
                                <xsl:param name="lines" as="xs:string*"/>
                                <xsl:for-each select="$lines">
                                                <xsl:variable
name="line-components" select="tokenize(.,'\t')"/>
                                                  <xsl:for-each
select="$line-components[position() = last()]">
                                                             <value>
                                                                        
<xsl:call-template name="tag-text">
                                                                             
         <xsl:with-param name="unparsed" select="."/>
                                                                         
</xsl:call-template>
                                                              </value>
                                                  </xsl:for-each>


<!-- AND
IT IS HERE in this "ag-text" template, I try to achieve  what I explained in
my original posting    --> 
 <xsl:template name="tag-text">
       <xsl:param
name="unparsed" required="yes"/>
         <xsl:analyze-string
select="$unparsed" regex="^(.*?)&lt;(.+)&gt;(.*)&lt;/(.+)&gt;(.*?)$">     
       etc as posted earlier. 

The skeleton input will be like (as I
mentioned before):

Line one text <b>within valid node</b> and like <II .>
Title etc
Line two with <1a .> Title etc, <i>within</i> <b>something</b> etc
another line can be just normal text
....

And it is vital here I get the data
in the way I wanted, so that out final output in stage two is correct. And
inview of this I cannot use <value-of select with d-o-e> here. As it seems
this cannot be acheived by XSL (looks likely) I am trying to get my source
corrected. But if there are solution available, in xsl or with better regex, I
would be happy to use. I hope the above clarifies your question. 

Thanks,
Karl


----- Original Message -----
From: David Carlisle <davidc@xxxxxxxxx>
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re:  Re: Dealing mixed
content with invalid node-like text


> nd you can assume it as something like
a text file format

but your post said that you were using xsl:analyze-string,
which means that you must somehow be pre-processing your text format into XML
before it gets to XSLT as otherwise the input would not be well formed and
XSLT would not even start. We can't help with the XSLT question you asked
unless we know what the input looks like _to XSLT_.

David      

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.