[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

Excel XML Cell index problem

Michael Kay mike at saxonica.com
Thu Aug 20 23:50:27 PDT 2009


  Excel XML Cell index problem
It's a tough format to work with, this one. Generally it's best to
preprocess it either

(a) into a format where every Cell is explicitly present, or

(b) into a format where every Cell is explicitly labelled with row and
column attributes.

The advantage is that this preprocessing is completely generic and reusable
and can be done whatever processing you intend to do next.

I found a neat way to add column numbers to all the cells using XSLT 2.0
grouping capabilities. It's something like this:

<xsl:for-each-group select="Cell" group-starting-with="*[@Index]">
  <xsl:variable name="StartIndex" select="(@Index, 1)[1]"/>
  <xsl:for-each select="current-group()">
    <xsl:copy>
       <xsl:attribute name="Index" select="$StartIndex + position() - 1"/>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>
  </xsl:for-each>
</xsl:for-each-group>


With XQuery it will need a recursive function. If you're comfortable with
recursion it's probably not very difficult, except at this time of night.

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay  

> -----Original Message-----
> From: http://x-query.com/mailman/listinfo/talk 
> [mailto:http://x-query.com/mailman/listinfo/talk] On Behalf Of Jesse Wu
> Sent: 20 August 2009 20:16
> To: http://x-query.com/mailman/listinfo/talk
> Subject:  Excel XML Cell index problem
> 
> Trying to evaluate if XQuery is the right tool to use to 
> process large Excel workbook, XLS when save as XML:
> <Workbook><Worksheet><Table>
> <Row>
>  <Cell>AAA</Cell>
>  <Cell Index="5" HRef= "http:??...">YYY</Cell>  .
>  .
>  <Cell Index="23">XXX</Cell>
>  .
> </Row>
> <Row>
> ...
> </Row>
> 
> My XQuery code is looping through each row then the cells in 
> each row, but unfortunately Microsoft Excel when save as XML 
> skips empty cells, as the example above in the 1st row, the 
> cells(columns) from 2 to 4 were missing before the 2nd cell 
> with Index="5" which means it's actually column 5 (I was 
> expecting to have 3 <Cell /> before it). Is there a way to 
> code this in Xquery?
> 
> My code:
> 
> let $pads : = 0
> for $i in 1 to count($Row/Cell)
> let $f := $i + $pads
> return
> <dummy>
> {  xu:padcells($Row/Cell[$i], $f) }
> 
> {$tr/Cell[$i]}
> 
> </dummy>
> 
> };
> 
> declare function xu:padcells($cl, $f)
> {
> let $t := xs:integer($cl/@Index) -1
> (: let $pads := $pads + ($f - $t )+ 1 is there a way to get 
> the total number of empty cells padded back to the calling 
> code? :) for $x in $f to $t return
> 	<Cell Index="{$x}"></Cell>	
> };
> 
> Thanks in advance for all your help.
> 
> -Jes
> _______________________________________________
> http://x-query.com/mailman/listinfo/talk
> http://x-query.com/mailman/listinfo/talk



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.