|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] Excel XML Cell index problemMichael Kay mike at saxonica.comThu Aug 20 23:50:27 PDT 2009
It's a tough format to work with, this one. Generally it's best to
preprocess it either
(a) into a format where every Cell is explicitly present, or
(b) into a format where every Cell is explicitly labelled with row and
column attributes.
The advantage is that this preprocessing is completely generic and reusable
and can be done whatever processing you intend to do next.
I found a neat way to add column numbers to all the cells using XSLT 2.0
grouping capabilities. It's something like this:
<xsl:for-each-group select="Cell" group-starting-with="*[@Index]">
<xsl:variable name="StartIndex" select="(@Index, 1)[1]"/>
<xsl:for-each select="current-group()">
<xsl:copy>
<xsl:attribute name="Index" select="$StartIndex + position() - 1"/>
<xsl:copy-of select="@* | node()"/>
</xsl:copy>
</xsl:for-each>
</xsl:for-each-group>
With XQuery it will need a recursive function. If you're comfortable with
recursion it's probably not very difficult, except at this time of night.
Regards,
Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay
> -----Original Message-----
> From: http://x-query.com/mailman/listinfo/talk
> [mailto:http://x-query.com/mailman/listinfo/talk] On Behalf Of Jesse Wu
> Sent: 20 August 2009 20:16
> To: http://x-query.com/mailman/listinfo/talk
> Subject: Excel XML Cell index problem
>
> Trying to evaluate if XQuery is the right tool to use to
> process large Excel workbook, XLS when save as XML:
> <Workbook><Worksheet><Table>
> <Row>
> <Cell>AAA</Cell>
> <Cell Index="5" HRef= "http:??...">YYY</Cell> .
> .
> <Cell Index="23">XXX</Cell>
> .
> </Row>
> <Row>
> ...
> </Row>
>
> My XQuery code is looping through each row then the cells in
> each row, but unfortunately Microsoft Excel when save as XML
> skips empty cells, as the example above in the 1st row, the
> cells(columns) from 2 to 4 were missing before the 2nd cell
> with Index="5" which means it's actually column 5 (I was
> expecting to have 3 <Cell /> before it). Is there a way to
> code this in Xquery?
>
> My code:
>
> let $pads : = 0
> for $i in 1 to count($Row/Cell)
> let $f := $i + $pads
> return
> <dummy>
> { xu:padcells($Row/Cell[$i], $f) }
>
> {$tr/Cell[$i]}
>
> </dummy>
>
> };
>
> declare function xu:padcells($cl, $f)
> {
> let $t := xs:integer($cl/@Index) -1
> (: let $pads := $pads + ($f - $t )+ 1 is there a way to get
> the total number of empty cells padded back to the calling
> code? :) for $x in $f to $t return
> <Cell Index="{$x}"></Cell>
> };
>
> Thanks in advance for all your help.
>
> -Jes
> _______________________________________________
> http://x-query.com/mailman/listinfo/talk
> http://x-query.com/mailman/listinfo/talk
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






