[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

Excel XML Cell index problem

G. Ken Holman gkholman at CraneSoftwrights.com
Thu Aug 20 18:38:35 PDT 2009


  Excel XML Cell index problem
At 2009-08-20 15:15 -0400, Jesse Wu wrote:
>Trying to evaluate if XQuery is the right tool to use to process large
>Excel workbook, XLS when save as XML:

Absolutely it is.

><Workbook><Worksheet><Table>
><Row>
>  <Cell>AAA</Cell>
>  <Cell Index="5" HRef= "http:??...">YYY</Cell>
>  .
>  .
>  <Cell Index="23">XXX</Cell>
>  .
></Row>
><Row>
>...
></Row>
>
>My XQuery code is looping through each row then the cells in each row,

So far so good ... but I think you are looking at the cells in the 
incorrect way.

>but unfortunately Microsoft Excel when save as XML skips empty cells,
>as the example above in the 1st row, the cells(columns) from 2 to 4
>were missing before the 2nd cell with Index="5" which means it's
>actually column 5 (I was expecting to have 3 <Cell /> before it).

So you want to create pad cells when you go to do "5" and find that 
it is out of sequence.  I seem to recall also that cells after an 
indexed cell may not have an index attribute, so I've augmented your 
data with more conditions in the input.

>Is there a way to code this in Xquery?

Absolutely there is a way, but I think not by using the approach you did.

>My code:
>
>let $pads : = 0
>for $i in 1 to count($Row/Cell)
>let $f := $i + $pads
>return
><dummy>
>{  xu:padcells($Row/Cell[$i], $f) }
>
>{$tr/Cell[$i]}
>
></dummy>
>
>};
>
>declare function xu:padcells($cl, $f)
>{
>let $t := xs:integer($cl/@Index) -1
>(: let $pads := $pads + ($f - $t )+ 1 is there a way to get the total
>number of empty cells padded back to the calling code? :)
>for $x in $f to $t
>return
>         <Cell Index="{$x}"></Cell>
>};
>
>Thanks in advance for all your help.

It is unclear to me why you are treating the collection of cells as 
an array.  That is a "programming approach" to working with XPath 
rather than a "tree approach" or a "path approach", nor is it taking 
advantage of the XPath function and operator library.

Below is a working example that fills out all of the missing cells, 
and it does not do so by walking through an array index.  Rather, it 
deals with missing cells when necessary, and does nothing with 
missing cells when they aren't necessary, and it walks over the cells 
one at a time without putting them all first into a variable.

I find in the classroom those XQuery students who have a programming 
background will use constructs like arrays when they are not at all 
necessary.  And in some ways, I feel doing so breaks the tree-based 
approach of working with XML.

I hope you find the example below helpful.

. . . . . . . . . . Ken

T:\ftemp>type jesse.xml
<Table>
<Row>
  <Cell>AAA</Cell>
  <Cell Index="5" HRef= "http:??...">YYY</Cell>
  <Cell>BBB</Cell>
  <Cell>CCC</Cell>
  <Cell Index="12" HRef= "http:??...">ZZZ</Cell>
  <Cell>DDD</Cell>
  <Cell Index="23">XXX</Cell>
</Row>
</Table>

T:\ftemp>call xquery jesse.xml jesse.xq
<?xml version="1.0" encoding="UTF-8"?>
<dummy>
    <outCell outIndex="1">AAA</outCell>
    <outCell outIndex="2"/>
    <outCell outIndex="3"/>
    <outCell outIndex="4"/>
    <outCell outIndex="5">YYY</outCell>
    <outCell outIndex="6">BBB</outCell>
    <outCell outIndex="7">CCC</outCell>
    <outCell outIndex="8"/>
    <outCell outIndex="9"/>
    <outCell outIndex="10"/>
    <outCell outIndex="11"/>
    <outCell outIndex="12">ZZZ</outCell>
    <outCell outIndex="13">DDD</outCell>
    <outCell outIndex="14"/>
    <outCell outIndex="15"/>
    <outCell outIndex="16"/>
    <outCell outIndex="17"/>
    <outCell outIndex="18"/>
    <outCell outIndex="19"/>
    <outCell outIndex="20"/>
    <outCell outIndex="21"/>
    <outCell outIndex="22"/>
    <outCell outIndex="23">XXX</outCell>
</dummy>
T:\ftemp>type jesse.xq
<dummy>{

(:Walk all cells in the row:)
for $c in Table/Row/Cell

(:determine count of immediately preceding cells without an index attribute:)
let $preceding-non-indexed := count( $c/preceding-sibling::Cell[not(@Index)] )
                               - count( $c/preceding-sibling::Cell[@Index][1]/
                                         preceding-sibling::Cell[not(@Index)] )

(:which was the previously-specified index?:)
let $preceding-index := ( $c/preceding-sibling::Cell[@Index][1]/@Index,
                           0 )[1]

(:which was the last-specified index?:)
let $preceding-or-this-index := ( $c/@Index,
                                   $c/preceding-sibling::Cell[@Index][1]/@Index,
                                   0 )[1]
   return
     (
       (:first fill any absent cells:)
       for $fill-index in xs:integer( $preceding-index +
                                      $preceding-non-indexed +
                                      1 )
                          to
                          xs:integer( $preceding-or-this-index - 1 )
         return <outCell outIndex="{$fill-index}"/>,

       (:then put out this given cell with an appropriate index:)
       <outCell outIndex="{if( $c/@Index )
                             then (:index given:) $c/@Index
                             else (:index calculated :)
                                  ( $preceding-index +
                                    $preceding-non-indexed +
                                    1)
                          }">{$c/node()}</outCell>
     )
}</dummy>
T:\ftemp>rem Done!



--
Interested in these classes?  http://www.CraneSoftwrights.com/q/i/
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/q/
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson:    http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview:  http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman                 mailto:http://x-query.com/mailman/listinfo/talk
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/q/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.