[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

solutions to an Excel XML processing problem

Michael Kay mhk at mhk.me.uk
Thu Aug 24 01:11:43 PDT 2006


excel xml
It's a classical positional grouping problem, as discussed in

http://www.ximep-2006.org/papers/Paper-Kay.pdf

though it's not actually one of the use cases presented in that paper.

In practice I've usually found it's more convenient for subsequent
processing to convert to a form (F) in which the absent cells are still
absent, but all the cells that are present have an explicit column number.
You can of course get from F to your preferred target form (G) quite simply
using something like

for $i in 1 to max(F/c/@i)
return (F/c[@i=$i], <c/>)[1]

(the hardest bit is deleting the @i attributes, which I left out because
it's absurdly difficult in XQuery 1.0 if the cells can have arbitrary
content)

Getting to form F can be done as

for $col in $row/c
return
  if ($col/@i) then $col
  else 
    let $p = $col/preceding-sibling::c,
        $p1 = $p[@i][last()]
        $pos = $p1/@i + count($p) - count($p1/preceding-sibling::c)
    return <c i="{$pos}">{$col/node()}</c>

(untested) or of course by a recursive function. This is assuming that the
preceding-sibling axis is supported - it's still an optional feature,
unfortunately.


In XSLT 2.0 (which is what I actually use for this problem, frequently!)
it's

<xsl:for-each-group select="c" group-starting-with="c[@i]">
  <xsl:for-each select="current-group()">
    <c i="{current-group()[1]/@i + position() - 1}">
      <xsl:copy-of select="child::node()"/>
    </c>
  </xsl:for-each>
</xsl:for-each-group>

Michael Kay
http://www.saxonica.com/




> -----Original Message-----
> From: http://xquery.com/mailman/listinfo/talk 
> [mailto:http://xquery.com/mailman/listinfo/talk] On Behalf Of Daniela Florescu
> Sent: 23 August 2006 23:33
> To: XQuery talk
> Subject:  solutions to an Excel XML processing problem
> 
> Dear all,
> 
> I am trying to find various solutions to the following XML to 
> XML transformation problem.
>   It is taken out of real world Excel spreadsheets data processing.
> 
> (Input) The data is composed of cells and some of the cells 
> are empty (no content).
> This version of the data is in compressed form, i.e. does not 
> contain the empty cells and is using the attribute @i to 
> express jumps to the  next non empty cell (using the 
> position). It looks like this:
> =====================================================================
> 
> 
> <row cnt="30">
> <c i="3">third</c>
> <c>fouth</c>
> <c>fifth</c>
> <c i="10">tenth</c>
> <c>eleventh</c>
> <c i="20">twentieth</c>
> <c>twenty-one</c>
> <c>twenty-two</c>
> <c>twenty-three</c>
> </row>
> 
> 
> (Ouput) Write an XQuery to transform the XML data in the 
> uncompressed form that includes the empty cells, as follows:
> =============================================================
> 
> <row cnt="30">
>   <c/>
>    <c/>
> <c>third</c>
> <c>fouth</c>
> <c>fifth</c>
>   <c/>
>   <c/>
> <c/>
>   <c/>
> <c>tenth</c>
> <c>eleventh</c>
>   <c/>
>   <c/>
> <c/>
>   <c/>
> <c/>
>   <c/>
> <c/>
>   <c/>
> <c i="20">twentieth</c>
> <c>twenty-one</c>
> <c>twenty-two</c>
> <c>twenty-three</c>
>   ...
> </row>
> 
> 
> I know it can be done in a variety of ways, I am just looking 
> for the "best" (i.e. simplest) XQuery to achieve this purpose 
> that people on this list can help me find.
> 
> Thanks in advance,
> Dana
> 
> _______________________________________________
> http://xquery.com/mailman/listinfo/talk
> http://xquery.com/mailman/listinfo/talk



Purchase Stylus Studio Online Today!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.