[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: finding a minimum set of references

Subject: Re: finding a minimum set of references
From: Wolfgang Laun <wolfgang.laun@xxxxxxxxx>
Date: Wed, 9 Jan 2013 09:02:54 +0100
Re:  finding a minimum set of references
If I've understood the problem correctly...

You can use a key to map a location (@area,@cite) to a sequence of
element(tree). Iterating over the tree elements, do not copy a loc
unless it is first in the sequence returned by that key: This
guarantees that all locs are covered. It may happen that this
eliminates all locs from a tree: if this must not happen, copy an
arbitrary loc.

(If the pass over all tree elements is written as a recursive
function, you may pass down a set of loc's that has been used, and
select one from this set if it can be used to avoid the empty set
within a tree.)

This may not produce the smalles set, but it should at least deal with
all necessities.

-W

On 09/01/2013, Graydon <graydon@xxxxxxxxx> wrote:
> So I have a lot (~600) tree elements that look like:
>
>     <tree parent="attribution">
>         <count count="1"/>
>         <locations>
>             <loc area="loiprop" cite="2010-07-prop10"/>
>         </locations>
>     </tree>
>     <tree parent="block">
>         <count count="10"/>
>         <locations>
>             <loc area="loiprop" cite="2004-09-bud2004"/>
>             <loc area="loiprop" cite="2010-07-prop10"/>
>             <loc area="loiprop" cite="2010-08-allprop"/>
>         </locations>
>     </tree>
>     <tree parent="block">
>         <key>section</key>
>         <count count="6689"/>
>         <locations>
>             <loc area="CRCc945-en" cite="2606"/>
>             <loc area="CRCc945-en" cite="402"/>
>             <loc area="CRCc945-en" cite="6804"/>
>             <loc area="CRCc945-en" cite="8301"/>
>             <loc area="CRCc945-en" cite="8308.1"/>
>         ....
>         </locations>
>         <child>section</child>
>     </tree>
>
> The @area,@cite pairs represent documents with the locations of particular
> patterns (via the @parent and the child elements of the tree elements) of
> documentation content, one tree element per pattern.  (So there are lots of
> tree/@parent="block" elements, etc.)
>
> What's wanted is the minimum number of documents that contain _all_ the
> patterns, for output testing purposes.
>
> So I need to produce the, or at least _a_, shortest list of <loc/> elements
> so that every (all ~600) tree element contains at least one loc element on
> the list.
>
> This has turned out to be much less straightforward than I thought.  I can't
> shake the feeling that there's a grouping solution, but I'm not seeing it if
> there is.  So I'd appreciate any algorithm hints anybody has got;
> unfortunately, efficiency is a concern.
>
> Demonstration that this is really an NP-complete problem also grateful
> accepted!
>
> Thanks!
> Graydon

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.