[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: How To Calculate Set of Unique Values Across a Tre
There was a recent thread on processing graphs in XSLT 2.0, see http://markmail.org/message/tlletsiznepd5no6 I provided a (sketch of a) solution that involved listing all the paths starting at a given node (while avoiding looping in the event of a cycle); a simple adaptation of that will give you all the nodes reachable from a given node. In your case the node identifiers can be obtained using document-uri(); you then simply need to apply distinct-values() to the returned set of URIs. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Eliot Kimber [mailto:ekimber@xxxxxxxxxxxx] > Sent: 21 March 2008 18:52 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: How To Calculate Set of Unique Values Across a > Tree of Input Documents > > I have a tree of DITA map documents where each map references > zero or more other map or topic documents. The same map or > topic could be referenced multiple times. > > I need to calculate the "bounded object set" of unique > documents referenced from within the compound map so that I > can then use an XSLT process to create new copies of each > document. Since I can't write to a given result more than > once I have to first remove any duplicates. > > Each target document is referenced by a relative URI that can > be different for different references to the same file (and > in fact will almost always be different in my particular data set). > > I am using XSLT 2. > > Because key() tables are bound to input documents I don't > think I can build a table of references indexed by target > document URI (that is, the absolute URI of the target of the > reference). If I could I would simply build that table and > then just process the first member of each entry. > > I can't think of any other efficient way to approach this. > The best idea I can come up with is to build an intermediate > document that reflects each document reference and then use > something like for-each-group on that to treat it as a set > for the purpose of processing each referenced file exactly > once. If I build a flat list of elements containing the > document URI of each reference I can easily sort the values > and then remove duplicates. So maybe that's as efficient as > anything else would be. > > My other challenge is that my input data set is very large so > I have the potential to run into memory issues, so it may be > that writing out an intermediate file as part of a > multi-stage, multi-transform pipeline is > the best process, but my current processor will handle the > entire data set in one process for the purpose of applying > the (mostly) identity transform to the map set. > > Can anyone suggest other solution approaches to this problem? > > Once again I feel like I might be missing a clever solution > hidden in the haze of my XSLT 1 brain damage. > > Thanks, > > Eliot > > -- > Eliot Kimber > Senior Solutions Architect > "Bringing Strategy, Content, and Technology Together" > Main: 610.631.6770 > www.reallysi.com > www.rsuitecms.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|