[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Building and re-using an index gradually as multip

Subject: Re: Building and re-using an index gradually as multiple inter-related files get transformed
From: Michel Hendriksen <michel.hendriksen@xxxxxxxxx>
Date: Tue, 10 May 2011 11:53:55 +0200
Re:  Building and re-using an index gradually as multip
Hi

You could setup a pipeline, but would depend on how many files to
transform, and generate output as normal and an index into a variable.
And then for each next step, copy the old index and add the new
objects. Something like

proces file1

var index1
    proces file1 mode = index

proces file2
   with param index = index1

var index2
   copy index1
   process file2 mode = index

etc.

Not so nice when there are a lot of files ofcourse.

When indexes are set during processing you might need to process to a
variable and process that for your index, and also copy that to
output.

Michel



On Mon, May 9, 2011 at 5:37 PM, Fabre Lambeau <fabre.lambeau@xxxxxxxxx>
wrote:
> This was the simplification indeed.
> Instead of XML documents, I call a REST webservice (not mine own) with
> the EXPath HTTP client.
> The workflow is:
> - I send a GET request to get a list of objects of one type
> - I modify the XML response payload to remove identifiers (and modify
> some values)
> - I send a PUT request with the modified payload
> - The webservice responds with a new XML payload containing the
> submitted objects with their new identifiers  (which are GUIDs
> assigned randomly, ie. they cannot be "guessed" from the properties of
> the object.
>
> The mapping index is therefore created by matching the first response
> to the second one and extracting the identifiers from both.
>
> Ideally, I would like to avoid using anything but XSLT to solve this,
> if possible.
>
> Fabre Lambeau
>
>
> On 9 May 2011 16:21, Michael Kay <mike@xxxxxxxxxxxx> wrote:
>> You haven't said how the new identifiers are generated (where do 434 and
>> 2526 come from?).
>>
>> The functional solution to this is to recognize that there is a function
>> f(oldID) -> newID that translates old identifiers to new identifiers. You
>> just need to call this function every time you want to do the translation
>> (not just the first time), and ensure of course that the function always
>> returns the same newID when given the same oldID.
>>
>> Now, how do you implement this function efficiently? I can't tell you,
>> because you haven't told us anything about it.
>>
>> Michael Kay
>> Saxonica
>>
>>
>> On 09/05/2011 15:35, Fabre Lambeau wrote:
>>>
>>> Hi!
>>> I'm after advice in how to build an "indexing" solution using XSLT 2.0.
>>>
>>> Here is my use case (simplified a bit).
>>> I have a number of XML files to "translate"/"re-map" into a second set
>>> of XML files. For each input file, there will be a single output file
>>> (1-to-1 relationship).
>>> Each document lists a series of objects and their properties. This
>>> "translation" consists of changing the identifier (GUID) of each
>>> object in the source file.
>>> However, some of the documents list objects that reference other
>>> objects (dependencies). Whilst "translating" therefore, I need to keep
>>> an index/dictionary of the old-vs-new identifiers, so that all
>>> dependencies remain valid in the new set of files, but that there is
>>> no overlap between original and new identifiers for any object.
>>>
>>> Example (simplified, assume an XML representation)
>>>
>>> SOURCE FILES
>>> Fruits.xml
>>>   Name=Apple, ID=1
>>>   Name=Orange, ID=2
>>> People.xml
>>>   Name=Bob, ID=A
>>>   Name=Marie, ID=B
>>> Preferences.xml
>>>   ID=Y, PersonID=A, FruitID=1
>>>   ID=Z, PersonID=B, FruitID=1
>>>
>>> TARGET FILES
>>> Fruits.xml
>>>   Name=Apple, ID=R
>>>   Name=Orange, ID=T
>>> People.xml
>>>   Name=Bob, ID=434
>>>   Name=Marie, ID=2526
>>> Preferences.xml
>>>   ID=G67, PersonID=434, FruitID=R
>>>   ID=E43, PersonID=2526, FruitID=R
>>>
>>> The example is obviously far more complex, with dozens of files and
>>> complex dependencies. I know however the object model, and therefore
>>> what objects have dependencies, and the direction of all dependencies.
>>> I can therefore order the file transformation so as to ensure that no
>>> file is processed if all its dependent objects have not already been
>>> translated. BTW, I have no control over the identifiers themselves
>>> (they are generated by a separate system).
>>>
>>> I could obviously process each transformation one at a time, and every
>>> time load the relevant source and target files already processed to
>>> create the mapping index. However, I'm after a way to do this in one
>>> single transformation.
>>> The reason I'm stuck (mentally) is the following:
>>> - Using XLST 2.0, I could use xslt:result-document to create the
>>> target files. However, I believe I would not be able to load them in
>>> the same transformation again (in order to do a lookup in them as
>>> necessary when treating depencies)
>>> - A variable, once defined, cannot be modified. I would therefore not
>>> be able to create a global "index" of sort and keep adding to it as I
>>> would in a procedural language.
>>>
>>> What would be the best way to go about this?  A recursive template
>>> that after each step passes the index generated at the previous step
>>> and augments it?  Would I not run into performance problems when
>>> treating hundreds of large source files?
>>>
>>> --
>>> Fabre Lambeau
>>
>>
>
>
>
> --
> Fabre Lambeau

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.