|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] XQuery and id()/idref(); Controlling the children of nodes in the result sequenceMaik Stührenberg maik.stuehrenberg at uni-bielefeld.deWed Apr 23 12:23:41 PDT 2008
Hello, I'm new to the list and tried to find the answer to my questions in several locations (including the list archive). So I apologize if I haven't searched thoroughly enough and the anwer has been given already. Here's my problem: We use a standoff annotation format for storing multiple annotated text files. The text files are used for defining a:span elements which delimit the textual information annotated by means of start and end positions (see example below). The annotation is stored separately as children of the a:data element. In principle, everything is allowed underneath the a:data element (in the underlying XSD 'a.xsd' the a:data element is a wrapper for elements derived from a different namespace), however, there won't be any text nodes, only elements containing other elements or empty elements. So I won't have any information about the hierarchy of the children of a:data. Connection between annotation and the annotated text is saved by the a:span attributes (which is declared as xs:IDREF in the XSD). <a:collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.example.org/a a.xsd" xmlns="http://www.example.org/a" xmlns:a="http://www.example.org/a"> <a:entry xml:id="c1" type="text"> <a:spans> <a:span xml:id="seg1" start="0" end="20"/> <a:span xml:id="seg2" start="0" end="20"/> <a:span xml:id="to1" start="0" end="4"/> <a:span xml:id="to2" start="5" end="8"/> </a:spans> <a:data xmlns:b="http://www.example.org/b" xsi:schemaLocation="http://www.example.org/b b.xsd"> <b:text a:span="seg1"> <b:para a:span="seg1"/> </b:text> </a:data> <a:data xmlns:c="http://www.example.org/c" xsi:schemaLocation="http://www.example.org/c c.xsd"> <c:sentence id="w35" a:span="seg2"> <c:word a:span="to1" id="w36"/> <c:word a:span="to2" id="w37"/> <!-- ... --> </c:sentence> </a:data> </a:entry> </a:collection> When I try to use an XQuery to subsum all annotation that corresponds to a specific a:span element with the following XQuery example, I receive the output below. declare namespace a="http://www.example.org/a"; declare namespace b="http://www.example.org/b"; declare namespace c="http://www.example.org/c"; element resultset { let $d := doc('instance.xml') for $s in $d/a:collection/a:entry/a:spans/a:span return <result span="{$s/@xml:id}" start="{$s/@start}" end="{$s/@end}"> { $d/a:collection/a:entry/a:data//*[@a:span = $s/@xml:id] } </result> } <resultset> <result start="0" end="20" span="seg1"> <b:text xmlns:b="http://www.example.org/b" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.example.org/a" xmlns:a="http://www.example.org/a" a:span="seg1"> <b:para a:span="seg1"/> </b:text> <b:para xmlns:b="http://www.example.org/b" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.example.org/a" xmlns:a="http://www.example.org/a" a:span="seg1"/> </result> <result start="0" end="20" span="seg2"> <c:sentence xmlns:c="http://www.example.org/c" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.example.org/a" xmlns:a="http://www.example.org/a" id="w35" a:span="seg2"> <c:word a:span="to1" id="w36"/> <c:word a:span="to2" id="w37"/> <!-- ... --> </c:sentence> </result> <result start="0" end="4" span="to1"> <c:word xmlns:c="http://www.example.org/c" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.example.org/a" xmlns:a="http://www.example.org/a" a:span="to1" id="w36"/> </result> <result start="5" end="8" span="to2"> <c:word xmlns:c="http://www.example.org/c" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.example.org/a" xmlns:a="http://www.example.org/a" a:span="to2" id="w37"/> </result> </resultset> Several things are not perfect here: - Is there any way to suppress the output of the namespaces in each element? Or to be more specific: what do I have to change to output all namespaces once (and only once) in the resultset element? - The biggest issue is that the b:para element is output twice: as child element of the b:text element (which is quite fine) and alone. The same problem appears when looking at the c:word elements: they should not be included as children of the c:sentence element because they are related to different spans, but only as children of the respective result element. - The third question I'd like to ask concerns the use of the fn:idref function in XQuery. My first examples of the query used idref() to select all those nodes underneath a:data that are related to a certain span -- but I didn't manage to get any output although all XSD files are available (I use Saxon-SA 9). What has to be changed in the XQuery to use the idref function? Again I apologize for asking three questions in my first post to the list. Kind regards, Maik Stührenberg
|
Purchase Stylus Studio Online Today!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






