XML Editor
Sign up for a WebBoard account Sign Up Keyword Search Search More Options... Options
Chat Rooms Chat Help Help News News Log in to WebBoard Log in Not Logged in
Show tree view Topic
Go to previous topicPrev TopicGo to next topicNext Topic
Postnext
Rodolfo VegasSubject: Problem with conditional JOINing of SQL database tables
Author: Rodolfo Vegas
Date: 03 Nov 2008 05:51 AM
Hi, I'm stucked with this problem, that seems to be some kind of "performance" issue.

I'm constructing an XML document based upon the information keeped in 6 SQL Server database tables. On an central part of the output document, the output data must be retrieved from 1 table or from antoher 2 tablas, respectively, depending on the value of a column of a table.

On my first approach I did code the following (sample structure):

for $A in collection("db.dbo.tableA"),
$B in collection("db.dbo.tableB"),
$C in collection("db.dbo.tableC"),
$D in collection("db.dbo.tableD")
where ... (: <- JOIN conditions between the 4 tables :)
return
<Root>
<Row>
...
...
<Conditional>
{ if ($D/check = "N") then
(
<field1>{$B/code/text()}</field1>
<field2>Text</field2>
)
else
(
for $E in collection("db.dbo.tableE"),
$F in collection("db.dbo.tableF")
where ... (: <- JOIN condition between the tables D, E and F :)
return
(
<field1>{$F/code/text()}</field1>
<field2>{$E/info/text()}</field2>
)
)
}
</Conditional>
...
...
</Row>
...
...
<Row>
...
</Row>
</Root>

This works OK except when the JOIN for tables D, E and F returns no results; in this case the result is an empty <Conditional> element, but the desired result shoul be not to output anything, that is, "ignore" the complete output Row.

To try to get the desired result I coded again, including th 6 tables in the main (and now unique) for instruction:

for $A in collection("db.dbo.tableA"),
$B in collection("db.dbo.tableB"),
$C in collection("db.dbo.tableC"),
$D in collection("db.dbo.tableD")
$E in collection("db.dbo.tableE"),
$F in collection("db.dbo.tableF")
where ... (: <- JOIN conditions between the 4 main tables A, B, C, D :)
and (
($D/check = "N")
or
((not($D/check = "N") and
... (: <- JOIN between tables D, E and F :)
)
(
return
<Root>
<Row>
...
...
<Conditional>
{ if ($D/check = "N") then
(
<field1>{$B/code/text()}</field1>
<field2>Text</field2>
)
else
(
<field1>{$F/code/text()}</field1>
<field2>{$E/info/text()}</field2>
)
}
</Conditional>
...
...
</Row>
...
...
<Row>
...
</Row>
</Root>

When I execute this code, I get no error message and it remains executing for a long time without actually displaying anything nor raising any error, and I cannot stop the process normally within Stylus Studio, having to shut down the Struzzo.exe process within the Task Manager.

If I omit the condition that JOINs the D table with the E and F tables ( $D/check value ) it executes well in a reasonable amount of time, but I get only the rows for wich $D/check = "N" (those for which the JOIN condition between D, E and F satisfies):

for $A in collection("db.dbo.tableA"),
$B in collection("db.dbo.tableB"),
$C in collection("db.dbo.tableC"),
$D in collection("db.dbo.tableD")
$E in collection("db.dbo.tableE"),
$F in collection("db.dbo.tableF")
where ... (: <- JOIN conditions between the 4 main tables A, B, C, D :)
and ... (: <- JOIN between tables D, E and F :)
return
<Root>
<Row>
...


...
</Row>
</Root>

Can you tell me what's going on? Is ther any way to recode this to avoid the problem or have you any suggestion about any part of the code that could be causing the problem?

Thanks in advance.
Rodolfo

Postnext
Alberto MassariSubject: Problem with conditional JOINing of SQL database tables
Author: Alberto Massari
Date: 04 Nov 2008 08:32 AM
Hi Rodolfo,
have you tried an approach like this?

for $A in collection("db.dbo.tableA"),
$B in collection("db.dbo.tableB"),
$C in collection("db.dbo.tableC"),
$D in collection("db.dbo.tableD")

let $E := collection("db.dbo.tableE")/info[test with $D],
$F := collection("db.dbo.tableF")/code[test with $D]

where (: <- JOIN conditions between the 4 main tables A, B, C, D :)
and (
($D/check = "N")
or
(not($D/check = "N") and not(empty($E)))
)

return ...


Alberto

Postnext
Rodolfo VegasSubject: Problem with conditional JOINing of SQL database tables
Author: Rodolfo Vegas
Date: 05 Nov 2008 10:52 AM
Hi Alberto,

Although the JOIN condition between tables D, E and F was not as you assumed (instead of D->E and D->F relations, it's D->E and E->F), and the condition was more complex than
what I wrote down on the sample code (even with a SQL-like SUBSELECT), I tried the approach you suggested... and voilá... it gave me the desired output in just a few seconds!!!

Thanks very much for your invaluable help and support!!!




I have only a few questions (maybe the 2 last ones are interesting questions for the XML Connections blog from Minollo):

1) I would like to add the following 3 attributes with their corresponding values to the document node:

xmlns xmlns:xsi xsi:schemaLocation

How can I do this in XQuery?

2) What's the real difference between 'joining' the two last tables (EXTERNAL_TEACHERS and COMPANIES) in the 'for' part of the FLWOR expression (as on my second approach) or in
the 'let' part of it?
From my SQL-programmer point of view, I see the code is basically the same (same conditions, but in different place), but if I'm right, the execution is different:
On my second approach (all tables on the 'for' part of FLWOR expr.), XQuery (and therefore also SQL Server) tries to collect the whole set of data for the 6 tables involved
and (I think) this could be the (performance) issue (on SQL Server).
With the solution you suggested, only the data for the 4 tables of the 'for' part are collected on the beginning to determine the whole dataset, and then, for each of the
datarows, a new collection dataset is set up for the EXTERNAL_TEACHERS and COMPANIES tables.
I've also noted that for this solution, the plan includes the creation and use of some temporary tables on the SQL Server (those prefixed with the # simbol), that are well
known to improve overall performance, specially in complex SQL queries that involves several tables and/or views.

3) After having a look at the plan for your suggested, and finally working, solution I've noted that the generated SQL sentences include ALL the columns for the tables included in
the 'for' part of the FLWOR expr. If I just need or are going to use some of the columns of certain table(s) involved in the query, is there any way to have XQuery to 'collect'
just these columns? This is important as it should reduce traffic between the application and the SQL Server and so will improve performance. I'm thinking of something similar
to this:
for $row in collection("DATABASE.dbo.TABLE")/TABLE//*[local-name() = ('name1', 'name2', ... , 'nameN')]

that should be translated to:

SELECT name1, name2, ... , nameN
FROM DATABASE.dbo.TABLE




For other people to let them have an real example of this, here are the codes I had before and after implementing your suggested solution:

------
BEFORE (first approach)
------

<groups>
{
for $edition in collection("TRAINING.dbo.EDITIONS")/EDITIONS,
$course in collection("TRAINING.dbo.COURSES_CATALOG")/COURSES_CATALOG,
$office in collection("TRAINING.dbo.OFFICES")/OFFICES,
$teach in collection("TRAINING.dbo.TEACHERS")/TEACHERS
where
(: Parameter filters :)
$edition/YEAR = $YEAR
and $edition/FROM_DATE >= xs:dateTime(concat(normalize-space($from_date_start),"T00:00:00"))
and $edition/FROM_DATE < xs:dateTime(concat(normalize-space($from_date_end), "T00:00:00"))
(: Other filters :)
and ($edition/CANCELLED/text() = "N" or empty($edition/CANCELLED))
and not(empty($edition/ACTION))
and not(empty($edition/GROUP))
and not($edition/ACTION = 0)
and not($edition/GROUP = 0)
and not(normalize-space($edition/COURSE_TYPE/text()) = "BECA")
and $edition/TOTAL_HOURS >= 6
(: Join COURSES_CATALOG table :)
and $edition/COURSE_CODE = $course/COURSE_CODE
and not(empty($course/COURSE_NAME_2))
and not(starts-with($course/COURSE_NAME_2, "ESPAÑOL"))
(: Join OFFICES table :)
and normalize-space($edition/OFFICE_NAME) = normalize-space($office/OFFICE_NAME)
(: Join TEACHERS table :)
and $edition/COURSE_CODE = $teach/COURSE_CODE
and $edition/COURSE_ZONE = $teach/COURSE_ZONE
and $edition/YEAR = $teach/YEAR
and $edition/ORDER = $teach/ORDER
order by $edition/ACTION, $edition/GROUP
return


...


<center>
{
if ($teach/EXTERNAL = "N") then
(
<cif>12345678</cif>,
<name>My Company's Training Center</name>,
<address>{normalize-space($office/DETAILED_ADDRESS)}</address>,
<zipCode>{$office/ZIP_CODE/text()}</zipCode>,
<city>{normalize-space($office/CITY)}</city>
)
else
(
for $comp in collection("TRAINING.dbo.COMPANIES")/COMPANIES,
$xteach in collection("TRAINING.dbo.EXTERNAL_TEACHERS")/EXTERNAL_TEACHERS
where $comp/ID = $xteach/COMPANY_ID
and $xteach/PERS_ID = $teach/PERS_ID
and matches($xteach/COMPANY_ID/text(), '[A-S][0-9]{7}([A-Z]|[0-9])')
and $comp/COMPANY_CODE = max(for $comp2 in collection("TRAINING.dbo.COMPANIES")/COMPANIES
where $comp2/ID = $xteach/COMPANY_ID
return max($comp2/COMPANY_CODE))
return
(
<cif>{$comp/ID/text()}</cif>,
<name>{normalize-space($comp/COMPANY_NAME)}</name>,
<address>{normalize-space($comp/ADDRESS)}</address>,
<zipCode>{$comp/ZIP_CODE/text()}</zipCode>,
<city>{normalize-space(collection("TRAINING.dbo.CITIES")/CITIES/NAME[../PROVINCE_CODE = substring($comp/ZIP_CODE/text(), 1, 2)]/text())}</city>
)
)
}
</center>

...

}
</groups>



------
BEFORE (second approach)
------

<groups>
{
for $edition in collection("TRAINING.dbo.EDITIONS")/EDITIONS,
$course in collection("TRAINING.dbo.COURSES_CATALOG")/COURSES_CATALOG,
$office in collection("TRAINING.dbo.OFFICES")/OFFICES,
$teach in collection("TRAINING.dbo.TEACHERS")/TEACHERS,
$comp in collection("TRAINING.dbo.COMPANIES")/COMPANIES,
$xteach in collection("TRAINING.dbo.EXTERNAL_TEACHERS")/EXTERNAL_TEACHERS
where
(: Parameter filters :)
$edition/YEAR = $YEAR
and $edition/FROM_DATE >= xs:dateTime(concat(normalize-space($from_date_start),"T00:00:00"))
and $edition/FROM_DATE < xs:dateTime(concat(normalize-space($from_date_end), "T00:00:00"))
(: Other filters :)
and ($edition/CANCELLED/text() = "N" or empty($edition/CANCELLED))
and not(empty($edition/ACTION))
and not(empty($edition/GROUP))
and not($edition/ACTION = 0)
and not($edition/GROUP = 0)
and not(normalize-space($edition/COURSE_TYPE/text()) = "BECA")
and $edition/TOTAL_HOURS >= 6
(: Join COURSES_CATALOG table :)
and $edition/COURSE_CODE = $course/COURSE_CODE
and not(empty($course/COURSE_NAME_2))
and not(starts-with($course/COURSE_NAME_2, "ESPAÑOL"))
(: Join OFFICES table :)
and normalize-space($edition/OFFICE_NAME) = normalize-space($office/OFFICE_NAME)
(: Join TEACHERS table :)
and $edition/COURSE_CODE = $teach/COURSE_CODE
and $edition/COURSE_ZONE = $teach/COURSE_ZONE
and $edition/YEAR = $teach/YEAR
and $edition/ORDER = $teach/ORDER
(: Join EXTERNAL_TEACHERS and COMPANIES tables :)
and (
($teach/EXTERNAL = "N")
or
(
not($teach/EXTERNAL = "N") and
$xteach/PERS_ID = $teach/PERS_ID and
matches($xteach/COMPANY_ID/text(), '[A-S][0-9]{7}([A-Z]|[0-9])') and
$comp/ID = $xteach/COMPANY_ID and
$comp/COMPANY_CODE = max(for $comp2 in collection("TRAINING.dbo.COMPANIES")/COMPANIES
where $comp2/ID = $xteach/COMPANY_ID
return max($comp2/COMPANY_CODE))
)
)
order by $edition/ACTION, $edition/GROUP
return


...


<center>
{
if ($teach/EXTERNAL = "N") then
(
<cif>12345678</cif>,
<name>My Company's Training Center</name>,
<address>{normalize-space($office/DETAILED_ADDRESS)}</address>,
<zipCode>{$office/ZIP_CODE/text()}</zipCode>,
<city>{normalize-space($office/CITY)}</city>
)
else
(
(
<cif>{$comp/ID/text()}</cif>,
<name>{normalize-space($comp/COMPANY_NAME)}</name>,
<address>{normalize-space($comp/ADDRESS)}</address>,
<zipCode>{$comp/ZIP_CODE/text()}</zipCode>,
<city>{normalize-space(collection("TRAINING.dbo.CITIES")/CITIES/NAME[../PROVINCE_CODE = substring($comp/ZIP_CODE/text(), 1, 2)]/text())}</city>
)
}
</center>

...

}
</groups>



-----
AFTER
-----

<groups>
{
for $edition in collection("TRAINING.dbo.EDITIONS")/EDITIONS,
$course in collection("TRAINING.dbo.COURSES_CATALOG")/COURSES_CATALOG,
$office in collection("TRAINING.dbo.OFFICES")/OFFICES,
$teach in collection("TRAINING.dbo.TEACHERS")/TEACHERS

let $xprof := collection("TRAINING.dbo.EXTERNAL_TEACHERS")/EXTERNAL_TEACHERS[(PERS_ID/text() = $teach/PERS_ID/text()) and (matches(COMPANY_ID/text(), '[A-S][0-9]{7}([A-Z]|[0-9])'))],
$comp := collection("TRAINING.dbo.COMPANIES")/COMPANIES[(ID = $xteach/COMP_ID) and
(COMPANY_CODE = max(for $comp2 in collection("TRAINING.dbo.COMPANIES")/COMPANIES
where $comp2/ID = $xteach/COMPANY_ID
return max($comp2/COMPANY_CODE)))]

where
(: Parameter filters :)
$edition/YEAR = $YEAR
and $edition/FROM_DATE >= xs:dateTime(concat(normalize-space($from_date_start),"T00:00:00"))
and $edition/FROM_DATE < xs:dateTime(concat(normalize-space($from_date_end), "T00:00:00"))
(: Other filters :)
and ($edition/CANCELLED/text() = "N" or empty($edition/CANCELLED))
and not(empty($edition/ACTION))
and not(empty($edition/GROUP))
and not($edition/ACTION = 0)
and not($edition/GROUP = 0)
and not(normalize-space($edition/COURSE_TYPE/text()) = "BECA")
and $edition/TOTAL_HOURS >= 6
(: Join COURSES_CATALOG table :)
and $edition/COURSE_CODE = $course/COURSE_CODE
and not(empty($course/COURSE_NAME_2))
and not(starts-with($course/COURSE_NAME_2, "ESPAÑOL"))
(: Join OFFICES table :)
and normalize-space($edition/OFFICE_NAME) = normalize-space($office/OFFICE_NAME)
(: Join TEACHERS table :)
and $edition/COURSE_CODE = $teach/COURSE_CODE
and $edition/COURSE_ZONE = $teach/COURSE_ZONE
and $edition/YEAR = $teach/YEAR
and $edition/ORDER = $teach/ORDER
(: Internal/external teachers filter :)
and (
($teach/EXTERNAL = "N")
or
(not($teach/EXTERNAL = "N") and not(empty($comp))
)

order by $edition/ACTION, $edition/GROUP
return


...


<center>
{
if ($teach/EXTERNAL = "N") then
(
<cif>12345678</cif>,
<name>My Company's Training Center</name>,
<address>{normalize-space($office/DETAILED_ADDRESS)}</address>,
<zipCode>{$office/ZIP_CODE/text()}</zipCode>,
<city>{normalize-space($office/CITY)}</city>
)
else
(
<cif>{$comp/ID/text()}</cif>,
<name>{normalize-space($comp/COMPANY_NAME)}</name>,
<address>{normalize-space($comp/ADDRESS)}</address>,
<zipCode>{$comp/ZIP_CODE/text()}</zipCode>,
<city>{normalize-space(collection("TRAINING.dbo.CITIES")/CITIES/NAME[../PROVINCE_CODE = substring($comp/ZIP_CODE/text(), 1, 2)]/text())}</city>
)
}
</center>

...

}
</groups>


Thanks again and regards,
Rodolfo

Postnext
Marc Van CappellenSubject: Problem with conditional JOINing of SQL database tables
Author: Marc Van Cappellen
Date: 06 Nov 2008 04:24 AM
Hi Rodolfo,

let me try to answer the 3 questions you have.

1) this can be achieved in a very natural way in XQuery. Just specify these as attributes in the desired element constructors.
For example, the following is not only XML, but is also valid XQuery!

<test xmlns="hello"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="somelocation"/>

2) each additional for-clause without an appropriate join condition will result in a cartesian product. If we have a closer look at your "second approach" we see:
($D/check = "N")
or
((not($D/check = "N") and
... (: <- JOIN between tables D, E and F :)
)
As such, in case $D/check is equal to "N", there is no join condition between D, E and F. In other words, each of the $D bindings where check = "N", will result in #E * #F tuples. As you understand, depending of the number of records in your #E and #F tables the query result can become huge.
Using a let as in Alberto's query this is not the case.
First there is a predicate to joing E (and F) with D. Second, a let basically aggregates the data.

3) DataDirect XQuery definitely projects only the relevant columns in the generated SQL. This and other aspects of our SQL generation are explained in the following paper http://www.xquery.com/white_papers/generating_sql/
If you think to have a scenario where this is not the case, or at least not optimal, then we're much interested to hear about it.
May I advise to have this discussion in a new thread, and include the exact XQuery you execute, preferable with the DDL if the tables. And some details on what you believe can be more optimal.
Or if you prefer, you could also contact our support organization http://www.datadirect.com/support/index.ssp

Thanks,
Marc

Postnext
Rodolfo VegasSubject: Problem with conditional JOINing of SQL database tables
Author: Rodolfo Vegas
Date: 06 Nov 2008 06:04 AM
Originally Posted: 06 Nov 2008 06:05 AM
Hi Marc,

Thank you for your clarifying answers.

Question 2 is now clear for me.

Regarding question 1, I've added the attributes as you suggested, but I get the following error message on the first 'for' clause:

[DataDirect][XQuery][err:XPST0005]Error at line 25, column 63. Static error. The static type of step 'child::COURSES' is empty-sequence(). (Did you mean 'COURSES'?)

and on the output I get only the root node <groups> with the 3 attributes and nothing else (neiter the close tag for this node).

After this, I deleted the attributes code, and now I'm getting a very strange error message like this:

Error: on line 25 of file:///c:/.../start_groups.xquery:
FODC0004: The file or directory file:///c:/.../TRAINING.dbo.COURSES does not exist

It seems that it's not recognizing the connection to the Microsoft SQL Server database and is interpreting the 'collection' source as being a file or directory instead. I've closed the query, shut down Stylus Studio, refreshed the connection on the File Explorer, where the database and its tables are all perfectly visible and explorable, but the problem persists. Afterthis I tried to change the processor, but with the same bad results. Also a very strange and 'suspicious', but maybe revealing circumstance, is that the Plan tab on the bottom of the XQuery Source editor windows, does not appear any more (only for this specific query!!!)...
What's going on, please? **** I need help with this ASAP, please *****

Regarding question 3, I will analyze more in-depth the generated SQL sentences in the Plan tab (when again available) and will share any comment, doubt, idea, that I may have.

Rgds,
Rodolfo

Postnext
Marc Van CappellenSubject: Problem with conditional JOINing of SQL database tables
Author: Marc Van Cappellen
Date: 06 Nov 2008 12:08 PM
Hi Rodolfo,

about the last problem you raised. I advise to first read the following, http://www.xquery.com/tips_and_tricks/xpath_and_namespaces.html. this will help you understand how adding a default namespace declaration makes your enclosed path expressions change in behavior.

So, let's take a step back and start from the query where you get the error as follows: [DataDirect][XQuery][err:XPST0005]Error at line 25, column 63. Static error. The static type of step 'child::COURSES' is empty-sequence().

I assume your query is something as follows,

<root xmlns="whatever">{
for $c in collection("TRAINING.dbo.COURSES")/COURSES
return
$c/SOME_COLUMN
}</root>

Adding the xmlns="whatever" also affects the XPath expressions. Now the /COURSES and /SOME_COLUMN steps are assumed to return elements in the "whatever" namespace. Which is not what you want, they need to be in no namespace.

You have two ways to resolve this.

a) don't use the default namespace, but rather an explicit prefix.
<p:root xmlns:p="whatever">{
for $c in collection("TRAINING.dbo.COURSES")/COURSES
return
$c/SOME_COLUMN
}</p:root>

b) use the default namespace, but change your path expressions to use a wild card for the namespace. In theory it it means that all COURSES elements in all namespaces are returned. But of course, in reality it doesn't matter as you're querying an XML structure where all COURSES are in no namespace.
The query becomes something as follows (note the /*:COURSES and /*:SOME_COLUMN):
<root xmlns="whatever">{
for $c in collection("TRAINING.dbo.COURSES")/*:COURSES
return
$c/*:SOME_COLUMN
}</root>

Regards,
Marc

Postnext
Rodolfo VegasSubject: Problem with conditional JOINing of SQL database tables
Author: Rodolfo Vegas
Date: 07 Nov 2008 06:26 AM
Hi Marc,

I've been reading the whole XQuery Tips & Tricks section, including the page about XPath and namespaces, and following your advise I've taken one of your two suggested solutions, in this case the second one.

I've choosed this one because the XML document I'm generating must be uploaded on a Web page application that uses a schema to validate it, and I don't really know if using a specific prefix for the namespace and thus also prefixing the root element as in your first suggested solution, will make the validation to fail (I know, maybe this is a simple question, but namespaces and their behavior, are somewhat obscure to me....).

If not so, please let me know, as I see this first solution is much cleener (for the second one, having a complex query with many fields, it's tidy to having to prefix ALL the elements name's with the *: characters, and also produces an, at least for me, undesired side-effect of not being able in Stylus Studio to see the structure for the current node in the drop-down list when typing the code).

Regarding the SQL generating question, I'm willing to post the full XQuery file with the DDL for the relevant tables, and the saved Plan for the query. As I'm using DataDirect XQuery from within Stylus Studio, not having a DataDirect XQuery license nor a SupportLink license, can I post them to the address you supplied? If not, could you please supply me another address?

Thanks for your support,
Rodolfo

P.D.: I will also post the Schema XSD file for the namespace question, if needed.

Posttop
Marc Van CappellenSubject: Problem with conditional JOINing of SQL database tables
Author: Marc Van Cappellen
Date: 07 Nov 2008 08:27 AM
Hello,

it's good to hear you have your query up an running.

To answer your question on namespaces. Theoretically using a default namespace or using a prefix should not make a difference. Both XML documents should be considered equivalent. But of course, that is the theory, there are tools out there that require the use the default namespace approach. If so, I would say there is a problem with that component, but of course that doesn't mean you can change or want to fix it. In any case, the only way to find out, is to give it a try.

I appreciate your help to possibly improve our SQL generation. I will contact you today offline, and we can then discuss.

Thanks,
Marc

 
Go to previous topicPrev TopicGo to next topicNext Topic
Download A Free Trial of Stylus Studio 6 XML Professional Edition Today! Powered by Stylus Studio, the world's leading XML IDE for XML, XSLT, XQuery, XML Schema, DTD, XPath, WSDL, XHTML, SQL/XML, and XML Mapping!  
go

Log In Options

Site Map | Privacy Policy | Terms of Use | Trademarks
Stylus Scoop XML Newsletter:
W3C Member
Stylus Studio® and DataDirect XQuery ™are from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2016 All Rights Reserved.