XQuery Tutorial: Building XQuery Based Web Service Aggregation and Reporting Applications

Introduction

The widespread adoption of XML has profoundly altered the way that information is exchanged within and between enterprises. XML, an extensible, text-based markup language, describes data in a way that is both hardware- and software-independent. As such, it has become a standard of choice for a growing number of Web services and Service Oriented Architectures. With a vast amount of data being published in XML format by multiple sources, the need has arisen for an easy and efficient means of extracting and manipulating this information. XQuery has emerged as an ideal way to aggregate data from Web services, relational databases, and other applications that employ XML.

Scenario

To illustrate the use of XQuery for aggregating data from multiple sources, this XQuery tutorial will cover using real-world data from a stock-quote Web service, and combining that information with historical company data stored in a relational database and presented as XML. In this tutorial, the historical data is being enhanced with live data about the current stock price, which is being retrieved via a Web-service call. Once the data is aggregated, it can be presented in any number of formats. For the purpose of this article, we'll use Stylus Studio® to display the data in HTML using XSLT.

Figure 1 shows an end-to-end overview of the proposed XQuery tutorial scenario. We'll work through the various stages in more detail throughout this article.

Using XML for data aggregation and reporting

Figure 1: Using XML for data aggregation and reporting

There are two XML inputs in this example: the relational database (RDBMS) and the Web service. For the RDBMS, Microsoft Access is being used to automatically format the data into XML. This conversion can also be performed by Stylus Studio®'s built-in ADO-to-XML Document Wizard. An excerpt of the converted data from the RDBMS is seen in Figure 2.

Historical stock data retrieved from Access database

Figure 2: Historical stock data retrieved from Access database

The other source of XML data in this example is an actual real-time stock-quote Web service, which is located at: http://www.swanandmokashi.com/HomePage/WebServices/StockQuotes.asmx?WSDL#StockQuotesSoap

The result of executing this Web service is a SOAP response that can be seen in Figure 3. SOAP (Simple Object Access Protocol) is the messaging protocol that allows Web service applications to talk to each other. When a Web service is identified as an input source, Stylus Studio® transparently invokes the Web service as part of any operation in which that data source is being used.

a SOAP Envelope

Figure 3: Sample SOAP response from stock-price Web service

In order to extract and aggregate the XML data from both the Microsoft Access database and the Web service, we'll build a common information model in XML using the Stylus Studio XML Schema Editor. This will enable us to more easily manipulate the results of the data aggregation for reporting purposes. Separating presentation from data extraction enables us to develop reporting algorithms independently, which makes for a more easily maintainable system. The schema for the common information model as used for the stock data results is depicted in Figure 4.

XML Schema Component

Figure 4: Common information model for stock data aggregation

Writing the solution in XQuery

Because XQuery was designed for processing XML, it's a logical choice for this scenario. Furthermore, XQuery easily supports the notion of "joins," which allows for two or more data sources to be combined based on query conditions. Other language options include:

  • Using a high-level language like Java and performing the parsing and manipulation of the data using JDOM
  • Using XSLT

The major difference is that with XSLT, the join logic is accomplished by saving the stock symbol in a variable in an outer loop when processing the SOAP stock-ticker information, and then in an inner loop testing that variable in an "xsl:if" against each stock element of the historical stock data. When the "xsl:if" evaluates to true, Annual_Revenues and City are output.

For the XQuery solution, we first build a cross-product of the elements from the two XML files, and then limit the result by returning only those items from both the Access database and SOAP sources that match. The complete program is shown below:

declare namespace soap = "http://schemas.xmlsoap.org/soap/envelope/"
declare namespace a = "http://swanandmokashi.com/"

<Result>
{
for $Quote in
/soap:Envelope/soap:Body/a:GetStockQuotesResponse/a:GetStockQuotesResult/a:Quote,
    $Report2003 in document("Report2003.xml")/dataroot/Report2003
   where $Report2003/Symbol = $Quote/a:StockTicker
   return
      <CompanyData>
         <Name_of_Company>
            {$Report2003/Company/text()}
         </Name_of_Company>
         <Ticker_Symbol>
            {$Report2003/Symbol/text()}
         </Ticker_Symbol>
         <Current_Stock_Value>
            {$Quote/a:StockQuote/text()}
         </Current_Stock_Value>
         <Annual_Revenues>
            {$Report2003/Rev/text()}
         </Annual_Revenues>
         <HeadQuarters>
            <City>
               {$Report2003/City/text()}
            </City>
         </HeadQuarters>
      </CompanyData>
}
</Result>

Let's walk through the program line by line. The first two lines of the program are namespace declarations, which are used for accessing the data contained within a SOAP document.

declare namespace soap = "http://schemas.xmlsoap.org/soap/envelope/"
declare namespace a = http://swanandmokashi.com/

These declarations are required because SOAP requests contain two distinct namespaces: one that describes the structure of the message, and one that describes the payload of the message — which, in this case, is the real-time stock information. The two XML namespaces prevent naming conflicts. The elements referring to the SOAP components of the message are accessed with a "soap:" prefix (/soap:Envelope/soap:Body), whereas elements that are part of the embedded message that is transferred as part of the Web service response use the "a:" prefix ($Quote/a:StockTicker).

The next two lines provide a clue as to how the program is going to operate:

<Result>
{

When the XQuery starts to execute, it builds an XML tree starting with the <Result> element. The squiggly bracket "{" that follows is the start of a processing block that, when completed, will have its output added to the XML tree.

The bulk of the processing of the program occurs in the "for/where" loop.

for $Quote in /soap:Envelope/soap:Body/a:GetStockQuotesResponse/a:GetStockQuotesResult/a:Quote,
$Report2003 in document("Report2003.xml")/dataroot/Report2003
where $Quote/a:StockTicker = $Report2003/Symbol

The "for" loop iterates over the SOAP message over the "a:Quote" repeating element. Each time through the loop, it assigns the "a:Quote" element to the "$Quote" variable. Similarly, for each of the "Report2003" repeating elements from the "Report2003.xml" file, the program assigns the element to $Report2003. Each time through the loop, the "where" clause executes and when it evaluates to true, it lets the next line of the XQuery run.

Note that an XQuery processor can in many cases optimize the execution of an XQuery such as this. For instance, it can take the result of the "$Quote" variable and use that to more optimally read from the "Report2003.xml" document, creating or using an existing index if necessary. The XQuery processor could also read the "Report2003.xml" document just once and create an in-memory hash-table of the structure so that during the next iteration through the loop, the "$Quote/a:StockTicker = $Report2003/Symbol" comparison can be done very efficiently.

Let's take a look at the state of the "$Quote" and "$Report2003" variables after one run through the loop:

XQuery Variables

Continuing through the code, once a "where" condition is satisfied, the return clause runs. Here's what the first few lines of the return look like:

   return
      <CompanyData>
         <Name_of_Company>
            {$Report2003/Company/text()}
         </Name_of_Company>

Here, an XML result tree is being built as part of the return operation, with <CompanyData> at the top-level, and <Name_of_Company> as a nested sub-element. The value being evaluated by the query is defined within the squiggly brackets. Note that you must specify the datatype being retrieved [here, it is "text ()"]; otherwise, the XQuery copies just the "$Report2003/Company" element — and not the contents — to the destination XML result tree.

As part of Stylus Studio®'s XSLT debugger, the intermediate result of returns is made available in the product's "Variables" window. This is especially helpful for developing XQuery programs because, unlike with languages like XSLT, the output isn't generated immediately. Therefore, this tool allows for quickly looking at the intermediate state during execution.

Skipping past the rest of the elements in the return statement, we see the following:

}
 </Result>

The end squiggly bracket "}" takes the XML result tree that was generated (starting with the begin squiggly bracket "{" ) and adds it to the outer XML result tree. Finally, the end tag of the <Result> is added to the XML result tree and the XQuery program completes.

Achieving the result

When the XQuery is executed within Stylus Studio®, the XML result tree is output to the Stylus Preview Window. From there, sorting the result according to any element from the input can be useful, and creating an extension to accomplish that is an easy matter. To sort the result in alphabetical order by company name, for example, simply insert an "order by" statement after the "where" in the "for/let/where/return" loop. We can then sort the output by company name with the following loop:

for $Quote in
/soap:Envelope/soap:Body/a:GetStockQuotesResponse/a:GetStockQuotesResult/a:Quote,
 $Report2003 in document("Report2003.xml")/dataroot/Report2003
where $Quote/a:StockTicker = $Report2003/Symbol
order by $Quote/a:CompanyName
return
...

While the code to produce the above XQuery is fairly straightforward to write, Stylus Studio® provides a mapping tool that simplifies the building of basic maps. Building the XQuery as demonstrated in this article can be accomplished in less than a minute using this mapping tool.

The visual representation of the XQuery is seen in Figure 5. The "for/let/where" loop is constructed by dragging the repeating elements to a FLWOR icon (where the "for" clause is implemented); the "where" clause is implemented by creating an "equal" icon and dragging the elements for both sides of the "equal" to that icon. The "for/let/where" loop is completed by hooking the "equal" to the FLWOR and then dragging the output of the FLWOR expression to the repeating element of the target document. After creating the structure of the "for/let/where" loop, elements from the source schema can simply be dragged and dropped onto the target schema.

XML Query Mapping

Figure 5: Stylus Studio®'s XQuery Mapper

Figure 6 shows a sample of resulting data produced by this XQuery:

XQuery Preview

Figure 6: Result of Xquery-based data aggregation

Reporting to HTML using XSLT

Once the XML-based common information model is populated, there are many options for presenting that information to the user. To complete this example, we'll create a style-sheet using Stylus Studio®'s WYSIWYG XML-to-HTML designer to present the data as HTML. Other options include using a reporting tool such as Crystal Reports, or processing the XML with another language such as Visual Basic or Java and then writing custom reports from that. Another option, of course, is simply to use XQuery to generate HTML.

This was generated by Stylus Studio®:

Sample Stock Report

Company Name  Ticker  Stock Price  Annual Revenue  HeadQuarters 
Progress Software PRGS 19.29 280.5 Bedford
Int'l Business Machines Corp IBM 86.32 81200 Armonk
BEA Systems, Inc. BEAS 10.29 934.1 San Jose

Summary

In this XQuery tutorial, we learned that Web services provide a wealth of new information that is described and made available to applications in XML. Often, data analysis requires information from multiple sources, which means that Web service data, for example, needs to be enhanced or aggregated with XML data obtained from other data sources.

Using Stylus Studio®, we proved it to be quick and easy to create an XQuery that efficiently aggregates data from historical stock data stored in a relational database with live stock-quote data provided from a Web service. A common information model was used for the target of the aggregation, and the result was translated into HTML for presentation purposes.

XQuery Resources

The tool used for creation of this article is Stylus Studio®, a powerful XQuery IDE, that including an XQuery editor, XQuery mapper, XQuery debugger and other XQuery tools. It can be downloaded from http://www.stylusstudio.com/xml_download.html

Additionally, for further information, read our XQuery Primer entitled: Learn XQuery in 10 Minutes by Dr. Michael Kay, or visit the Stylus Studio® Developer Network' XQuery Help and Discussion Forums or the XQuery-Talk Mailing List.

PURCHASE STYLUS STUDIO ONLINE TODAY!!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Try STYLUS XQuery Tools

Download a free trial of our award-winning XQuery Tools.

Top 10 XQuery Trends

Read about the top 10 XQuery Trends and how they will impact change the way enterprise software applications are built.

Learn XQuery in 10 Minutes!

Say goodbye to 10-minute abs, and say Hello to "Learn XQuery in Ten Minutes!", the world's fastest and easiest XQuery primer, now available for free!

Ask Someone You Know

Does your company use Stylus Studio? Do your competitors? Engineers from over 100,000 leading companies use Stylus Studio, and now you can ask someone from your own organization about their experiences using Stylus Studio.

XQuery Help and Discussion Forum

Learn about XQuery development at the SSDN's new XQuery Help and Discussion Forum

 
Free Stylus Studio XML Training:
W3C Member