An Interview with Jonathan Robie — The co-inventor of XQuery
Jonathan Robie is one of the inventors of XQuery. He's been involved with the XML Query Working Group since the beginning, and has been an editor since the first documents were published by the W3C. Jonathan is the XQuery Technology Lead for DataDirect XQuery, a high-performance, embeddable Java component that enables XQuery access to all the major relational databases that first shipped in September 2005. He recently won the Infoworld Innovator of the Year for his work on the XQuery working group, where he's currently helping create an XQuery Update Facility.
Ivan Pedruzzi is Stylus Studio's Sr. Product Architect, and editor of 'The Stylus Scoop' newsletter, and he recently had the opportunity to interview Jonathan on behalf of the Stylus Studio XML developer community. The two chatted about relational database integration using XQuery technologies and other XQuery topics that are sure to be of interest to you.
Ivan Pedruzzi: Hi, Jonathan! Thanks for meeting with us. Well, you've had quite a year, Mr. Robie! In your recent interview with Jon Udell, you seemed pretty convinced that XQuery will succeed. Obviously as a co-inventor of XQuery, you're biased, but apart from that, what makes you so sure of its success?
Jonathan Robie: Hi Ivan, I'm glad to be here and have the opportunity to talk to developers about XQuery.
Here's the main reason I think XQuery will succeed: without XQuery, programming with XML is much harder than it needs to be, and this is particularly true if you are integrating data from different sources. If you're querying XML and relational data and you need to create XML output, XQuery is much simpler to use, and more efficient to implement, than the alternatives. XQuery and implementations like DataDirect XQuery solve some real challenges and product vendors are investing heavily in it. XQuery is easier for most programmers to learn than XSLT, because it is more like SQL, Java, and other languages they are used to.
There are already 46 XQuery implementations listed on the XML Query Working Group home page. Not all of these are serious implementations, but there are now quite a few serious XQuery implementations, including implementations for the current betas of all the major relational databases, middleware vendors (including DataDirect Technologies), content management system vendors, and open source projects.
And XQuery is also creeping into other architectures and languages. For example, XQJ, a Java API for XQuery, is under development for J2EE, and the SQL standard is adding a pseudo-function called
IP: Wouldn't most programmers prefer to query a relational database directly with XQuery, without embedding the XQuery in SQL? And is this even possible?
JR: Yes! The XQuery language was designed from the start to query different types of data, including relational databases, and there are several implementations that allow this. Our implementation, DataDirect XQuery, provides this functionality for most of the major relational databases, including Oracle, Microsoft SQL Server, Sybase, and IBM DB2. In a nutshell, it works by translating queries from XQuery to SQL for the underlying database.
IP: But don't most database vendors plan to support XQuery? Why shouldn't a developer simply use one of those XQuery implementations?
JR: Support for XQuery varies among relational database vendors, as does the quality of their implementations. Our customers generally are working on middleware integration challenges, like messaging, data aggregation, and so on, and we've found that they need two features typically lacking in proprietary solutions – portability, to allow these applications to run on databases from many vendors, and better XQuery support than that offered by their database. Sometimes their database offers no XQuery support at all. There is a very real need for an XQuery product that works on all major relational databases, fully supports the XQuery language, and supports XQJ, the standards API for XQuery.
IP: Well, let's talk a bit about the Java API for XQuery — XQJ — then. Can you explain to our readers how is XQJ like JDBC?
JR: XQJ is a Java API that allows Java programs to execute XQuery queries and return results, and it also allows configuration and other functionality. JDBC, by contrast, is a Java API that allows Java programs to execute SQL queries and return results. And both of these standards are developed under Java Community Process.
The design of XQJ is very much like that of JDBC, but modified to take the XML data model into account. For instance, JDBC revolves around tables, but the XQuery data model does not have tables. Also, many of the configuration requirements for XML are different from those used with relational data.
I've talked to developers who use XQJ or plan to use XQJ, and about half of them use it much like they would typically use JDBC — the queries are small, and are used to retrieve data for the Java program, which does most of the processing. The other half use Java as a relatively thin shell to set up an environment and return results, and do the bulk of their processing in XQuery. These programs don't look much like typical JDBC programs.
IP: Do you always need to embed an XQuery in some other kind of program?
JR: No. Our own implementation, DataDirect XQuery, is designed for the Java platform, but it ships with a generic program to execute a query and return a result. Also, there are two tools, Stylus Studio and the <oXygen/>® XML Editor for Eclipse, (DataDirect Edition), that can take an arbitrary XQuery and generate a Java program that uses DataDirect XQuery. Either of these tools can execute an XQuery directly, without a Java program.
In the long run, I think you will see a lot of XQuery used in other programs, and a lot of XQuery used as a standalone language. Both approaches make sense to me.
IP: You mentioned earlier that your customers typically use XQuery for messaging or other Web applications. Can you describe the architecture for these kinds of application — where and how does XQuery fit in?
JR: XQuery is generally used either in the middle tier or on the server. The XQuery usually needs to be parameterized — you don't just write a query that returns a generic portfolio, it returns a portfolio for a particular user, in a particular time frame, and you have to specify these parameters somehow. One way of doing this is to post an XML document that contains such information as the name of the user and the starting and ending date. Another way to do this is to use URI parameters, and to bind these parameters to external variables in the XQuery before executing the query.
XQuery's prominence in the middle tier is due to the fact that not all of the data necessarily comes from a database, or it might come from more than one database. The XQuery itself is written by developers and executes inside the firewall, where it has access to the data, and people outside the firewall are generally prevented from writing queries. So the user's access to this data is limited to the queries that the developer provides and the parameters that the user is allowed to specify. This lets the developer use a very high level, efficient query language to create XML and to access whatever data is needed, while limiting the user to very well-defined interfaces.
IP: How does XQuery play with other standard database technologies? You mentioned running an XQuery from inside SQL as an example — what does that look like, and why did the SQL standards people add that when there are already SQL/XML extensions for SQL?
JR: The most critical need was to be able to query XML stored in columns of the database. SQL/XML does not have the full power of XQuery, and because customers were asking for that, it was added to the SQL standard.
There's now a SQL pseudo-function called
IP: Okay, how about the reverse — can I call SQL from an XQuery?
JR: That depends on the implementation. XQuery has external functions, which can be implemented in any language, but each implementation decides where these external functions can come from. In our implementation, DataDirect XQuery, you can declare any SQL function in the query prolog and call it in the query, so yes.
IP: What about Java functions? Can I call those from XQuery?
JR: Same story: the implementation decides which external functions can be supported. In DataDirect XQuery, Java functions can be called from within XQuery as external functions, so yes.
IP: Wow. XQuery is really quite extensible!
JR: Well, extensibility is important — the language was designed with data integration in mind for the simple reason that you want to be able to connect to the data you need in any environment.
IP: And what do you see down the road? How do you expect XQuery to change in the future?
JR: Adding XQuery updates and transformations will be the most significant changes, and I expect that in the next year or two. There is a lot of demand for this, and it's currently the primary focus of my work with the W3C. There are also quite a few people asking for more sophisticated grouping operators, so that's also on the list.
IP: And nearer term, what new features can our readers look forward to in the next version of DataDirect XQuery?
JR: We're dramatically improving our support for very large XML files, and adding filters to convert EDI, Java objects, email mailboxes, and other formats to XML so they can be queried. We're also adding support for MySQL 5, which many customers have been asking for, and for the most recent versions of the Microsoft and Oracle databases. We'll also be implementing the XQuery 1.0 Candidate Recommendation specification, which is newer than the version currently supported.
IP: Wow, lots of exciting developments! In closing, how can our readers get started using XQuery?
JR: The best way is probably to get an XQuery implementation, find a good tutorial, and start playing with some data. You can find a list of implementations on the W3C XQuery home page. Obviously, I like our own implementation, so make sure all your readers download the free trial of DataDirect XQuery [laughs].
I wrote a tutorial and a longer chapter introducing XQuery — they can both be found on the XQuery web site. Jonathan Bruce and I also wrote an XQJ tutorial for Java programmers. And DataDirect XQuery comes with quite a few sample programs that can help get people started with XQJ.
IP: I think you're forgetting something ...
JR: Of course! Many people find visual XQuery tools helpful for working with the input data, writing and debugging queries, and visualizing the result. Stylus Studio provides this functionality. The XQuery Mapper in particular is an excellent learning and development tool, and also available as a free trial download.
IP: I couldn't have said it better myself!
JR: Happy to oblige.
IP: In all seriousness, Jonathan, thank you for your time, and good luck with your plans for XQuery's future.
JR: My pleasure, Ivan. Nice to chat with you.
Editor's Note: If you liked this interview, consider subscribing to The Stylus Scoop, our bi-monthly XML developer newsletter! You can also get the latest on XQuery by visiting the XML Connections blog.
PURCHASE STYLUS STUDIO ONLINE TODAY!!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download a free trial of our award-winning XQuery development tools today!
Learn XQuery in 10 Minutes!
Say goodbye to 10-minute abs, and say Hello to "Learn XQuery in Ten Minutes!", the world's fastest and easiest XQuery primer, now available for free!