Re: Indexing solution for native XML database

To: <xml-dev@l...>
Subject: Re: Indexing solution for native XML database
From: "Ken North" <kennorth@s...>
Date: Thu, 1 Dec 2005 14:16:13 -0800
References: <002e01c5f659$fa57afa0$0115a8c0@E...>

Play the video

Michael Kay wrote:
> Relational databases are a very useful tool, there are some jobs they are
> very good at (mainly the kind of jobs that people used punched cards for 50
> years ago).

They do handle numbers and characters quite well, but you're describing the
strengths of vintage-1980s SQL technology. A lot has changed.

Many of the SQL platforms have evolved to a universal database model that
support queries over rich types such as video, images, HTML, XML and so on.
Some use a plug-in model similar to how  web browsers uses plug-ins for Flash or
Adobe PDFs.

Using an SQL database for rich types has implications for query optimization.
Here's an example I've probably beaten to death.

You want to write a web app or services that uses tabular data, XML, maps and
geo-spatial data:

"My GPS coordinates are x, customer profile y and presentation format is
[1024x768|320x240|104x208].
Find the nearest location where I can buy a widget for less than 500 Euros. Give
me a top 10 list with directions, map and a consumer review."

One approach is to use specialized servers -- one for a native XML database, one
for maps, etc. -- each with their own data model, indexing technology and access
methods. To write a query optimizer for that solution requires an understanding
of queries across distributed data stores using different data models, indexing
and access methods.

That solution requires integration middleware that processes query results
before delivering a solution to the client. It will require plenty of network
roundtrips to get statistics, data and metadata.

Contrast that with having all of the data managed by a single DBMS (SQL/XML).
Because the query optimizer understands the disparate types and their associated
access methods and indexing techniques, we don't have to develop an optimizer
for a query against an XML database, an image database, geo-spatial data and so
on. We can rely on the query optimizer to determine in what order to retrieve
data instead of having to build that logic into our application or middleware.

To optimize the query and prepare its access plan, the query optimizer does disk
seeks or cache reads, not RPCs over TCP/IP connections. Retrieving the data when
executing the plan is also a matter of doing disk seeks or reading cache memory,
instead of round-tripping messages across a network wire.

The performance gains you expect from using specialized data stores are negated
if your applications have to process several types of data, each with their own
specialized server or engine.

Follow-Ups:
- RE: Indexing solution for native XML database
  - From: "Michael Kay" <mike@s...>

Prev by Date: How to create "strict" flag for schema?
Next by Date: RE: Common Word Processing Format
Previous by thread: RE: Indexing solution for native XML database
Next by thread: RE: Indexing solution for native XML database
Index(es):
- Date
- Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >