|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Indexing solution for native XML database
Michael Kay wrote: > Relational databases are a very useful tool, there are some jobs they are > very good at (mainly the kind of jobs that people used punched cards for 50 > years ago). They do handle numbers and characters quite well, but you're describing the strengths of vintage-1980s SQL technology. A lot has changed. Many of the SQL platforms have evolved to a universal database model that support queries over rich types such as video, images, HTML, XML and so on. Some use a plug-in model similar to how web browsers uses plug-ins for Flash or Adobe PDFs. Using an SQL database for rich types has implications for query optimization. Here's an example I've probably beaten to death. You want to write a web app or services that uses tabular data, XML, maps and geo-spatial data: "My GPS coordinates are x, customer profile y and presentation format is [1024x768|320x240|104x208]. Find the nearest location where I can buy a widget for less than 500 Euros. Give me a top 10 list with directions, map and a consumer review." One approach is to use specialized servers -- one for a native XML database, one for maps, etc. -- each with their own data model, indexing technology and access methods. To write a query optimizer for that solution requires an understanding of queries across distributed data stores using different data models, indexing and access methods. That solution requires integration middleware that processes query results before delivering a solution to the client. It will require plenty of network roundtrips to get statistics, data and metadata. Contrast that with having all of the data managed by a single DBMS (SQL/XML). Because the query optimizer understands the disparate types and their associated access methods and indexing techniques, we don't have to develop an optimizer for a query against an XML database, an image database, geo-spatial data and so on. We can rely on the query optimizer to determine in what order to retrieve data instead of having to build that logic into our application or middleware. To optimize the query and prepare its access plan, the query optimizer does disk seeks or cache reads, not RPCs over TCP/IP connections. Retrieving the data when executing the plan is also a matter of doing disk seeks or reading cache memory, instead of round-tripping messages across a network wire. The performance gains you expect from using specialized data stores are negated if your applications have to process several types of data, each with their own specialized server or engine.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








