[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: storing XML files
Ron: As you can imagine, I have some comments ;-> About data set sizes: the number and size of the documents DOES matter. Our internal tests against Oracle 9i, IBM DB2, and SQL Server using standard OAG Purchase Order documents of fixed sizes and relatively simple schemas of around 10 KB and separate tests with 100 KB and 1 MB sizes show repeatedly that as the number of documents increases and the size of the documents increases the performance of queries and apply/drop indexes degrades exponentially. The more complex the schema, the faster the degradation, but it still degrades. We used consultants that came to us as a result of our merger with C-Bridge that are big relational heads, so we didn't cheat... we tried to make the RDBMS systems do well with their own tools and they just didn't work. Of course, we cannot publish the exact test results since we are not an independent party and Oracle et al's lawyers would come after us if we did... About Oracle's XSQL utility: it does not do much at all. It does not even perform the mapping for you. YOU have to design the appropriate relational schema for your XML as Oracle does not provide any tools to assist you in generating a relational schema (a painful and time intensive process, if the XML schema is not childishly simple). It can only map into an already defined relational table structure. If your XML does not match exactly, it resorts to storing it as a BLOB. Now, Oracle is the only database I'm aware of that even attempts to do this canonical mapping through any type of tool. However, it's still painful (we tried and our customers who are long-time Oracle customers tried). Oracle consultants will not even use this method and will instead resort to the way Oracle SAYS you should manage XML in Oracle: use the 9i XMLType, which stores as a BLOB but at least gives you XPath access and XPath indexing (albeit quite inefficient, non-scalable, and slow according to our internal tests). About API and knowledge portability: Not all of the RDBMS systems support XPath, but most of the XML-enabled middleware and native XML DBs I've seen do. Also, the tools, utilities, methodologies, and extensions to SQL are all different between the RDBMS vendors. If you want to make XML work in an RDBMS system (unless you're using a system like B-Bop which thankfully hides you from this mess), you have to use the vendor provided tools which gives you vendor lock in. Vendor lock-in is not so severe, in my experience, in the native XML DB world. RDBMS systems were just not designed for managing XML data and it shows. About Tamino's query support: install Tamino 2.3.1.4. Their query language is a hodgepodge of some XPath and proprietary syntax that ends up not XPath and that they unfittingly call "XQuery" (and it's not XQuery as the community knows it either). It's a fact, they know it, and I do know they are also attempting to remedy it. Admittedly my statements were inflammatory and inappropriate, but just install the product. It's not XPath, no matter what they say... but darn it, they've got good marketing ;-> Oracle 5 and 6 were crap, too... but Mr. Ellison is still a very rich man ;-> About hard evidence: Yes, we have hard data on all of our tests. We have hard data from competitive benchmarks customers have asked us to perform as well as internal competitive benchmarks. We use real customer applications to create our benchmarking tests and suites against our current and prior releases. Of course, we cannot release this data to you unless you require it to close a deal on our software ;-> Cheers, Chris --------------------------------------- Chris Parkerson Product Manager eXcelon Corporation Burlington, MA (781) 674-5393 http://www.exceloncorp.com --------------------------------------- -----Original Message----- From: Ronald Bourret [mailto:rpbourret@r...] Sent: Tuesday, October 09, 2001 3:15 AM To: xml-dev@l... Cc: 'Albena Georgieva' Subject: Re: storing XML files Chris Parkerson wrote: > ... and being able to > do fast queries across large document sets is NOT a requirement (i.e. > you've got 1000+ stock quote documents, but you do not need to query > over them as an aggregate), then the XML capabilities of the RDBMS > vendors should be sufficient. I don't understand this statement. What does the number of documents matter? If each stock quote document holds a single quote -- that is, a single row of data -- then a relational query should be very fast. As Soumitra Sengupta pointed out in his reply, I think the issues are nesting depth and how semi-structured the data is, not size. > Oracle > 9i's XML support is really limited with their new vaunted "XMLType" > column data type being nothing more than a convenience wrapper around > the CLOB type: XML still gets stored as a blob. This is only part of Oracle 9i's support. They have two other ways to map XML to the database. The first uses SQL3 object views to perform the obvious object-relational mapping between XML documents and the database. The second is the Internet File System (iFS), which uses mapping files. > Keep in mind, however, > that each RDBMS vendors approach to handling XML is going to be > proprietary: you will have little to no code [portability] Let's take a closer look at this. An XML/database application usually consists of a number of parts: 1) APIs. These are proprietary in RDBMSs, but also in native XML databases as well. That is, there is currently no way to write an XML application that is portable with respect to database access. This is because there is no widely supported standard API. (There is a good start in this direction -- the XML:DB API -- but it is not yet widely enough implemented by database vendors to make truly portable applications a reality.) In fact, the only way current to write an XML application that is portable across databases is to use object-relational middleware against a relational database. This is because a number of middleware vendors use ODBC, OLE DB, or JDBC for database access. So while you will be locked into a single vendor's API, the application will be portable across databases. 2) Query language. This is also not standardized across databases. While the most popular query langauge is probably XPath (usually with extensions for multi-document queries), numerous other query languages -- all of them proprietary -- are supported as well. This is true of XML-enabled relational databases as well as native XML databases. There is good reason for this, as XPath is not rich enough to perform many of the queries needed by users and XQuery is not yet finished. I suspect that when XQuery is done, you will see many implementations of it. 3) Update language. Where these are supported, they are all non-standard. This is because there is no standard XML update language in existence. (Again, there is an attempt to standardize this with the XUpdate language, but there are not enough implementations to make it a reality.) 4) DOM, SAX, XSLT, namespaces, etc. These are standard across all XML database products that I have seen -- native XML and XML-enabled. I therefore think it's fair to say that you'll have roughly the same amount of code portability with XML-enabled relational databases that you'll have with native XML databases. > or knowledge portability. Again, this is about the same for native XML databases and XML-enabled relational databases. Both types of databases are based on fairly consistent models (object-relational mappings for XML-enabled relational databases and XML document structure for native XML databases) but the actual implementations are different for every product. In short, moving code from one XML-enabled relational database to another means learning new mapping syntax, a new API, and possibly a new query language. Moving code from one native XML database to another means learning a new API and possibly a new query language. > A major advantage of native XML DB systems (save for Software AG's > Tamino which still uses a proprietary query language and schema dialect, > among other non-standard things) is their adherence to standards. > Queries are XPath, transformations are XSLT, granular access to document > data is via the standard DOM API, validation can be against DTD or W3C > XML Schema, etc. This is a rather inflammatory statement, as it seems to imply Tamino and XML-enabled relational databases do not adhere to standards. With the exception of query languages, most XML database products that I have seen (native XML, XML-enabled databases, middleware) *do* adhere to standards, Tamino included. (With respect to the non-standard schema language in Tamino, Tamino was released *long* before XML Schemas were completed, so it hardly seems fair to complain. A year from now, yes. Today, no.) > You gain a bit more in terms of code portability as > well as knowledge portability between different vendors. I think the operative term is "a bit more". > You also gain > a database system inherently designed to deal with the extensibility of > XML data True. > and capable of removing the scalability, performance, ease of > data update, and cross-document limitations of the RDBMS approach. Actually, this depends a lot on the application. In some cases, this is definitely true. In others, it is not. > The reason most XML applications are converting to native > XML databases is that they've tried to make their applications work > with RDBMS systems and they failed Is there any hard data to back this up? Numbers of customers, number of transactions per day, etc.? While there are certainly applications that can be built with native XML databases that can't be built with XML-enabled relational data, the reverse is also true, so I have a hard time with the word "most". -- Ron ----------------------------------------------------------------- The xml-dev list is sponsored by XML.org <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.xml.org/ob/adm.pl>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|