[XML-DEV Mailing List Archive Home]
[By Thread]
[By Date]
[Recent Entries]
[Reply To This Message]
Re: [Fwd: Re: storing XML files]
- From: "Soumitra Sengupta, Ph.D." <soumitra@b...>
- To: Ronald Bourret <rpbourret@r...>
- Date: Tue, 09 Oct 2001 09:31:13 -0500
Ronald Bourret wrote:
3BC2B2AA.793C954A@r...">Soumitra Sengupta wrote:
This is a valid way of solving XML storage, indexing and retrieval problem if you have:
a. you have pretty well defined DTD or schema for all the incoming documents and they are unlikely to change. You do not expect to add new DTD or schemas in the future often. b. the hierarchical relationship expressed in these documents are not very rich or are not too nested.
Agreed. However, I would caution that "not very rich" and "not too nested" are not well-defined terms. That is, what is acceptable performance in one application is not acceptable in another.
I should have been more precise but one person's "rich" is another person's "flat". I guess semi-structured
would have been a more appropriate definition.
3BC2B2AA.793C954A@r..."> Furthermore, native XML databases are not guaranteed to outperform XML-enabled relational databases in all cases. As a general rule:
3BC2B2AA.793C954A@r..."> 1) Native XML databases should outperform XML-enabled relational databases on queries along the XML hierarchy, such as XPath queries.
Agreed.
3BC2B2AA.793C954A@r..."> 2) It's anybody's guess which will perform better on queries not along the hierarchy, such as a query that inverts an XML document. Theoretically, it seems that if the queried fields are indexed in both types of database and similarly clever people implement the query and storage engines, performance should be comparable. However, I don't know enough about storage and query engines to know if this is true.
3) Native XML databases could encounter serious problems querying unindexed data. This is because they have to search many more nodes to find the data. That is, first they have to determine if a node is the correct node ("Are you a foo element?"), then they have to look at the data for the node ("Is your value 'bar'?"). Relational databases only have to look at a given set of column values.
Xfinity already solves this problem for any well formed XML document.
3BC2B2AA.793C954A@r...">
If on the other hand you do not have control on the structure of the incoming documents, they vary a lot and they are very rich, you should look at using a native XML database.
Agreed, with the caveat that "very rich" is an ill-defined term.
There is a slight possibility that the customer moves to Oracle, so my solution shouldn't be database provider dependent.
I recommend that you look at our Xfinity Server 2.0 (http://www.b-bop.com/products_xfinity_server.htm) as it will allow you the benefits of a native XML database but protect your investment in RDMS technology like MS-SQL server or Oracle. It meets your requirement of "database provider" independent.
Note that Xfinity provides database independence because it is built on top of a relational database. This allows you to switch the underlying database. This is not true of all native XML databases, many of which have proprietary database engines.
Is it smart to develop a generalized format into which all those data feeds will be transformed (I hope an XSLT is a good tool to do that)? Or, I should deal with them individually?
Depends on whether the structure is pretty static, how many different structure are you dealing with, are they likely to change a lot, are they very rich? Also how much time and money do you want to spend on development and do you want to protect your existing investment?
Personally, I don't think this is a good idea. The types of data you mentioned (stock quotes, weather report, events information, daily news, horoscope, etc.) are all radically different from each other. It makes little sense to convert them to a common XML schema.
In most cases that would be true. But if they want to search on kewords and the issue date for
each of these documents and nothing more, this would work just fine.
3BC2B2AA.793C954A@r...">
Can I apply XSLT on my XML data feeds and directly produce SQL statements?
I have not seen anything that does this. Anyone with an answer to this?
I haven't either. However, there are a number of middleware products that will store the data for you. (These create and execute the SQL statements internally.) See the middleware section of the product list mentioned by Chris.
Has someone already developed an XML2SQL transformation tool?
Yes, although they actually transfer the data instead of generating SQL statements. Again, see the middleware section of the product list, as well as the XML-enabled databases section.
Just a flat table solution or I can also capture relationship information?
Many of the middleware solutions, as well as all of the XML-enabled relational databases, use an object-relational mapping that captures relationship information.
We have solved this problem in Xfinity. All the rich relationships are captured and stored.
I think it is important to understand a fundamental difference between what a native XML database built on top of a relational database (like Xfinity) and an XML-enabled relational database (like SQL Server). For example, consider the following document:
<SalesOrder SONumber="12345"> <CustomerNumber="543" /> <OrderDate>981215</OrderDate> <Item ItemNumber="1"> <PartNumber="123" /> <Quantity>10</Quantity> </Item> <Item ItemNumber="2"> <PartNumber="456" /> <Quantity>5</Quantity> </Item> </SalesOrder>
In an XML-enabled relational database, this will be mapped to two tables: Orders and Items. In a native XML database built on a relational database, this will be mapped to tables such as Elements, Attributes, and PCDATA. (In
Couldn't have said it better.
3BC2B2AA.793C954A@r...">
Is it the right thing to save those XMLs in to a relational database or shall I explore the native XML databases?
Depends on incoming XML and the needs of your application.
Agreed.
Is there an existing DTD for SQL or that makes no sense to you?
These exist, but I don't think anybody actually uses them. This is usually handled internally by the products that transfer data between XML documents and relational databases.
I believe Oracle and Datachannel do publish something called XPages.
3BC2B2AA.793C954A@r...">
My system should be flexible enough to accept new kinds of data feeds ...
This I think is the key requirement that you have. Unless there are other requirements that I do not understand or know of, you really should look at native XML databases.
Agreed. This is not something that XML-enabled relational databases do well, so it's worth your time to look at native XML databases.
-- Ron
----------------------------------------------------------------- The xml-dev list is sponsored by XML.org http://www.xml.org, an initiative of OASIS http://www.oasis-open.org
The list archives are at http://lists.xml.org/archives/xml-dev/
To subscribe or unsubscribe from this elist use the subscription manager: http://lists.xml.org/ob/adm.pl
|
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format
RSS 2.0 |
|
Atom 0.3 |
|
|
Stylus Studio has published XML-DEV in RSS and ATOM formats,
enabling users to easily subcribe to the list from their preferred news reader application.
|
Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website.
they were not included by the author in the initial post. To view the content without the Sponsor Links please
click here.
|
|