|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Future of Databases
Ken North wrote: > Perhaps we'll hear from Ron Bourret. It was his question whether XQuery > would displace SQL. I've summarized my notes below. DISCLAIMER: Everything is paraphrased, not quoted. Also, I'm not a trained journalist -- that is, I can't think and take notes at the same time -- so if something strikes you as particularly astonishing, it probably wasn't and I'm to blame. Same goes with answers that are merely incomprehensible. Hopefully Ken can correct me as necessary. (And my apologies to Edd Dumbill for not writing this up as an xmlhack article.) OVERVIEW: Basically, my take on the panel was that it summarized what is current wisdom in the XML/database world, especially as seen through the eyes of people with strong relational backgrounds. In particular: 1) Relational databases aren't going away. 2) XML is/will become the dominant exchange format, with XML front ends to everything. As the InfoWorld article said, the one area of controversy was the role of XML databases. Rick Cattell saw them as niche players (although relational databases will support native XML) while Daniela Florescu saw them in a much larger role. QUESTIONS AND ANSWERS: Q. SQL is a formal standard. Java and XML are not formal standards. Does this matter? Melton: Standards are important. Without them, you don't get interoperability. What is less important is whether these are de jure or de facto. Note that ANSI, ISO, and the W3C all work with each other now. Also important is that de jure standards come in geologic time, not Internet time, although this is changing with fast-track processes. ------------------------------- Q. Is it possibile to do transaction processing over the Internet and scale to millions? Gray: Google and Hotmail do this. They have 10,000 processors [servers?]. Google is replicable and Hotmail is partitionable, so scaling really depends on the application. Traditional applications are 1000 transactions/second. <aside speaker="Gray">The cost of computer management is greater than capital cost, so we need self-managing computers.</aside> ------------------------------- Q. XQuery is document-centric. What can we expect? Chamberlin: Database people are still trying to catch up with the 90s, when all computers were connected together. A laptop can "hold" all of the works of mankind. Most are semi-structured or un-structured. Many come from streaming sources. For the last eight years, the database industry has been trying to solve this. [RPB: I believe this refers to the various attempts by relational databases to store/query text, as well as to add metadata to BLOBs, etc.] Predictions: 1) XML is/will become the dominant format for data interchange. It is flexible and self-describing. 2) Applications will want to query data in the same format they exchange it -- that is, they want to view all sources as XML. 3) RDBMSs and SQL won't go away. They are good for homogenous data. Instead, they will add XML front ends. 4) Other data sources will add XML front ends as well. 5) Large RDBMS vendors will make XML a first class citizen. ------------------------------- Q. With document-centric XML [in the main?], will tools follow? Chamberlin: Yes. It is still the early days. Updates, transactions, indexing, etc. are not completely/yet addressed. ------------------------------- Q. Lots of other industries have consolidated. Will the software industry consolidate into a single monolithic corporation? Cattell: We need multiple vendors to keep innovation alive. There are only 3 1/2 database vendors left. At this level, de jure processes are appropriate. JDBC and similar technologies [which are newer?] can move faster. [In answer to Chamberlin?] Very few people store/query XML -- the momentum of relational databases is simply insurmountable. XML databases will be a niche only, for use in caching, etc. There is a bigger market for XML-to-relational translators. ------------------------------- Q. E-commerce is exchanging XML. RDBMSs can process 50K rows/second while parsers can only parser 2-3K XML documents per second. Is XML optimizable? Florescu: I am optimistic about XML databases. XQuery should be equally optimizable [to SQL?]. There is are three impedance mismatches in Web services: Web to XML, XML marshalled to Java, and Java marshalled to RDBMS. There is therefore a market for a language that joins XML, Java, and SQL. ------------------------------- Q. What about peer-to-peer databases? Chamberlin: If the data is easily replicated, doesn't belong to anyone, and is read-only, then peer-to-peer databases are a good idea. Bank of America is going to want more control over their data. Gray: Napster is a good example, although they didn't own the data. SETI@Home was one third of the bandwidth in the University of California, which caused problems. The solution was to increase the size of computational pieces, therefore reducing the overall bandwidth use. Similarly, peer-to-peer sends lots of data around. The problem with this is that sending data around is expensive, while local computing is essentially free. Cattell: There is a growth market for peer-to-peer databases. Knowledge management is a growth area for non-relational databases. ------------------------------- Q. What is the future of databases -- T-spaces, in-memory databases, etc.? Cattell: In-memory databases are a no-brainer. T-spaces are interesting. They are a database, operating system, messaging system, etc. all rolled into one, although they won't take over the world yet. Gray: A variant of T-spaces is work flow. Flows can be described in XML to dovetail together. ------------------------------- Q. In the 90s it was fashionable to say that databases were dead, that the Web exceeded databases. Will something replace databases? Melton: Databases are increasingly needed. Storing traditional data is solved. Storing text is solved. There are new problems, such as searching video and audio. We need joins across different types [RPB: e.g. email and video]. Chamberlin: There are lots of new challenges -- decades of Ph.D. work. XML queries are structurally different from relational queries. XML data is heterogenous. For example, "Find all the red stuff" returns a cherry, a stop sign, etc. Data is ordered, which causes optimization problems. You can ask questions about both data and metadata such as, "What kinds of things are red?" There are new ways to deal with sparse data, which requires lots of nulls in a relational database. XML handles this data. The logic around nulls is different. XML databases need a different way to construct things due to using a hierarchy. Cattell: Anybody who says that databases are dead means that relational databases are mature and you can't find a thesis topic. ------------------------------- Q. How do we query the Web? Is the goal to query data or to describe it so it can be queried? Cattell: Data needs to be described before it can be queried. Florescu: You can query XML data without a schema, due to its self-describing nature. Vertical applications need schemas. Melton: There are no silver bullets. The great majority of the world's data has no metadata and probably won't ever have any. Only commercially valuable data will get metadata. We are trying to find ways to query all types of data. Gray: There are lots of data sources. For example, LDAP has 7 mandatory fields and 1000 optional fields. This fits XML well. Similar sources are email, schedules, etc. XML will be the standard interchange language, which encourages data sources to expose themselves as XML. ------------------------------- Q. Are there context sensitive searches in XML, such as in the context of the previous query? Chamberlin: This is not addressed in XQuery. Melton: You can use successive refinement against temporary results. [RPB: I think somebody pointed out that XQuery is composable?] ------------------------------- Q. What do you think of AMDs (associative model databases)? [RPB: The panel didn't really understand the question. Neither did I. As near as I could tell, an AMD is where you store all your data in, for example, two tables. One contains individual data values and the other contains information about associations/links between data values.] Gray: We already have an associative data model: SQL and XQuery are associative. ------------------------------- Q. How will XQuery be used? That is, will distributed vs. local queries affect optimization strategies? Florescu: Yes. Local and distributed queries are optimized differently. This will mean different implementations for different markets. ------------------------------- Q. Relational databases have an algebra. Does XQuery have an algebra? Will XML databases replace relational databases? Melton: Relational databases have already integrated object technology. Similarly, they will integrate XML technology. Yes, there is a formal model for XQuery, but not as formal as the relational model. Florescu: We will formalize the model in the W3C. It is not as elegant as the relational model. ------------------------------- Q. Will XML influence screen scrapers, etc.? Gray: It should reduce the need to screen scrape. Note that there are no eyeballs for XML. The customers are programs. ------------------------------- Q. Will XQuery replace SQL as the application query language of choice? Chamberlin: No. Florescu: Not in five years, but in ten years an extension of XQuery might replace both. Melton: Don't underestimate relational databases. Cattell: No. Another language might replace them. For example, something Google-like. -- Ron
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








