|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Relating to XML
Chiusano Joseph wrote: > Could you please elaborate on why you believe that using XML for data > interchange isn't always the best solution, and what better solutions > there may be? Simon St. Laurent summed it up pretty well in his response, so I won't repeat what he said there. What I might add is that although I've not seen much documentation of the actual thought processes that the original designers of XML went through, it looks to me from what they produced that they really just intended to replace HTML for the Web. From what I've seen, they were saying: "The current HTML Web is only useful for human browsing. Despite having a global data network, we're mainly using it to communicate stuff that only humans can really understand. It's stupid that every e-commerce Web site out there has a catalogue available online, but it's still not practical to write an application that merges the catalogues of multiple suppliers together, choosing the cheapest when the same product is available from more than one supplier or just listing multiple supplier options, depending on the user's wishes. So our plan is to migrate away from HTML to a new web where people use an XML structure for their site, that does not change, and a seperate layer of CSS or XSL that they can change to make cosmetic adjustments. This way, applications can view the product pages directly in XML; hopefully the catalogue sites will migrate towards a standard XML vocabulary for product pages, but even if not, it won't be the end of the world if people have to write XSLT to transform the different vocabularies into an arbitrarily chosen common one within the application. A neat side effect of this is that we will move people away from the current horrible abuse of HTML as a graphical layout language rather than a document language, which is causing accessability problems". It was the logical next step after moving people away from font tags towards CSS; it was a logical progression from: <div class="introduction">The <span class="emphasis">best</span> type of cheese is ... towards: <introduction>The <emphasis>best</emphasis> type of cheese is ... From what I've seen - and please correct me if I'm wrong - they just wanted to try to move the Web towards something that machines can gleen meaning from, in order to inrease the ease of 'screen-scraping' (for application developers) and alternative ways of displaying the information to humans (such as speech synthesis, for the disabled as well as people who just plain want to control how they view information). Although I'm not sure if some of what Tim Berners-Lee said at some points was just somewhat idealistic wording of how technology could reshape the world or part of an actual coherent plan. XML seems well designed for this kind of thing. Mixed content, and the fact you have the choice between attributes and element content, seem to point towards this (attributes would normally be used for meta-information, to keep metadata out of the actual content text). DTDs, with their fabled lack of type restraints on element content, seem perfect for describing this kind of vocabulary. If all recipies were written in the same XML vocabulary for recipies, then not only would site authors still be able to make their pages fit in with the rest of their site using CSS or XSL, meaning that by being constrained to the common format, they are not having their precious creative options taken away from them. Yet at the same time, hardened recipe-addicts could tell their browsers to display every recipe they encounter using a single fixed stylesheet so their skilled eye can use familiar visual cues to skip through the ingredients list. And a recipe library manager application can just accept the URL of a recipe, ignore any links to stylesheets, and just extract the data. Likewise, an online bank might render your bank statement using a common XML format for bank statements so that not only can you look at it, you can drag the URL into your personal financial app and expect it to start reconciling the statement against your transaction log. But people who write articles in the technical press and big business (not sure in which order), perhaps aided by the fact that a large percentage of Web developers really learnt programming by writing Web applications and hadn't seen much of the world beyond that domain so were unaware of a broad existing body of experience in transporting data between applications (so who thought that reading from a binary file was harder than parsing text - because they had more experience with the latter than the former), seemed to think that this meant XML could do *anything*; so the idea of XML-based protocols, and file formats for applications using XML, began to spread. They started to use XML for stuff that was mainly intended to be communicated between applications, rather than for publishing documents on the Web. Rather than treating XML as a document format, they treated it as a data format. They wrote code that serialised objects to it - <classname><elementname>value</elementname><element2name>value</element2name></classname> - which didn't product anything very human readable, since there wasn't any CSS or XSLT to convert it for display, and the class and element names only meant anything to programmers. If all you want is a format to express that this particular instance of Person has a Name of "Alaric" and a Gender of "Male", then there are plenty of perfectly good formats already around for that kind of thing; there's little need to create a new format for it with XML to pass mail merge records from your database to the company that's sending out your invoices. Now, there's a use for an XML format for personal contact details if you're producing a Web site that publishes contact details; if you have them in a common XML address book format, then somebody viewing that address on the Web can just drag the URL or whatever into their word processor to have it open up a letter template with the name and address prefilled, or drag it into the address book to have it added there. But when you're exporting a database of 10,000 people's addresses, all those human-readable element names used in every single of those records are a bit redundant. And with no CSS or XSLT stylesheet, generic display software will only be able to display it as a tree, which won't make an immense amount of sense to non-technical people unless they sit and stare at it for a while. Not that it will fall into the hands of many such people as it is run off of the database and emailled to the invoice printing house, anyway. In those industries, CSV still reigns supreme as the format of choice. And the flamewars began... :-) My own story began as a designer of file formats and network protocols; a large part of my theoretical study in computer science has been in the representation of information. I heard rumours that XML solved the 'babel problem'; there seemed to be claims that XML solved interoperability problems in some new way. But when I looked into it, all I found was a way of writing tree structures in text, different in no significant way from things like S-expressions. When I questioned those who were fervently claiming that XML was suddenly making possible what was not possible before, they said things along the lines of "Look! Now my CGIs can output XML as well as HTML, so other people's CGIs can just fetch the XML to get at the raw data rather! Suddenly, other people's apps can use my app as a building block!" To which my reply was something along the lines of "Er... people have been doing that kind of thing for years, actually...". Because the fans I talked to were people who joined an Internet composed of things like IRC, email, Web pages, and Usenet and saw it only as a way for humans to communicate; whereas I, and most of the people I hung out with, had first heard of the Internet as a protocol stack, and saw it as a way for bits of software to communicate - including, but not limited to, passing around information for display to humans. The thing is, most of the killer apps of the Internet were human-communication applications. Previously, if two companies wanted to exchange data for whatever reason, they'd set something up with a modem that dialled at a fixed time every day, or on demand, or lay a leased line. As they still do; but most new communications are set up to work over the Internet, usually using scheduled FTP uploads in my experience. The arrival of the Internet has mainly just decreased the cost of those communications links, although many businesses are worried about the security implications of putting their data over the public network (banks in particular...) and still continue to pay for leased lines, frame relay, and so on. However, if people wanted to chat in subject-based geography-independent 'rooms' where their age, fashion tastes, regional accent, income level, ethnic background, and gender were all hidden, they'd have had to either use special technology such as ham radio or dialup bulletin boards, which for various reasons (both technical and social) tended to be the province of nerds. The fact that the Internet lowered the cost of data communication enough to make things like IRC, email, Usenet, and so on accessible to consumers made a lot of new things possible, so those previously-nearly-impossible things grew like wildfire. One good thing that has come out of the XML hype is that it got people thinking about application intercommunication across the Internet - pulling it into the limelight again. Up until then, the decision makers in online businesses hadn't really considered the idea much, since making Web sites was what everyone was doing. Although they could have easily made their core data and services available to other application developers quite easily using anything from ONC RPC (toolkits for which come with all the open source UNIXes, for a start - still a bigger deployed base than XML tools, probably!) to emailling CSV files, should they have thought it would make them money, the thought of 'syndicating' their content via XML appealed to many. However, in the long run, it seems that few commercial web sites are doing this unless you pay them for the privelege, since they don't make any more money from you viewing their pages in XML than in HTML, and writing seperate XML and CSS/XSL is more complex and costly than just writing HTML in the first place. I've been involved in a few Web projects that had the backend logic producing XML and an XSLT layer converting that to HTML for display, and they never really had any benefits to show from it... whereas with a site that directly generates HTML, the designers can write an HTML page with little magic codes in where they want the product name, cost, and so on to be placed when the page is generated dynamically; with an XSLT site, they need to pass their desired HTML layouts to an XSLT guru who can convert them into stylesheets, and then passes a template XML file to the programmer instead of a template HTML file. What's worse is that most Web site designs are in constant flux as the management demand changes to try and bring in more successful sales - so when they say that each product page should now have links to the past 5 product pages you viewed so it's easier to switch between products to compare them, suddenly your nice generic "Product description XML" being generated from the backend database has to contain dynamic information generated from the viewer's browsing history, so the XSLT can convert that into the 'Recently Viewed Items' box. Which kind of spoils the idea of it being a display-format-independent abstract description of the product; because the sad fact is that companies want their web sites to be groups of pages that they control the display of. They have no particular desire to provide URLs for their *products*, that when viewed happen to produce a nice product page; they want URLs to point to pages. Part of the original semantic-web XML dream seemed to be that there would be a URL for a product, and viewing that URL in a browser would produce a nice pretty page about the product while software could access the same URL to get a concise database entry about the product, whose format would not change with every site redesign, so application developers could count upon it. Sadly, most real e-commerce sites will from time to time decide to split each product's details over several pages, or not, meaning that the cosmetic changes don't just affect how a page's information is displayed, but what information is on what page. So providing a URL from which a universal product description could be fetched in XML would mean *extra* effort for them to *seperately* provide that information; not just a matter of moving away from HTML to XML+CSS. And they just don't see much of a business case for doing that unless people pay to access their data directly. One point to consider is that none of the XML vocabularies I've seen contain fields for adverts to display to the viewer - and even if they did, no recipe database application in its right mind would bother actually storing the adverts that were supposed to be shown alongside the recipe on the original site and displaying them - and many sites make a nice bit of revenue out of showing adverts. Any version of the Web that does not allow the authors of sites to enforce the display of adverts seems to get a lukewarm reception! See also: http://www.xml.com/pub/a/2001/06/13/threemyths.html ABS
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








