Re: What is XML For?
On Tuesday 29 October 2002 16:16, Paul Prescod wrote: > Alaric Snell wrote: > >... > > > > I guess one thing that bugs me is that a schema might be used to test a > > bit of code that writes out documents but not one that reads them. > > Somebody might have added a new element and then forgotten to update the > > schema. > > In the areas where XML particularly excels, these sentences wouldn't > really even make sense. How do you "add an element to SVG" or HRML or > RSS and "forget to update the schema"? I mean you could update the prose > specification and forget to update the schema but that will be caught by > implementors pretty quickly. > > Schemas are most valuable when the specification process is completely > disjoint from the schedule of any particular implementations. Exactly; but not all situations are like that :-/ > > ... At one > > point with the data import format somebody had even allowed arbitrary > > elements in a certain context - data fields for a record were done with > > <fieldname>value</fieldname>, and when we moved away from a fixed data > > structure to an editable one in the database you could have any field > > name cropping up there and the type of the content would have to match a > > type pulled from our database :-/ > > You're talking about totally different kinds of applications than the > ones I'm talking about. Yep, I'm focussing on the areas in which your arguments don't hold to show that there are significant exceptions to them, areas where XML's model is weak. You can say "Well don't use XML for that then!" but that is just walling XML into a corner; there are flaws with its use in many situations, should you just return to using it for marking up text rather than for data / control flow work? I think so, but not everyone agrees. > Whereas in-memory data structures are very different. XML includes all > of the features that can reasonably be extracted from regular language > theory to give you power in describing your serialization. Some people > want and need that power. Some don't. But there is more power to be had in a system of general graphs than in a system of trees - not that they're not mappable to each other, but that it's more convenient to do everything as graphs than forcing non-tree things into a tree. > > ...I'd rather write: > > So use a tool that allows you to do that! I do, but everyone keeps telling me XML would be better! > > But THEY don't even want XML; they probably don't find wandering a DOM > > tree any more friendly than calling whatever passes for Perl's "pack" and > > "unpack" in VB. They are the people who want to just have magic > > serialisation from data structures to strings of bytes. > > If they are web developers then they are VERY familiar with the DOM. > They eat DOMs for breakfast. Now they have DOM's not just for user > interface but for structured data. To many of them, that's a big > improvement. Only client side scripters and people who wrote browsers would have done much with the DOM before, surely? Not the people who would have been doing data interchange betwixt business partners and their ilk? [we only use a handful of protocols in practice] > > That's purely a problem of adoption in the protocol marketplace, not > > difficulty of development. > > I didn't say they were difficult to develop, I said they were difficult > to deploy. Part of that is because they often use idiosyncratic > syntaxes, operation names and addressing schemes. This makes > implementation more challenging. Read some ye olde protocol RFCs; they're not really hard to implement (for a reason! Programmers were as cunningly lazy then as now, and those early RFCs were written by programmers after a few experimental implementations than by lofty comittees), and certainly apart from REALLY early ones they all tend to use the same terminology and mental model, since the people who wrote new ones read a lot of old ones first. > > But I'm a little WG right now developing a protocol to replace IMAP, > > They are replacing IMAP? I'm still waiting for IMAP to replace POP. ;) Exactly... see the problem! > > Nope, because it's the same model still, just implemented differently. > > From a linked list of C structs to the result of an SQL "SELECT * FROM > > PurchaseOrderLines where poNumber = <foo>" isn't a change of data > > structure, just a change of implementation, and indeed in SQL interfaces > > I've written for suitable dynamic languages where I can throw together a > > 'struct' type at runtime from the result of an SQL query, the linked list > > and the result set both support an interface like Java's Iterators since > > they are the same data structure. > > You're using a definition of data structure that is totally foreign to me. Check out java.util.Collection and it's subinterfaces... they provide some implementations of the interfaces in memory, and you can write your own to back onto other things. I recently did one that abstracted out some nasty SQL (where I run multiple queries then manually perform a kind of merge operation on the result sets to generate one result) into an Iterator, for example. I'm talking about data structures from the perspective of the developer using them, not the developer writing them, though. > > Just to reverse positions, I see XML as useful for marking up text... but > > it's not well fitted for data. > > Text is data. If you mean "tuple-structured data" then say so. I'd say > that XML is good for all sorts of hierarchical, recursive data with links. See other discussion on types of data :-) > > <cheese> > > <name>Cheddar</name> > > <colour>Yellowish</colour> > > <price currency="UKP" unit="kg">2.50</price> > > </cheese> > > > > ...without enough stopping to think if it's a good idea. > > You can't do real publishing without handling this case. This is the > core of what complicated technical publishing is all about. You're > describing a standard catalog! The same goes for links. Complex > technical publishing has to handle it. Complex technical publishing also has to handle images and so on, but XML seems to manage fine without that. Why couldn't it just refer to the catalogs and indexes from .csv files the same way it delegates bitmapped images to .gif files? > > ... The W3C seems to disavow > > responsibility in the first paragraph of that introduction. But somebody > > somehwere made a mental leap from "styling a human-readable document" to > > "data transfer". There are gray areas between the two, since an invoice > > might well be considered to need to be both a readable document and a > > piece of data, but nobody seems to be putting <?xml-stylesheet?> PIs in > > their XML purchase orders, do they? > > No, because they are sending them over SOAP which makes the PI kind of > useless (and in fact illegal). I certainly hope that as REST catches on, > this will become common. I'd have expected that sending this stuff by email would have been wise; push model is a standard way for serving documents to somebody, isn't it? More people have email addresses than they have personal online banking systems that can accept HTTP POSTs. But I digress. > Today, there are a variety of REST application that allow you to view > "data" as rendered documents with stylesheets. Xoomle, Meerkat and the > Amazon API come to mind. In fact, the idea that data should be > straight-forwardly renderable has almost 100% penetration in the REST > world and 0% penetration in the SOAP world. Indeed... > I still see it as part of the promise of XML that invoices and other > structured data will be accessed through URIs, What about email, though? What does REST have to say about push models? Is it just a matter of saying that it's equivelant (at some level) to POSTing to a mailto: URL? > rendered through > stylesheets, displayed as documents and that machines will use those > same URIs to manipulate those same XML documents for automated processes. I like that model, but only for things that are *predominantly* for humans to process. I'd like to be sent invoices by email that look like invoices, but that my accounts package notices so that when I view the invoice in my mail client I also get a button saying "pay" alongside "reply". This is part of my Big Vision For Networked Computing (tm). I bet that at least one of the other people who'se had that idea has called it "Smart Documents" or something; I daren't Google out that phrase because I know I'll see something that'll make me feel ill. But when it's megacorp A ordering their daily batch of 10,000 widgets from megacorp B, and both ends of the communication are unfeeling whirring lumps of software, I'd rather go for something a little less complex... > > Paul Prescod > ABS -- A city is like a large, complex, rabbit - ARP
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format