Re: Note from the Troll
Thanks for clarifying. I hope you don't mind if I respond to some of the points. I'm in agreement with many of them, but not with the overall conclusion that XML has no value. On Sun, 2002-10-27 at 06:49, tblanchard@m... wrote: > Lately, I'm working for a company that is exchanging HR information > with job boards (like monster and hot jobs) - which has its own working > groups trying to define HRXML. That's an interesting problem space, especially for someone strongly convinced of the value of the relational model. In the past ten to fifteen years, in my experience, many of the largest firms moved personnel information out of databases and into LDAP. LDAP is all sorts of things, but it isn't very relational. It's very easy to model as a hierarchical database. Or as XML. On the other hand, LDAP has some significant limitations. It isn't very relational, for one. *laugh* The problem, in the HR space, is that information fits neatly into hierarchies, except when it doesn't. And relations do a nice job, except for the rigor of column definitions. So that area, in particular, is a hard problem, one that probably requires a synergy of technologies. For all the hype surrounding the XMLization of the current leading relational database products, we aren't there yet. > 1) XML Tools [expletive deleted] - they're little more than syntax coloring editors. Hmm. My favorites are, but I work for a company that has produced strongly graphical editors. Available on a Mac, no less (the graphics used to describe various schemata have a tendency to appear in a variety of books; we pass them around at work, when found). On the whole, I tend to agree that tools aren't up to par as yet. But ... different people need to do different things. Cue rant from ERH on the uselessness of tree view as an editing model. > 2) The Hype is at the same level as the hype was for AI and it can't > possibly live up to it. It should be written <genuflect>XML</genuflect> All too true. Are the hypesters identical to the developers? I think that that was true for the AI model. I'm doubtful that it is the case for XML. THe most outrageous claims seem to be made by PHBs. > 3) The weight of the processing model is really really heavy. As an > example, using URLs to reference DTD's causes all sorts of problems for > computers when they're off the network. XML parsing simply halts. > This is especially annoying when running something like BEA WebLogic on > your machine because you're doing a web app. BEA stores config info in > XML which references some DTD at BEA and the server simply won't start > if that server isn't available. You can argue this is a misuse of XML > - I think so - but its one of those things thats going to hurt people's > impressions of XML. Hmm. I think that much more can be said here, and that some of it can tie in with points 5 and 6 below. It no longer surprises me that W3C recs tend to show the adverse effects of prolonged URI abuse. One of the canons is that URIs are the perfect addressing mechanism. No, wait, the perfect identification mechanism. No, wait. Oh, and sometimes URIs are not URIs. Massive confusion is created over whether the use of URIs, in a particular context, is for identification, for comparison, for location, or for decoupling. In fact, I would come to the same conclusion here, but argue that the problem is an *inadequate* processing model, not one that's too heavyweight. It isn't, on the whole, clear when you should retrieve a URI, and when you ought to compare it to something else by the rules governing URIs, and when you ought to compare it as a string, ignoring the fact that it has the form of a URI. Add linking (the XML version of relations, if you will), and life becomes most unpleasant. See the discussion between ERH, SSL, and UU on order of processing of XInclude during XPath processing. Eric van der Vlist has proposed a model for specifying processing order, to address the issue. > 4) Schema is really an insane spec. I mentioned just the data types - > too many too complicated. Do we have to specify the number of bytes > for the ints? Thats a physical issue and for this Smalltalker it > doesn't even make sense (Smalltalk handles arbitrary sized numbers). Umm. See [me] for rants on schema type cluelessness. Argh. In fact, I don't much care about the existence of the various register-based integers. They're among the twenty-five derived base types. The derivations even make a certain amount of sense. On the other hand, there are *nineteen* primitive types. That's string, boolean, number ... uh, another number. And oh, yeah, another number. Only different. And, umm, date, time, dateTime, and duration. Oh, and another five or so things that have to do with time or duration. Not related to one another. Oh, and remember we were talking about how URIs are sometimes just strings? Well, in schema, they *aren't* strings. No, I said I wouldn't go on this rant. Sorry. This is a place of deep flaw, that needs very serious, very careful work. What's worse is that the spec was so delayed, and so anticipated, that most folks really, really want to overlook its enormous hairy dangling warts and just get on with the job. > 5) Like C++, the average developer can't cope with the excessive > complexity XML introduces relative to its value. The average > programmer doesn't really know the difference between nonnegative int > and positive int. In fact, the schemas I'm getting from biz partners > (the couple that want to use XML because everyone is using it - and its > less than half) are AWFUL. Ugh. There is a solution to this problem; it's called RELAX NG. But it isn't a solution to the problem of primitive types, which hasn't been addressed, as yet. > XSLT files are maintainable with the same level of ease as densely > written perl. Developers asked to modify them routinely rewrite them > because they can't figure out what the last guy was doing. Hmm. I haven't encountered that one. I think some of the more web-oriented here have reported similar things, though (my work doesn't bring me into contact with that segment). So the transform sets that I've seen in use tend to be much more maintainable. Come to think of it, though, it wouldn't surprise me to find that quickie XSLT transforms would share the ease of confusion of quickie perl hacks. Similar problem spaces; perl addresses text, XSLT XML. > loved it because the thing itself is the damnedest puzzle. It > entertained and challenged their intellect to work with it. I'm > beginning to suspect the same about XML. *laugh* I, for one, *hate* having the nasty little corners crop up. I end up having to explain (for instance) that attributes are *not* in the default namespace when unadorned, even though similarly unadorned elements are. I'm not looking forward to having to explain why 0x0D is in the whitespace production, even though it can't appear in an XML stream ... I'll have to refer folks to the post describing that particular Stupid Entity Trick, because I can't remember it. And when they ask why any spec would support something so extremely obscure, I'll just have to shrug. Smiling and wincing. > 6) From the stand point of business process and enterprise architecture > - XML is an evolutionary step backwards. Hierarchical databases were > abandoned for relational models long ago and systems made out of lots Err. There's been one objection to this already. I can add that there are a number of spaces where hierarchical organization is taking share from relational models (LDAP is one). For long term storage of information, XML actually does make sense, because it's easy to untangle, in comparison to a number of alternatives. Note that it is not necessarily better than plists, or S-expressions, except that it's in wider use, and that means that the knowledge of how to tease information out has a better chance of remaining over the long term. This is important for governments, for instance. Last I heard, the 1960 US census is stored in a format that can be read by only two working machines in the world (one of which is on display at the Smithsonian). Having trained as an historian in the long ago, I can say that that is an unmitigated disaster. Cost of transformation, if undertaken, will be enormous. If not undertaken, loss of information will be enormous. Folks (including, from an apocryphal source, the IRS) store proprietary formats into databases as BLOBs. Better if that's XML. XML's looser restrictions provide an anodyne to the limitations of databases, and one that ought to, at some point, be taken up as a synthesis. Common databases do far better with primitive types, but have far less ability to handle semi-structured information. This, in fact, is probably the HR problem. There's personnel information that ought to be loosely structured, the sorts of things that are well described by DTD and RELAX NG without types. There's other information that has strong typing, and there's a real need to get at things fast, to be able to index and look up in a variety of ways, and to avoid redundancy of information. Somewhere in a synthesis of hierarchy and relation may lie the answer. It isn't currently available, though, to the best of my knowledge. > I need to write the follow on to this piece but it will focus on point > 6 above plus the assertion that XML fails as both a markup language > (markup shouldn't require well-formedness) and as a serialization *ABSOLUTELY* disagreed. Well-formedness is the thing that makes XML worthwhile. SGML, intended for writing in text editors, has all sorts of cute little tricks called "markup minimization". You can end a tag with </>. If SGML had primitive types for elements, you'd see minimization like <price/2.99/. And other Stupid Tag Minimization Tricks. These are great for typing, until you reach the point in the document that looks like this: </></></></></> Hope you don't have to extend one of those sections. Whatever they are. Still worse if you have full minimization, where the end tag is merely implied. Does this element imply the end of the last three open tags? Or just of the last two? Well-formedness removes ambiguity. It does so at the expense of terseness. It's a good choice for a markup language. It's a poor choice for a data exchange format, where all the data is in very tiny chunks, and the ambiguity does not arise. > format (too hierarchically oriented, verbose, and weirdly structured > relative to ER models). *shrug* ER models have their limitations as well. More loosely structured data than tables can easily model are commonly encountered. It's a good place to use XML. > 7) While there is lots of heavyweight support for reading XML, there > isn't any help for writing it from various other data structures. Hmmm. I kinda need to think about that one, I guess. We've been having go-rounds on the subject of "serialization", at work. > So there's my viewpoint. Take it for what its worth. Am I trolling? Well, I thought so. Even I can claim to have "used" DTDs for eight years, since I was writing HTML with doctype decls at the top in '94. Clearly, I was wrong to think so, but it was hard for me to see what the needs were, behind the very strongly expressed antagonism. > Because it runs against what I've found to be true in my own work. > Centralize business logic and ER/OO modeling to model your business > entities and processes. This works well when the company has the > discipline to pick an implementation language and stick with it and > focuses all developers on this goal. I was a grunt programmer at some of those companies. Speaking from the trenches (RPC, DCOM, CORBA), I don't think those models work. In fact, I think that they're seriously flawed by adopting Sun's marketing slogan. The network is not a computer. > But of course, XML philosophy says the opposite. Where is the business > rule repository in XML? Where do you want it to be? XML is data, not objects. It isn't even very good at linking, just at the moment, even though it was always *supposed* to be. > Have you considered that you're breathing your own exhaust? You don't *laugh* Such an image. No, I'm being a hairy nuisance to various committees, trying to convince them *not* to make any more nasty little complexities that I'll end up having to explain. Unconvincingly, since there really isn't any excuse for monstrosities like gHorribleKludge. However, I think that there are some underlying principles that are very valuable, and that address a number of problems hard to deal with using other tools. You may not agree, if you see no value in well-formedness. Alaric doesn't agree, 'cause he doesn't see value in text-only formats. XML well-formedness and text-only format are very important for easing exchange and encouraging long-term preservation of information. The looser-than-database-columns structure of schemas (not meaning WXS, here, but DTD and RNG, primarily) provides a useful match for many kinds of information that fit awkwardly into relational models (this information often ends up as unindexable blobs). Localization of information is important when information is transient; XML documents are often better solutions for the delivery of information than are rowsets (particularly when the information may be loosely structured, as noted in the previous point). As a format for exchange of strongly typed data, it's currently a mess and a failure (in my opinion), because there is no type system (what's in WXS isn't a system). Amy! -- Amelia A. Lewis amyzing@t... alicorn@m... To be whole is to be part; true voyage is return. -- Laia Asieo Odo (Ursula K. LeGuin, "The Dispossessed")
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format