[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: seduced by markup
I feel pretty odd defending a syntax I have self-righteously sneered at since 1985, but here goes. First, let me say that I can't explain everything, even to myself. It is weird to reopen my long-closed mind about this. > Wow - Steve I love your analogy, but I think you may have fallen in love > with it for its own sake, because it just doesn't apply to the DTD/RNC > comparison, at least I don't see it. Now I may be a high priest and out > of touch, but I just don't think DTDs are as transparent or accessible > as you seem to think they are. It's an observation. I've had to explain the syntax many times to many diverse audiences. It's downright perverse that they get it so readily. I can't explain it. I can only makes guesses. > Years ago, as a relative newcomer to these technologies, I looked at DTD > and RNC, and RNC made immediate, intuitive sense to me in a way that > DTDs did not (and probably never will, because I doubt I'll spend enough > time looking at them to get used to the squonks). Now that's a > completely subjective assessment, but for example consider the alien > words and tokens you have to learn to really get DTDs: > > You have to understand that % is a magic letter that means the following > token doesn't really mean anything about the markup, but is a reference > to something defined elsewhere. ALL TRUE! By the same token, consider two scenarios: You're in a room with 4 experienced programmers and 4 domain experts. Let's say they're experts in healthcare, since that's so topical these days. They know about medicine and/or insurance, but not much about IT. Now, everybody has to work together and come to an important agreement. You're the facilitator. Scenario 1: You put RNG syntax on the table. The programmers grin smugly. They feel perfectly comfortable. The healthcare experts grimace and stir uneasily. Scenario 2: You put DTD syntax on the table. Everybody stirs uneasily. What's this bizarre nonsense? As the facilitator responsible for eliciting information from everyone in the room, I'll prefer Scenario 2 every time. I have reason to do the full explanation, and to expect everyone to listen, and to (figuratively) rap the knuckles of anyone who doesn't pay attention. To prefer Scenario 1 is to create a gross power imbalance right at the outset. The domain experts will be humiliated and reticent. The programmers will run roughshod over the proceedings, and speak only to each other. It's a recipe for failure. -------------------- So let's assume Scenario 2. The whole question of parameter entities (your % character) doesn't come up for quite a while. It comes up when it's needed, and when it's needed, it's the answer to a question that those present in the room have already asked. It's much easier to understand the answer to a question that you've already asked, especially if the whole field is unfamiliar. Aside: What always surprises me is that people generally absorb Backus-Naur Form, right at the beginning, and without much difficulty. Maybe BNF should be taught in preschool. I think it might be a good idea. > #IMPLIED (what does that mean? it seems to indicate there is a default > value, but what is it? Hah! I love this one. It comes up immediately, and that's a good thing, because it's the moment when the most important indoctrination occurs. The indoctrination begins with the programmer-deflating news that we're emphatically not designing any sort of software whatsoever. This news comes as a relief to non-programmers. The programmers, on the other hand, immediately wonder whether they're in the right room. So the facilitator tells them, confidentially, that they are in exactly the right room to protect their turf, and #IMPLIED is their friend. It is a barrier that protects them from the busybody OCD standards-makers who may be lurking in the process. That thought generally restores their (bemused and grumpy) attention, at least for a while. The explanation of #IMPLIED is: if the interchanged information does not provide an explicit value, then the value, if any, will be supplied by the receiving system, at its sole discretion. That strange idea takes a while to be absorbed. The absorption process is vital; it enormously clarifies the nature of the entire task in which we're engaged. We're only legislating what must be legislated in order to enable information interchange for a (probably unknown) range of "applications". The full abstract import of the word "application" is worth exploring at this point. It doesn't mean "software" so much as it means "the goals we're trying to achieve in a given context" -- goals which require accurate information interchange between diverse entities in different contexts with different goals. The indoctrination is complete when everybody can say to themselves the phrase "#IMPLIED by the application" while understanding exactly what that means. The consultants/enablers should congratulate themselves when this finally happens, because a therapeutic process has begun. Programmers and non-programmers alike feel increasingly empowered by the formalization of ambiguity. It makes for good feeling in the room, and thoughts that maybe we can all get along with each other. This might work after all. > NMTOKEN (hunh?) The facilitator should wait till a question arises to which it is the answer. That's the time to introduce it. > CDATA (wha?) Not sure why this one is a problem. I've never seen it cause any problems. > PCDATA (seems to have something to with CDATA, but is it different in > some subtle and important way?) Personally, by default I try to keep #PCDATA in the realm of <!ELEMENT and CDATA in the realm of <!ATTLIST, thus not allowing the confusion to arise. As I write this, I'm just realizing now that the clunky separation of the two realms has been helpful to me in my practice in exactly this way. Again, though, a question may arise to which the CDATA content model is the answer. > special meaning of ID/IDREF, but unfamiliar semantics (IDs can't be > numbers! that will surprise a lot of people) This has never caused a problem for me. Nobody has ever expressed surprise that rules must govern the construction of SGMLNAMES. They don't always like the rules. So what? (Aside: As a matter of common practice, the world's largest SGML application, HTML, has long since de-facto vacated the restriction to which you refer. Even more shockingly, at least for me, Larry Masinter wrote in 1999 that it's common practice to allow spaces in fragment identifiers, which after all are the values of SGML ID attributes. Finally, I learned only last week that %-hex-escaped spaces are OK in fragment identifiers in the context of Mozilla, but IE won't work unless they are left unescaped. Go figure.) > What is that PUBLIC identifier in the DOCTYPE declaration anyway? You're in grave danger of pushing my rant-button about the W3C's deliberate destruction of the distinction between PUBLIC-ly registered materials and the 100% system-dependent, rental-contract-driven identification of materials via URIs that resolve to only-the-current-lessee-knows-what-(maybe). It's not exactly a recipe for information longevity and reliability. Anyway, I think the whole PUBLIC identifier thing is irrelevant to a project until it comes up, and when it does come up, as before, everybody has already asked the question, and then there's a good answer that really works. > Contrived sure, and not all of these things are *necessary* in DTDs, but > typically one does encounter them. Sure Relax has its own syntax to > learn, but at its simplest, the RNG document structure *mirrors the > structure of the document it models*, which to me is the thing that > really makes it accessible and intuitive. I also like the relative lack > of obscure words. And I have to say that most of the people that I work > with have reacted in a similar way. Look, I say again: I love RNG. I just don't think it's any better than DTD syntax in the real world, IF the purpose of both syntaxes is to facilitate the development of toothy agreements between diverse parties whose primary expertise is not necessarily IT. And DTD syntax remains much more commonplace. History matters, whether we like it or not. What are the facilitator's motivations? Those matter, too. > I like the idea that redundancy provides humans extra cues, but I'm just > not sure DTD format in itself actually provides that. Comments in human > languages are super helpful, as usual, but of course you can have > comments in any schema format. Since you mention this, allow me to lament, once again, that W3C deliberately crippled the DTD syntax in XML. It disallowed comments within markup declarations. That wasn't necessary and it was extremely hurtful to consensus-seeking projects. It is hard to draw any conclusion other than that they wanted to snatch document design out of the hands of everyman, offering instead only slavery to the software products those with the financial resources required to market them, and, of course, to XSD, which everybody knew would be SO much better. (As to the latter point, I have never understood why the W3C was allowed to escape prosecution under the Sherman Antitrust Act. I know the theory: that TimBL has absolute authority, and therefore W3C is technically not a conspiracy in restraint of trade. But that doesn't change the fact that it is indeed a conspiracy that has allowed multiple market leaders to collude in secret sessions with the net result of restraining trade, once TimBL says "OK".) Moreover, all that happened before RNG was around. W3C also crippled DTD syntax in another important way, once again unnecessarily, by disallowing parameter entities in DTDs that are included directly, verbatim, in XML documents. They're still OK in remote DTDs. They're just not OK where people can exploit their expressive power more locally, and with zero red tape. So once again it's a matter of big money trumping little money, right here in the most economically sensitive area of human intercourse, namely, information interchange. It's easy to understand why the Establishment saw fit to endow TimBL with a knighthood. A badge of honor, right? His integrity took the hit, the Establishment reaped the reward, and the public was duly impressed. No losers, right?
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|