Re: The triples datamodel -- was Re: SemanticWeb per
On Fri, 2004-06-11 at 13:15, Elliotte Rusty Harold wrote: > At 12:42 AM +0200 6/11/04, Henrik Martensson wrote: > > > > >What I am arguing is that: > >* it is not likely that anyone can foresee all possible variations > > and build software flexible enough to handle them > >* even in those cases where it is possible, it is often not > > cost efficient > > > I agree with your first point. It isn't possible or feasible to > foresee all possible variations. However I'm not suggesting we do > that. I'm suggesting that you deal with new variations as they arise > rather than trying to anticipate them. You don't need to process > everything people might send you, just what they actually do send > you. The first few weeks with such a system do involve a lot of time > writing code to process one new format after another, but matters do > stabilize to a manageable level fairly quickly. No, they do not. I (usually) do not work with systems where content generation is automated, as you do. In the last XML project I worked with the customer decided to do the DTD design themselves. They kept a steady stream of more or less random change up for over 18 months. There were more than 80 versions of one of the DTDs involved. At that point I quit the project and left the company I worked for. My friends who haven't (yet) left tell me they are still keeping it up. The refusal of both the customer and my former employer to instigate measures for change control has cost both companies millions, I quit, the project manager got heart problems (from being forced to work overtime while seriously ill). The consequences of allowing more or less random markup changes will affect every maintenance project for the next fifteen year, maybe more. I may be mistaken, but I believe you assume that every content author will be fairly stable (markup-wise, that is), so that changes occur mainly when a new author enters the arena. This is not the case. The authors vary a great deal among themselves, of course, but they tend to have certain traits in common: * They have little or no training for their job * They are above average intelligence * They are bored As a result, they come up with wildly inventive solutions to problems, both real and imagined. To keep from being bored, they experiment, and they just keep on doing it. There is no such thing as stabilization as time goes on. I once made a list of markup changes (in a non-XML system) that about half a dozen authors at one site had taken it upon themselves to make. It was 140 pages long. There was not one single change in there that actually solved a problem. Every single change was either a rename of existing markup, or new markup for something that wasn't supposed to be marked up in the first place. > > When a new format is discovered after the initial burn-in period, it > normally indicates a significant new addition of deletion of > information, not just an arbitrary random change; and it's probably > something that you want to think about. By preventing communicators Se above. > from sending you new markup you are preventing them from adapting to > significant changes in the domain. You are limiting what they are > allowed to tell you, and thereby limiting what you can know. As I have written before, I consider good communications between content authors and developers to be an absolute necessity. However, there are better ways to communicate a need than to make a change in the markup and wait for something to explode. For one thing, it may be several years from the change to the explosion. It is not uncommon for data to be stored for decades, until it is discovered that something is seriously wrong with it. And its not just information that is stored. One company I worked with published product catalogs for a foreign market _in_the_wrong_language_ for fifteen years before anyone discovered it. Another thing is that when the explosion happens, well, it may be pretty serious. Some damage can be repaired, some can't be. Have you noticed that we are going round in circles? You have made claims about me not listening to authors before, practising bad programming habits, etc. I have refuted them, and you just keep making the same claims all over again, without one shred of evidence to back you up. You keep ignoring what I write, and attribute opinions to me that I do not have. Actually, you attribute opinions and practises to me that I am very outspoken against. You have also claimed that the software I write is inflexible and inherently brittle. Frankly, you don't have the slightest idea about how I write software, under what conditions it works, what makes it break, etc. This is not a good way to make an argument. Over the course of this thread I have provided several examples of what can and will happen with the kind of projects I work with when markup creativity runs amok. You have several opinions that I do not agree with, like the idea that large groups of human authors would not instigate a high rate of change in the markup over an extended period of time. I have provided several real world examples of authors doing just that, but you haven't come up with a single example supporting your opinions. Earlier, you claimed that unknown information could usually be safely ignored. Again, I provided a counter example (admonitions). You also wrote that an agreement between sender and receiver was not necessary, because software can usually infer what is actually meant. Again, I have provided examples, showing that it may not be all that easy, and that contracts must, and do, exist, even though they are sometimes implicit. I think I also mentioned some of the problems I have encountered with markup of procedures, and how messing up the tagging affects downstream processing and reusability in DMS systems. All my examples have one thing in common: you completely ignored them. I can't help wondering if it is because you find real world examples being detrimental to your arguments. I suppose you might ignore them because you do not believe them. Either way, there is not much point in continuing the discussion. I don't doubt that the techniques you use work very well under the conditions you work with. Also, I believe that you are very good at using them. On the rare occasions when I work with similar problems under similar conditions, I do things in ways that are probably not too different from the way you work. For example, I often work with well formed XML when chaining a series of transformations. I take care to write software that is loosely coupled. (I am a strong believer in the use of design patterns, and the Law of Demeter.) I design DTDs to be as flexible as possible, never constraining them more than I have to. When I extract information from a document, I don't build in unnecessary dependencies on structure. When the same kind of processing applies to many elements, I prefer distinguishing the elements by class (properties, context, whatever they have in common) instead of hardcoding element names. I write automated unit tests, and acceptance tests when circumstances allow it, etc. Nothing strange about that, except one thing: you insist that I don't, despite us never having worked together, and you never having seen any of the code I've written. I do not think we are getting much further. As I've written before, all I wanted to do was to refute the idea that the tools and techniques you use are applicable in general to XML processing. I believe I have done that through arguments and examples. You do not buy my opinions at all. I believe that yours are valid for the things you work with, but not generally applicable. Let's move on to other topics. /Henrik
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format