[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: What Clean Specs Achieve
roddey@u... wrote: > >>> > >>>But is anyone here trying to _implement_ Java? Lots of folks here are > >>>indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink and > XPointer, > >>>Namespaces, XSL, etc. It's not like we're only trying to _use_ them, as > is > >>>the case with Java (or SQL, another example that's been bounced around.) > >> > >>Most of them seem to be succeeding. What should we conclude? -Tim > > > >Most people who don't succeed, don't announce. We can't conclude > anything. > > > >Judging from the volume of questions (and controversy) on this and its > >sibling lists (XSL-list, xlxp-dev), there's a lot of improvement that > could > >be made. > > As an above average developer, who just implemented the bulk of his first > XML parser (C++) in a binge over the last month, I have to question whether > any 'average' developer will ever implement a full featured parser. I found > it very non-trivial to write an XML parser that was well decomposed and > layered and pluggable, while retaining competitive performance. I found > that XML itself was not very conducive to fast processing and reasonably > simple architecture. Very true. Just to get something working that handled the data I was using took about two weeks of my time. One week reading the spec and asking questions and the other week writing the code. This was back in January of 1997 when there were no XML tutorials around (the spec was not even a recommendation then). > As to the spec... I don't mean to hurt anyone's feelings, but I found the > spec during that effort to be as confusing as enlightening. It describes > the logical (sometimes illogical :-) design of XML. But it doesn't help so > much when it comes to trying to apply that to some physical design. Of > course that's not their job, but obviously there have been a good number of > parsers written and some obvious issues in implementation could be > discussed, to save implementers from doing the same things over and over > again and then having to fix them. Of course now its all obvious :-) But I > had to really struggle through it the first time. A 4 or 5 page prose > document describing the most obviously implementation pitfalls (and > possibly some obvious implementation strategies) could have saved me a week > probably. Yes the spec is supposed to describe XML, but is its overall goal > not to facilite the development of software that implements it? I doubt that is the goal, but many people are hesitant to disclose their parsing "secrets" (-:. I think of XML parsers as pure commodities that you cannot make a penny off of unless you have some higher level tools built on top of a good parser framework. I have found that at least in Java, a lot of the things I learned while tuning performance were things which helped me out in a lot of areas of programming that have nothing to do with XML. I think Mr. Clark likes to refer to his generous works as reference implementations, however, XP is not something to easily learn from as it is very low-level and not very straightforward in terms of interfaces (not trying to disrespect Mr. Clark here as the XML parser I wrote may be fast but the code is practically unmaintainable as my extreme efforts at quality performance severely compromised good software engineering principles that I usually try and follow in my work). I think this can be said of just about all of the XML parser out there, they are all spaghetti except for perhaps Aelfred. > And I suspect that perhaps there are probably parsers out there, where the > developers really cannot intellectually prove that they do the right thing. > I would be willing to bet that some of them just fix problems until it runs > the James Clark tests and digest the Bosak files? When a customer reports a That is what I did for a long time. Debugging through the entire Clark test suite took a week or more and I still don't pass much more than 90% of the ones that test for not-well formed documents, but I suspect Mr. Clark spent a lot longer than a week doing the test suite (-: > problem, and sends in a sample file, then they look at the spec and try to > see if that file seems to correspend to the spec and fix their code to > handle if so. That is far easier than trying to prove that every method in > your code meets the spec (though its obviously not the optimum thing to > do.) Yah, generally if you control how your data is created, you can whip up a decent parser to meet your needs. Also, if you don't check for a lot of the obscure errors that may pop up you can save yourself a ton of time in processing overhead. Unfortunately, in my case the XML parser will be used in an end-user product where users may edit files manually (and screw things up in the process). But if you just want to have some basic XML capabilities for your organization and don't want to deal with using other people's codebases, XML is not too much of a beast (understanding the spec takes longer than writing the code at first). > Am I being too cynical here? Maybe so. But, I just don't think that an > 'average' developer could write an XML processor that is complete, > expandable, maintainable, and speedy, if all he/she had to work with was > the raw XML spec (at least not in a time that would be acceptable in a > commercial setting, which is what mostly counts I guess?) I think that it > would more likely just be 'proven' to be correct through empirical testing, > not through an ability to completely understand all the interactions > expressed in the XML spec and implement them cleanly. Very true. I fell into this trap when people on this list were talking about how an average university CS student could whip one up in a week. At first I said "geese this is easy" but when I started caring about performance and being able to detect some of the very obscure errors to be 100% compliant with the draft, I found myself going insane on doing a lot more work with XML than I originally intended. Then this XML stuff balloned into a bunch of XML related work for several clients with these tools and now I am here discussing XML with everyone else when all I intended at first was to just have basic XML support in the core application I was working on. > Also, the interactions that just exist in XML (regardless of how well or > badly they are expressed in the spec) means that the skill level required > to do something that is *maintainable and expandable* (i.e. well decomposed > despite all the interactions) is that higher still. Arguing whether or not > someone could manage to read the spec and squeeze something out that (in > whatever shape) was a fully compliant parser, isn't very meaningful to me. I could not agree more. > Oh well, that's my po' two cents worth. I think that yes you need a dry > laying out of the facts *and* some guidance at a higher level, related as > much to possible implementation issues as interpretation issues. I think > that the current spec perhaps is somewhere in between the two and thus > somewhat fails to fully please either master? You can thank the many people here who have provided open-source parsers to work from (I was never able to actually get mine out in open-source form as I originally intended for various business reasons), though I myself decided to waste a lot of time coming up with an XML architecture that works very differently from the event-based or tree-based parsers out there as it is more of a data-driven model than anything else (oh I forgot to mention Lark from Tim Bray which uses a DFA model that is unique to the current crop of XML parsers). I would say Aelfred is the best "reference" implementation out there if you could call it that and anyone who just wants to whip up a decent event-based XML parser should take a look at his source as it is pretty clean and straightofrward. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|