|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Fast text output from SAX?
Elliotte Rusty Harold wrote: > At 10:00 AM -0400 4/14/04, Stephen D. Williams wrote: > ..... > >> Additionally, the whole parsing etc. stream for XML must be >> completely performed, in DOM cases and many SAX cases, for every >> element of a document/object. With esXML, if a 3000 element >> document/object were read in and 5 elements manipulated, you only >> spend 5*element-manipulation-overhead. > > > I flat out don't believe this. I think there's an underlying > assumption here (and in some of the other binary formats) which once > again demonstrates that they are not as much like XML as they claim. > The only way you can limit this is by assuming the data in your stream > is well-formed. In XML, we don't assume that. One of the 3000 nodes > you don't process may be malformed. You're assuming that's not the > case, and therefore avoiding a lot of overhead in checking for it. A > large chunk of any speed gain such a format achieves over real XML is > by cutting corners on well-formedness checking. I could just as easily argue that every application has to perform schema validation, and then at a further level a complete application-level sanity validation (since DTD/Schema only goes so far), then referential integrity to database tables, etc. Certainly very paranoid applications processing potentially unfriendly data need to do these levels, but it is not required of many other applications. Of course when processing data you are doing some level of sanity checking, but your assertion that the only real XML (or XIS or ORX) application is one that fully validates all data, even data it doesn't otherwise need to know about or use, doesn't seem right for many applications. In fact, in the n-tier application example, it is explicitly desired that each tier only be concerned with the data elements it operates on. As an example, an initial step might be a full validation that is amortized over further processing. That said, there is an equivalent for esXML to a well formedness check. The variable integers, sizes, codes, etc. all have very particular ranges of validity, along with the standard restrictions on the characters for names, values, processing instructions, etc. It is not a requirement to fully validate a message for all applications, just as it is not a requirement to fully schema validate an XML 1.1 document for every XML application. > > If this is not the case for esXML and indeed it does make all mandated > well-formedness checks, then please correct my error. However, I'd be > very surprised that in that case that one could indeed limit parsing > overhead to the raw I/O. Completely validating well-formedness is a processing step on it's own, not a only side effect of loading, or in the case of esXML manipulating, the data. The fact that an XML 1.1 document is 'corrupted' is found when parsing; the fact that an esXML document, or ASN.1, JPEG, or whatever is corrupt is also found when it is parsed. The fact that partial corruption isn't found when there is partial access of the object isn't a valid argument against being able to operate with only the work needed for a given application. The fact that an MPEG2 player doesn't find corruption in the end of a video stream when it only plays the beginning doesn't mean that it doesn't validly detect corruption. sdw -- swilliams@h... http://www.hpti.com Per: sdw@l... http://sdw.st Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








