|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Well-formed vs. valid
>>FYI, our (IBM's) new version 2 architecture parsers do this. We have a >>pluggable architecture, and one of the plug ins is a validator. The low >>level scanner uses this to validate content before it sends it out through >>the internal even APIs. So, if you are wiring together a SAX style parser, >>you just wire the internal events to the SAX events and you have a >>validating SAX parser (actually we have that combination already provided >>for you as a canned parser, but you can do other variations as well.) > >Big question: can I plug someone else's SAX parser into your scanner, and >then have your validation component work on my SAX events? While it's >unlikely that I'd want to plug a different SAX parser in, it's quite >possible that I'd want to work with the SAX events (transforming with XT, >for instance) before performing validation. > You can, its just less efficient. The validators have to support 're-validation' or 'after the fact' validation, whatever you want to call it (e.g. revalidating a modified DOM tree.) Its just that, internally and in a DOM that we write for our parser specifically, we can take advantage of info that will significantly speed up the process. Once its passed through to the outside world (via some general API that cannot pass on our information) and hence only the element names exist, the validator has more work to do to do the validation, but it does work. For an event API, you will have to maintain an 'element stack' in order to gather up the info required to do the revalidation (a DOM tree already inherently represents that.) Its just a simple push down stack of the elements along the current nesting hierarchy, and the children of those elements. When you get to the end of an element, call the validator with the child list, then pop that top element off and go back to working on the previous one. The low level scanner (while parsing) maintains a stack like this for validation, though it only has to maintain numbers, not names. If you use our internal event API, you can also store numbers for revalidation (as would a DOM written specifically for our system.) As you can imagine, just doing number comparisons is much faster than doing hashed string comparisons. Of course that's not to say that you cannot maintain a string pool yourself and really only store numbers in your stack (for speed) and then just get the element name text references when its time to validate. But that's still not as fast as using our numbers, since they already exist and the validator knows the element content models in terms of those numbers. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








