[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: validation against xml schema (xsd)
George, Thanks very much for this information and for your thoughts. They will be useful! Matt > -----Original Message----- > From: George Cristian Bina [mailto:george@o...] > Sent: Thursday, March 05, 2009 4:00 PM > To: Johnson, Matthew C. (LNG-HBE) > Cc: xml-dev@l... > Subject: Re: validation against xml schema (xsd) > > Hi Matt, > > You can do a first parse and stop once you reach the root element, for > instance by throwing an exception on the first startElement callback. > That will give you enough information about the document to determine > the schema to use. While you do this parse you can buffer what the > parser reads and then start the validation feeding the parser with the > buffered content and then the remaining content of your document. You > can find an example of this in Jing, see the AutoSchemaReader and the > RewindableReader and RewindableInputStream classes: > > http://code.google.com/p/jing- > trang/source/browse/trunk/mod/validate/src/main/com/thaiopensource/valid at > e/auto/AutoSchemaReader.java > http://code.google.com/p/jing- > trang/source/browse/trunk/mod/validate/src/main/com/thaiopensource/valid at > e/auto/RewindableReader.java > http://code.google.com/p/jing- > trang/source/browse/trunk/mod/validate/src/main/com/thaiopensource/valid at > e/auto/RewindableInputStream.java > > Best Regards, > George > -- > George Cristian Bina > <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger > http://www.oxygenxml.com > > > Johnson, Matthew C. (LNG-HBE) wrote: > > Hello, > > > > > > > > I am wrestling with a choice and would like to ask for opinions. In > > validating XML instance documents against a W3C XML Schema instance, I > > can either rely use @xsi:schemaLocation and rely on it as a hint or I > > can infer which schema to apply using some other piece of information > > from the document. I believe one of the arguments against using > > @xsi:schemaLocation is that the consuming application should arguably be > > in a better position to determine which schema to apply than the > > producer. This is especially true in situations where a document could > > be valid against multiple schemas. My scenario is that a document is > > either valid or not but I do not want to discount this argument. > > Another argument against is that it is defined as only a hint and that > > not all tools support it, although in my case, the tools do support it. > > > > > > > > My question is, if I did not use/provide @xsi:schemaLocation, what are > > some suggested options and means to determine the schema? I will almost > > certainly be using a catalog (OASIS) so I believe this will play a role > > in the decision. One option I have considered is using the namespace > > URI of the root element as a sort of public identifier that could be > > used by the catalog resolver but this has limited support in > > "off-the-shelf" parsing solutions. For example, Xerces (Java) supports > > this through their (XNI) XMLCatalogResolver class but standard SAX > > EntityResolver(2) does not expose/report namespaces. > > > > > > > > The piece that is bugging me a little is that, regardless of the means > > of determining the schema, it feels like an extra > > step/pass/look-into-the-document is required before the actual parse of > > the document. Relying on @xsi:schemaLocation feels much more like > > relying on a DOCTYPE for a DTD in that it is recognized during the main > > parsing step represented by a standard API call (e.g. > > xmlreader.parse(...)) (even if that call does a few passes itself). > > > > > > > > I could even remove the notion of XSD here and ask the same question if > > I were validating against one of multiple RelaxNG schemas. Since RNG > > does not have the standardized equivalent of @xsi:schemaLocation that > > allows the instance document to say "validate me to this schema", it > > feels like a pre-pass would be needed here too. The Oxygen editor uses > > a processing instruction to indicate which RNG file it should use for > > validation but I am unsure whether the implementation first does a pass > > to get the PI and then another to validate or whether it is able to > > validate in a single pass. > > > > > > > > Am I missing anything here? I appreciate any comments, alternatives, > > etc. Thanks, I appreciate it! > > > > > > > > Matt > > > > > > > > PS: My scenario involves collections of heterogeneous content types so > > each document could be of one of several schema types (but only valid to > > one). The effect is that I could not rely on doing a pre-parse (or > > regex) on the first of a collection and assume that all docs in that > > collection are the same. > > > > > > > > > > > > > > > > > > > > > >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|