[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: comparing XML document structure
On Thu, Aug 18, 2011 at 12:44:02AM +0100, Tony Graham scripsit: > On Wed, August 17, 2011 11:48 pm, Wendell Piez wrote: > > It sounds like you want to infer content models on the fly and then > > validate against them. I can imagine approaches to this, but I doubt > > that I'd trust many algorithms that actually attempted it -- not because > > of XSLT, but because of the problem specifying the problem. > ... > > But why not use a schema? There are processors such as Trang that can > > infer schemas from documents. > > What Wendell said. Using trang to generated a schema from the DTD in question has historically tended to fail. (Not a whole lot, but some; generally usable for creating a schema to get saxon to validate the output, but not usable on the fly for structure.) So I've got a relatively fixed content model, in the form of a comprehensive DTD and a much less comprehensive example of how to use that DTD for a particular content type. Initially, what I want to do is eat the exemplar, use it to generate a parent child list -- so I'd have section/num, section/para, and section/subsection -- and then take an output file and get the same list from it, then compare the lists and produce a message for mis-matches. So if a particular output file had section/num, section/subsection, and section/list in it, for example, there should be an exception noted for the presence of the list. (Valid, but not expected.) > ... > > On 8/17/2011 5:57 PM, Graydon wrote: > ... > >> The desired goal is to be able to programmatically pull the structure, > >> at least to the extent of parent-child element pairs, from the > >> semantics-defining file, and compare that to each output file in turn. > >> > >> So if the semantics-defining file gives an example section element, > >> which has num, para, and subsection element children, what I want to be > >> able to do is create a sequence of axis relationships and test the > >> section elements of the output for axis relationships that are not > >> members of that sequence. > > It would help the rest of us wrap our heads around the problem if you > could provide a sample fragment of the "semantics-defining file" so we can > see what you are dealing with. It would, but the whole NDA thing rears its ugly head. It's just a document, to the same DTD as the output. Instead of having actual content in it, it has things like <para>This para is optional; if present, it should contain introductory text</para> in it. > You may be able to create the tests you want in Schematron, but it's a bit > hard to tell without having an example to look at. (If you can generate > Schematron from your definitions, you could directly create XSLT for the > axis tests about as easily, but the advantage could be that there are > tools such as XML IDEs that already understand the Schematron report > format.) Schematron is certainly something to look at, yes. Thanks! Graydon
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|