[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] For those who wanted to know the algorithm for checking maliciousschemas
This is one of the algorithms.. I explain for DTDs, but it can be modified to be used for any schema language that I would classify as based on regular tree grammars, including W3C XML-Schema, RELAX-NG and its family and so on.. best regards, murali. ---------- Forwarded message ---------- Date: Fri, 3 Oct 2003 06:26:50 -0700 (PDT) From: Murali Mani <mani@c...> Subject: Re: Managing Innovation This problem is in the same league as the one which Henry Thompson mentioned. Henry Thompson showed once that if we use fixed values for IDREFs, then checking whether a non-empty document exists is NP-complete. I will show how to determine whether given a DTD and a doctype declaration R, we can determine whether the DTD will produce only infinite documents with root R. I use an approach similar (rather identical) to "reachability in a directed graph" First, for each element, you have already calculated the set of elements that *must* occur as children. This is possible using quite simple algorithms which look at the content model for each element. I can give an algorithm for this also if you want. Basically, what you would already have is, <!ELEMENT a (b, c?)> then you know that a must have b as a child. Suppose you have that for every element. Let us denote this set for any element A, as NecChildSet (A) Now for each element, we find the set of descendant elements that the element *must* have. For an element, A, I will call this set of descendant elements as NecDescSet (A) You can compute NecDescSet (A) using a recursive algorithm as: NecDescSet (A) = NecChildSet (A) UNION NecDescSet (B), for every B \in NecChildSet (A) Once you have NecDescSet (A) for every element in DTD, you can find malicious elements. These are elements whose NecDescSet containts itself. Now suppose the doctype declaration says root is R, now we know that the DTD produces non-infinite documents with root R, *if and only if* NecDescSet (R) does not contain any malicious elements. best regards, murali.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|