[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RELAX NG and datatypes make XML generation NP-hard
After reading Henry's posting below, I realized that the following problem is also NP-hard: Given an arbitrary RELAX NG schema, which is allowed to contain XML Schema Datatypes, is there an XML document that conforms to the schema? Here's the proof: The 3SAT problem is NP-complete. Any 3SAT problem can be encoded into a RELAX NG schema, such that an XML document contains a solution to the 3SAT problem if and only if the document conforms to the RELAX NG schema. Thus, solving the XML generation problem (for RELAX NG with XML Schema Datatypes) is at least as hard as solving 3SAT. For example, consider the following 3SAT problem: ( x1 OR x2 OR x3 ) AND ( x2 OR NOT(x3) OR NOT(x4) ) AND ( x3 OR x4 OR NOT(x1) ) This 3SAT problem can be encoded as the following RELAX NG schema: <?xml version="1.0" encoding="UTF-8"?> <element name="three_sat_solution" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary= "http://www.w3.org/2001/XMLSchema-datatypes"> <data type="string"> <param name="pattern">x1=[01],x2=[01],x3=[01],x4=[01]</param> <param name="pattern">(.*x1=1.*)|(.*x2=1.*)|(.*x3=1.*)</param> <param name="pattern">(.*x2=1.*)|(.*x3=0.*)|(.*x4=0.*)</param> <param name="pattern">(.*x3=1.*)|(.*x4=1.*)|(.*x1=0.*)</param> </data> </element> The following XML document conforms to this RELAX NG schema and solves the 3SAT problem: <?xml version="1.0" encoding="UTF-8"?> <three_sat_solution>x1=1,x2=0,x3=0,x4=1</three_sat_solution> The following XML document does not conform to the RELAX NG schema and does not solve the 3SAT problem: <?xml version="1.0" encoding="UTF-8"?> <three_sat_solution>x1=1,x2=1,x3=0,x4=0</three_sat_solution> For more information on 3SAT (a.k.a., 3-CNF), see http://www.wikipedia.org/wiki/Boolean_satisfiability_problem. Since RELAX NG has a solid mathematical foundation, I suspect that the XML generation problem can be solved in linear time when the problem is restricted to RELAX NG schemas that do not use W3C XML Schema Datatypes. Best regards, Bob <sig name = 'Bob Lyons' title = 'XML and Java Consultant' company = 'Unidex, Inc.' phone = '+1-732-975-9877' email = 'boblyons@u...' url = 'http://www.unidex.com/' product = 'XML Convert: transforms flat files to XML and vice versa' /> -----Original Message----- From: Henry S. Thompson [mailto:ht@c...] Sent: Friday, March 28, 2003 4:59 PM To: xml-dev@l... Subject: ID/IDREF makes XML generation NP-hard Somewhat surprisingly, it turns out that answering the question, for an arbitrary XML DTD, "Are there any valid instances of the document type defined by this DTD?", is an NP-hard problem. This is true, because it is possible to encode a 3SAT problem [1] in a DTD, so that there is an instance of the DTD's document type iff the problem can be satisfied. So finding a valid instance is as hard a problem as solving the 3SAT problem, and 3SAT is known to be NP-hard. [snip]
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|